All of lore.kernel.org
 help / color / mirror / Atom feed
* ghes_edac: enable HIP08 platform edac driver
@ 2018-05-11 11:52 Zhengqiang
  0 siblings, 0 replies; 32+ messages in thread
From: Zhengqiang @ 2018-05-11 11:52 UTC (permalink / raw)
  To: mchehab, bp; +Cc: toshi.kani, linux-edac, linuxarm

HISI HIP08 platform support ghes_edac driver, add platform id to enable.

We will support this feature with future platforms.
We use OEM table id to distinguish each platform, that means each
platform we have unique OME table id.

With plat list method, we will upstream oem table id again and again
for future platforms. Is any simple way to solve it?

Signed-off-by: Qiang Zheng <zhengqiang10@huawei.com>
---
 drivers/edac/ghes_edac.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/drivers/edac/ghes_edac.c b/drivers/edac/ghes_edac.c
index 68b6ee1..3caefef 100644
--- a/drivers/edac/ghes_edac.c
+++ b/drivers/edac/ghes_edac.c
@@ -424,6 +424,7 @@ void ghes_edac_report_mem_error(struct ghes *ghes, int sev,
  */
 static struct acpi_platform_list plat_list[] = {
 	{"HPE   ", "Server  ", 0, ACPI_SIG_FADT, all_versions},
+	{"HISI  ", "HIP08   ", 0, ACPI_SIG_FADT, all_versions},
 	{ } /* End */
 };
 

^ permalink raw reply related	[flat|nested] 32+ messages in thread

* ghes_edac: enable HIP08 platform edac driver
@ 2018-05-11 12:19 Borislav Petkov
  0 siblings, 0 replies; 32+ messages in thread
From: Borislav Petkov @ 2018-05-11 12:19 UTC (permalink / raw)
  To: Qiang Zheng; +Cc: mchehab, toshi.kani, linux-edac, linuxarm, James Morse

On Fri, May 11, 2018 at 07:52:23PM +0800, Qiang Zheng wrote:
> HISI HIP08 platform support ghes_edac driver, add platform id to enable.

What does that mean exactly? You prefer to report hw errors through GHES
on your platforms or?

What kind of a platform is that, ARM64?

Because if so, we don't have an EDAC driver for AMD64 so ghes_edac
should load without the platform check there, AFAICT.

^ permalink raw reply	[flat|nested] 32+ messages in thread

* ghes_edac: enable HIP08 platform edac driver
@ 2018-05-14  4:11 Zhengqiang
  0 siblings, 0 replies; 32+ messages in thread
From: Zhengqiang @ 2018-05-14  4:11 UTC (permalink / raw)
  To: Borislav Petkov; +Cc: mchehab, toshi.kani, linux-edac, linuxarm, James Morse

On 2018/5/11 20:19, Borislav Petkov wrote:
> On Fri, May 11, 2018 at 07:52:23PM +0800, Qiang Zheng wrote:
>> HISI HIP08 platform support ghes_edac driver, add platform id to enable.
> 
> What does that mean exactly? You prefer to report hw errors through GHES
> on your platforms or?
> 
> What kind of a platform is that, ARM64?
> 

Yes, our platform is ARM64, we report hw errors through GHES.

> Because if so, we don't have an EDAC driver for AMD64 so ghes_edac
> should load without the platform check there, AFAICT.
> 
In ARM64 defconfig, ghes_edac is default load. memory error report to
user space rasdaemon tool through function ghes_edac_report_mem_error, we need it.

Thanks.
---
To unsubscribe from this list: send the line "unsubscribe linux-edac" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 32+ messages in thread

* ghes_edac: enable HIP08 platform edac driver
@ 2018-05-14  9:47 Borislav Petkov
  0 siblings, 0 replies; 32+ messages in thread
From: Borislav Petkov @ 2018-05-14  9:47 UTC (permalink / raw)
  To: Zhengqiang, James Morse; +Cc: mchehab, toshi.kani, linux-edac, linuxarm

On Mon, May 14, 2018 at 12:11:28PM +0800, Zhengqiang wrote:
> In ARM64 defconfig, ghes_edac is default load. memory error report to
> user space rasdaemon tool through function ghes_edac_report_mem_error,
> we need it.

So depending on whether there will be an ARM64 edac driver, we can do
the platform whitelisting on x86 only if ARM prefers to do the reporting
through ghes_edac. James?

diff --git a/drivers/edac/ghes_edac.c b/drivers/edac/ghes_edac.c
index 863fbf3db29f..7b298fd506e9 100644
--- a/drivers/edac/ghes_edac.c
+++ b/drivers/edac/ghes_edac.c
@@ -440,12 +440,14 @@ int ghes_edac_register(struct ghes *ghes, struct device *dev)
 	struct mem_ctl_info *mci;
 	struct edac_mc_layer layers[1];
 	struct ghes_edac_dimm_fill dimm_fill;
-	int idx;
+	int idx = -1;
 
-	/* Check if safe to enable on this system */
-	idx = acpi_match_platform_list(plat_list);
-	if (!force_load && idx < 0)
-		return -ENODEV;
+	if (IS_ENABLED(CONFIG_X86)) {
+		/* Check if safe to enable on this system */
+		idx = acpi_match_platform_list(plat_list);
+		if (!force_load && idx < 0)
+			return -ENODEV;
+	}
 
 	/*
 	 * We have only one logical memory controller to which all DIMMs belong.

^ permalink raw reply related	[flat|nested] 32+ messages in thread

* ghes_edac: enable HIP08 platform edac driver
  2018-05-14  9:47 Borislav Petkov
@ 2018-05-14 15:12 ` James Morse
  -1 siblings, 0 replies; 32+ messages in thread
From: James Morse @ 2018-05-14 15:12 UTC (permalink / raw)
  To: Borislav Petkov
  Cc: Zhengqiang, mchehab, toshi.kani, linux-edac, linuxarm, linux-arm-kernel

Hi Borislav,

(CC: +linux-arm list, just in case there is wider discussion)

On 14/05/18 10:47, Borislav Petkov wrote:
> On Mon, May 14, 2018 at 12:11:28PM +0800, Zhengqiang wrote:
>> In ARM64 defconfig, ghes_edac is default load. memory error report to
>> user space rasdaemon tool through function ghes_edac_report_mem_error,
>> we need it.
> 
> So depending on whether there will be an ARM64 edac driver, we can do

I'm afraid there could be a mix: The v8.2 CPU RAS Extensions mean the kernel can
do kernel first. (I agree for those systems there should only be one edac driver).
For systems without the v8.2 CPU RAS Extensions firmware-first is the only way
of doing it.


> the platform whitelisting on x86 only if ARM prefers to do the reporting
> through ghes_edac. James?

I'm afraid I'd like to keep both doors open. Kernel-first handling will require
some ACPI-table/DT property as some aspects of the CPU extensions aren't
discover-able. Can't we use this to pick up whether the platform supports
firmware-first (HEST and GHES entries) or kernel-first via some as-yet-undefined
HEST bits?

Without GHES entries this code would never be run. So we 'just' need to catch
systems that are describing both. (which can be the platform specific kernel
first bits problem to do)


Thanks,

James
---
To unsubscribe from this list: send the line "unsubscribe linux-edac" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 32+ messages in thread

* [PATCH] ghes_edac: enable HIP08 platform edac driver
@ 2018-05-14 15:12 ` James Morse
  0 siblings, 0 replies; 32+ messages in thread
From: James Morse @ 2018-05-14 15:12 UTC (permalink / raw)
  To: linux-arm-kernel

Hi Borislav,

(CC: +linux-arm list, just in case there is wider discussion)

On 14/05/18 10:47, Borislav Petkov wrote:
> On Mon, May 14, 2018 at 12:11:28PM +0800, Zhengqiang wrote:
>> In ARM64 defconfig, ghes_edac is default load. memory error report to
>> user space rasdaemon tool through function ghes_edac_report_mem_error,
>> we need it.
> 
> So depending on whether there will be an ARM64 edac driver, we can do

I'm afraid there could be a mix: The v8.2 CPU RAS Extensions mean the kernel can
do kernel first. (I agree for those systems there should only be one edac driver).
For systems without the v8.2 CPU RAS Extensions firmware-first is the only way
of doing it.


> the platform whitelisting on x86 only if ARM prefers to do the reporting
> through ghes_edac. James?

I'm afraid I'd like to keep both doors open. Kernel-first handling will require
some ACPI-table/DT property as some aspects of the CPU extensions aren't
discover-able. Can't we use this to pick up whether the platform supports
firmware-first (HEST and GHES entries) or kernel-first via some as-yet-undefined
HEST bits?

Without GHES entries this code would never be run. So we 'just' need to catch
systems that are describing both. (which can be the platform specific kernel
first bits problem to do)


Thanks,

James

^ permalink raw reply	[flat|nested] 32+ messages in thread

* ghes_edac: enable HIP08 platform edac driver
  2018-05-14 15:12 ` [PATCH] " James Morse
@ 2018-05-14 16:47 ` Borislav Petkov
  -1 siblings, 0 replies; 32+ messages in thread
From: Borislav Petkov @ 2018-05-14 16:47 UTC (permalink / raw)
  To: James Morse
  Cc: Zhengqiang, mchehab, toshi.kani, linux-edac, linuxarm, linux-arm-kernel

On Mon, May 14, 2018 at 04:12:08PM +0100, James Morse wrote:
> I'm afraid I'd like to keep both doors open. Kernel-first handling will require
> some ACPI-table/DT property as some aspects of the CPU extensions aren't
> discover-able. Can't we use this to pick up whether the platform supports
> firmware-first (HEST and GHES entries) or kernel-first via some as-yet-undefined
> HEST bits?

So how you detect those platforms is largely undefined as we're walking
new grounds here with the FF crap on the one hand and platform-specific
drivers on the other. So whatever works for you and as long as the
ugliness is nicely hidden. :-)

> Without GHES entries this code would never be run. So we 'just' need to catch
> systems that are describing both. (which can be the platform specific kernel
> first bits problem to do)

So the reason why we're doing this on x86 is that the majority of
GHES-advertizing platforms out there are a serious turd when it comes
to functioning fw.

So we've opted for known-good platforms list where there's backing from
the vendor to have firmware which is getting fixes and is being tested
properly.

And everything else we assume is crap. Thus we use the platform-specific
EDAC driver which we know it works and we can fix if there's an issue.
VS firmware which we can't. (I doubt anyone can, for that matter. :)...)

Anyway, this is the story in x86 land. JFYI guys in case it helps making
some decisions.

Thx.

^ permalink raw reply	[flat|nested] 32+ messages in thread

* [PATCH] ghes_edac: enable HIP08 platform edac driver
@ 2018-05-14 16:47 ` Borislav Petkov
  0 siblings, 0 replies; 32+ messages in thread
From: Borislav Petkov @ 2018-05-14 16:47 UTC (permalink / raw)
  To: linux-arm-kernel

On Mon, May 14, 2018 at 04:12:08PM +0100, James Morse wrote:
> I'm afraid I'd like to keep both doors open. Kernel-first handling will require
> some ACPI-table/DT property as some aspects of the CPU extensions aren't
> discover-able. Can't we use this to pick up whether the platform supports
> firmware-first (HEST and GHES entries) or kernel-first via some as-yet-undefined
> HEST bits?

So how you detect those platforms is largely undefined as we're walking
new grounds here with the FF crap on the one hand and platform-specific
drivers on the other. So whatever works for you and as long as the
ugliness is nicely hidden. :-)

> Without GHES entries this code would never be run. So we 'just' need to catch
> systems that are describing both. (which can be the platform specific kernel
> first bits problem to do)

So the reason why we're doing this on x86 is that the majority of
GHES-advertizing platforms out there are a serious turd when it comes
to functioning fw.

So we've opted for known-good platforms list where there's backing from
the vendor to have firmware which is getting fixes and is being tested
properly.

And everything else we assume is crap. Thus we use the platform-specific
EDAC driver which we know it works and we can fix if there's an issue.
VS firmware which we can't. (I doubt anyone can, for that matter. :)...)

Anyway, this is the story in x86 land. JFYI guys in case it helps making
some decisions.

Thx.

-- 
Regards/Gruss,
    Boris.

Good mailing practices for 400: avoid top-posting and trim the reply.

^ permalink raw reply	[flat|nested] 32+ messages in thread

* ghes_edac: enable HIP08 platform edac driver
  2018-05-14 16:47 ` [PATCH] " Borislav Petkov
@ 2018-05-16 13:38 ` James Morse
  -1 siblings, 0 replies; 32+ messages in thread
From: James Morse @ 2018-05-16 13:38 UTC (permalink / raw)
  To: Borislav Petkov
  Cc: Zhengqiang, mchehab, toshi.kani, linux-edac, linuxarm, linux-arm-kernel

Hi Borislav, Zhengqiang

On 14/05/18 17:47, Borislav Petkov wrote:
> On Mon, May 14, 2018 at 04:12:08PM +0100, James Morse wrote:
>> I'm afraid I'd like to keep both doors open. Kernel-first handling will require
>> some ACPI-table/DT property as some aspects of the CPU extensions aren't
>> discover-able. Can't we use this to pick up whether the platform supports
>> firmware-first (HEST and GHES entries) or kernel-first via some as-yet-undefined
>> HEST bits?
> 
> So how you detect those platforms is largely undefined as we're walking
> new grounds here with the FF crap on the one hand and platform-specific
> drivers on the other. So whatever works for you and as long as the
> ugliness is nicely hidden. :-)

>> Without GHES entries this code would never be run. So we 'just' need to catch
>> systems that are describing both. (which can be the platform specific kernel
>> first bits problem to do)
> 
> So the reason why we're doing this on x86 is that the majority of
> GHES-advertizing platforms out there are a serious turd when it comes
> to functioning fw.
> 
> So we've opted for known-good platforms list where there's backing from
> the vendor to have firmware which is getting fixes and is being tested
> properly.

Okay, so the question is whether we need a white/blacklist. The two older
platforms that describes GHES in their HEST tables are AMD-Seattle and APM-XGene.

XGene has its own edac driver, but it doesn't probe when booted via ACPI so
won't conflict with ghes_edac. On XGene:
|  ghes_edac: GOT 2 DIMMS
| EDAC MC0: Giving out device to module ghes_edac.c controller ghes_edac: DEV
ghes (INTERRUPT)

On Seattle:
[    6.119586] ghes_edac: GOT 4 DIMMS
[    6.127294] EDAC MC0: Giving out device to module ghes_edac.c controller ghes
_edac: DEV ghes (INTERRUPT)

The thing has 4 dimm slots, but only two are populated. I swapped them round and
the table was regenerated, so I don't think its faking it.


So I think we're good to make the whitelist x86 only.
Your diff-hunk makes 'idx=-1', so we always get the 'Unfortunately' warning. I'd
like to suppress this unless force_load has been used.


> Anyway, this is the story in x86 land. JFYI guys in case it helps making
> some decisions.

Thanks!

What is the history behind the fake thing here? It predates 32fa1f53c2d
"ghes_edac: do a better job of filling EDAC DIMM info", was it to support a
valid system, or just to ease merging the driver when not all systems had the
dmi table?

It looks like even the oldest Arm64 ACPI systems have dmi tables, so we can
probably require DMI or the 'force' flag.


Thanks,

James
---
To unsubscribe from this list: send the line "unsubscribe linux-edac" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 32+ messages in thread

* [PATCH] ghes_edac: enable HIP08 platform edac driver
@ 2018-05-16 13:38 ` James Morse
  0 siblings, 0 replies; 32+ messages in thread
From: James Morse @ 2018-05-16 13:38 UTC (permalink / raw)
  To: linux-arm-kernel

Hi Borislav, Zhengqiang

On 14/05/18 17:47, Borislav Petkov wrote:
> On Mon, May 14, 2018 at 04:12:08PM +0100, James Morse wrote:
>> I'm afraid I'd like to keep both doors open. Kernel-first handling will require
>> some ACPI-table/DT property as some aspects of the CPU extensions aren't
>> discover-able. Can't we use this to pick up whether the platform supports
>> firmware-first (HEST and GHES entries) or kernel-first via some as-yet-undefined
>> HEST bits?
> 
> So how you detect those platforms is largely undefined as we're walking
> new grounds here with the FF crap on the one hand and platform-specific
> drivers on the other. So whatever works for you and as long as the
> ugliness is nicely hidden. :-)

>> Without GHES entries this code would never be run. So we 'just' need to catch
>> systems that are describing both. (which can be the platform specific kernel
>> first bits problem to do)
> 
> So the reason why we're doing this on x86 is that the majority of
> GHES-advertizing platforms out there are a serious turd when it comes
> to functioning fw.
> 
> So we've opted for known-good platforms list where there's backing from
> the vendor to have firmware which is getting fixes and is being tested
> properly.

Okay, so the question is whether we need a white/blacklist. The two older
platforms that describes GHES in their HEST tables are AMD-Seattle and APM-XGene.

XGene has its own edac driver, but it doesn't probe when booted via ACPI so
won't conflict with ghes_edac. On XGene:
|  ghes_edac: GOT 2 DIMMS
| EDAC MC0: Giving out device to module ghes_edac.c controller ghes_edac: DEV
ghes (INTERRUPT)

On Seattle:
[    6.119586] ghes_edac: GOT 4 DIMMS
[    6.127294] EDAC MC0: Giving out device to module ghes_edac.c controller ghes
_edac: DEV ghes (INTERRUPT)

The thing has 4 dimm slots, but only two are populated. I swapped them round and
the table was regenerated, so I don't think its faking it.


So I think we're good to make the whitelist x86 only.
Your diff-hunk makes 'idx=-1', so we always get the 'Unfortunately' warning. I'd
like to suppress this unless force_load has been used.


> Anyway, this is the story in x86 land. JFYI guys in case it helps making
> some decisions.

Thanks!

What is the history behind the fake thing here? It predates 32fa1f53c2d
"ghes_edac: do a better job of filling EDAC DIMM info", was it to support a
valid system, or just to ease merging the driver when not all systems had the
dmi table?

It looks like even the oldest Arm64 ACPI systems have dmi tables, so we can
probably require DMI or the 'force' flag.


Thanks,

James

^ permalink raw reply	[flat|nested] 32+ messages in thread

* ghes_edac: enable HIP08 platform edac driver
  2018-05-16 13:38 ` [PATCH] " James Morse
@ 2018-05-16 18:29 ` Borislav Petkov
  -1 siblings, 0 replies; 32+ messages in thread
From: Borislav Petkov @ 2018-05-16 18:29 UTC (permalink / raw)
  To: James Morse
  Cc: Zhengqiang, mchehab, toshi.kani, linux-edac, linuxarm, linux-arm-kernel

On Wed, May 16, 2018 at 02:38:38PM +0100, James Morse wrote:
> XGene has its own edac driver, but it doesn't probe when booted via ACPI so
> won't conflict with ghes_edac.

Actually it will. EDAC core can have only one EDAC driver loaded. Don't
ask me why - it has been that way since forever. We can change it some
day but frankly, I don't see reasoning for it. One driver can easily
manage *all* error sources on a system, I'd say.

> ... The thing has 4 dimm slots, but only two are populated. I swapped
> them round and the table was regenerated, so I don't think its faking
> it.

Ok.

> So I think we're good to make the whitelist x86 only.
> Your diff-hunk makes 'idx=-1', so we always get the 'Unfortunately' warning. I'd
> like to suppress this unless force_load has been used.

Yeah, we should handle that differently for ARM. Toshi added the idx
thing in

  5deed6b6a479 ("EDAC, ghes: Add platform check")

to dump this when the platform is not whitelisted. So let's do that:
---

---

> What is the history behind the fake thing here? It predates 32fa1f53c2d
> "ghes_edac: do a better job of filling EDAC DIMM info", was it to support a
> valid system, or just to ease merging the driver when not all systems had the
> dmi table?

I wouldn't be surprised if there were some, TBH.

Looks to me like it used to fake DIMMs, see

-       /* FIXME: FAKE DATA */
-       dimm->nr_pages = 1000;
-       dimm->grain = 128;
-       dimm->mtype = MEM_UNKNOWN;
-       dimm->dtype = DEV_UNKNOWN;
-       dimm->edac_mode = EDAC_SECDED;

which 32fa1f53c2d removes.

$ git annotate drivers/edac/ghes_edac.c 32fa1f53c2d~1

shows you the driver before the DMI scanning so it looks like initially
it was faking stuff to satisfy EDAC core when it creates sysfs entries
using struct dimm_info descriptors.

> It looks like even the oldest Arm64 ACPI systems have dmi tables, so we can
> probably require DMI or the 'force' flag.

Well, with the hunk above it would still do ghes_edac_count_dimms() on
ARM and if it fails to find something, it will set fake, which is a good
sanity-check as it screams loudly. :)

Thx.

diff --git a/drivers/edac/ghes_edac.c b/drivers/edac/ghes_edac.c
index 863fbf3db29f..473aeec4b1da 100644
--- a/drivers/edac/ghes_edac.c
+++ b/drivers/edac/ghes_edac.c
@@ -440,12 +440,16 @@ int ghes_edac_register(struct ghes *ghes, struct device *dev)
 	struct mem_ctl_info *mci;
 	struct edac_mc_layer layers[1];
 	struct ghes_edac_dimm_fill dimm_fill;
-	int idx;
+	int idx = -1;
 
-	/* Check if safe to enable on this system */
-	idx = acpi_match_platform_list(plat_list);
-	if (!force_load && idx < 0)
-		return -ENODEV;
+	if (IS_ENABLED(CONFIG_X86)) {
+		/* Check if safe to enable on this system */
+		idx = acpi_match_platform_list(plat_list);
+		if (!force_load && idx < 0)
+			return -ENODEV;
+	} else {
+		idx = 0;
+	}
 
 	/*
 	 * We have only one logical memory controller to which all DIMMs belong.

^ permalink raw reply related	[flat|nested] 32+ messages in thread

* [PATCH] ghes_edac: enable HIP08 platform edac driver
@ 2018-05-16 18:29 ` Borislav Petkov
  0 siblings, 0 replies; 32+ messages in thread
From: Borislav Petkov @ 2018-05-16 18:29 UTC (permalink / raw)
  To: linux-arm-kernel

On Wed, May 16, 2018 at 02:38:38PM +0100, James Morse wrote:
> XGene has its own edac driver, but it doesn't probe when booted via ACPI so
> won't conflict with ghes_edac.

Actually it will. EDAC core can have only one EDAC driver loaded. Don't
ask me why - it has been that way since forever. We can change it some
day but frankly, I don't see reasoning for it. One driver can easily
manage *all* error sources on a system, I'd say.

> ... The thing has 4 dimm slots, but only two are populated. I swapped
> them round and the table was regenerated, so I don't think its faking
> it.

Ok.

> So I think we're good to make the whitelist x86 only.
> Your diff-hunk makes 'idx=-1', so we always get the 'Unfortunately' warning. I'd
> like to suppress this unless force_load has been used.

Yeah, we should handle that differently for ARM. Toshi added the idx
thing in

  5deed6b6a479 ("EDAC, ghes: Add platform check")

to dump this when the platform is not whitelisted. So let's do that:

---
diff --git a/drivers/edac/ghes_edac.c b/drivers/edac/ghes_edac.c
index 863fbf3db29f..473aeec4b1da 100644
--- a/drivers/edac/ghes_edac.c
+++ b/drivers/edac/ghes_edac.c
@@ -440,12 +440,16 @@ int ghes_edac_register(struct ghes *ghes, struct device *dev)
 	struct mem_ctl_info *mci;
 	struct edac_mc_layer layers[1];
 	struct ghes_edac_dimm_fill dimm_fill;
-	int idx;
+	int idx = -1;
 
-	/* Check if safe to enable on this system */
-	idx = acpi_match_platform_list(plat_list);
-	if (!force_load && idx < 0)
-		return -ENODEV;
+	if (IS_ENABLED(CONFIG_X86)) {
+		/* Check if safe to enable on this system */
+		idx = acpi_match_platform_list(plat_list);
+		if (!force_load && idx < 0)
+			return -ENODEV;
+	} else {
+		idx = 0;
+	}
 
 	/*
 	 * We have only one logical memory controller to which all DIMMs belong.

---

> What is the history behind the fake thing here? It predates 32fa1f53c2d
> "ghes_edac: do a better job of filling EDAC DIMM info", was it to support a
> valid system, or just to ease merging the driver when not all systems had the
> dmi table?

I wouldn't be surprised if there were some, TBH.

Looks to me like it used to fake DIMMs, see

-       /* FIXME: FAKE DATA */
-       dimm->nr_pages = 1000;
-       dimm->grain = 128;
-       dimm->mtype = MEM_UNKNOWN;
-       dimm->dtype = DEV_UNKNOWN;
-       dimm->edac_mode = EDAC_SECDED;

which 32fa1f53c2d removes.

$ git annotate drivers/edac/ghes_edac.c 32fa1f53c2d~1

shows you the driver before the DMI scanning so it looks like initially
it was faking stuff to satisfy EDAC core when it creates sysfs entries
using struct dimm_info descriptors.

> It looks like even the oldest Arm64 ACPI systems have dmi tables, so we can
> probably require DMI or the 'force' flag.

Well, with the hunk above it would still do ghes_edac_count_dimms() on
ARM and if it fails to find something, it will set fake, which is a good
sanity-check as it screams loudly. :)

Thx.

-- 
Regards/Gruss,
    Boris.

Good mailing practices for 400: avoid top-posting and trim the reply.

^ permalink raw reply related	[flat|nested] 32+ messages in thread

* ghes_edac: enable HIP08 platform edac driver
  2018-05-16 18:29 ` [PATCH] " Borislav Petkov
@ 2018-05-17 18:02 ` James Morse
  -1 siblings, 0 replies; 32+ messages in thread
From: James Morse @ 2018-05-17 18:02 UTC (permalink / raw)
  To: Borislav Petkov, Zhengqiang, Tyler Baicar
  Cc: mchehab, toshi.kani, linux-edac, linuxarm, linux-arm-kernel

Hi guys,

Tyler, Zhengqiang, I assume all your shipped platforms with HEST->GHES entries
also have DMI tables.


On 16/05/18 19:29, Borislav Petkov wrote:
> On Wed, May 16, 2018 at 02:38:38PM +0100, James Morse wrote:
>> XGene has its own edac driver, but it doesn't probe when booted via ACPI so
>> won't conflict with ghes_edac.
> 
> Actually it will. EDAC core can have only one EDAC driver loaded. Don't
> ask me why - it has been that way since forever.

By won't probe I mean it only works on DT systems:

| static const struct of_device_id xgene_edac_of_match[] = {
|	{ .compatible = "apm,xgene-edac" },
|	{},
| };

|	.driver = {
|		.name = "xgene-edac",
|		.of_match_table = xgene_edac_of_match,
|	},

To work on a system with GHES it would need an 'struct acpi_device_id' to
describe the HID (?) and populate driver's acpi_match_table.


> We can change it some
> day but frankly, I don't see reasoning for it. One driver can easily
> manage *all* error sources on a system, I'd say.

I agree, there is no reason to support two at the same time, if this happens
then there is probably something wrong with the platform (e.g. races with
firmware reading the same hardware registers), so we should make some noise.

Xgene's edac driver would be a good example of this, it looks like it reads data
from some mmio region, if something else is doing the same we're going to make a
mess.


>> So I think we're good to make the whitelist x86 only.
>> Your diff-hunk makes 'idx=-1', so we always get the 'Unfortunately' warning. I'd
>> like to suppress this unless force_load has been used.
> 
> Yeah, we should handle that differently for ARM. Toshi added the idx
> thing in
> 
>   5deed6b6a479 ("EDAC, ghes: Add platform check")
> 
> to dump this when the platform is not whitelisted. So let's do that:
> 
> ---
> diff --git a/drivers/edac/ghes_edac.c b/drivers/edac/ghes_edac.c
> index 863fbf3db29f..473aeec4b1da 100644
> --- a/drivers/edac/ghes_edac.c
> +++ b/drivers/edac/ghes_edac.c
> @@ -440,12 +440,16 @@ int ghes_edac_register(struct ghes *ghes, struct device *dev)
>  	struct mem_ctl_info *mci;
>  	struct edac_mc_layer layers[1];
>  	struct ghes_edac_dimm_fill dimm_fill;
> -	int idx;
> +	int idx = -1;
>  
> -	/* Check if safe to enable on this system */
> -	idx = acpi_match_platform_list(plat_list);
> -	if (!force_load && idx < 0)
> -		return -ENODEV;

v4.17-rc5 has 'return 0' here. Wouldn't this change means no ghes can be
registered unless ghes_edac is also supported by the platform?
Shouldn't this be '0' for a silent failure?


> +	if (IS_ENABLED(CONFIG_X86)) {
> +		/* Check if safe to enable on this system */
> +		idx = acpi_match_platform_list(plat_list);
> +		if (!force_load && idx < 0)
> +			return -ENODEV;
> +	} else {
> +		idx = 0;
> +	}
>  
>  	/*
>  	 * We have only one logical memory controller to which all DIMMs belong.

Tested on Seattle and some cranky homebrew-no-DMI firmware:
Tested-by: James Morse <james.morse@arm.com>

With the ENODEV/0 thing above:
Reviewed-by: James Morse <james.morse@arm.com>


>> It looks like even the oldest Arm64 ACPI systems have dmi tables, so we can
>> probably require DMI or the 'force' flag.
> 
> Well, with the hunk above it would still do ghes_edac_count_dimms() on
> ARM and if it fails to find something, it will set fake, which is a good
> sanity-check as it screams loudly. :)


Thanks,

James
---
To unsubscribe from this list: send the line "unsubscribe linux-edac" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 32+ messages in thread

* [PATCH] ghes_edac: enable HIP08 platform edac driver
@ 2018-05-17 18:02 ` James Morse
  0 siblings, 0 replies; 32+ messages in thread
From: James Morse @ 2018-05-17 18:02 UTC (permalink / raw)
  To: linux-arm-kernel

Hi guys,

Tyler, Zhengqiang, I assume all your shipped platforms with HEST->GHES entries
also have DMI tables.


On 16/05/18 19:29, Borislav Petkov wrote:
> On Wed, May 16, 2018 at 02:38:38PM +0100, James Morse wrote:
>> XGene has its own edac driver, but it doesn't probe when booted via ACPI so
>> won't conflict with ghes_edac.
> 
> Actually it will. EDAC core can have only one EDAC driver loaded. Don't
> ask me why - it has been that way since forever.

By won't probe I mean it only works on DT systems:

| static const struct of_device_id xgene_edac_of_match[] = {
|	{ .compatible = "apm,xgene-edac" },
|	{},
| };

|	.driver = {
|		.name = "xgene-edac",
|		.of_match_table = xgene_edac_of_match,
|	},

To work on a system with GHES it would need an 'struct acpi_device_id' to
describe the HID (?) and populate driver's acpi_match_table.


> We can change it some
> day but frankly, I don't see reasoning for it. One driver can easily
> manage *all* error sources on a system, I'd say.

I agree, there is no reason to support two at the same time, if this happens
then there is probably something wrong with the platform (e.g. races with
firmware reading the same hardware registers), so we should make some noise.

Xgene's edac driver would be a good example of this, it looks like it reads data
from some mmio region, if something else is doing the same we're going to make a
mess.


>> So I think we're good to make the whitelist x86 only.
>> Your diff-hunk makes 'idx=-1', so we always get the 'Unfortunately' warning. I'd
>> like to suppress this unless force_load has been used.
> 
> Yeah, we should handle that differently for ARM. Toshi added the idx
> thing in
> 
>   5deed6b6a479 ("EDAC, ghes: Add platform check")
> 
> to dump this when the platform is not whitelisted. So let's do that:
> 
> ---
> diff --git a/drivers/edac/ghes_edac.c b/drivers/edac/ghes_edac.c
> index 863fbf3db29f..473aeec4b1da 100644
> --- a/drivers/edac/ghes_edac.c
> +++ b/drivers/edac/ghes_edac.c
> @@ -440,12 +440,16 @@ int ghes_edac_register(struct ghes *ghes, struct device *dev)
>  	struct mem_ctl_info *mci;
>  	struct edac_mc_layer layers[1];
>  	struct ghes_edac_dimm_fill dimm_fill;
> -	int idx;
> +	int idx = -1;
>  
> -	/* Check if safe to enable on this system */
> -	idx = acpi_match_platform_list(plat_list);
> -	if (!force_load && idx < 0)
> -		return -ENODEV;

v4.17-rc5 has 'return 0' here. Wouldn't this change means no ghes can be
registered unless ghes_edac is also supported by the platform?
Shouldn't this be '0' for a silent failure?


> +	if (IS_ENABLED(CONFIG_X86)) {
> +		/* Check if safe to enable on this system */
> +		idx = acpi_match_platform_list(plat_list);
> +		if (!force_load && idx < 0)
> +			return -ENODEV;
> +	} else {
> +		idx = 0;
> +	}
>  
>  	/*
>  	 * We have only one logical memory controller to which all DIMMs belong.

Tested on Seattle and some cranky homebrew-no-DMI firmware:
Tested-by: James Morse <james.morse@arm.com>

With the ENODEV/0 thing above:
Reviewed-by: James Morse <james.morse@arm.com>


>> It looks like even the oldest Arm64 ACPI systems have dmi tables, so we can
>> probably require DMI or the 'force' flag.
> 
> Well, with the hunk above it would still do ghes_edac_count_dimms() on
> ARM and if it fails to find something, it will set fake, which is a good
> sanity-check as it screams loudly. :)


Thanks,

James

^ permalink raw reply	[flat|nested] 32+ messages in thread

* ghes_edac: enable HIP08 platform edac driver
  2018-05-17 18:02 ` [PATCH] " James Morse
@ 2018-05-18  7:13 ` Zhengqiang
  -1 siblings, 0 replies; 32+ messages in thread
From: Zhengqiang @ 2018-05-18  7:13 UTC (permalink / raw)
  To: James Morse, Borislav Petkov, Tyler Baicar
  Cc: mchehab, toshi.kani, linux-edac, linuxarm, linux-arm-kernel

On 2018/5/18 2:02, James Morse wrote:
> Hi guys,
> 
> Tyler, Zhengqiang, I assume all your shipped platforms with HEST->GHES entries
> also have DMI tables.
> 

Sure, Our ARM64 platform have DMI tables. thanks.

> 
> On 16/05/18 19:29, Borislav Petkov wrote:
>> On Wed, May 16, 2018 at 02:38:38PM +0100, James Morse wrote:
>>> XGene has its own edac driver, but it doesn't probe when booted via ACPI so
>>> won't conflict with ghes_edac.
>>
>> Actually it will. EDAC core can have only one EDAC driver loaded. Don't
>> ask me why - it has been that way since forever.
> 
> By won't probe I mean it only works on DT systems:
> 
> | static const struct of_device_id xgene_edac_of_match[] = {
> |	{ .compatible = "apm,xgene-edac" },
> |	{},
> | };
> 
> |	.driver = {
> |		.name = "xgene-edac",
> |		.of_match_table = xgene_edac_of_match,
> |	},
> 
> To work on a system with GHES it would need an 'struct acpi_device_id' to
> describe the HID (?) and populate driver's acpi_match_table.
> 
> 
>> We can change it some
>> day but frankly, I don't see reasoning for it. One driver can easily
>> manage *all* error sources on a system, I'd say.
> 
> I agree, there is no reason to support two at the same time, if this happens
> then there is probably something wrong with the platform (e.g. races with
> firmware reading the same hardware registers), so we should make some noise.
> 
> Xgene's edac driver would be a good example of this, it looks like it reads data
> from some mmio region, if something else is doing the same we're going to make a
> mess.
> 
> 
>>> So I think we're good to make the whitelist x86 only.
>>> Your diff-hunk makes 'idx=-1', so we always get the 'Unfortunately' warning. I'd
>>> like to suppress this unless force_load has been used.
>>
>> Yeah, we should handle that differently for ARM. Toshi added the idx
>> thing in
>>
>>   5deed6b6a479 ("EDAC, ghes: Add platform check")
>>
>> to dump this when the platform is not whitelisted. So let's do that:
>>
>> ---
>> diff --git a/drivers/edac/ghes_edac.c b/drivers/edac/ghes_edac.c
>> index 863fbf3db29f..473aeec4b1da 100644
>> --- a/drivers/edac/ghes_edac.c
>> +++ b/drivers/edac/ghes_edac.c
>> @@ -440,12 +440,16 @@ int ghes_edac_register(struct ghes *ghes, struct device *dev)
>>  	struct mem_ctl_info *mci;
>>  	struct edac_mc_layer layers[1];
>>  	struct ghes_edac_dimm_fill dimm_fill;
>> -	int idx;
>> +	int idx = -1;
>>  
>> -	/* Check if safe to enable on this system */
>> -	idx = acpi_match_platform_list(plat_list);
>> -	if (!force_load && idx < 0)
>> -		return -ENODEV;
> 
> v4.17-rc5 has 'return 0' here. Wouldn't this change means no ghes can be
> registered unless ghes_edac is also supported by the platform?
> Shouldn't this be '0' for a silent failure?
> 
> 
>> +	if (IS_ENABLED(CONFIG_X86)) {
>> +		/* Check if safe to enable on this system */
>> +		idx = acpi_match_platform_list(plat_list);
>> +		if (!force_load && idx < 0)
>> +			return -ENODEV;
>> +	} else {
>> +		idx = 0;
>> +	}
>>  
>>  	/*
>>  	 * We have only one logical memory controller to which all DIMMs belong.
> 
> Tested on Seattle and some cranky homebrew-no-DMI firmware:
> Tested-by: James Morse <james.morse@arm.com>
> 
> With the ENODEV/0 thing above:
> Reviewed-by: James Morse <james.morse@arm.com>
> 
> 
>>> It looks like even the oldest Arm64 ACPI systems have dmi tables, so we can
>>> probably require DMI or the 'force' flag.
>>
>> Well, with the hunk above it would still do ghes_edac_count_dimms() on
>> ARM and if it fails to find something, it will set fake, which is a good
>> sanity-check as it screams loudly. :)
> 
> 
> Thanks,
> 
> James
> 
> .
>
---
To unsubscribe from this list: send the line "unsubscribe linux-edac" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 32+ messages in thread

* [PATCH] ghes_edac: enable HIP08 platform edac driver
@ 2018-05-18  7:13 ` Zhengqiang
  0 siblings, 0 replies; 32+ messages in thread
From: Zhengqiang @ 2018-05-18  7:13 UTC (permalink / raw)
  To: linux-arm-kernel



On 2018/5/18 2:02, James Morse wrote:
> Hi guys,
> 
> Tyler, Zhengqiang, I assume all your shipped platforms with HEST->GHES entries
> also have DMI tables.
> 

Sure, Our ARM64 platform have DMI tables. thanks.

> 
> On 16/05/18 19:29, Borislav Petkov wrote:
>> On Wed, May 16, 2018 at 02:38:38PM +0100, James Morse wrote:
>>> XGene has its own edac driver, but it doesn't probe when booted via ACPI so
>>> won't conflict with ghes_edac.
>>
>> Actually it will. EDAC core can have only one EDAC driver loaded. Don't
>> ask me why - it has been that way since forever.
> 
> By won't probe I mean it only works on DT systems:
> 
> | static const struct of_device_id xgene_edac_of_match[] = {
> |	{ .compatible = "apm,xgene-edac" },
> |	{},
> | };
> 
> |	.driver = {
> |		.name = "xgene-edac",
> |		.of_match_table = xgene_edac_of_match,
> |	},
> 
> To work on a system with GHES it would need an 'struct acpi_device_id' to
> describe the HID (?) and populate driver's acpi_match_table.
> 
> 
>> We can change it some
>> day but frankly, I don't see reasoning for it. One driver can easily
>> manage *all* error sources on a system, I'd say.
> 
> I agree, there is no reason to support two at the same time, if this happens
> then there is probably something wrong with the platform (e.g. races with
> firmware reading the same hardware registers), so we should make some noise.
> 
> Xgene's edac driver would be a good example of this, it looks like it reads data
> from some mmio region, if something else is doing the same we're going to make a
> mess.
> 
> 
>>> So I think we're good to make the whitelist x86 only.
>>> Your diff-hunk makes 'idx=-1', so we always get the 'Unfortunately' warning. I'd
>>> like to suppress this unless force_load has been used.
>>
>> Yeah, we should handle that differently for ARM. Toshi added the idx
>> thing in
>>
>>   5deed6b6a479 ("EDAC, ghes: Add platform check")
>>
>> to dump this when the platform is not whitelisted. So let's do that:
>>
>> ---
>> diff --git a/drivers/edac/ghes_edac.c b/drivers/edac/ghes_edac.c
>> index 863fbf3db29f..473aeec4b1da 100644
>> --- a/drivers/edac/ghes_edac.c
>> +++ b/drivers/edac/ghes_edac.c
>> @@ -440,12 +440,16 @@ int ghes_edac_register(struct ghes *ghes, struct device *dev)
>>  	struct mem_ctl_info *mci;
>>  	struct edac_mc_layer layers[1];
>>  	struct ghes_edac_dimm_fill dimm_fill;
>> -	int idx;
>> +	int idx = -1;
>>  
>> -	/* Check if safe to enable on this system */
>> -	idx = acpi_match_platform_list(plat_list);
>> -	if (!force_load && idx < 0)
>> -		return -ENODEV;
> 
> v4.17-rc5 has 'return 0' here. Wouldn't this change means no ghes can be
> registered unless ghes_edac is also supported by the platform?
> Shouldn't this be '0' for a silent failure?
> 
> 
>> +	if (IS_ENABLED(CONFIG_X86)) {
>> +		/* Check if safe to enable on this system */
>> +		idx = acpi_match_platform_list(plat_list);
>> +		if (!force_load && idx < 0)
>> +			return -ENODEV;
>> +	} else {
>> +		idx = 0;
>> +	}
>>  
>>  	/*
>>  	 * We have only one logical memory controller to which all DIMMs belong.
> 
> Tested on Seattle and some cranky homebrew-no-DMI firmware:
> Tested-by: James Morse <james.morse@arm.com>
> 
> With the ENODEV/0 thing above:
> Reviewed-by: James Morse <james.morse@arm.com>
> 
> 
>>> It looks like even the oldest Arm64 ACPI systems have dmi tables, so we can
>>> probably require DMI or the 'force' flag.
>>
>> Well, with the hunk above it would still do ghes_edac_count_dimms() on
>> ARM and if it fails to find something, it will set fake, which is a good
>> sanity-check as it screams loudly. :)
> 
> 
> Thanks,
> 
> James
> 
> .
> 

^ permalink raw reply	[flat|nested] 32+ messages in thread

* ghes_edac: enable HIP08 platform edac driver
  2018-05-17 18:02 ` [PATCH] " James Morse
@ 2018-05-18 11:11 ` Borislav Petkov
  -1 siblings, 0 replies; 32+ messages in thread
From: Borislav Petkov @ 2018-05-18 11:11 UTC (permalink / raw)
  To: James Morse
  Cc: Zhengqiang, Tyler Baicar, mchehab, toshi.kani, linux-edac,
	linuxarm, linux-arm-kernel

On Thu, May 17, 2018 at 07:02:18PM +0100, James Morse wrote:
> v4.17-rc5 has 'return 0' here. Wouldn't this change means no ghes can be
> registered unless ghes_edac is also supported by the platform?
> Shouldn't this be '0' for a silent failure?

https://git.kernel.org/pub/scm/linux/kernel/git/bp/bp.git/commit/?h=for-next&id=cc7f3f132658289b6661ab8294ab08a9d32ea026

> Tested on Seattle and some cranky homebrew-no-DMI firmware:
> Tested-by: James Morse <james.morse@arm.com>
> 
> With the ENODEV/0 thing above:
> Reviewed-by: James Morse <james.morse@arm.com>

Thanks, adding.

^ permalink raw reply	[flat|nested] 32+ messages in thread

* [PATCH] ghes_edac: enable HIP08 platform edac driver
@ 2018-05-18 11:11 ` Borislav Petkov
  0 siblings, 0 replies; 32+ messages in thread
From: Borislav Petkov @ 2018-05-18 11:11 UTC (permalink / raw)
  To: linux-arm-kernel

On Thu, May 17, 2018 at 07:02:18PM +0100, James Morse wrote:
> v4.17-rc5 has 'return 0' here. Wouldn't this change means no ghes can be
> registered unless ghes_edac is also supported by the platform?
> Shouldn't this be '0' for a silent failure?

https://git.kernel.org/pub/scm/linux/kernel/git/bp/bp.git/commit/?h=for-next&id=cc7f3f132658289b6661ab8294ab08a9d32ea026

> Tested on Seattle and some cranky homebrew-no-DMI firmware:
> Tested-by: James Morse <james.morse@arm.com>
> 
> With the ENODEV/0 thing above:
> Reviewed-by: James Morse <james.morse@arm.com>

Thanks, adding.

-- 
Regards/Gruss,
    Boris.

Good mailing practices for 400: avoid top-posting and trim the reply.

^ permalink raw reply	[flat|nested] 32+ messages in thread

* EDAC, ghes: Make platform-based whitelisting x86-only
  2018-05-18 11:11 ` [PATCH] " Borislav Petkov
@ 2018-05-18 11:20 ` Borislav Petkov
  -1 siblings, 0 replies; 32+ messages in thread
From: Borislav Petkov @ 2018-05-18 11:20 UTC (permalink / raw)
  To: James Morse
  Cc: Zhengqiang, Tyler Baicar, mchehab, toshi.kani, linux-edac,
	linuxarm, linux-arm-kernel

From: Borislav Petkov <bp@suse.de>

ARM machines all have DMI tables so if they request hw error reporting
through GHES, then the driver should be able to detect DIMMs and report
errors successfully (famous last words :)).

Make the platform-based list x86-specific so that ghes_edac can load on
ARM.

Signed-off-by: Borislav Petkov <bp@suse.de>
Reviewed-by: James Morse <james.morse@arm.com>
Tested-by: James Morse <james.morse@arm.com>
Cc: Qiang Zheng <zhengqiang10@huawei.com>
Link: https://lkml.kernel.org/r/1526039543-180996-1-git-send-email-zhengqiang10@huawei.com
---
 drivers/edac/ghes_edac.c | 14 +++++++++-----
 1 file changed, 9 insertions(+), 5 deletions(-)

diff --git a/drivers/edac/ghes_edac.c b/drivers/edac/ghes_edac.c
index 863fbf3db29f..473aeec4b1da 100644
--- a/drivers/edac/ghes_edac.c
+++ b/drivers/edac/ghes_edac.c
@@ -440,12 +440,16 @@ int ghes_edac_register(struct ghes *ghes, struct device *dev)
 	struct mem_ctl_info *mci;
 	struct edac_mc_layer layers[1];
 	struct ghes_edac_dimm_fill dimm_fill;
-	int idx;
+	int idx = -1;
 
-	/* Check if safe to enable on this system */
-	idx = acpi_match_platform_list(plat_list);
-	if (!force_load && idx < 0)
-		return -ENODEV;
+	if (IS_ENABLED(CONFIG_X86)) {
+		/* Check if safe to enable on this system */
+		idx = acpi_match_platform_list(plat_list);
+		if (!force_load && idx < 0)
+			return -ENODEV;
+	} else {
+		idx = 0;
+	}
 
 	/*
 	 * We have only one logical memory controller to which all DIMMs belong.

^ permalink raw reply related	[flat|nested] 32+ messages in thread

* [PATCH] EDAC, ghes: Make platform-based whitelisting x86-only
@ 2018-05-18 11:20 ` Borislav Petkov
  0 siblings, 0 replies; 32+ messages in thread
From: Borislav Petkov @ 2018-05-18 11:20 UTC (permalink / raw)
  To: linux-arm-kernel

From: Borislav Petkov <bp@suse.de>

ARM machines all have DMI tables so if they request hw error reporting
through GHES, then the driver should be able to detect DIMMs and report
errors successfully (famous last words :)).

Make the platform-based list x86-specific so that ghes_edac can load on
ARM.

Signed-off-by: Borislav Petkov <bp@suse.de>
Reviewed-by: James Morse <james.morse@arm.com>
Tested-by: James Morse <james.morse@arm.com>
Cc: Qiang Zheng <zhengqiang10@huawei.com>
Link: https://lkml.kernel.org/r/1526039543-180996-1-git-send-email-zhengqiang10 at huawei.com
---
 drivers/edac/ghes_edac.c | 14 +++++++++-----
 1 file changed, 9 insertions(+), 5 deletions(-)

diff --git a/drivers/edac/ghes_edac.c b/drivers/edac/ghes_edac.c
index 863fbf3db29f..473aeec4b1da 100644
--- a/drivers/edac/ghes_edac.c
+++ b/drivers/edac/ghes_edac.c
@@ -440,12 +440,16 @@ int ghes_edac_register(struct ghes *ghes, struct device *dev)
 	struct mem_ctl_info *mci;
 	struct edac_mc_layer layers[1];
 	struct ghes_edac_dimm_fill dimm_fill;
-	int idx;
+	int idx = -1;
 
-	/* Check if safe to enable on this system */
-	idx = acpi_match_platform_list(plat_list);
-	if (!force_load && idx < 0)
-		return -ENODEV;
+	if (IS_ENABLED(CONFIG_X86)) {
+		/* Check if safe to enable on this system */
+		idx = acpi_match_platform_list(plat_list);
+		if (!force_load && idx < 0)
+			return -ENODEV;
+	} else {
+		idx = 0;
+	}
 
 	/*
 	 * We have only one logical memory controller to which all DIMMs belong.
-- 
2.17.0.391.g1f1cddd558b5


-- 
Regards/Gruss,
    Boris.

Good mailing practices for 400: avoid top-posting and trim the reply.

^ permalink raw reply related	[flat|nested] 32+ messages in thread

* EDAC, ghes: Make platform-based whitelisting x86-only
  2018-05-18 11:20 ` [PATCH] " Borislav Petkov
@ 2018-05-21  9:39 ` Zhengqiang
  -1 siblings, 0 replies; 32+ messages in thread
From: Zhengqiang @ 2018-05-21  9:39 UTC (permalink / raw)
  To: Borislav Petkov, James Morse
  Cc: Tyler Baicar, mchehab, toshi.kani, linux-edac, linuxarm,
	linux-arm-kernel

Thanks, it works for me.

On 2018/5/18 19:20, Borislav Petkov wrote:
> From: Borislav Petkov <bp@suse.de>
> 
> ARM machines all have DMI tables so if they request hw error reporting
> through GHES, then the driver should be able to detect DIMMs and report
> errors successfully (famous last words :)).
> 
> Make the platform-based list x86-specific so that ghes_edac can load on
> ARM.
> 
> Signed-off-by: Borislav Petkov <bp@suse.de>
> Reviewed-by: James Morse <james.morse@arm.com>
> Tested-by: James Morse <james.morse@arm.com>
> Cc: Qiang Zheng <zhengqiang10@huawei.com>
> Link: https://lkml.kernel.org/r/1526039543-180996-1-git-send-email-zhengqiang10@huawei.com

Tested-by: Qiang Zheng <zhengqiang10@huawei.com>

> ---
>  drivers/edac/ghes_edac.c | 14 +++++++++-----
>  1 file changed, 9 insertions(+), 5 deletions(-)
> 
> diff --git a/drivers/edac/ghes_edac.c b/drivers/edac/ghes_edac.c
> index 863fbf3db29f..473aeec4b1da 100644
> --- a/drivers/edac/ghes_edac.c
> +++ b/drivers/edac/ghes_edac.c
> @@ -440,12 +440,16 @@ int ghes_edac_register(struct ghes *ghes, struct device *dev)
>  	struct mem_ctl_info *mci;
>  	struct edac_mc_layer layers[1];
>  	struct ghes_edac_dimm_fill dimm_fill;
> -	int idx;
> +	int idx = -1;
>  
> -	/* Check if safe to enable on this system */
> -	idx = acpi_match_platform_list(plat_list);
> -	if (!force_load && idx < 0)
> -		return -ENODEV;
> +	if (IS_ENABLED(CONFIG_X86)) {
> +		/* Check if safe to enable on this system */
> +		idx = acpi_match_platform_list(plat_list);
> +		if (!force_load && idx < 0)
> +			return -ENODEV;
> +	} else {
> +		idx = 0;
> +	}
>  
>  	/*
>  	 * We have only one logical memory controller to which all DIMMs belong.
>
---
To unsubscribe from this list: send the line "unsubscribe linux-edac" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 32+ messages in thread

* [PATCH] EDAC, ghes: Make platform-based whitelisting x86-only
@ 2018-05-21  9:39 ` Zhengqiang
  0 siblings, 0 replies; 32+ messages in thread
From: Zhengqiang @ 2018-05-21  9:39 UTC (permalink / raw)
  To: linux-arm-kernel

Thanks, it works for me.

On 2018/5/18 19:20, Borislav Petkov wrote:
> From: Borislav Petkov <bp@suse.de>
> 
> ARM machines all have DMI tables so if they request hw error reporting
> through GHES, then the driver should be able to detect DIMMs and report
> errors successfully (famous last words :)).
> 
> Make the platform-based list x86-specific so that ghes_edac can load on
> ARM.
> 
> Signed-off-by: Borislav Petkov <bp@suse.de>
> Reviewed-by: James Morse <james.morse@arm.com>
> Tested-by: James Morse <james.morse@arm.com>
> Cc: Qiang Zheng <zhengqiang10@huawei.com>
> Link: https://lkml.kernel.org/r/1526039543-180996-1-git-send-email-zhengqiang10 at huawei.com

Tested-by: Qiang Zheng <zhengqiang10@huawei.com>

> ---
>  drivers/edac/ghes_edac.c | 14 +++++++++-----
>  1 file changed, 9 insertions(+), 5 deletions(-)
> 
> diff --git a/drivers/edac/ghes_edac.c b/drivers/edac/ghes_edac.c
> index 863fbf3db29f..473aeec4b1da 100644
> --- a/drivers/edac/ghes_edac.c
> +++ b/drivers/edac/ghes_edac.c
> @@ -440,12 +440,16 @@ int ghes_edac_register(struct ghes *ghes, struct device *dev)
>  	struct mem_ctl_info *mci;
>  	struct edac_mc_layer layers[1];
>  	struct ghes_edac_dimm_fill dimm_fill;
> -	int idx;
> +	int idx = -1;
>  
> -	/* Check if safe to enable on this system */
> -	idx = acpi_match_platform_list(plat_list);
> -	if (!force_load && idx < 0)
> -		return -ENODEV;
> +	if (IS_ENABLED(CONFIG_X86)) {
> +		/* Check if safe to enable on this system */
> +		idx = acpi_match_platform_list(plat_list);
> +		if (!force_load && idx < 0)
> +			return -ENODEV;
> +	} else {
> +		idx = 0;
> +	}
>  
>  	/*
>  	 * We have only one logical memory controller to which all DIMMs belong.
> 

^ permalink raw reply	[flat|nested] 32+ messages in thread

* EDAC, ghes: Make platform-based whitelisting x86-only
  2018-05-18 11:20 ` [PATCH] " Borislav Petkov
@ 2018-05-21 13:48 ` Tyler Baicar
  -1 siblings, 0 replies; 32+ messages in thread
From: Tyler Baicar @ 2018-05-21 13:48 UTC (permalink / raw)
  To: Borislav Petkov, James Morse
  Cc: Zhengqiang, mchehab, toshi.kani, linux-edac, linuxarm, linux-arm-kernel

On 5/18/2018 7:20 AM, Borislav Petkov wrote:
> From: Borislav Petkov <bp@suse.de>
>
> ARM machines all have DMI tables so if they request hw error reporting
> through GHES, then the driver should be able to detect DIMMs and report
> errors successfully (famous last words :)).
>
> Make the platform-based list x86-specific so that ghes_edac can load on
> ARM.
>
> Signed-off-by: Borislav Petkov <bp@suse.de>
> Reviewed-by: James Morse <james.morse@arm.com>
> Tested-by: James Morse <james.morse@arm.com>
> Cc: Qiang Zheng <zhengqiang10@huawei.com>
Tested-by: Tyler Baicar <tbaicar@codeaurora.org>

Was it intentional to make idx=0 so that the pr_info later in this function no 
longer hits?

I don't see an issue with not printing out the long BIOS statement, but the 
number of DIMM sockets print could still be useful.

Thanks,
Tyler

^ permalink raw reply	[flat|nested] 32+ messages in thread

* [PATCH] EDAC, ghes: Make platform-based whitelisting x86-only
@ 2018-05-21 13:48 ` Tyler Baicar
  0 siblings, 0 replies; 32+ messages in thread
From: Tyler Baicar @ 2018-05-21 13:48 UTC (permalink / raw)
  To: linux-arm-kernel

On 5/18/2018 7:20 AM, Borislav Petkov wrote:
> From: Borislav Petkov <bp@suse.de>
>
> ARM machines all have DMI tables so if they request hw error reporting
> through GHES, then the driver should be able to detect DIMMs and report
> errors successfully (famous last words :)).
>
> Make the platform-based list x86-specific so that ghes_edac can load on
> ARM.
>
> Signed-off-by: Borislav Petkov <bp@suse.de>
> Reviewed-by: James Morse <james.morse@arm.com>
> Tested-by: James Morse <james.morse@arm.com>
> Cc: Qiang Zheng <zhengqiang10@huawei.com>
Tested-by: Tyler Baicar <tbaicar@codeaurora.org>

Was it intentional to make idx=0 so that the pr_info later in this function no 
longer hits?

I don't see an issue with not printing out the long BIOS statement, but the 
number of DIMM sockets print could still be useful.

Thanks,
Tyler

-- 
Qualcomm Datacenter Technologies, Inc. as an affiliate of Qualcomm Technologies, Inc.
Qualcomm Technologies, Inc. is a member of the Code Aurora Forum,
a Linux Foundation Collaborative Project.

^ permalink raw reply	[flat|nested] 32+ messages in thread

* EDAC, ghes: Make platform-based whitelisting x86-only
  2018-05-21 13:48 ` [PATCH] " Tyler Baicar
@ 2018-05-21 17:15 ` Borislav Petkov
  -1 siblings, 0 replies; 32+ messages in thread
From: Borislav Petkov @ 2018-05-21 17:15 UTC (permalink / raw)
  To: Tyler Baicar
  Cc: James Morse, Zhengqiang, mchehab, toshi.kani, linux-edac,
	linuxarm, linux-arm-kernel

On Mon, May 21, 2018 at 09:48:23AM -0400, Tyler Baicar wrote:
> I don't see an issue with not printing out the long BIOS statement, but the
> number of DIMM sockets print could still be useful.

Well, if you wanna dump the number of DIMMs - then maybe that line
should issue unconditionally. However, "DIMM sockets" is silly - it
should simply say:

	"%d DIMMs detected"

or so.

^ permalink raw reply	[flat|nested] 32+ messages in thread

* [PATCH] EDAC, ghes: Make platform-based whitelisting x86-only
@ 2018-05-21 17:15 ` Borislav Petkov
  0 siblings, 0 replies; 32+ messages in thread
From: Borislav Petkov @ 2018-05-21 17:15 UTC (permalink / raw)
  To: linux-arm-kernel

On Mon, May 21, 2018 at 09:48:23AM -0400, Tyler Baicar wrote:
> I don't see an issue with not printing out the long BIOS statement, but the
> number of DIMM sockets print could still be useful.

Well, if you wanna dump the number of DIMMs - then maybe that line
should issue unconditionally. However, "DIMM sockets" is silly - it
should simply say:

	"%d DIMMs detected"

or so.

-- 
Regards/Gruss,
    Boris.

Good mailing practices for 400: avoid top-posting and trim the reply.

^ permalink raw reply	[flat|nested] 32+ messages in thread

* EDAC, ghes: Make platform-based whitelisting x86-only
  2018-05-21 17:15 ` [PATCH] " Borislav Petkov
@ 2018-05-21 20:34 ` Tyler Baicar
  -1 siblings, 0 replies; 32+ messages in thread
From: Tyler Baicar @ 2018-05-21 20:34 UTC (permalink / raw)
  To: Borislav Petkov
  Cc: James Morse, Zhengqiang, mchehab, toshi.kani, linux-edac,
	linuxarm, linux-arm-kernel

On 5/21/2018 1:15 PM, Borislav Petkov wrote:
> On Mon, May 21, 2018 at 09:48:23AM -0400, Tyler Baicar wrote:
>> I don't see an issue with not printing out the long BIOS statement, but the
>> number of DIMM sockets print could still be useful.
> Well, if you wanna dump the number of DIMMs - then maybe that line
> should issue unconditionally. However, "DIMM sockets" is silly - it
> should simply say:
>
> 	"%d DIMMs detected"
>
> or so.
Agreed. Sounds good to me.

^ permalink raw reply	[flat|nested] 32+ messages in thread

* [PATCH] EDAC, ghes: Make platform-based whitelisting x86-only
@ 2018-05-21 20:34 ` Tyler Baicar
  0 siblings, 0 replies; 32+ messages in thread
From: Tyler Baicar @ 2018-05-21 20:34 UTC (permalink / raw)
  To: linux-arm-kernel

On 5/21/2018 1:15 PM, Borislav Petkov wrote:
> On Mon, May 21, 2018 at 09:48:23AM -0400, Tyler Baicar wrote:
>> I don't see an issue with not printing out the long BIOS statement, but the
>> number of DIMM sockets print could still be useful.
> Well, if you wanna dump the number of DIMMs - then maybe that line
> should issue unconditionally. However, "DIMM sockets" is silly - it
> should simply say:
>
> 	"%d DIMMs detected"
>
> or so.
Agreed. Sounds good to me.

-- 
Qualcomm Datacenter Technologies, Inc. as an affiliate of Qualcomm Technologies, Inc.
Qualcomm Technologies, Inc. is a member of the Code Aurora Forum,
a Linux Foundation Collaborative Project.

^ permalink raw reply	[flat|nested] 32+ messages in thread

* EDAC, ghes: Make platform-based whitelisting x86-only
  2018-05-21 20:34 ` [PATCH] " Tyler Baicar
@ 2018-05-21 20:44 ` Borislav Petkov
  -1 siblings, 0 replies; 32+ messages in thread
From: Borislav Petkov @ 2018-05-21 20:44 UTC (permalink / raw)
  To: Tyler Baicar
  Cc: James Morse, Zhengqiang, mchehab, toshi.kani, linux-edac,
	linuxarm, linux-arm-kernel

On Mon, May 21, 2018 at 04:34:11PM -0400, Tyler Baicar wrote:
> Agreed. Sounds good to me.

Meaning you're sending me a patch soon or ...?

:-D

^ permalink raw reply	[flat|nested] 32+ messages in thread

* [PATCH] EDAC, ghes: Make platform-based whitelisting x86-only
@ 2018-05-21 20:44 ` Borislav Petkov
  0 siblings, 0 replies; 32+ messages in thread
From: Borislav Petkov @ 2018-05-21 20:44 UTC (permalink / raw)
  To: linux-arm-kernel

On Mon, May 21, 2018 at 04:34:11PM -0400, Tyler Baicar wrote:
> Agreed. Sounds good to me.

Meaning you're sending me a patch soon or ...?

:-D

-- 
Regards/Gruss,
    Boris.

Good mailing practices for 400: avoid top-posting and trim the reply.

^ permalink raw reply	[flat|nested] 32+ messages in thread

* EDAC, ghes: Make platform-based whitelisting x86-only
  2018-05-21 20:44 ` [PATCH] " Borislav Petkov
@ 2018-05-22  2:10 ` Tyler Baicar
  -1 siblings, 0 replies; 32+ messages in thread
From: Tyler Baicar @ 2018-05-22  2:10 UTC (permalink / raw)
  To: Borislav Petkov
  Cc: James Morse, Zhengqiang, mchehab, toshi.kani, linux-edac,
	linuxarm, linux-arm-kernel

On 5/21/2018 4:44 PM, Borislav Petkov wrote:
> On Mon, May 21, 2018 at 04:34:11PM -0400, Tyler Baicar wrote:
>> Agreed. Sounds good to me.
> Meaning you're sending me a patch soon or ...?
>
> :-D
>
Yes, I'll put a patch together and send it out tomorrow.

^ permalink raw reply	[flat|nested] 32+ messages in thread

* [PATCH] EDAC, ghes: Make platform-based whitelisting x86-only
@ 2018-05-22  2:10 ` Tyler Baicar
  0 siblings, 0 replies; 32+ messages in thread
From: Tyler Baicar @ 2018-05-22  2:10 UTC (permalink / raw)
  To: linux-arm-kernel

On 5/21/2018 4:44 PM, Borislav Petkov wrote:
> On Mon, May 21, 2018 at 04:34:11PM -0400, Tyler Baicar wrote:
>> Agreed. Sounds good to me.
> Meaning you're sending me a patch soon or ...?
>
> :-D
>
Yes, I'll put a patch together and send it out tomorrow.

-- 
Qualcomm Datacenter Technologies, Inc. as an affiliate of Qualcomm Technologies, Inc.
Qualcomm Technologies, Inc. is a member of the Code Aurora Forum,
a Linux Foundation Collaborative Project.

^ permalink raw reply	[flat|nested] 32+ messages in thread

end of thread, other threads:[~2018-05-22  2:10 UTC | newest]

Thread overview: 32+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2018-05-11 12:19 ghes_edac: enable HIP08 platform edac driver Borislav Petkov
  -- strict thread matches above, loose matches on Subject: below --
2018-05-22  2:10 EDAC, ghes: Make platform-based whitelisting x86-only Tyler Baicar
2018-05-22  2:10 ` [PATCH] " Tyler Baicar
2018-05-21 20:44 Borislav Petkov
2018-05-21 20:44 ` [PATCH] " Borislav Petkov
2018-05-21 20:34 Tyler Baicar
2018-05-21 20:34 ` [PATCH] " Tyler Baicar
2018-05-21 17:15 Borislav Petkov
2018-05-21 17:15 ` [PATCH] " Borislav Petkov
2018-05-21 13:48 Tyler Baicar
2018-05-21 13:48 ` [PATCH] " Tyler Baicar
2018-05-21  9:39 Zhengqiang
2018-05-21  9:39 ` [PATCH] " Zhengqiang
2018-05-18 11:20 Borislav Petkov
2018-05-18 11:20 ` [PATCH] " Borislav Petkov
2018-05-18 11:11 ghes_edac: enable HIP08 platform edac driver Borislav Petkov
2018-05-18 11:11 ` [PATCH] " Borislav Petkov
2018-05-18  7:13 Zhengqiang
2018-05-18  7:13 ` [PATCH] " Zhengqiang
2018-05-17 18:02 James Morse
2018-05-17 18:02 ` [PATCH] " James Morse
2018-05-16 18:29 Borislav Petkov
2018-05-16 18:29 ` [PATCH] " Borislav Petkov
2018-05-16 13:38 James Morse
2018-05-16 13:38 ` [PATCH] " James Morse
2018-05-14 16:47 Borislav Petkov
2018-05-14 16:47 ` [PATCH] " Borislav Petkov
2018-05-14 15:12 James Morse
2018-05-14 15:12 ` [PATCH] " James Morse
2018-05-14  9:47 Borislav Petkov
2018-05-14  4:11 Zhengqiang
2018-05-11 11:52 Zhengqiang

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.