All of lore.kernel.org
 help / color / mirror / Atom feed
* EDAC, ghes: Do not register instance if platform is not whitelisted
@ 2018-04-23 20:35 Toshi Kani
  0 siblings, 0 replies; 6+ messages in thread
From: Toshi Kani @ 2018-04-23 20:35 UTC (permalink / raw)
  To: bp; +Cc: alex_gagniuc, tony.luck, james.morse, austin_bolen, linux-edac

T24gTW9uLCAyMDE4LTA0LTIzIGF0IDIyOjI4ICswMjAwLCBCb3Jpc2xhdiBQZXRrb3Ygd3JvdGU6
DQo+IE9uIE1vbiwgQXByIDIzLCAyMDE4IGF0IDA4OjEzOjIzUE0gKzAwMDAsIEthbmksIFRvc2hp
IHdyb3RlOg0KPiA+IFVtbS4uLiwgSSBkbyBub3QgdGhpbmsgdGhpcyBjaGFuZ2UgaXMgY29ycmVj
dC4gIFRoaXMgY2F1c2VzIGFuIGVycm9yIHRvDQo+ID4gZ2hlc19wcm9iZSgpLCB3aGljaCBkaXNh
YmxlcyBHSEVTIGl0c2VsZi4gIGdoZXMgYW5kIGdoZXNfZWRhYyBhcmUNCj4gPiBzZXBhcmF0ZSBt
b2R1bGVzLiAgVGhhdCBpcywgZ2hlc19lZGFjIHdvcmtzIG9uIHRvcCBvZiBnaGVzIGFzIGFuDQo+
ID4gb3B0aW9uYWwgbW9kdWxlLg0KPiA+IA0KPiA+IEkgdGhpbmsgdGhlIHJpZ2h0IGZpeCBpcyB0
byBzaW1wbHkgcmVtb3ZlIHRoZSBlcnJvciBtZXNzYWdlLiAgVGhpcyB3aWxsDQo+ID4gbWFrZSBp
dCBjb25zaXN0ZW50IHdpdGggdGhlIGNhc2Ugb2YgQ09ORklHX0VEQUNfR0hFUyBkaXNhYmxlZC4g
IFBsZWFzZQ0KPiA+IHNlZSBpbmNsdWRlL2FjcGkvZ2hlcy5oLg0KPiANCj4gSSBndWVzcy4gQWx0
aG91Z2ggYW4gZXZlbiByaWdodC1lciBmaXggd291bGQgYmUgdG8gbm90IGNhbGwNCj4gZ2hlc19l
ZGFjX3JlcG9ydF9tZW1fZXJyb3IoKSBhdCBhbGwsIGlmIHRoZXJlJ3Mgbm8gZ2hlc19lZGFjIG1v
ZHVsZQ0KPiByZWdpc3RlcmVkLg0KPiANCj4gVGhhdCB3b3VsZCBuZWVkIG1vcmUgZGlzc2VjdGlu
ZyB0aG91Z2guLi4uDQoNClRoYXQncyBjbGVhbmVyLCBidXQgd2lsbCBsZWFkIHRvIGEgbW9kdWxl
IG9yZGVyaW5nIGlzc3VlLi4uICBGb3Igbm93LA0KSSdkIHNpbXBseSBmaXggZ2hlc19lZGFjX3Jl
cG9ydF9tZW1fZXJyb3IoKSB0byBoYW5kbGUgdGhlICFwdnQgY2FzZSBhcyBhDQpub3JtYWwgY2Fz
ZS4NCg0KVGhhbmtzLA0KLVRvc2hp
---
To unsubscribe from this list: send the line "unsubscribe linux-edac" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 6+ messages in thread

* EDAC, ghes: Do not register instance if platform is not whitelisted
@ 2018-04-27 20:54 Toshi Kani
  0 siblings, 0 replies; 6+ messages in thread
From: Toshi Kani @ 2018-04-27 20:54 UTC (permalink / raw)
  To: bp; +Cc: alex_gagniuc, tony.luck, james.morse, austin_bolen, linux-edac

On Fri, 2018-04-27 at 13:43 +0200, Borislav Petkov wrote:
> On Mon, Apr 23, 2018 at 08:35:31PM +0000, Kani, Toshi wrote:
> > On Mon, 2018-04-23 at 22:28 +0200, Borislav Petkov wrote:
> > > On Mon, Apr 23, 2018 at 08:13:23PM +0000, Kani, Toshi wrote:
> > > > Umm..., I do not think this change is correct.  This causes an error to
> > > > ghes_probe(), which disables GHES itself.  ghes and ghes_edac are
> > > > separate modules.  That is, ghes_edac works on top of ghes as an
> > > > optional module.
> > > > 
> > > > I think the right fix is to simply remove the error message.  This will
> > > > make it consistent with the case of CONFIG_EDAC_GHES disabled.  Please
> > > > see include/acpi/ghes.h.
> > > 
> > > I guess. Although an even right-er fix would be to not call
> > > ghes_edac_report_mem_error() at all, if there's no ghes_edac module
> > > registered.
> > > 
> > > That would need more dissecting though....
> > 
> > That's cleaner, but will lead to a module ordering issue...  For now,
> > I'd simply fix ghes_edac_report_mem_error() to handle the !pvt case as a
> > normal case.
> 
> Well, the !pvt case is not a normal case. The only justification for
> removing the pr_err() because of the hack how ghes_edac is grafted onto
> ghes.c
> 
> Actually, what I'm thinking is the below along with taking Sughosh's
> patch here:
> 
> https://marc.info/?l=linux-edac&m=152473783708615
> 
> Also, ghes_edac_register() needs to return an error when it does not hit
> the whitelist - there's no other way to see it.
> 
> Which means, ghes.c can register ghes_edac as the last thing it does and
> the outcome of that registration should not have any influence on GHES
> itself because ghes_edac does only reporting.

I haven't had chance to test your change, but it looks good to me.
Please also change ghes_edac_register() in include/acpi/ghes.h to return
-ENODEV as well.

Thanks,
-Toshi

^ permalink raw reply	[flat|nested] 6+ messages in thread

* EDAC, ghes: Do not register instance if platform is not whitelisted
@ 2018-04-27 11:43 Borislav Petkov
  0 siblings, 0 replies; 6+ messages in thread
From: Borislav Petkov @ 2018-04-27 11:43 UTC (permalink / raw)
  To: Kani, Toshi
  Cc: alex_gagniuc, tony.luck, james.morse, austin_bolen, linux-edac

On Mon, Apr 23, 2018 at 08:35:31PM +0000, Kani, Toshi wrote:
> On Mon, 2018-04-23 at 22:28 +0200, Borislav Petkov wrote:
> > On Mon, Apr 23, 2018 at 08:13:23PM +0000, Kani, Toshi wrote:
> > > Umm..., I do not think this change is correct.  This causes an error to
> > > ghes_probe(), which disables GHES itself.  ghes and ghes_edac are
> > > separate modules.  That is, ghes_edac works on top of ghes as an
> > > optional module.
> > > 
> > > I think the right fix is to simply remove the error message.  This will
> > > make it consistent with the case of CONFIG_EDAC_GHES disabled.  Please
> > > see include/acpi/ghes.h.
> > 
> > I guess. Although an even right-er fix would be to not call
> > ghes_edac_report_mem_error() at all, if there's no ghes_edac module
> > registered.
> > 
> > That would need more dissecting though....
> 
> That's cleaner, but will lead to a module ordering issue...  For now,
> I'd simply fix ghes_edac_report_mem_error() to handle the !pvt case as a
> normal case.

Well, the !pvt case is not a normal case. The only justification for
removing the pr_err() because of the hack how ghes_edac is grafted onto
ghes.c

Actually, what I'm thinking is the below along with taking Sughosh's
patch here:

https://marc.info/?l=linux-edac&m=152473783708615

Also, ghes_edac_register() needs to return an error when it does not hit
the whitelist - there's no other way to see it.

Which means, ghes.c can register ghes_edac as the last thing it does and
the outcome of that registration should not have any influence on GHES
itself because ghes_edac does only reporting.
---
 drivers/acpi/apei/ghes.c | 14 ++++++--------
 drivers/edac/ghes_edac.c |  6 ++----
 2 files changed, 8 insertions(+), 12 deletions(-)

diff --git a/drivers/acpi/apei/ghes.c b/drivers/acpi/apei/ghes.c
index 1efefe919555..88103333ee1b 100644
--- a/drivers/acpi/apei/ghes.c
+++ b/drivers/acpi/apei/ghes.c
@@ -1087,10 +1087,6 @@ static int ghes_probe(struct platform_device *ghes_dev)
 		goto err;
 	}
 
-	rc = ghes_edac_register(ghes, &ghes_dev->dev);
-	if (rc < 0)
-		goto err;
-
 	switch (generic->notify.type) {
 	case ACPI_HEST_NOTIFY_POLLED:
 		timer_setup(&ghes->timer, ghes_poll_func, TIMER_DEFERRABLE);
@@ -1102,14 +1098,14 @@ static int ghes_probe(struct platform_device *ghes_dev)
 		if (rc) {
 			pr_err(GHES_PFX "Failed to map GSI to IRQ for generic hardware error source: %d\n",
 			       generic->header.source_id);
-			goto err_edac_unreg;
+			goto err;
 		}
 		rc = request_irq(ghes->irq, ghes_irq_func, IRQF_SHARED,
 				 "GHES IRQ", ghes);
 		if (rc) {
 			pr_err(GHES_PFX "Failed to register IRQ for generic hardware error source: %d\n",
 			       generic->header.source_id);
-			goto err_edac_unreg;
+			goto err;
 		}
 		break;
 
@@ -1132,14 +1128,16 @@ static int ghes_probe(struct platform_device *ghes_dev)
 	default:
 		BUG();
 	}
+
 	platform_set_drvdata(ghes_dev, ghes);
 
+	ghes_edac_register(ghes, &ghes_dev->dev);
+
 	/* Handle any pending errors right away */
 	ghes_proc(ghes);
 
 	return 0;
-err_edac_unreg:
-	ghes_edac_unregister(ghes);
+
 err:
 	if (ghes) {
 		ghes_fini(ghes);
diff --git a/drivers/edac/ghes_edac.c b/drivers/edac/ghes_edac.c
index 68b6ee18bea6..7fdbfe871f5c 100644
--- a/drivers/edac/ghes_edac.c
+++ b/drivers/edac/ghes_edac.c
@@ -183,10 +183,8 @@ void ghes_edac_report_mem_error(struct ghes *ghes, int sev,
 	char *p;
 	u8 grain_bits;
 
-	if (!pvt) {
-		pr_err("Internal error: Can't find EDAC structure\n");
+	if (!pvt)
 		return;
-	}
 
 	/*
 	 * We can do the locking below because GHES defers error processing
@@ -439,7 +437,7 @@ int ghes_edac_register(struct ghes *ghes, struct device *dev)
 	/* Check if safe to enable on this system */
 	idx = acpi_match_platform_list(plat_list);
 	if (!force_load && idx < 0)
-		return 0;
+		return -ENODEV;
 
 	/*
 	 * We have only one logical memory controller to which all DIMMs belong.

^ permalink raw reply related	[flat|nested] 6+ messages in thread

* EDAC, ghes: Do not register instance if platform is not whitelisted
@ 2018-04-23 20:28 Borislav Petkov
  0 siblings, 0 replies; 6+ messages in thread
From: Borislav Petkov @ 2018-04-23 20:28 UTC (permalink / raw)
  To: Kani, Toshi
  Cc: tony.luck, alex_gagniuc, austin_bolen, james.morse, linux-edac

On Mon, Apr 23, 2018 at 08:13:23PM +0000, Kani, Toshi wrote:
> Umm..., I do not think this change is correct.  This causes an error to
> ghes_probe(), which disables GHES itself.  ghes and ghes_edac are
> separate modules.  That is, ghes_edac works on top of ghes as an
> optional module.
> 
> I think the right fix is to simply remove the error message.  This will
> make it consistent with the case of CONFIG_EDAC_GHES disabled.  Please
> see include/acpi/ghes.h.

I guess. Although an even right-er fix would be to not call
ghes_edac_report_mem_error() at all, if there's no ghes_edac module
registered.

That would need more dissecting though....

^ permalink raw reply	[flat|nested] 6+ messages in thread

* EDAC, ghes: Do not register instance if platform is not whitelisted
@ 2018-04-23 20:13 Toshi Kani
  0 siblings, 0 replies; 6+ messages in thread
From: Toshi Kani @ 2018-04-23 20:13 UTC (permalink / raw)
  To: tony.luck, bp; +Cc: alex_gagniuc, austin_bolen, james.morse, linux-edac

On Mon, 2018-04-23 at 14:27 +0200, Borislav Petkov wrote:
> Lemme write an official patch:
> 
> ---
> Tony reported seeing
> 
>   "Internal error: Can't find EDAC structure"
> 
> when injecting correctable errors.
> 
> The problem is that we should exit registration when platform doesn't
> match the whitelisted ones.
> 
> Do that.
> 
> Reviewed-by: Tony Luck <tony.luck@intel.com>
> Signed-off-by: Borislav Petkov <bp@suse.de>
> Cc: Toshi Kani <toshi.kani@hpe.com>
> Fixes: 5deed6b6a479 ("EDAC, ghes: Add platform check")
> Link: https://lkml.kernel.org/r/20180420182015.zao3olss4tvvlxki@agluck-desk
> ---
>  drivers/edac/ghes_edac.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/drivers/edac/ghes_edac.c b/drivers/edac/ghes_edac.c
> index 68b6ee18bea6..4a9231993699 100644
> --- a/drivers/edac/ghes_edac.c
> +++ b/drivers/edac/ghes_edac.c
> @@ -439,7 +439,7 @@ int ghes_edac_register(struct ghes *ghes, struct device *dev)
>  	/* Check if safe to enable on this system */
>  	idx = acpi_match_platform_list(plat_list);
>  	if (!force_load && idx < 0)
> -		return 0;
> +		return -ENODEV;
>  

Umm..., I do not think this change is correct.  This causes an error to
ghes_probe(), which disables GHES itself.  ghes and ghes_edac are
separate modules.  That is, ghes_edac works on top of ghes as an
optional module.

I think the right fix is to simply remove the error message.  This will
make it consistent with the case of CONFIG_EDAC_GHES disabled.  Please
see include/acpi/ghes.h.

Thanks,
-Toshi

^ permalink raw reply	[flat|nested] 6+ messages in thread

* EDAC, ghes: Do not register instance if platform is not whitelisted
@ 2018-04-23 12:27 Borislav Petkov
  0 siblings, 0 replies; 6+ messages in thread
From: Borislav Petkov @ 2018-04-23 12:27 UTC (permalink / raw)
  To: Luck, Tony
  Cc: alex_gagniuc, austin_bolen, james.morse, linux-edac, Toshi Kani

Lemme write an official patch:
---
Tony reported seeing

  "Internal error: Can't find EDAC structure"

when injecting correctable errors.

The problem is that we should exit registration when platform doesn't
match the whitelisted ones.

Do that.

Reviewed-by: Tony Luck <tony.luck@intel.com>
Signed-off-by: Borislav Petkov <bp@suse.de>
Cc: Toshi Kani <toshi.kani@hpe.com>
Fixes: 5deed6b6a479 ("EDAC, ghes: Add platform check")
Link: https://lkml.kernel.org/r/20180420182015.zao3olss4tvvlxki@agluck-desk
---
 drivers/edac/ghes_edac.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/edac/ghes_edac.c b/drivers/edac/ghes_edac.c
index 68b6ee18bea6..4a9231993699 100644
--- a/drivers/edac/ghes_edac.c
+++ b/drivers/edac/ghes_edac.c
@@ -439,7 +439,7 @@ int ghes_edac_register(struct ghes *ghes, struct device *dev)
 	/* Check if safe to enable on this system */
 	idx = acpi_match_platform_list(plat_list);
 	if (!force_load && idx < 0)
-		return 0;
+		return -ENODEV;
 
 	/*
 	 * We have only one logical memory controller to which all DIMMs belong.

^ permalink raw reply related	[flat|nested] 6+ messages in thread

end of thread, other threads:[~2018-04-27 20:54 UTC | newest]

Thread overview: 6+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2018-04-23 20:35 EDAC, ghes: Do not register instance if platform is not whitelisted Toshi Kani
  -- strict thread matches above, loose matches on Subject: below --
2018-04-27 20:54 Toshi Kani
2018-04-27 11:43 Borislav Petkov
2018-04-23 20:28 Borislav Petkov
2018-04-23 20:13 Toshi Kani
2018-04-23 12:27 Borislav Petkov

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.