Linux-EDAC Archive on lore.kernel.org
 help / color / Atom feed
* [PATCH 0/2] EDAC, ghes: Fix use after free and add reference
@ 2019-10-14 17:19 James Morse
  2019-10-14 17:19 ` [PATCH 1/2] EDAC, ghes: Fix Use after free in ghes_edac remove path James Morse
                   ` (3 more replies)
  0 siblings, 4 replies; 13+ messages in thread
From: James Morse @ 2019-10-14 17:19 UTC (permalink / raw)
  To: linux-edac
  Cc: Mauro Carvalho Chehab, Borislav Petkov, Tony Luck,
	Robert Richter, John Garry

Hello,

ghes_edac can only be registered once, later attempts will silently
do nothing as the driver is already setup. The unregister path also
only expects to be called once, but doesn't check.

This leads to KASAN splats if multiple GHES entries are unregistered,
as the free()d memory is dereferenced, and if we're lucky, free()d
a second time.

Link: lore.kernel.org/r/304df85b-8b56-b77e-1a11-aa23769f2e7c@huawei.com

Patch 1 is the minimum needed to prevent the dereference and double
free, but this does expose the lack of symmetry. If we unregister
one GHES entry, subsequent notifications will be lost.
Unregistering is unsafe if another CPU is using the free()d memory in
ghes_edac_report_mem_error().

To fix this, Patch 2 uses ghes_init as a reference count.

We can now unbind all the GHES entries, causing ghes_edac to be
unregistered, and start rebinding them again.


Thanks,

James Morse (2):
  EDAC, ghes: Fix Use after free in ghes_edac remove path
  EDAC, ghes: Reference count GHES users of ghes_edac

 drivers/edac/ghes_edac.c | 4 ++++
 1 file changed, 4 insertions(+)

-- 
2.20.1


^ permalink raw reply	[flat|nested] 13+ messages in thread

* [PATCH 1/2] EDAC, ghes: Fix Use after free in ghes_edac remove path
  2019-10-14 17:19 [PATCH 0/2] EDAC, ghes: Fix use after free and add reference James Morse
@ 2019-10-14 17:19 ` James Morse
  2019-10-14 17:19 ` [PATCH 2/2] EDAC, ghes: Reference count GHES users of ghes_edac James Morse
                   ` (2 subsequent siblings)
  3 siblings, 0 replies; 13+ messages in thread
From: James Morse @ 2019-10-14 17:19 UTC (permalink / raw)
  To: linux-edac
  Cc: Mauro Carvalho Chehab, Borislav Petkov, Tony Luck,
	Robert Richter, John Garry

ghes_edac models a single logical memory controller, and uses a global
ghes_init variable to ensure only the first ghes_edac_register() will
do anything.

ghes_edac is registered the first time a GHES entry in the HEST is probed.
There may be multiple entries, so subsequent attempts to register
ghes_edac are silently ignored as the work has already been done.

When a GHES entry is unregistered, it calls ghes_edac_unregister(), which
free()s the memory behind the global variables in ghes_edac.

... but there may be multiple GHES entries, the next call to
ghes_edac_unregister() will dereference the free()d memory, and
attempt to free it a second time.

This may also be triggered on a platform with one GHES entry, if the
driver is unbound/re-bound and unbound. The re-bind step will do
nothing because of ghes_init, the second unbind will then do the same
work as the first.

This was detected by KASAN and DEBUG_TEST_DRIVER_REMOVE.

Reported-by: John Garry <john.garry@huawei.com>
Link: lore.kernel.org/r/304df85b-8b56-b77e-1a11-aa23769f2e7c@huawei.com
Signed-off-by: James Morse <james.morse@arm.com>
Fixes: 0fe5f281f749 ("EDAC, ghes: Model a single, logical memory controller")
---
 drivers/edac/ghes_edac.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/drivers/edac/ghes_edac.c b/drivers/edac/ghes_edac.c
index d413a0bdc9ad..955b59b6aade 100644
--- a/drivers/edac/ghes_edac.c
+++ b/drivers/edac/ghes_edac.c
@@ -554,6 +554,7 @@ void ghes_edac_unregister(struct ghes *ghes)
 		return;
 
 	mci = ghes_pvt->mci;
+	ghes_pvt = NULL;
 	edac_mc_del_mc(mci->pdev);
 	edac_mc_free(mci);
 }
-- 
2.20.1


^ permalink raw reply	[flat|nested] 13+ messages in thread

* [PATCH 2/2] EDAC, ghes: Reference count GHES users of ghes_edac
  2019-10-14 17:19 [PATCH 0/2] EDAC, ghes: Fix use after free and add reference James Morse
  2019-10-14 17:19 ` [PATCH 1/2] EDAC, ghes: Fix Use after free in ghes_edac remove path James Morse
@ 2019-10-14 17:19 ` James Morse
  2019-10-14 17:30 ` [PATCH 0/2] EDAC, ghes: Fix use after free and add reference Borislav Petkov
  2019-10-15 13:25 ` Borislav Petkov
  3 siblings, 0 replies; 13+ messages in thread
From: James Morse @ 2019-10-14 17:19 UTC (permalink / raw)
  To: linux-edac
  Cc: Mauro Carvalho Chehab, Borislav Petkov, Tony Luck,
	Robert Richter, John Garry

ghes_edac only does work for the first GHES entry that registers it.
The same is true for unregister: first come first served. This lack
of symmetry is problematic if we have more than one GHES entry, as
ghes_edac_register() can only be called once, nothing decrements
ghes_init.

Doing the unregister work on the first call is unsafe, as another
CPU may be processing a notification in ghes_edac_report_mem_error(),
using the memory we are about to free.

ghes_init is already half of the reference counting. We only need
to do the register work for the first call, and the unregister work
for the last. Add the unregister check.

This means we no longer free ghes_edac's memory while there are
GHES entries that may receive a notification.

Signed-off-by: James Morse <james.morse@arm.com>
Fixes: 0fe5f281f749 ("EDAC, ghes: Model a single, logical memory controller")
---
 drivers/edac/ghes_edac.c | 3 +++
 1 file changed, 3 insertions(+)

diff --git a/drivers/edac/ghes_edac.c b/drivers/edac/ghes_edac.c
index 955b59b6aade..0bb62857ffb2 100644
--- a/drivers/edac/ghes_edac.c
+++ b/drivers/edac/ghes_edac.c
@@ -553,6 +553,9 @@ void ghes_edac_unregister(struct ghes *ghes)
 	if (!ghes_pvt)
 		return;
 
+	if (atomic_dec_return(&ghes_init))
+		return;
+
 	mci = ghes_pvt->mci;
 	ghes_pvt = NULL;
 	edac_mc_del_mc(mci->pdev);
-- 
2.20.1


^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH 0/2] EDAC, ghes: Fix use after free and add reference
  2019-10-14 17:19 [PATCH 0/2] EDAC, ghes: Fix use after free and add reference James Morse
  2019-10-14 17:19 ` [PATCH 1/2] EDAC, ghes: Fix Use after free in ghes_edac remove path James Morse
  2019-10-14 17:19 ` [PATCH 2/2] EDAC, ghes: Reference count GHES users of ghes_edac James Morse
@ 2019-10-14 17:30 ` Borislav Petkov
  2019-10-14 17:40   ` James Morse
  2019-10-15 13:25 ` Borislav Petkov
  3 siblings, 1 reply; 13+ messages in thread
From: Borislav Petkov @ 2019-10-14 17:30 UTC (permalink / raw)
  To: James Morse
  Cc: linux-edac, Mauro Carvalho Chehab, Tony Luck, Robert Richter, John Garry

On Mon, Oct 14, 2019 at 06:19:17PM +0100, James Morse wrote:
> ghes_edac can only be registered once, later attempts will silently
> do nothing as the driver is already setup. The unregister path also
> only expects to be called once, but doesn't check.
> 
> This leads to KASAN splats if multiple GHES entries are unregistered,
> as the free()d memory is dereferenced, and if we're lucky, free()d
> a second time.
> 
> Link: lore.kernel.org/r/304df85b-8b56-b77e-1a11-aa23769f2e7c@huawei.com
> 
> Patch 1 is the minimum needed to prevent the dereference and double
> free, but this does expose the lack of symmetry. If we unregister
> one GHES entry, subsequent notifications will be lost.
> Unregistering is unsafe if another CPU is using the free()d memory in
> ghes_edac_report_mem_error().
> 
> To fix this, Patch 2 uses ghes_init as a reference count.
> 
> We can now unbind all the GHES entries, causing ghes_edac to be
> unregistered, and start rebinding them again.
> 
> 
> Thanks,
> 
> James Morse (2):
>   EDAC, ghes: Fix Use after free in ghes_edac remove path
>   EDAC, ghes: Reference count GHES users of ghes_edac

Thanks for debugging and fixing this but why two patches?

One is perfectly fine IMO. Or?

Btw, I admit that the ghes_init thing is ugly as hell. ;-\

I wonder if it could be ripped out completely and we use only ghes_pvt
for controlling the single driver instance loading and unloading...

-- 
Regards/Gruss,
    Boris.

https://people.kernel.org/tglx/notes-about-netiquette

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH 0/2] EDAC, ghes: Fix use after free and add reference
  2019-10-14 17:30 ` [PATCH 0/2] EDAC, ghes: Fix use after free and add reference Borislav Petkov
@ 2019-10-14 17:40   ` James Morse
  2019-10-14 17:53     ` Borislav Petkov
  0 siblings, 1 reply; 13+ messages in thread
From: James Morse @ 2019-10-14 17:40 UTC (permalink / raw)
  To: Borislav Petkov
  Cc: linux-edac, Mauro Carvalho Chehab, Tony Luck, Robert Richter, John Garry

Hi Boris,

On 14/10/2019 18:30, Borislav Petkov wrote:
> On Mon, Oct 14, 2019 at 06:19:17PM +0100, James Morse wrote:
>> ghes_edac can only be registered once, later attempts will silently
>> do nothing as the driver is already setup. The unregister path also
>> only expects to be called once, but doesn't check.
>>
>> This leads to KASAN splats if multiple GHES entries are unregistered,
>> as the free()d memory is dereferenced, and if we're lucky, free()d
>> a second time.
>>
>> Link: lore.kernel.org/r/304df85b-8b56-b77e-1a11-aa23769f2e7c@huawei.com
>>
>> Patch 1 is the minimum needed to prevent the dereference and double
>> free, but this does expose the lack of symmetry. If we unregister
>> one GHES entry, subsequent notifications will be lost.
>> Unregistering is unsafe if another CPU is using the free()d memory in
>> ghes_edac_report_mem_error().
>>
>> To fix this, Patch 2 uses ghes_init as a reference count.
>>
>> We can now unbind all the GHES entries, causing ghes_edac to be

> Thanks for debugging and fixing this but why two patches?
> 
> One is perfectly fine IMO. Or?

They can be merged together if you prefer.

Keeping both avoids the !pvt check in ghes_edac_report_mem_error() going wrong, but I'm
not entirely sure what that is trying to stop...

The possibility of races between notifications and unregister only occurred to me after
testing the first patch, so they ended up as different things in my head: I thought it
deserved its own commit log as its unrelated to the KASAN splat.


> Btw, I admit that the ghes_init thing is ugly as hell. ;-\

> I wonder if it could be ripped out completely and we use only ghes_pvt
> for controlling the single driver instance loading and unloading...

I think you need some kind of reference count to know how many more GHES.x there are out
there that may call unregister, otherwise you race with notifications.


Thanks,

James

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH 0/2] EDAC, ghes: Fix use after free and add reference
  2019-10-14 17:40   ` James Morse
@ 2019-10-14 17:53     ` Borislav Petkov
  2019-10-16 15:17       ` Borislav Petkov
  0 siblings, 1 reply; 13+ messages in thread
From: Borislav Petkov @ 2019-10-14 17:53 UTC (permalink / raw)
  To: James Morse
  Cc: linux-edac, Mauro Carvalho Chehab, Tony Luck, Robert Richter, John Garry

On Mon, Oct 14, 2019 at 06:40:33PM +0100, James Morse wrote:
> Keeping both avoids the !pvt check in ghes_edac_report_mem_error() going wrong, but I'm
> not entirely sure what that is trying to stop...

It must've been some sanity-check in

  f04c62a7036a ("ghes_edac: add support for reporting errors via EDAC")

> The possibility of races between notifications and unregister only occurred to me after
> testing the first patch, so they ended up as different things in my head: I thought it
> deserved its own commit log as its unrelated to the KASAN splat.

To me they fix two different aspects of the missing counting on unreg.

> I think you need some kind of reference count to know how many more GHES.x there are out
> there that may call unregister, otherwise you race with notifications.

Provided unregister cannot be called concurrently, the

        if (!ghes_pvt)
                return;

in ghes_edac_unregister() should suffice.

But just to be on the safe side, it could get an "instance_mutex" or so
under which ghes_pvt is set and cleared and then that silly counter can
simply go away.

Thoughts?

-- 
Regards/Gruss,
    Boris.

https://people.kernel.org/tglx/notes-about-netiquette

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH 0/2] EDAC, ghes: Fix use after free and add reference
  2019-10-14 17:19 [PATCH 0/2] EDAC, ghes: Fix use after free and add reference James Morse
                   ` (2 preceding siblings ...)
  2019-10-14 17:30 ` [PATCH 0/2] EDAC, ghes: Fix use after free and add reference Borislav Petkov
@ 2019-10-15 13:25 ` Borislav Petkov
  3 siblings, 0 replies; 13+ messages in thread
From: Borislav Petkov @ 2019-10-15 13:25 UTC (permalink / raw)
  To: James Morse
  Cc: linux-edac, Mauro Carvalho Chehab, Tony Luck, Robert Richter, John Garry

On Mon, Oct 14, 2019 at 06:19:17PM +0100, James Morse wrote:
> Hello,
> 
> ghes_edac can only be registered once, later attempts will silently
> do nothing as the driver is already setup. The unregister path also
> only expects to be called once, but doesn't check.
> 
> This leads to KASAN splats if multiple GHES entries are unregistered,
> as the free()d memory is dereferenced, and if we're lucky, free()d
> a second time.
> 
> Link: lore.kernel.org/r/304df85b-8b56-b77e-1a11-aa23769f2e7c@huawei.com
> 
> Patch 1 is the minimum needed to prevent the dereference and double
> free, but this does expose the lack of symmetry. If we unregister
> one GHES entry, subsequent notifications will be lost.
> Unregistering is unsafe if another CPU is using the free()d memory in
> ghes_edac_report_mem_error().
> 
> To fix this, Patch 2 uses ghes_init as a reference count.
> 
> We can now unbind all the GHES entries, causing ghes_edac to be
> unregistered, and start rebinding them again.
> 
> 
> Thanks,
> 
> James Morse (2):
>   EDAC, ghes: Fix Use after free in ghes_edac remove path
>   EDAC, ghes: Reference count GHES users of ghes_edac
> 
>  drivers/edac/ghes_edac.c | 4 ++++
>  1 file changed, 4 insertions(+)

Patches squashed and queued here:

https://git.kernel.org/pub/scm/linux/kernel/git/ras/ras.git/log/?h=edac-urgent

Will send it to Linus soonish as it is stable material, I guess.

In the meantime, we can iron the whole deal out and improve it.

Thx.

-- 
Regards/Gruss,
    Boris.

https://people.kernel.org/tglx/notes-about-netiquette

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH 0/2] EDAC, ghes: Fix use after free and add reference
  2019-10-14 17:53     ` Borislav Petkov
@ 2019-10-16 15:17       ` Borislav Petkov
  2019-10-16 15:30         ` James Morse
  0 siblings, 1 reply; 13+ messages in thread
From: Borislav Petkov @ 2019-10-16 15:17 UTC (permalink / raw)
  To: James Morse
  Cc: linux-edac, Mauro Carvalho Chehab, Tony Luck, Robert Richter, John Garry

On Mon, Oct 14, 2019 at 07:53:19PM +0200, Borislav Petkov wrote:
> Provided unregister cannot be called concurrently, the
> 
>         if (!ghes_pvt)
>                 return;
> 
> in ghes_edac_unregister() should suffice.
> 
> But just to be on the safe side, it could get an "instance_mutex" or so
> under which ghes_pvt is set and cleared and then that silly counter can
> simply go away.
> 
> Thoughts?

IOW, something simple and straight-forward like this:

---
diff --git a/drivers/edac/ghes_edac.c b/drivers/edac/ghes_edac.c
index 0bb62857ffb2..b600f010fa0e 100644
--- a/drivers/edac/ghes_edac.c
+++ b/drivers/edac/ghes_edac.c
@@ -26,9 +26,11 @@ struct ghes_edac_pvt {
 	char msg[80];
 };
 
-static atomic_t ghes_init = ATOMIC_INIT(0);
 static struct ghes_edac_pvt *ghes_pvt;
 
+/* GHES instances reg/dereg mutex */
+static DEFINE_MUTEX(ghes_reg_mutex);
+
 /*
  * Sync with other, potentially concurrent callers of
  * ghes_edac_report_mem_error(). We don't know what the
@@ -461,7 +463,7 @@ int ghes_edac_register(struct ghes *ghes, struct device *dev)
 	struct mem_ctl_info *mci;
 	struct edac_mc_layer layers[1];
 	struct ghes_edac_dimm_fill dimm_fill;
-	int idx = -1;
+	int idx = -1, err = 0;
 
 	if (IS_ENABLED(CONFIG_X86)) {
 		/* Check if safe to enable on this system */
@@ -472,11 +474,13 @@ int ghes_edac_register(struct ghes *ghes, struct device *dev)
 		idx = 0;
 	}
 
+	mutex_lock(&ghes_reg_mutex);
+
 	/*
 	 * We have only one logical memory controller to which all DIMMs belong.
 	 */
-	if (atomic_inc_return(&ghes_init) > 1)
-		return 0;
+	if (ghes_pvt)
+		goto unlock;
 
 	/* Get the number of DIMMs */
 	dmi_walk(ghes_edac_count_dimms, &num_dimm);
@@ -494,7 +498,8 @@ int ghes_edac_register(struct ghes *ghes, struct device *dev)
 	mci = edac_mc_alloc(0, ARRAY_SIZE(layers), layers, sizeof(struct ghes_edac_pvt));
 	if (!mci) {
 		pr_info("Can't allocate memory for EDAC data\n");
-		return -ENOMEM;
+		err = -ENOMEM;
+		goto unlock;
 	}
 
 	ghes_pvt	= mci->pvt_info;
@@ -541,23 +546,29 @@ int ghes_edac_register(struct ghes *ghes, struct device *dev)
 	if (rc < 0) {
 		pr_info("Can't register at EDAC core\n");
 		edac_mc_free(mci);
-		return -ENODEV;
+		err = -ENODEV;
 	}
-	return 0;
+
+unlock:
+	mutex_unlock(&ghes_reg_mutex);
+
+	return err;
 }
 
 void ghes_edac_unregister(struct ghes *ghes)
 {
 	struct mem_ctl_info *mci;
 
-	if (!ghes_pvt)
-		return;
+	mutex_lock(&ghes_reg_mutex);
 
-	if (atomic_dec_return(&ghes_init))
-		return;
+	if (!ghes_pvt)
+		goto unlock;
 
 	mci = ghes_pvt->mci;
 	ghes_pvt = NULL;
 	edac_mc_del_mc(mci->pdev);
 	edac_mc_free(mci);
+
+unlock:
+	mutex_unlock(&ghes_reg_mutex);
 }


-- 
Regards/Gruss,
    Boris.

https://people.kernel.org/tglx/notes-about-netiquette

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH 0/2] EDAC, ghes: Fix use after free and add reference
  2019-10-16 15:17       ` Borislav Petkov
@ 2019-10-16 15:30         ` James Morse
  2019-10-16 18:50           ` Borislav Petkov
  2019-10-22 13:25           ` Robert Richter
  0 siblings, 2 replies; 13+ messages in thread
From: James Morse @ 2019-10-16 15:30 UTC (permalink / raw)
  To: Borislav Petkov
  Cc: linux-edac, Mauro Carvalho Chehab, Tony Luck, Robert Richter, John Garry

Hi Boris,

On 16/10/2019 16:17, Borislav Petkov wrote:
> On Mon, Oct 14, 2019 at 07:53:19PM +0200, Borislav Petkov wrote:
>> Provided unregister cannot be called concurrently, the
>>
>>         if (!ghes_pvt)
>>                 return;
>>
>> in ghes_edac_unregister() should suffice.
>>
>> But just to be on the safe side, it could get an "instance_mutex" or so
>> under which ghes_pvt is set and cleared and then that silly counter can
>> simply go away.
>>
>> Thoughts?
> 
> IOW, something simple and straight-forward like this:


> ---
> diff --git a/drivers/edac/ghes_edac.c b/drivers/edac/ghes_edac.c
> index 0bb62857ffb2..b600f010fa0e 100644
> --- a/drivers/edac/ghes_edac.c
> +++ b/drivers/edac/ghes_edac.c
> @@ -26,9 +26,11 @@ struct ghes_edac_pvt {
>  	char msg[80];
>  };
>  
> -static atomic_t ghes_init = ATOMIC_INIT(0);
>  static struct ghes_edac_pvt *ghes_pvt;
>  
> +/* GHES instances reg/dereg mutex */
> +static DEFINE_MUTEX(ghes_reg_mutex);
> +
>  /*
>   * Sync with other, potentially concurrent callers of
>   * ghes_edac_report_mem_error(). We don't know what the
> @@ -461,7 +463,7 @@ int ghes_edac_register(struct ghes *ghes, struct device *dev)
>  	struct mem_ctl_info *mci;
>  	struct edac_mc_layer layers[1];
>  	struct ghes_edac_dimm_fill dimm_fill;
> -	int idx = -1;
> +	int idx = -1, err = 0;
>  
>  	if (IS_ENABLED(CONFIG_X86)) {
>  		/* Check if safe to enable on this system */
> @@ -472,11 +474,13 @@ int ghes_edac_register(struct ghes *ghes, struct device *dev)
>  		idx = 0;
>  	}
>  
> +	mutex_lock(&ghes_reg_mutex);
> +
>  	/*
>  	 * We have only one logical memory controller to which all DIMMs belong.
>  	 */
> -	if (atomic_inc_return(&ghes_init) > 1)
> -		return 0;
> +	if (ghes_pvt)
> +		goto unlock;
>  
>  	/* Get the number of DIMMs */
>  	dmi_walk(ghes_edac_count_dimms, &num_dimm);
> @@ -494,7 +498,8 @@ int ghes_edac_register(struct ghes *ghes, struct device *dev)
>  	mci = edac_mc_alloc(0, ARRAY_SIZE(layers), layers, sizeof(struct ghes_edac_pvt));
>  	if (!mci) {
>  		pr_info("Can't allocate memory for EDAC data\n");
> -		return -ENOMEM;
> +		err = -ENOMEM;
> +		goto unlock;
>  	}
>  
>  	ghes_pvt	= mci->pvt_info;
> @@ -541,23 +546,29 @@ int ghes_edac_register(struct ghes *ghes, struct device *dev)
>  	if (rc < 0) {
>  		pr_info("Can't register at EDAC core\n");
>  		edac_mc_free(mci);
> -		return -ENODEV;
> +		err = -ENODEV;
>  	}
> -	return 0;
> +
> +unlock:
> +	mutex_unlock(&ghes_reg_mutex);
> +
> +	return err;
>  }

There are a few more warts we should try and get rid of with this:
ghes_edac_register() publishes the ghes_pvt pointer under the mutex, but the irq handler
reads it without taking the mutex. (obviously it can't).

ghes_edac_register() publishes the pointer before its called edac_mc_add_mc(), which is
pleasant.

(sorry I've been sitting on this while I found new and exciting ways to break my test
machine!)

Combined with the spinlocky bits of:
--------------------------%<--------------------------
From 61fa061790fe7c19af25b25693b61bb75a498058 Mon Sep 17 00:00:00 2001
From: James Morse <james.morse@arm.com>
Date: Wed, 16 Oct 2019 10:02:15 +0100
Subject: [PATCH] EDAC, ghes: Move ghes_init and ghes_pvt under the ghes_lock

ghes_edac has an irqsave spinlock that protects the contents of ghes_pvt,
but not the pointer itself. The register, unregister and irq-handler
functions read this bare global variable expecting to see NULL or the
allocated value. Without READ_ONCE()/WRITE_ONCE(), this is racy.
ghes_edac_register() also publishes the pointer before it has registered
the mci.

Replace ghes_init with an unsigned int counter in ghes_pvt. To access
this or read the ghes_pvt pointer, you must hold the ghes_lock. This
prevents races when these values are modified.

Signed-off-by: James Morse <james.morse@arm.com>
---
 drivers/edac/ghes_edac.c | 69 ++++++++++++++++++++++++++++------------
 1 file changed, 48 insertions(+), 21 deletions(-)

diff --git a/drivers/edac/ghes_edac.c b/drivers/edac/ghes_edac.c
index 0bb62857ffb2..804bb07c6acf 100644
--- a/drivers/edac/ghes_edac.c
+++ b/drivers/edac/ghes_edac.c
@@ -15,7 +15,10 @@
 #include "edac_module.h"
 #include <ras/ras_event.h>

+/* Hold ghes_lock when accessing ghes_pvt */
 struct ghes_edac_pvt {
+	unsigned int users;
+
 	struct list_head list;
 	struct ghes *ghes;
 	struct mem_ctl_info *mci;
@@ -26,7 +29,6 @@ struct ghes_edac_pvt {
 	char msg[80];
 };

-static atomic_t ghes_init = ATOMIC_INIT(0);
 static struct ghes_edac_pvt *ghes_pvt;

 /*
@@ -79,9 +81,8 @@ static void ghes_edac_count_dimms(const struct dmi_header *dh, void *arg)
 		(*num_dimm)++;
 }

-static int get_dimm_smbios_index(u16 handle)
+static int get_dimm_smbios_index(u16 handle, struct mem_ctl_info *mci)
 {
-	struct mem_ctl_info *mci = ghes_pvt->mci;
 	int i;

 	for (i = 0; i < mci->tot_dimms; i++) {
@@ -197,15 +198,12 @@ void ghes_edac_report_mem_error(int sev, struct cper_sec_mem_err
*mem_err)
 {
 	enum hw_event_mc_err_type type;
 	struct edac_raw_error_desc *e;
+	struct ghes_edac_pvt *pvt;
 	struct mem_ctl_info *mci;
-	struct ghes_edac_pvt *pvt = ghes_pvt;
 	unsigned long flags;
 	char *p;
 	u8 grain_bits;

-	if (!pvt)
-		return;
-
 	/*
 	 * We can do the locking below because GHES defers error processing
 	 * from NMI to IRQ context. Whenever that changes, we'd at least
@@ -215,6 +213,11 @@ void ghes_edac_report_mem_error(int sev, struct cper_sec_mem_err
*mem_err)
 		return;

 	spin_lock_irqsave(&ghes_lock, flags);
+	pvt = ghes_pvt;
+	if (!pvt) {
+		spin_unlock_irqrestore(&ghes_lock, flags);
+		return;
+	}

 	mci = pvt->mci;
 	e = &mci->error_desc;
@@ -348,7 +351,7 @@ void ghes_edac_report_mem_error(int sev, struct cper_sec_mem_err *mem_err)
 			p += sprintf(p, "DIMM DMI handle: 0x%.4x ",
 				     mem_err->mem_dev_handle);

-		index = get_dimm_smbios_index(mem_err->mem_dev_handle);
+		index = get_dimm_smbios_index(mem_err->mem_dev_handle, mci);
 		if (index >= 0) {
 			e->top_layer = index;
 			e->enable_per_layer_report = true;
@@ -457,8 +460,10 @@ static struct acpi_platform_list plat_list[] = {
 int ghes_edac_register(struct ghes *ghes, struct device *dev)
 {
 	bool fake = false;
+	unsigned long flags;
 	int rc, num_dimm = 0;
 	struct mem_ctl_info *mci;
+	struct ghes_edac_pvt *pvt;
 	struct edac_mc_layer layers[1];
 	struct ghes_edac_dimm_fill dimm_fill;
 	int idx = -1;
@@ -475,7 +480,13 @@ int ghes_edac_register(struct ghes *ghes, struct device *dev)
 	/*
 	 * We have only one logical memory controller to which all DIMMs belong.
 	 */
-	if (atomic_inc_return(&ghes_init) > 1)
+	spin_lock_irqsave(&ghes_lock, flags);
+	pvt = ghes_pvt;
+	if (pvt)
+		pvt->users++;
+	spin_unlock_irqrestore(&ghes_lock, flags);
+
+	if (pvt)
 		return 0;

 	/* Get the number of DIMMs */
@@ -497,9 +508,10 @@ int ghes_edac_register(struct ghes *ghes, struct device *dev)
 		return -ENOMEM;
 	}

-	ghes_pvt	= mci->pvt_info;
-	ghes_pvt->ghes	= ghes;
-	ghes_pvt->mci	= mci;
+	pvt		= mci->pvt_info;
+	pvt->ghes	= ghes;
+	pvt->mci	= mci;
+	pvt->users	= 1;

 	mci->pdev = dev;
 	mci->mtype_cap = MEM_FLAG_EMPTY;
@@ -543,21 +555,36 @@ int ghes_edac_register(struct ghes *ghes, struct device *dev)
 		edac_mc_free(mci);
 		return -ENODEV;
 	}
+
+	spin_lock_irqsave(&ghes_lock, flags);
+	WARN_ON_ONCE(ghes_pvt);
+	ghes_pvt = pvt;
+	spin_unlock_irqrestore(&ghes_lock, flags);
+
 	return 0;
 }

 void ghes_edac_unregister(struct ghes *ghes)
 {
 	struct mem_ctl_info *mci;
+	bool do_free = false;
+	unsigned long flags;

-	if (!ghes_pvt)
-		return;
-
-	if (atomic_dec_return(&ghes_init))
-		return;
+	spin_lock_irqsave(&ghes_lock, flags);
+	if (ghes_pvt) {
+		ghes_pvt->users -= 1;
+
+		/* Are we the last user? */
+		if (!ghes_pvt->users) {
+			do_free = true;
+			mci = ghes_pvt->mci;
+			ghes_pvt = NULL;
+		}
+	}
+	spin_unlock_irqrestore(&ghes_lock, flags);

-	mci = ghes_pvt->mci;
-	ghes_pvt = NULL;
-	edac_mc_del_mc(mci->pdev);
-	edac_mc_free(mci);
+	if (do_free) {
+		edac_mc_del_mc(mci->pdev);
+		edac_mc_free(mci);
+	}
 }
-- 
2.20.1
--------------------------%<--------------------------



Thanks,

James

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH 0/2] EDAC, ghes: Fix use after free and add reference
  2019-10-16 15:30         ` James Morse
@ 2019-10-16 18:50           ` Borislav Petkov
  2019-10-21  7:37             ` Borislav Petkov
  2019-10-22 13:25           ` Robert Richter
  1 sibling, 1 reply; 13+ messages in thread
From: Borislav Petkov @ 2019-10-16 18:50 UTC (permalink / raw)
  To: James Morse
  Cc: linux-edac, Mauro Carvalho Chehab, Tony Luck, Robert Richter, John Garry

On Wed, Oct 16, 2019 at 04:30:24PM +0100, James Morse wrote:
> There are a few more warts we should try and get rid of with this:
> ghes_edac_register() publishes the ghes_pvt pointer under the mutex, but the irq handler
> reads it without taking the mutex. (obviously it can't).
> 
> ghes_edac_register() publishes the pointer before its called edac_mc_add_mc(), which is
> pleasant.

Yeah, before we do this, lemme try to simplify the situation more. And
yeah, we don't need the mutex - we can use the spinlock only. But let's
get rid of the need to access the pvt in the IRQ handler. Yeah, we need
the *mci pointer but one can't have it all :)

Anyway, here's what I have, it is only build-tested. I wanna give it a
stern look tomorrow, on a clear head again:

---
diff --git a/drivers/edac/ghes_edac.c b/drivers/edac/ghes_edac.c
index 0bb62857ffb2..2075e55d49ab 100644
--- a/drivers/edac/ghes_edac.c
+++ b/drivers/edac/ghes_edac.c
@@ -19,15 +19,16 @@ struct ghes_edac_pvt {
 	struct list_head list;
 	struct ghes *ghes;
 	struct mem_ctl_info *mci;
-
-	/* Buffers for the error handling routine */
-	char detail_location[240];
-	char other_detail[160];
-	char msg[80];
 };
 
 static atomic_t ghes_init = ATOMIC_INIT(0);
 static struct ghes_edac_pvt *ghes_pvt;
+static struct mem_ctl_info *ghes_mci;
+
+/* Buffers for the error handling routine */
+static char detail_location[240];
+static char other_detail[160];
+static char msg[80];
 
 /*
  * Sync with other, potentially concurrent callers of
@@ -196,15 +197,10 @@ static void ghes_edac_dmidecode(const struct dmi_header *dh, void *arg)
 void ghes_edac_report_mem_error(int sev, struct cper_sec_mem_err *mem_err)
 {
 	enum hw_event_mc_err_type type;
-	struct edac_raw_error_desc *e;
-	struct mem_ctl_info *mci;
-	struct ghes_edac_pvt *pvt = ghes_pvt;
+	struct edac_raw_error_desc e;
 	unsigned long flags;
-	char *p;
 	u8 grain_bits;
-
-	if (!pvt)
-		return;
+	char *p;
 
 	/*
 	 * We can do the locking below because GHES defers error processing
@@ -216,20 +212,20 @@ void ghes_edac_report_mem_error(int sev, struct cper_sec_mem_err *mem_err)
 
 	spin_lock_irqsave(&ghes_lock, flags);
 
-	mci = pvt->mci;
-	e = &mci->error_desc;
+	if (!ghes_mci)
+		goto unlock;
 
 	/* Cleans the error report buffer */
-	memset(e, 0, sizeof (*e));
-	e->error_count = 1;
-	strcpy(e->label, "unknown label");
-	e->msg = pvt->msg;
-	e->other_detail = pvt->other_detail;
-	e->top_layer = -1;
-	e->mid_layer = -1;
-	e->low_layer = -1;
-	*pvt->other_detail = '\0';
-	*pvt->msg = '\0';
+	memset(&e, 0, sizeof (e));
+	e.error_count = 1;
+	strcpy(e.label, "unknown label");
+	e.msg = msg;
+	e.other_detail = other_detail;
+	e.top_layer = -1;
+	e.mid_layer = -1;
+	e.low_layer = -1;
+	*other_detail = '\0';
+	*msg = '\0';
 
 	switch (sev) {
 	case GHES_SEV_CORRECTED:
@@ -251,7 +247,7 @@ void ghes_edac_report_mem_error(int sev, struct cper_sec_mem_err *mem_err)
 
 	/* Error type, mapped on e->msg */
 	if (mem_err->validation_bits & CPER_MEM_VALID_ERROR_TYPE) {
-		p = pvt->msg;
+		p = msg;
 		switch (mem_err->error_type) {
 		case 0:
 			p += sprintf(p, "Unknown");
@@ -306,21 +302,21 @@ void ghes_edac_report_mem_error(int sev, struct cper_sec_mem_err *mem_err)
 				     mem_err->error_type);
 		}
 	} else {
-		strcpy(pvt->msg, "unknown error");
+		strcpy(msg, "unknown error");
 	}
 
 	/* Error address */
 	if (mem_err->validation_bits & CPER_MEM_VALID_PA) {
-		e->page_frame_number = mem_err->physical_addr >> PAGE_SHIFT;
-		e->offset_in_page = mem_err->physical_addr & ~PAGE_MASK;
+		e.page_frame_number = mem_err->physical_addr >> PAGE_SHIFT;
+		e.offset_in_page = mem_err->physical_addr & ~PAGE_MASK;
 	}
 
 	/* Error grain */
 	if (mem_err->validation_bits & CPER_MEM_VALID_PA_MASK)
-		e->grain = ~(mem_err->physical_addr_mask & ~PAGE_MASK);
+		e.grain = ~(mem_err->physical_addr_mask & ~PAGE_MASK);
 
 	/* Memory error location, mapped on e->location */
-	p = e->location;
+	p = e.location;
 	if (mem_err->validation_bits & CPER_MEM_VALID_NODE)
 		p += sprintf(p, "node:%d ", mem_err->node);
 	if (mem_err->validation_bits & CPER_MEM_VALID_CARD)
@@ -337,12 +333,13 @@ void ghes_edac_report_mem_error(int sev, struct cper_sec_mem_err *mem_err)
 		p += sprintf(p, "col:%d ", mem_err->column);
 	if (mem_err->validation_bits & CPER_MEM_VALID_BIT_POSITION)
 		p += sprintf(p, "bit_pos:%d ", mem_err->bit_pos);
+
 	if (mem_err->validation_bits & CPER_MEM_VALID_MODULE_HANDLE) {
 		const char *bank = NULL, *device = NULL;
 		int index = -1;
 
 		dmi_memdev_name(mem_err->mem_dev_handle, &bank, &device);
-		if (bank != NULL && device != NULL)
+		if (bank && device)
 			p += sprintf(p, "DIMM location:%s %s ", bank, device);
 		else
 			p += sprintf(p, "DIMM DMI handle: 0x%.4x ",
@@ -350,16 +347,16 @@ void ghes_edac_report_mem_error(int sev, struct cper_sec_mem_err *mem_err)
 
 		index = get_dimm_smbios_index(mem_err->mem_dev_handle);
 		if (index >= 0) {
-			e->top_layer = index;
-			e->enable_per_layer_report = true;
+			e.top_layer = index;
+			e.enable_per_layer_report = true;
 		}
 
 	}
-	if (p > e->location)
+	if (p > e.location)
 		*(p - 1) = '\0';
 
 	/* All other fields are mapped on e->other_detail */
-	p = pvt->other_detail;
+	p = other_detail;
 	if (mem_err->validation_bits & CPER_MEM_VALID_ERROR_STATUS) {
 		u64 status = mem_err->error_status;
 
@@ -421,6 +418,7 @@ void ghes_edac_report_mem_error(int sev, struct cper_sec_mem_err *mem_err)
 			break;
 		}
 	}
+
 	if (mem_err->validation_bits & CPER_MEM_VALID_REQUESTOR_ID)
 		p += sprintf(p, "requestorID: 0x%016llx ",
 			     (long long)mem_err->requestor_id);
@@ -430,19 +428,21 @@ void ghes_edac_report_mem_error(int sev, struct cper_sec_mem_err *mem_err)
 	if (mem_err->validation_bits & CPER_MEM_VALID_TARGET_ID)
 		p += sprintf(p, "targetID: 0x%016llx ",
 			     (long long)mem_err->responder_id);
-	if (p > pvt->other_detail)
+	if (p > other_detail)
 		*(p - 1) = '\0';
 
 	/* Generate the trace event */
-	grain_bits = fls_long(e->grain);
-	snprintf(pvt->detail_location, sizeof(pvt->detail_location),
-		 "APEI location: %s %s", e->location, e->other_detail);
-	trace_mc_event(type, e->msg, e->label, e->error_count,
-		       mci->mc_idx, e->top_layer, e->mid_layer, e->low_layer,
-		       (e->page_frame_number << PAGE_SHIFT) | e->offset_in_page,
-		       grain_bits, e->syndrome, pvt->detail_location);
-
-	edac_raw_mc_handle_error(type, mci, e);
+	grain_bits = fls_long(e.grain);
+	snprintf(detail_location, sizeof(detail_location),
+		 "APEI location: %s %s", e.location, e.other_detail);
+	trace_mc_event(type, msg, e.label, e.error_count,
+		       0, e.top_layer, e.mid_layer, e.low_layer,
+		       (e.page_frame_number << PAGE_SHIFT) | e.offset_in_page,
+		       grain_bits, e.syndrome, detail_location);
+
+	edac_raw_mc_handle_error(type, ghes_mci, &e);
+
+unlock:
 	spin_unlock_irqrestore(&ghes_lock, flags);
 }
 
@@ -500,6 +500,7 @@ int ghes_edac_register(struct ghes *ghes, struct device *dev)
 	ghes_pvt	= mci->pvt_info;
 	ghes_pvt->ghes	= ghes;
 	ghes_pvt->mci	= mci;
+	ghes_mci	= mci;
 
 	mci->pdev = dev;
 	mci->mtype_cap = MEM_FLAG_EMPTY;

-- 
Regards/Gruss,
    Boris.

https://people.kernel.org/tglx/notes-about-netiquette

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH 0/2] EDAC, ghes: Fix use after free and add reference
  2019-10-16 18:50           ` Borislav Petkov
@ 2019-10-21  7:37             ` Borislav Petkov
  2019-10-21 10:52               ` Robert Richter
  0 siblings, 1 reply; 13+ messages in thread
From: Borislav Petkov @ 2019-10-21  7:37 UTC (permalink / raw)
  To: James Morse
  Cc: linux-edac, Mauro Carvalho Chehab, Tony Luck, Robert Richter, John Garry

On Wed, Oct 16, 2019 at 08:50:41PM +0200, Borislav Petkov wrote:
> Yeah, before we do this, lemme try to simplify the situation more. And
> yeah, we don't need the mutex - we can use the spinlock only. But let's
> get rid of the need to access the pvt in the IRQ handler. Yeah, we need
> the *mci pointer but one can't have it all :)
> 
> Anyway, here's what I have, it is only build-tested. I wanna give it a
> stern look tomorrow, on a clear head again:

And now I went and redid your patch ontop and thus completely got rid of
ghes_pvt in favor of having one global ghes_mci pointer.

We access it only under the lock and we publish it in
ghes_edac_register() only in the success case. Can't get any simpler
than that.

Thoughts?

I think it is a lot cleaner and straight-forward this way. Lemme know if
you wanna run it on your setup and I'll push the two patches somewhere.

Thx.

---
diff --git a/drivers/edac/ghes_edac.c b/drivers/edac/ghes_edac.c
index bcb6a9c579c6..e55a9cb8ab73 100644
--- a/drivers/edac/ghes_edac.c
+++ b/drivers/edac/ghes_edac.c
@@ -15,14 +15,6 @@
 #include "edac_module.h"
 #include <ras/ras_event.h>
 
-struct ghes_edac_pvt {
-	struct list_head list;
-	struct ghes *ghes;
-	struct mem_ctl_info *mci;
-};
-
-static atomic_t ghes_init = ATOMIC_INIT(0);
-static struct ghes_edac_pvt *ghes_pvt;
 static struct mem_ctl_info *ghes_mci;
 
 /* Buffers for the error handling routine */
@@ -80,9 +72,8 @@ static void ghes_edac_count_dimms(const struct dmi_header *dh, void *arg)
 		(*num_dimm)++;
 }
 
-static int get_dimm_smbios_index(u16 handle)
+static int get_dimm_smbios_index(struct mem_ctl_info *mci, u16 handle)
 {
-	struct mem_ctl_info *mci = ghes_pvt->mci;
 	int i;
 
 	for (i = 0; i < mci->tot_dimms; i++) {
@@ -345,7 +336,7 @@ void ghes_edac_report_mem_error(int sev, struct cper_sec_mem_err *mem_err)
 			p += sprintf(p, "DIMM DMI handle: 0x%.4x ",
 				     mem_err->mem_dev_handle);
 
-		index = get_dimm_smbios_index(mem_err->mem_dev_handle);
+		index = get_dimm_smbios_index(ghes_mci, mem_err->mem_dev_handle);
 		if (index >= 0) {
 			e.top_layer = index;
 			e.enable_per_layer_report = true;
@@ -456,12 +447,13 @@ static struct acpi_platform_list plat_list[] = {
 
 int ghes_edac_register(struct ghes *ghes, struct device *dev)
 {
-	bool fake = false;
-	int rc, num_dimm = 0;
-	struct mem_ctl_info *mci;
-	struct edac_mc_layer layers[1];
 	struct ghes_edac_dimm_fill dimm_fill;
-	int idx = -1;
+	struct edac_mc_layer layers[1];
+	struct mem_ctl_info *mci;
+	int err = 0, idx = -1;
+	int rc, num_dimm = 0;
+	unsigned long flags;
+	bool fake = false;
 
 	if (IS_ENABLED(CONFIG_X86)) {
 		/* Check if safe to enable on this system */
@@ -475,8 +467,9 @@ int ghes_edac_register(struct ghes *ghes, struct device *dev)
 	/*
 	 * We have only one logical memory controller to which all DIMMs belong.
 	 */
-	if (atomic_inc_return(&ghes_init) > 1)
-		return 0;
+	spin_lock_irqsave(&ghes_lock, flags);
+	if (ghes_mci)
+		goto unlock;
 
 	/* Get the number of DIMMs */
 	dmi_walk(ghes_edac_count_dimms, &num_dimm);
@@ -491,17 +484,13 @@ int ghes_edac_register(struct ghes *ghes, struct device *dev)
 	layers[0].size = num_dimm;
 	layers[0].is_virt_csrow = true;
 
-	mci = edac_mc_alloc(0, ARRAY_SIZE(layers), layers, sizeof(struct ghes_edac_pvt));
+	mci = edac_mc_alloc(0, ARRAY_SIZE(layers), layers, 0);
 	if (!mci) {
 		pr_info("Can't allocate memory for EDAC data\n");
-		return -ENOMEM;
+		err = -ENOMEM;
+		goto unlock;
 	}
 
-	ghes_pvt	= mci->pvt_info;
-	ghes_pvt->ghes	= ghes;
-	ghes_pvt->mci	= mci;
-	ghes_mci	= mci;
-
 	mci->pdev = dev;
 	mci->mtype_cap = MEM_FLAG_EMPTY;
 	mci->edac_ctl_cap = EDAC_FLAG_NONE;
@@ -542,23 +531,30 @@ int ghes_edac_register(struct ghes *ghes, struct device *dev)
 	if (rc < 0) {
 		pr_info("Can't register at EDAC core\n");
 		edac_mc_free(mci);
-		return -ENODEV;
+		err = -ENODEV;
 	}
-	return 0;
+
+	ghes_mci = mci;
+
+unlock:
+	spin_unlock_irqrestore(&ghes_lock, flags);
+
+	return err;
 }
 
 void ghes_edac_unregister(struct ghes *ghes)
 {
-	struct mem_ctl_info *mci;
+	unsigned long flags;
 
-	if (!ghes_pvt)
-		return;
+	spin_lock_irqsave(&ghes_lock, flags);
 
-	if (atomic_dec_return(&ghes_init))
-		return;
+	if (!ghes_mci)
+		goto unlock;
 
-	mci = ghes_pvt->mci;
-	ghes_pvt = NULL;
-	edac_mc_del_mc(mci->pdev);
-	edac_mc_free(mci);
+	edac_mc_del_mc(ghes_mci->pdev);
+	edac_mc_free(ghes_mci);
+	ghes_mci = NULL;
+
+unlock:
+	spin_unlock_irqrestore(&ghes_lock, flags);
 }

-- 
Regards/Gruss,
    Boris.

https://people.kernel.org/tglx/notes-about-netiquette

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH 0/2] EDAC, ghes: Fix use after free and add reference
  2019-10-21  7:37             ` Borislav Petkov
@ 2019-10-21 10:52               ` Robert Richter
  0 siblings, 0 replies; 13+ messages in thread
From: Robert Richter @ 2019-10-21 10:52 UTC (permalink / raw)
  To: Borislav Petkov
  Cc: James Morse, linux-edac, Mauro Carvalho Chehab, Tony Luck, John Garry

Hi Boris,

On 21.10.19 09:37:58, Borislav Petkov wrote:
> On Wed, Oct 16, 2019 at 08:50:41PM +0200, Borislav Petkov wrote:
> > Yeah, before we do this, lemme try to simplify the situation more. And
> > yeah, we don't need the mutex - we can use the spinlock only. But let's
> > get rid of the need to access the pvt in the IRQ handler. Yeah, we need
> > the *mci pointer but one can't have it all :)
> > 
> > Anyway, here's what I have, it is only build-tested. I wanna give it a
> > stern look tomorrow, on a clear head again:
> 
> And now I went and redid your patch ontop and thus completely got rid of
> ghes_pvt in favor of having one global ghes_mci pointer.

please do not do that. While we have atm only one ghes instance
including a single mci, we will need in the future multiple mci
instances to reflect the memory layout in sysfs. I sent already a
patch for this and am going to resent an updated version, see:

 https://lore.kernel.org/patchwork/patch/1093472/

> We access it only under the lock and we publish it in
> ghes_edac_register() only in the success case. Can't get any simpler
> than that.
> 
> Thoughts?

So, could we just keep all data in struct ghes_edac_pvt instead of
global data?

> I think it is a lot cleaner and straight-forward this way. Lemme know if
> you wanna run it on your setup and I'll push the two patches somewhere.

Could we please properly repost patches for review? James last one got
lost as an answer in this mail thread, but contains fundamental
changes of data structures and locking (while the both initially
posted patches contained reasonable oneliners).

I will start reviewing James' last patch now.

Thanks,

-Robert

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH 0/2] EDAC, ghes: Fix use after free and add reference
  2019-10-16 15:30         ` James Morse
  2019-10-16 18:50           ` Borislav Petkov
@ 2019-10-22 13:25           ` Robert Richter
  1 sibling, 0 replies; 13+ messages in thread
From: Robert Richter @ 2019-10-22 13:25 UTC (permalink / raw)
  To: James Morse
  Cc: Borislav Petkov, linux-edac, Mauro Carvalho Chehab, Tony Luck,
	John Garry

On 16.10.19 16:30:24, James Morse wrote:
> Hi Boris,
> 
> On 16/10/2019 16:17, Borislav Petkov wrote:
> > On Mon, Oct 14, 2019 at 07:53:19PM +0200, Borislav Petkov wrote:
> >> Provided unregister cannot be called concurrently, the
> >>
> >>         if (!ghes_pvt)
> >>                 return;

ghes_pvt should not be used at all outside the irq handler. It can
being protected by the mci device's refcount and as long mci exists,
ghes_pvt exists too (mci->pvt_info).

So a better fix is to use the mci's devices refcounter (get_device())
to prevent ghes_pvt from being freed until this pointer is invalidated
for sure. The mci can be freed then with just a put_device() call.  Of
course this needs a proper refcounting of all users and a release
handler. Since the mci or ghes_pvt is needed in the irq handler the
pointer to it can be set/unset when registering/releasing the mci.

See also below.

> >>
> >> in ghes_edac_unregister() should suffice.
> >>
> >> But just to be on the safe side, it could get an "instance_mutex" or so
> >> under which ghes_pvt is set and cleared and then that silly counter can
> >> simply go away.
> >>
> >> Thoughts?
> > 
> > IOW, something simple and straight-forward like this:
> 
> 
> > ---
> > diff --git a/drivers/edac/ghes_edac.c b/drivers/edac/ghes_edac.c
> > index 0bb62857ffb2..b600f010fa0e 100644
> > --- a/drivers/edac/ghes_edac.c
> > +++ b/drivers/edac/ghes_edac.c
> > @@ -26,9 +26,11 @@ struct ghes_edac_pvt {
> >  	char msg[80];
> >  };
> >  
> > -static atomic_t ghes_init = ATOMIC_INIT(0);
> >  static struct ghes_edac_pvt *ghes_pvt;
> >  
> > +/* GHES instances reg/dereg mutex */
> > +static DEFINE_MUTEX(ghes_reg_mutex);
> > +
> >  /*
> >   * Sync with other, potentially concurrent callers of
> >   * ghes_edac_report_mem_error(). We don't know what the
> > @@ -461,7 +463,7 @@ int ghes_edac_register(struct ghes *ghes, struct device *dev)
> >  	struct mem_ctl_info *mci;
> >  	struct edac_mc_layer layers[1];
> >  	struct ghes_edac_dimm_fill dimm_fill;
> > -	int idx = -1;
> > +	int idx = -1, err = 0;
> >  
> >  	if (IS_ENABLED(CONFIG_X86)) {
> >  		/* Check if safe to enable on this system */
> > @@ -472,11 +474,13 @@ int ghes_edac_register(struct ghes *ghes, struct device *dev)
> >  		idx = 0;
> >  	}
> >  
> > +	mutex_lock(&ghes_reg_mutex);
> > +
> >  	/*
> >  	 * We have only one logical memory controller to which all DIMMs belong.
> >  	 */
> > -	if (atomic_inc_return(&ghes_init) > 1)
> > -		return 0;
> > +	if (ghes_pvt)
> > +		goto unlock;
> >  
> >  	/* Get the number of DIMMs */
> >  	dmi_walk(ghes_edac_count_dimms, &num_dimm);
> > @@ -494,7 +498,8 @@ int ghes_edac_register(struct ghes *ghes, struct device *dev)
> >  	mci = edac_mc_alloc(0, ARRAY_SIZE(layers), layers, sizeof(struct ghes_edac_pvt));
> >  	if (!mci) {
> >  		pr_info("Can't allocate memory for EDAC data\n");
> > -		return -ENOMEM;
> > +		err = -ENOMEM;
> > +		goto unlock;
> >  	}
> >  
> >  	ghes_pvt	= mci->pvt_info;
> > @@ -541,23 +546,29 @@ int ghes_edac_register(struct ghes *ghes, struct device *dev)
> >  	if (rc < 0) {
> >  		pr_info("Can't register at EDAC core\n");
> >  		edac_mc_free(mci);
> > -		return -ENODEV;
> > +		err = -ENODEV;
> >  	}
> > -	return 0;
> > +
> > +unlock:
> > +	mutex_unlock(&ghes_reg_mutex);
> > +
> > +	return err;
> >  }
> 
> There are a few more warts we should try and get rid of with this:
> ghes_edac_register() publishes the ghes_pvt pointer under the mutex, but the irq handler
> reads it without taking the mutex. (obviously it can't).
> 
> ghes_edac_register() publishes the pointer before its called edac_mc_add_mc(), which is
> pleasant.
> 
> (sorry I've been sitting on this while I found new and exciting ways to break my test
> machine!)
> 
> Combined with the spinlocky bits of:
> --------------------------%<--------------------------
> >From 61fa061790fe7c19af25b25693b61bb75a498058 Mon Sep 17 00:00:00 2001
> From: James Morse <james.morse@arm.com>
> Date: Wed, 16 Oct 2019 10:02:15 +0100
> Subject: [PATCH] EDAC, ghes: Move ghes_init and ghes_pvt under the ghes_lock
> 
> ghes_edac has an irqsave spinlock that protects the contents of ghes_pvt,
> but not the pointer itself. The register, unregister and irq-handler
> functions read this bare global variable expecting to see NULL or the
> allocated value. Without READ_ONCE()/WRITE_ONCE(), this is racy.
> ghes_edac_register() also publishes the pointer before it has registered
> the mci.
> 
> Replace ghes_init with an unsigned int counter in ghes_pvt. To access
> this or read the ghes_pvt pointer, you must hold the ghes_lock. This
> prevents races when these values are modified.
> 
> Signed-off-by: James Morse <james.morse@arm.com>
> ---
>  drivers/edac/ghes_edac.c | 69 ++++++++++++++++++++++++++++------------
>  1 file changed, 48 insertions(+), 21 deletions(-)
> 
> diff --git a/drivers/edac/ghes_edac.c b/drivers/edac/ghes_edac.c
> index 0bb62857ffb2..804bb07c6acf 100644
> --- a/drivers/edac/ghes_edac.c
> +++ b/drivers/edac/ghes_edac.c
> @@ -15,7 +15,10 @@
>  #include "edac_module.h"
>  #include <ras/ras_event.h>
> 
> +/* Hold ghes_lock when accessing ghes_pvt */
>  struct ghes_edac_pvt {
> +	unsigned int users;
> +
>  	struct list_head list;
>  	struct ghes *ghes;
>  	struct mem_ctl_info *mci;
> @@ -26,7 +29,6 @@ struct ghes_edac_pvt {
>  	char msg[80];
>  };
> 
> -static atomic_t ghes_init = ATOMIC_INIT(0);
>  static struct ghes_edac_pvt *ghes_pvt;
> 
>  /*
> @@ -79,9 +81,8 @@ static void ghes_edac_count_dimms(const struct dmi_header *dh, void *arg)
>  		(*num_dimm)++;
>  }
> 
> -static int get_dimm_smbios_index(u16 handle)
> +static int get_dimm_smbios_index(u16 handle, struct mem_ctl_info *mci)
>  {
> -	struct mem_ctl_info *mci = ghes_pvt->mci;
>  	int i;
> 
>  	for (i = 0; i < mci->tot_dimms; i++) {
> @@ -197,15 +198,12 @@ void ghes_edac_report_mem_error(int sev, struct cper_sec_mem_err
> *mem_err)
>  {
>  	enum hw_event_mc_err_type type;
>  	struct edac_raw_error_desc *e;
> +	struct ghes_edac_pvt *pvt;
>  	struct mem_ctl_info *mci;
> -	struct ghes_edac_pvt *pvt = ghes_pvt;
>  	unsigned long flags;
>  	char *p;
>  	u8 grain_bits;
> 
> -	if (!pvt)
> -		return;
> -
>  	/*
>  	 * We can do the locking below because GHES defers error processing
>  	 * from NMI to IRQ context. Whenever that changes, we'd at least
> @@ -215,6 +213,11 @@ void ghes_edac_report_mem_error(int sev, struct cper_sec_mem_err
> *mem_err)
>  		return;
> 
>  	spin_lock_irqsave(&ghes_lock, flags);
> +	pvt = ghes_pvt;
> +	if (!pvt) {
> +		spin_unlock_irqrestore(&ghes_lock, flags);
> +		return;
> +	}
> 
>  	mci = pvt->mci;
>  	e = &mci->error_desc;
> @@ -348,7 +351,7 @@ void ghes_edac_report_mem_error(int sev, struct cper_sec_mem_err *mem_err)
>  			p += sprintf(p, "DIMM DMI handle: 0x%.4x ",
>  				     mem_err->mem_dev_handle);
> 
> -		index = get_dimm_smbios_index(mem_err->mem_dev_handle);
> +		index = get_dimm_smbios_index(mem_err->mem_dev_handle, mci);
>  		if (index >= 0) {
>  			e->top_layer = index;
>  			e->enable_per_layer_report = true;
> @@ -457,8 +460,10 @@ static struct acpi_platform_list plat_list[] = {
>  int ghes_edac_register(struct ghes *ghes, struct device *dev)
>  {
>  	bool fake = false;
> +	unsigned long flags;
>  	int rc, num_dimm = 0;
>  	struct mem_ctl_info *mci;
> +	struct ghes_edac_pvt *pvt;
>  	struct edac_mc_layer layers[1];
>  	struct ghes_edac_dimm_fill dimm_fill;
>  	int idx = -1;
> @@ -475,7 +480,13 @@ int ghes_edac_register(struct ghes *ghes, struct device *dev)
>  	/*
>  	 * We have only one logical memory controller to which all DIMMs belong.
>  	 */
> -	if (atomic_inc_return(&ghes_init) > 1)
> +	spin_lock_irqsave(&ghes_lock, flags);
> +	pvt = ghes_pvt;
> +	if (pvt)
> +		pvt->users++;

As written above, we should better use the mci's refcount here.

> +	spin_unlock_irqrestore(&ghes_lock, flags);
> +
> +	if (pvt)
>  		return 0;
> 
>  	/* Get the number of DIMMs */
> @@ -497,9 +508,10 @@ int ghes_edac_register(struct ghes *ghes, struct device *dev)
>  		return -ENOMEM;
>  	}
> 
> -	ghes_pvt	= mci->pvt_info;
> -	ghes_pvt->ghes	= ghes;
> -	ghes_pvt->mci	= mci;
> +	pvt		= mci->pvt_info;
> +	pvt->ghes	= ghes;
> +	pvt->mci	= mci;
> +	pvt->users	= 1;
> 
>  	mci->pdev = dev;
>  	mci->mtype_cap = MEM_FLAG_EMPTY;
> @@ -543,21 +555,36 @@ int ghes_edac_register(struct ghes *ghes, struct device *dev)
>  		edac_mc_free(mci);
>  		return -ENODEV;
>  	}
> +
> +	spin_lock_irqsave(&ghes_lock, flags);
> +	WARN_ON_ONCE(ghes_pvt);
> +	ghes_pvt = pvt;
> +	spin_unlock_irqrestore(&ghes_lock, flags);

This still is not safe as ghes_pvt could being setup by another
instance in between.

> +
>  	return 0;
>  }
> 
>  void ghes_edac_unregister(struct ghes *ghes)
>  {
>  	struct mem_ctl_info *mci;
> +	bool do_free = false;
> +	unsigned long flags;
> 
> -	if (!ghes_pvt)
> -		return;
> -
> -	if (atomic_dec_return(&ghes_init))
> -		return;
> +	spin_lock_irqsave(&ghes_lock, flags);
> +	if (ghes_pvt) {
> +		ghes_pvt->users -= 1;
> +
> +		/* Are we the last user? */
> +		if (!ghes_pvt->users) {
> +			do_free = true;
> +			mci = ghes_pvt->mci;
> +			ghes_pvt = NULL;
> +		}
> +	}
> +	spin_unlock_irqrestore(&ghes_lock, flags);
> 
> -	mci = ghes_pvt->mci;
> -	ghes_pvt = NULL;
> -	edac_mc_del_mc(mci->pdev);
> -	edac_mc_free(mci);
> +	if (do_free) {
> +		edac_mc_del_mc(mci->pdev);
> +		edac_mc_free(mci);
> +	}

This does not look straight forward here, it is exactly what a
device release function is for.

-Robert

>  }
> -- 
> 2.20.1
> --------------------------%<--------------------------
> 
> 
> 
> Thanks,
> 
> James

^ permalink raw reply	[flat|nested] 13+ messages in thread

end of thread, back to index

Thread overview: 13+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2019-10-14 17:19 [PATCH 0/2] EDAC, ghes: Fix use after free and add reference James Morse
2019-10-14 17:19 ` [PATCH 1/2] EDAC, ghes: Fix Use after free in ghes_edac remove path James Morse
2019-10-14 17:19 ` [PATCH 2/2] EDAC, ghes: Reference count GHES users of ghes_edac James Morse
2019-10-14 17:30 ` [PATCH 0/2] EDAC, ghes: Fix use after free and add reference Borislav Petkov
2019-10-14 17:40   ` James Morse
2019-10-14 17:53     ` Borislav Petkov
2019-10-16 15:17       ` Borislav Petkov
2019-10-16 15:30         ` James Morse
2019-10-16 18:50           ` Borislav Petkov
2019-10-21  7:37             ` Borislav Petkov
2019-10-21 10:52               ` Robert Richter
2019-10-22 13:25           ` Robert Richter
2019-10-15 13:25 ` Borislav Petkov

Linux-EDAC Archive on lore.kernel.org

Archives are clonable:
	git clone --mirror https://lore.kernel.org/linux-edac/0 linux-edac/git/0.git

	# If you have public-inbox 1.1+ installed, you may
	# initialize and index your mirror using the following commands:
	public-inbox-init -V2 linux-edac linux-edac/ https://lore.kernel.org/linux-edac \
		linux-edac@vger.kernel.org
	public-inbox-index linux-edac

Example config snippet for mirrors

Newsgroup available over NNTP:
	nntp://nntp.lore.kernel.org/org.kernel.vger.linux-edac


AGPL code for this site: git clone https://public-inbox.org/public-inbox.git