linux-edac.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Robert Richter <rrichter@marvell.com>
To: Borislav Petkov <bp@alien8.de>,
	Mauro Carvalho Chehab <mchehab@kernel.org>,
	Tony Luck <tony.luck@intel.com>
Cc: James Morse <james.morse@arm.com>,
	Aristeu Rozanski <aris@redhat.com>,
	Robert Richter <rrichter@marvell.com>,
	<linux-edac@vger.kernel.org>, <linux-kernel@vger.kernel.org>,
	John Garry <john.garry@huawei.com>
Subject: [PATCH v3 2/4] EDAC/mc: Fix use-after-free and memleaks during device removal
Date: Wed, 12 Feb 2020 13:03:38 +0100	[thread overview]
Message-ID: <20200212120340.4764-3-rrichter@marvell.com> (raw)
In-Reply-To: <20200212120340.4764-1-rrichter@marvell.com>

A test kernel with the options set below revealed several issues when
removing a mci device:

 DEBUG_TEST_DRIVER_REMOVE
 KASAN
 DEBUG_KMEMLEAK

Issues seen:

1) Use-after-free:

On 27.11.19 17:07:33, John Garry wrote:
> [   22.104498] BUG: KASAN: use-after-free in
> edac_remove_sysfs_mci_device+0x148/0x180

The use-after-free is caused by the mci_for_each_dimm() iterator that
is called in edac_remove_sysfs_mci_device(). The iterator was
introduced with commit c498afaf7df8 ("EDAC: Introduce an
mci_for_each_dimm() iterator"). The iterator loop calls function
device_unregister(&dimm->dev), which removes the sysfs entry of the
device, but also frees the dimm struct in dimm_attr_release(). When
incrementing the loop in mci_for_each_dimm(), the dimm struct is
accessed again, but it is already freed.

The fix is to free all the mci device's subsequent dimm and csrow
objects at a later point in _edac_mc_free() when the mci device is
freed. This keeps the data structures intact and the mci device can be
fully used until its removal. The change allows the save usage of
mci_for_each_dimm() to release dimm devices from sysfs.

2) Memory leaks:

Following memory leaks have been detected:

 # grep edac /sys/kernel/debug/kmemleak | sort | uniq -c
       1     [<000000003c0f58f9>] edac_mc_alloc+0x3bc/0x9d0      # mci->csrows
      16     [<00000000bb932dc0>] edac_mc_alloc+0x49c/0x9d0      # csr->channels
      16     [<00000000e2734dba>] edac_mc_alloc+0x518/0x9d0      # csr->channels[chn]
       1     [<00000000eb040168>] edac_mc_alloc+0x5c8/0x9d0      # mci->dimms
      34     [<00000000ef737c29>] ghes_edac_register+0x1c8/0x3f8 # see edac_mc_alloc()

All leaks are from memory created by edac_mc_alloc().

Note: The test above shows that edac_mc_alloc() was called here from
ghes_edac_register(), thus both functions show up in the stack dump,
but the driver causing the leaks is edac_mc. The comments with the
data structures involved were made manually by analyzing the objdump.

The data structures listed above and created by edac_mc_alloc() are
not properly removed during device removal, which is done in
edac_mc_free(). There are two paths implemented to remove the device
depending on device registration, _edac_mc_free() is called if the
device is not registered and edac_unregister_sysfs() otherwise. The
implemenations differ. For the sysfs case the mci device removal lacks
the removal of subsequent data structures (csrows, channels, dimms).
This causes the memory leaks (see mci_attr_release()).

Fixing this as follows:

Unify code and use the _edac_mc_free() code path to free the mci
struct and subsequent memory allocations. An effect of this is that no
data is freed in edac_mc_sysfs.c (except the "mc" sysfs root node).

The patch has been tested with the above kernel options, no issues
seen any longer.

To backport this patch to 5.4+ kernels, the previous revert patch
needs to be applied too or squashed with this patch to avoid
conflicts.

Reported-by: John Garry <john.garry@huawei.com>
Fixes: c498afaf7df8 ("EDAC: Introduce an mci_for_each_dimm() iterator")
Fixes: faa2ad09c01c ("edac_mc: edac_mc_free() cannot assume mem_ctl_info is registered in sysfs.")
Fixes: 7a623c039075 ("edac: rewrite the sysfs code to use struct device")
Signed-off-by: Robert Richter <rrichter@marvell.com>
---
 drivers/edac/edac_mc.c       | 12 +++---------
 drivers/edac/edac_mc_sysfs.c | 15 +++------------
 2 files changed, 6 insertions(+), 21 deletions(-)

diff --git a/drivers/edac/edac_mc.c b/drivers/edac/edac_mc.c
index 7243b88f81d8..69e0d90460e6 100644
--- a/drivers/edac/edac_mc.c
+++ b/drivers/edac/edac_mc.c
@@ -505,16 +505,10 @@ void edac_mc_free(struct mem_ctl_info *mci)
 {
 	edac_dbg(1, "\n");
 
-	/* If we're not yet registered with sysfs free only what was allocated
-	 * in edac_mc_alloc().
-	 */
-	if (!device_is_registered(&mci->dev)) {
-		_edac_mc_free(mci);
-		return;
-	}
+	if (device_is_registered(&mci->dev))
+		edac_unregister_sysfs(mci);
 
-	/* the mci instance is freed here, when the sysfs object is dropped */
-	edac_unregister_sysfs(mci);
+	_edac_mc_free(mci);
 }
 EXPORT_SYMBOL_GPL(edac_mc_free);
 
diff --git a/drivers/edac/edac_mc_sysfs.c b/drivers/edac/edac_mc_sysfs.c
index ba0937140fe4..1c9c6a7b9f66 100644
--- a/drivers/edac/edac_mc_sysfs.c
+++ b/drivers/edac/edac_mc_sysfs.c
@@ -276,10 +276,7 @@ static const struct attribute_group *csrow_attr_groups[] = {
 
 static void csrow_attr_release(struct device *dev)
 {
-	struct csrow_info *csrow = container_of(dev, struct csrow_info, dev);
-
-	edac_dbg(1, "Releasing csrow device %s\n", dev_name(dev));
-	kfree(csrow);
+	/* release device with _edac_mc_free() */
 }
 
 static const struct device_type csrow_attr_type = {
@@ -608,10 +605,7 @@ static const struct attribute_group *dimm_attr_groups[] = {
 
 static void dimm_attr_release(struct device *dev)
 {
-	struct dimm_info *dimm = container_of(dev, struct dimm_info, dev);
-
-	edac_dbg(1, "Releasing dimm device %s\n", dev_name(dev));
-	kfree(dimm);
+	/* release device with _edac_mc_free() */
 }
 
 static const struct device_type dimm_attr_type = {
@@ -893,10 +887,7 @@ static const struct attribute_group *mci_attr_groups[] = {
 
 static void mci_attr_release(struct device *dev)
 {
-	struct mem_ctl_info *mci = container_of(dev, struct mem_ctl_info, dev);
-
-	edac_dbg(1, "Releasing csrow device %s\n", dev_name(dev));
-	kfree(mci);
+	/* release device with _edac_mc_free() */
 }
 
 static const struct device_type mci_attr_type = {
-- 
2.20.1


  parent reply	other threads:[~2020-02-12 12:04 UTC|newest]

Thread overview: 9+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-02-12 12:03 [PATCH v3 0/4] EDAC/mc: Fixes for mci device removal Robert Richter
2020-02-12 12:03 ` [PATCH v3 1/4] Revert parts of "EDAC/mc_sysfs: Make debug messages consistent" Robert Richter
2020-02-12 12:03 ` Robert Richter [this message]
2020-02-12 12:03 ` [PATCH v3 3/4] EDAC/sysfs: Remove csrow objects on errors Robert Richter
2020-02-12 12:03 ` [PATCH v3 4/4] EDAC/mc: Change mci device removal to use put_device() Robert Richter
2020-02-13 11:05 ` [PATCH v3 0/4] EDAC/mc: Fixes for mci device removal Borislav Petkov
2020-02-13 11:10   ` John Garry
2020-02-13 12:08     ` John Garry
2020-02-13 12:28       ` Borislav Petkov

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20200212120340.4764-3-rrichter@marvell.com \
    --to=rrichter@marvell.com \
    --cc=aris@redhat.com \
    --cc=bp@alien8.de \
    --cc=james.morse@arm.com \
    --cc=john.garry@huawei.com \
    --cc=linux-edac@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=mchehab@kernel.org \
    --cc=tony.luck@intel.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).