All of lore.kernel.org
 help / color / mirror / Atom feed
From: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
To: linux-kernel@vger.kernel.org
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>,
	stable@vger.kernel.org,
	Reinette Chatre <reinette.chatre@intel.com>,
	Xiaochen Shen <xiaochen.shen@intel.com>,
	Borislav Petkov <bp@suse.de>, Tony Luck <tony.luck@intel.com>,
	Thomas Gleixner <tglx@linutronix.de>,
	Sasha Levin <sashal@kernel.org>
Subject: [PATCH 4.19 02/70] x86/resctrl: Fix use-after-free when deleting resource groups
Date: Mon,  3 Feb 2020 16:19:14 +0000	[thread overview]
Message-ID: <20200203161912.599819041@linuxfoundation.org> (raw)
In-Reply-To: <20200203161912.158976871@linuxfoundation.org>

From: Xiaochen Shen <xiaochen.shen@intel.com>

commit b8511ccc75c033f6d54188ea4df7bf1e85778740 upstream.

A resource group (rdtgrp) contains a reference count (rdtgrp->waitcount)
that indicates how many waiters expect this rdtgrp to exist. Waiters
could be waiting on rdtgroup_mutex or some work sitting on a task's
workqueue for when the task returns from kernel mode or exits.

The deletion of a rdtgrp is intended to have two phases:

  (1) while holding rdtgroup_mutex the necessary cleanup is done and
  rdtgrp->flags is set to RDT_DELETED,

  (2) after releasing the rdtgroup_mutex, the rdtgrp structure is freed
  only if there are no waiters and its flag is set to RDT_DELETED. Upon
  gaining access to rdtgroup_mutex or rdtgrp, a waiter is required to check
  for the RDT_DELETED flag.

When unmounting the resctrl file system or deleting ctrl_mon groups,
all of the subdirectories are removed and the data structure of rdtgrp
is forcibly freed without checking rdtgrp->waitcount. If at this point
there was a waiter on rdtgrp then a use-after-free issue occurs when the
waiter starts running and accesses the rdtgrp structure it was waiting
on.

See kfree() calls in [1], [2] and [3] in these two call paths in
following scenarios:
(1) rdt_kill_sb() -> rmdir_all_sub() -> free_all_child_rdtgrp()
(2) rdtgroup_rmdir() -> rdtgroup_rmdir_ctrl() -> free_all_child_rdtgrp()

There are several scenarios that result in use-after-free issue in
following:

Scenario 1:
-----------
In Thread 1, rdtgroup_tasks_write() adds a task_work callback
move_myself(). If move_myself() is scheduled to execute after Thread 2
rdt_kill_sb() is finished, referring to earlier rdtgrp memory
(rdtgrp->waitcount) which was already freed in Thread 2 results in
use-after-free issue.

Thread 1 (rdtgroup_tasks_write)        Thread 2 (rdt_kill_sb)
-------------------------------        ----------------------
rdtgroup_kn_lock_live
  atomic_inc(&rdtgrp->waitcount)
  mutex_lock
rdtgroup_move_task
  __rdtgroup_move_task
    /*
     * Take an extra refcount, so rdtgrp cannot be freed
     * before the call back move_myself has been invoked
     */
    atomic_inc(&rdtgrp->waitcount)
    /* Callback move_myself will be scheduled for later */
    task_work_add(move_myself)
rdtgroup_kn_unlock
  mutex_unlock
  atomic_dec_and_test(&rdtgrp->waitcount)
  && (flags & RDT_DELETED)
                                       mutex_lock
                                       rmdir_all_sub
                                         /*
                                          * sentry and rdtgrp are freed
                                          * without checking refcount
                                          */
                                         free_all_child_rdtgrp
                                           kfree(sentry)*[1]
                                         kfree(rdtgrp)*[2]
                                       mutex_unlock
/*
 * Callback is scheduled to execute
 * after rdt_kill_sb is finished
 */
move_myself
  /*
   * Use-after-free: refer to earlier rdtgrp
   * memory which was freed in [1] or [2].
   */
  atomic_dec_and_test(&rdtgrp->waitcount)
  && (flags & RDT_DELETED)
    kfree(rdtgrp)

Scenario 2:
-----------
In Thread 1, rdtgroup_tasks_write() adds a task_work callback
move_myself(). If move_myself() is scheduled to execute after Thread 2
rdtgroup_rmdir() is finished, referring to earlier rdtgrp memory
(rdtgrp->waitcount) which was already freed in Thread 2 results in
use-after-free issue.

Thread 1 (rdtgroup_tasks_write)        Thread 2 (rdtgroup_rmdir)
-------------------------------        -------------------------
rdtgroup_kn_lock_live
  atomic_inc(&rdtgrp->waitcount)
  mutex_lock
rdtgroup_move_task
  __rdtgroup_move_task
    /*
     * Take an extra refcount, so rdtgrp cannot be freed
     * before the call back move_myself has been invoked
     */
    atomic_inc(&rdtgrp->waitcount)
    /* Callback move_myself will be scheduled for later */
    task_work_add(move_myself)
rdtgroup_kn_unlock
  mutex_unlock
  atomic_dec_and_test(&rdtgrp->waitcount)
  && (flags & RDT_DELETED)
                                       rdtgroup_kn_lock_live
                                         atomic_inc(&rdtgrp->waitcount)
                                         mutex_lock
                                       rdtgroup_rmdir_ctrl
                                         free_all_child_rdtgrp
                                           /*
                                            * sentry is freed without
                                            * checking refcount
                                            */
                                           kfree(sentry)*[3]
                                         rdtgroup_ctrl_remove
                                           rdtgrp->flags = RDT_DELETED
                                       rdtgroup_kn_unlock
                                         mutex_unlock
                                         atomic_dec_and_test(
                                                     &rdtgrp->waitcount)
                                         && (flags & RDT_DELETED)
                                           kfree(rdtgrp)
/*
 * Callback is scheduled to execute
 * after rdt_kill_sb is finished
 */
move_myself
  /*
   * Use-after-free: refer to earlier rdtgrp
   * memory which was freed in [3].
   */
  atomic_dec_and_test(&rdtgrp->waitcount)
  && (flags & RDT_DELETED)
    kfree(rdtgrp)

If CONFIG_DEBUG_SLAB=y, Slab corruption on kmalloc-2k can be observed
like following. Note that "0x6b" is POISON_FREE after kfree(). The
corrupted bits "0x6a", "0x64" at offset 0x424 correspond to
waitcount member of struct rdtgroup which was freed:

  Slab corruption (Not tainted): kmalloc-2k start=ffff9504c5b0d000, len=2048
  420: 6b 6b 6b 6b 6a 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b  kkkkjkkkkkkkkkkk
  Single bit error detected. Probably bad RAM.
  Run memtest86+ or a similar memory test tool.
  Next obj: start=ffff9504c5b0d800, len=2048
  000: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b  kkkkkkkkkkkkkkkk
  010: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b  kkkkkkkkkkkkkkkk

  Slab corruption (Not tainted): kmalloc-2k start=ffff9504c58ab800, len=2048
  420: 6b 6b 6b 6b 64 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b  kkkkdkkkkkkkkkkk
  Prev obj: start=ffff9504c58ab000, len=2048
  000: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b  kkkkkkkkkkkkkkkk
  010: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b  kkkkkkkkkkkkkkkk

Fix this by taking reference count (waitcount) of rdtgrp into account in
the two call paths that currently do not do so. Instead of always
freeing the resource group it will only be freed if there are no waiters
on it. If there are waiters, the resource group will have its flags set
to RDT_DELETED.

It will be left to the waiter to free the resource group when it starts
running and finding that it was the last waiter and the resource group
has been removed (rdtgrp->flags & RDT_DELETED) since. (1) rdt_kill_sb()
-> rmdir_all_sub() -> free_all_child_rdtgrp() (2) rdtgroup_rmdir() ->
rdtgroup_rmdir_ctrl() -> free_all_child_rdtgrp()

Backporting notes:

Since upstream commit fa7d949337cc ("x86/resctrl: Rename and move rdt
files to a separate directory"), the file
arch/x86/kernel/cpu/intel_rdt_rdtgroup.c has been renamed and moved to
arch/x86/kernel/cpu/resctrl/rdtgroup.c.

Apply the change against file arch/x86/kernel/cpu/intel_rdt_rdtgroup.c
in older stable trees.

Fixes: f3cbeacaa06e ("x86/intel_rdt/cqm: Add rmdir support")
Fixes: 60cf5e101fd4 ("x86/intel_rdt: Add mkdir to resctrl file system")
Suggested-by: Reinette Chatre <reinette.chatre@intel.com>
Signed-off-by: Xiaochen Shen <xiaochen.shen@intel.com>
Signed-off-by: Borislav Petkov <bp@suse.de>
Reviewed-by: Reinette Chatre <reinette.chatre@intel.com>
Reviewed-by: Tony Luck <tony.luck@intel.com>
Acked-by: Thomas Gleixner <tglx@linutronix.de>
Cc: stable@vger.kernel.org
Link: https://lkml.kernel.org/r/1578500886-21771-2-git-send-email-xiaochen.shen@intel.com
Signed-off-by: Sasha Levin <sashal@kernel.org>
---
 arch/x86/kernel/cpu/intel_rdt_rdtgroup.c | 12 ++++++++++--
 1 file changed, 10 insertions(+), 2 deletions(-)

diff --git a/arch/x86/kernel/cpu/intel_rdt_rdtgroup.c b/arch/x86/kernel/cpu/intel_rdt_rdtgroup.c
index 841a0246eb897..db22ba0bf9167 100644
--- a/arch/x86/kernel/cpu/intel_rdt_rdtgroup.c
+++ b/arch/x86/kernel/cpu/intel_rdt_rdtgroup.c
@@ -2167,7 +2167,11 @@ static void free_all_child_rdtgrp(struct rdtgroup *rdtgrp)
 	list_for_each_entry_safe(sentry, stmp, head, mon.crdtgrp_list) {
 		free_rmid(sentry->mon.rmid);
 		list_del(&sentry->mon.crdtgrp_list);
-		kfree(sentry);
+
+		if (atomic_read(&sentry->waitcount) != 0)
+			sentry->flags = RDT_DELETED;
+		else
+			kfree(sentry);
 	}
 }
 
@@ -2205,7 +2209,11 @@ static void rmdir_all_sub(void)
 
 		kernfs_remove(rdtgrp->kn);
 		list_del(&rdtgrp->rdtgroup_list);
-		kfree(rdtgrp);
+
+		if (atomic_read(&rdtgrp->waitcount) != 0)
+			rdtgrp->flags = RDT_DELETED;
+		else
+			kfree(rdtgrp);
 	}
 	/* Notify online CPUs to update per cpu storage and PQR_ASSOC MSR */
 	update_closid_rmid(cpu_online_mask, &rdtgroup_default);
-- 
2.20.1




  parent reply	other threads:[~2020-02-03 16:31 UTC|newest]

Thread overview: 78+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-02-03 16:19 [PATCH 4.19 00/70] 4.19.102-stable review Greg Kroah-Hartman
2020-02-03 16:19 ` [PATCH 4.19 01/70] vfs: fix do_last() regression Greg Kroah-Hartman
2020-02-03 16:19 ` Greg Kroah-Hartman [this message]
2020-02-03 16:19 ` [PATCH 4.19 03/70] x86/resctrl: Fix use-after-free due to inaccurate refcount of rdtgroup Greg Kroah-Hartman
2020-02-03 16:19 ` [PATCH 4.19 04/70] x86/resctrl: Fix a deadlock due to inaccurate reference Greg Kroah-Hartman
2020-02-03 16:19 ` [PATCH 4.19 05/70] crypto: pcrypt - Fix user-after-free on module unload Greg Kroah-Hartman
2020-02-03 16:19 ` [PATCH 4.19 06/70] rsi: add hci detach for hibernation and poweroff Greg Kroah-Hartman
2020-02-03 16:19 ` [PATCH 4.19 07/70] rsi: fix use-after-free on failed probe and unbind Greg Kroah-Hartman
2020-02-03 16:19 ` [PATCH 4.19 08/70] perf c2c: Fix return type for histogram sorting comparision functions Greg Kroah-Hartman
2020-02-03 16:19 ` [PATCH 4.19 09/70] PM / devfreq: Add new name attribute for sysfs Greg Kroah-Hartman
2020-02-03 16:19 ` [PATCH 4.19 10/70] tools lib: Fix builds when glibc contains strlcpy() Greg Kroah-Hartman
2020-02-03 16:19 ` [PATCH 4.19 11/70] arm64: kbuild: remove compressed images on make ARCH=arm64 (dist)clean Greg Kroah-Hartman
2020-02-03 16:19 ` [PATCH 4.19 12/70] ext4: validate the debug_want_extra_isize mount option at parse time Greg Kroah-Hartman
2020-02-03 16:19 ` [PATCH 4.19 13/70] mm/mempolicy.c: fix out of bounds write in mpol_parse_str() Greg Kroah-Hartman
2020-02-03 16:19 ` [PATCH 4.19 14/70] reiserfs: Fix memory leak of journal device string Greg Kroah-Hartman
2020-02-03 16:19 ` [PATCH 4.19 15/70] media: digitv: dont continue if remote control state cant be read Greg Kroah-Hartman
2020-02-03 16:19 ` [PATCH 4.19 16/70] media: af9005: uninitialized variable printked Greg Kroah-Hartman
2020-02-03 16:19 ` [PATCH 4.19 17/70] media: vp7045: do not read uninitialized values if usb transfer fails Greg Kroah-Hartman
2020-02-03 16:19 ` [PATCH 4.19 18/70] media: gspca: zero usb_buf Greg Kroah-Hartman
2020-02-03 16:19 ` [PATCH 4.19 19/70] media: dvb-usb/dvb-usb-urb.c: initialize actlen to 0 Greg Kroah-Hartman
2020-02-03 16:19 ` [PATCH 4.19 20/70] tomoyo: Use atomic_t for statistics counter Greg Kroah-Hartman
2020-02-03 16:19 ` [PATCH 4.19 21/70] ttyprintk: fix a potential deadlock in interrupt context issue Greg Kroah-Hartman
2020-02-03 16:19 ` [PATCH 4.19 22/70] Bluetooth: Fix race condition in hci_release_sock() Greg Kroah-Hartman
2020-02-03 16:19 ` [PATCH 4.19 23/70] cgroup: Prevent double killing of css when enabling threaded cgroup Greg Kroah-Hartman
2020-02-03 16:19 ` [PATCH 4.19 24/70] media: si470x-i2c: Move free() past last use of radio Greg Kroah-Hartman
2020-02-03 16:19 ` [PATCH 4.19 25/70] ARM: dts: sun8i: a83t: Correct USB3503 GPIOs polarity Greg Kroah-Hartman
2020-02-03 16:19 ` [PATCH 4.19 26/70] ARM: dts: am57xx-beagle-x15/am57xx-idk: Remove "gpios" for endpoint dt nodes Greg Kroah-Hartman
2020-02-03 16:19 ` [PATCH 4.19 27/70] ARM: dts: beagle-x15-common: Model 5V0 regulator Greg Kroah-Hartman
2020-02-03 16:19 ` [PATCH 4.19 28/70] soc: ti: wkup_m3_ipc: Fix race condition with rproc_boot Greg Kroah-Hartman
2020-02-03 16:19 ` [PATCH 4.19 29/70] tools lib traceevent: Fix memory leakage in filter_event Greg Kroah-Hartman
2020-02-03 16:19 ` [PATCH 4.19 30/70] rseq: Unregister rseq for clone CLONE_VM Greg Kroah-Hartman
2020-02-03 16:19 ` [PATCH 4.19 31/70] clk: sunxi-ng: h6-r: Fix AR100/R_APB2 parent order Greg Kroah-Hartman
2020-02-03 16:19 ` [PATCH 4.19 32/70] mac80211: mesh: restrict airtime metric to peered established plinks Greg Kroah-Hartman
2020-02-03 16:19 ` [PATCH 4.19 33/70] clk: mmp2: Fix the order of timer mux parents Greg Kroah-Hartman
2020-02-03 16:19 ` [PATCH 4.19 34/70] ASoC: rt5640: Fix NULL dereference on module unload Greg Kroah-Hartman
2020-02-03 16:19 ` [PATCH 4.19 35/70] ixgbevf: Remove limit of 10 entries for unicast filter list Greg Kroah-Hartman
2020-02-03 16:19 ` [PATCH 4.19 36/70] ixgbe: Fix calculation of queue with VFs and flow director on interface flap Greg Kroah-Hartman
2020-02-03 16:19 ` [PATCH 4.19 37/70] igb: Fix SGMII SFP module discovery for 100FX/LX Greg Kroah-Hartman
2020-02-03 16:19 ` [PATCH 4.19 38/70] platform/x86: GPD pocket fan: Allow somewhat lower/higher temperature limits Greg Kroah-Hartman
2020-02-04 19:12   ` Pavel Machek
2020-02-03 16:19 ` [PATCH 4.19 39/70] ASoC: sti: fix possible sleep-in-atomic Greg Kroah-Hartman
2020-02-03 16:19 ` [PATCH 4.19 40/70] qmi_wwan: Add support for Quectel RM500Q Greg Kroah-Hartman
2020-02-03 16:19 ` [PATCH 4.19 41/70] parisc: Use proper printk format for resource_size_t Greg Kroah-Hartman
2020-02-03 16:19 ` [PATCH 4.19 42/70] wireless: fix enabling channel 12 for custom regulatory domain Greg Kroah-Hartman
2020-02-03 16:19 ` [PATCH 4.19 43/70] cfg80211: Fix radar event during another phy CAC Greg Kroah-Hartman
2020-02-03 16:19 ` [PATCH 4.19 44/70] mac80211: Fix TKIP replay protection immediately after key setup Greg Kroah-Hartman
2020-02-03 16:19 ` [PATCH 4.19 45/70] wireless: wext: avoid gcc -O3 warning Greg Kroah-Hartman
2020-02-03 16:19 ` [PATCH 4.19 46/70] netfilter: nft_tunnel: ERSPAN_VERSION must not be null Greg Kroah-Hartman
2020-02-03 16:19 ` [PATCH 4.19 47/70] net: dsa: bcm_sf2: Configure IMP port for 2Gb/sec Greg Kroah-Hartman
2020-02-03 16:20 ` [PATCH 4.19 48/70] bnxt_en: Fix ipv6 RFS filter matching logic Greg Kroah-Hartman
2020-02-03 16:20 ` [PATCH 4.19 49/70] riscv: delete temporary files Greg Kroah-Hartman
2020-02-03 16:20 ` [PATCH 4.19 50/70] iwlwifi: mvm: fix NVM check for 3168 devices Greg Kroah-Hartman
2020-02-03 16:20 ` [PATCH 4.19 51/70] iwlwifi: Dont ignore the cap field upon mcc update Greg Kroah-Hartman
2020-02-03 16:20 ` [PATCH 4.19 52/70] Input: aiptek - use descriptors of current altsetting Greg Kroah-Hartman
2020-02-03 16:20 ` [PATCH 4.19 53/70] ARM: dts: am335x-boneblack-common: fix memory size Greg Kroah-Hartman
2020-02-03 16:20 ` [PATCH 4.19 54/70] vti[6]: fix packet tx through bpf_redirect() Greg Kroah-Hartman
2020-02-03 16:20 ` [PATCH 4.19 55/70] xfrm interface: " Greg Kroah-Hartman
2020-02-03 16:20 ` [PATCH 4.19 56/70] xfrm: interface: do not confirm neighbor when do pmtu update Greg Kroah-Hartman
2020-02-03 16:20 ` [PATCH 4.19 57/70] scsi: fnic: do not queue commands during fwreset Greg Kroah-Hartman
2020-02-03 16:20 ` [PATCH 4.19 58/70] ARM: 8955/1: virt: Relax arch timer version check during early boot Greg Kroah-Hartman
2020-02-03 16:20 ` [PATCH 4.19 59/70] tee: optee: Fix compilation issue with nommu Greg Kroah-Hartman
2020-02-03 16:20 ` [PATCH 4.19 60/70] airo: Fix possible info leak in AIROOLDIOCTL/SIOCDEVPRIVATE Greg Kroah-Hartman
2020-02-03 16:20 ` [PATCH 4.19 61/70] airo: Add missing CAP_NET_ADMIN check " Greg Kroah-Hartman
2020-02-03 16:20 ` [PATCH 4.19 62/70] r8152: get default setting of WOL before initializing Greg Kroah-Hartman
2020-02-03 16:20 ` [PATCH 4.19 63/70] ARM: dts: am43x-epos-evm: set data pin directions for spi0 and spi1 Greg Kroah-Hartman
2020-02-03 16:20 ` [PATCH 4.19 64/70] qlcnic: Fix CPU soft lockup while collecting firmware dump Greg Kroah-Hartman
2020-02-03 16:20 ` [PATCH 4.19 65/70] powerpc/fsl/dts: add fsl,erratum-a011043 Greg Kroah-Hartman
2020-02-03 16:20 ` [PATCH 4.19 66/70] net/fsl: treat fsl,erratum-a011043 Greg Kroah-Hartman
2020-02-03 16:20 ` [PATCH 4.19 67/70] net: fsl/fman: rename IF_MODE_XGMII to IF_MODE_10G Greg Kroah-Hartman
2020-02-03 16:20 ` [PATCH 4.19 68/70] seq_tab_next() should increase position index Greg Kroah-Hartman
2020-02-03 16:20 ` [PATCH 4.19 69/70] l2t_seq_next " Greg Kroah-Hartman
2020-02-03 16:20 ` [PATCH 4.19 70/70] net: Fix skb->csum update in inet_proto_csum_replace16() Greg Kroah-Hartman
     [not found] ` <20200203161912.158976871-hQyY1W1yCW8ekmWlsbkhG0B+6BGkLq7r@public.gmane.org>
2020-02-03 21:39   ` [PATCH 4.19 00/70] 4.19.102-stable review Jon Hunter
2020-02-03 21:39     ` Jon Hunter
2020-02-04 12:37 ` Pavel Machek
2020-02-05 14:42   ` Greg Kroah-Hartman
2020-02-04 13:37 ` Naresh Kamboju
2020-02-04 17:19 ` Guenter Roeck

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20200203161912.599819041@linuxfoundation.org \
    --to=gregkh@linuxfoundation.org \
    --cc=bp@suse.de \
    --cc=linux-kernel@vger.kernel.org \
    --cc=reinette.chatre@intel.com \
    --cc=sashal@kernel.org \
    --cc=stable@vger.kernel.org \
    --cc=tglx@linutronix.de \
    --cc=tony.luck@intel.com \
    --cc=xiaochen.shen@intel.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.