LKML Archive on lore.kernel.org
 help / color / Atom feed
From: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
To: linux-kernel@vger.kernel.org
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>,
	stable@vger.kernel.org, Breno Leitao <leitao@debian.org>,
	Michael Ellerman <mpe@ellerman.id.au>
Subject: [PATCH 4.9 55/63] powerpc/tm: Set MSR[TS] just prior to recheckpoint
Date: Fri, 11 Jan 2019 15:14:58 +0100
Message-ID: <20190111131054.515256345@linuxfoundation.org> (raw)
In-Reply-To: <20190111131046.387528003@linuxfoundation.org>

4.9-stable review patch.  If anyone has any objections, please let me know.

------------------

From: Breno Leitao <leitao@debian.org>

commit e1c3743e1a20647c53b719dbf28b48f45d23f2cd upstream.

On a signal handler return, the user could set a context with MSR[TS] bits
set, and these bits would be copied to task regs->msr.

At restore_tm_sigcontexts(), after current task regs->msr[TS] bits are set,
several __get_user() are called and then a recheckpoint is executed.

This is a problem since a page fault (in kernel space) could happen when
calling __get_user(). If it happens, the process MSR[TS] bits were
already set, but recheckpoint was not executed, and SPRs are still invalid.

The page fault can cause the current process to be de-scheduled, with
MSR[TS] active and without tm_recheckpoint() being called.  More
importantly, without TEXASR[FS] bit set also.

Since TEXASR might not have the FS bit set, and when the process is
scheduled back, it will try to reclaim, which will be aborted because of
the CPU is not in the suspended state, and, then, recheckpoint. This
recheckpoint will restore thread->texasr into TEXASR SPR, which might be
zero, hitting a BUG_ON().

	kernel BUG at /build/linux-sf3Co9/linux-4.9.30/arch/powerpc/kernel/tm.S:434!
	cpu 0xb: Vector: 700 (Program Check) at [c00000041f1576d0]
	    pc: c000000000054550: restore_gprs+0xb0/0x180
	    lr: 0000000000000000
	    sp: c00000041f157950
	   msr: 8000000100021033
	  current = 0xc00000041f143000
	  paca    = 0xc00000000fb86300	 softe: 0	 irq_happened: 0x01
	    pid   = 1021, comm = kworker/11:1
	kernel BUG at /build/linux-sf3Co9/linux-4.9.30/arch/powerpc/kernel/tm.S:434!
	Linux version 4.9.0-3-powerpc64le (debian-kernel@lists.debian.org) (gcc version 6.3.0 20170516 (Debian 6.3.0-18) ) #1 SMP Debian 4.9.30-2+deb9u2 (2017-06-26)
	enter ? for help
	[c00000041f157b30] c00000000001bc3c tm_recheckpoint.part.11+0x6c/0xa0
	[c00000041f157b70] c00000000001d184 __switch_to+0x1e4/0x4c0
	[c00000041f157bd0] c00000000082eeb8 __schedule+0x2f8/0x990
	[c00000041f157cb0] c00000000082f598 schedule+0x48/0xc0
	[c00000041f157ce0] c0000000000f0d28 worker_thread+0x148/0x610
	[c00000041f157d80] c0000000000f96b0 kthread+0x120/0x140
	[c00000041f157e30] c00000000000c0e0 ret_from_kernel_thread+0x5c/0x7c

This patch simply delays the MSR[TS] set, so, if there is any page fault in
the __get_user() section, it does not have regs->msr[TS] set, since the TM
structures are still invalid, thus avoiding doing TM operations for
in-kernel exceptions and possible process reschedule.

With this patch, the MSR[TS] will only be set just before recheckpointing
and setting TEXASR[FS] = 1, thus avoiding an interrupt with TM registers in
invalid state.

Other than that, if CONFIG_PREEMPT is set, there might be a preemption just
after setting MSR[TS] and before tm_recheckpoint(), thus, this block must
be atomic from a preemption perspective, thus, calling
preempt_disable/enable() on this code.

It is not possible to move tm_recheckpoint to happen earlier, because it is
required to get the checkpointed registers from userspace, with
__get_user(), thus, the only way to avoid this undesired behavior is
delaying the MSR[TS] set.

The 32-bits signal handler seems to be safe this current issue, but, it
might be exposed to the preemption issue, thus, disabling preemption in
this chunk of code.

Changes from v2:
 * Run the critical section with preempt_disable.

Fixes: 87b4e5393af7 ("powerpc/tm: Fix return of active 64bit signals")
Cc: stable@vger.kernel.org (v3.9+)
Signed-off-by: Breno Leitao <leitao@debian.org>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

---
 arch/powerpc/kernel/signal_32.c |   20 +++++++++++++++++-
 arch/powerpc/kernel/signal_64.c |   44 +++++++++++++++++++++++++++-------------
 2 files changed, 49 insertions(+), 15 deletions(-)

--- a/arch/powerpc/kernel/signal_32.c
+++ b/arch/powerpc/kernel/signal_32.c
@@ -866,7 +866,23 @@ static long restore_tm_user_regs(struct
 	/* If TM bits are set to the reserved value, it's an invalid context */
 	if (MSR_TM_RESV(msr_hi))
 		return 1;
-	/* Pull in the MSR TM bits from the user context */
+
+	/*
+	 * Disabling preemption, since it is unsafe to be preempted
+	 * with MSR[TS] set without recheckpointing.
+	 */
+	preempt_disable();
+
+	/*
+	 * CAUTION:
+	 * After regs->MSR[TS] being updated, make sure that get_user(),
+	 * put_user() or similar functions are *not* called. These
+	 * functions can generate page faults which will cause the process
+	 * to be de-scheduled with MSR[TS] set but without calling
+	 * tm_recheckpoint(). This can cause a bug.
+	 *
+	 * Pull in the MSR TM bits from the user context
+	 */
 	regs->msr = (regs->msr & ~MSR_TS_MASK) | (msr_hi & MSR_TS_MASK);
 	/* Now, recheckpoint.  This loads up all of the checkpointed (older)
 	 * registers, including FP and V[S]Rs.  After recheckpointing, the
@@ -891,6 +907,8 @@ static long restore_tm_user_regs(struct
 	}
 #endif
 
+	preempt_enable();
+
 	return 0;
 }
 #endif
--- a/arch/powerpc/kernel/signal_64.c
+++ b/arch/powerpc/kernel/signal_64.c
@@ -452,20 +452,6 @@ static long restore_tm_sigcontexts(struc
 	if (MSR_TM_RESV(msr))
 		return -EINVAL;
 
-	/* pull in MSR TS bits from user context */
-	regs->msr = (regs->msr & ~MSR_TS_MASK) | (msr & MSR_TS_MASK);
-
-	/*
-	 * Ensure that TM is enabled in regs->msr before we leave the signal
-	 * handler. It could be the case that (a) user disabled the TM bit
-	 * through the manipulation of the MSR bits in uc_mcontext or (b) the
-	 * TM bit was disabled because a sufficient number of context switches
-	 * happened whilst in the signal handler and load_tm overflowed,
-	 * disabling the TM bit. In either case we can end up with an illegal
-	 * TM state leading to a TM Bad Thing when we return to userspace.
-	 */
-	regs->msr |= MSR_TM;
-
 	/* pull in MSR LE from user context */
 	regs->msr = (regs->msr & ~MSR_LE) | (msr & MSR_LE);
 
@@ -557,6 +543,34 @@ static long restore_tm_sigcontexts(struc
 	tm_enable();
 	/* Make sure the transaction is marked as failed */
 	tsk->thread.tm_texasr |= TEXASR_FS;
+
+	/*
+	 * Disabling preemption, since it is unsafe to be preempted
+	 * with MSR[TS] set without recheckpointing.
+	 */
+	preempt_disable();
+
+	/* pull in MSR TS bits from user context */
+	regs->msr = (regs->msr & ~MSR_TS_MASK) | (msr & MSR_TS_MASK);
+
+	/*
+	 * Ensure that TM is enabled in regs->msr before we leave the signal
+	 * handler. It could be the case that (a) user disabled the TM bit
+	 * through the manipulation of the MSR bits in uc_mcontext or (b) the
+	 * TM bit was disabled because a sufficient number of context switches
+	 * happened whilst in the signal handler and load_tm overflowed,
+	 * disabling the TM bit. In either case we can end up with an illegal
+	 * TM state leading to a TM Bad Thing when we return to userspace.
+	 *
+	 * CAUTION:
+	 * After regs->MSR[TS] being updated, make sure that get_user(),
+	 * put_user() or similar functions are *not* called. These
+	 * functions can generate page faults which will cause the process
+	 * to be de-scheduled with MSR[TS] set but without calling
+	 * tm_recheckpoint(). This can cause a bug.
+	 */
+	regs->msr |= MSR_TM;
+
 	/* This loads the checkpointed FP/VEC state, if used */
 	tm_recheckpoint(&tsk->thread, msr);
 
@@ -570,6 +584,8 @@ static long restore_tm_sigcontexts(struc
 		regs->msr |= MSR_VEC;
 	}
 
+	preempt_enable();
+
 	return err;
 }
 #endif



  parent reply index

Thread overview: 67+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2019-01-11 14:14 [PATCH 4.9 00/63] 4.9.150-stable review Greg Kroah-Hartman
2019-01-11 14:14 ` [PATCH 4.9 01/63] pinctrl: meson: fix pull enable register calculation Greg Kroah-Hartman
2019-01-11 14:14 ` [PATCH 4.9 02/63] powerpc: Fix COFF zImage booting on old powermacs Greg Kroah-Hartman
2019-01-11 14:14 ` [PATCH 4.9 03/63] ARM: imx: update the cpu power up timing setting on i.mx6sx Greg Kroah-Hartman
2019-01-11 14:14 ` [PATCH 4.9 04/63] ARM: dts: imx7d-nitrogen7: Fix the description of the Wifi clock Greg Kroah-Hartman
2019-01-11 14:14 ` [PATCH 4.9 05/63] Input: restore EV_ABS ABS_RESERVED Greg Kroah-Hartman
2019-01-11 14:14 ` [PATCH 4.9 06/63] checkstack.pl: fix for aarch64 Greg Kroah-Hartman
2019-01-11 14:14 ` [PATCH 4.9 07/63] xfrm: Fix bucket count reported to userspace Greg Kroah-Hartman
2019-01-11 14:14 ` [PATCH 4.9 08/63] netfilter: seqadj: re-load tcp header pointer after possible head reallocation Greg Kroah-Hartman
2019-01-11 14:14 ` [PATCH 4.9 09/63] scsi: bnx2fc: Fix NULL dereference in error handling Greg Kroah-Hartman
2019-01-11 14:14 ` [PATCH 4.9 10/63] Input: omap-keypad - fix idle configuration to not block SoC idle states Greg Kroah-Hartman
2019-01-11 14:14 ` [PATCH 4.9 11/63] netfilter: ipset: do not call ipset_nest_end after nla_nest_cancel Greg Kroah-Hartman
2019-01-11 14:14 ` [PATCH 4.9 12/63] bnx2x: Clear fip MAC when fcoe offload support is disabled Greg Kroah-Hartman
2019-01-11 14:14 ` [PATCH 4.9 13/63] bnx2x: Remove configured vlans as part of unload sequence Greg Kroah-Hartman
2019-01-11 14:14 ` [PATCH 4.9 14/63] bnx2x: Send update-svid ramrod with retry/poll flags enabled Greg Kroah-Hartman
2019-01-11 14:14 ` [PATCH 4.9 15/63] scsi: target: iscsi: cxgbit: fix csk leak Greg Kroah-Hartman
2019-01-11 14:14 ` [PATCH 4.9 16/63] scsi: target: iscsi: cxgbit: add missing spin_lock_init() Greg Kroah-Hartman
2019-01-11 14:14 ` [PATCH 4.9 17/63] drivers: net: xgene: Remove unnecessary forward declarations Greg Kroah-Hartman
2019-01-11 14:14 ` [PATCH 4.9 18/63] w90p910_ether: remove incorrect __init annotation Greg Kroah-Hartman
2019-01-11 14:14 ` [PATCH 4.9 19/63] net: hns: Incorrect offset address used for some registers Greg Kroah-Hartman
2019-01-11 14:14 ` [PATCH 4.9 20/63] net: hns: All ports can not work when insmod hns ko after rmmod Greg Kroah-Hartman
2019-01-11 14:14 ` [PATCH 4.9 21/63] net: hns: Some registers use wrong address according to the datasheet Greg Kroah-Hartman
2019-01-11 14:14 ` [PATCH 4.9 22/63] net: hns: Fixed bug that netdev was opened twice Greg Kroah-Hartman
2019-01-11 14:14 ` [PATCH 4.9 23/63] net: hns: Clean rx fbd when ae stopped Greg Kroah-Hartman
2019-01-11 14:14 ` [PATCH 4.9 24/63] net: hns: Free irq when exit from abnormal branch Greg Kroah-Hartman
2019-01-11 14:14 ` [PATCH 4.9 25/63] net: hns: Avoid net reset caused by pause frames storm Greg Kroah-Hartman
2019-01-11 14:14 ` [PATCH 4.9 26/63] net: hns: Fix ntuple-filters status error Greg Kroah-Hartman
2019-01-11 14:14 ` [PATCH 4.9 27/63] net: hns: Add mac pcs config when enable|disable mac Greg Kroah-Hartman
2019-01-11 14:14 ` [PATCH 4.9 28/63] SUNRPC: Fix a race with XPRT_CONNECTING Greg Kroah-Hartman
2019-01-11 14:14 ` [PATCH 4.9 29/63] lan78xx: Resolve issue with changing MAC address Greg Kroah-Hartman
2019-01-11 14:14 ` [PATCH 4.9 30/63] vxge: ensure data0 is initialized in when fetching firmware version information Greg Kroah-Hartman
2019-01-11 14:14 ` [PATCH 4.9 31/63] net: netxen: fix a missing check and an uninitialized use Greg Kroah-Hartman
2019-01-11 14:14 ` [PATCH 4.9 32/63] serial/sunsu: fix refcount leak Greg Kroah-Hartman
2019-01-11 14:14 ` [PATCH 4.9 33/63] scsi: zfcp: fix posting too many status read buffers leading to adapter shutdown Greg Kroah-Hartman
2019-01-11 14:14 ` [PATCH 4.9 34/63] libceph: fix CEPH_FEATURE_CEPHX_V2 check in calc_signature() Greg Kroah-Hartman
2019-01-11 14:14 ` [PATCH 4.9 35/63] fork: record start_time late Greg Kroah-Hartman
2019-01-11 14:14 ` [PATCH 4.9 36/63] hwpoison, memory_hotplug: allow hwpoisoned pages to be offlined Greg Kroah-Hartman
2019-01-11 14:14 ` [PATCH 4.9 37/63] mm, devm_memremap_pages: mark devm_memremap_pages() EXPORT_SYMBOL_GPL Greg Kroah-Hartman
2019-01-11 14:14 ` [PATCH 4.9 38/63] mm, devm_memremap_pages: kill mapping "System RAM" support Greg Kroah-Hartman
2019-01-11 14:14 ` [PATCH 4.9 39/63] sunrpc: fix cache_head leak due to queued request Greg Kroah-Hartman
2019-01-11 14:14 ` [PATCH 4.9 40/63] sunrpc: use SVC_NET() in svcauth_gss_* functions Greg Kroah-Hartman
2019-01-11 14:14 ` [PATCH 4.9 41/63] MIPS: math-emu: Write-protect delay slot emulation pages Greg Kroah-Hartman
2019-01-11 14:14 ` [PATCH 4.9 42/63] crypto: x86/chacha20 - avoid sleeping with preemption disabled Greg Kroah-Hartman
2019-01-11 14:14 ` [PATCH 4.9 43/63] vhost/vsock: fix uninitialized vhost_vsock->guest_cid Greg Kroah-Hartman
2019-01-11 14:14 ` [PATCH 4.9 44/63] IB/hfi1: Incorrect sizing of sge for PIO will OOPs Greg Kroah-Hartman
2019-01-11 14:14 ` [PATCH 4.9 45/63] ALSA: cs46xx: Potential NULL dereference in probe Greg Kroah-Hartman
2019-01-11 14:14 ` [PATCH 4.9 46/63] ALSA: usb-audio: Avoid access before bLength check in build_audio_procunit() Greg Kroah-Hartman
2019-01-11 14:14 ` [PATCH 4.9 47/63] ALSA: usb-audio: Fix an out-of-bound read in create_composite_quirks Greg Kroah-Hartman
2019-01-11 14:14 ` [PATCH 4.9 48/63] dlm: fixed memory leaks after failed ls_remove_names allocation Greg Kroah-Hartman
2019-01-11 14:14 ` [PATCH 4.9 49/63] dlm: possible memory leak on error path in create_lkb() Greg Kroah-Hartman
2019-01-11 14:14 ` [PATCH 4.9 50/63] dlm: lost put_lkb on error path in receive_convert() and receive_unlock() Greg Kroah-Hartman
2019-01-11 14:14 ` [PATCH 4.9 51/63] dlm: memory leaks on error path in dlm_user_request() Greg Kroah-Hartman
2019-01-11 14:14 ` [PATCH 4.9 52/63] gfs2: Get rid of potential double-freeing in gfs2_create_inode Greg Kroah-Hartman
2019-01-11 14:14 ` [PATCH 4.9 53/63] gfs2: Fix loop in gfs2_rbm_find Greg Kroah-Hartman
2019-01-11 14:14 ` [PATCH 4.9 54/63] b43: Fix error in cordic routine Greg Kroah-Hartman
2019-01-11 14:14 ` Greg Kroah-Hartman [this message]
2019-01-11 14:14 ` [PATCH 4.9 56/63] 9p/net: put a lower bound on msize Greg Kroah-Hartman
2019-01-11 14:15 ` [PATCH 4.9 57/63] rxe: fix error completion wr_id and qp_num Greg Kroah-Hartman
2019-01-11 14:15 ` [PATCH 4.9 58/63] iommu/vt-d: Handle domain agaw being less than iommu agaw Greg Kroah-Hartman
2019-01-11 14:15 ` [PATCH 4.9 59/63] ceph: dont update importing caps mseq when handing cap export Greg Kroah-Hartman
2019-01-11 14:15 ` [PATCH 4.9 60/63] genwqe: Fix size check Greg Kroah-Hartman
2019-01-11 14:15 ` [PATCH 4.9 61/63] intel_th: msu: Fix an off-by-one in attribute store Greg Kroah-Hartman
2019-01-11 14:15 ` [PATCH 4.9 62/63] power: supply: olpc_battery: correct the temperature units Greg Kroah-Hartman
2019-01-11 14:15 ` [PATCH 4.9 63/63] drm/vc4: Set ->is_yuv to false when num_planes == 1 Greg Kroah-Hartman
2019-01-11 21:42 ` [PATCH 4.9 00/63] 4.9.150-stable review shuah
2019-01-12  8:07 ` Naresh Kamboju
2019-01-12 17:43 ` Guenter Roeck

Reply instructions:

You may reply publically to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20190111131054.515256345@linuxfoundation.org \
    --to=gregkh@linuxfoundation.org \
    --cc=leitao@debian.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=mpe@ellerman.id.au \
    --cc=stable@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

LKML Archive on lore.kernel.org

Archives are clonable:
	git clone --mirror https://lore.kernel.org/lkml/0 lkml/git/0.git
	git clone --mirror https://lore.kernel.org/lkml/1 lkml/git/1.git
	git clone --mirror https://lore.kernel.org/lkml/2 lkml/git/2.git
	git clone --mirror https://lore.kernel.org/lkml/3 lkml/git/3.git
	git clone --mirror https://lore.kernel.org/lkml/4 lkml/git/4.git
	git clone --mirror https://lore.kernel.org/lkml/5 lkml/git/5.git
	git clone --mirror https://lore.kernel.org/lkml/6 lkml/git/6.git
	git clone --mirror https://lore.kernel.org/lkml/7 lkml/git/7.git

	# If you have public-inbox 1.1+ installed, you may
	# initialize and index your mirror using the following commands:
	public-inbox-init -V2 lkml lkml/ https://lore.kernel.org/lkml \
		linux-kernel@vger.kernel.org
	public-inbox-index lkml

Example config snippet for mirrors

Newsgroup available over NNTP:
	nntp://nntp.lore.kernel.org/org.kernel.vger.linux-kernel


AGPL code for this site: git clone https://public-inbox.org/public-inbox.git