All of lore.kernel.org
 help / color / mirror / Atom feed
From: Amit Pundir <amit.pundir@linaro.org>
To: Greg KH <gregkh@linuxfoundation.org>
Cc: Stable <stable@vger.kernel.org>,
	Prateek Sood <prsood@codeaurora.org>,
	Peter Zijlstra <peterz@infradead.org>,
	Linus Torvalds <torvalds@linux-foundation.org>,
	Thomas Gleixner <tglx@linutronix.de>,
	sramana@codeaurora.org, Ingo Molnar <mingo@kernel.org>
Subject: [PATCH for-4.9.y 12/14] locking/osq_lock: Fix osq_lock queue corruption
Date: Wed, 29 Aug 2018 01:43:23 +0530	[thread overview]
Message-ID: <1535487205-26280-13-git-send-email-amit.pundir@linaro.org> (raw)
In-Reply-To: <1535487205-26280-1-git-send-email-amit.pundir@linaro.org>

From: Prateek Sood <prsood@codeaurora.org>

commit 50972fe78f24f1cd0b9d7bbf1f87d2be9e4f412e upstream.

Fix ordering of link creation between node->prev and prev->next in
osq_lock(). A case in which the status of optimistic spin queue is
CPU6->CPU2 in which CPU6 has acquired the lock.

        tail
          v
  ,-. <- ,-.
  |6|    |2|
  `-' -> `-'

At this point if CPU0 comes in to acquire osq_lock, it will update the
tail count.

  CPU2			CPU0
  ----------------------------------

				       tail
				         v
			  ,-. <- ,-.    ,-.
			  |6|    |2|    |0|
			  `-' -> `-'    `-'

After tail count update if CPU2 starts to unqueue itself from
optimistic spin queue, it will find an updated tail count with CPU0 and
update CPU2 node->next to NULL in osq_wait_next().

  unqueue-A

	       tail
	         v
  ,-. <- ,-.    ,-.
  |6|    |2|    |0|
  `-'    `-'    `-'

  unqueue-B

  ->tail != curr && !node->next

If reordering of following stores happen then prev->next where prev
being CPU2 would be updated to point to CPU0 node:

				       tail
				         v
			  ,-. <- ,-.    ,-.
			  |6|    |2|    |0|
			  `-'    `-' -> `-'

  osq_wait_next()
    node->next <- 0
    xchg(node->next, NULL)

	       tail
	         v
  ,-. <- ,-.    ,-.
  |6|    |2|    |0|
  `-'    `-'    `-'

  unqueue-C

At this point if next instruction
	WRITE_ONCE(next->prev, prev);
in CPU2 path is committed before the update of CPU0 node->prev = prev then
CPU0 node->prev will point to CPU6 node.

	       tail
    v----------. v
  ,-. <- ,-.    ,-.
  |6|    |2|    |0|
  `-'    `-'    `-'
     `----------^

At this point if CPU0 path's node->prev = prev is committed resulting
in change of CPU0 prev back to CPU2 node. CPU2 node->next is NULL
currently,

				       tail
			                 v
			  ,-. <- ,-. <- ,-.
			  |6|    |2|    |0|
			  `-'    `-'    `-'
			     `----------^

so if CPU0 gets into unqueue path of osq_lock it will keep spinning
in infinite loop as condition prev->next == node will never be true.

Signed-off-by: Prateek Sood <prsood@codeaurora.org>
[ Added pictures, rewrote comments. ]
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: sramana@codeaurora.org
Link: http://lkml.kernel.org/r/1500040076-27626-1-git-send-email-prsood@codeaurora.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Signed-off-by: Amit Pundir <amit.pundir@linaro.org>
---
To be applied on 4.4.y as well.
Build tested on v4.4.153.

 kernel/locking/osq_lock.c | 13 +++++++++++++
 1 file changed, 13 insertions(+)

diff --git a/kernel/locking/osq_lock.c b/kernel/locking/osq_lock.c
index 05a37857ab55..8d7047ecef4e 100644
--- a/kernel/locking/osq_lock.c
+++ b/kernel/locking/osq_lock.c
@@ -104,6 +104,19 @@ bool osq_lock(struct optimistic_spin_queue *lock)
 
 	prev = decode_cpu(old);
 	node->prev = prev;
+
+	/*
+	 * osq_lock()			unqueue
+	 *
+	 * node->prev = prev		osq_wait_next()
+	 * WMB				MB
+	 * prev->next = node		next->prev = prev // unqueue-C
+	 *
+	 * Here 'node->prev' and 'next->prev' are the same variable and we need
+	 * to ensure these stores happen in-order to avoid corrupting the list.
+	 */
+	smp_wmb();
+
 	WRITE_ONCE(prev->next, node);
 
 	/*
-- 
2.7.4

  parent reply	other threads:[~2018-08-29  0:07 UTC|newest]

Thread overview: 18+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2018-08-28 20:13 [PATCH for-4.9.y 00/14] Few upstream fixes from OnePlus6's kernel tree Amit Pundir
2018-08-28 20:13 ` [PATCH for-4.9.y 01/14] cfq: Give a chance for arming slice idle timer in case of group_idle Amit Pundir
2018-08-28 20:13 ` [PATCH for-4.9.y 02/14] kthread: Fix use-after-free if kthread fork fails Amit Pundir
2018-08-28 20:13 ` [PATCH for-4.9.y 03/14] kthread: fix boot hang (regression) on MIPS/OpenRISC Amit Pundir
2018-08-28 20:13   ` [OpenRISC] " Amit Pundir
2018-08-28 20:13 ` [PATCH for-4.9.y 04/14] random: convert get_random_int/long into get_random_u32/u64 Amit Pundir
2018-09-16 13:29   ` Greg KH
2018-08-28 20:13 ` [PATCH for-4.9.y 05/14] staging: rt5208: Fix a sleep-in-atomic bug in xd_copy_page Amit Pundir
2018-08-28 20:13 ` [PATCH for-4.9.y 06/14] staging/rts5208: Fix read overflow in memcpy Amit Pundir
2018-08-28 20:13 ` [PATCH for-4.9.y 07/14] IB/rxe: do not copy extra stack memory to skb Amit Pundir
2018-08-28 20:13 ` [PATCH for-4.9.y 08/14] block,blkcg: use __GFP_NOWARN for best-effort allocations in blkcg Amit Pundir
2018-08-28 20:13 ` [PATCH for-4.9.y 09/14] nl80211: fix null-ptr dereference on invalid mesh configuration Amit Pundir
2018-08-28 20:13 ` [PATCH for-4.9.y 10/14] locking/rwsem-xadd: Fix missed wakeup due to reordering of load Amit Pundir
2018-08-28 20:13 ` [PATCH for-4.9.y 11/14] selinux: use GFP_NOWAIT in the AVC kmem_caches Amit Pundir
2018-08-28 20:13 ` Amit Pundir [this message]
2018-08-28 20:13 ` [PATCH for-4.9.y 13/14] mm, vmscan: clear PGDAT_WRITEBACK when zone is balanced Amit Pundir
2018-08-28 20:13 ` [PATCH for-4.9.y 14/14] mm: remove seemingly spurious reclaimability check from laptop_mode gating Amit Pundir
2018-09-16 13:36 ` [PATCH for-4.9.y 00/14] Few upstream fixes from OnePlus6's kernel tree Greg KH

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=1535487205-26280-13-git-send-email-amit.pundir@linaro.org \
    --to=amit.pundir@linaro.org \
    --cc=gregkh@linuxfoundation.org \
    --cc=mingo@kernel.org \
    --cc=peterz@infradead.org \
    --cc=prsood@codeaurora.org \
    --cc=sramana@codeaurora.org \
    --cc=stable@vger.kernel.org \
    --cc=tglx@linutronix.de \
    --cc=torvalds@linux-foundation.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.