linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
To: linux-kernel@vger.kernel.org, stable@vger.kernel.org
Cc: Greg KH <gregkh@linuxfoundation.org>,
	torvalds@linux-foundation.org, akpm@linux-foundation.org,
	alan@lxorguk.ukuu.org.uk,
	Xiao Guangrong <xiaoguangrong@linux.vnet.ibm.com>,
	Avi Kivity <avi@redhat.com>,
	Marcelo Tosatti <mtosatti@redhat.com>,
	Paul Gortmaker <paul.gortmaker@windriver.com>,
	Andrea Arcangeli <aarcange@redhat.com>
Subject: [ 11/44] mm: mmu_notifier: fix freed page still mapped in secondary MMU
Date: Mon, 13 Aug 2012 15:02:18 -0700	[thread overview]
Message-ID: <20120813220143.157785504@linuxfoundation.org> (raw)
In-Reply-To: <20120813220142.113186818@linuxfoundation.org>

From: Greg KH <gregkh@linuxfoundation.org>

3.0-stable review patch.  If anyone has any objections, please let me know.

------------------

From: Xiao Guangrong <xiaoguangrong@linux.vnet.ibm.com>

commit 3ad3d901bbcfb15a5e4690e55350db0899095a68 upstream.

mmu_notifier_release() is called when the process is exiting.  It will
delete all the mmu notifiers.  But at this time the page belonging to the
process is still present in page tables and is present on the LRU list, so
this race will happen:

      CPU 0                 CPU 1
mmu_notifier_release:    try_to_unmap:
   hlist_del_init_rcu(&mn->hlist);
                            ptep_clear_flush_notify:
                                  mmu nofifler not found
                            free page  !!!!!!
                            /*
                             * At the point, the page has been
                             * freed, but it is still mapped in
                             * the secondary MMU.
                             */

  mn->ops->release(mn, mm);

Then the box is not stable and sometimes we can get this bug:

[  738.075923] BUG: Bad page state in process migrate-perf  pfn:03bec
[  738.075931] page:ffffea00000efb00 count:0 mapcount:0 mapping:          (null) index:0x8076
[  738.075936] page flags: 0x20000000000014(referenced|dirty)

The same issue is present in mmu_notifier_unregister().

We can call ->release before deleting the notifier to ensure the page has
been unmapped from the secondary MMU before it is freed.

Signed-off-by: Xiao Guangrong <xiaoguangrong@linux.vnet.ibm.com>
Cc: Avi Kivity <avi@redhat.com>
Cc: Marcelo Tosatti <mtosatti@redhat.com>
Cc: Paul Gortmaker <paul.gortmaker@windriver.com>
Cc: Andrea Arcangeli <aarcange@redhat.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

---
 mm/mmu_notifier.c |   45 +++++++++++++++++++++++----------------------
 1 file changed, 23 insertions(+), 22 deletions(-)

--- a/mm/mmu_notifier.c
+++ b/mm/mmu_notifier.c
@@ -33,6 +33,24 @@
 void __mmu_notifier_release(struct mm_struct *mm)
 {
 	struct mmu_notifier *mn;
+	struct hlist_node *n;
+
+	/*
+	 * RCU here will block mmu_notifier_unregister until
+	 * ->release returns.
+	 */
+	rcu_read_lock();
+	hlist_for_each_entry_rcu(mn, n, &mm->mmu_notifier_mm->list, hlist)
+		/*
+		 * if ->release runs before mmu_notifier_unregister it
+		 * must be handled as it's the only way for the driver
+		 * to flush all existing sptes and stop the driver
+		 * from establishing any more sptes before all the
+		 * pages in the mm are freed.
+		 */
+		if (mn->ops->release)
+			mn->ops->release(mn, mm);
+	rcu_read_unlock();
 
 	spin_lock(&mm->mmu_notifier_mm->lock);
 	while (unlikely(!hlist_empty(&mm->mmu_notifier_mm->list))) {
@@ -46,23 +64,6 @@ void __mmu_notifier_release(struct mm_st
 		 * mmu_notifier_unregister to return.
 		 */
 		hlist_del_init_rcu(&mn->hlist);
-		/*
-		 * RCU here will block mmu_notifier_unregister until
-		 * ->release returns.
-		 */
-		rcu_read_lock();
-		spin_unlock(&mm->mmu_notifier_mm->lock);
-		/*
-		 * if ->release runs before mmu_notifier_unregister it
-		 * must be handled as it's the only way for the driver
-		 * to flush all existing sptes and stop the driver
-		 * from establishing any more sptes before all the
-		 * pages in the mm are freed.
-		 */
-		if (mn->ops->release)
-			mn->ops->release(mn, mm);
-		rcu_read_unlock();
-		spin_lock(&mm->mmu_notifier_mm->lock);
 	}
 	spin_unlock(&mm->mmu_notifier_mm->lock);
 
@@ -284,16 +285,13 @@ void mmu_notifier_unregister(struct mmu_
 {
 	BUG_ON(atomic_read(&mm->mm_count) <= 0);
 
-	spin_lock(&mm->mmu_notifier_mm->lock);
 	if (!hlist_unhashed(&mn->hlist)) {
-		hlist_del_rcu(&mn->hlist);
-
 		/*
 		 * RCU here will force exit_mmap to wait ->release to finish
 		 * before freeing the pages.
 		 */
 		rcu_read_lock();
-		spin_unlock(&mm->mmu_notifier_mm->lock);
+
 		/*
 		 * exit_mmap will block in mmu_notifier_release to
 		 * guarantee ->release is called before freeing the
@@ -302,8 +300,11 @@ void mmu_notifier_unregister(struct mmu_
 		if (mn->ops->release)
 			mn->ops->release(mn, mm);
 		rcu_read_unlock();
-	} else
+
+		spin_lock(&mm->mmu_notifier_mm->lock);
+		hlist_del_rcu(&mn->hlist);
 		spin_unlock(&mm->mmu_notifier_mm->lock);
+	}
 
 	/*
 	 * Wait any running method to finish, of course including



  parent reply	other threads:[~2012-08-13 22:32 UTC|newest]

Thread overview: 56+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2012-08-13 22:02 [ 00/44] 3.0.41-stable review Greg Kroah-Hartman
2012-08-13 22:02 ` [ 01/44] [IA64] Redefine ATOMIC_INIT and ATOMIC64_INIT to drop the casts Greg Kroah-Hartman
2012-08-13 22:02 ` [ 02/44] SUNRPC: return negative value in case rpcbind client creation error Greg Kroah-Hartman
2012-08-13 22:02 ` [ 03/44] nilfs2: fix deadlock issue between chcp and thaw ioctls Greg Kroah-Hartman
2012-08-13 22:02 ` [ 04/44] pcdp: use early_ioremap/early_iounmap to access pcdp table Greg Kroah-Hartman
2012-08-13 22:02 ` [ 05/44] mm: fix wrong argument of migrate_huge_pages() in soft_offline_huge_page() Greg Kroah-Hartman
2012-08-13 22:02 ` [ 06/44] ARM: 7467/1: mutex: use generic xchg-based implementation for ARMv6+ Greg Kroah-Hartman
2012-08-15 14:02   ` Ben Hutchings
2012-08-13 22:02 ` [ 07/44] ARM: 7477/1: vfp: Always save VFP state in vfp_pm_suspend on UP Greg Kroah-Hartman
2012-08-14 20:01   ` Herton Ronaldo Krzesinski
2012-08-15 14:05     ` Greg Kroah-Hartman
2012-08-15 14:50       ` Herton Ronaldo Krzesinski
2012-08-13 22:02 ` [ 08/44] ARM: 7478/1: errata: extend workaround for erratum #720789 Greg Kroah-Hartman
2012-08-13 22:02 ` [ 09/44] ARM: 7479/1: mm: avoid NULL dereference when flushing gate_vma with VIVT caches Greg Kroah-Hartman
2012-08-13 22:02 ` [ 10/44] ALSA: hda - remove quirk for Dell Vostro 1015 Greg Kroah-Hartman
2012-08-14  5:17   ` David Henningsson
2012-08-14  5:43     ` Takashi Iwai
2012-08-15 14:03     ` Greg Kroah-Hartman
2012-08-13 22:02 ` Greg Kroah-Hartman [this message]
2012-08-13 22:02 ` [ 12/44] mac80211: cancel mesh path timer Greg Kroah-Hartman
2012-08-13 22:02 ` [ 13/44] x86, nops: Missing break resulting in incorrect selection on Intel Greg Kroah-Hartman
2012-08-13 22:02 ` [ 14/44] random: Add support for architectural random hooks Greg Kroah-Hartman
2012-08-13 22:02 ` [ 15/44] fix typo/thinko in get_random_bytes() Greg Kroah-Hartman
2012-08-13 22:02 ` [ 16/44] random: Use arch_get_random_int instead of cycle counter if avail Greg Kroah-Hartman
2012-08-13 22:02 ` [ 17/44] random: Use arch-specific RNG to initialize the entropy store Greg Kroah-Hartman
2012-08-13 22:02 ` [ 18/44] random: Adjust the number of loops when initializing Greg Kroah-Hartman
2012-08-13 22:02 ` [ 19/44] drivers/char/random.c: fix boot id uniqueness race Greg Kroah-Hartman
2012-08-13 22:02 ` [ 20/44] random: make add_interrupt_randomness() do something sane Greg Kroah-Hartman
2012-08-13 22:02 ` [ 21/44] random: use lockless techniques in the interrupt path Greg Kroah-Hartman
2012-08-13 22:02 ` [ 22/44] random: create add_device_randomness() interface Greg Kroah-Hartman
2012-08-13 22:02 ` [ 23/44] usb: feed USB device information to the /dev/random driver Greg Kroah-Hartman
2012-08-13 22:02 ` [ 24/44] net: feed /dev/random with the MAC address when registering a device Greg Kroah-Hartman
2012-08-13 22:02 ` [ 25/44] random: use the arch-specific rng in xfer_secondary_pool Greg Kroah-Hartman
2012-08-13 22:02 ` [ 26/44] random: add new get_random_bytes_arch() function Greg Kroah-Hartman
2012-08-13 22:02 ` [ 27/44] random: add tracepoints for easier debugging and verification Greg Kroah-Hartman
2012-08-13 22:02 ` [ 28/44] MAINTAINERS: Theodore Tso is taking over the random driver Greg Kroah-Hartman
2012-08-13 22:02 ` [ 29/44] rtc: wm831x: Feed the write counter into device_add_randomness() Greg Kroah-Hartman
2012-08-13 22:02 ` [ 30/44] mfd: wm831x: Feed the device UUID " Greg Kroah-Hartman
2012-08-13 22:02 ` [ 31/44] random: remove rand_initialize_irq() Greg Kroah-Hartman
2012-08-13 22:02 ` [ 32/44] random: Add comment to random_initialize() Greg Kroah-Hartman
2012-08-13 22:02 ` [ 33/44] dmi: Feed DMI table to /dev/random driver Greg Kroah-Hartman
2012-08-13 22:02 ` [ 34/44] random: mix in architectural randomness in extract_buf() Greg Kroah-Hartman
2012-08-13 22:02 ` [ 35/44] x86, microcode: microcode_core.c simple_strtoul cleanup Greg Kroah-Hartman
2012-08-13 22:02 ` [ 36/44] x86, microcode: Sanitize per-cpu microcode reloading interface Greg Kroah-Hartman
2012-08-15  0:26   ` Henrique de Moraes Holschuh
2012-08-15 14:06     ` Greg Kroah-Hartman
2012-08-15 16:30       ` Henrique de Moraes Holschuh
2012-08-15 18:26         ` Greg Kroah-Hartman
2012-08-13 22:02 ` [ 37/44] mm: hugetlbfs: close race during teardown of hugetlbfs shared page tables Greg Kroah-Hartman
2012-08-13 22:02 ` [ 38/44] ARM: mxs: Remove MMAP_MIN_ADDR setting from mxs_defconfig Greg Kroah-Hartman
2012-08-13 22:02 ` [ 39/44] ARM: pxa: remove irq_to_gpio from ezx-pcap driver Greg Kroah-Hartman
2012-08-13 22:02 ` [ 40/44] cfg80211: process pending events when unregistering net device Greg Kroah-Hartman
2012-08-13 22:02 ` [ 41/44] cfg80211: fix interface combinations check for ADHOC(IBSS) Greg Kroah-Hartman
2012-08-13 22:02 ` [ 42/44] e1000e: NIC goes up and immediately goes down Greg Kroah-Hartman
2012-08-13 22:02 ` [ 43/44] Input: wacom - Bamboo One 1024 pressure fix Greg Kroah-Hartman
2012-08-13 22:02 ` [ 44/44] rt61pci: fix NULL pointer dereference in config_lna_gain Greg Kroah-Hartman

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20120813220143.157785504@linuxfoundation.org \
    --to=gregkh@linuxfoundation.org \
    --cc=aarcange@redhat.com \
    --cc=akpm@linux-foundation.org \
    --cc=alan@lxorguk.ukuu.org.uk \
    --cc=avi@redhat.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=mtosatti@redhat.com \
    --cc=paul.gortmaker@windriver.com \
    --cc=stable@vger.kernel.org \
    --cc=torvalds@linux-foundation.org \
    --cc=xiaoguangrong@linux.vnet.ibm.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).