All of lore.kernel.org
 help / color / mirror / Atom feed
From: Jiri Slaby <jslaby@suse.cz>
To: stable@vger.kernel.org
Cc: Naoya Horiguchi <n-horiguchi@ah.jp.nec.com>,
	Hugh Dickins <hughd@google.com>, Michal Hocko <mhocko@kernel.org>,
	"Kirill A. Shutemov" <kirill.shutemov@linux.intel.com>,
	Mike Kravetz <mike.kravetz@oracle.com>,
	Christian Borntraeger <borntraeger@de.ibm.com>,
	Gerald Schaefer <gerald.schaefer@de.ibm.com>,
	Andrew Morton <akpm@linux-foundation.org>,
	Linus Torvalds <torvalds@linux-foundation.org>,
	Jiri Slaby <jslaby@suse.cz>
Subject: [patch added to 3.12-stable] mm, hugetlb: use pte_present() instead of pmd_present() in follow_huge_pmd()
Date: Mon, 10 Apr 2017 14:59:23 +0200	[thread overview]
Message-ID: <20170410125930.26495-46-jslaby@suse.cz> (raw)
In-Reply-To: <20170410125930.26495-1-jslaby@suse.cz>

From: Naoya Horiguchi <n-horiguchi@ah.jp.nec.com>

This patch has been added to the 3.12 stable tree. If you have any
objections, please let us know.

===============

commit c9d398fa237882ea07167e23bcfc5e6847066518 upstream.

I found the race condition which triggers the following bug when
move_pages() and soft offline are called on a single hugetlb page
concurrently.

    Soft offlining page 0x119400 at 0x700000000000
    BUG: unable to handle kernel paging request at ffffea0011943820
    IP: follow_huge_pmd+0x143/0x190
    PGD 7ffd2067
    PUD 7ffd1067
    PMD 0
        [61163.582052] Oops: 0000 [#1] SMP
    Modules linked in: binfmt_misc ppdev virtio_balloon parport_pc pcspkr i2c_piix4 parport i2c_core acpi_cpufreq ip_tables xfs libcrc32c ata_generic pata_acpi virtio_blk 8139too crc32c_intel ata_piix serio_raw libata virtio_pci 8139cp virtio_ring virtio mii floppy dm_mirror dm_region_hash dm_log dm_mod [last unloaded: cap_check]
    CPU: 0 PID: 22573 Comm: iterate_numa_mo Tainted: P           OE   4.11.0-rc2-mm1+ #2
    Hardware name: Red Hat KVM, BIOS 0.5.1 01/01/2011
    RIP: 0010:follow_huge_pmd+0x143/0x190
    RSP: 0018:ffffc90004bdbcd0 EFLAGS: 00010202
    RAX: 0000000465003e80 RBX: ffffea0004e34d30 RCX: 00003ffffffff000
    RDX: 0000000011943800 RSI: 0000000000080001 RDI: 0000000465003e80
    RBP: ffffc90004bdbd18 R08: 0000000000000000 R09: ffff880138d34000
    R10: ffffea0004650000 R11: 0000000000c363b0 R12: ffffea0011943800
    R13: ffff8801b8d34000 R14: ffffea0000000000 R15: 000077ff80000000
    FS:  00007fc977710740(0000) GS:ffff88007dc00000(0000) knlGS:0000000000000000
    CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
    CR2: ffffea0011943820 CR3: 000000007a746000 CR4: 00000000001406f0
    Call Trace:
     follow_page_mask+0x270/0x550
     SYSC_move_pages+0x4ea/0x8f0
     SyS_move_pages+0xe/0x10
     do_syscall_64+0x67/0x180
     entry_SYSCALL64_slow_path+0x25/0x25
    RIP: 0033:0x7fc976e03949
    RSP: 002b:00007ffe72221d88 EFLAGS: 00000246 ORIG_RAX: 0000000000000117
    RAX: ffffffffffffffda RBX: 0000000000000000 RCX: 00007fc976e03949
    RDX: 0000000000c22390 RSI: 0000000000001400 RDI: 0000000000005827
    RBP: 00007ffe72221e00 R08: 0000000000c2c3a0 R09: 0000000000000004
    R10: 0000000000c363b0 R11: 0000000000000246 R12: 0000000000400650
    R13: 00007ffe72221ee0 R14: 0000000000000000 R15: 0000000000000000
    Code: 81 e4 ff ff 1f 00 48 21 c2 49 c1 ec 0c 48 c1 ea 0c 4c 01 e2 49 bc 00 00 00 00 00 ea ff ff 48 c1 e2 06 49 01 d4 f6 45 bc 04 74 90 <49> 8b 7c 24 20 40 f6 c7 01 75 2b 4c 89 e7 8b 47 1c 85 c0 7e 2a
    RIP: follow_huge_pmd+0x143/0x190 RSP: ffffc90004bdbcd0
    CR2: ffffea0011943820
    ---[ end trace e4f81353a2d23232 ]---
    Kernel panic - not syncing: Fatal exception
    Kernel Offset: disabled

This bug is triggered when pmd_present() returns true for non-present
hugetlb, so fixing the present check in follow_huge_pmd() prevents it.
Using pmd_present() to determine present/non-present for hugetlb is not
correct, because pmd_present() checks multiple bits (not only
_PAGE_PRESENT) for historical reason and it can misjudge hugetlb state.

Fixes: e66f17ff7177 ("mm/hugetlb: take page table lock in follow_huge_pmd()")
Link: http://lkml.kernel.org/r/1490149898-20231-1-git-send-email-n-horiguchi@ah.jp.nec.com
Signed-off-by: Naoya Horiguchi <n-horiguchi@ah.jp.nec.com>
Acked-by: Hillf Danton <hillf.zj@alibaba-inc.com>
Cc: Hugh Dickins <hughd@google.com>
Cc: Michal Hocko <mhocko@kernel.org>
Cc: "Kirill A. Shutemov" <kirill.shutemov@linux.intel.com>
Cc: Mike Kravetz <mike.kravetz@oracle.com>
Cc: Christian Borntraeger <borntraeger@de.ibm.com>
Cc: Gerald Schaefer <gerald.schaefer@de.ibm.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Jiri Slaby <jslaby@suse.cz>
---
 mm/hugetlb.c | 6 ++++--
 1 file changed, 4 insertions(+), 2 deletions(-)

diff --git a/mm/hugetlb.c b/mm/hugetlb.c
index 24d50334d51c..ea69c897330e 100644
--- a/mm/hugetlb.c
+++ b/mm/hugetlb.c
@@ -3512,6 +3512,7 @@ follow_huge_pmd(struct mm_struct *mm, unsigned long address,
 {
 	struct page *page = NULL;
 	spinlock_t *ptl;
+	pte_t pte;
 retry:
 	ptl = &mm->page_table_lock;
 	spin_lock(ptl);
@@ -3521,12 +3522,13 @@ retry:
 	 */
 	if (!pmd_huge(*pmd))
 		goto out;
-	if (pmd_present(*pmd)) {
+	pte = huge_ptep_get((pte_t *)pmd);
+	if (pte_present(pte)) {
 		page = pmd_page(*pmd) + ((address & ~PMD_MASK) >> PAGE_SHIFT);
 		if (flags & FOLL_GET)
 			get_page(page);
 	} else {
-		if (is_hugetlb_entry_migration(huge_ptep_get((pte_t *)pmd))) {
+		if (is_hugetlb_entry_migration(pte)) {
 			spin_unlock(ptl);
 			__migration_entry_wait(mm, (pte_t *)pmd, ptl);
 			goto retry;
-- 
2.12.2

  parent reply	other threads:[~2017-04-10 12:59 UTC|newest]

Thread overview: 57+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2017-04-10 12:58 [patch added to 3.12-stable] Input: i8042 - add noloop quirk for Dell Embedded Box PC 3000 Jiri Slaby
2017-04-10 12:58 ` [patch added to 3.12-stable] Input: iforce - validate number of endpoints before using them Jiri Slaby
2017-04-10 12:58 ` [patch added to 3.12-stable] Input: ims-pcu " Jiri Slaby
2017-04-10 12:58 ` [patch added to 3.12-stable] Input: hanwang " Jiri Slaby
2017-04-10 12:58 ` [patch added to 3.12-stable] Input: yealink " Jiri Slaby
2017-04-10 12:58 ` [patch added to 3.12-stable] Input: cm109 " Jiri Slaby
2017-04-10 12:58 ` [patch added to 3.12-stable] Input: kbtab " Jiri Slaby
2017-04-10 12:58 ` [patch added to 3.12-stable] ALSA: seq: Fix racy cell insertions during snd_seq_pool_done() Jiri Slaby
2017-04-10 12:58 ` [patch added to 3.12-stable] USB: serial: option: add Quectel UC15, UC20, EC21, and EC25 modems Jiri Slaby
2017-04-10 12:58 ` [patch added to 3.12-stable] USB: serial: qcserial: add Dell DW5811e Jiri Slaby
2017-04-10 12:58 ` [patch added to 3.12-stable] ACM gadget: fix endianness in notifications Jiri Slaby
2017-04-10 12:58 ` [patch added to 3.12-stable] usb-core: Add LINEAR_FRAME_INTR_BINTERVAL USB quirk Jiri Slaby
2017-04-10 12:58 ` [patch added to 3.12-stable] USB: uss720: fix NULL-deref at probe Jiri Slaby
2017-04-10 12:58 ` [patch added to 3.12-stable] USB: idmouse: " Jiri Slaby
2017-04-10 12:58 ` [patch added to 3.12-stable] USB: wusbcore: " Jiri Slaby
2017-04-10 12:58 ` [patch added to 3.12-stable] usb: hub: Fix crash after failure to read BOS descriptor Jiri Slaby
2017-04-10 12:58 ` [patch added to 3.12-stable] uwb: i1480-dfu: fix NULL-deref at probe Jiri Slaby
2017-04-10 12:58 ` [patch added to 3.12-stable] uwb: hwa-rc: " Jiri Slaby
2017-04-10 12:58 ` [patch added to 3.12-stable] mmc: ushc: " Jiri Slaby
2017-04-10 12:58 ` [patch added to 3.12-stable] ext4: mark inode dirty after converting inline directory Jiri Slaby
2017-04-10 12:58 ` [patch added to 3.12-stable] mmc: sdhci: Do not disable interrupts while waiting for clock Jiri Slaby
2017-04-10 12:58 ` [patch added to 3.12-stable] nl80211: fix dumpit error path RTNL deadlocks Jiri Slaby
2017-04-10 12:59 ` [patch added to 3.12-stable] USB: usbtmc: add missing endpoint sanity check Jiri Slaby
2017-04-10 12:59 ` [patch added to 3.12-stable] xfs: clear _XBF_PAGES from buffers when readahead page Jiri Slaby
2017-04-10 12:59 ` [patch added to 3.12-stable] block: allow WRITE_SAME commands with the SG_IO ioctl Jiri Slaby
2017-04-10 12:59 ` [patch added to 3.12-stable] uvcvideo: uvc_scan_fallback() for webcams with broken chain Jiri Slaby
2017-04-10 12:59 ` [patch added to 3.12-stable] fbcon: Fix vc attr at deinit Jiri Slaby
2017-04-10 12:59 ` [patch added to 3.12-stable] crypto: algif_hash - avoid zero-sized array Jiri Slaby
2017-04-10 12:59 ` [patch added to 3.12-stable] xfrm_user: validate XFRM_MSG_NEWAE XFRMA_REPLAY_ESN_VAL replay_window Jiri Slaby
2017-04-10 12:59 ` [patch added to 3.12-stable] xfrm_user: validate XFRM_MSG_NEWAE incoming ESN size harder Jiri Slaby
2017-04-10 12:59 ` [patch added to 3.12-stable] virtio_balloon: init 1st buffer in stats vq Jiri Slaby
2017-04-10 12:59 ` [patch added to 3.12-stable] c6x/ptrace: Remove useless PTRACE_SETREGSET implementation Jiri Slaby
2017-04-10 12:59 ` [patch added to 3.12-stable] sparc/ptrace: Preserve previous registers for short regset write Jiri Slaby
2017-04-10 12:59 ` [patch added to 3.12-stable] metag/ptrace: " Jiri Slaby
2017-04-10 12:59 ` [patch added to 3.12-stable] metag/ptrace: Provide default TXSTATUS for short NT_PRSTATUS Jiri Slaby
2017-04-10 12:59 ` [patch added to 3.12-stable] metag/ptrace: Reject partial NT_METAG_RPIPE writes Jiri Slaby
2017-04-10 12:59 ` [patch added to 3.12-stable] sched/rt: Add a missing rescheduling point Jiri Slaby
2017-04-10 12:59 ` [patch added to 3.12-stable] libceph: force GFP_NOIO for socket allocations Jiri Slaby
2017-04-10 12:59 ` [patch added to 3.12-stable] scsi: mpt3sas: fix hang on ata passthrough commands Jiri Slaby
2017-04-10 12:59 ` [patch added to 3.12-stable] scsi: libsas: fix ata xfer length Jiri Slaby
2017-04-10 12:59 ` [patch added to 3.12-stable] ALSA: seq: Fix race during FIFO resize Jiri Slaby
2017-04-10 12:59 ` [patch added to 3.12-stable] ACPI: Fix incompatibility with mcount-based function graph tracing Jiri Slaby
2017-04-10 12:59 ` [patch added to 3.12-stable] tty/serial: atmel: fix race condition (TX+DMA) Jiri Slaby
2017-04-10 13:47   ` Richard Genoud
2017-04-10 15:30     ` Jiri Slaby
2017-04-10 12:59 ` [patch added to 3.12-stable] USB: fix linked-list corruption in rh_call_control() Jiri Slaby
2017-04-10 12:59 ` [patch added to 3.12-stable] KVM: x86: clear bus pointer when destroyed Jiri Slaby
2017-04-10 12:59 ` Jiri Slaby [this message]
2017-04-10 12:59 ` [patch added to 3.12-stable] MIPS: Lantiq: Fix cascaded IRQ setup Jiri Slaby
2017-04-10 13:07   ` Amit Pundir
2017-04-10 13:09     ` Jiri Slaby
2017-04-10 12:59 ` [patch added to 3.12-stable] rtc: s35390a: fix reading out alarm Jiri Slaby
2017-04-10 12:59 ` [patch added to 3.12-stable] rtc: s35390a: make sure all members in the output are set Jiri Slaby
2017-04-10 12:59 ` [patch added to 3.12-stable] rtc: s35390a: implement reset routine as suggested by the reference Jiri Slaby
2017-04-10 12:59 ` [patch added to 3.12-stable] rtc: s35390a: improve irq handling Jiri Slaby
2017-04-10 12:59 ` [patch added to 3.12-stable] KVM: kvm_io_bus_unregister_dev() should never fail Jiri Slaby
2017-04-10 12:59 ` [patch added to 3.12-stable] padata: avoid race in reordering Jiri Slaby

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20170410125930.26495-46-jslaby@suse.cz \
    --to=jslaby@suse.cz \
    --cc=akpm@linux-foundation.org \
    --cc=borntraeger@de.ibm.com \
    --cc=gerald.schaefer@de.ibm.com \
    --cc=hughd@google.com \
    --cc=kirill.shutemov@linux.intel.com \
    --cc=mhocko@kernel.org \
    --cc=mike.kravetz@oracle.com \
    --cc=n-horiguchi@ah.jp.nec.com \
    --cc=stable@vger.kernel.org \
    --cc=torvalds@linux-foundation.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.