All of lore.kernel.org
 help / color / mirror / Atom feed
From: Andrew Morton <akpm@linux-foundation.org>
To: akpm@linux-foundation.org, jgg@mellanox.com, linux-mm@kvack.org,
	longpeng2@huawei.com, mike.kravetz@oracle.com,
	mm-commits@vger.kernel.org, sean.j.christopherson@intel.com,
	stable@vger.kernel.org, torvalds@linux-foundation.org,
	willy@infradead.org
Subject: [patch 05/15] mm/hugetlb: fix a addressing exception caused by huge_pte_offset
Date: Mon, 20 Apr 2020 18:13:51 -0700	[thread overview]
Message-ID: <20200421011351.qwgQp5LVv%akpm@linux-foundation.org> (raw)
In-Reply-To: <20200420181310.c18b3c0aa4dc5b3e5ec1be10@linux-foundation.org>

From: Longpeng <longpeng2@huawei.com>
Subject: mm/hugetlb: fix a addressing exception caused by huge_pte_offset

Our machine encountered a panic(addressing exception) after run for a long
time and the calltrace is:

RIP: 0010:[<ffffffff9dff0587>]  [<ffffffff9dff0587>] hugetlb_fault+0x307/0xbe0
RSP: 0018:ffff9567fc27f808  EFLAGS: 00010286
RAX: e800c03ff1258d48 RBX: ffffd3bb003b69c0 RCX: e800c03ff1258d48
RDX: 17ff3fc00eda72b7 RSI: 00003ffffffff000 RDI: e800c03ff1258d48
RBP: ffff9567fc27f8c8 R08: e800c03ff1258d48 R09: 0000000000000080
R10: ffffaba0704c22a8 R11: 0000000000000001 R12: ffff95c87b4b60d8
R13: 00005fff00000000 R14: 0000000000000000 R15: ffff9567face8074
FS:  00007fe2d9ffb700(0000) GS:ffff956900e40000(0000) knlGS:0000000000000000
CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: ffffd3bb003b69c0 CR3: 000000be67374000 CR4: 00000000003627e0
DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
Call Trace:
 [<ffffffff9df9b71b>] ? unlock_page+0x2b/0x30
 [<ffffffff9dff04a2>] ? hugetlb_fault+0x222/0xbe0
 [<ffffffff9dff1405>] follow_hugetlb_page+0x175/0x540
 [<ffffffff9e15b825>] ? cpumask_next_and+0x35/0x50
 [<ffffffff9dfc7230>] __get_user_pages+0x2a0/0x7e0
 [<ffffffff9dfc648d>] __get_user_pages_unlocked+0x15d/0x210
 [<ffffffffc068cfc5>] __gfn_to_pfn_memslot+0x3c5/0x460 [kvm]
 [<ffffffffc06b28be>] try_async_pf+0x6e/0x2a0 [kvm]
 [<ffffffffc06b4b41>] tdp_page_fault+0x151/0x2d0 [kvm]
 ...
 [<ffffffffc06a6f90>] kvm_arch_vcpu_ioctl_run+0x330/0x490 [kvm]
 [<ffffffffc068d919>] kvm_vcpu_ioctl+0x309/0x6d0 [kvm]
 [<ffffffff9deaa8c2>] ? dequeue_signal+0x32/0x180
 [<ffffffff9deae34d>] ? do_sigtimedwait+0xcd/0x230
 [<ffffffff9e03aed0>] do_vfs_ioctl+0x3f0/0x540
 [<ffffffff9e03b0c1>] SyS_ioctl+0xa1/0xc0
 [<ffffffff9e53879b>] system_call_fastpath+0x22/0x27

For 1G hugepages, huge_pte_offset() wants to return NULL or pudp, but it
may return a wrong 'pmdp' if there is a race.  Please look at the
following code snippet:

    ...
    pud = pud_offset(p4d, addr);
    if (sz != PUD_SIZE && pud_none(*pud))
        return NULL;
    /* hugepage or swap? */
    if (pud_huge(*pud) || !pud_present(*pud))
        return (pte_t *)pud;

    pmd = pmd_offset(pud, addr);
    if (sz != PMD_SIZE && pmd_none(*pmd))
        return NULL;
    /* hugepage or swap? */
    if (pmd_huge(*pmd) || !pmd_present(*pmd))
        return (pte_t *)pmd;
    ...

The following sequence would trigger this bug:
1. CPU0: sz = PUD_SIZE and *pud = 0 , continue
1. CPU0: "pud_huge(*pud)" is false
2. CPU1: calling hugetlb_no_page and set *pud to xxxx8e7(PRESENT)
3. CPU0: "!pud_present(*pud)" is false, continue
4. CPU0: pmd = pmd_offset(pud, addr) and maybe return a wrong pmdp
However, we want CPU0 to return NULL or pudp in this case.

We must make sure there is exactly one dereference of pud and pmd.

Link: http://lkml.kernel.org/r/20200413010342.771-1-longpeng2@huawei.com
Signed-off-by: Longpeng <longpeng2@huawei.com>
Reviewed-by: Mike Kravetz <mike.kravetz@oracle.com>
Reviewed-by: Jason Gunthorpe <jgg@mellanox.com>
Cc: Matthew Wilcox <willy@infradead.org>
Cc: Sean Christopherson <sean.j.christopherson@intel.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
---

 mm/hugetlb.c |   14 ++++++++------
 1 file changed, 8 insertions(+), 6 deletions(-)

--- a/mm/hugetlb.c~mm-hugetlb-fix-a-addressing-exception-caused-by-huge_pte_offset
+++ a/mm/hugetlb.c
@@ -5365,8 +5365,8 @@ pte_t *huge_pte_offset(struct mm_struct
 {
 	pgd_t *pgd;
 	p4d_t *p4d;
-	pud_t *pud;
-	pmd_t *pmd;
+	pud_t *pud, pud_entry;
+	pmd_t *pmd, pmd_entry;
 
 	pgd = pgd_offset(mm, addr);
 	if (!pgd_present(*pgd))
@@ -5376,17 +5376,19 @@ pte_t *huge_pte_offset(struct mm_struct
 		return NULL;
 
 	pud = pud_offset(p4d, addr);
-	if (sz != PUD_SIZE && pud_none(*pud))
+	pud_entry = READ_ONCE(*pud);
+	if (sz != PUD_SIZE && pud_none(pud_entry))
 		return NULL;
 	/* hugepage or swap? */
-	if (pud_huge(*pud) || !pud_present(*pud))
+	if (pud_huge(pud_entry) || !pud_present(pud_entry))
 		return (pte_t *)pud;
 
 	pmd = pmd_offset(pud, addr);
-	if (sz != PMD_SIZE && pmd_none(*pmd))
+	pmd_entry = READ_ONCE(*pmd);
+	if (sz != PMD_SIZE && pmd_none(pmd_entry))
 		return NULL;
 	/* hugepage or swap? */
-	if (pmd_huge(*pmd) || !pmd_present(*pmd))
+	if (pmd_huge(pmd_entry) || !pmd_present(pmd_entry))
 		return (pte_t *)pmd;
 
 	return NULL;
_

  parent reply	other threads:[~2020-04-21  1:13 UTC|newest]

Thread overview: 76+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-04-21  1:13 incoming Andrew Morton
2020-04-21  1:13 ` [patch 01/15] sh: fix build error in mm/init.c Andrew Morton
2020-04-21  1:13 ` [patch 02/15] slub: avoid redzone when choosing freepointer location Andrew Morton
2020-04-21  1:13 ` [patch 03/15] mm/userfaultfd: disable userfaultfd-wp on x86_32 Andrew Morton
2020-04-21  1:13 ` [patch 04/15] MAINTAINERS: add an entry for kfifo Andrew Morton
2020-04-21 12:31   ` Andy Shevchenko
2020-04-21  1:13 ` Andrew Morton [this message]
2020-04-21  1:13 ` [patch 06/15] mm, gup: return EINTR when gup is interrupted by fatal signals Andrew Morton
2020-04-21  1:13 ` [patch 07/15] checkpatch: fix a typo in the regex for $allocFunctions Andrew Morton
2020-04-21  1:14 ` [patch 08/15] tools/build: tweak unused value workaround Andrew Morton
2020-04-21  1:14 ` [patch 09/15] mm/ksm: fix NULL pointer dereference when KSM zero page is enabled Andrew Morton
2020-04-21  1:14 ` [patch 10/15] mm/shmem: fix build without THP Andrew Morton
2020-04-21  1:14 ` [patch 11/15] vmalloc: fix remap_vmalloc_range() bounds checks Andrew Morton
2020-04-21  1:14 ` [patch 12/15] shmem: fix possible deadlocks on shmlock_user_lock Andrew Morton
2020-04-21  1:14 ` [patch 13/15] mm: shmem: disable interrupt when acquiring info->lock in userfaultfd_copy path Andrew Morton
2020-04-21  1:14 ` [patch 14/15] coredump: fix null pointer dereference on coredump Andrew Morton
2020-04-21  1:14 ` [patch 15/15] tools/vm: fix cross-compile build Andrew Morton
2020-04-21  2:00 ` + mm-memory_hotplug-refrain-from-adding-memory-into-an-impossible-node.patch added to -mm tree Andrew Morton
2020-04-21  2:48 ` + x86-mm-define-mm_p4d_folded.patch " Andrew Morton
2020-04-21  2:52 ` + mm-debug-add-tests-validating-architecture-page-table-helpers-v17.patch " Andrew Morton
2020-04-21  2:59 ` + mm-mmapc-add-more-sanity-checks-to-get_unmapped_area.patch " Andrew Morton
2020-04-21  2:59 ` + mm-mmapc-do-not-allow-mappings-outside-of-allowed-limits.patch " Andrew Morton
2020-04-21  3:07 ` + initrdmem=-option-to-specify-initrd-physical-address-checkpatch-fixes.patch " Andrew Morton
2020-04-21  3:58 ` + initrdmem=-option-to-specify-initrd-physical-address.patch " Andrew Morton
2020-04-21  5:43 ` mmotm 2020-04-20-22-43 uploaded Andrew Morton
2020-04-21  5:43 ` Andrew Morton
2020-04-22  1:36 ` + mm-swapfilec-found_free-could-be-represented-by-tmp-max.patch added to -mm tree Andrew Morton
2020-04-22  1:36 ` + mm-swapfilec-tmp-is-always-smaller-than-max.patch " Andrew Morton
2020-04-22  1:36 ` + mm-swapfilec-omit-a-duplicate-code-by-compare-tmp-and-max-first.patch " Andrew Morton
2020-04-23 22:36 ` + kasan-initialise-array-in-kasan_memcmp-test.patch " Andrew Morton
2020-04-23 22:38 ` + kvm-svm-change-flag-passed-to-gup-fast-in-sev_pin_memory.patch " Andrew Morton
2020-04-23 22:41 ` + mm-pass-task-and-mm-to-do_madvise-fix.patch " Andrew Morton
2020-04-23 22:44 ` + mm-support-vector-address-ranges-for-process_madvise.patch " Andrew Morton
2020-04-23 22:44 ` + mm-support-vector-address-ranges-for-process_madvise-fix.patch " Andrew Morton
2020-04-23 22:48 ` + kasan-stop-tests-being-eliminated-as-dead-code-with-fortify_source.patch " Andrew Morton
2020-04-23 22:48 ` + stringh-fix-incompatibility-between-fortify_source-and-kasan.patch " Andrew Morton
2020-04-23 23:03 ` + powerpc-add-support-for-folded-p4d-page-tables-fix.patch " Andrew Morton
2020-04-23 23:09 ` [folded-merged] memcg-optimize-memorynuma_stat-like-memorystat-fix.patch removed from " Andrew Morton
2020-04-23 23:32 ` + slub-remove-userspace-notifier-for-cache-add-remove.patch added to " Andrew Morton
2020-04-23 23:35 ` + ocfs2-mount-shared-volume-without-ha-stack.patch " Andrew Morton
2020-04-24  0:29 ` + mm-memory_hotplug-handle-memblocks-only-with-config_arch_keep_memblock.patch " Andrew Morton
2020-04-24  1:17 ` + mm-return-true-in-cpupid_pid_unset.patch " Andrew Morton
2020-04-24  1:20 ` + kernel-better-document-the-use_mm-unuse_mm-api-contract-v2-fix.patch " Andrew Morton
2020-04-24  1:40 ` + mm-thp-rename-pmd_mknotpresent-as-pmd_mkinvalid-v2.patch " Andrew Morton
2020-04-24  1:47 ` + ipc-convert-ipcs_idr-to-xarray-update.patch " Andrew Morton
2020-06-05 19:58   ` Qian Cai
2020-06-05 20:11     ` Matthew Wilcox
2020-06-05 21:20       ` Andrew Morton
2020-06-10  2:14         ` Matthew Wilcox
2020-12-30 15:44   ` Manfred Spraul
2020-04-24  2:06 ` + powerpc-spufs-simplify-spufs-core-dumping.patch " Andrew Morton
2020-04-24  2:06 ` + signal-factor-copy_siginfo_to_external32-from-copy_siginfo_to_user32.patch " Andrew Morton
2020-04-24  2:06 ` + binfmt_elf-femove-the-set_fs-in-fill_siginfo_note.patch " Andrew Morton
2020-04-24  2:06 ` + binfmt_elf-remove-the-set_fskernel_ds-in-elf_core_dump.patch " Andrew Morton
2020-04-24  2:06 ` + binfmt_elf_fdpic-remove-the-set_fskernel_ds-in-elf_fdpic_core_dump.patch " Andrew Morton
2020-04-24  2:06 ` + exec-simplify-the-copy_strings_kernel-calling-convention.patch " Andrew Morton
2020-04-24  2:06 ` + exec-open-code-copy_string_kernel.patch " Andrew Morton
2020-04-24  3:24 ` + add-kernel-config-option-for-twisting-kernel-behavior.patch " Andrew Morton
2020-04-24  3:24 ` + twist-allow-disabling-k_spec-function-in-drivers-tty-vt-keyboardc.patch " Andrew Morton
2020-04-24  3:24 ` + twist-add-option-for-selecting-twist-options-for-syzkallers-testing.patch " Andrew Morton
2020-04-24  3:32 ` + eventpoll-fix-missing-wakeup-for-ovflist-in-ep_poll_callback.patch " Andrew Morton
2020-04-24  3:49 ` [obsolete] linux-next-rejects.patch removed from " Andrew Morton
2020-04-24  3:51 ` + mips-mm-add-page-soft-dirty-tracking.patch added to " Andrew Morton
2020-04-24 23:36 ` + mm-memory_hotplug-set-node_start_pfn-of-hotadded-pgdat-to-0.patch " Andrew Morton
2020-04-26  0:09 ` + mm-switch-the-test_vmalloc-module-to-use-__vmalloc_node-fix-fix.patch " Andrew Morton
2020-04-26  0:17 ` + mm-hugetlb-avoid-unnecessary-check-on-pud-and-pmd-entry-in-huge_pte_offset.patch " Andrew Morton
2020-04-26  0:29 ` + eventpoll-fix-missing-wakeup-for-ovflist-in-ep_poll_callback-v2.patch " Andrew Morton
2020-04-26  0:41 ` [withdrawn] kasan-initialise-array-in-kasan_memcmp-test.patch removed from " Andrew Morton
2020-04-26  0:41 ` + kasan-stop-tests-being-eliminated-as-dead-code-with-fortify_source-v4.patch added to " Andrew Morton
2020-04-26  0:48 ` + checkpatch-test-git_dir-changes.patch " Andrew Morton
2020-04-26  1:06 ` + mm-add-debug_wx-support.patch " Andrew Morton
2020-04-26  1:06 ` + riscv-support-debug_wx.patch " Andrew Morton
2020-04-26  1:06 ` + riscv-support-debug_wx-fix.patch " Andrew Morton
2020-04-26  1:06 ` + x86-mm-use-arch_has_debug_wx-instead-of-arch-defined.patch " Andrew Morton
2020-04-26  1:06 ` + arm64-mm-use-arch_has_debug_wx-instead-of-arch-defined.patch " Andrew Morton
2020-04-26  1:09 ` [folded-merged] initrdmem=-option-to-specify-initrd-physical-address-checkpatch-fixes.patch removed from " Andrew Morton

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20200421011351.qwgQp5LVv%akpm@linux-foundation.org \
    --to=akpm@linux-foundation.org \
    --cc=jgg@mellanox.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=longpeng2@huawei.com \
    --cc=mike.kravetz@oracle.com \
    --cc=mm-commits@vger.kernel.org \
    --cc=sean.j.christopherson@intel.com \
    --cc=stable@vger.kernel.org \
    --cc=torvalds@linux-foundation.org \
    --cc=willy@infradead.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.