All of lore.kernel.org
 help / color / mirror / Atom feed
From: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
To: linux-kernel@vger.kernel.org
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>,
	stable@vger.kernel.org,
	Alexander Egorenkov <egorenar@linux.ibm.com>,
	Gerald Schaefer <gerald.schaefer@linux.ibm.com>,
	Andrew Morton <akpm@linux-foundation.org>,
	Andrea Arcangeli <aarcange@redhat.com>,
	Heiko Carstens <hca@linux.ibm.com>,
	Linus Torvalds <torvalds@linux-foundation.org>
Subject: [PATCH 4.14 02/50] mm/userfaultfd: do not access vma->vm_mm after calling handle_userfault()
Date: Tue,  1 Dec 2020 09:53:01 +0100	[thread overview]
Message-ID: <20201201084645.054722406@linuxfoundation.org> (raw)
In-Reply-To: <20201201084644.803812112@linuxfoundation.org>

From: Gerald Schaefer <gerald.schaefer@linux.ibm.com>

commit bfe8cc1db02ab243c62780f17fc57f65bde0afe1 upstream.

Alexander reported a syzkaller / KASAN finding on s390, see below for
complete output.

In do_huge_pmd_anonymous_page(), the pre-allocated pagetable will be
freed in some cases.  In the case of userfaultfd_missing(), this will
happen after calling handle_userfault(), which might have released the
mmap_lock.  Therefore, the following pte_free(vma->vm_mm, pgtable) will
access an unstable vma->vm_mm, which could have been freed or re-used
already.

For all architectures other than s390 this will go w/o any negative
impact, because pte_free() simply frees the page and ignores the
passed-in mm.  The implementation for SPARC32 would also access
mm->page_table_lock for pte_free(), but there is no THP support in
SPARC32, so the buggy code path will not be used there.

For s390, the mm->context.pgtable_list is being used to maintain the 2K
pagetable fragments, and operating on an already freed or even re-used
mm could result in various more or less subtle bugs due to list /
pagetable corruption.

Fix this by calling pte_free() before handle_userfault(), similar to how
it is already done in __do_huge_pmd_anonymous_page() for the WRITE /
non-huge_zero_page case.

Commit 6b251fc96cf2c ("userfaultfd: call handle_userfault() for
userfaultfd_missing() faults") actually introduced both, the
do_huge_pmd_anonymous_page() and also __do_huge_pmd_anonymous_page()
changes wrt to calling handle_userfault(), but only in the latter case
it put the pte_free() before calling handle_userfault().

  BUG: KASAN: use-after-free in do_huge_pmd_anonymous_page+0xcda/0xd90 mm/huge_memory.c:744
  Read of size 8 at addr 00000000962d6988 by task syz-executor.0/9334

  CPU: 1 PID: 9334 Comm: syz-executor.0 Not tainted 5.10.0-rc1-syzkaller-07083-g4c9720875573 #0
  Hardware name: IBM 3906 M04 701 (KVM/Linux)
  Call Trace:
    do_huge_pmd_anonymous_page+0xcda/0xd90 mm/huge_memory.c:744
    create_huge_pmd mm/memory.c:4256 [inline]
    __handle_mm_fault+0xe6e/0x1068 mm/memory.c:4480
    handle_mm_fault+0x288/0x748 mm/memory.c:4607
    do_exception+0x394/0xae0 arch/s390/mm/fault.c:479
    do_dat_exception+0x34/0x80 arch/s390/mm/fault.c:567
    pgm_check_handler+0x1da/0x22c arch/s390/kernel/entry.S:706
    copy_from_user_mvcos arch/s390/lib/uaccess.c:111 [inline]
    raw_copy_from_user+0x3a/0x88 arch/s390/lib/uaccess.c:174
    _copy_from_user+0x48/0xa8 lib/usercopy.c:16
    copy_from_user include/linux/uaccess.h:192 [inline]
    __do_sys_sigaltstack kernel/signal.c:4064 [inline]
    __s390x_sys_sigaltstack+0xc8/0x240 kernel/signal.c:4060
    system_call+0xe0/0x28c arch/s390/kernel/entry.S:415

  Allocated by task 9334:
    slab_alloc_node mm/slub.c:2891 [inline]
    slab_alloc mm/slub.c:2899 [inline]
    kmem_cache_alloc+0x118/0x348 mm/slub.c:2904
    vm_area_dup+0x9c/0x2b8 kernel/fork.c:356
    __split_vma+0xba/0x560 mm/mmap.c:2742
    split_vma+0xca/0x108 mm/mmap.c:2800
    mlock_fixup+0x4ae/0x600 mm/mlock.c:550
    apply_vma_lock_flags+0x2c6/0x398 mm/mlock.c:619
    do_mlock+0x1aa/0x718 mm/mlock.c:711
    __do_sys_mlock2 mm/mlock.c:738 [inline]
    __s390x_sys_mlock2+0x86/0xa8 mm/mlock.c:728
    system_call+0xe0/0x28c arch/s390/kernel/entry.S:415

  Freed by task 9333:
    slab_free mm/slub.c:3142 [inline]
    kmem_cache_free+0x7c/0x4b8 mm/slub.c:3158
    __vma_adjust+0x7b2/0x2508 mm/mmap.c:960
    vma_merge+0x87e/0xce0 mm/mmap.c:1209
    userfaultfd_release+0x412/0x6b8 fs/userfaultfd.c:868
    __fput+0x22c/0x7a8 fs/file_table.c:281
    task_work_run+0x200/0x320 kernel/task_work.c:151
    tracehook_notify_resume include/linux/tracehook.h:188 [inline]
    do_notify_resume+0x100/0x148 arch/s390/kernel/signal.c:538
    system_call+0xe6/0x28c arch/s390/kernel/entry.S:416

  The buggy address belongs to the object at 00000000962d6948 which belongs to the cache vm_area_struct of size 200
  The buggy address is located 64 bytes inside of 200-byte region [00000000962d6948, 00000000962d6a10)
  The buggy address belongs to the page: page:00000000313a09fe refcount:1 mapcount:0 mapping:0000000000000000 index:0x0 pfn:0x962d6 flags: 0x3ffff00000000200(slab)
  raw: 3ffff00000000200 000040000257e080 0000000c0000000c 000000008020ba00
  raw: 0000000000000000 000f001e00000000 ffffffff00000001 0000000096959501
  page dumped because: kasan: bad access detected
  page->mem_cgroup:0000000096959501

  Memory state around the buggy address:
   00000000962d6880: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
   00000000962d6900: 00 fc fc fc fc fc fc fc fc fa fb fb fb fb fb fb
  >00000000962d6980: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
                        ^
   00000000962d6a00: fb fb fc fc fc fc fc fc fc fc 00 00 00 00 00 00
   00000000962d6a80: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
  ==================================================================

Changes for v4.14 stable:
  - Make it apply w/o
    * Commit 4cf58924951ef ("mm: treewide: remove unused address argument
      from pte_alloc functions")
    * Commit 2b7403035459c ("mm: Change return type int to vm_fault_t for
      fault handlers")

Fixes: 6b251fc96cf2c ("userfaultfd: call handle_userfault() for userfaultfd_missing() faults")
Reported-by: Alexander Egorenkov <egorenar@linux.ibm.com>
Signed-off-by: Gerald Schaefer <gerald.schaefer@linux.ibm.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Cc: Andrea Arcangeli <aarcange@redhat.com>
Cc: Heiko Carstens <hca@linux.ibm.com>
Cc: <stable@vger.kernel.org>	[4.3+]
Link: https://lkml.kernel.org/r/20201110190329.11920-1-gerald.schaefer@linux.ibm.com
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
---
 mm/huge_memory.c |    9 ++++-----
 1 file changed, 4 insertions(+), 5 deletions(-)

--- a/mm/huge_memory.c
+++ b/mm/huge_memory.c
@@ -688,7 +688,6 @@ int do_huge_pmd_anonymous_page(struct vm
 			transparent_hugepage_use_zero_page()) {
 		pgtable_t pgtable;
 		struct page *zero_page;
-		bool set;
 		int ret;
 		pgtable = pte_alloc_one(vma->vm_mm, haddr);
 		if (unlikely(!pgtable))
@@ -701,25 +700,25 @@ int do_huge_pmd_anonymous_page(struct vm
 		}
 		vmf->ptl = pmd_lock(vma->vm_mm, vmf->pmd);
 		ret = 0;
-		set = false;
 		if (pmd_none(*vmf->pmd)) {
 			ret = check_stable_address_space(vma->vm_mm);
 			if (ret) {
 				spin_unlock(vmf->ptl);
+				pte_free(vma->vm_mm, pgtable);
 			} else if (userfaultfd_missing(vma)) {
 				spin_unlock(vmf->ptl);
+				pte_free(vma->vm_mm, pgtable);
 				ret = handle_userfault(vmf, VM_UFFD_MISSING);
 				VM_BUG_ON(ret & VM_FAULT_FALLBACK);
 			} else {
 				set_huge_zero_page(pgtable, vma->vm_mm, vma,
 						   haddr, vmf->pmd, zero_page);
 				spin_unlock(vmf->ptl);
-				set = true;
 			}
-		} else
+		} else {
 			spin_unlock(vmf->ptl);
-		if (!set)
 			pte_free(vma->vm_mm, pgtable);
+		}
 		return ret;
 	}
 	gfp = alloc_hugepage_direct_gfpmask(vma);



  parent reply	other threads:[~2020-12-01  9:30 UTC|newest]

Thread overview: 55+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-12-01  8:52 [PATCH 4.14 00/50] 4.14.210-rc1 review Greg Kroah-Hartman
2020-12-01  8:53 ` [PATCH 4.14 01/50] perf event: Check ref_reloc_sym before using it Greg Kroah-Hartman
2020-12-01  8:53   ` Greg Kroah-Hartman
2020-12-01  8:53 ` Greg Kroah-Hartman [this message]
2020-12-01  8:53 ` [PATCH 4.14 03/50] btrfs: fix lockdep splat when reading qgroup config on mount Greg Kroah-Hartman
2020-12-01  8:53 ` [PATCH 4.14 04/50] wireless: Use linux/stddef.h instead of stddef.h Greg Kroah-Hartman
2020-12-01  8:53 ` [PATCH 4.14 05/50] PCI: Add device even if driver attach failed Greg Kroah-Hartman
2020-12-01  8:53 ` [PATCH 4.14 06/50] btrfs: tree-checker: Enhance chunk checker to validate chunk profile Greg Kroah-Hartman
2020-12-01  8:53 ` [PATCH 4.14 07/50] btrfs: adjust return values of btrfs_inode_by_name Greg Kroah-Hartman
2020-12-01  8:53 ` [PATCH 4.14 08/50] btrfs: inode: Verify inode mode to avoid NULL pointer dereference Greg Kroah-Hartman
2020-12-01  8:53 ` [PATCH 4.14 09/50] KVM: x86: Fix split-irqchip vs interrupt injection window request Greg Kroah-Hartman
2020-12-01  8:53 ` [PATCH 4.14 10/50] arm64: pgtable: Fix pte_accessible() Greg Kroah-Hartman
2020-12-01  8:53 ` [PATCH 4.14 11/50] arm64: pgtable: Ensure dirty bit is preserved across pte_wrprotect() Greg Kroah-Hartman
2020-12-01  8:53 ` [PATCH 4.14 12/50] ALSA: hda/hdmi: Use single mutex unlock in error paths Greg Kroah-Hartman
2020-12-01  8:53 ` [PATCH 4.14 13/50] ALSA: hda/hdmi: fix incorrect locking in hdmi_pcm_close Greg Kroah-Hartman
2020-12-01  8:53 ` [PATCH 4.14 14/50] HID: cypress: Support Varmilo Keyboards media hotkeys Greg Kroah-Hartman
2020-12-01  8:53 ` [PATCH 4.14 15/50] Input: i8042 - allow insmod to succeed on devices without an i8042 controller Greg Kroah-Hartman
2020-12-01  8:53 ` [PATCH 4.14 16/50] HID: hid-sensor-hub: Fix issue with devices with no report ID Greg Kroah-Hartman
2020-12-01  8:53 ` [PATCH 4.14 17/50] dmaengine: xilinx_dma: use readl_poll_timeout_atomic variant Greg Kroah-Hartman
2020-12-01  8:53 ` [PATCH 4.14 18/50] x86/xen: dont unbind uninitialized lock_kicker_irq Greg Kroah-Hartman
2020-12-01  8:53 ` [PATCH 4.14 19/50] HID: Add Logitech Dinovo Edge battery quirk Greg Kroah-Hartman
2020-12-01  8:53 ` [PATCH 4.14 20/50] proc: dont allow async path resolution of /proc/self components Greg Kroah-Hartman
2020-12-01  8:53 ` [PATCH 4.14 21/50] nvme: free sq/cq dbbuf pointers when dbbuf set fails Greg Kroah-Hartman
2020-12-01  8:53 ` [PATCH 4.14 22/50] dmaengine: pl330: _prep_dma_memcpy: Fix wrong burst size Greg Kroah-Hartman
2020-12-01  8:53 ` [PATCH 4.14 23/50] scsi: libiscsi: Fix NOP race condition Greg Kroah-Hartman
2020-12-01  8:53 ` [PATCH 4.14 24/50] scsi: target: iscsi: Fix cmd abort fabric stop race Greg Kroah-Hartman
2020-12-01  8:53 ` [PATCH 4.14 25/50] perf/x86: fix sysfs type mismatches Greg Kroah-Hartman
2020-12-01  8:53 ` [PATCH 4.14 26/50] phy: tegra: xusb: Fix dangling pointer on probe failure Greg Kroah-Hartman
2020-12-01  8:53 ` [PATCH 4.14 27/50] batman-adv: set .owner to THIS_MODULE Greg Kroah-Hartman
2020-12-01  8:53 ` [PATCH 4.14 28/50] scsi: ufs: Fix race between shutdown and runtime resume flow Greg Kroah-Hartman
2020-12-01  8:53 ` [PATCH 4.14 29/50] bnxt_en: fix error return code in bnxt_init_one() Greg Kroah-Hartman
2020-12-01  8:53 ` [PATCH 4.14 30/50] bnxt_en: fix error return code in bnxt_init_board() Greg Kroah-Hartman
2020-12-01  8:53 ` [PATCH 4.14 31/50] video: hyperv_fb: Fix the cache type when mapping the VRAM Greg Kroah-Hartman
2020-12-01  8:53 ` [PATCH 4.14 32/50] bnxt_en: Release PCI regions when DMA mask setup fails during probe Greg Kroah-Hartman
2020-12-01  8:53 ` [PATCH 4.14 33/50] IB/mthca: fix return value of error branch in mthca_init_cq() Greg Kroah-Hartman
2020-12-01  8:53 ` [PATCH 4.14 34/50] nfc: s3fwrn5: use signed integer for parsing GPIO numbers Greg Kroah-Hartman
2020-12-01  8:53 ` [PATCH 4.14 35/50] net: ena: set initial DMA width to avoid intel iommu issue Greg Kroah-Hartman
2020-12-01  8:53 ` [PATCH 4.14 36/50] ibmvnic: fix NULL pointer dereference in reset_sub_crq_queues Greg Kroah-Hartman
2020-12-01  8:53 ` [PATCH 4.14 37/50] ibmvnic: fix NULL pointer dereference in ibmvic_reset_crq Greg Kroah-Hartman
2020-12-01  8:53 ` [PATCH 4.14 38/50] efivarfs: revert "fix memory leak in efivarfs_create()" Greg Kroah-Hartman
2020-12-01  8:53 ` [PATCH 4.14 39/50] can: gs_usb: fix endianess problem with candleLight firmware Greg Kroah-Hartman
2020-12-01  8:53 ` [PATCH 4.14 40/50] platform/x86: toshiba_acpi: Fix the wrong variable assignment Greg Kroah-Hartman
2020-12-01  8:53 ` [PATCH 4.14 41/50] can: m_can: fix nominal bitiming tseg2 min for version >= 3.1 Greg Kroah-Hartman
2020-12-01  8:53 ` [PATCH 4.14 42/50] perf probe: Fix to die_entrypc() returns error correctly Greg Kroah-Hartman
2020-12-01  8:53 ` [PATCH 4.14 43/50] USB: core: Change %pK for __user pointers to %px Greg Kroah-Hartman
2020-12-01  8:53 ` [PATCH 4.14 44/50] usb: gadget: f_midi: Fix memleak in f_midi_alloc Greg Kroah-Hartman
2020-12-01  8:53 ` [PATCH 4.14 45/50] usb: gadget: Fix memleak in gadgetfs_fill_super Greg Kroah-Hartman
2020-12-01  8:53 ` [PATCH 4.14 46/50] x86/speculation: Fix prctl() when spectre_v2_user={seccomp,prctl},ibpb Greg Kroah-Hartman
2020-12-01  8:53 ` [PATCH 4.14 47/50] x86/resctrl: Remove superfluous kernfs_get() calls to prevent refcount leak Greg Kroah-Hartman
2020-12-01  8:53 ` [PATCH 4.14 48/50] x86/resctrl: Add necessary kernfs_put() " Greg Kroah-Hartman
2020-12-01  8:53 ` [PATCH 4.14 49/50] USB: core: add endpoint-blacklist quirk Greg Kroah-Hartman
2020-12-01  8:53 ` [PATCH 4.14 50/50] USB: core: Fix regression in Hercules audio card Greg Kroah-Hartman
2020-12-01 13:39 ` [PATCH 4.14 00/50] 4.14.210-rc1 review Jon Hunter
2020-12-01 21:39 ` Guenter Roeck
2020-12-02  6:04 ` Naresh Kamboju

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20201201084645.054722406@linuxfoundation.org \
    --to=gregkh@linuxfoundation.org \
    --cc=aarcange@redhat.com \
    --cc=akpm@linux-foundation.org \
    --cc=egorenar@linux.ibm.com \
    --cc=gerald.schaefer@linux.ibm.com \
    --cc=hca@linux.ibm.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=stable@vger.kernel.org \
    --cc=torvalds@linux-foundation.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.