linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
To: linux-kernel@vger.kernel.org
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>,
	stable@vger.kernel.org, Hugh Dickins <hughd@google.com>,
	Andrew Morton <akpm@linux-foundation.org>,
	"Kirill A. Shutemov" <kirill.shutemov@linux.intel.com>,
	Andrea Arcangeli <aarcange@redhat.com>,
	Mike Kravetz <mike.kravetz@oracle.com>,
	Song Liu <songliubraving@fb.com>,
	Linus Torvalds <torvalds@linux-foundation.org>
Subject: [PATCH 4.19 88/92] khugepaged: retract_page_tables() remember to test exit
Date: Thu, 20 Aug 2020 11:22:13 +0200	[thread overview]
Message-ID: <20200820091542.214329888@linuxfoundation.org> (raw)
In-Reply-To: <20200820091537.490965042@linuxfoundation.org>

From: Hugh Dickins <hughd@google.com>

commit 18e77600f7a1ed69f8ce46c9e11cad0985712dfa upstream.

Only once have I seen this scenario (and forgot even to notice what forced
the eventual crash): a sequence of "BUG: Bad page map" alerts from
vm_normal_page(), from zap_pte_range() servicing exit_mmap();
pmd:00000000, pte values corresponding to data in physical page 0.

The pte mappings being zapped in this case were supposed to be from a huge
page of ext4 text (but could as well have been shmem): my belief is that
it was racing with collapse_file()'s retract_page_tables(), found *pmd
pointing to a page table, locked it, but *pmd had become 0 by the time
start_pte was decided.

In most cases, that possibility is excluded by holding mmap lock; but
exit_mmap() proceeds without mmap lock.  Most of what's run by khugepaged
checks khugepaged_test_exit() after acquiring mmap lock:
khugepaged_collapse_pte_mapped_thps() and hugepage_vma_revalidate() do so,
for example.  But retract_page_tables() did not: fix that.

The fix is for retract_page_tables() to check khugepaged_test_exit(),
after acquiring mmap lock, before doing anything to the page table.
Getting the mmap lock serializes with __mmput(), which briefly takes and
drops it in __khugepaged_exit(); then the khugepaged_test_exit() check on
mm_users makes sure we don't touch the page table once exit_mmap() might
reach it, since exit_mmap() will be proceeding without mmap lock, not
expecting anyone to be racing with it.

Fixes: f3f0e1d2150b ("khugepaged: add support of collapse for tmpfs/shmem pages")
Signed-off-by: Hugh Dickins <hughd@google.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Acked-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
Cc: Andrea Arcangeli <aarcange@redhat.com>
Cc: Mike Kravetz <mike.kravetz@oracle.com>
Cc: Song Liu <songliubraving@fb.com>
Cc: <stable@vger.kernel.org>	[4.8+]
Link: http://lkml.kernel.org/r/alpine.LSU.2.11.2008021215400.27773@eggly.anvils
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
---
 mm/khugepaged.c |   22 +++++++++++++---------
 1 file changed, 13 insertions(+), 9 deletions(-)

--- a/mm/khugepaged.c
+++ b/mm/khugepaged.c
@@ -1251,6 +1251,7 @@ static void collect_mm_slot(struct mm_sl
 static void retract_page_tables(struct address_space *mapping, pgoff_t pgoff)
 {
 	struct vm_area_struct *vma;
+	struct mm_struct *mm;
 	unsigned long addr;
 	pmd_t *pmd, _pmd;
 
@@ -1264,7 +1265,8 @@ static void retract_page_tables(struct a
 			continue;
 		if (vma->vm_end < addr + HPAGE_PMD_SIZE)
 			continue;
-		pmd = mm_find_pmd(vma->vm_mm, addr);
+		mm = vma->vm_mm;
+		pmd = mm_find_pmd(mm, addr);
 		if (!pmd)
 			continue;
 		/*
@@ -1273,14 +1275,16 @@ static void retract_page_tables(struct a
 		 * re-fault. Not ideal, but it's more important to not disturb
 		 * the system too much.
 		 */
-		if (down_write_trylock(&vma->vm_mm->mmap_sem)) {
-			spinlock_t *ptl = pmd_lock(vma->vm_mm, pmd);
-			/* assume page table is clear */
-			_pmd = pmdp_collapse_flush(vma, addr, pmd);
-			spin_unlock(ptl);
-			up_write(&vma->vm_mm->mmap_sem);
-			mm_dec_nr_ptes(vma->vm_mm);
-			pte_free(vma->vm_mm, pmd_pgtable(_pmd));
+		if (down_write_trylock(&mm->mmap_sem)) {
+			if (!khugepaged_test_exit(mm)) {
+				spinlock_t *ptl = pmd_lock(mm, pmd);
+				/* assume page table is clear */
+				_pmd = pmdp_collapse_flush(vma, addr, pmd);
+				spin_unlock(ptl);
+				mm_dec_nr_ptes(mm);
+				pte_free(mm, pmd_pgtable(_pmd));
+			}
+			up_write(&mm->mmap_sem);
 		}
 	}
 	i_mmap_unlock_write(mapping);



  parent reply	other threads:[~2020-08-20 12:21 UTC|newest]

Thread overview: 111+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-08-20  9:20 [PATCH 4.19 00/92] 4.19.141-rc1 review Greg Kroah-Hartman
2020-08-20  9:20 ` [PATCH 4.19 01/92] smb3: warn on confusing error scenario with sec=krb5 Greg Kroah-Hartman
2020-08-20  9:20 ` [PATCH 4.19 02/92] genirq/affinity: Make affinity setting if activated opt-in Greg Kroah-Hartman
2020-08-20  9:20 ` [PATCH 4.19 03/92] PCI: hotplug: ACPI: Fix context refcounting in acpiphp_grab_context() Greg Kroah-Hartman
2020-08-20  9:20 ` [PATCH 4.19 04/92] PCI: Mark AMD Navi10 GPU rev 0x00 ATS as broken Greg Kroah-Hartman
2020-08-20  9:20 ` [PATCH 4.19 05/92] PCI: Add device even if driver attach failed Greg Kroah-Hartman
2020-08-20  9:20 ` [PATCH 4.19 06/92] PCI: qcom: Define some PARF params needed for ipq8064 SoC Greg Kroah-Hartman
2020-08-20  9:20 ` [PATCH 4.19 07/92] PCI: qcom: Add support for tx term offset for rev 2.1.0 Greg Kroah-Hartman
2020-08-20  9:20 ` [PATCH 4.19 08/92] PCI: Probe bridge window attributes once at enumeration-time Greg Kroah-Hartman
2020-08-20  9:20 ` [PATCH 4.19 09/92] btrfs: free anon block device right after subvolume deletion Greg Kroah-Hartman
2020-08-20  9:20 ` [PATCH 4.19 10/92] btrfs: dont allocate anonymous block device for user invisible roots Greg Kroah-Hartman
2020-08-20  9:20 ` [PATCH 4.19 11/92] btrfs: ref-verify: fix memory leak in add_block_entry Greg Kroah-Hartman
2020-08-20  9:20 ` [PATCH 4.19 12/92] btrfs: dont traverse into the seed devices in show_devname Greg Kroah-Hartman
2020-08-20  9:20 ` [PATCH 4.19 13/92] btrfs: open device without device_list_mutex Greg Kroah-Hartman
2020-08-20  9:20 ` [PATCH 4.19 14/92] btrfs: fix messages after changing compression level by remount Greg Kroah-Hartman
2020-08-20  9:21 ` [PATCH 4.19 15/92] btrfs: only search for left_info if there is no right_info in try_merge_free_space Greg Kroah-Hartman
2020-08-20  9:21 ` [PATCH 4.19 16/92] btrfs: fix memory leaks after failure to lookup checksums during inode logging Greg Kroah-Hartman
2020-08-20  9:21 ` [PATCH 4.19 17/92] btrfs: fix return value mixup in btrfs_get_extent Greg Kroah-Hartman
2020-08-20  9:21 ` [PATCH 4.19 18/92] dt-bindings: iio: io-channel-mux: Fix compatible string in example code Greg Kroah-Hartman
2020-08-20  9:21 ` [PATCH 4.19 19/92] iio: dac: ad5592r: fix unbalanced mutex unlocks in ad5592r_read_raw() Greg Kroah-Hartman
2020-08-20  9:21 ` [PATCH 4.19 20/92] xtensa: fix xtensa_pmu_setup prototype Greg Kroah-Hartman
2020-08-20  9:21 ` [PATCH 4.19 21/92] cifs: Fix leak when handling lease break for cached root fid Greg Kroah-Hartman
2020-08-20  9:21 ` [PATCH 4.19 22/92] powerpc: Allow 4224 bytes of stack expansion for the signal frame Greg Kroah-Hartman
2020-08-20  9:21 ` [PATCH 4.19 23/92] powerpc: Fix circular dependency between percpu.h and mmu.h Greg Kroah-Hartman
2020-08-20  9:21 ` [PATCH 4.19 24/92] media: vsp1: dl: Fix NULL pointer dereference on unbind Greg Kroah-Hartman
2020-08-20  9:21 ` [PATCH 4.19 25/92] net: ethernet: stmmac: Disable hardware multicast filter Greg Kroah-Hartman
2020-08-20  9:21 ` [PATCH 4.19 26/92] net: stmmac: dwmac1000: provide multicast filter fallback Greg Kroah-Hartman
2020-08-20  9:21 ` [PATCH 4.19 27/92] net/compat: Add missing sock updates for SCM_RIGHTS Greg Kroah-Hartman
2020-08-20  9:21 ` [PATCH 4.19 28/92] md/raid5: Fix Force reconstruct-write io stuck in degraded raid5 Greg Kroah-Hartman
2020-08-20  9:21 ` [PATCH 4.19 29/92] bcache: allocate meta data pages as compound pages Greg Kroah-Hartman
2020-08-20  9:21 ` [PATCH 4.19 30/92] bcache: fix overflow in offset_to_stripe() Greg Kroah-Hartman
2020-08-20  9:21 ` [PATCH 4.19 31/92] mac80211: fix misplaced while instead of if Greg Kroah-Hartman
2020-08-20  9:21 ` [PATCH 4.19 32/92] driver core: Avoid binding drivers to dead devices Greg Kroah-Hartman
2020-08-20  9:21 ` [PATCH 4.19 33/92] MIPS: CPU#0 is not hotpluggable Greg Kroah-Hartman
2020-08-20  9:21 ` [PATCH 4.19 34/92] ext2: fix missing percpu_counter_inc Greg Kroah-Hartman
2020-08-20  9:21 ` [PATCH 4.19 35/92] ocfs2: change slot number type s16 to u16 Greg Kroah-Hartman
2020-08-20  9:21 ` [PATCH 4.19 36/92] mm/page_counter.c: fix protection usage propagation Greg Kroah-Hartman
2020-08-20  9:21 ` [PATCH 4.19 37/92] ftrace: Setup correct FTRACE_FL_REGS flags for module Greg Kroah-Hartman
2020-08-20  9:21 ` [PATCH 4.19 38/92] kprobes: Fix NULL pointer dereference at kprobe_ftrace_handler Greg Kroah-Hartman
2020-09-28 20:02   ` Naresh Kamboju
2020-09-28 22:09     ` Steven Rostedt
2020-09-28 22:15       ` Steven Rostedt
2020-09-29  5:49         ` Masami Hiramatsu
2020-09-29  6:52           ` Greg Kroah-Hartman
2020-08-20  9:21 ` [PATCH 4.19 39/92] tracing/hwlat: Honor the tracing_cpumask Greg Kroah-Hartman
2020-08-20  9:21 ` [PATCH 4.19 40/92] tracing: Use trace_sched_process_free() instead of exit() for pid tracing Greg Kroah-Hartman
2020-08-20  9:21 ` [PATCH 4.19 41/92] watchdog: f71808e_wdt: indicate WDIOF_CARDRESET support in watchdog_info.options Greg Kroah-Hartman
2020-08-20  9:21 ` [PATCH 4.19 42/92] watchdog: f71808e_wdt: remove use of wrong watchdog_info option Greg Kroah-Hartman
2020-08-20  9:21 ` [PATCH 4.19 43/92] watchdog: f71808e_wdt: clear watchdog timeout occurred flag Greg Kroah-Hartman
2020-08-20  9:21 ` [PATCH 4.19 44/92] pseries: Fix 64 bit logical memory block panic Greg Kroah-Hartman
2020-08-20  9:21 ` [PATCH 4.19 45/92] module: Correctly truncate sysfs sections output Greg Kroah-Hartman
2020-08-20  9:21 ` [PATCH 4.19 46/92] perf intel-pt: Fix FUP packet state Greg Kroah-Hartman
2020-08-20  9:21 ` [PATCH 4.19 47/92] remoteproc: qcom: q6v5: Update running state before requesting stop Greg Kroah-Hartman
2020-08-20  9:21 ` [PATCH 4.19 48/92] drm/imx: imx-ldb: Disable both channels for split mode in enc->disable() Greg Kroah-Hartman
2020-08-20  9:21 ` [PATCH 4.19 49/92] mfd: arizona: Ensure 32k clock is put on driver unbind and error Greg Kroah-Hartman
2020-08-20  9:21 ` [PATCH 4.19 50/92] RDMA/ipoib: Return void from ipoib_ib_dev_stop() Greg Kroah-Hartman
2020-08-20  9:21 ` [PATCH 4.19 51/92] RDMA/ipoib: Fix ABBA deadlock with ipoib_reap_ah() Greg Kroah-Hartman
2020-08-20  9:21 ` [PATCH 4.19 52/92] media: rockchip: rga: Introduce color fmt macros and refactor CSC mode logic Greg Kroah-Hartman
2020-08-20  9:21 ` [PATCH 4.19 53/92] media: rockchip: rga: Only set output CSC mode for RGB input Greg Kroah-Hartman
2020-08-20  9:21 ` [PATCH 4.19 54/92] USB: serial: ftdi_sio: make process-packet buffer unsigned Greg Kroah-Hartman
2020-08-20  9:21 ` [PATCH 4.19 55/92] USB: serial: ftdi_sio: clean up receive processing Greg Kroah-Hartman
2020-08-20  9:21 ` [PATCH 4.19 56/92] USB: serial: ftdi_sio: fix break and sysrq handling Greg Kroah-Hartman
2020-08-20  9:21 ` [PATCH 4.19 57/92] mmc: renesas_sdhi_internal_dmac: clean up the code for dma complete Greg Kroah-Hartman
2020-08-20  9:21 ` [PATCH 4.19 58/92] gpu: ipu-v3: image-convert: Combine rotate/no-rotate irq handlers Greg Kroah-Hartman
2020-08-20  9:21 ` [PATCH 4.19 59/92] dm rq: dont call blk_mq_queue_stopped() in dm_stop_queue() Greg Kroah-Hartman
2020-08-20  9:21 ` [PATCH 4.19 60/92] selftests/powerpc: ptrace-pkey: Rename variables to make it easier to follow code Greg Kroah-Hartman
2020-08-20  9:21 ` [PATCH 4.19 61/92] selftests/powerpc: ptrace-pkey: Update the test to mark an invalid pkey correctly Greg Kroah-Hartman
2020-08-20  9:21 ` [PATCH 4.19 62/92] selftests/powerpc: ptrace-pkey: Dont update expected UAMOR value Greg Kroah-Hartman
2020-08-20  9:21 ` [PATCH 4.19 63/92] iommu/omap: Check for failure of a call to omap_iommu_dump_ctx Greg Kroah-Hartman
2020-08-20  9:21 ` [PATCH 4.19 64/92] iommu/vt-d: Enforce PASID devTLB field mask Greg Kroah-Hartman
2020-08-20  9:21 ` [PATCH 4.19 65/92] i2c: rcar: slave: only send STOP event when we have been addressed Greg Kroah-Hartman
2020-08-20  9:21 ` [PATCH 4.19 66/92] clk: clk-atlas6: fix return value check in atlas6_clk_init() Greg Kroah-Hartman
2020-08-20  9:21 ` [PATCH 4.19 67/92] pwm: bcm-iproc: handle clk_get_rate() return Greg Kroah-Hartman
2020-08-20  9:21 ` [PATCH 4.19 68/92] tools build feature: Use CC and CXX from parent Greg Kroah-Hartman
2020-08-20  9:21 ` [PATCH 4.19 69/92] i2c: rcar: avoid race when unregistering slave Greg Kroah-Hartman
2020-08-20  9:21 ` [PATCH 4.19 70/92] openrisc: Fix oops caused when dumping stack Greg Kroah-Hartman
2020-08-20  9:21 ` [PATCH 4.19 71/92] scsi: lpfc: nvmet: Avoid hang / use-after-free again when destroying targetport Greg Kroah-Hartman
2020-08-20  9:21 ` [PATCH 4.19 72/92] watchdog: initialize device before misc_register Greg Kroah-Hartman
2020-08-20  9:21 ` [PATCH 4.19 73/92] Input: sentelic - fix error return when fsp_reg_write fails Greg Kroah-Hartman
2020-08-20  9:21 ` [PATCH 4.19 74/92] drm/vmwgfx: Use correct vmw_legacy_display_unit pointer Greg Kroah-Hartman
2020-08-20  9:22 ` [PATCH 4.19 75/92] drm/vmwgfx: Fix two list_for_each loop exit tests Greg Kroah-Hartman
2020-08-20  9:22 ` [PATCH 4.19 76/92] net: qcom/emac: add missed clk_disable_unprepare in error path of emac_clks_phase1_init Greg Kroah-Hartman
2020-08-20  9:22 ` [PATCH 4.19 77/92] nfs: Fix getxattr kernel panic and memory overflow Greg Kroah-Hartman
2020-08-20  9:22 ` [PATCH 4.19 78/92] fs/minix: set s_maxbytes correctly Greg Kroah-Hartman
2020-08-20  9:22 ` [PATCH 4.19 79/92] fs/minix: fix block limit check for V1 filesystems Greg Kroah-Hartman
2020-08-20  9:22 ` [PATCH 4.19 80/92] fs/minix: remove expected error message in block_to_path() Greg Kroah-Hartman
2020-08-20  9:22 ` [PATCH 4.19 81/92] fs/ufs: avoid potential u32 multiplication overflow Greg Kroah-Hartman
2020-08-20  9:22 ` [PATCH 4.19 82/92] test_kmod: avoid potential double free in trigger_config_run_type() Greg Kroah-Hartman
2020-08-20  9:22 ` [PATCH 4.19 83/92] mfd: dln2: Run event handler loop under spinlock Greg Kroah-Hartman
2020-08-21  7:21   ` Pavel Machek
2020-08-21  9:06     ` Andy Shevchenko
2020-08-21  9:14       ` Greg Kroah-Hartman
2020-08-21  9:15         ` Greg Kroah-Hartman
2020-08-21 10:54           ` Andy Shevchenko
2020-08-21 11:21             ` Greg Kroah-Hartman
2020-08-20  9:22 ` [PATCH 4.19 84/92] ALSA: echoaudio: Fix potential Oops in snd_echo_resume() Greg Kroah-Hartman
2020-08-20  9:22 ` [PATCH 4.19 85/92] perf bench mem: Always memset source before memcpy Greg Kroah-Hartman
2020-08-20  9:22 ` [PATCH 4.19 86/92] tools build feature: Quote CC and CXX for their arguments Greg Kroah-Hartman
2020-08-20  9:22 ` [PATCH 4.19 87/92] sh: landisk: Add missing initialization of sh_io_port_base Greg Kroah-Hartman
2020-08-20  9:22 ` Greg Kroah-Hartman [this message]
2020-08-20  9:22 ` [PATCH 4.19 89/92] arm64: dts: marvell: espressobin: add ethernet alias Greg Kroah-Hartman
2020-08-20  9:22 ` [PATCH 4.19 90/92] drm/radeon: fix fb_div check in ni_init_smc_spll_table() Greg Kroah-Hartman
2020-08-21  7:27   ` Pavel Machek
2020-08-21  7:37     ` Greg Kroah-Hartman
2020-08-20  9:22 ` [PATCH 4.19 91/92] drm: Added orientation quirk for ASUS tablet model T103HAF Greg Kroah-Hartman
2020-08-20  9:22 ` [PATCH 4.19 92/92] drm/amdgpu: Fix bug where DPM is not enabled after hibernate and resume Greg Kroah-Hartman
2020-08-20 20:03 ` [PATCH 4.19 00/92] 4.19.141-rc1 review Guenter Roeck
2020-08-20 20:05 ` Guenter Roeck
2020-08-20 23:49 ` Shuah Khan
2020-08-21  7:09 ` Naresh Kamboju
2020-08-21  7:39 ` Pavel Machek

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20200820091542.214329888@linuxfoundation.org \
    --to=gregkh@linuxfoundation.org \
    --cc=aarcange@redhat.com \
    --cc=akpm@linux-foundation.org \
    --cc=hughd@google.com \
    --cc=kirill.shutemov@linux.intel.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=mike.kravetz@oracle.com \
    --cc=songliubraving@fb.com \
    --cc=stable@vger.kernel.org \
    --cc=torvalds@linux-foundation.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).