All of lore.kernel.org
 help / color / mirror / Atom feed
From: Ben Hutchings <ben@decadent.org.uk>
To: linux-kernel@vger.kernel.org, stable@vger.kernel.org
Cc: akpm@linux-foundation.org, Denis Kirjanov <kda@linux-powerpc.org>,
	"Hugh Dickins" <hughd@google.com>,
	"Andrea Arcangeli" <aarcange@redhat.com>,
	"Peter Xu" <peterx@redhat.com>,
	"Linus Torvalds" <torvalds@linux-foundation.org>,
	"Mike Kravetz" <mike.kravetz@oracle.com>,
	"Michal Hocko" <mhocko@suse.com>,
	"Jason Gunthorpe" <jgg@mellanox.com>,
	"Mike Rapoport" <rppt@linux.vnet.ibm.com>,
	"Oleg Nesterov" <oleg@redhat.com>,
	"Kirill A. Shutemov" <kirill.shutemov@linux.intel.com>,
	"Jann Horn" <jannh@google.com>
Subject: [PATCH 3.16 60/87] coredump: fix race condition between collapse_huge_page() and core dumping
Date: Wed, 02 Oct 2019 20:06:51 +0100	[thread overview]
Message-ID: <lsq.1570043211.584670199@decadent.org.uk> (raw)
In-Reply-To: <lsq.1570043210.379046399@decadent.org.uk>

3.16.75-rc1 review patch.  If anyone has any objections, please let me know.

------------------

From: Andrea Arcangeli <aarcange@redhat.com>

commit 59ea6d06cfa9247b586a695c21f94afa7183af74 upstream.

When fixing the race conditions between the coredump and the mmap_sem
holders outside the context of the process, we focused on
mmget_not_zero()/get_task_mm() callers in 04f5866e41fb70 ("coredump: fix
race condition between mmget_not_zero()/get_task_mm() and core
dumping"), but those aren't the only cases where the mmap_sem can be
taken outside of the context of the process as Michal Hocko noticed
while backporting that commit to older -stable kernels.

If mmgrab() is called in the context of the process, but then the
mm_count reference is transferred outside the context of the process,
that can also be a problem if the mmap_sem has to be taken for writing
through that mm_count reference.

khugepaged registration calls mmgrab() in the context of the process,
but the mmap_sem for writing is taken later in the context of the
khugepaged kernel thread.

collapse_huge_page() after taking the mmap_sem for writing doesn't
modify any vma, so it's not obvious that it could cause a problem to the
coredump, but it happens to modify the pmd in a way that breaks an
invariant that pmd_trans_huge_lock() relies upon.  collapse_huge_page()
needs the mmap_sem for writing just to block concurrent page faults that
call pmd_trans_huge_lock().

Specifically the invariant that "!pmd_trans_huge()" cannot become a
"pmd_trans_huge()" doesn't hold while collapse_huge_page() runs.

The coredump will call __get_user_pages() without mmap_sem for reading,
which eventually can invoke a lockless page fault which will need a
functional pmd_trans_huge_lock().

So collapse_huge_page() needs to use mmget_still_valid() to check it's
not running concurrently with the coredump...  as long as the coredump
can invoke page faults without holding the mmap_sem for reading.

This has "Fixes: khugepaged" to facilitate backporting, but in my view
it's more a bug in the coredump code that will eventually have to be
rewritten to stop invoking page faults without the mmap_sem for reading.
So the long term plan is still to drop all mmget_still_valid().

Link: http://lkml.kernel.org/r/20190607161558.32104-1-aarcange@redhat.com
Fixes: ba76149f47d8 ("thp: khugepaged")
Signed-off-by: Andrea Arcangeli <aarcange@redhat.com>
Reported-by: Michal Hocko <mhocko@suse.com>
Acked-by: Michal Hocko <mhocko@suse.com>
Acked-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
Cc: Oleg Nesterov <oleg@redhat.com>
Cc: Jann Horn <jannh@google.com>
Cc: Hugh Dickins <hughd@google.com>
Cc: Mike Rapoport <rppt@linux.vnet.ibm.com>
Cc: Mike Kravetz <mike.kravetz@oracle.com>
Cc: Peter Xu <peterx@redhat.com>
Cc: Jason Gunthorpe <jgg@mellanox.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
[bwh: Backported to 3.16:
 - Don't set result variable; collapse_huge_range() returns void
 - Adjust filenames]
Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
---
--- a/include/linux/sched.h
+++ b/include/linux/sched.h
@@ -2439,6 +2439,10 @@ static inline void mmdrop(struct mm_stru
  * followed by taking the mmap_sem for writing before modifying the
  * vmas or anything the coredump pretends not to change from under it.
  *
+ * It also has to be called when mmgrab() is used in the context of
+ * the process, but then the mm_count refcount is transferred outside
+ * the context of the process to run down_write() on that pinned mm.
+ *
  * NOTE: find_extend_vma() called from GUP context is the only place
  * that can modify the "mm" (notably the vm_start/end) under mmap_sem
  * for reading and outside the context of the process, so it is also
--- a/mm/huge_memory.c
+++ b/mm/huge_memory.c
@@ -2428,6 +2428,8 @@ static void collapse_huge_page(struct mm
 	 * handled by the anon_vma lock + PG_lock.
 	 */
 	down_write(&mm->mmap_sem);
+	if (!mmget_still_valid(mm))
+		goto out;
 	if (unlikely(khugepaged_test_exit(mm)))
 		goto out;
 


  parent reply	other threads:[~2019-10-02 19:09 UTC|newest]

Thread overview: 101+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2019-10-02 19:06 [PATCH 3.16 00/87] 3.16.75-rc1 review Ben Hutchings
2019-10-02 19:06 ` [PATCH 3.16 27/87] genwqe: Prevent an integer overflow in the ioctl Ben Hutchings
2019-10-02 19:06 ` [PATCH 3.16 52/87] ipv6: flowlabel: fl6_sock_lookup() must use atomic_inc_not_zero Ben Hutchings
2019-10-02 19:06 ` [PATCH 3.16 61/87] cfg80211: fix memory leak of wiphy device name Ben Hutchings
2019-10-02 19:06 ` [PATCH 3.16 10/87] drm/gma500/cdv: Check vbt config bits when detecting lvds panels Ben Hutchings
2019-10-02 19:06 ` [PATCH 3.16 04/87] ASoC: cs42xx8: Add regcache mask dirty Ben Hutchings
2019-10-02 19:06 ` [PATCH 3.16 80/87] bonding: Always enable vlan tx offload Ben Hutchings
2019-10-02 19:06 ` [PATCH 3.16 59/87] fs/ocfs2: fix race in ocfs2_dentry_attach_lock() Ben Hutchings
2019-10-02 19:06 ` [PATCH 3.16 41/87] kernel/signal.c: trace_signal_deliver when signal_group_exit Ben Hutchings
2019-10-02 19:06 ` [PATCH 3.16 21/87] ipv4/igmp: fix build error if !CONFIG_IP_MULTICAST Ben Hutchings
2019-10-02 19:06 ` [PATCH 3.16 26/87] gpio: fix gpio-adp5588 build errors Ben Hutchings
2019-10-02 19:06 ` [PATCH 3.16 40/87] net-gro: fix use-after-free read in napi_gro_frags() Ben Hutchings
2019-10-02 19:06 ` [PATCH 3.16 42/87] USB: usb-storage: Add new ID to ums-realtek Ben Hutchings
2019-10-02 19:06 ` [PATCH 3.16 66/87] perf/core: Fix perf_sample_regs_user() mm check Ben Hutchings
2019-10-02 19:06 ` [PATCH 3.16 57/87] libata: Extend quirks for the ST1000LM024 drives with NOLPM quirk Ben Hutchings
2019-10-02 19:06 ` [PATCH 3.16 47/87] net: rds: fix memory leak in rds_ib_flush_mr_pool Ben Hutchings
2019-10-02 19:06 ` [PATCH 3.16 63/87] btrfs: start readahead also in seed devices Ben Hutchings
2019-10-02 19:06 ` [PATCH 3.16 82/87] sctp: change to hold sk after auth shkey is created successfully Ben Hutchings
2019-10-02 19:06 ` [PATCH 3.16 20/87] ipv4/igmp: fix another memory leak in igmpv3_del_delrec() Ben Hutchings
2019-10-02 19:06 ` [PATCH 3.16 62/87] Btrfs: fix race between readahead and device replace/removal Ben Hutchings
2019-10-02 19:06 ` [PATCH 3.16 05/87] scsi: bnx2fc: fix incorrect cast to u64 on shift operation Ben Hutchings
2019-10-02 19:06 ` [PATCH 3.16 75/87] be2net: fix link failure after ethtool offline test Ben Hutchings
2019-10-02 19:06 ` [PATCH 3.16 58/87] cifs: add spinlock for the openFileList to cifsInodeInfo Ben Hutchings
2019-10-28 22:19   ` Pavel Shilovskiy
2019-10-29 13:15     ` Ben Hutchings
2019-11-19 14:49       ` Ben Hutchings
2019-10-02 19:06 ` [PATCH 3.16 55/87] KVM: arm64: Filter out invalid core register IDs in KVM_GET_REG_LIST Ben Hutchings
2019-10-02 19:06 ` [PATCH 3.16 69/87] net: netem: fix backlog accounting for corrupted GSO frames Ben Hutchings
2019-10-02 19:06 ` [PATCH 3.16 16/87] usb: xhci: avoid null pointer deref when bos field is NULL Ben Hutchings
2019-10-02 19:06 ` [PATCH 3.16 36/87] usbip: usbip_host: fix stub_dev lock context imbalance regression Ben Hutchings
2019-10-02 19:06 ` [PATCH 3.16 81/87] bonding: Add vlan tx offload to hw_enc_features Ben Hutchings
2019-10-02 19:06 ` [PATCH 3.16 48/87] pktgen: do not sleep with the thread lock held Ben Hutchings
2019-10-02 19:06 ` [PATCH 3.16 03/87] Btrfs: fix race between ranged fsync and writeback of adjacent ranges Ben Hutchings
2019-10-02 19:06 ` [PATCH 3.16 13/87] tty: max310x: Fix external crystal register setup Ben Hutchings
2019-10-02 19:06 ` [PATCH 3.16 23/87] Input: uinput - add compat ioctl number translation for UI_*_FF_UPLOAD Ben Hutchings
2019-10-02 19:06 ` [PATCH 3.16 67/87] SMB3: retry on STATUS_INSUFFICIENT_RESOURCES instead of failing write Ben Hutchings
2019-10-02 19:06 ` [PATCH 3.16 72/87] net/af_iucv: always register net_device notifier Ben Hutchings
2019-10-02 19:06 ` [PATCH 3.16 44/87] s390/qeth: fix VLAN attribute in bridge_hostnotify udev event Ben Hutchings
2019-10-02 19:06 ` [PATCH 3.16 12/87] serial: sh-sci: disable DMA for uart_console Ben Hutchings
2019-10-02 19:06 ` [PATCH 3.16 17/87] net: stmmac: fix reset gpio free missing Ben Hutchings
2019-10-02 19:06 ` [PATCH 3.16 01/87] net/mlx4_core: Change the error print to info print Ben Hutchings
2019-10-02 19:06 ` [PATCH 3.16 32/87] configfs: Fix use-after-free when accessing sd->s_dentry Ben Hutchings
2019-10-02 19:06 ` [PATCH 3.16 70/87] scsi: ufs: Avoid runtime suspend possibly being blocked forever Ben Hutchings
2019-10-02 19:06 ` [PATCH 3.16 09/87] USB: rio500: fix memory leak in close after disconnect Ben Hutchings
2019-10-02 19:06 ` [PATCH 3.16 85/87] scsi: target/iblock: Fix overrun in WRITE SAME emulation Ben Hutchings
2019-10-02 19:06 ` [PATCH 3.16 46/87] parisc: Use implicit space register selection for loading the coherence index of I/O pdirs Ben Hutchings
2019-10-02 19:06 ` [PATCH 3.16 11/87] USB: serial: pl2303: add Allied Telesis VT-Kit3 Ben Hutchings
2019-10-02 19:06 ` [PATCH 3.16 76/87] perf/ioctl: Add check for the sample_period value Ben Hutchings
2019-10-02 19:06   ` Ben Hutchings
2019-10-02 19:06 ` [PATCH 3.16 15/87] powerpc/perf: Fix MMCRA corruption by bhrb_filter Ben Hutchings
2019-10-02 19:06 ` [PATCH 3.16 38/87] scsi: zfcp: fix to prevent port_remove with pure auto scan LUNs (only sdevs) Ben Hutchings
2019-10-02 19:06 ` [PATCH 3.16 87/87] crypto: user - prevent operating on larval algorithms Ben Hutchings
2019-10-02 19:06 ` [PATCH 3.16 50/87] can: af_can: Fix error path of can_init() Ben Hutchings
2019-10-02 19:06 ` [PATCH 3.16 29/87] staging: iio: cdc: Don't put an else right after a return Ben Hutchings
2019-10-02 21:36   ` Joe Perches
2019-10-03 14:47     ` Ben Hutchings
2019-10-03 15:09       ` Joe Perches
2019-10-03 22:06         ` Ben Hutchings
2019-10-02 19:06 ` [PATCH 3.16 39/87] signal/ptrace: Don't leak unitialized kernel memory with PTRACE_PEEK_SIGINFO Ben Hutchings
2019-10-02 19:06 ` [PATCH 3.16 31/87] i2c: dev: fix potential memory leak in i2cdev_ioctl_rdwr Ben Hutchings
2019-10-02 19:06 ` [PATCH 3.16 35/87] s390/crypto: fix possible sleep during spinlock aquired Ben Hutchings
2019-10-02 19:06 ` [PATCH 3.16 64/87] be2net: Fix number of Rx queues used for flow hashing Ben Hutchings
2019-10-02 19:06 ` [PATCH 3.16 68/87] apparmor: enforce nullbyte at end of tag string Ben Hutchings
2019-10-02 19:06 ` [PATCH 3.16 30/87] staging:iio:ad7150: fix threshold mode config bit Ben Hutchings
2019-10-02 19:06 ` [PATCH 3.16 22/87] sbitmap: fix improper use of smp_mb__before_atomic() Ben Hutchings
2019-10-02 19:06 ` [PATCH 3.16 43/87] USB: Fix chipmunk-like voice when using Logitech C270 for recording audio Ben Hutchings
2019-10-02 19:06 ` [PATCH 3.16 49/87] can: flexcan: fix timeout when set small bitrate Ben Hutchings
2019-10-02 19:06 ` [PATCH 3.16 06/87] USB: Fix slab-out-of-bounds write in usb_get_bos_descriptor Ben Hutchings
2019-10-02 19:06 ` [PATCH 3.16 78/87] x86/speculation: Allow guests to use SSBD even if host does not Ben Hutchings
2019-10-02 19:06 ` [PATCH 3.16 71/87] net/af_iucv: remove GFP_DMA restriction for HiperTransport Ben Hutchings
2019-10-02 19:06 ` [PATCH 3.16 19/87] igmp: add a missing spin_lock_init() Ben Hutchings
2019-10-02 19:06 ` [PATCH 3.16 73/87] scsi: vmw_pscsi: Fix use-after-free in pvscsi_queue_lck() Ben Hutchings
2019-10-02 19:06 ` [PATCH 3.16 74/87] x86/apic: Fix integer overflow on 10 bit left shift of cpu_khz Ben Hutchings
2019-10-02 19:06   ` Ben Hutchings
2019-10-02 19:06 ` [PATCH 3.16 24/87] perf/ring_buffer: Fix exposing a temporarily decreased data_head Ben Hutchings
2019-10-02 19:06 ` [PATCH 3.16 79/87] cpu/speculation: Warn on unsupported mitigations= parameter Ben Hutchings
2019-10-02 19:06 ` [PATCH 3.16 45/87] hwmon: (pmbus/core) Treat parameters as paged if on multiple pages Ben Hutchings
2019-10-02 19:06 ` [PATCH 3.16 53/87] ptrace: restore smp_rmb() in __ptrace_may_access() Ben Hutchings
2019-10-02 19:06 ` [PATCH 3.16 34/87] CIFS: cifs_read_allocate_pages: don't iterate through whole page array on ENOMEM Ben Hutchings
2019-10-02 19:06 ` [PATCH 3.16 18/87] igmp: acquire pmc lock for ip_mc_clear_src() Ben Hutchings
2019-10-02 19:06 ` [PATCH 3.16 25/87] perf/ring_buffer: Add ordering to rb->nest increment Ben Hutchings
2019-10-02 19:06 ` [PATCH 3.16 02/87] spi: bitbang: Fix NULL pointer dereference in spi_unregister_master Ben Hutchings
2019-10-02 19:06 ` [PATCH 3.16 37/87] scsi: zfcp: fix missing zfcp_port reference put on -EBUSY from port_remove Ben Hutchings
2019-10-02 19:06 ` [PATCH 3.16 28/87] net: tulip: de4x5: Drop redundant MODULE_DEVICE_TABLE() Ben Hutchings
2019-10-02 19:06 ` [PATCH 3.16 56/87] bcache: fix stack corruption by PRECEDING_KEY() Ben Hutchings
2019-10-02 19:06 ` [PATCH 3.16 86/87] lib/mpi: Fix karactx leak in mpi_powm Ben Hutchings
2019-10-02 19:06 ` [PATCH 3.16 14/87] powerpc/perf: add missing put_cpu_var in power_pmu_event_init Ben Hutchings
2019-10-02 19:06 ` [PATCH 3.16 77/87] MIPS: Add missing EHB in mtc0 -> mfc0 sequence Ben Hutchings
2019-10-02 19:06 ` Ben Hutchings [this message]
2019-10-02 19:06 ` [PATCH 3.16 84/87] tracing/snapshot: Resize spare buffer if size changed Ben Hutchings
2019-10-02 19:06 ` [PATCH 3.16 51/87] can: purge socket error queue on sock destruct Ben Hutchings
2019-10-02 19:06 ` [PATCH 3.16 65/87] neigh: fix use-after-free read in pneigh_get_next Ben Hutchings
2019-10-02 19:06 ` [PATCH 3.16 08/87] usbip: usbip_host: fix BUG: sleeping function called from invalid context Ben Hutchings
2019-10-02 19:06 ` [PATCH 3.16 33/87] llc: fix skb leak in llc_build_and_send_ui_pkt() Ben Hutchings
2019-10-02 19:06 ` [PATCH 3.16 07/87] USB: Add LPM quirk for Surface Dock GigE adapter Ben Hutchings
2019-10-02 19:06 ` [PATCH 3.16 54/87] i2c: acorn: fix i2c warning Ben Hutchings
2019-10-02 19:06 ` [PATCH 3.16 83/87] ALSA: seq: fix incorrect order of dest_client/dest_ports arguments Ben Hutchings
2019-10-03 12:54 ` [PATCH 3.16 00/87] 3.16.75-rc1 review Guenter Roeck
2019-10-03 22:25   ` Ben Hutchings
2019-10-04 23:09 ` Guenter Roeck
2019-10-05 20:29   ` Ben Hutchings

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=lsq.1570043211.584670199@decadent.org.uk \
    --to=ben@decadent.org.uk \
    --cc=aarcange@redhat.com \
    --cc=akpm@linux-foundation.org \
    --cc=hughd@google.com \
    --cc=jannh@google.com \
    --cc=jgg@mellanox.com \
    --cc=kda@linux-powerpc.org \
    --cc=kirill.shutemov@linux.intel.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=mhocko@suse.com \
    --cc=mike.kravetz@oracle.com \
    --cc=oleg@redhat.com \
    --cc=peterx@redhat.com \
    --cc=rppt@linux.vnet.ibm.com \
    --cc=stable@vger.kernel.org \
    --cc=torvalds@linux-foundation.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.