linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
From: Andrew Morton <akpm@linux-foundation.org>
To: akpm@linux-foundation.org, cai@lca.pw, elver@google.com,
	kirill@shutemov.name, linux-mm@kvack.org,
	mm-commits@vger.kernel.org, torvalds@linux-foundation.org,
	willy@infradead.org
Subject: [patch 24/39] mm/filemap.c: fix a data race in filemap_fault()
Date: Fri, 14 Aug 2020 17:31:27 -0700	[thread overview]
Message-ID: <20200815003127.4HDfspt4g%akpm@linux-foundation.org> (raw)
In-Reply-To: <20200814172939.55d6d80b6e21e4241f1ee1f3@linux-foundation.org>

From: Kirill A. Shutemov <kirill@shutemov.name>
Subject: mm/filemap.c: fix a data race in filemap_fault()

struct file_ra_state ra.mmap_miss could be accessed concurrently during
page faults as noticed by KCSAN,

 BUG: KCSAN: data-race in filemap_fault / filemap_map_pages

 write to 0xffff9b1700a2c1b4 of 4 bytes by task 3292 on cpu 30:
  filemap_fault+0x920/0xfc0
  do_sync_mmap_readahead at mm/filemap.c:2384
  (inlined by) filemap_fault at mm/filemap.c:2486
  __xfs_filemap_fault+0x112/0x3e0 [xfs]
  xfs_filemap_fault+0x74/0x90 [xfs]
  __do_fault+0x9e/0x220
  do_fault+0x4a0/0x920
  __handle_mm_fault+0xc69/0xd00
  handle_mm_fault+0xfc/0x2f0
  do_page_fault+0x263/0x6f9
  page_fault+0x34/0x40

 read to 0xffff9b1700a2c1b4 of 4 bytes by task 3313 on cpu 32:
  filemap_map_pages+0xc2e/0xd80
  filemap_map_pages at mm/filemap.c:2625
  do_fault+0x3da/0x920
  __handle_mm_fault+0xc69/0xd00
  handle_mm_fault+0xfc/0x2f0
  do_page_fault+0x263/0x6f9
  page_fault+0x34/0x40

 Reported by Kernel Concurrency Sanitizer on:
 CPU: 32 PID: 3313 Comm: systemd-udevd Tainted: G        W    L 5.5.0-next-20200210+ #1
 Hardware name: HPE ProLiant DL385 Gen10/ProLiant DL385 Gen10, BIOS A40 07/10/2019

ra.mmap_miss is used to contribute the readahead decisions, a data race
could be undesirable.  Both the read and write is only under non-exclusive
mmap_sem, two concurrent writers could even underflow the counter.  Fix
the underflow by writing to a local variable before committing a final
store to ra.mmap_miss given a small inaccuracy of the counter should be
acceptable.

Link: http://lkml.kernel.org/r/20200211030134.1847-1-cai@lca.pw
Signed-off-by: Kirill A. Shutemov <kirill@shutemov.name>
Signed-off-by: Qian Cai <cai@lca.pw>
Tested-by: Qian Cai <cai@lca.pw>
Reviewed-by: Matthew Wilcox (Oracle) <willy@infradead.org>
Cc: Marco Elver <elver@google.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
---

 mm/filemap.c |   20 +++++++++++++-------
 1 file changed, 13 insertions(+), 7 deletions(-)

--- a/mm/filemap.c~mm-filemap-fix-a-data-race-in-filemap_fault
+++ a/mm/filemap.c
@@ -2468,6 +2468,7 @@ static struct file *do_sync_mmap_readahe
 	struct address_space *mapping = file->f_mapping;
 	struct file *fpin = NULL;
 	pgoff_t offset = vmf->pgoff;
+	unsigned int mmap_miss;
 
 	/* If we don't want any read-ahead, don't bother */
 	if (vmf->vma->vm_flags & VM_RAND_READ)
@@ -2483,14 +2484,15 @@ static struct file *do_sync_mmap_readahe
 	}
 
 	/* Avoid banging the cache line if not needed */
-	if (ra->mmap_miss < MMAP_LOTSAMISS * 10)
-		ra->mmap_miss++;
+	mmap_miss = READ_ONCE(ra->mmap_miss);
+	if (mmap_miss < MMAP_LOTSAMISS * 10)
+		WRITE_ONCE(ra->mmap_miss, ++mmap_miss);
 
 	/*
 	 * Do we miss much more than hit in this file? If so,
 	 * stop bothering with read-ahead. It will only hurt.
 	 */
-	if (ra->mmap_miss > MMAP_LOTSAMISS)
+	if (mmap_miss > MMAP_LOTSAMISS)
 		return fpin;
 
 	/*
@@ -2516,13 +2518,15 @@ static struct file *do_async_mmap_readah
 	struct file_ra_state *ra = &file->f_ra;
 	struct address_space *mapping = file->f_mapping;
 	struct file *fpin = NULL;
+	unsigned int mmap_miss;
 	pgoff_t offset = vmf->pgoff;
 
 	/* If we don't want any read-ahead, don't bother */
 	if (vmf->vma->vm_flags & VM_RAND_READ || !ra->ra_pages)
 		return fpin;
-	if (ra->mmap_miss > 0)
-		ra->mmap_miss--;
+	mmap_miss = READ_ONCE(ra->mmap_miss);
+	if (mmap_miss)
+		WRITE_ONCE(ra->mmap_miss, --mmap_miss);
 	if (PageReadahead(page)) {
 		fpin = maybe_unlock_mmap_for_io(vmf, fpin);
 		page_cache_async_readahead(mapping, ra, file,
@@ -2688,6 +2692,7 @@ void filemap_map_pages(struct vm_fault *
 	unsigned long max_idx;
 	XA_STATE(xas, &mapping->i_pages, start_pgoff);
 	struct page *page;
+	unsigned int mmap_miss = READ_ONCE(file->f_ra.mmap_miss);
 
 	rcu_read_lock();
 	xas_for_each(&xas, page, end_pgoff) {
@@ -2724,8 +2729,8 @@ void filemap_map_pages(struct vm_fault *
 		if (page->index >= max_idx)
 			goto unlock;
 
-		if (file->f_ra.mmap_miss > 0)
-			file->f_ra.mmap_miss--;
+		if (mmap_miss > 0)
+			mmap_miss--;
 
 		vmf->address += (xas.xa_index - last_pgoff) << PAGE_SHIFT;
 		if (vmf->pte)
@@ -2745,6 +2750,7 @@ next:
 			break;
 	}
 	rcu_read_unlock();
+	WRITE_ONCE(file->f_ra.mmap_miss, mmap_miss);
 }
 EXPORT_SYMBOL(filemap_map_pages);
 
_


  parent reply	other threads:[~2020-08-15  0:31 UTC|newest]

Thread overview: 49+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-08-15  0:29 incoming Andrew Morton
2020-08-15  0:30 ` [patch 01/39] asm-generic: pgalloc.h: use correct #ifdef to enable pud_alloc_one() Andrew Morton
2020-08-15  0:30 ` [patch 02/39] Revert "mm/vmstat.c: do not show lowmem reserve protection information of empty zone" Andrew Morton
2020-08-15  0:30 ` [patch 03/39] lz4: fix kernel decompression speed Andrew Morton
2020-08-15  0:30 ` [patch 04/39] exec: restore EACCES of S_ISDIR execve() Andrew Morton
2020-08-15  0:30 ` [patch 05/39] selftests/exec: add file type errno tests Andrew Morton
2020-08-15  0:30 ` [patch 06/39] mailmap: add entry for Greg Kurz Andrew Morton
2020-08-15  0:30 ` [patch 07/39] mm: store compound_nr as well as compound_order Andrew Morton
2020-08-15  0:30 ` [patch 08/39] mm: move page-flags include to top of file Andrew Morton
2020-08-15  0:30 ` [patch 09/39] mm: add thp_order Andrew Morton
2020-08-15  0:30 ` [patch 10/39] mm: add thp_size Andrew Morton
2020-08-15  0:30 ` [patch 11/39] mm: replace hpage_nr_pages with thp_nr_pages Andrew Morton
2020-08-15  0:30 ` [patch 12/39] mm: add thp_head Andrew Morton
2020-08-15  0:30 ` [patch 13/39] mm: introduce offset_in_thp Andrew Morton
2020-08-15  0:30 ` [patch 14/39] fs: autofs: delete repeated words in comments Andrew Morton
2020-08-15  0:30 ` [patch 15/39] mm/madvise: pass task and mm to do_madvise Andrew Morton
2020-08-15  0:30 ` [patch 16/39] pid: move pidfd_get_pid() to pid.c Andrew Morton
2020-08-15  0:30 ` [patch 17/39] mm/madvise: introduce process_madvise() syscall: an external memory hinting API Andrew Morton
2020-08-16  8:12   ` Christian Brauner
2020-08-17 15:10     ` Minchan Kim
2020-08-15  0:31 ` [patch 18/39] mm/madvise: check fatal signal pending of target process Andrew Morton
2020-08-15  2:53   ` Linus Torvalds
2020-08-15  4:59     ` Minchan Kim
2020-08-15 14:57       ` Linus Torvalds
2020-08-15 18:34         ` Minchan Kim
2020-08-16  1:43           ` Linus Torvalds
2020-08-16  5:58             ` Minchan Kim
2020-08-15  0:31 ` [patch 19/39] all arch: remove system call sys_sysctl Andrew Morton
2020-08-15  0:31 ` [patch 20/39] mm/kmemleak: silence KCSAN splats in checksum Andrew Morton
2020-08-15  0:31 ` [patch 21/39] mm/frontswap: mark various intentional data races Andrew Morton
2020-08-15  0:31 ` [patch 22/39] mm/page_io: " Andrew Morton
2020-08-15  0:31 ` [patch 23/39] mm/swap_state: " Andrew Morton
2020-08-15  0:31 ` Andrew Morton [this message]
2020-08-15  0:31 ` [patch 25/39] mm/swapfile: fix and annotate various " Andrew Morton
2020-08-15  0:31 ` [patch 26/39] mm/page_counter: fix various data races at memsw Andrew Morton
2020-08-15  0:31 ` [patch 27/39] mm/memcontrol: fix a data race in scan count Andrew Morton
2020-08-15  0:31 ` [patch 28/39] mm/list_lru: fix a data race in list_lru_count_one Andrew Morton
2020-08-15  0:31 ` [patch 29/39] mm/mempool: fix a data race in mempool_free() Andrew Morton
2020-08-15  0:31 ` [patch 30/39] mm/rmap: annotate a data race at tlb_flush_batched Andrew Morton
2020-08-15  0:31 ` [patch 31/39] mm/swap.c: annotate data races for lru_rotate_pvecs Andrew Morton
2020-08-15  0:31 ` [patch 32/39] mm: annotate a data race in page_zonenum() Andrew Morton
2020-08-15  0:31 ` [patch 33/39] include/asm-generic/vmlinux.lds.h: align ro_after_init Andrew Morton
2020-08-15  0:32 ` [patch 34/39] sh: clkfwk: remove r8/r16/r32 Andrew Morton
2020-08-15  0:32 ` [patch 35/39] sh: use generic strncpy() Andrew Morton
2020-08-15  0:32 ` [patch 36/39] iomap: constify ioreadX() iomem argument (as in generic implementation) Andrew Morton
2020-08-15  0:32 ` [patch 37/39] rtl818x: " Andrew Morton
2020-08-15  0:32 ` [patch 38/39] ntb: intel: " Andrew Morton
2020-08-15  0:32 ` [patch 39/39] virtio: pci: " Andrew Morton
2020-08-19 23:09 ` mmotm 2020-08-19-16-09 uploaded Andrew Morton

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20200815003127.4HDfspt4g%akpm@linux-foundation.org \
    --to=akpm@linux-foundation.org \
    --cc=cai@lca.pw \
    --cc=elver@google.com \
    --cc=kirill@shutemov.name \
    --cc=linux-mm@kvack.org \
    --cc=mm-commits@vger.kernel.org \
    --cc=torvalds@linux-foundation.org \
    --cc=willy@infradead.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).