From: Andrew Morton <akpm@linux-foundation.org>
To: akpm@linux-foundation.org, dan.j.williams@intel.com,
jmoyer@redhat.com, Justin.He@arm.com,
kirill.shutemov@linux.intel.com, kirill@shutemov.name,
linux-mm@kvack.org, mm-commits@vger.kernel.org,
stable@vger.kernel.org, torvalds@linux-foundation.org
Subject: [patch 3/7] mm: avoid data corruption on CoW fault into PFN-mapped VMA
Date: Thu, 05 Mar 2020 22:28:32 -0800 [thread overview]
Message-ID: <20200306062832.Eauf-skRC%akpm@linux-foundation.org> (raw)
In-Reply-To: <20200305222751.6d781a3f2802d79510941e4e@linux-foundation.org>
From: "Kirill A. Shutemov" <kirill@shutemov.name>
Subject: mm: avoid data corruption on CoW fault into PFN-mapped VMA
Jeff Moyer has reported that one of xfstests triggers a warning when run
on DAX-enabled filesystem:
WARNING: CPU: 76 PID: 51024 at mm/memory.c:2317 wp_page_copy+0xc40/0xd50
...
wp_page_copy+0x98c/0xd50 (unreliable)
do_wp_page+0xd8/0xad0
__handle_mm_fault+0x748/0x1b90
handle_mm_fault+0x120/0x1f0
__do_page_fault+0x240/0xd70
do_page_fault+0x38/0xd0
handle_page_fault+0x10/0x30
The warning happens on failed __copy_from_user_inatomic() which tries to
copy data into a CoW page.
This happens because of race between MADV_DONTNEED and CoW page fault:
CPU0 CPU1
handle_mm_fault()
do_wp_page()
wp_page_copy()
do_wp_page()
madvise(MADV_DONTNEED)
zap_page_range()
zap_pte_range()
ptep_get_and_clear_full()
<TLB flush>
__copy_from_user_inatomic()
sees empty PTE and fails
WARN_ON_ONCE(1)
clear_page()
The solution is to re-try __copy_from_user_inatomic() under PTL after
checking that PTE is matches the orig_pte.
The second copy attempt can still fail, like due to non-readable PTE, but
there's nothing reasonable we can do about, except clearing the CoW page.
Link: http://lkml.kernel.org/r/20200218154151.13349-1-kirill.shutemov@linux.intel.com
Signed-off-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
Reported-by: Jeff Moyer <jmoyer@redhat.com>
Tested-by: Jeff Moyer <jmoyer@redhat.com>
Cc: Dan Williams <dan.j.williams@intel.com>
Cc: Justin He <Justin.He@arm.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
---
mm/memory.c | 35 +++++++++++++++++++++++++++--------
1 file changed, 27 insertions(+), 8 deletions(-)
--- a/mm/memory.c~mm-avoid-data-corruption-on-cow-fault-into-pfn-mapped-vma
+++ a/mm/memory.c
@@ -2257,7 +2257,7 @@ static inline bool cow_user_page(struct
bool ret;
void *kaddr;
void __user *uaddr;
- bool force_mkyoung;
+ bool locked = false;
struct vm_area_struct *vma = vmf->vma;
struct mm_struct *mm = vma->vm_mm;
unsigned long addr = vmf->address;
@@ -2282,11 +2282,11 @@ static inline bool cow_user_page(struct
* On architectures with software "accessed" bits, we would
* take a double page fault, so mark it accessed here.
*/
- force_mkyoung = arch_faults_on_old_pte() && !pte_young(vmf->orig_pte);
- if (force_mkyoung) {
+ if (arch_faults_on_old_pte() && !pte_young(vmf->orig_pte)) {
pte_t entry;
vmf->pte = pte_offset_map_lock(mm, vmf->pmd, addr, &vmf->ptl);
+ locked = true;
if (!likely(pte_same(*vmf->pte, vmf->orig_pte))) {
/*
* Other thread has already handled the fault
@@ -2310,18 +2310,37 @@ static inline bool cow_user_page(struct
* zeroes.
*/
if (__copy_from_user_inatomic(kaddr, uaddr, PAGE_SIZE)) {
+ if (locked)
+ goto warn;
+
+ /* Re-validate under PTL if the page is still mapped */
+ vmf->pte = pte_offset_map_lock(mm, vmf->pmd, addr, &vmf->ptl);
+ locked = true;
+ if (!likely(pte_same(*vmf->pte, vmf->orig_pte))) {
+ /* The PTE changed under us. Retry page fault. */
+ ret = false;
+ goto pte_unlock;
+ }
+
/*
- * Give a warn in case there can be some obscure
- * use-case
+ * The same page can be mapped back since last copy attampt.
+ * Try to copy again under PTL.
*/
- WARN_ON_ONCE(1);
- clear_page(kaddr);
+ if (__copy_from_user_inatomic(kaddr, uaddr, PAGE_SIZE)) {
+ /*
+ * Give a warn in case there can be some obscure
+ * use-case
+ */
+warn:
+ WARN_ON_ONCE(1);
+ clear_page(kaddr);
+ }
}
ret = true;
pte_unlock:
- if (force_mkyoung)
+ if (locked)
pte_unmap_unlock(vmf->pte, vmf->ptl);
kunmap_atomic(kaddr);
flush_dcache_page(dst);
_
next prev parent reply other threads:[~2020-03-06 6:28 UTC|newest]
Thread overview: 14+ messages / expand[flat|nested] mbox.gz Atom feed top
2020-03-06 6:27 incoming Andrew Morton
2020-03-06 6:28 ` [patch 1/7] mm, numa: fix bad pmd by atomically check for pmd_trans_huge when marking page tables prot_numa Andrew Morton
2020-03-06 6:28 ` [patch 2/7] mm: fix possible PMD dirty bit lost in set_pmd_migration_entry() Andrew Morton
2020-03-06 6:28 ` Andrew Morton [this message]
2020-03-06 6:28 ` [patch 4/7] fat: fix uninit-memory access for partial initialized inode Andrew Morton
2020-03-06 6:28 ` [patch 5/7] mm/z3fold.c: do not include rwlock.h directly Andrew Morton
2020-03-06 6:28 ` [patch 6/7] mm, hotplug: fix page online with DEBUG_PAGEALLOC compiled but not enabled Andrew Morton
2020-03-06 6:28 ` [patch 7/7] arch/Kconfig: update HAVE_RELIABLE_STACKTRACE description Andrew Morton
2020-03-12 4:12 ` mmotm 2020-03-11-21-11 uploaded Andrew Morton
2020-03-12 15:03 ` mmotm 2020-03-11-21-11 uploaded (sound/soc/codecs/wcd9335.c) Randy Dunlap
2020-03-12 16:59 ` Srinivas Kandagatla
2020-03-13 4:52 ` Masahiro Yamada
2020-03-13 4:49 ` mmotm 2020-03-12-21-49 uploaded Andrew Morton
2020-03-21 22:17 ` mmotm 2020-03-21-15-17 uploaded Andrew Morton
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20200306062832.Eauf-skRC%akpm@linux-foundation.org \
--to=akpm@linux-foundation.org \
--cc=Justin.He@arm.com \
--cc=dan.j.williams@intel.com \
--cc=jmoyer@redhat.com \
--cc=kirill.shutemov@linux.intel.com \
--cc=kirill@shutemov.name \
--cc=linux-mm@kvack.org \
--cc=mm-commits@vger.kernel.org \
--cc=stable@vger.kernel.org \
--cc=torvalds@linux-foundation.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).