linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
From: "Kirill A. Shutemov" <kirill@shutemov.name>
To: Jeff Moyer <jmoyer@redhat.com>
Cc: Jia He <justin.he@arm.com>,
	Catalin Marinas <catalin.marinas@arm.com>,
	"Kirill A. Shutemov" <kirill.shutemov@linux.intel.com>,
	linux-mm@kvack.org
Subject: Re: bug: data corruption introduced by commit 83d116c53058 ("mm: fix double page fault on arm64 if PTE_AF is cleared")
Date: Tue, 11 Feb 2020 17:51:58 +0300	[thread overview]
Message-ID: <20200211145158.5wt7nepe3flx25bj@box> (raw)
In-Reply-To: <x49tv3yys1l.fsf@segfault.boston.devel.redhat.com>

On Mon, Feb 10, 2020 at 05:51:50PM -0500, Jeff Moyer wrote:
> Hi,
> 
> While running xfstests test generic/437, I noticed that the following
> WARN_ON_ONCE inside cow_user_page() was triggered:
> 
> 	/*
> 	 * This really shouldn't fail, because the page is there
> 	 * in the page tables. But it might just be unreadable,
> 	 * in which case we just give up and fill the result with
> 	 * zeroes.
> 	 */
> 	if (__copy_from_user_inatomic(kaddr, uaddr, PAGE_SIZE)) {
> 		/*
> 		 * Give a warn in case there can be some obscure
> 		 * use-case
> 		 */
> 		WARN_ON_ONCE(1);
> 		clear_page(kaddr);
> 	}
> 
> Just clearing the page, in this case, will result in data corruption.  I
> think the assumption that the copy fails because the memory is
> inaccessible may be wrong.  In this instance, it may be the other thread
> issuing the madvise call?
> 
> I reverted the commit in question, and the data corruption is gone.
> 
> Below is the (ppc64) stack trace (this will reproduce on x86 as well).
> I've attached the reproducer, which is a modified version of the
> xfs test.  You'll need to setup a file system on pmem, and mount it with
> -o dax.  Then issue "./t_mmap_cow_race /mnt/pmem/foo".
> 
> Any help tracking this down is appreciated.

My guess is that MADV_DONTNEED get the page unmapped under you and
__copy_from_user_inatomic() sees empty PTE instead of the populated PTE it
expects.

Below is my completely untested attempt to fix it.

It is going to hurt perfomance in common case, but it should be good
enough to test my idea.

The real solution would be to retry __copy_from_user_inatomic() under ptl
if the first attempt fails. I expect it to be ugly.

diff --git a/mm/memory.c b/mm/memory.c
index 0bccc622e482..362a791f47fd 100644
--- a/mm/memory.c
+++ b/mm/memory.c
@@ -2257,7 +2257,6 @@ static inline bool cow_user_page(struct page *dst, struct page *src,
 	bool ret;
 	void *kaddr;
 	void __user *uaddr;
-	bool force_mkyoung;
 	struct vm_area_struct *vma = vmf->vma;
 	struct mm_struct *mm = vma->vm_mm;
 	unsigned long addr = vmf->address;
@@ -2278,27 +2277,18 @@ static inline bool cow_user_page(struct page *dst, struct page *src,
 	kaddr = kmap_atomic(dst);
 	uaddr = (void __user *)(addr & PAGE_MASK);
 
+	vmf->pte = pte_offset_map_lock(mm, vmf->pmd, addr, &vmf->ptl);
+	if (!pte_same(*vmf->pte, vmf->orig_pte)) {
+		ret = false;
+		goto pte_unlock;
+	}
+
 	/*
 	 * On architectures with software "accessed" bits, we would
 	 * take a double page fault, so mark it accessed here.
 	 */
-	force_mkyoung = arch_faults_on_old_pte() && !pte_young(vmf->orig_pte);
-	if (force_mkyoung) {
-		pte_t entry;
-
-		vmf->pte = pte_offset_map_lock(mm, vmf->pmd, addr, &vmf->ptl);
-		if (!likely(pte_same(*vmf->pte, vmf->orig_pte))) {
-			/*
-			 * Other thread has already handled the fault
-			 * and we don't need to do anything. If it's
-			 * not the case, the fault will be triggered
-			 * again on the same address.
-			 */
-			ret = false;
-			goto pte_unlock;
-		}
-
-		entry = pte_mkyoung(vmf->orig_pte);
+	if (arch_faults_on_old_pte() && !pte_young(vmf->orig_pte)) {
+		pte_t entry = pte_mkyoung(vmf->orig_pte);
 		if (ptep_set_access_flags(vma, addr, vmf->pte, entry, 0))
 			update_mmu_cache(vma, addr, vmf->pte);
 	}
@@ -2321,8 +2311,7 @@ static inline bool cow_user_page(struct page *dst, struct page *src,
 	ret = true;
 
 pte_unlock:
-	if (force_mkyoung)
-		pte_unmap_unlock(vmf->pte, vmf->ptl);
+	pte_unmap_unlock(vmf->pte, vmf->ptl);
 	kunmap_atomic(kaddr);
 	flush_dcache_page(dst);
 
-- 
 Kirill A. Shutemov


  parent reply	other threads:[~2020-02-11 14:51 UTC|newest]

Thread overview: 15+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-02-10 22:51 bug: data corruption introduced by commit 83d116c53058 ("mm: fix double page fault on arm64 if PTE_AF is cleared") Jeff Moyer
2020-02-11  4:17 ` Justin He
2020-02-11  4:29   ` Justin He
2020-02-11 16:44     ` Jeff Moyer
2020-02-11 17:33       ` Kirill A. Shutemov
2020-02-11 17:55         ` Jeff Moyer
2020-02-11 21:44           ` Kirill A. Shutemov
2020-02-11 22:01             ` Jeff Moyer
2020-02-11 22:15               ` Kirill A. Shutemov
2020-02-11 14:51 ` Kirill A. Shutemov [this message]
2020-02-11 16:27   ` Jeff Moyer
2020-02-11 22:40     ` Kirill A. Shutemov
2020-02-12 14:22       ` Jeff Moyer
2020-02-13 12:14         ` Kirill A. Shutemov
2020-02-14 21:07           ` Jeff Moyer

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20200211145158.5wt7nepe3flx25bj@box \
    --to=kirill@shutemov.name \
    --cc=catalin.marinas@arm.com \
    --cc=jmoyer@redhat.com \
    --cc=justin.he@arm.com \
    --cc=kirill.shutemov@linux.intel.com \
    --cc=linux-mm@kvack.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).