* [PATCH] mm: Avoid data corruption on CoW fault into PFN-mapped VMA
@ 2020-02-18 15:41 Kirill A. Shutemov
2020-02-19 21:22 ` Andrew Morton
0 siblings, 1 reply; 3+ messages in thread
From: Kirill A. Shutemov @ 2020-02-18 15:41 UTC (permalink / raw)
To: Andrew Morton
Cc: Dan Williams, Justin He, linux-mm, linux-kernel,
Kirill A. Shutemov, Jeff Moyer
Jeff Moyer has reported that one of xfstests triggers a warning when run
on DAX-enabled filesystem:
WARNING: CPU: 76 PID: 51024 at mm/memory.c:2317 wp_page_copy+0xc40/0xd50
...
wp_page_copy+0x98c/0xd50 (unreliable)
do_wp_page+0xd8/0xad0
__handle_mm_fault+0x748/0x1b90
handle_mm_fault+0x120/0x1f0
__do_page_fault+0x240/0xd70
do_page_fault+0x38/0xd0
handle_page_fault+0x10/0x30
The warning happens on failed __copy_from_user_inatomic() which tries to
copy data into a CoW page.
This happens because of race between MADV_DONTNEED and CoW page fault:
CPU0 CPU1
handle_mm_fault()
do_wp_page()
wp_page_copy()
do_wp_page()
madvise(MADV_DONTNEED)
zap_page_range()
zap_pte_range()
ptep_get_and_clear_full()
<TLB flush>
__copy_from_user_inatomic()
sees empty PTE and fails
WARN_ON_ONCE(1)
clear_page()
The solution is to re-try __copy_from_user_inatomic() under PTL after
checking that PTE is matches the orig_pte.
The second copy attempt can still fail, like due to non-readable PTE,
but there's nothing reasonable we can do about, except clearing the CoW
page.
Signed-off-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
Reported-and-tested-by: Jeff Moyer <jmoyer@redhat.com>
---
mm/memory.c | 35 +++++++++++++++++++++++++++--------
1 file changed, 27 insertions(+), 8 deletions(-)
diff --git a/mm/memory.c b/mm/memory.c
index 0bccc622e482..e8bfdf0d9d1d 100644
--- a/mm/memory.c
+++ b/mm/memory.c
@@ -2257,7 +2257,7 @@ static inline bool cow_user_page(struct page *dst, struct page *src,
bool ret;
void *kaddr;
void __user *uaddr;
- bool force_mkyoung;
+ bool locked = false;
struct vm_area_struct *vma = vmf->vma;
struct mm_struct *mm = vma->vm_mm;
unsigned long addr = vmf->address;
@@ -2282,11 +2282,11 @@ static inline bool cow_user_page(struct page *dst, struct page *src,
* On architectures with software "accessed" bits, we would
* take a double page fault, so mark it accessed here.
*/
- force_mkyoung = arch_faults_on_old_pte() && !pte_young(vmf->orig_pte);
- if (force_mkyoung) {
+ if (arch_faults_on_old_pte() && !pte_young(vmf->orig_pte)) {
pte_t entry;
vmf->pte = pte_offset_map_lock(mm, vmf->pmd, addr, &vmf->ptl);
+ locked = true;
if (!likely(pte_same(*vmf->pte, vmf->orig_pte))) {
/*
* Other thread has already handled the fault
@@ -2310,18 +2310,37 @@ static inline bool cow_user_page(struct page *dst, struct page *src,
* zeroes.
*/
if (__copy_from_user_inatomic(kaddr, uaddr, PAGE_SIZE)) {
+ if (locked)
+ goto warn;
+
+ /* Re-validate under PTL if the page is still mapped */
+ vmf->pte = pte_offset_map_lock(mm, vmf->pmd, addr, &vmf->ptl);
+ locked = true;
+ if (!likely(pte_same(*vmf->pte, vmf->orig_pte))) {
+ /* The PTE changed under us. Retry page fault. */
+ ret = false;
+ goto pte_unlock;
+ }
+
/*
- * Give a warn in case there can be some obscure
- * use-case
+ * The same page can be mapped back since last copy attampt.
+ * Try to copy again under PTL.
*/
- WARN_ON_ONCE(1);
- clear_page(kaddr);
+ if (__copy_from_user_inatomic(kaddr, uaddr, PAGE_SIZE)) {
+ /*
+ * Give a warn in case there can be some obscure
+ * use-case
+ */
+warn:
+ WARN_ON_ONCE(1);
+ clear_page(kaddr);
+ }
}
ret = true;
pte_unlock:
- if (force_mkyoung)
+ if (locked)
pte_unmap_unlock(vmf->pte, vmf->ptl);
kunmap_atomic(kaddr);
flush_dcache_page(dst);
--
2.25.1
^ permalink raw reply related [flat|nested] 3+ messages in thread
* Re: [PATCH] mm: Avoid data corruption on CoW fault into PFN-mapped VMA
2020-02-18 15:41 [PATCH] mm: Avoid data corruption on CoW fault into PFN-mapped VMA Kirill A. Shutemov
@ 2020-02-19 21:22 ` Andrew Morton
2020-02-20 12:06 ` Kirill A. Shutemov
0 siblings, 1 reply; 3+ messages in thread
From: Andrew Morton @ 2020-02-19 21:22 UTC (permalink / raw)
To: Kirill A. Shutemov
Cc: Dan Williams, Justin He, linux-mm, linux-kernel,
Kirill A. Shutemov, Jeff Moyer
On Tue, 18 Feb 2020 18:41:51 +0300 "Kirill A. Shutemov" <kirill@shutemov.name> wrote:
> Jeff Moyer has reported that one of xfstests triggers a warning when run
> on DAX-enabled filesystem:
>
> WARNING: CPU: 76 PID: 51024 at mm/memory.c:2317 wp_page_copy+0xc40/0xd50
> ...
> wp_page_copy+0x98c/0xd50 (unreliable)
> do_wp_page+0xd8/0xad0
> __handle_mm_fault+0x748/0x1b90
> handle_mm_fault+0x120/0x1f0
> __do_page_fault+0x240/0xd70
> do_page_fault+0x38/0xd0
> handle_page_fault+0x10/0x30
>
> The warning happens on failed __copy_from_user_inatomic() which tries to
> copy data into a CoW page.
>
> This happens because of race between MADV_DONTNEED and CoW page fault:
>
> CPU0 CPU1
> handle_mm_fault()
> do_wp_page()
> wp_page_copy()
> do_wp_page()
> madvise(MADV_DONTNEED)
> zap_page_range()
> zap_pte_range()
> ptep_get_and_clear_full()
> <TLB flush>
> __copy_from_user_inatomic()
> sees empty PTE and fails
> WARN_ON_ONCE(1)
> clear_page()
>
> The solution is to re-try __copy_from_user_inatomic() under PTL after
> checking that PTE is matches the orig_pte.
>
> The second copy attempt can still fail, like due to non-readable PTE,
> but there's nothing reasonable we can do about, except clearing the CoW
> page.
You don't think this is worthy of a cc:stable?
^ permalink raw reply [flat|nested] 3+ messages in thread
* Re: [PATCH] mm: Avoid data corruption on CoW fault into PFN-mapped VMA
2020-02-19 21:22 ` Andrew Morton
@ 2020-02-20 12:06 ` Kirill A. Shutemov
0 siblings, 0 replies; 3+ messages in thread
From: Kirill A. Shutemov @ 2020-02-20 12:06 UTC (permalink / raw)
To: Andrew Morton
Cc: Dan Williams, Justin He, linux-mm, linux-kernel,
Kirill A. Shutemov, Jeff Moyer
On Wed, Feb 19, 2020 at 01:22:39PM -0800, Andrew Morton wrote:
> On Tue, 18 Feb 2020 18:41:51 +0300 "Kirill A. Shutemov" <kirill@shutemov.name> wrote:
>
> > Jeff Moyer has reported that one of xfstests triggers a warning when run
> > on DAX-enabled filesystem:
> >
> > WARNING: CPU: 76 PID: 51024 at mm/memory.c:2317 wp_page_copy+0xc40/0xd50
> > ...
> > wp_page_copy+0x98c/0xd50 (unreliable)
> > do_wp_page+0xd8/0xad0
> > __handle_mm_fault+0x748/0x1b90
> > handle_mm_fault+0x120/0x1f0
> > __do_page_fault+0x240/0xd70
> > do_page_fault+0x38/0xd0
> > handle_page_fault+0x10/0x30
> >
> > The warning happens on failed __copy_from_user_inatomic() which tries to
> > copy data into a CoW page.
> >
> > This happens because of race between MADV_DONTNEED and CoW page fault:
> >
> > CPU0 CPU1
> > handle_mm_fault()
> > do_wp_page()
> > wp_page_copy()
> > do_wp_page()
> > madvise(MADV_DONTNEED)
> > zap_page_range()
> > zap_pte_range()
> > ptep_get_and_clear_full()
> > <TLB flush>
> > __copy_from_user_inatomic()
> > sees empty PTE and fails
> > WARN_ON_ONCE(1)
> > clear_page()
> >
> > The solution is to re-try __copy_from_user_inatomic() under PTL after
> > checking that PTE is matches the orig_pte.
> >
> > The second copy attempt can still fail, like due to non-readable PTE,
> > but there's nothing reasonable we can do about, except clearing the CoW
> > page.
>
> You don't think this is worthy of a cc:stable?
Please, add it.
Although, if I read history correctly, it is 15 year old bug that nobody
noticed until we added WARN() there :/
--
Kirill A. Shutemov
^ permalink raw reply [flat|nested] 3+ messages in thread
end of thread, other threads:[~2020-02-20 12:05 UTC | newest]
Thread overview: 3+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-02-18 15:41 [PATCH] mm: Avoid data corruption on CoW fault into PFN-mapped VMA Kirill A. Shutemov
2020-02-19 21:22 ` Andrew Morton
2020-02-20 12:06 ` Kirill A. Shutemov
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).