* [PATCH 1/2] mm: Move the backup x_devmap() functions to asm-generic/pgtable.h
@ 2019-11-15 11:58 Thomas Hellström (VMware)
2019-11-15 11:58 ` [PATCH 2/2] mm: Fix a huge pud insertion race during faulting Thomas Hellström (VMware)
0 siblings, 1 reply; 6+ messages in thread
From: Thomas Hellström (VMware) @ 2019-11-15 11:58 UTC (permalink / raw)
To: linux-kernel, linux-mm
Cc: Thomas Hellstrom, Arnd Bergmann, Andrew Morton,
Kirill A. Shutemov, Matthew Wilcox
From: Thomas Hellstrom <thellstrom@vmware.com>
The asm-generic/pgtable.h include file appears to be the correct place
for the backup x_devmap() inline functions. Moving them here is also
necessary if we want to include x_devmap() in the [pmd|pud]_unstable
functions. So move the x_devmap() functions to asm-generic/pgtable.h
Cc: Arnd Bergmann <arnd@arndb.de>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: "Kirill A. Shutemov" <kirill.shutemov@linux.intel.com>
Cc: "Matthew Wilcox" <willy@infradead.org>
Signed-off-by: Thomas Hellstrom <thellstrom@vmware.com>
---
include/asm-generic/pgtable.h | 15 +++++++++++++++
include/linux/mm.h | 15 ---------------
2 files changed, 15 insertions(+), 15 deletions(-)
diff --git a/include/asm-generic/pgtable.h b/include/asm-generic/pgtable.h
index 818691846c90..8efa45580fd0 100644
--- a/include/asm-generic/pgtable.h
+++ b/include/asm-generic/pgtable.h
@@ -903,6 +903,21 @@ static inline int pud_write(pud_t pud)
}
#endif /* pud_write */
+#if !defined(CONFIG_ARCH_HAS_PTE_DEVMAP) || !defined(CONFIG_TRANSPARENT_HUGEPAGE)
+static inline int pmd_devmap(pmd_t pmd)
+{
+ return 0;
+}
+static inline int pud_devmap(pud_t pud)
+{
+ return 0;
+}
+static inline int pgd_devmap(pgd_t pgd)
+{
+ return 0;
+}
+#endif
+
#if !defined(CONFIG_TRANSPARENT_HUGEPAGE) || \
(defined(CONFIG_TRANSPARENT_HUGEPAGE) && \
!defined(CONFIG_HAVE_ARCH_TRANSPARENT_HUGEPAGE_PUD))
diff --git a/include/linux/mm.h b/include/linux/mm.h
index 4bc93477375e..78a09ff33cd7 100644
--- a/include/linux/mm.h
+++ b/include/linux/mm.h
@@ -564,21 +564,6 @@ int vma_is_stack_for_current(struct vm_area_struct *vma);
struct mmu_gather;
struct inode;
-#if !defined(CONFIG_ARCH_HAS_PTE_DEVMAP) || !defined(CONFIG_TRANSPARENT_HUGEPAGE)
-static inline int pmd_devmap(pmd_t pmd)
-{
- return 0;
-}
-static inline int pud_devmap(pud_t pud)
-{
- return 0;
-}
-static inline int pgd_devmap(pgd_t pgd)
-{
- return 0;
-}
-#endif
-
/*
* FIXME: take this include out, include page-flags.h in
* files which need it (119 of them)
--
2.21.0
^ permalink raw reply related [flat|nested] 6+ messages in thread
* [PATCH 2/2] mm: Fix a huge pud insertion race during faulting
2019-11-15 11:58 [PATCH 1/2] mm: Move the backup x_devmap() functions to asm-generic/pgtable.h Thomas Hellström (VMware)
@ 2019-11-15 11:58 ` Thomas Hellström (VMware)
2019-11-15 19:58 ` Andrew Morton
0 siblings, 1 reply; 6+ messages in thread
From: Thomas Hellström (VMware) @ 2019-11-15 11:58 UTC (permalink / raw)
To: linux-kernel, linux-mm
Cc: Thomas Hellstrom, Arnd Bergmann, Andrew Morton,
Kirill A. Shutemov, Matthew Wilcox
From: Thomas Hellstrom <thellstrom@vmware.com>
A huge pud page can theoretically be faulted in racing with pmd_alloc()
in __handle_mm_fault(). That will lead to pmd_alloc() returning an
invalid pmd pointer. Fix this by adding a pud_trans_unstable() function
similar to pmd_trans_unstable() and check whether the pud is really stable
before using the pmd pointer.
Race:
Thread 1: Thread 2: Comment
create_huge_pud() Fallback - not taken.
create_huge_pud() Taken.
pmd_alloc() Returns an invalid pointer.
Cc: Arnd Bergmann <arnd@arndb.de>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: "Kirill A. Shutemov" <kirill.shutemov@linux.intel.com>
Cc: Matthew Wilcox <willy@infradead.org>
Fixes: a00cc7d9dd93 ("mm, x86: add support for PUD-sized transparent hugepages")
Signed-off-by: Thomas Hellstrom <thellstrom@vmware.com>
---
include/asm-generic/pgtable.h | 25 +++++++++++++++++++++++++
mm/memory.c | 6 ++++++
2 files changed, 31 insertions(+)
diff --git a/include/asm-generic/pgtable.h b/include/asm-generic/pgtable.h
index 8efa45580fd0..c40a0ced53bd 100644
--- a/include/asm-generic/pgtable.h
+++ b/include/asm-generic/pgtable.h
@@ -927,6 +927,31 @@ static inline int pud_trans_huge(pud_t pud)
}
#endif
+/* See pmd_none_or_trans_huge_or_clear_bad for discussion. */
+static inline int pud_none_or_trans_huge_or_dev_or_clear_bad(pud_t *pud)
+{
+ pud_t pudval = READ_ONCE(*pud);
+
+ if (pud_none(pudval) || pud_trans_huge(pudval) || pud_devmap(pudval))
+ return 1;
+ if (unlikely(pud_bad(pudval))) {
+ pud_clear_bad(pud);
+ return 1;
+ }
+ return 0;
+}
+
+/* See pmd_trans_unstable for discussion. */
+static inline int pud_trans_unstable(pud_t *pud)
+{
+#if defined(CONFIG_TRANSPARENT_HUGEPAGE) && \
+ defined(CONFIG_HAVE_ARCH_TRANSPARENT_HUGEPAGE_PUD)
+ return pud_none_or_trans_huge_or_dev_or_clear_bad(pud);
+#else
+ return 0;
+#endif
+}
+
#ifndef pmd_read_atomic
static inline pmd_t pmd_read_atomic(pmd_t *pmdp)
{
diff --git a/mm/memory.c b/mm/memory.c
index b1ca51a079f2..43ff372f4f07 100644
--- a/mm/memory.c
+++ b/mm/memory.c
@@ -3914,6 +3914,7 @@ static vm_fault_t __handle_mm_fault(struct vm_area_struct *vma,
vmf.pud = pud_alloc(mm, p4d, address);
if (!vmf.pud)
return VM_FAULT_OOM;
+retry_pud:
if (pud_none(*vmf.pud) && __transparent_hugepage_enabled(vma)) {
ret = create_huge_pud(&vmf);
if (!(ret & VM_FAULT_FALLBACK))
@@ -3940,6 +3941,11 @@ static vm_fault_t __handle_mm_fault(struct vm_area_struct *vma,
vmf.pmd = pmd_alloc(mm, vmf.pud, address);
if (!vmf.pmd)
return VM_FAULT_OOM;
+
+ /* Huge pud page fault raced with pmd_alloc? */
+ if (pud_trans_unstable(vmf.pud))
+ goto retry_pud;
+
if (pmd_none(*vmf.pmd) && __transparent_hugepage_enabled(vma)) {
ret = create_huge_pmd(&vmf);
if (!(ret & VM_FAULT_FALLBACK))
--
2.21.0
^ permalink raw reply related [flat|nested] 6+ messages in thread
* Re: [PATCH 2/2] mm: Fix a huge pud insertion race during faulting
2019-11-15 11:58 ` [PATCH 2/2] mm: Fix a huge pud insertion race during faulting Thomas Hellström (VMware)
@ 2019-11-15 19:58 ` Andrew Morton
2019-11-18 10:22 ` Kirill A. Shutemov
0 siblings, 1 reply; 6+ messages in thread
From: Andrew Morton @ 2019-11-15 19:58 UTC (permalink / raw)
To: Thomas Hellström
Cc: linux-kernel, linux-mm, Thomas Hellstrom, Arnd Bergmann,
Kirill A. Shutemov, Matthew Wilcox
On Fri, 15 Nov 2019 12:58:08 +0100 Thomas Hellström (VMware) <thomas_os@shipmail.org> wrote:
> A huge pud page can theoretically be faulted in racing with pmd_alloc()
> in __handle_mm_fault(). That will lead to pmd_alloc() returning an
> invalid pmd pointer. Fix this by adding a pud_trans_unstable() function
> similar to pmd_trans_unstable() and check whether the pud is really stable
> before using the pmd pointer.
>
> Race:
> Thread 1: Thread 2: Comment
> create_huge_pud() Fallback - not taken.
> create_huge_pud() Taken.
> pmd_alloc() Returns an invalid pointer.
What are the user-visible runtime effects of this change?
Is a -stable backport warranted?
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: [PATCH 2/2] mm: Fix a huge pud insertion race during faulting
2019-11-15 19:58 ` Andrew Morton
@ 2019-11-18 10:22 ` Kirill A. Shutemov
2019-11-18 12:58 ` Thomas Hellström (VMware)
0 siblings, 1 reply; 6+ messages in thread
From: Kirill A. Shutemov @ 2019-11-18 10:22 UTC (permalink / raw)
To: Andrew Morton
Cc: Thomas Hellström, linux-kernel, linux-mm, Thomas Hellstrom,
Arnd Bergmann, Kirill A. Shutemov, Matthew Wilcox
On Fri, Nov 15, 2019 at 11:58:00AM -0800, Andrew Morton wrote:
> On Fri, 15 Nov 2019 12:58:08 +0100 Thomas Hellström (VMware) <thomas_os@shipmail.org> wrote:
>
> > A huge pud page can theoretically be faulted in racing with pmd_alloc()
> > in __handle_mm_fault(). That will lead to pmd_alloc() returning an
> > invalid pmd pointer. Fix this by adding a pud_trans_unstable() function
> > similar to pmd_trans_unstable() and check whether the pud is really stable
> > before using the pmd pointer.
> >
> > Race:
> > Thread 1: Thread 2: Comment
> > create_huge_pud() Fallback - not taken.
> > create_huge_pud() Taken.
> > pmd_alloc() Returns an invalid pointer.
>
> What are the user-visible runtime effects of this change?
Data corruption: kernel writes to a huge page thing it's page table.
> Is a -stable backport warranted?
I believe it is.
Acked-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
--
Kirill A. Shutemov
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: [PATCH 2/2] mm: Fix a huge pud insertion race during faulting
2019-11-18 10:22 ` Kirill A. Shutemov
@ 2019-11-18 12:58 ` Thomas Hellström (VMware)
2019-11-18 21:02 ` Andrew Morton
0 siblings, 1 reply; 6+ messages in thread
From: Thomas Hellström (VMware) @ 2019-11-18 12:58 UTC (permalink / raw)
To: Kirill A. Shutemov, Andrew Morton
Cc: linux-kernel, linux-mm, Thomas Hellstrom, Arnd Bergmann,
Kirill A. Shutemov, Matthew Wilcox
On 11/18/19 11:22 AM, Kirill A. Shutemov wrote:
> On Fri, Nov 15, 2019 at 11:58:00AM -0800, Andrew Morton wrote:
>> On Fri, 15 Nov 2019 12:58:08 +0100 Thomas Hellström (VMware) <thomas_os@shipmail.org> wrote:
>>
>>> A huge pud page can theoretically be faulted in racing with pmd_alloc()
>>> in __handle_mm_fault(). That will lead to pmd_alloc() returning an
>>> invalid pmd pointer. Fix this by adding a pud_trans_unstable() function
>>> similar to pmd_trans_unstable() and check whether the pud is really stable
>>> before using the pmd pointer.
>>>
>>> Race:
>>> Thread 1: Thread 2: Comment
>>> create_huge_pud() Fallback - not taken.
>>> create_huge_pud() Taken.
>>> pmd_alloc() Returns an invalid pointer.
>> What are the user-visible runtime effects of this change?
> Data corruption: kernel writes to a huge page thing it's page table.
>
>> Is a -stable backport warranted?
> I believe it is.
Note that this was caught during a code audit rather than a real
experienced problem. It looks to me like the only implementation that
currently creates huge pud pagetable entries is dev_dax_huge_fault()
which doesn't appear to care much about private (COW) mappings or
write-tracking which is, I believe, a prerequisite for create_huge_pud()
falling back on thread 1, but not in thread 2.
This means (assuming that's intentional) that a stable backport
shouldn't be needed.
For the WIP huge page support for graphics memory we'll be allowing both
COW mappings and write-tracking, though, but that's still some time away.
In any case, I think this patch needs -rc testing to catch potential
pud_devmap issues before submitted to stable.
Thanks,
Thomas
>
> Acked-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
>
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: [PATCH 2/2] mm: Fix a huge pud insertion race during faulting
2019-11-18 12:58 ` Thomas Hellström (VMware)
@ 2019-11-18 21:02 ` Andrew Morton
0 siblings, 0 replies; 6+ messages in thread
From: Andrew Morton @ 2019-11-18 21:02 UTC (permalink / raw)
To: Thomas Hellström
Cc: Kirill A. Shutemov, linux-kernel, linux-mm, Thomas Hellstrom,
Arnd Bergmann, Kirill A. Shutemov, Matthew Wilcox
On Mon, 18 Nov 2019 13:58:04 +0100 Thomas Hellström (VMware) <thomas_os@shipmail.org> wrote:
> >> Is a -stable backport warranted?
> > I believe it is.
>
> Note that this was caught during a code audit rather than a real
> experienced problem. It looks to me like the only implementation that
> currently creates huge pud pagetable entries is dev_dax_huge_fault()
> which doesn't appear to care much about private (COW) mappings or
> write-tracking which is, I believe, a prerequisite for create_huge_pud()
> falling back on thread 1, but not in thread 2.
>
> This means (assuming that's intentional) that a stable backport
> shouldn't be needed.
>
> For the WIP huge page support for graphics memory we'll be allowing both
> COW mappings and write-tracking, though, but that's still some time away.
>
> In any case, I think this patch needs -rc testing to catch potential
> pud_devmap issues before submitted to stable.
OK, thanks, I'll queue it for 5.5-rc1 with a -stable tag. Hopefully
that way it will get a bit of exposure before the stable trees pick it
up. Maybe this is optimistic..
^ permalink raw reply [flat|nested] 6+ messages in thread
end of thread, other threads:[~2019-11-18 21:02 UTC | newest]
Thread overview: 6+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2019-11-15 11:58 [PATCH 1/2] mm: Move the backup x_devmap() functions to asm-generic/pgtable.h Thomas Hellström (VMware)
2019-11-15 11:58 ` [PATCH 2/2] mm: Fix a huge pud insertion race during faulting Thomas Hellström (VMware)
2019-11-15 19:58 ` Andrew Morton
2019-11-18 10:22 ` Kirill A. Shutemov
2019-11-18 12:58 ` Thomas Hellström (VMware)
2019-11-18 21:02 ` Andrew Morton
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).