linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH 0/4] fix device-dax pud crash and fixup {pte,pmd,pud}_write
@ 2017-11-11  0:44 Dan Williams
  2017-11-11  0:44 ` [PATCH 1/4] mm: fix device-dax pud write-faults triggered by get_user_pages() Dan Williams
                   ` (3 more replies)
  0 siblings, 4 replies; 5+ messages in thread
From: Dan Williams @ 2017-11-11  0:44 UTC (permalink / raw)
  To: akpm
  Cc: linux-mm, linux-nvdimm, linux-kernel, stable, Dave Hansen,
	Jérôme Glisse, David S. Miller, Kirill A. Shutemov

Andrew,

Here is a new version to the pud_write() fix [1], and some follow-on
patches to use the '_access_permitted' helpers in fault and
get_user_pages() paths where we are checking if the thread has access to
write. I explicitly omit conversions for places where the kernel is
checking the _PAGE_RW flag for kernel purposes, not for userspace
access.

Beyond fixing the crash, this series also fixes get_user_pages() and
fault paths to honor protection keys in the same manner as
get_user_pages_fast(). Only the crash fix is tagged for -stable as the
protection key check is done just for consistency reasons since
userspace can change protection keys at will.

[1]: https://lists.01.org/pipermail/linux-nvdimm/2017-November/013237.html

---

Dan Williams (4):
      mm: fix device-dax pud write-faults triggered by get_user_pages()
      mm: replace pud_write with pud_access_permitted in fault + gup paths
      mm: replace pmd_write with pmd_access_permitted in fault + gup paths
      mm: replace pte_write with pte_access_permitted in fault + gup paths


 arch/sparc/mm/gup.c            |    4 ++--
 arch/x86/include/asm/pgtable.h |    6 ++++++
 fs/dax.c                       |    3 ++-
 include/asm-generic/pgtable.h  |    9 +++++++++
 include/linux/hugetlb.h        |    8 --------
 mm/gup.c                       |    2 +-
 mm/hmm.c                       |    8 ++++----
 mm/huge_memory.c               |    6 +++---
 mm/memory.c                    |    8 ++++----
 9 files changed, 31 insertions(+), 23 deletions(-)

^ permalink raw reply	[flat|nested] 5+ messages in thread

* [PATCH 1/4] mm: fix device-dax pud write-faults triggered by get_user_pages()
  2017-11-11  0:44 [PATCH 0/4] fix device-dax pud crash and fixup {pte,pmd,pud}_write Dan Williams
@ 2017-11-11  0:44 ` Dan Williams
  2017-11-11  0:44 ` [PATCH 2/4] mm: replace pud_write with pud_access_permitted in fault + gup paths Dan Williams
                   ` (2 subsequent siblings)
  3 siblings, 0 replies; 5+ messages in thread
From: Dan Williams @ 2017-11-11  0:44 UTC (permalink / raw)
  To: akpm
  Cc: linux-nvdimm, linux-kernel, stable, linux-mm, Dave Hansen,
	Kirill A. Shutemov

Currently only get_user_pages_fast() can safely handle the writable gup
case due to its use of pud_access_permitted() to check whether the pud
entry is writable. In the gup slow path pud_write() is used instead of
pud_access_permitted() and to date it has been unimplemented, just calls
BUG_ON().

    kernel BUG at ./include/linux/hugetlb.h:244!
    [..]
    RIP: 0010:follow_devmap_pud+0x482/0x490
    [..]
    Call Trace:
     follow_page_mask+0x28c/0x6e0
     __get_user_pages+0xe4/0x6c0
     get_user_pages_unlocked+0x130/0x1b0
     get_user_pages_fast+0x89/0xb0
     iov_iter_get_pages_alloc+0x114/0x4a0
     nfs_direct_read_schedule_iovec+0xd2/0x350
     ? nfs_start_io_direct+0x63/0x70
     nfs_file_direct_read+0x1e0/0x250
     nfs_file_read+0x90/0xc0

For now this just implements a simple check for the _PAGE_RW bit similar
to pmd_write. However, this implies that the gup-slow-path check is
missing the extra checks that the gup-fast-path performs with
pud_access_permitted. Later patches will align all checks to use the
'access_permitted' helper if the architecture provides it. Note that the
generic 'access_permitted' helper fallback is the simple _PAGE_RW check
on architectures that do not define the 'access_permitted' helper(s).

Cc: <stable@vger.kernel.org>
Cc: Dave Hansen <dave.hansen@intel.com>
Cc: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
Fixes: a00cc7d9dd93 ("mm, x86: add support for PUD-sized transparent hugepages")
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
---
 arch/x86/include/asm/pgtable.h |    6 ++++++
 include/asm-generic/pgtable.h  |    9 +++++++++
 include/linux/hugetlb.h        |    8 --------
 3 files changed, 15 insertions(+), 8 deletions(-)

diff --git a/arch/x86/include/asm/pgtable.h b/arch/x86/include/asm/pgtable.h
index f735c3016325..5c396724fd0d 100644
--- a/arch/x86/include/asm/pgtable.h
+++ b/arch/x86/include/asm/pgtable.h
@@ -1093,6 +1093,12 @@ static inline void pmdp_set_wrprotect(struct mm_struct *mm,
 	clear_bit(_PAGE_BIT_RW, (unsigned long *)pmdp);
 }
 
+#define __HAVE_ARCH_PUD_WRITE
+static inline int pud_write(pud_t pud)
+{
+	return pud_flags(pud) & _PAGE_RW;
+}
+
 /*
  * clone_pgd_range(pgd_t *dst, pgd_t *src, int count);
  *
diff --git a/include/asm-generic/pgtable.h b/include/asm-generic/pgtable.h
index 757dc6ffc7ba..bd738624bd16 100644
--- a/include/asm-generic/pgtable.h
+++ b/include/asm-generic/pgtable.h
@@ -812,6 +812,15 @@ static inline int pmd_write(pmd_t pmd)
 	return 0;
 }
 #endif /* __HAVE_ARCH_PMD_WRITE */
+
+#ifndef __HAVE_ARCH_PUD_WRITE
+static inline int pud_write(pud_t pud)
+{
+	BUG();
+	return 0;
+}
+#endif /* __HAVE_ARCH_PUD_WRITE */
+
 #endif /* CONFIG_TRANSPARENT_HUGEPAGE */
 
 #if !defined(CONFIG_TRANSPARENT_HUGEPAGE) || \
diff --git a/include/linux/hugetlb.h b/include/linux/hugetlb.h
index fbf5b31d47ee..82a25880714a 100644
--- a/include/linux/hugetlb.h
+++ b/include/linux/hugetlb.h
@@ -239,14 +239,6 @@ static inline int pgd_write(pgd_t pgd)
 }
 #endif
 
-#ifndef pud_write
-static inline int pud_write(pud_t pud)
-{
-	BUG();
-	return 0;
-}
-#endif
-
 #define HUGETLB_ANON_FILE "anon_hugepage"
 
 enum {

^ permalink raw reply related	[flat|nested] 5+ messages in thread

* [PATCH 2/4] mm: replace pud_write with pud_access_permitted in fault + gup paths
  2017-11-11  0:44 [PATCH 0/4] fix device-dax pud crash and fixup {pte,pmd,pud}_write Dan Williams
  2017-11-11  0:44 ` [PATCH 1/4] mm: fix device-dax pud write-faults triggered by get_user_pages() Dan Williams
@ 2017-11-11  0:44 ` Dan Williams
  2017-11-11  0:44 ` [PATCH 3/4] mm: replace pmd_write with pmd_access_permitted " Dan Williams
  2017-11-11  0:44 ` [PATCH 4/4] mm: replace pte_write with pte_access_permitted " Dan Williams
  3 siblings, 0 replies; 5+ messages in thread
From: Dan Williams @ 2017-11-11  0:44 UTC (permalink / raw)
  To: akpm
  Cc: linux-nvdimm, linux-kernel, linux-mm, Dave Hansen,
	David S. Miller, Kirill A. Shutemov

The 'access_permitted' helper is used in the gup-fast path and goes
beyond the simple _PAGE_RW check to also:

* validate that the mapping is writable from a protection keys
  standpoint

* validate that the pte has _PAGE_USER set since all fault paths where
  pud_write is must be referencing user-memory.

Cc: Dave Hansen <dave.hansen@intel.com>
Cc: "David S. Miller" <davem@davemloft.net>
Cc: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
---
 arch/sparc/mm/gup.c |    2 +-
 mm/huge_memory.c    |    2 +-
 mm/memory.c         |    2 +-
 3 files changed, 3 insertions(+), 3 deletions(-)

diff --git a/arch/sparc/mm/gup.c b/arch/sparc/mm/gup.c
index 5335ba3c850e..5ae2d0a01a70 100644
--- a/arch/sparc/mm/gup.c
+++ b/arch/sparc/mm/gup.c
@@ -114,7 +114,7 @@ static int gup_huge_pud(pud_t *pudp, pud_t pud, unsigned long addr,
 	if (!(pud_val(pud) & _PAGE_VALID))
 		return 0;
 
-	if (write && !pud_write(pud))
+	if (!pud_access_permitted(pud, write))
 		return 0;
 
 	refs = 0;
diff --git a/mm/huge_memory.c b/mm/huge_memory.c
index 1981ed697dab..1e4e11275856 100644
--- a/mm/huge_memory.c
+++ b/mm/huge_memory.c
@@ -1022,7 +1022,7 @@ struct page *follow_devmap_pud(struct vm_area_struct *vma, unsigned long addr,
 
 	assert_spin_locked(pud_lockptr(mm, pud));
 
-	if (flags & FOLL_WRITE && !pud_write(*pud))
+	if (!pud_access_permitted(*pud, flags & FOLL_WRITE))
 		return NULL;
 
 	if (pud_present(*pud) && pud_devmap(*pud))
diff --git a/mm/memory.c b/mm/memory.c
index a728bed16c20..64f86beadcca 100644
--- a/mm/memory.c
+++ b/mm/memory.c
@@ -3987,7 +3987,7 @@ static int __handle_mm_fault(struct vm_area_struct *vma, unsigned long address,
 
 			/* NUMA case for anonymous PUDs would go here */
 
-			if (dirty && !pud_write(orig_pud)) {
+			if (dirty && !pud_access_permitted(orig_pud, WRITE)) {
 				ret = wp_huge_pud(&vmf, orig_pud);
 				if (!(ret & VM_FAULT_FALLBACK))
 					return ret;

^ permalink raw reply related	[flat|nested] 5+ messages in thread

* [PATCH 3/4] mm: replace pmd_write with pmd_access_permitted in fault + gup paths
  2017-11-11  0:44 [PATCH 0/4] fix device-dax pud crash and fixup {pte,pmd,pud}_write Dan Williams
  2017-11-11  0:44 ` [PATCH 1/4] mm: fix device-dax pud write-faults triggered by get_user_pages() Dan Williams
  2017-11-11  0:44 ` [PATCH 2/4] mm: replace pud_write with pud_access_permitted in fault + gup paths Dan Williams
@ 2017-11-11  0:44 ` Dan Williams
  2017-11-11  0:44 ` [PATCH 4/4] mm: replace pte_write with pte_access_permitted " Dan Williams
  3 siblings, 0 replies; 5+ messages in thread
From: Dan Williams @ 2017-11-11  0:44 UTC (permalink / raw)
  To: akpm
  Cc: linux-nvdimm, linux-kernel, linux-mm, Dave Hansen,
	Jérôme Glisse, Kirill A. Shutemov

The 'access_permitted' helper is used in the gup-fast path and goes
beyond the simple _PAGE_RW check to also:

* validate that the mapping is writable from a protection keys
  standpoint

* validate that the pte has _PAGE_USER set since all fault paths where
  pmd_write is must be referencing user-memory.

Cc: Dave Hansen <dave.hansen@intel.com>
Cc: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
Cc: "Jérôme Glisse" <jglisse@redhat.com>
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
---
 arch/sparc/mm/gup.c |    2 +-
 fs/dax.c            |    3 ++-
 mm/hmm.c            |    4 ++--
 mm/huge_memory.c    |    4 ++--
 mm/memory.c         |    2 +-
 5 files changed, 8 insertions(+), 7 deletions(-)

diff --git a/arch/sparc/mm/gup.c b/arch/sparc/mm/gup.c
index 5ae2d0a01a70..33c0f8bb0f33 100644
--- a/arch/sparc/mm/gup.c
+++ b/arch/sparc/mm/gup.c
@@ -75,7 +75,7 @@ static int gup_huge_pmd(pmd_t *pmdp, pmd_t pmd, unsigned long addr,
 	if (!(pmd_val(pmd) & _PAGE_VALID))
 		return 0;
 
-	if (write && !pmd_write(pmd))
+	if (!pmd_access_permitted(pmd, write))
 		return 0;
 
 	refs = 0;
diff --git a/fs/dax.c b/fs/dax.c
index f001d8c72a06..3cc40eebbb9e 100644
--- a/fs/dax.c
+++ b/fs/dax.c
@@ -620,7 +620,8 @@ static void dax_mapping_entry_mkclean(struct address_space *mapping,
 
 			if (pfn != pmd_pfn(*pmdp))
 				goto unlock_pmd;
-			if (!pmd_dirty(*pmdp) && !pmd_write(*pmdp))
+			if (!pmd_dirty(*pmdp)
+					&& !pmd_access_permitted(*pmdp, WRITE))
 				goto unlock_pmd;
 
 			flush_cache_page(vma, address, pfn);
diff --git a/mm/hmm.c b/mm/hmm.c
index a88a847bccba..cbdd47bf6a48 100644
--- a/mm/hmm.c
+++ b/mm/hmm.c
@@ -391,11 +391,11 @@ static int hmm_vma_walk_pmd(pmd_t *pmdp,
 		if (pmd_protnone(pmd))
 			return hmm_vma_walk_clear(start, end, walk);
 
-		if (write_fault && !pmd_write(pmd))
+		if (!pmd_access_permitted(pmd, write_fault))
 			return hmm_vma_walk_clear(start, end, walk);
 
 		pfn = pmd_pfn(pmd) + pte_index(addr);
-		flag |= pmd_write(pmd) ? HMM_PFN_WRITE : 0;
+		flag |= pmd_access_permitted(pmd, WRITE) ? HMM_PFN_WRITE : 0;
 		for (; addr < end; addr += PAGE_SIZE, i++, pfn++)
 			pfns[i] = hmm_pfn_t_from_pfn(pfn) | flag;
 		return 0;
diff --git a/mm/huge_memory.c b/mm/huge_memory.c
index 1e4e11275856..411ba3ba45f8 100644
--- a/mm/huge_memory.c
+++ b/mm/huge_memory.c
@@ -875,7 +875,7 @@ struct page *follow_devmap_pmd(struct vm_area_struct *vma, unsigned long addr,
 	 */
 	WARN_ONCE(flags & FOLL_COW, "mm: In follow_devmap_pmd with FOLL_COW set");
 
-	if (flags & FOLL_WRITE && !pmd_write(*pmd))
+	if (!pmd_access_permitted(*pmd, flags & FOLL_WRITE))
 		return NULL;
 
 	if (pmd_present(*pmd) && pmd_devmap(*pmd))
@@ -1379,7 +1379,7 @@ int do_huge_pmd_wp_page(struct vm_fault *vmf, pmd_t orig_pmd)
  */
 static inline bool can_follow_write_pmd(pmd_t pmd, unsigned int flags)
 {
-	return pmd_write(pmd) ||
+	return pmd_access_permitted(pmd, WRITE) ||
 	       ((flags & FOLL_FORCE) && (flags & FOLL_COW) && pmd_dirty(pmd));
 }
 
diff --git a/mm/memory.c b/mm/memory.c
index 64f86beadcca..157fd4320bb3 100644
--- a/mm/memory.c
+++ b/mm/memory.c
@@ -4020,7 +4020,7 @@ static int __handle_mm_fault(struct vm_area_struct *vma, unsigned long address,
 			if (pmd_protnone(orig_pmd) && vma_is_accessible(vma))
 				return do_huge_pmd_numa_page(&vmf, orig_pmd);
 
-			if (dirty && !pmd_write(orig_pmd)) {
+			if (dirty && !pmd_access_permitted(orig_pmd, WRITE)) {
 				ret = wp_huge_pmd(&vmf, orig_pmd);
 				if (!(ret & VM_FAULT_FALLBACK))
 					return ret;

^ permalink raw reply related	[flat|nested] 5+ messages in thread

* [PATCH 4/4] mm: replace pte_write with pte_access_permitted in fault + gup paths
  2017-11-11  0:44 [PATCH 0/4] fix device-dax pud crash and fixup {pte,pmd,pud}_write Dan Williams
                   ` (2 preceding siblings ...)
  2017-11-11  0:44 ` [PATCH 3/4] mm: replace pmd_write with pmd_access_permitted " Dan Williams
@ 2017-11-11  0:44 ` Dan Williams
  3 siblings, 0 replies; 5+ messages in thread
From: Dan Williams @ 2017-11-11  0:44 UTC (permalink / raw)
  To: akpm
  Cc: linux-nvdimm, linux-kernel, linux-mm, Dave Hansen,
	Jérôme Glisse, Kirill A. Shutemov

The 'access_permitted' helper is used in the gup-fast path and goes
beyond the simple _PAGE_RW check to also:

* validate that the mapping is writable from a protection keys
  standpoint

* validate that the pte has _PAGE_USER set since all fault paths where
  pte_write is must be referencing user-memory.

Cc: Dave Hansen <dave.hansen@intel.com>
Cc: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
Cc: "Jérôme Glisse" <jglisse@redhat.com>
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
---
 mm/gup.c    |    2 +-
 mm/hmm.c    |    4 ++--
 mm/memory.c |    4 ++--
 3 files changed, 5 insertions(+), 5 deletions(-)

diff --git a/mm/gup.c b/mm/gup.c
index b2b4d4263768..bb6542c47b08 100644
--- a/mm/gup.c
+++ b/mm/gup.c
@@ -66,7 +66,7 @@ static int follow_pfn_pte(struct vm_area_struct *vma, unsigned long address,
  */
 static inline bool can_follow_write_pte(pte_t pte, unsigned int flags)
 {
-	return pte_write(pte) ||
+	return pte_access_permitted(pte, WRITE) ||
 		((flags & FOLL_FORCE) && (flags & FOLL_COW) && pte_dirty(pte));
 }
 
diff --git a/mm/hmm.c b/mm/hmm.c
index cbdd47bf6a48..3d2e49fd851a 100644
--- a/mm/hmm.c
+++ b/mm/hmm.c
@@ -456,11 +456,11 @@ static int hmm_vma_walk_pmd(pmd_t *pmdp,
 			continue;
 		}
 
-		if (write_fault && !pte_write(pte))
+		if (!pte_access_permitted(pte, write_fault))
 			goto fault;
 
 		pfns[i] = hmm_pfn_t_from_pfn(pte_pfn(pte)) | flag;
-		pfns[i] |= pte_write(pte) ? HMM_PFN_WRITE : 0;
+		pfns[i] |= pte_access_permitted(pte, WRITE) ? HMM_PFN_WRITE : 0;
 		continue;
 
 fault:
diff --git a/mm/memory.c b/mm/memory.c
index 157fd4320bb3..a8cbc2c3e3c9 100644
--- a/mm/memory.c
+++ b/mm/memory.c
@@ -3922,7 +3922,7 @@ static int handle_pte_fault(struct vm_fault *vmf)
 	if (unlikely(!pte_same(*vmf->pte, entry)))
 		goto unlock;
 	if (vmf->flags & FAULT_FLAG_WRITE) {
-		if (!pte_write(entry))
+		if (!pte_access_permitted(entry, WRITE))
 			return do_wp_page(vmf);
 		entry = pte_mkdirty(entry);
 	}
@@ -4308,7 +4308,7 @@ int follow_phys(struct vm_area_struct *vma,
 		goto out;
 	pte = *ptep;
 
-	if ((flags & FOLL_WRITE) && !pte_write(pte))
+	if (!pte_access_permitted(pte, flags & FOLL_WRITE))
 		goto unlock;
 
 	*prot = pgprot_val(pte_pgprot(pte));

^ permalink raw reply related	[flat|nested] 5+ messages in thread

end of thread, other threads:[~2017-11-11  0:53 UTC | newest]

Thread overview: 5+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2017-11-11  0:44 [PATCH 0/4] fix device-dax pud crash and fixup {pte,pmd,pud}_write Dan Williams
2017-11-11  0:44 ` [PATCH 1/4] mm: fix device-dax pud write-faults triggered by get_user_pages() Dan Williams
2017-11-11  0:44 ` [PATCH 2/4] mm: replace pud_write with pud_access_permitted in fault + gup paths Dan Williams
2017-11-11  0:44 ` [PATCH 3/4] mm: replace pmd_write with pmd_access_permitted " Dan Williams
2017-11-11  0:44 ` [PATCH 4/4] mm: replace pte_write with pte_access_permitted " Dan Williams

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).