linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH 0/7] mm: Unexport apply_to_page_range()
@ 2021-04-12  8:00 Peter Zijlstra
  2021-04-12  8:00 ` [PATCH 1/7] mm: Unexport apply_to_existing_page_range() Peter Zijlstra
                   ` (6 more replies)
  0 siblings, 7 replies; 28+ messages in thread
From: Peter Zijlstra @ 2021-04-12  8:00 UTC (permalink / raw)
  To: akpm
  Cc: linux-kernel, peterz, boris.ostrovsky, jgross, sstabellini, x86,
	jani.nikula, joonas.lahtinen, rodrigo.vivi, chris, intel-gfx,
	linux-mm, keescook, hch

Hi,

These patches remove the last few modular apply_to_page_range() users and
unexports these interface. Since these functions provide direct access to our
page-tables they're a prime target for nefarious activities.

These patches rely on the remap_pfn_range_notrack() patches from Christoph that
are currently already in Andrew's tree.

Please consider.


^ permalink raw reply	[flat|nested] 28+ messages in thread

* [PATCH 1/7] mm: Unexport apply_to_existing_page_range()
  2021-04-12  8:00 [PATCH 0/7] mm: Unexport apply_to_page_range() Peter Zijlstra
@ 2021-04-12  8:00 ` Peter Zijlstra
  2021-04-12  8:13   ` Christoph Hellwig
  2021-04-12  8:00 ` [PATCH 2/7] xen/gntdev,x86: Remove apply_to_page_range() use from module Peter Zijlstra
                   ` (5 subsequent siblings)
  6 siblings, 1 reply; 28+ messages in thread
From: Peter Zijlstra @ 2021-04-12  8:00 UTC (permalink / raw)
  To: akpm
  Cc: linux-kernel, peterz, boris.ostrovsky, jgross, sstabellini, x86,
	jani.nikula, joonas.lahtinen, rodrigo.vivi, chris, intel-gfx,
	linux-mm, keescook, hch

There are no modular in-tree users, remove the EXPORT.

This is an unsafe function in that it gives direct access to the
page-tables.

Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
---
 mm/memory.c |    1 -
 1 file changed, 1 deletion(-)

--- a/mm/memory.c
+++ b/mm/memory.c
@@ -2558,7 +2558,6 @@ int apply_to_existing_page_range(struct
 {
 	return __apply_to_page_range(mm, addr, size, fn, data, false);
 }
-EXPORT_SYMBOL_GPL(apply_to_existing_page_range);
 
 /*
  * handle_pte_fault chooses page fault handler according to an entry which was



^ permalink raw reply	[flat|nested] 28+ messages in thread

* [PATCH 2/7] xen/gntdev,x86: Remove apply_to_page_range() use from module
  2021-04-12  8:00 [PATCH 0/7] mm: Unexport apply_to_page_range() Peter Zijlstra
  2021-04-12  8:00 ` [PATCH 1/7] mm: Unexport apply_to_existing_page_range() Peter Zijlstra
@ 2021-04-12  8:00 ` Peter Zijlstra
  2021-04-12  8:26   ` Christoph Hellwig
  2021-04-12  8:00 ` [PATCH 3/7] xen/gntdev: " Peter Zijlstra
                   ` (4 subsequent siblings)
  6 siblings, 1 reply; 28+ messages in thread
From: Peter Zijlstra @ 2021-04-12  8:00 UTC (permalink / raw)
  To: akpm
  Cc: linux-kernel, peterz, boris.ostrovsky, jgross, sstabellini, x86,
	jani.nikula, joonas.lahtinen, rodrigo.vivi, chris, intel-gfx,
	linux-mm, keescook, hch

Instead of relying on apply_to_page_range() being available to
modules, move its use into core kernel code and export it's
application.

NOTE: ideally we do: use_ptemod = !auto_translate_physmap &&
gnttab_map_avail_bits and remove this hack.

Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
---
 arch/x86/include/asm/xen/page.h |    2 ++
 arch/x86/xen/mmu.c              |   26 ++++++++++++++++++++++++++
 drivers/xen/gntdev.c            |   23 +----------------------
 3 files changed, 29 insertions(+), 22 deletions(-)

--- a/arch/x86/include/asm/xen/page.h
+++ b/arch/x86/include/asm/xen/page.h
@@ -370,4 +370,6 @@ static inline unsigned long xen_get_swio
 	return __get_free_pages(__GFP_NOWARN, order);
 }
 
+extern void xen_set_grant_as_special(struct vm_area_struct *vma);
+
 #endif /* _ASM_X86_XEN_PAGE_H */
--- a/arch/x86/xen/mmu.c
+++ b/arch/x86/xen/mmu.c
@@ -51,3 +51,29 @@ int xen_unmap_domain_gfn_range(struct vm
 	return -EINVAL;
 }
 EXPORT_SYMBOL_GPL(xen_unmap_domain_gfn_range);
+
+static int set_grant_ptes_as_special(pte_t *pte, unsigned long addr, void *data)
+{
+	set_pte_at(current->mm, addr, pte, pte_mkspecial(*pte));
+	return 0;
+}
+
+void xen_set_grant_as_special(struct vm_area_struct *vma)
+{
+	if (xen_feature(XENFEAT_gnttab_map_avail_bits))
+		return;
+
+	/*
+	 * If the PTEs were not made special by the grant map
+	 * hypercall, do so here.
+	 *
+	 * This is racy since the mapping is already visible
+	 * to userspace but userspace should be well-behaved
+	 * enough to not touch it until the mmap() call
+	 * returns.
+	 */
+	apply_to_page_range(vma->vm_mm, vma->vm_start,
+			    vma->vm_end - vma->vm_start,
+			    set_grant_ptes_as_special, NULL);
+}
+EXPORT_SYMBOL_GPL(xen_set_grant_as_special);
--- a/drivers/xen/gntdev.c
+++ b/drivers/xen/gntdev.c
@@ -278,14 +278,6 @@ static int find_grant_ptes(pte_t *pte, u
 	return 0;
 }
 
-#ifdef CONFIG_X86
-static int set_grant_ptes_as_special(pte_t *pte, unsigned long addr, void *data)
-{
-	set_pte_at(current->mm, addr, pte, pte_mkspecial(*pte));
-	return 0;
-}
-#endif
-
 int gntdev_map_grant_pages(struct gntdev_grant_map *map)
 {
 	int i, err = 0;
@@ -1040,20 +1032,7 @@ static int gntdev_mmap(struct file *flip
 			goto out_put_map;
 	} else {
 #ifdef CONFIG_X86
-		/*
-		 * If the PTEs were not made special by the grant map
-		 * hypercall, do so here.
-		 *
-		 * This is racy since the mapping is already visible
-		 * to userspace but userspace should be well-behaved
-		 * enough to not touch it until the mmap() call
-		 * returns.
-		 */
-		if (!xen_feature(XENFEAT_gnttab_map_avail_bits)) {
-			apply_to_page_range(vma->vm_mm, vma->vm_start,
-					    vma->vm_end - vma->vm_start,
-					    set_grant_ptes_as_special, NULL);
-		}
+		xen_set_grant_as_special(vma);
 #endif
 	}
 



^ permalink raw reply	[flat|nested] 28+ messages in thread

* [PATCH 3/7] xen/gntdev: Remove apply_to_page_range() use from module
  2021-04-12  8:00 [PATCH 0/7] mm: Unexport apply_to_page_range() Peter Zijlstra
  2021-04-12  8:00 ` [PATCH 1/7] mm: Unexport apply_to_existing_page_range() Peter Zijlstra
  2021-04-12  8:00 ` [PATCH 2/7] xen/gntdev,x86: Remove apply_to_page_range() use from module Peter Zijlstra
@ 2021-04-12  8:00 ` Peter Zijlstra
  2021-04-12  8:27   ` Christoph Hellwig
  2021-04-12  8:00 ` [PATCH 4/7] mm: Introduce verify_page_range() Peter Zijlstra
                   ` (3 subsequent siblings)
  6 siblings, 1 reply; 28+ messages in thread
From: Peter Zijlstra @ 2021-04-12  8:00 UTC (permalink / raw)
  To: akpm
  Cc: linux-kernel, peterz, boris.ostrovsky, jgross, sstabellini, x86,
	jani.nikula, joonas.lahtinen, rodrigo.vivi, chris, intel-gfx,
	linux-mm, keescook, hch

Instead of relying on apply_to_page_range() being available to
modules, move its use into core kernel code and export it's
application.

Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
---
 drivers/xen/gntdev-common.h |    2 ++
 drivers/xen/gntdev.c        |   30 +-----------------------------
 drivers/xen/grant-table.c   |   37 +++++++++++++++++++++++++++++++++++++
 3 files changed, 40 insertions(+), 29 deletions(-)

--- a/drivers/xen/gntdev-common.h
+++ b/drivers/xen/gntdev-common.h
@@ -86,4 +86,6 @@ bool gntdev_test_page_count(unsigned int
 
 int gntdev_map_grant_pages(struct gntdev_grant_map *map);
 
+int gnttab_use_ptemod(struct vm_area_struct *vma, struct gntdev_grant_map *map);
+
 #endif
--- a/drivers/xen/gntdev.c
+++ b/drivers/xen/gntdev.c
@@ -262,32 +262,6 @@ void gntdev_put_map(struct gntdev_priv *
 
 /* ------------------------------------------------------------------ */
 
-static int find_grant_ptes(pte_t *pte, unsigned long addr, void *data)
-{
-	struct gntdev_grant_map *map = data;
-	unsigned int pgnr = (addr - map->vma->vm_start) >> PAGE_SHIFT;
-	int flags = map->flags | GNTMAP_application_map | GNTMAP_contains_pte;
-	u64 pte_maddr;
-
-	BUG_ON(pgnr >= map->count);
-	pte_maddr = arbitrary_virt_to_machine(pte).maddr;
-
-	/*
-	 * Set the PTE as special to force get_user_pages_fast() fall
-	 * back to the slow path.  If this is not supported as part of
-	 * the grant map, it will be done afterwards.
-	 */
-	if (xen_feature(XENFEAT_gnttab_map_avail_bits))
-		flags |= (1 << _GNTMAP_guest_avail0);
-
-	gnttab_set_map_op(&map->map_ops[pgnr], pte_maddr, flags,
-			  map->grants[pgnr].ref,
-			  map->grants[pgnr].domid);
-	gnttab_set_unmap_op(&map->unmap_ops[pgnr], pte_maddr, flags,
-			    INVALID_GRANT_HANDLE);
-	return 0;
-}
-
 int gntdev_map_grant_pages(struct gntdev_grant_map *map)
 {
 	int i, err = 0;
@@ -1028,9 +1002,7 @@ static int gntdev_mmap(struct file *flip
 		mmu_interval_read_begin(&map->notifier);
 
 		map->pages_vm_start = vma->vm_start;
-		err = apply_to_page_range(vma->vm_mm, vma->vm_start,
-					  vma->vm_end - vma->vm_start,
-					  find_grant_ptes, map);
+		err = gnttab_use_ptemod(vma, map);
 		if (err) {
 			pr_warn("find_grant_ptes() failure.\n");
 			goto out_put_map;
--- a/drivers/xen/grant-table.c
+++ b/drivers/xen/grant-table.c
@@ -1591,6 +1591,43 @@ int gnttab_init(void)
 }
 EXPORT_SYMBOL_GPL(gnttab_init);
 
+#include <xen/gntdev.h>
+#include "gntdev-common.h"
+
+static int find_grant_ptes(pte_t *pte, unsigned long addr, void *data)
+{
+	struct gntdev_grant_map *map = data;
+	unsigned int pgnr = (addr - map->vma->vm_start) >> PAGE_SHIFT;
+	int flags = map->flags | GNTMAP_application_map | GNTMAP_contains_pte;
+	u64 pte_maddr;
+
+	BUG_ON(pgnr >= map->count);
+	pte_maddr = arbitrary_virt_to_machine(pte).maddr;
+
+	/*
+	 * Set the PTE as special to force get_user_pages_fast() fall
+	 * back to the slow path.  If this is not supported as part of
+	 * the grant map, it will be done afterwards.
+	 */
+	if (xen_feature(XENFEAT_gnttab_map_avail_bits))
+		flags |= (1 << _GNTMAP_guest_avail0);
+
+	gnttab_set_map_op(&map->map_ops[pgnr], pte_maddr, flags,
+			  map->grants[pgnr].ref,
+			  map->grants[pgnr].domid);
+	gnttab_set_unmap_op(&map->unmap_ops[pgnr], pte_maddr, flags,
+			    INVALID_GRANT_HANDLE);
+	return 0;
+}
+
+int gnttab_use_ptemod(struct vm_area_struct *vma, struct gntdev_grant_map *map)
+{
+	return apply_to_page_range(vma->vm_mm, vma->vm_start,
+				   vma->vm_end - vma->vm_start,
+				   find_grant_ptes, map);
+}
+EXPORT_SYMBOL_GPL(gnttab_use_ptemod);
+
 static int __gnttab_init(void)
 {
 	if (!xen_domain())



^ permalink raw reply	[flat|nested] 28+ messages in thread

* [PATCH 4/7] mm: Introduce verify_page_range()
  2021-04-12  8:00 [PATCH 0/7] mm: Unexport apply_to_page_range() Peter Zijlstra
                   ` (2 preceding siblings ...)
  2021-04-12  8:00 ` [PATCH 3/7] xen/gntdev: " Peter Zijlstra
@ 2021-04-12  8:00 ` Peter Zijlstra
  2021-04-12  8:28   ` Christoph Hellwig
  2021-04-12 20:05   ` Kees Cook
  2021-04-12  8:00 ` [PATCH 5/7] xen/privcmd: Use verify_page_range() Peter Zijlstra
                   ` (2 subsequent siblings)
  6 siblings, 2 replies; 28+ messages in thread
From: Peter Zijlstra @ 2021-04-12  8:00 UTC (permalink / raw)
  To: akpm
  Cc: linux-kernel, peterz, boris.ostrovsky, jgross, sstabellini, x86,
	jani.nikula, joonas.lahtinen, rodrigo.vivi, chris, intel-gfx,
	linux-mm, keescook, hch

Introduce and EXPORT a read-only counterpart to apply_to_page_range().

It only exposes the PTE value, not a pointer to the pagetables itself
and is thus quite a bit safer to export. A number of
apply_to_page_range() users can be converted to this primitive.

Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
---
 include/linux/mm.h |    4 ++++
 mm/memory.c        |   24 ++++++++++++++++++++++++
 2 files changed, 28 insertions(+)

--- a/include/linux/mm.h
+++ b/include/linux/mm.h
@@ -2876,6 +2876,10 @@ extern int apply_to_page_range(struct mm
 extern int apply_to_existing_page_range(struct mm_struct *mm,
 				   unsigned long address, unsigned long size,
 				   pte_fn_t fn, void *data);
+extern int verify_page_range(struct mm_struct *mm,
+			     unsigned long addr, unsigned long size,
+			     int (*fn)(pte_t pte, unsigned long addr, void *data),
+			     void *data);
 
 extern void init_mem_debugging_and_hardening(void);
 #ifdef CONFIG_PAGE_POISONING
--- a/mm/memory.c
+++ b/mm/memory.c
@@ -2559,6 +2559,30 @@ int apply_to_existing_page_range(struct
 	return __apply_to_page_range(mm, addr, size, fn, data, false);
 }
 
+struct vpr_data {
+	int (*fn)(pte_t pte, unsigned long addr, void *data);
+	void *data;
+};
+
+static int vpr_fn(pte_t *pte, unsigned long addr, void *data)
+{
+	struct vpr_data *vpr = data;
+	return vpr->fn(*pte, addr, vpr->data);
+}
+
+int verify_page_range(struct mm_struct *mm,
+		      unsigned long addr, unsigned long size,
+		      int (*fn)(pte_t pte, unsigned long addr, void *data),
+		      void *data)
+{
+	struct vpr_data vpr = {
+		.fn = fn,
+		.data = data,
+	};
+	return apply_to_page_range(mm, addr, size, vpr_fn, &vpr);
+}
+EXPORT_SYMBOL_GPL(verify_page_range);
+
 /*
  * handle_pte_fault chooses page fault handler according to an entry which was
  * read non-atomically.  Before making any commitment, on those architectures



^ permalink raw reply	[flat|nested] 28+ messages in thread

* [PATCH 5/7] xen/privcmd: Use verify_page_range()
  2021-04-12  8:00 [PATCH 0/7] mm: Unexport apply_to_page_range() Peter Zijlstra
                   ` (3 preceding siblings ...)
  2021-04-12  8:00 ` [PATCH 4/7] mm: Introduce verify_page_range() Peter Zijlstra
@ 2021-04-12  8:00 ` Peter Zijlstra
  2021-04-12  8:28   ` Christoph Hellwig
  2021-04-12  8:00 ` [PATCH 6/7] i915: Convert to verify_page_range() Peter Zijlstra
  2021-04-12  8:00 ` [PATCH 7/7] mm: Unexport apply_to_page_range() Peter Zijlstra
  6 siblings, 1 reply; 28+ messages in thread
From: Peter Zijlstra @ 2021-04-12  8:00 UTC (permalink / raw)
  To: akpm
  Cc: linux-kernel, peterz, boris.ostrovsky, jgross, sstabellini, x86,
	jani.nikula, joonas.lahtinen, rodrigo.vivi, chris, intel-gfx,
	linux-mm, keescook, hch

Avoid using apply_to_page_range() from modules, use the safer
verify_page_range() instead.

Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
---
 drivers/xen/privcmd.c |    8 ++++----
 1 file changed, 4 insertions(+), 4 deletions(-)

--- a/drivers/xen/privcmd.c
+++ b/drivers/xen/privcmd.c
@@ -946,9 +946,9 @@ static int privcmd_mmap(struct file *fil
  * on a per pfn/pte basis. Mapping calls that fail with ENOENT
  * can be then retried until success.
  */
-static int is_mapped_fn(pte_t *pte, unsigned long addr, void *data)
+static int is_mapped_fn(pte_t pte, unsigned long addr, void *data)
 {
-	return pte_none(*pte) ? 0 : -EBUSY;
+	return pte_none(pte) ? 0 : -EBUSY;
 }
 
 static int privcmd_vma_range_is_mapped(
@@ -956,8 +956,8 @@ static int privcmd_vma_range_is_mapped(
 	           unsigned long addr,
 	           unsigned long nr_pages)
 {
-	return apply_to_page_range(vma->vm_mm, addr, nr_pages << PAGE_SHIFT,
-				   is_mapped_fn, NULL) != 0;
+	return verify_page_range(vma->vm_mm, addr, nr_pages << PAGE_SHIFT,
+				 is_mapped_fn, NULL) != 0;
 }
 
 const struct file_operations xen_privcmd_fops = {



^ permalink raw reply	[flat|nested] 28+ messages in thread

* [PATCH 6/7] i915: Convert to verify_page_range()
  2021-04-12  8:00 [PATCH 0/7] mm: Unexport apply_to_page_range() Peter Zijlstra
                   ` (4 preceding siblings ...)
  2021-04-12  8:00 ` [PATCH 5/7] xen/privcmd: Use verify_page_range() Peter Zijlstra
@ 2021-04-12  8:00 ` Peter Zijlstra
  2021-04-12  8:28   ` Christoph Hellwig
                     ` (2 more replies)
  2021-04-12  8:00 ` [PATCH 7/7] mm: Unexport apply_to_page_range() Peter Zijlstra
  6 siblings, 3 replies; 28+ messages in thread
From: Peter Zijlstra @ 2021-04-12  8:00 UTC (permalink / raw)
  To: akpm
  Cc: linux-kernel, peterz, boris.ostrovsky, jgross, sstabellini, x86,
	jani.nikula, joonas.lahtinen, rodrigo.vivi, chris, intel-gfx,
	linux-mm, keescook, hch

check_{present,absent}() only need R/O access, use verify_page_range()
instead to remove modular use of apply_to_page_range().

Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
---
 drivers/gpu/drm/i915/gem/selftests/i915_gem_mman.c |   16 ++++++++--------
 1 file changed, 8 insertions(+), 8 deletions(-)

--- a/drivers/gpu/drm/i915/gem/selftests/i915_gem_mman.c
+++ b/drivers/gpu/drm/i915/gem/selftests/i915_gem_mman.c
@@ -1225,9 +1225,9 @@ static int igt_mmap_gpu(void *arg)
 	return 0;
 }
 
-static int check_present_pte(pte_t *pte, unsigned long addr, void *data)
+static int check_present_pte(pte_t pte, unsigned long addr, void *data)
 {
-	if (!pte_present(*pte) || pte_none(*pte)) {
+	if (!pte_present(pte) || pte_none(pte)) {
 		pr_err("missing PTE:%lx\n",
 		       (addr - (unsigned long)data) >> PAGE_SHIFT);
 		return -EINVAL;
@@ -1236,9 +1236,9 @@ static int check_present_pte(pte_t *pte,
 	return 0;
 }
 
-static int check_absent_pte(pte_t *pte, unsigned long addr, void *data)
+static int check_absent_pte(pte_t pte, unsigned long addr, void *data)
 {
-	if (pte_present(*pte) && !pte_none(*pte)) {
+	if (pte_present(pte) && !pte_none(pte)) {
 		pr_err("present PTE:%lx; expected to be revoked\n",
 		       (addr - (unsigned long)data) >> PAGE_SHIFT);
 		return -EINVAL;
@@ -1249,14 +1249,14 @@ static int check_absent_pte(pte_t *pte,
 
 static int check_present(unsigned long addr, unsigned long len)
 {
-	return apply_to_page_range(current->mm, addr, len,
-				   check_present_pte, (void *)addr);
+	return verify_page_range(current->mm, addr, len,
+				 check_present_pte, (void *)addr);
 }
 
 static int check_absent(unsigned long addr, unsigned long len)
 {
-	return apply_to_page_range(current->mm, addr, len,
-				   check_absent_pte, (void *)addr);
+	return verify_page_range(current->mm, addr, len,
+				 check_absent_pte, (void *)addr);
 }
 
 static int prefault_range(u64 start, u64 len)



^ permalink raw reply	[flat|nested] 28+ messages in thread

* [PATCH 7/7] mm: Unexport apply_to_page_range()
  2021-04-12  8:00 [PATCH 0/7] mm: Unexport apply_to_page_range() Peter Zijlstra
                   ` (5 preceding siblings ...)
  2021-04-12  8:00 ` [PATCH 6/7] i915: Convert to verify_page_range() Peter Zijlstra
@ 2021-04-12  8:00 ` Peter Zijlstra
  2021-04-12  8:28   ` Christoph Hellwig
  6 siblings, 1 reply; 28+ messages in thread
From: Peter Zijlstra @ 2021-04-12  8:00 UTC (permalink / raw)
  To: akpm
  Cc: linux-kernel, peterz, boris.ostrovsky, jgross, sstabellini, x86,
	jani.nikula, joonas.lahtinen, rodrigo.vivi, chris, intel-gfx,
	linux-mm, keescook, hch

Now that all module users of apply_to_page_range() have been removed,
unexport this function.

This is an unsafe function in that it gives direct access to the
page-tables.

Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
---
 mm/memory.c |    5 ++++-
 1 file changed, 4 insertions(+), 1 deletion(-)

--- a/mm/memory.c
+++ b/mm/memory.c
@@ -2544,13 +2544,14 @@ static int __apply_to_page_range(struct
 /*
  * Scan a region of virtual memory, filling in page tables as necessary
  * and calling a provided function on each leaf page table.
+ *
+ * DO NOT EXPORT; this hands out our page-tables on a platter.
  */
 int apply_to_page_range(struct mm_struct *mm, unsigned long addr,
 			unsigned long size, pte_fn_t fn, void *data)
 {
 	return __apply_to_page_range(mm, addr, size, fn, data, true);
 }
-EXPORT_SYMBOL_GPL(apply_to_page_range);
 
 /*
  * Scan a region of virtual memory, calling a provided function on
@@ -2558,6 +2559,8 @@ EXPORT_SYMBOL_GPL(apply_to_page_range);
  *
  * Unlike apply_to_page_range, this does _not_ fill in page tables
  * where they are absent.
+ *
+ * DO NOT EXPORT; this hands out our page-tables on a platter.
  */
 int apply_to_existing_page_range(struct mm_struct *mm, unsigned long addr,
 				 unsigned long size, pte_fn_t fn, void *data)



^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH 1/7] mm: Unexport apply_to_existing_page_range()
  2021-04-12  8:00 ` [PATCH 1/7] mm: Unexport apply_to_existing_page_range() Peter Zijlstra
@ 2021-04-12  8:13   ` Christoph Hellwig
  0 siblings, 0 replies; 28+ messages in thread
From: Christoph Hellwig @ 2021-04-12  8:13 UTC (permalink / raw)
  To: Peter Zijlstra
  Cc: akpm, linux-kernel, boris.ostrovsky, jgross, sstabellini, x86,
	jani.nikula, joonas.lahtinen, rodrigo.vivi, chris, intel-gfx,
	linux-mm, keescook, hch

On Mon, Apr 12, 2021 at 10:00:13AM +0200, Peter Zijlstra wrote:
> There are no modular in-tree users, remove the EXPORT.
> 
> This is an unsafe function in that it gives direct access to the
> page-tables.
> 
> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>

Looks good,

Reviewed-by: Christoph Hellwig <hch@lst.de>

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH 2/7] xen/gntdev,x86: Remove apply_to_page_range() use from module
  2021-04-12  8:00 ` [PATCH 2/7] xen/gntdev,x86: Remove apply_to_page_range() use from module Peter Zijlstra
@ 2021-04-12  8:26   ` Christoph Hellwig
  2021-04-12  9:20     ` Peter Zijlstra
  2021-04-12  9:26     ` Juergen Gross
  0 siblings, 2 replies; 28+ messages in thread
From: Christoph Hellwig @ 2021-04-12  8:26 UTC (permalink / raw)
  To: Peter Zijlstra
  Cc: akpm, linux-kernel, boris.ostrovsky, jgross, sstabellini, x86,
	jani.nikula, joonas.lahtinen, rodrigo.vivi, chris, intel-gfx,
	linux-mm, keescook, hch

On Mon, Apr 12, 2021 at 10:00:14AM +0200, Peter Zijlstra wrote:
> Instead of relying on apply_to_page_range() being available to
> modules, move its use into core kernel code and export it's
> application.

This doesn't exactly look great, but at least it contains the damage..

> 
> NOTE: ideally we do: use_ptemod = !auto_translate_physmap &&
> gnttab_map_avail_bits and remove this hack.

Given how much pain the !auto_translate_physmap case causes all over
the kernel I wonder what a realistic timeline might be for dropping
support for this case might be..

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH 3/7] xen/gntdev: Remove apply_to_page_range() use from module
  2021-04-12  8:00 ` [PATCH 3/7] xen/gntdev: " Peter Zijlstra
@ 2021-04-12  8:27   ` Christoph Hellwig
  2021-04-12  9:02     ` Peter Zijlstra
  0 siblings, 1 reply; 28+ messages in thread
From: Christoph Hellwig @ 2021-04-12  8:27 UTC (permalink / raw)
  To: Peter Zijlstra
  Cc: akpm, linux-kernel, boris.ostrovsky, jgross, sstabellini, x86,
	jani.nikula, joonas.lahtinen, rodrigo.vivi, chris, intel-gfx,
	linux-mm, keescook, hch

On Mon, Apr 12, 2021 at 10:00:15AM +0200, Peter Zijlstra wrote:
> --- a/drivers/xen/grant-table.c
> +++ b/drivers/xen/grant-table.c
> @@ -1591,6 +1591,43 @@ int gnttab_init(void)
>  }
>  EXPORT_SYMBOL_GPL(gnttab_init);
>  
> +#include <xen/gntdev.h>
> +#include "gntdev-common.h"

Can't we keep the includes at the top of the file?

Otherwise looks good:

Reviewed-by: Christoph Hellwig <hch@lst.de>

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH 4/7] mm: Introduce verify_page_range()
  2021-04-12  8:00 ` [PATCH 4/7] mm: Introduce verify_page_range() Peter Zijlstra
@ 2021-04-12  8:28   ` Christoph Hellwig
  2021-04-12  9:17     ` Peter Zijlstra
  2021-04-12 20:05   ` Kees Cook
  1 sibling, 1 reply; 28+ messages in thread
From: Christoph Hellwig @ 2021-04-12  8:28 UTC (permalink / raw)
  To: Peter Zijlstra
  Cc: akpm, linux-kernel, boris.ostrovsky, jgross, sstabellini, x86,
	jani.nikula, joonas.lahtinen, rodrigo.vivi, chris, intel-gfx,
	linux-mm, keescook, hch

On Mon, Apr 12, 2021 at 10:00:16AM +0200, Peter Zijlstra wrote:
> +extern int verify_page_range(struct mm_struct *mm,

No need for the extern here.

> +int verify_page_range(struct mm_struct *mm,
> +		      unsigned long addr, unsigned long size,
> +		      int (*fn)(pte_t pte, unsigned long addr, void *data),
> +		      void *data)

A kerneldoc comment would be nice for this function.

Otherwise this looks fine.

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH 5/7] xen/privcmd: Use verify_page_range()
  2021-04-12  8:00 ` [PATCH 5/7] xen/privcmd: Use verify_page_range() Peter Zijlstra
@ 2021-04-12  8:28   ` Christoph Hellwig
  0 siblings, 0 replies; 28+ messages in thread
From: Christoph Hellwig @ 2021-04-12  8:28 UTC (permalink / raw)
  To: Peter Zijlstra
  Cc: akpm, linux-kernel, boris.ostrovsky, jgross, sstabellini, x86,
	jani.nikula, joonas.lahtinen, rodrigo.vivi, chris, intel-gfx,
	linux-mm, keescook, hch

Looks good,

Reviewed-by: Christoph Hellwig <hch@lst.de>

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH 6/7] i915: Convert to verify_page_range()
  2021-04-12  8:00 ` [PATCH 6/7] i915: Convert to verify_page_range() Peter Zijlstra
@ 2021-04-12  8:28   ` Christoph Hellwig
  2021-04-12 20:08   ` Kees Cook
  2021-04-14  3:04   ` Kees Cook
  2 siblings, 0 replies; 28+ messages in thread
From: Christoph Hellwig @ 2021-04-12  8:28 UTC (permalink / raw)
  To: Peter Zijlstra
  Cc: akpm, linux-kernel, boris.ostrovsky, jgross, sstabellini, x86,
	jani.nikula, joonas.lahtinen, rodrigo.vivi, chris, intel-gfx,
	linux-mm, keescook, hch

Looks good,

Reviewed-by: Christoph Hellwig <hch@lst.de>

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH 7/7] mm: Unexport apply_to_page_range()
  2021-04-12  8:00 ` [PATCH 7/7] mm: Unexport apply_to_page_range() Peter Zijlstra
@ 2021-04-12  8:28   ` Christoph Hellwig
  0 siblings, 0 replies; 28+ messages in thread
From: Christoph Hellwig @ 2021-04-12  8:28 UTC (permalink / raw)
  To: Peter Zijlstra
  Cc: akpm, linux-kernel, boris.ostrovsky, jgross, sstabellini, x86,
	jani.nikula, joonas.lahtinen, rodrigo.vivi, chris, intel-gfx,
	linux-mm, keescook, hch

Looks good,

Reviewed-by: Christoph Hellwig <hch@lst.de>

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH 3/7] xen/gntdev: Remove apply_to_page_range() use from module
  2021-04-12  8:27   ` Christoph Hellwig
@ 2021-04-12  9:02     ` Peter Zijlstra
  0 siblings, 0 replies; 28+ messages in thread
From: Peter Zijlstra @ 2021-04-12  9:02 UTC (permalink / raw)
  To: Christoph Hellwig
  Cc: akpm, linux-kernel, boris.ostrovsky, jgross, sstabellini, x86,
	jani.nikula, joonas.lahtinen, rodrigo.vivi, chris, intel-gfx,
	linux-mm, keescook

On Mon, Apr 12, 2021 at 10:27:21AM +0200, Christoph Hellwig wrote:
> On Mon, Apr 12, 2021 at 10:00:15AM +0200, Peter Zijlstra wrote:
> > --- a/drivers/xen/grant-table.c
> > +++ b/drivers/xen/grant-table.c
> > @@ -1591,6 +1591,43 @@ int gnttab_init(void)
> >  }
> >  EXPORT_SYMBOL_GPL(gnttab_init);
> >  
> > +#include <xen/gntdev.h>
> > +#include "gntdev-common.h"
> 
> Can't we keep the includes at the top of the file?

Probably could, lemme move them.

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH 4/7] mm: Introduce verify_page_range()
  2021-04-12  8:28   ` Christoph Hellwig
@ 2021-04-12  9:17     ` Peter Zijlstra
  0 siblings, 0 replies; 28+ messages in thread
From: Peter Zijlstra @ 2021-04-12  9:17 UTC (permalink / raw)
  To: Christoph Hellwig
  Cc: akpm, linux-kernel, boris.ostrovsky, jgross, sstabellini, x86,
	jani.nikula, joonas.lahtinen, rodrigo.vivi, chris, intel-gfx,
	linux-mm, keescook

On Mon, Apr 12, 2021 at 10:28:05AM +0200, Christoph Hellwig wrote:
> On Mon, Apr 12, 2021 at 10:00:16AM +0200, Peter Zijlstra wrote:
> > +extern int verify_page_range(struct mm_struct *mm,
> 
> No need for the extern here.

It's consistent with the rest of the functions there. Also, I personally
like that extern.

> > +int verify_page_range(struct mm_struct *mm,
> > +		      unsigned long addr, unsigned long size,
> > +		      int (*fn)(pte_t pte, unsigned long addr, void *data),
> > +		      void *data)
> 
> A kerneldoc comment would be nice for this function.
> 
> Otherwise this looks fine.

Something like so?

/**
 * verify_page_range() - Scan (and fill) a range of virtual memory and validate PTEs
 * @mm: mm identifying the virtual memory map
 * @addr: starting virtual address of the range
 * @size: size of the range
 * @fn: function that verifies the PTEs
 * @data: opaque data passed to @fn
 *
 * Scan a region of virtual memory, filling in page tables as necessary and
 * calling a provided function on each leaf, providing a copy of the
 * page-table-entry.
 *
 * Similar apply_to_page_range(), but does not provide direct access to the
 * page-tables.
 *
 * NOTE! this function does not work correctly vs large pages.
 *
 * Return: the first !0 return of the provided function, or 0 on completion.
 */
int verify_page_range(struct mm_struct *mm,
		      unsigned long addr, unsigned long size,
		      int (*fn)(pte_t pte, unsigned long addr, void *data),
		      void *data)

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH 2/7] xen/gntdev,x86: Remove apply_to_page_range() use from module
  2021-04-12  8:26   ` Christoph Hellwig
@ 2021-04-12  9:20     ` Peter Zijlstra
  2021-04-12  9:26     ` Juergen Gross
  1 sibling, 0 replies; 28+ messages in thread
From: Peter Zijlstra @ 2021-04-12  9:20 UTC (permalink / raw)
  To: Christoph Hellwig
  Cc: akpm, linux-kernel, boris.ostrovsky, jgross, sstabellini, x86,
	jani.nikula, joonas.lahtinen, rodrigo.vivi, chris, intel-gfx,
	linux-mm, keescook

On Mon, Apr 12, 2021 at 10:26:40AM +0200, Christoph Hellwig wrote:
> On Mon, Apr 12, 2021 at 10:00:14AM +0200, Peter Zijlstra wrote:
> > Instead of relying on apply_to_page_range() being available to
> > modules, move its use into core kernel code and export it's
> > application.
> 
> This doesn't exactly look great, but at least it contains the damage..

That was just about as far as I got...

> > NOTE: ideally we do: use_ptemod = !auto_translate_physmap &&
> > gnttab_map_avail_bits and remove this hack.
> 
> Given how much pain the !auto_translate_physmap case causes all over
> the kernel I wonder what a realistic timeline might be for dropping
> support for this case might be..

I've no experience with any of that; it looks absolutely disguisting,
all of it. I figured that's part of the Xen 'charm' :-)

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH 2/7] xen/gntdev,x86: Remove apply_to_page_range() use from module
  2021-04-12  8:26   ` Christoph Hellwig
  2021-04-12  9:20     ` Peter Zijlstra
@ 2021-04-12  9:26     ` Juergen Gross
  1 sibling, 0 replies; 28+ messages in thread
From: Juergen Gross @ 2021-04-12  9:26 UTC (permalink / raw)
  To: Christoph Hellwig, Peter Zijlstra
  Cc: akpm, linux-kernel, boris.ostrovsky, sstabellini, x86,
	jani.nikula, joonas.lahtinen, rodrigo.vivi, chris, intel-gfx,
	linux-mm, keescook


[-- Attachment #1.1.1: Type: text/plain, Size: 878 bytes --]

On 12.04.21 10:26, Christoph Hellwig wrote:
> On Mon, Apr 12, 2021 at 10:00:14AM +0200, Peter Zijlstra wrote:
>> Instead of relying on apply_to_page_range() being available to
>> modules, move its use into core kernel code and export it's
>> application.
> 
> This doesn't exactly look great, but at least it contains the damage..
> 
>>
>> NOTE: ideally we do: use_ptemod = !auto_translate_physmap &&
>> gnttab_map_avail_bits and remove this hack.
> 
> Given how much pain the !auto_translate_physmap case causes all over
> the kernel I wonder what a realistic timeline might be for dropping
> support for this case might be..

Think in the order of years.

It is basically the Xen PV guest support you are speaking of here, and
the planned replacement PVH especially for dom0 is still lacking some
functionality and it has performance issues.


Juergen

[-- Attachment #1.1.2: OpenPGP_0xB0DE9DD628BF132F.asc --]
[-- Type: application/pgp-keys, Size: 3135 bytes --]

[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 495 bytes --]

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH 4/7] mm: Introduce verify_page_range()
  2021-04-12  8:00 ` [PATCH 4/7] mm: Introduce verify_page_range() Peter Zijlstra
  2021-04-12  8:28   ` Christoph Hellwig
@ 2021-04-12 20:05   ` Kees Cook
  2021-04-13  6:54     ` Peter Zijlstra
  2021-04-13  7:36     ` Peter Zijlstra
  1 sibling, 2 replies; 28+ messages in thread
From: Kees Cook @ 2021-04-12 20:05 UTC (permalink / raw)
  To: Peter Zijlstra
  Cc: akpm, linux-kernel, boris.ostrovsky, jgross, sstabellini, x86,
	jani.nikula, joonas.lahtinen, rodrigo.vivi, chris, intel-gfx,
	linux-mm, hch

On Mon, Apr 12, 2021 at 10:00:16AM +0200, Peter Zijlstra wrote:
> +struct vpr_data {
> +	int (*fn)(pte_t pte, unsigned long addr, void *data);
> +	void *data;
> +};

Eeerg. This is likely to become an attack target itself. Stored function
pointer with stored (3rd) argument.

This doesn't seem needed: only DRM uses it, and that's for error
reporting. I'd rather plumb back errors in a way to not have to add
another place in the kernel where we do func+arg stored calling.

-- 
Kees Cook

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH 6/7] i915: Convert to verify_page_range()
  2021-04-12  8:00 ` [PATCH 6/7] i915: Convert to verify_page_range() Peter Zijlstra
  2021-04-12  8:28   ` Christoph Hellwig
@ 2021-04-12 20:08   ` Kees Cook
  2021-04-13  6:54     ` Peter Zijlstra
  2021-04-14  3:04   ` Kees Cook
  2 siblings, 1 reply; 28+ messages in thread
From: Kees Cook @ 2021-04-12 20:08 UTC (permalink / raw)
  To: Peter Zijlstra
  Cc: akpm, linux-kernel, boris.ostrovsky, jgross, sstabellini, x86,
	jani.nikula, joonas.lahtinen, rodrigo.vivi, chris, intel-gfx,
	linux-mm, hch

On Mon, Apr 12, 2021 at 10:00:18AM +0200, Peter Zijlstra wrote:
> check_{present,absent}() only need R/O access, use verify_page_range()
> instead to remove modular use of apply_to_page_range().
> 
> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
> ---
>  drivers/gpu/drm/i915/gem/selftests/i915_gem_mman.c |   16 ++++++++--------
>  1 file changed, 8 insertions(+), 8 deletions(-)
> 
> --- a/drivers/gpu/drm/i915/gem/selftests/i915_gem_mman.c
> +++ b/drivers/gpu/drm/i915/gem/selftests/i915_gem_mman.c
> @@ -1225,9 +1225,9 @@ static int igt_mmap_gpu(void *arg)
>  	return 0;
>  }
>  
> -static int check_present_pte(pte_t *pte, unsigned long addr, void *data)
> +static int check_present_pte(pte_t pte, unsigned long addr, void *data)
>  {
> -	if (!pte_present(*pte) || pte_none(*pte)) {
> +	if (!pte_present(pte) || pte_none(pte)) {
>  		pr_err("missing PTE:%lx\n",
>  		       (addr - (unsigned long)data) >> PAGE_SHIFT);
>  		return -EINVAL;
> @@ -1236,9 +1236,9 @@ static int check_present_pte(pte_t *pte,
>  	return 0;
>  }
>  
> -static int check_absent_pte(pte_t *pte, unsigned long addr, void *data)
> +static int check_absent_pte(pte_t pte, unsigned long addr, void *data)
>  {
> -	if (pte_present(*pte) && !pte_none(*pte)) {
> +	if (pte_present(pte) && !pte_none(pte)) {
>  		pr_err("present PTE:%lx; expected to be revoked\n",
>  		       (addr - (unsigned long)data) >> PAGE_SHIFT);
>  		return -EINVAL;
> @@ -1249,14 +1249,14 @@ static int check_absent_pte(pte_t *pte,
>  
>  static int check_present(unsigned long addr, unsigned long len)
>  {
> -	return apply_to_page_range(current->mm, addr, len,
> -				   check_present_pte, (void *)addr);
> +	return verify_page_range(current->mm, addr, len,
> +				 check_present_pte, (void *)addr);

For example, switch to returning bad addr through verify_page_range(),
or have a by-reference value, etc:

	unsigned long failed;

	failed = verify_page_range(current->mm< addr, len, check_present_pte);
	if (failed) {
  		pr_err("missing PTE:%lx\n",
  		       (addr - failed) >> PAGE_SHIFT);


-- 
Kees Cook

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH 4/7] mm: Introduce verify_page_range()
  2021-04-12 20:05   ` Kees Cook
@ 2021-04-13  6:54     ` Peter Zijlstra
  2021-04-19 23:36       ` Kees Cook
  2021-04-13  7:36     ` Peter Zijlstra
  1 sibling, 1 reply; 28+ messages in thread
From: Peter Zijlstra @ 2021-04-13  6:54 UTC (permalink / raw)
  To: Kees Cook
  Cc: akpm, linux-kernel, boris.ostrovsky, jgross, sstabellini, x86,
	jani.nikula, joonas.lahtinen, rodrigo.vivi, chris, intel-gfx,
	linux-mm, hch

On Mon, Apr 12, 2021 at 01:05:09PM -0700, Kees Cook wrote:
> On Mon, Apr 12, 2021 at 10:00:16AM +0200, Peter Zijlstra wrote:
> > +struct vpr_data {
> > +	int (*fn)(pte_t pte, unsigned long addr, void *data);
> > +	void *data;
> > +};
> 
> Eeerg. This is likely to become an attack target itself. Stored function
> pointer with stored (3rd) argument.

You got some further reading on that? How exactly are those exploited?

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH 6/7] i915: Convert to verify_page_range()
  2021-04-12 20:08   ` Kees Cook
@ 2021-04-13  6:54     ` Peter Zijlstra
  0 siblings, 0 replies; 28+ messages in thread
From: Peter Zijlstra @ 2021-04-13  6:54 UTC (permalink / raw)
  To: Kees Cook
  Cc: akpm, linux-kernel, boris.ostrovsky, jgross, sstabellini, x86,
	jani.nikula, joonas.lahtinen, rodrigo.vivi, chris, intel-gfx,
	linux-mm, hch

On Mon, Apr 12, 2021 at 01:08:38PM -0700, Kees Cook wrote:
> On Mon, Apr 12, 2021 at 10:00:18AM +0200, Peter Zijlstra wrote:
> > @@ -1249,14 +1249,14 @@ static int check_absent_pte(pte_t *pte,
> >  
> >  static int check_present(unsigned long addr, unsigned long len)
> >  {
> > -	return apply_to_page_range(current->mm, addr, len,
> > -				   check_present_pte, (void *)addr);
> > +	return verify_page_range(current->mm, addr, len,
> > +				 check_present_pte, (void *)addr);
> 
> For example, switch to returning bad addr through verify_page_range(),
> or have a by-reference value, etc:
> 
> 	unsigned long failed;
> 
> 	failed = verify_page_range(current->mm< addr, len, check_present_pte);
> 	if (failed) {
>   		pr_err("missing PTE:%lx\n",
>   		       (addr - failed) >> PAGE_SHIFT);

OK, lemme try that.

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH 4/7] mm: Introduce verify_page_range()
  2021-04-12 20:05   ` Kees Cook
  2021-04-13  6:54     ` Peter Zijlstra
@ 2021-04-13  7:36     ` Peter Zijlstra
  2021-04-14  3:01       ` Kees Cook
  1 sibling, 1 reply; 28+ messages in thread
From: Peter Zijlstra @ 2021-04-13  7:36 UTC (permalink / raw)
  To: Kees Cook
  Cc: akpm, linux-kernel, boris.ostrovsky, jgross, sstabellini, x86,
	jani.nikula, joonas.lahtinen, rodrigo.vivi, chris, intel-gfx,
	linux-mm, hch

On Mon, Apr 12, 2021 at 01:05:09PM -0700, Kees Cook wrote:
> On Mon, Apr 12, 2021 at 10:00:16AM +0200, Peter Zijlstra wrote:
> > +struct vpr_data {
> > +	int (*fn)(pte_t pte, unsigned long addr, void *data);
> > +	void *data;
> > +};
> 
> Eeerg. This is likely to become an attack target itself. Stored function
> pointer with stored (3rd) argument.
> 
> This doesn't seem needed: only DRM uses it, and that's for error
> reporting. I'd rather plumb back errors in a way to not have to add
> another place in the kernel where we do func+arg stored calling.

Is this any better? It does have the stored pointer, but not a stored
argument, assuming you don't count returns as arguments I suppose.

The alternative is refactoring apply_to_page_range() :-/

---

struct vpr_data {
	bool (*fn)(pte_t pte, unsigned long addr);
	unsigned long addr;
};

static int vpr_fn(pte_t *pte, unsigned long addr, void *data)
{
	struct vpr_data *vpr = data;
	if (!vpr->fn(*pte, addr)) {
		vpr->addr = addr;
		return -EINVAL;
	}
	return 0;
}

/**
 * verify_page_range() - Scan (and fill) a range of virtual memory and validate PTEs
 * @mm: mm identifying the virtual memory map
 * @addr: starting virtual address of the range
 * @size: size of the range
 * @fn: function that verifies the PTEs
 *
 * Scan a region of virtual memory, filling in page tables as necessary and
 * calling a provided function on each leaf, providing a copy of the
 * page-table-entry.
 *
 * Similar apply_to_page_range(), but does not provide direct access to the
 * page-tables.
 *
 * NOTE! this function does not work correctly vs large pages.
 *
 * Return: the address that failed verification or 0 on success.
 */
unsigned long verify_page_range(struct mm_struct *mm,
				unsigned long addr, unsigned long size,
				bool (*fn)(pte_t pte, unsigned long addr))
{
	struct vpr_data vpr = {
		.fn = fn,
		.addr = 0,
	};
	apply_to_page_range(mm, addr, size, vpr_fn, &vpr);
	return vpr.addr;
}
EXPORT_SYMBOL_GPL(verify_page_range);

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH 4/7] mm: Introduce verify_page_range()
  2021-04-13  7:36     ` Peter Zijlstra
@ 2021-04-14  3:01       ` Kees Cook
  2021-04-14  7:00         ` Peter Zijlstra
  0 siblings, 1 reply; 28+ messages in thread
From: Kees Cook @ 2021-04-14  3:01 UTC (permalink / raw)
  To: Peter Zijlstra
  Cc: akpm, linux-kernel, boris.ostrovsky, jgross, sstabellini, x86,
	jani.nikula, joonas.lahtinen, rodrigo.vivi, chris, intel-gfx,
	linux-mm, hch

On Tue, Apr 13, 2021 at 09:36:32AM +0200, Peter Zijlstra wrote:
> On Mon, Apr 12, 2021 at 01:05:09PM -0700, Kees Cook wrote:
> > On Mon, Apr 12, 2021 at 10:00:16AM +0200, Peter Zijlstra wrote:
> > > +struct vpr_data {
> > > +	int (*fn)(pte_t pte, unsigned long addr, void *data);
> > > +	void *data;
> > > +};
> > 
> > Eeerg. This is likely to become an attack target itself. Stored function
> > pointer with stored (3rd) argument.
> > 
> > This doesn't seem needed: only DRM uses it, and that's for error
> > reporting. I'd rather plumb back errors in a way to not have to add
> > another place in the kernel where we do func+arg stored calling.
> 
> Is this any better? It does have the stored pointer, but not a stored
> argument, assuming you don't count returns as arguments I suppose.

It's better in the sense that it's not the func/arg pair that really
bugs me, yes. :)

> The alternative is refactoring apply_to_page_range() :-/

Yeah, I'm looking now, I see what you mean.

> ---
> 
> struct vpr_data {
> 	bool (*fn)(pte_t pte, unsigned long addr);
> 	unsigned long addr;
> };
> 
> static int vpr_fn(pte_t *pte, unsigned long addr, void *data)
> {
> 	struct vpr_data *vpr = data;
> 	if (!vpr->fn(*pte, addr)) {
> 		vpr->addr = addr;
> 		return -EINVAL;
> 	}
> 	return 0;
> }

My point about passing "addr" was that nothing in the callback actually
needs it -- the top level can just as easily report the error. And that
the helper is always vpr_fn(), so it doesn't need to be passed either.

So the addr can just be encoded in "int", and no structure is needed at:

typedef bool (*vpr_fn_t)(pte_t pte);

static int vpr_fn(pte_t *pte, unsigned long addr, void *data)
{
	vpr_fn_t callback = data;

	if (!callback(*pte))
		return addr >> PAGE_SIZE;
	return 0;
}

unsigned long verify_page_range(struct mm_struct *mm,
				unsigned long addr, unsigned long size,
				vpr_fn_t callback)
{
	return apply_to_page_range(mm, addr, size, vpr_fn, callback) << PAGE_SIZE;
}

But maybe I'm missing something?

-- 
Kees Cook

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH 6/7] i915: Convert to verify_page_range()
  2021-04-12  8:00 ` [PATCH 6/7] i915: Convert to verify_page_range() Peter Zijlstra
  2021-04-12  8:28   ` Christoph Hellwig
  2021-04-12 20:08   ` Kees Cook
@ 2021-04-14  3:04   ` Kees Cook
  2 siblings, 0 replies; 28+ messages in thread
From: Kees Cook @ 2021-04-14  3:04 UTC (permalink / raw)
  To: Peter Zijlstra
  Cc: akpm, linux-kernel, boris.ostrovsky, jgross, sstabellini, x86,
	jani.nikula, joonas.lahtinen, rodrigo.vivi, chris, intel-gfx,
	linux-mm, hch

On Mon, Apr 12, 2021 at 10:00:18AM +0200, Peter Zijlstra wrote:
> check_{present,absent}() only need R/O access, use verify_page_range()
> instead to remove modular use of apply_to_page_range().
> 
> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
> ---
>  drivers/gpu/drm/i915/gem/selftests/i915_gem_mman.c |   16 ++++++++--------
>  1 file changed, 8 insertions(+), 8 deletions(-)
> 
> --- a/drivers/gpu/drm/i915/gem/selftests/i915_gem_mman.c
> +++ b/drivers/gpu/drm/i915/gem/selftests/i915_gem_mman.c
> @@ -1225,9 +1225,9 @@ static int igt_mmap_gpu(void *arg)
>  	return 0;
>  }
>  
> -static int check_present_pte(pte_t *pte, unsigned long addr, void *data)
> +static int check_present_pte(pte_t pte, unsigned long addr, void *data)
>  {
> -	if (!pte_present(*pte) || pte_none(*pte)) {
> +	if (!pte_present(pte) || pte_none(pte)) {
>  		pr_err("missing PTE:%lx\n",
>  		       (addr - (unsigned long)data) >> PAGE_SHIFT);
>  		return -EINVAL;
> @@ -1236,9 +1236,9 @@ static int check_present_pte(pte_t *pte,
>  	return 0;
>  }
>  
> -static int check_absent_pte(pte_t *pte, unsigned long addr, void *data)
> +static int check_absent_pte(pte_t pte, unsigned long addr, void *data)
>  {
> -	if (pte_present(*pte) && !pte_none(*pte)) {
> +	if (pte_present(pte) && !pte_none(pte)) {
>  		pr_err("present PTE:%lx; expected to be revoked\n",
>  		       (addr - (unsigned long)data) >> PAGE_SHIFT);
>  		return -EINVAL;
> @@ -1249,14 +1249,14 @@ static int check_absent_pte(pte_t *pte,
>  
>  static int check_present(unsigned long addr, unsigned long len)
>  {
> -	return apply_to_page_range(current->mm, addr, len,
> -				   check_present_pte, (void *)addr);
> +	return verify_page_range(current->mm, addr, len,
> +				 check_present_pte, (void *)addr);
>  }

And this would be:

static int check_present(unsigned long addr, unsigned long len)
	unsigned long fail;

	fail = verify_page_range(current->mm, addr, len, check_present_pte);
	if (fail) {
  		pr_err("missing PTE:%lx\n", addr);
		return -EINVAL;
	}
}

(Oh, and I think I messed up the page shifting macro name in the earlier
one...)


-- 
Kees Cook

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH 4/7] mm: Introduce verify_page_range()
  2021-04-14  3:01       ` Kees Cook
@ 2021-04-14  7:00         ` Peter Zijlstra
  0 siblings, 0 replies; 28+ messages in thread
From: Peter Zijlstra @ 2021-04-14  7:00 UTC (permalink / raw)
  To: Kees Cook
  Cc: akpm, linux-kernel, boris.ostrovsky, jgross, sstabellini, x86,
	jani.nikula, joonas.lahtinen, rodrigo.vivi, chris, intel-gfx,
	linux-mm, hch

On Tue, Apr 13, 2021 at 08:01:08PM -0700, Kees Cook wrote:
> So the addr can just be encoded in "int", and no structure is needed at:
> 
> typedef bool (*vpr_fn_t)(pte_t pte);
> 
> static int vpr_fn(pte_t *pte, unsigned long addr, void *data)
> {
> 	vpr_fn_t callback = data;
> 
> 	if (!callback(*pte))
> 		return addr >> PAGE_SIZE;
> 	return 0;
> }
> 
> unsigned long verify_page_range(struct mm_struct *mm,
> 				unsigned long addr, unsigned long size,
> 				vpr_fn_t callback)
> {
> 	return apply_to_page_range(mm, addr, size, vpr_fn, callback) << PAGE_SIZE;
> }
> 
> But maybe I'm missing something?

That covers only (32+12) bits of address space and will mostly work, but
we definitely support architectures (very much including x86_64) with
larger address spaces than that.

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH 4/7] mm: Introduce verify_page_range()
  2021-04-13  6:54     ` Peter Zijlstra
@ 2021-04-19 23:36       ` Kees Cook
  0 siblings, 0 replies; 28+ messages in thread
From: Kees Cook @ 2021-04-19 23:36 UTC (permalink / raw)
  To: Peter Zijlstra
  Cc: akpm, linux-kernel, boris.ostrovsky, jgross, sstabellini, x86,
	jani.nikula, joonas.lahtinen, rodrigo.vivi, chris, intel-gfx,
	linux-mm, hch

On Tue, Apr 13, 2021 at 08:54:06AM +0200, Peter Zijlstra wrote:
> On Mon, Apr 12, 2021 at 01:05:09PM -0700, Kees Cook wrote:
> > On Mon, Apr 12, 2021 at 10:00:16AM +0200, Peter Zijlstra wrote:
> > > +struct vpr_data {
> > > +	int (*fn)(pte_t pte, unsigned long addr, void *data);
> > > +	void *data;
> > > +};
> > 
> > Eeerg. This is likely to become an attack target itself. Stored function
> > pointer with stored (3rd) argument.
> 
> You got some further reading on that? How exactly are those exploited?

Sure, see "Executing code" in
https://googleprojectzero.blogspot.com/2017/05/exploiting-linux-kernel-via-packet.html

I killed the entire primitive (for timer_list)
https://outflux.net/blog/archives/2018/02/05/security-things-in-linux-v4-15/#v4.15-timer_list
but that was a lot of work, so I'm trying to avoid seeing more things
like it appear. :) (And I'm trying to get rid of similar APIs, like
tasklet.)

This new code is unlikely to ever be used as widely as timer_list,
but I just cringe when I see the code pattern. I'll understand if there
isn't a solution that doesn't require major refactoring, but I can
dream. :)

-- 
Kees Cook

^ permalink raw reply	[flat|nested] 28+ messages in thread

end of thread, other threads:[~2021-04-19 23:36 UTC | newest]

Thread overview: 28+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-04-12  8:00 [PATCH 0/7] mm: Unexport apply_to_page_range() Peter Zijlstra
2021-04-12  8:00 ` [PATCH 1/7] mm: Unexport apply_to_existing_page_range() Peter Zijlstra
2021-04-12  8:13   ` Christoph Hellwig
2021-04-12  8:00 ` [PATCH 2/7] xen/gntdev,x86: Remove apply_to_page_range() use from module Peter Zijlstra
2021-04-12  8:26   ` Christoph Hellwig
2021-04-12  9:20     ` Peter Zijlstra
2021-04-12  9:26     ` Juergen Gross
2021-04-12  8:00 ` [PATCH 3/7] xen/gntdev: " Peter Zijlstra
2021-04-12  8:27   ` Christoph Hellwig
2021-04-12  9:02     ` Peter Zijlstra
2021-04-12  8:00 ` [PATCH 4/7] mm: Introduce verify_page_range() Peter Zijlstra
2021-04-12  8:28   ` Christoph Hellwig
2021-04-12  9:17     ` Peter Zijlstra
2021-04-12 20:05   ` Kees Cook
2021-04-13  6:54     ` Peter Zijlstra
2021-04-19 23:36       ` Kees Cook
2021-04-13  7:36     ` Peter Zijlstra
2021-04-14  3:01       ` Kees Cook
2021-04-14  7:00         ` Peter Zijlstra
2021-04-12  8:00 ` [PATCH 5/7] xen/privcmd: Use verify_page_range() Peter Zijlstra
2021-04-12  8:28   ` Christoph Hellwig
2021-04-12  8:00 ` [PATCH 6/7] i915: Convert to verify_page_range() Peter Zijlstra
2021-04-12  8:28   ` Christoph Hellwig
2021-04-12 20:08   ` Kees Cook
2021-04-13  6:54     ` Peter Zijlstra
2021-04-14  3:04   ` Kees Cook
2021-04-12  8:00 ` [PATCH 7/7] mm: Unexport apply_to_page_range() Peter Zijlstra
2021-04-12  8:28   ` Christoph Hellwig

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).