linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
* [PATCH -next 00/11] lib/interval-tree: move to half closed intervals
@ 2019-10-03 20:18 Davidlohr Bueso
  2019-10-03 20:18 ` [PATCH 01/11] mm: introduce vma_interval_tree_foreach_stab() Davidlohr Bueso
                   ` (12 more replies)
  0 siblings, 13 replies; 38+ messages in thread
From: Davidlohr Bueso @ 2019-10-03 20:18 UTC (permalink / raw)
  To: akpm; +Cc: walken, peterz, linux-kernel, linux-mm, dri-devel, linux-rdma, dave

Hi,

It has been discussed[1,2] that almost all users of interval trees would better
be served if the intervals were actually not [a,b], but instead [a, b). This
series attempts to convert all callers by way of transitioning from using
"interval_tree_generic.h" to "interval_tree_gen.h". Once all users are converted,
we remove the former.

Patch 1: adds a call that will make patch 8 easier to review by introducing stab
         queries for the vma interval tree.

Patch 2: adds the new interval_tree_gen.h which is the same as the old one but
         uses [a,b) intervals.

Patch 3-9: converts, in baby steps (as much as possible), each interval tree to
	   the new [a,b) one. It is done this way also to maintain bisectability.
	   Most conversions are pretty straightforward, however, there are some
	   creative ways in which some callers use the interval 'end' when going
	   through intersecting ranges within a tree. Ie: patch 3, 6 and 9.

Patch 10: deletes the interval_tree_generic.h header; there are no longer any users.

Patch 11: finally simplifies x86 pat tree to use the new interval tree machinery.

This has been lightly tested, and certainly not on driver paths that do non
trivial conversions. Also needs more eyeballs as conversions can be easily
missed (even when I've tried mitigating this by renaming the endpoint from 'last'
to 'end' in each corresponding structure).

Because this touches a lot of drivers, I'm Cc'ing the whole thing to a couple of
relevant lists (mm, dri, rdma); sorry if you consider this spam.

Applies on top of today's linux-next tree. Please consider for v5.5.

Thanks!

[1] https://lore.kernel.org/lkml/CANN689HVDJXKEwB80yPAVwvRwnV4HfiucQVAho=dupKM_iKozw@mail.gmail.com/
[2] https://lore.kernel.org/patchwork/patch/1114629/

Davidlohr Bueso (11):
  mm: introduce vma_interval_tree_foreach_stab()
  lib/interval-tree: add an equivalent tree with [a,b) intervals
  drm/amdgpu: convert amdgpu_vm_it to half closed intervals
  drm: convert drm_mm_interval_tree to half closed intervals
  IB/hfi1: convert __mmu_int_rb to half closed intervals
  IB,usnic: convert usnic_uiom_interval_tree to half closed intervals
  vhost: convert vhost_umem_interval_tree to half closed intervals
  mm: convert vma_interval_tree to half closed intervals
  lib/interval-tree: convert interval_tree to half closed intervals
  lib: drop interval_tree_generic.h
  x86/mm, pat: convert pat tree to generic interval tree

 arch/arm/mm/fault-armv.c                           |   2 +-
 arch/arm/mm/flush.c                                |   2 +-
 arch/nios2/mm/cacheflush.c                         |   2 +-
 arch/parisc/kernel/cache.c                         |   2 +-
 arch/x86/mm/pat.c                                  |  22 +--
 arch/x86/mm/pat_rbtree.c                           | 151 +++++----------------
 drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c             |   2 +-
 drivers/gpu/drm/amd/amdgpu/amdgpu_mn.c             |  12 +-
 drivers/gpu/drm/amd/amdgpu/amdgpu_object.h         |   2 +-
 drivers/gpu/drm/amd/amdgpu/amdgpu_trace.h          |  18 +--
 drivers/gpu/drm/amd/amdgpu/amdgpu_uvd.c            |   2 +-
 drivers/gpu/drm/amd/amdgpu/amdgpu_vce.c            |   3 +-
 drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c             |  47 ++++---
 drivers/gpu/drm/drm_mm.c                           |   8 +-
 drivers/gpu/drm/i915/gem/i915_gem_userptr.c        |   5 +-
 .../gpu/drm/i915/gem/selftests/i915_gem_context.c  |   2 +-
 drivers/gpu/drm/radeon/radeon_mn.c                 |  11 +-
 drivers/gpu/drm/radeon/radeon_trace.h              |   2 +-
 drivers/gpu/drm/radeon/radeon_vm.c                 |  26 ++--
 drivers/gpu/drm/selftests/test-drm_mm.c            |   2 +-
 drivers/infiniband/core/umem_odp.c                 |  21 +--
 drivers/infiniband/hw/hfi1/mmu_rb.c                |  15 +-
 drivers/infiniband/hw/usnic/usnic_uiom.c           |   8 +-
 .../infiniband/hw/usnic/usnic_uiom_interval_tree.c |  26 ++--
 .../infiniband/hw/usnic/usnic_uiom_interval_tree.h |   2 +-
 drivers/iommu/virtio-iommu.c                       |   6 +-
 drivers/vhost/vhost.c                              |  19 ++-
 drivers/vhost/vhost.h                              |   4 +-
 fs/dax.c                                           |   2 +-
 include/drm/drm_mm.h                               |   6 +-
 include/linux/interval_tree.h                      |   2 +-
 ...interval_tree_generic.h => interval_tree_gen.h} |  72 +++++-----
 include/linux/mm.h                                 |   6 +
 include/rdma/ib_umem_odp.h                         |   4 +-
 kernel/events/uprobes.c                            |   2 +-
 lib/interval_tree.c                                |   6 +-
 mm/hugetlb.c                                       |   4 +-
 mm/interval_tree.c                                 |   4 +-
 mm/khugepaged.c                                    |   2 +-
 mm/memory-failure.c                                |   6 +-
 mm/memory.c                                        |   2 +-
 mm/nommu.c                                         |   2 +-
 mm/rmap.c                                          |   6 +-
 43 files changed, 217 insertions(+), 333 deletions(-)
 rename include/linux/{interval_tree_generic.h => interval_tree_gen.h} (72%)

-- 
2.16.4



^ permalink raw reply	[flat|nested] 38+ messages in thread

* [PATCH 01/11] mm: introduce vma_interval_tree_foreach_stab()
  2019-10-03 20:18 [PATCH -next 00/11] lib/interval-tree: move to half closed intervals Davidlohr Bueso
@ 2019-10-03 20:18 ` Davidlohr Bueso
  2019-10-03 20:18 ` [PATCH 02/11] lib/interval-tree: add an equivalent tree with [a,b) intervals Davidlohr Bueso
                   ` (11 subsequent siblings)
  12 siblings, 0 replies; 38+ messages in thread
From: Davidlohr Bueso @ 2019-10-03 20:18 UTC (permalink / raw)
  To: akpm
  Cc: walken, peterz, linux-kernel, linux-mm, dri-devel, linux-rdma,
	dave, Davidlohr Bueso

This documents better the nature of the stab lookup/query.

In addition, this is a step that will make the conversion
of interval tree nodes from [a,b] to [a,b) easier to review.

For symmetry with vma_interval_tree, the anon equivalent is
also introduced, albeit a single user. This patch does not
change any semantics.

Signed-off-by: Davidlohr Bueso <dbueso@suse.de>
---
 arch/arm/mm/fault-armv.c   | 2 +-
 arch/arm/mm/flush.c        | 2 +-
 arch/nios2/mm/cacheflush.c | 2 +-
 arch/parisc/kernel/cache.c | 2 +-
 fs/dax.c                   | 2 +-
 include/linux/mm.h         | 6 ++++++
 kernel/events/uprobes.c    | 2 +-
 mm/hugetlb.c               | 4 ++--
 mm/khugepaged.c            | 2 +-
 mm/memory-failure.c        | 6 ++----
 10 files changed, 17 insertions(+), 13 deletions(-)

diff --git a/arch/arm/mm/fault-armv.c b/arch/arm/mm/fault-armv.c
index ae857f41f68d..85bb69f1decb 100644
--- a/arch/arm/mm/fault-armv.c
+++ b/arch/arm/mm/fault-armv.c
@@ -143,7 +143,7 @@ make_coherent(struct address_space *mapping, struct vm_area_struct *vma,
 	 * cache coherency.
 	 */
 	flush_dcache_mmap_lock(mapping);
-	vma_interval_tree_foreach(mpnt, &mapping->i_mmap, pgoff, pgoff) {
+	vma_interval_tree_foreach_stab(mpnt, &mapping->i_mmap, pgoff) {
 		/*
 		 * If this VMA is not in our MM, we can ignore it.
 		 * Note that we intentionally mask out the VMA
diff --git a/arch/arm/mm/flush.c b/arch/arm/mm/flush.c
index 6d89db7895d1..b04384406c0f 100644
--- a/arch/arm/mm/flush.c
+++ b/arch/arm/mm/flush.c
@@ -249,7 +249,7 @@ static void __flush_dcache_aliases(struct address_space *mapping, struct page *p
 	pgoff = page->index;
 
 	flush_dcache_mmap_lock(mapping);
-	vma_interval_tree_foreach(mpnt, &mapping->i_mmap, pgoff, pgoff) {
+	vma_interval_tree_foreach_stab(mpnt, &mapping->i_mmap, pgoff) {
 		unsigned long offset;
 
 		/*
diff --git a/arch/nios2/mm/cacheflush.c b/arch/nios2/mm/cacheflush.c
index 65de1bd6a760..8abe26b8e29d 100644
--- a/arch/nios2/mm/cacheflush.c
+++ b/arch/nios2/mm/cacheflush.c
@@ -79,7 +79,7 @@ static void flush_aliases(struct address_space *mapping, struct page *page)
 	pgoff = page->index;
 
 	flush_dcache_mmap_lock(mapping);
-	vma_interval_tree_foreach(mpnt, &mapping->i_mmap, pgoff, pgoff) {
+	vma_interval_tree_foreach_stab(mpnt, &mapping->i_mmap, pgoff) {
 		unsigned long offset;
 
 		if (mpnt->vm_mm != mm)
diff --git a/arch/parisc/kernel/cache.c b/arch/parisc/kernel/cache.c
index a82b3eaa5398..c5f8aab749c1 100644
--- a/arch/parisc/kernel/cache.c
+++ b/arch/parisc/kernel/cache.c
@@ -348,7 +348,7 @@ void flush_dcache_page(struct page *page)
 	 * to flush one address here for them all to become coherent */
 
 	flush_dcache_mmap_lock(mapping);
-	vma_interval_tree_foreach(mpnt, &mapping->i_mmap, pgoff, pgoff) {
+	vma_interval_tree_foreach_stab(mpnt, &mapping->i_mmap, pgoff) {
 		offset = (pgoff - mpnt->vm_pgoff) << PAGE_SHIFT;
 		addr = mpnt->vm_start + offset;
 
diff --git a/fs/dax.c b/fs/dax.c
index 6bf81f931de3..f24c035defb7 100644
--- a/fs/dax.c
+++ b/fs/dax.c
@@ -781,7 +781,7 @@ static void dax_entry_mkclean(struct address_space *mapping, pgoff_t index,
 	spinlock_t *ptl;
 
 	i_mmap_lock_read(mapping);
-	vma_interval_tree_foreach(vma, &mapping->i_mmap, index, index) {
+	vma_interval_tree_foreach_stab(vma, &mapping->i_mmap, index) {
 		struct mmu_notifier_range range;
 		unsigned long address;
 
diff --git a/include/linux/mm.h b/include/linux/mm.h
index cc292273e6ba..53f9784d917d 100644
--- a/include/linux/mm.h
+++ b/include/linux/mm.h
@@ -2248,6 +2248,9 @@ struct vm_area_struct *vma_interval_tree_iter_next(struct vm_area_struct *node,
 	for (vma = vma_interval_tree_iter_first(root, start, last);	\
 	     vma; vma = vma_interval_tree_iter_next(vma, start, last))
 
+#define vma_interval_tree_foreach_stab(vma, root, start)		\
+	vma_interval_tree_foreach(vma, root, start, start)
+
 void anon_vma_interval_tree_insert(struct anon_vma_chain *node,
 				   struct rb_root_cached *root);
 void anon_vma_interval_tree_remove(struct anon_vma_chain *node,
@@ -2265,6 +2268,9 @@ void anon_vma_interval_tree_verify(struct anon_vma_chain *node);
 	for (avc = anon_vma_interval_tree_iter_first(root, start, last); \
 	     avc; avc = anon_vma_interval_tree_iter_next(avc, start, last))
 
+#define anon_vma_interval_tree_foreach_stab(vma, root, start)		\
+	anon_vma_interval_tree_foreach(vma, root, start, start)
+
 /* mmap.c */
 extern int __vm_enough_memory(struct mm_struct *mm, long pages, int cap_sys_admin);
 extern int __vma_adjust(struct vm_area_struct *vma, unsigned long start,
diff --git a/kernel/events/uprobes.c b/kernel/events/uprobes.c
index 94d38a39d72e..6a70dbe0b4b2 100644
--- a/kernel/events/uprobes.c
+++ b/kernel/events/uprobes.c
@@ -975,7 +975,7 @@ build_map_info(struct address_space *mapping, loff_t offset, bool is_register)
 
  again:
 	i_mmap_lock_read(mapping);
-	vma_interval_tree_foreach(vma, &mapping->i_mmap, pgoff, pgoff) {
+	vma_interval_tree_foreach_stab(vma, &mapping->i_mmap, pgoff) {
 		if (!valid_vma(vma, is_register))
 			continue;
 
diff --git a/mm/hugetlb.c b/mm/hugetlb.c
index ef37c85423a5..1a6b133ca66d 100644
--- a/mm/hugetlb.c
+++ b/mm/hugetlb.c
@@ -3643,7 +3643,7 @@ static void unmap_ref_private(struct mm_struct *mm, struct vm_area_struct *vma,
 	 * __unmap_hugepage_range() is called as the lock is already held
 	 */
 	i_mmap_lock_write(mapping);
-	vma_interval_tree_foreach(iter_vma, &mapping->i_mmap, pgoff, pgoff) {
+	vma_interval_tree_foreach_stab(iter_vma, &mapping->i_mmap, pgoff) {
 		/* Do not unmap the current VMA */
 		if (iter_vma == vma)
 			continue;
@@ -4844,7 +4844,7 @@ pte_t *huge_pmd_share(struct mm_struct *mm, unsigned long addr, pud_t *pud)
 		return (pte_t *)pmd_alloc(mm, pud, addr);
 
 	i_mmap_lock_write(mapping);
-	vma_interval_tree_foreach(svma, &mapping->i_mmap, idx, idx) {
+	vma_interval_tree_foreach_stab(svma, &mapping->i_mmap, idx) {
 		if (svma == vma)
 			continue;
 
diff --git a/mm/khugepaged.c b/mm/khugepaged.c
index 0a1b4b484ac5..24b66416fde0 100644
--- a/mm/khugepaged.c
+++ b/mm/khugepaged.c
@@ -1420,7 +1420,7 @@ static void retract_page_tables(struct address_space *mapping, pgoff_t pgoff)
 	pmd_t *pmd, _pmd;
 
 	i_mmap_lock_write(mapping);
-	vma_interval_tree_foreach(vma, &mapping->i_mmap, pgoff, pgoff) {
+	vma_interval_tree_foreach_stab(vma, &mapping->i_mmap, pgoff) {
 		/*
 		 * Check vma->anon_vma to exclude MAP_PRIVATE mappings that
 		 * got written to. These VMAs are likely not worth investing
diff --git a/mm/memory-failure.c b/mm/memory-failure.c
index 7ef849da8278..79934fd00146 100644
--- a/mm/memory-failure.c
+++ b/mm/memory-failure.c
@@ -451,8 +451,7 @@ static void collect_procs_anon(struct page *page, struct list_head *to_kill,
 
 		if (!t)
 			continue;
-		anon_vma_interval_tree_foreach(vmac, &av->rb_root,
-					       pgoff, pgoff) {
+		anon_vma_interval_tree_foreach_stab(vmac, &av->rb_root, pgoff) {
 			vma = vmac->vma;
 			if (!page_mapped_in_vma(page, vma))
 				continue;
@@ -482,8 +481,7 @@ static void collect_procs_file(struct page *page, struct list_head *to_kill,
 
 		if (!t)
 			continue;
-		vma_interval_tree_foreach(vma, &mapping->i_mmap, pgoff,
-				      pgoff) {
+		vma_interval_tree_foreach_stab(vma, &mapping->i_mmap, pgoff) {
 			/*
 			 * Send early kill signal to tasks where a vma covers
 			 * the page but the corrupted page is not necessarily
-- 
2.16.4



^ permalink raw reply related	[flat|nested] 38+ messages in thread

* [PATCH 02/11] lib/interval-tree: add an equivalent tree with [a,b) intervals
  2019-10-03 20:18 [PATCH -next 00/11] lib/interval-tree: move to half closed intervals Davidlohr Bueso
  2019-10-03 20:18 ` [PATCH 01/11] mm: introduce vma_interval_tree_foreach_stab() Davidlohr Bueso
@ 2019-10-03 20:18 ` Davidlohr Bueso
  2019-10-04 11:02   ` Michel Lespinasse
  2019-10-03 20:18 ` [PATCH 03/11] drm/amdgpu: convert amdgpu_vm_it to half closed intervals Davidlohr Bueso
                   ` (10 subsequent siblings)
  12 siblings, 1 reply; 38+ messages in thread
From: Davidlohr Bueso @ 2019-10-03 20:18 UTC (permalink / raw)
  To: akpm
  Cc: walken, peterz, linux-kernel, linux-mm, dri-devel, linux-rdma,
	dave, Davidlohr Bueso

While the kernel's interval tree implementation uses closed intervals, all
users (with the exception of query stabbing) really want half closed intervals,
specifically [a, b), and will explicitly correct this before calling the tree api.

This patch simply duplicates the include/linuxinterval_tree_generic.h header into
a interval_tree_gen.h (gen as in 'generic' and 'generate' :-) with two differences:

- The 'last' endpoint is renamed 'end'; this is something lib/interval_tree.c will
also need to update, but that is later in the series.

- The lookup calls have been adapted accordingly, such that users need not need to
do the subtraction by one fixup.

No users are ported over in this patch.

Signed-off-by: Davidlohr Bueso <dbueso@suse.de>
---
 include/linux/interval_tree_gen.h | 179 ++++++++++++++++++++++++++++++++++++++
 1 file changed, 179 insertions(+)
 create mode 100644 include/linux/interval_tree_gen.h

diff --git a/include/linux/interval_tree_gen.h b/include/linux/interval_tree_gen.h
new file mode 100644
index 000000000000..5d446c0bd89a
--- /dev/null
+++ b/include/linux/interval_tree_gen.h
@@ -0,0 +1,179 @@
+/* SPDX-License-Identifier: GPL-2.0-or-later */
+/*
+  Interval Trees
+  (C) 2012  Michel Lespinasse <walken@google.com>
+*/
+
+#include <linux/rbtree_augmented.h>
+
+/*
+ * Template for implementing interval trees
+ *
+ * ITSTRUCT:   struct type of the interval tree nodes
+ * ITRB:       name of struct rb_node field within ITSTRUCT
+ * ITTYPE:     type of the interval endpoints
+ * ITSUBTREE:  name of ITTYPE field within ITSTRUCT holding end-in-subtree
+ * ITSTART(n): start endpoint of ITSTRUCT node n
+ * ITEND(n):   end/last endpoint of ITSTRUCT node n
+ * ITSTATIC:   'static' or empty
+ * ITPREFIX:   prefix to use for the inline tree definitions
+ *
+ * Note - before using this, please consider if generic version
+ * (interval_tree.h) would work for you...
+ */
+
+#define INTERVAL_TREE_DEFINE(ITSTRUCT, ITRB, ITTYPE, ITSUBTREE,		      \
+			     ITSTART, ITEND, ITSTATIC, ITPREFIX)	      \
+									      \
+/* Callbacks for augmented rbtree insert and remove */			      \
+									      \
+RB_DECLARE_CALLBACKS_MAX(static, ITPREFIX ## _augment,			      \
+			 ITSTRUCT, ITRB, ITTYPE, ITSUBTREE, ITEND)	      \
+									      \
+/* Insert / remove interval nodes from the tree */			      \
+									      \
+ITSTATIC void ITPREFIX ## _insert(ITSTRUCT *node,			      \
+				  struct rb_root_cached *root)		      \
+{									      \
+	struct rb_node **link = &root->rb_root.rb_node, *rb_parent = NULL;    \
+	ITTYPE start = ITSTART(node), end = ITEND(node);		      \
+	ITSTRUCT *parent;						      \
+	bool leftmost = true;						      \
+									      \
+	while (*link) {							      \
+		rb_parent = *link;					      \
+		parent = rb_entry(rb_parent, ITSTRUCT, ITRB);		      \
+		if (parent->ITSUBTREE < end)				      \
+			parent->ITSUBTREE = end;			      \
+		if (start < ITSTART(parent))				      \
+			link = &parent->ITRB.rb_left;			      \
+		else {							      \
+			link = &parent->ITRB.rb_right;			      \
+			leftmost = false;				      \
+		}							      \
+	}								      \
+									      \
+	node->ITSUBTREE = end;						      \
+	rb_link_node(&node->ITRB, rb_parent, link);			      \
+	rb_insert_augmented_cached(&node->ITRB, root,			      \
+				   leftmost, &ITPREFIX ## _augment);	      \
+}									      \
+									      \
+ITSTATIC void ITPREFIX ## _remove(ITSTRUCT *node,			      \
+				  struct rb_root_cached *root)		      \
+{									      \
+	rb_erase_augmented_cached(&node->ITRB, root, &ITPREFIX ## _augment);  \
+}									      \
+									      \
+/*									      \
+ * Iterate over intervals intersecting [start;end)			      \
+ *									      \
+ * Note that a node's interval intersects [start;end) iff:		      \
+ *   Cond1: ITSTART(node) < end						      \
+ * and									      \
+ *   Cond2: start < ITEND(node)						      \
+ */									      \
+									      \
+static ITSTRUCT *							      \
+ITPREFIX ## _subtree_search(ITSTRUCT *node, ITTYPE start, ITTYPE end)	      \
+{									      \
+	while (true) {							      \
+		/*							      \
+		 * Loop invariant: start <= node->ITSUBTREE		      \
+		 * (Cond2 is satisfied by one of the subtree nodes)	      \
+		 */							      \
+		if (node->ITRB.rb_left) {				      \
+			ITSTRUCT *left = rb_entry(node->ITRB.rb_left,	      \
+						  ITSTRUCT, ITRB);	      \
+			if (start < left->ITSUBTREE) {			      \
+				/*					      \
+				 * Some nodes in left subtree satisfy Cond2.  \
+				 * Iterate to find the leftmost such node N.  \
+				 * If it also satisfies Cond1, that's the     \
+				 * match we are looking for. Otherwise, there \
+				 * is no matching interval as nodes to the    \
+				 * right of N can't satisfy Cond1 either.     \
+				 */					      \
+				node = left;				      \
+				continue;				      \
+			}						      \
+		}							      \
+		if (ITSTART(node) < end) {		/* Cond1 */	      \
+			if (start < ITEND(node))	/* Cond2 */	      \
+				return node;	/* node is leftmost match */  \
+			if (node->ITRB.rb_right) {			      \
+				node = rb_entry(node->ITRB.rb_right,	      \
+						ITSTRUCT, ITRB);	      \
+				if (start <= node->ITSUBTREE)		      \
+					continue;			      \
+			}						      \
+		}							      \
+		return NULL;	/* No match */				      \
+	}								      \
+}									      \
+									      \
+ITSTATIC ITSTRUCT *							      \
+ITPREFIX ## _iter_first(struct rb_root_cached *root,			      \
+			ITTYPE start, ITTYPE end)			      \
+{									      \
+	ITSTRUCT *node, *leftmost;					      \
+									      \
+	if (!root->rb_root.rb_node)					      \
+		return NULL;						      \
+									      \
+	/*								      \
+	 * Fastpath range intersection/overlap where we compare the	      \
+	 * smallest 'start' and largest 'end' in the tree. For the latter,    \
+	 * we rely on the root node, which by augmented interval tree	      \
+	 * property, holds the largest value in its end-in-subtree.	      \
+	 * This allows mitigating some of the tree walk overhead for	      \
+	 * for non-intersecting ranges, maintained and consulted in O(1).     \
+	 */								      \
+	node = rb_entry(root->rb_root.rb_node, ITSTRUCT, ITRB);		      \
+	if (node->ITSUBTREE <= start)	/* !Cond2 */			      \
+		return NULL;						      \
+									      \
+	leftmost = rb_entry(root->rb_leftmost, ITSTRUCT, ITRB);		      \
+	if (ITSTART(leftmost) >= end)	/* !Cond1 */			      \
+		return NULL;						      \
+									      \
+	return ITPREFIX ## _subtree_search(node, start, end);		      \
+}									      \
+									      \
+ITSTATIC ITSTRUCT *							      \
+ITPREFIX ## _iter_next(ITSTRUCT *node, ITTYPE start, ITTYPE end)	      \
+{									      \
+	struct rb_node *rb = node->ITRB.rb_right, *prev;		      \
+									      \
+	while (true) {							      \
+		/*							      \
+		 * Loop invariants:					      \
+		 *   Cond1: ITSTART(node) < end				      \
+		 *   rb == node->ITRB.rb_right				      \
+		 *							      \
+		 * First, search right subtree if suitable		      \
+		 */							      \
+		if (rb) {						      \
+			ITSTRUCT *right = rb_entry(rb, ITSTRUCT, ITRB);	      \
+			if (start < right->ITSUBTREE)			      \
+				return ITPREFIX ## _subtree_search(right,     \
+								start, end); \
+		}							      \
+									      \
+		/* Move up the tree until we come from a node's left child */ \
+		do {							      \
+			rb = rb_parent(&node->ITRB);			      \
+			if (!rb)					      \
+				return NULL;				      \
+			prev = &node->ITRB;				      \
+			node = rb_entry(rb, ITSTRUCT, ITRB);		      \
+			rb = node->ITRB.rb_right;			      \
+		} while (prev == rb);					      \
+									      \
+		/* Check if the node intersects [start;end) */		      \
+		if (end <= ITSTART(node))		/* !Cond1 */	      \
+			return NULL;					      \
+		else if (start < ITEND(node))		/* Cond2 */	      \
+			return node;					      \
+	}								      \
+}
-- 
2.16.4



^ permalink raw reply related	[flat|nested] 38+ messages in thread

* [PATCH 03/11] drm/amdgpu: convert amdgpu_vm_it to half closed intervals
  2019-10-03 20:18 [PATCH -next 00/11] lib/interval-tree: move to half closed intervals Davidlohr Bueso
  2019-10-03 20:18 ` [PATCH 01/11] mm: introduce vma_interval_tree_foreach_stab() Davidlohr Bueso
  2019-10-03 20:18 ` [PATCH 02/11] lib/interval-tree: add an equivalent tree with [a,b) intervals Davidlohr Bueso
@ 2019-10-03 20:18 ` Davidlohr Bueso
  2019-10-04  6:54   ` Koenig, Christian
  2019-10-03 20:18 ` [PATCH 04/11] drm: convert drm_mm_interval_tree " Davidlohr Bueso
                   ` (9 subsequent siblings)
  12 siblings, 1 reply; 38+ messages in thread
From: Davidlohr Bueso @ 2019-10-03 20:18 UTC (permalink / raw)
  To: akpm
  Cc: walken, peterz, linux-kernel, linux-mm, dri-devel, linux-rdma,
	dave, Jerome Glisse, Alex Deucher, Christian König,
	Daniel Vetter, amd-gfx, Davidlohr Bueso

The amdgpu_vm interval tree really wants [a, b) intervals,
not fully closed ones. As such convert it to use the new
interval_tree_gen.h, and also rename the 'last' endpoint
in the node to 'end', which is both a more suitable name
for the half closed interval and also reduces the chances
of missing a conversion when doing insertion or lookup.

Cc: Jerome Glisse <jglisse@redhat.com>
Cc: Alex Deucher <alexander.deucher@amd.com>
Cc: "Christian König" <christian.koenig@amd.com>
Cc: Daniel Vetter <daniel@ffwll.ch>
Cc: amd-gfx@lists.freedesktop.org
Signed-off-by: Davidlohr Bueso <dbueso@suse.de>
---
 drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c     |  2 +-
 drivers/gpu/drm/amd/amdgpu/amdgpu_object.h |  2 +-
 drivers/gpu/drm/amd/amdgpu/amdgpu_trace.h  | 18 ++++++------
 drivers/gpu/drm/amd/amdgpu/amdgpu_uvd.c    |  2 +-
 drivers/gpu/drm/amd/amdgpu/amdgpu_vce.c    |  3 +-
 drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c     | 46 +++++++++++++++---------------
 6 files changed, 36 insertions(+), 37 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c
index 49b767b7238f..290bfe820890 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c
@@ -756,7 +756,7 @@ static int amdgpu_cs_vm_handling(struct amdgpu_cs_parser *p)
 			}
 
 			if ((va_start + chunk_ib->ib_bytes) >
-			    (m->last + 1) * AMDGPU_GPU_PAGE_SIZE) {
+			    m->end * AMDGPU_GPU_PAGE_SIZE) {
 				DRM_ERROR("IB va_start+ib_bytes is invalid\n");
 				return -EINVAL;
 			}
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_object.h b/drivers/gpu/drm/amd/amdgpu/amdgpu_object.h
index 7e99f6c58c48..60b73bc4d11a 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_object.h
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_object.h
@@ -51,7 +51,7 @@ struct amdgpu_bo_va_mapping {
 	struct list_head		list;
 	struct rb_node			rb;
 	uint64_t			start;
-	uint64_t			last;
+	uint64_t			end;
 	uint64_t			__subtree_last;
 	uint64_t			offset;
 	uint64_t			flags;
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_trace.h b/drivers/gpu/drm/amd/amdgpu/amdgpu_trace.h
index 8227ebd0f511..c5b0e88d019c 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_trace.h
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_trace.h
@@ -247,7 +247,7 @@ TRACE_EVENT(amdgpu_vm_bo_map,
 	    TP_STRUCT__entry(
 			     __field(struct amdgpu_bo *, bo)
 			     __field(long, start)
-			     __field(long, last)
+			     __field(long, end)
 			     __field(u64, offset)
 			     __field(u64, flags)
 			     ),
@@ -255,12 +255,12 @@ TRACE_EVENT(amdgpu_vm_bo_map,
 	    TP_fast_assign(
 			   __entry->bo = bo_va ? bo_va->base.bo : NULL;
 			   __entry->start = mapping->start;
-			   __entry->last = mapping->last;
+			   __entry->end = mapping->end;
 			   __entry->offset = mapping->offset;
 			   __entry->flags = mapping->flags;
 			   ),
-	    TP_printk("bo=%p, start=%lx, last=%lx, offset=%010llx, flags=%llx",
-		      __entry->bo, __entry->start, __entry->last,
+	    TP_printk("bo=%p, start=%lx, end=%lx, offset=%010llx, flags=%llx",
+		      __entry->bo, __entry->start, __entry->end,
 		      __entry->offset, __entry->flags)
 );
 
@@ -271,7 +271,7 @@ TRACE_EVENT(amdgpu_vm_bo_unmap,
 	    TP_STRUCT__entry(
 			     __field(struct amdgpu_bo *, bo)
 			     __field(long, start)
-			     __field(long, last)
+			     __field(long, end)
 			     __field(u64, offset)
 			     __field(u64, flags)
 			     ),
@@ -279,12 +279,12 @@ TRACE_EVENT(amdgpu_vm_bo_unmap,
 	    TP_fast_assign(
 			   __entry->bo = bo_va ? bo_va->base.bo : NULL;
 			   __entry->start = mapping->start;
-			   __entry->last = mapping->last;
+			   __entry->end = mapping->end;
 			   __entry->offset = mapping->offset;
 			   __entry->flags = mapping->flags;
 			   ),
-	    TP_printk("bo=%p, start=%lx, last=%lx, offset=%010llx, flags=%llx",
-		      __entry->bo, __entry->start, __entry->last,
+	    TP_printk("bo=%p, start=%lx, end=%lx, offset=%010llx, flags=%llx",
+		      __entry->bo, __entry->start, __entry->end,
 		      __entry->offset, __entry->flags)
 );
 
@@ -299,7 +299,7 @@ DECLARE_EVENT_CLASS(amdgpu_vm_mapping,
 
 	    TP_fast_assign(
 			   __entry->soffset = mapping->start;
-			   __entry->eoffset = mapping->last + 1;
+			   __entry->eoffset = mapping->end;
 			   __entry->flags = mapping->flags;
 			   ),
 	    TP_printk("soffs=%010llx, eoffs=%010llx, flags=%llx",
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_uvd.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_uvd.c
index b2c364b8695f..8094dd0b0332 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_uvd.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_uvd.c
@@ -819,7 +819,7 @@ static int amdgpu_uvd_cs_pass2(struct amdgpu_uvd_cs_ctx *ctx)
 
 	start = amdgpu_bo_gpu_offset(bo);
 
-	end = (mapping->last + 1 - mapping->start);
+	end = mapping->end - mapping->start;
 	end = end * AMDGPU_GPU_PAGE_SIZE + start;
 
 	addr -= mapping->start * AMDGPU_GPU_PAGE_SIZE;
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_vce.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_vce.c
index b70b3c45bb29..d6511bf446df 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_vce.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_vce.c
@@ -642,8 +642,7 @@ static int amdgpu_vce_cs_reloc(struct amdgpu_cs_parser *p, uint32_t ib_idx,
 		return r;
 	}
 
-	if ((addr + (uint64_t)size) >
-	    (mapping->last + 1) * AMDGPU_GPU_PAGE_SIZE) {
+	if ((addr + (uint64_t)size) > mapping->end * AMDGPU_GPU_PAGE_SIZE) {
 		DRM_ERROR("BO to small for addr 0x%010Lx %d %d\n",
 			  addr, lo, hi);
 		return -EINVAL;
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c
index a2c797e34a29..2f618017617e 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c
@@ -26,7 +26,7 @@
  *          Jerome Glisse
  */
 #include <linux/dma-fence-array.h>
-#include <linux/interval_tree_generic.h>
+#include <linux/interval_tree_gen.h>
 #include <linux/idr.h>
 
 #include <drm/amdgpu_drm.h>
@@ -58,13 +58,13 @@
  */
 
 #define START(node) ((node)->start)
-#define LAST(node) ((node)->last)
+#define END(node) ((node)->end)
 
 INTERVAL_TREE_DEFINE(struct amdgpu_bo_va_mapping, rb, uint64_t, __subtree_last,
-		     START, LAST, static, amdgpu_vm_it)
+		     START, END, static, amdgpu_vm_it)
 
 #undef START
-#undef LAST
+#undef END
 
 /**
  * struct amdgpu_prt_cb - Helper to disable partial resident texture feature from a fence callback
@@ -1616,7 +1616,7 @@ static int amdgpu_vm_bo_split_mapping(struct amdgpu_device *adev,
 	do {
 		dma_addr_t *dma_addr = NULL;
 		uint64_t max_entries;
-		uint64_t addr, last;
+		uint64_t addr, end;
 
 		if (nodes) {
 			addr = nodes->start << PAGE_SHIFT;
@@ -1654,21 +1654,21 @@ static int amdgpu_vm_bo_split_mapping(struct amdgpu_device *adev,
 			addr += pfn << PAGE_SHIFT;
 		}
 
-		last = min((uint64_t)mapping->last, start + max_entries - 1);
+		end = min((uint64_t)mapping->end, start + max_entries);
 		r = amdgpu_vm_bo_update_mapping(adev, vm, false, exclusive,
-						start, last, flags, addr,
+						start, end, flags, addr,
 						dma_addr, fence);
 		if (r)
 			return r;
 
-		pfn += (last - start + 1) / AMDGPU_GPU_PAGES_IN_CPU_PAGE;
+		pfn += (end - start) / AMDGPU_GPU_PAGES_IN_CPU_PAGE;
 		if (nodes && nodes->size == pfn) {
 			pfn = 0;
 			++nodes;
 		}
-		start = last + 1;
+		start = end;
 
-	} while (unlikely(start != mapping->last + 1));
+	} while (unlikely(start != mapping->end));
 
 	return 0;
 }
@@ -1946,7 +1946,7 @@ int amdgpu_vm_clear_freed(struct amdgpu_device *adev,
 			init_pte_value = AMDGPU_PTE_DEFAULT_ATC;
 
 		r = amdgpu_vm_bo_update_mapping(adev, vm, false, NULL,
-						mapping->start, mapping->last,
+						mapping->start, mapping->end,
 						init_pte_value, 0, NULL, &f);
 		amdgpu_vm_free_mapping(adev, vm, mapping, f);
 		if (r) {
@@ -2129,7 +2129,7 @@ int amdgpu_vm_bo_map(struct amdgpu_device *adev,
 		return -EINVAL;
 
 	/* make sure object fit at this offset */
-	eaddr = saddr + size - 1;
+	eaddr = saddr + size;
 	if (saddr >= eaddr ||
 	    (bo && offset + size > amdgpu_bo_size(bo)))
 		return -EINVAL;
@@ -2142,7 +2142,7 @@ int amdgpu_vm_bo_map(struct amdgpu_device *adev,
 		/* bo and tmp overlap, invalid addr */
 		dev_err(adev->dev, "bo %p va 0x%010Lx-0x%010Lx conflict with "
 			"0x%010Lx-0x%010Lx\n", bo, saddr, eaddr,
-			tmp->start, tmp->last + 1);
+			tmp->start, tmp->end);
 		return -EINVAL;
 	}
 
@@ -2151,7 +2151,7 @@ int amdgpu_vm_bo_map(struct amdgpu_device *adev,
 		return -ENOMEM;
 
 	mapping->start = saddr;
-	mapping->last = eaddr;
+	mapping->end = eaddr;
 	mapping->offset = offset;
 	mapping->flags = flags;
 
@@ -2194,7 +2194,7 @@ int amdgpu_vm_bo_replace_map(struct amdgpu_device *adev,
 		return -EINVAL;
 
 	/* make sure object fit at this offset */
-	eaddr = saddr + size - 1;
+	eaddr = saddr + size;
 	if (saddr >= eaddr ||
 	    (bo && offset + size > amdgpu_bo_size(bo)))
 		return -EINVAL;
@@ -2214,7 +2214,7 @@ int amdgpu_vm_bo_replace_map(struct amdgpu_device *adev,
 	eaddr /= AMDGPU_GPU_PAGE_SIZE;
 
 	mapping->start = saddr;
-	mapping->last = eaddr;
+	mapping->end = eaddr;
 	mapping->offset = offset;
 	mapping->flags = flags;
 
@@ -2299,7 +2299,7 @@ int amdgpu_vm_bo_clear_mappings(struct amdgpu_device *adev,
 	LIST_HEAD(removed);
 	uint64_t eaddr;
 
-	eaddr = saddr + size - 1;
+	eaddr = saddr + size;
 	saddr /= AMDGPU_GPU_PAGE_SIZE;
 	eaddr /= AMDGPU_GPU_PAGE_SIZE;
 
@@ -2322,7 +2322,7 @@ int amdgpu_vm_bo_clear_mappings(struct amdgpu_device *adev,
 		/* Remember mapping split at the start */
 		if (tmp->start < saddr) {
 			before->start = tmp->start;
-			before->last = saddr - 1;
+			before->end = saddr;
 			before->offset = tmp->offset;
 			before->flags = tmp->flags;
 			before->bo_va = tmp->bo_va;
@@ -2330,9 +2330,9 @@ int amdgpu_vm_bo_clear_mappings(struct amdgpu_device *adev,
 		}
 
 		/* Remember mapping split at the end */
-		if (tmp->last > eaddr) {
-			after->start = eaddr + 1;
-			after->last = tmp->last;
+		if (tmp->end > eaddr) {
+			after->start = eaddr;
+			after->end = tmp->end;
 			after->offset = tmp->offset;
 			after->offset += after->start - tmp->start;
 			after->flags = tmp->flags;
@@ -2353,8 +2353,8 @@ int amdgpu_vm_bo_clear_mappings(struct amdgpu_device *adev,
 
 		if (tmp->start < saddr)
 		    tmp->start = saddr;
-		if (tmp->last > eaddr)
-		    tmp->last = eaddr;
+		if (tmp->end > eaddr)
+		    tmp->end = eaddr;
 
 		tmp->bo_va = NULL;
 		list_add(&tmp->list, &vm->freed);
-- 
2.16.4



^ permalink raw reply related	[flat|nested] 38+ messages in thread

* [PATCH 04/11] drm: convert drm_mm_interval_tree to half closed intervals
  2019-10-03 20:18 [PATCH -next 00/11] lib/interval-tree: move to half closed intervals Davidlohr Bueso
                   ` (2 preceding siblings ...)
  2019-10-03 20:18 ` [PATCH 03/11] drm/amdgpu: convert amdgpu_vm_it to half closed intervals Davidlohr Bueso
@ 2019-10-03 20:18 ` Davidlohr Bueso
  2019-10-03 20:18 ` [PATCH 05/11] IB/hfi1: convert __mmu_int_rb " Davidlohr Bueso
                   ` (8 subsequent siblings)
  12 siblings, 0 replies; 38+ messages in thread
From: Davidlohr Bueso @ 2019-10-03 20:18 UTC (permalink / raw)
  To: akpm
  Cc: walken, peterz, linux-kernel, linux-mm, dri-devel, linux-rdma,
	dave, Maarten Lankhorst, Maxime Ripard, Daniel Vetter, intel-gfx,
	Davidlohr Bueso

The drm_mm interval tree really wants [a, b) intervals, not
fully closed as it is now. As such convert it to use the new
interval_tree_gen.h.

Cc: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
Cc: Maxime Ripard <mripard@kernel.org>
Cc: Daniel Vetter <daniel@ffwll.ch>
Cc: intel-gfx@lists.freedesktop.org
Cc: dri-devel@lists.freedesktop.org
Signed-off-by: Davidlohr Bueso <dbueso@suse.de>
---
 drivers/gpu/drm/drm_mm.c                              | 8 ++++----
 drivers/gpu/drm/i915/gem/selftests/i915_gem_context.c | 2 +-
 drivers/gpu/drm/selftests/test-drm_mm.c               | 2 +-
 include/drm/drm_mm.h                                  | 6 +++---
 4 files changed, 9 insertions(+), 9 deletions(-)

diff --git a/drivers/gpu/drm/drm_mm.c b/drivers/gpu/drm/drm_mm.c
index 4581c5387372..17feb00e7d80 100644
--- a/drivers/gpu/drm/drm_mm.c
+++ b/drivers/gpu/drm/drm_mm.c
@@ -43,7 +43,7 @@
  */
 
 #include <linux/export.h>
-#include <linux/interval_tree_generic.h>
+#include <linux/interval_tree_gen.h>
 #include <linux/seq_file.h>
 #include <linux/slab.h>
 #include <linux/stacktrace.h>
@@ -150,11 +150,11 @@ static void show_leaks(struct drm_mm *mm) { }
 #endif
 
 #define START(node) ((node)->start)
-#define LAST(node)  ((node)->start + (node)->size - 1)
+#define END(node)   ((node)->start + (node)->size)
 
 INTERVAL_TREE_DEFINE(struct drm_mm_node, rb,
 		     u64, __subtree_last,
-		     START, LAST, static inline, drm_mm_interval_tree)
+		     START, END, static inline, drm_mm_interval_tree)
 
 struct drm_mm_node *
 __drm_mm_interval_first(const struct drm_mm *mm, u64 start, u64 last)
@@ -172,7 +172,7 @@ static void drm_mm_interval_tree_add_node(struct drm_mm_node *hole_node,
 	struct drm_mm_node *parent;
 	bool leftmost;
 
-	node->__subtree_last = LAST(node);
+	node->__subtree_last = END(node);
 
 	if (hole_node->allocated) {
 		rb = &hole_node->rb;
diff --git a/drivers/gpu/drm/i915/gem/selftests/i915_gem_context.c b/drivers/gpu/drm/i915/gem/selftests/i915_gem_context.c
index 3e6f4a65d356..af40d3bfa065 100644
--- a/drivers/gpu/drm/i915/gem/selftests/i915_gem_context.c
+++ b/drivers/gpu/drm/i915/gem/selftests/i915_gem_context.c
@@ -1150,7 +1150,7 @@ static int check_scratch(struct i915_gem_context *ctx, u64 offset)
 {
 	struct drm_mm_node *node =
 		__drm_mm_interval_first(&ctx->vm->mm,
-					offset, offset + sizeof(u32) - 1);
+					offset, offset + sizeof(u32));
 	if (!node || node->start > offset)
 		return 0;
 
diff --git a/drivers/gpu/drm/selftests/test-drm_mm.c b/drivers/gpu/drm/selftests/test-drm_mm.c
index 388f9844f4ba..f34f975c1570 100644
--- a/drivers/gpu/drm/selftests/test-drm_mm.c
+++ b/drivers/gpu/drm/selftests/test-drm_mm.c
@@ -853,7 +853,7 @@ static bool assert_contiguous_in_range(struct drm_mm *mm,
 	}
 
 	if (start > 0) {
-		node = __drm_mm_interval_first(mm, 0, start - 1);
+		node = __drm_mm_interval_first(mm, 0, start);
 		if (node->allocated) {
 			pr_err("node before start: node=%llx+%llu, start=%llx\n",
 			       node->start, node->size, start);
diff --git a/include/drm/drm_mm.h b/include/drm/drm_mm.h
index 2c3bbb43c7d1..0eda6180e1ef 100644
--- a/include/drm/drm_mm.h
+++ b/include/drm/drm_mm.h
@@ -485,19 +485,19 @@ __drm_mm_interval_first(const struct drm_mm *mm, u64 start, u64 last);
  * @node__: drm_mm_node structure to assign to in each iteration step
  * @mm__: drm_mm allocator to walk
  * @start__: starting offset, the first node will overlap this
- * @end__: ending offset, the last node will start before this (but may overlap)
+ * @end__: ending offset, the last node will start before this
  *
  * This iterator walks over all nodes in the range allocator that lie
  * between @start and @end. It is implemented similarly to list_for_each(),
  * but using the internal interval tree to accelerate the search for the
  * starting node, and so not safe against removal of elements. It assumes
  * that @end is within (or is the upper limit of) the drm_mm allocator.
- * If [@start, @end] are beyond the range of the drm_mm, the iterator may walk
+ * If [@start, @end) are beyond the range of the drm_mm, the iterator may walk
  * over the special _unallocated_ &drm_mm.head_node, and may even continue
  * indefinitely.
  */
 #define drm_mm_for_each_node_in_range(node__, mm__, start__, end__)	\
-	for (node__ = __drm_mm_interval_first((mm__), (start__), (end__)-1); \
+	for (node__ = __drm_mm_interval_first((mm__), (start__), (end__)); \
 	     node__->start < (end__);					\
 	     node__ = list_next_entry(node__, node_list))
 
-- 
2.16.4



^ permalink raw reply related	[flat|nested] 38+ messages in thread

* [PATCH 05/11] IB/hfi1: convert __mmu_int_rb to half closed intervals
  2019-10-03 20:18 [PATCH -next 00/11] lib/interval-tree: move to half closed intervals Davidlohr Bueso
                   ` (3 preceding siblings ...)
  2019-10-03 20:18 ` [PATCH 04/11] drm: convert drm_mm_interval_tree " Davidlohr Bueso
@ 2019-10-03 20:18 ` Davidlohr Bueso
  2019-10-04 11:50   ` Michel Lespinasse
  2019-10-03 20:18 ` [PATCH 06/11] IB,usnic: convert usnic_uiom_interval_tree " Davidlohr Bueso
                   ` (7 subsequent siblings)
  12 siblings, 1 reply; 38+ messages in thread
From: Davidlohr Bueso @ 2019-10-03 20:18 UTC (permalink / raw)
  To: akpm
  Cc: walken, peterz, linux-kernel, linux-mm, dri-devel, linux-rdma,
	dave, Mike Marciniszyn, Dennis Dalessandro, Doug Ledford,
	Davidlohr Bueso

The __mmu_int_rb interval tree really wants [a, b) intervals,
not fully closed as currently. As such convert it to use the new
interval_tree_gen.h.

Cc: Mike Marciniszyn <mike.marciniszyn@intel.com>
Cc: Dennis Dalessandro <dennis.dalessandro@intel.com>
Cc: Doug Ledford <dledford@redhat.com>
Cc: linux-rdma@vger.kernel.org
Signed-off-by: Davidlohr Bueso <dbueso@suse.de>
---
 drivers/infiniband/hw/hfi1/mmu_rb.c | 15 +++++++--------
 1 file changed, 7 insertions(+), 8 deletions(-)

diff --git a/drivers/infiniband/hw/hfi1/mmu_rb.c b/drivers/infiniband/hw/hfi1/mmu_rb.c
index 14d2a90964c3..fb6382b2d44e 100644
--- a/drivers/infiniband/hw/hfi1/mmu_rb.c
+++ b/drivers/infiniband/hw/hfi1/mmu_rb.c
@@ -47,7 +47,7 @@
 #include <linux/list.h>
 #include <linux/rculist.h>
 #include <linux/mmu_notifier.h>
-#include <linux/interval_tree_generic.h>
+#include <linux/interval_tree_gen.h>
 
 #include "mmu_rb.h"
 #include "trace.h"
@@ -89,7 +89,7 @@ static unsigned long mmu_node_start(struct mmu_rb_node *node)
 
 static unsigned long mmu_node_last(struct mmu_rb_node *node)
 {
-	return PAGE_ALIGN(node->addr + node->len) - 1;
+	return PAGE_ALIGN(node->addr + node->len);
 }
 
 int hfi1_mmu_rb_register(void *ops_arg, struct mm_struct *mm,
@@ -195,13 +195,13 @@ static struct mmu_rb_node *__mmu_rb_search(struct mmu_rb_handler *handler,
 	trace_hfi1_mmu_rb_search(addr, len);
 	if (!handler->ops->filter) {
 		node = __mmu_int_rb_iter_first(&handler->root, addr,
-					       (addr + len) - 1);
+					       addr + len);
 	} else {
 		for (node = __mmu_int_rb_iter_first(&handler->root, addr,
-						    (addr + len) - 1);
+						    addr + len);
 		     node;
 		     node = __mmu_int_rb_iter_next(node, addr,
-						   (addr + len) - 1)) {
+						   addr + len)) {
 			if (handler->ops->filter(node, addr, len))
 				return node;
 		}
@@ -293,11 +293,10 @@ static int mmu_notifier_range_start(struct mmu_notifier *mn,
 	bool added = false;
 
 	spin_lock_irqsave(&handler->lock, flags);
-	for (node = __mmu_int_rb_iter_first(root, range->start, range->end-1);
+	for (node = __mmu_int_rb_iter_first(root, range->start, range->end);
 	     node; node = ptr) {
 		/* Guard against node removal. */
-		ptr = __mmu_int_rb_iter_next(node, range->start,
-					     range->end - 1);
+		ptr = __mmu_int_rb_iter_next(node, range->start, range->end);
 		trace_hfi1_mmu_mem_invalidate(node->addr, node->len);
 		if (handler->ops->invalidate(handler->ops_arg, node)) {
 			__mmu_int_rb_remove(node, root);
-- 
2.16.4



^ permalink raw reply related	[flat|nested] 38+ messages in thread

* [PATCH 06/11] IB,usnic: convert usnic_uiom_interval_tree to half closed intervals
  2019-10-03 20:18 [PATCH -next 00/11] lib/interval-tree: move to half closed intervals Davidlohr Bueso
                   ` (4 preceding siblings ...)
  2019-10-03 20:18 ` [PATCH 05/11] IB/hfi1: convert __mmu_int_rb " Davidlohr Bueso
@ 2019-10-03 20:18 ` Davidlohr Bueso
  2019-10-03 20:18 ` [PATCH 07/11] vhost: convert vhost_umem_interval_tree " Davidlohr Bueso
                   ` (6 subsequent siblings)
  12 siblings, 0 replies; 38+ messages in thread
From: Davidlohr Bueso @ 2019-10-03 20:18 UTC (permalink / raw)
  To: akpm
  Cc: walken, peterz, linux-kernel, linux-mm, dri-devel, linux-rdma,
	dave, Christian Benvenuti, Nelson Escobar, Doug Ledford,
	Jason Gunthorpe, Davidlohr Bueso

The usnic_uiom interval tree really wants [a, b) intervals,
not fully closed. As such convert it to use the new
interval_tree_gen.h, and also rename the 'last' endpoint
in the node to 'end', which both a more suitable name for
the half closed interval and also reduces the chances of some
caller being missed.

The conversion can get non-trivial when calling into things like:

 - find_intervals_intersection_sorted()
 - usnic_uiom_insert_interval()

which have been converted to no longer subtracting one from the
interval->end, hoping that the semantics of ad-hoc usage remain
untouched.

Cc: Christian Benvenuti <benve@cisco.com>
Cc: Nelson Escobar <neescoba@cisco.com>
Cc: Doug Ledford <dledford@redhat.com>
Cc: Jason Gunthorpe <jgg@ziepe.ca>
Cc: linux-rdma@vger.kernel.org
Signed-off-by: Davidlohr Bueso <dbueso@suse.de>
---
 drivers/infiniband/hw/usnic/usnic_uiom.c           |  8 +++----
 .../infiniband/hw/usnic/usnic_uiom_interval_tree.c | 26 ++++++++++------------
 .../infiniband/hw/usnic/usnic_uiom_interval_tree.h |  2 +-
 3 files changed, 17 insertions(+), 19 deletions(-)

diff --git a/drivers/infiniband/hw/usnic/usnic_uiom.c b/drivers/infiniband/hw/usnic/usnic_uiom.c
index 62e6ffa9ad78..14f607c398a8 100644
--- a/drivers/infiniband/hw/usnic/usnic_uiom.c
+++ b/drivers/infiniband/hw/usnic/usnic_uiom.c
@@ -200,7 +200,7 @@ static void usnic_uiom_unmap_sorted_intervals(struct list_head *intervals,
 
 	list_for_each_entry_safe(interval, tmp, intervals, link) {
 		va = interval->start << PAGE_SHIFT;
-		size = ((interval->last - interval->start) + 1) << PAGE_SHIFT;
+		size = (interval->end - interval->start) << PAGE_SHIFT;
 		while (size > 0) {
 			/* Workaround for RH 970401 */
 			usnic_dbg("va 0x%lx size 0x%lx", va, PAGE_SIZE);
@@ -223,7 +223,7 @@ static void __usnic_uiom_reg_release(struct usnic_uiom_pd *pd,
 
 	npages = PAGE_ALIGN(uiomr->length + uiomr->offset) >> PAGE_SHIFT;
 	vpn_start = (uiomr->va & PAGE_MASK) >> PAGE_SHIFT;
-	vpn_last = vpn_start + npages - 1;
+	vpn_last = vpn_start + npages;
 
 	spin_lock(&pd->lock);
 	usnic_uiom_remove_interval(&pd->root, vpn_start,
@@ -293,7 +293,7 @@ static int usnic_uiom_map_sorted_intervals(struct list_head *intervals,
 				pa_end = pa;
 			}
 
-			if ((va >> PAGE_SHIFT) == interval_node->last) {
+			if ((va >> PAGE_SHIFT) == interval_node->end) {
 				/* Last page of the interval */
 				size = pa - pa_start + PAGE_SIZE;
 				usnic_dbg("va 0x%lx pa %pa size 0x%zx flags 0x%x\n",
@@ -354,7 +354,7 @@ struct usnic_uiom_reg *usnic_uiom_reg_get(struct usnic_uiom_pd *pd,
 	offset = addr & ~PAGE_MASK;
 	npages = PAGE_ALIGN(size + offset) >> PAGE_SHIFT;
 	vpn_start = (addr & PAGE_MASK) >> PAGE_SHIFT;
-	vpn_last = vpn_start + npages - 1;
+	vpn_last = vpn_start + npages;
 
 	uiomr = kmalloc(sizeof(*uiomr), GFP_KERNEL);
 	if (!uiomr)
diff --git a/drivers/infiniband/hw/usnic/usnic_uiom_interval_tree.c b/drivers/infiniband/hw/usnic/usnic_uiom_interval_tree.c
index d399523206c7..12c968447673 100644
--- a/drivers/infiniband/hw/usnic/usnic_uiom_interval_tree.c
+++ b/drivers/infiniband/hw/usnic/usnic_uiom_interval_tree.c
@@ -36,11 +36,11 @@
 #include <linux/slab.h>
 #include <linux/list_sort.h>
 
-#include <linux/interval_tree_generic.h>
+#include <linux/interval_tree_gen.h>
 #include "usnic_uiom_interval_tree.h"
 
 #define START(node) ((node)->start)
-#define LAST(node) ((node)->last)
+#define END(node) ((node)->end)
 
 #define MAKE_NODE(node, start, end, ref_cnt, flags, err, err_out)	\
 		do {							\
@@ -76,7 +76,7 @@ usnic_uiom_interval_node_alloc(long int start, long int last, int ref_cnt,
 		return NULL;
 
 	interval->start = start;
-	interval->last = last;
+	interval->end = last;
 	interval->flags = flags;
 	interval->ref_cnt = ref_cnt;
 
@@ -133,7 +133,7 @@ int usnic_uiom_get_intervals_diff(unsigned long start, unsigned long last,
 
 	list_for_each_entry(interval, &intersection_set, link) {
 		if (pivot < interval->start) {
-			MAKE_NODE_AND_APPEND(tmp, pivot, interval->start - 1,
+			MAKE_NODE_AND_APPEND(tmp, pivot, interval->start,
 						1, flags, err, err_out,
 						diff_set);
 			pivot = interval->start;
@@ -144,12 +144,10 @@ int usnic_uiom_get_intervals_diff(unsigned long start, unsigned long last,
 		 * but not in both.
 		 */
 
-		if (pivot > interval->last) {
+		if (pivot > interval->end - 1) {
 			continue;
-		} else if (pivot <= interval->last &&
-				FLAGS_EQUAL(interval->flags, flags,
-				flag_mask)) {
-			pivot = interval->last + 1;
+		} else if (FLAGS_EQUAL(interval->flags, flags, flag_mask)) {
+			pivot = interval->end;
 		}
 	}
 
@@ -195,15 +193,15 @@ int usnic_uiom_insert_interval(struct rb_root_cached *root, unsigned long start,
 		 * inserted
 		 */
 		istart = interval->start;
-		ilast = interval->last;
+		ilast = interval->end - 1;
 		iref_cnt = interval->ref_cnt;
 		iflags = interval->flags;
 
 		if (istart < lpivot) {
-			MAKE_NODE_AND_APPEND(tmp, istart, lpivot - 1, iref_cnt,
+			MAKE_NODE_AND_APPEND(tmp, istart, lpivot, iref_cnt,
 						iflags, err, err_out, &to_add);
 		} else if (istart > lpivot) {
-			MAKE_NODE_AND_APPEND(tmp, lpivot, istart - 1, 1, flags,
+			MAKE_NODE_AND_APPEND(tmp, lpivot, istart, 1, flags,
 						err, err_out, &to_add);
 			lpivot = istart;
 		} else {
@@ -222,7 +220,7 @@ int usnic_uiom_insert_interval(struct rb_root_cached *root, unsigned long start,
 						&to_add);
 		}
 
-		lpivot = ilast + 1;
+		lpivot = interval->end;
 	}
 
 	if (lpivot <= last)
@@ -267,4 +265,4 @@ void usnic_uiom_remove_interval(struct rb_root_cached *root,
 
 INTERVAL_TREE_DEFINE(struct usnic_uiom_interval_node, rb,
 			unsigned long, __subtree_last,
-			START, LAST, , usnic_uiom_interval_tree)
+			START, END, , usnic_uiom_interval_tree)
diff --git a/drivers/infiniband/hw/usnic/usnic_uiom_interval_tree.h b/drivers/infiniband/hw/usnic/usnic_uiom_interval_tree.h
index 1d7fc3226bca..496edc9758c1 100644
--- a/drivers/infiniband/hw/usnic/usnic_uiom_interval_tree.h
+++ b/drivers/infiniband/hw/usnic/usnic_uiom_interval_tree.h
@@ -40,7 +40,7 @@ struct usnic_uiom_interval_node {
 	struct rb_node			rb;
 	struct list_head		link;
 	unsigned long			start;
-	unsigned long			last;
+	unsigned long		        end;
 	unsigned long			__subtree_last;
 	unsigned int			ref_cnt;
 	int				flags;
-- 
2.16.4



^ permalink raw reply related	[flat|nested] 38+ messages in thread

* [PATCH 07/11] vhost: convert vhost_umem_interval_tree to half closed intervals
  2019-10-03 20:18 [PATCH -next 00/11] lib/interval-tree: move to half closed intervals Davidlohr Bueso
                   ` (5 preceding siblings ...)
  2019-10-03 20:18 ` [PATCH 06/11] IB,usnic: convert usnic_uiom_interval_tree " Davidlohr Bueso
@ 2019-10-03 20:18 ` Davidlohr Bueso
  2019-10-04 12:10   ` Michel Lespinasse
  2019-10-10  5:49   ` Jason Wang
  2019-10-03 20:18 ` [PATCH 08/11] mm: convert vma_interval_tree " Davidlohr Bueso
                   ` (5 subsequent siblings)
  12 siblings, 2 replies; 38+ messages in thread
From: Davidlohr Bueso @ 2019-10-03 20:18 UTC (permalink / raw)
  To: akpm
  Cc: walken, peterz, linux-kernel, linux-mm, dri-devel, linux-rdma,
	dave, Michael, Jason Wang, virtualization, Davidlohr Bueso

The vhost_umem interval tree really wants [a, b) intervals,
not fully closed as currently. As such convert it to use the
new interval_tree_gen.h, and also rename the 'last' endpoint
in the node to 'end', which both a more suitable name for
the half closed interval and also reduces the chances of some
caller being missed.

Cc: Michael S. Tsirkin" <mst@redhat.com>
Cc: Jason Wang <jasowang@redhat.com>
Cc: virtualization@lists.linux-foundation.org
Signed-off-by: Davidlohr Bueso <dbueso@suse.de>
---
 drivers/vhost/vhost.c | 19 +++++++++----------
 drivers/vhost/vhost.h |  4 ++--
 2 files changed, 11 insertions(+), 12 deletions(-)

diff --git a/drivers/vhost/vhost.c b/drivers/vhost/vhost.c
index 36ca2cf419bf..80c3cca24dc7 100644
--- a/drivers/vhost/vhost.c
+++ b/drivers/vhost/vhost.c
@@ -28,7 +28,7 @@
 #include <linux/sort.h>
 #include <linux/sched/mm.h>
 #include <linux/sched/signal.h>
-#include <linux/interval_tree_generic.h>
+#include <linux/interval_tree_gen.h>
 #include <linux/nospec.h>
 
 #include "vhost.h"
@@ -51,7 +51,7 @@ enum {
 
 INTERVAL_TREE_DEFINE(struct vhost_umem_node,
 		     rb, __u64, __subtree_last,
-		     START, LAST, static inline, vhost_umem_interval_tree);
+		     START, END, static inline, vhost_umem_interval_tree);
 
 #ifdef CONFIG_VHOST_CROSS_ENDIAN_LEGACY
 static void vhost_disable_cross_endian(struct vhost_virtqueue *vq)
@@ -1034,7 +1034,7 @@ static int vhost_new_umem_range(struct vhost_umem *umem,
 
 	node->start = start;
 	node->size = size;
-	node->last = end;
+	node->end = end;
 	node->userspace_addr = userspace_addr;
 	node->perm = perm;
 	INIT_LIST_HEAD(&node->link);
@@ -1112,7 +1112,7 @@ static int vhost_process_iotlb_msg(struct vhost_dev *dev,
 		}
 		vhost_vq_meta_reset(dev);
 		if (vhost_new_umem_range(dev->iotlb, msg->iova, msg->size,
-					 msg->iova + msg->size - 1,
+					 msg->iova + msg->size,
 					 msg->uaddr, msg->perm)) {
 			ret = -ENOMEM;
 			break;
@@ -1126,7 +1126,7 @@ static int vhost_process_iotlb_msg(struct vhost_dev *dev,
 		}
 		vhost_vq_meta_reset(dev);
 		vhost_del_umem_range(dev->iotlb, msg->iova,
-				     msg->iova + msg->size - 1);
+				     msg->iova + msg->size);
 		break;
 	default:
 		ret = -EINVAL;
@@ -1320,15 +1320,14 @@ static bool iotlb_access_ok(struct vhost_virtqueue *vq,
 {
 	const struct vhost_umem_node *node;
 	struct vhost_umem *umem = vq->iotlb;
-	u64 s = 0, size, orig_addr = addr, last = addr + len - 1;
+	u64 s = 0, size, orig_addr = addr, last = addr + len;
 
 	if (vhost_vq_meta_fetch(vq, addr, len, type))
 		return true;
 
 	while (len > s) {
 		node = vhost_umem_interval_tree_iter_first(&umem->umem_tree,
-							   addr,
-							   last);
+							   addr, last);
 		if (node == NULL || node->start > addr) {
 			vhost_iotlb_miss(vq, addr, access);
 			return false;
@@ -1455,7 +1454,7 @@ static long vhost_set_memory(struct vhost_dev *d, struct vhost_memory __user *m)
 					 region->guest_phys_addr,
 					 region->memory_size,
 					 region->guest_phys_addr +
-					 region->memory_size - 1,
+					 region->memory_size,
 					 region->userspace_addr,
 					 VHOST_ACCESS_RW))
 			goto err;
@@ -2055,7 +2054,7 @@ static int translate_desc(struct vhost_virtqueue *vq, u64 addr, u32 len,
 		}
 
 		node = vhost_umem_interval_tree_iter_first(&umem->umem_tree,
-							addr, addr + len - 1);
+							   addr, addr + len);
 		if (node == NULL || node->start > addr) {
 			if (umem != dev->iotlb) {
 				ret = -EFAULT;
diff --git a/drivers/vhost/vhost.h b/drivers/vhost/vhost.h
index e9ed2722b633..bb36cb9ed5ec 100644
--- a/drivers/vhost/vhost.h
+++ b/drivers/vhost/vhost.h
@@ -53,13 +53,13 @@ struct vhost_log {
 };
 
 #define START(node) ((node)->start)
-#define LAST(node) ((node)->last)
+#define END(node) ((node)->end)
 
 struct vhost_umem_node {
 	struct rb_node rb;
 	struct list_head link;
 	__u64 start;
-	__u64 last;
+	__u64 end;
 	__u64 size;
 	__u64 userspace_addr;
 	__u32 perm;
-- 
2.16.4



^ permalink raw reply related	[flat|nested] 38+ messages in thread

* [PATCH 08/11] mm: convert vma_interval_tree to half closed intervals
  2019-10-03 20:18 [PATCH -next 00/11] lib/interval-tree: move to half closed intervals Davidlohr Bueso
                   ` (6 preceding siblings ...)
  2019-10-03 20:18 ` [PATCH 07/11] vhost: convert vhost_umem_interval_tree " Davidlohr Bueso
@ 2019-10-03 20:18 ` Davidlohr Bueso
  2019-10-03 20:41   ` Matthew Wilcox
  2019-10-04 12:30   ` Michel Lespinasse
  2019-10-03 20:18 ` [PATCH 09/11] lib/interval-tree: convert interval_tree " Davidlohr Bueso
                   ` (4 subsequent siblings)
  12 siblings, 2 replies; 38+ messages in thread
From: Davidlohr Bueso @ 2019-10-03 20:18 UTC (permalink / raw)
  To: akpm
  Cc: walken, peterz, linux-kernel, linux-mm, dri-devel, linux-rdma,
	dave, Davidlohr Bueso

The vma and anon vma interval tree really wants [a, b) intervals,
not fully closed. As such convert it to use the new
interval_tree_gen.h. Because of vma_last_pgoff(), the conversion
is quite straightforward.

Signed-off-by: Davidlohr Bueso <dbueso@suse.de>
---
 include/linux/mm.h | 4 ++--
 mm/interval_tree.c | 4 ++--
 mm/memory.c        | 2 +-
 mm/nommu.c         | 2 +-
 mm/rmap.c          | 6 +++---
 5 files changed, 9 insertions(+), 9 deletions(-)

diff --git a/include/linux/mm.h b/include/linux/mm.h
index 53f9784d917d..3bc06e1de40c 100644
--- a/include/linux/mm.h
+++ b/include/linux/mm.h
@@ -2249,7 +2249,7 @@ struct vm_area_struct *vma_interval_tree_iter_next(struct vm_area_struct *node,
 	     vma; vma = vma_interval_tree_iter_next(vma, start, last))
 
 #define vma_interval_tree_foreach_stab(vma, root, start)		\
-	vma_interval_tree_foreach(vma, root, start, start)
+	vma_interval_tree_foreach(vma, root, start, start + 1)
 
 void anon_vma_interval_tree_insert(struct anon_vma_chain *node,
 				   struct rb_root_cached *root);
@@ -2269,7 +2269,7 @@ void anon_vma_interval_tree_verify(struct anon_vma_chain *node);
 	     avc; avc = anon_vma_interval_tree_iter_next(avc, start, last))
 
 #define anon_vma_interval_tree_foreach_stab(vma, root, start)		\
-	anon_vma_interval_tree_foreach(vma, root, start, start)
+	anon_vma_interval_tree_foreach(vma, root, start, start + 1)
 
 /* mmap.c */
 extern int __vm_enough_memory(struct mm_struct *mm, long pages, int cap_sys_admin);
diff --git a/mm/interval_tree.c b/mm/interval_tree.c
index 11c75fb07584..1f8a7c122dd7 100644
--- a/mm/interval_tree.c
+++ b/mm/interval_tree.c
@@ -8,7 +8,7 @@
 #include <linux/mm.h>
 #include <linux/fs.h>
 #include <linux/rmap.h>
-#include <linux/interval_tree_generic.h>
+#include <linux/interval_tree_gen.h>
 
 static inline unsigned long vma_start_pgoff(struct vm_area_struct *v)
 {
@@ -17,7 +17,7 @@ static inline unsigned long vma_start_pgoff(struct vm_area_struct *v)
 
 static inline unsigned long vma_last_pgoff(struct vm_area_struct *v)
 {
-	return v->vm_pgoff + vma_pages(v) - 1;
+	return v->vm_pgoff + vma_pages(v);
 }
 
 INTERVAL_TREE_DEFINE(struct vm_area_struct, shared.rb,
diff --git a/mm/memory.c b/mm/memory.c
index b1ca51a079f2..8f6978abf64a 100644
--- a/mm/memory.c
+++ b/mm/memory.c
@@ -2679,7 +2679,7 @@ void unmap_mapping_pages(struct address_space *mapping, pgoff_t start,
 
 	details.check_mapping = even_cows ? NULL : mapping;
 	details.first_index = start;
-	details.last_index = start + nr - 1;
+	details.last_index = start + nr;
 	if (details.last_index < details.first_index)
 		details.last_index = ULONG_MAX;
 
diff --git a/mm/nommu.c b/mm/nommu.c
index 99b7ec318824..284c2a948d79 100644
--- a/mm/nommu.c
+++ b/mm/nommu.c
@@ -1793,7 +1793,7 @@ int nommu_shrink_inode_mappings(struct inode *inode, size_t size,
 	size_t r_size, r_top;
 
 	low = newsize >> PAGE_SHIFT;
-	high = (size + PAGE_SIZE - 1) >> PAGE_SHIFT;
+	high = (size + PAGE_SIZE) >> PAGE_SHIFT;
 
 	down_write(&nommu_region_sem);
 	i_mmap_lock_read(inode->i_mapping);
diff --git a/mm/rmap.c b/mm/rmap.c
index d9a23bb773bf..48ca7d1a06b5 100644
--- a/mm/rmap.c
+++ b/mm/rmap.c
@@ -1826,7 +1826,7 @@ static void rmap_walk_anon(struct page *page, struct rmap_walk_control *rwc,
 		return;
 
 	pgoff_start = page_to_pgoff(page);
-	pgoff_end = pgoff_start + hpage_nr_pages(page) - 1;
+	pgoff_end = pgoff_start + hpage_nr_pages(page);
 	anon_vma_interval_tree_foreach(avc, &anon_vma->rb_root,
 			pgoff_start, pgoff_end) {
 		struct vm_area_struct *vma = avc->vma;
@@ -1879,11 +1879,11 @@ static void rmap_walk_file(struct page *page, struct rmap_walk_control *rwc,
 		return;
 
 	pgoff_start = page_to_pgoff(page);
-	pgoff_end = pgoff_start + hpage_nr_pages(page) - 1;
+	pgoff_end = pgoff_start + hpage_nr_pages(page);
 	if (!locked)
 		i_mmap_lock_read(mapping);
 	vma_interval_tree_foreach(vma, &mapping->i_mmap,
-			pgoff_start, pgoff_end) {
+				  pgoff_start, pgoff_end) {
 		unsigned long address = vma_address(page, vma);
 
 		cond_resched();
-- 
2.16.4



^ permalink raw reply related	[flat|nested] 38+ messages in thread

* [PATCH 09/11] lib/interval-tree: convert interval_tree to half closed intervals
  2019-10-03 20:18 [PATCH -next 00/11] lib/interval-tree: move to half closed intervals Davidlohr Bueso
                   ` (7 preceding siblings ...)
  2019-10-03 20:18 ` [PATCH 08/11] mm: convert vma_interval_tree " Davidlohr Bueso
@ 2019-10-03 20:18 ` Davidlohr Bueso
  2019-10-03 22:50   ` kbuild test robot
  2019-10-04  6:57   ` Koenig, Christian
  2019-10-03 20:18 ` [PATCH 10/11] lib: drop interval_tree_generic.h Davidlohr Bueso
                   ` (3 subsequent siblings)
  12 siblings, 2 replies; 38+ messages in thread
From: Davidlohr Bueso @ 2019-10-03 20:18 UTC (permalink / raw)
  To: akpm
  Cc: walken, peterz, linux-kernel, linux-mm, dri-devel, linux-rdma,
	dave, Christian König, Alex Deucher, David Airlie,
	Daniel Vetter, Doug Ledford, Joerg Roedel,
	Jérôme Glisse, Davidlohr Bueso

The generic tree tree really wants [a, b) intervals, not fully closed.
As such convert it to use the new interval_tree_gen.h. Most of the
conversions are straightforward, with the exception of perhaps
radeon_vm_bo_set_addr(), but semantics have been tried to be left
untouched.

Cc: "Christian König" <christian.koenig@amd.com>
Cc: Alex Deucher <alexander.deucher@amd.com>
Cc: David Airlie <airlied@linux.ie>
Cc: Daniel Vetter <daniel@ffwll.ch>
Cc: Doug Ledford <dledford@redhat.com>
Cc: Joerg Roedel <joro@8bytes.org>
Cc: "Jérôme Glisse" <jglisse@redhat.com>
Cc: dri-devel@lists.freedesktop.org
Cc: linux-rdma@vger.kernel.org
Signed-off-by: Davidlohr Bueso <dbueso@suse.de>
---
 drivers/gpu/drm/amd/amdgpu/amdgpu_mn.c      | 12 +++---------
 drivers/gpu/drm/i915/gem/i915_gem_userptr.c |  5 +----
 drivers/gpu/drm/radeon/radeon_mn.c          | 11 ++++-------
 drivers/gpu/drm/radeon/radeon_trace.h       |  2 +-
 drivers/gpu/drm/radeon/radeon_vm.c          | 26 +++++++++++++-------------
 drivers/infiniband/core/umem_odp.c          | 21 +++++++--------------
 drivers/iommu/virtio-iommu.c                |  6 +++---
 include/linux/interval_tree.h               |  2 +-
 include/rdma/ib_umem_odp.h                  |  4 ++--
 lib/interval_tree.c                         |  6 +++---
 10 files changed, 38 insertions(+), 57 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_mn.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_mn.c
index 31d4deb5d294..4bbaa9ffa570 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_mn.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_mn.c
@@ -205,9 +205,6 @@ amdgpu_mn_sync_pagetables_gfx(struct hmm_mirror *mirror,
 	bool blockable = mmu_notifier_range_blockable(update);
 	struct interval_tree_node *it;
 
-	/* notification is exclusive, but interval is inclusive */
-	end -= 1;
-
 	/* TODO we should be able to split locking for interval tree and
 	 * amdgpu_mn_invalidate_node
 	 */
@@ -254,9 +251,6 @@ amdgpu_mn_sync_pagetables_hsa(struct hmm_mirror *mirror,
 	bool blockable = mmu_notifier_range_blockable(update);
 	struct interval_tree_node *it;
 
-	/* notification is exclusive, but interval is inclusive */
-	end -= 1;
-
 	if (amdgpu_mn_read_lock(amn, blockable))
 		return -EAGAIN;
 
@@ -374,7 +368,7 @@ struct amdgpu_mn *amdgpu_mn_get(struct amdgpu_device *adev,
  */
 int amdgpu_mn_register(struct amdgpu_bo *bo, unsigned long addr)
 {
-	unsigned long end = addr + amdgpu_bo_size(bo) - 1;
+	unsigned long end = addr + amdgpu_bo_size(bo);
 	struct amdgpu_device *adev = amdgpu_ttm_adev(bo->tbo.bdev);
 	enum amdgpu_mn_type type =
 		bo->kfd_bo ? AMDGPU_MN_TYPE_HSA : AMDGPU_MN_TYPE_GFX;
@@ -400,7 +394,7 @@ int amdgpu_mn_register(struct amdgpu_bo *bo, unsigned long addr)
 		node = container_of(it, struct amdgpu_mn_node, it);
 		interval_tree_remove(&node->it, &amn->objects);
 		addr = min(it->start, addr);
-		end = max(it->last, end);
+		end = max(it->end, end);
 		list_splice(&node->bos, &bos);
 	}
 
@@ -412,7 +406,7 @@ int amdgpu_mn_register(struct amdgpu_bo *bo, unsigned long addr)
 	bo->mn = amn;
 
 	node->it.start = addr;
-	node->it.last = end;
+	node->it.end = end;
 	INIT_LIST_HEAD(&node->bos);
 	list_splice(&bos, &node->bos);
 	list_add(&bo->mn_list, &node->bos);
diff --git a/drivers/gpu/drm/i915/gem/i915_gem_userptr.c b/drivers/gpu/drm/i915/gem/i915_gem_userptr.c
index 11b231c187c5..818ff6b33102 100644
--- a/drivers/gpu/drm/i915/gem/i915_gem_userptr.c
+++ b/drivers/gpu/drm/i915/gem/i915_gem_userptr.c
@@ -99,9 +99,6 @@ userptr_mn_invalidate_range_start(struct mmu_notifier *_mn,
 	if (RB_EMPTY_ROOT(&mn->objects.rb_root))
 		return 0;
 
-	/* interval ranges are inclusive, but invalidate range is exclusive */
-	end = range->end - 1;
-
 	spin_lock(&mn->lock);
 	it = interval_tree_iter_first(&mn->objects, range->start, end);
 	while (it) {
@@ -275,7 +272,7 @@ i915_gem_userptr_init__mmu_notifier(struct drm_i915_gem_object *obj,
 	mo->mn = mn;
 	mo->obj = obj;
 	mo->it.start = obj->userptr.ptr;
-	mo->it.last = obj->userptr.ptr + obj->base.size - 1;
+	mo->it.end = obj->userptr.ptr + obj->base.size;
 	RB_CLEAR_NODE(&mo->it.rb);
 
 	obj->userptr.mmu_object = mo;
diff --git a/drivers/gpu/drm/radeon/radeon_mn.c b/drivers/gpu/drm/radeon/radeon_mn.c
index dbab9a3a969b..4810421dacc2 100644
--- a/drivers/gpu/drm/radeon/radeon_mn.c
+++ b/drivers/gpu/drm/radeon/radeon_mn.c
@@ -66,12 +66,9 @@ static int radeon_mn_invalidate_range_start(struct mmu_notifier *mn,
 	struct radeon_mn *rmn = container_of(mn, struct radeon_mn, mn);
 	struct ttm_operation_ctx ctx = { false, false };
 	struct interval_tree_node *it;
-	unsigned long end;
+	unsigned long end = range->end;
 	int ret = 0;
 
-	/* notification is exclusive, but interval is inclusive */
-	end = range->end - 1;
-
 	/* TODO we should be able to split locking for interval tree and
 	 * the tear down.
 	 */
@@ -174,7 +171,7 @@ static const struct mmu_notifier_ops radeon_mn_ops = {
  */
 int radeon_mn_register(struct radeon_bo *bo, unsigned long addr)
 {
-	unsigned long end = addr + radeon_bo_size(bo) - 1;
+	unsigned long end = addr + radeon_bo_size(bo);
 	struct mmu_notifier *mn;
 	struct radeon_mn *rmn;
 	struct radeon_mn_node *node = NULL;
@@ -195,7 +192,7 @@ int radeon_mn_register(struct radeon_bo *bo, unsigned long addr)
 		node = container_of(it, struct radeon_mn_node, it);
 		interval_tree_remove(&node->it, &rmn->objects);
 		addr = min(it->start, addr);
-		end = max(it->last, end);
+		end = max(it->end, end);
 		list_splice(&node->bos, &bos);
 	}
 
@@ -210,7 +207,7 @@ int radeon_mn_register(struct radeon_bo *bo, unsigned long addr)
 	bo->mn = rmn;
 
 	node->it.start = addr;
-	node->it.last = end;
+	node->it.end = end;
 	INIT_LIST_HEAD(&node->bos);
 	list_splice(&bos, &node->bos);
 	list_add(&bo->mn_list, &node->bos);
diff --git a/drivers/gpu/drm/radeon/radeon_trace.h b/drivers/gpu/drm/radeon/radeon_trace.h
index c93f3ab3c4e3..4324f3fc5d82 100644
--- a/drivers/gpu/drm/radeon/radeon_trace.h
+++ b/drivers/gpu/drm/radeon/radeon_trace.h
@@ -73,7 +73,7 @@ TRACE_EVENT(radeon_vm_bo_update,
 
 	    TP_fast_assign(
 			   __entry->soffset = bo_va->it.start;
-			   __entry->eoffset = bo_va->it.last + 1;
+			   __entry->eoffset = bo_va->it.end;
 			   __entry->flags = bo_va->flags;
 			   ),
 	    TP_printk("soffs=%010llx, eoffs=%010llx, flags=%08x",
diff --git a/drivers/gpu/drm/radeon/radeon_vm.c b/drivers/gpu/drm/radeon/radeon_vm.c
index e0ad547786e8..b2a54aff21f4 100644
--- a/drivers/gpu/drm/radeon/radeon_vm.c
+++ b/drivers/gpu/drm/radeon/radeon_vm.c
@@ -329,7 +329,7 @@ struct radeon_bo_va *radeon_vm_bo_add(struct radeon_device *rdev,
 	bo_va->vm = vm;
 	bo_va->bo = bo;
 	bo_va->it.start = 0;
-	bo_va->it.last = 0;
+	bo_va->it.end = 0;
 	bo_va->flags = 0;
 	bo_va->ref_count = 1;
 	INIT_LIST_HEAD(&bo_va->bo_list);
@@ -456,7 +456,7 @@ int radeon_vm_bo_set_addr(struct radeon_device *rdev,
 
 	if (soffset) {
 		/* make sure object fit at this offset */
-		eoffset = soffset + size - 1;
+		eoffset = soffset + size;
 		if (soffset >= eoffset) {
 			r = -EINVAL;
 			goto error_unreserve;
@@ -486,14 +486,14 @@ int radeon_vm_bo_set_addr(struct radeon_device *rdev,
 			/* bo and tmp overlap, invalid offset */
 			dev_err(rdev->dev, "bo %p va 0x%010Lx conflict with "
 				"(bo %p 0x%010lx 0x%010lx)\n", bo_va->bo,
-				soffset, tmp->bo, tmp->it.start, tmp->it.last);
+				soffset, tmp->bo, tmp->it.start, tmp->it.end);
 			mutex_unlock(&vm->mutex);
 			r = -EINVAL;
 			goto error_unreserve;
 		}
 	}
 
-	if (bo_va->it.start || bo_va->it.last) {
+	if (bo_va->it.start || bo_va->it.end) {
 		/* add a clone of the bo_va to clear the old address */
 		struct radeon_bo_va *tmp;
 		tmp = kzalloc(sizeof(struct radeon_bo_va), GFP_KERNEL);
@@ -503,14 +503,14 @@ int radeon_vm_bo_set_addr(struct radeon_device *rdev,
 			goto error_unreserve;
 		}
 		tmp->it.start = bo_va->it.start;
-		tmp->it.last = bo_va->it.last;
+		tmp->it.end = bo_va->it.end;
 		tmp->vm = vm;
 		tmp->bo = radeon_bo_ref(bo_va->bo);
 
 		interval_tree_remove(&bo_va->it, &vm->va);
 		spin_lock(&vm->status_lock);
 		bo_va->it.start = 0;
-		bo_va->it.last = 0;
+		bo_va->it.end = 0;
 		list_del_init(&bo_va->vm_status);
 		list_add(&tmp->vm_status, &vm->freed);
 		spin_unlock(&vm->status_lock);
@@ -519,7 +519,7 @@ int radeon_vm_bo_set_addr(struct radeon_device *rdev,
 	if (soffset || eoffset) {
 		spin_lock(&vm->status_lock);
 		bo_va->it.start = soffset;
-		bo_va->it.last = eoffset;
+		bo_va->it.end = eoffset;
 		list_add(&bo_va->vm_status, &vm->cleared);
 		spin_unlock(&vm->status_lock);
 		interval_tree_insert(&bo_va->it, &vm->va);
@@ -964,7 +964,7 @@ int radeon_vm_bo_update(struct radeon_device *rdev,
 
 	trace_radeon_vm_bo_update(bo_va);
 
-	nptes = bo_va->it.last - bo_va->it.start + 1;
+	nptes = bo_va->it.end - bo_va->it.start;
 
 	/* reserve space for one command every (1 << BLOCK_SIZE) entries
 	   or 2k dwords (whatever is smaller) */
@@ -1010,7 +1010,7 @@ int radeon_vm_bo_update(struct radeon_device *rdev,
 	}
 
 	r = radeon_vm_update_ptes(rdev, vm, &ib, bo_va->it.start,
-				  bo_va->it.last + 1, addr,
+				  bo_va->it.end, addr,
 				  radeon_vm_page_flags(bo_va->flags));
 	if (r) {
 		radeon_ib_free(rdev, &ib);
@@ -1026,7 +1026,7 @@ int radeon_vm_bo_update(struct radeon_device *rdev,
 		return r;
 	}
 	ib.fence->is_vm_update = true;
-	radeon_vm_fence_pts(vm, bo_va->it.start, bo_va->it.last + 1, ib.fence);
+	radeon_vm_fence_pts(vm, bo_va->it.start, bo_va->it.end, ib.fence);
 	radeon_fence_unref(&bo_va->last_pt_update);
 	bo_va->last_pt_update = radeon_fence_ref(ib.fence);
 	radeon_ib_free(rdev, &ib);
@@ -1124,12 +1124,12 @@ void radeon_vm_bo_rmv(struct radeon_device *rdev,
 	list_del(&bo_va->bo_list);
 
 	mutex_lock(&vm->mutex);
-	if (bo_va->it.start || bo_va->it.last)
+	if (bo_va->it.start || bo_va->it.end)
 		interval_tree_remove(&bo_va->it, &vm->va);
 
 	spin_lock(&vm->status_lock);
 	list_del(&bo_va->vm_status);
-	if (bo_va->it.start || bo_va->it.last) {
+	if (bo_va->it.start || bo_va->it.end) {
 		bo_va->bo = radeon_bo_ref(bo_va->bo);
 		list_add(&bo_va->vm_status, &vm->freed);
 	} else {
@@ -1158,7 +1158,7 @@ void radeon_vm_bo_invalidate(struct radeon_device *rdev,
 	list_for_each_entry(bo_va, &bo->va, bo_list) {
 		spin_lock(&bo_va->vm->status_lock);
 		if (list_empty(&bo_va->vm_status) &&
-		    (bo_va->it.start || bo_va->it.last))
+		    (bo_va->it.start || bo_va->it.end))
 			list_add(&bo_va->vm_status, &bo_va->vm->invalidated);
 		spin_unlock(&bo_va->vm->status_lock);
 	}
diff --git a/drivers/infiniband/core/umem_odp.c b/drivers/infiniband/core/umem_odp.c
index f67a30fda1ed..9b7ac93223d6 100644
--- a/drivers/infiniband/core/umem_odp.c
+++ b/drivers/infiniband/core/umem_odp.c
@@ -219,26 +219,19 @@ static inline int ib_init_umem_odp(struct ib_umem_odp *umem_odp)
 			ALIGN_DOWN(umem_odp->umem.address, page_size);
 		if (check_add_overflow(umem_odp->umem.address,
 				       (unsigned long)umem_odp->umem.length,
-				       &umem_odp->interval_tree.last))
+				       &umem_odp->interval_tree.end))
 			return -EOVERFLOW;
-		umem_odp->interval_tree.last =
-			ALIGN(umem_odp->interval_tree.last, page_size);
-		if (unlikely(umem_odp->interval_tree.last < page_size))
+		umem_odp->interval_tree.end =
+			ALIGN(umem_odp->interval_tree.end, page_size);
+		if (unlikely(umem_odp->interval_tree.end < page_size))
 			return -EOVERFLOW;
 
-		pages = (umem_odp->interval_tree.last -
+		pages = (umem_odp->interval_tree.end -
 			 umem_odp->interval_tree.start) >>
 			umem_odp->page_shift;
 		if (!pages)
 			return -EINVAL;
 
-		/*
-		 * Note that the representation of the intervals in the
-		 * interval tree considers the ending point as contained in
-		 * the interval.
-		 */
-		umem_odp->interval_tree.last--;
-
 		umem_odp->page_list = kvcalloc(
 			pages, sizeof(*umem_odp->page_list), GFP_KERNEL);
 		if (!umem_odp->page_list)
@@ -777,12 +770,12 @@ int rbt_ib_umem_for_each_in_range(struct rb_root_cached *root,
 	if (unlikely(start == last))
 		return ret_val;
 
-	for (node = interval_tree_iter_first(root, start, last - 1);
+	for (node = interval_tree_iter_first(root, start, last);
 			node; node = next) {
 		/* TODO move the blockable decision up to the callback */
 		if (!blockable)
 			return -EAGAIN;
-		next = interval_tree_iter_next(node, start, last - 1);
+		next = interval_tree_iter_next(node, start, last);
 		umem = container_of(node, struct ib_umem_odp, interval_tree);
 		ret_val = cb(umem, start, last, cookie) || ret_val;
 	}
diff --git a/drivers/iommu/virtio-iommu.c b/drivers/iommu/virtio-iommu.c
index 3ea9d7682999..771740981f43 100644
--- a/drivers/iommu/virtio-iommu.c
+++ b/drivers/iommu/virtio-iommu.c
@@ -323,7 +323,7 @@ static int viommu_add_mapping(struct viommu_domain *vdomain, unsigned long iova,
 
 	mapping->paddr		= paddr;
 	mapping->iova.start	= iova;
-	mapping->iova.last	= iova + size - 1;
+	mapping->iova.end	= iova + size;
 	mapping->flags		= flags;
 
 	spin_lock_irqsave(&vdomain->mappings_lock, irqflags);
@@ -348,7 +348,7 @@ static size_t viommu_del_mappings(struct viommu_domain *vdomain,
 {
 	size_t unmapped = 0;
 	unsigned long flags;
-	unsigned long last = iova + size - 1;
+	unsigned long last = iova + size;
 	struct viommu_mapping *mapping = NULL;
 	struct interval_tree_node *node, *next;
 
@@ -367,7 +367,7 @@ static size_t viommu_del_mappings(struct viommu_domain *vdomain,
 		 * Virtio-iommu doesn't allow UNMAP to split a mapping created
 		 * with a single MAP request, so remove the full mapping.
 		 */
-		unmapped += mapping->iova.last - mapping->iova.start + 1;
+		unmapped += mapping->iova.end - mapping->iova.start;
 
 		interval_tree_remove(node, &vdomain->mappings);
 		kfree(mapping);
diff --git a/include/linux/interval_tree.h b/include/linux/interval_tree.h
index 288c26f50732..f3d1ea9e4003 100644
--- a/include/linux/interval_tree.h
+++ b/include/linux/interval_tree.h
@@ -7,7 +7,7 @@
 struct interval_tree_node {
 	struct rb_node rb;
 	unsigned long start;	/* Start of interval */
-	unsigned long last;	/* Last location _in_ interval */
+	unsigned long end;	/* Last location _outside_ interval */
 	unsigned long __subtree_last;
 };
 
diff --git a/include/rdma/ib_umem_odp.h b/include/rdma/ib_umem_odp.h
index 253df1a1fa54..d0ae7aa2139b 100644
--- a/include/rdma/ib_umem_odp.h
+++ b/include/rdma/ib_umem_odp.h
@@ -97,7 +97,7 @@ static inline unsigned long ib_umem_start(struct ib_umem_odp *umem_odp)
 /* Returns the address of the page after the last one of an ODP umem. */
 static inline unsigned long ib_umem_end(struct ib_umem_odp *umem_odp)
 {
-	return umem_odp->interval_tree.last + 1;
+	return umem_odp->interval_tree.end;
 }
 
 static inline size_t ib_umem_odp_num_pages(struct ib_umem_odp *umem_odp)
@@ -165,7 +165,7 @@ rbt_ib_umem_lookup(struct rb_root_cached *root, u64 addr, u64 length)
 {
 	struct interval_tree_node *node;
 
-	node = interval_tree_iter_first(root, addr, addr + length - 1);
+	node = interval_tree_iter_first(root, addr, addr + length);
 	if (!node)
 		return NULL;
 	return container_of(node, struct ib_umem_odp, interval_tree);
diff --git a/lib/interval_tree.c b/lib/interval_tree.c
index 593ce56ece50..336ec5113a28 100644
--- a/lib/interval_tree.c
+++ b/lib/interval_tree.c
@@ -1,15 +1,15 @@
 // SPDX-License-Identifier: GPL-2.0-only
 #include <linux/interval_tree.h>
-#include <linux/interval_tree_generic.h>
+#include <linux/interval_tree_gen.h>
 #include <linux/compiler.h>
 #include <linux/export.h>
 
 #define START(node) ((node)->start)
-#define LAST(node)  ((node)->last)
+#define END(node)  ((node)->end)
 
 INTERVAL_TREE_DEFINE(struct interval_tree_node, rb,
 		     unsigned long, __subtree_last,
-		     START, LAST,, interval_tree)
+		     START, END,, interval_tree)
 
 EXPORT_SYMBOL_GPL(interval_tree_insert);
 EXPORT_SYMBOL_GPL(interval_tree_remove);
-- 
2.16.4



^ permalink raw reply related	[flat|nested] 38+ messages in thread

* [PATCH 10/11] lib: drop interval_tree_generic.h
  2019-10-03 20:18 [PATCH -next 00/11] lib/interval-tree: move to half closed intervals Davidlohr Bueso
                   ` (8 preceding siblings ...)
  2019-10-03 20:18 ` [PATCH 09/11] lib/interval-tree: convert interval_tree " Davidlohr Bueso
@ 2019-10-03 20:18 ` Davidlohr Bueso
  2019-10-03 20:18 ` [PATCH 11/11] x86/mm, pat: convert pat tree to generic interval tree Davidlohr Bueso
                   ` (2 subsequent siblings)
  12 siblings, 0 replies; 38+ messages in thread
From: Davidlohr Bueso @ 2019-10-03 20:18 UTC (permalink / raw)
  To: akpm
  Cc: walken, peterz, linux-kernel, linux-mm, dri-devel, linux-rdma,
	dave, Davidlohr Bueso

Now that we have the semi-open equivalent, interval_tree_gen.h, and
all users updated, we can do without this header file.

Signed-off-by: Davidlohr Bueso <dbueso@suse.de>
---
 include/linux/interval_tree_generic.h | 187 ----------------------------------
 1 file changed, 187 deletions(-)
 delete mode 100644 include/linux/interval_tree_generic.h

diff --git a/include/linux/interval_tree_generic.h b/include/linux/interval_tree_generic.h
deleted file mode 100644
index aaa8a0767aa3..000000000000
--- a/include/linux/interval_tree_generic.h
+++ /dev/null
@@ -1,187 +0,0 @@
-/* SPDX-License-Identifier: GPL-2.0-or-later */
-/*
-  Interval Trees
-  (C) 2012  Michel Lespinasse <walken@google.com>
-
-
-  include/linux/interval_tree_generic.h
-*/
-
-#include <linux/rbtree_augmented.h>
-
-/*
- * Template for implementing interval trees
- *
- * ITSTRUCT:   struct type of the interval tree nodes
- * ITRB:       name of struct rb_node field within ITSTRUCT
- * ITTYPE:     type of the interval endpoints
- * ITSUBTREE:  name of ITTYPE field within ITSTRUCT holding last-in-subtree
- * ITSTART(n): start endpoint of ITSTRUCT node n
- * ITLAST(n):  last endpoint of ITSTRUCT node n
- * ITSTATIC:   'static' or empty
- * ITPREFIX:   prefix to use for the inline tree definitions
- *
- * Note - before using this, please consider if generic version
- * (interval_tree.h) would work for you...
- */
-
-#define INTERVAL_TREE_DEFINE(ITSTRUCT, ITRB, ITTYPE, ITSUBTREE,		      \
-			     ITSTART, ITLAST, ITSTATIC, ITPREFIX)	      \
-									      \
-/* Callbacks for augmented rbtree insert and remove */			      \
-									      \
-RB_DECLARE_CALLBACKS_MAX(static, ITPREFIX ## _augment,			      \
-			 ITSTRUCT, ITRB, ITTYPE, ITSUBTREE, ITLAST)	      \
-									      \
-/* Insert / remove interval nodes from the tree */			      \
-									      \
-ITSTATIC void ITPREFIX ## _insert(ITSTRUCT *node,			      \
-				  struct rb_root_cached *root)	 	      \
-{									      \
-	struct rb_node **link = &root->rb_root.rb_node, *rb_parent = NULL;    \
-	ITTYPE start = ITSTART(node), last = ITLAST(node);		      \
-	ITSTRUCT *parent;						      \
-	bool leftmost = true;						      \
-									      \
-	while (*link) {							      \
-		rb_parent = *link;					      \
-		parent = rb_entry(rb_parent, ITSTRUCT, ITRB);		      \
-		if (parent->ITSUBTREE < last)				      \
-			parent->ITSUBTREE = last;			      \
-		if (start < ITSTART(parent))				      \
-			link = &parent->ITRB.rb_left;			      \
-		else {							      \
-			link = &parent->ITRB.rb_right;			      \
-			leftmost = false;				      \
-		}							      \
-	}								      \
-									      \
-	node->ITSUBTREE = last;						      \
-	rb_link_node(&node->ITRB, rb_parent, link);			      \
-	rb_insert_augmented_cached(&node->ITRB, root,			      \
-				   leftmost, &ITPREFIX ## _augment);	      \
-}									      \
-									      \
-ITSTATIC void ITPREFIX ## _remove(ITSTRUCT *node,			      \
-				  struct rb_root_cached *root)		      \
-{									      \
-	rb_erase_augmented_cached(&node->ITRB, root, &ITPREFIX ## _augment);  \
-}									      \
-									      \
-/*									      \
- * Iterate over intervals intersecting [start;last]			      \
- *									      \
- * Note that a node's interval intersects [start;last] iff:		      \
- *   Cond1: ITSTART(node) <= last					      \
- * and									      \
- *   Cond2: start <= ITLAST(node)					      \
- */									      \
-									      \
-static ITSTRUCT *							      \
-ITPREFIX ## _subtree_search(ITSTRUCT *node, ITTYPE start, ITTYPE last)	      \
-{									      \
-	while (true) {							      \
-		/*							      \
-		 * Loop invariant: start <= node->ITSUBTREE		      \
-		 * (Cond2 is satisfied by one of the subtree nodes)	      \
-		 */							      \
-		if (node->ITRB.rb_left) {				      \
-			ITSTRUCT *left = rb_entry(node->ITRB.rb_left,	      \
-						  ITSTRUCT, ITRB);	      \
-			if (start <= left->ITSUBTREE) {			      \
-				/*					      \
-				 * Some nodes in left subtree satisfy Cond2.  \
-				 * Iterate to find the leftmost such node N.  \
-				 * If it also satisfies Cond1, that's the     \
-				 * match we are looking for. Otherwise, there \
-				 * is no matching interval as nodes to the    \
-				 * right of N can't satisfy Cond1 either.     \
-				 */					      \
-				node = left;				      \
-				continue;				      \
-			}						      \
-		}							      \
-		if (ITSTART(node) <= last) {		/* Cond1 */	      \
-			if (start <= ITLAST(node))	/* Cond2 */	      \
-				return node;	/* node is leftmost match */  \
-			if (node->ITRB.rb_right) {			      \
-				node = rb_entry(node->ITRB.rb_right,	      \
-						ITSTRUCT, ITRB);	      \
-				if (start <= node->ITSUBTREE)		      \
-					continue;			      \
-			}						      \
-		}							      \
-		return NULL;	/* No match */				      \
-	}								      \
-}									      \
-									      \
-ITSTATIC ITSTRUCT *							      \
-ITPREFIX ## _iter_first(struct rb_root_cached *root,			      \
-			ITTYPE start, ITTYPE last)			      \
-{									      \
-	ITSTRUCT *node, *leftmost;					      \
-									      \
-	if (!root->rb_root.rb_node)					      \
-		return NULL;						      \
-									      \
-	/*								      \
-	 * Fastpath range intersection/overlap between A: [a0, a1] and	      \
-	 * B: [b0, b1] is given by:					      \
-	 *								      \
-	 *         a0 <= b1 && b0 <= a1					      \
-	 *								      \
-	 *  ... where A holds the lock range and B holds the smallest	      \
-	 * 'start' and largest 'last' in the tree. For the later, we	      \
-	 * rely on the root node, which by augmented interval tree	      \
-	 * property, holds the largest value in its last-in-subtree.	      \
-	 * This allows mitigating some of the tree walk overhead for	      \
-	 * for non-intersecting ranges, maintained and consulted in O(1).     \
-	 */								      \
-	node = rb_entry(root->rb_root.rb_node, ITSTRUCT, ITRB);		      \
-	if (node->ITSUBTREE < start)					      \
-		return NULL;						      \
-									      \
-	leftmost = rb_entry(root->rb_leftmost, ITSTRUCT, ITRB);		      \
-	if (ITSTART(leftmost) > last)					      \
-		return NULL;						      \
-									      \
-	return ITPREFIX ## _subtree_search(node, start, last);		      \
-}									      \
-									      \
-ITSTATIC ITSTRUCT *							      \
-ITPREFIX ## _iter_next(ITSTRUCT *node, ITTYPE start, ITTYPE last)	      \
-{									      \
-	struct rb_node *rb = node->ITRB.rb_right, *prev;		      \
-									      \
-	while (true) {							      \
-		/*							      \
-		 * Loop invariants:					      \
-		 *   Cond1: ITSTART(node) <= last			      \
-		 *   rb == node->ITRB.rb_right				      \
-		 *							      \
-		 * First, search right subtree if suitable		      \
-		 */							      \
-		if (rb) {						      \
-			ITSTRUCT *right = rb_entry(rb, ITSTRUCT, ITRB);	      \
-			if (start <= right->ITSUBTREE)			      \
-				return ITPREFIX ## _subtree_search(right,     \
-								start, last); \
-		}							      \
-									      \
-		/* Move up the tree until we come from a node's left child */ \
-		do {							      \
-			rb = rb_parent(&node->ITRB);			      \
-			if (!rb)					      \
-				return NULL;				      \
-			prev = &node->ITRB;				      \
-			node = rb_entry(rb, ITSTRUCT, ITRB);		      \
-			rb = node->ITRB.rb_right;			      \
-		} while (prev == rb);					      \
-									      \
-		/* Check if the node intersects [start;last] */		      \
-		if (last < ITSTART(node))		/* !Cond1 */	      \
-			return NULL;					      \
-		else if (start <= ITLAST(node))		/* Cond2 */	      \
-			return node;					      \
-	}								      \
-}
-- 
2.16.4



^ permalink raw reply related	[flat|nested] 38+ messages in thread

* [PATCH 11/11] x86/mm, pat: convert pat tree to generic interval tree
  2019-10-03 20:18 [PATCH -next 00/11] lib/interval-tree: move to half closed intervals Davidlohr Bueso
                   ` (9 preceding siblings ...)
  2019-10-03 20:18 ` [PATCH 10/11] lib: drop interval_tree_generic.h Davidlohr Bueso
@ 2019-10-03 20:18 ` Davidlohr Bueso
  2019-10-07 15:33   ` Ingo Molnar
  2019-10-03 20:32 ` [PATCH -next 00/11] lib/interval-tree: move to half closed intervals Matthew Wilcox
  2019-10-04  0:26 ` Jason Gunthorpe
  12 siblings, 1 reply; 38+ messages in thread
From: Davidlohr Bueso @ 2019-10-03 20:18 UTC (permalink / raw)
  To: akpm
  Cc: walken, peterz, linux-kernel, linux-mm, dri-devel, linux-rdma,
	dave, Thomas Gleixner, Ingo Molnar, Borislav Petkov, x86,
	Davidlohr Bueso

With some considerations, the custom pat_rbtree implementation can be
simplified to use most of the generic interval_tree machinery.

o The tree inorder traversal can slightly differ when there are key
('start') collisions in the tree due to one going left and another right.
This, however, only affects the output of debugfs' pat_memtype_list file.

o Generic interval trees are now semi open [a,b), which suits well with
what pat wants.

o Erasing logic must remain untouched as well.

In order for the types to remain u64, the 'memtype_interval' calls are
introduced, as opposed to simply using struct interval_tree.

In addition, pat tree might potentially also benefit by the fast overlap
detection for the insertion case when looking up the first overlapping node
in the tree.

Finally, I've tested this on various servers, via sanity warnings, running
side by side with the current version and so far see no differences in the
returned pointer node when doing memtype_rb_lowest_match() lookups.

Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Borislav Petkov <bp@alien8.de>
Cc: x86@kernel.org
Signed-off-by: Davidlohr Bueso <dbueso@suse.de>
---
 arch/x86/mm/pat.c        |  22 +++----
 arch/x86/mm/pat_rbtree.c | 151 ++++++++++-------------------------------------
 2 files changed, 43 insertions(+), 130 deletions(-)

diff --git a/arch/x86/mm/pat.c b/arch/x86/mm/pat.c
index d9fbd4f69920..a3d47b5db704 100644
--- a/arch/x86/mm/pat.c
+++ b/arch/x86/mm/pat.c
@@ -482,8 +482,8 @@ static int reserve_ram_pages_type(u64 start, u64 end,
 		page = pfn_to_page(pfn);
 		type = get_page_memtype(page);
 		if (type != _PAGE_CACHE_MODE_WB) {
-			pr_info("x86/PAT: reserve_ram_pages_type failed [mem %#010Lx-%#010Lx], track 0x%x, req 0x%x\n",
-				start, end - 1, type, req_type);
+			pr_info("x86/PAT: reserve_ram_pages_type failed [mem %#010Lx-%#010Lx), track 0x%x, req 0x%x\n",
+				start, end, type, req_type);
 			if (new_type)
 				*new_type = type;
 
@@ -553,8 +553,8 @@ int reserve_memtype(u64 start, u64 end, enum page_cache_mode req_type,
 	start = sanitize_phys(start);
 	end = sanitize_phys(end);
 	if (start >= end) {
-		WARN(1, "%s failed: [mem %#010Lx-%#010Lx], req %s\n", __func__,
-				start, end - 1, cattr_name(req_type));
+		WARN(1, "%s failed: [mem %#010Lx-%#010Lx), req %s\n", __func__,
+				start, end, cattr_name(req_type));
 		return -EINVAL;
 	}
 
@@ -605,8 +605,8 @@ int reserve_memtype(u64 start, u64 end, enum page_cache_mode req_type,
 
 	err = rbt_memtype_check_insert(new, new_type);
 	if (err) {
-		pr_info("x86/PAT: reserve_memtype failed [mem %#010Lx-%#010Lx], track %s, req %s\n",
-			start, end - 1,
+		pr_info("x86/PAT: reserve_memtype failed [mem %#010Lx-%#010Lx), track %s, req %s\n",
+			start, end,
 			cattr_name(new->type), cattr_name(req_type));
 		kfree(new);
 		spin_unlock(&memtype_lock);
@@ -616,8 +616,8 @@ int reserve_memtype(u64 start, u64 end, enum page_cache_mode req_type,
 
 	spin_unlock(&memtype_lock);
 
-	dprintk("reserve_memtype added [mem %#010Lx-%#010Lx], track %s, req %s, ret %s\n",
-		start, end - 1, cattr_name(new->type), cattr_name(req_type),
+	dprintk("reserve_memtype added [mem %#010Lx-%#010Lx), track %s, req %s, ret %s\n",
+		start, end, cattr_name(new->type), cattr_name(req_type),
 		new_type ? cattr_name(*new_type) : "-");
 
 	return err;
@@ -654,14 +654,14 @@ int free_memtype(u64 start, u64 end)
 	spin_unlock(&memtype_lock);
 
 	if (IS_ERR(entry)) {
-		pr_info("x86/PAT: %s:%d freeing invalid memtype [mem %#010Lx-%#010Lx]\n",
-			current->comm, current->pid, start, end - 1);
+		pr_info("x86/PAT: %s:%d freeing invalid memtype [mem %#010Lx-%#010Lx)\n",
+			current->comm, current->pid, start, end);
 		return -EINVAL;
 	}
 
 	kfree(entry);
 
-	dprintk("free_memtype request [mem %#010Lx-%#010Lx]\n", start, end - 1);
+	dprintk("free_memtype request [mem %#010Lx-%#010Lx)\n", start, end);
 
 	return 0;
 }
diff --git a/arch/x86/mm/pat_rbtree.c b/arch/x86/mm/pat_rbtree.c
index 65ebe4b88f7c..89d097c50adb 100644
--- a/arch/x86/mm/pat_rbtree.c
+++ b/arch/x86/mm/pat_rbtree.c
@@ -5,14 +5,13 @@
  * Authors: Venkatesh Pallipadi <venkatesh.pallipadi@intel.com>
  *          Suresh B Siddha <suresh.b.siddha@intel.com>
  *
- * Interval tree (augmented rbtree) used to store the PAT memory type
- * reservations.
+ * Interval tree used to store the PAT memory type reservations.
  */
 
 #include <linux/seq_file.h>
 #include <linux/debugfs.h>
 #include <linux/kernel.h>
-#include <linux/rbtree_augmented.h>
+#include <linux/interval_tree_gen.h>
 #include <linux/sched.h>
 #include <linux/gfp.h>
 
@@ -33,72 +32,25 @@
  *
  * memtype_lock protects the rbtree.
  */
+#define START(node) ((node)->start)
+#define END(node)  ((node)->end)
+INTERVAL_TREE_DEFINE(struct memtype, rb, u64, subtree_max_end,
+		     START, END, static, memtype_interval)
 
-static struct rb_root memtype_rbroot = RB_ROOT;
-
-static int is_node_overlap(struct memtype *node, u64 start, u64 end)
-{
-	if (node->start >= end || node->end <= start)
-		return 0;
-
-	return 1;
-}
-
-static u64 get_subtree_max_end(struct rb_node *node)
-{
-	u64 ret = 0;
-	if (node) {
-		struct memtype *data = rb_entry(node, struct memtype, rb);
-		ret = data->subtree_max_end;
-	}
-	return ret;
-}
-
-#define NODE_END(node) ((node)->end)
-
-RB_DECLARE_CALLBACKS_MAX(static, memtype_rb_augment_cb,
-			 struct memtype, rb, u64, subtree_max_end, NODE_END)
-
-/* Find the first (lowest start addr) overlapping range from rb tree */
-static struct memtype *memtype_rb_lowest_match(struct rb_root *root,
-				u64 start, u64 end)
-{
-	struct rb_node *node = root->rb_node;
-	struct memtype *last_lower = NULL;
-
-	while (node) {
-		struct memtype *data = rb_entry(node, struct memtype, rb);
-
-		if (get_subtree_max_end(node->rb_left) > start) {
-			/* Lowest overlap if any must be on left side */
-			node = node->rb_left;
-		} else if (is_node_overlap(data, start, end)) {
-			last_lower = data;
-			break;
-		} else if (start >= data->start) {
-			/* Lowest overlap if any must be on right side */
-			node = node->rb_right;
-		} else {
-			break;
-		}
-	}
-	return last_lower; /* Returns NULL if there is no overlap */
-}
+static struct rb_root_cached memtype_rbroot = RB_ROOT_CACHED;
 
 enum {
 	MEMTYPE_EXACT_MATCH	= 0,
 	MEMTYPE_END_MATCH	= 1
 };
 
-static struct memtype *memtype_rb_match(struct rb_root *root,
+static struct memtype *memtype_rb_match(struct rb_root_cached *root,
 				u64 start, u64 end, int match_type)
 {
 	struct memtype *match;
 
-	match = memtype_rb_lowest_match(root, start, end);
+	match = memtype_interval_iter_first(root, start, end);
 	while (match != NULL && match->start < end) {
-		struct rb_node *node;
-
 		if ((match_type == MEMTYPE_EXACT_MATCH) &&
 		    (match->start == start) && (match->end == end))
 			return match;
@@ -107,26 +59,21 @@ static struct memtype *memtype_rb_match(struct rb_root *root,
 		    (match->start < start) && (match->end == end))
 			return match;
 
-		node = rb_next(&match->rb);
-		if (node)
-			match = rb_entry(node, struct memtype, rb);
-		else
-			match = NULL;
+		match = memtype_interval_iter_next(match, start, end);
 	}
 
 	return NULL; /* Returns NULL if there is no match */
 }
 
-static int memtype_rb_check_conflict(struct rb_root *root,
+static int memtype_rb_check_conflict(struct rb_root_cached *root,
 				u64 start, u64 end,
 				enum page_cache_mode reqtype,
 				enum page_cache_mode *newtype)
 {
-	struct rb_node *node;
 	struct memtype *match;
 	enum page_cache_mode found_type = reqtype;
 
-	match = memtype_rb_lowest_match(&memtype_rbroot, start, end);
+	match = memtype_interval_iter_first(&memtype_rbroot, start, end);
 	if (match == NULL)
 		goto success;
 
@@ -136,19 +83,12 @@ static int memtype_rb_check_conflict(struct rb_root *root,
 	dprintk("Overlap at 0x%Lx-0x%Lx\n", match->start, match->end);
 	found_type = match->type;
 
-	node = rb_next(&match->rb);
-	while (node) {
-		match = rb_entry(node, struct memtype, rb);
-
-		if (match->start >= end) /* Checked all possible matches */
-			goto success;
-
-		if (is_node_overlap(match, start, end) &&
-		    match->type != found_type) {
+	match = memtype_interval_iter_next(match, start, end);
+	while (match) {
+		if (match->type != found_type)
 			goto failure;
-		}
 
-		node = rb_next(&match->rb);
+		match = memtype_interval_iter_next(match, start, end);
 	}
 success:
 	if (newtype)
@@ -163,28 +103,6 @@ static int memtype_rb_check_conflict(struct rb_root *root,
 	return -EBUSY;
 }
 
-static void memtype_rb_insert(struct rb_root *root, struct memtype *newdata)
-{
-	struct rb_node **node = &(root->rb_node);
-	struct rb_node *parent = NULL;
-
-	while (*node) {
-		struct memtype *data = rb_entry(*node, struct memtype, rb);
-
-		parent = *node;
-		if (data->subtree_max_end < newdata->end)
-			data->subtree_max_end = newdata->end;
-		if (newdata->start <= data->start)
-			node = &((*node)->rb_left);
-		else if (newdata->start > data->start)
-			node = &((*node)->rb_right);
-	}
-
-	newdata->subtree_max_end = newdata->end;
-	rb_link_node(&newdata->rb, parent, node);
-	rb_insert_augmented(&newdata->rb, root, &memtype_rb_augment_cb);
-}
-
 int rbt_memtype_check_insert(struct memtype *new,
 			     enum page_cache_mode *ret_type)
 {
@@ -192,14 +110,13 @@ int rbt_memtype_check_insert(struct memtype *new,
 
 	err = memtype_rb_check_conflict(&memtype_rbroot, new->start, new->end,
 						new->type, ret_type);
+	if (err)
+		goto done;
 
-	if (!err) {
-		if (ret_type)
-			new->type = *ret_type;
-
-		new->subtree_max_end = new->end;
-		memtype_rb_insert(&memtype_rbroot, new);
-	}
+	if (ret_type)
+		new->type = *ret_type;
+	memtype_interval_insert(new, &memtype_rbroot);
+done:
 	return err;
 }
 
@@ -225,15 +142,12 @@ struct memtype *rbt_memtype_erase(u64 start, u64 end)
 
 	if (data->start == start) {
 		/* munmap: erase this node */
-		rb_erase_augmented(&data->rb, &memtype_rbroot,
-					&memtype_rb_augment_cb);
+		memtype_interval_remove(data, &memtype_rbroot);
 	} else {
 		/* mremap: update the end value of this node */
-		rb_erase_augmented(&data->rb, &memtype_rbroot,
-					&memtype_rb_augment_cb);
+		memtype_interval_remove(data, &memtype_rbroot);
 		data->end = start;
-		data->subtree_max_end = data->end;
-		memtype_rb_insert(&memtype_rbroot, data);
+		memtype_interval_insert(data, &memtype_rbroot);
 		return NULL;
 	}
 
@@ -242,24 +156,23 @@ struct memtype *rbt_memtype_erase(u64 start, u64 end)
 
 struct memtype *rbt_memtype_lookup(u64 addr)
 {
-	return memtype_rb_lowest_match(&memtype_rbroot, addr, addr + PAGE_SIZE);
+	return memtype_interval_iter_first(&memtype_rbroot, addr, addr + PAGE_SIZE);
 }
 
 #if defined(CONFIG_DEBUG_FS)
 int rbt_memtype_copy_nth_element(struct memtype *out, loff_t pos)
 {
-	struct rb_node *node;
+	struct memtype *match;
 	int i = 1;
 
-	node = rb_first(&memtype_rbroot);
-	while (node && pos != i) {
-		node = rb_next(node);
+	match = memtype_interval_iter_first(&memtype_rbroot, 0, ULONG_MAX);
+	while (match && pos != i) {
+		match = memtype_interval_iter_next(match, 0, ULONG_MAX);
 		i++;
 	}
 
-	if (node) { /* pos == i */
-		struct memtype *this = rb_entry(node, struct memtype, rb);
-		*out = *this;
+	if (match) { /* pos == i */
+		*out = *match;
 		return 0;
 	} else {
 		return 1;
-- 
2.16.4



^ permalink raw reply related	[flat|nested] 38+ messages in thread

* Re: [PATCH -next 00/11] lib/interval-tree: move to half closed intervals
  2019-10-03 20:18 [PATCH -next 00/11] lib/interval-tree: move to half closed intervals Davidlohr Bueso
                   ` (10 preceding siblings ...)
  2019-10-03 20:18 ` [PATCH 11/11] x86/mm, pat: convert pat tree to generic interval tree Davidlohr Bueso
@ 2019-10-03 20:32 ` Matthew Wilcox
  2019-10-03 21:10   ` Davidlohr Bueso
  2019-10-04 12:43   ` Michel Lespinasse
  2019-10-04  0:26 ` Jason Gunthorpe
  12 siblings, 2 replies; 38+ messages in thread
From: Matthew Wilcox @ 2019-10-03 20:32 UTC (permalink / raw)
  To: Davidlohr Bueso
  Cc: akpm, walken, peterz, linux-kernel, linux-mm, dri-devel, linux-rdma

On Thu, Oct 03, 2019 at 01:18:47PM -0700, Davidlohr Bueso wrote:
> It has been discussed[1,2] that almost all users of interval trees would better
> be served if the intervals were actually not [a,b], but instead [a, b). This

So how does a user represent a range from ULONG_MAX to ULONG_MAX now?

I think the problem is that large parts of the kernel just don't consider
integer overflow.  Because we write in C, it's natural to write:

	for (i = start; i < end; i++)

and just assume that we never need to hit ULONG_MAX or UINT_MAX.
If we're storing addresses, that's generally true -- most architectures
don't allow addresses in the -PAGE_SIZE to ULONG_MAX range (or they'd
have trouble with PTR_ERR).  If you're looking at file sizes, that's
not true on 32-bit machines, and we've definitely seen filesystem bugs
with files nudging up on 16TB (on 32 bit with 4k page size).  Or block
driver bugs with similarly sized block devices.

So, yeah, easier to use.  But damning corner cases.


^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [PATCH 08/11] mm: convert vma_interval_tree to half closed intervals
  2019-10-03 20:18 ` [PATCH 08/11] mm: convert vma_interval_tree " Davidlohr Bueso
@ 2019-10-03 20:41   ` Matthew Wilcox
  2019-10-04 12:30   ` Michel Lespinasse
  1 sibling, 0 replies; 38+ messages in thread
From: Matthew Wilcox @ 2019-10-03 20:41 UTC (permalink / raw)
  To: Davidlohr Bueso
  Cc: akpm, walken, peterz, linux-kernel, linux-mm, dri-devel,
	linux-rdma, Davidlohr Bueso

On Thu, Oct 03, 2019 at 01:18:55PM -0700, Davidlohr Bueso wrote:
> +++ b/mm/nommu.c
> @@ -1793,7 +1793,7 @@ int nommu_shrink_inode_mappings(struct inode *inode, size_t size,
>  	size_t r_size, r_top;
>  
>  	low = newsize >> PAGE_SHIFT;
> -	high = (size + PAGE_SIZE - 1) >> PAGE_SHIFT;
> +	high = (size + PAGE_SIZE) >> PAGE_SHIFT;

Uhh ... are you sure about this?  size is in bytes here, and we're rounding
up to the next page size.  So if size is [1-4096], then we add on 4095 and get
the answer 1.  With your patch, if size is [0-4095], we get the answer 1.
I think you meant:

	high = ((size + PAGE_SIZE - 1) >> PAGE_SHIFT) + 1;



^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [PATCH -next 00/11] lib/interval-tree: move to half closed intervals
  2019-10-03 20:32 ` [PATCH -next 00/11] lib/interval-tree: move to half closed intervals Matthew Wilcox
@ 2019-10-03 21:10   ` Davidlohr Bueso
  2019-10-04 12:43   ` Michel Lespinasse
  1 sibling, 0 replies; 38+ messages in thread
From: Davidlohr Bueso @ 2019-10-03 21:10 UTC (permalink / raw)
  To: Matthew Wilcox
  Cc: akpm, walken, peterz, linux-kernel, linux-mm, dri-devel, linux-rdma

On Thu, 03 Oct 2019, Matthew Wilcox wrote:

>On Thu, Oct 03, 2019 at 01:18:47PM -0700, Davidlohr Bueso wrote:
>> It has been discussed[1,2] that almost all users of interval trees would better
>> be served if the intervals were actually not [a,b], but instead [a, b). This
>
>So how does a user represent a range from ULONG_MAX to ULONG_MAX now?

I would assume that any such lookups would be stab queries (anon/vma interval
tree). So both anon and files. And yeah, I blissfully ignored any overflow scenarios.
This should at least be documented.

>
>I think the problem is that large parts of the kernel just don't consider
>integer overflow.  Because we write in C, it's natural to write:
>
>	for (i = start; i < end; i++)
>
>and just assume that we never need to hit ULONG_MAX or UINT_MAX.

Similarly, I did not adjust queries such as 0 to ULONG_MAX, which are actually
real, then again any intersecting ranges will most likely not even be close to
end.

>If we're storing addresses, that's generally true -- most architectures
>don't allow addresses in the -PAGE_SIZE to ULONG_MAX range (or they'd
>have trouble with PTR_ERR).  If you're looking at file sizes, that's
>not true on 32-bit machines, and we've definitely seen filesystem bugs
>with files nudging up on 16TB (on 32 bit with 4k page size).  Or block
>driver bugs with similarly sized block devices.
>
>So, yeah, easier to use.  But damning corner cases.

I agree.

Thanks,
Davidlohr


^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [PATCH 09/11] lib/interval-tree: convert interval_tree to half closed intervals
  2019-10-03 20:18 ` [PATCH 09/11] lib/interval-tree: convert interval_tree " Davidlohr Bueso
@ 2019-10-03 22:50   ` kbuild test robot
  2019-10-04  6:57   ` Koenig, Christian
  1 sibling, 0 replies; 38+ messages in thread
From: kbuild test robot @ 2019-10-03 22:50 UTC (permalink / raw)
  To: Davidlohr Bueso
  Cc: kbuild-all, akpm, walken, peterz, linux-kernel, linux-mm,
	dri-devel, linux-rdma, dave, Christian König, Alex Deucher,
	David Airlie, Daniel Vetter, Doug Ledford, Joerg Roedel,
	Jérôme Glisse, Davidlohr Bueso

[-- Attachment #1: Type: text/plain, Size: 19110 bytes --]

Hi Davidlohr,

I love your patch! Yet something to improve:

[auto build test ERROR on next-20191003]

url:    https://github.com/0day-ci/linux/commits/Davidlohr-Bueso/lib-interval-tree-move-to-half-closed-intervals/20191004-042411
config: arm64-allyesconfig (attached as .config)
compiler: aarch64-linux-gcc (GCC) 7.4.0
reproduce:
        wget https://raw.githubusercontent.com/intel/lkp-tests/master/sbin/make.cross -O ~/bin/make.cross
        chmod +x ~/bin/make.cross
        # save the attached .config to linux build tree
        GCC_VERSION=7.4.0 make.cross ARCH=arm64 

If you fix the issue, kindly add following tag
Reported-by: kbuild test robot <lkp@intel.com>

All error/warnings (new ones prefixed by >>):

   In file included from include/linux/swab.h:5:0,
                    from include/uapi/linux/byteorder/big_endian.h:13,
                    from include/linux/byteorder/big_endian.h:5,
                    from arch/arm64/include/uapi/asm/byteorder.h:21,
                    from include/asm-generic/bitops/le.h:6,
                    from arch/arm64/include/asm/bitops.h:29,
                    from include/linux/bitops.h:26,
                    from include/linux/kernel.h:12,
                    from include/linux/clk.h:13,
                    from include/linux/amba/bus.h:14,
                    from drivers//iommu/virtio-iommu.c:10:
   drivers//iommu/virtio-iommu.c: In function 'viommu_replay_mappings':
>> drivers//iommu/virtio-iommu.c:403:41: error: 'struct interval_tree_node' has no member named 'last'
       .virt_end = cpu_to_le64(mapping->iova.last),
                                            ^
   include/uapi/linux/swab.h:130:32: note: in definition of macro '__swab64'
     (__builtin_constant_p((__u64)(x)) ? \
                                   ^
>> include/linux/byteorder/generic.h:86:21: note: in expansion of macro '__cpu_to_le64'
    #define cpu_to_le64 __cpu_to_le64
                        ^~~~~~~~~~~~~
>> drivers//iommu/virtio-iommu.c:403:16: note: in expansion of macro 'cpu_to_le64'
       .virt_end = cpu_to_le64(mapping->iova.last),
                   ^~~~~~~~~~~
>> drivers//iommu/virtio-iommu.c:403:41: error: 'struct interval_tree_node' has no member named 'last'
       .virt_end = cpu_to_le64(mapping->iova.last),
                                            ^
   include/uapi/linux/swab.h:24:12: note: in definition of macro '___constant_swab64'
     (((__u64)(x) & (__u64)0x00000000000000ffULL) << 56) | \
               ^
>> include/uapi/linux/byteorder/big_endian.h:31:43: note: in expansion of macro '__swab64'
    #define __cpu_to_le64(x) ((__force __le64)__swab64((x)))
                                              ^~~~~~~~
>> include/linux/byteorder/generic.h:86:21: note: in expansion of macro '__cpu_to_le64'
    #define cpu_to_le64 __cpu_to_le64
                        ^~~~~~~~~~~~~
>> drivers//iommu/virtio-iommu.c:403:16: note: in expansion of macro 'cpu_to_le64'
       .virt_end = cpu_to_le64(mapping->iova.last),
                   ^~~~~~~~~~~
>> drivers//iommu/virtio-iommu.c:403:41: error: 'struct interval_tree_node' has no member named 'last'
       .virt_end = cpu_to_le64(mapping->iova.last),
                                            ^
   include/uapi/linux/swab.h:25:12: note: in definition of macro '___constant_swab64'
     (((__u64)(x) & (__u64)0x000000000000ff00ULL) << 40) | \
               ^
>> include/uapi/linux/byteorder/big_endian.h:31:43: note: in expansion of macro '__swab64'
    #define __cpu_to_le64(x) ((__force __le64)__swab64((x)))
                                              ^~~~~~~~
>> include/linux/byteorder/generic.h:86:21: note: in expansion of macro '__cpu_to_le64'
    #define cpu_to_le64 __cpu_to_le64
                        ^~~~~~~~~~~~~
>> drivers//iommu/virtio-iommu.c:403:16: note: in expansion of macro 'cpu_to_le64'
       .virt_end = cpu_to_le64(mapping->iova.last),
                   ^~~~~~~~~~~
>> drivers//iommu/virtio-iommu.c:403:41: error: 'struct interval_tree_node' has no member named 'last'
       .virt_end = cpu_to_le64(mapping->iova.last),
                                            ^
   include/uapi/linux/swab.h:26:12: note: in definition of macro '___constant_swab64'
     (((__u64)(x) & (__u64)0x0000000000ff0000ULL) << 24) | \
               ^
>> include/uapi/linux/byteorder/big_endian.h:31:43: note: in expansion of macro '__swab64'
    #define __cpu_to_le64(x) ((__force __le64)__swab64((x)))
                                              ^~~~~~~~
>> include/linux/byteorder/generic.h:86:21: note: in expansion of macro '__cpu_to_le64'
    #define cpu_to_le64 __cpu_to_le64
                        ^~~~~~~~~~~~~
>> drivers//iommu/virtio-iommu.c:403:16: note: in expansion of macro 'cpu_to_le64'
       .virt_end = cpu_to_le64(mapping->iova.last),
                   ^~~~~~~~~~~
>> drivers//iommu/virtio-iommu.c:403:41: error: 'struct interval_tree_node' has no member named 'last'
       .virt_end = cpu_to_le64(mapping->iova.last),
                                            ^
   include/uapi/linux/swab.h:27:12: note: in definition of macro '___constant_swab64'
     (((__u64)(x) & (__u64)0x00000000ff000000ULL) <<  8) | \
               ^
>> include/uapi/linux/byteorder/big_endian.h:31:43: note: in expansion of macro '__swab64'
    #define __cpu_to_le64(x) ((__force __le64)__swab64((x)))
                                              ^~~~~~~~
>> include/linux/byteorder/generic.h:86:21: note: in expansion of macro '__cpu_to_le64'
    #define cpu_to_le64 __cpu_to_le64
                        ^~~~~~~~~~~~~
>> drivers//iommu/virtio-iommu.c:403:16: note: in expansion of macro 'cpu_to_le64'
       .virt_end = cpu_to_le64(mapping->iova.last),
                   ^~~~~~~~~~~
>> drivers//iommu/virtio-iommu.c:403:41: error: 'struct interval_tree_node' has no member named 'last'
       .virt_end = cpu_to_le64(mapping->iova.last),
                                            ^
   include/uapi/linux/swab.h:28:12: note: in definition of macro '___constant_swab64'
     (((__u64)(x) & (__u64)0x000000ff00000000ULL) >>  8) | \
               ^
--
   In file included from include/linux/swab.h:5:0,
                    from include/uapi/linux/byteorder/big_endian.h:13,
                    from include/linux/byteorder/big_endian.h:5,
                    from arch/arm64/include/uapi/asm/byteorder.h:21,
                    from include/asm-generic/bitops/le.h:6,
                    from arch/arm64/include/asm/bitops.h:29,
                    from include/linux/bitops.h:26,
                    from include/linux/kernel.h:12,
                    from include/linux/clk.h:13,
                    from include/linux/amba/bus.h:14,
                    from drivers/iommu/virtio-iommu.c:10:
   drivers/iommu/virtio-iommu.c: In function 'viommu_replay_mappings':
   drivers/iommu/virtio-iommu.c:403:41: error: 'struct interval_tree_node' has no member named 'last'
       .virt_end = cpu_to_le64(mapping->iova.last),
                                            ^
   include/uapi/linux/swab.h:130:32: note: in definition of macro '__swab64'
     (__builtin_constant_p((__u64)(x)) ? \
                                   ^
>> include/linux/byteorder/generic.h:86:21: note: in expansion of macro '__cpu_to_le64'
    #define cpu_to_le64 __cpu_to_le64
                        ^~~~~~~~~~~~~
   drivers/iommu/virtio-iommu.c:403:16: note: in expansion of macro 'cpu_to_le64'
       .virt_end = cpu_to_le64(mapping->iova.last),
                   ^~~~~~~~~~~
   drivers/iommu/virtio-iommu.c:403:41: error: 'struct interval_tree_node' has no member named 'last'
       .virt_end = cpu_to_le64(mapping->iova.last),
                                            ^
   include/uapi/linux/swab.h:24:12: note: in definition of macro '___constant_swab64'
     (((__u64)(x) & (__u64)0x00000000000000ffULL) << 56) | \
               ^
>> include/uapi/linux/byteorder/big_endian.h:31:43: note: in expansion of macro '__swab64'
    #define __cpu_to_le64(x) ((__force __le64)__swab64((x)))
                                              ^~~~~~~~
>> include/linux/byteorder/generic.h:86:21: note: in expansion of macro '__cpu_to_le64'
    #define cpu_to_le64 __cpu_to_le64
                        ^~~~~~~~~~~~~
   drivers/iommu/virtio-iommu.c:403:16: note: in expansion of macro 'cpu_to_le64'
       .virt_end = cpu_to_le64(mapping->iova.last),
                   ^~~~~~~~~~~
   drivers/iommu/virtio-iommu.c:403:41: error: 'struct interval_tree_node' has no member named 'last'
       .virt_end = cpu_to_le64(mapping->iova.last),
                                            ^
   include/uapi/linux/swab.h:25:12: note: in definition of macro '___constant_swab64'
     (((__u64)(x) & (__u64)0x000000000000ff00ULL) << 40) | \
               ^
>> include/uapi/linux/byteorder/big_endian.h:31:43: note: in expansion of macro '__swab64'
    #define __cpu_to_le64(x) ((__force __le64)__swab64((x)))
                                              ^~~~~~~~
>> include/linux/byteorder/generic.h:86:21: note: in expansion of macro '__cpu_to_le64'
    #define cpu_to_le64 __cpu_to_le64
                        ^~~~~~~~~~~~~
   drivers/iommu/virtio-iommu.c:403:16: note: in expansion of macro 'cpu_to_le64'
       .virt_end = cpu_to_le64(mapping->iova.last),
                   ^~~~~~~~~~~
   drivers/iommu/virtio-iommu.c:403:41: error: 'struct interval_tree_node' has no member named 'last'
       .virt_end = cpu_to_le64(mapping->iova.last),
                                            ^
   include/uapi/linux/swab.h:26:12: note: in definition of macro '___constant_swab64'
     (((__u64)(x) & (__u64)0x0000000000ff0000ULL) << 24) | \
               ^
>> include/uapi/linux/byteorder/big_endian.h:31:43: note: in expansion of macro '__swab64'
    #define __cpu_to_le64(x) ((__force __le64)__swab64((x)))
                                              ^~~~~~~~
>> include/linux/byteorder/generic.h:86:21: note: in expansion of macro '__cpu_to_le64'
    #define cpu_to_le64 __cpu_to_le64
                        ^~~~~~~~~~~~~
   drivers/iommu/virtio-iommu.c:403:16: note: in expansion of macro 'cpu_to_le64'
       .virt_end = cpu_to_le64(mapping->iova.last),
                   ^~~~~~~~~~~
   drivers/iommu/virtio-iommu.c:403:41: error: 'struct interval_tree_node' has no member named 'last'
       .virt_end = cpu_to_le64(mapping->iova.last),
                                            ^
   include/uapi/linux/swab.h:27:12: note: in definition of macro '___constant_swab64'
     (((__u64)(x) & (__u64)0x00000000ff000000ULL) <<  8) | \
               ^
>> include/uapi/linux/byteorder/big_endian.h:31:43: note: in expansion of macro '__swab64'
    #define __cpu_to_le64(x) ((__force __le64)__swab64((x)))
                                              ^~~~~~~~
>> include/linux/byteorder/generic.h:86:21: note: in expansion of macro '__cpu_to_le64'
    #define cpu_to_le64 __cpu_to_le64
                        ^~~~~~~~~~~~~
   drivers/iommu/virtio-iommu.c:403:16: note: in expansion of macro 'cpu_to_le64'
       .virt_end = cpu_to_le64(mapping->iova.last),
                   ^~~~~~~~~~~
   drivers/iommu/virtio-iommu.c:403:41: error: 'struct interval_tree_node' has no member named 'last'
       .virt_end = cpu_to_le64(mapping->iova.last),
                                            ^
   include/uapi/linux/swab.h:28:12: note: in definition of macro '___constant_swab64'
     (((__u64)(x) & (__u64)0x000000ff00000000ULL) >>  8) | \
               ^
>> include/uapi/linux/byteorder/big_endian.h:31:43: note: in expansion of macro '__swab64'
    #define __cpu_to_le64(x) ((__force __le64)__swab64((x)))
                                              ^~~~~~~~
>> include/linux/byteorder/generic.h:86:21: note: in expansion of macro '__cpu_to_le64'
    #define cpu_to_le64 __cpu_to_le64
                        ^~~~~~~~~~~~~
   drivers/iommu/virtio-iommu.c:403:16: note: in expansion of macro 'cpu_to_le64'
       .virt_end = cpu_to_le64(mapping->iova.last),
                   ^~~~~~~~~~~
   drivers/iommu/virtio-iommu.c:403:41: error: 'struct interval_tree_node' has no member named 'last'
       .virt_end = cpu_to_le64(mapping->iova.last),
                                            ^
   include/uapi/linux/swab.h:29:12: note: in definition of macro '___constant_swab64'
     (((__u64)(x) & (__u64)0x0000ff0000000000ULL) >> 24) | \
               ^
>> include/uapi/linux/byteorder/big_endian.h:31:43: note: in expansion of macro '__swab64'
    #define __cpu_to_le64(x) ((__force __le64)__swab64((x)))
                                              ^~~~~~~~
>> include/linux/byteorder/generic.h:86:21: note: in expansion of macro '__cpu_to_le64'
    #define cpu_to_le64 __cpu_to_le64
                        ^~~~~~~~~~~~~
   drivers/iommu/virtio-iommu.c:403:16: note: in expansion of macro 'cpu_to_le64'
       .virt_end = cpu_to_le64(mapping->iova.last),
                   ^~~~~~~~~~~
   drivers/iommu/virtio-iommu.c:403:41: error: 'struct interval_tree_node' has no member named 'last'
       .virt_end = cpu_to_le64(mapping->iova.last),
                                            ^
   include/uapi/linux/swab.h:30:12: note: in definition of macro '___constant_swab64'
     (((__u64)(x) & (__u64)0x00ff000000000000ULL) >> 40) | \
               ^
>> include/uapi/linux/byteorder/big_endian.h:31:43: note: in expansion of macro '__swab64'
    #define __cpu_to_le64(x) ((__force __le64)__swab64((x)))
                                              ^~~~~~~~
>> include/linux/byteorder/generic.h:86:21: note: in expansion of macro '__cpu_to_le64'
    #define cpu_to_le64 __cpu_to_le64
                        ^~~~~~~~~~~~~
   drivers/iommu/virtio-iommu.c:403:16: note: in expansion of macro 'cpu_to_le64'
       .virt_end = cpu_to_le64(mapping->iova.last),
                   ^~~~~~~~~~~
   drivers/iommu/virtio-iommu.c:403:41: error: 'struct interval_tree_node' has no member named 'last'
       .virt_end = cpu_to_le64(mapping->iova.last),
                                            ^
   include/uapi/linux/swab.h:31:12: note: in definition of macro '___constant_swab64'
     (((__u64)(x) & (__u64)0xff00000000000000ULL) >> 56)))
               ^
>> include/uapi/linux/byteorder/big_endian.h:31:43: note: in expansion of macro '__swab64'
    #define __cpu_to_le64(x) ((__force __le64)__swab64((x)))
                                              ^~~~~~~~
>> include/linux/byteorder/generic.h:86:21: note: in expansion of macro '__cpu_to_le64'
    #define cpu_to_le64 __cpu_to_le64
                        ^~~~~~~~~~~~~
   drivers/iommu/virtio-iommu.c:403:16: note: in expansion of macro 'cpu_to_le64'
       .virt_end = cpu_to_le64(mapping->iova.last),
                   ^~~~~~~~~~~
   drivers/iommu/virtio-iommu.c:403:41: error: 'struct interval_tree_node' has no member named 'last'
       .virt_end = cpu_to_le64(mapping->iova.last),
                                            ^
   include/uapi/linux/swab.h:132:12: note: in definition of macro '__swab64'
     __fswab64(x))
               ^
>> include/linux/byteorder/generic.h:86:21: note: in expansion of macro '__cpu_to_le64'
    #define cpu_to_le64 __cpu_to_le64
                        ^~~~~~~~~~~~~
   drivers/iommu/virtio-iommu.c:403:16: note: in expansion of macro 'cpu_to_le64'
       .virt_end = cpu_to_le64(mapping->iova.last),
                   ^~~~~~~~~~~

vim +403 drivers//iommu/virtio-iommu.c

edcd69ab9a323b Jean-Philippe Brucker 2019-01-15  379  
edcd69ab9a323b Jean-Philippe Brucker 2019-01-15  380  /*
edcd69ab9a323b Jean-Philippe Brucker 2019-01-15  381   * viommu_replay_mappings - re-send MAP requests
edcd69ab9a323b Jean-Philippe Brucker 2019-01-15  382   *
edcd69ab9a323b Jean-Philippe Brucker 2019-01-15  383   * When reattaching a domain that was previously detached from all endpoints,
edcd69ab9a323b Jean-Philippe Brucker 2019-01-15  384   * mappings were deleted from the device. Re-create the mappings available in
edcd69ab9a323b Jean-Philippe Brucker 2019-01-15  385   * the internal tree.
edcd69ab9a323b Jean-Philippe Brucker 2019-01-15  386   */
edcd69ab9a323b Jean-Philippe Brucker 2019-01-15  387  static int viommu_replay_mappings(struct viommu_domain *vdomain)
edcd69ab9a323b Jean-Philippe Brucker 2019-01-15  388  {
edcd69ab9a323b Jean-Philippe Brucker 2019-01-15  389  	int ret = 0;
edcd69ab9a323b Jean-Philippe Brucker 2019-01-15  390  	unsigned long flags;
edcd69ab9a323b Jean-Philippe Brucker 2019-01-15  391  	struct viommu_mapping *mapping;
edcd69ab9a323b Jean-Philippe Brucker 2019-01-15  392  	struct interval_tree_node *node;
edcd69ab9a323b Jean-Philippe Brucker 2019-01-15  393  	struct virtio_iommu_req_map map;
edcd69ab9a323b Jean-Philippe Brucker 2019-01-15  394  
edcd69ab9a323b Jean-Philippe Brucker 2019-01-15  395  	spin_lock_irqsave(&vdomain->mappings_lock, flags);
edcd69ab9a323b Jean-Philippe Brucker 2019-01-15  396  	node = interval_tree_iter_first(&vdomain->mappings, 0, -1UL);
edcd69ab9a323b Jean-Philippe Brucker 2019-01-15  397  	while (node) {
edcd69ab9a323b Jean-Philippe Brucker 2019-01-15  398  		mapping = container_of(node, struct viommu_mapping, iova);
edcd69ab9a323b Jean-Philippe Brucker 2019-01-15  399  		map = (struct virtio_iommu_req_map) {
edcd69ab9a323b Jean-Philippe Brucker 2019-01-15  400  			.head.type	= VIRTIO_IOMMU_T_MAP,
edcd69ab9a323b Jean-Philippe Brucker 2019-01-15  401  			.domain		= cpu_to_le32(vdomain->id),
edcd69ab9a323b Jean-Philippe Brucker 2019-01-15  402  			.virt_start	= cpu_to_le64(mapping->iova.start),
edcd69ab9a323b Jean-Philippe Brucker 2019-01-15 @403  			.virt_end	= cpu_to_le64(mapping->iova.last),
edcd69ab9a323b Jean-Philippe Brucker 2019-01-15  404  			.phys_start	= cpu_to_le64(mapping->paddr),
edcd69ab9a323b Jean-Philippe Brucker 2019-01-15  405  			.flags		= cpu_to_le32(mapping->flags),
edcd69ab9a323b Jean-Philippe Brucker 2019-01-15  406  		};
edcd69ab9a323b Jean-Philippe Brucker 2019-01-15  407  
edcd69ab9a323b Jean-Philippe Brucker 2019-01-15  408  		ret = viommu_send_req_sync(vdomain->viommu, &map, sizeof(map));
edcd69ab9a323b Jean-Philippe Brucker 2019-01-15  409  		if (ret)
edcd69ab9a323b Jean-Philippe Brucker 2019-01-15  410  			break;
edcd69ab9a323b Jean-Philippe Brucker 2019-01-15  411  
edcd69ab9a323b Jean-Philippe Brucker 2019-01-15  412  		node = interval_tree_iter_next(node, 0, -1UL);
edcd69ab9a323b Jean-Philippe Brucker 2019-01-15  413  	}
edcd69ab9a323b Jean-Philippe Brucker 2019-01-15  414  	spin_unlock_irqrestore(&vdomain->mappings_lock, flags);
edcd69ab9a323b Jean-Philippe Brucker 2019-01-15  415  
edcd69ab9a323b Jean-Philippe Brucker 2019-01-15  416  	return ret;
edcd69ab9a323b Jean-Philippe Brucker 2019-01-15  417  }
edcd69ab9a323b Jean-Philippe Brucker 2019-01-15  418  

:::::: The code at line 403 was first introduced by commit
:::::: edcd69ab9a323b7ac7a86e1c44b6c9c46598391f iommu: Add virtio-iommu driver

:::::: TO: Jean-Philippe Brucker <jean-philippe.brucker@arm.com>
:::::: CC: Michael S. Tsirkin <mst@redhat.com>

---
0-DAY kernel test infrastructure                Open Source Technology Center
https://lists.01.org/pipermail/kbuild-all                   Intel Corporation

[-- Attachment #2: .config.gz --]
[-- Type: application/gzip, Size: 67355 bytes --]

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [PATCH -next 00/11] lib/interval-tree: move to half closed intervals
  2019-10-03 20:18 [PATCH -next 00/11] lib/interval-tree: move to half closed intervals Davidlohr Bueso
                   ` (11 preceding siblings ...)
  2019-10-03 20:32 ` [PATCH -next 00/11] lib/interval-tree: move to half closed intervals Matthew Wilcox
@ 2019-10-04  0:26 ` Jason Gunthorpe
  2019-10-04  2:48   ` Davidlohr Bueso
  2019-10-04 13:15   ` Michel Lespinasse
  12 siblings, 2 replies; 38+ messages in thread
From: Jason Gunthorpe @ 2019-10-04  0:26 UTC (permalink / raw)
  To: Davidlohr Bueso
  Cc: akpm, walken, peterz, linux-kernel, linux-mm, dri-devel, linux-rdma

On Thu, Oct 03, 2019 at 01:18:47PM -0700, Davidlohr Bueso wrote:
> Hi,
> 
> It has been discussed[1,2] that almost all users of interval trees would better
> be served if the intervals were actually not [a,b], but instead [a, b). This
> series attempts to convert all callers by way of transitioning from using
> "interval_tree_generic.h" to "interval_tree_gen.h". Once all users are converted,
> we remove the former.
> 
> Patch 1: adds a call that will make patch 8 easier to review by introducing stab
>          queries for the vma interval tree.
> 
> Patch 2: adds the new interval_tree_gen.h which is the same as the old one but
>          uses [a,b) intervals.
> 
> Patch 3-9: converts, in baby steps (as much as possible), each interval tree to
> 	   the new [a,b) one. It is done this way also to maintain bisectability.
> 	   Most conversions are pretty straightforward, however, there are some
> 	   creative ways in which some callers use the interval 'end' when going
> 	   through intersecting ranges within a tree. Ie: patch 3, 6 and 9.
> 
> Patch 10: deletes the interval_tree_generic.h header; there are no longer any users.
> 
> Patch 11: finally simplifies x86 pat tree to use the new interval tree machinery.
> 
> This has been lightly tested, and certainly not on driver paths that do non
> trivial conversions. Also needs more eyeballs as conversions can be easily
> missed (even when I've tried mitigating this by renaming the endpoint from 'last'
> to 'end' in each corresponding structure).
> 
> Because this touches a lot of drivers, I'm Cc'ing the whole thing to a couple of
> relevant lists (mm, dri, rdma); sorry if you consider this spam.
> 
> Applies on top of today's linux-next tree. Please consider for v5.5.
> 
> Thanks!
> 
> [1] https://lore.kernel.org/lkml/CANN689HVDJXKEwB80yPAVwvRwnV4HfiucQVAho=dupKM_iKozw@mail.gmail.com/

Hurm, this is not entirely accurate. Most users do actually want
overlapping and multiple ranges. I just studied this extensively:

radeon_mn actually wants overlapping but seems to mis-understand the
interval_tree API and actively tries hard to prevent overlapping at
great cost and complexity. I have a patch to delete all of this and
just be overlapping.

amdgpu_mn copied the wrongness from radeon_mn

All the DRM drivers are basically the same here, tracking userspace
controlled VAs, so overlapping is essential

hfi1/mmu_rb definitely needs overlapping as it is dealing with
userspace VA ranges under control of userspace. As do the other
infiniband users.

vhost probably doesn't overlap in the normal case, but again userspace
could trigger overlap in some pathalogical case.

The [start,last] allows the interval to cover up to ULONG_MAX. I don't
know if this is needed however. Many users are using userspace VAs
here. Is there any kernel configuration where ULONG_MAX is a valid
userspace pointer? Ie 32 bit 4G userspace? I don't know. 

Many users seemed to have bugs where they were taking a userspace
controlled start + length and converting them into a start/end for
interval tree without overflow protection (woops)

Also I have a series already cooking to delete several of these
interval tree users, which will terribly conflict with this :\

Is it really necessary to make such churn for such a tiny API change?

Jason


^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [PATCH -next 00/11] lib/interval-tree: move to half closed intervals
  2019-10-04  0:26 ` Jason Gunthorpe
@ 2019-10-04  2:48   ` Davidlohr Bueso
  2019-10-04 13:15   ` Michel Lespinasse
  1 sibling, 0 replies; 38+ messages in thread
From: Davidlohr Bueso @ 2019-10-04  2:48 UTC (permalink / raw)
  To: Jason Gunthorpe
  Cc: akpm, walken, peterz, linux-kernel, linux-mm, dri-devel, linux-rdma

On Thu, 03 Oct 2019, Jason Gunthorpe wrote:

>On Thu, Oct 03, 2019 at 01:18:47PM -0700, Davidlohr Bueso wrote:
>> Hi,
>>
>> It has been discussed[1,2] that almost all users of interval trees would better
>> be served if the intervals were actually not [a,b], but instead [a, b). This
>> series attempts to convert all callers by way of transitioning from using
>> "interval_tree_generic.h" to "interval_tree_gen.h". Once all users are converted,
>> we remove the former.
>>
>> Patch 1: adds a call that will make patch 8 easier to review by introducing stab
>>          queries for the vma interval tree.
>>
>> Patch 2: adds the new interval_tree_gen.h which is the same as the old one but
>>          uses [a,b) intervals.
>>
>> Patch 3-9: converts, in baby steps (as much as possible), each interval tree to
>> 	   the new [a,b) one. It is done this way also to maintain bisectability.
>> 	   Most conversions are pretty straightforward, however, there are some
>> 	   creative ways in which some callers use the interval 'end' when going
>> 	   through intersecting ranges within a tree. Ie: patch 3, 6 and 9.
>>
>> Patch 10: deletes the interval_tree_generic.h header; there are no longer any users.
>>
>> Patch 11: finally simplifies x86 pat tree to use the new interval tree machinery.
>>
>> This has been lightly tested, and certainly not on driver paths that do non
>> trivial conversions. Also needs more eyeballs as conversions can be easily
>> missed (even when I've tried mitigating this by renaming the endpoint from 'last'
>> to 'end' in each corresponding structure).
>>
>> Because this touches a lot of drivers, I'm Cc'ing the whole thing to a couple of
>> relevant lists (mm, dri, rdma); sorry if you consider this spam.
>>
>> Applies on top of today's linux-next tree. Please consider for v5.5.
>>
>> Thanks!
>>
>> [1] https://lore.kernel.org/lkml/CANN689HVDJXKEwB80yPAVwvRwnV4HfiucQVAho=dupKM_iKozw@mail.gmail.com/
>
>Hurm, this is not entirely accurate. Most users do actually want
>overlapping and multiple ranges. I just studied this extensively:
>
>radeon_mn actually wants overlapping but seems to mis-understand the
>interval_tree API and actively tries hard to prevent overlapping at
>great cost and complexity. I have a patch to delete all of this and
>just be overlapping.
>
>amdgpu_mn copied the wrongness from radeon_mn
>
>All the DRM drivers are basically the same here, tracking userspace
>controlled VAs, so overlapping is essential
>
>hfi1/mmu_rb definitely needs overlapping as it is dealing with
>userspace VA ranges under control of userspace. As do the other
>infiniband users.
>
>vhost probably doesn't overlap in the normal case, but again userspace
>could trigger overlap in some pathalogical case.
>
>The [start,last] allows the interval to cover up to ULONG_MAX. I don't
>know if this is needed however. Many users are using userspace VAs
>here. Is there any kernel configuration where ULONG_MAX is a valid
>userspace pointer? Ie 32 bit 4G userspace? I don't know.
>
>Many users seemed to have bugs where they were taking a userspace
>controlled start + length and converting them into a start/end for
>interval tree without overflow protection (woops)
>
>Also I have a series already cooking to delete several of these
>interval tree users, which will terribly conflict with this :\

I have no problem redoing after your changes; if it's worth it
at all.

>
>Is it really necessary to make such churn for such a tiny API change?

I agree, and was kind of expecting this. In general the diffstat ended
up being larger than I initially hoped for. Maybe after your removals
I can look into this again.

Thanks,
Davidlohr


^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [PATCH 03/11] drm/amdgpu: convert amdgpu_vm_it to half closed intervals
  2019-10-03 20:18 ` [PATCH 03/11] drm/amdgpu: convert amdgpu_vm_it to half closed intervals Davidlohr Bueso
@ 2019-10-04  6:54   ` Koenig, Christian
  2019-10-04 11:36     ` Michel Lespinasse
  0 siblings, 1 reply; 38+ messages in thread
From: Koenig, Christian @ 2019-10-04  6:54 UTC (permalink / raw)
  To: Davidlohr Bueso, akpm
  Cc: walken, peterz, linux-kernel, linux-mm, dri-devel, linux-rdma,
	Jerome Glisse, Deucher, Alexander, Daniel Vetter, amd-gfx,
	Davidlohr Bueso

Am 03.10.19 um 22:18 schrieb Davidlohr Bueso:
> The amdgpu_vm interval tree really wants [a, b) intervals,

NAK, we explicitly do need an [a, b[ interval here.

Regards,
Christian.

> not fully closed ones. As such convert it to use the new
> interval_tree_gen.h, and also rename the 'last' endpoint
> in the node to 'end', which is both a more suitable name
> for the half closed interval and also reduces the chances
> of missing a conversion when doing insertion or lookup.
>
> Cc: Jerome Glisse <jglisse@redhat.com>
> Cc: Alex Deucher <alexander.deucher@amd.com>
> Cc: "Christian König" <christian.koenig@amd.com>
> Cc: Daniel Vetter <daniel@ffwll.ch>
> Cc: amd-gfx@lists.freedesktop.org
> Signed-off-by: Davidlohr Bueso <dbueso@suse.de>
> ---
>   drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c     |  2 +-
>   drivers/gpu/drm/amd/amdgpu/amdgpu_object.h |  2 +-
>   drivers/gpu/drm/amd/amdgpu/amdgpu_trace.h  | 18 ++++++------
>   drivers/gpu/drm/amd/amdgpu/amdgpu_uvd.c    |  2 +-
>   drivers/gpu/drm/amd/amdgpu/amdgpu_vce.c    |  3 +-
>   drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c     | 46 +++++++++++++++---------------
>   6 files changed, 36 insertions(+), 37 deletions(-)
>
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c
> index 49b767b7238f..290bfe820890 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c
> @@ -756,7 +756,7 @@ static int amdgpu_cs_vm_handling(struct amdgpu_cs_parser *p)
>   			}
>   
>   			if ((va_start + chunk_ib->ib_bytes) >
> -			    (m->last + 1) * AMDGPU_GPU_PAGE_SIZE) {
> +			    m->end * AMDGPU_GPU_PAGE_SIZE) {
>   				DRM_ERROR("IB va_start+ib_bytes is invalid\n");
>   				return -EINVAL;
>   			}
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_object.h b/drivers/gpu/drm/amd/amdgpu/amdgpu_object.h
> index 7e99f6c58c48..60b73bc4d11a 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_object.h
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_object.h
> @@ -51,7 +51,7 @@ struct amdgpu_bo_va_mapping {
>   	struct list_head		list;
>   	struct rb_node			rb;
>   	uint64_t			start;
> -	uint64_t			last;
> +	uint64_t			end;
>   	uint64_t			__subtree_last;
>   	uint64_t			offset;
>   	uint64_t			flags;
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_trace.h b/drivers/gpu/drm/amd/amdgpu/amdgpu_trace.h
> index 8227ebd0f511..c5b0e88d019c 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_trace.h
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_trace.h
> @@ -247,7 +247,7 @@ TRACE_EVENT(amdgpu_vm_bo_map,
>   	    TP_STRUCT__entry(
>   			     __field(struct amdgpu_bo *, bo)
>   			     __field(long, start)
> -			     __field(long, last)
> +			     __field(long, end)
>   			     __field(u64, offset)
>   			     __field(u64, flags)
>   			     ),
> @@ -255,12 +255,12 @@ TRACE_EVENT(amdgpu_vm_bo_map,
>   	    TP_fast_assign(
>   			   __entry->bo = bo_va ? bo_va->base.bo : NULL;
>   			   __entry->start = mapping->start;
> -			   __entry->last = mapping->last;
> +			   __entry->end = mapping->end;
>   			   __entry->offset = mapping->offset;
>   			   __entry->flags = mapping->flags;
>   			   ),
> -	    TP_printk("bo=%p, start=%lx, last=%lx, offset=%010llx, flags=%llx",
> -		      __entry->bo, __entry->start, __entry->last,
> +	    TP_printk("bo=%p, start=%lx, end=%lx, offset=%010llx, flags=%llx",
> +		      __entry->bo, __entry->start, __entry->end,
>   		      __entry->offset, __entry->flags)
>   );
>   
> @@ -271,7 +271,7 @@ TRACE_EVENT(amdgpu_vm_bo_unmap,
>   	    TP_STRUCT__entry(
>   			     __field(struct amdgpu_bo *, bo)
>   			     __field(long, start)
> -			     __field(long, last)
> +			     __field(long, end)
>   			     __field(u64, offset)
>   			     __field(u64, flags)
>   			     ),
> @@ -279,12 +279,12 @@ TRACE_EVENT(amdgpu_vm_bo_unmap,
>   	    TP_fast_assign(
>   			   __entry->bo = bo_va ? bo_va->base.bo : NULL;
>   			   __entry->start = mapping->start;
> -			   __entry->last = mapping->last;
> +			   __entry->end = mapping->end;
>   			   __entry->offset = mapping->offset;
>   			   __entry->flags = mapping->flags;
>   			   ),
> -	    TP_printk("bo=%p, start=%lx, last=%lx, offset=%010llx, flags=%llx",
> -		      __entry->bo, __entry->start, __entry->last,
> +	    TP_printk("bo=%p, start=%lx, end=%lx, offset=%010llx, flags=%llx",
> +		      __entry->bo, __entry->start, __entry->end,
>   		      __entry->offset, __entry->flags)
>   );
>   
> @@ -299,7 +299,7 @@ DECLARE_EVENT_CLASS(amdgpu_vm_mapping,
>   
>   	    TP_fast_assign(
>   			   __entry->soffset = mapping->start;
> -			   __entry->eoffset = mapping->last + 1;
> +			   __entry->eoffset = mapping->end;
>   			   __entry->flags = mapping->flags;
>   			   ),
>   	    TP_printk("soffs=%010llx, eoffs=%010llx, flags=%llx",
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_uvd.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_uvd.c
> index b2c364b8695f..8094dd0b0332 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_uvd.c
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_uvd.c
> @@ -819,7 +819,7 @@ static int amdgpu_uvd_cs_pass2(struct amdgpu_uvd_cs_ctx *ctx)
>   
>   	start = amdgpu_bo_gpu_offset(bo);
>   
> -	end = (mapping->last + 1 - mapping->start);
> +	end = mapping->end - mapping->start;
>   	end = end * AMDGPU_GPU_PAGE_SIZE + start;
>   
>   	addr -= mapping->start * AMDGPU_GPU_PAGE_SIZE;
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_vce.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_vce.c
> index b70b3c45bb29..d6511bf446df 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_vce.c
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_vce.c
> @@ -642,8 +642,7 @@ static int amdgpu_vce_cs_reloc(struct amdgpu_cs_parser *p, uint32_t ib_idx,
>   		return r;
>   	}
>   
> -	if ((addr + (uint64_t)size) >
> -	    (mapping->last + 1) * AMDGPU_GPU_PAGE_SIZE) {
> +	if ((addr + (uint64_t)size) > mapping->end * AMDGPU_GPU_PAGE_SIZE) {
>   		DRM_ERROR("BO to small for addr 0x%010Lx %d %d\n",
>   			  addr, lo, hi);
>   		return -EINVAL;
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c
> index a2c797e34a29..2f618017617e 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c
> @@ -26,7 +26,7 @@
>    *          Jerome Glisse
>    */
>   #include <linux/dma-fence-array.h>
> -#include <linux/interval_tree_generic.h>
> +#include <linux/interval_tree_gen.h>
>   #include <linux/idr.h>
>   
>   #include <drm/amdgpu_drm.h>
> @@ -58,13 +58,13 @@
>    */
>   
>   #define START(node) ((node)->start)
> -#define LAST(node) ((node)->last)
> +#define END(node) ((node)->end)
>   
>   INTERVAL_TREE_DEFINE(struct amdgpu_bo_va_mapping, rb, uint64_t, __subtree_last,
> -		     START, LAST, static, amdgpu_vm_it)
> +		     START, END, static, amdgpu_vm_it)
>   
>   #undef START
> -#undef LAST
> +#undef END
>   
>   /**
>    * struct amdgpu_prt_cb - Helper to disable partial resident texture feature from a fence callback
> @@ -1616,7 +1616,7 @@ static int amdgpu_vm_bo_split_mapping(struct amdgpu_device *adev,
>   	do {
>   		dma_addr_t *dma_addr = NULL;
>   		uint64_t max_entries;
> -		uint64_t addr, last;
> +		uint64_t addr, end;
>   
>   		if (nodes) {
>   			addr = nodes->start << PAGE_SHIFT;
> @@ -1654,21 +1654,21 @@ static int amdgpu_vm_bo_split_mapping(struct amdgpu_device *adev,
>   			addr += pfn << PAGE_SHIFT;
>   		}
>   
> -		last = min((uint64_t)mapping->last, start + max_entries - 1);
> +		end = min((uint64_t)mapping->end, start + max_entries);
>   		r = amdgpu_vm_bo_update_mapping(adev, vm, false, exclusive,
> -						start, last, flags, addr,
> +						start, end, flags, addr,
>   						dma_addr, fence);
>   		if (r)
>   			return r;
>   
> -		pfn += (last - start + 1) / AMDGPU_GPU_PAGES_IN_CPU_PAGE;
> +		pfn += (end - start) / AMDGPU_GPU_PAGES_IN_CPU_PAGE;
>   		if (nodes && nodes->size == pfn) {
>   			pfn = 0;
>   			++nodes;
>   		}
> -		start = last + 1;
> +		start = end;
>   
> -	} while (unlikely(start != mapping->last + 1));
> +	} while (unlikely(start != mapping->end));
>   
>   	return 0;
>   }
> @@ -1946,7 +1946,7 @@ int amdgpu_vm_clear_freed(struct amdgpu_device *adev,
>   			init_pte_value = AMDGPU_PTE_DEFAULT_ATC;
>   
>   		r = amdgpu_vm_bo_update_mapping(adev, vm, false, NULL,
> -						mapping->start, mapping->last,
> +						mapping->start, mapping->end,
>   						init_pte_value, 0, NULL, &f);
>   		amdgpu_vm_free_mapping(adev, vm, mapping, f);
>   		if (r) {
> @@ -2129,7 +2129,7 @@ int amdgpu_vm_bo_map(struct amdgpu_device *adev,
>   		return -EINVAL;
>   
>   	/* make sure object fit at this offset */
> -	eaddr = saddr + size - 1;
> +	eaddr = saddr + size;
>   	if (saddr >= eaddr ||
>   	    (bo && offset + size > amdgpu_bo_size(bo)))
>   		return -EINVAL;
> @@ -2142,7 +2142,7 @@ int amdgpu_vm_bo_map(struct amdgpu_device *adev,
>   		/* bo and tmp overlap, invalid addr */
>   		dev_err(adev->dev, "bo %p va 0x%010Lx-0x%010Lx conflict with "
>   			"0x%010Lx-0x%010Lx\n", bo, saddr, eaddr,
> -			tmp->start, tmp->last + 1);
> +			tmp->start, tmp->end);
>   		return -EINVAL;
>   	}
>   
> @@ -2151,7 +2151,7 @@ int amdgpu_vm_bo_map(struct amdgpu_device *adev,
>   		return -ENOMEM;
>   
>   	mapping->start = saddr;
> -	mapping->last = eaddr;
> +	mapping->end = eaddr;
>   	mapping->offset = offset;
>   	mapping->flags = flags;
>   
> @@ -2194,7 +2194,7 @@ int amdgpu_vm_bo_replace_map(struct amdgpu_device *adev,
>   		return -EINVAL;
>   
>   	/* make sure object fit at this offset */
> -	eaddr = saddr + size - 1;
> +	eaddr = saddr + size;
>   	if (saddr >= eaddr ||
>   	    (bo && offset + size > amdgpu_bo_size(bo)))
>   		return -EINVAL;
> @@ -2214,7 +2214,7 @@ int amdgpu_vm_bo_replace_map(struct amdgpu_device *adev,
>   	eaddr /= AMDGPU_GPU_PAGE_SIZE;
>   
>   	mapping->start = saddr;
> -	mapping->last = eaddr;
> +	mapping->end = eaddr;
>   	mapping->offset = offset;
>   	mapping->flags = flags;
>   
> @@ -2299,7 +2299,7 @@ int amdgpu_vm_bo_clear_mappings(struct amdgpu_device *adev,
>   	LIST_HEAD(removed);
>   	uint64_t eaddr;
>   
> -	eaddr = saddr + size - 1;
> +	eaddr = saddr + size;
>   	saddr /= AMDGPU_GPU_PAGE_SIZE;
>   	eaddr /= AMDGPU_GPU_PAGE_SIZE;
>   
> @@ -2322,7 +2322,7 @@ int amdgpu_vm_bo_clear_mappings(struct amdgpu_device *adev,
>   		/* Remember mapping split at the start */
>   		if (tmp->start < saddr) {
>   			before->start = tmp->start;
> -			before->last = saddr - 1;
> +			before->end = saddr;
>   			before->offset = tmp->offset;
>   			before->flags = tmp->flags;
>   			before->bo_va = tmp->bo_va;
> @@ -2330,9 +2330,9 @@ int amdgpu_vm_bo_clear_mappings(struct amdgpu_device *adev,
>   		}
>   
>   		/* Remember mapping split at the end */
> -		if (tmp->last > eaddr) {
> -			after->start = eaddr + 1;
> -			after->last = tmp->last;
> +		if (tmp->end > eaddr) {
> +			after->start = eaddr;
> +			after->end = tmp->end;
>   			after->offset = tmp->offset;
>   			after->offset += after->start - tmp->start;
>   			after->flags = tmp->flags;
> @@ -2353,8 +2353,8 @@ int amdgpu_vm_bo_clear_mappings(struct amdgpu_device *adev,
>   
>   		if (tmp->start < saddr)
>   		    tmp->start = saddr;
> -		if (tmp->last > eaddr)
> -		    tmp->last = eaddr;
> +		if (tmp->end > eaddr)
> +		    tmp->end = eaddr;
>   
>   		tmp->bo_va = NULL;
>   		list_add(&tmp->list, &vm->freed);


^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [PATCH 09/11] lib/interval-tree: convert interval_tree to half closed intervals
  2019-10-03 20:18 ` [PATCH 09/11] lib/interval-tree: convert interval_tree " Davidlohr Bueso
  2019-10-03 22:50   ` kbuild test robot
@ 2019-10-04  6:57   ` Koenig, Christian
  2019-10-04  7:20     ` Koenig, Christian
  1 sibling, 1 reply; 38+ messages in thread
From: Koenig, Christian @ 2019-10-04  6:57 UTC (permalink / raw)
  To: Davidlohr Bueso, akpm
  Cc: walken, peterz, linux-kernel, linux-mm, dri-devel, linux-rdma,
	Deucher, Alexander, David Airlie, Daniel Vetter, Doug Ledford,
	Joerg Roedel, Jérôme Glisse, Davidlohr Bueso

Am 03.10.19 um 22:18 schrieb Davidlohr Bueso:
> The generic tree tree really wants [a, b) intervals, not fully closed.
> As such convert it to use the new interval_tree_gen.h. Most of the
> conversions are straightforward, with the exception of perhaps
> radeon_vm_bo_set_addr(), but semantics have been tried to be left
> untouched.

NAK, the whole thing won't work.

See we need to handle the full device address space which means we have 
values in the range of 0x0-0xffffffff.

If you make this a closed interval then the end would wrap around to 0x0 
if long is only 32bit.

Regards,
Christian.

>
> Cc: "Christian König" <christian.koenig@amd.com>
> Cc: Alex Deucher <alexander.deucher@amd.com>
> Cc: David Airlie <airlied@linux.ie>
> Cc: Daniel Vetter <daniel@ffwll.ch>
> Cc: Doug Ledford <dledford@redhat.com>
> Cc: Joerg Roedel <joro@8bytes.org>
> Cc: "Jérôme Glisse" <jglisse@redhat.com>
> Cc: dri-devel@lists.freedesktop.org
> Cc: linux-rdma@vger.kernel.org
> Signed-off-by: Davidlohr Bueso <dbueso@suse.de>
> ---
>   drivers/gpu/drm/amd/amdgpu/amdgpu_mn.c      | 12 +++---------
>   drivers/gpu/drm/i915/gem/i915_gem_userptr.c |  5 +----
>   drivers/gpu/drm/radeon/radeon_mn.c          | 11 ++++-------
>   drivers/gpu/drm/radeon/radeon_trace.h       |  2 +-
>   drivers/gpu/drm/radeon/radeon_vm.c          | 26 +++++++++++++-------------
>   drivers/infiniband/core/umem_odp.c          | 21 +++++++--------------
>   drivers/iommu/virtio-iommu.c                |  6 +++---
>   include/linux/interval_tree.h               |  2 +-
>   include/rdma/ib_umem_odp.h                  |  4 ++--
>   lib/interval_tree.c                         |  6 +++---
>   10 files changed, 38 insertions(+), 57 deletions(-)
>
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_mn.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_mn.c
> index 31d4deb5d294..4bbaa9ffa570 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_mn.c
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_mn.c
> @@ -205,9 +205,6 @@ amdgpu_mn_sync_pagetables_gfx(struct hmm_mirror *mirror,
>   	bool blockable = mmu_notifier_range_blockable(update);
>   	struct interval_tree_node *it;
>   
> -	/* notification is exclusive, but interval is inclusive */
> -	end -= 1;
> -
>   	/* TODO we should be able to split locking for interval tree and
>   	 * amdgpu_mn_invalidate_node
>   	 */
> @@ -254,9 +251,6 @@ amdgpu_mn_sync_pagetables_hsa(struct hmm_mirror *mirror,
>   	bool blockable = mmu_notifier_range_blockable(update);
>   	struct interval_tree_node *it;
>   
> -	/* notification is exclusive, but interval is inclusive */
> -	end -= 1;
> -
>   	if (amdgpu_mn_read_lock(amn, blockable))
>   		return -EAGAIN;
>   
> @@ -374,7 +368,7 @@ struct amdgpu_mn *amdgpu_mn_get(struct amdgpu_device *adev,
>    */
>   int amdgpu_mn_register(struct amdgpu_bo *bo, unsigned long addr)
>   {
> -	unsigned long end = addr + amdgpu_bo_size(bo) - 1;
> +	unsigned long end = addr + amdgpu_bo_size(bo);
>   	struct amdgpu_device *adev = amdgpu_ttm_adev(bo->tbo.bdev);
>   	enum amdgpu_mn_type type =
>   		bo->kfd_bo ? AMDGPU_MN_TYPE_HSA : AMDGPU_MN_TYPE_GFX;
> @@ -400,7 +394,7 @@ int amdgpu_mn_register(struct amdgpu_bo *bo, unsigned long addr)
>   		node = container_of(it, struct amdgpu_mn_node, it);
>   		interval_tree_remove(&node->it, &amn->objects);
>   		addr = min(it->start, addr);
> -		end = max(it->last, end);
> +		end = max(it->end, end);
>   		list_splice(&node->bos, &bos);
>   	}
>   
> @@ -412,7 +406,7 @@ int amdgpu_mn_register(struct amdgpu_bo *bo, unsigned long addr)
>   	bo->mn = amn;
>   
>   	node->it.start = addr;
> -	node->it.last = end;
> +	node->it.end = end;
>   	INIT_LIST_HEAD(&node->bos);
>   	list_splice(&bos, &node->bos);
>   	list_add(&bo->mn_list, &node->bos);
> diff --git a/drivers/gpu/drm/i915/gem/i915_gem_userptr.c b/drivers/gpu/drm/i915/gem/i915_gem_userptr.c
> index 11b231c187c5..818ff6b33102 100644
> --- a/drivers/gpu/drm/i915/gem/i915_gem_userptr.c
> +++ b/drivers/gpu/drm/i915/gem/i915_gem_userptr.c
> @@ -99,9 +99,6 @@ userptr_mn_invalidate_range_start(struct mmu_notifier *_mn,
>   	if (RB_EMPTY_ROOT(&mn->objects.rb_root))
>   		return 0;
>   
> -	/* interval ranges are inclusive, but invalidate range is exclusive */
> -	end = range->end - 1;
> -
>   	spin_lock(&mn->lock);
>   	it = interval_tree_iter_first(&mn->objects, range->start, end);
>   	while (it) {
> @@ -275,7 +272,7 @@ i915_gem_userptr_init__mmu_notifier(struct drm_i915_gem_object *obj,
>   	mo->mn = mn;
>   	mo->obj = obj;
>   	mo->it.start = obj->userptr.ptr;
> -	mo->it.last = obj->userptr.ptr + obj->base.size - 1;
> +	mo->it.end = obj->userptr.ptr + obj->base.size;
>   	RB_CLEAR_NODE(&mo->it.rb);
>   
>   	obj->userptr.mmu_object = mo;
> diff --git a/drivers/gpu/drm/radeon/radeon_mn.c b/drivers/gpu/drm/radeon/radeon_mn.c
> index dbab9a3a969b..4810421dacc2 100644
> --- a/drivers/gpu/drm/radeon/radeon_mn.c
> +++ b/drivers/gpu/drm/radeon/radeon_mn.c
> @@ -66,12 +66,9 @@ static int radeon_mn_invalidate_range_start(struct mmu_notifier *mn,
>   	struct radeon_mn *rmn = container_of(mn, struct radeon_mn, mn);
>   	struct ttm_operation_ctx ctx = { false, false };
>   	struct interval_tree_node *it;
> -	unsigned long end;
> +	unsigned long end = range->end;
>   	int ret = 0;
>   
> -	/* notification is exclusive, but interval is inclusive */
> -	end = range->end - 1;
> -
>   	/* TODO we should be able to split locking for interval tree and
>   	 * the tear down.
>   	 */
> @@ -174,7 +171,7 @@ static const struct mmu_notifier_ops radeon_mn_ops = {
>    */
>   int radeon_mn_register(struct radeon_bo *bo, unsigned long addr)
>   {
> -	unsigned long end = addr + radeon_bo_size(bo) - 1;
> +	unsigned long end = addr + radeon_bo_size(bo);
>   	struct mmu_notifier *mn;
>   	struct radeon_mn *rmn;
>   	struct radeon_mn_node *node = NULL;
> @@ -195,7 +192,7 @@ int radeon_mn_register(struct radeon_bo *bo, unsigned long addr)
>   		node = container_of(it, struct radeon_mn_node, it);
>   		interval_tree_remove(&node->it, &rmn->objects);
>   		addr = min(it->start, addr);
> -		end = max(it->last, end);
> +		end = max(it->end, end);
>   		list_splice(&node->bos, &bos);
>   	}
>   
> @@ -210,7 +207,7 @@ int radeon_mn_register(struct radeon_bo *bo, unsigned long addr)
>   	bo->mn = rmn;
>   
>   	node->it.start = addr;
> -	node->it.last = end;
> +	node->it.end = end;
>   	INIT_LIST_HEAD(&node->bos);
>   	list_splice(&bos, &node->bos);
>   	list_add(&bo->mn_list, &node->bos);
> diff --git a/drivers/gpu/drm/radeon/radeon_trace.h b/drivers/gpu/drm/radeon/radeon_trace.h
> index c93f3ab3c4e3..4324f3fc5d82 100644
> --- a/drivers/gpu/drm/radeon/radeon_trace.h
> +++ b/drivers/gpu/drm/radeon/radeon_trace.h
> @@ -73,7 +73,7 @@ TRACE_EVENT(radeon_vm_bo_update,
>   
>   	    TP_fast_assign(
>   			   __entry->soffset = bo_va->it.start;
> -			   __entry->eoffset = bo_va->it.last + 1;
> +			   __entry->eoffset = bo_va->it.end;
>   			   __entry->flags = bo_va->flags;
>   			   ),
>   	    TP_printk("soffs=%010llx, eoffs=%010llx, flags=%08x",
> diff --git a/drivers/gpu/drm/radeon/radeon_vm.c b/drivers/gpu/drm/radeon/radeon_vm.c
> index e0ad547786e8..b2a54aff21f4 100644
> --- a/drivers/gpu/drm/radeon/radeon_vm.c
> +++ b/drivers/gpu/drm/radeon/radeon_vm.c
> @@ -329,7 +329,7 @@ struct radeon_bo_va *radeon_vm_bo_add(struct radeon_device *rdev,
>   	bo_va->vm = vm;
>   	bo_va->bo = bo;
>   	bo_va->it.start = 0;
> -	bo_va->it.last = 0;
> +	bo_va->it.end = 0;
>   	bo_va->flags = 0;
>   	bo_va->ref_count = 1;
>   	INIT_LIST_HEAD(&bo_va->bo_list);
> @@ -456,7 +456,7 @@ int radeon_vm_bo_set_addr(struct radeon_device *rdev,
>   
>   	if (soffset) {
>   		/* make sure object fit at this offset */
> -		eoffset = soffset + size - 1;
> +		eoffset = soffset + size;
>   		if (soffset >= eoffset) {
>   			r = -EINVAL;
>   			goto error_unreserve;
> @@ -486,14 +486,14 @@ int radeon_vm_bo_set_addr(struct radeon_device *rdev,
>   			/* bo and tmp overlap, invalid offset */
>   			dev_err(rdev->dev, "bo %p va 0x%010Lx conflict with "
>   				"(bo %p 0x%010lx 0x%010lx)\n", bo_va->bo,
> -				soffset, tmp->bo, tmp->it.start, tmp->it.last);
> +				soffset, tmp->bo, tmp->it.start, tmp->it.end);
>   			mutex_unlock(&vm->mutex);
>   			r = -EINVAL;
>   			goto error_unreserve;
>   		}
>   	}
>   
> -	if (bo_va->it.start || bo_va->it.last) {
> +	if (bo_va->it.start || bo_va->it.end) {
>   		/* add a clone of the bo_va to clear the old address */
>   		struct radeon_bo_va *tmp;
>   		tmp = kzalloc(sizeof(struct radeon_bo_va), GFP_KERNEL);
> @@ -503,14 +503,14 @@ int radeon_vm_bo_set_addr(struct radeon_device *rdev,
>   			goto error_unreserve;
>   		}
>   		tmp->it.start = bo_va->it.start;
> -		tmp->it.last = bo_va->it.last;
> +		tmp->it.end = bo_va->it.end;
>   		tmp->vm = vm;
>   		tmp->bo = radeon_bo_ref(bo_va->bo);
>   
>   		interval_tree_remove(&bo_va->it, &vm->va);
>   		spin_lock(&vm->status_lock);
>   		bo_va->it.start = 0;
> -		bo_va->it.last = 0;
> +		bo_va->it.end = 0;
>   		list_del_init(&bo_va->vm_status);
>   		list_add(&tmp->vm_status, &vm->freed);
>   		spin_unlock(&vm->status_lock);
> @@ -519,7 +519,7 @@ int radeon_vm_bo_set_addr(struct radeon_device *rdev,
>   	if (soffset || eoffset) {
>   		spin_lock(&vm->status_lock);
>   		bo_va->it.start = soffset;
> -		bo_va->it.last = eoffset;
> +		bo_va->it.end = eoffset;
>   		list_add(&bo_va->vm_status, &vm->cleared);
>   		spin_unlock(&vm->status_lock);
>   		interval_tree_insert(&bo_va->it, &vm->va);
> @@ -964,7 +964,7 @@ int radeon_vm_bo_update(struct radeon_device *rdev,
>   
>   	trace_radeon_vm_bo_update(bo_va);
>   
> -	nptes = bo_va->it.last - bo_va->it.start + 1;
> +	nptes = bo_va->it.end - bo_va->it.start;
>   
>   	/* reserve space for one command every (1 << BLOCK_SIZE) entries
>   	   or 2k dwords (whatever is smaller) */
> @@ -1010,7 +1010,7 @@ int radeon_vm_bo_update(struct radeon_device *rdev,
>   	}
>   
>   	r = radeon_vm_update_ptes(rdev, vm, &ib, bo_va->it.start,
> -				  bo_va->it.last + 1, addr,
> +				  bo_va->it.end, addr,
>   				  radeon_vm_page_flags(bo_va->flags));
>   	if (r) {
>   		radeon_ib_free(rdev, &ib);
> @@ -1026,7 +1026,7 @@ int radeon_vm_bo_update(struct radeon_device *rdev,
>   		return r;
>   	}
>   	ib.fence->is_vm_update = true;
> -	radeon_vm_fence_pts(vm, bo_va->it.start, bo_va->it.last + 1, ib.fence);
> +	radeon_vm_fence_pts(vm, bo_va->it.start, bo_va->it.end, ib.fence);
>   	radeon_fence_unref(&bo_va->last_pt_update);
>   	bo_va->last_pt_update = radeon_fence_ref(ib.fence);
>   	radeon_ib_free(rdev, &ib);
> @@ -1124,12 +1124,12 @@ void radeon_vm_bo_rmv(struct radeon_device *rdev,
>   	list_del(&bo_va->bo_list);
>   
>   	mutex_lock(&vm->mutex);
> -	if (bo_va->it.start || bo_va->it.last)
> +	if (bo_va->it.start || bo_va->it.end)
>   		interval_tree_remove(&bo_va->it, &vm->va);
>   
>   	spin_lock(&vm->status_lock);
>   	list_del(&bo_va->vm_status);
> -	if (bo_va->it.start || bo_va->it.last) {
> +	if (bo_va->it.start || bo_va->it.end) {
>   		bo_va->bo = radeon_bo_ref(bo_va->bo);
>   		list_add(&bo_va->vm_status, &vm->freed);
>   	} else {
> @@ -1158,7 +1158,7 @@ void radeon_vm_bo_invalidate(struct radeon_device *rdev,
>   	list_for_each_entry(bo_va, &bo->va, bo_list) {
>   		spin_lock(&bo_va->vm->status_lock);
>   		if (list_empty(&bo_va->vm_status) &&
> -		    (bo_va->it.start || bo_va->it.last))
> +		    (bo_va->it.start || bo_va->it.end))
>   			list_add(&bo_va->vm_status, &bo_va->vm->invalidated);
>   		spin_unlock(&bo_va->vm->status_lock);
>   	}
> diff --git a/drivers/infiniband/core/umem_odp.c b/drivers/infiniband/core/umem_odp.c
> index f67a30fda1ed..9b7ac93223d6 100644
> --- a/drivers/infiniband/core/umem_odp.c
> +++ b/drivers/infiniband/core/umem_odp.c
> @@ -219,26 +219,19 @@ static inline int ib_init_umem_odp(struct ib_umem_odp *umem_odp)
>   			ALIGN_DOWN(umem_odp->umem.address, page_size);
>   		if (check_add_overflow(umem_odp->umem.address,
>   				       (unsigned long)umem_odp->umem.length,
> -				       &umem_odp->interval_tree.last))
> +				       &umem_odp->interval_tree.end))
>   			return -EOVERFLOW;
> -		umem_odp->interval_tree.last =
> -			ALIGN(umem_odp->interval_tree.last, page_size);
> -		if (unlikely(umem_odp->interval_tree.last < page_size))
> +		umem_odp->interval_tree.end =
> +			ALIGN(umem_odp->interval_tree.end, page_size);
> +		if (unlikely(umem_odp->interval_tree.end < page_size))
>   			return -EOVERFLOW;
>   
> -		pages = (umem_odp->interval_tree.last -
> +		pages = (umem_odp->interval_tree.end -
>   			 umem_odp->interval_tree.start) >>
>   			umem_odp->page_shift;
>   		if (!pages)
>   			return -EINVAL;
>   
> -		/*
> -		 * Note that the representation of the intervals in the
> -		 * interval tree considers the ending point as contained in
> -		 * the interval.
> -		 */
> -		umem_odp->interval_tree.last--;
> -
>   		umem_odp->page_list = kvcalloc(
>   			pages, sizeof(*umem_odp->page_list), GFP_KERNEL);
>   		if (!umem_odp->page_list)
> @@ -777,12 +770,12 @@ int rbt_ib_umem_for_each_in_range(struct rb_root_cached *root,
>   	if (unlikely(start == last))
>   		return ret_val;
>   
> -	for (node = interval_tree_iter_first(root, start, last - 1);
> +	for (node = interval_tree_iter_first(root, start, last);
>   			node; node = next) {
>   		/* TODO move the blockable decision up to the callback */
>   		if (!blockable)
>   			return -EAGAIN;
> -		next = interval_tree_iter_next(node, start, last - 1);
> +		next = interval_tree_iter_next(node, start, last);
>   		umem = container_of(node, struct ib_umem_odp, interval_tree);
>   		ret_val = cb(umem, start, last, cookie) || ret_val;
>   	}
> diff --git a/drivers/iommu/virtio-iommu.c b/drivers/iommu/virtio-iommu.c
> index 3ea9d7682999..771740981f43 100644
> --- a/drivers/iommu/virtio-iommu.c
> +++ b/drivers/iommu/virtio-iommu.c
> @@ -323,7 +323,7 @@ static int viommu_add_mapping(struct viommu_domain *vdomain, unsigned long iova,
>   
>   	mapping->paddr		= paddr;
>   	mapping->iova.start	= iova;
> -	mapping->iova.last	= iova + size - 1;
> +	mapping->iova.end	= iova + size;
>   	mapping->flags		= flags;
>   
>   	spin_lock_irqsave(&vdomain->mappings_lock, irqflags);
> @@ -348,7 +348,7 @@ static size_t viommu_del_mappings(struct viommu_domain *vdomain,
>   {
>   	size_t unmapped = 0;
>   	unsigned long flags;
> -	unsigned long last = iova + size - 1;
> +	unsigned long last = iova + size;
>   	struct viommu_mapping *mapping = NULL;
>   	struct interval_tree_node *node, *next;
>   
> @@ -367,7 +367,7 @@ static size_t viommu_del_mappings(struct viommu_domain *vdomain,
>   		 * Virtio-iommu doesn't allow UNMAP to split a mapping created
>   		 * with a single MAP request, so remove the full mapping.
>   		 */
> -		unmapped += mapping->iova.last - mapping->iova.start + 1;
> +		unmapped += mapping->iova.end - mapping->iova.start;
>   
>   		interval_tree_remove(node, &vdomain->mappings);
>   		kfree(mapping);
> diff --git a/include/linux/interval_tree.h b/include/linux/interval_tree.h
> index 288c26f50732..f3d1ea9e4003 100644
> --- a/include/linux/interval_tree.h
> +++ b/include/linux/interval_tree.h
> @@ -7,7 +7,7 @@
>   struct interval_tree_node {
>   	struct rb_node rb;
>   	unsigned long start;	/* Start of interval */
> -	unsigned long last;	/* Last location _in_ interval */
> +	unsigned long end;	/* Last location _outside_ interval */
>   	unsigned long __subtree_last;
>   };
>   
> diff --git a/include/rdma/ib_umem_odp.h b/include/rdma/ib_umem_odp.h
> index 253df1a1fa54..d0ae7aa2139b 100644
> --- a/include/rdma/ib_umem_odp.h
> +++ b/include/rdma/ib_umem_odp.h
> @@ -97,7 +97,7 @@ static inline unsigned long ib_umem_start(struct ib_umem_odp *umem_odp)
>   /* Returns the address of the page after the last one of an ODP umem. */
>   static inline unsigned long ib_umem_end(struct ib_umem_odp *umem_odp)
>   {
> -	return umem_odp->interval_tree.last + 1;
> +	return umem_odp->interval_tree.end;
>   }
>   
>   static inline size_t ib_umem_odp_num_pages(struct ib_umem_odp *umem_odp)
> @@ -165,7 +165,7 @@ rbt_ib_umem_lookup(struct rb_root_cached *root, u64 addr, u64 length)
>   {
>   	struct interval_tree_node *node;
>   
> -	node = interval_tree_iter_first(root, addr, addr + length - 1);
> +	node = interval_tree_iter_first(root, addr, addr + length);
>   	if (!node)
>   		return NULL;
>   	return container_of(node, struct ib_umem_odp, interval_tree);
> diff --git a/lib/interval_tree.c b/lib/interval_tree.c
> index 593ce56ece50..336ec5113a28 100644
> --- a/lib/interval_tree.c
> +++ b/lib/interval_tree.c
> @@ -1,15 +1,15 @@
>   // SPDX-License-Identifier: GPL-2.0-only
>   #include <linux/interval_tree.h>
> -#include <linux/interval_tree_generic.h>
> +#include <linux/interval_tree_gen.h>
>   #include <linux/compiler.h>
>   #include <linux/export.h>
>   
>   #define START(node) ((node)->start)
> -#define LAST(node)  ((node)->last)
> +#define END(node)  ((node)->end)
>   
>   INTERVAL_TREE_DEFINE(struct interval_tree_node, rb,
>   		     unsigned long, __subtree_last,
> -		     START, LAST,, interval_tree)
> +		     START, END,, interval_tree)
>   
>   EXPORT_SYMBOL_GPL(interval_tree_insert);
>   EXPORT_SYMBOL_GPL(interval_tree_remove);


^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [PATCH 09/11] lib/interval-tree: convert interval_tree to half closed intervals
  2019-10-04  6:57   ` Koenig, Christian
@ 2019-10-04  7:20     ` Koenig, Christian
  2019-10-08 16:59       ` Davidlohr Bueso
  0 siblings, 1 reply; 38+ messages in thread
From: Koenig, Christian @ 2019-10-04  7:20 UTC (permalink / raw)
  To: Davidlohr Bueso, akpm
  Cc: walken, peterz, linux-kernel, linux-mm, dri-devel, linux-rdma,
	Deucher, Alexander, David Airlie, Daniel Vetter, Doug Ledford,
	Joerg Roedel, Jérôme Glisse, Davidlohr Bueso

Am 04.10.19 um 08:57 schrieb Christian König:
> Am 03.10.19 um 22:18 schrieb Davidlohr Bueso:
>> The generic tree tree really wants [a, b) intervals, not fully closed.
>> As such convert it to use the new interval_tree_gen.h. Most of the
>> conversions are straightforward, with the exception of perhaps
>> radeon_vm_bo_set_addr(), but semantics have been tried to be left
>> untouched.
>
> NAK, the whole thing won't work.
>
> See we need to handle the full device address space which means we 
> have values in the range of 0x0-0xffffffff.
>
> If you make this a closed interval then the end would wrap around to 
> 0x0 if long is only 32bit.

Well I've just now re-read the subject line. From that it sounds like 
you are actually trying to fix the interval tree to use a half closed 
interval, e.g. something like [a, b[

But your code changes sometimes doesn't seem to reflect that.

Regards,
Christian.

>
> Regards,
> Christian.
>
>>
>> Cc: "Christian König" <christian.koenig@amd.com>
>> Cc: Alex Deucher <alexander.deucher@amd.com>
>> Cc: David Airlie <airlied@linux.ie>
>> Cc: Daniel Vetter <daniel@ffwll.ch>
>> Cc: Doug Ledford <dledford@redhat.com>
>> Cc: Joerg Roedel <joro@8bytes.org>
>> Cc: "Jérôme Glisse" <jglisse@redhat.com>
>> Cc: dri-devel@lists.freedesktop.org
>> Cc: linux-rdma@vger.kernel.org
>> Signed-off-by: Davidlohr Bueso <dbueso@suse.de>
>> ---
>>   drivers/gpu/drm/amd/amdgpu/amdgpu_mn.c      | 12 +++---------
>>   drivers/gpu/drm/i915/gem/i915_gem_userptr.c |  5 +----
>>   drivers/gpu/drm/radeon/radeon_mn.c          | 11 ++++-------
>>   drivers/gpu/drm/radeon/radeon_trace.h       |  2 +-
>>   drivers/gpu/drm/radeon/radeon_vm.c          | 26 
>> +++++++++++++-------------
>>   drivers/infiniband/core/umem_odp.c          | 21 +++++++--------------
>>   drivers/iommu/virtio-iommu.c                |  6 +++---
>>   include/linux/interval_tree.h               |  2 +-
>>   include/rdma/ib_umem_odp.h                  |  4 ++--
>>   lib/interval_tree.c                         |  6 +++---
>>   10 files changed, 38 insertions(+), 57 deletions(-)
>>
>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_mn.c 
>> b/drivers/gpu/drm/amd/amdgpu/amdgpu_mn.c
>> index 31d4deb5d294..4bbaa9ffa570 100644
>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_mn.c
>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_mn.c
>> @@ -205,9 +205,6 @@ amdgpu_mn_sync_pagetables_gfx(struct hmm_mirror 
>> *mirror,
>>       bool blockable = mmu_notifier_range_blockable(update);
>>       struct interval_tree_node *it;
>>   -    /* notification is exclusive, but interval is inclusive */
>> -    end -= 1;
>> -
>>       /* TODO we should be able to split locking for interval tree and
>>        * amdgpu_mn_invalidate_node
>>        */
>> @@ -254,9 +251,6 @@ amdgpu_mn_sync_pagetables_hsa(struct hmm_mirror 
>> *mirror,
>>       bool blockable = mmu_notifier_range_blockable(update);
>>       struct interval_tree_node *it;
>>   -    /* notification is exclusive, but interval is inclusive */
>> -    end -= 1;
>> -
>>       if (amdgpu_mn_read_lock(amn, blockable))
>>           return -EAGAIN;
>>   @@ -374,7 +368,7 @@ struct amdgpu_mn *amdgpu_mn_get(struct 
>> amdgpu_device *adev,
>>    */
>>   int amdgpu_mn_register(struct amdgpu_bo *bo, unsigned long addr)
>>   {
>> -    unsigned long end = addr + amdgpu_bo_size(bo) - 1;
>> +    unsigned long end = addr + amdgpu_bo_size(bo);
>>       struct amdgpu_device *adev = amdgpu_ttm_adev(bo->tbo.bdev);
>>       enum amdgpu_mn_type type =
>>           bo->kfd_bo ? AMDGPU_MN_TYPE_HSA : AMDGPU_MN_TYPE_GFX;
>> @@ -400,7 +394,7 @@ int amdgpu_mn_register(struct amdgpu_bo *bo, 
>> unsigned long addr)
>>           node = container_of(it, struct amdgpu_mn_node, it);
>>           interval_tree_remove(&node->it, &amn->objects);
>>           addr = min(it->start, addr);
>> -        end = max(it->last, end);
>> +        end = max(it->end, end);
>>           list_splice(&node->bos, &bos);
>>       }
>>   @@ -412,7 +406,7 @@ int amdgpu_mn_register(struct amdgpu_bo *bo, 
>> unsigned long addr)
>>       bo->mn = amn;
>>         node->it.start = addr;
>> -    node->it.last = end;
>> +    node->it.end = end;
>>       INIT_LIST_HEAD(&node->bos);
>>       list_splice(&bos, &node->bos);
>>       list_add(&bo->mn_list, &node->bos);
>> diff --git a/drivers/gpu/drm/i915/gem/i915_gem_userptr.c 
>> b/drivers/gpu/drm/i915/gem/i915_gem_userptr.c
>> index 11b231c187c5..818ff6b33102 100644
>> --- a/drivers/gpu/drm/i915/gem/i915_gem_userptr.c
>> +++ b/drivers/gpu/drm/i915/gem/i915_gem_userptr.c
>> @@ -99,9 +99,6 @@ userptr_mn_invalidate_range_start(struct 
>> mmu_notifier *_mn,
>>       if (RB_EMPTY_ROOT(&mn->objects.rb_root))
>>           return 0;
>>   -    /* interval ranges are inclusive, but invalidate range is 
>> exclusive */
>> -    end = range->end - 1;
>> -
>>       spin_lock(&mn->lock);
>>       it = interval_tree_iter_first(&mn->objects, range->start, end);
>>       while (it) {
>> @@ -275,7 +272,7 @@ i915_gem_userptr_init__mmu_notifier(struct 
>> drm_i915_gem_object *obj,
>>       mo->mn = mn;
>>       mo->obj = obj;
>>       mo->it.start = obj->userptr.ptr;
>> -    mo->it.last = obj->userptr.ptr + obj->base.size - 1;
>> +    mo->it.end = obj->userptr.ptr + obj->base.size;
>>       RB_CLEAR_NODE(&mo->it.rb);
>>         obj->userptr.mmu_object = mo;
>> diff --git a/drivers/gpu/drm/radeon/radeon_mn.c 
>> b/drivers/gpu/drm/radeon/radeon_mn.c
>> index dbab9a3a969b..4810421dacc2 100644
>> --- a/drivers/gpu/drm/radeon/radeon_mn.c
>> +++ b/drivers/gpu/drm/radeon/radeon_mn.c
>> @@ -66,12 +66,9 @@ static int radeon_mn_invalidate_range_start(struct 
>> mmu_notifier *mn,
>>       struct radeon_mn *rmn = container_of(mn, struct radeon_mn, mn);
>>       struct ttm_operation_ctx ctx = { false, false };
>>       struct interval_tree_node *it;
>> -    unsigned long end;
>> +    unsigned long end = range->end;
>>       int ret = 0;
>>   -    /* notification is exclusive, but interval is inclusive */
>> -    end = range->end - 1;
>> -
>>       /* TODO we should be able to split locking for interval tree and
>>        * the tear down.
>>        */
>> @@ -174,7 +171,7 @@ static const struct mmu_notifier_ops 
>> radeon_mn_ops = {
>>    */
>>   int radeon_mn_register(struct radeon_bo *bo, unsigned long addr)
>>   {
>> -    unsigned long end = addr + radeon_bo_size(bo) - 1;
>> +    unsigned long end = addr + radeon_bo_size(bo);
>>       struct mmu_notifier *mn;
>>       struct radeon_mn *rmn;
>>       struct radeon_mn_node *node = NULL;
>> @@ -195,7 +192,7 @@ int radeon_mn_register(struct radeon_bo *bo, 
>> unsigned long addr)
>>           node = container_of(it, struct radeon_mn_node, it);
>>           interval_tree_remove(&node->it, &rmn->objects);
>>           addr = min(it->start, addr);
>> -        end = max(it->last, end);
>> +        end = max(it->end, end);
>>           list_splice(&node->bos, &bos);
>>       }
>>   @@ -210,7 +207,7 @@ int radeon_mn_register(struct radeon_bo *bo, 
>> unsigned long addr)
>>       bo->mn = rmn;
>>         node->it.start = addr;
>> -    node->it.last = end;
>> +    node->it.end = end;
>>       INIT_LIST_HEAD(&node->bos);
>>       list_splice(&bos, &node->bos);
>>       list_add(&bo->mn_list, &node->bos);
>> diff --git a/drivers/gpu/drm/radeon/radeon_trace.h 
>> b/drivers/gpu/drm/radeon/radeon_trace.h
>> index c93f3ab3c4e3..4324f3fc5d82 100644
>> --- a/drivers/gpu/drm/radeon/radeon_trace.h
>> +++ b/drivers/gpu/drm/radeon/radeon_trace.h
>> @@ -73,7 +73,7 @@ TRACE_EVENT(radeon_vm_bo_update,
>>             TP_fast_assign(
>>                  __entry->soffset = bo_va->it.start;
>> -               __entry->eoffset = bo_va->it.last + 1;
>> +               __entry->eoffset = bo_va->it.end;
>>                  __entry->flags = bo_va->flags;
>>                  ),
>>           TP_printk("soffs=%010llx, eoffs=%010llx, flags=%08x",
>> diff --git a/drivers/gpu/drm/radeon/radeon_vm.c 
>> b/drivers/gpu/drm/radeon/radeon_vm.c
>> index e0ad547786e8..b2a54aff21f4 100644
>> --- a/drivers/gpu/drm/radeon/radeon_vm.c
>> +++ b/drivers/gpu/drm/radeon/radeon_vm.c
>> @@ -329,7 +329,7 @@ struct radeon_bo_va *radeon_vm_bo_add(struct 
>> radeon_device *rdev,
>>       bo_va->vm = vm;
>>       bo_va->bo = bo;
>>       bo_va->it.start = 0;
>> -    bo_va->it.last = 0;
>> +    bo_va->it.end = 0;
>>       bo_va->flags = 0;
>>       bo_va->ref_count = 1;
>>       INIT_LIST_HEAD(&bo_va->bo_list);
>> @@ -456,7 +456,7 @@ int radeon_vm_bo_set_addr(struct radeon_device 
>> *rdev,
>>         if (soffset) {
>>           /* make sure object fit at this offset */
>> -        eoffset = soffset + size - 1;
>> +        eoffset = soffset + size;
>>           if (soffset >= eoffset) {
>>               r = -EINVAL;
>>               goto error_unreserve;
>> @@ -486,14 +486,14 @@ int radeon_vm_bo_set_addr(struct radeon_device 
>> *rdev,
>>               /* bo and tmp overlap, invalid offset */
>>               dev_err(rdev->dev, "bo %p va 0x%010Lx conflict with "
>>                   "(bo %p 0x%010lx 0x%010lx)\n", bo_va->bo,
>> -                soffset, tmp->bo, tmp->it.start, tmp->it.last);
>> +                soffset, tmp->bo, tmp->it.start, tmp->it.end);
>>               mutex_unlock(&vm->mutex);
>>               r = -EINVAL;
>>               goto error_unreserve;
>>           }
>>       }
>>   -    if (bo_va->it.start || bo_va->it.last) {
>> +    if (bo_va->it.start || bo_va->it.end) {
>>           /* add a clone of the bo_va to clear the old address */
>>           struct radeon_bo_va *tmp;
>>           tmp = kzalloc(sizeof(struct radeon_bo_va), GFP_KERNEL);
>> @@ -503,14 +503,14 @@ int radeon_vm_bo_set_addr(struct radeon_device 
>> *rdev,
>>               goto error_unreserve;
>>           }
>>           tmp->it.start = bo_va->it.start;
>> -        tmp->it.last = bo_va->it.last;
>> +        tmp->it.end = bo_va->it.end;
>>           tmp->vm = vm;
>>           tmp->bo = radeon_bo_ref(bo_va->bo);
>>             interval_tree_remove(&bo_va->it, &vm->va);
>>           spin_lock(&vm->status_lock);
>>           bo_va->it.start = 0;
>> -        bo_va->it.last = 0;
>> +        bo_va->it.end = 0;
>>           list_del_init(&bo_va->vm_status);
>>           list_add(&tmp->vm_status, &vm->freed);
>>           spin_unlock(&vm->status_lock);
>> @@ -519,7 +519,7 @@ int radeon_vm_bo_set_addr(struct radeon_device 
>> *rdev,
>>       if (soffset || eoffset) {
>>           spin_lock(&vm->status_lock);
>>           bo_va->it.start = soffset;
>> -        bo_va->it.last = eoffset;
>> +        bo_va->it.end = eoffset;
>>           list_add(&bo_va->vm_status, &vm->cleared);
>>           spin_unlock(&vm->status_lock);
>>           interval_tree_insert(&bo_va->it, &vm->va);
>> @@ -964,7 +964,7 @@ int radeon_vm_bo_update(struct radeon_device *rdev,
>>         trace_radeon_vm_bo_update(bo_va);
>>   -    nptes = bo_va->it.last - bo_va->it.start + 1;
>> +    nptes = bo_va->it.end - bo_va->it.start;
>>         /* reserve space for one command every (1 << BLOCK_SIZE) entries
>>          or 2k dwords (whatever is smaller) */
>> @@ -1010,7 +1010,7 @@ int radeon_vm_bo_update(struct radeon_device 
>> *rdev,
>>       }
>>         r = radeon_vm_update_ptes(rdev, vm, &ib, bo_va->it.start,
>> -                  bo_va->it.last + 1, addr,
>> +                  bo_va->it.end, addr,
>>                     radeon_vm_page_flags(bo_va->flags));
>>       if (r) {
>>           radeon_ib_free(rdev, &ib);
>> @@ -1026,7 +1026,7 @@ int radeon_vm_bo_update(struct radeon_device 
>> *rdev,
>>           return r;
>>       }
>>       ib.fence->is_vm_update = true;
>> -    radeon_vm_fence_pts(vm, bo_va->it.start, bo_va->it.last + 1, 
>> ib.fence);
>> +    radeon_vm_fence_pts(vm, bo_va->it.start, bo_va->it.end, ib.fence);
>>       radeon_fence_unref(&bo_va->last_pt_update);
>>       bo_va->last_pt_update = radeon_fence_ref(ib.fence);
>>       radeon_ib_free(rdev, &ib);
>> @@ -1124,12 +1124,12 @@ void radeon_vm_bo_rmv(struct radeon_device 
>> *rdev,
>>       list_del(&bo_va->bo_list);
>>         mutex_lock(&vm->mutex);
>> -    if (bo_va->it.start || bo_va->it.last)
>> +    if (bo_va->it.start || bo_va->it.end)
>>           interval_tree_remove(&bo_va->it, &vm->va);
>>         spin_lock(&vm->status_lock);
>>       list_del(&bo_va->vm_status);
>> -    if (bo_va->it.start || bo_va->it.last) {
>> +    if (bo_va->it.start || bo_va->it.end) {
>>           bo_va->bo = radeon_bo_ref(bo_va->bo);
>>           list_add(&bo_va->vm_status, &vm->freed);
>>       } else {
>> @@ -1158,7 +1158,7 @@ void radeon_vm_bo_invalidate(struct 
>> radeon_device *rdev,
>>       list_for_each_entry(bo_va, &bo->va, bo_list) {
>>           spin_lock(&bo_va->vm->status_lock);
>>           if (list_empty(&bo_va->vm_status) &&
>> -            (bo_va->it.start || bo_va->it.last))
>> +            (bo_va->it.start || bo_va->it.end))
>>               list_add(&bo_va->vm_status, &bo_va->vm->invalidated);
>>           spin_unlock(&bo_va->vm->status_lock);
>>       }
>> diff --git a/drivers/infiniband/core/umem_odp.c 
>> b/drivers/infiniband/core/umem_odp.c
>> index f67a30fda1ed..9b7ac93223d6 100644
>> --- a/drivers/infiniband/core/umem_odp.c
>> +++ b/drivers/infiniband/core/umem_odp.c
>> @@ -219,26 +219,19 @@ static inline int ib_init_umem_odp(struct 
>> ib_umem_odp *umem_odp)
>>               ALIGN_DOWN(umem_odp->umem.address, page_size);
>>           if (check_add_overflow(umem_odp->umem.address,
>>                          (unsigned long)umem_odp->umem.length,
>> -                       &umem_odp->interval_tree.last))
>> +                       &umem_odp->interval_tree.end))
>>               return -EOVERFLOW;
>> -        umem_odp->interval_tree.last =
>> -            ALIGN(umem_odp->interval_tree.last, page_size);
>> -        if (unlikely(umem_odp->interval_tree.last < page_size))
>> +        umem_odp->interval_tree.end =
>> +            ALIGN(umem_odp->interval_tree.end, page_size);
>> +        if (unlikely(umem_odp->interval_tree.end < page_size))
>>               return -EOVERFLOW;
>>   -        pages = (umem_odp->interval_tree.last -
>> +        pages = (umem_odp->interval_tree.end -
>>                umem_odp->interval_tree.start) >>
>>               umem_odp->page_shift;
>>           if (!pages)
>>               return -EINVAL;
>>   -        /*
>> -         * Note that the representation of the intervals in the
>> -         * interval tree considers the ending point as contained in
>> -         * the interval.
>> -         */
>> -        umem_odp->interval_tree.last--;
>> -
>>           umem_odp->page_list = kvcalloc(
>>               pages, sizeof(*umem_odp->page_list), GFP_KERNEL);
>>           if (!umem_odp->page_list)
>> @@ -777,12 +770,12 @@ int rbt_ib_umem_for_each_in_range(struct 
>> rb_root_cached *root,
>>       if (unlikely(start == last))
>>           return ret_val;
>>   -    for (node = interval_tree_iter_first(root, start, last - 1);
>> +    for (node = interval_tree_iter_first(root, start, last);
>>               node; node = next) {
>>           /* TODO move the blockable decision up to the callback */
>>           if (!blockable)
>>               return -EAGAIN;
>> -        next = interval_tree_iter_next(node, start, last - 1);
>> +        next = interval_tree_iter_next(node, start, last);
>>           umem = container_of(node, struct ib_umem_odp, interval_tree);
>>           ret_val = cb(umem, start, last, cookie) || ret_val;
>>       }
>> diff --git a/drivers/iommu/virtio-iommu.c b/drivers/iommu/virtio-iommu.c
>> index 3ea9d7682999..771740981f43 100644
>> --- a/drivers/iommu/virtio-iommu.c
>> +++ b/drivers/iommu/virtio-iommu.c
>> @@ -323,7 +323,7 @@ static int viommu_add_mapping(struct 
>> viommu_domain *vdomain, unsigned long iova,
>>         mapping->paddr        = paddr;
>>       mapping->iova.start    = iova;
>> -    mapping->iova.last    = iova + size - 1;
>> +    mapping->iova.end    = iova + size;
>>       mapping->flags        = flags;
>>         spin_lock_irqsave(&vdomain->mappings_lock, irqflags);
>> @@ -348,7 +348,7 @@ static size_t viommu_del_mappings(struct 
>> viommu_domain *vdomain,
>>   {
>>       size_t unmapped = 0;
>>       unsigned long flags;
>> -    unsigned long last = iova + size - 1;
>> +    unsigned long last = iova + size;
>>       struct viommu_mapping *mapping = NULL;
>>       struct interval_tree_node *node, *next;
>>   @@ -367,7 +367,7 @@ static size_t viommu_del_mappings(struct 
>> viommu_domain *vdomain,
>>            * Virtio-iommu doesn't allow UNMAP to split a mapping created
>>            * with a single MAP request, so remove the full mapping.
>>            */
>> -        unmapped += mapping->iova.last - mapping->iova.start + 1;
>> +        unmapped += mapping->iova.end - mapping->iova.start;
>>             interval_tree_remove(node, &vdomain->mappings);
>>           kfree(mapping);
>> diff --git a/include/linux/interval_tree.h 
>> b/include/linux/interval_tree.h
>> index 288c26f50732..f3d1ea9e4003 100644
>> --- a/include/linux/interval_tree.h
>> +++ b/include/linux/interval_tree.h
>> @@ -7,7 +7,7 @@
>>   struct interval_tree_node {
>>       struct rb_node rb;
>>       unsigned long start;    /* Start of interval */
>> -    unsigned long last;    /* Last location _in_ interval */
>> +    unsigned long end;    /* Last location _outside_ interval */
>>       unsigned long __subtree_last;
>>   };
>>   diff --git a/include/rdma/ib_umem_odp.h b/include/rdma/ib_umem_odp.h
>> index 253df1a1fa54..d0ae7aa2139b 100644
>> --- a/include/rdma/ib_umem_odp.h
>> +++ b/include/rdma/ib_umem_odp.h
>> @@ -97,7 +97,7 @@ static inline unsigned long ib_umem_start(struct 
>> ib_umem_odp *umem_odp)
>>   /* Returns the address of the page after the last one of an ODP 
>> umem. */
>>   static inline unsigned long ib_umem_end(struct ib_umem_odp *umem_odp)
>>   {
>> -    return umem_odp->interval_tree.last + 1;
>> +    return umem_odp->interval_tree.end;
>>   }
>>     static inline size_t ib_umem_odp_num_pages(struct ib_umem_odp 
>> *umem_odp)
>> @@ -165,7 +165,7 @@ rbt_ib_umem_lookup(struct rb_root_cached *root, 
>> u64 addr, u64 length)
>>   {
>>       struct interval_tree_node *node;
>>   -    node = interval_tree_iter_first(root, addr, addr + length - 1);
>> +    node = interval_tree_iter_first(root, addr, addr + length);
>>       if (!node)
>>           return NULL;
>>       return container_of(node, struct ib_umem_odp, interval_tree);
>> diff --git a/lib/interval_tree.c b/lib/interval_tree.c
>> index 593ce56ece50..336ec5113a28 100644
>> --- a/lib/interval_tree.c
>> +++ b/lib/interval_tree.c
>> @@ -1,15 +1,15 @@
>>   // SPDX-License-Identifier: GPL-2.0-only
>>   #include <linux/interval_tree.h>
>> -#include <linux/interval_tree_generic.h>
>> +#include <linux/interval_tree_gen.h>
>>   #include <linux/compiler.h>
>>   #include <linux/export.h>
>>     #define START(node) ((node)->start)
>> -#define LAST(node)  ((node)->last)
>> +#define END(node)  ((node)->end)
>>     INTERVAL_TREE_DEFINE(struct interval_tree_node, rb,
>>                unsigned long, __subtree_last,
>> -             START, LAST,, interval_tree)
>> +             START, END,, interval_tree)
>>     EXPORT_SYMBOL_GPL(interval_tree_insert);
>>   EXPORT_SYMBOL_GPL(interval_tree_remove);
>


^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [PATCH 02/11] lib/interval-tree: add an equivalent tree with [a,b) intervals
  2019-10-03 20:18 ` [PATCH 02/11] lib/interval-tree: add an equivalent tree with [a,b) intervals Davidlohr Bueso
@ 2019-10-04 11:02   ` Michel Lespinasse
  0 siblings, 0 replies; 38+ messages in thread
From: Michel Lespinasse @ 2019-10-04 11:02 UTC (permalink / raw)
  To: Davidlohr Bueso
  Cc: akpm, peterz, linux-kernel, linux-mm, dri-devel, linux-rdma,
	Davidlohr Bueso

On Thu, Oct 03, 2019 at 01:18:49PM -0700, Davidlohr Bueso wrote:
> +/*									      \
> + * Iterate over intervals intersecting [start;end)			      \
> + *									      \
> + * Note that a node's interval intersects [start;end) iff:		      \
> + *   Cond1: ITSTART(node) < end						      \
> + * and									      \
> + *   Cond2: start < ITEND(node)						      \
> + */									      \
> +									      \
> +static ITSTRUCT *							      \
> +ITPREFIX ## _subtree_search(ITSTRUCT *node, ITTYPE start, ITTYPE end)	      \
> +{									      \
> +	while (true) {							      \
> +		/*							      \
> +		 * Loop invariant: start <= node->ITSUBTREE		      \
Should be start < node->ITSUBTREE
> +		 * (Cond2 is satisfied by one of the subtree nodes)	      \
> +		 */							      \
> +		if (node->ITRB.rb_left) {				      \
> +			ITSTRUCT *left = rb_entry(node->ITRB.rb_left,	      \
> +						  ITSTRUCT, ITRB);	      \
> +			if (start < left->ITSUBTREE) {			      \
> +				/*					      \
> +				 * Some nodes in left subtree satisfy Cond2.  \
> +				 * Iterate to find the leftmost such node N.  \
> +				 * If it also satisfies Cond1, that's the     \
> +				 * match we are looking for. Otherwise, there \
> +				 * is no matching interval as nodes to the    \
> +				 * right of N can't satisfy Cond1 either.     \
> +				 */					      \
> +				node = left;				      \
> +				continue;				      \
> +			}						      \
> +		}							      \
> +		if (ITSTART(node) < end) {		/* Cond1 */	      \
> +			if (start < ITEND(node))	/* Cond2 */	      \
> +				return node;	/* node is leftmost match */  \
> +			if (node->ITRB.rb_right) {			      \
> +				node = rb_entry(node->ITRB.rb_right,	      \
> +						ITSTRUCT, ITRB);	      \
> +				if (start <= node->ITSUBTREE)		      \
Should be start < node->ITSUBTREE
> +					continue;			      \
> +			}						      \
> +		}							      \
> +		return NULL;	/* No match */				      \
> +	}								      \
> +}									      \

Other than that, the change looks good to me.

This is something I might use, regardless of the status of converting
other current users.

The name "interval_tree_gen.h" makes it ambiguous wether gen stands
for "generic" or "generator". This may sounds like a criticism,
but it's not - I think it really stands for both :)

Reviewed-by: Michel Lespinasse <walken@google.com>

-- 
Michel "Walken" Lespinasse
A program is never fully debugged until the last user dies.


^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [PATCH 03/11] drm/amdgpu: convert amdgpu_vm_it to half closed intervals
  2019-10-04  6:54   ` Koenig, Christian
@ 2019-10-04 11:36     ` Michel Lespinasse
  2019-10-04 12:39       ` Christian König
  0 siblings, 1 reply; 38+ messages in thread
From: Michel Lespinasse @ 2019-10-04 11:36 UTC (permalink / raw)
  To: Koenig, Christian
  Cc: Davidlohr Bueso, akpm, peterz, linux-kernel, linux-mm, dri-devel,
	linux-rdma, Jerome Glisse, Deucher, Alexander, Daniel Vetter,
	amd-gfx, Davidlohr Bueso

On Fri, Oct 04, 2019 at 06:54:54AM +0000, Koenig, Christian wrote:
> Am 03.10.19 um 22:18 schrieb Davidlohr Bueso:
> > The amdgpu_vm interval tree really wants [a, b) intervals,
> 
> NAK, we explicitly do need an [a, b[ interval here.

Hi Christian,

Just wanted to confirm where you stand on this patch, since I think
you reconsidered your initial position after first looking at 9/11
from this series.

I do not know the amdgpu code well, but I think the changes should be
fine - in struct amdgpu_bo_va_mapping, the "end" field will hold what
was previously stored in the "last" field, plus one. The expectation
is that overflows should not be an issue there, as "end" is explicitly
declared as an uint64, and as the code was previously computing
"last + 1" in many places.

Does that seem workable to you ?

Thanks,

-- 
Michel "Walken" Lespinasse
A program is never fully debugged until the last user dies.


^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [PATCH 05/11] IB/hfi1: convert __mmu_int_rb to half closed intervals
  2019-10-03 20:18 ` [PATCH 05/11] IB/hfi1: convert __mmu_int_rb " Davidlohr Bueso
@ 2019-10-04 11:50   ` Michel Lespinasse
  2019-10-04 19:41     ` Davidlohr Bueso
  0 siblings, 1 reply; 38+ messages in thread
From: Michel Lespinasse @ 2019-10-04 11:50 UTC (permalink / raw)
  To: Davidlohr Bueso
  Cc: akpm, peterz, linux-kernel, linux-mm, dri-devel, linux-rdma,
	Mike Marciniszyn, Dennis Dalessandro, Doug Ledford,
	Davidlohr Bueso

On Thu, Oct 03, 2019 at 01:18:52PM -0700, Davidlohr Bueso wrote:
> diff --git a/drivers/infiniband/hw/hfi1/mmu_rb.c b/drivers/infiniband/hw/hfi1/mmu_rb.c
> index 14d2a90964c3..fb6382b2d44e 100644
> --- a/drivers/infiniband/hw/hfi1/mmu_rb.c
> +++ b/drivers/infiniband/hw/hfi1/mmu_rb.c
> @@ -47,7 +47,7 @@
>  #include <linux/list.h>
>  #include <linux/rculist.h>
>  #include <linux/mmu_notifier.h>
> -#include <linux/interval_tree_generic.h>
> +#include <linux/interval_tree_gen.h>
>  
>  #include "mmu_rb.h"
>  #include "trace.h"
> @@ -89,7 +89,7 @@ static unsigned long mmu_node_start(struct mmu_rb_node *node)
>  
>  static unsigned long mmu_node_last(struct mmu_rb_node *node)
>  {
> -	return PAGE_ALIGN(node->addr + node->len) - 1;
> +	return PAGE_ALIGN(node->addr + node->len);
>  }

May as well rename the function mmu_node_end(). I was worried if it
was used anywhere else, but it turned out it's only used when defining
the interval tree.

I would also suggest moving this function (as well as mmu_node_first)
right before its use, rather than just after, which would allow you to
also remove the function prototype a few lines earlier.

Looks good to me otherwise.

Reviewed-by: Michel Lespinasse <walken@google.com>

-- 
Michel "Walken" Lespinasse
A program is never fully debugged until the last user dies.


^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [PATCH 07/11] vhost: convert vhost_umem_interval_tree to half closed intervals
  2019-10-03 20:18 ` [PATCH 07/11] vhost: convert vhost_umem_interval_tree " Davidlohr Bueso
@ 2019-10-04 12:10   ` Michel Lespinasse
  2019-10-04 19:44     ` Davidlohr Bueso
  2019-10-10  5:49   ` Jason Wang
  1 sibling, 1 reply; 38+ messages in thread
From: Michel Lespinasse @ 2019-10-04 12:10 UTC (permalink / raw)
  To: Davidlohr Bueso
  Cc: akpm, peterz, linux-kernel, linux-mm, dri-devel, linux-rdma,
	Michael, Jason Wang, virtualization, Davidlohr Bueso

On Thu, Oct 03, 2019 at 01:18:54PM -0700, Davidlohr Bueso wrote:
> @@ -1320,15 +1320,14 @@ static bool iotlb_access_ok(struct vhost_virtqueue *vq,
>  {
>  	const struct vhost_umem_node *node;
>  	struct vhost_umem *umem = vq->iotlb;
> -	u64 s = 0, size, orig_addr = addr, last = addr + len - 1;
> +	u64 s = 0, size, orig_addr = addr, last = addr + len;

maybe "end" or "end_addr" instead of "last".

> diff --git a/drivers/vhost/vhost.h b/drivers/vhost/vhost.h
> index e9ed2722b633..bb36cb9ed5ec 100644
> --- a/drivers/vhost/vhost.h
> +++ b/drivers/vhost/vhost.h
> @@ -53,13 +53,13 @@ struct vhost_log {
>  };
>  
>  #define START(node) ((node)->start)
> -#define LAST(node) ((node)->last)
> +#define END(node) ((node)->end)
>  
>  struct vhost_umem_node {
>  	struct rb_node rb;
>  	struct list_head link;
>  	__u64 start;
> -	__u64 last;
> +	__u64 end;
>  	__u64 size;
>  	__u64 userspace_addr;
>  	__u32 perm;

Preferably also rename __subtree_last to __subtree_end

Looks good to me otherwise.

-- 
Michel "Walken" Lespinasse
A program is never fully debugged until the last user dies.


^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [PATCH 08/11] mm: convert vma_interval_tree to half closed intervals
  2019-10-03 20:18 ` [PATCH 08/11] mm: convert vma_interval_tree " Davidlohr Bueso
  2019-10-03 20:41   ` Matthew Wilcox
@ 2019-10-04 12:30   ` Michel Lespinasse
  1 sibling, 0 replies; 38+ messages in thread
From: Michel Lespinasse @ 2019-10-04 12:30 UTC (permalink / raw)
  To: Davidlohr Bueso, Matthew Wilcox
  Cc: akpm, peterz, linux-kernel, linux-mm, dri-devel, linux-rdma,
	Davidlohr Bueso

On Thu, Oct 03, 2019 at 01:18:55PM -0700, Davidlohr Bueso wrote:
> The vma and anon vma interval tree really wants [a, b) intervals,
> not fully closed. As such convert it to use the new
> interval_tree_gen.h. Because of vma_last_pgoff(), the conversion
> is quite straightforward.

I am not certain if we need to worry about integer overflow here.
The problem case would be accessing the last block of a file that is
exactly 16TB long, on an arch where long (and thus pgoff_t) is 32-bit.
Maybe FS folks can tell us whether that case is currently supported,
or if we can just not worry about it ?

I would also want to rename the fields in struct zap_details into
start_index and end_index so we can verify we don't leave any
off-by-one uses.

-- 
Michel "Walken" Lespinasse
A program is never fully debugged until the last user dies.


^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [PATCH 03/11] drm/amdgpu: convert amdgpu_vm_it to half closed intervals
  2019-10-04 11:36     ` Michel Lespinasse
@ 2019-10-04 12:39       ` Christian König
  0 siblings, 0 replies; 38+ messages in thread
From: Christian König @ 2019-10-04 12:39 UTC (permalink / raw)
  To: Michel Lespinasse, Koenig, Christian
  Cc: Davidlohr Bueso, Davidlohr Bueso, peterz, linux-kernel,
	dri-devel, linux-mm, Jerome Glisse, amd-gfx, Daniel Vetter,
	Deucher, Alexander, akpm, linux-rdma

Hi Michel,

Am 04.10.19 um 13:36 schrieb Michel Lespinasse:
> On Fri, Oct 04, 2019 at 06:54:54AM +0000, Koenig, Christian wrote:
>> Am 03.10.19 um 22:18 schrieb Davidlohr Bueso:
>>> The amdgpu_vm interval tree really wants [a, b) intervals,
>> NAK, we explicitly do need an [a, b[ interval here.
> Hi Christian,
>
> Just wanted to confirm where you stand on this patch, since I think
> you reconsidered your initial position after first looking at 9/11
> from this series.
>
> I do not know the amdgpu code well, but I think the changes should be
> fine - in struct amdgpu_bo_va_mapping, the "end" field will hold what
> was previously stored in the "last" field, plus one. The expectation
> is that overflows should not be an issue there, as "end" is explicitly
> declared as an uint64, and as the code was previously computing
> "last + 1" in many places.
>
> Does that seem workable to you ?

No, we computed last + 1 in a couple of debug places were it doesn't 
hurt us and IIRC we currently cheat a bit because we use pfn instead of 
addresses on some other places.

But that is only a leftover from radeon and we need to fix that sooner 
or later, cause essentially the physical address space of the device is 
really full 64bits, e.g. 0x0-0xffffffffffffffff.

So that only fits into a 64bit int when we use half open/closed 
intervals, but would wrap around to zero if we use a closed interval.

I initially thought that the set was changing the interval tree into 
always using a closed interval, but that seems to have been a 
misunderstanding.

Regards,
Christian.

>
> Thanks,
>



^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [PATCH -next 00/11] lib/interval-tree: move to half closed intervals
  2019-10-03 20:32 ` [PATCH -next 00/11] lib/interval-tree: move to half closed intervals Matthew Wilcox
  2019-10-03 21:10   ` Davidlohr Bueso
@ 2019-10-04 12:43   ` Michel Lespinasse
  1 sibling, 0 replies; 38+ messages in thread
From: Michel Lespinasse @ 2019-10-04 12:43 UTC (permalink / raw)
  To: Matthew Wilcox
  Cc: Davidlohr Bueso, akpm, peterz, linux-kernel, linux-mm, dri-devel,
	linux-rdma

On Thu, Oct 03, 2019 at 01:32:50PM -0700, Matthew Wilcox wrote:
> On Thu, Oct 03, 2019 at 01:18:47PM -0700, Davidlohr Bueso wrote:
> > It has been discussed[1,2] that almost all users of interval trees would better
> > be served if the intervals were actually not [a,b], but instead [a, b). This
> 
> So how does a user represent a range from ULONG_MAX to ULONG_MAX now?
> 
> I think the problem is that large parts of the kernel just don't consider
> integer overflow.  Because we write in C, it's natural to write:
> 
> 	for (i = start; i < end; i++)
> 
> and just assume that we never need to hit ULONG_MAX or UINT_MAX.
> If we're storing addresses, that's generally true -- most architectures
> don't allow addresses in the -PAGE_SIZE to ULONG_MAX range (or they'd
> have trouble with PTR_ERR).  If you're looking at file sizes, that's
> not true on 32-bit machines, and we've definitely seen filesystem bugs
> with files nudging up on 16TB (on 32 bit with 4k page size).  Or block
> driver bugs with similarly sized block devices.
> 
> So, yeah, easier to use.  But damning corner cases.

Yeah, I wanted to ask - is the case where pgoff == ULONG_MAX (i.e.,
last block of a file that is exactly 16TB) currently supported on
32-bit archs ?
I have no idea if I am supposed to care about this or not...

-- 
Michel "Walken" Lespinasse
A program is never fully debugged until the last user dies.


^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [PATCH -next 00/11] lib/interval-tree: move to half closed intervals
  2019-10-04  0:26 ` Jason Gunthorpe
  2019-10-04  2:48   ` Davidlohr Bueso
@ 2019-10-04 13:15   ` Michel Lespinasse
  2019-10-04 16:03     ` Matthew Wilcox
  2019-10-04 17:45     ` Jason Gunthorpe
  1 sibling, 2 replies; 38+ messages in thread
From: Michel Lespinasse @ 2019-10-04 13:15 UTC (permalink / raw)
  To: Jason Gunthorpe
  Cc: Davidlohr Bueso, Andrew Morton, Peter Zijlstra, LKML, linux-mm,
	dri-devel, linux-rdma

Hi Jason,

On Thu, Oct 3, 2019 at 5:26 PM Jason Gunthorpe <jgg@ziepe.ca> wrote:
> Hurm, this is not entirely accurate. Most users do actually want
> overlapping and multiple ranges. I just studied this extensively:

(Just curious, are you the person we discussed this with after the
Maple Tree talk at LPC 2019 ?)

I think we have two separate API problems there:
- overlapping vs non-overlapping intervals (the interval tree API
supports overlapping intervals, but some users are confused about
this)
- closed vs half-open interval definitions

It looks like you have been looking mostly at the first issue, which I
expect could simplify several interval tree users considerably, while
Davidlohr is addressing the second issue here.

> radeon_mn actually wants overlapping but seems to mis-understand the
> interval_tree API and actively tries hard to prevent overlapping at
> great cost and complexity. I have a patch to delete all of this and
> just be overlapping.
>
> amdgpu_mn copied the wrongness from radeon_mn
>
> All the DRM drivers are basically the same here, tracking userspace
> controlled VAs, so overlapping is essential
>
> hfi1/mmu_rb definitely needs overlapping as it is dealing with
> userspace VA ranges under control of userspace. As do the other
> infiniband users.

Do you have a handle on what usnic is doing with its intervals ?
usnic_uiom_insert_interval() has some complicated logic to avoid
having overlapping intervals, which is very confusing to me.

> vhost probably doesn't overlap in the normal case, but again userspace
> could trigger overlap in some pathalogical case.
>
> The [start,last] allows the interval to cover up to ULONG_MAX. I don't
> know if this is needed however. Many users are using userspace VAs
> here. Is there any kernel configuration where ULONG_MAX is a valid
> userspace pointer? Ie 32 bit 4G userspace? I don't know.
>
> Many users seemed to have bugs where they were taking a userspace
> controlled start + length and converting them into a start/end for
> interval tree without overflow protection (woops)
>
> Also I have a series already cooking to delete several of these
> interval tree users, which will terribly conflict with this :\
>
> Is it really necessary to make such churn for such a tiny API change?

My take is that this (Davidlohr's) patch series does not necessarily
need to be applied all at once - we could get the first change in
(adding the interval_tree_gen.h header), and convert the first few
users, without getting them all at once, as long as we have a plan for
finishing the work. So, if you have cleanups in progress in some of
the files, just tell us which ones and we can leave them out from the
first pass.

Thanks,

-- 
Michel "Walken" Lespinasse
A program is never fully debugged until the last user dies.


^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [PATCH -next 00/11] lib/interval-tree: move to half closed intervals
  2019-10-04 13:15   ` Michel Lespinasse
@ 2019-10-04 16:03     ` Matthew Wilcox
  2019-10-04 19:35       ` Davidlohr Bueso
  2019-10-04 17:45     ` Jason Gunthorpe
  1 sibling, 1 reply; 38+ messages in thread
From: Matthew Wilcox @ 2019-10-04 16:03 UTC (permalink / raw)
  To: Michel Lespinasse
  Cc: Jason Gunthorpe, Davidlohr Bueso, Andrew Morton, Peter Zijlstra,
	LKML, linux-mm, dri-devel, linux-rdma

On Fri, Oct 04, 2019 at 06:15:11AM -0700, Michel Lespinasse wrote:
> My take is that this (Davidlohr's) patch series does not necessarily
> need to be applied all at once - we could get the first change in
> (adding the interval_tree_gen.h header), and convert the first few
> users, without getting them all at once, as long as we have a plan for
> finishing the work. So, if you have cleanups in progress in some of
> the files, just tell us which ones and we can leave them out from the
> first pass.

Since we have users which do need to use the full ULONG_MAX range
(as pointed out by Christian Koenig), I don't think adding a second
implementation which is half-open is a good idea.  It'll only lead to
confusion.


^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [PATCH -next 00/11] lib/interval-tree: move to half closed intervals
  2019-10-04 13:15   ` Michel Lespinasse
  2019-10-04 16:03     ` Matthew Wilcox
@ 2019-10-04 17:45     ` Jason Gunthorpe
  1 sibling, 0 replies; 38+ messages in thread
From: Jason Gunthorpe @ 2019-10-04 17:45 UTC (permalink / raw)
  To: Michel Lespinasse
  Cc: Davidlohr Bueso, Andrew Morton, Peter Zijlstra, LKML, linux-mm,
	dri-devel, linux-rdma

On Fri, Oct 04, 2019 at 06:15:11AM -0700, Michel Lespinasse wrote:
> Hi Jason,
> 
> On Thu, Oct 3, 2019 at 5:26 PM Jason Gunthorpe <jgg@ziepe.ca> wrote:
> > Hurm, this is not entirely accurate. Most users do actually want
> > overlapping and multiple ranges. I just studied this extensively:
> 
> (Just curious, are you the person we discussed this with after the
> Maple Tree talk at LPC 2019 ?)

Possibly!
 
> I think we have two separate API problems there:
> - overlapping vs non-overlapping intervals (the interval tree API
> supports overlapping intervals, but some users are confused about
> this)

I think we just have a bunch of confused drivers, ie the two drm
drivers sure look confused to me.

> - closed vs half-open interval definitions

I'm not sure why this is a big problem..

We may actually just have bugs in handling the '-1' as it is supposed
to be written as start + (size-1) so that start + size == ULONG_MAX+1
works properly.

> > hfi1/mmu_rb definitely needs overlapping as it is dealing with
> > userspace VA ranges under control of userspace. As do the other
> > infiniband users.
> 
> Do you have a handle on what usnic is doing with its intervals ?
> usnic_uiom_insert_interval() has some complicated logic to avoid
> having overlapping intervals, which is very confusing to me.

I don't know why it is so complicated, but I can say that it is
storing userspace VA's in that tree.

I have some feeling this driver is trying to use the IOMMU to create a
mirror of the userspace VA

Userspace can request the HW be able to access any set of overlapping
regions and so the driver must intersect all the ranges and compute a
list of VA pages to IOMMU map. Just guessing.

Jason


^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [PATCH -next 00/11] lib/interval-tree: move to half closed intervals
  2019-10-04 16:03     ` Matthew Wilcox
@ 2019-10-04 19:35       ` Davidlohr Bueso
  0 siblings, 0 replies; 38+ messages in thread
From: Davidlohr Bueso @ 2019-10-04 19:35 UTC (permalink / raw)
  To: Matthew Wilcox
  Cc: Michel Lespinasse, Jason Gunthorpe, Andrew Morton,
	Peter Zijlstra, LKML, linux-mm, dri-devel, linux-rdma

On Fri, 04 Oct 2019, Matthew Wilcox wrote:

>On Fri, Oct 04, 2019 at 06:15:11AM -0700, Michel Lespinasse wrote:
>> My take is that this (Davidlohr's) patch series does not necessarily
>> need to be applied all at once - we could get the first change in
>> (adding the interval_tree_gen.h header), and convert the first few
>> users, without getting them all at once, as long as we have a plan for
>> finishing the work. So, if you have cleanups in progress in some of
>> the files, just tell us which ones and we can leave them out from the
>> first pass.
>
>Since we have users which do need to use the full ULONG_MAX range
>(as pointed out by Christian Koenig), I don't think adding a second
>implementation which is half-open is a good idea.  It'll only lead to
>confusion.

Right, we should not have two implementations.

Thanks,
Davidlohr


^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [PATCH 05/11] IB/hfi1: convert __mmu_int_rb to half closed intervals
  2019-10-04 11:50   ` Michel Lespinasse
@ 2019-10-04 19:41     ` Davidlohr Bueso
  0 siblings, 0 replies; 38+ messages in thread
From: Davidlohr Bueso @ 2019-10-04 19:41 UTC (permalink / raw)
  To: Michel Lespinasse
  Cc: akpm, peterz, linux-kernel, linux-mm, dri-devel, linux-rdma,
	Mike Marciniszyn, Dennis Dalessandro, Doug Ledford,
	Davidlohr Bueso

On Fri, 04 Oct 2019, Michel Lespinasse wrote:

>On Thu, Oct 03, 2019 at 01:18:52PM -0700, Davidlohr Bueso wrote:
>> diff --git a/drivers/infiniband/hw/hfi1/mmu_rb.c b/drivers/infiniband/hw/hfi1/mmu_rb.c
>> index 14d2a90964c3..fb6382b2d44e 100644
>> --- a/drivers/infiniband/hw/hfi1/mmu_rb.c
>> +++ b/drivers/infiniband/hw/hfi1/mmu_rb.c
>> @@ -47,7 +47,7 @@
>>  #include <linux/list.h>
>>  #include <linux/rculist.h>
>>  #include <linux/mmu_notifier.h>
>> -#include <linux/interval_tree_generic.h>
>> +#include <linux/interval_tree_gen.h>
>>
>>  #include "mmu_rb.h"
>>  #include "trace.h"
>> @@ -89,7 +89,7 @@ static unsigned long mmu_node_start(struct mmu_rb_node *node)
>>
>>  static unsigned long mmu_node_last(struct mmu_rb_node *node)
>>  {
>> -	return PAGE_ALIGN(node->addr + node->len) - 1;
>> +	return PAGE_ALIGN(node->addr + node->len);
>>  }
>
>May as well rename the function mmu_node_end(). I was worried if it
>was used anywhere else, but it turned out it's only used when defining
>the interval tree.

Right.

In general I tried not to rename everything to end because I wanted to
avoid bloating the diffstat, albeit having naming discrepancies within
the code (which isn't new either fwiw).

>
>I would also suggest moving this function (as well as mmu_node_first)
>right before its use, rather than just after, which would allow you to
>also remove the function prototype a few lines earlier.

Indeed, but again I don't want to unnecessarily grow the patch. I have
several notes to come back to once/if this series is settled.

>
>Looks good to me otherwise.
>
>Reviewed-by: Michel Lespinasse <walken@google.com>

Thanks,
Davidlohr


^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [PATCH 07/11] vhost: convert vhost_umem_interval_tree to half closed intervals
  2019-10-04 12:10   ` Michel Lespinasse
@ 2019-10-04 19:44     ` Davidlohr Bueso
  0 siblings, 0 replies; 38+ messages in thread
From: Davidlohr Bueso @ 2019-10-04 19:44 UTC (permalink / raw)
  To: Michel Lespinasse
  Cc: akpm, peterz, linux-kernel, linux-mm, dri-devel, linux-rdma,
	Michael, Jason Wang, virtualization, Davidlohr Bueso

On Fri, 04 Oct 2019, Michel Lespinasse wrote:

>On Thu, Oct 03, 2019 at 01:18:54PM -0700, Davidlohr Bueso wrote:
>> @@ -1320,15 +1320,14 @@ static bool iotlb_access_ok(struct vhost_virtqueue *vq,
>>  {
>>  	const struct vhost_umem_node *node;
>>  	struct vhost_umem *umem = vq->iotlb;
>> -	u64 s = 0, size, orig_addr = addr, last = addr + len - 1;
>> +	u64 s = 0, size, orig_addr = addr, last = addr + len;
>
>maybe "end" or "end_addr" instead of "last".
>
>> diff --git a/drivers/vhost/vhost.h b/drivers/vhost/vhost.h
>> index e9ed2722b633..bb36cb9ed5ec 100644
>> --- a/drivers/vhost/vhost.h
>> +++ b/drivers/vhost/vhost.h
>> @@ -53,13 +53,13 @@ struct vhost_log {
>>  };
>>
>>  #define START(node) ((node)->start)
>> -#define LAST(node) ((node)->last)
>> +#define END(node) ((node)->end)
>>
>>  struct vhost_umem_node {
>>  	struct rb_node rb;
>>  	struct list_head link;
>>  	__u64 start;
>> -	__u64 last;
>> +	__u64 end;
>>  	__u64 size;
>>  	__u64 userspace_addr;
>>  	__u32 perm;
>
>Preferably also rename __subtree_last to __subtree_end

Yes, this was was another one that I had in mind renaming, but
didn't want to grow the series -- all custom interval trees
name _last for the subtree iirc. Like my previous reply, I'd
rather leave this stuff for a followup series.

Thanks,
Davidlohr


^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [PATCH 11/11] x86/mm, pat: convert pat tree to generic interval tree
  2019-10-03 20:18 ` [PATCH 11/11] x86/mm, pat: convert pat tree to generic interval tree Davidlohr Bueso
@ 2019-10-07 15:33   ` Ingo Molnar
  2019-10-21 23:24     ` Davidlohr Bueso
  0 siblings, 1 reply; 38+ messages in thread
From: Ingo Molnar @ 2019-10-07 15:33 UTC (permalink / raw)
  To: Davidlohr Bueso
  Cc: akpm, walken, peterz, linux-kernel, linux-mm, dri-devel,
	linux-rdma, Thomas Gleixner, Ingo Molnar, Borislav Petkov, x86,
	Davidlohr Bueso


* Davidlohr Bueso <dave@stgolabs.net> wrote:

> With some considerations, the custom pat_rbtree implementation can be
> simplified to use most of the generic interval_tree machinery.
> 
> o The tree inorder traversal can slightly differ when there are key
> ('start') collisions in the tree due to one going left and another right.
> This, however, only affects the output of debugfs' pat_memtype_list file.
> 
> o Generic interval trees are now semi open [a,b), which suits well with
> what pat wants.
> 
> o Erasing logic must remain untouched as well.
> 
> In order for the types to remain u64, the 'memtype_interval' calls are
> introduced, as opposed to simply using struct interval_tree.
> 
> In addition, pat tree might potentially also benefit by the fast overlap
> detection for the insertion case when looking up the first overlapping node
> in the tree.
> 
> Finally, I've tested this on various servers, via sanity warnings, running
> side by side with the current version and so far see no differences in the
> returned pointer node when doing memtype_rb_lowest_match() lookups.
> 
> Cc: Peter Zijlstra <peterz@infradead.org>
> Cc: Thomas Gleixner <tglx@linutronix.de>
> Cc: Ingo Molnar <mingo@redhat.com>
> Cc: Borislav Petkov <bp@alien8.de>
> Cc: x86@kernel.org
> Signed-off-by: Davidlohr Bueso <dbueso@suse.de>
> ---
>  arch/x86/mm/pat.c        |  22 +++----
>  arch/x86/mm/pat_rbtree.c | 151 ++++++++++-------------------------------------
>  2 files changed, 43 insertions(+), 130 deletions(-)

I suppose this will be carried in -mm?

If so and if this patch is regression free, then:

  Acked-by: Ingo Molnar <mingo@kernel.org>

Thanks,
	
	Ingo


^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [PATCH 09/11] lib/interval-tree: convert interval_tree to half closed intervals
  2019-10-04  7:20     ` Koenig, Christian
@ 2019-10-08 16:59       ` Davidlohr Bueso
  0 siblings, 0 replies; 38+ messages in thread
From: Davidlohr Bueso @ 2019-10-08 16:59 UTC (permalink / raw)
  To: Koenig, Christian
  Cc: akpm, walken, peterz, linux-kernel, linux-mm, dri-devel,
	linux-rdma, Deucher, Alexander, David Airlie, Daniel Vetter,
	Doug Ledford, Joerg Roedel, J?r?me Glisse, Davidlohr Bueso

On Fri, 04 Oct 2019, Koenig, Christian wrote:

>Am 04.10.19 um 08:57 schrieb Christian König:
>> Am 03.10.19 um 22:18 schrieb Davidlohr Bueso:
>>> The generic tree tree really wants [a, b) intervals, not fully closed.
>>> As such convert it to use the new interval_tree_gen.h. Most of the
>>> conversions are straightforward, with the exception of perhaps
>>> radeon_vm_bo_set_addr(), but semantics have been tried to be left
>>> untouched.
>>
>> NAK, the whole thing won't work.
>>
>> See we need to handle the full device address space which means we
>> have values in the range of 0x0-0xffffffff.
>>
>> If you make this a closed interval then the end would wrap around to
>> 0x0 if long is only 32bit.
>
>Well I've just now re-read the subject line. From that it sounds like
>you are actually trying to fix the interval tree to use a half closed
>interval, e.g. something like [a, b[

Correct.

>
>But your code changes sometimes doesn't seem to reflect that.

Hmm the change simply aims at avoiding the end - 1 trick when dealing
with interval_tree insertions and lookups; the rest of the series
converts other interval tree users in a similar way, albeit some errors
which will be updated. What are your concerns about this patch?

Thanks,
Davidlohr


^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [PATCH 07/11] vhost: convert vhost_umem_interval_tree to half closed intervals
  2019-10-03 20:18 ` [PATCH 07/11] vhost: convert vhost_umem_interval_tree " Davidlohr Bueso
  2019-10-04 12:10   ` Michel Lespinasse
@ 2019-10-10  5:49   ` Jason Wang
  1 sibling, 0 replies; 38+ messages in thread
From: Jason Wang @ 2019-10-10  5:49 UTC (permalink / raw)
  To: Davidlohr Bueso, akpm
  Cc: walken, peterz, linux-kernel, linux-mm, dri-devel, linux-rdma,
	Michael S. Tsirkin, virtualization, Davidlohr Bueso


On 2019/10/4 上午4:18, Davidlohr Bueso wrote:
> The vhost_umem interval tree really wants [a, b) intervals,
> not fully closed as currently. As such convert it to use the
> new interval_tree_gen.h, and also rename the 'last' endpoint
> in the node to 'end', which both a more suitable name for
> the half closed interval and also reduces the chances of some
> caller being missed.
>
> Cc: Michael S. Tsirkin" <mst@redhat.com>
> Cc: Jason Wang <jasowang@redhat.com>
> Cc: virtualization@lists.linux-foundation.org
> Signed-off-by: Davidlohr Bueso <dbueso@suse.de>
> ---
>   drivers/vhost/vhost.c | 19 +++++++++----------
>   drivers/vhost/vhost.h |  4 ++--
>   2 files changed, 11 insertions(+), 12 deletions(-)
>
> diff --git a/drivers/vhost/vhost.c b/drivers/vhost/vhost.c
> index 36ca2cf419bf..80c3cca24dc7 100644
> --- a/drivers/vhost/vhost.c
> +++ b/drivers/vhost/vhost.c
> @@ -28,7 +28,7 @@
>   #include <linux/sort.h>
>   #include <linux/sched/mm.h>
>   #include <linux/sched/signal.h>
> -#include <linux/interval_tree_generic.h>
> +#include <linux/interval_tree_gen.h>
>   #include <linux/nospec.h>
>   
>   #include "vhost.h"
> @@ -51,7 +51,7 @@ enum {
>   
>   INTERVAL_TREE_DEFINE(struct vhost_umem_node,
>   		     rb, __u64, __subtree_last,
> -		     START, LAST, static inline, vhost_umem_interval_tree);
> +		     START, END, static inline, vhost_umem_interval_tree);
>   
>   #ifdef CONFIG_VHOST_CROSS_ENDIAN_LEGACY
>   static void vhost_disable_cross_endian(struct vhost_virtqueue *vq)
> @@ -1034,7 +1034,7 @@ static int vhost_new_umem_range(struct vhost_umem *umem,
>   
>   	node->start = start;
>   	node->size = size;
> -	node->last = end;
> +	node->end = end;
>   	node->userspace_addr = userspace_addr;
>   	node->perm = perm;
>   	INIT_LIST_HEAD(&node->link);
> @@ -1112,7 +1112,7 @@ static int vhost_process_iotlb_msg(struct vhost_dev *dev,
>   		}
>   		vhost_vq_meta_reset(dev);
>   		if (vhost_new_umem_range(dev->iotlb, msg->iova, msg->size,
> -					 msg->iova + msg->size - 1,
> +					 msg->iova + msg->size,
>   					 msg->uaddr, msg->perm)) {
>   			ret = -ENOMEM;
>   			break;
> @@ -1126,7 +1126,7 @@ static int vhost_process_iotlb_msg(struct vhost_dev *dev,
>   		}
>   		vhost_vq_meta_reset(dev);
>   		vhost_del_umem_range(dev->iotlb, msg->iova,
> -				     msg->iova + msg->size - 1);
> +				     msg->iova + msg->size);
>   		break;
>   	default:
>   		ret = -EINVAL;
> @@ -1320,15 +1320,14 @@ static bool iotlb_access_ok(struct vhost_virtqueue *vq,
>   {
>   	const struct vhost_umem_node *node;
>   	struct vhost_umem *umem = vq->iotlb;
> -	u64 s = 0, size, orig_addr = addr, last = addr + len - 1;
> +	u64 s = 0, size, orig_addr = addr, last = addr + len;
>   
>   	if (vhost_vq_meta_fetch(vq, addr, len, type))
>   		return true;
>   
>   	while (len > s) {
>   		node = vhost_umem_interval_tree_iter_first(&umem->umem_tree,
> -							   addr,
> -							   last);
> +							   addr, last);
>   		if (node == NULL || node->start > addr) {
>   			vhost_iotlb_miss(vq, addr, access);
>   			return false;
> @@ -1455,7 +1454,7 @@ static long vhost_set_memory(struct vhost_dev *d, struct vhost_memory __user *m)
>   					 region->guest_phys_addr,
>   					 region->memory_size,
>   					 region->guest_phys_addr +
> -					 region->memory_size - 1,
> +					 region->memory_size,
>   					 region->userspace_addr,
>   					 VHOST_ACCESS_RW))
>   			goto err;
> @@ -2055,7 +2054,7 @@ static int translate_desc(struct vhost_virtqueue *vq, u64 addr, u32 len,
>   		}
>   
>   		node = vhost_umem_interval_tree_iter_first(&umem->umem_tree,
> -							addr, addr + len - 1);
> +							   addr, addr + len);
>   		if (node == NULL || node->start > addr) {
>   			if (umem != dev->iotlb) {
>   				ret = -EFAULT;
> diff --git a/drivers/vhost/vhost.h b/drivers/vhost/vhost.h
> index e9ed2722b633..bb36cb9ed5ec 100644
> --- a/drivers/vhost/vhost.h
> +++ b/drivers/vhost/vhost.h
> @@ -53,13 +53,13 @@ struct vhost_log {
>   };
>   
>   #define START(node) ((node)->start)
> -#define LAST(node) ((node)->last)
> +#define END(node) ((node)->end)
>   
>   struct vhost_umem_node {
>   	struct rb_node rb;
>   	struct list_head link;
>   	__u64 start;
> -	__u64 last;
> +	__u64 end;
>   	__u64 size;
>   	__u64 userspace_addr;
>   	__u32 perm;


Reviewed-by: Jason Wang <jasowang@redhat.com>




^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [PATCH 11/11] x86/mm, pat: convert pat tree to generic interval tree
  2019-10-07 15:33   ` Ingo Molnar
@ 2019-10-21 23:24     ` Davidlohr Bueso
  0 siblings, 0 replies; 38+ messages in thread
From: Davidlohr Bueso @ 2019-10-21 23:24 UTC (permalink / raw)
  To: Ingo Molnar
  Cc: akpm, walken, peterz, linux-kernel, linux-mm, dri-devel,
	linux-rdma, Thomas Gleixner, Ingo Molnar, Borislav Petkov, x86,
	Davidlohr Bueso

On Mon, 07 Oct 2019, Ingo Molnar wrote:
>I suppose this will be carried in -mm?

I've just sent out a new patchset for -tip that only modified the pat
tree. It seems that this series will at least take some more time due to
the mmu_notifier rework being done - and there was some worries about
the corner case details. I think it makes sense to just port the pat
tree to the current interval tree machineary and just be done with all this.
Any future versions of this series would require a trivial two liner for pat.

Thanks,
Davidlohr


^ permalink raw reply	[flat|nested] 38+ messages in thread

end of thread, other threads:[~2019-10-21 23:25 UTC | newest]

Thread overview: 38+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2019-10-03 20:18 [PATCH -next 00/11] lib/interval-tree: move to half closed intervals Davidlohr Bueso
2019-10-03 20:18 ` [PATCH 01/11] mm: introduce vma_interval_tree_foreach_stab() Davidlohr Bueso
2019-10-03 20:18 ` [PATCH 02/11] lib/interval-tree: add an equivalent tree with [a,b) intervals Davidlohr Bueso
2019-10-04 11:02   ` Michel Lespinasse
2019-10-03 20:18 ` [PATCH 03/11] drm/amdgpu: convert amdgpu_vm_it to half closed intervals Davidlohr Bueso
2019-10-04  6:54   ` Koenig, Christian
2019-10-04 11:36     ` Michel Lespinasse
2019-10-04 12:39       ` Christian König
2019-10-03 20:18 ` [PATCH 04/11] drm: convert drm_mm_interval_tree " Davidlohr Bueso
2019-10-03 20:18 ` [PATCH 05/11] IB/hfi1: convert __mmu_int_rb " Davidlohr Bueso
2019-10-04 11:50   ` Michel Lespinasse
2019-10-04 19:41     ` Davidlohr Bueso
2019-10-03 20:18 ` [PATCH 06/11] IB,usnic: convert usnic_uiom_interval_tree " Davidlohr Bueso
2019-10-03 20:18 ` [PATCH 07/11] vhost: convert vhost_umem_interval_tree " Davidlohr Bueso
2019-10-04 12:10   ` Michel Lespinasse
2019-10-04 19:44     ` Davidlohr Bueso
2019-10-10  5:49   ` Jason Wang
2019-10-03 20:18 ` [PATCH 08/11] mm: convert vma_interval_tree " Davidlohr Bueso
2019-10-03 20:41   ` Matthew Wilcox
2019-10-04 12:30   ` Michel Lespinasse
2019-10-03 20:18 ` [PATCH 09/11] lib/interval-tree: convert interval_tree " Davidlohr Bueso
2019-10-03 22:50   ` kbuild test robot
2019-10-04  6:57   ` Koenig, Christian
2019-10-04  7:20     ` Koenig, Christian
2019-10-08 16:59       ` Davidlohr Bueso
2019-10-03 20:18 ` [PATCH 10/11] lib: drop interval_tree_generic.h Davidlohr Bueso
2019-10-03 20:18 ` [PATCH 11/11] x86/mm, pat: convert pat tree to generic interval tree Davidlohr Bueso
2019-10-07 15:33   ` Ingo Molnar
2019-10-21 23:24     ` Davidlohr Bueso
2019-10-03 20:32 ` [PATCH -next 00/11] lib/interval-tree: move to half closed intervals Matthew Wilcox
2019-10-03 21:10   ` Davidlohr Bueso
2019-10-04 12:43   ` Michel Lespinasse
2019-10-04  0:26 ` Jason Gunthorpe
2019-10-04  2:48   ` Davidlohr Bueso
2019-10-04 13:15   ` Michel Lespinasse
2019-10-04 16:03     ` Matthew Wilcox
2019-10-04 19:35       ` Davidlohr Bueso
2019-10-04 17:45     ` Jason Gunthorpe

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).