All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH 0/4] radix-tree: iterating general cleanup
@ 2012-02-07  7:54 ` Konstantin Khlebnikov
  0 siblings, 0 replies; 30+ messages in thread
From: Konstantin Khlebnikov @ 2012-02-07  7:54 UTC (permalink / raw)
  To: linux-mm, Andrew Morton, Hugh Dickins, Linus Torvalds, linux-kernel

This patchset implements common radix-tree iteration routine and
reworks page-cache lookup functions with using it.

radix_tree_gang_lookup_*slot() now mostly unused (the last user somethere in
drivers/sh/intc/virq.c), but they are exported, we cannot remove them for now.

Also there some shmem-related radix-tree hacks can be reworked,
radix_tree_locate_item() can be removed. I already have a few extra patches.

And as usual my lovely bloat-o-meter:

add/remove: 4/3 grow/shrink: 4/4 up/down: 1232/-964 (268)
function                                     old     new   delta
radix_tree_next_chunk                          -     499    +499
static.shmem_find_get_pages_and_swap           -     404    +404
find_get_pages_tag                           354     488    +134
find_get_pages                               362     438     +76
find_get_pages_contig                        345     407     +62
__kstrtab_radix_tree_next_chunk                -      22     +22
shmem_truncate_range                        1633    1652     +19
__ksymtab_radix_tree_next_chunk                -      16     +16
radix_tree_gang_lookup_tag_slot              208     180     -28
radix_tree_gang_lookup_tag                   247     207     -40
radix_tree_gang_lookup_slot                  204     162     -42
radix_tree_gang_lookup                       231     160     -71
__lookup                                     217       -    -217
__lookup_tag                                 242       -    -242
shmem_find_get_pages_and_swap                324       -    -324

---

Konstantin Khlebnikov (4):
      bitops: implement "optimized" __find_next_bit()
      radix-tree: introduce bit-optimized iterator
      radix-tree: rewrite gang lookup with using iterator
      radix-tree: use iterators in find_get_pages* functions


 include/asm-generic/bitops/find.h |   36 +++
 include/linux/radix-tree.h        |  129 +++++++++++
 lib/radix-tree.c                  |  422 +++++++++++++------------------------
 mm/filemap.c                      |   75 ++++---
 mm/shmem.c                        |   23 +-
 5 files changed, 375 insertions(+), 310 deletions(-)

-- 
Signature

^ permalink raw reply	[flat|nested] 30+ messages in thread

* [PATCH 0/4] radix-tree: iterating general cleanup
@ 2012-02-07  7:54 ` Konstantin Khlebnikov
  0 siblings, 0 replies; 30+ messages in thread
From: Konstantin Khlebnikov @ 2012-02-07  7:54 UTC (permalink / raw)
  To: linux-mm, Andrew Morton, Hugh Dickins, Linus Torvalds, linux-kernel

This patchset implements common radix-tree iteration routine and
reworks page-cache lookup functions with using it.

radix_tree_gang_lookup_*slot() now mostly unused (the last user somethere in
drivers/sh/intc/virq.c), but they are exported, we cannot remove them for now.

Also there some shmem-related radix-tree hacks can be reworked,
radix_tree_locate_item() can be removed. I already have a few extra patches.

And as usual my lovely bloat-o-meter:

add/remove: 4/3 grow/shrink: 4/4 up/down: 1232/-964 (268)
function                                     old     new   delta
radix_tree_next_chunk                          -     499    +499
static.shmem_find_get_pages_and_swap           -     404    +404
find_get_pages_tag                           354     488    +134
find_get_pages                               362     438     +76
find_get_pages_contig                        345     407     +62
__kstrtab_radix_tree_next_chunk                -      22     +22
shmem_truncate_range                        1633    1652     +19
__ksymtab_radix_tree_next_chunk                -      16     +16
radix_tree_gang_lookup_tag_slot              208     180     -28
radix_tree_gang_lookup_tag                   247     207     -40
radix_tree_gang_lookup_slot                  204     162     -42
radix_tree_gang_lookup                       231     160     -71
__lookup                                     217       -    -217
__lookup_tag                                 242       -    -242
shmem_find_get_pages_and_swap                324       -    -324

---

Konstantin Khlebnikov (4):
      bitops: implement "optimized" __find_next_bit()
      radix-tree: introduce bit-optimized iterator
      radix-tree: rewrite gang lookup with using iterator
      radix-tree: use iterators in find_get_pages* functions


 include/asm-generic/bitops/find.h |   36 +++
 include/linux/radix-tree.h        |  129 +++++++++++
 lib/radix-tree.c                  |  422 +++++++++++++------------------------
 mm/filemap.c                      |   75 ++++---
 mm/shmem.c                        |   23 +-
 5 files changed, 375 insertions(+), 310 deletions(-)

-- 
Signature

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Fight unfair telecom internet charges in Canada: sign http://stopthemeter.ca/
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 30+ messages in thread

* [PATCH 1/4] bitops: implement "optimized" __find_next_bit()
  2012-02-07  7:54 ` Konstantin Khlebnikov
@ 2012-02-07  7:55   ` Konstantin Khlebnikov
  -1 siblings, 0 replies; 30+ messages in thread
From: Konstantin Khlebnikov @ 2012-02-07  7:55 UTC (permalink / raw)
  To: linux-mm, Andrew Morton, Hugh Dickins, Linus Torvalds, linux-kernel

This patch adds  __find_next_bit() -- static-inline variant of find_next_bit()
optimized for small constant size arrays, because find_next_bit() is too heavy
for searching in an array with one/two long elements.
And unlike to find_next_bit() it does not mask tail bits.

Signed-off-by: Konstantin Khlebnikov <khlebnikov@openvz.org>
---
 include/asm-generic/bitops/find.h |   36 ++++++++++++++++++++++++++++++++++++
 1 files changed, 36 insertions(+), 0 deletions(-)

diff --git a/include/asm-generic/bitops/find.h b/include/asm-generic/bitops/find.h
index 71c7780..1dd2495 100644
--- a/include/asm-generic/bitops/find.h
+++ b/include/asm-generic/bitops/find.h
@@ -12,6 +12,42 @@ extern unsigned long find_next_bit(const unsigned long *addr, unsigned long
 		size, unsigned long offset);
 #endif
 
+#ifndef __find_next_bit
+/**
+ * __find_next_bit - find the next set bit in a memory region
+ * @addr: The address to base the search on
+ * @size: The bitmap size in bits
+ * @offset: The bitnumber to start searching at
+ *
+ * Unrollable variant of find_next_bit() for constant size arrays.
+ * Tail bits starting from size to roundup(size, BITS_PER_LONG) must be zero.
+ * Returns next bit offset, or size if nothing found.
+ */
+static inline unsigned long __find_next_bit(const unsigned long *addr,
+		unsigned long size, unsigned long offset)
+{
+	if (!__builtin_constant_p(size))
+		return find_next_bit(addr, size, offset);
+
+	if (offset < size) {
+		unsigned long tmp;
+
+		addr += offset / BITS_PER_LONG;
+		tmp = *addr >> (offset % BITS_PER_LONG);
+		if (tmp)
+			return __ffs(tmp) + offset;
+		offset = (offset + BITS_PER_LONG) & ~(BITS_PER_LONG - 1);
+		while (offset < size) {
+			tmp = *++addr;
+			if (tmp)
+				return __ffs(tmp) + offset;
+			offset += BITS_PER_LONG;
+		}
+	}
+	return size;
+}
+#endif
+
 #ifndef find_next_zero_bit
 /**
  * find_next_zero_bit - find the next cleared bit in a memory region


^ permalink raw reply related	[flat|nested] 30+ messages in thread

* [PATCH 1/4] bitops: implement "optimized" __find_next_bit()
@ 2012-02-07  7:55   ` Konstantin Khlebnikov
  0 siblings, 0 replies; 30+ messages in thread
From: Konstantin Khlebnikov @ 2012-02-07  7:55 UTC (permalink / raw)
  To: linux-mm, Andrew Morton, Hugh Dickins, Linus Torvalds, linux-kernel

This patch adds  __find_next_bit() -- static-inline variant of find_next_bit()
optimized for small constant size arrays, because find_next_bit() is too heavy
for searching in an array with one/two long elements.
And unlike to find_next_bit() it does not mask tail bits.

Signed-off-by: Konstantin Khlebnikov <khlebnikov@openvz.org>
---
 include/asm-generic/bitops/find.h |   36 ++++++++++++++++++++++++++++++++++++
 1 files changed, 36 insertions(+), 0 deletions(-)

diff --git a/include/asm-generic/bitops/find.h b/include/asm-generic/bitops/find.h
index 71c7780..1dd2495 100644
--- a/include/asm-generic/bitops/find.h
+++ b/include/asm-generic/bitops/find.h
@@ -12,6 +12,42 @@ extern unsigned long find_next_bit(const unsigned long *addr, unsigned long
 		size, unsigned long offset);
 #endif
 
+#ifndef __find_next_bit
+/**
+ * __find_next_bit - find the next set bit in a memory region
+ * @addr: The address to base the search on
+ * @size: The bitmap size in bits
+ * @offset: The bitnumber to start searching at
+ *
+ * Unrollable variant of find_next_bit() for constant size arrays.
+ * Tail bits starting from size to roundup(size, BITS_PER_LONG) must be zero.
+ * Returns next bit offset, or size if nothing found.
+ */
+static inline unsigned long __find_next_bit(const unsigned long *addr,
+		unsigned long size, unsigned long offset)
+{
+	if (!__builtin_constant_p(size))
+		return find_next_bit(addr, size, offset);
+
+	if (offset < size) {
+		unsigned long tmp;
+
+		addr += offset / BITS_PER_LONG;
+		tmp = *addr >> (offset % BITS_PER_LONG);
+		if (tmp)
+			return __ffs(tmp) + offset;
+		offset = (offset + BITS_PER_LONG) & ~(BITS_PER_LONG - 1);
+		while (offset < size) {
+			tmp = *++addr;
+			if (tmp)
+				return __ffs(tmp) + offset;
+			offset += BITS_PER_LONG;
+		}
+	}
+	return size;
+}
+#endif
+
 #ifndef find_next_zero_bit
 /**
  * find_next_zero_bit - find the next cleared bit in a memory region

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Fight unfair telecom internet charges in Canada: sign http://stopthemeter.ca/
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply related	[flat|nested] 30+ messages in thread

* [PATCH 2/4] radix-tree: introduce bit-optimized iterator
  2012-02-07  7:54 ` Konstantin Khlebnikov
@ 2012-02-07  7:55   ` Konstantin Khlebnikov
  -1 siblings, 0 replies; 30+ messages in thread
From: Konstantin Khlebnikov @ 2012-02-07  7:55 UTC (permalink / raw)
  To: linux-mm, Andrew Morton, Hugh Dickins, Linus Torvalds, linux-kernel

This patch implements clean, simple and effective radix-tree iteration routine.
Iterating is divided into two loops: the outer loop iterates though radix tree
chunks on leaf nodes, the nested loop iterates through slots in one chunk.

Main iterator function radix_tree_next_chunk() returns pointer to first slot,
and stores in the struct radix_tree_iter index of next-to-last slot.
For tagged-iterating it also constuct bitmask of tags for retunted chunk.
Thus nested-loop has enough information for iteration.

Signed-off-by: Konstantin Khlebnikov <khlebnikov@openvz.org>
---
 include/linux/radix-tree.h |  129 ++++++++++++++++++++++++++++++++++++++++++++
 lib/radix-tree.c           |  107 ++++++++++++++++++++++++++++++++++++
 2 files changed, 236 insertions(+), 0 deletions(-)

diff --git a/include/linux/radix-tree.h b/include/linux/radix-tree.h
index 07e360b..c31b895 100644
--- a/include/linux/radix-tree.h
+++ b/include/linux/radix-tree.h
@@ -2,6 +2,7 @@
  * Copyright (C) 2001 Momchil Velikov
  * Portions Copyright (C) 2001 Christoph Hellwig
  * Copyright (C) 2006 Nick Piggin
+ * Copyright (C) 2012 Konstantin Khlebnikov
  *
  * This program is free software; you can redistribute it and/or
  * modify it under the terms of the GNU General Public License as
@@ -256,4 +257,132 @@ static inline void radix_tree_preload_end(void)
 	preempt_enable();
 }
 
+struct radix_tree_iter {
+	unsigned long		index;		/* current slot index */
+	unsigned long		next_index;	/* next chunk first index */
+	unsigned long		tags;		/* bitmask for tag-iterating */
+};
+
+static inline void radix_tree_iter_start(struct radix_tree_iter *iter,
+					 unsigned long start)
+{
+	iter->index = start;
+	iter->next_index = 1; /* any non-zero */
+}
+
+#define RADIX_TREE_ITER_TAG_MASK	0x00FF
+#define RADIX_TREE_ITER_TAGGED		0x0100	/* search tagged */
+#define RADIX_TREE_ITER_CONTIG		0x0200	/* stop at first hole  */
+
+void **radix_tree_next_chunk(struct radix_tree_root *root,
+			     struct radix_tree_iter *iter, unsigned flags);
+
+/**
+ * radix_tree_next_tag - find next tagged slot in chunk
+ *
+ * @slot	pointer to slot pointer
+ * @iter	iterator state
+ *
+ * Eats iter->tags and bumps *slot and iter->index to next tagged slot
+ * Returns true if something found, and false otherwise
+ */
+static inline bool radix_tree_next_tag(void ***slot,
+				       struct radix_tree_iter *iter)
+{
+	unsigned long offset;
+
+	if (likely(iter->tags & 1ul))
+		return true;
+	if (unlikely(!iter->tags)) {
+		iter->index = iter->next_index;
+		return false;
+	}
+	offset = __ffs(iter->tags);
+	iter->tags >>= offset;
+	iter->index += offset;
+	*slot += offset;
+	return true;
+}
+
+/**
+ * radix_tree_for_each_chunk - iterate over all slot chunks
+ *
+ * @slot:	the void** for pointer to first slot
+ * @root	the struct radix_tree_root pointer
+ * @iter	the struct radix_tree_iter pointer
+ * @start	starting index
+ * @flags	RADIX_TREE_ITER_* and tag index
+ */
+#define radix_tree_for_each_chunk(slot, root, iter, start, flags)	\
+	for ( radix_tree_iter_start(iter, start) ;			\
+	      (slot = radix_tree_next_chunk(root, iter, flags)) ; )
+
+/**
+ * radix_tree_for_each_chunk_slot - iterate over all slots in one chunk
+ *
+ * @slot:	the void** for pointer to slot
+ * @iter	the struct radix_tree_iter pointer
+ */
+#define radix_tree_for_each_chunk_slot(slot, iter)	\
+	for ( ; (iter)->index != (iter)->next_index ;	\
+		(iter)->index++, slot++ )
+
+/**
+ * radix_tree_for_each_chunk_tagged_slot - iterate over all tagged slots
+ *					   in one chunk
+ *
+ * @slot:	the void** for pointer to slot
+ * @iter	the struct radix_tree_iter pointer
+ */
+#define radix_tree_for_each_chunk_tagged_slot(slot, iter)	\
+	for ( ; radix_tree_next_tag(&slot, iter) ;		\
+		(iter)->tags >>= 1, (iter)->index++, slot++ )
+
+/**
+ * radix_tree_for_each_slot - iterate over all non-empty slots
+ *
+ * @slot:	the void** for pointer to slot
+ * @root	the struct radix_tree_root pointer
+ * @iter	the struct radix_tree_iter pointer
+ * @start	starting index
+ *
+ * This is double-loop, break does not work.
+ */
+#define radix_tree_for_each_slot(slot, root, iter, start)	\
+	radix_tree_for_each_chunk(slot, root, iter, start, 0)	\
+		radix_tree_for_each_chunk_slot(slot, iter)	\
+			if (!*slot) { continue; } else
+
+/**
+ * radix_tree_for_each_contig - iterate over all contiguous non-empty slots
+ *
+ * @slot:	the void** for pointer to slot
+ * @root	the struct radix_tree_root pointer
+ * @iter	the struct radix_tree_iter pointer
+ * @start	starting index
+ *
+ * This is double-loop, break does not work.
+ */
+#define radix_tree_for_each_contig(slot, root, iter, start)	\
+	radix_tree_for_each_chunk(slot, root, iter, start,	\
+					RADIX_TREE_ITER_CONTIG)	\
+		radix_tree_for_each_chunk_slot(slot, iter)	\
+			if (!*slot) { (iter)->next_index = 0; break; } else
+
+/**
+ * radix_tree_for_each_tagged - iterate over all tagged slots
+ *
+ * @slot:	the void** for pointer to slot
+ * @root	the struct radix_tree_root pointer
+ * @iter	the struct radix_tree_iter pointer
+ * @start	starting index
+ * @tag		tag index
+ *
+ * This is double-loop, break does not work.
+ */
+#define radix_tree_for_each_tagged(slot, root, iter, start, tag)\
+	radix_tree_for_each_chunk(slot, root, iter, start,	\
+				RADIX_TREE_ITER_TAGGED | tag)	\
+		radix_tree_for_each_chunk_tagged_slot(slot, iter)
+
 #endif /* _LINUX_RADIX_TREE_H */
diff --git a/lib/radix-tree.c b/lib/radix-tree.c
index dc63d08..32e2bfa 100644
--- a/lib/radix-tree.c
+++ b/lib/radix-tree.c
@@ -3,6 +3,7 @@
  * Portions Copyright (C) 2001 Christoph Hellwig
  * Copyright (C) 2005 SGI, Christoph Lameter
  * Copyright (C) 2006 Nick Piggin
+ * Copyright (C) 2012 Konstantin Khlebnikov
  *
  * This program is free software; you can redistribute it and/or
  * modify it under the terms of the GNU General Public License as
@@ -613,6 +614,112 @@ int radix_tree_tag_get(struct radix_tree_root *root,
 EXPORT_SYMBOL(radix_tree_tag_get);
 
 /**
+ * radix_tree_next_chunk - find next slots chunk for iteration
+ *
+ * @root:		radix tree root
+ * @iter:		iterator state
+ * @flags		RADIX_TREE_ITER_* flags and tag index
+ *
+ * Returns pointer to first slots, or NULL if there no more left
+ */
+void **radix_tree_next_chunk(struct radix_tree_root *root,
+			     struct radix_tree_iter *iter, unsigned flags)
+{
+	unsigned tag = flags & RADIX_TREE_ITER_TAG_MASK;
+	struct radix_tree_node *rnode, *node;
+	unsigned long index, i;
+	unsigned shift;
+
+	/* Overflow after ~0ul, or break in contiguous iteration */
+	if (!iter->next_index)
+		return NULL;
+
+	if ((flags & RADIX_TREE_ITER_TAGGED) && !root_tag_get(root, tag))
+		return NULL;
+
+	index = iter->index;
+
+	rnode = rcu_dereference_raw(root->rnode);
+	if (radix_tree_is_indirect_ptr(rnode)) {
+		rnode = indirect_to_ptr(rnode);
+	} else if (rnode && !index) {
+		/* Single-slot tree */
+		iter->tags = 1;
+		iter->next_index = 1;
+		return (void **)&root->rnode;
+	} else
+		return NULL;
+
+restart:
+	shift = (rnode->height - 1) * RADIX_TREE_MAP_SHIFT;
+	i = index >> shift;
+
+	/* Index ouside of the tree */
+	if (i >= RADIX_TREE_MAP_SIZE)
+		return NULL;
+
+	node = rnode;
+	while (1) {
+		if ((flags & RADIX_TREE_ITER_TAGGED) ?
+				!test_bit(i, node->tags[tag]) :
+				!node->slots[i]) {
+			/* Hole detected */
+			if (flags & RADIX_TREE_ITER_CONTIG)
+				return NULL;
+
+			if (flags & RADIX_TREE_ITER_TAGGED)
+				i = __find_next_bit(node->tags[tag],
+						RADIX_TREE_MAP_SIZE, i + 1);
+			else
+				while (++i < RADIX_TREE_MAP_SIZE &&
+						!node->slots[i]);
+
+			index &= ~((RADIX_TREE_MAP_SIZE << shift) - 1);
+			index += i << shift;
+			/* Overflow after ~0UL */
+			if (!index)
+				return NULL;
+			if (i == RADIX_TREE_MAP_SIZE)
+				goto restart;
+		}
+
+		/* This is leaf-node */
+		if (!shift)
+			break;
+
+		node = rcu_dereference_raw(node->slots[i]);
+		if (node == NULL)
+			goto restart;
+		shift -= RADIX_TREE_MAP_SHIFT;
+		i = (index >> shift) & RADIX_TREE_MAP_MASK;
+	}
+
+	iter->index = index;
+	iter->next_index = (index | RADIX_TREE_MAP_MASK) + 1;
+
+	/* Construct iter->tags bitmask from node->tags[tag] array */
+	if (flags & RADIX_TREE_ITER_TAGGED) {
+		unsigned tag_long, tag_bit;
+
+		tag_long = i / BITS_PER_LONG;
+		tag_bit  = i % BITS_PER_LONG;
+		iter->tags = node->tags[tag][tag_long] >> tag_bit;
+		/* This never happens if RADIX_TREE_TAG_LONGS == 1 */
+		if (tag_long < RADIX_TREE_TAG_LONGS - 1) {
+			/* Pick tags from next element */
+			if (tag_bit)
+				iter->tags |= node->tags[tag][tag_long + 1] <<
+						(BITS_PER_LONG - tag_bit);
+			/* Clip chunk size, here only BITS_PER_LONG tags */
+			iter->next_index = index + BITS_PER_LONG;
+		}
+	}
+
+	return node->slots + i;
+}
+EXPORT_SYMBOL(radix_tree_next_chunk);
+
+/**
  * radix_tree_range_tag_if_tagged - for each item in given range set given
  *				   tag if item has another tag set
  * @root:		radix tree root


^ permalink raw reply related	[flat|nested] 30+ messages in thread

* [PATCH 2/4] radix-tree: introduce bit-optimized iterator
@ 2012-02-07  7:55   ` Konstantin Khlebnikov
  0 siblings, 0 replies; 30+ messages in thread
From: Konstantin Khlebnikov @ 2012-02-07  7:55 UTC (permalink / raw)
  To: linux-mm, Andrew Morton, Hugh Dickins, Linus Torvalds, linux-kernel

This patch implements clean, simple and effective radix-tree iteration routine.
Iterating is divided into two loops: the outer loop iterates though radix tree
chunks on leaf nodes, the nested loop iterates through slots in one chunk.

Main iterator function radix_tree_next_chunk() returns pointer to first slot,
and stores in the struct radix_tree_iter index of next-to-last slot.
For tagged-iterating it also constuct bitmask of tags for retunted chunk.
Thus nested-loop has enough information for iteration.

Signed-off-by: Konstantin Khlebnikov <khlebnikov@openvz.org>
---
 include/linux/radix-tree.h |  129 ++++++++++++++++++++++++++++++++++++++++++++
 lib/radix-tree.c           |  107 ++++++++++++++++++++++++++++++++++++
 2 files changed, 236 insertions(+), 0 deletions(-)

diff --git a/include/linux/radix-tree.h b/include/linux/radix-tree.h
index 07e360b..c31b895 100644
--- a/include/linux/radix-tree.h
+++ b/include/linux/radix-tree.h
@@ -2,6 +2,7 @@
  * Copyright (C) 2001 Momchil Velikov
  * Portions Copyright (C) 2001 Christoph Hellwig
  * Copyright (C) 2006 Nick Piggin
+ * Copyright (C) 2012 Konstantin Khlebnikov
  *
  * This program is free software; you can redistribute it and/or
  * modify it under the terms of the GNU General Public License as
@@ -256,4 +257,132 @@ static inline void radix_tree_preload_end(void)
 	preempt_enable();
 }
 
+struct radix_tree_iter {
+	unsigned long		index;		/* current slot index */
+	unsigned long		next_index;	/* next chunk first index */
+	unsigned long		tags;		/* bitmask for tag-iterating */
+};
+
+static inline void radix_tree_iter_start(struct radix_tree_iter *iter,
+					 unsigned long start)
+{
+	iter->index = start;
+	iter->next_index = 1; /* any non-zero */
+}
+
+#define RADIX_TREE_ITER_TAG_MASK	0x00FF
+#define RADIX_TREE_ITER_TAGGED		0x0100	/* search tagged */
+#define RADIX_TREE_ITER_CONTIG		0x0200	/* stop at first hole  */
+
+void **radix_tree_next_chunk(struct radix_tree_root *root,
+			     struct radix_tree_iter *iter, unsigned flags);
+
+/**
+ * radix_tree_next_tag - find next tagged slot in chunk
+ *
+ * @slot	pointer to slot pointer
+ * @iter	iterator state
+ *
+ * Eats iter->tags and bumps *slot and iter->index to next tagged slot
+ * Returns true if something found, and false otherwise
+ */
+static inline bool radix_tree_next_tag(void ***slot,
+				       struct radix_tree_iter *iter)
+{
+	unsigned long offset;
+
+	if (likely(iter->tags & 1ul))
+		return true;
+	if (unlikely(!iter->tags)) {
+		iter->index = iter->next_index;
+		return false;
+	}
+	offset = __ffs(iter->tags);
+	iter->tags >>= offset;
+	iter->index += offset;
+	*slot += offset;
+	return true;
+}
+
+/**
+ * radix_tree_for_each_chunk - iterate over all slot chunks
+ *
+ * @slot:	the void** for pointer to first slot
+ * @root	the struct radix_tree_root pointer
+ * @iter	the struct radix_tree_iter pointer
+ * @start	starting index
+ * @flags	RADIX_TREE_ITER_* and tag index
+ */
+#define radix_tree_for_each_chunk(slot, root, iter, start, flags)	\
+	for ( radix_tree_iter_start(iter, start) ;			\
+	      (slot = radix_tree_next_chunk(root, iter, flags)) ; )
+
+/**
+ * radix_tree_for_each_chunk_slot - iterate over all slots in one chunk
+ *
+ * @slot:	the void** for pointer to slot
+ * @iter	the struct radix_tree_iter pointer
+ */
+#define radix_tree_for_each_chunk_slot(slot, iter)	\
+	for ( ; (iter)->index != (iter)->next_index ;	\
+		(iter)->index++, slot++ )
+
+/**
+ * radix_tree_for_each_chunk_tagged_slot - iterate over all tagged slots
+ *					   in one chunk
+ *
+ * @slot:	the void** for pointer to slot
+ * @iter	the struct radix_tree_iter pointer
+ */
+#define radix_tree_for_each_chunk_tagged_slot(slot, iter)	\
+	for ( ; radix_tree_next_tag(&slot, iter) ;		\
+		(iter)->tags >>= 1, (iter)->index++, slot++ )
+
+/**
+ * radix_tree_for_each_slot - iterate over all non-empty slots
+ *
+ * @slot:	the void** for pointer to slot
+ * @root	the struct radix_tree_root pointer
+ * @iter	the struct radix_tree_iter pointer
+ * @start	starting index
+ *
+ * This is double-loop, break does not work.
+ */
+#define radix_tree_for_each_slot(slot, root, iter, start)	\
+	radix_tree_for_each_chunk(slot, root, iter, start, 0)	\
+		radix_tree_for_each_chunk_slot(slot, iter)	\
+			if (!*slot) { continue; } else
+
+/**
+ * radix_tree_for_each_contig - iterate over all contiguous non-empty slots
+ *
+ * @slot:	the void** for pointer to slot
+ * @root	the struct radix_tree_root pointer
+ * @iter	the struct radix_tree_iter pointer
+ * @start	starting index
+ *
+ * This is double-loop, break does not work.
+ */
+#define radix_tree_for_each_contig(slot, root, iter, start)	\
+	radix_tree_for_each_chunk(slot, root, iter, start,	\
+					RADIX_TREE_ITER_CONTIG)	\
+		radix_tree_for_each_chunk_slot(slot, iter)	\
+			if (!*slot) { (iter)->next_index = 0; break; } else
+
+/**
+ * radix_tree_for_each_tagged - iterate over all tagged slots
+ *
+ * @slot:	the void** for pointer to slot
+ * @root	the struct radix_tree_root pointer
+ * @iter	the struct radix_tree_iter pointer
+ * @start	starting index
+ * @tag		tag index
+ *
+ * This is double-loop, break does not work.
+ */
+#define radix_tree_for_each_tagged(slot, root, iter, start, tag)\
+	radix_tree_for_each_chunk(slot, root, iter, start,	\
+				RADIX_TREE_ITER_TAGGED | tag)	\
+		radix_tree_for_each_chunk_tagged_slot(slot, iter)
+
 #endif /* _LINUX_RADIX_TREE_H */
diff --git a/lib/radix-tree.c b/lib/radix-tree.c
index dc63d08..32e2bfa 100644
--- a/lib/radix-tree.c
+++ b/lib/radix-tree.c
@@ -3,6 +3,7 @@
  * Portions Copyright (C) 2001 Christoph Hellwig
  * Copyright (C) 2005 SGI, Christoph Lameter
  * Copyright (C) 2006 Nick Piggin
+ * Copyright (C) 2012 Konstantin Khlebnikov
  *
  * This program is free software; you can redistribute it and/or
  * modify it under the terms of the GNU General Public License as
@@ -613,6 +614,112 @@ int radix_tree_tag_get(struct radix_tree_root *root,
 EXPORT_SYMBOL(radix_tree_tag_get);
 
 /**
+ * radix_tree_next_chunk - find next slots chunk for iteration
+ *
+ * @root:		radix tree root
+ * @iter:		iterator state
+ * @flags		RADIX_TREE_ITER_* flags and tag index
+ *
+ * Returns pointer to first slots, or NULL if there no more left
+ */
+void **radix_tree_next_chunk(struct radix_tree_root *root,
+			     struct radix_tree_iter *iter, unsigned flags)
+{
+	unsigned tag = flags & RADIX_TREE_ITER_TAG_MASK;
+	struct radix_tree_node *rnode, *node;
+	unsigned long index, i;
+	unsigned shift;
+
+	/* Overflow after ~0ul, or break in contiguous iteration */
+	if (!iter->next_index)
+		return NULL;
+
+	if ((flags & RADIX_TREE_ITER_TAGGED) && !root_tag_get(root, tag))
+		return NULL;
+
+	index = iter->index;
+
+	rnode = rcu_dereference_raw(root->rnode);
+	if (radix_tree_is_indirect_ptr(rnode)) {
+		rnode = indirect_to_ptr(rnode);
+	} else if (rnode && !index) {
+		/* Single-slot tree */
+		iter->tags = 1;
+		iter->next_index = 1;
+		return (void **)&root->rnode;
+	} else
+		return NULL;
+
+restart:
+	shift = (rnode->height - 1) * RADIX_TREE_MAP_SHIFT;
+	i = index >> shift;
+
+	/* Index ouside of the tree */
+	if (i >= RADIX_TREE_MAP_SIZE)
+		return NULL;
+
+	node = rnode;
+	while (1) {
+		if ((flags & RADIX_TREE_ITER_TAGGED) ?
+				!test_bit(i, node->tags[tag]) :
+				!node->slots[i]) {
+			/* Hole detected */
+			if (flags & RADIX_TREE_ITER_CONTIG)
+				return NULL;
+
+			if (flags & RADIX_TREE_ITER_TAGGED)
+				i = __find_next_bit(node->tags[tag],
+						RADIX_TREE_MAP_SIZE, i + 1);
+			else
+				while (++i < RADIX_TREE_MAP_SIZE &&
+						!node->slots[i]);
+
+			index &= ~((RADIX_TREE_MAP_SIZE << shift) - 1);
+			index += i << shift;
+			/* Overflow after ~0UL */
+			if (!index)
+				return NULL;
+			if (i == RADIX_TREE_MAP_SIZE)
+				goto restart;
+		}
+
+		/* This is leaf-node */
+		if (!shift)
+			break;
+
+		node = rcu_dereference_raw(node->slots[i]);
+		if (node == NULL)
+			goto restart;
+		shift -= RADIX_TREE_MAP_SHIFT;
+		i = (index >> shift) & RADIX_TREE_MAP_MASK;
+	}
+
+	iter->index = index;
+	iter->next_index = (index | RADIX_TREE_MAP_MASK) + 1;
+
+	/* Construct iter->tags bitmask from node->tags[tag] array */
+	if (flags & RADIX_TREE_ITER_TAGGED) {
+		unsigned tag_long, tag_bit;
+
+		tag_long = i / BITS_PER_LONG;
+		tag_bit  = i % BITS_PER_LONG;
+		iter->tags = node->tags[tag][tag_long] >> tag_bit;
+		/* This never happens if RADIX_TREE_TAG_LONGS == 1 */
+		if (tag_long < RADIX_TREE_TAG_LONGS - 1) {
+			/* Pick tags from next element */
+			if (tag_bit)
+				iter->tags |= node->tags[tag][tag_long + 1] <<
+						(BITS_PER_LONG - tag_bit);
+			/* Clip chunk size, here only BITS_PER_LONG tags */
+			iter->next_index = index + BITS_PER_LONG;
+		}
+	}
+
+	return node->slots + i;
+}
+EXPORT_SYMBOL(radix_tree_next_chunk);
+
+/**
  * radix_tree_range_tag_if_tagged - for each item in given range set given
  *				   tag if item has another tag set
  * @root:		radix tree root

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Fight unfair telecom internet charges in Canada: sign http://stopthemeter.ca/
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply related	[flat|nested] 30+ messages in thread

* [PATCH 3/4] radix-tree: rewrite gang lookup with using iterator
  2012-02-07  7:54 ` Konstantin Khlebnikov
@ 2012-02-07  7:55   ` Konstantin Khlebnikov
  -1 siblings, 0 replies; 30+ messages in thread
From: Konstantin Khlebnikov @ 2012-02-07  7:55 UTC (permalink / raw)
  To: linux-mm, Andrew Morton, Hugh Dickins, Linus Torvalds, linux-kernel

Rewrite radix_tree_gang_lookup_* functions with using radix-tree iterator.

Signed-off-by: Konstantin Khlebnikov <khlebnikov@openvz.org>
---
 lib/radix-tree.c |  315 ++++++++----------------------------------------------
 1 files changed, 45 insertions(+), 270 deletions(-)

diff --git a/lib/radix-tree.c b/lib/radix-tree.c
index 32e2bfa..57a93e7 100644
--- a/lib/radix-tree.c
+++ b/lib/radix-tree.c
@@ -924,57 +924,6 @@ unsigned long radix_tree_prev_hole(struct radix_tree_root *root,
 }
 EXPORT_SYMBOL(radix_tree_prev_hole);
 
-static unsigned int
-__lookup(struct radix_tree_node *slot, void ***results, unsigned long *indices,
-	unsigned long index, unsigned int max_items, unsigned long *next_index)
-{
-	unsigned int nr_found = 0;
-	unsigned int shift, height;
-	unsigned long i;
-
-	height = slot->height;
-	if (height == 0)
-		goto out;
-	shift = (height-1) * RADIX_TREE_MAP_SHIFT;
-
-	for ( ; height > 1; height--) {
-		i = (index >> shift) & RADIX_TREE_MAP_MASK;
-		for (;;) {
-			if (slot->slots[i] != NULL)
-				break;
-			index &= ~((1UL << shift) - 1);
-			index += 1UL << shift;
-			if (index == 0)
-				goto out;	/* 32-bit wraparound */
-			i++;
-			if (i == RADIX_TREE_MAP_SIZE)
-				goto out;
-		}
-
-		shift -= RADIX_TREE_MAP_SHIFT;
-		slot = rcu_dereference_raw(slot->slots[i]);
-		if (slot == NULL)
-			goto out;
-	}
-
-	/* Bottom level: grab some items */
-	for (i = index & RADIX_TREE_MAP_MASK; i < RADIX_TREE_MAP_SIZE; i++) {
-		if (slot->slots[i]) {
-			results[nr_found] = &(slot->slots[i]);
-			if (indices)
-				indices[nr_found] = index;
-			if (++nr_found == max_items) {
-				index++;
-				goto out;
-			}
-		}
-		index++;
-	}
-out:
-	*next_index = index;
-	return nr_found;
-}
-
 /**
  *	radix_tree_gang_lookup - perform multiple lookup on a radix tree
  *	@root:		radix tree root
@@ -998,50 +947,21 @@ unsigned int
 radix_tree_gang_lookup(struct radix_tree_root *root, void **results,
 			unsigned long first_index, unsigned int max_items)
 {
-	unsigned long max_index;
-	struct radix_tree_node *node;
-	unsigned long cur_index = first_index;
-	unsigned int ret;
-
-	node = rcu_dereference_raw(root->rnode);
-	if (!node)
-		return 0;
+	struct radix_tree_iter iter;
+	void **slot;
+	unsigned int ret = 0;
 
-	if (!radix_tree_is_indirect_ptr(node)) {
-		if (first_index > 0)
-			return 0;
-		results[0] = node;
-		return 1;
-	}
-	node = indirect_to_ptr(node);
-
-	max_index = radix_tree_maxindex(node->height);
-
-	ret = 0;
-	while (ret < max_items) {
-		unsigned int nr_found, slots_found, i;
-		unsigned long next_index;	/* Index of next search */
+	if (!max_items)
+		goto out;
 
-		if (cur_index > max_index)
-			break;
-		slots_found = __lookup(node, (void ***)results + ret, NULL,
-				cur_index, max_items - ret, &next_index);
-		nr_found = 0;
-		for (i = 0; i < slots_found; i++) {
-			struct radix_tree_node *slot;
-			slot = *(((void ***)results)[ret + i]);
-			if (!slot)
-				continue;
-			results[ret + nr_found] =
-				indirect_to_ptr(rcu_dereference_raw(slot));
-			nr_found++;
-		}
-		ret += nr_found;
-		if (next_index == 0)
-			break;
-		cur_index = next_index;
+	radix_tree_for_each_slot(slot, root, &iter, first_index) {
+		results[ret] = indirect_to_ptr(rcu_dereference_raw(*slot));
+		if (!results[ret])
+			continue;
+		if (++ret == max_items)
+			goto out;
 	}
-
+out:
 	return ret;
 }
 EXPORT_SYMBOL(radix_tree_gang_lookup);
@@ -1069,112 +989,25 @@ radix_tree_gang_lookup_slot(struct radix_tree_root *root,
 			void ***results, unsigned long *indices,
 			unsigned long first_index, unsigned int max_items)
 {
-	unsigned long max_index;
-	struct radix_tree_node *node;
-	unsigned long cur_index = first_index;
-	unsigned int ret;
+	struct radix_tree_iter iter;
+	void **slot;
+	unsigned int ret = 0;
 
-	node = rcu_dereference_raw(root->rnode);
-	if (!node)
-		return 0;
+	if (!max_items)
+		goto out;
 
-	if (!radix_tree_is_indirect_ptr(node)) {
-		if (first_index > 0)
-			return 0;
-		results[0] = (void **)&root->rnode;
+	radix_tree_for_each_slot(slot, root, &iter, first_index) {
+		results[ret] = slot;
 		if (indices)
-			indices[0] = 0;
-		return 1;
-	}
-	node = indirect_to_ptr(node);
-
-	max_index = radix_tree_maxindex(node->height);
-
-	ret = 0;
-	while (ret < max_items) {
-		unsigned int slots_found;
-		unsigned long next_index;	/* Index of next search */
-
-		if (cur_index > max_index)
-			break;
-		slots_found = __lookup(node, results + ret,
-				indices ? indices + ret : NULL,
-				cur_index, max_items - ret, &next_index);
-		ret += slots_found;
-		if (next_index == 0)
-			break;
-		cur_index = next_index;
+			indices[ret] = iter.index;
+		if (++ret == max_items)
+			goto out;
 	}
-
+out:
 	return ret;
 }
 EXPORT_SYMBOL(radix_tree_gang_lookup_slot);
 
-/*
- * FIXME: the two tag_get()s here should use find_next_bit() instead of
- * open-coding the search.
- */
-static unsigned int
-__lookup_tag(struct radix_tree_node *slot, void ***results, unsigned long index,
-	unsigned int max_items, unsigned long *next_index, unsigned int tag)
-{
-	unsigned int nr_found = 0;
-	unsigned int shift, height;
-
-	height = slot->height;
-	if (height == 0)
-		goto out;
-	shift = (height-1) * RADIX_TREE_MAP_SHIFT;
-
-	while (height > 0) {
-		unsigned long i = (index >> shift) & RADIX_TREE_MAP_MASK ;
-
-		for (;;) {
-			if (tag_get(slot, tag, i))
-				break;
-			index &= ~((1UL << shift) - 1);
-			index += 1UL << shift;
-			if (index == 0)
-				goto out;	/* 32-bit wraparound */
-			i++;
-			if (i == RADIX_TREE_MAP_SIZE)
-				goto out;
-		}
-		height--;
-		if (height == 0) {	/* Bottom level: grab some items */
-			unsigned long j = index & RADIX_TREE_MAP_MASK;
-
-			for ( ; j < RADIX_TREE_MAP_SIZE; j++) {
-				index++;
-				if (!tag_get(slot, tag, j))
-					continue;
-				/*
-				 * Even though the tag was found set, we need to
-				 * recheck that we have a non-NULL node, because
-				 * if this lookup is lockless, it may have been
-				 * subsequently deleted.
-				 *
-				 * Similar care must be taken in any place that
-				 * lookup ->slots[x] without a lock (ie. can't
-				 * rely on its value remaining the same).
-				 */
-				if (slot->slots[j]) {
-					results[nr_found++] = &(slot->slots[j]);
-					if (nr_found == max_items)
-						goto out;
-				}
-			}
-		}
-		shift -= RADIX_TREE_MAP_SHIFT;
-		slot = rcu_dereference_raw(slot->slots[i]);
-		if (slot == NULL)
-			break;
-	}
-out:
-	*next_index = index;
-	return nr_found;
-}
-
 /**
  *	radix_tree_gang_lookup_tag - perform multiple lookup on a radix tree
  *	                             based on a tag
@@ -1193,54 +1026,21 @@ radix_tree_gang_lookup_tag(struct radix_tree_root *root, void **results,
 		unsigned long first_index, unsigned int max_items,
 		unsigned int tag)
 {
-	struct radix_tree_node *node;
-	unsigned long max_index;
-	unsigned long cur_index = first_index;
-	unsigned int ret;
-
-	/* check the root's tag bit */
-	if (!root_tag_get(root, tag))
-		return 0;
+	struct radix_tree_iter iter;
+	void **slot;
+	unsigned int ret = 0;
 
-	node = rcu_dereference_raw(root->rnode);
-	if (!node)
-		return 0;
-
-	if (!radix_tree_is_indirect_ptr(node)) {
-		if (first_index > 0)
-			return 0;
-		results[0] = node;
-		return 1;
-	}
-	node = indirect_to_ptr(node);
-
-	max_index = radix_tree_maxindex(node->height);
-
-	ret = 0;
-	while (ret < max_items) {
-		unsigned int nr_found, slots_found, i;
-		unsigned long next_index;	/* Index of next search */
+	if (!max_items)
+		goto out;
 
-		if (cur_index > max_index)
-			break;
-		slots_found = __lookup_tag(node, (void ***)results + ret,
-				cur_index, max_items - ret, &next_index, tag);
-		nr_found = 0;
-		for (i = 0; i < slots_found; i++) {
-			struct radix_tree_node *slot;
-			slot = *(((void ***)results)[ret + i]);
-			if (!slot)
-				continue;
-			results[ret + nr_found] =
-				indirect_to_ptr(rcu_dereference_raw(slot));
-			nr_found++;
-		}
-		ret += nr_found;
-		if (next_index == 0)
-			break;
-		cur_index = next_index;
+	radix_tree_for_each_tagged(slot, root, &iter, first_index, tag) {
+		results[ret] = indirect_to_ptr(rcu_dereference_raw(*slot));
+		if (!results[ret])
+			continue;
+		if (++ret == max_items)
+			goto out;
 	}
-
+out:
 	return ret;
 }
 EXPORT_SYMBOL(radix_tree_gang_lookup_tag);
@@ -1263,44 +1063,19 @@ radix_tree_gang_lookup_tag_slot(struct radix_tree_root *root, void ***results,
 		unsigned long first_index, unsigned int max_items,
 		unsigned int tag)
 {
-	struct radix_tree_node *node;
-	unsigned long max_index;
-	unsigned long cur_index = first_index;
-	unsigned int ret;
+	struct radix_tree_iter iter;
+	void **slot;
+	unsigned int ret = 0;
 
-	/* check the root's tag bit */
-	if (!root_tag_get(root, tag))
-		return 0;
-
-	node = rcu_dereference_raw(root->rnode);
-	if (!node)
-		return 0;
-
-	if (!radix_tree_is_indirect_ptr(node)) {
-		if (first_index > 0)
-			return 0;
-		results[0] = (void **)&root->rnode;
-		return 1;
-	}
-	node = indirect_to_ptr(node);
-
-	max_index = radix_tree_maxindex(node->height);
-
-	ret = 0;
-	while (ret < max_items) {
-		unsigned int slots_found;
-		unsigned long next_index;	/* Index of next search */
+	if (!max_items)
+		goto out;
 
-		if (cur_index > max_index)
-			break;
-		slots_found = __lookup_tag(node, results + ret,
-				cur_index, max_items - ret, &next_index, tag);
-		ret += slots_found;
-		if (next_index == 0)
-			break;
-		cur_index = next_index;
+	radix_tree_for_each_tagged(slot, root, &iter, first_index, tag) {
+		results[ret] = slot;
+		if (++ret == max_items)
+			goto out;
 	}
-
+out:
 	return ret;
 }
 EXPORT_SYMBOL(radix_tree_gang_lookup_tag_slot);


^ permalink raw reply related	[flat|nested] 30+ messages in thread

* [PATCH 3/4] radix-tree: rewrite gang lookup with using iterator
@ 2012-02-07  7:55   ` Konstantin Khlebnikov
  0 siblings, 0 replies; 30+ messages in thread
From: Konstantin Khlebnikov @ 2012-02-07  7:55 UTC (permalink / raw)
  To: linux-mm, Andrew Morton, Hugh Dickins, Linus Torvalds, linux-kernel

Rewrite radix_tree_gang_lookup_* functions with using radix-tree iterator.

Signed-off-by: Konstantin Khlebnikov <khlebnikov@openvz.org>
---
 lib/radix-tree.c |  315 ++++++++----------------------------------------------
 1 files changed, 45 insertions(+), 270 deletions(-)

diff --git a/lib/radix-tree.c b/lib/radix-tree.c
index 32e2bfa..57a93e7 100644
--- a/lib/radix-tree.c
+++ b/lib/radix-tree.c
@@ -924,57 +924,6 @@ unsigned long radix_tree_prev_hole(struct radix_tree_root *root,
 }
 EXPORT_SYMBOL(radix_tree_prev_hole);
 
-static unsigned int
-__lookup(struct radix_tree_node *slot, void ***results, unsigned long *indices,
-	unsigned long index, unsigned int max_items, unsigned long *next_index)
-{
-	unsigned int nr_found = 0;
-	unsigned int shift, height;
-	unsigned long i;
-
-	height = slot->height;
-	if (height == 0)
-		goto out;
-	shift = (height-1) * RADIX_TREE_MAP_SHIFT;
-
-	for ( ; height > 1; height--) {
-		i = (index >> shift) & RADIX_TREE_MAP_MASK;
-		for (;;) {
-			if (slot->slots[i] != NULL)
-				break;
-			index &= ~((1UL << shift) - 1);
-			index += 1UL << shift;
-			if (index == 0)
-				goto out;	/* 32-bit wraparound */
-			i++;
-			if (i == RADIX_TREE_MAP_SIZE)
-				goto out;
-		}
-
-		shift -= RADIX_TREE_MAP_SHIFT;
-		slot = rcu_dereference_raw(slot->slots[i]);
-		if (slot == NULL)
-			goto out;
-	}
-
-	/* Bottom level: grab some items */
-	for (i = index & RADIX_TREE_MAP_MASK; i < RADIX_TREE_MAP_SIZE; i++) {
-		if (slot->slots[i]) {
-			results[nr_found] = &(slot->slots[i]);
-			if (indices)
-				indices[nr_found] = index;
-			if (++nr_found == max_items) {
-				index++;
-				goto out;
-			}
-		}
-		index++;
-	}
-out:
-	*next_index = index;
-	return nr_found;
-}
-
 /**
  *	radix_tree_gang_lookup - perform multiple lookup on a radix tree
  *	@root:		radix tree root
@@ -998,50 +947,21 @@ unsigned int
 radix_tree_gang_lookup(struct radix_tree_root *root, void **results,
 			unsigned long first_index, unsigned int max_items)
 {
-	unsigned long max_index;
-	struct radix_tree_node *node;
-	unsigned long cur_index = first_index;
-	unsigned int ret;
-
-	node = rcu_dereference_raw(root->rnode);
-	if (!node)
-		return 0;
+	struct radix_tree_iter iter;
+	void **slot;
+	unsigned int ret = 0;
 
-	if (!radix_tree_is_indirect_ptr(node)) {
-		if (first_index > 0)
-			return 0;
-		results[0] = node;
-		return 1;
-	}
-	node = indirect_to_ptr(node);
-
-	max_index = radix_tree_maxindex(node->height);
-
-	ret = 0;
-	while (ret < max_items) {
-		unsigned int nr_found, slots_found, i;
-		unsigned long next_index;	/* Index of next search */
+	if (!max_items)
+		goto out;
 
-		if (cur_index > max_index)
-			break;
-		slots_found = __lookup(node, (void ***)results + ret, NULL,
-				cur_index, max_items - ret, &next_index);
-		nr_found = 0;
-		for (i = 0; i < slots_found; i++) {
-			struct radix_tree_node *slot;
-			slot = *(((void ***)results)[ret + i]);
-			if (!slot)
-				continue;
-			results[ret + nr_found] =
-				indirect_to_ptr(rcu_dereference_raw(slot));
-			nr_found++;
-		}
-		ret += nr_found;
-		if (next_index == 0)
-			break;
-		cur_index = next_index;
+	radix_tree_for_each_slot(slot, root, &iter, first_index) {
+		results[ret] = indirect_to_ptr(rcu_dereference_raw(*slot));
+		if (!results[ret])
+			continue;
+		if (++ret == max_items)
+			goto out;
 	}
-
+out:
 	return ret;
 }
 EXPORT_SYMBOL(radix_tree_gang_lookup);
@@ -1069,112 +989,25 @@ radix_tree_gang_lookup_slot(struct radix_tree_root *root,
 			void ***results, unsigned long *indices,
 			unsigned long first_index, unsigned int max_items)
 {
-	unsigned long max_index;
-	struct radix_tree_node *node;
-	unsigned long cur_index = first_index;
-	unsigned int ret;
+	struct radix_tree_iter iter;
+	void **slot;
+	unsigned int ret = 0;
 
-	node = rcu_dereference_raw(root->rnode);
-	if (!node)
-		return 0;
+	if (!max_items)
+		goto out;
 
-	if (!radix_tree_is_indirect_ptr(node)) {
-		if (first_index > 0)
-			return 0;
-		results[0] = (void **)&root->rnode;
+	radix_tree_for_each_slot(slot, root, &iter, first_index) {
+		results[ret] = slot;
 		if (indices)
-			indices[0] = 0;
-		return 1;
-	}
-	node = indirect_to_ptr(node);
-
-	max_index = radix_tree_maxindex(node->height);
-
-	ret = 0;
-	while (ret < max_items) {
-		unsigned int slots_found;
-		unsigned long next_index;	/* Index of next search */
-
-		if (cur_index > max_index)
-			break;
-		slots_found = __lookup(node, results + ret,
-				indices ? indices + ret : NULL,
-				cur_index, max_items - ret, &next_index);
-		ret += slots_found;
-		if (next_index == 0)
-			break;
-		cur_index = next_index;
+			indices[ret] = iter.index;
+		if (++ret == max_items)
+			goto out;
 	}
-
+out:
 	return ret;
 }
 EXPORT_SYMBOL(radix_tree_gang_lookup_slot);
 
-/*
- * FIXME: the two tag_get()s here should use find_next_bit() instead of
- * open-coding the search.
- */
-static unsigned int
-__lookup_tag(struct radix_tree_node *slot, void ***results, unsigned long index,
-	unsigned int max_items, unsigned long *next_index, unsigned int tag)
-{
-	unsigned int nr_found = 0;
-	unsigned int shift, height;
-
-	height = slot->height;
-	if (height == 0)
-		goto out;
-	shift = (height-1) * RADIX_TREE_MAP_SHIFT;
-
-	while (height > 0) {
-		unsigned long i = (index >> shift) & RADIX_TREE_MAP_MASK ;
-
-		for (;;) {
-			if (tag_get(slot, tag, i))
-				break;
-			index &= ~((1UL << shift) - 1);
-			index += 1UL << shift;
-			if (index == 0)
-				goto out;	/* 32-bit wraparound */
-			i++;
-			if (i == RADIX_TREE_MAP_SIZE)
-				goto out;
-		}
-		height--;
-		if (height == 0) {	/* Bottom level: grab some items */
-			unsigned long j = index & RADIX_TREE_MAP_MASK;
-
-			for ( ; j < RADIX_TREE_MAP_SIZE; j++) {
-				index++;
-				if (!tag_get(slot, tag, j))
-					continue;
-				/*
-				 * Even though the tag was found set, we need to
-				 * recheck that we have a non-NULL node, because
-				 * if this lookup is lockless, it may have been
-				 * subsequently deleted.
-				 *
-				 * Similar care must be taken in any place that
-				 * lookup ->slots[x] without a lock (ie. can't
-				 * rely on its value remaining the same).
-				 */
-				if (slot->slots[j]) {
-					results[nr_found++] = &(slot->slots[j]);
-					if (nr_found == max_items)
-						goto out;
-				}
-			}
-		}
-		shift -= RADIX_TREE_MAP_SHIFT;
-		slot = rcu_dereference_raw(slot->slots[i]);
-		if (slot == NULL)
-			break;
-	}
-out:
-	*next_index = index;
-	return nr_found;
-}
-
 /**
  *	radix_tree_gang_lookup_tag - perform multiple lookup on a radix tree
  *	                             based on a tag
@@ -1193,54 +1026,21 @@ radix_tree_gang_lookup_tag(struct radix_tree_root *root, void **results,
 		unsigned long first_index, unsigned int max_items,
 		unsigned int tag)
 {
-	struct radix_tree_node *node;
-	unsigned long max_index;
-	unsigned long cur_index = first_index;
-	unsigned int ret;
-
-	/* check the root's tag bit */
-	if (!root_tag_get(root, tag))
-		return 0;
+	struct radix_tree_iter iter;
+	void **slot;
+	unsigned int ret = 0;
 
-	node = rcu_dereference_raw(root->rnode);
-	if (!node)
-		return 0;
-
-	if (!radix_tree_is_indirect_ptr(node)) {
-		if (first_index > 0)
-			return 0;
-		results[0] = node;
-		return 1;
-	}
-	node = indirect_to_ptr(node);
-
-	max_index = radix_tree_maxindex(node->height);
-
-	ret = 0;
-	while (ret < max_items) {
-		unsigned int nr_found, slots_found, i;
-		unsigned long next_index;	/* Index of next search */
+	if (!max_items)
+		goto out;
 
-		if (cur_index > max_index)
-			break;
-		slots_found = __lookup_tag(node, (void ***)results + ret,
-				cur_index, max_items - ret, &next_index, tag);
-		nr_found = 0;
-		for (i = 0; i < slots_found; i++) {
-			struct radix_tree_node *slot;
-			slot = *(((void ***)results)[ret + i]);
-			if (!slot)
-				continue;
-			results[ret + nr_found] =
-				indirect_to_ptr(rcu_dereference_raw(slot));
-			nr_found++;
-		}
-		ret += nr_found;
-		if (next_index == 0)
-			break;
-		cur_index = next_index;
+	radix_tree_for_each_tagged(slot, root, &iter, first_index, tag) {
+		results[ret] = indirect_to_ptr(rcu_dereference_raw(*slot));
+		if (!results[ret])
+			continue;
+		if (++ret == max_items)
+			goto out;
 	}
-
+out:
 	return ret;
 }
 EXPORT_SYMBOL(radix_tree_gang_lookup_tag);
@@ -1263,44 +1063,19 @@ radix_tree_gang_lookup_tag_slot(struct radix_tree_root *root, void ***results,
 		unsigned long first_index, unsigned int max_items,
 		unsigned int tag)
 {
-	struct radix_tree_node *node;
-	unsigned long max_index;
-	unsigned long cur_index = first_index;
-	unsigned int ret;
+	struct radix_tree_iter iter;
+	void **slot;
+	unsigned int ret = 0;
 
-	/* check the root's tag bit */
-	if (!root_tag_get(root, tag))
-		return 0;
-
-	node = rcu_dereference_raw(root->rnode);
-	if (!node)
-		return 0;
-
-	if (!radix_tree_is_indirect_ptr(node)) {
-		if (first_index > 0)
-			return 0;
-		results[0] = (void **)&root->rnode;
-		return 1;
-	}
-	node = indirect_to_ptr(node);
-
-	max_index = radix_tree_maxindex(node->height);
-
-	ret = 0;
-	while (ret < max_items) {
-		unsigned int slots_found;
-		unsigned long next_index;	/* Index of next search */
+	if (!max_items)
+		goto out;
 
-		if (cur_index > max_index)
-			break;
-		slots_found = __lookup_tag(node, results + ret,
-				cur_index, max_items - ret, &next_index, tag);
-		ret += slots_found;
-		if (next_index == 0)
-			break;
-		cur_index = next_index;
+	radix_tree_for_each_tagged(slot, root, &iter, first_index, tag) {
+		results[ret] = slot;
+		if (++ret == max_items)
+			goto out;
 	}
-
+out:
 	return ret;
 }
 EXPORT_SYMBOL(radix_tree_gang_lookup_tag_slot);

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Fight unfair telecom internet charges in Canada: sign http://stopthemeter.ca/
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply related	[flat|nested] 30+ messages in thread

* [PATCH 4/4] radix-tree: use iterators in find_get_pages* functions
  2012-02-07  7:54 ` Konstantin Khlebnikov
@ 2012-02-07  7:55   ` Konstantin Khlebnikov
  -1 siblings, 0 replies; 30+ messages in thread
From: Konstantin Khlebnikov @ 2012-02-07  7:55 UTC (permalink / raw)
  To: linux-mm, Andrew Morton, Hugh Dickins, Linus Torvalds, linux-kernel

This patch replaces radix_tree_gang_lookup_slot() and
radix_tree_gang_lookup_tag_slot() in page-cache lookup functions with brand-new
radix-tree direct iterating. This avoids the double-scanning and pointers copying.

Signed-off-by: Konstantin Khlebnikov <khlebnikov@openvz.org>
---
 mm/filemap.c |   75 ++++++++++++++++++++++++++++++++++------------------------
 mm/shmem.c   |   23 +++++++++++-------
 2 files changed, 58 insertions(+), 40 deletions(-)

diff --git a/mm/filemap.c b/mm/filemap.c
index b662757..518223b 100644
--- a/mm/filemap.c
+++ b/mm/filemap.c
@@ -811,23 +811,25 @@ EXPORT_SYMBOL(find_or_create_page);
 unsigned find_get_pages(struct address_space *mapping, pgoff_t start,
 			    unsigned int nr_pages, struct page **pages)
 {
-	unsigned int i;
 	unsigned int ret;
 	unsigned int nr_found, nr_skip;
+	struct radix_tree_iter iter;
+	void **slot;
+
+	if (!nr_pages)
+		return 0;
 
 	rcu_read_lock();
 restart:
-	nr_found = radix_tree_gang_lookup_slot(&mapping->page_tree,
-				(void ***)pages, NULL, start, nr_pages);
-	ret = 0;
-	nr_skip = 0;
-	for (i = 0; i < nr_found; i++) {
+	nr_found = nr_skip = ret = 0;
+	radix_tree_for_each_slot(slot, &mapping->page_tree, &iter, start) {
 		struct page *page;
 repeat:
-		page = radix_tree_deref_slot((void **)pages[i]);
+		page = radix_tree_deref_slot(slot);
 		if (unlikely(!page))
 			continue;
 
+		nr_found++;
 		if (radix_tree_exception(page)) {
 			if (radix_tree_deref_retry(page)) {
 				/*
@@ -835,7 +837,7 @@ repeat:
 				 * when entry at index 0 moves out of or back
 				 * to root: none yet gotten, safe to restart.
 				 */
-				WARN_ON(start | i);
+				WARN_ON(iter.index);
 				goto restart;
 			}
 			/*
@@ -851,13 +853,14 @@ repeat:
 			goto repeat;
 
 		/* Has the page moved? */
-		if (unlikely(page != *((void **)pages[i]))) {
+		if (unlikely(page != *slot)) {
 			page_cache_release(page);
 			goto repeat;
 		}
 
 		pages[ret] = page;
-		ret++;
+		if (++ret == nr_pages)
+			goto out;
 	}
 
 	/*
@@ -866,6 +869,7 @@ repeat:
 	 */
 	if (unlikely(!ret && nr_found > nr_skip))
 		goto restart;
+out:
 	rcu_read_unlock();
 	return ret;
 }
@@ -885,21 +889,23 @@ repeat:
 unsigned find_get_pages_contig(struct address_space *mapping, pgoff_t index,
 			       unsigned int nr_pages, struct page **pages)
 {
-	unsigned int i;
+	struct radix_tree_iter iter;
 	unsigned int ret;
-	unsigned int nr_found;
+	void **slot;
+
+	if (!nr_pages)
+		return 0;
 
 	rcu_read_lock();
 restart:
-	nr_found = radix_tree_gang_lookup_slot(&mapping->page_tree,
-				(void ***)pages, NULL, index, nr_pages);
 	ret = 0;
-	for (i = 0; i < nr_found; i++) {
+	radix_tree_for_each_contig(slot, &mapping->page_tree, &iter, index) {
 		struct page *page;
 repeat:
-		page = radix_tree_deref_slot((void **)pages[i]);
+		page = radix_tree_deref_slot(slot);
+		/* The hole, there no reason to continue */
 		if (unlikely(!page))
-			continue;
+			goto out;
 
 		if (radix_tree_exception(page)) {
 			if (radix_tree_deref_retry(page)) {
@@ -915,14 +921,14 @@ repeat:
 			 * here as an exceptional entry: so stop looking for
 			 * contiguous pages.
 			 */
-			break;
+			goto out;
 		}
 
 		if (!page_cache_get_speculative(page))
 			goto repeat;
 
 		/* Has the page moved? */
-		if (unlikely(page != *((void **)pages[i]))) {
+		if (unlikely(page != *slot)) {
 			page_cache_release(page);
 			goto repeat;
 		}
@@ -932,15 +938,16 @@ repeat:
 		 * otherwise we can get both false positives and false
 		 * negatives, which is just confusing to the caller.
 		 */
-		if (page->mapping == NULL || page->index != index) {
+		if (page->mapping == NULL || page->index != iter.index) {
 			page_cache_release(page);
-			break;
+			goto out;
 		}
 
 		pages[ret] = page;
-		ret++;
-		index++;
+		if (++ret == nr_pages)
+			goto out;
 	}
+out:
 	rcu_read_unlock();
 	return ret;
 }
@@ -960,22 +967,26 @@ EXPORT_SYMBOL(find_get_pages_contig);
 unsigned find_get_pages_tag(struct address_space *mapping, pgoff_t *index,
 			int tag, unsigned int nr_pages, struct page **pages)
 {
-	unsigned int i;
 	unsigned int ret;
 	unsigned int nr_found;
+	struct radix_tree_iter iter;
+	void **slot;
+
+	if (!nr_pages)
+		return 0;
 
 	rcu_read_lock();
 restart:
-	nr_found = radix_tree_gang_lookup_tag_slot(&mapping->page_tree,
-				(void ***)pages, *index, nr_pages, tag);
-	ret = 0;
-	for (i = 0; i < nr_found; i++) {
+	nr_found = ret = 0;
+	radix_tree_for_each_tagged(slot, &mapping->page_tree,
+				   &iter, *index, tag) {
 		struct page *page;
 repeat:
-		page = radix_tree_deref_slot((void **)pages[i]);
+		page = radix_tree_deref_slot(slot);
 		if (unlikely(!page))
 			continue;
 
+		nr_found++;
 		if (radix_tree_exception(page)) {
 			if (radix_tree_deref_retry(page)) {
 				/*
@@ -996,13 +1007,14 @@ repeat:
 			goto repeat;
 
 		/* Has the page moved? */
-		if (unlikely(page != *((void **)pages[i]))) {
+		if (unlikely(page != *slot)) {
 			page_cache_release(page);
 			goto repeat;
 		}
 
 		pages[ret] = page;
-		ret++;
+		if (++ret == nr_pages)
+			goto out;
 	}
 
 	/*
@@ -1011,6 +1023,7 @@ repeat:
 	 */
 	if (unlikely(!ret && nr_found))
 		goto restart;
+out:
 	rcu_read_unlock();
 
 	if (ret)
diff --git a/mm/shmem.c b/mm/shmem.c
index 269d049..5033319 100644
--- a/mm/shmem.c
+++ b/mm/shmem.c
@@ -316,21 +316,24 @@ static unsigned shmem_find_get_pages_and_swap(struct address_space *mapping,
 					pgoff_t start, unsigned int nr_pages,
 					struct page **pages, pgoff_t *indices)
 {
-	unsigned int i;
 	unsigned int ret;
 	unsigned int nr_found;
+	struct radix_tree_iter iter;
+	void **slot;
+
+	if (!nr_pages)
+		return 0;
 
 	rcu_read_lock();
 restart:
-	nr_found = radix_tree_gang_lookup_slot(&mapping->page_tree,
-				(void ***)pages, indices, start, nr_pages);
-	ret = 0;
-	for (i = 0; i < nr_found; i++) {
+	nr_found = ret = 0;
+	radix_tree_for_each_slot(slot, &mapping->page_tree, &iter, start) {
 		struct page *page;
 repeat:
-		page = radix_tree_deref_slot((void **)pages[i]);
+		page = radix_tree_deref_slot(slot);
 		if (unlikely(!page))
 			continue;
+		nr_found++;
 		if (radix_tree_exception(page)) {
 			if (radix_tree_deref_retry(page))
 				goto restart;
@@ -345,17 +348,19 @@ repeat:
 			goto repeat;
 
 		/* Has the page moved? */
-		if (unlikely(page != *((void **)pages[i]))) {
+		if (unlikely(page != *slot)) {
 			page_cache_release(page);
 			goto repeat;
 		}
 export:
-		indices[ret] = indices[i];
+		indices[ret] = iter.index;
 		pages[ret] = page;
-		ret++;
+		if (++ret == nr_pages)
+			goto out;
 	}
 	if (unlikely(!ret && nr_found))
 		goto restart;
+out:
 	rcu_read_unlock();
 	return ret;
 }


^ permalink raw reply related	[flat|nested] 30+ messages in thread

* [PATCH 4/4] radix-tree: use iterators in find_get_pages* functions
@ 2012-02-07  7:55   ` Konstantin Khlebnikov
  0 siblings, 0 replies; 30+ messages in thread
From: Konstantin Khlebnikov @ 2012-02-07  7:55 UTC (permalink / raw)
  To: linux-mm, Andrew Morton, Hugh Dickins, Linus Torvalds, linux-kernel

This patch replaces radix_tree_gang_lookup_slot() and
radix_tree_gang_lookup_tag_slot() in page-cache lookup functions with brand-new
radix-tree direct iterating. This avoids the double-scanning and pointers copying.

Signed-off-by: Konstantin Khlebnikov <khlebnikov@openvz.org>
---
 mm/filemap.c |   75 ++++++++++++++++++++++++++++++++++------------------------
 mm/shmem.c   |   23 +++++++++++-------
 2 files changed, 58 insertions(+), 40 deletions(-)

diff --git a/mm/filemap.c b/mm/filemap.c
index b662757..518223b 100644
--- a/mm/filemap.c
+++ b/mm/filemap.c
@@ -811,23 +811,25 @@ EXPORT_SYMBOL(find_or_create_page);
 unsigned find_get_pages(struct address_space *mapping, pgoff_t start,
 			    unsigned int nr_pages, struct page **pages)
 {
-	unsigned int i;
 	unsigned int ret;
 	unsigned int nr_found, nr_skip;
+	struct radix_tree_iter iter;
+	void **slot;
+
+	if (!nr_pages)
+		return 0;
 
 	rcu_read_lock();
 restart:
-	nr_found = radix_tree_gang_lookup_slot(&mapping->page_tree,
-				(void ***)pages, NULL, start, nr_pages);
-	ret = 0;
-	nr_skip = 0;
-	for (i = 0; i < nr_found; i++) {
+	nr_found = nr_skip = ret = 0;
+	radix_tree_for_each_slot(slot, &mapping->page_tree, &iter, start) {
 		struct page *page;
 repeat:
-		page = radix_tree_deref_slot((void **)pages[i]);
+		page = radix_tree_deref_slot(slot);
 		if (unlikely(!page))
 			continue;
 
+		nr_found++;
 		if (radix_tree_exception(page)) {
 			if (radix_tree_deref_retry(page)) {
 				/*
@@ -835,7 +837,7 @@ repeat:
 				 * when entry at index 0 moves out of or back
 				 * to root: none yet gotten, safe to restart.
 				 */
-				WARN_ON(start | i);
+				WARN_ON(iter.index);
 				goto restart;
 			}
 			/*
@@ -851,13 +853,14 @@ repeat:
 			goto repeat;
 
 		/* Has the page moved? */
-		if (unlikely(page != *((void **)pages[i]))) {
+		if (unlikely(page != *slot)) {
 			page_cache_release(page);
 			goto repeat;
 		}
 
 		pages[ret] = page;
-		ret++;
+		if (++ret == nr_pages)
+			goto out;
 	}
 
 	/*
@@ -866,6 +869,7 @@ repeat:
 	 */
 	if (unlikely(!ret && nr_found > nr_skip))
 		goto restart;
+out:
 	rcu_read_unlock();
 	return ret;
 }
@@ -885,21 +889,23 @@ repeat:
 unsigned find_get_pages_contig(struct address_space *mapping, pgoff_t index,
 			       unsigned int nr_pages, struct page **pages)
 {
-	unsigned int i;
+	struct radix_tree_iter iter;
 	unsigned int ret;
-	unsigned int nr_found;
+	void **slot;
+
+	if (!nr_pages)
+		return 0;
 
 	rcu_read_lock();
 restart:
-	nr_found = radix_tree_gang_lookup_slot(&mapping->page_tree,
-				(void ***)pages, NULL, index, nr_pages);
 	ret = 0;
-	for (i = 0; i < nr_found; i++) {
+	radix_tree_for_each_contig(slot, &mapping->page_tree, &iter, index) {
 		struct page *page;
 repeat:
-		page = radix_tree_deref_slot((void **)pages[i]);
+		page = radix_tree_deref_slot(slot);
+		/* The hole, there no reason to continue */
 		if (unlikely(!page))
-			continue;
+			goto out;
 
 		if (radix_tree_exception(page)) {
 			if (radix_tree_deref_retry(page)) {
@@ -915,14 +921,14 @@ repeat:
 			 * here as an exceptional entry: so stop looking for
 			 * contiguous pages.
 			 */
-			break;
+			goto out;
 		}
 
 		if (!page_cache_get_speculative(page))
 			goto repeat;
 
 		/* Has the page moved? */
-		if (unlikely(page != *((void **)pages[i]))) {
+		if (unlikely(page != *slot)) {
 			page_cache_release(page);
 			goto repeat;
 		}
@@ -932,15 +938,16 @@ repeat:
 		 * otherwise we can get both false positives and false
 		 * negatives, which is just confusing to the caller.
 		 */
-		if (page->mapping == NULL || page->index != index) {
+		if (page->mapping == NULL || page->index != iter.index) {
 			page_cache_release(page);
-			break;
+			goto out;
 		}
 
 		pages[ret] = page;
-		ret++;
-		index++;
+		if (++ret == nr_pages)
+			goto out;
 	}
+out:
 	rcu_read_unlock();
 	return ret;
 }
@@ -960,22 +967,26 @@ EXPORT_SYMBOL(find_get_pages_contig);
 unsigned find_get_pages_tag(struct address_space *mapping, pgoff_t *index,
 			int tag, unsigned int nr_pages, struct page **pages)
 {
-	unsigned int i;
 	unsigned int ret;
 	unsigned int nr_found;
+	struct radix_tree_iter iter;
+	void **slot;
+
+	if (!nr_pages)
+		return 0;
 
 	rcu_read_lock();
 restart:
-	nr_found = radix_tree_gang_lookup_tag_slot(&mapping->page_tree,
-				(void ***)pages, *index, nr_pages, tag);
-	ret = 0;
-	for (i = 0; i < nr_found; i++) {
+	nr_found = ret = 0;
+	radix_tree_for_each_tagged(slot, &mapping->page_tree,
+				   &iter, *index, tag) {
 		struct page *page;
 repeat:
-		page = radix_tree_deref_slot((void **)pages[i]);
+		page = radix_tree_deref_slot(slot);
 		if (unlikely(!page))
 			continue;
 
+		nr_found++;
 		if (radix_tree_exception(page)) {
 			if (radix_tree_deref_retry(page)) {
 				/*
@@ -996,13 +1007,14 @@ repeat:
 			goto repeat;
 
 		/* Has the page moved? */
-		if (unlikely(page != *((void **)pages[i]))) {
+		if (unlikely(page != *slot)) {
 			page_cache_release(page);
 			goto repeat;
 		}
 
 		pages[ret] = page;
-		ret++;
+		if (++ret == nr_pages)
+			goto out;
 	}
 
 	/*
@@ -1011,6 +1023,7 @@ repeat:
 	 */
 	if (unlikely(!ret && nr_found))
 		goto restart;
+out:
 	rcu_read_unlock();
 
 	if (ret)
diff --git a/mm/shmem.c b/mm/shmem.c
index 269d049..5033319 100644
--- a/mm/shmem.c
+++ b/mm/shmem.c
@@ -316,21 +316,24 @@ static unsigned shmem_find_get_pages_and_swap(struct address_space *mapping,
 					pgoff_t start, unsigned int nr_pages,
 					struct page **pages, pgoff_t *indices)
 {
-	unsigned int i;
 	unsigned int ret;
 	unsigned int nr_found;
+	struct radix_tree_iter iter;
+	void **slot;
+
+	if (!nr_pages)
+		return 0;
 
 	rcu_read_lock();
 restart:
-	nr_found = radix_tree_gang_lookup_slot(&mapping->page_tree,
-				(void ***)pages, indices, start, nr_pages);
-	ret = 0;
-	for (i = 0; i < nr_found; i++) {
+	nr_found = ret = 0;
+	radix_tree_for_each_slot(slot, &mapping->page_tree, &iter, start) {
 		struct page *page;
 repeat:
-		page = radix_tree_deref_slot((void **)pages[i]);
+		page = radix_tree_deref_slot(slot);
 		if (unlikely(!page))
 			continue;
+		nr_found++;
 		if (radix_tree_exception(page)) {
 			if (radix_tree_deref_retry(page))
 				goto restart;
@@ -345,17 +348,19 @@ repeat:
 			goto repeat;
 
 		/* Has the page moved? */
-		if (unlikely(page != *((void **)pages[i]))) {
+		if (unlikely(page != *slot)) {
 			page_cache_release(page);
 			goto repeat;
 		}
 export:
-		indices[ret] = indices[i];
+		indices[ret] = iter.index;
 		pages[ret] = page;
-		ret++;
+		if (++ret == nr_pages)
+			goto out;
 	}
 	if (unlikely(!ret && nr_found))
 		goto restart;
+out:
 	rcu_read_unlock();
 	return ret;
 }

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Fight unfair telecom internet charges in Canada: sign http://stopthemeter.ca/
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply related	[flat|nested] 30+ messages in thread

* Re: [PATCH 1/4] bitops: implement "optimized" __find_next_bit()
  2012-02-07  7:55   ` Konstantin Khlebnikov
@ 2012-02-07 19:36     ` Linus Torvalds
  -1 siblings, 0 replies; 30+ messages in thread
From: Linus Torvalds @ 2012-02-07 19:36 UTC (permalink / raw)
  To: Konstantin Khlebnikov; +Cc: linux-mm, Andrew Morton, Hugh Dickins, linux-kernel

On Mon, Feb 6, 2012 at 11:55 PM, Konstantin Khlebnikov
<khlebnikov@openvz.org> wrote:
> This patch adds  __find_next_bit() -- static-inline variant of find_next_bit()
> optimized for small constant size arrays, because find_next_bit() is too heavy
> for searching in an array with one/two long elements.
> And unlike to find_next_bit() it does not mask tail bits.

Does anybody else really want this?  My gut feel is that this
shouldn't be inline at all (the same is largely true of the existing
ones), and that nobody else really wants this. Nor do we want to
introduce yet another helper function that has very subtly different
semantics that will just confuse people.

So I suspect this should be instead a function that is internal to the
iterator code.

                  Linus

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH 1/4] bitops: implement "optimized" __find_next_bit()
@ 2012-02-07 19:36     ` Linus Torvalds
  0 siblings, 0 replies; 30+ messages in thread
From: Linus Torvalds @ 2012-02-07 19:36 UTC (permalink / raw)
  To: Konstantin Khlebnikov; +Cc: linux-mm, Andrew Morton, Hugh Dickins, linux-kernel

On Mon, Feb 6, 2012 at 11:55 PM, Konstantin Khlebnikov
<khlebnikov@openvz.org> wrote:
> This patch adds  __find_next_bit() -- static-inline variant of find_next_bit()
> optimized for small constant size arrays, because find_next_bit() is too heavy
> for searching in an array with one/two long elements.
> And unlike to find_next_bit() it does not mask tail bits.

Does anybody else really want this?  My gut feel is that this
shouldn't be inline at all (the same is largely true of the existing
ones), and that nobody else really wants this. Nor do we want to
introduce yet another helper function that has very subtly different
semantics that will just confuse people.

So I suspect this should be instead a function that is internal to the
iterator code.

                  Linus

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Fight unfair telecom internet charges in Canada: sign http://stopthemeter.ca/
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH 0/4] radix-tree: iterating general cleanup
  2012-02-07  7:54 ` Konstantin Khlebnikov
@ 2012-02-07 19:38   ` Linus Torvalds
  -1 siblings, 0 replies; 30+ messages in thread
From: Linus Torvalds @ 2012-02-07 19:38 UTC (permalink / raw)
  To: Konstantin Khlebnikov; +Cc: linux-mm, Andrew Morton, Hugh Dickins, linux-kernel

On Mon, Feb 6, 2012 at 11:54 PM, Konstantin Khlebnikov
<khlebnikov@openvz.org> wrote:
> This patchset implements common radix-tree iteration routine and
> reworks page-cache lookup functions with using it.

So what's the advantage? Both the line counts and the bloat-o-meter
seems to imply this is all just bad.

I assume there is some upside to it, but you really don't make that
obvious, so why should anybody ever waste even a second of time
looking at this?

                  Linus

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH 0/4] radix-tree: iterating general cleanup
@ 2012-02-07 19:38   ` Linus Torvalds
  0 siblings, 0 replies; 30+ messages in thread
From: Linus Torvalds @ 2012-02-07 19:38 UTC (permalink / raw)
  To: Konstantin Khlebnikov; +Cc: linux-mm, Andrew Morton, Hugh Dickins, linux-kernel

On Mon, Feb 6, 2012 at 11:54 PM, Konstantin Khlebnikov
<khlebnikov@openvz.org> wrote:
> This patchset implements common radix-tree iteration routine and
> reworks page-cache lookup functions with using it.

So what's the advantage? Both the line counts and the bloat-o-meter
seems to imply this is all just bad.

I assume there is some upside to it, but you really don't make that
obvious, so why should anybody ever waste even a second of time
looking at this?

                  Linus

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Fight unfair telecom internet charges in Canada: sign http://stopthemeter.ca/
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH 0/4] radix-tree: iterating general cleanup
  2012-02-07 19:38   ` Linus Torvalds
@ 2012-02-08  1:30     ` Konstantin Khlebnikov
  -1 siblings, 0 replies; 30+ messages in thread
From: Konstantin Khlebnikov @ 2012-02-08  1:30 UTC (permalink / raw)
  To: Linus Torvalds; +Cc: linux-mm, Andrew Morton, Hugh Dickins, linux-kernel

Linus Torvalds wrote:
> On Mon, Feb 6, 2012 at 11:54 PM, Konstantin Khlebnikov
> <khlebnikov@openvz.org>  wrote:
>> This patchset implements common radix-tree iteration routine and
>> reworks page-cache lookup functions with using it.
>
> So what's the advantage? Both the line counts and the bloat-o-meter
> seems to imply this is all just bad.

If do not count comments here actually is negative line count change.
And if drop (almost) unused radix_tree_gang_lookup_tag_slot() and
radix_tree_gang_lookup_slot() total bloat-o-meter score becomes negative too.
There also some shrinkable stuff in shmem code.
So, if this is really a problem it is fixable. I just didn't want to bloat patchset.

>
> I assume there is some upside to it, but you really don't make that
> obvious, so why should anybody ever waste even a second of time
> looking at this?

Hmm, from my point of view, this unified iterator code looks cleaner than
current duplicated radix-tree code. That's why I was titled it "cleanup".

There also some simple bit-hacks: find-next-bit instead of dumb loops in tagged-lookup.

Here some benchmark results: there is radix-tree with 1024 slots, I fill and tag every <step> slot,
and run lookup for all slots with radix_tree_gang_lookup() and radix_tree_gang_lookup_tag() in the loop.
old/new rows -- nsec per iteration over whole tree.

tagged-lookup
step	1	2	3	4	5	6	7	8	9	10	11	12	13	14	15	16
old	7035	5248	4742	4308	4217	4133	4030	3920	4038	3933	3914	3796	3851	3755	3819	3582
new	3578	2617	1899	1426	1220	1058	936	822	845	749	695	679	648	575	591	509

so, new tagged-lookup always faster, especially for sparse trees.

normal-lookup
step	1	2	3	4	5	6	7	8	9	10	11	12	13	14	15	16
old	4156	2641	2068	1837	1630	1531	1467	1415	1373	1398	1333	1323	1308	1280	1236	1249
new	3048	2520	2429	2280	2266	2296	2215	2219	2188	2170	2218	2332	2166	2096	2100	2058

New normal lookup works faster for dense trees, on sparse trees it slower.
Looks like it caused by struct radix_tree_iter aliasing,
gcc cannot effectively use its field as loop counter in nested-loop.
But how important is this? Anyway, I'll think how to fix this misfortune.


>
>                    Linus


^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH 0/4] radix-tree: iterating general cleanup
@ 2012-02-08  1:30     ` Konstantin Khlebnikov
  0 siblings, 0 replies; 30+ messages in thread
From: Konstantin Khlebnikov @ 2012-02-08  1:30 UTC (permalink / raw)
  To: Linus Torvalds; +Cc: linux-mm, Andrew Morton, Hugh Dickins, linux-kernel

Linus Torvalds wrote:
> On Mon, Feb 6, 2012 at 11:54 PM, Konstantin Khlebnikov
> <khlebnikov@openvz.org>  wrote:
>> This patchset implements common radix-tree iteration routine and
>> reworks page-cache lookup functions with using it.
>
> So what's the advantage? Both the line counts and the bloat-o-meter
> seems to imply this is all just bad.

If do not count comments here actually is negative line count change.
And if drop (almost) unused radix_tree_gang_lookup_tag_slot() and
radix_tree_gang_lookup_slot() total bloat-o-meter score becomes negative too.
There also some shrinkable stuff in shmem code.
So, if this is really a problem it is fixable. I just didn't want to bloat patchset.

>
> I assume there is some upside to it, but you really don't make that
> obvious, so why should anybody ever waste even a second of time
> looking at this?

Hmm, from my point of view, this unified iterator code looks cleaner than
current duplicated radix-tree code. That's why I was titled it "cleanup".

There also some simple bit-hacks: find-next-bit instead of dumb loops in tagged-lookup.

Here some benchmark results: there is radix-tree with 1024 slots, I fill and tag every <step> slot,
and run lookup for all slots with radix_tree_gang_lookup() and radix_tree_gang_lookup_tag() in the loop.
old/new rows -- nsec per iteration over whole tree.

tagged-lookup
step	1	2	3	4	5	6	7	8	9	10	11	12	13	14	15	16
old	7035	5248	4742	4308	4217	4133	4030	3920	4038	3933	3914	3796	3851	3755	3819	3582
new	3578	2617	1899	1426	1220	1058	936	822	845	749	695	679	648	575	591	509

so, new tagged-lookup always faster, especially for sparse trees.

normal-lookup
step	1	2	3	4	5	6	7	8	9	10	11	12	13	14	15	16
old	4156	2641	2068	1837	1630	1531	1467	1415	1373	1398	1333	1323	1308	1280	1236	1249
new	3048	2520	2429	2280	2266	2296	2215	2219	2188	2170	2218	2332	2166	2096	2100	2058

New normal lookup works faster for dense trees, on sparse trees it slower.
Looks like it caused by struct radix_tree_iter aliasing,
gcc cannot effectively use its field as loop counter in nested-loop.
But how important is this? Anyway, I'll think how to fix this misfortune.


>
>                    Linus

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Fight unfair telecom internet charges in Canada: sign http://stopthemeter.ca/
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH 0/4] radix-tree: iterating general cleanup
  2012-02-08  1:30     ` Konstantin Khlebnikov
@ 2012-02-08  1:50       ` Linus Torvalds
  -1 siblings, 0 replies; 30+ messages in thread
From: Linus Torvalds @ 2012-02-08  1:50 UTC (permalink / raw)
  To: Konstantin Khlebnikov; +Cc: linux-mm, Andrew Morton, Hugh Dickins, linux-kernel

On Tue, Feb 7, 2012 at 5:30 PM, Konstantin Khlebnikov
<khlebnikov@openvz.org> wrote:
>
> If do not count comments here actually is negative line count change.

Ok, fair enough.

> And if drop (almost) unused radix_tree_gang_lookup_tag_slot() and
> radix_tree_gang_lookup_slot() total bloat-o-meter score becomes negative
> too.

Good.

> There also some simple bit-hacks: find-next-bit instead of dumb loops in
> tagged-lookup.
>
> Here some benchmark results: there is radix-tree with 1024 slots, I fill and
> tag every <step> slot,
> and run lookup for all slots with radix_tree_gang_lookup() and
> radix_tree_gang_lookup_tag() in the loop.
> old/new rows -- nsec per iteration over whole tree.
>
> tagged-lookup
> step    1       2       3       4       5       6       7       8       9       10      11      12      13      14      15      16
> old     7035    5248    4742    4308    4217    4133    4030    3920    4038    3933    3914    3796    3851    3755    3819    3582
> new     3578    2617    1899    1426    1220    1058    936     822     845     749     695     679     648     575     591     509
>
> so, new tagged-lookup always faster, especially for sparse trees.

Do you have any benchmarks when it's actually used by higher levels,
though? I guess that will involve find_get_pages(), and we don't have
all that any of them, but it would be lovely to see some real load
(even if it is limited to one of the filesystems that uses this)
numbers too..

> New normal lookup works faster for dense trees, on sparse trees it slower.

I think that should be the common case, so that may be fine. Again, it
would be nice to see numbers that are for something else than just the
lookup - an actual use of it in some real context.

Anyway, the patches themselves looked fine to me, modulo the fact that
I wasn't all that happy with the new __find_next_bit, and I think it's
better to not expose it in a generic header file. But I would really
like to see more "real" numbers for the series

Thanks,

                   Linus

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH 0/4] radix-tree: iterating general cleanup
@ 2012-02-08  1:50       ` Linus Torvalds
  0 siblings, 0 replies; 30+ messages in thread
From: Linus Torvalds @ 2012-02-08  1:50 UTC (permalink / raw)
  To: Konstantin Khlebnikov; +Cc: linux-mm, Andrew Morton, Hugh Dickins, linux-kernel

On Tue, Feb 7, 2012 at 5:30 PM, Konstantin Khlebnikov
<khlebnikov@openvz.org> wrote:
>
> If do not count comments here actually is negative line count change.

Ok, fair enough.

> And if drop (almost) unused radix_tree_gang_lookup_tag_slot() and
> radix_tree_gang_lookup_slot() total bloat-o-meter score becomes negative
> too.

Good.

> There also some simple bit-hacks: find-next-bit instead of dumb loops in
> tagged-lookup.
>
> Here some benchmark results: there is radix-tree with 1024 slots, I fill and
> tag every <step> slot,
> and run lookup for all slots with radix_tree_gang_lookup() and
> radix_tree_gang_lookup_tag() in the loop.
> old/new rows -- nsec per iteration over whole tree.
>
> tagged-lookup
> step    1       2       3       4       5       6       7       8       9       10      11      12      13      14      15      16
> old     7035    5248    4742    4308    4217    4133    4030    3920    4038    3933    3914    3796    3851    3755    3819    3582
> new     3578    2617    1899    1426    1220    1058    936     822     845     749     695     679     648     575     591     509
>
> so, new tagged-lookup always faster, especially for sparse trees.

Do you have any benchmarks when it's actually used by higher levels,
though? I guess that will involve find_get_pages(), and we don't have
all that any of them, but it would be lovely to see some real load
(even if it is limited to one of the filesystems that uses this)
numbers too..

> New normal lookup works faster for dense trees, on sparse trees it slower.

I think that should be the common case, so that may be fine. Again, it
would be nice to see numbers that are for something else than just the
lookup - an actual use of it in some real context.

Anyway, the patches themselves looked fine to me, modulo the fact that
I wasn't all that happy with the new __find_next_bit, and I think it's
better to not expose it in a generic header file. But I would really
like to see more "real" numbers for the series

Thanks,

                   Linus

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Fight unfair telecom internet charges in Canada: sign http://stopthemeter.ca/
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH 0/4] radix-tree: iterating general cleanup
  2012-02-08  1:50       ` Linus Torvalds
@ 2012-02-08 21:31         ` Dave Chinner
  -1 siblings, 0 replies; 30+ messages in thread
From: Dave Chinner @ 2012-02-08 21:31 UTC (permalink / raw)
  To: Linus Torvalds
  Cc: Konstantin Khlebnikov, linux-mm, Andrew Morton, Hugh Dickins,
	linux-kernel

On Tue, Feb 07, 2012 at 05:50:59PM -0800, Linus Torvalds wrote:
> On Tue, Feb 7, 2012 at 5:30 PM, Konstantin Khlebnikov
> <khlebnikov@openvz.org> wrote:
> >
> > If do not count comments here actually is negative line count change.
> 
> Ok, fair enough.
> 
> > And if drop (almost) unused radix_tree_gang_lookup_tag_slot() and
> > radix_tree_gang_lookup_slot() total bloat-o-meter score becomes negative
> > too.
> 
> Good.
> 
> > There also some simple bit-hacks: find-next-bit instead of dumb loops in
> > tagged-lookup.
> >
> > Here some benchmark results: there is radix-tree with 1024 slots, I fill and
> > tag every <step> slot,
> > and run lookup for all slots with radix_tree_gang_lookup() and
> > radix_tree_gang_lookup_tag() in the loop.
> > old/new rows -- nsec per iteration over whole tree.
> >
> > tagged-lookup
> > step    1    2     3     4     5     6     7     8     9     10    11    12    13    14    15    16
> > old   7035  5248  4742  4308  4217  4133  4030  3920  4038  3933  3914  3796  3851  3755  3819  3582
> > new   3578  2617  1899  1426  1220  1058  936   822   845   749   695   679   648   575   591   509
> >
> > so, new tagged-lookup always faster, especially for sparse trees.
> 
> Do you have any benchmarks when it's actually used by higher levels,
> though? I guess that will involve find_get_pages(), and we don't have
> all that any of them, but it would be lovely to see some real load
> (even if it is limited to one of the filesystems that uses this)
> numbers too..

It's also a very small tree size to test - 1024 slots is only
4MB of page cache data, but we regularly cache GBs of pages in
a single tree.

> > New normal lookup works faster for dense trees, on sparse trees it slower.

Testing large trees (in the millions of entries) might show
different results - I'd be interested to see the difference there
given that iterating large trees is very common (e.g. in the
writeback code)....

> I think that should be the common case, so that may be fine. Again, it
> would be nice to see numbers that are for something else than just the
> lookup - an actual use of it in some real context.

XFS also uses radix trees for it's inode caches and AG indexes. We
iterate those trees by both normal and tagged lookups in different
contexts, but it is extremely difficult to isolate the tree
traversal from everything else that is going on around them, so I
can't really help with a microbenchmark there...

Cheers,

Dave.
-- 
Dave Chinner
david@fromorbit.com

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH 0/4] radix-tree: iterating general cleanup
@ 2012-02-08 21:31         ` Dave Chinner
  0 siblings, 0 replies; 30+ messages in thread
From: Dave Chinner @ 2012-02-08 21:31 UTC (permalink / raw)
  To: Linus Torvalds
  Cc: Konstantin Khlebnikov, linux-mm, Andrew Morton, Hugh Dickins,
	linux-kernel

On Tue, Feb 07, 2012 at 05:50:59PM -0800, Linus Torvalds wrote:
> On Tue, Feb 7, 2012 at 5:30 PM, Konstantin Khlebnikov
> <khlebnikov@openvz.org> wrote:
> >
> > If do not count comments here actually is negative line count change.
> 
> Ok, fair enough.
> 
> > And if drop (almost) unused radix_tree_gang_lookup_tag_slot() and
> > radix_tree_gang_lookup_slot() total bloat-o-meter score becomes negative
> > too.
> 
> Good.
> 
> > There also some simple bit-hacks: find-next-bit instead of dumb loops in
> > tagged-lookup.
> >
> > Here some benchmark results: there is radix-tree with 1024 slots, I fill and
> > tag every <step> slot,
> > and run lookup for all slots with radix_tree_gang_lookup() and
> > radix_tree_gang_lookup_tag() in the loop.
> > old/new rows -- nsec per iteration over whole tree.
> >
> > tagged-lookup
> > step    1    2     3     4     5     6     7     8     9     10    11    12    13    14    15    16
> > old   7035  5248  4742  4308  4217  4133  4030  3920  4038  3933  3914  3796  3851  3755  3819  3582
> > new   3578  2617  1899  1426  1220  1058  936   822   845   749   695   679   648   575   591   509
> >
> > so, new tagged-lookup always faster, especially for sparse trees.
> 
> Do you have any benchmarks when it's actually used by higher levels,
> though? I guess that will involve find_get_pages(), and we don't have
> all that any of them, but it would be lovely to see some real load
> (even if it is limited to one of the filesystems that uses this)
> numbers too..

It's also a very small tree size to test - 1024 slots is only
4MB of page cache data, but we regularly cache GBs of pages in
a single tree.

> > New normal lookup works faster for dense trees, on sparse trees it slower.

Testing large trees (in the millions of entries) might show
different results - I'd be interested to see the difference there
given that iterating large trees is very common (e.g. in the
writeback code)....

> I think that should be the common case, so that may be fine. Again, it
> would be nice to see numbers that are for something else than just the
> lookup - an actual use of it in some real context.

XFS also uses radix trees for it's inode caches and AG indexes. We
iterate those trees by both normal and tagged lookups in different
contexts, but it is extremely difficult to isolate the tree
traversal from everything else that is going on around them, so I
can't really help with a microbenchmark there...

Cheers,

Dave.
-- 
Dave Chinner
david@fromorbit.com

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Fight unfair telecom internet charges in Canada: sign http://stopthemeter.ca/
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH 0/4] radix-tree: iterating general cleanup
  2012-02-07  7:54 ` Konstantin Khlebnikov
@ 2012-03-14  7:36   ` Christoph Hellwig
  -1 siblings, 0 replies; 30+ messages in thread
From: Christoph Hellwig @ 2012-03-14  7:36 UTC (permalink / raw)
  To: Konstantin Khlebnikov
  Cc: linux-mm, Andrew Morton, Hugh Dickins, Linus Torvalds, linux-kernel

Any updates on this series?

On Tue, Feb 07, 2012 at 11:54:56AM +0400, Konstantin Khlebnikov wrote:
> This patchset implements common radix-tree iteration routine and
> reworks page-cache lookup functions with using it.
> 
> radix_tree_gang_lookup_*slot() now mostly unused (the last user somethere in
> drivers/sh/intc/virq.c), but they are exported, we cannot remove them for now.
> 
> Also there some shmem-related radix-tree hacks can be reworked,
> radix_tree_locate_item() can be removed. I already have a few extra patches.
> 
> And as usual my lovely bloat-o-meter:
> 
> add/remove: 4/3 grow/shrink: 4/4 up/down: 1232/-964 (268)
> function                                     old     new   delta
> radix_tree_next_chunk                          -     499    +499
> static.shmem_find_get_pages_and_swap           -     404    +404
> find_get_pages_tag                           354     488    +134
> find_get_pages                               362     438     +76
> find_get_pages_contig                        345     407     +62
> __kstrtab_radix_tree_next_chunk                -      22     +22
> shmem_truncate_range                        1633    1652     +19
> __ksymtab_radix_tree_next_chunk                -      16     +16
> radix_tree_gang_lookup_tag_slot              208     180     -28
> radix_tree_gang_lookup_tag                   247     207     -40
> radix_tree_gang_lookup_slot                  204     162     -42
> radix_tree_gang_lookup                       231     160     -71
> __lookup                                     217       -    -217
> __lookup_tag                                 242       -    -242
> shmem_find_get_pages_and_swap                324       -    -324
> 
> ---
> 
> Konstantin Khlebnikov (4):
>       bitops: implement "optimized" __find_next_bit()
>       radix-tree: introduce bit-optimized iterator
>       radix-tree: rewrite gang lookup with using iterator
>       radix-tree: use iterators in find_get_pages* functions
> 
> 
>  include/asm-generic/bitops/find.h |   36 +++
>  include/linux/radix-tree.h        |  129 +++++++++++
>  lib/radix-tree.c                  |  422 +++++++++++++------------------------
>  mm/filemap.c                      |   75 ++++---
>  mm/shmem.c                        |   23 +-
>  5 files changed, 375 insertions(+), 310 deletions(-)
> 
> -- 
> Signature
> 
> --
> To unsubscribe, send a message with 'unsubscribe linux-mm' in
> the body to majordomo@kvack.org.  For more info on Linux MM,
> see: http://www.linux-mm.org/ .
> Fight unfair telecom internet charges in Canada: sign http://stopthemeter.ca/
> Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
---end quoted text---

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH 0/4] radix-tree: iterating general cleanup
@ 2012-03-14  7:36   ` Christoph Hellwig
  0 siblings, 0 replies; 30+ messages in thread
From: Christoph Hellwig @ 2012-03-14  7:36 UTC (permalink / raw)
  To: Konstantin Khlebnikov
  Cc: linux-mm, Andrew Morton, Hugh Dickins, Linus Torvalds, linux-kernel

Any updates on this series?

On Tue, Feb 07, 2012 at 11:54:56AM +0400, Konstantin Khlebnikov wrote:
> This patchset implements common radix-tree iteration routine and
> reworks page-cache lookup functions with using it.
> 
> radix_tree_gang_lookup_*slot() now mostly unused (the last user somethere in
> drivers/sh/intc/virq.c), but they are exported, we cannot remove them for now.
> 
> Also there some shmem-related radix-tree hacks can be reworked,
> radix_tree_locate_item() can be removed. I already have a few extra patches.
> 
> And as usual my lovely bloat-o-meter:
> 
> add/remove: 4/3 grow/shrink: 4/4 up/down: 1232/-964 (268)
> function                                     old     new   delta
> radix_tree_next_chunk                          -     499    +499
> static.shmem_find_get_pages_and_swap           -     404    +404
> find_get_pages_tag                           354     488    +134
> find_get_pages                               362     438     +76
> find_get_pages_contig                        345     407     +62
> __kstrtab_radix_tree_next_chunk                -      22     +22
> shmem_truncate_range                        1633    1652     +19
> __ksymtab_radix_tree_next_chunk                -      16     +16
> radix_tree_gang_lookup_tag_slot              208     180     -28
> radix_tree_gang_lookup_tag                   247     207     -40
> radix_tree_gang_lookup_slot                  204     162     -42
> radix_tree_gang_lookup                       231     160     -71
> __lookup                                     217       -    -217
> __lookup_tag                                 242       -    -242
> shmem_find_get_pages_and_swap                324       -    -324
> 
> ---
> 
> Konstantin Khlebnikov (4):
>       bitops: implement "optimized" __find_next_bit()
>       radix-tree: introduce bit-optimized iterator
>       radix-tree: rewrite gang lookup with using iterator
>       radix-tree: use iterators in find_get_pages* functions
> 
> 
>  include/asm-generic/bitops/find.h |   36 +++
>  include/linux/radix-tree.h        |  129 +++++++++++
>  lib/radix-tree.c                  |  422 +++++++++++++------------------------
>  mm/filemap.c                      |   75 ++++---
>  mm/shmem.c                        |   23 +-
>  5 files changed, 375 insertions(+), 310 deletions(-)
> 
> -- 
> Signature
> 
> --
> To unsubscribe, send a message with 'unsubscribe linux-mm' in
> the body to majordomo@kvack.org.  For more info on Linux MM,
> see: http://www.linux-mm.org/ .
> Fight unfair telecom internet charges in Canada: sign http://stopthemeter.ca/
> Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
---end quoted text---

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Fight unfair telecom internet charges in Canada: sign http://stopthemeter.ca/
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH 0/4] radix-tree: iterating general cleanup
  2012-03-14  7:36   ` Christoph Hellwig
@ 2012-03-14  7:49     ` Konstantin Khlebnikov
  -1 siblings, 0 replies; 30+ messages in thread
From: Konstantin Khlebnikov @ 2012-03-14  7:49 UTC (permalink / raw)
  To: Christoph Hellwig
  Cc: linux-mm, Andrew Morton, Hugh Dickins, Linus Torvalds, linux-kernel

Christoph Hellwig wrote:
> Any updates on this series?

I had sent "[PATCH v2 0/3] radix-tree: general iterator" February 10, there is no more updates after that.
I just checked v2 on top "next-20120314" -- looks like all ok.

>
> On Tue, Feb 07, 2012 at 11:54:56AM +0400, Konstantin Khlebnikov wrote:
>> This patchset implements common radix-tree iteration routine and
>> reworks page-cache lookup functions with using it.
>>
>> radix_tree_gang_lookup_*slot() now mostly unused (the last user somethere in
>> drivers/sh/intc/virq.c), but they are exported, we cannot remove them for now.
>>
>> Also there some shmem-related radix-tree hacks can be reworked,
>> radix_tree_locate_item() can be removed. I already have a few extra patches.
>>
>> And as usual my lovely bloat-o-meter:
>>
>> add/remove: 4/3 grow/shrink: 4/4 up/down: 1232/-964 (268)
>> function                                     old     new   delta
>> radix_tree_next_chunk                          -     499    +499
>> static.shmem_find_get_pages_and_swap           -     404    +404
>> find_get_pages_tag                           354     488    +134
>> find_get_pages                               362     438     +76
>> find_get_pages_contig                        345     407     +62
>> __kstrtab_radix_tree_next_chunk                -      22     +22
>> shmem_truncate_range                        1633    1652     +19
>> __ksymtab_radix_tree_next_chunk                -      16     +16
>> radix_tree_gang_lookup_tag_slot              208     180     -28
>> radix_tree_gang_lookup_tag                   247     207     -40
>> radix_tree_gang_lookup_slot                  204     162     -42
>> radix_tree_gang_lookup                       231     160     -71
>> __lookup                                     217       -    -217
>> __lookup_tag                                 242       -    -242
>> shmem_find_get_pages_and_swap                324       -    -324
>>
>> ---
>>
>> Konstantin Khlebnikov (4):
>>        bitops: implement "optimized" __find_next_bit()
>>        radix-tree: introduce bit-optimized iterator
>>        radix-tree: rewrite gang lookup with using iterator
>>        radix-tree: use iterators in find_get_pages* functions
>>
>>
>>   include/asm-generic/bitops/find.h |   36 +++
>>   include/linux/radix-tree.h        |  129 +++++++++++
>>   lib/radix-tree.c                  |  422 +++++++++++++------------------------
>>   mm/filemap.c                      |   75 ++++---
>>   mm/shmem.c                        |   23 +-
>>   5 files changed, 375 insertions(+), 310 deletions(-)
>>
>> --
>> Signature
>>
>> --
>> To unsubscribe, send a message with 'unsubscribe linux-mm' in
>> the body to majordomo@kvack.org.  For more info on Linux MM,
>> see: http://www.linux-mm.org/ .
>> Fight unfair telecom internet charges in Canada: sign http://stopthemeter.ca/
>> Don't email:<a href=mailto:"dont@kvack.org">  email@kvack.org</a>
> ---end quoted text---


^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH 0/4] radix-tree: iterating general cleanup
@ 2012-03-14  7:49     ` Konstantin Khlebnikov
  0 siblings, 0 replies; 30+ messages in thread
From: Konstantin Khlebnikov @ 2012-03-14  7:49 UTC (permalink / raw)
  To: Christoph Hellwig
  Cc: linux-mm, Andrew Morton, Hugh Dickins, Linus Torvalds, linux-kernel

Christoph Hellwig wrote:
> Any updates on this series?

I had sent "[PATCH v2 0/3] radix-tree: general iterator" February 10, there is no more updates after that.
I just checked v2 on top "next-20120314" -- looks like all ok.

>
> On Tue, Feb 07, 2012 at 11:54:56AM +0400, Konstantin Khlebnikov wrote:
>> This patchset implements common radix-tree iteration routine and
>> reworks page-cache lookup functions with using it.
>>
>> radix_tree_gang_lookup_*slot() now mostly unused (the last user somethere in
>> drivers/sh/intc/virq.c), but they are exported, we cannot remove them for now.
>>
>> Also there some shmem-related radix-tree hacks can be reworked,
>> radix_tree_locate_item() can be removed. I already have a few extra patches.
>>
>> And as usual my lovely bloat-o-meter:
>>
>> add/remove: 4/3 grow/shrink: 4/4 up/down: 1232/-964 (268)
>> function                                     old     new   delta
>> radix_tree_next_chunk                          -     499    +499
>> static.shmem_find_get_pages_and_swap           -     404    +404
>> find_get_pages_tag                           354     488    +134
>> find_get_pages                               362     438     +76
>> find_get_pages_contig                        345     407     +62
>> __kstrtab_radix_tree_next_chunk                -      22     +22
>> shmem_truncate_range                        1633    1652     +19
>> __ksymtab_radix_tree_next_chunk                -      16     +16
>> radix_tree_gang_lookup_tag_slot              208     180     -28
>> radix_tree_gang_lookup_tag                   247     207     -40
>> radix_tree_gang_lookup_slot                  204     162     -42
>> radix_tree_gang_lookup                       231     160     -71
>> __lookup                                     217       -    -217
>> __lookup_tag                                 242       -    -242
>> shmem_find_get_pages_and_swap                324       -    -324
>>
>> ---
>>
>> Konstantin Khlebnikov (4):
>>        bitops: implement "optimized" __find_next_bit()
>>        radix-tree: introduce bit-optimized iterator
>>        radix-tree: rewrite gang lookup with using iterator
>>        radix-tree: use iterators in find_get_pages* functions
>>
>>
>>   include/asm-generic/bitops/find.h |   36 +++
>>   include/linux/radix-tree.h        |  129 +++++++++++
>>   lib/radix-tree.c                  |  422 +++++++++++++------------------------
>>   mm/filemap.c                      |   75 ++++---
>>   mm/shmem.c                        |   23 +-
>>   5 files changed, 375 insertions(+), 310 deletions(-)
>>
>> --
>> Signature
>>
>> --
>> To unsubscribe, send a message with 'unsubscribe linux-mm' in
>> the body to majordomo@kvack.org.  For more info on Linux MM,
>> see: http://www.linux-mm.org/ .
>> Fight unfair telecom internet charges in Canada: sign http://stopthemeter.ca/
>> Don't email:<a href=mailto:"dont@kvack.org">  email@kvack.org</a>
> ---end quoted text---

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Fight unfair telecom internet charges in Canada: sign http://stopthemeter.ca/
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH 0/4] radix-tree: iterating general cleanup
  2012-03-14  7:49     ` Konstantin Khlebnikov
@ 2012-03-14  7:51       ` Christoph Hellwig
  -1 siblings, 0 replies; 30+ messages in thread
From: Christoph Hellwig @ 2012-03-14  7:51 UTC (permalink / raw)
  To: Konstantin Khlebnikov
  Cc: Christoph Hellwig, linux-mm, Andrew Morton, Hugh Dickins,
	Linus Torvalds, linux-kernel

On Wed, Mar 14, 2012 at 11:49:21AM +0400, Konstantin Khlebnikov wrote:
> Christoph Hellwig wrote:
> >Any updates on this series?
> 
> I had sent "[PATCH v2 0/3] radix-tree: general iterator" February 10, there is no more updates after that.
> I just checked v2 on top "next-20120314" -- looks like all ok.

this was more a question to the MM maintainers if this is getting
merged or if there were any further comments.

We'd really like to have this interface to simplify some code in XFS.

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH 0/4] radix-tree: iterating general cleanup
@ 2012-03-14  7:51       ` Christoph Hellwig
  0 siblings, 0 replies; 30+ messages in thread
From: Christoph Hellwig @ 2012-03-14  7:51 UTC (permalink / raw)
  To: Konstantin Khlebnikov
  Cc: Christoph Hellwig, linux-mm, Andrew Morton, Hugh Dickins,
	Linus Torvalds, linux-kernel

On Wed, Mar 14, 2012 at 11:49:21AM +0400, Konstantin Khlebnikov wrote:
> Christoph Hellwig wrote:
> >Any updates on this series?
> 
> I had sent "[PATCH v2 0/3] radix-tree: general iterator" February 10, there is no more updates after that.
> I just checked v2 on top "next-20120314" -- looks like all ok.

this was more a question to the MM maintainers if this is getting
merged or if there were any further comments.

We'd really like to have this interface to simplify some code in XFS.

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Fight unfair telecom internet charges in Canada: sign http://stopthemeter.ca/
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH 0/4] radix-tree: iterating general cleanup
  2012-03-14  7:51       ` Christoph Hellwig
@ 2012-03-14 19:36         ` Hugh Dickins
  -1 siblings, 0 replies; 30+ messages in thread
From: Hugh Dickins @ 2012-03-14 19:36 UTC (permalink / raw)
  To: Christoph Hellwig
  Cc: Konstantin Khlebnikov, Andrew Morton, Linus Torvalds,
	linux-kernel, linux-mm

On Wed, 14 Mar 2012, Christoph Hellwig wrote:
> On Wed, Mar 14, 2012 at 11:49:21AM +0400, Konstantin Khlebnikov wrote:
> > Christoph Hellwig wrote:
> > >Any updates on this series?
> > 
> > I had sent "[PATCH v2 0/3] radix-tree: general iterator" February 10, there is no more updates after that.
> > I just checked v2 on top "next-20120314" -- looks like all ok.
> 
> this was more a question to the MM maintainers if this is getting
> merged or if there were any further comments.

I haven't studied the code at all - I'm afraid Konstantin is rather
more productive than I can keep up with, and other bugs and patches
appeared to be more urgent and important.

But I have included that work in most of my testing for four weeks now,
and observed no problems whatever from it.  And I made a patch for the
radix-tree test harness which akpm curates, to update its radix-tree.c
to Konstantin's: those tests ran perfectly on 64-bit and on 32-bit.
That patch to rtth appended below.

(I do have, or expect to have once I study them, reservations about
his subsequent changes to radix-tree usage in mm/shmem.c; and even
if I end up agreeing with his changes, would prefer to hold them off
until after the tmpfs fallocation mods are in - other work which
had to yield to higher priorities, ready but not yet commented.)

> 
> We'd really like to have this interface to simplify some code in XFS.

That's useful info: let's raise its priority, and hope that someone
faster than me (not a narrow category...) gets to review it.

Hugh

[PATCH] rtth: update to KK's radix-tree iterator

Signed-off-by: Hugh Dickins <hughd@google.com>

--- rtth/Makefile.0	2012-02-17 19:38:02.611537166 -0800
+++ rtth/Makefile	2012-02-17 20:32:31.351614457 -0800
@@ -2,7 +2,8 @@
 CFLAGS += -I. -g -Wall -D_LGPL_SOURCE
 LDFLAGS += -lpthread -lurcu
 TARGETS = main
-OFILES = main.o radix-tree.o linux.o test.o tag_check.o regression1.o regression2.o
+OFILES = main.o radix-tree.o linux.o test.o tag_check.o find_next_bit.o \
+	 regression1.o regression2.o
 
 targets: $(TARGETS)
 
--- rtth/find_next_bit.c.0	2012-02-17 21:07:53.556625068 -0800
+++ rtth/find_next_bit.c	2012-02-17 20:34:54.159617936 -0800
@@ -0,0 +1,57 @@
+/* find_next_bit.c: fallback find next bit implementation
+ *
+ * Copyright (C) 2004 Red Hat, Inc. All Rights Reserved.
+ * Written by David Howells (dhowells@redhat.com)
+ *
+ * This program is free software; you can redistribute it and/or
+ * modify it under the terms of the GNU General Public License
+ * as published by the Free Software Foundation; either version
+ * 2 of the License, or (at your option) any later version.
+ */
+
+#include <linux/types.h>
+#include <linux/bitops.h>
+
+#define BITOP_WORD(nr)		((nr) / BITS_PER_LONG)
+
+/*
+ * Find the next set bit in a memory region.
+ */
+unsigned long find_next_bit(const unsigned long *addr, unsigned long size,
+			    unsigned long offset)
+{
+	const unsigned long *p = addr + BITOP_WORD(offset);
+	unsigned long result = offset & ~(BITS_PER_LONG-1);
+	unsigned long tmp;
+
+	if (offset >= size)
+		return size;
+	size -= result;
+	offset %= BITS_PER_LONG;
+	if (offset) {
+		tmp = *(p++);
+		tmp &= (~0UL << offset);
+		if (size < BITS_PER_LONG)
+			goto found_first;
+		if (tmp)
+			goto found_middle;
+		size -= BITS_PER_LONG;
+		result += BITS_PER_LONG;
+	}
+	while (size & ~(BITS_PER_LONG-1)) {
+		if ((tmp = *(p++)))
+			goto found_middle;
+		result += BITS_PER_LONG;
+		size -= BITS_PER_LONG;
+	}
+	if (!size)
+		return result;
+	tmp = *p;
+
+found_first:
+	tmp &= (~0UL >> (BITS_PER_LONG - size));
+	if (tmp == 0UL)		/* Are any bits set? */
+		return result + size;	/* Nope. */
+found_middle:
+	return result + __ffs(tmp);
+}
--- rtth/linux/bitops.h.0	2012-02-17 19:38:02.615537110 -0800
+++ rtth/linux/bitops.h	2012-02-17 20:36:19.431619931 -0800
@@ -108,4 +108,43 @@ static inline int test_bit(int nr, const
 	return 1UL & (addr[BITOP_WORD(nr)] >> (nr & (BITS_PER_LONG-1)));
 }
 
+/**
+ * __ffs - find first bit in word.
+ * @word: The word to search
+ *
+ * Undefined if no bit exists, so code should check against 0 first.
+ */
+static inline unsigned long __ffs(unsigned long word)
+{
+	int num = 0;
+
+	if ((word & 0xffffffff) == 0) {
+		num += 32;
+		word >>= 32;
+	}
+	if ((word & 0xffff) == 0) {
+		num += 16;
+		word >>= 16;
+	}
+	if ((word & 0xff) == 0) {
+		num += 8;
+		word >>= 8;
+	}
+	if ((word & 0xf) == 0) {
+		num += 4;
+		word >>= 4;
+	}
+	if ((word & 0x3) == 0) {
+		num += 2;
+		word >>= 2;
+	}
+	if ((word & 0x1) == 0)
+		num += 1;
+	return num;
+}
+
+unsigned long find_next_bit(const unsigned long *addr,
+			    unsigned long size,
+			    unsigned long offset);
+
 #endif /* _ASM_GENERIC_BITOPS_NON_ATOMIC_H_ */
--- rtth/linux/kernel.h.0	2010-08-25 13:30:45.000000000 -0700
+++ rtth/linux/kernel.h	2012-02-17 20:21:04.607598241 -0800
@@ -14,6 +14,7 @@
 #define panic(expr)
 #define printk printf
 #define __force
+#define likely(c) (c)
 #define unlikely(c) (c)
 #define DIV_ROUND_UP(n,d) (((n) + (d) - 1) / (d))
 
--- rtth/linux/radix-tree.h.0	2012-02-17 19:38:02.615537110 -0800
+++ rtth/linux/radix-tree.h	2012-02-17 20:21:36.911599007 -0800
@@ -2,6 +2,7 @@
  * Copyright (C) 2001 Momchil Velikov
  * Portions Copyright (C) 2001 Christoph Hellwig
  * Copyright (C) 2006 Nick Piggin
+ * Copyright (C) 2012 Konstantin Khlebnikov
  *
  * This program is free software; you can redistribute it and/or
  * modify it under the terms of the GNU General Public License as
@@ -23,6 +24,7 @@
 #include <linux/preempt.h>
 #include <linux/types.h>
 #include <linux/kernel.h>
+#include <linux/bitops.h>
 #include <linux/rcupdate.h>
 
 /*
@@ -258,4 +260,142 @@ static inline void radix_tree_preload_en
 	preempt_enable();
 }
 
+struct radix_tree_iter {
+	unsigned long	index;		/* current index, do not overflow it */
+	unsigned long	next_index;	/* next-to-last index for this chunk */
+	unsigned long	tags;		/* bitmask for tag-iterating */
+};
+
+#define RADIX_TREE_ITER_TAG_MASK	0x00FF	/* tag index in lower byte */
+#define RADIX_TREE_ITER_TAGGED		0x0100	/* lookup tagged slots */
+#define RADIX_TREE_ITER_CONTIG		0x0200	/* stop at first hole */
+
+void **radix_tree_next_chunk(struct radix_tree_root *root,
+			     struct radix_tree_iter *iter, unsigned flags);
+
+static inline
+void **radix_tree_iter_init(struct radix_tree_iter *iter, unsigned long start)
+{
+	iter->index = 0; /* to bypass next_index overflow protection */
+	iter->next_index = start;
+	return NULL;
+}
+
+static inline unsigned long radix_tree_chunk_size(struct radix_tree_iter *iter)
+{
+	return iter->next_index - iter->index;
+}
+
+/**
+ * radix_tree_next_slot - find next slot in chunk
+ *
+ * @slot	pointer to slot
+ * @iter	iterator state
+ * @flags	RADIX_TREE_ITER_*
+ *
+ * Returns pointer to next slot, or NULL if no more left.
+ */
+static __always_inline
+void **radix_tree_next_slot(void **slot, struct radix_tree_iter *iter,
+			    unsigned flags)
+{
+	unsigned size, offset;
+
+	size = radix_tree_chunk_size(iter) - 1;
+	if (flags & RADIX_TREE_ITER_TAGGED) {
+		iter->tags >>= 1;
+		if (likely(iter->tags & 1ul)) {
+			iter->index++;
+			return slot + 1;
+		}
+		if ((flags & RADIX_TREE_ITER_CONTIG) && size)
+			return NULL;
+		if (likely(iter->tags)) {
+			offset = __ffs(iter->tags);
+			iter->tags >>= offset;
+			iter->index += offset + 1;
+			return slot + offset + 1;
+		}
+	} else {
+		while (size--) {
+			slot++;
+			iter->index++;
+			if (likely(*slot))
+				return slot;
+			if (flags & RADIX_TREE_ITER_CONTIG)
+				return NULL;
+		}
+	}
+	return NULL;
+}
+
+/**
+ * radix_tree_for_each_chunk - iterate over chunks
+ *
+ * @slot:	the void** for pointer to chunk first slot
+ * @root	the struct radix_tree_root pointer
+ * @iter	the struct radix_tree_iter pointer
+ * @start	starting index
+ * @flags	RADIX_TREE_ITER_* and tag index
+ *
+ * Locks can be released and reasquired between iterations.
+ */
+#define radix_tree_for_each_chunk(slot, root, iter, start, flags)	\
+	for ( slot = radix_tree_iter_init(iter, start) ;		\
+	      (slot = radix_tree_next_chunk(root, iter, flags)) ; )
+
+/**
+ * radix_tree_for_each_chunk_slot - iterate over slots in one chunk
+ *
+ * @slot:	the void** for pointer to slot
+ * @iter	the struct radix_tree_iter pointer
+ * @flags	RADIX_TREE_ITER_*
+ */
+#define radix_tree_for_each_chunk_slot(slot, iter, flags)	\
+	for ( ; slot ; slot = radix_tree_next_slot(slot, iter, flags) )
+
+/**
+ * radix_tree_for_each_slot - iterate over all slots
+ *
+ * @slot:	the void** for pointer to slot
+ * @root	the struct radix_tree_root pointer
+ * @iter	the struct radix_tree_iter pointer
+ * @start	starting index
+ */
+#define radix_tree_for_each_slot(slot, root, iter, start)	\
+	for ( slot = radix_tree_iter_init(iter, start) ;	\
+	      slot || (slot = radix_tree_next_chunk(root, iter, 0)) ; \
+	      slot = radix_tree_next_slot(slot, iter, 0) )
+
+/**
+ * radix_tree_for_each_contig - iterate over all contiguous slots
+ *
+ * @slot:	the void** for pointer to slot
+ * @root	the struct radix_tree_root pointer
+ * @iter	the struct radix_tree_iter pointer
+ * @start	starting index
+ */
+#define radix_tree_for_each_contig(slot, root, iter, start)		\
+	for ( slot = radix_tree_iter_init(iter, start) ;		\
+	      slot || (slot = radix_tree_next_chunk(root, iter,		\
+				RADIX_TREE_ITER_CONTIG)) ;		\
+	      slot = radix_tree_next_slot(slot, iter,			\
+				RADIX_TREE_ITER_CONTIG) )
+
+/**
+ * radix_tree_for_each_tagged - iterate over all tagged slots
+ *
+ * @slot:	the void** for pointer to slot
+ * @root	the struct radix_tree_root pointer
+ * @iter	the struct radix_tree_iter pointer
+ * @start	starting index
+ * @tag		tag index
+ */
+#define radix_tree_for_each_tagged(slot, root, iter, start, tag)	\
+	for ( slot = radix_tree_iter_init(iter, start) ;		\
+	      slot || (slot = radix_tree_next_chunk(root, iter,		\
+			      RADIX_TREE_ITER_TAGGED | tag)) ;		\
+	      slot = radix_tree_next_slot(slot, iter,			\
+				RADIX_TREE_ITER_TAGGED) )
+
 #endif /* _LINUX_RADIX_TREE_H */
--- rtth/radix-tree.c.0	2012-02-17 19:40:59.307536275 -0800
+++ rtth/radix-tree.c	2012-02-17 19:44:11.083545685 -0800
@@ -3,6 +3,7 @@
  * Portions Copyright (C) 2001 Christoph Hellwig
  * Copyright (C) 2005 SGI, Christoph Lameter
  * Copyright (C) 2006 Nick Piggin
+ * Copyright (C) 2012 Konstantin Khlebnikov
  *
  * This program is free software; you can redistribute it and/or
  * modify it under the terms of the GNU General Public License as
@@ -146,6 +147,41 @@ static inline int any_tag_set(struct rad
 	}
 	return 0;
 }
+
+/**
+ * radix_tree_find_next_bit - find the next set bit in a memory region
+ * @addr: The address to base the search on
+ * @size: The bitmap size in bits
+ * @offset: The bitnumber to start searching at
+ *
+ * Unrollable variant of find_next_bit() for constant size arrays.
+ * Tail bits starting from size to roundup(size, BITS_PER_LONG) must be zero.
+ * Returns next bit offset, or size if nothing found.
+ */
+static inline unsigned long radix_tree_find_next_bit(const unsigned long *addr,
+		unsigned long size, unsigned long offset)
+{
+	if (!__builtin_constant_p(size))
+		return find_next_bit(addr, size, offset);
+
+	if (offset < size) {
+		unsigned long tmp;
+
+		addr += offset / BITS_PER_LONG;
+		tmp = *addr >> (offset % BITS_PER_LONG);
+		if (tmp)
+			return __ffs(tmp) + offset;
+		offset = (offset + BITS_PER_LONG) & ~(BITS_PER_LONG - 1);
+		while (offset < size) {
+			tmp = *++addr;
+			if (tmp)
+				return __ffs(tmp) + offset;
+			offset += BITS_PER_LONG;
+		}
+	}
+	return size;
+}
+
 /*
  * This assumes that the caller has performed appropriate preallocation, and
  * that the caller has pinned this thread of control to the current CPU.
@@ -613,6 +649,117 @@ int radix_tree_tag_get(struct radix_tree
 EXPORT_SYMBOL(radix_tree_tag_get);
 
 /**
+ * radix_tree_next_chunk - find next chunk of slots for iteration
+ *
+ * @root:		radix tree root
+ * @iter:		iterator state
+ * @flags		RADIX_TREE_ITER_* flags and tag index
+ *
+ * Returns pointer to first slots in chunk, or NULL if there no more left
+ */
+void **radix_tree_next_chunk(struct radix_tree_root *root,
+			     struct radix_tree_iter *iter, unsigned flags)
+{
+	unsigned shift, tag = flags & RADIX_TREE_ITER_TAG_MASK;
+	struct radix_tree_node *rnode, *node;
+	unsigned long i, index;
+
+	if ((flags & RADIX_TREE_ITER_TAGGED) && !root_tag_get(root, tag))
+		return NULL;
+
+	/*
+	 * Catch next_index overflow after ~0UL.
+	 * iter->index can be zero only at the beginning.
+	 * Because RADIX_TREE_MAP_SHIFT < BITS_PER_LONG we cannot
+	 * oveflow iter->next_index in single step.
+	 */
+	index = iter->next_index;
+	if (!index && iter->index)
+		return NULL;
+
+	rnode = rcu_dereference_raw(root->rnode);
+	if (radix_tree_is_indirect_ptr(rnode)) {
+		rnode = indirect_to_ptr(rnode);
+	} else if (rnode && !index) {
+		/* Single-slot tree */
+		iter->index = 0;
+		iter->next_index = 1;
+		iter->tags = 1;
+		return (void **)&root->rnode;
+	} else
+		return NULL;
+
+restart:
+	shift = (rnode->height - 1) * RADIX_TREE_MAP_SHIFT;
+	i = index >> shift;
+
+	/* Index ouside of the tree */
+	if (i >= RADIX_TREE_MAP_SIZE)
+		return NULL;
+
+	node = rnode;
+	while (1) {
+		if ((flags & RADIX_TREE_ITER_TAGGED) ?
+				!test_bit(i, node->tags[tag]) :
+				!node->slots[i]) {
+			/* Hole detected */
+			if (flags & RADIX_TREE_ITER_CONTIG)
+				return NULL;
+
+			if (flags & RADIX_TREE_ITER_TAGGED)
+				i = radix_tree_find_next_bit(node->tags[tag],
+						RADIX_TREE_MAP_SIZE, i + 1);
+			else
+				while (++i < RADIX_TREE_MAP_SIZE &&
+						!node->slots[i]);
+
+			index &= ~((RADIX_TREE_MAP_SIZE << shift) - 1);
+			index += i << shift;
+			/* Overflow after ~0UL */
+			if (!index)
+				return NULL;
+			if (i == RADIX_TREE_MAP_SIZE)
+				goto restart;
+		}
+
+		/* This is leaf-node */
+		if (!shift)
+			break;
+
+		node = rcu_dereference_raw(node->slots[i]);
+		if (node == NULL)
+			goto restart;
+		shift -= RADIX_TREE_MAP_SHIFT;
+		i = (index >> shift) & RADIX_TREE_MAP_MASK;
+	}
+
+	/* Update the iterator state */
+	iter->index = index;
+	iter->next_index = (index | RADIX_TREE_MAP_MASK) + 1;
+
+	/* Construct iter->tags bitmask from node->tags[tag] array */
+	if (flags & RADIX_TREE_ITER_TAGGED) {
+		unsigned tag_long, tag_bit;
+
+		tag_long = i / BITS_PER_LONG;
+		tag_bit  = i % BITS_PER_LONG;
+		iter->tags = node->tags[tag][tag_long] >> tag_bit;
+		/* This never happens if RADIX_TREE_TAG_LONGS == 1 */
+		if (tag_long < RADIX_TREE_TAG_LONGS - 1) {
+			/* Pick tags from next element */
+			if (tag_bit)
+				iter->tags |= node->tags[tag][tag_long + 1] <<
+						(BITS_PER_LONG - tag_bit);
+			/* Clip chunk size, here only BITS_PER_LONG tags */
+			iter->next_index = index + BITS_PER_LONG;
+		}
+	}
+
+	return node->slots + i;
+}
+EXPORT_SYMBOL(radix_tree_next_chunk);
+
+/**
  * radix_tree_range_tag_if_tagged - for each item in given range set given
  *				   tag if item has another tag set
  * @root:		radix tree root
@@ -817,57 +964,6 @@ unsigned long radix_tree_prev_hole(struc
 }
 EXPORT_SYMBOL(radix_tree_prev_hole);
 
-static unsigned int
-__lookup(struct radix_tree_node *slot, void ***results, unsigned long *indices,
-	unsigned long index, unsigned int max_items, unsigned long *next_index)
-{
-	unsigned int nr_found = 0;
-	unsigned int shift, height;
-	unsigned long i;
-
-	height = slot->height;
-	if (height == 0)
-		goto out;
-	shift = (height-1) * RADIX_TREE_MAP_SHIFT;
-
-	for ( ; height > 1; height--) {
-		i = (index >> shift) & RADIX_TREE_MAP_MASK;
-		for (;;) {
-			if (slot->slots[i] != NULL)
-				break;
-			index &= ~((1UL << shift) - 1);
-			index += 1UL << shift;
-			if (index == 0)
-				goto out;	/* 32-bit wraparound */
-			i++;
-			if (i == RADIX_TREE_MAP_SIZE)
-				goto out;
-		}
-
-		shift -= RADIX_TREE_MAP_SHIFT;
-		slot = rcu_dereference_raw(slot->slots[i]);
-		if (slot == NULL)
-			goto out;
-	}
-
-	/* Bottom level: grab some items */
-	for (i = index & RADIX_TREE_MAP_MASK; i < RADIX_TREE_MAP_SIZE; i++) {
-		if (slot->slots[i]) {
-			results[nr_found] = &(slot->slots[i]);
-			if (indices)
-				indices[nr_found] = index;
-			if (++nr_found == max_items) {
-				index++;
-				goto out;
-			}
-		}
-		index++;
-	}
-out:
-	*next_index = index;
-	return nr_found;
-}
-
 /**
  *	radix_tree_gang_lookup - perform multiple lookup on a radix tree
  *	@root:		radix tree root
@@ -891,48 +987,19 @@ unsigned int
 radix_tree_gang_lookup(struct radix_tree_root *root, void **results,
 			unsigned long first_index, unsigned int max_items)
 {
-	unsigned long max_index;
-	struct radix_tree_node *node;
-	unsigned long cur_index = first_index;
-	unsigned int ret;
+	struct radix_tree_iter iter;
+	void **slot;
+	unsigned int ret = 0;
 
-	node = rcu_dereference_raw(root->rnode);
-	if (!node)
+	if (unlikely(!max_items))
 		return 0;
 
-	if (!radix_tree_is_indirect_ptr(node)) {
-		if (first_index > 0)
-			return 0;
-		results[0] = node;
-		return 1;
-	}
-	node = indirect_to_ptr(node);
-
-	max_index = radix_tree_maxindex(node->height);
-
-	ret = 0;
-	while (ret < max_items) {
-		unsigned int nr_found, slots_found, i;
-		unsigned long next_index;	/* Index of next search */
-
-		if (cur_index > max_index)
-			break;
-		slots_found = __lookup(node, (void ***)results + ret, NULL,
-				cur_index, max_items - ret, &next_index);
-		nr_found = 0;
-		for (i = 0; i < slots_found; i++) {
-			struct radix_tree_node *slot;
-			slot = *(((void ***)results)[ret + i]);
-			if (!slot)
-				continue;
-			results[ret + nr_found] =
-				indirect_to_ptr(rcu_dereference_raw(slot));
-			nr_found++;
-		}
-		ret += nr_found;
-		if (next_index == 0)
+	radix_tree_for_each_slot(slot, root, &iter, first_index) {
+		results[ret] = indirect_to_ptr(rcu_dereference_raw(*slot));
+		if (!results[ret])
+			continue;
+		if (++ret == max_items)
 			break;
-		cur_index = next_index;
 	}
 
 	return ret;
@@ -962,112 +1029,25 @@ radix_tree_gang_lookup_slot(struct radix
 			void ***results, unsigned long *indices,
 			unsigned long first_index, unsigned int max_items)
 {
-	unsigned long max_index;
-	struct radix_tree_node *node;
-	unsigned long cur_index = first_index;
-	unsigned int ret;
+	struct radix_tree_iter iter;
+	void **slot;
+	unsigned int ret = 0;
 
-	node = rcu_dereference_raw(root->rnode);
-	if (!node)
+	if (unlikely(!max_items))
 		return 0;
 
-	if (!radix_tree_is_indirect_ptr(node)) {
-		if (first_index > 0)
-			return 0;
-		results[0] = (void **)&root->rnode;
+	radix_tree_for_each_slot(slot, root, &iter, first_index) {
+		results[ret] = slot;
 		if (indices)
-			indices[0] = 0;
-		return 1;
-	}
-	node = indirect_to_ptr(node);
-
-	max_index = radix_tree_maxindex(node->height);
-
-	ret = 0;
-	while (ret < max_items) {
-		unsigned int slots_found;
-		unsigned long next_index;	/* Index of next search */
-
-		if (cur_index > max_index)
+			indices[ret] = iter.index;
+		if (++ret == max_items)
 			break;
-		slots_found = __lookup(node, results + ret,
-				indices ? indices + ret : NULL,
-				cur_index, max_items - ret, &next_index);
-		ret += slots_found;
-		if (next_index == 0)
-			break;
-		cur_index = next_index;
 	}
 
 	return ret;
 }
 EXPORT_SYMBOL(radix_tree_gang_lookup_slot);
 
-/*
- * FIXME: the two tag_get()s here should use find_next_bit() instead of
- * open-coding the search.
- */
-static unsigned int
-__lookup_tag(struct radix_tree_node *slot, void ***results, unsigned long index,
-	unsigned int max_items, unsigned long *next_index, unsigned int tag)
-{
-	unsigned int nr_found = 0;
-	unsigned int shift, height;
-
-	height = slot->height;
-	if (height == 0)
-		goto out;
-	shift = (height-1) * RADIX_TREE_MAP_SHIFT;
-
-	while (height > 0) {
-		unsigned long i = (index >> shift) & RADIX_TREE_MAP_MASK ;
-
-		for (;;) {
-			if (tag_get(slot, tag, i))
-				break;
-			index &= ~((1UL << shift) - 1);
-			index += 1UL << shift;
-			if (index == 0)
-				goto out;	/* 32-bit wraparound */
-			i++;
-			if (i == RADIX_TREE_MAP_SIZE)
-				goto out;
-		}
-		height--;
-		if (height == 0) {	/* Bottom level: grab some items */
-			unsigned long j = index & RADIX_TREE_MAP_MASK;
-
-			for ( ; j < RADIX_TREE_MAP_SIZE; j++) {
-				index++;
-				if (!tag_get(slot, tag, j))
-					continue;
-				/*
-				 * Even though the tag was found set, we need to
-				 * recheck that we have a non-NULL node, because
-				 * if this lookup is lockless, it may have been
-				 * subsequently deleted.
-				 *
-				 * Similar care must be taken in any place that
-				 * lookup ->slots[x] without a lock (ie. can't
-				 * rely on its value remaining the same).
-				 */
-				if (slot->slots[j]) {
-					results[nr_found++] = &(slot->slots[j]);
-					if (nr_found == max_items)
-						goto out;
-				}
-			}
-		}
-		shift -= RADIX_TREE_MAP_SHIFT;
-		slot = rcu_dereference_raw(slot->slots[i]);
-		if (slot == NULL)
-			break;
-	}
-out:
-	*next_index = index;
-	return nr_found;
-}
-
 /**
  *	radix_tree_gang_lookup_tag - perform multiple lookup on a radix tree
  *	                             based on a tag
@@ -1086,52 +1066,19 @@ radix_tree_gang_lookup_tag(struct radix_
 		unsigned long first_index, unsigned int max_items,
 		unsigned int tag)
 {
-	struct radix_tree_node *node;
-	unsigned long max_index;
-	unsigned long cur_index = first_index;
-	unsigned int ret;
+	struct radix_tree_iter iter;
+	void **slot;
+	unsigned int ret = 0;
 
-	/* check the root's tag bit */
-	if (!root_tag_get(root, tag))
+	if (unlikely(!max_items))
 		return 0;
 
-	node = rcu_dereference_raw(root->rnode);
-	if (!node)
-		return 0;
-
-	if (!radix_tree_is_indirect_ptr(node)) {
-		if (first_index > 0)
-			return 0;
-		results[0] = node;
-		return 1;
-	}
-	node = indirect_to_ptr(node);
-
-	max_index = radix_tree_maxindex(node->height);
-
-	ret = 0;
-	while (ret < max_items) {
-		unsigned int nr_found, slots_found, i;
-		unsigned long next_index;	/* Index of next search */
-
-		if (cur_index > max_index)
-			break;
-		slots_found = __lookup_tag(node, (void ***)results + ret,
-				cur_index, max_items - ret, &next_index, tag);
-		nr_found = 0;
-		for (i = 0; i < slots_found; i++) {
-			struct radix_tree_node *slot;
-			slot = *(((void ***)results)[ret + i]);
-			if (!slot)
-				continue;
-			results[ret + nr_found] =
-				indirect_to_ptr(rcu_dereference_raw(slot));
-			nr_found++;
-		}
-		ret += nr_found;
-		if (next_index == 0)
+	radix_tree_for_each_tagged(slot, root, &iter, first_index, tag) {
+		results[ret] = indirect_to_ptr(rcu_dereference_raw(*slot));
+		if (!results[ret])
+			continue;
+		if (++ret == max_items)
 			break;
-		cur_index = next_index;
 	}
 
 	return ret;
@@ -1156,42 +1103,17 @@ radix_tree_gang_lookup_tag_slot(struct r
 		unsigned long first_index, unsigned int max_items,
 		unsigned int tag)
 {
-	struct radix_tree_node *node;
-	unsigned long max_index;
-	unsigned long cur_index = first_index;
-	unsigned int ret;
+	struct radix_tree_iter iter;
+	void **slot;
+	unsigned int ret = 0;
 
-	/* check the root's tag bit */
-	if (!root_tag_get(root, tag))
+	if (unlikely(!max_items))
 		return 0;
 
-	node = rcu_dereference_raw(root->rnode);
-	if (!node)
-		return 0;
-
-	if (!radix_tree_is_indirect_ptr(node)) {
-		if (first_index > 0)
-			return 0;
-		results[0] = (void **)&root->rnode;
-		return 1;
-	}
-	node = indirect_to_ptr(node);
-
-	max_index = radix_tree_maxindex(node->height);
-
-	ret = 0;
-	while (ret < max_items) {
-		unsigned int slots_found;
-		unsigned long next_index;	/* Index of next search */
-
-		if (cur_index > max_index)
-			break;
-		slots_found = __lookup_tag(node, results + ret,
-				cur_index, max_items - ret, &next_index, tag);
-		ret += slots_found;
-		if (next_index == 0)
+	radix_tree_for_each_tagged(slot, root, &iter, first_index, tag) {
+		results[ret] = slot;
+		if (++ret == max_items)
 			break;
-		cur_index = next_index;
 	}
 
 	return ret;

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH 0/4] radix-tree: iterating general cleanup
@ 2012-03-14 19:36         ` Hugh Dickins
  0 siblings, 0 replies; 30+ messages in thread
From: Hugh Dickins @ 2012-03-14 19:36 UTC (permalink / raw)
  To: Christoph Hellwig
  Cc: Konstantin Khlebnikov, Andrew Morton, Linus Torvalds,
	linux-kernel, linux-mm

On Wed, 14 Mar 2012, Christoph Hellwig wrote:
> On Wed, Mar 14, 2012 at 11:49:21AM +0400, Konstantin Khlebnikov wrote:
> > Christoph Hellwig wrote:
> > >Any updates on this series?
> > 
> > I had sent "[PATCH v2 0/3] radix-tree: general iterator" February 10, there is no more updates after that.
> > I just checked v2 on top "next-20120314" -- looks like all ok.
> 
> this was more a question to the MM maintainers if this is getting
> merged or if there were any further comments.

I haven't studied the code at all - I'm afraid Konstantin is rather
more productive than I can keep up with, and other bugs and patches
appeared to be more urgent and important.

But I have included that work in most of my testing for four weeks now,
and observed no problems whatever from it.  And I made a patch for the
radix-tree test harness which akpm curates, to update its radix-tree.c
to Konstantin's: those tests ran perfectly on 64-bit and on 32-bit.
That patch to rtth appended below.

(I do have, or expect to have once I study them, reservations about
his subsequent changes to radix-tree usage in mm/shmem.c; and even
if I end up agreeing with his changes, would prefer to hold them off
until after the tmpfs fallocation mods are in - other work which
had to yield to higher priorities, ready but not yet commented.)

> 
> We'd really like to have this interface to simplify some code in XFS.

That's useful info: let's raise its priority, and hope that someone
faster than me (not a narrow category...) gets to review it.

Hugh

[PATCH] rtth: update to KK's radix-tree iterator

Signed-off-by: Hugh Dickins <hughd@google.com>

--- rtth/Makefile.0	2012-02-17 19:38:02.611537166 -0800
+++ rtth/Makefile	2012-02-17 20:32:31.351614457 -0800
@@ -2,7 +2,8 @@
 CFLAGS += -I. -g -Wall -D_LGPL_SOURCE
 LDFLAGS += -lpthread -lurcu
 TARGETS = main
-OFILES = main.o radix-tree.o linux.o test.o tag_check.o regression1.o regression2.o
+OFILES = main.o radix-tree.o linux.o test.o tag_check.o find_next_bit.o \
+	 regression1.o regression2.o
 
 targets: $(TARGETS)
 
--- rtth/find_next_bit.c.0	2012-02-17 21:07:53.556625068 -0800
+++ rtth/find_next_bit.c	2012-02-17 20:34:54.159617936 -0800
@@ -0,0 +1,57 @@
+/* find_next_bit.c: fallback find next bit implementation
+ *
+ * Copyright (C) 2004 Red Hat, Inc. All Rights Reserved.
+ * Written by David Howells (dhowells@redhat.com)
+ *
+ * This program is free software; you can redistribute it and/or
+ * modify it under the terms of the GNU General Public License
+ * as published by the Free Software Foundation; either version
+ * 2 of the License, or (at your option) any later version.
+ */
+
+#include <linux/types.h>
+#include <linux/bitops.h>
+
+#define BITOP_WORD(nr)		((nr) / BITS_PER_LONG)
+
+/*
+ * Find the next set bit in a memory region.
+ */
+unsigned long find_next_bit(const unsigned long *addr, unsigned long size,
+			    unsigned long offset)
+{
+	const unsigned long *p = addr + BITOP_WORD(offset);
+	unsigned long result = offset & ~(BITS_PER_LONG-1);
+	unsigned long tmp;
+
+	if (offset >= size)
+		return size;
+	size -= result;
+	offset %= BITS_PER_LONG;
+	if (offset) {
+		tmp = *(p++);
+		tmp &= (~0UL << offset);
+		if (size < BITS_PER_LONG)
+			goto found_first;
+		if (tmp)
+			goto found_middle;
+		size -= BITS_PER_LONG;
+		result += BITS_PER_LONG;
+	}
+	while (size & ~(BITS_PER_LONG-1)) {
+		if ((tmp = *(p++)))
+			goto found_middle;
+		result += BITS_PER_LONG;
+		size -= BITS_PER_LONG;
+	}
+	if (!size)
+		return result;
+	tmp = *p;
+
+found_first:
+	tmp &= (~0UL >> (BITS_PER_LONG - size));
+	if (tmp == 0UL)		/* Are any bits set? */
+		return result + size;	/* Nope. */
+found_middle:
+	return result + __ffs(tmp);
+}
--- rtth/linux/bitops.h.0	2012-02-17 19:38:02.615537110 -0800
+++ rtth/linux/bitops.h	2012-02-17 20:36:19.431619931 -0800
@@ -108,4 +108,43 @@ static inline int test_bit(int nr, const
 	return 1UL & (addr[BITOP_WORD(nr)] >> (nr & (BITS_PER_LONG-1)));
 }
 
+/**
+ * __ffs - find first bit in word.
+ * @word: The word to search
+ *
+ * Undefined if no bit exists, so code should check against 0 first.
+ */
+static inline unsigned long __ffs(unsigned long word)
+{
+	int num = 0;
+
+	if ((word & 0xffffffff) == 0) {
+		num += 32;
+		word >>= 32;
+	}
+	if ((word & 0xffff) == 0) {
+		num += 16;
+		word >>= 16;
+	}
+	if ((word & 0xff) == 0) {
+		num += 8;
+		word >>= 8;
+	}
+	if ((word & 0xf) == 0) {
+		num += 4;
+		word >>= 4;
+	}
+	if ((word & 0x3) == 0) {
+		num += 2;
+		word >>= 2;
+	}
+	if ((word & 0x1) == 0)
+		num += 1;
+	return num;
+}
+
+unsigned long find_next_bit(const unsigned long *addr,
+			    unsigned long size,
+			    unsigned long offset);
+
 #endif /* _ASM_GENERIC_BITOPS_NON_ATOMIC_H_ */
--- rtth/linux/kernel.h.0	2010-08-25 13:30:45.000000000 -0700
+++ rtth/linux/kernel.h	2012-02-17 20:21:04.607598241 -0800
@@ -14,6 +14,7 @@
 #define panic(expr)
 #define printk printf
 #define __force
+#define likely(c) (c)
 #define unlikely(c) (c)
 #define DIV_ROUND_UP(n,d) (((n) + (d) - 1) / (d))
 
--- rtth/linux/radix-tree.h.0	2012-02-17 19:38:02.615537110 -0800
+++ rtth/linux/radix-tree.h	2012-02-17 20:21:36.911599007 -0800
@@ -2,6 +2,7 @@
  * Copyright (C) 2001 Momchil Velikov
  * Portions Copyright (C) 2001 Christoph Hellwig
  * Copyright (C) 2006 Nick Piggin
+ * Copyright (C) 2012 Konstantin Khlebnikov
  *
  * This program is free software; you can redistribute it and/or
  * modify it under the terms of the GNU General Public License as
@@ -23,6 +24,7 @@
 #include <linux/preempt.h>
 #include <linux/types.h>
 #include <linux/kernel.h>
+#include <linux/bitops.h>
 #include <linux/rcupdate.h>
 
 /*
@@ -258,4 +260,142 @@ static inline void radix_tree_preload_en
 	preempt_enable();
 }
 
+struct radix_tree_iter {
+	unsigned long	index;		/* current index, do not overflow it */
+	unsigned long	next_index;	/* next-to-last index for this chunk */
+	unsigned long	tags;		/* bitmask for tag-iterating */
+};
+
+#define RADIX_TREE_ITER_TAG_MASK	0x00FF	/* tag index in lower byte */
+#define RADIX_TREE_ITER_TAGGED		0x0100	/* lookup tagged slots */
+#define RADIX_TREE_ITER_CONTIG		0x0200	/* stop at first hole */
+
+void **radix_tree_next_chunk(struct radix_tree_root *root,
+			     struct radix_tree_iter *iter, unsigned flags);
+
+static inline
+void **radix_tree_iter_init(struct radix_tree_iter *iter, unsigned long start)
+{
+	iter->index = 0; /* to bypass next_index overflow protection */
+	iter->next_index = start;
+	return NULL;
+}
+
+static inline unsigned long radix_tree_chunk_size(struct radix_tree_iter *iter)
+{
+	return iter->next_index - iter->index;
+}
+
+/**
+ * radix_tree_next_slot - find next slot in chunk
+ *
+ * @slot	pointer to slot
+ * @iter	iterator state
+ * @flags	RADIX_TREE_ITER_*
+ *
+ * Returns pointer to next slot, or NULL if no more left.
+ */
+static __always_inline
+void **radix_tree_next_slot(void **slot, struct radix_tree_iter *iter,
+			    unsigned flags)
+{
+	unsigned size, offset;
+
+	size = radix_tree_chunk_size(iter) - 1;
+	if (flags & RADIX_TREE_ITER_TAGGED) {
+		iter->tags >>= 1;
+		if (likely(iter->tags & 1ul)) {
+			iter->index++;
+			return slot + 1;
+		}
+		if ((flags & RADIX_TREE_ITER_CONTIG) && size)
+			return NULL;
+		if (likely(iter->tags)) {
+			offset = __ffs(iter->tags);
+			iter->tags >>= offset;
+			iter->index += offset + 1;
+			return slot + offset + 1;
+		}
+	} else {
+		while (size--) {
+			slot++;
+			iter->index++;
+			if (likely(*slot))
+				return slot;
+			if (flags & RADIX_TREE_ITER_CONTIG)
+				return NULL;
+		}
+	}
+	return NULL;
+}
+
+/**
+ * radix_tree_for_each_chunk - iterate over chunks
+ *
+ * @slot:	the void** for pointer to chunk first slot
+ * @root	the struct radix_tree_root pointer
+ * @iter	the struct radix_tree_iter pointer
+ * @start	starting index
+ * @flags	RADIX_TREE_ITER_* and tag index
+ *
+ * Locks can be released and reasquired between iterations.
+ */
+#define radix_tree_for_each_chunk(slot, root, iter, start, flags)	\
+	for ( slot = radix_tree_iter_init(iter, start) ;		\
+	      (slot = radix_tree_next_chunk(root, iter, flags)) ; )
+
+/**
+ * radix_tree_for_each_chunk_slot - iterate over slots in one chunk
+ *
+ * @slot:	the void** for pointer to slot
+ * @iter	the struct radix_tree_iter pointer
+ * @flags	RADIX_TREE_ITER_*
+ */
+#define radix_tree_for_each_chunk_slot(slot, iter, flags)	\
+	for ( ; slot ; slot = radix_tree_next_slot(slot, iter, flags) )
+
+/**
+ * radix_tree_for_each_slot - iterate over all slots
+ *
+ * @slot:	the void** for pointer to slot
+ * @root	the struct radix_tree_root pointer
+ * @iter	the struct radix_tree_iter pointer
+ * @start	starting index
+ */
+#define radix_tree_for_each_slot(slot, root, iter, start)	\
+	for ( slot = radix_tree_iter_init(iter, start) ;	\
+	      slot || (slot = radix_tree_next_chunk(root, iter, 0)) ; \
+	      slot = radix_tree_next_slot(slot, iter, 0) )
+
+/**
+ * radix_tree_for_each_contig - iterate over all contiguous slots
+ *
+ * @slot:	the void** for pointer to slot
+ * @root	the struct radix_tree_root pointer
+ * @iter	the struct radix_tree_iter pointer
+ * @start	starting index
+ */
+#define radix_tree_for_each_contig(slot, root, iter, start)		\
+	for ( slot = radix_tree_iter_init(iter, start) ;		\
+	      slot || (slot = radix_tree_next_chunk(root, iter,		\
+				RADIX_TREE_ITER_CONTIG)) ;		\
+	      slot = radix_tree_next_slot(slot, iter,			\
+				RADIX_TREE_ITER_CONTIG) )
+
+/**
+ * radix_tree_for_each_tagged - iterate over all tagged slots
+ *
+ * @slot:	the void** for pointer to slot
+ * @root	the struct radix_tree_root pointer
+ * @iter	the struct radix_tree_iter pointer
+ * @start	starting index
+ * @tag		tag index
+ */
+#define radix_tree_for_each_tagged(slot, root, iter, start, tag)	\
+	for ( slot = radix_tree_iter_init(iter, start) ;		\
+	      slot || (slot = radix_tree_next_chunk(root, iter,		\
+			      RADIX_TREE_ITER_TAGGED | tag)) ;		\
+	      slot = radix_tree_next_slot(slot, iter,			\
+				RADIX_TREE_ITER_TAGGED) )
+
 #endif /* _LINUX_RADIX_TREE_H */
--- rtth/radix-tree.c.0	2012-02-17 19:40:59.307536275 -0800
+++ rtth/radix-tree.c	2012-02-17 19:44:11.083545685 -0800
@@ -3,6 +3,7 @@
  * Portions Copyright (C) 2001 Christoph Hellwig
  * Copyright (C) 2005 SGI, Christoph Lameter
  * Copyright (C) 2006 Nick Piggin
+ * Copyright (C) 2012 Konstantin Khlebnikov
  *
  * This program is free software; you can redistribute it and/or
  * modify it under the terms of the GNU General Public License as
@@ -146,6 +147,41 @@ static inline int any_tag_set(struct rad
 	}
 	return 0;
 }
+
+/**
+ * radix_tree_find_next_bit - find the next set bit in a memory region
+ * @addr: The address to base the search on
+ * @size: The bitmap size in bits
+ * @offset: The bitnumber to start searching at
+ *
+ * Unrollable variant of find_next_bit() for constant size arrays.
+ * Tail bits starting from size to roundup(size, BITS_PER_LONG) must be zero.
+ * Returns next bit offset, or size if nothing found.
+ */
+static inline unsigned long radix_tree_find_next_bit(const unsigned long *addr,
+		unsigned long size, unsigned long offset)
+{
+	if (!__builtin_constant_p(size))
+		return find_next_bit(addr, size, offset);
+
+	if (offset < size) {
+		unsigned long tmp;
+
+		addr += offset / BITS_PER_LONG;
+		tmp = *addr >> (offset % BITS_PER_LONG);
+		if (tmp)
+			return __ffs(tmp) + offset;
+		offset = (offset + BITS_PER_LONG) & ~(BITS_PER_LONG - 1);
+		while (offset < size) {
+			tmp = *++addr;
+			if (tmp)
+				return __ffs(tmp) + offset;
+			offset += BITS_PER_LONG;
+		}
+	}
+	return size;
+}
+
 /*
  * This assumes that the caller has performed appropriate preallocation, and
  * that the caller has pinned this thread of control to the current CPU.
@@ -613,6 +649,117 @@ int radix_tree_tag_get(struct radix_tree
 EXPORT_SYMBOL(radix_tree_tag_get);
 
 /**
+ * radix_tree_next_chunk - find next chunk of slots for iteration
+ *
+ * @root:		radix tree root
+ * @iter:		iterator state
+ * @flags		RADIX_TREE_ITER_* flags and tag index
+ *
+ * Returns pointer to first slots in chunk, or NULL if there no more left
+ */
+void **radix_tree_next_chunk(struct radix_tree_root *root,
+			     struct radix_tree_iter *iter, unsigned flags)
+{
+	unsigned shift, tag = flags & RADIX_TREE_ITER_TAG_MASK;
+	struct radix_tree_node *rnode, *node;
+	unsigned long i, index;
+
+	if ((flags & RADIX_TREE_ITER_TAGGED) && !root_tag_get(root, tag))
+		return NULL;
+
+	/*
+	 * Catch next_index overflow after ~0UL.
+	 * iter->index can be zero only at the beginning.
+	 * Because RADIX_TREE_MAP_SHIFT < BITS_PER_LONG we cannot
+	 * oveflow iter->next_index in single step.
+	 */
+	index = iter->next_index;
+	if (!index && iter->index)
+		return NULL;
+
+	rnode = rcu_dereference_raw(root->rnode);
+	if (radix_tree_is_indirect_ptr(rnode)) {
+		rnode = indirect_to_ptr(rnode);
+	} else if (rnode && !index) {
+		/* Single-slot tree */
+		iter->index = 0;
+		iter->next_index = 1;
+		iter->tags = 1;
+		return (void **)&root->rnode;
+	} else
+		return NULL;
+
+restart:
+	shift = (rnode->height - 1) * RADIX_TREE_MAP_SHIFT;
+	i = index >> shift;
+
+	/* Index ouside of the tree */
+	if (i >= RADIX_TREE_MAP_SIZE)
+		return NULL;
+
+	node = rnode;
+	while (1) {
+		if ((flags & RADIX_TREE_ITER_TAGGED) ?
+				!test_bit(i, node->tags[tag]) :
+				!node->slots[i]) {
+			/* Hole detected */
+			if (flags & RADIX_TREE_ITER_CONTIG)
+				return NULL;
+
+			if (flags & RADIX_TREE_ITER_TAGGED)
+				i = radix_tree_find_next_bit(node->tags[tag],
+						RADIX_TREE_MAP_SIZE, i + 1);
+			else
+				while (++i < RADIX_TREE_MAP_SIZE &&
+						!node->slots[i]);
+
+			index &= ~((RADIX_TREE_MAP_SIZE << shift) - 1);
+			index += i << shift;
+			/* Overflow after ~0UL */
+			if (!index)
+				return NULL;
+			if (i == RADIX_TREE_MAP_SIZE)
+				goto restart;
+		}
+
+		/* This is leaf-node */
+		if (!shift)
+			break;
+
+		node = rcu_dereference_raw(node->slots[i]);
+		if (node == NULL)
+			goto restart;
+		shift -= RADIX_TREE_MAP_SHIFT;
+		i = (index >> shift) & RADIX_TREE_MAP_MASK;
+	}
+
+	/* Update the iterator state */
+	iter->index = index;
+	iter->next_index = (index | RADIX_TREE_MAP_MASK) + 1;
+
+	/* Construct iter->tags bitmask from node->tags[tag] array */
+	if (flags & RADIX_TREE_ITER_TAGGED) {
+		unsigned tag_long, tag_bit;
+
+		tag_long = i / BITS_PER_LONG;
+		tag_bit  = i % BITS_PER_LONG;
+		iter->tags = node->tags[tag][tag_long] >> tag_bit;
+		/* This never happens if RADIX_TREE_TAG_LONGS == 1 */
+		if (tag_long < RADIX_TREE_TAG_LONGS - 1) {
+			/* Pick tags from next element */
+			if (tag_bit)
+				iter->tags |= node->tags[tag][tag_long + 1] <<
+						(BITS_PER_LONG - tag_bit);
+			/* Clip chunk size, here only BITS_PER_LONG tags */
+			iter->next_index = index + BITS_PER_LONG;
+		}
+	}
+
+	return node->slots + i;
+}
+EXPORT_SYMBOL(radix_tree_next_chunk);
+
+/**
  * radix_tree_range_tag_if_tagged - for each item in given range set given
  *				   tag if item has another tag set
  * @root:		radix tree root
@@ -817,57 +964,6 @@ unsigned long radix_tree_prev_hole(struc
 }
 EXPORT_SYMBOL(radix_tree_prev_hole);
 
-static unsigned int
-__lookup(struct radix_tree_node *slot, void ***results, unsigned long *indices,
-	unsigned long index, unsigned int max_items, unsigned long *next_index)
-{
-	unsigned int nr_found = 0;
-	unsigned int shift, height;
-	unsigned long i;
-
-	height = slot->height;
-	if (height == 0)
-		goto out;
-	shift = (height-1) * RADIX_TREE_MAP_SHIFT;
-
-	for ( ; height > 1; height--) {
-		i = (index >> shift) & RADIX_TREE_MAP_MASK;
-		for (;;) {
-			if (slot->slots[i] != NULL)
-				break;
-			index &= ~((1UL << shift) - 1);
-			index += 1UL << shift;
-			if (index == 0)
-				goto out;	/* 32-bit wraparound */
-			i++;
-			if (i == RADIX_TREE_MAP_SIZE)
-				goto out;
-		}
-
-		shift -= RADIX_TREE_MAP_SHIFT;
-		slot = rcu_dereference_raw(slot->slots[i]);
-		if (slot == NULL)
-			goto out;
-	}
-
-	/* Bottom level: grab some items */
-	for (i = index & RADIX_TREE_MAP_MASK; i < RADIX_TREE_MAP_SIZE; i++) {
-		if (slot->slots[i]) {
-			results[nr_found] = &(slot->slots[i]);
-			if (indices)
-				indices[nr_found] = index;
-			if (++nr_found == max_items) {
-				index++;
-				goto out;
-			}
-		}
-		index++;
-	}
-out:
-	*next_index = index;
-	return nr_found;
-}
-
 /**
  *	radix_tree_gang_lookup - perform multiple lookup on a radix tree
  *	@root:		radix tree root
@@ -891,48 +987,19 @@ unsigned int
 radix_tree_gang_lookup(struct radix_tree_root *root, void **results,
 			unsigned long first_index, unsigned int max_items)
 {
-	unsigned long max_index;
-	struct radix_tree_node *node;
-	unsigned long cur_index = first_index;
-	unsigned int ret;
+	struct radix_tree_iter iter;
+	void **slot;
+	unsigned int ret = 0;
 
-	node = rcu_dereference_raw(root->rnode);
-	if (!node)
+	if (unlikely(!max_items))
 		return 0;
 
-	if (!radix_tree_is_indirect_ptr(node)) {
-		if (first_index > 0)
-			return 0;
-		results[0] = node;
-		return 1;
-	}
-	node = indirect_to_ptr(node);
-
-	max_index = radix_tree_maxindex(node->height);
-
-	ret = 0;
-	while (ret < max_items) {
-		unsigned int nr_found, slots_found, i;
-		unsigned long next_index;	/* Index of next search */
-
-		if (cur_index > max_index)
-			break;
-		slots_found = __lookup(node, (void ***)results + ret, NULL,
-				cur_index, max_items - ret, &next_index);
-		nr_found = 0;
-		for (i = 0; i < slots_found; i++) {
-			struct radix_tree_node *slot;
-			slot = *(((void ***)results)[ret + i]);
-			if (!slot)
-				continue;
-			results[ret + nr_found] =
-				indirect_to_ptr(rcu_dereference_raw(slot));
-			nr_found++;
-		}
-		ret += nr_found;
-		if (next_index == 0)
+	radix_tree_for_each_slot(slot, root, &iter, first_index) {
+		results[ret] = indirect_to_ptr(rcu_dereference_raw(*slot));
+		if (!results[ret])
+			continue;
+		if (++ret == max_items)
 			break;
-		cur_index = next_index;
 	}
 
 	return ret;
@@ -962,112 +1029,25 @@ radix_tree_gang_lookup_slot(struct radix
 			void ***results, unsigned long *indices,
 			unsigned long first_index, unsigned int max_items)
 {
-	unsigned long max_index;
-	struct radix_tree_node *node;
-	unsigned long cur_index = first_index;
-	unsigned int ret;
+	struct radix_tree_iter iter;
+	void **slot;
+	unsigned int ret = 0;
 
-	node = rcu_dereference_raw(root->rnode);
-	if (!node)
+	if (unlikely(!max_items))
 		return 0;
 
-	if (!radix_tree_is_indirect_ptr(node)) {
-		if (first_index > 0)
-			return 0;
-		results[0] = (void **)&root->rnode;
+	radix_tree_for_each_slot(slot, root, &iter, first_index) {
+		results[ret] = slot;
 		if (indices)
-			indices[0] = 0;
-		return 1;
-	}
-	node = indirect_to_ptr(node);
-
-	max_index = radix_tree_maxindex(node->height);
-
-	ret = 0;
-	while (ret < max_items) {
-		unsigned int slots_found;
-		unsigned long next_index;	/* Index of next search */
-
-		if (cur_index > max_index)
+			indices[ret] = iter.index;
+		if (++ret == max_items)
 			break;
-		slots_found = __lookup(node, results + ret,
-				indices ? indices + ret : NULL,
-				cur_index, max_items - ret, &next_index);
-		ret += slots_found;
-		if (next_index == 0)
-			break;
-		cur_index = next_index;
 	}
 
 	return ret;
 }
 EXPORT_SYMBOL(radix_tree_gang_lookup_slot);
 
-/*
- * FIXME: the two tag_get()s here should use find_next_bit() instead of
- * open-coding the search.
- */
-static unsigned int
-__lookup_tag(struct radix_tree_node *slot, void ***results, unsigned long index,
-	unsigned int max_items, unsigned long *next_index, unsigned int tag)
-{
-	unsigned int nr_found = 0;
-	unsigned int shift, height;
-
-	height = slot->height;
-	if (height == 0)
-		goto out;
-	shift = (height-1) * RADIX_TREE_MAP_SHIFT;
-
-	while (height > 0) {
-		unsigned long i = (index >> shift) & RADIX_TREE_MAP_MASK ;
-
-		for (;;) {
-			if (tag_get(slot, tag, i))
-				break;
-			index &= ~((1UL << shift) - 1);
-			index += 1UL << shift;
-			if (index == 0)
-				goto out;	/* 32-bit wraparound */
-			i++;
-			if (i == RADIX_TREE_MAP_SIZE)
-				goto out;
-		}
-		height--;
-		if (height == 0) {	/* Bottom level: grab some items */
-			unsigned long j = index & RADIX_TREE_MAP_MASK;
-
-			for ( ; j < RADIX_TREE_MAP_SIZE; j++) {
-				index++;
-				if (!tag_get(slot, tag, j))
-					continue;
-				/*
-				 * Even though the tag was found set, we need to
-				 * recheck that we have a non-NULL node, because
-				 * if this lookup is lockless, it may have been
-				 * subsequently deleted.
-				 *
-				 * Similar care must be taken in any place that
-				 * lookup ->slots[x] without a lock (ie. can't
-				 * rely on its value remaining the same).
-				 */
-				if (slot->slots[j]) {
-					results[nr_found++] = &(slot->slots[j]);
-					if (nr_found == max_items)
-						goto out;
-				}
-			}
-		}
-		shift -= RADIX_TREE_MAP_SHIFT;
-		slot = rcu_dereference_raw(slot->slots[i]);
-		if (slot == NULL)
-			break;
-	}
-out:
-	*next_index = index;
-	return nr_found;
-}
-
 /**
  *	radix_tree_gang_lookup_tag - perform multiple lookup on a radix tree
  *	                             based on a tag
@@ -1086,52 +1066,19 @@ radix_tree_gang_lookup_tag(struct radix_
 		unsigned long first_index, unsigned int max_items,
 		unsigned int tag)
 {
-	struct radix_tree_node *node;
-	unsigned long max_index;
-	unsigned long cur_index = first_index;
-	unsigned int ret;
+	struct radix_tree_iter iter;
+	void **slot;
+	unsigned int ret = 0;
 
-	/* check the root's tag bit */
-	if (!root_tag_get(root, tag))
+	if (unlikely(!max_items))
 		return 0;
 
-	node = rcu_dereference_raw(root->rnode);
-	if (!node)
-		return 0;
-
-	if (!radix_tree_is_indirect_ptr(node)) {
-		if (first_index > 0)
-			return 0;
-		results[0] = node;
-		return 1;
-	}
-	node = indirect_to_ptr(node);
-
-	max_index = radix_tree_maxindex(node->height);
-
-	ret = 0;
-	while (ret < max_items) {
-		unsigned int nr_found, slots_found, i;
-		unsigned long next_index;	/* Index of next search */
-
-		if (cur_index > max_index)
-			break;
-		slots_found = __lookup_tag(node, (void ***)results + ret,
-				cur_index, max_items - ret, &next_index, tag);
-		nr_found = 0;
-		for (i = 0; i < slots_found; i++) {
-			struct radix_tree_node *slot;
-			slot = *(((void ***)results)[ret + i]);
-			if (!slot)
-				continue;
-			results[ret + nr_found] =
-				indirect_to_ptr(rcu_dereference_raw(slot));
-			nr_found++;
-		}
-		ret += nr_found;
-		if (next_index == 0)
+	radix_tree_for_each_tagged(slot, root, &iter, first_index, tag) {
+		results[ret] = indirect_to_ptr(rcu_dereference_raw(*slot));
+		if (!results[ret])
+			continue;
+		if (++ret == max_items)
 			break;
-		cur_index = next_index;
 	}
 
 	return ret;
@@ -1156,42 +1103,17 @@ radix_tree_gang_lookup_tag_slot(struct r
 		unsigned long first_index, unsigned int max_items,
 		unsigned int tag)
 {
-	struct radix_tree_node *node;
-	unsigned long max_index;
-	unsigned long cur_index = first_index;
-	unsigned int ret;
+	struct radix_tree_iter iter;
+	void **slot;
+	unsigned int ret = 0;
 
-	/* check the root's tag bit */
-	if (!root_tag_get(root, tag))
+	if (unlikely(!max_items))
 		return 0;
 
-	node = rcu_dereference_raw(root->rnode);
-	if (!node)
-		return 0;
-
-	if (!radix_tree_is_indirect_ptr(node)) {
-		if (first_index > 0)
-			return 0;
-		results[0] = (void **)&root->rnode;
-		return 1;
-	}
-	node = indirect_to_ptr(node);
-
-	max_index = radix_tree_maxindex(node->height);
-
-	ret = 0;
-	while (ret < max_items) {
-		unsigned int slots_found;
-		unsigned long next_index;	/* Index of next search */
-
-		if (cur_index > max_index)
-			break;
-		slots_found = __lookup_tag(node, results + ret,
-				cur_index, max_items - ret, &next_index, tag);
-		ret += slots_found;
-		if (next_index == 0)
+	radix_tree_for_each_tagged(slot, root, &iter, first_index, tag) {
+		results[ret] = slot;
+		if (++ret == max_items)
 			break;
-		cur_index = next_index;
 	}
 
 	return ret;

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Fight unfair telecom internet charges in Canada: sign http://stopthemeter.ca/
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH 0/4] radix-tree: iterating general cleanup
  2012-03-14 19:36         ` Hugh Dickins
@ 2012-03-15  0:06           ` Andrew Morton
  -1 siblings, 0 replies; 30+ messages in thread
From: Andrew Morton @ 2012-03-15  0:06 UTC (permalink / raw)
  To: Hugh Dickins
  Cc: Christoph Hellwig, Konstantin Khlebnikov, Linus Torvalds,
	linux-kernel, linux-mm

On Wed, 14 Mar 2012 12:36:11 -0700 (PDT)
Hugh Dickins <hughd@google.com> wrote:

> On Wed, 14 Mar 2012, Christoph Hellwig wrote:
> > On Wed, Mar 14, 2012 at 11:49:21AM +0400, Konstantin Khlebnikov wrote:
> > > Christoph Hellwig wrote:
> > > >Any updates on this series?

Linus took an interest, so I went to sleep.  It seems that a role
reversal is in order ;)

> > > I had sent "[PATCH v2 0/3] radix-tree: general iterator" February 10, there is no more updates after that.
> > > I just checked v2 on top "next-20120314" -- looks like all ok.
> > 
> > this was more a question to the MM maintainers if this is getting
> > merged or if there were any further comments.
> 
> I haven't studied the code at all - I'm afraid Konstantin is rather
> more productive than I can keep up with, and other bugs and patches
> appeared to be more urgent and important.

I'll take a look.

>  And I made a patch for the
> radix-tree test harness which akpm curates, to update its radix-tree.c
> to Konstantin's: those tests ran perfectly on 64-bit and on 32-bit.
> That patch to rtth appended below.

Thanks.

> (I do have, or expect to have once I study them, reservations about
> his subsequent changes to radix-tree usage in mm/shmem.c; and even
> if I end up agreeing with his changes, would prefer to hold them off
> until after the tmpfs fallocation mods are in - other work which
> had to yield to higher priorities, ready but not yet commented.)

OK, I shall forget all about that followup series this time around.


^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH 0/4] radix-tree: iterating general cleanup
@ 2012-03-15  0:06           ` Andrew Morton
  0 siblings, 0 replies; 30+ messages in thread
From: Andrew Morton @ 2012-03-15  0:06 UTC (permalink / raw)
  To: Hugh Dickins
  Cc: Christoph Hellwig, Konstantin Khlebnikov, Linus Torvalds,
	linux-kernel, linux-mm

On Wed, 14 Mar 2012 12:36:11 -0700 (PDT)
Hugh Dickins <hughd@google.com> wrote:

> On Wed, 14 Mar 2012, Christoph Hellwig wrote:
> > On Wed, Mar 14, 2012 at 11:49:21AM +0400, Konstantin Khlebnikov wrote:
> > > Christoph Hellwig wrote:
> > > >Any updates on this series?

Linus took an interest, so I went to sleep.  It seems that a role
reversal is in order ;)

> > > I had sent "[PATCH v2 0/3] radix-tree: general iterator" February 10, there is no more updates after that.
> > > I just checked v2 on top "next-20120314" -- looks like all ok.
> > 
> > this was more a question to the MM maintainers if this is getting
> > merged or if there were any further comments.
> 
> I haven't studied the code at all - I'm afraid Konstantin is rather
> more productive than I can keep up with, and other bugs and patches
> appeared to be more urgent and important.

I'll take a look.

>  And I made a patch for the
> radix-tree test harness which akpm curates, to update its radix-tree.c
> to Konstantin's: those tests ran perfectly on 64-bit and on 32-bit.
> That patch to rtth appended below.

Thanks.

> (I do have, or expect to have once I study them, reservations about
> his subsequent changes to radix-tree usage in mm/shmem.c; and even
> if I end up agreeing with his changes, would prefer to hold them off
> until after the tmpfs fallocation mods are in - other work which
> had to yield to higher priorities, ready but not yet commented.)

OK, I shall forget all about that followup series this time around.

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Fight unfair telecom internet charges in Canada: sign http://stopthemeter.ca/
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 30+ messages in thread

end of thread, other threads:[~2012-03-15  0:06 UTC | newest]

Thread overview: 30+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2012-02-07  7:54 [PATCH 0/4] radix-tree: iterating general cleanup Konstantin Khlebnikov
2012-02-07  7:54 ` Konstantin Khlebnikov
2012-02-07  7:55 ` [PATCH 1/4] bitops: implement "optimized" __find_next_bit() Konstantin Khlebnikov
2012-02-07  7:55   ` Konstantin Khlebnikov
2012-02-07 19:36   ` Linus Torvalds
2012-02-07 19:36     ` Linus Torvalds
2012-02-07  7:55 ` [PATCH 2/4] radix-tree: introduce bit-optimized iterator Konstantin Khlebnikov
2012-02-07  7:55   ` Konstantin Khlebnikov
2012-02-07  7:55 ` [PATCH 3/4] radix-tree: rewrite gang lookup with using iterator Konstantin Khlebnikov
2012-02-07  7:55   ` Konstantin Khlebnikov
2012-02-07  7:55 ` [PATCH 4/4] radix-tree: use iterators in find_get_pages* functions Konstantin Khlebnikov
2012-02-07  7:55   ` Konstantin Khlebnikov
2012-02-07 19:38 ` [PATCH 0/4] radix-tree: iterating general cleanup Linus Torvalds
2012-02-07 19:38   ` Linus Torvalds
2012-02-08  1:30   ` Konstantin Khlebnikov
2012-02-08  1:30     ` Konstantin Khlebnikov
2012-02-08  1:50     ` Linus Torvalds
2012-02-08  1:50       ` Linus Torvalds
2012-02-08 21:31       ` Dave Chinner
2012-02-08 21:31         ` Dave Chinner
2012-03-14  7:36 ` Christoph Hellwig
2012-03-14  7:36   ` Christoph Hellwig
2012-03-14  7:49   ` Konstantin Khlebnikov
2012-03-14  7:49     ` Konstantin Khlebnikov
2012-03-14  7:51     ` Christoph Hellwig
2012-03-14  7:51       ` Christoph Hellwig
2012-03-14 19:36       ` Hugh Dickins
2012-03-14 19:36         ` Hugh Dickins
2012-03-15  0:06         ` Andrew Morton
2012-03-15  0:06           ` Andrew Morton

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.