* [PATCH RFC 2/4] lib/radix-tree: remove sibling entries
2016-08-27 14:14 [PATCH RFC 1/4] lib/radix: add universal radix_tree_fill_range Konstantin Khlebnikov
@ 2016-08-27 14:16 ` Konstantin Khlebnikov
2016-08-27 14:16 ` [PATCH RFC 3/4] testing/radix-tree: replace multi-order with range operations Konstantin Khlebnikov
` (2 subsequent siblings)
3 siblings, 0 replies; 5+ messages in thread
From: Konstantin Khlebnikov @ 2016-08-27 14:16 UTC (permalink / raw)
To: linux-mm; +Cc: Andrew Morton, linux-kernel, Kirill A. Shutemov, Ross Zwisler
Current implementation stores huge entry as "canonical" head entry with
tail of "sibling" entries which points backward to the head. Iterator
jumps back and forward when sees them. This complication is required for
THP in page-cache because struct page can tell it start index and size.
This patch removes support of sibling entries but keeps part that allows
store data pointers in non-leaf slots. Huge pages will be stored as range
of slots which points to the head page.
This allows to simplify fast-path in radix_tree_next_chunk(): huge entry
is reported as single-slot chunk for any index within range.
Signed-off-by: Konstantin Khlebnikov <koct9i@gmail.com>
---
include/linux/radix-tree.h | 75 ++++-----------------
lib/radix-tree.c | 158 +++++++++++++-------------------------------
2 files changed, 60 insertions(+), 173 deletions(-)
diff --git a/include/linux/radix-tree.h b/include/linux/radix-tree.h
index af33e8d93ec3..1721ddbf981d 100644
--- a/include/linux/radix-tree.h
+++ b/include/linux/radix-tree.h
@@ -37,8 +37,8 @@
* 10 - exceptional entry
* 11 - this bit combination is currently unused/reserved
*
- * The internal entry may be a pointer to the next level in the tree, a
- * sibling entry, or an indicator that the entry in this slot has been moved
+ * The internal entry may be a pointer to the next level in the tree
+ * or an indicator that the entry in this slot has been moved
* to another location in the tree and the lookup should be restarted. While
* NULL fits the 'data pointer' pattern, it means that there is no entry in
* the tree for this index (no matter what level of the tree it is found at).
@@ -265,13 +265,7 @@ static inline void radix_tree_replace_slot(void **pslot, void *item)
int __radix_tree_create(struct radix_tree_root *root, unsigned long index,
unsigned order, struct radix_tree_node **nodep,
void ***slotp);
-int __radix_tree_insert(struct radix_tree_root *, unsigned long index,
- unsigned order, void *);
-static inline int radix_tree_insert(struct radix_tree_root *root,
- unsigned long index, void *entry)
-{
- return __radix_tree_insert(root, index, 0, entry);
-}
+int radix_tree_insert(struct radix_tree_root *, unsigned long index, void *);
void *__radix_tree_lookup(struct radix_tree_root *root, unsigned long index,
struct radix_tree_node **nodep, void ***slotp);
void *radix_tree_lookup(struct radix_tree_root *, unsigned long);
@@ -354,7 +348,6 @@ radix_tree_truncate_range(struct radix_tree_root *root,
* @index: index of current slot
* @next_index: one beyond the last index for this chunk
* @tags: bit-mask for tag-iterating
- * @shift: shift for the node that holds our slots
*
* This radix tree iterator works in terms of "chunks" of slots. A chunk is a
* subinterval of slots contained within one radix tree leaf node. It is
@@ -367,20 +360,8 @@ struct radix_tree_iter {
unsigned long index;
unsigned long next_index;
unsigned long tags;
-#ifdef CONFIG_RADIX_TREE_MULTIORDER
- unsigned int shift;
-#endif
};
-static inline unsigned int iter_shift(struct radix_tree_iter *iter)
-{
-#ifdef CONFIG_RADIX_TREE_MULTIORDER
- return iter->shift;
-#else
- return 0;
-#endif
-}
-
#define RADIX_TREE_ITER_TAG_MASK 0x00FF /* tag index in lower byte */
#define RADIX_TREE_ITER_TAGGED 0x0100 /* lookup tagged slots */
#define RADIX_TREE_ITER_CONTIG 0x0200 /* stop at first hole */
@@ -441,12 +422,6 @@ void **radix_tree_iter_retry(struct radix_tree_iter *iter)
return NULL;
}
-static inline unsigned long
-__radix_tree_iter_add(struct radix_tree_iter *iter, unsigned long slots)
-{
- return iter->index + (slots << iter_shift(iter));
-}
-
/**
* radix_tree_iter_next - resume iterating when the chunk may be invalid
* @iter: iterator state
@@ -458,7 +433,7 @@ __radix_tree_iter_add(struct radix_tree_iter *iter, unsigned long slots)
static inline __must_check
void **radix_tree_iter_next(struct radix_tree_iter *iter)
{
- iter->next_index = __radix_tree_iter_add(iter, 1);
+ iter->next_index = iter->index + 1;
iter->tags = 0;
return NULL;
}
@@ -489,7 +464,7 @@ void **radix_tree_iter_jump(struct radix_tree_iter *iter, unsigned long index)
static __always_inline long
radix_tree_chunk_size(struct radix_tree_iter *iter)
{
- return (iter->next_index - iter->index) >> iter_shift(iter);
+ return iter->next_index - iter->index;
}
static inline struct radix_tree_node *entry_to_node(void *ptr)
@@ -508,6 +483,9 @@ static inline struct radix_tree_node *entry_to_node(void *ptr)
* This function updates @iter->index in the case of a successful lookup.
* For tagged lookup it also eats @iter->tags.
*
+ * Please keep this fast-path as small as possible. Complicated logic is hidden
+ * inside radix_tree_next_chunk() which prepares chunks for this funciton.
+ *
* There are several cases where 'slot' can be passed in as NULL to this
* function. These cases result from the use of radix_tree_iter_next() or
* radix_tree_iter_retry(). In these cases we don't end up dereferencing
@@ -520,49 +498,24 @@ static __always_inline void **
radix_tree_next_slot(void **slot, struct radix_tree_iter *iter, unsigned flags)
{
if (flags & RADIX_TREE_ITER_TAGGED) {
- void *canon = slot;
-
iter->tags >>= 1;
- if (unlikely(!iter->tags))
- return NULL;
- while (IS_ENABLED(CONFIG_RADIX_TREE_MULTIORDER) &&
- radix_tree_is_internal_node(slot[1])) {
- if (entry_to_node(slot[1]) == canon) {
- iter->tags >>= 1;
- iter->index = __radix_tree_iter_add(iter, 1);
- slot++;
- continue;
- }
- iter->next_index = __radix_tree_iter_add(iter, 1);
- return NULL;
- }
if (likely(iter->tags & 1ul)) {
- iter->index = __radix_tree_iter_add(iter, 1);
+ iter->index++;
return slot + 1;
}
- if (!(flags & RADIX_TREE_ITER_CONTIG)) {
+ if (!(flags & RADIX_TREE_ITER_CONTIG) && likely(iter->tags)) {
unsigned offset = __ffs(iter->tags);
iter->tags >>= offset;
- iter->index = __radix_tree_iter_add(iter, offset + 1);
+ iter->index += offset + 1;
return slot + offset + 1;
}
} else {
- long count = radix_tree_chunk_size(iter);
- void *canon = slot;
+ long size = radix_tree_chunk_size(iter);
- while (--count > 0) {
+ while (--size > 0) {
slot++;
- iter->index = __radix_tree_iter_add(iter, 1);
-
- if (IS_ENABLED(CONFIG_RADIX_TREE_MULTIORDER) &&
- radix_tree_is_internal_node(*slot)) {
- if (entry_to_node(*slot) == canon)
- continue;
- iter->next_index = iter->index;
- break;
- }
-
+ iter->index++;
if (likely(*slot))
return slot;
if (flags & RADIX_TREE_ITER_CONTIG) {
diff --git a/lib/radix-tree.c b/lib/radix-tree.c
index c46a60065a77..234f1ddbd7a9 100644
--- a/lib/radix-tree.c
+++ b/lib/radix-tree.c
@@ -77,21 +77,6 @@ static inline void *node_to_entry(void *ptr)
#define RADIX_TREE_RETRY node_to_entry(NULL)
-#ifdef CONFIG_RADIX_TREE_MULTIORDER
-/* Sibling slots point directly to another slot in the same node */
-static inline bool is_sibling_entry(struct radix_tree_node *parent, void *node)
-{
- void **ptr = node;
- return (parent->slots <= ptr) &&
- (ptr < parent->slots + RADIX_TREE_MAP_SIZE);
-}
-#else
-static inline bool is_sibling_entry(struct radix_tree_node *parent, void *node)
-{
- return false;
-}
-#endif
-
static inline unsigned long get_slot_offset(struct radix_tree_node *parent,
void **slot)
{
@@ -104,16 +89,6 @@ static unsigned int radix_tree_descend(struct radix_tree_node *parent,
unsigned int offset = (index >> parent->shift) & RADIX_TREE_MAP_MASK;
void **entry = rcu_dereference_raw(parent->slots[offset]);
-#ifdef CONFIG_RADIX_TREE_MULTIORDER
- if (radix_tree_is_internal_node(entry)) {
- unsigned long siboff = get_slot_offset(parent, entry);
- if (siboff < RADIX_TREE_MAP_SIZE) {
- offset = siboff;
- entry = rcu_dereference_raw(parent->slots[offset]);
- }
- }
-#endif
-
*nodep = (void *)entry;
return offset;
}
@@ -232,12 +207,7 @@ static void dump_node(struct radix_tree_node *node, unsigned long index)
void *entry = node->slots[i];
if (!entry)
continue;
- if (is_sibling_entry(node, entry)) {
- pr_debug("radix sblng %p offset %ld val %p indices %ld-%ld\n",
- entry, i,
- *(void **)entry_to_node(entry),
- first, last);
- } else if (!radix_tree_is_internal_node(entry)) {
+ if (!radix_tree_is_internal_node(entry)) {
pr_debug("radix entry %p offset %ld indices %ld-%ld\n",
entry, i, first, last);
} else {
@@ -596,25 +566,6 @@ int __radix_tree_create(struct radix_tree_root *root, unsigned long index,
slot = &node->slots[offset];
}
-#ifdef CONFIG_RADIX_TREE_MULTIORDER
- /* Insert pointers to the canonical entry */
- if (order > shift) {
- unsigned i, n = 1 << (order - shift);
- offset = offset & ~(n - 1);
- slot = &node->slots[offset];
- child = node_to_entry(slot);
- for (i = 0; i < n; i++) {
- if (slot[i])
- return -EEXIST;
- }
-
- for (i = 1; i < n; i++) {
- rcu_assign_pointer(slot[i], child);
- node->count++;
- }
- }
-#endif
-
if (nodep)
*nodep = node;
if (slotp)
@@ -623,16 +574,15 @@ int __radix_tree_create(struct radix_tree_root *root, unsigned long index,
}
/**
- * __radix_tree_insert - insert into a radix tree
+ * radix_tree_insert - insert into a radix tree
* @root: radix tree root
* @index: index key
- * @order: key covers the 2^order indices around index
* @item: item to insert
*
* Insert an item into the radix tree at position @index.
*/
-int __radix_tree_insert(struct radix_tree_root *root, unsigned long index,
- unsigned order, void *item)
+int radix_tree_insert(struct radix_tree_root *root, unsigned long index,
+ void *item)
{
struct radix_tree_node *node;
void **slot;
@@ -640,7 +590,7 @@ int __radix_tree_insert(struct radix_tree_root *root, unsigned long index,
BUG_ON(radix_tree_is_internal_node(item));
- error = __radix_tree_create(root, index, order, &node, &slot);
+ error = __radix_tree_create(root, index, 0, &node, &slot);
if (error)
return error;
if (*slot != NULL)
@@ -659,7 +609,7 @@ int __radix_tree_insert(struct radix_tree_root *root, unsigned long index,
return 0;
}
-EXPORT_SYMBOL(__radix_tree_insert);
+EXPORT_SYMBOL(radix_tree_insert);
/**
* __radix_tree_lookup - lookup an item in a radix tree
@@ -895,14 +845,6 @@ int radix_tree_tag_get(struct radix_tree_root *root,
}
EXPORT_SYMBOL(radix_tree_tag_get);
-static inline void __set_iter_shift(struct radix_tree_iter *iter,
- unsigned int shift)
-{
-#ifdef CONFIG_RADIX_TREE_MULTIORDER
- iter->shift = shift;
-#endif
-}
-
/**
* radix_tree_next_chunk - find next chunk of slots for iteration
*
@@ -914,9 +856,10 @@ static inline void __set_iter_shift(struct radix_tree_iter *iter,
void **radix_tree_next_chunk(struct radix_tree_root *root,
struct radix_tree_iter *iter, unsigned flags)
{
- unsigned tag = flags & RADIX_TREE_ITER_TAG_MASK;
- struct radix_tree_node *node, *child;
+ unsigned int shift, tag = flags & RADIX_TREE_ITER_TAG_MASK;
unsigned long index, offset, maxindex;
+ struct radix_tree_node *node;
+ void *entry;
if ((flags & RADIX_TREE_ITER_TAGGED) && !root_tag_get(root, tag))
return NULL;
@@ -934,28 +877,27 @@ void **radix_tree_next_chunk(struct radix_tree_root *root,
if (!index && iter->index)
return NULL;
- restart:
- radix_tree_load_root(root, &child, &maxindex);
+restart:
+ shift = radix_tree_load_root(root, &node, &maxindex);
if (index > maxindex)
return NULL;
- if (!child)
- return NULL;
- if (!radix_tree_is_internal_node(child)) {
+ if (!maxindex) {
/* Single-slot tree */
- iter->index = index;
- iter->next_index = maxindex + 1;
- iter->tags = 1;
- __set_iter_shift(iter, 0);
+ iter->index = 0;
+ iter->next_index = 1;
+ iter->tags = 0;
return (void **)&root->rnode;
}
- do {
- node = entry_to_node(child);
- offset = radix_tree_descend(node, &child, index);
+ node = entry_to_node(node);
+ while (1) {
+ shift -= RADIX_TREE_MAP_SHIFT;
+ offset = (index >> shift) & RADIX_TREE_MAP_MASK;
+ entry = rcu_dereference_raw(node->slots[offset]);
if ((flags & RADIX_TREE_ITER_TAGGED) ?
- !tag_get(node, tag, offset) : !child) {
+ !tag_get(node, tag, offset) : !entry) {
/* Hole detected */
if (flags & RADIX_TREE_ITER_CONTIG)
return NULL;
@@ -967,30 +909,42 @@ void **radix_tree_next_chunk(struct radix_tree_root *root,
offset + 1);
else
while (++offset < RADIX_TREE_MAP_SIZE) {
- void *slot = node->slots[offset];
- if (is_sibling_entry(node, slot))
- continue;
- if (slot)
+ if (node->slots[offset])
break;
}
- index &= ~node_maxindex(node);
- index += offset << node->shift;
+
+ index &= ~shift_maxindex(shift);
+ index += offset << shift;
/* Overflow after ~0UL */
if (!index)
return NULL;
if (offset == RADIX_TREE_MAP_SIZE)
goto restart;
- child = rcu_dereference_raw(node->slots[offset]);
+ entry = rcu_dereference_raw(node->slots[offset]);
}
- if ((child == NULL) || (child == RADIX_TREE_RETRY))
+ /* This is leaf-node */
+ if (!shift)
+ break;
+
+ /* Non-leaf data entry */
+ if (!radix_tree_is_internal_node(entry)) {
+ /* Report as a single slot chunk */
+ iter->index = index;
+ iter->next_index = index + 1;
+ iter->tags = 0;
+ return node->slots + offset;
+ }
+
+ node = entry_to_node(entry);
+ /* RADIX_TREE_RETRY */
+ if (!node)
goto restart;
- } while (radix_tree_is_internal_node(child));
+ }
/* Update the iterator state */
- iter->index = (index &~ node_maxindex(node)) | (offset << node->shift);
- iter->next_index = (index | node_maxindex(node)) + 1;
- __set_iter_shift(iter, node->shift);
+ iter->index = index;
+ iter->next_index = (index | RADIX_TREE_MAP_MASK) + 1;
/* Construct iter->tags bit-mask from node->tags[tag] array */
if (flags & RADIX_TREE_ITER_TAGGED) {
@@ -1319,7 +1273,6 @@ unsigned long radix_tree_range_tag_if_tagged(struct radix_tree_root *root,
goto next;
if (!tag_get(node, iftag, offset))
goto next;
- /* Sibling slots never have tags set on them */
if (radix_tree_is_internal_node(child)) {
node = entry_to_node(child);
continue;
@@ -1341,7 +1294,7 @@ unsigned long radix_tree_range_tag_if_tagged(struct radix_tree_root *root,
break;
tag_set(parent, settag, offset);
}
- next:
+next:
/* Go to next entry in node */
index = ((index >> node->shift) + 1) << node->shift;
/* Overflow can happen when last_index is ~0UL... */
@@ -1357,8 +1310,6 @@ unsigned long radix_tree_range_tag_if_tagged(struct radix_tree_root *root,
node = node->parent;
offset = (index >> node->shift) & RADIX_TREE_MAP_MASK;
}
- if (is_sibling_entry(node, node->slots[offset]))
- goto next;
if (tagged >= nr_to_tag)
break;
}
@@ -1574,8 +1525,6 @@ static unsigned long __locate(struct radix_tree_node *slot, void *item,
continue;
}
node = entry_to_node(node);
- if (is_sibling_entry(slot, node))
- continue;
slot = node;
break;
}
@@ -1750,20 +1699,6 @@ bool __radix_tree_delete_node(struct radix_tree_root *root,
return deleted;
}
-static inline void delete_sibling_entries(struct radix_tree_node *node,
- void *ptr, unsigned offset)
-{
-#ifdef CONFIG_RADIX_TREE_MULTIORDER
- int i;
- for (i = 1; offset + i < RADIX_TREE_MAP_SIZE; i++) {
- if (node->slots[offset + i] != ptr)
- break;
- node->slots[offset + i] = NULL;
- node->count--;
- }
-#endif
-}
-
/**
* radix_tree_delete_item - delete an item from a radix tree
* @root: radix tree root
@@ -1803,7 +1738,6 @@ void *radix_tree_delete_item(struct radix_tree_root *root,
for (tag = 0; tag < RADIX_TREE_MAX_TAGS; tag++)
node_tag_clear(root, node, tag, offset);
- delete_sibling_entries(node, node_to_entry(slot), offset);
node->slots[offset] = NULL;
node->count--;
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply related [flat|nested] 5+ messages in thread
* [PATCH RFC 3/4] testing/radix-tree: replace multi-order with range operations
2016-08-27 14:14 [PATCH RFC 1/4] lib/radix: add universal radix_tree_fill_range Konstantin Khlebnikov
2016-08-27 14:16 ` [PATCH RFC 2/4] lib/radix-tree: remove sibling entries Konstantin Khlebnikov
@ 2016-08-27 14:16 ` Konstantin Khlebnikov
2016-08-27 14:16 ` [PATCH RFC 4/4] testing/radix-tree: benchmark for iterator Konstantin Khlebnikov
2016-08-30 22:39 ` [PATCH RFC 1/4] lib/radix: add universal radix_tree_fill_range Ross Zwisler
3 siblings, 0 replies; 5+ messages in thread
From: Konstantin Khlebnikov @ 2016-08-27 14:16 UTC (permalink / raw)
To: linux-mm; +Cc: Andrew Morton, linux-kernel, Kirill A. Shutemov, Ross Zwisler
This patch updates test for multi-order operations according to changes
in radix-tree: huge entry now must remember its range.
Signed-off-by: Konstantin Khlebnikov <koct9i@gmail.com>
---
tools/testing/radix-tree/linux/bug.h | 2 -
tools/testing/radix-tree/linux/err.h | 31 ++++++++
tools/testing/radix-tree/linux/kernel.h | 3 +
tools/testing/radix-tree/linux/rcupdate.h | 6 ++
tools/testing/radix-tree/multiorder.c | 113 ++++++++++++++++-------------
tools/testing/radix-tree/test.c | 49 ++++++++++---
tools/testing/radix-tree/test.h | 15 ++--
7 files changed, 151 insertions(+), 68 deletions(-)
create mode 100644 tools/testing/radix-tree/linux/err.h
diff --git a/tools/testing/radix-tree/linux/bug.h b/tools/testing/radix-tree/linux/bug.h
index ccbe444977df..7a77fa971e91 100644
--- a/tools/testing/radix-tree/linux/bug.h
+++ b/tools/testing/radix-tree/linux/bug.h
@@ -1 +1 @@
-#define WARN_ON_ONCE(x) assert(x)
+#define WARN_ON_ONCE(x) assert(!(x))
diff --git a/tools/testing/radix-tree/linux/err.h b/tools/testing/radix-tree/linux/err.h
new file mode 100644
index 000000000000..6fd3e608d4d7
--- /dev/null
+++ b/tools/testing/radix-tree/linux/err.h
@@ -0,0 +1,31 @@
+#ifndef _LINUX_ERR_H
+#define _LINUX_ERR_H
+
+#define MAX_ERRNO 4095
+
+#define IS_ERR_VALUE(x) unlikely((unsigned long)(void *)(x) >= (unsigned long)-MAX_ERRNO)
+
+static inline void *ERR_PTR(long error)
+{
+ return (void *) error;
+}
+
+static inline long PTR_ERR(const void *ptr)
+{
+ return (long) ptr;
+}
+
+static inline bool IS_ERR(const void *ptr)
+{
+ return IS_ERR_VALUE((unsigned long)ptr);
+}
+
+static inline int PTR_ERR_OR_ZERO(const void *ptr)
+{
+ if (IS_ERR(ptr))
+ return PTR_ERR(ptr);
+ else
+ return 0;
+}
+
+#endif /* _LINUX_ERR_H */
diff --git a/tools/testing/radix-tree/linux/kernel.h b/tools/testing/radix-tree/linux/kernel.h
index be98a47b4e1b..52714e86991b 100644
--- a/tools/testing/radix-tree/linux/kernel.h
+++ b/tools/testing/radix-tree/linux/kernel.h
@@ -32,6 +32,9 @@
#define ARRAY_SIZE(arr) (sizeof(arr) / sizeof((arr)[0]))
+#define IS_ALIGNED(x, a) (((x) & ((typeof(x))(a) - 1)) == 0)
+#define is_power_of_2(n) ((n) != 0 && (((n) & ((n) - 1)) == 0))
+
#define container_of(ptr, type, member) ({ \
const typeof( ((type *)0)->member ) *__mptr = (ptr); \
(type *)( (char *)__mptr - offsetof(type, member) );})
diff --git a/tools/testing/radix-tree/linux/rcupdate.h b/tools/testing/radix-tree/linux/rcupdate.h
index f7129ea2a899..8c4ae8173778 100644
--- a/tools/testing/radix-tree/linux/rcupdate.h
+++ b/tools/testing/radix-tree/linux/rcupdate.h
@@ -3,6 +3,12 @@
#include <urcu.h>
+/* urcu.h includes errno.h which undefines ERANGE */
+#ifndef ERANGE
+#define ERANGE 34
+#endif
+
+#define RCU_INIT_POINTER(p, v) rcu_assign_pointer(p, v)
#define rcu_dereference_raw(p) rcu_dereference(p)
#define rcu_dereference_protected(p, cond) rcu_dereference(p)
diff --git a/tools/testing/radix-tree/multiorder.c b/tools/testing/radix-tree/multiorder.c
index 39d9b9568fe2..f73a3bfa83d8 100644
--- a/tools/testing/radix-tree/multiorder.c
+++ b/tools/testing/radix-tree/multiorder.c
@@ -29,7 +29,7 @@ static void __multiorder_tag_test(int index, int order)
unsigned long first = 0;
/* our canonical entry */
- base = index & ~((1 << order) - 1);
+ base = index;
printf("Multiorder tag test with index %d, canonical entry %d\n",
index, base);
@@ -43,7 +43,7 @@ static void __multiorder_tag_test(int index, int order)
* item_insert_order().
*/
for_each_index(i, base, order) {
- err = __radix_tree_insert(&tree, i, order,
+ err = radix_tree_insert(&tree, i,
(void *)(0xA0 | RADIX_TREE_EXCEPTIONAL_ENTRY));
assert(err == -EEXIST);
}
@@ -55,18 +55,18 @@ static void __multiorder_tag_test(int index, int order)
assert(radix_tree_tag_set(&tree, index, 0));
- for_each_index(i, base, order) {
- assert(radix_tree_tag_get(&tree, i, 0));
+ assert(radix_tree_tag_get(&tree, index, 0));
+
+ for_each_index(i, base, order)
assert(!radix_tree_tag_get(&tree, i, 1));
- }
assert(radix_tree_range_tag_if_tagged(&tree, &first, ~0UL, 10, 0, 1) == 1);
assert(radix_tree_tag_clear(&tree, index, 0));
- for_each_index(i, base, order) {
+ assert(radix_tree_tag_get(&tree, index, 1));
+
+ for_each_index(i, base, order)
assert(!radix_tree_tag_get(&tree, i, 0));
- assert(radix_tree_tag_get(&tree, i, 1));
- }
assert(radix_tree_tag_clear(&tree, index, 1));
@@ -122,7 +122,7 @@ static void multiorder_tag_tests(void)
static void multiorder_check(unsigned long index, int order)
{
unsigned long i;
- unsigned long min = index & ~((1UL << order) - 1);
+ unsigned long min = index;
unsigned long max = min + (1UL << order);
RADIX_TREE(tree, GFP_KERNEL);
@@ -145,7 +145,7 @@ static void multiorder_check(unsigned long index, int order)
assert(radix_tree_insert(&tree, i, entry) == -EEXIST);
}
- assert(item_delete(&tree, index) != 0);
+ assert(item_delete_order(&tree, index, order) != 0);
for (i = 0; i < 2*max; i++)
item_check_absent(&tree, i);
@@ -178,7 +178,7 @@ static void multiorder_shrink(unsigned long index, int order)
for (i = max; i < 2*max; i++)
item_check_absent(&tree, i);
- if (!item_delete(&tree, 0)) {
+ if (!item_delete_order(&tree, 0, order)) {
printf("failed to delete index %ld (order %d)\n", index, order); abort();
}
@@ -221,14 +221,12 @@ void multiorder_iteration(void)
break;
radix_tree_for_each_slot(slot, &tree, &iter, j) {
- int height = order[i] / RADIX_TREE_MAP_SHIFT;
- int shift = height * RADIX_TREE_MAP_SHIFT;
int mask = (1 << order[i]) - 1;
assert(iter.index >= (index[i] &~ mask));
assert(iter.index <= (index[i] | mask));
- assert(iter.shift == shift);
- i++;
+ if (iter.index == (index[i] | mask))
+ i++;
}
}
@@ -248,36 +246,48 @@ void multiorder_tagged_iteration(void)
#define MT_NUM_ENTRIES 9
int index[MT_NUM_ENTRIES] = {0, 2, 4, 16, 32, 40, 64, 72, 128};
int order[MT_NUM_ENTRIES] = {1, 0, 2, 4, 3, 1, 3, 0, 7};
-
-#define TAG_ENTRIES 7
- int tag_index[TAG_ENTRIES] = {0, 4, 16, 40, 64, 72, 128};
+ int tag[MT_NUM_ENTRIES] = {1, 0, 1, 1, 0, 1, 1, 1, 1};
for (i = 0; i < MT_NUM_ENTRIES; i++)
assert(!item_insert_order(&tree, index[i], order[i]));
assert(!radix_tree_tagged(&tree, 1));
- for (i = 0; i < TAG_ENTRIES; i++)
- assert(radix_tree_tag_set(&tree, tag_index[i], 1));
+ for (i = 0; i < MT_NUM_ENTRIES; i++) {
+ unsigned long end = item_order_end(index[i], order[i]);
+
+ assert(!radix_tree_tag_get(&tree, index[i], 1));
+ assert(!radix_tree_tag_get(&tree, end, 1));
+ if (tag[i])
+ assert(radix_tree_tag_set(&tree, end, 1));
+ }
+
+ for (i = 0; i < MT_NUM_ENTRIES; i++) {
+ unsigned long end = item_order_end(index[i], order[i]);
+
+ if (tag[i])
+ assert(radix_tree_tag_get(&tree, end, 1));
+ else
+ assert(!radix_tree_tag_get(&tree, end, 1));
+ }
for (j = 0; j < 256; j++) {
- int mask, k;
+ int k;
- for (i = 0; i < TAG_ENTRIES; i++) {
- for (k = i; index[k] < tag_index[i]; k++)
- ;
- if (j <= (index[k] | ((1 << order[k]) - 1)))
+ for (k = 0; k < MT_NUM_ENTRIES; k++)
+ if (tag[k] && j <= item_order_end(index[k], order[k]))
break;
- }
radix_tree_for_each_tagged(slot, &tree, &iter, j, 1) {
- for (k = i; index[k] < tag_index[i]; k++)
- ;
- mask = (1 << order[k]) - 1;
-
- assert(iter.index >= (tag_index[i] &~ mask));
- assert(iter.index <= (tag_index[i] | mask));
- i++;
+ unsigned long end = item_order_end(index[k], order[k]);
+
+ assert(k < MT_NUM_ENTRIES);
+ assert(radix_tree_tag_get(&tree, iter.index, 1));
+ assert(iter.index >= index[k]);
+ assert(iter.index <= end);
+ if (iter.index == end)
+ while (++k < MT_NUM_ENTRIES && !tag[k])
+ ;
}
}
@@ -285,33 +295,36 @@ void multiorder_tagged_iteration(void)
MT_NUM_ENTRIES, 1, 2);
for (j = 0; j < 256; j++) {
- int mask, k;
+ int k;
- for (i = 0; i < TAG_ENTRIES; i++) {
- for (k = i; index[k] < tag_index[i]; k++)
- ;
- if (j <= (index[k] | ((1 << order[k]) - 1)))
+ for (k = 0; k < MT_NUM_ENTRIES; k++)
+ if (tag[k] && j <= item_order_end(index[k], order[k]))
break;
- }
radix_tree_for_each_tagged(slot, &tree, &iter, j, 2) {
- for (k = i; index[k] < tag_index[i]; k++)
- ;
- mask = (1 << order[k]) - 1;
-
- assert(iter.index >= (tag_index[i] &~ mask));
- assert(iter.index <= (tag_index[i] | mask));
- i++;
+ unsigned long end = item_order_end(index[k], order[k]);
+
+ assert(k < MT_NUM_ENTRIES);
+ assert(radix_tree_tag_get(&tree, iter.index, 2));
+ assert(iter.index >= index[k]);
+ assert(iter.index <= end);
+ if (iter.index == end)
+ while (++k < MT_NUM_ENTRIES && !tag[k])
+ ;
}
}
- first = 1;
+ assert(!radix_tree_tagged(&tree, 0));
+
+ first = index[1];
radix_tree_range_tag_if_tagged(&tree, &first, ~0UL,
MT_NUM_ENTRIES, 1, 0);
- i = 0;
+
radix_tree_for_each_tagged(slot, &tree, &iter, 0, 0) {
- assert(iter.index == tag_index[i]);
- i++;
+ assert(radix_tree_tag_get(&tree, iter.index, 0));
+ assert(iter.index >= index[2]);
+ assert(iter.index <= item_order_end(index[2], order[2]));
+ break;
}
item_kill_tree(&tree);
diff --git a/tools/testing/radix-tree/test.c b/tools/testing/radix-tree/test.c
index a6e8099eaf4f..d6761d1a157a 100644
--- a/tools/testing/radix-tree/test.c
+++ b/tools/testing/radix-tree/test.c
@@ -4,6 +4,7 @@
#include <linux/types.h>
#include <linux/kernel.h>
#include <linux/bitops.h>
+#include <linux/err.h>
#include "test.h"
@@ -24,21 +25,23 @@ int item_tag_get(struct radix_tree_root *root, unsigned long index, int tag)
return radix_tree_tag_get(root, index, tag);
}
-int __item_insert(struct radix_tree_root *root, struct item *item,
- unsigned order)
+int item_insert(struct radix_tree_root *root, unsigned long index)
{
- return __radix_tree_insert(root, item->index, order, item);
+ return radix_tree_insert(root, index, item_create(index, index));
}
-int item_insert(struct radix_tree_root *root, unsigned long index)
+unsigned long item_order_end(unsigned long index, unsigned int order)
{
- return __item_insert(root, item_create(index), 0);
+ return index + (1ul << order) - 1;
}
int item_insert_order(struct radix_tree_root *root, unsigned long index,
- unsigned order)
+ unsigned int order)
{
- return __item_insert(root, item_create(index), order);
+ unsigned long end = item_order_end(index, order);
+
+ return PTR_ERR_OR_ZERO(radix_tree_fill_range(root, index, end,
+ item_create(index, end), 0));
}
int item_delete(struct radix_tree_root *root, unsigned long index)
@@ -47,17 +50,34 @@ int item_delete(struct radix_tree_root *root, unsigned long index)
if (item) {
assert(item->index == index);
+ assert(item->end == index);
free(item);
return 1;
}
return 0;
}
-struct item *item_create(unsigned long index)
+int item_delete_order(struct radix_tree_root *root, unsigned long index,
+ unsigned int order)
+{
+ struct item *item = radix_tree_lookup(root, index);
+ unsigned long end = item_order_end(index, order);
+
+ if (item) {
+ assert(item->index == index);
+ assert(item->end == end);
+ }
+ radix_tree_truncate_range(root, index, end);
+ free(item);
+ return !!item;
+}
+
+struct item *item_create(unsigned long index, unsigned long end)
{
struct item *ret = malloc(sizeof(*ret));
ret->index = index;
+ ret->end = end;
return ret;
}
@@ -207,10 +227,17 @@ void item_kill_tree(struct radix_tree_root *root)
int i;
for (i = 0; i < nfound; i++) {
- void *ret;
+ void *item;
+
+ if (items[i]->index != items[i]->end) {
+ radix_tree_truncate_range(root, items[i]->index,
+ items[i]->end);
+ free(items[i]);
+ break;
+ }
- ret = radix_tree_delete(root, items[i]->index);
- assert(ret == items[i]);
+ item = radix_tree_delete(root, items[i]->index);
+ assert(item == items[i]);
free(items[i]);
}
}
diff --git a/tools/testing/radix-tree/test.h b/tools/testing/radix-tree/test.h
index 217fb2403f09..93a6ce5e5a59 100644
--- a/tools/testing/radix-tree/test.h
+++ b/tools/testing/radix-tree/test.h
@@ -4,16 +4,19 @@
#include <linux/rcupdate.h>
struct item {
- unsigned long index;
+ unsigned long index, end;
};
-struct item *item_create(unsigned long index);
-int __item_insert(struct radix_tree_root *root, struct item *item,
- unsigned order);
+struct item *item_create(unsigned long index, unsigned long end);
int item_insert(struct radix_tree_root *root, unsigned long index);
-int item_insert_order(struct radix_tree_root *root, unsigned long index,
- unsigned order);
int item_delete(struct radix_tree_root *root, unsigned long index);
+
+unsigned long item_order_end(unsigned long index, unsigned int order);
+int item_insert_order(struct radix_tree_root *root, unsigned long index,
+ unsigned int order);
+int item_delete_order(struct radix_tree_root *root, unsigned long index,
+ unsigned int order);
+
struct item *item_lookup(struct radix_tree_root *root, unsigned long index);
void item_check_present(struct radix_tree_root *root, unsigned long index);
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply related [flat|nested] 5+ messages in thread
* [PATCH RFC 4/4] testing/radix-tree: benchmark for iterator
2016-08-27 14:14 [PATCH RFC 1/4] lib/radix: add universal radix_tree_fill_range Konstantin Khlebnikov
2016-08-27 14:16 ` [PATCH RFC 2/4] lib/radix-tree: remove sibling entries Konstantin Khlebnikov
2016-08-27 14:16 ` [PATCH RFC 3/4] testing/radix-tree: replace multi-order with range operations Konstantin Khlebnikov
@ 2016-08-27 14:16 ` Konstantin Khlebnikov
2016-08-30 22:39 ` [PATCH RFC 1/4] lib/radix: add universal radix_tree_fill_range Ross Zwisler
3 siblings, 0 replies; 5+ messages in thread
From: Konstantin Khlebnikov @ 2016-08-27 14:16 UTC (permalink / raw)
To: linux-mm; +Cc: Andrew Morton, linux-kernel, Kirill A. Shutemov, Ross Zwisler
This adds simple benchmark for iterator similar to one I've used for
commit 78c1d78488a3 ("radix-tree: introduce bit-optimized iterator")
Building with make BENCHMARK=1 set radix tree order to 6 and adds -O2,
this allows to get performance comparable to in kernel performance.
Signed-off-by: Konstantin Khlebnikov <koct9i@gmail.com>
---
tools/testing/radix-tree/Makefile | 6 ++
tools/testing/radix-tree/benchmark.c | 101 +++++++++++++++++++++++++++++++
tools/testing/radix-tree/linux/kernel.h | 4 +
tools/testing/radix-tree/main.c | 2 +
tools/testing/radix-tree/test.h | 1
5 files changed, 113 insertions(+), 1 deletion(-)
create mode 100644 tools/testing/radix-tree/benchmark.c
diff --git a/tools/testing/radix-tree/Makefile b/tools/testing/radix-tree/Makefile
index 6079ec142685..1594335d1ed6 100644
--- a/tools/testing/radix-tree/Makefile
+++ b/tools/testing/radix-tree/Makefile
@@ -4,7 +4,11 @@ LDFLAGS += -lpthread -lurcu
TARGETS = main
OFILES = main.o radix-tree.o linux.o test.o tag_check.o find_next_bit.o \
regression1.o regression2.o regression3.o multiorder.o \
- iteration_check.o
+ iteration_check.o benchmark.o
+
+ifdef BENCHMARK
+ CFLAGS += -DBENCHMARK=1 -O2
+endif
targets: $(TARGETS)
diff --git a/tools/testing/radix-tree/benchmark.c b/tools/testing/radix-tree/benchmark.c
new file mode 100644
index 000000000000..05d46071bf37
--- /dev/null
+++ b/tools/testing/radix-tree/benchmark.c
@@ -0,0 +1,101 @@
+/*
+ * benchmark.c:
+ * Author: Konstantin Khlebnikov <koct9i@gmail.com>
+ *
+ * This program is free software; you can redistribute it and/or modify it
+ * under the terms and conditions of the GNU General Public License,
+ * version 2, as published by the Free Software Foundation.
+ *
+ * This program is distributed in the hope it will be useful, but WITHOUT
+ * ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or
+ * FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License for
+ * more details.
+ */
+#include <linux/radix-tree.h>
+#include <linux/slab.h>
+#include <linux/errno.h>
+#include <time.h>
+#include "test.h"
+
+#define NSEC_PER_SEC 1000000000L
+
+static long long benchmark_iter(struct radix_tree_root *root, bool tagged)
+{
+ volatile unsigned long sink = 0;
+ struct radix_tree_iter iter;
+ struct timespec start, finish;
+ long long nsec;
+ int l, loops = 1;
+ void **slot;
+
+#ifdef BENCHMARK
+again:
+#endif
+ clock_gettime(CLOCK_MONOTONIC, &start);
+ for (l = 0; l < loops; l++) {
+ if (tagged) {
+ radix_tree_for_each_tagged(slot, root, &iter, 0, 0)
+ sink ^= (unsigned long)slot;
+ } else {
+ radix_tree_for_each_slot(slot, root, &iter, 0)
+ sink ^= (unsigned long)slot;
+ }
+ }
+ clock_gettime(CLOCK_MONOTONIC, &finish);
+
+ nsec = (finish.tv_sec - start.tv_sec) * NSEC_PER_SEC +
+ (finish.tv_nsec - start.tv_nsec);
+
+#ifdef BENCHMARK
+ if (loops == 1 && nsec * 5 < NSEC_PER_SEC) {
+ loops = NSEC_PER_SEC / nsec / 4 + 1;
+ goto again;
+ }
+#endif
+
+ nsec /= loops;
+ return nsec;
+}
+
+static void benchmark_size(unsigned long size, unsigned long step, int order)
+{
+ RADIX_TREE(tree, GFP_KERNEL);
+ long long normal, tagged;
+ unsigned long index;
+
+ for (index = 0 ; index < size ; index += step) {
+ item_insert_order(&tree, index, order);
+ radix_tree_tag_set(&tree, item_order_end(index, order), 0);
+ }
+
+ tagged = benchmark_iter(&tree, true);
+ normal = benchmark_iter(&tree, false);
+
+ printf("Size %ld, step %6ld, order %d tagged %10lld ns, normal %10lld ns\n",
+ size, step, order, tagged, normal);
+
+ item_kill_tree(&tree);
+}
+
+void benchmark(void)
+{
+ unsigned long size[] = {1 << 10, 1 << 20,
+#ifdef BENCHMARK
+ 1 << 27,
+#endif
+ 0};
+ unsigned long step[] = {1, 2, 7, 15, 63, 64, 65,
+ 128, 256, 512, 12345, 0};
+ int c, s;
+
+ printf("starting benchmarks\n");
+ printf("RADIX_TREE_MAP_SHIFT = %d\n", RADIX_TREE_MAP_SHIFT);
+
+ for (c = 0; size[c]; c++)
+ for (s = 0; step[s]; s++)
+ benchmark_size(size[c], step[s], 0);
+
+ for (c = 0; size[c]; c++)
+ for (s = 0; step[s]; s++)
+ benchmark_size(size[c], step[s] << 9, 9);
+}
diff --git a/tools/testing/radix-tree/linux/kernel.h b/tools/testing/radix-tree/linux/kernel.h
index 52714e86991b..ddabc495423f 100644
--- a/tools/testing/radix-tree/linux/kernel.h
+++ b/tools/testing/radix-tree/linux/kernel.h
@@ -10,7 +10,11 @@
#include "../../include/linux/compiler.h"
#include "../../../include/linux/kconfig.h"
+#ifdef BENCHMARK
+#define RADIX_TREE_MAP_SHIFT 6
+#else
#define RADIX_TREE_MAP_SHIFT 3
+#endif
#ifndef NULL
#define NULL 0
diff --git a/tools/testing/radix-tree/main.c b/tools/testing/radix-tree/main.c
index daa9010693e8..6e9a02cb7b97 100644
--- a/tools/testing/radix-tree/main.c
+++ b/tools/testing/radix-tree/main.c
@@ -335,6 +335,8 @@ int main(int argc, char **argv)
iteration_test();
single_thread_tests(long_run);
+ benchmark();
+
sleep(1);
printf("after sleep(1): %d allocated\n", nr_allocated);
rcu_unregister_thread();
diff --git a/tools/testing/radix-tree/test.h b/tools/testing/radix-tree/test.h
index 93a6ce5e5a59..2529a622f1f2 100644
--- a/tools/testing/radix-tree/test.h
+++ b/tools/testing/radix-tree/test.h
@@ -31,6 +31,7 @@ void item_kill_tree(struct radix_tree_root *root);
void tag_check(void);
void multiorder_checks(void);
void iteration_test(void);
+void benchmark(void);
struct item *
item_tag_set(struct radix_tree_root *root, unsigned long index, int tag);
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply related [flat|nested] 5+ messages in thread
* Re: [PATCH RFC 1/4] lib/radix: add universal radix_tree_fill_range
2016-08-27 14:14 [PATCH RFC 1/4] lib/radix: add universal radix_tree_fill_range Konstantin Khlebnikov
` (2 preceding siblings ...)
2016-08-27 14:16 ` [PATCH RFC 4/4] testing/radix-tree: benchmark for iterator Konstantin Khlebnikov
@ 2016-08-30 22:39 ` Ross Zwisler
3 siblings, 0 replies; 5+ messages in thread
From: Ross Zwisler @ 2016-08-30 22:39 UTC (permalink / raw)
To: Konstantin Khlebnikov
Cc: linux-mm, Andrew Morton, linux-kernel, Kirill A. Shutemov, Ross Zwisler
On Sat, Aug 27, 2016 at 05:14:34PM +0300, Konstantin Khlebnikov wrote:
> Tags should be set only for last index of THP range: this way iterator
> will find them regardless of starting index.
I don't think this works well for DAX. We really want to to have the tags be
consistent for all indices within a multi-order range. Meaning, if I fault in
an order-9 fault, and then I get a PTE write fault to anywhere within that
range, I want to be able to do a lookup, find the one canonical entry that has
my dirty tags, flush, and eventually I want to be able to clear that one tag.
I agree that it's *possible* to do all of this with your code, but it puts a
lot of onus on the user. I now have to have two paths, one for order-0
entries, and one for multi-order entries where I know to use a specific entry
as my canonical entry where I can count on the log bit, on tags, etc.
This was actually the way that it was done with the old PMD code. We used
the first aligned index for the PMD to be the one source of truth. On every
fault I would first check to see if there was a PMD aligned entry, and then if
not I would treat it like a normal 4k fault. The multi-order radix tree with
sibling entries was a huge step forward.
I guess my question is the same as Matthew's: what is the problem you need to
solve with this code, and why can't the current code be made to solve it?
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply [flat|nested] 5+ messages in thread