linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH v5] hashtable: introduce a small and naive hashtable
@ 2012-09-15  9:53 Sasha Levin
  2012-09-15 15:14 ` Mathieu Desnoyers
  0 siblings, 1 reply; 7+ messages in thread
From: Sasha Levin @ 2012-09-15  9:53 UTC (permalink / raw)
  To: torvalds
  Cc: tj, akpm, linux-kernel, rostedt, ebiederm, mathieu.desnoyers,
	neilb, bfields, ejt, snitzer, edumazet, josh, David.Laight,
	rmallon, palves, Sasha Levin

This hashtable implementation is using hlist buckets to provide a simple
hashtable to prevent it from getting reimplemented all over the kernel.

Signed-off-by: Sasha Levin <levinsasha928@gmail.com>
---

Changes from v4:

 - Addressed comments by Mathieu Desnoyers.

 include/linux/hashtable.h | 190 ++++++++++++++++++++++++++++++++++++++++++++++
 1 file changed, 190 insertions(+)
 create mode 100644 include/linux/hashtable.h

diff --git a/include/linux/hashtable.h b/include/linux/hashtable.h
new file mode 100644
index 0000000..6d0c6c2
--- /dev/null
+++ b/include/linux/hashtable.h
@@ -0,0 +1,190 @@
+/*
+ * Hash table implementation
+ * (C) 2012  Sasha Levin <levinsasha928@gmail.com>
+ */
+
+#ifndef _LINUX_HASHTABLE_H
+#define _LINUX_HASHTABLE_H
+
+#include <linux/list.h>
+#include <linux/types.h>
+#include <linux/kernel.h>
+#include <linux/hash.h>
+#include <linux/rculist.h>
+
+#define DEFINE_HASHTABLE(name, bits)						\
+	struct hlist_head name[HASH_SIZE(bits)] =				\
+			{ [0 ... HASH_SIZE(bits) - 1] = HLIST_HEAD_INIT }
+
+#define DECLARE_HASHTABLE(name, bits)                                   	\
+	struct hlist_head name[1 << (bits)]
+
+#define HASH_SIZE(name) (1 << HASH_BITS(name))
+#define HASH_BITS(name) ilog2(ARRAY_SIZE(name))
+
+/* Use hash_32 when possible to allow for fast 32bit hashing in 64bit kernels. */
+#define hash_min(val, bits)							\
+({										\
+	sizeof(val) <= 4 ?							\
+	hash_32(val, bits) :							\
+	hash_long(val, bits);							\
+})
+
+/**
+ * hash_init - initialize a hash table
+ * @hashtable: hashtable to be initialized
+ *
+ * Calculates the size of the hashtable from the given parameter, otherwise
+ * same as hash_init_size.
+ *
+ * This has to be a macro since HASH_BITS() will not work on pointers since
+ * it calculates the size during preprocessing.
+ */
+#define hash_init(hashtable)							\
+({										\
+	int __i;								\
+										\
+	for (__i = 0; __i < HASH_BITS(hashtable); __i++)			\
+		INIT_HLIST_HEAD(&hashtable[__i]);				\
+})
+
+/**
+ * hash_add - add an object to a hashtable
+ * @hashtable: hashtable to add to
+ * @node: the &struct hlist_node of the object to be added
+ * @key: the key of the object to be added
+ */
+#define hash_add(hashtable, node, key)						\
+	hlist_add_head(node, &hashtable[hash_min(key, HASH_BITS(hashtable))]);
+
+/**
+ * hash_add_rcu - add an object to a rcu enabled hashtable
+ * @hashtable: hashtable to add to
+ * @node: the &struct hlist_node of the object to be added
+ * @key: the key of the object to be added
+ */
+#define hash_add_rcu(hashtable, node, key)					\
+	hlist_add_head_rcu(node, &hashtable[hash_min(key, HASH_BITS(hashtable))]);
+
+/**
+ * hash_hashed - check whether an object is in any hashtable
+ * @node: the &struct hlist_node of the object to be checked
+ */
+#define hash_hashed(node) (!hlist_unhashed(node))
+
+/**
+ * hash_empty - check whether a hashtable is empty
+ * @hashtable: hashtable to check
+ *
+ * This has to be a macro since HASH_BITS() will not work on pointers since
+ * it calculates the size during preprocessing.
+ */
+#define hash_empty(hashtable)							\
+({										\
+	int __i;								\
+	bool __ret = true;							\
+										\
+	for (__i = 0; __i < HASH_SIZE(hashtable); __i++)			\
+		if (!hlist_empty(&hashtable[__i]))				\
+			__ret = false;						\
+										\
+	__ret;									\
+})
+
+/**
+ * hash_del - remove an object from a hashtable
+ * @node: &struct hlist_node of the object to remove
+ */
+static inline void hash_del(struct hlist_node *node)
+{
+	hlist_del_init(node);
+}
+
+/**
+ * hash_del_rcu - remove an object from a rcu enabled hashtable
+ * @node: &struct hlist_node of the object to remove
+ */
+static inline void hash_del_rcu(struct hlist_node *node)
+{
+	hlist_del_init_rcu(node);
+}
+
+/**
+ * hash_for_each - iterate over a hashtable
+ * @name: hashtable to iterate
+ * @bkt: integer to use as bucket loop cursor
+ * @node: the &struct list_head to use as a loop cursor for each entry
+ * @obj: the type * to use as a loop cursor for each entry
+ * @member: the name of the hlist_node within the struct
+ */
+#define hash_for_each(name, bkt, node, obj, member)				\
+	for (bkt = 0, node = NULL; node == NULL && bkt < HASH_SIZE(name); bkt++)\
+		hlist_for_each_entry(obj, node, &name[bkt], member)
+
+/**
+ * hash_for_each_rcu - iterate over a rcu enabled hashtable
+ * @name: hashtable to iterate
+ * @bkt: integer to use as bucket loop cursor
+ * @node: the &struct list_head to use as a loop cursor for each entry
+ * @obj: the type * to use as a loop cursor for each entry
+ * @member: the name of the hlist_node within the struct
+ */
+#define hash_for_each_rcu(name, bkt, node, obj, member)				\
+	for (bkt = 0, node = NULL; node == NULL && bkt < HASH_SIZE(name); bkt++)\
+		hlist_for_each_entry_rcu(obj, node, &name[bkt], member)
+
+/**
+ * hash_for_each_safe - iterate over a hashtable safe against removal of
+ * hash entry
+ * @name: hashtable to iterate
+ * @bkt: integer to use as bucket loop cursor
+ * @node: the &struct list_head to use as a loop cursor for each entry
+ * @tmp: a &struct used for temporary storage
+ * @obj: the type * to use as a loop cursor for each entry
+ * @member: the name of the hlist_node within the struct
+ */
+#define hash_for_each_safe(name, bkt, node, tmp, obj, member)			\
+	for (bkt = 0, node = NULL; node == NULL && bkt < HASH_SIZE(name); bkt++)\
+		hlist_for_each_entry_safe(obj, node, tmp, &name[bkt], member)
+
+/**
+ * hash_for_each_possible - iterate over all possible objects hashing to the
+ * same bucket
+ * @name: hashtable to iterate
+ * @obj: the type * to use as a loop cursor for each entry
+ * @node: the &struct list_head to use as a loop cursor for each entry
+ * @member: the name of the hlist_node within the struct
+ * @key: the key of the objects to iterate over
+ */
+#define hash_for_each_possible(name, obj, node, member, key)			\
+	hlist_for_each_entry(obj, node,	&name[hash_min(key, HASH_BITS(name))], member)
+
+/**
+ * hash_for_each_possible_rcu - iterate over all possible objects hashing to the
+ * same bucket in an rcu enabled hashtable
+ * in a rcu enabled hashtable
+ * @name: hashtable to iterate
+ * @obj: the type * to use as a loop cursor for each entry
+ * @node: the &struct list_head to use as a loop cursor for each entry
+ * @member: the name of the hlist_node within the struct
+ * @key: the key of the objects to iterate over
+ */
+#define hash_for_each_possible_rcu(name, obj, node, member, key)		\
+	hlist_for_each_entry_rcu(obj, node, &name[hash_min(key, HASH_BITS(name))], member)
+
+/**
+ * hash_for_each_possible_safe - iterate over all possible objects hashing to the
+ * same bucket safe against removals
+ * @name: hashtable to iterate
+ * @obj: the type * to use as a loop cursor for each entry
+ * @node: the &struct list_head to use as a loop cursor for each entry
+ * @tmp: a &struct used for temporary storage
+ * @member: the name of the hlist_node within the struct
+ * @key: the key of the objects to iterate over
+ */
+#define hash_for_each_possible_safe(name, obj, node, tmp, member, key)		\
+	hlist_for_each_entry_safe(obj, node, tmp,				\
+		&name[hash_min(key, HASH_BITS(name))], member)
+
+
+#endif
-- 
1.7.12


^ permalink raw reply related	[flat|nested] 7+ messages in thread

* Re: [PATCH v5] hashtable: introduce a small and naive hashtable
  2012-09-15  9:53 [PATCH v5] hashtable: introduce a small and naive hashtable Sasha Levin
@ 2012-09-15 15:14 ` Mathieu Desnoyers
  2012-09-15 15:47   ` Sasha Levin
  0 siblings, 1 reply; 7+ messages in thread
From: Mathieu Desnoyers @ 2012-09-15 15:14 UTC (permalink / raw)
  To: Sasha Levin
  Cc: torvalds, tj, akpm, linux-kernel, rostedt, ebiederm, neilb,
	bfields, ejt, snitzer, edumazet, josh, David.Laight, rmallon,
	palves

* Sasha Levin (levinsasha928@gmail.com) wrote:
> This hashtable implementation is using hlist buckets to provide a simple
> hashtable to prevent it from getting reimplemented all over the kernel.
> 
> Signed-off-by: Sasha Levin <levinsasha928@gmail.com>
> ---
> 
> Changes from v4:
> 
>  - Addressed comments by Mathieu Desnoyers.
> 
>  include/linux/hashtable.h | 190 ++++++++++++++++++++++++++++++++++++++++++++++
>  1 file changed, 190 insertions(+)
>  create mode 100644 include/linux/hashtable.h
> 
> diff --git a/include/linux/hashtable.h b/include/linux/hashtable.h
> new file mode 100644
> index 0000000..6d0c6c2
> --- /dev/null
> +++ b/include/linux/hashtable.h
> @@ -0,0 +1,190 @@
> +/*
> + * Hash table implementation
> + * (C) 2012  Sasha Levin <levinsasha928@gmail.com>
> + */
> +
> +#ifndef _LINUX_HASHTABLE_H
> +#define _LINUX_HASHTABLE_H
> +
> +#include <linux/list.h>
> +#include <linux/types.h>
> +#include <linux/kernel.h>
> +#include <linux/hash.h>
> +#include <linux/rculist.h>
> +
> +#define DEFINE_HASHTABLE(name, bits)						\
> +	struct hlist_head name[HASH_SIZE(bits)] =				\
> +			{ [0 ... HASH_SIZE(bits) - 1] = HLIST_HEAD_INIT }
> +
> +#define DECLARE_HASHTABLE(name, bits)                                   	\
> +	struct hlist_head name[1 << (bits)]
> +
> +#define HASH_SIZE(name) (1 << HASH_BITS(name))
> +#define HASH_BITS(name) ilog2(ARRAY_SIZE(name))
> +
> +/* Use hash_32 when possible to allow for fast 32bit hashing in 64bit kernels. */
> +#define hash_min(val, bits)							\
> +({										\
> +	sizeof(val) <= 4 ?							\
> +	hash_32(val, bits) :							\
> +	hash_long(val, bits);							\
> +})
> +
> +/**
> + * hash_init - initialize a hash table
> + * @hashtable: hashtable to be initialized
> + *
> + * Calculates the size of the hashtable from the given parameter, otherwise
> + * same as hash_init_size.
> + *
> + * This has to be a macro since HASH_BITS() will not work on pointers since
> + * it calculates the size during preprocessing.
> + */
> +#define hash_init(hashtable)							\
> +({										\
> +	int __i;								\
> +										\
> +	for (__i = 0; __i < HASH_BITS(hashtable); __i++)			\

I think this fails to initialize the whole table. You'd need

  HASH_BITS -> HASH_SIZE

Which brings the following question: how did you test this code ? It
would be nice to have a small test module along with this patchset that
stress-tests this simple hash table in various configurations (on stack,
in data, etc).

Also, I don't see how hash_init() can be useful, since DEFINE_HASHTABLE
already initialize the array entries. If it is for dynamically allocated
hash tables, this also won't work, considering the comment "This has to
be a macro since HASH_BITS() will not work on pointers since it
calculates the size during preprocessing."

> +		INIT_HLIST_HEAD(&hashtable[__i]);				\
> +})
> +
> +/**
> + * hash_add - add an object to a hashtable
> + * @hashtable: hashtable to add to
> + * @node: the &struct hlist_node of the object to be added
> + * @key: the key of the object to be added
> + */
> +#define hash_add(hashtable, node, key)						\
> +	hlist_add_head(node, &hashtable[hash_min(key, HASH_BITS(hashtable))]);
> +
> +/**
> + * hash_add_rcu - add an object to a rcu enabled hashtable
> + * @hashtable: hashtable to add to
> + * @node: the &struct hlist_node of the object to be added
> + * @key: the key of the object to be added
> + */
> +#define hash_add_rcu(hashtable, node, key)					\
> +	hlist_add_head_rcu(node, &hashtable[hash_min(key, HASH_BITS(hashtable))]);
> +
> +/**
> + * hash_hashed - check whether an object is in any hashtable
> + * @node: the &struct hlist_node of the object to be checked
> + */
> +#define hash_hashed(node) (!hlist_unhashed(node))
> +
> +/**
> + * hash_empty - check whether a hashtable is empty
> + * @hashtable: hashtable to check
> + *
> + * This has to be a macro since HASH_BITS() will not work on pointers since
> + * it calculates the size during preprocessing.

So I get that hash_empty is only expected to be used for statically
defined hash tables ? Does it support hash tables in dynamically
allocated memory ? On the stack ? If these are never expected to be
supported, this should be documented.

Thanks,

Mathieu

> + */
> +#define hash_empty(hashtable)							\
> +({										\
> +	int __i;								\
> +	bool __ret = true;							\
> +										\
> +	for (__i = 0; __i < HASH_SIZE(hashtable); __i++)			\
> +		if (!hlist_empty(&hashtable[__i]))				\
> +			__ret = false;						\
> +										\
> +	__ret;									\
> +})
> +
> +/**
> + * hash_del - remove an object from a hashtable
> + * @node: &struct hlist_node of the object to remove
> + */
> +static inline void hash_del(struct hlist_node *node)
> +{
> +	hlist_del_init(node);
> +}
> +
> +/**
> + * hash_del_rcu - remove an object from a rcu enabled hashtable
> + * @node: &struct hlist_node of the object to remove
> + */
> +static inline void hash_del_rcu(struct hlist_node *node)
> +{
> +	hlist_del_init_rcu(node);
> +}
> +
> +/**
> + * hash_for_each - iterate over a hashtable
> + * @name: hashtable to iterate
> + * @bkt: integer to use as bucket loop cursor
> + * @node: the &struct list_head to use as a loop cursor for each entry
> + * @obj: the type * to use as a loop cursor for each entry
> + * @member: the name of the hlist_node within the struct
> + */
> +#define hash_for_each(name, bkt, node, obj, member)				\
> +	for (bkt = 0, node = NULL; node == NULL && bkt < HASH_SIZE(name); bkt++)\
> +		hlist_for_each_entry(obj, node, &name[bkt], member)
> +
> +/**
> + * hash_for_each_rcu - iterate over a rcu enabled hashtable
> + * @name: hashtable to iterate
> + * @bkt: integer to use as bucket loop cursor
> + * @node: the &struct list_head to use as a loop cursor for each entry
> + * @obj: the type * to use as a loop cursor for each entry
> + * @member: the name of the hlist_node within the struct
> + */
> +#define hash_for_each_rcu(name, bkt, node, obj, member)				\
> +	for (bkt = 0, node = NULL; node == NULL && bkt < HASH_SIZE(name); bkt++)\
> +		hlist_for_each_entry_rcu(obj, node, &name[bkt], member)
> +
> +/**
> + * hash_for_each_safe - iterate over a hashtable safe against removal of
> + * hash entry
> + * @name: hashtable to iterate
> + * @bkt: integer to use as bucket loop cursor
> + * @node: the &struct list_head to use as a loop cursor for each entry
> + * @tmp: a &struct used for temporary storage
> + * @obj: the type * to use as a loop cursor for each entry
> + * @member: the name of the hlist_node within the struct
> + */
> +#define hash_for_each_safe(name, bkt, node, tmp, obj, member)			\
> +	for (bkt = 0, node = NULL; node == NULL && bkt < HASH_SIZE(name); bkt++)\
> +		hlist_for_each_entry_safe(obj, node, tmp, &name[bkt], member)
> +
> +/**
> + * hash_for_each_possible - iterate over all possible objects hashing to the
> + * same bucket
> + * @name: hashtable to iterate
> + * @obj: the type * to use as a loop cursor for each entry
> + * @node: the &struct list_head to use as a loop cursor for each entry
> + * @member: the name of the hlist_node within the struct
> + * @key: the key of the objects to iterate over
> + */
> +#define hash_for_each_possible(name, obj, node, member, key)			\
> +	hlist_for_each_entry(obj, node,	&name[hash_min(key, HASH_BITS(name))], member)
> +
> +/**
> + * hash_for_each_possible_rcu - iterate over all possible objects hashing to the
> + * same bucket in an rcu enabled hashtable
> + * in a rcu enabled hashtable
> + * @name: hashtable to iterate
> + * @obj: the type * to use as a loop cursor for each entry
> + * @node: the &struct list_head to use as a loop cursor for each entry
> + * @member: the name of the hlist_node within the struct
> + * @key: the key of the objects to iterate over
> + */
> +#define hash_for_each_possible_rcu(name, obj, node, member, key)		\
> +	hlist_for_each_entry_rcu(obj, node, &name[hash_min(key, HASH_BITS(name))], member)
> +
> +/**
> + * hash_for_each_possible_safe - iterate over all possible objects hashing to the
> + * same bucket safe against removals
> + * @name: hashtable to iterate
> + * @obj: the type * to use as a loop cursor for each entry
> + * @node: the &struct list_head to use as a loop cursor for each entry
> + * @tmp: a &struct used for temporary storage
> + * @member: the name of the hlist_node within the struct
> + * @key: the key of the objects to iterate over
> + */
> +#define hash_for_each_possible_safe(name, obj, node, tmp, member, key)		\
> +	hlist_for_each_entry_safe(obj, node, tmp,				\
> +		&name[hash_min(key, HASH_BITS(name))], member)
> +
> +
> +#endif
> -- 
> 1.7.12
> 

-- 
Mathieu Desnoyers
Operating System Efficiency R&D Consultant
EfficiOS Inc.
http://www.efficios.com

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH v5] hashtable: introduce a small and naive hashtable
  2012-09-15 15:14 ` Mathieu Desnoyers
@ 2012-09-15 15:47   ` Sasha Levin
  2012-09-16 19:33     ` Mathieu Desnoyers
  0 siblings, 1 reply; 7+ messages in thread
From: Sasha Levin @ 2012-09-15 15:47 UTC (permalink / raw)
  To: Mathieu Desnoyers
  Cc: torvalds, tj, akpm, linux-kernel, rostedt, ebiederm, neilb,
	bfields, ejt, snitzer, edumazet, josh, David.Laight, rmallon,
	palves

On Sat, Sep 15, 2012 at 5:14 PM, Mathieu Desnoyers
<mathieu.desnoyers@efficios.com> wrote:
> * Sasha Levin (levinsasha928@gmail.com) wrote:
>> This hashtable implementation is using hlist buckets to provide a simple
>> hashtable to prevent it from getting reimplemented all over the kernel.
>>
>> Signed-off-by: Sasha Levin <levinsasha928@gmail.com>
>> ---
>>
>> Changes from v4:
>>
>>  - Addressed comments by Mathieu Desnoyers.
>>
>>  include/linux/hashtable.h | 190 ++++++++++++++++++++++++++++++++++++++++++++++
>>  1 file changed, 190 insertions(+)
>>  create mode 100644 include/linux/hashtable.h
>>
>> diff --git a/include/linux/hashtable.h b/include/linux/hashtable.h
>> new file mode 100644
>> index 0000000..6d0c6c2
>> --- /dev/null
>> +++ b/include/linux/hashtable.h
>> @@ -0,0 +1,190 @@
>> +/*
>> + * Hash table implementation
>> + * (C) 2012  Sasha Levin <levinsasha928@gmail.com>
>> + */
>> +
>> +#ifndef _LINUX_HASHTABLE_H
>> +#define _LINUX_HASHTABLE_H
>> +
>> +#include <linux/list.h>
>> +#include <linux/types.h>
>> +#include <linux/kernel.h>
>> +#include <linux/hash.h>
>> +#include <linux/rculist.h>
>> +
>> +#define DEFINE_HASHTABLE(name, bits)                                         \
>> +     struct hlist_head name[HASH_SIZE(bits)] =                               \
>> +                     { [0 ... HASH_SIZE(bits) - 1] = HLIST_HEAD_INIT }
>> +
>> +#define DECLARE_HASHTABLE(name, bits)                                        \
>> +     struct hlist_head name[1 << (bits)]
>> +
>> +#define HASH_SIZE(name) (1 << HASH_BITS(name))
>> +#define HASH_BITS(name) ilog2(ARRAY_SIZE(name))
>> +
>> +/* Use hash_32 when possible to allow for fast 32bit hashing in 64bit kernels. */
>> +#define hash_min(val, bits)                                                  \
>> +({                                                                           \
>> +     sizeof(val) <= 4 ?                                                      \
>> +     hash_32(val, bits) :                                                    \
>> +     hash_long(val, bits);                                                   \
>> +})
>> +
>> +/**
>> + * hash_init - initialize a hash table
>> + * @hashtable: hashtable to be initialized
>> + *
>> + * Calculates the size of the hashtable from the given parameter, otherwise
>> + * same as hash_init_size.
>> + *
>> + * This has to be a macro since HASH_BITS() will not work on pointers since
>> + * it calculates the size during preprocessing.
>> + */
>> +#define hash_init(hashtable)                                                 \
>> +({                                                                           \
>> +     int __i;                                                                \
>> +                                                                             \
>> +     for (__i = 0; __i < HASH_BITS(hashtable); __i++)                        \
>
> I think this fails to initialize the whole table. You'd need
>
>   HASH_BITS -> HASH_SIZE

Right.

Unfortunately it's pretty hard catching something like this :/

> Which brings the following question: how did you test this code ? It
> would be nice to have a small test module along with this patchset that
> stress-tests this simple hash table in various configurations (on stack,
> in data, etc).

I do two things:

 - A small userspace test (since this header works just fine from
userspace as well).
 - Build a kernel with all ~20 commits and stresstest it a bit, since
things like workqueue were converted, it finds issues pretty fast.

I agree that there should be something similar to the list sort test
we currently have, but I'm not sure if it should be in the scope of
this initial patch.

> Also, I don't see how hash_init() can be useful, since DEFINE_HASHTABLE
> already initialize the array entries. If it is for dynamically allocated
> hash tables, this also won't work, considering the comment "This has to
> be a macro since HASH_BITS() will not work on pointers since it
> calculates the size during preprocessing."

hash_init() is actually used to clear the hashtable in two of the
commits I've used to send along with this one. Since as you've said we
now have DEFINE_HASHTABLE and DECLARE_HASHTABLE, maybe it should be
now called hash_clear() or something similar. I don't have a strong
opinion about this.

>> +             INIT_HLIST_HEAD(&hashtable[__i]);                               \
>> +})
>> +
>> +/**
>> + * hash_add - add an object to a hashtable
>> + * @hashtable: hashtable to add to
>> + * @node: the &struct hlist_node of the object to be added
>> + * @key: the key of the object to be added
>> + */
>> +#define hash_add(hashtable, node, key)                                               \
>> +     hlist_add_head(node, &hashtable[hash_min(key, HASH_BITS(hashtable))]);
>> +
>> +/**
>> + * hash_add_rcu - add an object to a rcu enabled hashtable
>> + * @hashtable: hashtable to add to
>> + * @node: the &struct hlist_node of the object to be added
>> + * @key: the key of the object to be added
>> + */
>> +#define hash_add_rcu(hashtable, node, key)                                   \
>> +     hlist_add_head_rcu(node, &hashtable[hash_min(key, HASH_BITS(hashtable))]);
>> +
>> +/**
>> + * hash_hashed - check whether an object is in any hashtable
>> + * @node: the &struct hlist_node of the object to be checked
>> + */
>> +#define hash_hashed(node) (!hlist_unhashed(node))
>> +
>> +/**
>> + * hash_empty - check whether a hashtable is empty
>> + * @hashtable: hashtable to check
>> + *
>> + * This has to be a macro since HASH_BITS() will not work on pointers since
>> + * it calculates the size during preprocessing.
>
> So I get that hash_empty is only expected to be used for statically
> defined hash tables ? Does it support hash tables in dynamically
> allocated memory ? On the stack ? If these are never expected to be
> supported, this should be documented.

Like the rest of the code, this hashtable implementation only works
with non-dynamically allocated hashtables (on the stack/statically
defined/etc are supported).

How would you create a dynamic hashtable with this code to begin with?


Thanks,
Sasha

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH v5] hashtable: introduce a small and naive hashtable
  2012-09-15 15:47   ` Sasha Levin
@ 2012-09-16 19:33     ` Mathieu Desnoyers
  2012-09-16 19:48       ` Sasha Levin
  0 siblings, 1 reply; 7+ messages in thread
From: Mathieu Desnoyers @ 2012-09-16 19:33 UTC (permalink / raw)
  To: Sasha Levin
  Cc: torvalds, tj, akpm, linux-kernel, rostedt, ebiederm, neilb,
	bfields, ejt, snitzer, edumazet, josh, David.Laight, rmallon,
	palves

* Sasha Levin (levinsasha928@gmail.com) wrote:
> On Sat, Sep 15, 2012 at 5:14 PM, Mathieu Desnoyers
> <mathieu.desnoyers@efficios.com> wrote:
> > * Sasha Levin (levinsasha928@gmail.com) wrote:
> >> This hashtable implementation is using hlist buckets to provide a simple
> >> hashtable to prevent it from getting reimplemented all over the kernel.
> >>
> >> Signed-off-by: Sasha Levin <levinsasha928@gmail.com>
> >> ---
> >>
> >> Changes from v4:
> >>
> >>  - Addressed comments by Mathieu Desnoyers.
> >>
> >>  include/linux/hashtable.h | 190 ++++++++++++++++++++++++++++++++++++++++++++++
> >>  1 file changed, 190 insertions(+)
> >>  create mode 100644 include/linux/hashtable.h
> >>
> >> diff --git a/include/linux/hashtable.h b/include/linux/hashtable.h
> >> new file mode 100644
> >> index 0000000..6d0c6c2
> >> --- /dev/null
> >> +++ b/include/linux/hashtable.h
> >> @@ -0,0 +1,190 @@
> >> +/*
> >> + * Hash table implementation
> >> + * (C) 2012  Sasha Levin <levinsasha928@gmail.com>
> >> + */
> >> +
> >> +#ifndef _LINUX_HASHTABLE_H
> >> +#define _LINUX_HASHTABLE_H
> >> +
> >> +#include <linux/list.h>
> >> +#include <linux/types.h>
> >> +#include <linux/kernel.h>
> >> +#include <linux/hash.h>
> >> +#include <linux/rculist.h>
> >> +
> >> +#define DEFINE_HASHTABLE(name, bits)                                         \
> >> +     struct hlist_head name[HASH_SIZE(bits)] =                               \
> >> +                     { [0 ... HASH_SIZE(bits) - 1] = HLIST_HEAD_INIT }
> >> +
> >> +#define DECLARE_HASHTABLE(name, bits)                                        \
> >> +     struct hlist_head name[1 << (bits)]
> >> +
> >> +#define HASH_SIZE(name) (1 << HASH_BITS(name))
> >> +#define HASH_BITS(name) ilog2(ARRAY_SIZE(name))
> >> +
> >> +/* Use hash_32 when possible to allow for fast 32bit hashing in 64bit kernels. */
> >> +#define hash_min(val, bits)                                                  \
> >> +({                                                                           \
> >> +     sizeof(val) <= 4 ?                                                      \
> >> +     hash_32(val, bits) :                                                    \
> >> +     hash_long(val, bits);                                                   \
> >> +})
> >> +
> >> +/**
> >> + * hash_init - initialize a hash table
> >> + * @hashtable: hashtable to be initialized
> >> + *
> >> + * Calculates the size of the hashtable from the given parameter, otherwise
> >> + * same as hash_init_size.
> >> + *
> >> + * This has to be a macro since HASH_BITS() will not work on pointers since
> >> + * it calculates the size during preprocessing.
> >> + */
> >> +#define hash_init(hashtable)                                                 \
> >> +({                                                                           \
> >> +     int __i;                                                                \
> >> +                                                                             \
> >> +     for (__i = 0; __i < HASH_BITS(hashtable); __i++)                        \
> >
> > I think this fails to initialize the whole table. You'd need
> >
> >   HASH_BITS -> HASH_SIZE
> 
> Right.
> 
> Unfortunately it's pretty hard catching something like this :/
> 
> > Which brings the following question: how did you test this code ? It
> > would be nice to have a small test module along with this patchset that
> > stress-tests this simple hash table in various configurations (on stack,
> > in data, etc).
> 
> I do two things:
> 
>  - A small userspace test (since this header works just fine from
> userspace as well).
>  - Build a kernel with all ~20 commits and stresstest it a bit, since
> things like workqueue were converted, it finds issues pretty fast.
> 
> I agree that there should be something similar to the list sort test
> we currently have, but I'm not sure if it should be in the scope of
> this initial patch.

Fair enough.

> 
> > Also, I don't see how hash_init() can be useful, since DEFINE_HASHTABLE
> > already initialize the array entries. If it is for dynamically allocated
> > hash tables, this also won't work, considering the comment "This has to
> > be a macro since HASH_BITS() will not work on pointers since it
> > calculates the size during preprocessing."
> 
> hash_init() is actually used to clear the hashtable in two of the
> commits I've used to send along with this one. Since as you've said we
> now have DEFINE_HASHTABLE and DECLARE_HASHTABLE, maybe it should be
> now called hash_clear() or something similar. I don't have a strong
> opinion about this.

If it's used somewhere, then it's fine. People are used to "*_init"
functions, so I don't think there is a point in renaming this to
"clear".

> 
> >> +             INIT_HLIST_HEAD(&hashtable[__i]);                               \
> >> +})
> >> +
> >> +/**
> >> + * hash_add - add an object to a hashtable
> >> + * @hashtable: hashtable to add to
> >> + * @node: the &struct hlist_node of the object to be added
> >> + * @key: the key of the object to be added
> >> + */
> >> +#define hash_add(hashtable, node, key)                                               \
> >> +     hlist_add_head(node, &hashtable[hash_min(key, HASH_BITS(hashtable))]);
> >> +
> >> +/**
> >> + * hash_add_rcu - add an object to a rcu enabled hashtable
> >> + * @hashtable: hashtable to add to
> >> + * @node: the &struct hlist_node of the object to be added
> >> + * @key: the key of the object to be added
> >> + */
> >> +#define hash_add_rcu(hashtable, node, key)                                   \
> >> +     hlist_add_head_rcu(node, &hashtable[hash_min(key, HASH_BITS(hashtable))]);
> >> +
> >> +/**
> >> + * hash_hashed - check whether an object is in any hashtable
> >> + * @node: the &struct hlist_node of the object to be checked
> >> + */
> >> +#define hash_hashed(node) (!hlist_unhashed(node))
> >> +
> >> +/**
> >> + * hash_empty - check whether a hashtable is empty
> >> + * @hashtable: hashtable to check
> >> + *
> >> + * This has to be a macro since HASH_BITS() will not work on pointers since
> >> + * it calculates the size during preprocessing.
> >
> > So I get that hash_empty is only expected to be used for statically
> > defined hash tables ? Does it support hash tables in dynamically
> > allocated memory ? On the stack ? If these are never expected to be
> > supported, this should be documented.
> 
> Like the rest of the code, this hashtable implementation only works
> with non-dynamically allocated hashtables (on the stack/statically
> defined/etc are supported).
> 
> How would you create a dynamic hashtable with this code to begin with?

With a var. sized array in a structure, e.g.:

struct hashtable {
        size_t len;
        struct hlist_head t[];
};

Then create a hash table allocation function, and turn both hash_empty
and hash_init into functions that take a struct simplehash pointer as
parameter. The downside is to consume extra space (especially because
the array always has a power of 2 len), but the nice thing is that we
could use this code with a kmalloc'd/vmalloc'd hash table.

I'm not saying we need to do it (due to the space consumption downside),
but that we should at least document this limitation.

Thanks,

Mathieu


> 
> 
> Thanks,
> Sasha

-- 
Mathieu Desnoyers
Operating System Efficiency R&D Consultant
EfficiOS Inc.
http://www.efficios.com

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH v5] hashtable: introduce a small and naive hashtable
  2012-09-16 19:33     ` Mathieu Desnoyers
@ 2012-09-16 19:48       ` Sasha Levin
  0 siblings, 0 replies; 7+ messages in thread
From: Sasha Levin @ 2012-09-16 19:48 UTC (permalink / raw)
  To: Mathieu Desnoyers
  Cc: torvalds, tj, akpm, linux-kernel, rostedt, ebiederm, neilb,
	bfields, ejt, snitzer, edumazet, josh, David.Laight, rmallon,
	palves

On 09/16/2012 09:33 PM, Mathieu Desnoyers wrote:
>> How would you create a dynamic hashtable with this code to begin with?
> With a var. sized array in a structure, e.g.:
> 
> struct hashtable {
>         size_t len;
>         struct hlist_head t[];
> };
> 
> Then create a hash table allocation function, and turn both hash_empty
> and hash_init into functions that take a struct simplehash pointer as
> parameter. The downside is to consume extra space (especially because
> the array always has a power of 2 len), but the nice thing is that we
> could use this code with a kmalloc'd/vmalloc'd hash table.
> 
> I'm not saying we need to do it (due to the space consumption downside),
> but that we should at least document this limitation.

We've gone through the process of attempting to support dynamic hashtables in
the previous versions of this patch, this turned to complicate the code a lot
beyond simply changing the array of hlist_heads into the struct you proposed, so
it was decided to not support dynamic hashtables at this point at all.

I thought we've agreed that non-dynamically allocated hashtables are widespread
enough to have a dedicated abstraction of their own, and further hashtable
implementations may introduce their own API for that.

We can ofcourse open that for discussion again, but I'd rather not go back and
forth adding support for dynamic hashtables and removing it again.


Thanks,
Sasha

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH v5] hashtable: introduce a small and naive hashtable
  2012-09-15 23:08 Matthias Diener
@ 2012-09-16 11:30 ` Sasha Levin
  0 siblings, 0 replies; 7+ messages in thread
From: Sasha Levin @ 2012-09-16 11:30 UTC (permalink / raw)
  To: Matthias Diener; +Cc: linux-kernel

[-- Attachment #1: Type: text/plain, Size: 1746 bytes --]

On 09/16/2012 01:08 AM, Matthias Diener wrote:
> Sasha Levin (levinsasha928 <at> gmail.com) wrote:
>> On Sat, Sep 15, 2012 at 5:14 PM, Mathieu Desnoyers
>> <mathieu.desnoyers <at> efficios.com> wrote:
>>> * Sasha Levin (levinsasha928 <at> gmail.com) wrote:
> [...]
>>>> +#define hash_init(hashtable)                                                 \
>>>> +({                                                                           \
>>>> +     int __i;                                                                \
>>>> +                                                                             \
>>>> +     for (__i = 0; __i < HASH_BITS(hashtable); __i++)                        \
>>>
>>> I think this fails to initialize the whole table. You'd need
>>>
>>>   HASH_BITS -> HASH_SIZE
>>
>> Right.
>>
>> Unfortunately it's pretty hard catching something like this :/
>>
>>> Which brings the following question: how did you test this code ? It
>>> would be nice to have a small test module along with this patchset that
>>> stress-tests this simple hash table in various configurations (on stack,
>>> in data, etc).
>>
>> I do two things:
>>
>> - A small userspace test (since this header works just fine from
>> userspace as well).
> 
> 
> It would be interesting to run some experiments with this hashtable in
> userspace.
> Could you post the test code here?

Sure, I've attached the test code. There are 2 things to remember it:

 1. The code looks like crap :) I've never intended it to be seen by others.
 2. It should be used in the context of "sanitized" kernel headers so it could
be included directly. I usually work in the directory of lkvm, and compile this
code using:

	gcc -Iinclude/ -I../../include/ -O0 -ggdb  hashtest.c



[-- Attachment #2: hashtest.c --]
[-- Type: text/x-csrc, Size: 1352 bytes --]

#define BITS_PER_LONG 64
#include <linux/kernel.h>
#include <linux/hashtable.h>
#include <kvm/util.h>
#include <linux/log2.h>
#include <assert.h>

struct my_data {
	int x;
	struct hlist_node node;
};

static DECLARE_HASHTABLE(test, 10);
static struct my_data *values[1000000];

struct my_data *get_data(int v)
{
	struct my_data *p;
	struct hlist_node *n;

	hash_for_each_possible(test, p, n, node, v)
		if (p->x == v)
			return p;

	return NULL;
}

bool verify(int i)
{
	if (values[i] == NULL && get_data(i))
		printf("No data, but found at %p (%d)\n", get_data(i), i);
	if (values[i])
		return get_data(i) ? true : false;
	else
		return get_data(i) ? false : true;
}

int main(void)
{
	int i;

	hash_init(test);

	printf("Empty? %d\n", hash_empty(test));

	while (1) {
	for (i = 0; i < ARRAY_SIZE(values); i++) {
		int r = rand() % 3;

		if (i % 10000 == 0) {
			printf("\r%d ...", i);
			fflush(stdout);
		}

		switch (r) {
		case 0:
			assert(verify(i));
			if (values[i])
				break;
			values[i] = malloc(sizeof(values[i]));
			values[i]->x = i;
			hash_add(test, &values[i]->node, i);
			assert(verify(i));
			break;
		case 1:
			assert(verify(i));

			break;
		case 2:
			assert(verify(i));
			if (values[i]) {
				hash_del(&values[i]->node);
				free(values[i]);
				values[i] = NULL;
			}
			assert(verify(i));
			break;
		}
	}
	}
	return 0;
}

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH v5] hashtable: introduce a small and naive hashtable
@ 2012-09-15 23:08 Matthias Diener
  2012-09-16 11:30 ` Sasha Levin
  0 siblings, 1 reply; 7+ messages in thread
From: Matthias Diener @ 2012-09-15 23:08 UTC (permalink / raw)
  To: linux-kernel

Sasha Levin (levinsasha928 <at> gmail.com) wrote:
>On Sat, Sep 15, 2012 at 5:14 PM, Mathieu Desnoyers
><mathieu.desnoyers <at> efficios.com> wrote:
>> * Sasha Levin (levinsasha928 <at> gmail.com) wrote:
[...]
>>> +#define hash_init(hashtable)                                                 \
>>> +({                                                                           \
>>> +     int __i;                                                                \
>>> +                                                                             \
>>> +     for (__i = 0; __i < HASH_BITS(hashtable); __i++)                        \
>>
>> I think this fails to initialize the whole table. You'd need
>>
>>   HASH_BITS -> HASH_SIZE
>
>Right.
>
>Unfortunately it's pretty hard catching something like this :/
>
>> Which brings the following question: how did you test this code ? It
>> would be nice to have a small test module along with this patchset that
>> stress-tests this simple hash table in various configurations (on stack,
>> in data, etc).
>
>I do two things:
>
> - A small userspace test (since this header works just fine from
>userspace as well).


It would be interesting to run some experiments with this hashtable in
userspace.
Could you post the test code here?

Thanks,
Matthias

^ permalink raw reply	[flat|nested] 7+ messages in thread

end of thread, other threads:[~2012-09-16 19:48 UTC | newest]

Thread overview: 7+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2012-09-15  9:53 [PATCH v5] hashtable: introduce a small and naive hashtable Sasha Levin
2012-09-15 15:14 ` Mathieu Desnoyers
2012-09-15 15:47   ` Sasha Levin
2012-09-16 19:33     ` Mathieu Desnoyers
2012-09-16 19:48       ` Sasha Levin
2012-09-15 23:08 Matthias Diener
2012-09-16 11:30 ` Sasha Levin

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).