linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH v5 1/25] compiler-gcc4.h: Correct verion check for __compiletime_error
       [not found] <1348615467-9850-1-git-send-email-daniel.santos@pobox.com>
@ 2012-09-25 23:29 ` Daniel Santos
  2012-09-25 23:30 ` [PATCH v5 2/25] compiler-gcc4.h: Reorder macros based upon gcc ver Daniel Santos
                   ` (23 subsequent siblings)
  24 siblings, 0 replies; 26+ messages in thread
From: Daniel Santos @ 2012-09-25 23:29 UTC (permalink / raw)
  To: Daniel Santos, Andrew Morton, David Howells, David Daney,
	David Rientjes, Pavel Pisa, Richard Weinberger, LKML,
	Michel Lespinasse, Andrea Arcangeli, Peter Zijlstra,
	Rik van Riel

__attribute__((error(msg))) was introduced in gcc 4.3 (not 4.4) and as I
was unable to find any gcc bugs pertaining to it, I'm presuming that it
has functioned as advertised since 4.3.0.

Signed-off-by: Daniel Santos <daniel.santos@pobox.com>
---
 include/linux/compiler-gcc4.h |    2 +-
 1 files changed, 1 insertions(+), 1 deletions(-)

diff --git a/include/linux/compiler-gcc4.h b/include/linux/compiler-gcc4.h
index 2f40791..10ce4fa 100644
--- a/include/linux/compiler-gcc4.h
+++ b/include/linux/compiler-gcc4.h
@@ -52,7 +52,7 @@
 #if __GNUC_MINOR__ > 0
 #define __compiletime_object_size(obj) __builtin_object_size(obj, 0)
 #endif
-#if __GNUC_MINOR__ >= 4 && !defined(__CHECKER__)
+#if __GNUC_MINOR__ >= 3 && !defined(__CHECKER__)
 #define __compiletime_warning(message) __attribute__((warning(message)))
 #define __compiletime_error(message) __attribute__((error(message)))
 #endif
-- 
1.7.3.4


^ permalink raw reply related	[flat|nested] 26+ messages in thread

* [PATCH v5 2/25] compiler-gcc4.h: Reorder macros based upon gcc ver
       [not found] <1348615467-9850-1-git-send-email-daniel.santos@pobox.com>
  2012-09-25 23:29 ` [PATCH v5 1/25] compiler-gcc4.h: Correct verion check for __compiletime_error Daniel Santos
@ 2012-09-25 23:30 ` Daniel Santos
  2012-09-25 23:30 ` [PATCH v5 3/25] compiler-gcc.h: Add gcc-recommended GCC_VERSION macro Daniel Santos
                   ` (22 subsequent siblings)
  24 siblings, 0 replies; 26+ messages in thread
From: Daniel Santos @ 2012-09-25 23:30 UTC (permalink / raw)
  To: Daniel Santos, Andrew Morton, David Howells, David Daney,
	David Rientjes, Pavel Pisa, Richard Weinberger, LKML,
	Michel Lespinasse, Andrea Arcangeli, Peter Zijlstra,
	Rik van Riel

This helps to keep the file from getting confusing, removes one
duplicate version check and should encourage future editors to put new
macros where they belong.

Signed-off-by: Daniel Santos <daniel.santos@pobox.com>
---
 include/linux/compiler-gcc4.h |   20 +++++++++++---------
 1 files changed, 11 insertions(+), 9 deletions(-)

diff --git a/include/linux/compiler-gcc4.h b/include/linux/compiler-gcc4.h
index 10ce4fa..a334107 100644
--- a/include/linux/compiler-gcc4.h
+++ b/include/linux/compiler-gcc4.h
@@ -13,6 +13,10 @@
 #define __must_check 		__attribute__((warn_unused_result))
 #define __compiler_offsetof(a,b) __builtin_offsetof(a,b)
 
+#if __GNUC_MINOR__ > 0
+# define __compiletime_object_size(obj) __builtin_object_size(obj, 0)
+#endif
+
 #if __GNUC_MINOR__ >= 3
 /* Mark functions as cold. gcc will assume any path leading to a call
    to them will be unlikely.  This means a lot of manual unlikely()s
@@ -31,6 +35,12 @@
 
 #define __linktime_error(message) __attribute__((__error__(message)))
 
+#ifndef __CHECKER__
+# define __compiletime_warning(message) __attribute__((warning(message)))
+# define __compiletime_error(message) __attribute__((error(message)))
+#endif /* __CHECKER__ */
+#endif /* __GNUC_MINOR__ >= 3 */
+
 #if __GNUC_MINOR__ >= 5
 /*
  * Mark a position in code as unreachable.  This can be used to
@@ -46,13 +56,5 @@
 /* Mark a function definition as prohibited from being cloned. */
 #define __noclone	__attribute__((__noclone__))
 
-#endif
-#endif
+#endif /* __GNUC_MINOR__ >= 5 */
 
-#if __GNUC_MINOR__ > 0
-#define __compiletime_object_size(obj) __builtin_object_size(obj, 0)
-#endif
-#if __GNUC_MINOR__ >= 3 && !defined(__CHECKER__)
-#define __compiletime_warning(message) __attribute__((warning(message)))
-#define __compiletime_error(message) __attribute__((error(message)))
-#endif
-- 
1.7.3.4


^ permalink raw reply related	[flat|nested] 26+ messages in thread

* [PATCH v5 3/25] compiler-gcc.h: Add gcc-recommended GCC_VERSION macro
       [not found] <1348615467-9850-1-git-send-email-daniel.santos@pobox.com>
  2012-09-25 23:29 ` [PATCH v5 1/25] compiler-gcc4.h: Correct verion check for __compiletime_error Daniel Santos
  2012-09-25 23:30 ` [PATCH v5 2/25] compiler-gcc4.h: Reorder macros based upon gcc ver Daniel Santos
@ 2012-09-25 23:30 ` Daniel Santos
  2012-09-25 23:30 ` [PATCH v5 4/25] compiler-gcc{3,4}.h: Use " Daniel Santos
                   ` (21 subsequent siblings)
  24 siblings, 0 replies; 26+ messages in thread
From: Daniel Santos @ 2012-09-25 23:30 UTC (permalink / raw)
  To: Steven Rostedt, Andrew Morton, Joe Perches, Daniel Santos,
	Pavel Pisa, Richard Weinberger, LKML, Michel Lespinasse,
	Andrea Arcangeli, Peter Zijlstra, Rik van Riel

Throughout compiler*.h, many version checks are made.  These can be
simplified by using the macro that gcc's documentation recommends.
However, my primary reason for adding this is that I need bug-check
macros that are enabled at certain gcc versions and it's cleaner to use
this macro than the tradition method:

if __GNUC__ > 4 || (__GNUC__ == 4 && __GNUC_MINOR__ => 2)

If you add patch level, it gets this ugly:

if __GNUC__ > 4 || (__GNUC__ == 4 && (__GNUC_MINOR__ > 2 || \
   __GNUC_MINOR__ == 2 __GNUC_PATCHLEVEL__ >= 1))

As opposed to:

if GCC_VERSION >= 40201

While having separate headers for gcc 3 & 4 eliminates some of this
verbosity, they can still be cleaned up by this.

See also:
http://gcc.gnu.org/onlinedocs/cpp/Common-Predefined-Macros.html

Signed-off-by: Daniel Santos <daniel.santos@pobox.com>
---
 include/linux/compiler-gcc.h |    3 +++
 1 files changed, 3 insertions(+), 0 deletions(-)

diff --git a/include/linux/compiler-gcc.h b/include/linux/compiler-gcc.h
index 6a6d7ae..24545cd 100644
--- a/include/linux/compiler-gcc.h
+++ b/include/linux/compiler-gcc.h
@@ -5,6 +5,9 @@
 /*
  * Common definitions for all gcc versions go here.
  */
+#define GCC_VERSION (__GNUC__ * 10000 \
+		   + __GNUC_MINOR__ * 100 \
+		   + __GNUC_PATCHLEVEL__)
 
 
 /* Optimization barrier */
-- 
1.7.3.4


^ permalink raw reply related	[flat|nested] 26+ messages in thread

* [PATCH v5 4/25] compiler-gcc{3,4}.h: Use GCC_VERSION macro
       [not found] <1348615467-9850-1-git-send-email-daniel.santos@pobox.com>
                   ` (2 preceding siblings ...)
  2012-09-25 23:30 ` [PATCH v5 3/25] compiler-gcc.h: Add gcc-recommended GCC_VERSION macro Daniel Santos
@ 2012-09-25 23:30 ` Daniel Santos
  2012-09-25 23:30 ` [PATCH v5 5/25] compiler{,-gcc4}.h: Remove duplicate macros Daniel Santos
                   ` (20 subsequent siblings)
  24 siblings, 0 replies; 26+ messages in thread
From: Daniel Santos @ 2012-09-25 23:30 UTC (permalink / raw)
  To: Daniel Santos, Andrew Morton, David Howells, David Daney,
	David Rientjes, Pavel Pisa, Richard Weinberger, LKML,
	Michel Lespinasse, Andrea Arcangeli, Peter Zijlstra,
	Rik van Riel

Using GCC_VERSION reduces complexity, is easier to read and is GCC's
recommended mechanism for doing version checks. (Just don't ask me why
they didn't define it in the first place.)  This also makes it easy to
merge compiler-gcc{3,4}.h should somebody want to.

Signed-off-by: Daniel Santos <daniel.santos@pobox.com>
---
 include/linux/compiler-gcc3.h |    8 ++++----
 include/linux/compiler-gcc4.h |   12 ++++++------
 2 files changed, 10 insertions(+), 10 deletions(-)

diff --git a/include/linux/compiler-gcc3.h b/include/linux/compiler-gcc3.h
index 37d4124..7d89feb 100644
--- a/include/linux/compiler-gcc3.h
+++ b/include/linux/compiler-gcc3.h
@@ -2,22 +2,22 @@
 #error "Please don't include <linux/compiler-gcc3.h> directly, include <linux/compiler.h> instead."
 #endif
 
-#if __GNUC_MINOR__ < 2
+#if GCC_VERSION < 30200
 # error Sorry, your compiler is too old - please upgrade it.
 #endif
 
-#if __GNUC_MINOR__ >= 3
+#if GCC_VERSION >= 30300
 # define __used			__attribute__((__used__))
 #else
 # define __used			__attribute__((__unused__))
 #endif
 
-#if __GNUC_MINOR__ >= 4
+#if GCC_VERSION >= 30400
 #define __must_check		__attribute__((warn_unused_result))
 #endif
 
 #ifdef CONFIG_GCOV_KERNEL
-# if __GNUC_MINOR__ < 4
+# if GCC_VERSION < 30400
 #   error "GCOV profiling support for gcc versions below 3.4 not included"
 # endif /* __GNUC_MINOR__ */
 #endif /* CONFIG_GCOV_KERNEL */
diff --git a/include/linux/compiler-gcc4.h b/include/linux/compiler-gcc4.h
index a334107..7ad60cd 100644
--- a/include/linux/compiler-gcc4.h
+++ b/include/linux/compiler-gcc4.h
@@ -4,7 +4,7 @@
 
 /* GCC 4.1.[01] miscompiles __weak */
 #ifdef __KERNEL__
-# if __GNUC_MINOR__ == 1 && __GNUC_PATCHLEVEL__ <= 1
+# if GCC_VERSION >= 40100 &&  GCC_VERSION <= 40101
 #  error Your version of gcc miscompiles the __weak directive
 # endif
 #endif
@@ -13,11 +13,11 @@
 #define __must_check 		__attribute__((warn_unused_result))
 #define __compiler_offsetof(a,b) __builtin_offsetof(a,b)
 
-#if __GNUC_MINOR__ > 0
+#if GCC_VERSION >= 40102
 # define __compiletime_object_size(obj) __builtin_object_size(obj, 0)
 #endif
 
-#if __GNUC_MINOR__ >= 3
+#if GCC_VERSION >= 40300
 /* Mark functions as cold. gcc will assume any path leading to a call
    to them will be unlikely.  This means a lot of manual unlikely()s
    are unnecessary now for any paths leading to the usual suspects
@@ -39,9 +39,9 @@
 # define __compiletime_warning(message) __attribute__((warning(message)))
 # define __compiletime_error(message) __attribute__((error(message)))
 #endif /* __CHECKER__ */
-#endif /* __GNUC_MINOR__ >= 3 */
+#endif /* GCC_VERSION >= 40300 */
 
-#if __GNUC_MINOR__ >= 5
+#if GCC_VERSION >= 40500
 /*
  * Mark a position in code as unreachable.  This can be used to
  * suppress control flow warnings after asm blocks that transfer
@@ -56,5 +56,5 @@
 /* Mark a function definition as prohibited from being cloned. */
 #define __noclone	__attribute__((__noclone__))
 
-#endif /* __GNUC_MINOR__ >= 5 */
+#endif /* GCC_VERSION >= 40500 */
 
-- 
1.7.3.4


^ permalink raw reply related	[flat|nested] 26+ messages in thread

* [PATCH v5 5/25] compiler{,-gcc4}.h: Remove duplicate macros
       [not found] <1348615467-9850-1-git-send-email-daniel.santos@pobox.com>
                   ` (3 preceding siblings ...)
  2012-09-25 23:30 ` [PATCH v5 4/25] compiler-gcc{3,4}.h: Use " Daniel Santos
@ 2012-09-25 23:30 ` Daniel Santos
  2012-09-25 23:30 ` [PATCH v5 6/25] bug.h: Replace __linktime_error with __compiletime_error Daniel Santos
                   ` (19 subsequent siblings)
  24 siblings, 0 replies; 26+ messages in thread
From: Daniel Santos @ 2012-09-25 23:30 UTC (permalink / raw)
  To: Christopher Li, Daniel Santos, Andrew Morton, David Howells,
	David Daney, David Rientjes, linux-sparse, Pavel Pisa,
	Richard Weinberger, LKML, Michel Lespinasse, Andrea Arcangeli,
	Peter Zijlstra, Rik van Riel

__linktime_error() does the same thing as __compiletime_error() and is
only used in bug.h.  Since the macro defines a function attribute that
will cause a failure at compile-time (not link-time), it makes more
sense to keep __compiletime_error(), which is also neatly mated with
__compiletime_warning().

Signed-off-by: Daniel Santos <daniel.santos@pobox.com>
---
 include/linux/compiler-gcc4.h |    2 --
 include/linux/compiler.h      |    3 ---
 2 files changed, 0 insertions(+), 5 deletions(-)

diff --git a/include/linux/compiler-gcc4.h b/include/linux/compiler-gcc4.h
index 7ad60cd..5755e23 100644
--- a/include/linux/compiler-gcc4.h
+++ b/include/linux/compiler-gcc4.h
@@ -33,8 +33,6 @@
    the kernel context */
 #define __cold			__attribute__((__cold__))
 
-#define __linktime_error(message) __attribute__((__error__(message)))
-
 #ifndef __CHECKER__
 # define __compiletime_warning(message) __attribute__((warning(message)))
 # define __compiletime_error(message) __attribute__((error(message)))
diff --git a/include/linux/compiler.h b/include/linux/compiler.h
index 923d093..4d9f353 100644
--- a/include/linux/compiler.h
+++ b/include/linux/compiler.h
@@ -293,9 +293,6 @@ void ftrace_likely_update(struct ftrace_branch_data *f, int val, int expect);
 #ifndef __compiletime_error
 # define __compiletime_error(message)
 #endif
-#ifndef __linktime_error
-# define __linktime_error(message)
-#endif
 /*
  * Prevent the compiler from merging or refetching accesses.  The compiler
  * is also forbidden from reordering successive instances of ACCESS_ONCE(),
-- 
1.7.3.4


^ permalink raw reply related	[flat|nested] 26+ messages in thread

* [PATCH v5 6/25] bug.h: Replace __linktime_error with __compiletime_error
       [not found] <1348615467-9850-1-git-send-email-daniel.santos@pobox.com>
                   ` (4 preceding siblings ...)
  2012-09-25 23:30 ` [PATCH v5 5/25] compiler{,-gcc4}.h: Remove duplicate macros Daniel Santos
@ 2012-09-25 23:30 ` Daniel Santos
  2012-09-25 23:30 ` [PATCH v5 7/25] compiler{,-gcc4}.h: Introduce __flatten function attribute Daniel Santos
                   ` (18 subsequent siblings)
  24 siblings, 0 replies; 26+ messages in thread
From: Daniel Santos @ 2012-09-25 23:30 UTC (permalink / raw)
  To: Daniel Santos, Paul Gortmaker, Andrew Morton,
	Konstantin Khlebnikov, Pavel Pisa, Richard Weinberger, LKML,
	Michel Lespinasse, Andrea Arcangeli, Peter Zijlstra,
	Rik van Riel

Signed-off-by: Daniel Santos <daniel.santos@pobox.com>
---
 include/linux/bug.h |    2 +-
 1 files changed, 1 insertions(+), 1 deletions(-)

diff --git a/include/linux/bug.h b/include/linux/bug.h
index aaac4bb..298a916 100644
--- a/include/linux/bug.h
+++ b/include/linux/bug.h
@@ -73,7 +73,7 @@ extern int __build_bug_on_failed;
 #define BUILD_BUG()						\
 	do {							\
 		extern void __build_bug_failed(void)		\
-			__linktime_error("BUILD_BUG failed");	\
+			__compiletime_error("BUILD_BUG failed");\
 		__build_bug_failed();				\
 	} while (0)
 
-- 
1.7.3.4


^ permalink raw reply related	[flat|nested] 26+ messages in thread

* [PATCH v5 7/25] compiler{,-gcc4}.h: Introduce __flatten function attribute
       [not found] <1348615467-9850-1-git-send-email-daniel.santos@pobox.com>
                   ` (5 preceding siblings ...)
  2012-09-25 23:30 ` [PATCH v5 6/25] bug.h: Replace __linktime_error with __compiletime_error Daniel Santos
@ 2012-09-25 23:30 ` Daniel Santos
  2012-09-25 23:30 ` [PATCH v5 8/25] bug.h: Make BUILD_BUG_ON generate compile-time error Daniel Santos
                   ` (17 subsequent siblings)
  24 siblings, 0 replies; 26+ messages in thread
From: Daniel Santos @ 2012-09-25 23:30 UTC (permalink / raw)
  To: Christopher Li, Daniel Santos, Andrew Morton, David Howells,
	David Daney, David Rientjes, linux-sparse, Pavel Pisa,
	Richard Weinberger, LKML, Michel Lespinasse, Andrea Arcangeli,
	Peter Zijlstra, Rik van Riel

For gcc 4.1 & later, expands to __attribute__((flatten)) which forces
the compiler to inline everything it can into the function.  This is
useful in combination with noinline when you want to control the depth
of inlining, or create a single function where inline expansions will
occur. (see
http://gcc.gnu.org/onlinedocs/gcc/Function-Attributes.html#index-g_t_0040code_007bflatten_007d-function-attribute-2512)

Normally, it's best to leave this type of thing up to the compiler.
However, the generic rbtree code uses inline functions just to be able
to inject compile-time constant data that specifies how the caller wants
the function to behave (via struct rb_relationship).  This data can be
thought of as the template parameters of a C++ templatized function.
Since some of these functions, once expanded, become quite large, gcc
sometimes decides not to perform some important inlining, in one case,
even generating a few bytes more code by not doing so. (Note: I have not
eliminated the possibility that this was an optimization bug, but the
flatten attribute fixes it in either case.)

Combining __flatten and noinline insures that important optimizations
occur in these cases and that the inline expansion occurs in exactly one
place, thus not leading to unnecissary bloat. However, it also can
eliminate some opportunities for optimization should gcc otherwise
decide the function its self is a good candidate for inlining.

Signed-off-by: Daniel Santos <daniel.santos@pobox.com>
---
 include/linux/compiler-gcc4.h |    7 ++++++-
 include/linux/compiler.h      |    4 ++++
 2 files changed, 10 insertions(+), 1 deletions(-)

diff --git a/include/linux/compiler-gcc4.h b/include/linux/compiler-gcc4.h
index 5755e23..38fb81d 100644
--- a/include/linux/compiler-gcc4.h
+++ b/include/linux/compiler-gcc4.h
@@ -15,7 +15,12 @@
 
 #if GCC_VERSION >= 40102
 # define __compiletime_object_size(obj) __builtin_object_size(obj, 0)
-#endif
+
+/* flatten introduced in 4.1, but broken in 4.6.0 (gcc bug #48731)*/
+# if GCC_VERSION != 40600
+#  define __flatten __attribute__((flatten))
+# endif
+#endif /* GCC_VERSION >= 40102 */
 
 #if GCC_VERSION >= 40300
 /* Mark functions as cold. gcc will assume any path leading to a call
diff --git a/include/linux/compiler.h b/include/linux/compiler.h
index 4d9f353..b26d606 100644
--- a/include/linux/compiler.h
+++ b/include/linux/compiler.h
@@ -244,6 +244,10 @@ void ftrace_likely_update(struct ftrace_branch_data *f, int val, int expect);
 #define __always_inline inline
 #endif
 
+#ifndef __flatten
+#define __flatten
+#endif
+
 #endif /* __KERNEL__ */
 
 /*
-- 
1.7.3.4


^ permalink raw reply related	[flat|nested] 26+ messages in thread

* [PATCH v5 8/25] bug.h: Make BUILD_BUG_ON generate compile-time error
       [not found] <1348615467-9850-1-git-send-email-daniel.santos@pobox.com>
                   ` (6 preceding siblings ...)
  2012-09-25 23:30 ` [PATCH v5 7/25] compiler{,-gcc4}.h: Introduce __flatten function attribute Daniel Santos
@ 2012-09-25 23:30 ` Daniel Santos
  2012-09-25 23:30 ` [PATCH v5 9/25] bug.h: Add BUILD_BUG_ON_NON_CONST macro Daniel Santos
                   ` (16 subsequent siblings)
  24 siblings, 0 replies; 26+ messages in thread
From: Daniel Santos @ 2012-09-25 23:30 UTC (permalink / raw)
  To: Daniel Santos, Paul Gortmaker, Andrew Morton,
	Konstantin Khlebnikov, Pavel Pisa, Richard Weinberger, LKML,
	Michel Lespinasse, Andrea Arcangeli, Peter Zijlstra,
	Rik van Riel

Negative sized arrays wont create a compile-time error in some cases
starting with gcc 4.4 (e.g., inlined functions), but gcc 4.3 introduced
the error function attribute that will.  This patch modifies
BUILD_BUG_ON to behave like BUILD_BUG already does, using the error
function attribute so that you don't have to build the entire kernel to
discover that you have a problem, and then enjoy trying to track it down
from a link-time error.

Signed-off-by: Daniel Santos <daniel.santos@pobox.com>
---
 include/linux/bug.h |   24 ++++++++++++++----------
 1 files changed, 14 insertions(+), 10 deletions(-)

diff --git a/include/linux/bug.h b/include/linux/bug.h
index 298a916..c70b833 100644
--- a/include/linux/bug.h
+++ b/include/linux/bug.h
@@ -42,24 +42,28 @@ struct pt_regs;
  * @condition: the condition which the compiler should know is false.
  *
  * If you have some code which relies on certain constants being equal, or
- * other compile-time-evaluated condition, you should use BUILD_BUG_ON to
+ * some other compile-time-evaluated condition, you should use BUILD_BUG_ON to
  * detect if someone changes it.
  *
  * The implementation uses gcc's reluctance to create a negative array, but
  * gcc (as of 4.4) only emits that error for obvious cases (eg. not arguments
- * to inline functions).  So as a fallback we use the optimizer; if it can't
- * prove the condition is false, it will cause a link error on the undefined
- * "__build_bug_on_failed".  This error message can be harder to track down
- * though, hence the two different methods.
+ * to inline functions).  Luckily, in 4.3 they added the "error" function
+ * attribute just for this type of case.  Thus, we use a negative sized array
+ * (should always create an error pre-gcc-4.4) and then call an undefined
+ * function with the error attribute (should always creates an error 4.3+).  If
+ * for some reason, neither creates a compile-time error, we'll still have a
+ * link-time error, which is harder to track down.
  */
 #ifndef __OPTIMIZE__
 #define BUILD_BUG_ON(condition) ((void)sizeof(char[1 - 2*!!(condition)]))
 #else
-extern int __build_bug_on_failed;
-#define BUILD_BUG_ON(condition)					\
-	do {							\
-		((void)sizeof(char[1 - 2*!!(condition)]));	\
-		if (condition) __build_bug_on_failed = 1;	\
+#define BUILD_BUG_ON(condition)						\
+	do {								\
+		extern void __build_bug_on_failed(void)			\
+			__compiletime_error("BUILD_BUG_ON failed");	\
+		((void)sizeof(char[1 - 2*!!(condition)]));		\
+		if (condition)						\
+			__build_bug_on_failed();			\
 	} while(0)
 #endif
 
-- 
1.7.3.4


^ permalink raw reply related	[flat|nested] 26+ messages in thread

* [PATCH v5 9/25] bug.h: Add BUILD_BUG_ON_NON_CONST macro
       [not found] <1348615467-9850-1-git-send-email-daniel.santos@pobox.com>
                   ` (7 preceding siblings ...)
  2012-09-25 23:30 ` [PATCH v5 8/25] bug.h: Make BUILD_BUG_ON generate compile-time error Daniel Santos
@ 2012-09-25 23:30 ` Daniel Santos
  2012-09-25 23:31 ` [PATCH v5 10/25] bug.h: Add gcc 4.2+ versions of BUILD_BUG_ON_* macros Daniel Santos
                   ` (15 subsequent siblings)
  24 siblings, 0 replies; 26+ messages in thread
From: Daniel Santos @ 2012-09-25 23:30 UTC (permalink / raw)
  To: Daniel Santos, Paul Gortmaker, Andrew Morton,
	Konstantin Khlebnikov, Pavel Pisa, Richard Weinberger, LKML,
	Michel Lespinasse, Andrea Arcangeli, Peter Zijlstra,
	Rik van Riel

A very common use of __builtin_constant_p is to make sure that a certain
value is a compile time constant and generate a build-time error if it
is not.  However, __builtin_constant_p is broken in a variety of ways in
various situations (on various versions of gcc) and never returns one in
an unoptimized build. This macro provide a mechanism to perform these
build-time checks, but not break unoptimized builds (or modules being
build with -O0), of which there probably aren't many people that care
anyway.

This patch documents all of the relevant quirks I could find in the
"Gory Details" section of the doc-comments.  For almost all cases,
BUILD_BUG_ON_NON_CONST() should never fail on a primitive, non-pointer
type variable declared const.  A subsequent patch provides a separate
macro for performing tests which are known to be broken in older
compilers (pretty much, using __builtin_constant_p on arrays, pointers &
structs as well as testing those values).

Signed-off-by: Daniel Santos <daniel.santos@pobox.com>
---
 include/linux/bug.h |   48 ++++++++++++++++++++++++++++++++++++++++++++++++
 1 files changed, 48 insertions(+), 0 deletions(-)

diff --git a/include/linux/bug.h b/include/linux/bug.h
index c70b833..e30f600 100644
--- a/include/linux/bug.h
+++ b/include/linux/bug.h
@@ -81,6 +81,54 @@ struct pt_regs;
 		__build_bug_failed();				\
 	} while (0)
 
+/**
+ * BUILD_BUG_ON_NON_CONST - break compile if expression cannot be determined
+ *                          to be a compile-time constant.
+ * @exp: value to test for compile-time constness
+ *
+ * __builtin_constant_p() is a work in progress and is broken in various ways
+ * on various versions of gcc and optimization levels. It can fail, even when
+ * gcc otherwise determines that the expression is compile-time constant when
+ * performing actual optimizations and thus, compile out the value anyway. Do
+ * not use this macro for struct members or dereferenced pointers and arrays,
+ * as these are broken in many versions of gcc -- use BUILD_BUG_ON_NON_CONST42
+ * or another gcc-version-checked macro instead.
+ *
+ * As long as you are passing a variable declared const (and not modified),
+ * this macro should never fail (except for floats).  For information on gcc's
+ * behavior in other cases, see below.
+ *
+ * Gory Details:
+ *
+ * Normal primitive variables
+ * - global non-static non-const values are never compile-time constants (but
+ *   you should already know that)
+ * - all const values (global/local, non/static) should never fail this test
+ *   (3.4+) with one exception (below)
+ * - floats (which we wont use anyway) are broken in various ways until 4.2
+ *   (-O1 broken until 4.4)
+ * - local static non-const broken until 4.2 (-O1 broken until 4.3)
+ * - local non-static non-const broken until 4.0
+ *
+ * Dereferencing pointers & arrays
+ * - all static const derefs broken until 4.4 (except arrays at -O2 or better,
+ *   which are fixed in 4.2)
+ * - global non-static const pointer derefs always fail (<=4.7)
+ * - local non-static const derefs broken until 4.3, except for array derefs
+ *   to a zero value, which works from 4.0+
+ * - local static non-const pointers always fail (<=4.7)
+ * - local static non-const arrays broken until 4.4
+ * - local non-static non-const arrays broken until 4.0 (unless zero deref,
+ *   works in 3.4+)
+
+ */
+#ifdef __OPTIMIZE__
+#define BUILD_BUG_ON_NON_CONST(exp) \
+	BUILD_BUG_ON(!__builtin_constant_p(exp))
+#else
+#define BUILD_BUG_ON_NON_CONST(exp)
+#endif
+
 #endif	/* __CHECKER__ */
 
 #ifdef CONFIG_GENERIC_BUG
-- 
1.7.3.4


^ permalink raw reply related	[flat|nested] 26+ messages in thread

* [PATCH v5 10/25] bug.h: Add gcc 4.2+ versions of BUILD_BUG_ON_* macros
       [not found] <1348615467-9850-1-git-send-email-daniel.santos@pobox.com>
                   ` (8 preceding siblings ...)
  2012-09-25 23:30 ` [PATCH v5 9/25] bug.h: Add BUILD_BUG_ON_NON_CONST macro Daniel Santos
@ 2012-09-25 23:31 ` Daniel Santos
  2012-09-25 23:31 ` [PATCH v5 11/25] rbtree.h: Generic Red-Black Trees Daniel Santos
                   ` (14 subsequent siblings)
  24 siblings, 0 replies; 26+ messages in thread
From: Daniel Santos @ 2012-09-25 23:31 UTC (permalink / raw)
  To: Daniel Santos, Paul Gortmaker, Andrew Morton,
	Konstantin Khlebnikov, Pavel Pisa, Richard Weinberger, LKML,
	Michel Lespinasse, Andrea Arcangeli, Peter Zijlstra,
	Rik van Riel

BUILD_BUG_ON42(arg)
BUILD_BUG_ON_CONST42(arg)

Prior to gcc 4.2, the optimizer was unable to determine that many
constant values stored in structs were indeed compile-time constants and
optimize them out.  Sometimes, it will find an intergral value to be a
compile-time constant, but fail to perform a bit-wise AND at
compile-time.  These two macros provide a mechanism to perform these
build-time checks, but not break on older compilers where we already
know they can't be checked at compile time.

For specific details, consult the doc comments for BUILD_BUG_ON_CONST.
These macros are used in the generic rbtree code.

Signed-off-by: Daniel Santos <daniel.santos@pobox.com>
---
 include/linux/bug.h |   36 ++++++++++++++++++++++++++++++++++++
 1 files changed, 36 insertions(+), 0 deletions(-)

diff --git a/include/linux/bug.h b/include/linux/bug.h
index e30f600..d14c23c 100644
--- a/include/linux/bug.h
+++ b/include/linux/bug.h
@@ -2,6 +2,7 @@
 #define _LINUX_BUG_H
 
 #include <asm/bug.h>
+#include <linux/compiler.h>
 
 enum bug_trap_type {
 	BUG_TRAP_TYPE_NONE = 0,
@@ -129,6 +130,41 @@ struct pt_regs;
 #define BUILD_BUG_ON_NON_CONST(exp)
 #endif
 
+
+#if GCC_VERSION >= 40200
+/**
+ * BUILD_BUG_ON_NON_CONST42 - break compile if expression cannot be determined
+ *                            to be a compile-time constant (disabled prior to
+ *                            gcc 4.2)
+ * @exp: value to test for compile-time constness
+ *
+ * Use this macro instead of BUILD_BUG_ON_NON_CONST when testing struct
+ * members or dereferenced arrays and pointers.  Note that the version checks
+ * for this macro are not perfect.  BUILD_BUG_ON_NON_CONST42 expands to nothing
+ * prior to gcc-4.2, after which it is the same as BUILD_BUG_ON_NON_CONST.
+ * However, there are still many checks that will break with this macro (see
+ * the Gory Details section of BUILD_BUG_ON_NON_CONST for more info).
+ *
+ * See also BUILD_BUG_ON_NON_CONST()
+ */
+# define BUILD_BUG_ON_NON_CONST42(exp) BUILD_BUG_ON_NON_CONST(exp)
+
+/**
+ * BUILD_BUG_ON42 - break compile if expression cannot be determined
+ *                   (disabled prior to gcc 4.2)
+ *
+ * This gcc-version check is necessary due to breakages in testing struct
+ * members prior to gcc 4.2.
+ *
+ * See also BUILD_BUG_ON()
+ */
+# define BUILD_BUG_ON42(arg) BUILD_BUG_ON(arg)
+#else
+# define BUILD_BUG_ON_NON_CONST42(exp)
+# define BUILD_BUG_ON42(arg)
+#endif /* GCC_VERSION >= 40200 */
+
+
 #endif	/* __CHECKER__ */
 
 #ifdef CONFIG_GENERIC_BUG
-- 
1.7.3.4


^ permalink raw reply related	[flat|nested] 26+ messages in thread

* [PATCH v5 11/25] rbtree.h: Generic Red-Black Trees
       [not found] <1348615467-9850-1-git-send-email-daniel.santos@pobox.com>
                   ` (9 preceding siblings ...)
  2012-09-25 23:31 ` [PATCH v5 10/25] bug.h: Add gcc 4.2+ versions of BUILD_BUG_ON_* macros Daniel Santos
@ 2012-09-25 23:31 ` Daniel Santos
  2012-09-25 23:31 ` [PATCH v5 12/25] rbtree.h: include kconfig.h Daniel Santos
                   ` (13 subsequent siblings)
  24 siblings, 0 replies; 26+ messages in thread
From: Daniel Santos @ 2012-09-25 23:31 UTC (permalink / raw)
  To: Daniel Santos, Pavel Pisa, Richard Weinberger, LKML,
	Michel Lespinasse, Andrea Arcangeli, Peter Zijlstra,
	Rik van Riel

Add generic red-black tree code to rbtree.h.

Signed-off-by: Daniel Santos <daniel.santos@pobox.com>
---
 include/linux/rbtree.h | 1155 +++++++++++++++++++++++++++++++++++++++++++++++-
 1 files changed, 1153 insertions(+), 2 deletions(-)

diff --git a/include/linux/rbtree.h b/include/linux/rbtree.h
index 033b507..f1b53d5 100644
--- a/include/linux/rbtree.h
+++ b/include/linux/rbtree.h
@@ -1,7 +1,8 @@
 /*
   Red Black Trees
   (C) 1999  Andrea Arcangeli <andrea@suse.de>
-  
+  (C) 2012  Daniel Santos <daniel.santos@pobox.com>
+
   This program is free software; you can redistribute it and/or modify
   it under the terms of the GNU General Public License as published by
   the Free Software Foundation; either version 2 of the License, or
@@ -96,6 +97,7 @@ static inline struct page * rb_insert_page_cache(struct inode * inode,
 
 #include <linux/kernel.h>
 #include <linux/stddef.h>
+#include <linux/bug.h>
 
 struct rb_node
 {
@@ -148,6 +150,7 @@ extern void rb_insert_color(struct rb_node *, struct rb_root *);
 extern void rb_erase(struct rb_node *, struct rb_root *);
 
 typedef void (*rb_augment_f)(struct rb_node *node, void *data);
+typedef long (*rb_compare_f)(const void *a, const void *b);
 
 extern void rb_augment_insert(struct rb_node *node,
 			      rb_augment_f func, void *data);
@@ -162,7 +165,7 @@ extern struct rb_node *rb_first(const struct rb_root *);
 extern struct rb_node *rb_last(const struct rb_root *);
 
 /* Fast replacement of a single node without remove/rebalance/add/rebalance */
-extern void rb_replace_node(struct rb_node *victim, struct rb_node *new, 
+extern void rb_replace_node(struct rb_node *victim, struct rb_node *new,
 			    struct rb_root *root);
 
 static inline void rb_link_node(struct rb_node * node, struct rb_node * parent,
@@ -174,4 +177,1152 @@ static inline void rb_link_node(struct rb_node * node, struct rb_node * parent,
 	*rb_link = node;
 }
 
+
+#define __JUNK junk,
+#define _iff_empty(test_or_junk, t, f) __iff_empty(test_or_junk, t, f)
+#define __iff_empty(__ignored1, __ignored2, result, ...) result
+
+/**
+ * IFF_EMPTY() - Expands to the second argument when the first is empty, the
+ *               third if non-empty.
+ * @test:        An argument to test for emptiness.
+ * @t:           A value to expand to if test is empty.
+ * @f:           A value to expand to if test is non-empty.
+ *
+ * Caveats:
+ * IFF_EMPTY isn't perfect.  The test parameter must either be empty or a valid
+ * pre-processor token as well as result in a valid token when pasted to the
+ * end of a word.
+ *
+ * Valid Examples:
+ * IFF_EMPTY(a, b, c) = c
+ * IFF_EMPTY( , b, c) = b
+ * IFF_EMPTY( ,  , c) = (nothing)
+ *
+ * Invalid Examples:
+ * IFF_EMPTY(.,  b, c)
+ * IFF_EMPTY(+,  b, c)
+ */
+#define IFF_EMPTY(test, t, f) _iff_empty(__JUNK##test, t, f)
+
+/**
+ * IS_EMPTY() - test if a pre-processor argument is empty.
+ * @arg:        An argument (empty or non-empty)
+ *
+ * If empty, expands to 1, 0 otherwise.  See IFF_EMPTY() for caveats &
+ * limitations.
+ */
+#define IS_EMPTY(arg)	IFF_EMPTY(arg, 1, 0)
+
+/**
+ * OPT_OFFSETOF() - return the offsetof for the supplied expression, or zero
+ *                  if m is an empty argument.
+ * @type:           struct/union type
+ * @member:         (optional) struct member name
+ *
+ * Since any offsetof can return zero if the specified member is the first in
+ * the struct/union, you should also check if the argument is empty separately
+ * with IS_EMPTY(m).
+ */
+#define OPT_OFFSETOF(type, member) IFF_EMPTY(member, 0, offsetof(type, member))
+
+/**
+ * enum rb_flags - values for strct rb_relationship's flags
+ * @RB_HAS_LEFTMOST:	The container has a struct rb_node *leftmost member
+ * 			that will receive a pointer to the leftmost (smallest)
+ * 			object in the tree that is updated during inserts &
+ * 			deletions.
+ * @RB_HAS_RIGHTMOST:	Same as above (for right side of tree).
+ * @RB_HAS_COUNT:	The container has an unsigned long field that will
+ * 			receive updates of the object count in the tree.
+ * @RB_UNIQUE_KEYS:	The tree contains only unique values.
+ * @RB_INSERT_REPLACES:	When set, the rb_insert() will replace a value if it
+ * 			matches the supplied one (valid only when
+ * 			RB_UNIQUE_KEYS is set).
+ * @RB_IS_AUGMENTED:	is an augmented tree
+ * @RB_ALL_FLAGS:	(internal use)
+ */
+
+enum rb_flags {
+	RB_HAS_LEFTMOST		= 0x00000001,
+	RB_HAS_RIGHTMOST	= 0x00000002,
+	RB_HAS_COUNT		= 0x00000004,
+	RB_UNIQUE_KEYS		= 0x00000008,
+	RB_INSERT_REPLACES	= 0x00000010,
+	RB_IS_AUGMENTED		= 0x00000040,
+	RB_ALL_FLAGS		= RB_HAS_LEFTMOST | RB_HAS_RIGHTMOST
+			| RB_HAS_COUNT | RB_UNIQUE_KEYS
+			| RB_INSERT_REPLACES | RB_IS_AUGMENTED,
+};
+
+/**
+ * struct rb_relationship - Defines relationship between a container and the
+ *			    objects it contains.
+ * @root_offset:  Offset of container's struct rb_root member.
+ * @left_offset:  (Used only if RB_HAS_LEFTMOST is set) Offset of the
+ *		  container's struct rb_node *leftmost member for storing a
+ *		  pointer to the leftmost node in the tree, which is kept
+ *		  updated as inserts and deletions are made.
+ * @right_offset: Same as left_offset, except for right side of tree.
+ * @count_offset: Offset of container's unsigned long count member.
+ * @node_offset:  Offset of object's struct rb_node member.
+ * @key_offset:   Offset of object's key member.
+ * @flags:        See enum rb_flags.
+ * @compare:      Pointer to key rb_compare_f function to compare keys.
+ *		  Although it will be cast to and called as type long (*)(const
+ *		  void *a, const void *b), you should declare it as accepting
+ *		  pointers to your key members, or sanity checks will fail.
+ *		  Further, it is optimal if the function is declared inline.
+ * @augment:      Pointer to the rb_augment_f or zero if tree is not augmented.
+ *
+ * Instances of struct rb_relationship should be compile-time constants (or
+ * rather, the value of its members).
+ */
+struct rb_relationship {
+	ssize_t root_offset;
+	ssize_t left_offset;
+	ssize_t right_offset;
+	ssize_t count_offset;
+	ssize_t node_offset;
+	ssize_t key_offset;
+	int flags;
+	const rb_compare_f compare;
+	const rb_compare_f ins_compare;
+	const rb_augment_f augment;
+	unsigned key_size;
+};
+
+#define __RB_PTR(type, ptr, offset) ((type *)((char *)(ptr) + (offset)))
+
+/* public conversion functions for use with run-time values */
+static __always_inline
+struct rb_root *rb_to_root(const void *ptr,
+			   const struct rb_relationship *rel)
+{
+	return __RB_PTR(struct rb_root, ptr, rel->root_offset);
+}
+
+static __always_inline
+struct rb_node **rb_root_to_left(struct rb_root *root,
+				 const struct rb_relationship *rel)
+{
+	return __RB_PTR(struct rb_node *, root,
+			rel->left_offset - rel->root_offset);
+}
+
+static __always_inline
+struct rb_node **rb_root_to_right(struct rb_root *root,
+				  const struct rb_relationship *rel)
+{
+	return __RB_PTR(struct rb_node *, root,
+			rel->right_offset - rel->root_offset);
+}
+
+static __always_inline
+unsigned long *rb_root_to_count(struct rb_root *root,
+			       const struct rb_relationship *rel)
+{
+	return __RB_PTR(unsigned long, root,
+			rel->count_offset - rel->root_offset);
+}
+
+static __always_inline
+const void *rb_node_to_key(const struct rb_node *node,
+			   const struct rb_relationship *rel)
+{
+	return __RB_PTR(const void, node,
+			rel->key_offset - rel->node_offset);
+}
+
+static __always_inline
+void *rb_node_to_obj(const struct rb_node *node,
+		     const struct rb_relationship *rel)
+{
+	return __RB_PTR(void, node, -rel->node_offset);
+}
+
+static __always_inline
+struct rb_node *rb_to_node(const void *ptr, const struct rb_relationship *rel)
+{
+	return __RB_PTR(struct rb_node, ptr, rel->node_offset);
+}
+
+static __always_inline
+const void *rb_to_key(const void *ptr, const struct rb_relationship *rel)
+{
+	return __RB_PTR(const void, ptr, rel->key_offset);
+}
+
+
+/* checked conversion functions that will error on run-time values */
+static __always_inline
+struct rb_root *__rb_to_root(const void *ptr,
+			     const struct rb_relationship *rel)
+{
+	BUILD_BUG_ON_NON_CONST42(rel->root_offset);
+	return rb_to_root(ptr, rel);
+}
+
+static __always_inline
+struct rb_node **__rb_root_to_left(struct rb_root *root,
+				   const struct rb_relationship *rel)
+{
+	BUILD_BUG_ON42(!(rel->flags & RB_HAS_LEFTMOST));
+	BUILD_BUG_ON_NON_CONST42(rel->root_offset);
+	BUILD_BUG_ON_NON_CONST42(rel->left_offset);
+	return rb_root_to_left(root, rel);
+}
+
+static __always_inline
+struct rb_node **__rb_root_to_right(struct rb_root *root,
+				    const struct rb_relationship *rel)
+{
+	BUILD_BUG_ON42(!(rel->flags & RB_HAS_RIGHTMOST));
+	BUILD_BUG_ON_NON_CONST42(rel->root_offset);
+	BUILD_BUG_ON_NON_CONST42(rel->right_offset);
+	return rb_root_to_right(root, rel);
+}
+
+static __always_inline
+unsigned long *__rb_root_to_count(struct rb_root *root,
+				  const struct rb_relationship *rel)
+{
+	BUILD_BUG_ON42(!(rel->flags & RB_HAS_COUNT));
+	BUILD_BUG_ON_NON_CONST42(rel->root_offset);
+	BUILD_BUG_ON_NON_CONST42(rel->count_offset);
+	return rb_root_to_count(root, rel);
+}
+
+static __always_inline
+const void *__rb_node_to_key(const struct rb_node *node,
+			     const struct rb_relationship *rel)
+{
+	BUILD_BUG_ON_NON_CONST42(rel->node_offset);
+	BUILD_BUG_ON_NON_CONST42(rel->key_offset);
+	return rb_node_to_key(node, rel);
+}
+
+static __always_inline
+void *__rb_node_to_obj(const struct rb_node *node,
+		       const struct rb_relationship *rel)
+{
+	BUILD_BUG_ON_NON_CONST42(rel->node_offset);
+	return rb_node_to_obj(node, rel);
+}
+
+static __always_inline
+struct rb_node *__rb_to_node(const void *ptr, const struct rb_relationship *rel)
+{
+	BUILD_BUG_ON_NON_CONST42(rel->node_offset);
+	return rb_to_node(ptr, rel);
+}
+
+static __always_inline
+const void *__rb_to_key(const void *ptr, const struct rb_relationship *rel)
+{
+	BUILD_BUG_ON_NON_CONST42(rel->key_offset);
+	return rb_to_key(ptr, rel);
+}
+
+/**
+ * __rb_assert_good_rel() - Perform compile-time sanity checks on a struct
+ *			    rb_relationship.
+ * @rel:	Pointer to a const struct rb_relationship to check.
+ */
+static __always_inline
+void __rb_assert_good_rel(const struct rb_relationship *rel)
+{
+	BUILD_BUG_ON_NON_CONST42(rel->flags);
+	BUILD_BUG_ON42(rel->flags & ~RB_ALL_FLAGS);
+
+	BUILD_BUG_ON_NON_CONST42(rel->root_offset);
+	BUILD_BUG_ON_NON_CONST42(rel->node_offset);
+	BUILD_BUG_ON_NON_CONST42(rel->key_offset);
+
+	if (rel->flags & RB_HAS_LEFTMOST)
+		BUILD_BUG_ON_NON_CONST42(rel->count_offset);
+
+	if (rel->flags & RB_HAS_RIGHTMOST)
+		BUILD_BUG_ON_NON_CONST42(rel->right_offset);
+
+	if (rel->flags & RB_HAS_COUNT)
+		BUILD_BUG_ON_NON_CONST42(rel->left_offset);
+
+	/* Due to a bug in versions of gcc prior to 4.6, the following
+	 * expressions are always evalulated at run-time:
+	 *
+	 * (!(rel->flags & RB_UNIQUE_KEYS) && (rel->flags & RB_INSERT_REPLACES))
+	 *
+	 * The work-around for this bug is separate each bitwise AND test using
+	 * an if/else construct and evaluate only the last test with the
+	 * BUILD_BUG_ON macro.
+	 */
+
+	if (rel->flags & RB_INSERT_REPLACES)
+		BUILD_BUG_ON42(!(rel->flags & RB_UNIQUE_KEYS));
+}
+
+
+/**
+ * __rb_find() - Perform a (normal) search on a Red-Black Tree, starting at the
+ *		 specified node, traversing downward.
+ * @node:	Node (subtree) to start the search from.
+ * @key:	Pointer to a key to search for.
+ * @rel:	Pointer to the relationship definition constant.
+ */
+static __always_inline __flatten
+struct rb_node *__rb_find(
+		struct rb_node *node,
+		const void *key,
+		const struct rb_relationship *rel)
+{
+	__rb_assert_good_rel(rel);
+	while (node) {
+		long diff = rel->compare(key, __rb_node_to_key(node, rel));
+
+		if (diff > 0)
+			node = node->rb_right;
+		else if (diff < 0)
+			node = node->rb_left;
+		else
+			return node;
+	}
+
+	return 0;
+}
+
+/**
+ * rb_find() - Perform a (normal) search on a Red-Black Tree.
+ * @root:	Root of the tree.
+ * @key:	Pointer to a key to search for.
+ * @rel:	Pointer to the relationship definition constant.
+ */
+static __always_inline __flatten
+struct rb_node *rb_find(
+		struct rb_root *root,
+		const void *key,
+		const struct rb_relationship *rel)
+{
+	return __rb_find(root->rb_node, key, rel);
+}
+
+static __always_inline __flatten
+struct rb_node *__rb_find_first_last(
+		struct rb_node *node,
+		const void *key,
+		const struct rb_relationship *rel,
+		const int find_first)
+{
+	__rb_assert_good_rel(rel);
+
+	/* don't use this function on a tree with unique keys */
+	BUILD_BUG_ON42(rel->flags & RB_UNIQUE_KEYS);
+	BUILD_BUG_ON_NON_CONST(find_first);
+
+	while (node) {
+		long diff = rel->compare(key, __rb_node_to_key(node, rel));
+
+		if (diff > 0)
+			node = node->rb_right;
+		else if (diff < 0)
+			node = node->rb_left;
+		else {
+			if (find_first && node->rb_left)
+				node = node->rb_left;
+			else if (!find_first && node->rb_right)
+				node = node->rb_right;
+			else
+				return node;
+		}
+	}
+
+	return 0;
+}
+
+/**
+ * rb_find_first() - Search for first occurrence of key in a tree containing
+ * 		     non-unique keys.
+ * @root:	Root of the tree.
+ * @key:	Pointer to a key.
+ * @rel:	Pointer to the relationship definition constant.
+ *
+ * This function is intended for use with trees containing non-unique keys.
+ * When called for trees with unique keys, it maps to __rb_find (a normal
+ * search).
+ */
+static __always_inline __flatten
+struct rb_node *rb_find_first(
+		struct rb_root *root,
+		const void *key,
+		const struct rb_relationship *rel)
+{
+	return __rb_find_first_last(root->rb_node, key, rel, 1);
+}
+
+/**
+ * rb_find_last() - Search for last occurrence of key in a tree containing
+ * 		    non-unique keys.
+ * @root:	Root of the tree.
+ * @key:	Pointer to a key.
+ * @rel:	Pointer to the relationship definition constant.
+ *
+ * This function is intended for use with trees containing non-unique keys.
+ * When called for trees with unique keys, it maps to __rb_find (a normal
+ * search).
+ */
+static __always_inline __flatten
+struct rb_node *rb_find_last(
+		struct rb_root *root,
+		const void *key,
+		const struct rb_relationship *rel)
+{
+	return __rb_find_first_last(root->rb_node, key, rel, 0);
+}
+
+/**
+ * rb_find_next() - Locate the next node in a tree containing non-unique keys,
+ *		    whos key matches the supplied node.
+ * @node:	Node of the current object in the tree.
+ * @rel:	Pointer to the relationship definition constant.
+ *
+ * Generally for use after calling rb_find_first().  Only valid for use with a
+ * tree with non-unique keys.
+ */
+static __always_inline __flatten
+struct rb_node *rb_find_next(
+		const struct rb_node *node,
+		const struct rb_relationship *rel)
+{
+	const void *key      = __rb_node_to_key(node, rel);
+	struct rb_node *next = rb_next(node);
+
+	/* don't use this function on a tree with unique keys */
+	BUILD_BUG_ON42(rel->flags & RB_UNIQUE_KEYS);
+	return (next && !rel->compare(key, __rb_node_to_key(next, rel)))
+	       ? next : NULL;
+}
+
+/**
+ * rb_find_prev() - Locate the previous node in a tree containing non-unique
+ *		    keys, whos key matches the supplied node.
+ * @node:	Node of the current object in the tree.
+ * @rel:	Pointer to the relationship definition constant.
+ *
+ * Generally for use after calling rb_find_last().  Only valid for use with a
+ * tree with non-unique keys.
+ */
+static __always_inline __flatten
+struct rb_node *rb_find_prev(
+		const struct rb_node *node,
+		const struct rb_relationship *rel)
+{
+	const void *key      = __rb_node_to_key(node, rel);
+	struct rb_node *prev = rb_prev(node);
+
+	/* don't use this function on a tree with unique keys */
+	BUILD_BUG_ON42(rel->flags & RB_UNIQUE_KEYS);
+	return (prev && !rel->compare(key, __rb_node_to_key(prev, rel)))
+	       ? prev : NULL;
+}
+
+
+enum rb_find_subtree_match {
+	RB_MATCH_NONE		= 0,
+	RB_MATCH_IMMEDIATE	= 2,
+	RB_MATCH_LEFT		= -1,
+	RB_MATCH_RIGHT		= 1,
+};
+
+/**
+ * __rb_find_subtree() - Locate the subtree that contains the specified key (if
+ *			 it exists) traversing upwards.
+ * @root:	Root of the tree
+ * @start:	Node to start from
+ * @key:	Key to search for
+ * @matched:	Pointer for a result value to be returned to (see enum
+ * 		rb_find_subtree_match)
+ * @ret_link:	Pointer for a link pointer to be returned
+ * @ret_parent:	Pointer for a parent pointer to be returned
+ * @rel:	A constant relationship definition
+ * @doing_insert: Rather or not we're doing an insert.
+ *
+ * Travels up a tree, starting from the specified node, until it locates the
+ * node (representing the subtree) under which the object (specified by key)
+ * can be located, or until that object its self is located.
+ *
+ * This function is used by find_near and insert_near, but behaves differently
+ * for each case (and perhaps could have been implemented as two separate
+ * functions). Specifically, when doing_insert is non-zero, it will set values
+ * in the location provided by populate ret_link & ret_parent.  Unused
+ * functionality when doing_insert is zero should be "compiled-out" in an
+ * optimized build.
+ */
+static __always_inline __flatten
+struct rb_node *__rb_find_subtree(
+		struct rb_root *root,
+		struct rb_node *start,
+		const void *key,
+		int *matched,
+		struct rb_node ***ret_link, /* wow, triple indirection.
+					       Am I smart or just nuts? */
+		struct rb_node **ret_parent,
+		const struct rb_relationship *rel,
+		const int doing_insert)
+{
+	struct rb_node *prev = start;
+	struct rb_node *node = rb_parent(start);
+	long diff;
+
+	__rb_assert_good_rel(rel);
+	BUILD_BUG_ON_NON_CONST(doing_insert);
+	BUG_ON(doing_insert && (!root || !ret_link || !ret_parent));
+
+	/* already at top of tree, so return start value */
+	if (!node) {
+		*matched = RB_MATCH_NONE;
+		if (doing_insert) {
+			*ret_link = &root->rb_node;
+			*ret_parent = **ret_link;
+		}
+		return start;
+	}
+
+	/* The first compare is just to figure out which direction up the tree
+	 * we're traveling.  When compare returns a value with a different
+	 * sign, we'll have found our subtree, or an exact match if zero.
+	 */
+	diff = rel->compare(key, __rb_node_to_key(node, rel));
+
+	if (diff) {
+		int go_left = diff < 0;
+		while (1) {
+			prev = node;
+			node = rb_parent(prev);
+			if (!node)
+				/* Reached top of tree.  In this case. rather
+				 * than having the top down search start from
+				 * the root, we'll start on the prev sibling
+				 * since we've already tested the root node, we
+				 * know that we don't need to go back the way
+				 * we came.
+				 */
+				break;
+
+			diff = rel->compare(key, __rb_node_to_key(node, rel));
+			if (go_left ? diff > 0 : diff < 0)
+				/* found the diverging node, so the child on
+				 * the opposite side (of prev) is the subtree
+				 * that will contain the key
+				 */
+				break;
+			else if (!diff) {
+				/* exact match */
+				*matched = go_left
+					 ? RB_MATCH_LEFT
+					 : RB_MATCH_RIGHT;
+
+				goto find_parent_link;
+			}
+		}
+
+		*matched = RB_MATCH_NONE;
+		if (doing_insert) {
+			*ret_parent = prev;
+			*ret_link = go_left ? &prev->rb_left : &prev->rb_right;
+			return **ret_link;
+		} else {
+			return go_left ? prev->rb_left : prev->rb_right;
+		}
+	}
+
+	/* start node's parent was an exact match */
+	*matched = RB_MATCH_IMMEDIATE;
+
+find_parent_link:
+	if (doing_insert) {
+		struct rb_node *parent = rb_parent(node);
+
+		if (!parent) {
+			*ret_link = &root->rb_node;
+			*ret_parent = **ret_link;
+		} else if (parent->rb_left == node) {
+			*ret_link = &parent->rb_left;
+			*ret_parent = parent;
+		} else if (parent->rb_right == node) {
+			*ret_link = &parent->rb_left;
+			*ret_parent = parent;
+		} else {
+			BUG();
+		}
+	}
+
+	return node;
+}
+
+/**
+ * rb_find_near() - Perform a search starting at the specified node instead of
+ *		    the top of the tree.
+ * @from:	Node to start search from
+ * @key:	Key to search for
+ * @rel:	Pointer to the relationship definition constant.
+ *
+ * Travels up the tree starting from the specified node and then back down
+ * again, searching for the object specified by key.  This function is larger
+ * than a normal search, but can yield better performance if the target object
+ * is near the supplied node. Performance is roughly O(log2(distance / 2) * 2
+ * + 1).
+ */
+static __always_inline __flatten
+struct rb_node *rb_find_near(
+		struct rb_node *from,
+		const void *key,
+		const struct rb_relationship *rel)
+{
+	int matched;
+	struct rb_node *subtree;
+
+	subtree = __rb_find_subtree(NULL, from, key, &matched, NULL, NULL,
+				    rel, 0);
+
+	if (matched)
+		return subtree;
+
+	return __rb_find(subtree, key, rel);
+}
+
+/* common insert epilogue used by rb_insert() and rb_insert_near() */
+static __always_inline __flatten
+struct rb_node *__rb_insert_epilogue(
+		struct rb_root *root,
+		struct rb_node *parent,
+		struct rb_node *node,
+		struct rb_node *found,
+		struct rb_node **rb_link,
+		const struct rb_relationship *rel)
+{
+	if ((rel->flags & RB_UNIQUE_KEYS) && found) {
+		if (rel->flags & RB_INSERT_REPLACES) {
+			/* if we're replacing the entry, we don't increment
+			 * count, but we do still need to do augment
+			 */
+			rb_replace_node(found, node, root);
+			goto do_augment;
+		} else {
+			/* otherwise, we don't do either */
+			goto done;
+		}
+	} else {
+		rb_link_node(node, parent, rb_link);
+		rb_insert_color(node, root);
+	}
+
+	if ((rel->flags & RB_HAS_COUNT))
+		++*__rb_root_to_count(root, rel);
+
+do_augment:
+	if (rel->augment)
+		rb_augment_insert(node, rel->augment, NULL);
+
+done:
+	return found;
+}
+
+
+/**
+ * rb_insert() - Insert a node into a tree.
+ * @root:	Pointer to struct rb_root.
+ * @node:	Pointer to the node of the new object to insert.
+ * @rel:	Pointer to the relationship definition constant.
+ *
+ * If an object with the same key already exists and RB_INSERT_REPLACES is set
+ * then it is replaced with new object node; if RB_INSERT_REPLACES is not set,
+ * then no change is made. In either case, a pointer to the existing object
+ * node is returned.
+ *
+ * If no object with the same key exists, then the new object node is inserted
+ * and NULL is returned.
+ */
+static __always_inline __flatten
+struct rb_node *rb_insert(
+		struct rb_root *root,
+		struct rb_node *node,
+		const struct rb_relationship *rel)
+{
+	struct rb_node **p     = &root->rb_node;
+	struct rb_node *parent = NULL;
+	const void * const key = __rb_node_to_key(node, rel);
+	int leftmost           = 1;
+	int rightmost          = 1;
+
+	/* optimization/hack good on gcc 4.6.0+, when -findirect-inline is able
+	 * to inline the compare function.  This manages to force gcc to put
+	 * the value of the key in a register, instead of retrieving it prior
+	 * to each compare.  The necessity of this hasn't been tested beyond
+	 * gcc 4.7.1.
+	 */
+#if GCC_VERSION >= 40600
+	u16 __maybe_unused key16;
+	u32 __maybe_unused key32;
+	u64 __maybe_unused key64;
+
+	if (rel->key_size == 2)
+		key16 = *(u16*)key;
+	else if (rel->key_size == 4)
+		key32 = *(u32*)key;
+	else if (rel->key_size == 8)
+		key64 = *(u64*)key;
+#endif
+
+	__rb_assert_good_rel(rel);
+
+
+	while (*p) {
+		long diff;
+		const void *cur_key = __rb_node_to_key(*p, rel);
+
+#if GCC_VERSION >= 40600
+		if (rel->key_size == 2)
+			diff = rel->ins_compare(&key16, cur_key);
+		else if (rel->key_size == 4)
+			diff = rel->ins_compare(&key32, cur_key);
+		else if (rel->key_size == 8)
+			diff = rel->ins_compare(&key64, cur_key);
+		else
+#endif
+			/* On gcc 4.5.x & prior, or for other key sizes, we
+			 * pass key ptr as a const void*, which tends to
+			 * optimize more poorly
+			 */
+			diff = rel->ins_compare(key, cur_key);
+
+		parent = *p;
+
+		if (diff > 0) {
+			p = &(*p)->rb_right;
+			if (rel->flags & RB_HAS_LEFTMOST)
+				leftmost = 0;
+		} else if (!(rel->flags & RB_UNIQUE_KEYS) || diff < 0) {
+			p = &(*p)->rb_left;
+			if (rel->flags & RB_HAS_RIGHTMOST)
+				rightmost = 0;
+		} else
+			break;
+	}
+
+	if ((rel->flags & RB_HAS_LEFTMOST) && leftmost) {
+		struct rb_node **left = __rb_root_to_left(root, rel);
+
+		if (!(rel->flags & RB_INSERT_REPLACES) || !(*p) || *left == *p)
+			*left = node;
+	}
+	if ((rel->flags & RB_HAS_RIGHTMOST) && rightmost) {
+		struct rb_node **right = __rb_root_to_right(root, rel);
+
+		if (!(rel->flags & RB_INSERT_REPLACES) || !(*p) || *right == *p)
+			*right = node;
+	}
+
+	return __rb_insert_epilogue(root, parent, node, *p, p, rel);
+}
+
+/**
+ * rb_insert_near() - Perform an insert, but use the supplied start node to
+ *		      find the location for the new node.
+ * @root:	Pointer to struct rb_root.
+ * @start:	Node to start search for insert location from.
+ * @node:	Pointer to the node of the new object to insert.
+ * @rel:	Pointer to the relationship definition constant.
+ *
+ * This function is larger than rb_insert, but can yield better performance
+ * when the position where the new node is being is close to start. Performance
+ * is roughly O(log2(distance / 2) * 2 + 1).
+ */
+static __always_inline __flatten
+struct rb_node *rb_insert_near(
+		struct rb_root *root,
+		struct rb_node *start,
+		struct rb_node *node,
+		const struct rb_relationship *rel)
+{
+	const void *key = __rb_node_to_key(node, rel);
+	struct rb_node **p;
+	struct rb_node *parent;
+	struct rb_node *ret;
+	int matched;
+	long diff;
+
+	BUILD_BUG_ON_NON_CONST42(rel->flags);
+
+	ret = __rb_find_subtree(root, start, key, &matched, &p, &parent, rel,
+				1);
+
+	if (!matched) {
+		while (*p) {
+			diff   = rel->compare(__rb_node_to_key(*p, rel), key);
+			parent = *p;
+
+			if (diff > 0)
+				p = &(*p)->rb_right;
+			else if (!(rel->flags & RB_UNIQUE_KEYS) || diff < 0)
+				p = &(*p)->rb_left;
+			else
+				break;
+		}
+		ret = *p;
+	}
+
+	/* the longer way to see if we're left- or right-most (since we aren't
+	 * starting from the top, we can't use the mechanism rb_insert()
+	 * does.)
+	 */
+	if (rel->flags & RB_HAS_LEFTMOST) {
+		struct rb_node **left = __rb_root_to_left(root, rel);
+		if (!*left || *left == ret ||
+				(*left == parent && &parent->rb_left == p))
+			*left = node;
+	}
+
+	if (rel->flags & RB_HAS_RIGHTMOST) {
+		struct rb_node **right = __rb_root_to_right(root, rel);
+		if (!*right || *right == ret ||
+				(*right == parent && &parent->rb_right == p))
+			*right = node;
+	}
+
+	return __rb_insert_epilogue(root, parent, node, ret, p, rel);
+}
+
+/**
+ * rb_remove() - Remove a node from an rbtree.
+ * @root:	Pointer to struct rb_root.
+ * @node:	Pointer to the node of the object to be removed.
+ * @rel:	Pointer to the relationship definition constant.
+ */
+static __always_inline __flatten
+void rb_remove(
+	struct rb_root *root,
+	struct rb_node *node,
+	const struct rb_relationship *rel)
+{
+	struct rb_node *uninitialized_var(deepest);
+
+	BUILD_BUG_ON_NON_CONST42(rel->flags);
+
+	if (rel->augment)
+		deepest = rb_augment_erase_begin(node);
+
+	if (rel->flags & RB_HAS_LEFTMOST) {
+		struct rb_node **left = __rb_root_to_left(root, rel);
+
+		if (*left == node)
+			*left = rb_next(node);
+	}
+
+	if (rel->flags & RB_HAS_RIGHTMOST) {
+		struct rb_node **right = __rb_root_to_right(root, rel);
+
+		if (*right == node)
+			*right = rb_prev(node);
+	}
+
+	rb_erase(node, root);
+
+	if ((rel->flags & RB_HAS_COUNT))
+		--*__rb_root_to_count(root, rel);
+
+	if (rel->augment)
+		rb_augment_erase_end(deepest, rel->augment, NULL);
+}
+
+
+/**
+ * RB_RELATIONSHIP - Define the relationship between a container with a struct
+ *		    rb_root member, and the objects it contains.
+ * @cont_type: container type
+ * @root:      Container's struct rb_root member name
+ * @left:      (Optional) If the container needs a pointer to the tree's
+ *             leftmost (smallest) object, then specify the container's struct
+ *             rb_node *leftmost member.  Otherwise, leave this parameter
+ *             empty.
+ * @right:     (Optional) Same as left, but for the rightmost (largest)
+ * @count:     (Optional) Name of container's unsigned long member that will be
+ *             updated with the number of objects in the tree.  Note that if
+ *             you add or remove objects from the tree without using the
+ *             generic functions, you must update this value yourself.
+ * @obj_type:  Type of object stored in container
+ * @node:      The struct rb_node member of the object
+ * @key:       The key member of the object
+ * @_flags:    see enum rb_flags.  Note: you do not have to specify
+ *             RB_HAS_LEFTMOST, RB_HAS_RIGHTMOST, RB_HAS_COUNT or
+ *             RB_IS_AUGMENTED as these will be added automatically if their
+ *             respective field is non-empty.
+ * @_compare:  Pointer to key rb_compare_f function to compare keys.
+ *             Although it will be cast to and called as type long (*)(const
+ *             void *a, const void *b), you should declare it as accepting
+ *             pointers to your key members, or sanity checks will fail.
+ *             Further, it is optimal if the function is declared inline.
+ * @_ins_compare:
+ * @_augment:  (optional) pointer to the rb_augment_f or empty if tree is not
+ *             augmented.
+ *
+ * Example:
+ * struct my_container {
+ *     struct rb_root root;
+ *     unsigned int count;
+ *     struct rb_node *left;
+ * };
+ *
+ * struct my_object {
+ *     struct rb_node node;
+ *     int key;
+ * };
+ *
+ * static inline long compare_int(const int *a, const int *b)
+ * {
+ *     return *a - *b;
+ * }
+ *
+ * static inline long greater_int(const int *a, const int *b)
+ * {
+ *     return *a > *b;
+ * }
+ *
+ * static const struct rb_relationship my_rel = RB_RELATIONSHIP(
+ *     struct my_container, root, left, , count, // no rightmost
+ *     struct my_object, node, key,
+ *     0, compare_int, greater_int, );           // no augment
+ */
+#define RB_RELATIONSHIP(						\
+		cont_type, root, left, right, count,			\
+		obj_type, node, key,					\
+		_flags, _compare, _ins_compare, _augment) {		\
+	.root_offset	= offsetof(cont_type, root),			\
+	.left_offset	= OPT_OFFSETOF(cont_type, left),		\
+	.right_offset	= OPT_OFFSETOF(cont_type, right),		\
+	.count_offset	= OPT_OFFSETOF(cont_type, count),		\
+	.node_offset	= offsetof(obj_type, node),			\
+	.key_offset	= offsetof(obj_type, key),			\
+	.flags		= (_flags)					\
+			| IFF_EMPTY(left ,    0,  RB_HAS_LEFTMOST)	\
+			| IFF_EMPTY(right,    0,  RB_HAS_RIGHTMOST)	\
+			| IFF_EMPTY(count,    0,  RB_HAS_COUNT)		\
+			| IFF_EMPTY(_augment, 0,  RB_IS_AUGMENTED),	\
+	.compare	= (const rb_compare_f) (_compare),		\
+	.ins_compare	= (const rb_compare_f) (_ins_compare),		\
+	.augment	= IFF_EMPTY(_augment, 0, _augment),		\
+	.key_size	= sizeof(((obj_type *)0)->key)			\
+}
+
+/* compile-time type-validation functions used by __rb_sanity_check_##prefix */
+static inline void __rb_verify_root(struct rb_root *root) {}
+static inline void __rb_verify_left(struct rb_node * const *left) {}
+static inline void __rb_verify_right(struct rb_node * const *right) {}
+static inline void __rb_verify_count(const unsigned int *count) {}
+static inline void __rb_verify_node(struct rb_node *node) {}
+static inline void __rb_verify_compare_fn_ret(long *diff) {}
+static inline void __rb_verify_ins_compare_fn_ret(long *diff) {}
+
+/**
+ * RB_DEFINE_INTERFACE - Defines a complete interface for a relationship
+ *                       between container and object including a struct
+ *                       rb_relationship and an interface of type-safe wrapper
+ *                       functions.
+ * @prefix:    name for the relationship (see explanation below)
+ * @cont_type:   see RB_RELATIONSHIP
+ * @root:        see RB_RELATIONSHIP
+ * @left:        see RB_RELATIONSHIP
+ * @right:       see RB_RELATIONSHIP
+ * @count:       see RB_RELATIONSHIP
+ * @obj_type:    see RB_RELATIONSHIP
+ * @node:        see RB_RELATIONSHIP
+ * @key:         see RB_RELATIONSHIP
+ * @flags:       see RB_RELATIONSHIP
+ * @compare:     see RB_RELATIONSHIP
+ * @ing_compare: see RB_RELATIONSHIP
+ * @augment:     see RB_RELATIONSHIP
+ * @insert_mod:      (Optional) Function modifiers for insert function,
+ *                   defaults to "static __always_inline" if left empty.
+ * @insert_near_mod: (Optional) Same as above, for insert_near.
+ * @find_mod:        (Optional) Same as above, for find.
+ * @find_near_mod:   (Optional) Same as above, for find_near.
+ *
+ * This macro can be declared in the global scope of either a source or header
+ * file and will generate a static const struct rb_relationship variable named
+ * prefix##_rel as well as similarly named (i.e., prefix##_##func_name)
+ * type-safe wrapper functions for find, find_near, insert, insert_near and
+ * remove.  If these function names are not sufficient, you can use the
+ * __RB_DEFINE_INTERFACE macro to specify them explicitly.
+ *
+ * The compare function will be passed pointers to the key members of two
+ * objects.  If your compare function needs access to other members of your
+ * struct (e.g., compound keys, etc.) , you can use the rb_entry macro to
+ * access other members.  However, if you use this mechanism, your find
+ * function must always pass it's key parameter as a pointer to the key member
+ * of an object of type obj_type, since the compare function is used for both
+ * inserts and lookups (else, you'll be screwed).
+ *
+ * Example:
+ * struct my_container {
+ *     struct rb_root root;
+ *     unsigned int count;
+ *     struct rb_node *left;
+ * };
+ *
+ * struct my_object {
+ *     struct rb_node node;
+ *     int key;
+ * };
+ *
+ * static inline long compare_int(const int *a, const int *b)
+ * {
+ *     return (long)*a - (long)*b;
+ * }
+ *
+ * RB_DEFINE_INTERFACE(
+ *     my_tree,
+ *     struct my_container, root, left, , count, // no rightmost
+ *     struct my_object, node, key,
+ *     0, compare_int,
+ *     ,                     // defaults for find
+ *     static __flatten,     // let gcc decide rather or not to inline insert()
+ *     ,                     // defaults on find_near
+ *     static __flatten noinline) // don't let gcc inline insert_near()
+ */
+#define RB_DEFINE_INTERFACE(						\
+	prefix,								\
+	cont_type, root, left, right, count,				\
+	obj_type, node, key,						\
+	flags, compare, ins_compare, augment,				\
+	find_mod, insert_mod, find_near_mod, insert_near_mod)		\
+									\
+/* Compile-time sanity checks. You need not call this function for	\
+ * validation to occur.  We define __rb_sanity_check function first,	\
+ * so errors will (hopefully) get caught here and produce a more helpful\
+ * error message than a failure in the RB_RELATIONSHIP macro expansion.	\
+ */									\
+static inline __maybe_unused						\
+void __rb_sanity_check_ ## prefix(cont_type *cont, obj_type *obj)	\
+{									\
+	/* {,ins_}compare functions should take ptr to key member */	\
+	typeof((compare)(&obj->key, &obj->key)) _diff =			\
+			(compare)(&obj->key, &obj->key);		\
+	typeof((ins_compare)(&obj->key, &obj->key)) _ins_diff =		\
+			(ins_compare)(&obj->key, &obj->key);		\
+	__rb_verify_compare_fn_ret(&_diff);				\
+	__rb_verify_ins_compare_fn_ret(&_ins_diff);			\
+									\
+	/* validate types of container members */			\
+	__rb_verify_root(&cont->root);					\
+	__rb_verify_left (IFF_EMPTY(left  , 0, &cont->left));		\
+	__rb_verify_right(IFF_EMPTY(right , 0, &cont->right));		\
+	__rb_verify_count(IFF_EMPTY(count , 0, &cont->count));		\
+									\
+	/* validate types of object node */				\
+	__rb_verify_node(&obj->node);					\
+}									\
+									\
+static const struct rb_relationship prefix ## _rel =			\
+RB_RELATIONSHIP(							\
+	cont_type, root, left, right, count,				\
+	obj_type, node, key,						\
+	flags, compare, ins_compare, augment);				\
+									\
+IFF_EMPTY(find_mod, static __always_inline, find_mod)			\
+obj_type *prefix ## _find(cont_type *cont,				\
+			  const typeof(((obj_type *)0)->key) *_key)	\
+{									\
+	struct rb_node *ret = rb_find(					\
+			&cont->root, _key, &prefix ## _rel);		\
+	return ret ? rb_entry(ret, obj_type, node) : 0;			\
+}									\
+									\
+IFF_EMPTY(insert_mod, static __always_inline, insert_mod)		\
+obj_type *prefix ## _insert(cont_type *cont, obj_type *obj)		\
+{									\
+	struct rb_node *ret = rb_insert(				\
+			&cont->root, &obj->node, &prefix ## _rel);	\
+	return ret ? rb_entry(ret, obj_type, node) : 0;			\
+}									\
+									\
+IFF_EMPTY(find_near_mod, static __always_inline, find_near_mod)		\
+obj_type *prefix ## _find_near(obj_type *near,				\
+		    const typeof(((obj_type *)0)->key) *_key)		\
+{									\
+	struct rb_node *ret = rb_find_near(				\
+			&near->node, _key, &prefix ## _rel);		\
+	return ret ? rb_entry(ret, obj_type, node) : 0;			\
+}									\
+									\
+IFF_EMPTY(insert_near_mod, static __always_inline, insert_near_mod)	\
+obj_type *prefix ## _insert_near(cont_type *cont, obj_type *near,	\
+				 obj_type *obj)				\
+{									\
+	struct rb_node *ret = rb_insert_near(				\
+			&cont->root, &near->node, &obj->node,		\
+			&prefix ## _rel);				\
+	return ret ? rb_entry(ret, obj_type, node) : 0;			\
+}									\
+									\
+static __always_inline							\
+void prefix ## _remove(cont_type *cont, obj_type *obj)			\
+{									\
+	rb_remove(&cont->root, &obj->node, &prefix ## _rel);		\
+}									\
+									\
+IFF_EMPTY(find_mod, static __always_inline, find_mod) __maybe_unused	\
+obj_type *prefix ## _find_first(cont_type *cont,			\
+			  const typeof(((obj_type *)0)->key) *_key)	\
+{									\
+	struct rb_node *ret = rb_find_first(				\
+			&cont->root, _key, &prefix ## _rel);		\
+	return ret ? rb_entry(ret, obj_type, node) : 0;			\
+}									\
+									\
+IFF_EMPTY(find_mod, static __always_inline, find_mod) __maybe_unused	\
+obj_type *prefix ## _find_last(cont_type *cont,				\
+			  const typeof(((obj_type *)0)->key) *_key)	\
+{									\
+	struct rb_node *ret = rb_find_last(				\
+			&cont->root, _key, &prefix ## _rel);		\
+	return ret ? rb_entry(ret, obj_type, node) : 0;			\
+}									\
+									\
+static __always_inline							\
+obj_type *prefix ## _find_next(const obj_type *obj)			\
+{									\
+	struct rb_node *ret = rb_find_next(&obj->node, &prefix ## _rel);\
+	return ret ? rb_entry(ret, obj_type, node) : 0;			\
+}									\
+									\
+static __always_inline							\
+obj_type *prefix ## _find_prev(const obj_type *obj)			\
+{									\
+	struct rb_node *ret = rb_find_prev(&obj->node, &prefix ## _rel);\
+	return ret ? rb_entry(ret, obj_type, node) : 0;			\
+}									\
+									\
+static __always_inline obj_type *prefix ## _next(const obj_type *obj)	\
+{									\
+	struct rb_node *ret = rb_next(&obj->node);			\
+	return ret ? rb_entry(ret, obj_type, node) : 0;			\
+}									\
+									\
+static __always_inline obj_type *prefix ## _prev(const obj_type *obj)	\
+{									\
+	struct rb_node *ret = rb_prev(&obj->node);			\
+	return ret ? rb_entry(ret, obj_type, node) : 0;			\
+}									\
+									\
+static __always_inline obj_type *prefix ## _first(cont_type *cont)	\
+{									\
+	struct rb_node *ret = rb_first(&cont->root);			\
+	return ret ? rb_entry(ret, obj_type, node) : 0;			\
+}									\
+									\
+static __always_inline obj_type *prefix ## _last(cont_type *cont)	\
+{									\
+	struct rb_node *ret = rb_last(&cont->root);			\
+	return ret ? rb_entry(ret, obj_type, node) : 0;			\
+}
+
 #endif	/* _LINUX_RBTREE_H */
-- 
1.7.3.4


^ permalink raw reply related	[flat|nested] 26+ messages in thread

* [PATCH v5 12/25] rbtree.h: include kconfig.h
       [not found] <1348615467-9850-1-git-send-email-daniel.santos@pobox.com>
                   ` (10 preceding siblings ...)
  2012-09-25 23:31 ` [PATCH v5 11/25] rbtree.h: Generic Red-Black Trees Daniel Santos
@ 2012-09-25 23:31 ` Daniel Santos
  2012-09-25 23:31 ` [PATCH v5 13/25] fair.c: Use generic rbtree impl in fair scheduler Daniel Santos
                   ` (12 subsequent siblings)
  24 siblings, 0 replies; 26+ messages in thread
From: Daniel Santos @ 2012-09-25 23:31 UTC (permalink / raw)
  To: Daniel Santos, Pavel Pisa, Richard Weinberger, LKML,
	Michel Lespinasse, Andrea Arcangeli, Peter Zijlstra,
	Rik van Riel

We shouldn't depend upon kernel.h including this for us.  However, this
also fixes some issues with compiling in userland (coming later).

Signed-off-by: Daniel Santos <daniel.santos@pobox.com>
---
 include/linux/rbtree.h |    1 +
 1 files changed, 1 insertions(+), 0 deletions(-)

diff --git a/include/linux/rbtree.h b/include/linux/rbtree.h
index f1b53d5..66a99fd 100644
--- a/include/linux/rbtree.h
+++ b/include/linux/rbtree.h
@@ -98,6 +98,7 @@ static inline struct page * rb_insert_page_cache(struct inode * inode,
 #include <linux/kernel.h>
 #include <linux/stddef.h>
 #include <linux/bug.h>
+#include <linux/kconfig.h>
 
 struct rb_node
 {
-- 
1.7.3.4


^ permalink raw reply related	[flat|nested] 26+ messages in thread

* [PATCH v5 13/25] fair.c: Use generic rbtree impl in fair scheduler
       [not found] <1348615467-9850-1-git-send-email-daniel.santos@pobox.com>
                   ` (11 preceding siblings ...)
  2012-09-25 23:31 ` [PATCH v5 12/25] rbtree.h: include kconfig.h Daniel Santos
@ 2012-09-25 23:31 ` Daniel Santos
  2012-09-25 23:31 ` [PATCH v5 15/25] kernel-doc: bugfix - multi-line macros Daniel Santos
                   ` (11 subsequent siblings)
  24 siblings, 0 replies; 26+ messages in thread
From: Daniel Santos @ 2012-09-25 23:31 UTC (permalink / raw)
  To: Ingo Molnar, Peter Zijlstra, Pavel Pisa, Richard Weinberger,
	LKML, Michel Lespinasse, Andrea Arcangeli, Peter Zijlstra,
	Rik van Riel
  Cc: Daniel Santos

Signed-off-by: Daniel Santos <daniel.santos@pobox.com>
---
 kernel/sched/fair.c |   75 ++++++++++++++------------------------------------
 1 files changed, 21 insertions(+), 54 deletions(-)

diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c
index c099cc6..8feb4ea 100644
--- a/kernel/sched/fair.c
+++ b/kernel/sched/fair.c
@@ -447,6 +447,12 @@ static inline int entity_before(struct sched_entity *a,
 	return (s64)(a->vruntime - b->vruntime) < 0;
 }
 
+static inline long greater_vruntime(u64 *a, u64 *b)
+{
+	s64 diff = (s64)(*a - *b);
+	return diff > 0;
+}
+
 static void update_min_vruntime(struct cfs_rq *cfs_rq)
 {
 	u64 vruntime = cfs_rq->min_vruntime;
@@ -472,56 +478,17 @@ static void update_min_vruntime(struct cfs_rq *cfs_rq)
 #endif
 }
 
-/*
- * Enqueue an entity into the rb-tree:
+/* NOTE: we're passing greater_vruntime for both compare & greater because we
+ * don't need to use the find function.
  */
-static void __enqueue_entity(struct cfs_rq *cfs_rq, struct sched_entity *se)
-{
-	struct rb_node **link = &cfs_rq->tasks_timeline.rb_node;
-	struct rb_node *parent = NULL;
-	struct sched_entity *entry;
-	int leftmost = 1;
-
-	/*
-	 * Find the right place in the rbtree:
-	 */
-	while (*link) {
-		parent = *link;
-		entry = rb_entry(parent, struct sched_entity, run_node);
-		/*
-		 * We dont care about collisions. Nodes with
-		 * the same key stay together.
-		 */
-		if (entity_before(se, entry)) {
-			link = &parent->rb_left;
-		} else {
-			link = &parent->rb_right;
-			leftmost = 0;
-		}
-	}
-
-	/*
-	 * Maintain a cache of leftmost tree entries (it is frequently
-	 * used):
-	 */
-	if (leftmost)
-		cfs_rq->rb_leftmost = &se->run_node;
-
-	rb_link_node(&se->run_node, parent, link);
-	rb_insert_color(&se->run_node, &cfs_rq->tasks_timeline);
-}
-
-static void __dequeue_entity(struct cfs_rq *cfs_rq, struct sched_entity *se)
-{
-	if (cfs_rq->rb_leftmost == &se->run_node) {
-		struct rb_node *next_node;
-
-		next_node = rb_next(&se->run_node);
-		cfs_rq->rb_leftmost = next_node;
-	}
-
-	rb_erase(&se->run_node, &cfs_rq->tasks_timeline);
-}
+RB_DEFINE_INTERFACE(
+	fair_tree,
+	struct cfs_rq, tasks_timeline, rb_leftmost, /* no right or count */, ,
+	struct sched_entity, run_node, vruntime,
+	0, greater_vruntime, greater_vruntime, /* no augment */,
+	/* find unused */ ,
+	static __flatten, /* let gcc decide rather or not to inline insert */
+	/* find_near unused */, /* insert_near unused */)
 
 struct sched_entity *__pick_first_entity(struct cfs_rq *cfs_rq)
 {
@@ -1108,7 +1075,7 @@ enqueue_entity(struct cfs_rq *cfs_rq, struct sched_entity *se, int flags)
 	update_stats_enqueue(cfs_rq, se);
 	check_spread(cfs_rq, se);
 	if (se != cfs_rq->curr)
-		__enqueue_entity(cfs_rq, se);
+		fair_tree_insert(cfs_rq, se);
 	se->on_rq = 1;
 
 	if (cfs_rq->nr_running == 1) {
@@ -1189,7 +1156,7 @@ dequeue_entity(struct cfs_rq *cfs_rq, struct sched_entity *se, int flags)
 	clear_buddies(cfs_rq, se);
 
 	if (se != cfs_rq->curr)
-		__dequeue_entity(cfs_rq, se);
+		fair_tree_remove(cfs_rq, se);
 	se->on_rq = 0;
 	update_cfs_load(cfs_rq, 0);
 	account_entity_dequeue(cfs_rq, se);
@@ -1260,7 +1227,7 @@ set_next_entity(struct cfs_rq *cfs_rq, struct sched_entity *se)
 		 * runqueue.
 		 */
 		update_stats_wait_end(cfs_rq, se);
-		__dequeue_entity(cfs_rq, se);
+		fair_tree_remove(cfs_rq, se);
 	}
 
 	update_stats_curr_start(cfs_rq, se);
@@ -1339,7 +1306,7 @@ static void put_prev_entity(struct cfs_rq *cfs_rq, struct sched_entity *prev)
 	if (prev->on_rq) {
 		update_stats_wait_start(cfs_rq, prev);
 		/* Put 'current' back into the tree. */
-		__enqueue_entity(cfs_rq, prev);
+		fair_tree_insert(cfs_rq, prev);
 	}
 	cfs_rq->curr = NULL;
 }
@@ -3593,7 +3560,7 @@ void update_group_power(struct sched_domain *sd, int cpu)
 		/*
 		 * !SD_OVERLAP domains can assume that child groups
 		 * span the current group.
-		 */ 
+		 */
 
 		group = child->groups;
 		do {
-- 
1.7.3.4


^ permalink raw reply related	[flat|nested] 26+ messages in thread

* [PATCH v5 15/25] kernel-doc: bugfix - multi-line macros
       [not found] <1348615467-9850-1-git-send-email-daniel.santos@pobox.com>
                   ` (12 preceding siblings ...)
  2012-09-25 23:31 ` [PATCH v5 13/25] fair.c: Use generic rbtree impl in fair scheduler Daniel Santos
@ 2012-09-25 23:31 ` Daniel Santos
  2012-09-25 23:31 ` [PATCH v5 16/25] kernel-doc: bugfix - empty line in Example section Daniel Santos
                   ` (10 subsequent siblings)
  24 siblings, 0 replies; 26+ messages in thread
From: Daniel Santos @ 2012-09-25 23:31 UTC (permalink / raw)
  To: Randy Dunlap, Pavel Pisa, Richard Weinberger, LKML,
	Michel Lespinasse, Andrea Arcangeli, Peter Zijlstra,
	Rik van Riel
  Cc: Daniel Santos

Prior to this patch the following code breaks:

/**
 * multiline_example - this breaks kernel-doc
 */
 #define multiline_example( \
myparam)

Producing this error:

Error(somefile.h:983): cannot understand prototype: 'multiline_example( \ '

This patch fixes the issue by appending all lines ending in a blackslash
(optionally followed by whitespace), removing the backslash and any
whitespace after it prior to appending (just like the C pre-processor
would).

This fixes a break in kerel-doc introduced by the additions to rbtree.h.

Signed-off-by: Daniel Santos <daniel.santos@pobox.com>
---
 scripts/kernel-doc |    3 +++
 1 files changed, 3 insertions(+), 0 deletions(-)

diff --git a/scripts/kernel-doc b/scripts/kernel-doc
index 9b0c0b8..55ab5e4 100755
--- a/scripts/kernel-doc
+++ b/scripts/kernel-doc
@@ -2045,6 +2045,9 @@ sub process_file($) {
 
     $section_counter = 0;
     while (<IN>) {
+	while (s/\\\s*$//) {
+	    $_ .= <IN>;
+	}
 	if ($state == 0) {
 	    if (/$doc_start/o) {
 		$state = 1;		# next line is always the function name
-- 
1.7.3.4


^ permalink raw reply related	[flat|nested] 26+ messages in thread

* [PATCH v5 16/25] kernel-doc: bugfix - empty line in Example section
       [not found] <1348615467-9850-1-git-send-email-daniel.santos@pobox.com>
                   ` (13 preceding siblings ...)
  2012-09-25 23:31 ` [PATCH v5 15/25] kernel-doc: bugfix - multi-line macros Daniel Santos
@ 2012-09-25 23:31 ` Daniel Santos
  2012-09-25 23:31 ` [PATCH v5 17/25] kernel-doc: Don't mangle whitespace " Daniel Santos
                   ` (9 subsequent siblings)
  24 siblings, 0 replies; 26+ messages in thread
From: Daniel Santos @ 2012-09-25 23:31 UTC (permalink / raw)
  To: Randy Dunlap, Pavel Pisa, Richard Weinberger, LKML,
	Michel Lespinasse, Andrea Arcangeli, Peter Zijlstra,
	Rik van Riel
  Cc: Daniel Santos

If you have a section named "Example" that contains an empty line,
attempting to generate htmldocs give you the error:

/path/Documentation/DocBook/kernel-api.xml:3455: parser error : Opening and ending tag mismatch: programlisting line 3449 and para
   </para><para>
          ^
/path/Documentation/DocBook/kernel-api.xml:3473: parser error : Opening and ending tag mismatch: para line 3467 and programlisting
</programlisting></informalexample>
                 ^
/path/Documentation/DocBook/kernel-api.xml:3678: parser error : Opening and ending tag mismatch: programlisting line 3672 and para
   </para><para>
          ^
/path/Documentation/DocBook/kernel-api.xml:3701: parser error : Opening and ending tag mismatch: para line 3690 and programlisting
</programlisting></informalexample>
                 ^
unable to parse
/path/Documentation/DocBook/kernel-api.xml

Essentially, the script attempts to close a <programlisting> with a
closing tag for a <para> block.  This patch corrects the problem by
simply not outputting anything extra when we're dumping pre-formatted
text, since the empty line will be rendered correctly anyway.

Signed-off-by: Daniel Santos <daniel.santos@pobox.com>
---
 scripts/kernel-doc |   11 ++++++++++-
 1 files changed, 10 insertions(+), 1 deletions(-)

diff --git a/scripts/kernel-doc b/scripts/kernel-doc
index 55ab5e4..69efb2f 100755
--- a/scripts/kernel-doc
+++ b/scripts/kernel-doc
@@ -230,6 +230,7 @@ my $dohighlight = "";
 
 my $verbose = 0;
 my $output_mode = "man";
+my $output_preformatted = 0;
 my $no_doc_sections = 0;
 my %highlights = %highlights_man;
 my $blankline = $blankline_man;
@@ -460,7 +461,9 @@ sub output_highlight {
 
     foreach $line (split "\n", $contents) {
 	if ($line eq ""){
-	    print $lineprefix, local_unescape($blankline);
+	    if (! $output_preformatted) {
+		print $lineprefix, local_unescape($blankline);
+	    }
 	} else {
 	    $line =~ s/\\\\\\/\&/g;
 	    if ($output_mode eq "man" && substr($line, 0, 1) eq ".") {
@@ -643,10 +646,12 @@ sub output_section_xml(%) {
 	print "<title>$section</title>\n";
 	if ($section =~ m/EXAMPLE/i) {
 	    print "<informalexample><programlisting>\n";
+	    $output_preformatted = 1;
 	} else {
 	    print "<para>\n";
 	}
 	output_highlight($args{'sections'}{$section});
+	$output_preformatted = 0;
 	if ($section =~ m/EXAMPLE/i) {
 	    print "</programlisting></informalexample>\n";
 	} else {
@@ -949,10 +954,12 @@ sub output_blockhead_xml(%) {
 	}
 	if ($section =~ m/EXAMPLE/i) {
 	    print "<example><para>\n";
+	    $output_preformatted = 1;
 	} else {
 	    print "<para>\n";
 	}
 	output_highlight($args{'sections'}{$section});
+	$output_preformatted = 0;
 	if ($section =~ m/EXAMPLE/i) {
 	    print "</para></example>\n";
 	} else {
@@ -1028,10 +1035,12 @@ sub output_function_gnome {
 	print "<simplesect>\n <title>$section</title>\n";
 	if ($section =~ m/EXAMPLE/i) {
 	    print "<example><programlisting>\n";
+	    $output_preformatted = 1;
 	} else {
 	}
 	print "<para>\n";
 	output_highlight($args{'sections'}{$section});
+	$output_preformatted = 0;
 	print "</para>\n";
 	if ($section =~ m/EXAMPLE/i) {
 	    print "</programlisting></example>\n";
-- 
1.7.3.4


^ permalink raw reply related	[flat|nested] 26+ messages in thread

* [PATCH v5 17/25] kernel-doc: Don't mangle whitespace in Example section
       [not found] <1348615467-9850-1-git-send-email-daniel.santos@pobox.com>
                   ` (14 preceding siblings ...)
  2012-09-25 23:31 ` [PATCH v5 16/25] kernel-doc: bugfix - empty line in Example section Daniel Santos
@ 2012-09-25 23:31 ` Daniel Santos
  2012-09-25 23:31 ` [PATCH v5 19/25] rbtree.h: add doc comments for struct rb_node Daniel Santos
                   ` (8 subsequent siblings)
  24 siblings, 0 replies; 26+ messages in thread
From: Daniel Santos @ 2012-09-25 23:31 UTC (permalink / raw)
  To: Randy Dunlap, Pavel Pisa, Richard Weinberger, LKML,
	Michel Lespinasse, Andrea Arcangeli, Peter Zijlstra,
	Rik van Riel
  Cc: Daniel Santos

A section with the name "Example" (case-insensitive) has a special
meaning to kernel-doc.  These sections are output using mono-type fonts.
However, leading whitespace is stripped, thus robbing a lot of meaning
from this, as indented code examples will be mangled.

This patch preserves the leading whitespace for "Example" sections.
More accurately, it preserves it for all sections, but removes it later
if the section isn't an "Example" section.

Signed-off-by: Daniel Santos <daniel.santos@pobox.com>
---
 scripts/kernel-doc |    9 +++++++--
 1 files changed, 7 insertions(+), 2 deletions(-)

diff --git a/scripts/kernel-doc b/scripts/kernel-doc
index 69efb2f..976e28c 100755
--- a/scripts/kernel-doc
+++ b/scripts/kernel-doc
@@ -281,9 +281,10 @@ my $doc_special = "\@\%\$\&";
 my $doc_start = '^/\*\*\s*$'; # Allow whitespace at end of comment start.
 my $doc_end = '\*/';
 my $doc_com = '\s*\*\s*';
+my $doc_com_body = '\s*\* ?';
 my $doc_decl = $doc_com . '(\w+)';
 my $doc_sect = $doc_com . '([' . $doc_special . ']?[\w\s]+):(.*)';
-my $doc_content = $doc_com . '(.*)';
+my $doc_content = $doc_com_body . '(.*)';
 my $doc_block = $doc_com . 'DOC:\s*(.*)?';
 
 my %constants;
@@ -460,6 +461,9 @@ sub output_highlight {
 #   print STDERR "contents af:$contents\n";
 
     foreach $line (split "\n", $contents) {
+	if (! $output_preformatted) {
+	    $line =~ s/^\s*//;
+	}
 	if ($line eq ""){
 	    if (! $output_preformatted) {
 		print $lineprefix, local_unescape($blankline);
@@ -2084,7 +2088,7 @@ sub process_file($) {
 		    $descr= $1;
 		    $descr =~ s/^\s*//;
 		    $descr =~ s/\s*$//;
-		    $descr =~ s/\s+/ /;
+		    $descr =~ s/\s+/ /g;
 		    $declaration_purpose = xml_escape($descr);
 		    $in_purpose = 1;
 		} else {
@@ -2176,6 +2180,7 @@ sub process_file($) {
 		    # Continued declaration purpose
 		    chomp($declaration_purpose);
 		    $declaration_purpose .= " " . xml_escape($1);
+		    $declaration_purpose =~ s/\s+/ /g;
 		} else {
 		    $contents .= $1 . "\n";
 		}
-- 
1.7.3.4


^ permalink raw reply related	[flat|nested] 26+ messages in thread

* [PATCH v5 19/25] rbtree.h: add doc comments for struct rb_node
       [not found] <1348615467-9850-1-git-send-email-daniel.santos@pobox.com>
                   ` (15 preceding siblings ...)
  2012-09-25 23:31 ` [PATCH v5 17/25] kernel-doc: Don't mangle whitespace " Daniel Santos
@ 2012-09-25 23:31 ` Daniel Santos
  2012-09-25 23:32 ` [PATCH v5 20/25] selftest: Add generic tree self-test common code Daniel Santos
                   ` (7 subsequent siblings)
  24 siblings, 0 replies; 26+ messages in thread
From: Daniel Santos @ 2012-09-25 23:31 UTC (permalink / raw)
  To: Daniel Santos, Pavel Pisa, Richard Weinberger, LKML,
	Michel Lespinasse, Andrea Arcangeli, Peter Zijlstra,
	Rik van Riel

Signed-off-by: Daniel Santos <daniel.santos@pobox.com>
---
 include/linux/rbtree.h |   13 +++++++++++++
 1 files changed, 13 insertions(+), 0 deletions(-)

diff --git a/include/linux/rbtree.h b/include/linux/rbtree.h
index 3ef30b9..f1fbdea 100644
--- a/include/linux/rbtree.h
+++ b/include/linux/rbtree.h
@@ -100,6 +100,19 @@ static inline struct page * rb_insert_page_cache(struct inode * inode,
 #include <linux/bug.h>
 #include <linux/kconfig.h>
 
+/**
+ * struct rb_node
+ * @rb_parent_color: Contains the color in the lower 2 bits (although only bit
+ * 		     zero is currently used) and the address of the parent in
+ * 		     the rest (lower 2 bits of address should always be zero on
+ * 		     any arch supported).  If the node is initialized and not a
+ * 		     member of any tree, the parent point to its self.  If the
+ * 		     node belongs to a tree, but is the root element, the
+ * 		     parent will be NULL.  Otherwise, parent will always
+ * 		     point to the parent node in the tree.
+ * @rb_right:        Pointer to the right element.
+ * @rb_left:         Pointer to the left element.
+ */
 struct rb_node
 {
 	unsigned long  rb_parent_color;
-- 
1.7.3.4


^ permalink raw reply related	[flat|nested] 26+ messages in thread

* [PATCH v5 20/25] selftest: Add generic tree self-test common code.
       [not found] <1348615467-9850-1-git-send-email-daniel.santos@pobox.com>
                   ` (16 preceding siblings ...)
  2012-09-25 23:31 ` [PATCH v5 19/25] rbtree.h: add doc comments for struct rb_node Daniel Santos
@ 2012-09-25 23:32 ` Daniel Santos
  2012-09-25 23:32 ` [PATCH v5 21/25] selftest: Add userspace test program Daniel Santos
                   ` (6 subsequent siblings)
  24 siblings, 0 replies; 26+ messages in thread
From: Daniel Santos @ 2012-09-25 23:32 UTC (permalink / raw)
  To: Daniel Santos, Pavel Pisa, Richard Weinberger, LKML,
	Michel Lespinasse, Andrea Arcangeli, Peter Zijlstra,
	Rik van Riel

Self-test code for both performance and correctness testing. The files
tools/testing/selftests/grbtree/common.{h,c} contain code for use in
both the user- and kernel-space test program/module and depends upon a
few functions being made available by said.

The purpose of these tests is to verify correctness across compilers and
document the performance difference between the generic and hand-coded
red-black tree implementations on various compilers, which is identified
as critical for determining feasibility of adding this this tree
implementation to the kernel, as older compilers optimize the generic
code more poorly than its hand-coded counterpart.

Signed-off-by: Daniel Santos <daniel.santos@pobox.com>
---
 tools/testing/selftests/grbtree/common.c |  957 ++++++++++++++++++++++++++++++
 tools/testing/selftests/grbtree/common.h |  252 ++++++++
 2 files changed, 1209 insertions(+), 0 deletions(-)
 create mode 100644 tools/testing/selftests/grbtree/common.c
 create mode 100644 tools/testing/selftests/grbtree/common.h

diff --git a/tools/testing/selftests/grbtree/common.c b/tools/testing/selftests/grbtree/common.c
new file mode 100644
index 0000000..9625d60
--- /dev/null
+++ b/tools/testing/selftests/grbtree/common.c
@@ -0,0 +1,957 @@
+/* common.c - generic red-black tree test functions for use in both kernel and
+ * user space.
+ * Copyright (C) 2012  Daniel Santos <daniel.santos@pobox.com>
+ *
+ * This program is free software; you can redistribute it and/or
+ * modify it under the terms of the GNU General Public License
+ * as published by the Free Software Foundation; either version 2
+ * of the License, or (at your option) any later version.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+ * GNU General Public License for more details.
+ *
+ * You should have received a copy of the GNU General Public License
+ * along with this program; if not, write to the Free Software
+ * Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301, USA.
+ */
+
+#include "common.h"
+
+#define _CONCAT2(a, b) a ## b
+#define _CONCAT(a, b) _CONCAT2(a, b)
+
+
+const char *grbtest_type_desc[GRBTEST_TYPE_COUNT] = {
+	"insertion performance",
+	"insertion & deletion performance",
+	"insertion validation"
+};
+
+#if GRBTEST_USE_AUGMENTED
+static void mytree_augment_fn(struct rb_node *node, void *data)
+{
+	// does nothing
+}
+#endif
+
+/* more efficient alternative to rb_init_node */
+static inline void grbtest_init_node(struct rb_node *node)
+{
+	node->rb_parent_color = (unsigned long)node;
+	node->rb_right = NULL;
+	node->rb_left = NULL;
+}
+
+#if GRBTEST_BUILD_GENERIC
+/****************************************************************************
+ * Generic Implementation
+ */
+
+#define __GRBTEST_FLAGS						\
+	((GRBTEST_UNIQUE_KEYS ? RB_UNIQUE_KEYS : 0) |		\
+	 (GRBTEST_INSERT_REPLACES ? RB_INSERT_REPLACES : 0))
+
+
+
+static inline long compare_s32(const s32 *a, const s32 *b) {return *a - *b;}
+static inline long greater_s32(const s32 *a, const s32 *b) {return *a > *b;}
+
+static inline long compare_u32(const u32 *a, const u32 *b) {return (long)*a - (long)*b;}
+static inline long greater_u32(const u32 *a, const u32 *b) {return *a > *b;}
+
+static inline long compare_s64(const s64 *a, const s64 *b) {return *a - *b;}
+static inline long greater_s64(const s64 *a, const s64 *b) {return *a > *b;}
+
+static inline long compare_u64(const u64 *a, const u64 *b) {return *a - *b;}
+static inline long greater_u64(const u64 *a, const u64 *b) {return *a > *b;}
+
+
+RB_DEFINE_INTERFACE(
+	mytree,
+	struct container, tree,
+#if GRBTEST_USE_LEFTMOST
+	leftmost
+#endif
+	,
+#if GRBTEST_USE_RIGHTMOST
+	rightmost
+#endif
+	,
+#if GRBTEST_USE_COUNT
+	count
+#endif
+	,
+	struct object, node, key,
+	__GRBTEST_FLAGS, _CONCAT(compare_, GRBTEST_KEY_TYPE),
+#if GRBTEST_UNIQUE_KEYS
+	_CONCAT(compare_, GRBTEST_KEY_TYPE),
+#else
+	_CONCAT(greater_, GRBTEST_KEY_TYPE),
+#endif
+#if GRBTEST_USE_AUGMENTED
+	mytree_augment_fn
+#endif
+	,
+	static __flatten inline,	/* find */
+	static __flatten inline,	/* insert */
+	static __flatten inline,	/* find_near */
+	static __flatten inline);	/* insert_near */
+
+
+#else /* GRBTEST_BUILD_GENERIC */
+
+/****************************************************************************
+ * Hand-coded Implementation
+ *
+ * This section implements the find, insert & remove functions as one would do
+ * so were they hand-coding it, except that we use pre-processor to include (or
+ * omit) the various features & rules.
+ *
+ * In order to account for compilers that may fail to optimize out a simple
+ * if(0) or if(1) construct, we'll make sure that such extra code is not
+ * generated by using these ugly pre-processor #ifs in this "hand-coded"
+ * section.  This makes the code prety ugly, but is percieved as necessary by
+ * the author for correctness. TODO: Is this overkill?
+ */
+
+static inline struct object *mytree_find(struct container *cont, GRBTEST_KEY_TYPE *key)
+{
+	struct rb_node *node = cont->tree.rb_node;
+
+	while (node) {
+		struct object *obj = container_of(node, struct object, node);
+		if (*key > obj->key) {
+			node = node->rb_right;
+		} else if (*key < obj->key) {
+			node = node->rb_left;
+		} else
+			return obj;
+	}
+
+	return 0;
+}
+
+static inline struct object *mytree_insert(struct container *cont,
+					   struct object *obj)
+{
+	struct rb_root *root   = &cont->tree;
+	struct rb_node **p     = &root->rb_node;
+	struct rb_node *parent = NULL;
+	GRBTEST_KEY_TYPE key   = obj->key;
+#if GRBTEST_UNIQUE_KEYS
+	struct rb_node *found;
+#endif
+#if GRBTEST_USE_LEFTMOST
+	int leftmost           = 1;
+#endif
+#if GRBTEST_USE_RIGHTMOST
+	int rightmost          = 1;
+#endif
+
+	while (*p) {
+		parent = *p;
+		GRBTEST_KEY_TYPE cur_key = container_of(*p, struct object, node)->key;
+
+		if (key < cur_key) {
+			p = &(*p)->rb_left;
+#if GRBTEST_USE_RIGHTMOST
+			rightmost = 0;
+#endif
+/* if not using/enforcing unique keys, the below test is superflorous */
+#if GRBTEST_UNIQUE_KEYS
+		} else if (key > cur_key) {
+#else
+		} else {
+#endif /* GRBTEST_UNIQUE_KEYS */
+			p = &(*p)->rb_right;
+#if GRBTEST_USE_LEFTMOST
+			leftmost = 0;
+#endif
+		}
+/* again, if not using unique keys, the if/else is completed above, and we
+ * never exit the loop until we find a leaf
+ */
+#if GRBTEST_UNIQUE_KEYS
+		  else
+			break;
+#endif
+	}
+
+
+#if GRBTEST_USE_LEFTMOST
+	if (leftmost)
+		cont->leftmost = *p;
+#endif
+
+#if GRBTEST_USE_RIGHTMOST
+	if (rightmost)
+		cont->rightmost = *p;
+#endif
+
+#if GRBTEST_UNIQUE_KEYS
+	found = *p;
+	if (found) {
+#if GRBTEST_INSERT_REPLACES
+		rb_replace_node(found, &obj->node, root);
+#endif
+
+#if GRBTEST_USE_AUGMENTED
+		/* If we're using an augment tree with unique keys, we'll
+		 * re-purpose 'found' for efficiency and make sure the
+		 * augmented function is called.
+		 */
+		found = (struct rb_node*)container_of(found, struct object, node);
+		goto do_augmented;
+#else
+		/* If we're not using an augmented tree, let's not give the
+		 * compiler a chance to generate extra code by storing the
+		 * return value and just return it directly here.
+		 */
+		return container_of(found, struct object, node);
+#endif /* GRBTEST_USE_AUGMENTED */
+	} else
+#endif /* GRBTEST_UNIQUE_KEYS */
+	{
+		rb_link_node(&obj->node, parent, p);
+		rb_insert_color(&obj->node, root);
+	}
+
+#if GRBTEST_USE_COUNT
+	++cont->count;
+#endif
+
+#if GRBTEST_USE_AUGMENTED
+#if GRBTEST_UNIQUE_KEYS
+/* We need the label only if augmented & unique keys, otherwise, we'll generate
+ * a compiler warning or error.
+ */
+do_augmented:
+#endif
+	rb_augment_insert(&obj->node, mytree_augment_fn, NULL);
+	return (struct object*)found;
+#else
+	return NULL;
+#endif /* GRBTEST_USE_AUGMENTED */
+}
+
+static inline void mytree_remove(struct container *cont, struct object *obj)
+{
+	struct rb_node *node = &obj->node;
+#if GRBTEST_USE_AUGMENTED
+	struct rb_node *deepest = rb_augment_erase_begin(node);
+#endif
+
+#if GRBTEST_USE_LEFTMOST
+	if (cont->leftmost == node)
+		cont->leftmost = rb_next(node);
+#endif
+
+#if GRBTEST_USE_RIGHTMOST
+	if (cont->rightmost == node)
+		cont->rightmost = rb_prev(node);
+#endif
+
+	rb_erase(node, &cont->tree);
+
+#if GRBTEST_USE_COUNT
+	--cont->count;
+#endif
+
+#if GRBTEST_USE_AUGMENTED
+	rb_augment_erase_end(deepest, mytree_augment_fn, NULL);
+#endif
+}
+
+
+#endif /* GRBTEST_BUILD_GENERIC */
+
+
+
+/****************************************************************************
+ * Test functions
+ */
+
+#define VALIDATE_PARAM(fmt, param, test)			\
+do {								\
+	if (unlikely(!(test))) {				\
+		print_err(#param " invalid: %" #fmt " (" #test ")\n", \
+				param);				\
+		return -EINVAL;					\
+	}							\
+} while(0)
+
+long grbtest_init(struct grbtest_config *config)
+{
+	long i;
+	void *rnd_state;
+
+	VALIDATE_PARAM(u, config->pool_count, config->pool_count);
+	VALIDATE_PARAM(u, config->object_count, config->object_count);
+
+	config->seed = config->in_seed;
+	rnd_state = rand_init(&config->seed);
+	if (!rnd_state) {
+		print_err("failed to init random number generator\n");
+		return -ENOMEM;
+	}
+
+	/* if already initialized, cleanup and start over */
+	if(objects.pools) {
+		grbtest_cleanup();
+	}
+
+	objects.pools        = malloc(sizeof(struct object *) *
+			       config->pool_count);
+	objects.pool_count   = config->pool_count;
+	objects.object_count = config->object_count;
+	objects.pool_size    = sizeof(struct object) * config->object_count;
+
+	if (0) {
+	print_err("allocating %u pools of %u objects (%lu bytes) each for a "
+		  "total of %lu bytes of memory.\n",
+		  config->pool_count, config->object_count,
+		  (unsigned long)objects.pool_size,
+		  (unsigned long)(config->pool_count * objects.pool_size));
+	}
+
+	for (i = 0; i != config->pool_count; ++i) {
+		objects.pools[i] = mem_alloc(objects.pool_size);
+		if (unlikely(!objects.pools[i])) {
+			print_err("out of memory, you probably did something "
+				  "stupid...\n");
+			objects.pool_count = i;
+			grbtest_cleanup();
+			return -ENOMEM;
+		}
+	}
+
+	/* initialize objects in all pools */
+	for (i = 0; i < config->pool_count; ++i) {
+		struct object *p = objects.pools[i];
+		struct object *end = &p[objects.object_count];
+
+		if (!i) {
+			/* pool zero gets random keys */
+			for (; p != end; ++p)
+				init_object(p, rand() & config->key_mask);
+		} else {
+			/* remaining pools copy keys from pool zero */
+			const struct object *p0 = objects.pools[0];
+			for (; p != end; ++p, ++p0)
+				init_object(p, p0->key);
+		}
+	}
+
+	return 0;
+}
+
+void grbtest_cleanup()
+{
+	BUG_ON(!objects.pools);
+
+	while (objects.pool_count--)
+		mem_free(objects.pools[objects.pool_count]);
+	mem_free(objects.pools);
+
+	memset(&objects, 0, sizeof(objects));
+	objects.pools	     = NULL;
+	objects.pool_count   = 0;
+	objects.object_count = 0;
+	objects.pool_size    = 0;
+}
+
+long reset_objects(unsigned int pool_count) {
+	unsigned i;
+
+	VALIDATE_PARAM(u, pool_count, pool_count <= objects.pool_count);
+
+	for (i = 0; i < pool_count; ++i) {
+		struct object *p = objects.pools[i];
+		struct object *end = &p[objects.object_count];
+
+		for (; p != end; ++p) {
+			grbtest_init_node(&p->node);
+		}
+	}
+
+	return 0;
+}
+
+long grbtest_run_test(
+		struct grbtest_config *config,
+		struct grbtest_result *result,
+		struct container *cont)
+{
+	grbtest_init_results(config, result);
+
+	switch (config->test) {
+		case GRBTEST_TYPE_INSERTION:
+			return grbtest_perftest(config, result, cont, 0);
+
+		case GRBTEST_TYPE_INSERTION_DELETION:
+			return grbtest_perftest(config, result, cont, 1);
+
+		case GRBTEST_TYPE_VALIDATE_INSERTIONS:
+			return grbtest_validate_insertion(config, result, cont);
+
+		default:
+			print_err("Invalid test specified");
+			return -1;
+	}
+}
+
+/* flatten needed for gcc 4.3.x, that otherwise fails to inline mytree_insert */
+__flatten
+long grbtest_perftest(
+		struct grbtest_config *config,
+		struct grbtest_result *result,
+		struct container *cont,
+		int do_deletes)
+{
+	u64 start_time, end_time;
+	struct object *p, *start, *evicted;
+	const struct object *end;
+	unsigned i;
+	unsigned pool = 0;
+
+	VALIDATE_PARAM(u, pool, pool < objects.pool_count);
+	VALIDATE_PARAM(u, config->object_count, config->object_count
+		<= objects.object_count);
+
+	start = objects.pools[pool];
+	end = &objects.pools[pool][config->object_count];
+
+	if (!do_deletes) {
+		/* profile insertions with cheap container purge */
+
+		start_time = getCurTicks();
+		for (i = 0; i < config->reps; ++i) {
+			init_container(cont);
+
+			for (p = start; p != end; ++p) {
+				if (mytree_insert(cont, p))
+					++result->evictions;
+			}
+		}
+
+		result->insertion_time += (u64)(getCurTicks() - start_time);
+		result->insertions += config->reps * config->object_count;
+	} else {
+		/* profile insertions & deletions */
+
+		for (i = 0; i < config->reps; ++i) {
+			init_container(cont);
+
+			/* populate tree */
+			start_time = getCurTicks();
+			for (p = start; p != end; ++p) {
+				if ((evicted = mytree_insert(cont, p))) {
+					evicted->node.rb_parent_color =
+						(unsigned long)&evicted->node;
+					++result->evictions;
+				}
+			}
+
+			end_time = getCurTicks();
+			result->insertion_time += (u64)(end_time - start_time);
+
+			/* remove items from tree */
+			start_time = end_time;
+			for (p = start; p != end; ++p) {
+				if (p->node.rb_parent_color !=
+						(unsigned long)&p->node) {
+					mytree_remove(cont, p);
+					++result->deletions;
+				}
+			}
+
+			end_time = getCurTicks();
+			result->deletion_time += (u64)(end_time - start_time);
+
+			/* tree should now be empty */
+			BUG_ON(cont->tree.rb_node != NULL);
+		}
+		result->insertions = config->reps * config->object_count;
+	}
+
+	return 0;
+}
+
+
+long grbtest_validate_insertion(
+		struct grbtest_config *config,
+		struct grbtest_result *result,
+		struct container *cont)
+{
+	struct object *p, *near;
+	struct object *start = objects.pools[0];
+	const struct object *end;
+	u64 start_time;
+	unsigned num_nodes;
+	unsigned pool = 0;
+
+	VALIDATE_PARAM(u, config->object_count, config->object_count
+		<= objects.object_count);
+	VALIDATE_PARAM(u, pool, pool < objects.pool_count);
+	/* currently ignored, pass zero for now */
+	VALIDATE_PARAM(u, config->reps, !config->reps);
+
+	start_time = getCurTicks();
+	for (num_nodes = 1; num_nodes <= config->object_count; ++num_nodes) {
+		end = &start[num_nodes];
+		init_container(cont);
+		reset_objects(1);
+
+		for (p = start; p != end; ++p) {
+			struct object *evicted = mytree_insert(cont, p);
+
+#if 0
+			if (evicted && (mytree_rel.flags & RB_UNIQUE_KEYS)
+				    && (mytree_rel.flags & RB_INSERT_REPLACES)) {
+#endif
+			if (evicted && GRBTEST_INSERT_REPLACES) {
+				++result->evictions;
+				grbtest_init_node(&evicted->node);
+			}
+		}
+
+		/* make sure we can find them */
+		for (p = start; p != end; ++p) {
+			struct object *found = mytree_find(cont, &p->key);
+
+			/* if the object is inserted, it should be there */
+			if (is_inserted(p)) {
+				BUG_ON(found != p);
+			/* otherwise, it was evicted and shouldn't be there */
+			} else {
+#if GRBTEST_BUILD_GENERIC
+				BUG_ON(!(mytree_rel.flags & RB_UNIQUE_KEYS));
+#else
+				BUG_ON(!GRBTEST_UNIQUE_KEYS);
+#endif
+				BUG_ON(found == p);
+			}
+		}
+
+		/* make sure we can find_near them */
+		for (near = start; near != end; ++near) {
+
+			/* we wont use a non-inserted object for a near search */
+			if (!is_inserted(near)) {
+#if GRBTEST_BUILD_GENERIC
+				BUG_ON(!(mytree_rel.flags & RB_UNIQUE_KEYS));
+#else
+				BUG_ON(!GRBTEST_UNIQUE_KEYS);
+#endif
+				continue;
+			}
+
+/* we're not doing find_near on hand-coded stuff */
+#if GRBTEST_BUILD_GENERIC
+			for (p = start; p != end; ++p) {
+				struct object *found =
+					mytree_find_near(near, &p->key);
+
+				if (is_inserted(p)) {
+					BUG_ON(found != p);
+				} else {
+					BUG_ON(!(mytree_rel.flags & RB_UNIQUE_KEYS));
+					BUG_ON(found == p);
+				}
+			}
+#endif /* GRBTEST_BUILD_GENERIC */
+		}
+
+		/* walk the tree and verify order using compare function */
+		for (near = start; near != end; ++near) {
+			// count objects
+			// count inserted objects using start -> end and make sure they match
+			// verify count field
+
+		}
+
+		/* now we'll delete them and make sure they are removed */
+		for (near = start; near != end; ++near) {
+
+		}
+
+	}
+	result->insertion_time += (u64)(getCurTicks() - start_time);
+
+	return 0;
+}
+
+
+/****************************************************************************
+ * Test result functions
+ */
+
+void grbtest_init_results(const struct grbtest_config *config,
+			  struct grbtest_result *result)
+{
+	memset(result, 0, sizeof(*result));
+	result->node_size   = sizeof(struct rb_node);
+	result->object_size = sizeof(struct object);
+	result->pool_size   = sizeof(struct object) * config->object_count;
+}
+
+void grbtest_print_result_header(const struct grbtest_config *config)
+{
+	const char *d = config->delimiter;
+	print_msg(
+		/* compile-time config */
+		"compiler%s"
+		"key_type%s"
+		"userland%s"
+		"use_generic%s"
+		"use_leftmost%s"
+		"use_rightmost%s"
+		"use_count%s"
+		"unique_keys%s"
+		"insert_replaces%s"
+		"use_augmented%s"
+		"debug%s"
+		"debug_validate%s"
+		"arch%s"
+		"arch_flags%s"
+		"processor%s"
+		"cc%s"
+		"cflags%s"
+
+		/* run-time config */
+		"test%s"
+		"in_seed%s"
+		"seed%s"
+		"key_mask%s"
+		"object_count%s"
+		"pool_count%s"
+		"reps%s"
+		/* fields human_readable, delimiter, text_enclosure
+		 * and print_header omited */
+
+		/* result */
+		"node_size%s"
+		"object_size%s"
+		"pool_size%s"
+		"insertions%s"
+		"insertion_time%s"
+		"evictions%s"
+		"deletions%s"
+		"deletion_time\n",
+		d, d, d, d, d, d, d, d, d, d, d, d, d, d, d, d, d, d, d, d, d,
+		d, d, d, d, d, d, d, d, d, d);
+}
+
+void grbtest_print_result_row(const struct grbtest_config *config,
+			      const struct grbtest_result *result)
+{
+	const char *d = config->delimiter;
+	const char *q = config->text_enclosure;
+
+	/* FIXME: Will formating u64 as "%llu" wont be correct on all
+	 * architectures?
+	 */
+	print_msg(/* compile-time config */
+		  "%s" GRBTEST_COMPILER "%s%s"
+		  "%s" strize(GRBTEST_KEY_TYPE) "%s%s"
+		  strize(GRBTEST_USERLAND) "%s"
+		  strize(GRBTEST_BUILD_GENERIC) "%s"
+		  strize(GRBTEST_USE_LEFTMOST) "%s"
+		  strize(GRBTEST_USE_RIGHTMOST) "%s"
+		  strize(GRBTEST_USE_COUNT) "%s"
+		  strize(GRBTEST_UNIQUE_KEYS) "%s"
+		  strize(GRBTEST_INSERT_REPLACES) "%s"
+		  strize(GRBTEST_USE_AUGMENTED) "%s"
+		  strize(config_enabled(CONFIG_DEBUG_RBTREE)) "%s"
+		  strize(config_enabled(CONFIG_DEBUG_RBTREE_VALIDATE)) "%s"
+		  "%s" strize(GRBTEST_ARCH) "%s%s"
+		  "%s" strize(GRBTEST_ARCH_FLAGS) "%s%s"
+		  "%s" strize(GRBTEST_PROCESSOR) "%s%s"
+		  "%s" strize(GRBTEST_CC) "%s%s"
+		  "%s" strize(GRBTEST_CFLAGS) "%s%s"
+
+		  /* run-time config */
+		  "%u%s"
+		  "%llu%s"
+		  "%llu%s"
+		  "%u%s"
+		  "%u%s"
+		  "%u%s"
+		  "%u%s"
+
+		  /* result */
+		  "%u%s"
+		  "%u%s"
+		  "%u%s"
+		  "%llu%s"
+		  "%llu%s"
+		  "%llu%s"
+		  "%llu%s"
+		  "%llu\n",
+
+		  q, q, d,
+		  q, q, d,
+		  d, d, d, d, d, d, d, d, d, d,
+		  q, q, d,
+		  q, q, d,
+		  q, q, d,
+		  q, q, d,
+		  q, q, d,
+		  config->test, d,
+		  (unsigned long long)config->in_seed, d,
+		  (unsigned long long)config->seed, d,
+		  config->key_mask, d,
+		  config->object_count, d,
+		  config->pool_count, d,
+		  config->reps, d,
+		  result->node_size, d,
+		  result->object_size, d,
+		  result->pool_size, d,
+		  (unsigned long long)result->insertions, d,
+		  (unsigned long long)result->insertion_time, d,
+		  (unsigned long long)result->evictions, d,
+		  (unsigned long long)result->deletions, d,
+		  (unsigned long long)result->deletion_time);
+}
+
+#if 0
+
+int retrieve_from_empty_container()
+{
+	return 0;
+}
+Requirements for Generic Red-Black Tree tests
+
+Correctness testing
+Input parameters
+max_nodes - maximum number of nodes to use for tests
+
+Create an array of max_nodes objects and initialize their keys to random values.
+Start from a tree with zero objects and perform lookups of non-existant objects (which should all return NULL).
+
+int num_nodes;
+int i;
+
+/* correctness tests */
+for (int num_nodes = 0; i < max_nodes; ++num_nodes) {
+	// clear tree
+	for (i = 0; i < num_nodes; ++ i) {
+		// insert nodes
+	}
+	// insert a node with an existing key and verify outcome
+	for (i = 0; i < num_nodes; ++i) {
+		// find object i and verify outcome
+		for (j = 0; j < num_nodes; ++j) {
+			// perform find_near starting at i, looking for j
+			// and verify outcome
+		}
+		// delete object i & verify
+		// re-insert object i & verify
+	}
+	for (i = 0; i < num_nodes; ++i) {
+		// delete node i
+		// search for node i and make sure we get nothing
+		// search for next node and make sure we get one
+	}
+	// verify that tree is empty
+
+	// insert node zero
+	for (i = 1; i < num_nodes; ++ i) {
+		// insert_near, using previous node as start
+	}
+	// verify tree
+
+	// repeat find tests above
+}
+
+/* performance tests */
+int reps;  /* number of times to repeat a test */
+for (int num_nodes = 0; i < max_nodes; ++num_nodes) {
+	for (rep = 0; rep < reps; ++rep) {
+		// clear tree
+		// get start time
+		for (i = 0; i < num_nodes; ++ i) {
+			// insert nodes
+		}
+		// get end time & add
+	}
+}
+// insertion test
+// insert near test (separate data for each size, for each node distance)
+// insert replace test
+// find test
+// find_near test (again, for each size, for each node distance)
+
+__attribute__((flatten))
+long run_test(unsigned int count) {
+	size_t buf_size = sizeof(struct object) * count;
+	struct container cont;
+	struct object *obj_pools[2];
+	struct object **tree_contents;
+	size_t pool_size;
+	int pool_in_use;
+	long i, j, k;
+	struct object *found;
+	struct rb_node *node;
+	long long start, end;
+
+	long long now = getCurTicks();
+
+	srand((now & 0xffffffff) ^ (now >> 32));
+
+	if (count < 1 || count > 0x1000000) { /* 16.8 million should be a reasonable limit */
+		return -1;
+	}
+
+	print_err("allocating two pools of %u objects each\n", count);
+
+	cont.tree      = RB_ROOT;
+	cont.count     = 0;
+	cont.leftmost  = 0;
+	cont.rightmost = 0;
+	pool_in_use      = 0;
+	obj_pools[0]     = (struct object *)malloc(buf_size);
+	obj_pools[1]     = (struct object *)malloc(buf_size);
+
+	fprintf(stderr, "initializing objects\n");
+
+	/* init a psudo-random using a real-random seed */
+//	get_random_bytes(&seed, sizeof(seed));
+//	prandom32_seed(&grbtest.rand, seed);
+
+	for (i = 0; i < 2; ++i) {
+		struct object *p = obj_pools[i];
+		struct object *end = &p[count];
+
+		for (; p != end; ++p) {
+			p->id = rand() & 0xfffff;
+			rb_init_node(&p->node);
+		}
+	}
+	pool_size = count;
+
+	for (j = 0; j < 2; ++j) {
+		struct object *pool = obj_pools[j];
+		start = getCurTicks();
+
+		for (i = count; i; ) {
+			struct object *new = &pool[--i];
+
+			mytree_insert(&cont, new);
+		}
+		pool_in_use ^= 1;
+		end = getCurTicks();
+		fprintf(stderr, "Inserted %u objects in %llu\n", count, end - start);
+	}
+
+	fprintf(stderr, "walking tree now...\n");
+	tree_contents = (struct object **)malloc(sizeof(void*) * cont.count);
+	start = getCurTicks();
+	for (i = 0, node = cont.leftmost; node; node = rb_next(node), ++i) {
+		tree_contents[i] = rb_entry(node, struct object, node);
+	}
+	end = getCurTicks();
+	fprintf(stderr, "Finished walking tree of %u in %llu\n", cont.count, end - start);
+
+	fprintf(stderr, "root = %p, count = %u\n", cont.tree.rb_node, cont.count);
+	//dumpNode(cont.tree.rb_node);
+
+	//dumpTree(&cont.tree, 16);
+	start = getCurTicks();
+#define NEAR_RANGE 8
+#if 0
+	for (i = 0; i < cont.count; ++i) {
+		for (j = 0; j < cont.count; ++j) {
+			found = mytree_find_near(tree_contents[i], &tree_contents[j]->id);
+			if (found != tree_contents[j]) {
+				fprintf(stderr, "find_near found %p near %p (expected %p)\n",
+					found, tree_contents[i], tree_contents[j]);
+				found = mytree_find_near(tree_contents[i], &tree_contents[j]->id);
+			}
+		}
+	}
+#else
+for (k = 0; k < 8; ++k) {
+	for (i = 0; i < cont.count; ++i) {
+		int max = i + NEAR_RANGE;
+		if (max > cont.count)
+			max = cont.count;
+		for (j = i > NEAR_RANGE ? i - NEAR_RANGE : 0;
+		     j < max; ++j) {
+			found = mytree_find_near(tree_contents[i], &tree_contents[j]->id);
+			if (found != tree_contents[j]) {
+				fprintf(stderr, "find_near found %p near %p (expected %p)\n",
+					found, tree_contents[i], tree_contents[j]);
+				found = mytree_find_near(tree_contents[i], &tree_contents[j]->id);
+			}
+		}
+	}
+}
+
+#endif
+	end = getCurTicks();
+	fprintf(stderr, "find_near duration = %llu\n", end - start);
+
+	start = getCurTicks();
+for (k = 0; k < 8; ++k) {
+	for (i = 0; i < cont.count; ++i) {
+		int max = i + NEAR_RANGE;
+		if (max > cont.count)
+			max = cont.count;
+		for (j = i > NEAR_RANGE ? i - NEAR_RANGE : 0;
+		     j < max; ++j) {
+			found = mytree_find(&cont, &tree_contents[j]->id);
+			if (found != tree_contents[j]) {
+				fprintf(stderr, "find_near found %p near %p (expected %p)\n",
+					found, tree_contents[i], tree_contents[j]);
+				found = mytree_find_near(tree_contents[i], &tree_contents[j]->id);
+			}
+		}
+	}
+}
+
+	end = getCurTicks();
+	fprintf(stderr, "find duration = %llu\n", end - start);
+
+	// cleanup
+
+	fprintf(stderr, "Forward iteration (%u objects)\n", cont.count);
+	for (node = cont.leftmost; node ; node = rb_next(node)) {
+		struct object *obj = (struct object *)__rb_node_to_obj(node, &mytree_rel);
+		//fprintf(stderr, "id = 0x%08x\n", obj->id);
+	}
+
+	fprintf(stderr, "Reverse iteration (%u objects)\n", cont.count);
+	for (node = cont.rightmost; node ; node = rb_prev(node)) {
+		struct object *obj = (struct object *)__rb_node_to_obj(node, &mytree_rel);
+		//fprintf(stderr, "id = 0x%08x\n", obj->id);
+	}
+
+
+	fprintf(stderr, "Starting cleanup, %u objects\n", cont.count);
+	while (cont.leftmost) {
+		struct object *obj = (struct object *)__rb_node_to_obj(cont.leftmost, &mytree_rel);
+		//fprintf(stderr, "Removing object at 0x%p id = 0x%04x\n", obj, obj->id);
+		mytree_remove(&cont, obj);
+		//--cont.count;
+	}
+
+	if(obj_pools[0]) {
+		free(obj_pools[0]);
+		free(obj_pools[1]);
+		obj_pools[0] = 0;
+		obj_pools[1] = 0;
+		free(tree_contents);
+	}
+	pool_in_use = 0;
+	cont.leftmost = 0;
+	cont.rightmost = 0;
+
+
+	fprintf(stderr, "Cleanup complete, %u objects left in container.\n", cont.count);
+
+	return 0;
+}
+#endif
diff --git a/tools/testing/selftests/grbtree/common.h b/tools/testing/selftests/grbtree/common.h
new file mode 100644
index 0000000..62cb5e2
--- /dev/null
+++ b/tools/testing/selftests/grbtree/common.h
@@ -0,0 +1,252 @@
+/* common.h - generic red-black tree test functions for use in both kernel and
+ * user space.
+ * Copyright (C) 2012  Daniel Santos <daniel.santos@pobox.com>
+ *
+ * This program is free software; you can redistribute it and/or
+ * modify it under the terms of the GNU General Public License
+ * as published by the Free Software Foundation; either version 2
+ * of the License, or (at your option) any later version.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+ * GNU General Public License for more details.
+ *
+ * You should have received a copy of the GNU General Public License
+ * along with this program; if not, write to the Free Software
+ * Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301, USA.
+ */
+
+/**
+ * The following preprocessor variables should be defined to either 1 or 0 when
+ * compiling common.{c,h}
+ *
+ * GRBTEST_USERLAND		If non-zero, compile for userspace, kernelspace
+ * 				otherwise.
+ * GRBTEST_BUILD_GENERIC	If non-zero, use generic implementation,
+ * 				otherwise, use hand-coded implementation.
+ * GRBTEST_KEY_TYPE
+ * GRBTEST_USE_LEFTMOST		Maintain a pointer to the leftmost (smallest)
+ * 				in all insert & delete operations.
+ * GRBTEST_USE_RIGHTMOST	Same as above, except for rightmost (greatest)
+ * 				value
+ * GRBTEST_USE_COUNT		Maintain a count of objects in tree
+ * GRBTEST_UNIQUE_KEYS		If non-zero, tree contains only unique keys.
+ * GRBTEST_INSERT_REPLACES	Valid only if GRBTEST_UNIQUE_KEYS is non-zero.
+ * 				If non-zero, insert function will replace any
+ * 				existing object with the same key.
+ * GRBTEST_USE_AUGMENTED	Simulate an augmented tree (partially
+ * 				implemented)
+ */
+#ifndef _GRBTESTCOMMON_H
+#define _GRBTESTCOMMON_H
+
+#include <linux/rbtree.h>
+
+#ifndef GRBTEST_USERLAND
+# error GRBTEST_USERLAND not defined
+#endif
+#ifndef GRBTEST_BUILD_GENERIC
+# error GRBTEST_BUILD_GENERIC not defined
+#endif
+#ifndef GRBTEST_USE_LEFTMOST
+# error GRBTEST_USE_LEFTMOST not defined
+#endif
+#ifndef GRBTEST_KEY_TYPE
+# error GRBTEST_KEY_TYPE not defined
+#endif
+#ifndef GRBTEST_USE_RIGHTMOST
+# error GRBTEST_USE_RIGHTMOST not defined
+#endif
+#ifndef GRBTEST_USE_COUNT
+# error GRBTEST_USE_COUNT not defined
+#endif
+#ifndef GRBTEST_UNIQUE_KEYS
+# error GRBTEST_UNIQUE_KEYS not defined
+#endif
+#ifndef GRBTEST_INSERT_REPLACES
+# error GRBTEST_INSERT_REPLACES not defined
+#endif
+#ifndef GRBTEST_USE_AUGMENTED
+# error GRBTEST_INSERT_REPLACES not defined
+#endif
+
+#define strize(arg) strize2(arg)
+#define strize2(arg) #arg
+
+#ifdef __GNUC__
+# define GRBTEST_COMPILER "gcc-" \
+		strize(__GNUC__) "." \
+		strize(__GNUC_MINOR__) "." \
+		strize(__GNUC_PATCHLEVEL__)
+#else
+/* TODO: Add support for other compilers here */
+# define GRBTEST_COMPILER "non-gcc"
+#endif
+
+
+/**
+ * grbtest_config - Returns a string describing the build configuration.
+ */
+static inline const char *grbtest_config(void)
+{
+	return
+		"key type        " strize(GRBTEST_KEY_TYPE) "\n"
+#if GRBTEST_BUILD_GENERIC
+		"type            generic\n"
+#else
+		"type            hand-coded\n"
+#endif
+		"use leftmost    " strize(GRBTEST_USE_LEFTMOST) "\n"
+		"use rightmost   " strize(GRBTEST_USE_RIGHTMOST) "\n"
+		"use count       " strize(GRBTEST_USE_COUNT) "\n"
+		"unique keys     " strize(GRBTEST_UNIQUE_KEYS) "\n"
+		"insert replaces " strize(GRBTEST_INSERT_REPLACES) "\n"
+		"augmented       " strize(GRBTEST_USE_AUGMENTED) "\n"
+		"DEBUG_RBTREE    "
+		strize(config_enabled(CONFIG_DEBUG_RBTREE)) "\n"
+		"DEBUG_RBTREE_VALIDATE "
+		strize(config_enabled(CONFIG_DEBUG_RBTREE_VALIDATE)) "\n"
+		"CFLAGS          " strize(GRBTEST_CFLAGS) "\n"
+		"CC              " strize(GRBTEST_CC) "\n";
+
+}
+
+/* Functions provided by {module,user}/facilities.c for common.c */
+
+extern void facilities_init(void);
+extern u64 getCurTicks(void);
+extern void *rand_init(u64 *seed);
+
+//typedef unsigned int uint;
+
+
+#if GRBTEST_USERLAND
+#   define print_msg(...)	printf(__VA_ARGS__)
+#   define print_err(...)	fprintf(stderr, __VA_ARGS__)
+
+static inline void *mem_alloc(size_t size)	{return malloc(size);}
+static inline void mem_free(void *ptr)		{return free(ptr);}
+static inline u32 rand_get(void *state) 	{return rand();}
+static inline void rand_free(void *state)	{}
+
+#else /* GRBTEST_USERLAND */
+
+#    define print_msg(...) printk("grbtest: " __VA_ARGS__)
+#    define print_err(...) printk(KERN_ALERT "grbtest: " __VA_ARGS__)
+
+static inline void *mem_alloc(size_t size) {return kzalloc(size, GFP_KERNEL);}
+static inline void mem_free(void *ptr)	   {return kfree(ptr);}
+static inline u32 rand_get(struct rnd_state *state) {return prandom32(state);}
+static inline void rand_free(struct rnd_state *state) {kzfree(state);}
+
+#endif
+
+/****************************************************************************
+ * Structures
+ */
+
+struct object {
+	struct rb_node node;
+	GRBTEST_KEY_TYPE key;
+};
+
+struct container {
+	struct rb_root	tree;
+	unsigned int	count;
+	struct rb_node	*leftmost;
+	struct rb_node	*rightmost;
+	unsigned int	pool_in_use;
+};
+
+struct object_pools {
+	struct object	**pools;
+	unsigned int	pool_count;	/**< number of pools */
+	unsigned int	object_count;	/**< num objects in each pool */
+	size_t		pool_size;	/**< size of each pool in bytes */
+};
+
+enum grbtest_type {
+	GRBTEST_TYPE_INSERTION,
+	GRBTEST_TYPE_INSERTION_DELETION,
+	GRBTEST_TYPE_VALIDATE_INSERTIONS,
+	GRBTEST_TYPE_COUNT
+};
+
+extern const char *grbtest_type_desc[GRBTEST_TYPE_COUNT];
+
+struct grbtest_config {
+	enum grbtest_type test;
+	u64		  in_seed;
+	u64		  seed;
+	u32		  key_mask;
+	unsigned	  object_count;
+	unsigned	  pool_count;
+	unsigned	  reps;
+	int		  human_readable;
+	char		 *delimiter;
+	char		 *text_enclosure;
+	int		  print_header;
+};
+
+struct grbtest_result {
+	unsigned	node_size;
+	unsigned	object_size;
+	unsigned	pool_size;
+	u64		insertions;
+	u64		insertion_time;
+	u64		evictions;
+	u64		deletions;
+	u64		deletion_time;
+};
+
+/****************************************************************************
+ * Global data
+ */
+
+extern struct object_pools objects;
+
+
+static void inline init_container(struct container *cont)
+{
+	cont->tree	= RB_ROOT;
+	cont->count	= 0;
+	cont->leftmost	= 0;
+	cont->rightmost	= 0;
+}
+
+static void inline init_object(struct object *obj, GRBTEST_KEY_TYPE key)
+{
+	rb_init_node(&obj->node);
+	obj->key = key;
+}
+
+static int inline is_inserted(struct object *obj)
+{
+	return rb_parent(&obj->node) != &obj->node;
+}
+
+/* exported by common.c */
+extern long grbtest_init(struct grbtest_config *config);
+extern void grbtest_cleanup(void);
+extern long grbtest_run_test(
+		struct grbtest_config *config,
+		struct grbtest_result *result,
+		struct container *cont);
+extern long grbtest_perftest(
+		struct grbtest_config *config,
+		struct grbtest_result *result,
+		struct container *cont,
+		int do_deletes);
+extern long grbtest_validate_insertion(
+		struct grbtest_config *config,
+		struct grbtest_result *result,
+		struct container *cont);
+
+extern void grbtest_init_results(const struct grbtest_config *config,
+				 struct grbtest_result *result);
+extern void grbtest_print_result_header(const struct grbtest_config *config);
+extern void grbtest_print_result_row(const struct grbtest_config *config,
+				     const struct grbtest_result *result);
+
+#endif /* _GRBTESTCOMMON_H */
-- 
1.7.3.4


^ permalink raw reply related	[flat|nested] 26+ messages in thread

* [PATCH v5 21/25] selftest: Add userspace test program.
       [not found] <1348615467-9850-1-git-send-email-daniel.santos@pobox.com>
                   ` (17 preceding siblings ...)
  2012-09-25 23:32 ` [PATCH v5 20/25] selftest: Add generic tree self-test common code Daniel Santos
@ 2012-09-25 23:32 ` Daniel Santos
  2012-09-25 23:32 ` [PATCH v5 22/25] selftest: Add script to compile & run " Daniel Santos
                   ` (5 subsequent siblings)
  24 siblings, 0 replies; 26+ messages in thread
From: Daniel Santos @ 2012-09-25 23:32 UTC (permalink / raw)
  To: Daniel Santos, Pavel Pisa, Richard Weinberger, LKML,
	Michel Lespinasse, Andrea Arcangeli, Peter Zijlstra,
	Rik van Riel

Userspace test program using code from common.{c,h}.  Userspace
compliation is accomplished by ovreriding a few kernel headers in the
overrides directory.  This is an invasive hack (involving many internals
of kernel headers) that is expected to require fairly frequent
maintainence as kernel headers change.

The program grbtest can be run with -h option for help and outputs both
a human-readable format as well as a delimited text for more extensive
processing.

The exact behavior of the grbtest program depends upon the following
pre-processor variables, which the Makefile expects to be -Defined in a
CONFIG variable. Each variable should be assigned a value of 1 or 0, as
documented in common.h. If CONFIG is not set, a default is assigned in
the Makefile.

GRBTEST_BUILD_GENERIC
GRBTEST_USE_LEFTMOST
GRBTEST_USE_RIGHTMOST
GRBTEST_USE_COUNT
GRBTEST_UNIQUE_KEYS
GRBTEST_INSERT_REPLACES
GRBTEST_USE_AUGMENTED

Generation of the resultant executable is also dependent upon the below
.config values.

DEBUG_RBTREE
DEBUG_RBTREE_VALIDATE

Signed-off-by: Daniel Santos <daniel.santos@pobox.com>
---
 tools/testing/selftests/grbtree/user/Makefile      |   66 +++
 tools/testing/selftests/grbtree/user/facilities.c  |   58 ++
 tools/testing/selftests/grbtree/user/main.c        |  614 ++++++++++++++++++++
 .../grbtree/user/overrides/linux/export.h          |   31 +
 .../grbtree/user/overrides/linux/kernel.h          |   80 +++
 5 files changed, 849 insertions(+), 0 deletions(-)
 create mode 100644 tools/testing/selftests/grbtree/user/Makefile
 create mode 100644 tools/testing/selftests/grbtree/user/facilities.c
 create mode 100644 tools/testing/selftests/grbtree/user/main.c
 create mode 100644 tools/testing/selftests/grbtree/user/overrides/linux/export.h
 create mode 100644 tools/testing/selftests/grbtree/user/overrides/linux/kernel.h

diff --git a/tools/testing/selftests/grbtree/user/Makefile b/tools/testing/selftests/grbtree/user/Makefile
new file mode 100644
index 0000000..e487a85
--- /dev/null
+++ b/tools/testing/selftests/grbtree/user/Makefile
@@ -0,0 +1,66 @@
+# Default configuration (used if CONFIG not supplied)
+# See common.h for docs
+CONFIG ?= -DGRBTEST_KEY_TYPE=u32	\
+	  -DGRBTEST_BUILD_GENERIC=0	\
+	  -DGRBTEST_USE_LEFTMOST=1	\
+	  -DGRBTEST_USE_RIGHTMOST=1	\
+	  -DGRBTEST_USE_COUNT=1		\
+	  -DGRBTEST_UNIQUE_KEYS=1	\
+	  -DGRBTEST_INSERT_REPLACES=1	\
+	  -DGRBTEST_USE_AUGMENTED=0
+
+ifeq ($(KERNELRELEASE),)
+    # Assume the source tree is where the running kernel was built
+    # You should set KERNELDIR in the environment if it's elsewhere
+    KERNELDIR ?= /lib/modules/$(shell uname -r)/build
+endif
+
+PWD := $(shell pwd)
+
+# The below KERNEL_ARCH works on x86_64, but hasn't been tested elsewhere
+# (e.g., x86 32-bit, ARM, etc.)
+KERNEL_ARCH	 = $(shell uname -m | sed 's/x86_64/x86/')
+
+# Kernel include directories
+KERNEL_INCLUDES	 = -I$(KERNELDIR)/include \
+		   -I$(KERNELDIR)/arch/$(KERNEL_ARCH)/include
+CPPFLAGS	+= -DGRBTEST_USERLAND=1 -D__KERNEL__ -I$(PWD)/.. \
+		   -I$(PWD)/overrides $(KERNEL_INCLUDES) $(CONFIG)
+WARN_FLAGS	 = -Wall -Wundef -Wstrict-prototypes -Wno-unused-variable \
+		   -Werror-implicit-function-declaration -Wno-trigraphs \
+		   -Wno-format-security -Wno-unused-variable -Werror
+
+# Standard CFLAGS if not already set
+CFLAGS		?= -O2 -pipe
+# FIXME: breaks glibc
+#CFLAGS		+= -mno-see
+# TODO: Can get tese from KBUILD_CFLAGS in arch/<arch>/Makefile?
+#CFLAGS		+= -march=native -mno-mmx -mno-sse2 -mno-3dnow -mno-avx
+CFLAGS		+= $(WARN_FLAGS) -fno-strict-aliasing -fno-common \
+		   -fno-delete-null-pointer-checks
+CC		?= gcc
+CPPFLAGS	+= -DGRBTEST_CFLAGS="$(CFLAGS)" -DGRBTEST_CONFIG="$(CONFIG)" \
+		   -DGRBTEST_CC="$(CC)" \
+		   -DGRBTEST_ARCH="$(KERNEL_ARCH)" \
+		   -DGRBTEST_ARCH_FLAGS="-march=k8 -m64" \
+		   -DGRBTEST_PROCESSOR="$(shell uname -p)" \
+
+all: grbtest
+
+OBJ_FILES = main.o rbtree.o common.o facilities.o
+HEADER_FILES = $(KERNELDIR)/include/linux/rbtree.h ../common.h
+
+rbtree.c:
+	ln -s $(KERNELDIR)/lib/rbtree.c $(PWD)/rbtree.c
+
+common.c:
+	ln -s ../common.c common.c
+
+rbtree.o: rbtree.c $(KERNELDIR)/include/linux/rbtree.h
+
+grbtest: $(OBJ_FILES) $(HEADER_FILES)
+	$(CC) $(CFLAGS) $(OBJ_FILES) -o grbtest
+
+clean:
+	rm -f grbtest *.o rbtree.c common.c
+
diff --git a/tools/testing/selftests/grbtree/user/facilities.c b/tools/testing/selftests/grbtree/user/facilities.c
new file mode 100644
index 0000000..63c29aa
--- /dev/null
+++ b/tools/testing/selftests/grbtree/user/facilities.c
@@ -0,0 +1,58 @@
+/* facilities.c - userspace facilities used by common.c/h
+ * Copyright (C) 2012  Daniel Santos <daniel.santos@pobox.com>
+ *
+ * This program is free software; you can redistribute it and/or
+ * modify it under the terms of the GNU General Public License
+ * as published by the Free Software Foundation; either version 2
+ * of the License, or (at your option) any later version.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+ * GNU General Public License for more details.
+ *
+ * You should have received a copy of the GNU General Public License
+ * along with this program; if not, write to the Free Software
+ * Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301, USA.
+ */
+
+#include <stdio.h>
+#include <string.h>
+#include <stdlib.h>
+#include <sys/time.h>
+#include "common.h"
+
+long run_test(unsigned int count);
+
+static struct timezone tz;
+
+void facilities_init(void)
+{
+	tz.tz_minuteswest = 0;
+	tz.tz_dsttime = 0;
+
+	memset(&objects, 0, sizeof(objects));
+}
+
+u64 getCurTicks(void) {
+	struct timeval now;
+	gettimeofday(&now, &tz);
+	return 1000000LL * now.tv_sec + now.tv_usec;
+}
+
+/* We aren't actually allocating anything here, unlike the kernelspace version
+ * but a null pointer means error, so we'll return one instead.
+ */
+void *rand_init(u64 *seed)
+{
+	BUG_ON(!seed);
+
+	if (!*seed)
+		*seed = getCurTicks();
+
+	/* reduce to 32 bits */
+	*seed = (*seed & 0xffffffff) ^ (*seed >> 32);
+	srand(*seed & 0xffffffff);
+
+	return (void*)1;
+}
diff --git a/tools/testing/selftests/grbtree/user/main.c b/tools/testing/selftests/grbtree/user/main.c
new file mode 100644
index 0000000..212c1ae
--- /dev/null
+++ b/tools/testing/selftests/grbtree/user/main.c
@@ -0,0 +1,614 @@
+/* main.c - userspace generic red-black tree test program.
+ * Copyright (C) 2012  Daniel Santos <daniel.santos@pobox.com>
+ *
+ * This program is free software; you can redistribute it and/or
+ * modify it under the terms of the GNU General Public License
+ * as published by the Free Software Foundation; either version 2
+ * of the License, or (at your option) any later version.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+ * GNU General Public License for more details.
+ *
+ * You should have received a copy of the GNU General Public License
+ * along with this program; if not, write to the Free Software
+ * Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301, USA.
+ */
+
+#include <stdio.h>
+#include <string.h>
+#include <stdlib.h>
+#include <sys/time.h>
+#include <getopt.h>
+#include <libgen.h>
+
+#include "common.h"
+
+long run_test(unsigned int count);
+
+struct timezone tz;
+struct object_pools objects;
+
+#define BAIL_ON_ERR(var, expr)			\
+do {						\
+	var = (expr);				\
+	if(IS_ERR_VALUE(var)) return var;	\
+} while(0)
+
+int process_args(struct grbtest_config *config, int argc, char *argv[]);
+void show_usage(void);
+void init_basename(void);
+void print_human_summary(const struct grbtest_config *config);
+void print_human_result(const struct grbtest_result *result);
+
+const char *argv0  = NULL;
+
+int main(int argc, char *argv[]) {
+	long		      ret;
+	struct grbtest_config config;
+	struct grbtest_result result;
+	struct container      cont;
+	char		     *argv0_copy = strdup(argv[0]);
+	argv0				 = basename(argv0_copy);
+
+	if (process_args(&config, argc, argv)) {
+		fprintf(stderr, "\n");
+		show_usage();
+		free(argv0_copy);
+		return -1;
+	}
+
+	facilities_init();
+	init_container(&cont);
+	grbtest_init_results(&config, &result);
+
+	BAIL_ON_ERR(ret, grbtest_init(&config));
+
+	if (config.human_readable)
+		print_human_summary(&config);
+	else if (config.print_header)
+		grbtest_print_result_header(&config);
+
+	grbtest_run_test(&config, &result, &cont);
+
+	if (config.human_readable)
+		print_human_result(&result);
+	else
+		grbtest_print_result_row(&config, &result);
+
+	grbtest_cleanup();
+	free(config.delimiter);
+	free(config.text_enclosure);
+	free(argv0_copy);
+
+	return 0;
+}
+
+void print_human_summary(const struct grbtest_config *config)
+{
+	unsigned long long in_seed = (unsigned long long)config->in_seed;
+	unsigned long long seed = (unsigned long long)config->seed;
+
+	print_msg("Build Configuration\n%s\n", grbtest_config());
+
+	print_msg(
+		"Execution Parameters\n"
+		"test            %u (%s)\n"
+		"in_seed         %llu (0x%llx)\n"
+		"seed            %llu (0x%llx)\n"
+		"key_mask        %u (0x%x)\n"
+		"count           %u (0x%x)\n"
+		"pool_count      %u\n"
+		"reps            %u (0x%x)\n"
+		"human_readable  %u\n"
+		"delimiter       %s\n"
+		"text_enclosure  %s\n"
+		"print_header    %u\n"
+		"\n",
+		config->test, grbtest_type_desc[config->test],
+		in_seed, in_seed,
+		seed, seed,
+		config->key_mask, config->key_mask,
+		config->object_count, config->object_count,
+		config->pool_count,
+		config->reps, config->reps,
+		config->human_readable,
+		config->delimiter,
+		config->text_enclosure,
+		config->print_header);
+
+	print_msg("Running test...");
+}
+
+void print_human_result(const struct grbtest_result *result)
+{
+	print_msg("completed!\n\n");
+	print_msg(
+		"Test Results\n"
+		"insertions       %llu\n"
+		"insertion_time   %llu\n"
+		"evictions        %llu\n"
+		"deletions        %llu\n"
+		"deletion_time    %llu\n",
+		(unsigned long long)result->insertions,
+		(unsigned long long)result->insertion_time,
+		(unsigned long long)result->evictions,
+		(unsigned long long)result->deletions,
+		(unsigned long long)result->deletion_time);
+
+}
+
+/**
+ * Determine base by prefix and offset to number. Uses standard rules
+ * (expressed in regex below):
+ * 0[xX][0-9a-fA-F]+ denotes a hexidecimal number
+ * 0[0-7]+           denotes an octal number
+ * (0|[1-9][0-9]*)   denotes a decimal number
+ */
+int get_param_base_and_start(const char **str)
+{
+	assert(*str);
+
+	/* Parse an "0x", "0X" or "0" -prefixed string, but not the string
+	 * "0" (which will be treated as base ten). */
+	if (**str == '0' && *(*str + 1)) {
+		++*str;
+		if (**str == 'x' || **str == 'X') {
+			++*str;
+			return 16;
+		} else {
+			return 8;
+		}
+	} else {
+		return 10;
+	}
+}
+
+/**
+ * @returns zero on success, non-zero if the string didn't represent a clean
+ *          number
+ *
+ * FIXME: wont be correct for u32 on platforms where sizeof(int) == 2
+ */
+int get_uint_param(const char *str, unsigned int *dest)
+{
+	const char *start = str;
+	char *endptr;
+	int base = get_param_base_and_start(&start);
+
+	*dest = strtoul(start, &endptr, base);
+	if (*endptr) {
+		fprintf(stderr, "bad number: %s\n", str);
+		exit (1);
+	}
+
+	return *endptr;
+}
+
+/**
+ * @returns zero on success, non-zero if the string didn't represent a clean
+ *          number
+ * FIXME: assumes u64 is unsigned long long
+ */
+int get_u64_param(const char *str, u64 *dest)
+{
+	char *endptr;
+	int base = get_param_base_and_start(&str);
+
+	*dest = strtoull(str, &endptr, base);
+	if (*endptr) {
+		fprintf(stderr, "bad number: %s\n", str);
+		exit (1);
+	}
+
+	return *endptr;
+}
+
+void show_usage()
+{
+	fprintf(stderr,
+"Usage: %s [options]\n"
+"Options:\n"
+"  -h,     --help    Show this help\n"
+"  -t=NUM, --test    The test to run\n"
+"                    0 Performance: Insert\n"
+"                    1 Performance: Insert & Delete\n"
+"                    2 Validation\n"
+"  -s=NUM, --seed    Seed for random key generation (zero to use current\n"
+"                    time)\n"
+"  -m=NUM, --keymask Bitmask for keys (key range)\n"
+"  -c=NUM, --count   Number of objects to use\n"
+"  -r=NUM, --reps    Number of times to repeat test(s)\n"
+"  -p=NUM, --pools   Number of pools of objects to use\n"
+"  -u,     --human   Output in human-readable form\n"
+"  -d=STR, --delim   Use the string STR to delimit fields\n"
+"  -H,     --header  Output a row header\n"
+"\n"
+"Fields:\n"
+"  compiler        the compiler used\n"
+"  use_generic     value of GRBTEST_BUILD_GENERIC\n"
+"  use_leftmost    value of GRBTEST_USE_LEFTMOST\n"
+"  use_rightmost   value of GRBTEST_USE_RIGHTMOST\n"
+"  use_count       value of GRBTEST_USE_COUNT\n"
+"  unique_keys     value of GRBTEST_UNIQUE_KEYS\n"
+"  insert_replaces value of GRBTEST_INSERT_REPLACES\n"
+"  use_augmented   value of GRBTEST_USE_AUGMENTED\n"
+"  debug           if CONFIG_DEBUG_RBTREE is enabled (.config)\n"
+"  debug_validate  if CONFIG_DEBUG_RBTREE_VALIDATE is enabled (.config)\n"
+"  test            \n"
+"  in_seed         input seed\n"
+"  seed            result seed (differs if supplied seed is zero)\n"
+"  key_mask        \n"
+"  object_count    number of objects used for test\n"
+"  pool_count      \n"
+"  reps            number of times test is repeated\n"
+"  node_size       sizeof(struct rb_node)\n"
+"  object_size     sizeof(struct object)\n"
+"  pool_size       \n"
+"  time            time in microseconds\n"
+"  insertions      number of insertions\n"
+"  deletions       number of deletions\n"
+"  evictions       number of evictions (always zero unless both \n"
+"                  GRBTEST_UNIQUE_KEYS and GRBTEST_USE_AUGMENTED are\n"
+"                  non-zero)\n"
+"\n"
+"Example:\n"
+"%s -s 1 --reps 0x8000 --count 0x800 --keymask 0xfff\n"
+"\n",
+		argv0, argv0);
+}
+
+int process_args(struct grbtest_config *config, int argc, char *argv[])
+{
+	int c;
+
+	memset(config, 0, sizeof(*config));
+	config->test		= 0;
+	config->in_seed		= 0;
+	config->seed		= 0;
+	config->key_mask	= 0xff;
+	config->object_count	= 0x100;
+	config->pool_count	= 1;
+	config->reps		= 1;
+	config->human_readable	= 0;
+	config->delimiter	= strdup(",");
+	config->text_enclosure	= strdup("'");
+	config->print_header	= 0;
+
+	while (1) {
+		static struct option long_options[] = {
+			{"help",    no_argument,       0, 'h'},
+			{"test",    required_argument, 0, 't'},
+			{"seed",    required_argument, 0, 's'},
+			{"keymask", required_argument, 0, 'm'},
+			{"count",   required_argument, 0, 'c'},
+			{"reps",    required_argument, 0, 'r'},
+			{"pools",   required_argument, 0, 'p'},
+			{"human",   no_argument,       0, 'u'},
+			{"delim",   required_argument, 0, 'd'},
+			{"quote",   required_argument, 0, 'q'},
+			{"header",  no_argument,       0, 'H'},
+			{0,         0,                 0, 0}
+		};
+
+		/* getopt_long stores the option index here. */
+		int option_index = 0;
+		c = getopt_long(argc, argv, "ht:s:c:r:p:m:ud:q:H", long_options, &option_index);
+
+		/* Detect the end of the options. */
+		if (c == -1)
+		break;
+
+		switch (c) {
+		case 'h': show_usage();			exit(1);
+		case 't': get_uint_param(optarg, &config->test);	break;
+		case 's': get_u64_param(optarg, &config->in_seed);	break;
+		case 'm': get_uint_param(optarg, &config->key_mask);	break;
+		case 'c': get_uint_param(optarg, &config->object_count);break;
+		case 'r': get_uint_param(optarg, &config->reps);	break;
+		case 'p': get_uint_param(optarg, &config->pool_count);	break;
+		case 'u': config->human_readable = 1;			break;
+		case 'd': free(config->delimiter);
+			  config->delimiter = strdup(optarg);		break;
+		case 'q': free(config->text_enclosure);
+			  config->text_enclosure = strdup(optarg);	break;
+		case 'H': config->print_header = 1;			break;
+		default : return -1;
+		}
+	}
+
+	if (optind < argc) {
+		fprintf(stderr, "Invalid argument: %s\n", argv[optind]);
+		return -1;
+	}
+
+	if (config->test >= GRBTEST_TYPE_COUNT) {
+		fprintf(stderr, "Invalid test specified.\n");
+		return -1;
+	}
+
+	return 0;
+}
+
+
+
+#if 0
+/****************************************************************************
+ * rbtree & container-related functions
+ */
+
+static __always_inline long compare_long(const long *a, const long *b) {
+	return *a - *b;
+}
+
+static __always_inline long compare_u32(const u32 *a, const u32 *b) {
+	return (long)*a - (long)*b;
+}
+
+static inline void init_container(struct container *cunt) {
+	cunt->tree = RB_ROOT;
+	cunt->count = 0;
+}
+
+/****************************************************************************
+ * rb_relationship definition
+ */
+RB_DEFINE_INTERFACE(
+	mytree,
+	struct container, tree, leftmost, rightmost, count,
+	struct object, node, id,
+	RB_UNIQUE_KEYS | RB_INSERT_REPLACES, compare_u32, ,
+	,
+	static __flatten noinline,
+	static __flatten,
+	static __flatten __maybe_unused);
+
+void dumpNode(struct rb_node *node) {
+	fprintf(stderr, "rb_parent_color = 0x%lu (parent = 0x%p, color = %d)\n",
+		node->rb_parent_color, rb_parent(node), (int)rb_color(node));
+	fprintf(stderr, "rb_right        = %p\n", node->rb_right);
+	fprintf(stderr, "rb_left         = %p\n", node->rb_left);
+}
+
+void printObj(struct object *obj) {
+
+}
+
+struct obj_display {
+	struct object *o;
+	char text[64];
+};
+
+void populateDisp(struct rb_node *node, struct obj_display *disp_arr, unsigned *index, unsigned target_depth) {
+	struct rb_node *left = NULL;
+	struct rb_node *right = NULL;
+
+	if (target_depth == 0) {
+		struct obj_display *disp = &disp_arr[*index];
+
+		if (node) {
+			struct object *obj = rb_entry(node, struct object, node);
+			disp->o = obj;
+			snprintf(disp->text, sizeof(disp->text),
+				"%p=%02d", obj, obj->id);
+		} else {
+			disp->o = 0;
+			snprintf(disp->text, sizeof(disp->text),
+				"empty      ");
+		}
+		++(*index);
+	}
+
+	if (target_depth > 0) {
+		if (node) {
+			left = node->rb_left;
+			right = node->rb_right;
+		}
+		populateDisp(left, disp_arr, index, target_depth - 1);
+		populateDisp(right, disp_arr, index, target_depth - 1);
+	}
+}
+
+void dumpTree(struct rb_root *root, size_t obj_count) {
+	int max_depth = 5;
+	int max_nodes = 1 << max_depth;
+	unsigned index = 0u;
+	int i;
+	size_t item_text_size;
+	struct obj_display *disp = (struct obj_display *)malloc(max_nodes * sizeof(struct obj_display));
+
+	for (i = 0; i < max_depth; ++i) {
+		populateDisp(root->rb_node, disp, &index, i);
+	}
+	item_text_size = strlen(disp[0].text);
+
+	for (i = 0; i < max_nodes; ++i) {
+		if (i == 1 || i == 3 || i == 7 || i == 15 || i == 31) {
+			int bity;
+			fprintf(stderr, "\n");
+			for (bity = i + 1; !(bity & 32); bity <<= 1) {
+				const char spaces[257] = {[0 ... 255] = ' ', [256] = 0};
+				fprintf(stderr, "%s", &spaces[256 - item_text_size]);
+			}
+
+		} else {
+			fprintf(stderr, "  ");
+		}
+		fprintf(stderr, "%s", disp[i].text);
+	}
+	fprintf(stderr, "\n");
+
+}
+
+__attribute__((flatten))
+long run_test(unsigned int count) {
+	size_t buf_size = sizeof(struct object) * count;
+	struct container cont;
+	struct object *obj_pools[2];
+	struct object **tree_contents;
+	size_t pool_size;
+	int pool_in_use;
+	long i, j, k;
+	struct object *found;
+	struct rb_node *node;
+	long long start, end;
+
+	long long now = getCurTicks();
+
+	srand((now & 0xffffffff) ^ (now >> 32));
+
+	if (count < 1 || count > 0x1000000) { /* 16.8 million should be a reasonable limit */
+		return -1;
+	}
+
+	fprintf(stderr, "allocating two pools of %u objects each\n", count);
+
+	cont.tree      = RB_ROOT;
+	cont.count     = 0;
+	cont.leftmost  = 0;
+	cont.rightmost = 0;
+	pool_in_use      = 0;
+	obj_pools[0]     = (struct object *)malloc(buf_size);
+	obj_pools[1]     = (struct object *)malloc(buf_size);
+
+	fprintf(stderr, "initializing objects\n");
+
+	/* init a psudo-random using a real-random seed */
+//	get_random_bytes(&seed, sizeof(seed));
+//	prandom32_seed(&grbtest.rand, seed);
+
+	for (i = 0; i < 2; ++i) {
+		struct object *p = obj_pools[i];
+		struct object *end = &p[count];
+
+		for (; p != end; ++p) {
+			p->id = rand() & 0xfffff;
+			rb_init_node(&p->node);
+		}
+	}
+	pool_size = count;
+
+	for (j = 0; j < 2; ++j) {
+		struct object *pool = obj_pools[j];
+		start = getCurTicks();
+
+		for (i = count; i; ) {
+			struct object *new = &pool[--i];
+
+			mytree_insert(&cont, new);
+		}
+		pool_in_use ^= 1;
+		end = getCurTicks();
+		fprintf(stderr, "Inserted %u objects in %llu\n", count, end - start);
+	}
+
+	fprintf(stderr, "walking tree now...\n");
+	tree_contents = (struct object **)malloc(sizeof(void*) * cont.count);
+	start = getCurTicks();
+	for (i = 0, node = cont.leftmost; node; node = rb_next(node), ++i) {
+		tree_contents[i] = rb_entry(node, struct object, node);
+	}
+	end = getCurTicks();
+	fprintf(stderr, "Finished walking tree of %u in %llu\n", cont.count, end - start);
+
+	fprintf(stderr, "root = %p, count = %u\n", cont.tree.rb_node, cont.count);
+	//dumpNode(cont.tree.rb_node);
+
+	//dumpTree(&cont.tree, 16);
+	start = getCurTicks();
+#define NEAR_RANGE 8
+#if 0
+	for (i = 0; i < cont.count; ++i) {
+		for (j = 0; j < cont.count; ++j) {
+			found = mytree_find_near(tree_contents[i], &tree_contents[j]->id);
+			if (found != tree_contents[j]) {
+				fprintf(stderr, "find_near found %p near %p (expected %p)\n",
+					found, tree_contents[i], tree_contents[j]);
+				found = mytree_find_near(tree_contents[i], &tree_contents[j]->id);
+			}
+		}
+	}
+#else
+for (k = 0; k < 8; ++k) {
+	for (i = 0; i < cont.count; ++i) {
+		int max = i + NEAR_RANGE;
+		if (max > cont.count)
+			max = cont.count;
+		for (j = i > NEAR_RANGE ? i - NEAR_RANGE : 0;
+		     j < max; ++j) {
+			found = mytree_find_near(tree_contents[i], &tree_contents[j]->id);
+			if (found != tree_contents[j]) {
+				fprintf(stderr, "find_near found %p near %p (expected %p)\n",
+					found, tree_contents[i], tree_contents[j]);
+				found = mytree_find_near(tree_contents[i], &tree_contents[j]->id);
+			}
+		}
+	}
+}
+
+#endif
+	end = getCurTicks();
+	fprintf(stderr, "find_near duration = %llu\n", end - start);
+
+	start = getCurTicks();
+for (k = 0; k < 8; ++k) {
+	for (i = 0; i < cont.count; ++i) {
+		int max = i + NEAR_RANGE;
+		if (max > cont.count)
+			max = cont.count;
+		for (j = i > NEAR_RANGE ? i - NEAR_RANGE : 0;
+		     j < max; ++j) {
+			found = mytree_find(&cont, &tree_contents[j]->id);
+			if (found != tree_contents[j]) {
+				fprintf(stderr, "find_near found %p near %p (expected %p)\n",
+					found, tree_contents[i], tree_contents[j]);
+				found = mytree_find_near(tree_contents[i], &tree_contents[j]->id);
+			}
+		}
+	}
+}
+
+	end = getCurTicks();
+	fprintf(stderr, "find duration = %llu\n", end - start);
+
+	// cleanup
+
+	fprintf(stderr, "Forward iteration (%u objects)\n", cont.count);
+	for (node = cont.leftmost; node ; node = rb_next(node)) {
+		struct object *obj = (struct object *)__rb_node_to_obj(node, &mytree_rel);
+		//fprintf(stderr, "id = 0x%08x\n", obj->id);
+	}
+
+	fprintf(stderr, "Reverse iteration (%u objects)\n", cont.count);
+	for (node = cont.rightmost; node ; node = rb_prev(node)) {
+		struct object *obj = (struct object *)__rb_node_to_obj(node, &mytree_rel);
+		//fprintf(stderr, "id = 0x%08x\n", obj->id);
+	}
+
+
+	fprintf(stderr, "Starting cleanup, %u objects\n", cont.count);
+	while (cont.leftmost) {
+		struct object *obj = (struct object *)__rb_node_to_obj(cont.leftmost, &mytree_rel);
+		//fprintf(stderr, "Removing object at 0x%p id = 0x%04x\n", obj, obj->id);
+		mytree_remove(&cont, obj);
+		//--cont.count;
+	}
+
+	if(obj_pools[0]) {
+		free(obj_pools[0]);
+		free(obj_pools[1]);
+		obj_pools[0] = 0;
+		obj_pools[1] = 0;
+		free(tree_contents);
+	}
+	pool_in_use = 0;
+	cont.leftmost = 0;
+	cont.rightmost = 0;
+
+
+	fprintf(stderr, "Cleanup complete, %u objects left in container.\n", cont.count);
+
+	return 0;
+}
+#endif
diff --git a/tools/testing/selftests/grbtree/user/overrides/linux/export.h b/tools/testing/selftests/grbtree/user/overrides/linux/export.h
new file mode 100644
index 0000000..911be60
--- /dev/null
+++ b/tools/testing/selftests/grbtree/user/overrides/linux/export.h
@@ -0,0 +1,31 @@
+/* export.h userspace override hack
+ * Copyright (C) 2012  Daniel Santos <daniel.santos@pobox.com>
+ *
+ * This program is free software; you can redistribute it and/or
+ * modify it under the terms of the GNU General Public License
+ * as published by the Free Software Foundation; either version 2
+ * of the License, or (at your option) any later version.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+ * GNU General Public License for more details.
+ *
+ * You should have received a copy of the GNU General Public License
+ * along with this program; if not, write to the Free Software
+ * Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301, USA.
+ *
+ * This file is an override hack for the file normally found at
+ * include/linux/export.h in the kernel tree.
+ */
+
+#ifndef _LINUX_EXPORT_H
+#define _LINUX_EXPORT_H
+
+#define EXPORT_SYMBOL(sym)
+#define EXPORT_SYMBOL_GPL(sym)
+#define EXPORT_SYMBOL_GPL_FUTURE(sym)
+#define EXPORT_UNUSED_SYMBOL(sym)
+#define EXPORT_UNUSED_SYMBOL_GPL(sym)
+
+#endif /* _LINUX_EXPORT_H */
diff --git a/tools/testing/selftests/grbtree/user/overrides/linux/kernel.h b/tools/testing/selftests/grbtree/user/overrides/linux/kernel.h
new file mode 100644
index 0000000..3dfd046
--- /dev/null
+++ b/tools/testing/selftests/grbtree/user/overrides/linux/kernel.h
@@ -0,0 +1,80 @@
+/* kernel.h userspace override hack
+ * Copyright (C) 2012  Daniel Santos <daniel.santos@pobox.com>
+ *
+ * This program is free software; you can redistribute it and/or
+ * modify it under the terms of the GNU General Public License
+ * as published by the Free Software Foundation; either version 2
+ * of the License, or (at your option) any later version.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+ * GNU General Public License for more details.
+ *
+ * You should have received a copy of the GNU General Public License
+ * along with this program; if not, write to the Free Software
+ * Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301, USA.
+ *
+ * This file is an override hack for the file normally found at
+ * include/linux/kernel.h in the kernel tree (not /usr) to allow you to compile
+ * some portions of the Linux kernel code in user-space and is expected to
+ * break fairly often as the internals of the kernel tree change.  It's
+ * designed for use with GNU libc and is generally not expected to work
+ * elsewhere as-is.
+ */
+
+#ifndef _LINUX_KERNEL_H
+#define _LINUX_KERNEL_H
+
+/* avoid warning: __always_inline defined in both /usr/include/sys/cdefs.h as
+ * well as linux/compiler.h.  So we include cdefs.h now, #undef it and let
+ * linux/compiler.h redefine it.
+ */
+#include <sys/cdefs.h>
+#if defined(__always_inline) && !defined(__LINUX_COMPILER_H)
+# undef __always_inline
+#endif
+#include <linux/compiler.h>
+
+#include <assert.h>
+#include <stdint.h>
+#include <stdlib.h>
+#include <stdio.h>
+#include <errno.h>
+#include <string.h>
+#include <sys/types.h>
+#include <linux/err.h>
+#include <linux/bug.h>
+
+#ifndef __GNU_LIBRARY__
+# warning This file is a hack written to compile with the GNU C Library and \
+	  is not generally expected to work elsewhere.
+#endif
+
+/* glibc-backed versions of BUG{,_ON} */
+#undef BUG
+#undef BUG_ON
+#define BUG()		assert(0)
+#define BUG_ON(arg)	assert(!(arg))
+
+/**
+ * container_of - cast a member of a structure out to the containing structure
+ * @ptr:	the pointer to the member.
+ * @type:	the type of the container struct this is embedded in.
+ * @member:	the name of the member within the struct.
+ *
+ */
+#define container_of(ptr, type, member) ({			\
+	const typeof( ((type *)0)->member ) *__mptr = (ptr);	\
+	(type *)( (char *)__mptr - offsetof(type,member) );})
+
+typedef int8_t   s8;
+typedef uint8_t  u8;
+typedef int16_t  s16;
+typedef uint16_t u16;
+typedef int32_t  s32;
+typedef uint32_t u32;
+typedef int64_t  s64;
+typedef uint64_t u64;
+
+#endif /* _LINUX_KERNEL_H */
-- 
1.7.3.4


^ permalink raw reply related	[flat|nested] 26+ messages in thread

* [PATCH v5 22/25] selftest: Add script to compile & run userspace test program
       [not found] <1348615467-9850-1-git-send-email-daniel.santos@pobox.com>
                   ` (18 preceding siblings ...)
  2012-09-25 23:32 ` [PATCH v5 21/25] selftest: Add userspace test program Daniel Santos
@ 2012-09-25 23:32 ` Daniel Santos
  2012-09-25 23:32 ` [PATCH v5 23/25] selftest: Add basic compiler iterator test script Daniel Santos
                   ` (4 subsequent siblings)
  24 siblings, 0 replies; 26+ messages in thread
From: Daniel Santos @ 2012-09-25 23:32 UTC (permalink / raw)
  To: Daniel Santos, Pavel Pisa, Richard Weinberger, LKML,
	Michel Lespinasse, Andrea Arcangeli, Peter Zijlstra,
	Rik van Riel

This is a basic script is designed to automate the testing process (for
a single compiler) and generate delimited text output suitable for later
processing. All test parameters can be supplied via the environment, or
defaults will be used for them. The following variables affect the
script and if not supplied, will have the these default values:

CC="gcc"
KERNELDIR="../../../../.."
CFLAGS="-O2 -pipe -march=k8"
key_type="u64"
use_leftmost=0
use_rightmost=0
use_count=0
unique_keys=0
insert_replaces=0
augmented=0
test_num=0

Other variables are hard-coded and will have be tweaked to need.

Signed-off-by: Daniel Santos <daniel.santos@pobox.com>
---
 tools/testing/selftests/grbtree/user/runtest.sh |  108 +++++++++++++++++++++++
 1 files changed, 108 insertions(+), 0 deletions(-)
 create mode 100755 tools/testing/selftests/grbtree/user/runtest.sh

diff --git a/tools/testing/selftests/grbtree/user/runtest.sh b/tools/testing/selftests/grbtree/user/runtest.sh
new file mode 100755
index 0000000..f371fea
--- /dev/null
+++ b/tools/testing/selftests/grbtree/user/runtest.sh
@@ -0,0 +1,108 @@
+#!/bin/bash
+
+# runtest.sh - script to compile and run userspace tests of gerneric red-black
+#              tree implementation for a single compiler.
+#
+# Copyright (C) 2012  Daniel Santos <daniel.santos@pobox.com>
+#
+# This program is free software; you can redistribute it and/or
+# modify it under the terms of the GNU General Public License
+# as published by the Free Software Foundation; either version 2
+# of the License, or (at your option) any later version.
+#
+# This program is distributed in the hope that it will be useful,
+# but WITHOUT ANY WARRANTY; without even the implied warranty of
+# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+# GNU General Public License for more details.
+#
+# You should have received a copy of the GNU General Public License
+# along with this program; if not, write to the Free Software
+# Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301, USA.
+
+# Variables affecting code generation
+CC="${CC:-gcc}"
+KERNELDIR="${KERNELDIR:-../../../../..}"
+CFLAGS="${CFLAGS:-"-O2 -pipe -march=k8"}"
+# CONFIG parameters:
+key_type=${key_type:-u64}
+use_leftmost=${use_leftmost:-0}
+use_rightmost=${use_rightmost:-0}
+use_count=${use_count:-0}
+unique_keys=${unique_keys:-0}
+insert_replaces=${insert_replaces:-0}
+augmented=${augmented:-0}
+# For test_num, see grbtest -h
+test_num=${test_num:-0}
+
+# Variables passed to grbtest program
+reps=0x20000
+count=0x800
+keymask=0xfff
+
+# Output files
+logfile=runtest.log
+datafile=runtest.out
+
+build_desc[0]="hand-coded"
+build_desc[1]="generic"
+
+die() {
+	echo "ERROR${@:+": "}$@" 1>&2
+	exit -1
+}
+
+. /etc/profile || die
+
+do_cpp() {
+	echo "$1" > /tmp/gnucver.$$.c || die
+	${CC} -E /tmp/gnucver.$$.c | grep -v '^#' | tr -d ' '
+	rm /tmp/gnucver.$$.c
+}
+
+gccverstr=$(do_cpp "__GNUC__.__GNUC_MINOR__.__GNUC_PATCHLEVEL__") || die
+
+execute_tests() {
+	for build_type in 1 0; do
+		CONFIG=$(echo						\
+			-DGRBTEST_KEY_TYPE=${key_type}			\
+			-DGRBTEST_BUILD_GENERIC=${build_type}		\
+			-DGRBTEST_USE_LEFTMOST=${use_leftmost}		\
+			-DGRBTEST_USE_RIGHTMOST=${use_rightmost}	\
+			-DGRBTEST_USE_COUNT=${use_count}		\
+			-DGRBTEST_UNIQUE_KEYS=${unique_keys}		\
+			-DGRBTEST_INSERT_REPLACES=${insert_replaces}	\
+			-DGRBTEST_USE_AUGMENTED=${augmented}		\
+		)
+
+		echo "********************************************************"
+		echo "Starting build at $(date '+%Y-%m-%d %H:%M:%S')..."
+		echo "  build_type = ${build_desc[${build_type}]}"
+		echo "  compiler   = ${gccverstr}"
+		echo "  CFLAGS     = ${CFLAGS}"
+		echo "  KERNELDIR  = ${KERNELDIR}"
+		echo
+
+		#set -x
+		CC="${CC}"			\
+		CFLAGS="${CFLAGS}"		\
+		CONFIG="${CONFIG}"		\
+		KERNELDIR="${KERNELDIR}"	\
+		make clean all || die
+		set +x
+
+		echo
+		echo "Executing test..."
+		echo
+		./grbtest --seed 1		\
+			  --reps ${reps}	\
+			  --count ${count}	\
+			  --keymask ${keymask}	\
+			  --delim "|"		\
+			  --quote ""		\
+			  --test ${test_num} | tee -a "${datafile}"
+		echo
+		echo
+	done
+}
+
+execute_tests | tee -a ${logfile}
-- 
1.7.3.4


^ permalink raw reply related	[flat|nested] 26+ messages in thread

* [PATCH v5 23/25] selftest: Add basic compiler iterator test script
       [not found] <1348615467-9850-1-git-send-email-daniel.santos@pobox.com>
                   ` (19 preceding siblings ...)
  2012-09-25 23:32 ` [PATCH v5 22/25] selftest: Add script to compile & run " Daniel Santos
@ 2012-09-25 23:32 ` Daniel Santos
  2012-09-25 23:32 ` [PATCH v5 24/25] selftest: report generation script for test results Daniel Santos
                   ` (3 subsequent siblings)
  24 siblings, 0 replies; 26+ messages in thread
From: Daniel Santos @ 2012-09-25 23:32 UTC (permalink / raw)
  To: Daniel Santos, Pavel Pisa, Richard Weinberger, LKML,
	Michel Lespinasse, Andrea Arcangeli, Peter Zijlstra,
	Rik van Riel

A Gentoo-specific script (to be run by root) to iterate through all
installed compilers, execte runtest.sh script and collect the output
data.

Signed-off-by: Daniel Santos <daniel.santos@pobox.com>
---
 tools/testing/selftests/grbtree/user/runtests.sh |   30 ++++++++++++++++++++++
 1 files changed, 30 insertions(+), 0 deletions(-)
 create mode 100755 tools/testing/selftests/grbtree/user/runtests.sh

diff --git a/tools/testing/selftests/grbtree/user/runtests.sh b/tools/testing/selftests/grbtree/user/runtests.sh
new file mode 100755
index 0000000..0192c10
--- /dev/null
+++ b/tools/testing/selftests/grbtree/user/runtests.sh
@@ -0,0 +1,30 @@
+#!/bin/bash
+
+# This script is designed for use on Gentoo systems, using gcc-config to
+# change the compiler and must be run as root. I'm lazy, so alter to fit your
+# system.
+
+user=daniel
+outfile=runtests.$$.out
+
+rm -f runtest.log runtest.out
+
+if [[ -e ${outfile} ]]; then
+	echo "File ${outfile} exists, please move it out of the way."
+	exit
+fi
+
+
+for ((gcc_inst_num = 1; gcc_inst_num < 10; ++gcc_inst_num)); do
+	gcc-config $gcc_inst_num || exit
+	. /etc/profile
+	nice -n -3 sudo -Hu ${user}	\
+		key_type=u32		\
+		use_leftmost=1		\
+		use_rightmost=1		\
+		use_count=1		\
+		unique_keys=1		\
+		insert_replaces=0	\
+		./runtest.sh >> ${outfile}
+
+done
-- 
1.7.3.4


^ permalink raw reply related	[flat|nested] 26+ messages in thread

* [PATCH v5 24/25] selftest: report generation script for test results
       [not found] <1348615467-9850-1-git-send-email-daniel.santos@pobox.com>
                   ` (20 preceding siblings ...)
  2012-09-25 23:32 ` [PATCH v5 23/25] selftest: Add basic compiler iterator test script Daniel Santos
@ 2012-09-25 23:32 ` Daniel Santos
  2012-09-25 23:32 ` [PATCH v5 25/25] rbtree.h: (optional?) Add RB_INSERT_DUPE_RIGHT flag Daniel Santos
                   ` (2 subsequent siblings)
  24 siblings, 0 replies; 26+ messages in thread
From: Daniel Santos @ 2012-09-25 23:32 UTC (permalink / raw)
  To: Daniel Santos, Pavel Pisa, Richard Weinberger, LKML,
	Michel Lespinasse, Andrea Arcangeli, Peter Zijlstra,
	Rik van Riel

A script that uses sqlite to load test results and generates a report
showing differences in performance per compiler used.

Signed-off-by: Daniel Santos <daniel.santos@pobox.com>
---
 tools/testing/selftests/grbtree/user/gen_report.sh |  118 ++++++++++++++++++++
 1 files changed, 118 insertions(+), 0 deletions(-)
 create mode 100755 tools/testing/selftests/grbtree/user/gen_report.sh

diff --git a/tools/testing/selftests/grbtree/user/gen_report.sh b/tools/testing/selftests/grbtree/user/gen_report.sh
new file mode 100755
index 0000000..d6a1d3d
--- /dev/null
+++ b/tools/testing/selftests/grbtree/user/gen_report.sh
@@ -0,0 +1,118 @@
+#!/bin/bash
+
+dbfile=results.$$.db
+datafile=runtest.out
+
+die() {
+	echo "ERROR${@:+": "}$@" 1>&2
+	exit -1
+}
+
+find_sqlite() {
+	for suffix in "" 4 3; do
+		which sqlite${suffix} 2> /dev/null && return 0
+	done
+	return 1
+}
+
+sqlite=$(find_sqlite) || die "failed to find sqlite"
+
+${sqlite} "${dbfile}" << asdf
+/* .echo on */
+.headers on
+create table if not exists grbtest_result (
+	compiler	varchar(255),
+	key_type	varchar(255),
+	userland	tinyint,
+	use_generic	tinyint,
+	use_leftmost	tinyint,
+	use_rightmost	tinyint,
+	use_count	tinyint,
+	unique_keys	tinyint,
+	insert_replaces	tinyint,
+	use_augmented	tinyint,
+	debug		tinyint,
+	debug_validate	tinyint,
+	arch		varchar(255),
+	arch_flags	varchar(255),
+	processor	varchar(255),
+	cc		varchar(255),
+	cflags		varchar(255),
+	test		tinyint,
+	in_seed		bigint,
+	seed		bigint,
+	key_mask	int,
+	object_count	int,
+	pool_count	int,
+	reps		bigint,
+	node_size	int,
+	object_size	int,
+	pool_size	int,
+	insertions	bigint,
+	insertion_time	bigint,
+	evictions	bigint,
+	deletions	bigint,
+	deletion_time	bigint
+);
+.separator |
+.import ${datafile} grbtest_result
+/* .mode column */
+select distinct
+	key_type,
+	userland,
+	use_leftmost,
+	use_rightmost,
+	use_count,
+	unique_keys,
+	insert_replaces,
+	use_augmented,
+	debug,
+	debug_validate,
+	arch,
+	arch_flags,
+	processor,
+	cc,
+	test,
+	in_seed,
+	seed,
+	key_mask,
+	object_count,
+	pool_count,
+	reps,
+	node_size,
+	object_size,
+	pool_size,
+	insertions,
+	evictions,
+	deletions
+from grbtest_result;
+
+select distinct
+	a.compiler as 'Compiler',
+	a.key_type,
+	(case when a.userland then	  'U' else 'K' end) ||
+	(case when a.use_leftmost then	  'L' else '.' end) ||
+	(case when a.use_rightmost then	  'R' else '.' end) ||
+	(case when a.use_count then	  'C' else '.' end) ||
+	(case when a.unique_keys then	  'U' else '.' end) ||
+	(case when a.insert_replaces then 'I' else '.' end) ||
+	(case when a.debug then		  'D' else '.' end) ||
+	(case when a.debug_validate then  'V' else '.' end)
+	as config,
+	a.insertion_time as 'Generic Insert Time',
+	b.insertion_time as 'Hand-Coded Insert Time',
+	1.0 * a.insertion_time / b.insertion_time - 1.0 as 'Insert Diff',
+	a.deletion_time as 'Generic Delete Time',
+	b.deletion_time as 'Hand-Coded Delete Time',
+	1.0 * a.deletion_time / b.deletion_time - 1.0 as 'Delete Diff'
+from
+	grbtest_result as a inner join grbtest_result as b on (
+		a.compiler = b.compiler
+	)
+where
+	a.use_generic == 1
+	and b.use_generic = 0;
+asdf
+
+rm "${dbfile}"
+
-- 
1.7.3.4


^ permalink raw reply related	[flat|nested] 26+ messages in thread

* [PATCH v5 25/25] rbtree.h: (optional?) Add RB_INSERT_DUPE_RIGHT flag
       [not found] <1348615467-9850-1-git-send-email-daniel.santos@pobox.com>
                   ` (21 preceding siblings ...)
  2012-09-25 23:32 ` [PATCH v5 24/25] selftest: report generation script for test results Daniel Santos
@ 2012-09-25 23:32 ` Daniel Santos
       [not found] ` <1348618742.22822.39.camel@gandalf.local.home>
  2012-09-26  5:07 ` Daniel Santos
  24 siblings, 0 replies; 26+ messages in thread
From: Daniel Santos @ 2012-09-25 23:32 UTC (permalink / raw)
  To: Daniel Santos, Pavel Pisa, Richard Weinberger, LKML,
	Michel Lespinasse, Andrea Arcangeli, Peter Zijlstra,
	Rik van Riel

I'm not sure if this is needed anywhere in the kernel. If not, it will
cause inserts run on older compilers to slow down very slightly.

This flag affects the behavior of inserts in trees where duplicate keys
are allowed.  It's generally assumed that the new node will be added to
the head of a group of nodes with the same key value.  However, this
behavior may not always be desired and this flag offers the option to
choose them to be inserted at the tail of such a group.

Signed-off-by: Daniel Santos <daniel.santos@pobox.com>
---
 include/linux/rbtree.h |   79 ++++++++++++++++++++++++++++++++++++-----------
 1 files changed, 60 insertions(+), 19 deletions(-)

diff --git a/include/linux/rbtree.h b/include/linux/rbtree.h
index f1fbdea..2ca553b 100644
--- a/include/linux/rbtree.h
+++ b/include/linux/rbtree.h
@@ -264,6 +264,11 @@ static inline void rb_link_node(struct rb_node * node, struct rb_node * parent,
  * @RB_INSERT_REPLACES:	When set, the rb_insert() will replace a value if it
  * 			matches the supplied one (valid only when
  * 			RB_UNIQUE_KEYS is set).
+ * @RB_INSERT_DUPE_RIGHT: Normally, when inserting a duplicate value into a
+ * 			tree with non-unique keys, the new value is inserted at
+ * 			the the head of the group same-value objects.  This
+ * 			flag causes inserts in such cases to put the new value
+ * 			at the tail of the group.
  * @RB_IS_AUGMENTED:	is an augmented tree
  * @RB_VERIFY_USAGE:	Perform checks upon node insertion for a small run-time
  * 			overhead. This has other usage restrictions, so read
@@ -300,12 +305,14 @@ enum rb_flags {
 	RB_HAS_COUNT		= 0x00000004,
 	RB_UNIQUE_KEYS		= 0x00000008,
 	RB_INSERT_REPLACES	= 0x00000010,
+	RB_INSERT_DUPE_RIGHT	= 0x00000020,
 	RB_IS_AUGMENTED		= 0x00000040,
 	RB_VERIFY_USAGE		= 0x00000080 & __RB_DEBUG_ENABLE_MASK,
 	RB_VERIFY_INTEGRITY	= 0x00000100 & __RB_DEBUG_ENABLE_MASK,
 	RB_ALL_FLAGS		= RB_HAS_LEFTMOST | RB_HAS_RIGHTMOST
 			| RB_HAS_COUNT | RB_UNIQUE_KEYS
 			| RB_INSERT_REPLACES | RB_IS_AUGMENTED
+			| RB_INSERT_DUPE_RIGHT
 			| RB_VERIFY_USAGE | RB_VERIFY_INTEGRITY,
 };
 
@@ -504,15 +511,19 @@ void __rb_assert_good_rel(const struct rb_relationship *rel)
 	/* Due to a bug in versions of gcc prior to 4.6, the following
 	 * expressions are always evalulated at run-time:
 	 *
+	 * ((rel->flags & RB_UNIQUE_KEYS) && (rel->flags & RB_INSERT_DUPE_RIGHT))
 	 * (!(rel->flags & RB_UNIQUE_KEYS) && (rel->flags & RB_INSERT_REPLACES))
 	 *
 	 * The work-around for this bug is separate each bitwise AND test using
 	 * an if/else construct and evaluate only the last test with the
 	 * BUILD_BUG_ON macro.
 	 */
-
-	if (rel->flags & RB_INSERT_REPLACES)
-		BUILD_BUG_ON42(!(rel->flags & RB_UNIQUE_KEYS));
+	if (rel->flags & RB_UNIQUE_KEYS)
+		/* only with non-unique keys */
+		BUILD_BUG_ON42(rel->flags & RB_INSERT_DUPE_RIGHT);
+	else
+		/* only with unique keys */
+		BUILD_BUG_ON42(rel->flags & RB_INSERT_REPLACES);
 }
 
 
@@ -840,6 +851,7 @@ struct rb_node *__rb_find_subtree(
 				 */
 				break;
 			else if (!diff) {
+				/* FIXME: non-unique keys broken here on inserts */
 				/* exact match */
 				*matched = go_left
 					 ? RB_MATCH_LEFT
@@ -1022,16 +1034,34 @@ struct rb_node *rb_insert(
 
 		parent = *p;
 
-		if (diff > 0) {
-			p = &(*p)->rb_right;
-			if (rel->flags & RB_HAS_LEFTMOST)
-				leftmost = 0;
-		} else if (!(rel->flags & RB_UNIQUE_KEYS) || diff < 0) {
-			p = &(*p)->rb_left;
-			if (rel->flags & RB_HAS_RIGHTMOST)
-				rightmost = 0;
-		} else
-			break;
+		/* when using non-unique keys, a new same-key objects is
+		 * inserted at the head of any existing same-key objects unless
+		 * RB_INSERT_DUPE_RIGHT is specified, which causes them to be
+		 * inserted at the tail.
+		 */
+		if (rel->flags & RB_INSERT_DUPE_RIGHT) {
+			if (diff < 0) {
+				p = &(*p)->rb_left;
+				if (rel->flags & RB_HAS_RIGHTMOST)
+					rightmost = 0;
+			} else if (!(rel->flags & RB_UNIQUE_KEYS) || diff > 0) {
+				p = &(*p)->rb_right;
+				if (rel->flags & RB_HAS_LEFTMOST)
+					leftmost = 0;
+			} else
+				break;
+		} else {
+			if (diff > 0) {
+				p = &(*p)->rb_right;
+				if (rel->flags & RB_HAS_LEFTMOST)
+					leftmost = 0;
+			} else if (!(rel->flags & RB_UNIQUE_KEYS) || diff < 0) {
+				p = &(*p)->rb_left;
+				if (rel->flags & RB_HAS_RIGHTMOST)
+					rightmost = 0;
+			} else
+				break;
+		}
 	}
 
 	if ((rel->flags & RB_HAS_LEFTMOST) && leftmost) {
@@ -1087,12 +1117,23 @@ struct rb_node *rb_insert_near(
 			diff   = rel->compare(__rb_node_to_key(*p, rel), key);
 			parent = *p;
 
-			if (diff > 0)
-				p = &(*p)->rb_right;
-			else if (!(rel->flags & RB_UNIQUE_KEYS) || diff < 0)
-				p = &(*p)->rb_left;
-			else
-				break;
+			if (rel->flags & RB_INSERT_REPLACES) {
+				if (diff < 0)
+					p = &(*p)->rb_left;
+				else if (!(rel->flags & RB_UNIQUE_KEYS) ||
+						diff > 0)
+					p = &(*p)->rb_right;
+				else
+					break;
+			} else {
+				if (diff > 0)
+					p = &(*p)->rb_right;
+				else if (!(rel->flags & RB_UNIQUE_KEYS) ||
+						diff < 0)
+					p = &(*p)->rb_left;
+				else
+					break;
+			}
 		}
 		ret = *p;
 	}
-- 
1.7.3.4


^ permalink raw reply related	[flat|nested] 26+ messages in thread

* Re: [PATCH v5 0/25] Generic Red-Black Trees (still WIP)
       [not found] ` <1348618742.22822.39.camel@gandalf.local.home>
@ 2012-09-26  1:02   ` Daniel Santos
  2012-09-26  1:28     ` Steven Rostedt
  0 siblings, 1 reply; 26+ messages in thread
From: Daniel Santos @ 2012-09-26  1:02 UTC (permalink / raw)
  To: Steven Rostedt; +Cc: linux-doc, linux-sparse, LKML


>> Q&A
>> ===
>> Q: Why did you add BUILD_BUG_ON_NON_CONST() and
>>    BUILD_BUG_ON_NON_CONST42()?
> A: Because BUILD_BUG_ON_NON_CONST42() will crash if it does not result
> in the answer to life, the universe and everything!

By the way, I have a theory before time, God was writing code on some
cosmic computer (beyond our understanding, of course) when he
accidentally tried to divide by zero, resulting in a core dump that we
now know as the universe we live in.  So thus, we are just the excrement
of some celestial computer after it failed to properly execute its code.

Daniel


^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH v5 0/25] Generic Red-Black Trees (still WIP)
  2012-09-26  1:02   ` [PATCH v5 0/25] Generic Red-Black Trees (still WIP) Daniel Santos
@ 2012-09-26  1:28     ` Steven Rostedt
  0 siblings, 0 replies; 26+ messages in thread
From: Steven Rostedt @ 2012-09-26  1:28 UTC (permalink / raw)
  To: Daniel Santos; +Cc: linux-doc, linux-sparse, LKML

On Tue, 2012-09-25 at 20:02 -0500, Daniel Santos wrote:
> >> Q&A
> >> ===
> >> Q: Why did you add BUILD_BUG_ON_NON_CONST() and
> >>    BUILD_BUG_ON_NON_CONST42()?
> > A: Because BUILD_BUG_ON_NON_CONST42() will crash if it does not result
> > in the answer to life, the universe and everything!
> 
> By the way, I have a theory before time, God was writing code on some
> cosmic computer (beyond our understanding, of course) when he
> accidentally tried to divide by zero, resulting in a core dump that we
> now know as the universe we live in.  So thus, we are just the excrement
> of some celestial computer after it failed to properly execute its code.
> 

And that operation on which God failed on was:

ans = 42 / 0

Which totally explains the point where... "to understand the answer, you
must first understand the question".

-- Steve


^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH v5 0/25] Generic Red-Black Trees (still WIP)
       [not found] <1348615467-9850-1-git-send-email-daniel.santos@pobox.com>
                   ` (23 preceding siblings ...)
       [not found] ` <1348618742.22822.39.camel@gandalf.local.home>
@ 2012-09-26  5:07 ` Daniel Santos
  24 siblings, 0 replies; 26+ messages in thread
From: Daniel Santos @ 2012-09-26  5:07 UTC (permalink / raw)
  To: LKML

Hmm, looks like I've had some type of mailer problem as this message
didn't appear on LKML :(  I hope this one goes through, but sorry my
patches aren't properly grouped.

On 09/25/2012 06:24 PM, Daniel Santos wrote:
> First I want to apologize for not being able to work on this over most of the
> summer. I see that some other changes are happening with red-black and
> interval trees in the kernel which look good.  This patch set is based on v3.5
> and is not adjusted for many of the changes in Michel Lespinasse's patches.
> This is still WIP as I have added a good deal of new test code and made a fair
> number of performance tweaks, but I needed to get something out for review
> again to keep this thing rolling.
>
> Summary
> =======
> This patch set improves on Andrea Arcangeli's original Red-Black Tree
> implementation by adding generic search and insert functions with
> complete support for:
>
> o leftmost - keeps a pointer to the leftmost (lowest value) node cached
>   in your container struct
> o rightmost - ditto for rightmost (greatest value)
> o count - optionally update an count variable when you perform inserts
>   or deletes
> o unique or non-unique keys
> o find and insert "near" functions - when you already have a node that
>   is likely near another one you want to search for
> o augmented / interval tree support
> o type-safe wrapper interface available via pre-processor macro
>
> Outstanding Issues
> ==================
> General
> -------
> o Need to change comments at head of rbtree.h.
> o Need something in Documents to explain generic rbtrees.
> o Descriptions for new KConfig values incomplete.
> o Due to a bug in gcc's optimizer, extra instructions are generated in various
>   places.  Pavel Pisa has provided me a possible work-around that should be
>   examined more closely to see if it can be working in (Discussed in
>   Performance section).
> o Doc-comments are missing or out of date in some places for the new
>   ins_compare field of struct rb_relationship (including at least one code
>   example).
>
> Selftests
> ---------
> o In-kernel test module not completed, although the option to build it has
>   already been added to KConfig.
> o Userspace selftest's Makefile should run modules_prepare in KERNELDIR.
> o Validation in self-tests doesn't yet cover tests for
>   - insert_near
>   - find_{first,last,next,prev}
> o Selftest scripts need better portability.
> o It would be nice to have some fault-injection in test code to verify that
>   CONFIG_DEBUG_RBTREE and CONFIG_DEBUG_RBTREE_VALIDATE (and it's
>   RB_VERIFY_INTEGRITY counterpart flag) catch the errors they are supposed to.
>
> Undecided (Opinions Requested!)
> -------------------------------
> o With the exception of the rb_node & rb_root structs, "Layer 2" of the code
>   (see below) completely abstracts away the underlying red-black tree
>   mechanism.  The structs rb_node and rb_root can also be abstracted away via
>   a typeset or some other mechanism. Thus, should the "Layer 2" code be
>   separated from "Layer 1" and renamed "Generic Tree (gtree)" or some such,
>   paving the way for an alternate tree implementation in the future?
> o Do we need RB_INSERT_DUPE_RIGHT? (see the last patch)
>
>
> Theory of Operation
> ===================
> Historically, genericity in C meant function pointers, the overhead of a
> function call and the inability of the compiler to optimize code across
> the function call boundary.  GCC has been getting better and better at
> optimization and determining when a value is a compile-time constant and
> compiling it out.  As of gcc 4.6, it has finally reached a point where
> it's possible to have generic search & insert cores that optimize
> exactly as well as if they were hand-coded. (see also gcc man page:
> -findirect-inlining)
>
> This implementation actually consists of two layers written on top of the
> existing rbtree implementation.
>
> Layer 1: Type-Specific (But Not Type-Safe)
> ------------------------------------------
> The first layer consists of enum rb_flags, struct rb_relationship and
> some generic inline functions(see patch for doc comments).
>
> enum rb_flags {
> 	RB_HAS_LEFTMOST		= 0x00000001,
> 	RB_HAS_RIGHTMOST	= 0x00000002,
> 	RB_HAS_COUNT		= 0x00000004,
> 	RB_UNIQUE_KEYS		= 0x00000008,
> 	RB_INSERT_REPLACES	= 0x00000010,
> 	RB_IS_AUGMENTED		= 0x00000040,
> 	RB_VERIFY_USAGE		= 0x00000080,
> 	RB_VERIFY_INTEGRITY	= 0x00000100
> };
>
> struct rb_relationship {
> 	ssize_t root_offset;
> 	ssize_t left_offset;
> 	ssize_t right_offset;
> 	ssize_t count_offset;
> 	ssize_t node_offset;
> 	ssize_t key_offset;
> 	int flags;
> 	const rb_compare_f compare;     /* comparitor for lookups */
> 	const rb_compare_f ins_compare; /* comparitor for inserts */
> 	const rb_augment_f augment;
> 	unsigned key_size;
> };
>
> /* these function for use on all trees */
> struct rb_node *rb_find(
> 		struct rb_root *root,
> 		const void *key,
> 		const struct rb_relationship *rel);
> struct rb_node *rb_find_near(
> 		struct rb_node *from,
> 		const void *key,
> 		const struct rb_relationship *rel);
> struct rb_node *rb_insert(
> 		struct rb_root *root,
> 		struct rb_node *node,
> 		const struct rb_relationship *rel);
> struct rb_node *rb_insert_near(
> 		struct rb_root *root,
> 		struct rb_node *start,
> 		struct rb_node *node,
> 		const struct rb_relationship *rel);
> void rb_remove(	struct rb_root *root,
> 		struct rb_node *node,
> 		const struct rb_relationship *rel);
>
> /* these function for use on trees with non-unique keys */
> struct rb_node *rb_find_first(
> 		struct rb_root *root,
> 		const void *key,
> 		const struct rb_relationship *rel);
> struct rb_node *rb_find_last(
> 		struct rb_root *root,
> 		const void *key,
> 		const struct rb_relationship *rel);
> struct rb_node *rb_find_next(
> 		const struct rb_node *node,
> 		const struct rb_relationship *rel)
> struct rb_node *rb_find_prev(
> 		const struct rb_node *node,
> 		const struct rb_relationship *rel)
>
> Using this layer involves initializing a const struct rb_relationship
> variable with compile-time constant values and feeding its "address" to
> the generic inline functions.  The trick being, that (when gcc behaves
> properly) it never creates a struct rb_relationship variable, stores an
> initializer in the data section of the object file or passes a struct
> rb_relationship pointer.  Instead, gcc "optimizes out" out the struct,
> and uses the compile-time constant values to dictate how the inline
> functions will expand.
>
> Thus, this structure can be thought of both as a database's DDL (data
> definition language), defining the relationship between two entities and the
> template parameters to a C++ templatized function that controls how the
> template function is instantiated.  This creates type-specific functions,
> although type-safety is still not achieved (e.g., you can pass a pointer to
> any rb_node you like).
>
> To simplify usage, you can initialize your struct rb_relationship variable
> with the RB_RELATIONSHIP macro, feeding it your types, member names and flags
> and it will calculate the offsets for you.  See doc comments in patch for
> examples of using this layer (either with or without the RB_RELATIONSHIP
> macro).
>
> Layer 2: Type-Safety
> --------------------
> In order to achieve type-safety of a generic interface in C, we must delve
> deep into the darkened Swamps of The Preprocessor and confront the Prince of
> Darkness himself: Big Ugly Macro.  To be fair, there is an alternative
> solution (discussed in History & Design Goals), the so-called "x-macro" or
> "supermacro" where you #define some pre-processor values and include an
> unguarded header file.  With 17 parameters, I choose this solution for its
> ease of use and brevity, but it's an area worth debate (some of which you can
> find here if you wish: http://lwn.net/Articles/501876).
>
> So this second layer allows you to use a single macro to define your
> relationship as well as type-safe wrapper functions all in one go.
>
> RB_DEFINE_INTERFACE(
> 	prefix,
> 	cont_type, root, left, right, count,
> 	obj_type, node, key,
> 	flags, compare, ins_compare, augment,
> 	find_mod, insert_mod, find_near_mod, insert_near_mod)
>
> To avoid needing multiple versions of the macro, we use a paradigm where
> optional values can be left empty. (See RB_DEFINE_INTERFACE doc comments for
> details.)  Thus, if your container doesn't need to know leftmost, you leave
> the parameter empty.  Here's a quick example:
>
> struct container {
> 	struct rb_root root;
> 	struct rb_node *leftmost;
> 	unsigned long count;
> };
>
> struct object {
> 	struct rb_node node;
> 	long key;
> };
>
> static inline long compare_long(const long *a, const long *b)
> {
> 	return *a - *b;
> }
>
> RB_DEFINE_INTERFACE(
> 	my_objects,
> 	struct container, root, leftmost, /* no rightmost */, count,
> 	struct object, node, key,
> 	RB_UNIQUE_KEYS | RB_INSERT_REPLACES, compare_long, compare_long,
> 	/* no augment */,
> 	,,,)
>
> This will do some type-checking, create the struct rb_relationship and
> the following static __always_inline wrapper functions. (Note that
> "my_objects" is the prefix used in the example above.  It will be
> whatever you pass as the first parameter to the RB_DEFINE_INTERFACE
> macro.)
>
> struct object *my_objects_find(
> 		struct container *cont,
> 		const typeof(((struct object *)0)->key) *_key);
> struct object *my_objects_insert(
> 		struct container *cont,
> 		struct object *obj);
> struct object *my_objects_find_near(
> 		struct object *near,
> 		const typeof(((struct object *)0)->key) *_key);
> struct object *my_objects_insert_near(
> 		struct container *cont,
> 		struct object *near,
> 		struct object *obj);
> void my_objects_remove(struct container *cont, struct object *obj);
> struct object *my_objects_find_first(
> 		struct container *cont,
> 		const typeof(((struct object *)0)->key) *_key);
> struct object *my_objects_find_last(
> 		struct container *cont,
> 		const typeof(((struct object *)0)->key) *_key);
> struct object *my_objects_find_next(const struct object *obj);
> struct object *my_objects_find_last(const struct object *obj);
> struct object *my_objects_next(const struct object *obj);
> struct object *my_objects_prev(const struct object *obj);
> struct object *my_objects_first(struct container *cont);
> struct object *my_objects_last(struct container *cont);
>
> Each of these are each declared static __always_inline. However, you can
> change the modifiers for the first four (find, insert, find_near and
> insert_near) by populating any of the last 4 parameters with the function
> modifiers of the respective function (when empty, they default to static
> __always_inline).
>
> Not only does this layer give you type-safety, it removes almost all of
> the implementation details of the rbtree from the code using it, thus
> making it easier to replace the underlying algorithm at some later
> date.
>
> Compare Functions
> -----------------
> Because equality is unimportant when doing inserts into a tree with duplicate
> keys, struct rb_relationship's ins_compare field can be set to a greater-than
> function for better performance. Using the example in the section above as a
> model, this is what it would look like:
>
> static inline long compare_long(const long *a, const long *b)
> ...
> static inline long greater_long(const long *a, const long *b)
> {
> 	return *a > *b;
> }
>
> RB_DEFINE_INTERFACE(
> 	my_objects,
> 	struct container, root, leftmost, /* no rightmost */, count,
> 	struct object, node, key,
> 	0, compare_long, greater_long,
> 	/* no augment */,
> 	,,,)
>
>
> History & Design Goals
> ======================
> I've been through many iterations of various techniques searching for the
> perfect "clean" implementation and finally settled on having a huge macro
> expand to wrapper functions after exhausting all other alternatives. The trick
> is that what one person considers a "clean" implementation is a bit of a value
> judgment.  So by "clean", I mean balancing these requirements:
>
> 1.) minimal dependence on pre-processor
> 2.) avoiding pre-processor expanded code that will break debug
>     information (backtraces)
> 3.) optimal encapsulation of the details of your rbtree in minimal
>     source code (this is where you define the relationship between your
> 	container and contained objects, their types, keys, rather or not
> 	non-unique objects are allowed, etc.) -- preferably eliminating
> 	duplication of these details entirely.
> 4.) offering a complete feature-set in a single implementation (not
>     multiple functions when various features are used)
> 5.) perfect optimization -- the generic function must be exactly as
>     efficient as the hand-coded version
>
> By those standards, the "cleanest" implementation I had come up with
> actually used a different mechanism: defining an anonymous interface
> struct something like this:
>
> /* generic non-type-safe function */
> static __always_inline void *__generic_func(void *obj);
>
> struct {							\
>         out_type *(*const func)(in_type *obj);			\
> } name = {							\
>         .func = (out_type *(*const)(in_type *obj))__generic_func;\
> }
>
> /* usage looks like this: */
> DEFINE_INTERFACE(solution_a, struct something, struct something_else);
> struct something *s;
> struct something_else *se;
> se = solution_a.func(s);
>
> Sadly, while solution_a.func(s) optimizes perfectly in 4.6, it completely
> bombed in 4.5 and prior -- the call by struct-member-function-pointer is never
> inlined and nothing passed to it is every considered a compile-time constant
> (again, see gcc's docs on -findirect-inline).  Because of the implementation
> of the generic functions, this bloated the code unacceptably (3x larger).
> Thus, I finally settled on the current RB_DEFINE_INTERFACE macro, which is
> massive, but optimizes perfectly in 4.6+ and close enough in 4.5 and prior
> (prior to 4.6, the compare function is never inlined).
>
> The other alternative I briefly considered was to have a header file
> that is only included after #defining all of these parameters, relying
> primarily on cpp rather than cc & compile-time constants to fill in the
> relationship details (the "x-macro" approach).  While this mechanism
> would perform better on older compilers and never break backtraces, in
> the end, I just couldn't stomach it.  Aside from that, it would make
> using the interface almost as verbose as hand-coding it yourself.
>
> Performance
> ===========
> Here are the results of performance tests run on v5 of this patch set (against
> v3.5 kernel) on an AMD Phenom 9850.  This is a reformatted version of what
> tools/testing/selftests/grbtree/user/gen_report.sh outputs.  Test results vary
> quite a bit dependent upon the selected features.
>
> For all of these tests, I used the following parameters:
> key range       0-4095
> key type	u32
> object_count    2048
> repititions     131,072
> node_size       24 bytes
> object_size     32 bytes
> total data size 65,536 bytes
> num insertions	268,435,456
>
> Below is a summary of the performance drop using generic rbtrees on various
> ranges of compilers. (negative values are performance improvements)
>
> GCC versions    Best    Worst
> 3.4 - 4.0	35%	80%
> 4.1 - 4.5	18%	23%
> 4.6 - 4.7	-7%	 5%
>
> The tables below list the time in seconds it took to execute the tests on each
> compiler and the difference between the generic and specific (i.e.,
> hand-coded) test results.
>
> Duplicate keys (no leftmost, rightmost or count)
> Compiler       Generic Specific Performance Loss
> gcc-3.4.6	33.41	18.78	77.94%
> gcc-4.0.4	32.36	17.94	80.37%
> gcc-4.1.2	23.11	17.76	30.14%
> gcc-4.2.4	22.97	17.83	28.84%
> gcc-4.3.6	23.07	17.78	29.79%
> gcc-4.4.7	21.88	17.64	24.03%
> gcc-4.5.4	21.75	17.54	23.99%
> gcc-4.6.3	16.84	16.82	 0.10%
> gcc-4.7.1	16.79	16.68	 0.66%
>
> Duplicate keys, use leftmost (no rightmost or count)
> Compiler       Generic Specific Performance Loss
> gcc-3.4.6	33.54	22.57	48.63%
> gcc-4.0.4	32.82	22.16	48.07%
> gcc-4.1.2	27.30	22.77	19.93%
> gcc-4.2.4	27.41	22.86	19.95%
> gcc-4.3.6	28.65	23.03	24.38%
> gcc-4.4.7	27.03	21.41	26.24%
> gcc-4.5.4	26.69	22.48	18.71%
> gcc-4.6.3	21.58	21.53	 0.24%
> gcc-4.7.1	22.40	22.23	 0.77%
>
> Duplicate keys, use leftmost, rightmost and count
> Compiler       Generic Specific Performance Loss
> gcc-3.4.6	33.49	22.70	47.52%
> gcc-4.0.4	33.19	23.71	39.94%
> gcc-4.1.2	29.03	23.76	22.18%
> gcc-4.2.4	28.59	23.82	20.04%
> gcc-4.3.6	29.69	23.94	24.01%
> gcc-4.4.7	28.62	23.89	19.79%
> gcc-4.5.4	28.73	23.54	22.04%
> gcc-4.6.3	23.82	23.70	 0.51%
> gcc-4.7.1	23.84	23.94	-0.40%
>
> Unique keys (no leftmost, rightmost or count)
> Compiler       Generic Specific Performance Loss
> gcc-3.4.6	29.38	19.94	47.33%
> gcc-4.0.4	28.85	21.14	36.48%
> gcc-4.1.2	25.16	20.30	23.95%
> gcc-4.2.4	25.26	20.50	23.23%
> gcc-4.3.6	25.41	20.82	22.02%
> gcc-4.4.7	26.12	20.68	26.33%
> gcc-4.5.4	25.29	20.31	24.54%
> gcc-4.6.3	21.57	20.35	 6.01%
> gcc-4.7.1	20.98	20.20	 3.88%
>
> Unique keys, use leftmost (no rightmost or count)
> Compiler       Generic Specific Performance Loss
> gcc-3.4.6	29.50	20.96	40.76%
> gcc-4.0.4	28.93	20.90	38.41%
> gcc-4.1.2	26.26	22.29	17.80%
> gcc-4.2.4	25.49	22.05	15.61%
> gcc-4.3.6	26.55	22.25	19.34%
> gcc-4.4.7	28.90	22.24	29.92%
> gcc-4.5.4	26.85	21.86	22.80%
> gcc-4.6.3	22.95	22.06	 4.03%
> gcc-4.7.1	22.56	21.48	 5.01%
>
> Unique keys, use leftmost, rightmost and count
> Compiler       Generic Specific Performance Loss
> gcc-3.4.6	29.48	20.91	40.97%
> gcc-4.0.4	29.37	21.72	35.20%
> gcc-4.1.2	25.25	23.10	 9.29%
> gcc-4.2.4	26.17	22.35	17.13%
> gcc-4.3.6	26.34	22.30	18.10%
> gcc-4.4.7	25.24	22.43	12.51%
> gcc-4.5.4	25.58	23.07	10.89%
> gcc-4.6.3	21.79	23.50	-7.29%
> gcc-4.7.1	23.27	25.08	-7.22%
>
>
> I've done an analysis of the gcc 4.7.1-generated code and discovered the
> following flaws in the generic insert function.
>
> 1. Key of inserted object being read repeatedly. Instead of reading the value
>    of the inserted key once, at the start of the function, the key is read
>    prior to each comparision.  I'm guessing that this is because optimizer
>    makes the faulty assumption that the value could change throughout the
>    course of execution. This costs us one extra instruction each iteration of
>    the loop as we search the tree (32-bit key).
>
>      mov    0x18(%rax),%edx
>
>    A work-around is in place to eliminate this problem on gcc 4.6.0 and later
>    if your key size is 16, 32 or 64 bits, which manages to get gcc to store
>    the key of the supplied object in a regsiter at the start of the function
>    preventing us a performance loss of roughly 4%.
>
> 2. Due to gcc bug 3507 (http://gcc.gnu.org/bugzilla/show_bug.cgi?id=3507),
>    this code:
>
>      long diff = a - b;
>
>      if (diff > 0)
>          do_gt();
>      else if (diff < 0)
>          do_lt();
>      else
>          do_eq();
>
>    Optimizes more poorly than this code:
>
>      if (a > b)
>          do_gt();
>      else if (b < a)
>          do_lt();
>      else
>          do_eq();
>
>    So instead of the key compare happening like this (64-bit key):
>
>      cmp    0x18(%rax),%rsi
>
>    We get this:
>
>      mov    %rsi,%rdx
>      sub    0x18(%rax),%rdx
>      cmp    $0x0,%rdx
>
>    The results can be slightly worse when the key type isn't the same as long.
>    With a signed 32-bit key (s32) on x86_64, gcc thinks it needs to convert
>    the difference to a 64-bit long.
>
>      mov    %esi,%edx
>      sub    0x18(%rax),%edx
>      movslq %edx,%rdx
>      cmp    $0x0,%rdx
>
>    Not only is this 2-3 extra instruction, it also uses one extra register,
>    which in turn forces gcc to use an r8-15 register in other places, which
>    requires larger opcodes. Also, this only occurs when using the normal
>    compare function (doesn't occur when using 'greater').  So this affects
>    inserts on trees with unique keys and all lookups.
>
> Q&A
> ===
> Q: Why did you add BUILD_BUG_ON_NON_CONST() and
>    BUILD_BUG_ON_NON_CONST42()?
> A: There were initially enough BUILD_BUG_ON(!__builtin_constant_p(arg))
>    calls to warrant it having a macro for it.  However, I've since
>    discovered that using __builtin_constant_p on a struct member did not
>    behave very consistently, so after writing some test programs &
>    scripts, and refining 200k+ test results, I graphed out basically
>    where __builtin_constant_p() worked and didn't.  As it turns out,
>    using it on struct members is fragile until gcc 4.2, so
>    BUILD_BUG_ON_NON_CONST42() is intended for use with struct members.
>
> Q: Why empty parameters?
>    What is IFF_EMPTY() for?
>    Why don't you just pass zero instead of an empty parameter?
> A: Support for caching the left- & right-most nodes in the tree as well
>    as maintaining a count variable are all optional.  Passing the offset
>    value directly not only means more characters of code to use the
>    RB_RELATIONSHIP and RB_DEFINE_INTERFACE macros (because now you'll
>    have to invoke the offsetof macro, supplying your struct types
>    again), but the offset may actually be zero, so passing zero as "I'm
>    not using this feature" wont work.  (This is the reason why the flags
>    RB_HAS_LEFTMOST, et. al. exist.)  Thus, you would also need to
>    manually pass the appropriate rb_flag value to specify that you're
>    using the feature.  All of this means more copy, paste & edit code
>    that is error-prone and a maintenance nightmare.  This implementation
>    allows the caller to pass the name of the struct member or leave the
>    parameter empty to mean "I'm not using this feature", thus
>    eliminating all of these other complications.
>
> Q: Using huge macro like RB_DEFINE_INTERFACE prone to usage errors that
>    create crappy error messages and have zero type-safety. (not really a
>    question)
> A: True.  However, much of this is mitigated by creating an
>    __rb_sanity_check_##name function that is never called, but will
>    generate meaningful error messages for most mistakes (incorrect
>    struct member types, etc.)
>
> Q: The traditional boolean comparitor passed to for sorted sets is a less_than
>    function, why are you using 'greater than'?
> A: This decision is purely for optimization purposes, as compare and
>    greather_than are interchangable when we don't care about equality.
>    However, this may become a moot point if we can't get gcc to properly
>    optimize code using the compare function, and switch to a pair of
>    equals/less functions.
>
> Revision History
> ===============
> New in v5
> o Added a ability to specify a different compare function for inserts.  This
>   is more efficient on trees with duplicate keys, since you can use a boolean
>   "greater than" function.
> o Added an optimization to generate better code where key size is 16, 32 or 64
>   bits.
> o Add test & validation framework (CONFIG_DEBUG_RBTREE and
>   CONFIG_DEBUG_RBTREE_VALIDATE)
> o Fixed bugs in kernel-doc so that API documentation generates correctly.
> o Add userspace test program & scripts.
> o Fixed a lot of typos
> o Cleaned up and completed kernel-doc comments
>
> New in v4:
> o Added type-safe wrapper functions for rb_{next,prev,first,last}
>   to RB_DEFINE_INTERFACE.  Naming is the same as other type-safe
>   functions (e.g.,  prefix##_first wraps rb_first). (thanks Pavel Pisa
>   for the suggestion)
> o Added rb_find_{first,next,last,prev} (for non-unique trees) to find
>   the first or last occurrence of a key and iterate through them.
>   Type-safe wrapper functions also added to RB_DEFINE_INTERFACE. (thanks
>   again Pavel Pisa)
> o Added support for an unsigned long count member of the container
>   struct that will be updated upon insertions & deletions.
> o Improve sanity checks performed by RB_DEFINE_INTERFACE -- error
>   messages are now more specific and clearer.  Type safety for compare
>   function is now enforced.
> o Completed implementation of insert_near (still untested).
> o Completed testing for find_near.  Performance is something like
>   O(log distance * 2 + 1), so if your start node is a bit closer than
>   half way across the tree, find_near will be about the same speed as
>   find. If it is further, it will be slower.  Either way, it is larger
>   than a normal find (which should be taken into account), so should
>   only be used when you are fairly certain your target objects is near
>   the start.
> o Added support for specifying modifiers for functions generated by
>   RB_DEFINE_INTERFACE.  This adds 4 more parameters, but is probably
>   better than forcing the user to write their own wrapper functions to
>   macro-generated wrapper functions, just to change their function
>   attributes.
> o Added run-time versions of all of the __rb_xxx_to_xxx inline
>   functions, for use in those conditions where someone may actually need
>   to access these using a run-time struct rb_relatinoship value.
> o Performed compile tests on gcc 3.4.6 - 4.7.0 and tweaked BUILD_BUG_ON*
>   macros to not fail on any of these compilers.
>
> New in v3:
> o Moved compare & augment functions back into struct rb_relationship
>   after discovering that calling them will be inlined in gcc 4.6+ if the
>   function is flattened.
> o Improved doc comments.
> o Solved problem of compare function not being checked for
>   type-correctness by adding a __sanity_check_##name() function to
>   __RB_DEFINE_INTERFACE that generates usable errors when there's a type
>   or member name problem in the macro parameters.  This is helpful since
>   the errors produced when the RB_RELATIONSHIP macro expands were quite
>   terrible.
>
> New in v2:
> o Added RB_RELATIONSHIP macro (thanks Peter Zijlstra for the
>   suggestions).
> o Added RB_DEFINE_INTERFACE macro.
>


^ permalink raw reply	[flat|nested] 26+ messages in thread

end of thread, other threads:[~2012-09-26  5:07 UTC | newest]

Thread overview: 26+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
     [not found] <1348615467-9850-1-git-send-email-daniel.santos@pobox.com>
2012-09-25 23:29 ` [PATCH v5 1/25] compiler-gcc4.h: Correct verion check for __compiletime_error Daniel Santos
2012-09-25 23:30 ` [PATCH v5 2/25] compiler-gcc4.h: Reorder macros based upon gcc ver Daniel Santos
2012-09-25 23:30 ` [PATCH v5 3/25] compiler-gcc.h: Add gcc-recommended GCC_VERSION macro Daniel Santos
2012-09-25 23:30 ` [PATCH v5 4/25] compiler-gcc{3,4}.h: Use " Daniel Santos
2012-09-25 23:30 ` [PATCH v5 5/25] compiler{,-gcc4}.h: Remove duplicate macros Daniel Santos
2012-09-25 23:30 ` [PATCH v5 6/25] bug.h: Replace __linktime_error with __compiletime_error Daniel Santos
2012-09-25 23:30 ` [PATCH v5 7/25] compiler{,-gcc4}.h: Introduce __flatten function attribute Daniel Santos
2012-09-25 23:30 ` [PATCH v5 8/25] bug.h: Make BUILD_BUG_ON generate compile-time error Daniel Santos
2012-09-25 23:30 ` [PATCH v5 9/25] bug.h: Add BUILD_BUG_ON_NON_CONST macro Daniel Santos
2012-09-25 23:31 ` [PATCH v5 10/25] bug.h: Add gcc 4.2+ versions of BUILD_BUG_ON_* macros Daniel Santos
2012-09-25 23:31 ` [PATCH v5 11/25] rbtree.h: Generic Red-Black Trees Daniel Santos
2012-09-25 23:31 ` [PATCH v5 12/25] rbtree.h: include kconfig.h Daniel Santos
2012-09-25 23:31 ` [PATCH v5 13/25] fair.c: Use generic rbtree impl in fair scheduler Daniel Santos
2012-09-25 23:31 ` [PATCH v5 15/25] kernel-doc: bugfix - multi-line macros Daniel Santos
2012-09-25 23:31 ` [PATCH v5 16/25] kernel-doc: bugfix - empty line in Example section Daniel Santos
2012-09-25 23:31 ` [PATCH v5 17/25] kernel-doc: Don't mangle whitespace " Daniel Santos
2012-09-25 23:31 ` [PATCH v5 19/25] rbtree.h: add doc comments for struct rb_node Daniel Santos
2012-09-25 23:32 ` [PATCH v5 20/25] selftest: Add generic tree self-test common code Daniel Santos
2012-09-25 23:32 ` [PATCH v5 21/25] selftest: Add userspace test program Daniel Santos
2012-09-25 23:32 ` [PATCH v5 22/25] selftest: Add script to compile & run " Daniel Santos
2012-09-25 23:32 ` [PATCH v5 23/25] selftest: Add basic compiler iterator test script Daniel Santos
2012-09-25 23:32 ` [PATCH v5 24/25] selftest: report generation script for test results Daniel Santos
2012-09-25 23:32 ` [PATCH v5 25/25] rbtree.h: (optional?) Add RB_INSERT_DUPE_RIGHT flag Daniel Santos
     [not found] ` <1348618742.22822.39.camel@gandalf.local.home>
2012-09-26  1:02   ` [PATCH v5 0/25] Generic Red-Black Trees (still WIP) Daniel Santos
2012-09-26  1:28     ` Steven Rostedt
2012-09-26  5:07 ` Daniel Santos

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).