linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH v6 0/25] Generic Red-Black Trees
@ 2012-09-28  1:54 Daniel Santos
  2012-09-28  1:54 ` [PATCH v6 1/25] compiler-gcc4.h: Correct verion check for __compiletime_error Daniel Santos
                   ` (7 more replies)
  0 siblings, 8 replies; 15+ messages in thread
From: Daniel Santos @ 2012-09-28  1:54 UTC (permalink / raw)
  To: linux-kernel, linux-doc, linux-sparse, Akinobu Mita, Andi Kleen,
	Andrea Arcangeli, Andrew Morton, Christopher Li, David Daney,
	David Howells, David Rientjes, David Woodhouse, Don Zickus,
	Greg Kroah-Hartman, Hidetoshi Seto, H. Peter Anvin
  Cc: Daniel Santos



This revised patch set is rebased onto linux-mmotm.

I have a new mail provider now, but they seem just as bad as the last, so I'm
going to try to split the recpients in half and pray that this goes through.
I'm sorry to those of you who have gotten partial patch sets and I hope to get
this fixed soon. (If anybody knows of a descent email service for developers
who send patches, please let me know.)

Summary
=======
This patch set improves on Andrea Arcangeli's original Red-Black Tree
implementation by adding generic search and insert functions with
complete support for:

o leftmost - keeps a pointer to the leftmost (lowest value) node cached
  in your container struct
o rightmost - ditto for rightmost (greatest value)
o count - optionally update an count variable when you perform inserts
  or deletes
o unique or non-unique keys
o find and insert "near" functions - when you already have a node that
  is likely near another one you want to search for
o type-safe wrapper interface available via pre-processor macro

Outstanding Issues
==================
General
-------
o Need something in Documents to explain generic rbtrees.
o Due to a bug in gcc's optimizer, extra instructions are generated in various
  places.  Pavel Pisa has provided me a possible work-around that should be
  examined more closely to see if it can be working in (Discussed in
  Performance section).
o Doc-comments are missing or out of date in some places for the new
  ins_compare field of struct rb_relationship (including at least one code
  example).

Selftests
---------
o In-kernel test module not completed.
o Userspace selftest's Makefile should run modules_prepare in KERNELDIR.
o Validation in self-tests doesn't yet cover tests for
  - insert_near
  - find_{first,last,next,prev}
o Selftest scripts need better portability (maybe solved? we'll see)
o It would be nice to have some fault-injection in test code to verify that
  CONFIG_DEBUG_GRBTREE and CONFIG_DEBUG_GRBTREE_VALIDATE (and it's
  RB_VERIFY_INTEGRITY counterpart flag) catch the errors they are supposed to.

Undecided (Opinions Requested!)
-------------------------------
o With the exception of the rb_node & rb_root structs, "Layer 2" of the code
  (see below) completely abstracts away the underlying red-black tree
  mechanism.  The structs rb_node and rb_root can also be abstracted away via
  a typeset or some other mechanism. Thus, should the "Layer 2" code be
  separated from "Layer 1" and renamed "Generic Tree (gtree)" or some such,
  paving the way for an alternate tree implementation in the future?
o Do we need RB_INSERT_DUPE_RIGHT? (see the last patch)


Theory of Operation
===================
Historically, genericity in C meant function pointers, the overhead of a
function call and the inability of the compiler to optimize code across
the function call boundary.  GCC has been getting better and better at
optimization and determining when a value is a compile-time constant and
compiling it out.  As of gcc 4.6, it has finally reached a point where
it's possible to have generic search & insert cores that optimize
exactly as well as if they were hand-coded. (see also gcc man page:
-findirect-inlining)

This implementation actually consists of two layers written on top of the
existing rbtree implementation.

Layer 1: Type-Specific (But Not Type-Safe)
------------------------------------------
The first layer consists of enum rb_flags, struct rb_relationship and
some generic inline functions(see patch for doc comments).

enum rb_flags {
	RB_HAS_LEFTMOST		= 0x00000001,
	RB_HAS_RIGHTMOST	= 0x00000002,
	RB_HAS_COUNT		= 0x00000004,
	RB_UNIQUE_KEYS		= 0x00000008,
	RB_INSERT_REPLACES	= 0x00000010,
	RB_IS_AUGMENTED		= 0x00000040,
	RB_VERIFY_USAGE		= 0x00000080,
	RB_VERIFY_INTEGRITY	= 0x00000100
};

struct rb_relationship {
	ssize_t root_offset;
	ssize_t left_offset;
	ssize_t right_offset;
	ssize_t count_offset;
	ssize_t node_offset;
	ssize_t key_offset;
	int flags;
	const rb_compare_f compare;     /* comparitor for lookups */
	const rb_compare_f ins_compare; /* comparitor for inserts */
	unsigned key_size;
};

/* these function for use on all trees */
struct rb_node *rb_find(
		struct rb_root *root,
		const void *key,
		const struct rb_relationship *rel);
struct rb_node *rb_find_near(
		struct rb_node *from,
		const void *key,
		const struct rb_relationship *rel);
struct rb_node *rb_insert(
		struct rb_root *root,
		struct rb_node *node,
		const struct rb_relationship *rel);
struct rb_node *rb_insert_near(
		struct rb_root *root,
		struct rb_node *start,
		struct rb_node *node,
		const struct rb_relationship *rel);
void rb_remove(	struct rb_root *root,
		struct rb_node *node,
		const struct rb_relationship *rel);

/* these function for use on trees with non-unique keys */
struct rb_node *rb_find_first(
		struct rb_root *root,
		const void *key,
		const struct rb_relationship *rel);
struct rb_node *rb_find_last(
		struct rb_root *root,
		const void *key,
		const struct rb_relationship *rel);
struct rb_node *rb_find_next(
		const struct rb_node *node,
		const struct rb_relationship *rel)
struct rb_node *rb_find_prev(
		const struct rb_node *node,
		const struct rb_relationship *rel)

Using this layer involves initializing a const struct rb_relationship
variable with compile-time constant values and feeding its "address" to
the generic inline functions.  The trick being, that (when gcc behaves
properly) it never creates a struct rb_relationship variable, stores an
initializer in the data section of the object file or passes a struct
rb_relationship pointer.  Instead, gcc "optimizes out" out the struct,
and uses the compile-time constant values to dictate how the inline
functions will expand.

Thus, this structure can be thought of both as a database's DDL (data
definition language), defining the relationship between two entities and the
template parameters to a C++ templatized function that controls how the
template function is instantiated.  This creates type-specific functions,
although type-safety is still not achieved (e.g., you can pass a pointer to
any rb_node you like).

To simplify usage, you can initialize your struct rb_relationship variable
with the RB_RELATIONSHIP macro, feeding it your types, member names and flags
and it will calculate the offsets for you.  See doc comments in patch for
examples of using this layer (either with or without the RB_RELATIONSHIP
macro).

Layer 2: Type-Safety
--------------------
In order to achieve type-safety of a generic interface in C, we must delve
deep into the darkened Swamps of The Preprocessor and confront the Prince of
Darkness himself: Big Ugly Macro.  To be fair, there is an alternative
solution (discussed in History & Design Goals), the so-called "x-macro" or
"supermacro" where you #define some pre-processor values and include an
unguarded header file.  With 17 parameters, I choose this solution for its
ease of use and brevity, but it's an area worth debate (some of which you can
find here if you wish: http://lwn.net/Articles/501876).

So this second layer allows you to use a single macro to define your
relationship as well as type-safe wrapper functions all in one go.

RB_DEFINE_INTERFACE(
	prefix,
	cont_type, root, left, right, count,
	obj_type, node, key,
	flags, compare, ins_compare,
	find_mod, insert_mod, find_near_mod, insert_near_mod)

To avoid needing multiple versions of the macro, we use a paradigm where
optional values can be left empty. (See RB_DEFINE_INTERFACE doc comments for
details.)  Thus, if your container doesn't need to know leftmost, you leave
the parameter empty.  Here's a quick example:

struct container {
	struct rb_root root;
	struct rb_node *leftmost;
	unsigned long count;
};

struct object {
	struct rb_node node;
	long key;
};

static inline long compare_long(const long *a, const long *b)
{
	return *a - *b;
}

RB_DEFINE_INTERFACE(
	my_objects,
	struct container, root, leftmost, /* no rightmost */, count,
	struct object, node, key,
	RB_UNIQUE_KEYS | RB_INSERT_REPLACES, compare_long, compare_long,
	,,,)

This will do some type-checking, create the struct rb_relationship and
the following static __always_inline wrapper functions. (Note that
"my_objects" is the prefix used in the example above.  It will be
whatever you pass as the first parameter to the RB_DEFINE_INTERFACE
macro.)

struct object *my_objects_find(
		struct container *cont,
		const typeof(((struct object *)0)->key) *_key);
struct object *my_objects_insert(
		struct container *cont,
		struct object *obj);
struct object *my_objects_find_near(
		struct object *near,
		const typeof(((struct object *)0)->key) *_key);
struct object *my_objects_insert_near(
		struct container *cont,
		struct object *near,
		struct object *obj);
void my_objects_remove(struct container *cont, struct object *obj);
struct object *my_objects_find_first(
		struct container *cont,
		const typeof(((struct object *)0)->key) *_key);
struct object *my_objects_find_last(
		struct container *cont,
		const typeof(((struct object *)0)->key) *_key);
struct object *my_objects_find_next(const struct object *obj);
struct object *my_objects_find_last(const struct object *obj);
struct object *my_objects_next(const struct object *obj);
struct object *my_objects_prev(const struct object *obj);
struct object *my_objects_first(struct container *cont);
struct object *my_objects_last(struct container *cont);

Each of these are each declared static __always_inline. However, you can
change the modifiers for the first four (find, insert, find_near and
insert_near) by populating any of the last 4 parameters with the function
modifiers of the respective function (when empty, they default to static
__always_inline).

Not only does this layer give you type-safety, it removes almost all of
the implementation details of the rbtree from the code using it, thus
making it easier to replace the underlying algorithm at some later
date.

Compare Functions
-----------------
Because equality is unimportant when doing inserts into a tree with duplicate
keys, struct rb_relationship's ins_compare field can be set to a greater-than
function for better performance. Using the example in the section above as a
model, this is what it would look like:

static inline long compare_long(const long *a, const long *b)
...
static inline long greater_long(const long *a, const long *b)
{
	return *a > *b;
}

RB_DEFINE_INTERFACE(
	my_objects,
	struct container, root, leftmost, /* no rightmost */, count,
	struct object, node, key,
	0, compare_long, greater_long,
	,,,)


History & Design Goals
======================
I've been through many iterations of various techniques searching for the
perfect "clean" implementation and finally settled on having a huge macro
expand to wrapper functions after exhausting all other alternatives. The trick
is that what one person considers a "clean" implementation is a bit of a value
judgment.  So by "clean", I mean balancing these requirements:

1.) minimal dependence on pre-processor
2.) avoiding pre-processor expanded code that will break debug
    information (backtraces)
3.) optimal encapsulation of the details of your rbtree in minimal
    source code (this is where you define the relationship between your
	container and contained objects, their types, keys, rather or not
	non-unique objects are allowed, etc.) -- preferably eliminating
	duplication of these details entirely.
4.) offering a complete feature-set in a single implementation (not
    multiple functions when various features are used)
5.) perfect optimization -- the generic function must be exactly as
    efficient as the hand-coded version

By those standards, the "cleanest" implementation I had come up with
actually used a different mechanism: defining an anonymous interface
struct something like this:

/* generic non-type-safe function */
static __always_inline void *__generic_func(void *obj);

struct {							\
        out_type *(*const func)(in_type *obj);			\
} name = {							\
        .func = (out_type *(*const)(in_type *obj))__generic_func;\
}

/* usage looks like this: */
DEFINE_INTERFACE(solution_a, struct something, struct something_else);
struct something *s;
struct something_else *se;
se = solution_a.func(s);

Sadly, while solution_a.func(s) optimizes perfectly in 4.6, it completely
bombed in 4.5 and prior -- the call by struct-member-function-pointer is never
inlined and nothing passed to it is every considered a compile-time constant
(again, see gcc's docs on -findirect-inline).  Because of the implementation
of the generic functions, this bloated the code unacceptably (3x larger).
Thus, I finally settled on the current RB_DEFINE_INTERFACE macro, which is
massive, but optimizes perfectly in 4.6+ and close enough in 4.5 and prior
(prior to 4.6, the compare function is never inlined).

The other alternative I briefly considered was to have a header file
that is only included after #defining all of these parameters, relying
primarily on cpp rather than cc & compile-time constants to fill in the
relationship details (the "x-macro" approach).  While this mechanism
would perform better on older compilers and never break backtraces, in
the end, I just couldn't stomach it.  Aside from that, it would make
using the interface almost as verbose as hand-coding it yourself.

Performance
===========
Here are the results of performance tests run on v5 of this patch set (against
v3.5 kernel) on an AMD Phenom 9850.  This is a reformatted version of what
tools/testing/selftests/grbtree/user/gen_report.sh outputs.  Test results vary
quite a bit dependent upon the selected features.

For all of these tests, I used the following parameters:
key range       0-4095
key type	u32
object_count    2048
repititions     131,072
node_size       24 bytes
object_size     32 bytes
total data size 65,536 bytes
num insertions	268,435,456

Below is a summary of the performance drop using generic rbtrees on various
ranges of compilers. (negative values are performance improvements)

GCC versions    Best    Worst
3.4 - 4.0	35%	80%
4.1 - 4.5	18%	23%
4.6 - 4.7	-7%	 5%

The tables below list the time in seconds it took to execute the tests on each
compiler and the difference between the generic and specific (i.e.,
hand-coded) test results (from v5 of patches against the 3.5 kernel).

Duplicate keys (no leftmost, rightmost or count)
Compiler       Generic Specific Performance Loss
gcc-3.4.6	33.41	18.78	77.94%
gcc-4.0.4	32.36	17.94	80.37%
gcc-4.1.2	23.11	17.76	30.14%
gcc-4.2.4	22.97	17.83	28.84%
gcc-4.3.6	23.07	17.78	29.79%
gcc-4.4.7	21.88	17.64	24.03%
gcc-4.5.4	21.75	17.54	23.99%
gcc-4.6.3	16.84	16.82	 0.10%
gcc-4.7.1	16.79	16.68	 0.66%

Duplicate keys, use leftmost (no rightmost or count)
Compiler       Generic Specific Performance Loss
gcc-3.4.6	33.54	22.57	48.63%
gcc-4.0.4	32.82	22.16	48.07%
gcc-4.1.2	27.30	22.77	19.93%
gcc-4.2.4	27.41	22.86	19.95%
gcc-4.3.6	28.65	23.03	24.38%
gcc-4.4.7	27.03	21.41	26.24%
gcc-4.5.4	26.69	22.48	18.71%
gcc-4.6.3	21.58	21.53	 0.24%
gcc-4.7.1	22.40	22.23	 0.77%

Duplicate keys, use leftmost, rightmost and count
Compiler       Generic Specific Performance Loss
gcc-3.4.6	33.49	22.70	47.52%
gcc-4.0.4	33.19	23.71	39.94%
gcc-4.1.2	29.03	23.76	22.18%
gcc-4.2.4	28.59	23.82	20.04%
gcc-4.3.6	29.69	23.94	24.01%
gcc-4.4.7	28.62	23.89	19.79%
gcc-4.5.4	28.73	23.54	22.04%
gcc-4.6.3	23.82	23.70	 0.51%
gcc-4.7.1	23.84	23.94	-0.40%

Unique keys (no leftmost, rightmost or count)
Compiler       Generic Specific Performance Loss
gcc-3.4.6	29.38	19.94	47.33%
gcc-4.0.4	28.85	21.14	36.48%
gcc-4.1.2	25.16	20.30	23.95%
gcc-4.2.4	25.26	20.50	23.23%
gcc-4.3.6	25.41	20.82	22.02%
gcc-4.4.7	26.12	20.68	26.33%
gcc-4.5.4	25.29	20.31	24.54%
gcc-4.6.3	21.57	20.35	 6.01%
gcc-4.7.1	20.98	20.20	 3.88%

Unique keys, use leftmost (no rightmost or count)
Compiler       Generic Specific Performance Loss
gcc-3.4.6	29.50	20.96	40.76%
gcc-4.0.4	28.93	20.90	38.41%
gcc-4.1.2	26.26	22.29	17.80%
gcc-4.2.4	25.49	22.05	15.61%
gcc-4.3.6	26.55	22.25	19.34%
gcc-4.4.7	28.90	22.24	29.92%
gcc-4.5.4	26.85	21.86	22.80%
gcc-4.6.3	22.95	22.06	 4.03%
gcc-4.7.1	22.56	21.48	 5.01%

Unique keys, use leftmost, rightmost and count
Compiler       Generic Specific Performance Loss
gcc-3.4.6	29.48	20.91	40.97%
gcc-4.0.4	29.37	21.72	35.20%
gcc-4.1.2	25.25	23.10	 9.29%
gcc-4.2.4	26.17	22.35	17.13%
gcc-4.3.6	26.34	22.30	18.10%
gcc-4.4.7	25.24	22.43	12.51%
gcc-4.5.4	25.58	23.07	10.89%
gcc-4.6.3	21.79	23.50	-7.29%
gcc-4.7.1	23.27	25.08	-7.22%


I've done an analysis of the gcc 4.7.1-generated code and discovered the
following flaws in the generic insert function.

1. Key of inserted object being read repeatedly. Instead of reading the value
   of the inserted key once, at the start of the function, the key is read
   prior to each comparision.  I'm guessing that this is because optimizer
   makes the faulty assumption that the value could change throughout the
   course of execution. This costs us one extra instruction each iteration of
   the loop as we search the tree (32-bit key).

     mov    0x18(%rax),%edx

   A work-around is in place to eliminate this problem on gcc 4.6.0 and later
   if your key size is 16, 32 or 64 bits, which manages to get gcc to store
   the key of the supplied object in a regsiter at the start of the function
   preventing us a performance loss of roughly 4%.

2. Due to gcc bug 3507 (http://gcc.gnu.org/bugzilla/show_bug.cgi?id=3507),
   this code:

     long diff = a - b;

     if (diff > 0)
         do_gt();
     else if (diff < 0)
         do_lt();
     else
         do_eq();

   Optimizes more poorly than this code:

     if (a > b)
         do_gt();
     else if (b < a)
         do_lt();
     else
         do_eq();

   So instead of the key compare happening like this (64-bit key):

     cmp    0x18(%rax),%rsi

   We get this:

     mov    %rsi,%rdx
     sub    0x18(%rax),%rdx
     cmp    $0x0,%rdx

   The results can be slightly worse when the key type isn't the same as long.
   With a signed 32-bit key (s32) on x86_64, gcc thinks it needs to convert
   the difference to a 64-bit long.

     mov    %esi,%edx
     sub    0x18(%rax),%edx
     movslq %edx,%rdx
     cmp    $0x0,%rdx

   Not only is this 2-3 extra instruction, it also uses one extra register,
   which in turn forces gcc to use an r8-15 register in other places, which
   requires larger opcodes. Also, this only occurs when using the normal
   compare function (doesn't occur when using 'greater').  So this affects
   inserts on trees with unique keys and all lookups.

Q&A
===
Q: Why did you add BUILD_BUG_ON_NON_CONST() and
   BUILD_BUG_ON_NON_CONST42()?
A: There were initially enough BUILD_BUG_ON(!__builtin_constant_p(arg))
   calls to warrant it having a macro for it.  However, I've since
   discovered that using __builtin_constant_p on a struct member did not
   behave very consistently, so after writing some test programs &
   scripts, and refining 200k+ test results, I graphed out basically
   where __builtin_constant_p() worked and didn't.  As it turns out,
   using it on struct members is fragile until gcc 4.2, so
   BUILD_BUG_ON_NON_CONST42() is intended for use with struct members.

Q: Why empty parameters?
   What is IFF_EMPTY() for?
   Why don't you just pass zero instead of an empty parameter?
A: Support for caching the left- & right-most nodes in the tree as well
   as maintaining a count variable are all optional.  Passing the offset
   value directly not only means more characters of code to use the
   RB_RELATIONSHIP and RB_DEFINE_INTERFACE macros (because now you'll
   have to invoke the offsetof macro, supplying your struct types
   again), but the offset may actually be zero, so passing zero as "I'm
   not using this feature" wont work.  (This is the reason why the flags
   RB_HAS_LEFTMOST, et. al. exist.)  Thus, you would also need to
   manually pass the appropriate rb_flag value to specify that you're
   using the feature.  All of this means more copy, paste & edit code
   that is error-prone and a maintenance nightmare.  This implementation
   allows the caller to pass the name of the struct member or leave the
   parameter empty to mean "I'm not using this feature", thus
   eliminating all of these other complications.

Q: Using huge macro like RB_DEFINE_INTERFACE prone to usage errors that
   create crappy error messages and have zero type-safety. (not really a
   question)
A: True.  However, much of this is mitigated by creating an
   __rb_sanity_check_##name function that is never called, but will
   generate meaningful error messages for most mistakes (incorrect
   struct member types, etc.)

Q: The traditional boolean comparitor passed to for sorted sets is a less_than
   function, why are you using 'greater than'?
A: This decision is purely for optimization purposes, as compare and
   greather_than are interchangable when we don't care about equality.
   However, this may become a moot point if we can't get gcc to properly
   optimize code using the compare function, and switch to a pair of
   equals/less functions.

Revision History
===============
New in v6:
o Rebased onto linux-next.
o Removed augmented suport which is now redundant.  This should give us a
  little speed back on those older compilers.
o Renamed CONFIG_RBTREE* to CONFIG_GRBTREE* to avoid conflicts with Michel
  Lespinasse's new code.
o Added support to selftests for a payload on the test objects to see how
  performance changes as the size of the objects grow.
o Various other enhancements to test code & scripts.
o Dumped my email provider for silently refusing to send to more than 20
  recipients (but sending emails 20 random recipients anyway).

New in v5:
o Added a ability to specify a different compare function for inserts.  This
  is more efficient on trees with duplicate keys, since you can use a boolean
  "greater than" function.
o Added an optimization to generate better code where key size is 16, 32 or 64
  bits.
o Add test & validation framework (CONFIG_DEBUG_RBTREE and
  CONFIG_DEBUG_RBTREE_VALIDATE)
o Fixed bugs in kernel-doc so that API documentation generates correctly.
o Add userspace test program & scripts.
o Fixed a lot of typos
o Cleaned up and completed kernel-doc comments

New in v4:
o Added type-safe wrapper functions for rb_{next,prev,first,last}
  to RB_DEFINE_INTERFACE.  Naming is the same as other type-safe
  functions (e.g.,  prefix##_first wraps rb_first). (thanks Pavel Pisa
  for the suggestion)
o Added rb_find_{first,next,last,prev} (for non-unique trees) to find
  the first or last occurrence of a key and iterate through them.
  Type-safe wrapper functions also added to RB_DEFINE_INTERFACE. (thanks
  again Pavel Pisa)
o Added support for an unsigned long count member of the container
  struct that will be updated upon insertions & deletions.
o Improve sanity checks performed by RB_DEFINE_INTERFACE -- error
  messages are now more specific and clearer.  Type safety for compare
  function is now enforced.
o Completed implementation of insert_near (still untested).
o Completed testing for find_near.  Performance is something like
  O(log distance * 2 + 1), so if your start node is a bit closer than
  half way across the tree, find_near will be about the same speed as
  find. If it is further, it will be slower.  Either way, it is larger
  than a normal find (which should be taken into account), so should
  only be used when you are fairly certain your target objects is near
  the start.
o Added support for specifying modifiers for functions generated by
  RB_DEFINE_INTERFACE.  This adds 4 more parameters, but is probably
  better than forcing the user to write their own wrapper functions to
  macro-generated wrapper functions, just to change their function
  attributes.
o Added run-time versions of all of the __rb_xxx_to_xxx inline
  functions, for use in those conditions where someone may actually need
  to access these using a run-time struct rb_relatinoship value.
o Performed compile tests on gcc 3.4.6 - 4.7.0 and tweaked BUILD_BUG_ON*
  macros to not fail on any of these compilers.

New in v3:
o Moved compare & augment functions back into struct rb_relationship
  after discovering that calling them will be inlined in gcc 4.6+ if the
  function is flattened.
o Improved doc comments.
o Solved problem of compare function not being checked for
  type-correctness by adding a __sanity_check_##name() function to
  __RB_DEFINE_INTERFACE that generates usable errors when there's a type
  or member name problem in the macro parameters.  This is helpful since
  the errors produced when the RB_RELATIONSHIP macro expands were quite
  terrible.

New in v2:
o Added RB_RELATIONSHIP macro (thanks Peter Zijlstra for the
  suggestions).
o Added RB_DEFINE_INTERFACE macro.


^ permalink raw reply	[flat|nested] 15+ messages in thread

* [PATCH v6 1/25] compiler-gcc4.h: Correct verion check for __compiletime_error
  2012-09-28  1:54 [PATCH v6 0/25] Generic Red-Black Trees Daniel Santos
@ 2012-09-28  1:54 ` Daniel Santos
  2012-09-28  1:54 ` [PATCH v6 2/25] compiler-gcc4.h: Reorder macros based upon gcc ver Daniel Santos
                   ` (6 subsequent siblings)
  7 siblings, 0 replies; 15+ messages in thread
From: Daniel Santos @ 2012-09-28  1:54 UTC (permalink / raw)
  To: linux-kernel, linux-doc, linux-sparse, Akinobu Mita, Andi Kleen,
	Andrea Arcangeli, Andrew Morton, Christopher Li, David Daney,
	David Howells, David Rientjes, David Woodhouse, Don Zickus,
	Greg Kroah-Hartman, Hidetoshi Seto, H. Peter Anvin
  Cc: Daniel Santos

__attribute__((error(msg))) was introduced in gcc 4.3 (not 4.4) and as I
was unable to find any gcc bugs pertaining to it, I'm presuming that it
has functioned as advertised since 4.3.0.

Signed-off-by: Daniel Santos <daniel.santos@pobox.com>
---
 include/linux/compiler-gcc4.h |    2 +-
 1 files changed, 1 insertions(+), 1 deletions(-)

diff --git a/include/linux/compiler-gcc4.h b/include/linux/compiler-gcc4.h
index 997fd8a..8721704 100644
--- a/include/linux/compiler-gcc4.h
+++ b/include/linux/compiler-gcc4.h
@@ -59,7 +59,7 @@
 #if __GNUC_MINOR__ > 0
 #define __compiletime_object_size(obj) __builtin_object_size(obj, 0)
 #endif
-#if __GNUC_MINOR__ >= 4 && !defined(__CHECKER__)
+#if __GNUC_MINOR__ >= 3 && !defined(__CHECKER__)
 #define __compiletime_warning(message) __attribute__((warning(message)))
 #define __compiletime_error(message) __attribute__((error(message)))
 #endif
-- 
1.7.3.4


^ permalink raw reply related	[flat|nested] 15+ messages in thread

* [PATCH v6 2/25] compiler-gcc4.h: Reorder macros based upon gcc ver
  2012-09-28  1:54 [PATCH v6 0/25] Generic Red-Black Trees Daniel Santos
  2012-09-28  1:54 ` [PATCH v6 1/25] compiler-gcc4.h: Correct verion check for __compiletime_error Daniel Santos
@ 2012-09-28  1:54 ` Daniel Santos
  2012-09-28  1:54 ` [PATCH v6 3/25] compiler-gcc.h: Add gcc-recommended GCC_VERSION macro Daniel Santos
                   ` (5 subsequent siblings)
  7 siblings, 0 replies; 15+ messages in thread
From: Daniel Santos @ 2012-09-28  1:54 UTC (permalink / raw)
  To: linux-kernel, linux-doc, linux-sparse, Akinobu Mita, Andi Kleen,
	Andrea Arcangeli, Andrew Morton, Christopher Li, David Daney,
	David Howells, David Rientjes, David Woodhouse, Don Zickus,
	Greg Kroah-Hartman, Hidetoshi Seto, H. Peter Anvin
  Cc: Daniel Santos

This helps to keep the file from getting confusing, removes one
duplicate version check and should encourage future editors to put new
macros where they belong.

Signed-off-by: Daniel Santos <daniel.santos@pobox.com>
---
 include/linux/compiler-gcc4.h |   20 +++++++++++---------
 1 files changed, 11 insertions(+), 9 deletions(-)

diff --git a/include/linux/compiler-gcc4.h b/include/linux/compiler-gcc4.h
index 8721704..4506d65 100644
--- a/include/linux/compiler-gcc4.h
+++ b/include/linux/compiler-gcc4.h
@@ -13,6 +13,10 @@
 #define __must_check 		__attribute__((warn_unused_result))
 #define __compiler_offsetof(a,b) __builtin_offsetof(a,b)
 
+#if __GNUC_MINOR__ > 0
+# define __compiletime_object_size(obj) __builtin_object_size(obj, 0)
+#endif
+
 #if __GNUC_MINOR__ >= 3
 /* Mark functions as cold. gcc will assume any path leading to a call
    to them will be unlikely.  This means a lot of manual unlikely()s
@@ -31,6 +35,12 @@
 
 #define __linktime_error(message) __attribute__((__error__(message)))
 
+#ifndef __CHECKER__
+# define __compiletime_warning(message) __attribute__((warning(message)))
+# define __compiletime_error(message) __attribute__((error(message)))
+#endif /* __CHECKER__ */
+#endif /* __GNUC_MINOR__ >= 3 */
+
 #if __GNUC_MINOR__ >= 5
 /*
  * Mark a position in code as unreachable.  This can be used to
@@ -46,8 +56,7 @@
 /* Mark a function definition as prohibited from being cloned. */
 #define __noclone	__attribute__((__noclone__))
 
-#endif
-#endif
+#endif /* __GNUC_MINOR__ >= 5 */
 
 #if __GNUC_MINOR__ >= 6
 /*
@@ -56,10 +65,3 @@
 #define __visible __attribute__((externally_visible))
 #endif
 
-#if __GNUC_MINOR__ > 0
-#define __compiletime_object_size(obj) __builtin_object_size(obj, 0)
-#endif
-#if __GNUC_MINOR__ >= 3 && !defined(__CHECKER__)
-#define __compiletime_warning(message) __attribute__((warning(message)))
-#define __compiletime_error(message) __attribute__((error(message)))
-#endif
-- 
1.7.3.4


^ permalink raw reply related	[flat|nested] 15+ messages in thread

* [PATCH v6 3/25] compiler-gcc.h: Add gcc-recommended GCC_VERSION macro
  2012-09-28  1:54 [PATCH v6 0/25] Generic Red-Black Trees Daniel Santos
  2012-09-28  1:54 ` [PATCH v6 1/25] compiler-gcc4.h: Correct verion check for __compiletime_error Daniel Santos
  2012-09-28  1:54 ` [PATCH v6 2/25] compiler-gcc4.h: Reorder macros based upon gcc ver Daniel Santos
@ 2012-09-28  1:54 ` Daniel Santos
  2012-09-28  1:54 ` [PATCH v6 4/25] compiler-gcc{3,4}.h: Use " Daniel Santos
                   ` (4 subsequent siblings)
  7 siblings, 0 replies; 15+ messages in thread
From: Daniel Santos @ 2012-09-28  1:54 UTC (permalink / raw)
  To: linux-kernel, linux-doc, linux-sparse, Akinobu Mita, Andi Kleen,
	Andrea Arcangeli, Andrew Morton, Christopher Li, David Daney,
	David Howells, David Rientjes, David Woodhouse, Don Zickus,
	Greg Kroah-Hartman, Hidetoshi Seto, H. Peter Anvin
  Cc: Daniel Santos

Throughout compiler*.h, many version checks are made.  These can be
simplified by using the macro that gcc's documentation recommends.
However, my primary reason for adding this is that I need bug-check
macros that are enabled at certain gcc versions and it's cleaner to use
this macro than the tradition method:

if __GNUC__ > 4 || (__GNUC__ == 4 && __GNUC_MINOR__ => 2)

If you add patch level, it gets this ugly:

if __GNUC__ > 4 || (__GNUC__ == 4 && (__GNUC_MINOR__ > 2 || \
   __GNUC_MINOR__ == 2 __GNUC_PATCHLEVEL__ >= 1))

As opposed to:

if GCC_VERSION >= 40201

While having separate headers for gcc 3 & 4 eliminates some of this
verbosity, they can still be cleaned up by this.

See also:
http://gcc.gnu.org/onlinedocs/cpp/Common-Predefined-Macros.html

Signed-off-by: Daniel Santos <daniel.santos@pobox.com>
---
 include/linux/compiler-gcc.h |    3 +++
 1 files changed, 3 insertions(+), 0 deletions(-)

diff --git a/include/linux/compiler-gcc.h b/include/linux/compiler-gcc.h
index 6a6d7ae..24545cd 100644
--- a/include/linux/compiler-gcc.h
+++ b/include/linux/compiler-gcc.h
@@ -5,6 +5,9 @@
 /*
  * Common definitions for all gcc versions go here.
  */
+#define GCC_VERSION (__GNUC__ * 10000 \
+		   + __GNUC_MINOR__ * 100 \
+		   + __GNUC_PATCHLEVEL__)
 
 
 /* Optimization barrier */
-- 
1.7.3.4


^ permalink raw reply related	[flat|nested] 15+ messages in thread

* [PATCH v6 4/25] compiler-gcc{3,4}.h: Use GCC_VERSION macro
  2012-09-28  1:54 [PATCH v6 0/25] Generic Red-Black Trees Daniel Santos
                   ` (2 preceding siblings ...)
  2012-09-28  1:54 ` [PATCH v6 3/25] compiler-gcc.h: Add gcc-recommended GCC_VERSION macro Daniel Santos
@ 2012-09-28  1:54 ` Daniel Santos
  2012-10-01 19:39   ` Andrew Morton
  2012-09-28  1:54 ` [PATCH v6 5/25] compiler{,-gcc4}.h: Remove duplicate macros Daniel Santos
                   ` (3 subsequent siblings)
  7 siblings, 1 reply; 15+ messages in thread
From: Daniel Santos @ 2012-09-28  1:54 UTC (permalink / raw)
  To: linux-kernel, linux-doc, linux-sparse, Akinobu Mita, Andi Kleen,
	Andrea Arcangeli, Andrew Morton, Christopher Li, David Daney,
	David Howells, David Rientjes, David Woodhouse, Don Zickus,
	Greg Kroah-Hartman, Hidetoshi Seto, H. Peter Anvin
  Cc: Daniel Santos

Using GCC_VERSION reduces complexity, is easier to read and is GCC's
recommended mechanism for doing version checks. (Just don't ask me why
they didn't define it in the first place.)  This also makes it easy to
merge compiler-gcc{3,4}.h should somebody want to.

Signed-off-by: Daniel Santos <daniel.santos@pobox.com>
---
 include/linux/compiler-gcc3.h |    8 ++++----
 include/linux/compiler-gcc4.h |   14 +++++++-------
 2 files changed, 11 insertions(+), 11 deletions(-)

diff --git a/include/linux/compiler-gcc3.h b/include/linux/compiler-gcc3.h
index 37d4124..7d89feb 100644
--- a/include/linux/compiler-gcc3.h
+++ b/include/linux/compiler-gcc3.h
@@ -2,22 +2,22 @@
 #error "Please don't include <linux/compiler-gcc3.h> directly, include <linux/compiler.h> instead."
 #endif
 
-#if __GNUC_MINOR__ < 2
+#if GCC_VERSION < 30200
 # error Sorry, your compiler is too old - please upgrade it.
 #endif
 
-#if __GNUC_MINOR__ >= 3
+#if GCC_VERSION >= 30300
 # define __used			__attribute__((__used__))
 #else
 # define __used			__attribute__((__unused__))
 #endif
 
-#if __GNUC_MINOR__ >= 4
+#if GCC_VERSION >= 30400
 #define __must_check		__attribute__((warn_unused_result))
 #endif
 
 #ifdef CONFIG_GCOV_KERNEL
-# if __GNUC_MINOR__ < 4
+# if GCC_VERSION < 30400
 #   error "GCOV profiling support for gcc versions below 3.4 not included"
 # endif /* __GNUC_MINOR__ */
 #endif /* CONFIG_GCOV_KERNEL */
diff --git a/include/linux/compiler-gcc4.h b/include/linux/compiler-gcc4.h
index 4506d65..b44307d 100644
--- a/include/linux/compiler-gcc4.h
+++ b/include/linux/compiler-gcc4.h
@@ -4,7 +4,7 @@
 
 /* GCC 4.1.[01] miscompiles __weak */
 #ifdef __KERNEL__
-# if __GNUC_MINOR__ == 1 && __GNUC_PATCHLEVEL__ <= 1
+# if GCC_VERSION >= 40100 &&  GCC_VERSION <= 40101
 //#  error Your version of gcc miscompiles the __weak directive
 # endif
 #endif
@@ -13,11 +13,11 @@
 #define __must_check 		__attribute__((warn_unused_result))
 #define __compiler_offsetof(a,b) __builtin_offsetof(a,b)
 
-#if __GNUC_MINOR__ > 0
+#if GCC_VERSION >= 40102
 # define __compiletime_object_size(obj) __builtin_object_size(obj, 0)
 #endif
 
-#if __GNUC_MINOR__ >= 3
+#if GCC_VERSION >= 40300
 /* Mark functions as cold. gcc will assume any path leading to a call
    to them will be unlikely.  This means a lot of manual unlikely()s
    are unnecessary now for any paths leading to the usual suspects
@@ -39,9 +39,9 @@
 # define __compiletime_warning(message) __attribute__((warning(message)))
 # define __compiletime_error(message) __attribute__((error(message)))
 #endif /* __CHECKER__ */
-#endif /* __GNUC_MINOR__ >= 3 */
+#endif /* GCC_VERSION >= 40300 */
 
-#if __GNUC_MINOR__ >= 5
+#if GCC_VERSION >= 40500
 /*
  * Mark a position in code as unreachable.  This can be used to
  * suppress control flow warnings after asm blocks that transfer
@@ -56,9 +56,9 @@
 /* Mark a function definition as prohibited from being cloned. */
 #define __noclone	__attribute__((__noclone__))
 
-#endif /* __GNUC_MINOR__ >= 5 */
+#endif /* GCC_VERSION >= 40500 */
 
-#if __GNUC_MINOR__ >= 6
+#if GCC_VERSION >= 40600
 /*
  * Tell the optimizer that something else uses this function or variable.
  */
-- 
1.7.3.4


^ permalink raw reply related	[flat|nested] 15+ messages in thread

* [PATCH v6 5/25] compiler{,-gcc4}.h: Remove duplicate macros
  2012-09-28  1:54 [PATCH v6 0/25] Generic Red-Black Trees Daniel Santos
                   ` (3 preceding siblings ...)
  2012-09-28  1:54 ` [PATCH v6 4/25] compiler-gcc{3,4}.h: Use " Daniel Santos
@ 2012-09-28  1:54 ` Daniel Santos
  2012-09-28  1:54 ` [PATCH v6 6/25] bug.h: Replace __linktime_error with __compiletime_error Daniel Santos
                   ` (2 subsequent siblings)
  7 siblings, 0 replies; 15+ messages in thread
From: Daniel Santos @ 2012-09-28  1:54 UTC (permalink / raw)
  To: linux-kernel, linux-doc, linux-sparse, Akinobu Mita, Andi Kleen,
	Andrea Arcangeli, Andrew Morton, Christopher Li, David Daney,
	David Howells, David Rientjes, David Woodhouse, Don Zickus,
	Greg Kroah-Hartman, Hidetoshi Seto, H. Peter Anvin
  Cc: Daniel Santos

__linktime_error() does the same thing as __compiletime_error() and is
only used in bug.h.  Since the macro defines a function attribute that
will cause a failure at compile-time (not link-time), it makes more
sense to keep __compiletime_error(), which is also neatly mated with
__compiletime_warning().

Signed-off-by: Daniel Santos <daniel.santos@pobox.com>
---
 include/linux/compiler-gcc4.h |    2 --
 include/linux/compiler.h      |    3 ---
 2 files changed, 0 insertions(+), 5 deletions(-)

diff --git a/include/linux/compiler-gcc4.h b/include/linux/compiler-gcc4.h
index b44307d..ad610f2 100644
--- a/include/linux/compiler-gcc4.h
+++ b/include/linux/compiler-gcc4.h
@@ -33,8 +33,6 @@
    the kernel context */
 #define __cold			__attribute__((__cold__))
 
-#define __linktime_error(message) __attribute__((__error__(message)))
-
 #ifndef __CHECKER__
 # define __compiletime_warning(message) __attribute__((warning(message)))
 # define __compiletime_error(message) __attribute__((error(message)))
diff --git a/include/linux/compiler.h b/include/linux/compiler.h
index f430e41..fd455aa 100644
--- a/include/linux/compiler.h
+++ b/include/linux/compiler.h
@@ -297,9 +297,6 @@ void ftrace_likely_update(struct ftrace_branch_data *f, int val, int expect);
 #ifndef __compiletime_error
 # define __compiletime_error(message)
 #endif
-#ifndef __linktime_error
-# define __linktime_error(message)
-#endif
 /*
  * Prevent the compiler from merging or refetching accesses.  The compiler
  * is also forbidden from reordering successive instances of ACCESS_ONCE(),
-- 
1.7.3.4


^ permalink raw reply related	[flat|nested] 15+ messages in thread

* [PATCH v6 6/25] bug.h: Replace __linktime_error with __compiletime_error
  2012-09-28  1:54 [PATCH v6 0/25] Generic Red-Black Trees Daniel Santos
                   ` (4 preceding siblings ...)
  2012-09-28  1:54 ` [PATCH v6 5/25] compiler{,-gcc4}.h: Remove duplicate macros Daniel Santos
@ 2012-09-28  1:54 ` Daniel Santos
  2012-09-28  1:54 ` [PATCH v6 7/25] compiler{,-gcc4}.h: Introduce __flatten function attribute Daniel Santos
  2012-10-01 19:43 ` [PATCH v6 0/25] Generic Red-Black Trees Andrew Morton
  7 siblings, 0 replies; 15+ messages in thread
From: Daniel Santos @ 2012-09-28  1:54 UTC (permalink / raw)
  To: linux-kernel, linux-doc, linux-sparse, Akinobu Mita, Andi Kleen,
	Andrea Arcangeli, Andrew Morton, Christopher Li, David Daney,
	David Howells, David Rientjes, David Woodhouse, Don Zickus,
	Greg Kroah-Hartman, Hidetoshi Seto, H. Peter Anvin
  Cc: Daniel Santos

Signed-off-by: Daniel Santos <daniel.santos@pobox.com>
---
 include/linux/bug.h |    2 +-
 1 files changed, 1 insertions(+), 1 deletions(-)

diff --git a/include/linux/bug.h b/include/linux/bug.h
index aaac4bb..298a916 100644
--- a/include/linux/bug.h
+++ b/include/linux/bug.h
@@ -73,7 +73,7 @@ extern int __build_bug_on_failed;
 #define BUILD_BUG()						\
 	do {							\
 		extern void __build_bug_failed(void)		\
-			__linktime_error("BUILD_BUG failed");	\
+			__compiletime_error("BUILD_BUG failed");\
 		__build_bug_failed();				\
 	} while (0)
 
-- 
1.7.3.4


^ permalink raw reply related	[flat|nested] 15+ messages in thread

* [PATCH v6 7/25] compiler{,-gcc4}.h: Introduce __flatten function attribute
  2012-09-28  1:54 [PATCH v6 0/25] Generic Red-Black Trees Daniel Santos
                   ` (5 preceding siblings ...)
  2012-09-28  1:54 ` [PATCH v6 6/25] bug.h: Replace __linktime_error with __compiletime_error Daniel Santos
@ 2012-09-28  1:54 ` Daniel Santos
  2012-10-01 19:43 ` [PATCH v6 0/25] Generic Red-Black Trees Andrew Morton
  7 siblings, 0 replies; 15+ messages in thread
From: Daniel Santos @ 2012-09-28  1:54 UTC (permalink / raw)
  To: linux-kernel, linux-doc, linux-sparse, Akinobu Mita, Andi Kleen,
	Andrea Arcangeli, Andrew Morton, Christopher Li, David Daney,
	David Howells, David Rientjes, David Woodhouse, Don Zickus,
	Greg Kroah-Hartman, Hidetoshi Seto, H. Peter Anvin
  Cc: Daniel Santos

For gcc 4.1 & later, expands to __attribute__((flatten)) which forces
the compiler to inline everything it can into the function.  This is
useful in combination with noinline when you want to control the depth
of inlining, or create a single function where inline expansions will
occur. (see
http://gcc.gnu.org/onlinedocs/gcc/Function-Attributes.html#index-g_t_0040code_007bflatten_007d-function-attribute-2512)

Normally, it's best to leave this type of thing up to the compiler.
However, the generic rbtree code uses inline functions just to be able
to inject compile-time constant data that specifies how the caller wants
the function to behave (via struct rb_relationship).  This data can be
thought of as the template parameters of a C++ templatized function.
Since some of these functions, once expanded, become quite large, gcc
sometimes decides not to perform some important inlining, in one case,
even generating a few bytes more code by not doing so. (Note: I have not
eliminated the possibility that this was an optimization bug, but the
flatten attribute fixes it in either case.)

Combining __flatten and noinline insures that important optimizations
occur in these cases and that the inline expansion occurs in exactly one
place, thus not leading to unnecissary bloat. However, it also can
eliminate some opportunities for optimization should gcc otherwise
decide the function its self is a good candidate for inlining.

Signed-off-by: Daniel Santos <daniel.santos@pobox.com>
---
 include/linux/compiler-gcc4.h |    7 ++++++-
 include/linux/compiler.h      |    4 ++++
 2 files changed, 10 insertions(+), 1 deletions(-)

diff --git a/include/linux/compiler-gcc4.h b/include/linux/compiler-gcc4.h
index ad610f2..5a0897e 100644
--- a/include/linux/compiler-gcc4.h
+++ b/include/linux/compiler-gcc4.h
@@ -15,7 +15,12 @@
 
 #if GCC_VERSION >= 40102
 # define __compiletime_object_size(obj) __builtin_object_size(obj, 0)
-#endif
+
+/* flatten introduced in 4.1, but broken in 4.6.0 (gcc bug #48731)*/
+# if GCC_VERSION != 40600
+#  define __flatten __attribute__((flatten))
+# endif
+#endif /* GCC_VERSION >= 40102 */
 
 #if GCC_VERSION >= 40300
 /* Mark functions as cold. gcc will assume any path leading to a call
diff --git a/include/linux/compiler.h b/include/linux/compiler.h
index fd455aa..268aeb6 100644
--- a/include/linux/compiler.h
+++ b/include/linux/compiler.h
@@ -244,6 +244,10 @@ void ftrace_likely_update(struct ftrace_branch_data *f, int val, int expect);
 #define __always_inline inline
 #endif
 
+#ifndef __flatten
+#define __flatten
+#endif
+
 #endif /* __KERNEL__ */
 
 /*
-- 
1.7.3.4


^ permalink raw reply related	[flat|nested] 15+ messages in thread

* Re: [PATCH v6 4/25] compiler-gcc{3,4}.h: Use GCC_VERSION macro
  2012-09-28  1:54 ` [PATCH v6 4/25] compiler-gcc{3,4}.h: Use " Daniel Santos
@ 2012-10-01 19:39   ` Andrew Morton
  2012-10-01 20:32     ` Josh Triplett
  0 siblings, 1 reply; 15+ messages in thread
From: Andrew Morton @ 2012-10-01 19:39 UTC (permalink / raw)
  To: Daniel Santos
  Cc: linux-kernel, linux-doc, linux-sparse, Akinobu Mita, Andi Kleen,
	Andrea Arcangeli, Christopher Li, David Daney, David Howells,
	David Rientjes, David Woodhouse, Don Zickus, Greg Kroah-Hartman,
	Hidetoshi Seto, H. Peter Anvin

On Thu, 27 Sep 2012 20:54:20 -0500
Daniel Santos <daniel.santos@pobox.com> wrote:

> Using GCC_VERSION reduces complexity, is easier to read and is GCC's
> recommended mechanism for doing version checks. (Just don't ask me why
> they didn't define it in the first place.)  This also makes it easy to
> merge compiler-gcc{3,4}.h should somebody want to.
> 
> Signed-off-by: Daniel Santos <daniel.santos@pobox.com>
> ---
>  include/linux/compiler-gcc3.h |    8 ++++----
>  include/linux/compiler-gcc4.h |   14 +++++++-------
>  2 files changed, 11 insertions(+), 11 deletions(-)
> 
> diff --git a/include/linux/compiler-gcc3.h b/include/linux/compiler-gcc3.h
> index 37d4124..7d89feb 100644
> --- a/include/linux/compiler-gcc3.h
> +++ b/include/linux/compiler-gcc3.h
> @@ -2,22 +2,22 @@
>  #error "Please don't include <linux/compiler-gcc3.h> directly, include <linux/compiler.h> instead."
>  #endif
>  
> -#if __GNUC_MINOR__ < 2
> +#if GCC_VERSION < 30200
>  # error Sorry, your compiler is too old - please upgrade it.
>  #endif
>  
> -#if __GNUC_MINOR__ >= 3
> +#if GCC_VERSION >= 30300
>  # define __used			__attribute__((__used__))
>  #else
>  # define __used			__attribute__((__unused__))
>  #endif
>  
> -#if __GNUC_MINOR__ >= 4
> +#if GCC_VERSION >= 30400
>  #define __must_check		__attribute__((warn_unused_result))
>  #endif
>  
>  #ifdef CONFIG_GCOV_KERNEL
> -# if __GNUC_MINOR__ < 4
> +# if GCC_VERSION < 30400
>  #   error "GCOV profiling support for gcc versions below 3.4 not included"
>  # endif /* __GNUC_MINOR__ */
>  #endif /* CONFIG_GCOV_KERNEL */
> diff --git a/include/linux/compiler-gcc4.h b/include/linux/compiler-gcc4.h
> index 4506d65..b44307d 100644
> --- a/include/linux/compiler-gcc4.h
> +++ b/include/linux/compiler-gcc4.h
> @@ -4,7 +4,7 @@
>  
>  /* GCC 4.1.[01] miscompiles __weak */
>  #ifdef __KERNEL__
> -# if __GNUC_MINOR__ == 1 && __GNUC_PATCHLEVEL__ <= 1
> +# if GCC_VERSION >= 40100 &&  GCC_VERSION <= 40101
>  //#  error Your version of gcc miscompiles the __weak directive
>  # endif
>  #endif
> @@ -13,11 +13,11 @@
>  #define __must_check 		__attribute__((warn_unused_result))
>  #define __compiler_offsetof(a,b) __builtin_offsetof(a,b)
>  
> -#if __GNUC_MINOR__ > 0
> +#if GCC_VERSION >= 40102

Is this correct (and clear)?  I'd expect

	#if GCC_VERSION > 40000

>  # define __compiletime_object_size(obj) __builtin_object_size(obj, 0)
>  #endif
>  
> -#if __GNUC_MINOR__ >= 3
> +#if GCC_VERSION >= 40300
>  /* Mark functions as cold. gcc will assume any path leading to a call
>     to them will be unlikely.  This means a lot of manual unlikely()s
>     are unnecessary now for any paths leading to the usual suspects
> @@ -39,9 +39,9 @@
>  # define __compiletime_warning(message) __attribute__((warning(message)))
>  # define __compiletime_error(message) __attribute__((error(message)))
>  #endif /* __CHECKER__ */
> -#endif /* __GNUC_MINOR__ >= 3 */
> +#endif /* GCC_VERSION >= 40300 */
>  
> -#if __GNUC_MINOR__ >= 5
> +#if GCC_VERSION >= 40500
>  /*
>   * Mark a position in code as unreachable.  This can be used to
>   * suppress control flow warnings after asm blocks that transfer
> @@ -56,9 +56,9 @@
>  /* Mark a function definition as prohibited from being cloned. */
>  #define __noclone	__attribute__((__noclone__))
>  
> -#endif /* __GNUC_MINOR__ >= 5 */
> +#endif /* GCC_VERSION >= 40500 */
>  
> -#if __GNUC_MINOR__ >= 6
> +#if GCC_VERSION >= 40600
>  /*
>   * Tell the optimizer that something else uses this function or variable.
>   */
> -- 
> 1.7.3.4

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH v6 0/25] Generic Red-Black Trees
  2012-09-28  1:54 [PATCH v6 0/25] Generic Red-Black Trees Daniel Santos
                   ` (6 preceding siblings ...)
  2012-09-28  1:54 ` [PATCH v6 7/25] compiler{,-gcc4}.h: Introduce __flatten function attribute Daniel Santos
@ 2012-10-01 19:43 ` Andrew Morton
  2012-10-01 20:41   ` Daniel Santos
  7 siblings, 1 reply; 15+ messages in thread
From: Andrew Morton @ 2012-10-01 19:43 UTC (permalink / raw)
  To: Daniel Santos
  Cc: linux-kernel, linux-doc, linux-sparse, Akinobu Mita, Andi Kleen,
	Andrea Arcangeli, Christopher Li, David Daney, David Howells,
	David Rientjes, David Woodhouse, Don Zickus, Greg Kroah-Hartman,
	Hidetoshi Seto, H. Peter Anvin

On Thu, 27 Sep 2012 20:54:16 -0500
Daniel Santos <daniel.santos@pobox.com> wrote:

> This patch set improves on Andrea Arcangeli's original Red-Black Tree
> implementation by adding generic search and insert functions with
> complete support for:

I grabbed patches 1-7, but I don't expect to send them in for 3.7. 
It's not a good time to be merging new material, but I like cleanups.


^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH v6 4/25] compiler-gcc{3,4}.h: Use GCC_VERSION macro
  2012-10-01 19:39   ` Andrew Morton
@ 2012-10-01 20:32     ` Josh Triplett
  2012-10-01 20:43       ` Daniel Santos
  0 siblings, 1 reply; 15+ messages in thread
From: Josh Triplett @ 2012-10-01 20:32 UTC (permalink / raw)
  To: Andrew Morton
  Cc: Daniel Santos, linux-kernel, linux-doc, linux-sparse,
	Akinobu Mita, Andi Kleen, Andrea Arcangeli, Christopher Li,
	David Daney, David Howells, David Rientjes, David Woodhouse,
	Don Zickus, Greg Kroah-Hartman, Hidetoshi Seto, H. Peter Anvin

On Mon, Oct 01, 2012 at 12:39:44PM -0700, Andrew Morton wrote:
> On Thu, 27 Sep 2012 20:54:20 -0500
> Daniel Santos <daniel.santos@pobox.com> wrote:
> 
> > Using GCC_VERSION reduces complexity, is easier to read and is GCC's
> > recommended mechanism for doing version checks. (Just don't ask me why
> > they didn't define it in the first place.)  This also makes it easy to
> > merge compiler-gcc{3,4}.h should somebody want to.
> > 
> > Signed-off-by: Daniel Santos <daniel.santos@pobox.com>
> > ---
> >  include/linux/compiler-gcc3.h |    8 ++++----
> >  include/linux/compiler-gcc4.h |   14 +++++++-------
> >  2 files changed, 11 insertions(+), 11 deletions(-)
> > 
> > diff --git a/include/linux/compiler-gcc3.h b/include/linux/compiler-gcc3.h
> > index 37d4124..7d89feb 100644
> > --- a/include/linux/compiler-gcc3.h
> > +++ b/include/linux/compiler-gcc3.h
> > @@ -2,22 +2,22 @@
> >  #error "Please don't include <linux/compiler-gcc3.h> directly, include <linux/compiler.h> instead."
> >  #endif
> >  
> > -#if __GNUC_MINOR__ < 2
> > +#if GCC_VERSION < 30200
> >  # error Sorry, your compiler is too old - please upgrade it.
> >  #endif
> >  
> > -#if __GNUC_MINOR__ >= 3
> > +#if GCC_VERSION >= 30300
> >  # define __used			__attribute__((__used__))
> >  #else
> >  # define __used			__attribute__((__unused__))
> >  #endif
> >  
> > -#if __GNUC_MINOR__ >= 4
> > +#if GCC_VERSION >= 30400
> >  #define __must_check		__attribute__((warn_unused_result))
> >  #endif
> >  
> >  #ifdef CONFIG_GCOV_KERNEL
> > -# if __GNUC_MINOR__ < 4
> > +# if GCC_VERSION < 30400
> >  #   error "GCOV profiling support for gcc versions below 3.4 not included"
> >  # endif /* __GNUC_MINOR__ */
> >  #endif /* CONFIG_GCOV_KERNEL */
> > diff --git a/include/linux/compiler-gcc4.h b/include/linux/compiler-gcc4.h
> > index 4506d65..b44307d 100644
> > --- a/include/linux/compiler-gcc4.h
> > +++ b/include/linux/compiler-gcc4.h
> > @@ -4,7 +4,7 @@
> >  
> >  /* GCC 4.1.[01] miscompiles __weak */
> >  #ifdef __KERNEL__
> > -# if __GNUC_MINOR__ == 1 && __GNUC_PATCHLEVEL__ <= 1
> > +# if GCC_VERSION >= 40100 &&  GCC_VERSION <= 40101
> >  //#  error Your version of gcc miscompiles the __weak directive
> >  # endif
> >  #endif
> > @@ -13,11 +13,11 @@
> >  #define __must_check 		__attribute__((warn_unused_result))
> >  #define __compiler_offsetof(a,b) __builtin_offsetof(a,b)
> >  
> > -#if __GNUC_MINOR__ > 0
> > +#if GCC_VERSION >= 40102
> 
> Is this correct (and clear)?  I'd expect
> 
> 	#if GCC_VERSION > 40000

GCC_VERSION >= 40100, actually; __GNUC_MINOR__ > 0 implies
__GNUC_MINOR__ >= 1, whereas the version you wrote would accept 4.0.1.

- Josh Triplett

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH v6 0/25] Generic Red-Black Trees
  2012-10-01 19:43 ` [PATCH v6 0/25] Generic Red-Black Trees Andrew Morton
@ 2012-10-01 20:41   ` Daniel Santos
  2012-10-01 20:47     ` Andrew Morton
  0 siblings, 1 reply; 15+ messages in thread
From: Daniel Santos @ 2012-10-01 20:41 UTC (permalink / raw)
  To: Andrew Morton; +Cc: LKML, Josh Triplett, Borislav Petkov, Steven Rostedt

Andrew,

I'm really sorry for the debacle of this round of patches.  It turns out
that my patches weren't reaching LKML because my recipient list was too
large
and the server was tagging it as spam, so none of these patches you
committed
ever made it to LKML. :(  To fix that, I broke the 25 patches
into 3 smaller sets.

[PATCH 0/10] Cleanup & new features for compiler*.h and bug.h
[PATCH 0/3] kernel-doc bug fixes
[PATCH v6 0/10] Generic Red-Black Trees

On 10/01/2012 02:43 PM, Andrew Morton wrote:
> On Thu, 27 Sep 2012 20:54:16 -0500
> Daniel Santos <daniel.santos@pobox.com> wrote:
>
>> This patch set improves on Andrea Arcangeli's original Red-Black Tree
>> implementation by adding generic search and insert functions with
>> complete support for:
>
> I grabbed patches 1-7, but I don't expect to send them in for 3.7. 
> It's not a good time to be merging new material, but I like cleanups.
>
I probably should have bumped the version to 7 to reduce the confusion.
Some maintainers have requested some changes in some of the first 10
patches (the compiler*.h & bug.h).  Can you roll them back or is it
better to
just send the corrections?

So one change, which you noted ("[PATCH v6 4/25] compiler-gcc{3,4}.h: Use
GCC_VERSION macro" is now "[PATCH 4/10]..." of the "Cleanup & new
features for
compiler*.h and bug.h" patch set.

>>  /* GCC 4.1.[01] miscompiles __weak */
>>  #ifdef __KERNEL__
>> -# if __GNUC_MINOR__ == 1 && __GNUC_PATCHLEVEL__ <= 1
>> +# if GCC_VERSION >= 40100 &&  GCC_VERSION <= 40101
>>  //#  error Your version of gcc miscompiles the __weak directive
>>  # endif
>>  #endif
>> @@ -13,11 +13,11 @@
>>  #define __must_check 		__attribute__((warn_unused_result))
>>  #define __compiler_offsetof(a,b) __builtin_offsetof(a,b)
>>  
>> -#if __GNUC_MINOR__ > 0
>> +#if GCC_VERSION >= 40102
> Is this correct (and clear)?  I'd expect
>
> 	#if GCC_VERSION > 40000
This should actually be gcc 4.1.0 or higher. I was going from the
presumption
that 4.1.0 & 4.1.1 wouldn't compile due to the __weak thing above, but
that's
unrelated (and now commented out), so it should just be >= 4.1.0.

#if GCC_VERSION >= 40100

They also want the order of patches 5 & 6 reversed (breaks build in between
otherwise) and patch notes added to the patch "[PATCH 6/10] bug.h: Replace
__linktime_error with __compiletime_error" and we're going to rework
BUILD_BUG_ON.

I can rebase against whatever you like and send either corrections or an
updated patch set. Just tell me what works please.

Thank you for your patience as I learn the ropes in this project.

Daniel


^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH v6 4/25] compiler-gcc{3,4}.h: Use GCC_VERSION macro
  2012-10-01 20:32     ` Josh Triplett
@ 2012-10-01 20:43       ` Daniel Santos
  0 siblings, 0 replies; 15+ messages in thread
From: Daniel Santos @ 2012-10-01 20:43 UTC (permalink / raw)
  To: Josh Triplett, Andrew Morton
  Cc: Daniel Santos, linux-kernel, linux-doc, linux-sparse,
	Akinobu Mita, Andi Kleen, Andrea Arcangeli, Christopher Li,
	David Daney, David Howells, David Rientjes, David Woodhouse,
	Don Zickus, Greg Kroah-Hartman, Hidetoshi Seto, H. Peter Anvin

On 10/01/2012 03:32 PM, Josh Triplett wrote:
> On Mon, Oct 01, 2012 at 12:39:44PM -0700, Andrew Morton wrote:
>> Is this correct (and clear)?  I'd expect
>>
>> 	#if GCC_VERSION > 40000
> GCC_VERSION >= 40100, actually; __GNUC_MINOR__ > 0 implies
> __GNUC_MINOR__ >= 1, whereas the version you wrote would accept 4.0.1.
>
> - Josh Triplett
Yeah, and now I have two threads for the same set of patches, sorry. :(

Daniel

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH v6 0/25] Generic Red-Black Trees
  2012-10-01 20:41   ` Daniel Santos
@ 2012-10-01 20:47     ` Andrew Morton
  2012-10-03 15:18       ` Daniel Santos
  0 siblings, 1 reply; 15+ messages in thread
From: Andrew Morton @ 2012-10-01 20:47 UTC (permalink / raw)
  To: Daniel Santos
  Cc: Daniel Santos, LKML, Josh Triplett, Borislav Petkov, Steven Rostedt

On Mon, 01 Oct 2012 15:41:14 -0500
Daniel Santos <danielfsantos@att.net> wrote:

> I can rebase against whatever you like and send either corrections or an
> updated patch set. Just tell me what works please.

I dropped everything - let's start again.

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH v6 0/25] Generic Red-Black Trees
  2012-10-01 20:47     ` Andrew Morton
@ 2012-10-03 15:18       ` Daniel Santos
  0 siblings, 0 replies; 15+ messages in thread
From: Daniel Santos @ 2012-10-03 15:18 UTC (permalink / raw)
  To: Andrew Morton
  Cc: LKML, Josh Triplett, Borislav Petkov, Steven Rostedt, David Rientjes

On 10/01/2012 03:47 PM, Andrew Morton wrote:
> On Mon, 01 Oct 2012 15:41:14 -0500
> Daniel Santos <danielfsantos@att.net> wrote:
>
>> I can rebase against whatever you like and send either corrections or an
>> updated patch set. Just tell me what works please.
> I dropped everything - let's start again.
I would have sent you my updated patch set, but both -mmotm and -next
are broken on my hardware at the moment and I can't properly re-test.  I
posted a bug here: https://bugzilla.kernel.org/show_bug.cgi?id=48241
(with early boot jpg).  I can rebase it against Linus maybe for testing
(just a few changes).

I did use gcc 4.7.1, so I'm going to try using gcc 4.6.3 just to make
sure it's not that.  But it looks like I may have a few more changes to
make anyway.

Daniel

^ permalink raw reply	[flat|nested] 15+ messages in thread

end of thread, other threads:[~2012-10-03 15:18 UTC | newest]

Thread overview: 15+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2012-09-28  1:54 [PATCH v6 0/25] Generic Red-Black Trees Daniel Santos
2012-09-28  1:54 ` [PATCH v6 1/25] compiler-gcc4.h: Correct verion check for __compiletime_error Daniel Santos
2012-09-28  1:54 ` [PATCH v6 2/25] compiler-gcc4.h: Reorder macros based upon gcc ver Daniel Santos
2012-09-28  1:54 ` [PATCH v6 3/25] compiler-gcc.h: Add gcc-recommended GCC_VERSION macro Daniel Santos
2012-09-28  1:54 ` [PATCH v6 4/25] compiler-gcc{3,4}.h: Use " Daniel Santos
2012-10-01 19:39   ` Andrew Morton
2012-10-01 20:32     ` Josh Triplett
2012-10-01 20:43       ` Daniel Santos
2012-09-28  1:54 ` [PATCH v6 5/25] compiler{,-gcc4}.h: Remove duplicate macros Daniel Santos
2012-09-28  1:54 ` [PATCH v6 6/25] bug.h: Replace __linktime_error with __compiletime_error Daniel Santos
2012-09-28  1:54 ` [PATCH v6 7/25] compiler{,-gcc4}.h: Introduce __flatten function attribute Daniel Santos
2012-10-01 19:43 ` [PATCH v6 0/25] Generic Red-Black Trees Andrew Morton
2012-10-01 20:41   ` Daniel Santos
2012-10-01 20:47     ` Andrew Morton
2012-10-03 15:18       ` Daniel Santos

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).