linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH v3] kernel.h: Skip single-eval logic on literals in min()/max()
@ 2018-03-09 20:05 Kees Cook
  2018-03-09 21:10 ` Linus Torvalds
  2018-03-10  0:07 ` Andrew Morton
  0 siblings, 2 replies; 51+ messages in thread
From: Kees Cook @ 2018-03-09 20:05 UTC (permalink / raw)
  To: Andrew Morton
  Cc: linux-kernel, Josh Poimboeuf, Rasmus Villemoes,
	Gustavo A. R. Silva, Tobin C. Harding, Steven Rostedt,
	Jonathan Corbet, Chris Mason, Josef Bacik, David Sterba,
	David S. Miller, Alexey Kuznetsov, Hideaki YOSHIFUJI,
	Ingo Molnar, Peter Zijlstra, Thomas Gleixner, Masahiro Yamada,
	Borislav Petkov, Randy Dunlap, Ian Abbott, Sergey Senozhatsky,
	Petr Mladek, Andy Shevchenko, Pantelis Antoniou, Linux Btrfs,
	Network Development, Kernel Hardening

When max() is used in stack array size calculations from literal values
(e.g. "char foo[max(sizeof(struct1), sizeof(struct2))]", the compiler
thinks this is a dynamic calculation due to the single-eval logic, which
is not needed in the literal case. This change removes several accidental
stack VLAs from an x86 allmodconfig build:

$ diff -u before.txt after.txt | grep ^-
-drivers/input/touchscreen/cyttsp4_core.c:871:2: warning: ISO C90 forbids variable length array ‘ids’ [-Wvla]
-fs/btrfs/tree-checker.c:344:4: warning: ISO C90 forbids variable length array ‘namebuf’ [-Wvla]
-lib/vsprintf.c:747:2: warning: ISO C90 forbids variable length array ‘sym’ [-Wvla]
-net/ipv4/proc.c:403:2: warning: ISO C90 forbids variable length array ‘buff’ [-Wvla]
-net/ipv6/proc.c:198:2: warning: ISO C90 forbids variable length array ‘buff’ [-Wvla]
-net/ipv6/proc.c:218:2: warning: ISO C90 forbids variable length array ‘buff64’ [-Wvla]

Based on an earlier patch from Josh Poimboeuf.

Signed-off-by: Kees Cook <keescook@chromium.org>
---
v3:
- drop __builtin_types_compatible_p() (Rasmus, Linus)
v2:
- fix copy/paste-o max1_/max2_ (ijc)
- clarify "compile-time" constant in comment (Rasmus)
- clean up formatting on min_t()/max_t()
---
 include/linux/kernel.h | 48 ++++++++++++++++++++++++++++++------------------
 1 file changed, 30 insertions(+), 18 deletions(-)

diff --git a/include/linux/kernel.h b/include/linux/kernel.h
index 3fd291503576..a0fca4deb3ab 100644
--- a/include/linux/kernel.h
+++ b/include/linux/kernel.h
@@ -787,37 +787,55 @@ static inline void ftrace_dump(enum ftrace_dump_mode oops_dump_mode) { }
  * strict type-checking.. See the
  * "unnecessary" pointer comparison.
  */
-#define __min(t1, t2, min1, min2, x, y) ({		\
+#define __single_eval_min(t1, t2, min1, min2, x, y) ({	\
 	t1 min1 = (x);					\
 	t2 min2 = (y);					\
 	(void) (&min1 == &min2);			\
 	min1 < min2 ? min1 : min2; })
 
+/*
+ * In the case of compile-time constant values, there is no need to do
+ * the double-evaluation protection, so the raw comparison can be made.
+ * This allows min()/max() to be used in stack array allocations and
+ * avoid the compiler thinking it is a dynamic value leading to an
+ * accidental VLA.
+ */
+#define __min(t1, t2, x, y)						\
+	__builtin_choose_expr(__builtin_constant_p(x) &&		\
+			      __builtin_constant_p(y),			\
+			      (t1)(x) < (t2)(y) ? (t1)(x) : (t2)(y),	\
+			      __single_eval_min(t1, t2,			\
+						__UNIQUE_ID(min1_),	\
+						__UNIQUE_ID(min2_),	\
+						x, y))
+
 /**
  * min - return minimum of two values of the same or compatible types
  * @x: first value
  * @y: second value
  */
-#define min(x, y)					\
-	__min(typeof(x), typeof(y),			\
-	      __UNIQUE_ID(min1_), __UNIQUE_ID(min2_),	\
-	      x, y)
+#define min(x, y)	__min(typeof(x), typeof(y), x, y)
 
-#define __max(t1, t2, max1, max2, x, y) ({		\
+#define __single_eval_max(t1, t2, max1, max2, x, y) ({	\
 	t1 max1 = (x);					\
 	t2 max2 = (y);					\
 	(void) (&max1 == &max2);			\
 	max1 > max2 ? max1 : max2; })
 
+#define __max(t1, t2, x, y)						\
+	__builtin_choose_expr(__builtin_constant_p(x) &&		\
+			      __builtin_constant_p(y),			\
+			      (t1)(x) > (t2)(y) ? (t1)(x) : (t2)(y),	\
+			      __single_eval_max(t1, t2,			\
+						__UNIQUE_ID(max1_),	\
+						__UNIQUE_ID(max2_),	\
+						x, y))
 /**
  * max - return maximum of two values of the same or compatible types
  * @x: first value
  * @y: second value
  */
-#define max(x, y)					\
-	__max(typeof(x), typeof(y),			\
-	      __UNIQUE_ID(max1_), __UNIQUE_ID(max2_),	\
-	      x, y)
+#define max(x, y)	__max(typeof(x), typeof(y), x, y)
 
 /**
  * min3 - return minimum of three values
@@ -869,10 +887,7 @@ static inline void ftrace_dump(enum ftrace_dump_mode oops_dump_mode) { }
  * @x: first value
  * @y: second value
  */
-#define min_t(type, x, y)				\
-	__min(type, type,				\
-	      __UNIQUE_ID(min1_), __UNIQUE_ID(min2_),	\
-	      x, y)
+#define min_t(type, x, y)	 __min(type, type, x, y)
 
 /**
  * max_t - return maximum of two values, using the specified type
@@ -880,10 +895,7 @@ static inline void ftrace_dump(enum ftrace_dump_mode oops_dump_mode) { }
  * @x: first value
  * @y: second value
  */
-#define max_t(type, x, y)				\
-	__max(type, type,				\
-	      __UNIQUE_ID(min1_), __UNIQUE_ID(min2_),	\
-	      x, y)
+#define max_t(type, x, y)	__max(type, type, x, y)
 
 /**
  * clamp_t - return a value clamped to a given range using a given type
-- 
2.7.4


-- 
Kees Cook
Pixel Security

^ permalink raw reply related	[flat|nested] 51+ messages in thread

* Re: [PATCH v3] kernel.h: Skip single-eval logic on literals in min()/max()
  2018-03-09 20:05 [PATCH v3] kernel.h: Skip single-eval logic on literals in min()/max() Kees Cook
@ 2018-03-09 21:10 ` Linus Torvalds
  2018-03-09 21:47   ` Kees Cook
                     ` (2 more replies)
  2018-03-10  0:07 ` Andrew Morton
  1 sibling, 3 replies; 51+ messages in thread
From: Linus Torvalds @ 2018-03-09 21:10 UTC (permalink / raw)
  To: Kees Cook
  Cc: Andrew Morton, Linux Kernel Mailing List, Josh Poimboeuf,
	Rasmus Villemoes, Gustavo A. R. Silva, Tobin C. Harding,
	Steven Rostedt, Jonathan Corbet, Chris Mason, Josef Bacik,
	David Sterba, David S. Miller, Alexey Kuznetsov,
	Hideaki YOSHIFUJI, Ingo Molnar, Peter Zijlstra, Thomas Gleixner,
	Masahiro Yamada, Borislav Petkov, Randy Dunlap, Ian Abbott,
	Sergey Senozhatsky, Petr Mladek, Andy Shevchenko,
	Pantelis Antoniou, Linux Btrfs, Network Development,
	Kernel Hardening

On Fri, Mar 9, 2018 at 12:05 PM, Kees Cook <keescook@chromium.org> wrote:
> When max() is used in stack array size calculations from literal values
> (e.g. "char foo[max(sizeof(struct1), sizeof(struct2))]", the compiler
> thinks this is a dynamic calculation due to the single-eval logic, which
> is not needed in the literal case. This change removes several accidental
> stack VLAs from an x86 allmodconfig build:

Ok, looks good.

I just have a couple of questions about applying it.

In particular, if this will help people working on getting rid of
VLA's in the short term, I can apply it directly. But if people who
are looking at it (anybody else than Kees?) don't much care, then this
might be a 4.17 thing or at least "random -mm queue"?

The other unrelated reaction I had to this was that "we're passing
those types down very deep, even though nobody _cares_ about them all
that much at that deep level".

Honestly, the only case that really cares is the very top level, and
the rest could just take the properly cast versions.

For example, "max_t/min_t" really don't care at all, since they - by
definition - just take the single specified type.

So I'm wondering if we should just drop the types from __max/__min
(and everything they call) entirely, and instead do

    #define __check_type(x,y) ((void)((typeof(x)*)1==(typeof(y)*)1))
    #define min(x,y)   (__check_type(x,y),__min(x,y))
    #define max(x,y)   (__check_type(x,y),__max(x,y))

    #define min_t(t,x,y) __min((t)(x),(t)(y))
    #define max_t(t,x,y) __max((t)(x),(t)(y))

and then __min/__max and friends are much simpler (and can just assume
that the type is already fine, and the casting has been done).

This is technically entirely independent of this VLA cleanup thing,
but the "passing the types around unnecessarily" just becomes more
obvious when there's now another level of macros, _and_ it's a more
complex expression too.

Yes, yes, the __single_eval_xyz() functions still end up wanting the
types for the declaration of the new single-evaluation variables, but
the 'typeof' pattern is the standard pattern, so

#define __single_eval_max(max1, max2, x, y) ({  \
        typeof (x) max1 = (x);                  \
        typeof (y) max2 = (y);                  \
        max1 > max2 ? max1 : max2; })

actually looks more natural to me than passing the two types in as
arguments to the macro.

(That form also is amenable to things like "__auto_type" etc simplifications).

Side note: do we *really* need the unique variable names? That's what
makes those things _really_ illegible. I thgink it's done just for a
sparse warning that we should probably ignore. It's off by default
anyway exactly because it doesn't work that well due to nested macro
expansions like this.

There is very real value to keeping our odd macros legible, I feel.
Even when they are complicated by issues like this, it would be good
to keep them as simple as possible.

Comments?

                Linus

^ permalink raw reply	[flat|nested] 51+ messages in thread

* Re: [PATCH v3] kernel.h: Skip single-eval logic on literals in min()/max()
  2018-03-09 21:10 ` Linus Torvalds
@ 2018-03-09 21:47   ` Kees Cook
  2018-03-11 22:46   ` Tobin C. Harding
  2018-03-13 13:31   ` David Laight
  2 siblings, 0 replies; 51+ messages in thread
From: Kees Cook @ 2018-03-09 21:47 UTC (permalink / raw)
  To: Linus Torvalds
  Cc: Andrew Morton, Linux Kernel Mailing List, Josh Poimboeuf,
	Rasmus Villemoes, Gustavo A. R. Silva, Tobin C. Harding,
	Steven Rostedt, Jonathan Corbet, Chris Mason, Josef Bacik,
	David Sterba, David S. Miller, Alexey Kuznetsov,
	Hideaki YOSHIFUJI, Ingo Molnar, Peter Zijlstra, Thomas Gleixner,
	Masahiro Yamada, Borislav Petkov, Randy Dunlap, Ian Abbott,
	Sergey Senozhatsky, Petr Mladek, Andy Shevchenko,
	Pantelis Antoniou, Linux Btrfs, Network Development,
	Kernel Hardening

On Fri, Mar 9, 2018 at 1:10 PM, Linus Torvalds
<torvalds@linux-foundation.org> wrote:
> On Fri, Mar 9, 2018 at 12:05 PM, Kees Cook <keescook@chromium.org> wrote:
>> When max() is used in stack array size calculations from literal values
>> (e.g. "char foo[max(sizeof(struct1), sizeof(struct2))]", the compiler
>> thinks this is a dynamic calculation due to the single-eval logic, which
>> is not needed in the literal case. This change removes several accidental
>> stack VLAs from an x86 allmodconfig build:
>
> Ok, looks good.
>
> I just have a couple of questions about applying it.
>
> In particular, if this will help people working on getting rid of
> VLA's in the short term, I can apply it directly.

AFAIK, all the VLAs effected by max() are fixed with this patch. i.e.
no other VLA work I'm aware of depends on this (famous last words).

> But if people who are looking at it (anybody else than Kees?)

FWIW, I've seen at least 6 other people helping out now with VLA clean
up patches. Whee. :)

> don't much care, then this
> might be a 4.17 thing or at least "random -mm queue"?

Andrew has this (v2) queued in -mm for 4.17. I didn't view VLA clean
up (or this macro change) as urgent by any means.

> #define __single_eval_max(max1, max2, x, y) ({  \
>         typeof (x) max1 = (x);                  \
>         typeof (y) max2 = (y);                  \
>         max1 > max2 ? max1 : max2; })
>
> actually looks more natural to me than passing the two types in as
> arguments to the macro.

I (obviously) didn't design this macro originally, but my take on this
was that it's safer to keep the type check together with the
comparison so that it is never possible for someone to run off and use
__single_eval_max() directly and accidentally bypass the type check.
While it does look silly from the max_t()/min_t() perspective, it just
seems more defensive.

> (That form also is amenable to things like "__auto_type" etc simplifications).

Agreed, I think that's the biggest reason to lift the types. However,
since we're still not actually doing anything with the types (i.e.
this change doesn't weaken the type checking), I would think it would
be orthogonal. While it would be nice to resolve differing types
sanely, there doesn't appear to be a driving reason to make this
change. The example case of max(5, sizeof(thing)) doesn't currently
exist in the code and I don't see how making min()/max() _less_ strict
would be generically useful (in fact, it has proven useful to have
strict type checking).

> Side note: do we *really* need the unique variable names? That's what
> makes those things _really_ illegible. I thgink it's done just for a
> sparse warning that we should probably ignore. It's off by default
> anyway exactly because it doesn't work that well due to nested macro
> expansions like this.

That I'm not sure about. I'd be fine to remove them; I left them in
place because it seemed quite intentional. :)

> There is very real value to keeping our odd macros legible, I feel.
> Even when they are complicated by issues like this, it would be good
> to keep them as simple as possible.
>
> Comments?

I'm open to whatever! I'm just glad this gets to kill a handful of
"accidental" stack VLAs. :)

-Kees

-- 
Kees Cook
Pixel Security<div class="gmail_extra"><br><div class="gmail_quote">On
Fri, Mar 9, 2018 at 1:10 PM, Linus Torvalds <span dir="ltr">&lt;<a
href="mailto:torvalds@linux-foundation.org"
target="_blank">torvalds@linux-foundation.org</a>&gt;</span>
wrote:<br><blockquote class="gmail_quote" style="margin:0 0 0
.8ex;border-left:1px #ccc solid;padding-left:1ex"><span class="">On
Fri, Mar 9, 2018 at 12:05 PM, Kees Cook &lt;<a
href="mailto:keescook@chromium.org">keescook@chromium.org</a>&gt;
wrote:<br>
&gt; When max() is used in stack array size calculations from literal values<br>
&gt; (e.g. "char foo[max(sizeof(struct1), sizeof(struct2))]", the compiler<br>
&gt; thinks this is a dynamic calculation due to the single-eval
logic, which<br>
&gt; is not needed in the literal case. This change removes several
accidental<br>
&gt; stack VLAs from an x86 allmodconfig build:<br>
<br>
</span>Ok, looks good.<br>
<br>
I just have a couple of questions about applying it.<br>
<br>
In particular, if this will help people working on getting rid of<br>
VLA's in the short term, I can apply it directly. But if people who<br>
are looking at it (anybody else than Kees?) don't much care, then this<br>
might be a 4.17 thing or at least "random -mm queue"?<br>
<br>
The other unrelated reaction I had to this was that "we're passing<br>
those types down very deep, even though nobody _cares_ about them all<br>
that much at that deep level".<br>
<br>
Honestly, the only case that really cares is the very top level, and<br>
the rest could just take the properly cast versions.<br>
<br>
For example, "max_t/min_t" really don't care at all, since they - by<br>
definition - just take the single specified type.<br>
<br>
So I'm wondering if we should just drop the types from __max/__min<br>
(and everything they call) entirely, and instead do<br>
<br>
&nbsp; &nbsp; #define __check_type(x,y)
((void)((typeof(x)*)1==(<wbr>typeof(y)*)1))<br>
&nbsp; &nbsp; #define min(x,y)&nbsp; &nbsp;(__check_type(x,y),__min(x,y))<br>
&nbsp; &nbsp; #define max(x,y)&nbsp; &nbsp;(__check_type(x,y),__max(x,y))<br>
<br>
&nbsp; &nbsp; #define min_t(t,x,y) __min((t)(x),(t)(y))<br>
&nbsp; &nbsp; #define max_t(t,x,y) __max((t)(x),(t)(y))<br>
<br>
and then __min/__max and friends are much simpler (and can just assume<br>
that the type is already fine, and the casting has been done).<br>
<br>
This is technically entirely independent of this VLA cleanup thing,<br>
but the "passing the types around unnecessarily" just becomes more<br>
obvious when there's now another level of macros, _and_ it's a more<br>
complex expression too.<br>
<br>
Yes, yes, the __single_eval_xyz() functions still end up wanting the<br>
types for the declaration of the new single-evaluation variables, but<br>
the 'typeof' pattern is the standard pattern, so<br>
<br>
#define __single_eval_max(max1, max2, x, y) ({&nbsp; \<br>
&nbsp; &nbsp; &nbsp; &nbsp; typeof (x) max1 = (x);&nbsp; &nbsp; &nbsp;
&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; \<br>
&nbsp; &nbsp; &nbsp; &nbsp; typeof (y) max2 = (y);&nbsp; &nbsp; &nbsp;
&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; \<br>
<span class="">&nbsp; &nbsp; &nbsp; &nbsp; max1 &gt; max2 ? max1 : max2; })<br>
<br>
</span>actually looks more natural to me than passing the two types in as<br>
arguments to the macro.<br>
<br>
(That form also is amenable to things like "__auto_type" etc
simplifications).<br>
<br>
Side note: do we *really* need the unique variable names? That's what<br>
makes those things _really_ illegible. I thgink it's done just for a<br>
sparse warning that we should probably ignore. It's off by default<br>
anyway exactly because it doesn't work that well due to nested macro<br>
expansions like this.<br>
<br>
There is very real value to keeping our odd macros legible, I feel.<br>
Even when they are complicated by issues like this, it would be good<br>
to keep them as simple as possible.<br>
<br>
Comments?<br>
<span class="HOEnZb"><font color="#888888"><br>
&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; Linus<br>
</font></span></blockquote></div><br><br clear="all"><div><br></div>--
<br><div class="gmail_signature" data-smartmail="gmail_signature">Kees
Cook<br>Pixel Security</div>
</div>

^ permalink raw reply	[flat|nested] 51+ messages in thread

* Re: [PATCH v3] kernel.h: Skip single-eval logic on literals in min()/max()
  2018-03-09 20:05 [PATCH v3] kernel.h: Skip single-eval logic on literals in min()/max() Kees Cook
  2018-03-09 21:10 ` Linus Torvalds
@ 2018-03-10  0:07 ` Andrew Morton
  2018-03-10  0:28   ` Linus Torvalds
  2018-03-10  3:11   ` Randy Dunlap
  1 sibling, 2 replies; 51+ messages in thread
From: Andrew Morton @ 2018-03-10  0:07 UTC (permalink / raw)
  To: Kees Cook
  Cc: linux-kernel, Josh Poimboeuf, Rasmus Villemoes,
	Gustavo A. R. Silva, Tobin C. Harding, Steven Rostedt,
	Jonathan Corbet, Chris Mason, Josef Bacik, David Sterba,
	David S. Miller, Alexey Kuznetsov, Hideaki YOSHIFUJI,
	Ingo Molnar, Peter Zijlstra, Thomas Gleixner, Masahiro Yamada,
	Borislav Petkov, Randy Dunlap, Ian Abbott, Sergey Senozhatsky,
	Petr Mladek, Andy Shevchenko, Pantelis Antoniou, Linux Btrfs,
	Network Development, Kernel Hardening, Linus Torvalds

On Fri, 9 Mar 2018 12:05:36 -0800 Kees Cook <keescook@chromium.org> wrote:

> When max() is used in stack array size calculations from literal values
> (e.g. "char foo[max(sizeof(struct1), sizeof(struct2))]", the compiler
> thinks this is a dynamic calculation due to the single-eval logic, which
> is not needed in the literal case. This change removes several accidental
> stack VLAs from an x86 allmodconfig build:
> 
> $ diff -u before.txt after.txt | grep ^-
> -drivers/input/touchscreen/cyttsp4_core.c:871:2: warning: ISO C90 forbids variable length array ‘ids’ [-Wvla]
> -fs/btrfs/tree-checker.c:344:4: warning: ISO C90 forbids variable length array ‘namebuf’ [-Wvla]
> -lib/vsprintf.c:747:2: warning: ISO C90 forbids variable length array ‘sym’ [-Wvla]
> -net/ipv4/proc.c:403:2: warning: ISO C90 forbids variable length array ‘buff’ [-Wvla]
> -net/ipv6/proc.c:198:2: warning: ISO C90 forbids variable length array ‘buff’ [-Wvla]
> -net/ipv6/proc.c:218:2: warning: ISO C90 forbids variable length array ‘buff64’ [-Wvla]
> 
> Based on an earlier patch from Josh Poimboeuf.

v1, v2 and v3 of this patch all fail with gcc-4.4.4:

./include/linux/jiffies.h: In function 'jiffies_delta_to_clock_t':
./include/linux/jiffies.h:444: error: first argument to '__builtin_choose_expr' not a constant

That's with

#define __max(t1, t2, x, y)						\
	__builtin_choose_expr(__builtin_constant_p(x) &&		\
			      __builtin_constant_p(y) &&		\
			      __builtin_types_compatible_p(t1, t2),	\
			      (t1)(x) > (t2)(y) ? (t1)(x) : (t2)(y),	\
			      __single_eval_max(t1, t2,			\
						__UNIQUE_ID(max1_),	\
						__UNIQUE_ID(max2_),	\
						x, y))
/**
 * max - return maximum of two values of the same or compatible types
 * @x: first value
 * @y: second value
 */
#define max(x, y)	__max(typeof(x), typeof(y), x, y)


A brief poke failed to reveal a workaround - gcc-4.4.4 doesn't appear
to know that __builtin_constant_p(x) is a constant.  Or something.

Sigh.  Wasn't there some talk about modernizing our toolchain
requirements?

^ permalink raw reply	[flat|nested] 51+ messages in thread

* Re: [PATCH v3] kernel.h: Skip single-eval logic on literals in min()/max()
  2018-03-10  0:07 ` Andrew Morton
@ 2018-03-10  0:28   ` Linus Torvalds
  2018-03-10  0:32     ` Andrew Morton
  2018-03-10  3:11   ` Randy Dunlap
  1 sibling, 1 reply; 51+ messages in thread
From: Linus Torvalds @ 2018-03-10  0:28 UTC (permalink / raw)
  To: Andrew Morton
  Cc: Kees Cook, Linux Kernel Mailing List, Josh Poimboeuf,
	Rasmus Villemoes, Gustavo A. R. Silva, Tobin C. Harding,
	Steven Rostedt, Jonathan Corbet, Chris Mason, Josef Bacik,
	David Sterba, David S. Miller, Alexey Kuznetsov,
	Hideaki YOSHIFUJI, Ingo Molnar, Peter Zijlstra, Thomas Gleixner,
	Masahiro Yamada, Borislav Petkov, Randy Dunlap, Ian Abbott,
	Sergey Senozhatsky, Petr Mladek, Andy Shevchenko,
	Pantelis Antoniou, Linux Btrfs, Network Development,
	Kernel Hardening

On Fri, Mar 9, 2018 at 4:07 PM, Andrew Morton <akpm@linux-foundation.org> wrote:
>
> A brief poke failed to reveal a workaround - gcc-4.4.4 doesn't appear
> to know that __builtin_constant_p(x) is a constant.  Or something.

LOL.

I suspect it might be that it wants to evaluate
__builtin_choose_expr() at an earlier stage than it evaluates
__builtin_constant_p(), so it's not that it doesn't know that
__builtin_constant_p() is a constant, it just might not know it *yet*.

Maybe.

Side note, if it's not that, but just the "complex" expression that
has the logical 'and' etc, maybe the code could just use

  __builtin_constant_p((x)+(y))

or something.

But yeah:

> Sigh.  Wasn't there some talk about modernizing our toolchain
> requirements?

Maybe it's just time to give up on 4.4.  We wanted 4.5 for "asm goto",
and once we upgrade to 4.5 I think Arnd said that no distro actually
ships it, so we might as well go to 4.6.

So maybe this is just the excuse to finally make that official, if
there is no clever workaround any more.

                    Linus

^ permalink raw reply	[flat|nested] 51+ messages in thread

* Re: [PATCH v3] kernel.h: Skip single-eval logic on literals in min()/max()
  2018-03-10  0:28   ` Linus Torvalds
@ 2018-03-10  0:32     ` Andrew Morton
  2018-03-10  0:38       ` Linus Torvalds
  0 siblings, 1 reply; 51+ messages in thread
From: Andrew Morton @ 2018-03-10  0:32 UTC (permalink / raw)
  To: Linus Torvalds
  Cc: Kees Cook, Linux Kernel Mailing List, Josh Poimboeuf,
	Rasmus Villemoes, Gustavo A. R. Silva, Tobin C. Harding,
	Steven Rostedt, Jonathan Corbet, Chris Mason, Josef Bacik,
	David Sterba, David S. Miller, Alexey Kuznetsov,
	Hideaki YOSHIFUJI, Ingo Molnar, Peter Zijlstra, Thomas Gleixner,
	Masahiro Yamada, Borislav Petkov, Randy Dunlap, Ian Abbott,
	Sergey Senozhatsky, Petr Mladek, Andy Shevchenko,
	Pantelis Antoniou, Linux Btrfs, Network Development,
	Kernel Hardening

On Fri, 9 Mar 2018 16:28:51 -0800 Linus Torvalds <torvalds@linux-foundation.org> wrote:

> On Fri, Mar 9, 2018 at 4:07 PM, Andrew Morton <akpm@linux-foundation.org> wrote:
> >
> > A brief poke failed to reveal a workaround - gcc-4.4.4 doesn't appear
> > to know that __builtin_constant_p(x) is a constant.  Or something.
> 
> LOL.
> 
> I suspect it might be that it wants to evaluate
> __builtin_choose_expr() at an earlier stage than it evaluates
> __builtin_constant_p(), so it's not that it doesn't know that
> __builtin_constant_p() is a constant, it just might not know it *yet*.
> 
> Maybe.
> 
> Side note, if it's not that, but just the "complex" expression that
> has the logical 'and' etc, maybe the code could just use
> 
>   __builtin_constant_p((x)+(y))
> 
> or something.

I'll do a bit more poking at it.

> But yeah:
> 
> > Sigh.  Wasn't there some talk about modernizing our toolchain
> > requirements?
> 
> Maybe it's just time to give up on 4.4.  We wanted 4.5 for "asm goto",
> and once we upgrade to 4.5 I think Arnd said that no distro actually
> ships it, so we might as well go to 4.6.
> 
> So maybe this is just the excuse to finally make that official, if
> there is no clever workaround any more.

I wonder which gcc versions actually accept Kees's addition.

^ permalink raw reply	[flat|nested] 51+ messages in thread

* Re: [PATCH v3] kernel.h: Skip single-eval logic on literals in min()/max()
  2018-03-10  0:32     ` Andrew Morton
@ 2018-03-10  0:38       ` Linus Torvalds
  2018-03-10  1:30         ` Kees Cook
  0 siblings, 1 reply; 51+ messages in thread
From: Linus Torvalds @ 2018-03-10  0:38 UTC (permalink / raw)
  To: Andrew Morton
  Cc: Kees Cook, Linux Kernel Mailing List, Josh Poimboeuf,
	Rasmus Villemoes, Gustavo A. R. Silva, Tobin C. Harding,
	Steven Rostedt, Jonathan Corbet, Chris Mason, Josef Bacik,
	David Sterba, David S. Miller, Alexey Kuznetsov,
	Hideaki YOSHIFUJI, Ingo Molnar, Peter Zijlstra, Thomas Gleixner,
	Masahiro Yamada, Borislav Petkov, Randy Dunlap, Ian Abbott,
	Sergey Senozhatsky, Petr Mladek, Andy Shevchenko,
	Pantelis Antoniou, Linux Btrfs, Network Development,
	Kernel Hardening

On Fri, Mar 9, 2018 at 4:32 PM, Andrew Morton <akpm@linux-foundation.org> wrote:
>
> I wonder which gcc versions actually accept Kees's addition.

Note that we already do have this pattern, as seen by:

  git grep -2  __builtin_choose_expr | grep -2 __builtin_constant_p

which show drivers/pinctrl/intel/pinctrl-intel.h, so the pattern
already exists current kernels and _works_ - it apparently just
doesn't work in slightly more complicated cases.

It's one reason why I wondered if simplifying the expression to have
just that single __builtin_constant_p() might not end up working..

              Linus

^ permalink raw reply	[flat|nested] 51+ messages in thread

* Re: [PATCH v3] kernel.h: Skip single-eval logic on literals in min()/max()
  2018-03-10  0:38       ` Linus Torvalds
@ 2018-03-10  1:30         ` Kees Cook
  2018-03-10  1:31           ` Kees Cook
  2018-03-12 22:55           ` Andrew Morton
  0 siblings, 2 replies; 51+ messages in thread
From: Kees Cook @ 2018-03-10  1:30 UTC (permalink / raw)
  To: Linus Torvalds
  Cc: Andrew Morton, Linux Kernel Mailing List, Josh Poimboeuf,
	Rasmus Villemoes, Gustavo A. R. Silva, Tobin C. Harding,
	Steven Rostedt, Jonathan Corbet, Chris Mason, Josef Bacik,
	David Sterba, David S. Miller, Alexey Kuznetsov,
	Hideaki YOSHIFUJI, Ingo Molnar, Peter Zijlstra, Thomas Gleixner,
	Masahiro Yamada, Borislav Petkov, Randy Dunlap, Ian Abbott,
	Sergey Senozhatsky, Petr Mladek, Andy Shevchenko,
	Pantelis Antoniou, Linux Btrfs, Network Development,
	Kernel Hardening

On Fri, Mar 9, 2018 at 4:38 PM, Linus Torvalds
<torvalds@linux-foundation.org> wrote:
> On Fri, Mar 9, 2018 at 4:32 PM, Andrew Morton <akpm@linux-foundation.org> wrote:
>>
>> I wonder which gcc versions actually accept Kees's addition.

Ah, my old nemesis, gcc 4.4.4. *sob*

> Note that we already do have this pattern, as seen by:
>
>   git grep -2  __builtin_choose_expr | grep -2 __builtin_constant_p
>
> which show drivers/pinctrl/intel/pinctrl-intel.h, so the pattern
> already exists current kernels and _works_ - it apparently just
> doesn't work in slightly more complicated cases.

Fun. Yeah, so all the PIN_GROUP() callers have either a literal or an
array name for that argument, so the argument to
__builtin_constant_p() isn't complex.

git grep '\bPIN_GROUP\b' | egrep -v '(1|2|3|true|false)\)'

> It's one reason why I wondered if simplifying the expression to have
> just that single __builtin_constant_p() might not end up working..

Yeah, it seems like it doesn't bail out as "false" for complex
expressions given to __builtin_constant_p().

If no magic solution, then which of these?

- go back to my original "multi-eval max only for constants" macro (meh)
- add gcc version checks around this and similarly for -Wvla in the future (eww)
- raise gcc version (yikes)

-Kees

-- 
Kees Cook
Pixel Security<div class="gmail_extra"><br><div class="gmail_quote">On
Fri, Mar 9, 2018 at 4:38 PM, Linus Torvalds <span dir="ltr">&lt;<a
href="mailto:torvalds@linux-foundation.org"
target="_blank">torvalds@linux-foundation.org</a>&gt;</span>
wrote:<br><blockquote class="gmail_quote" style="margin:0 0 0
.8ex;border-left:1px #ccc solid;padding-left:1ex"><span class="">On
Fri, Mar 9, 2018 at 4:32 PM, Andrew Morton &lt;<a
href="mailto:akpm@linux-foundation.org">akpm@linux-foundation.org</a>&gt;
wrote:<br>
&gt;<br>
&gt; I wonder which gcc versions actually accept Kees's addition.<br>
<br>
</span>Note that we already do have this pattern, as seen by:<br>
<br>
&nbsp; git grep -2&nbsp; __builtin_choose_expr | grep -2
__builtin_constant_p<br>
<br>
which show drivers/pinctrl/intel/pinctrl-<wbr>intel.h, so the pattern<br>
already exists current kernels and _works_ - it apparently just<br>
doesn't work in slightly more complicated cases.<br>
<br>
It's one reason why I wondered if simplifying the expression to have<br>
just that single __builtin_constant_p() might not end up working..<br>
<span class="HOEnZb"><font color="#888888"><br>
&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; Linus<br>
</font></span></blockquote></div><br><br clear="all"><div><br></div>--
<br><div class="gmail_signature" data-smartmail="gmail_signature">Kees
Cook<br>Pixel Security</div>
</div>

^ permalink raw reply	[flat|nested] 51+ messages in thread

* Re: [PATCH v3] kernel.h: Skip single-eval logic on literals in min()/max()
  2018-03-10  1:30         ` Kees Cook
@ 2018-03-10  1:31           ` Kees Cook
  2018-03-10  2:37             ` Linus Torvalds
  2018-03-12 22:55           ` Andrew Morton
  1 sibling, 1 reply; 51+ messages in thread
From: Kees Cook @ 2018-03-10  1:31 UTC (permalink / raw)
  To: Linus Torvalds
  Cc: Andrew Morton, Linux Kernel Mailing List, Josh Poimboeuf,
	Rasmus Villemoes, Gustavo A. R. Silva, Tobin C. Harding,
	Steven Rostedt, Jonathan Corbet, Chris Mason, Josef Bacik,
	David Sterba, David S. Miller, Alexey Kuznetsov,
	Hideaki YOSHIFUJI, Ingo Molnar, Peter Zijlstra, Thomas Gleixner,
	Masahiro Yamada, Borislav Petkov, Randy Dunlap, Ian Abbott,
	Sergey Senozhatsky, Petr Mladek, Andy Shevchenko,
	Pantelis Antoniou, Linux Btrfs, Network Development,
	Kernel Hardening

On Fri, Mar 9, 2018 at 5:30 PM, Kees Cook <keescook@chromium.org> wrote:
> --
> Kees Cook
> Pixel Security<div class="gmail_extra"><br><div class="gmail_quote">On
> [...]

WTF, gmail just blasted HTML into my explicitly plain-text email?! Apologies...

-- 
Kees Cook
Pixel Security<div class="gmail_extra"><br><div class="gmail_quote">On
Fri, Mar 9, 2018 at 5:30 PM, Kees Cook <span dir="ltr">&lt;<a
href="mailto:keescook@chromium.org"
target="_blank">keescook@chromium.org</a>&gt;</span>
wrote:<br><blockquote class="gmail_quote" style="margin:0 0 0
.8ex;border-left:1px #ccc solid;padding-left:1ex"><span class="">On
Fri, Mar 9, 2018 at 4:38 PM, Linus Torvalds<br>
&lt;<a href="mailto:torvalds@linux-foundation.org">torvalds@linux-foundation.org</a><wbr>&gt;
wrote:<br>
&gt; On Fri, Mar 9, 2018 at 4:32 PM, Andrew Morton &lt;<a
href="mailto:akpm@linux-foundation.org">akpm@linux-foundation.org</a>&gt;
wrote:<br>
&gt;&gt;<br>
&gt;&gt; I wonder which gcc versions actually accept Kees's addition.<br>
<br>
</span>Ah, my old nemesis, gcc 4.4.4. *sob*<br>
<span class=""><br>
&gt; Note that we already do have this pattern, as seen by:<br>
&gt;<br>
&gt;&nbsp; &nbsp;git grep -2&nbsp; __builtin_choose_expr | grep -2
__builtin_constant_p<br>
&gt;<br>
&gt; which show drivers/pinctrl/intel/pinctrl-<wbr>intel.h, so the pattern<br>
&gt; already exists current kernels and _works_ - it apparently just<br>
&gt; doesn't work in slightly more complicated cases.<br>
<br>
</span>Fun. Yeah, so all the PIN_GROUP() callers have either a literal or an<br>
array name for that argument, so the argument to<br>
__builtin_constant_p() isn't complex.<br>
<br>
git grep '\bPIN_GROUP\b' | egrep -v '(1|2|3|true|false)\)'<br>
<span class=""><br>
&gt; It's one reason why I wondered if simplifying the expression to have<br>
&gt; just that single __builtin_constant_p() might not end up working..<br>
<br>
</span>Yeah, it seems like it doesn't bail out as "false" for complex<br>
expressions given to __builtin_constant_p().<br>
<br>
If no magic solution, then which of these?<br>
<br>
- go back to my original "multi-eval max only for constants" macro (meh)<br>
- add gcc version checks around this and similarly for -Wvla in the
future (eww)<br>
- raise gcc version (yikes)<br>
<br>
-Kees<br>
<br>
--<br>
Kees Cook<br>
Pixel Security&lt;div class="gmail_extra"&gt;&lt;br&gt;&lt;div
class="gmail_quote"&gt;On<br>
Fri, Mar 9, 2018 at 4:38 PM, Linus Torvalds &lt;span
dir="ltr"&gt;&amp;lt;&lt;a<br>
<span class="">href="mailto:<a
href="mailto:torvalds@linux-foundation.org">torvalds@linux-<wbr>foundation.org</a>"<br>
target="_blank"&gt;<a
href="mailto:torvalds@linux-foundation.org">torvalds@<wbr>linux-foundation.org</a>&lt;/a&gt;&amp;gt;&lt;/<wbr>span&gt;<br>
wrote:&lt;br&gt;&lt;blockquote class="gmail_quote" style="margin:0 0 0<br>
</span>.8ex;border-left:1px #ccc solid;padding-left:1ex"&gt;&lt;span
class=""&gt;On<br>
Fri, Mar 9, 2018 at 4:32 PM, Andrew Morton &amp;lt;&lt;a<br>
href="mailto:<a
href="mailto:akpm@linux-foundation.org">akpm@linux-<wbr>foundation.org</a>"&gt;<a
href="mailto:akpm@linux-foundation.org">akpm@linux-<wbr>foundation.org</a>&lt;/a&gt;&amp;gt;<br>
wrote:&lt;br&gt;<br>
&amp;gt;&lt;br&gt;<br>
&amp;gt; I wonder which gcc versions actually accept Kees's
addition.&lt;br&gt;<br>
&lt;br&gt;<br>
&lt;/span&gt;Note that we already do have this pattern, as seen
by:&lt;br&gt;<br>
&lt;br&gt;<br>
&amp;nbsp; git grep -2&amp;nbsp; __builtin_choose_expr | grep -2<br>
__builtin_constant_p&lt;br&gt;<br>
&lt;br&gt;<br>
which show drivers/pinctrl/intel/pinctrl-<wbr>&lt;wbr&gt;intel.h, so
the pattern&lt;br&gt;<br>
already exists current kernels and _works_ - it apparently just&lt;br&gt;<br>
doesn't work in slightly more complicated cases.&lt;br&gt;<br>
&lt;br&gt;<br>
It's one reason why I wondered if simplifying the expression to
have&lt;br&gt;<br>
just that single __builtin_constant_p() might not end up working..&lt;br&gt;<br>
<div class="HOEnZb"><div class="h5">&lt;span
class="HOEnZb"&gt;&lt;font color="#888888"&gt;&lt;br&gt;<br>
&amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp;
&amp;nbsp; Linus&lt;br&gt;<br>
&lt;/font&gt;&lt;/span&gt;&lt;/blockquote&gt;&lt;/<wbr>div&gt;&lt;br&gt;&lt;br
clear="all"&gt;&lt;div&gt;&lt;br&gt;&lt;/div&gt;--<br>
&lt;br&gt;&lt;div class="gmail_signature"
data-smartmail="gmail_<wbr>signature"&gt;Kees<br>
Cook&lt;br&gt;Pixel Security&lt;/div&gt;<br>
&lt;/div&gt;<br>
</div></div></blockquote></div><br><br clear="all"><div><br></div>--
<br><div class="gmail_signature" data-smartmail="gmail_signature">Kees
Cook<br>Pixel Security</div>
</div>

^ permalink raw reply	[flat|nested] 51+ messages in thread

* [PATCH 0/3] tracing: Rewrite the function filter code
@ 2018-03-10  2:34 Steven Rostedt
  2018-03-10  2:34 ` [PATCH 1/3] tracing: Combine enum and arrays into single macro in " Steven Rostedt
                   ` (3 more replies)
  0 siblings, 4 replies; 51+ messages in thread
From: Steven Rostedt @ 2018-03-10  2:34 UTC (permalink / raw)
  To: linux-kernel
  Cc: Ingo Molnar, Andrew Morton, Al Viro, Tom Zanussi, Namhyung Kim,
	Masami Hiramatsu, Jiri Olsa


Al Viro reviewed the ftrace event filter logic and was horrified. He wrote
Tom and I a lengthy private email with a formal proof on how to do it
simpler as well as more efficient.

Currently the filter logic creates a binary search tree (with some
optimizations), and walks it to get a result.

Say we had the following filter: a && !(!b || c) || d && !e
Where each letter is a predicate. A tree would be made to look something
like:

             ||
            /  \
          &&    &&
         /  |   | \
        a  !||  d  !e
           /  \
          !b   c

If a == 1, b == 1, c == 1, d == 1, e == 0, then all nodes would be walked.

 || -> && -> a -> && -> !|| -> !b -> !|| -> c -> !|| -> && -> || -> && ->
 d -> && -> !e -> && -> || -> return TRUE!

That's 16 steps across vertices.

Al's method would create a program using an array that denotes the predicate
function, when to branch, and a target to branch to if the value is the same
as when to branch. Storing the array as:

 prog[0] = { a, 0, 2}; // predicate, when_to_branch, target
 prog[1] = { b, 0, 2};
 prog[2] = { c, 0, 4};
 prog[3] = { d, 0, 5};
 prog[4] = { e, 1, 5};
 prog[5] = { NULL, 0, 1}; // TRUE
 prog[6] = { NULL, 0, 0}; // FALSE

Where the code to execute the above looks like:

	for (i = 0; prog[i].pred; i++) {
		struct filter_pred *pred = prog[i].pred;
		int match = pred->fn(pred, rec);
		if (match == prog[i].when_to_branch)
			i = prog[i].target;
	}
	return prog[i].target;

Which translates the above program array into:

 n0: if (!a) goto n3;
 n1: if (!b) goto n3;
 n2: if (!c) goto T;
 n3: if (!d) goto F;
 n4: if (e) goto F;
 T: return TRUE;
 F: return FALSE;

And the above example only takes 6 steps to return TRUE.

Special thanks goes out to Al for his patience and his time that he spent in
educating us in a proper logical parser.

Two patches were added to do some initial clean up. The last patch
implements Al's suggestions. I wrote up a very lengthy comment describing
what Al taught us in my own words (hopefully I got it right), and the
implementation is similar to what Al showed with a few changes (again,
hopefully I got that right too). I tested this with lots of different
filters and it looks to be holding up.


 Jiri,

This affects perf as well. I ran some tests and I believe I got
it woking as it does today.

Steven Rostedt (VMware) (3):
      tracing: Combine enum and arrays into single macro in filter code
      tracing: Clean up and document pred_funcs_##type creation and use
      tracing: Rewrite filter logic to be simpler and faster

----
 kernel/trace/trace.h               |   15 +-
 kernel/trace/trace_events_filter.c | 2372 ++++++++++++++++--------------------
 2 files changed, 1083 insertions(+), 1304 deletions(-)

^ permalink raw reply	[flat|nested] 51+ messages in thread

* [PATCH 1/3] tracing: Combine enum and arrays into single macro in filter code
  2018-03-10  2:34 [PATCH 0/3] tracing: Rewrite the function filter code Steven Rostedt
@ 2018-03-10  2:34 ` Steven Rostedt
  2018-03-12 10:31   ` Masami Hiramatsu
  2018-03-10  2:34 ` [PATCH 2/3] tracing: Clean up and document pred_funcs_##type creation and use Steven Rostedt
                   ` (2 subsequent siblings)
  3 siblings, 1 reply; 51+ messages in thread
From: Steven Rostedt @ 2018-03-10  2:34 UTC (permalink / raw)
  To: linux-kernel
  Cc: Ingo Molnar, Andrew Morton, Al Viro, Tom Zanussi, Namhyung Kim,
	Masami Hiramatsu, Jiri Olsa

[-- Attachment #1: 0001-tracing-Combine-enum-and-arrays-into-single-macro-in.patch --]
[-- Type: text/plain, Size: 4193 bytes --]

From: "Steven Rostedt (VMware)" <rostedt@goodmis.org>

Instead of having a separate enum that is the index into another array, like
a string array, make a single macro that combines them into a single list,
and then the two can not get out of sync. This makes it easier to add and
remove items.

The macro trick is:

 #define DOGS				\
  C( JACK,     "Jack Russell")		\
  C( ITALIAN,  "Italian Greyhound")	\
  C( GERMAN,   "German Shepherd")

 #undef C
 #define C(a, b) a

 enum { DOGS };

 #undef C
 #define C(a, b) b

 static char dogs[] = { DOGS };

Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
---
 kernel/trace/trace_events_filter.c | 112 ++++++++++++++++---------------------
 1 file changed, 48 insertions(+), 64 deletions(-)

diff --git a/kernel/trace/trace_events_filter.c b/kernel/trace/trace_events_filter.c
index c3c6eee1e4df..a2ef393b3bb2 100644
--- a/kernel/trace/trace_events_filter.c
+++ b/kernel/trace/trace_events_filter.c
@@ -33,22 +33,26 @@
 	"# Only events with the given fields will be affected.\n"	\
 	"# If no events are modified, an error message will be displayed here"
 
-enum filter_op_ids
-{
-	OP_OR,
-	OP_AND,
-	OP_GLOB,
-	OP_NE,
-	OP_EQ,
-	OP_LT,
-	OP_LE,
-	OP_GT,
-	OP_GE,
-	OP_BAND,
-	OP_NOT,
-	OP_NONE,
-	OP_OPEN_PAREN,
-};
+#define OPS					\
+	C( OP_OR,	"||",		1 ),	\
+	C( OP_AND,	"&&",		2 ),	\
+	C( OP_GLOB,	"~",		4 ),	\
+	C( OP_NE,	"!=",		4 ),	\
+	C( OP_EQ,	"==",		4 ),	\
+	C( OP_LT,	"<",		5 ),	\
+	C( OP_LE,	"<=",		5 ),	\
+	C( OP_GT,	">",		5 ),	\
+	C( OP_GE,	">=",		5 ),	\
+	C( OP_BAND,	"&",		6 ),	\
+	C( OP_NOT,	"!",		6 ),	\
+	C( OP_NONE,	"OP_NONE",	0 ),	\
+	C( OP_OPEN_PAREN, "(",		0 ),	\
+	C( OP_MAX,	NULL,		0 )
+
+#undef C
+#define C(a, b, c)	a
+
+enum filter_op_ids { OPS };
 
 struct filter_op {
 	int id;
@@ -56,56 +60,36 @@ struct filter_op {
 	int precedence;
 };
 
-/* Order must be the same as enum filter_op_ids above */
-static struct filter_op filter_ops[] = {
-	{ OP_OR,	"||",		1 },
-	{ OP_AND,	"&&",		2 },
-	{ OP_GLOB,	"~",		4 },
-	{ OP_NE,	"!=",		4 },
-	{ OP_EQ,	"==",		4 },
-	{ OP_LT,	"<",		5 },
-	{ OP_LE,	"<=",		5 },
-	{ OP_GT,	">",		5 },
-	{ OP_GE,	">=",		5 },
-	{ OP_BAND,	"&",		6 },
-	{ OP_NOT,	"!",		6 },
-	{ OP_NONE,	"OP_NONE",	0 },
-	{ OP_OPEN_PAREN, "(",		0 },
-};
+#undef C
+#define C(a, b, c)	{ a, b, c }
 
-enum {
-	FILT_ERR_NONE,
-	FILT_ERR_INVALID_OP,
-	FILT_ERR_UNBALANCED_PAREN,
-	FILT_ERR_TOO_MANY_OPERANDS,
-	FILT_ERR_OPERAND_TOO_LONG,
-	FILT_ERR_FIELD_NOT_FOUND,
-	FILT_ERR_ILLEGAL_FIELD_OP,
-	FILT_ERR_ILLEGAL_INTVAL,
-	FILT_ERR_BAD_SUBSYS_FILTER,
-	FILT_ERR_TOO_MANY_PREDS,
-	FILT_ERR_MISSING_FIELD,
-	FILT_ERR_INVALID_FILTER,
-	FILT_ERR_IP_FIELD_ONLY,
-	FILT_ERR_ILLEGAL_NOT_OP,
-};
+static struct filter_op filter_ops[] = { OPS };
 
-static char *err_text[] = {
-	"No error",
-	"Invalid operator",
-	"Unbalanced parens",
-	"Too many operands",
-	"Operand too long",
-	"Field not found",
-	"Illegal operation for field type",
-	"Illegal integer value",
-	"Couldn't find or set field in one of a subsystem's events",
-	"Too many terms in predicate expression",
-	"Missing field name and/or value",
-	"Meaningless filter expression",
-	"Only 'ip' field is supported for function trace",
-	"Illegal use of '!'",
-};
+#define ERRORS								\
+	C( NONE,	 	"No error"),				\
+	C( INVALID_OP,		"Invalid operator"),			\
+	C( UNBALANCED_PAREN,	"Unbalanced parens"),			\
+	C( TOO_MANY_OPERANDS,	"Too many operands"),			\
+	C( OPERAND_TOO_LONG,	"Operand too long"),			\
+	C( FIELD_NOT_FOUND,	"Field not found"),			\
+	C( ILLEGAL_FIELD_OP,	"Illegal operation for field type"),	\
+	C( ILLEGAL_INTVAL,	"Illegal integer value"),		\
+	C( BAD_SUBSYS_FILTER,	"Couldn't find or set field in one of a subsystem's events"), \
+	C( TOO_MANY_PREDS,	"Too many terms in predicate expression"), \
+	C( MISSING_FIELD,	"Missing field name and/or value"),	\
+	C( INVALID_FILTER,	"Meaningless filter expression"),	\
+	C( IP_FIELD_ONLY,	"Only 'ip' field is supported for function trace"), \
+	C( ILLEGAL_NOT_OP,	"Illegal use of '!'"),
+
+#undef C
+#define C(a, b)		FILT_ERR_##a
+
+enum { ERRORS };
+
+#undef C
+#define C(a, b)		b
+
+static char *err_text[] = { ERRORS };
 
 struct opstack_op {
 	enum filter_op_ids op;
-- 
2.15.1

^ permalink raw reply related	[flat|nested] 51+ messages in thread

* [PATCH 2/3] tracing: Clean up and document pred_funcs_##type creation and use
  2018-03-10  2:34 [PATCH 0/3] tracing: Rewrite the function filter code Steven Rostedt
  2018-03-10  2:34 ` [PATCH 1/3] tracing: Combine enum and arrays into single macro in " Steven Rostedt
@ 2018-03-10  2:34 ` Steven Rostedt
  2018-03-12 13:42   ` Masami Hiramatsu
  2018-03-10  2:34 ` [PATCH 3/3] tracing: Rewrite filter logic to be simpler and faster Steven Rostedt
  2018-03-11 19:54 ` [PATCH 0/3] tracing: Rewrite the function filter code Jiri Olsa
  3 siblings, 1 reply; 51+ messages in thread
From: Steven Rostedt @ 2018-03-10  2:34 UTC (permalink / raw)
  To: linux-kernel
  Cc: Ingo Molnar, Andrew Morton, Al Viro, Tom Zanussi, Namhyung Kim,
	Masami Hiramatsu, Jiri Olsa

[-- Attachment #1: 0002-tracing-Clean-up-and-document-pred_funcs_-type-creat.patch --]
[-- Type: text/plain, Size: 3622 bytes --]

From: "Steven Rostedt (VMware)" <rostedt@goodmis.org>

The pred_funcs_##type arrays consist of five functions that are assigned
based on the ops. The array must be in the same order of the ops each
function represents. The PRED_FUNC_START macro denotes the op enum that
starts the op that maps to the pred_funcs_##type arrays. This is all very
subtle and prone to bugs if the code is changed.

Add comments describing how PRED_FUNC_START and pred_funcs_##type array is
used, and also a PRED_FUNC_MAX that is the maximum number of functions in
the arrays.

Clean up select_comparison_fn() that assigns the predicates to the
pred_funcs_##type array function as well as add protection in case an op is
passed in that does not map correctly to the array.

Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
---
 kernel/trace/trace_events_filter.c | 46 ++++++++++++++++++++++++++------------
 1 file changed, 32 insertions(+), 14 deletions(-)

diff --git a/kernel/trace/trace_events_filter.c b/kernel/trace/trace_events_filter.c
index a2ef393b3bb2..9d383f4383dc 100644
--- a/kernel/trace/trace_events_filter.c
+++ b/kernel/trace/trace_events_filter.c
@@ -65,6 +65,13 @@ struct filter_op {
 
 static struct filter_op filter_ops[] = { OPS };
 
+/*
+ * pred functions are OP_LT, OP_LE, OP_GT, OP_GE, and OP_BAND
+ * pred_funcs_##type below must match the order of them above.
+ */
+#define PRED_FUNC_START			OP_LT
+#define PRED_FUNC_MAX			(OP_BAND - PRED_FUNC_START)
+
 #define ERRORS								\
 	C( NONE,	 	"No error"),				\
 	C( INVALID_OP,		"Invalid operator"),			\
@@ -172,8 +179,6 @@ static const filter_pred_fn_t pred_funcs_##type[] = {			\
 	filter_pred_BAND_##type,					\
 };
 
-#define PRED_FUNC_START			OP_LT
-
 #define DEFINE_EQUALITY_PRED(size)					\
 static int filter_pred_##size(struct filter_pred *pred, void *event)	\
 {									\
@@ -946,39 +951,52 @@ static filter_pred_fn_t select_comparison_fn(enum filter_op_ids op,
 					    int field_size, int field_is_signed)
 {
 	filter_pred_fn_t fn = NULL;
+	int pred_func_index = -1;
+
+	switch (op) {
+	case OP_EQ:
+	case OP_NE:
+		break;
+	default:
+		if (WARN_ON_ONCE(op < PRED_FUNC_START))
+			return NULL;
+		pred_func_index = op - PRED_FUNC_START;
+		if (WARN_ON_ONCE(pred_func_index > PRED_FUNC_MAX))
+			return NULL;
+	}
 
 	switch (field_size) {
 	case 8:
-		if (op == OP_EQ || op == OP_NE)
+		if (pred_func_index < 0)
 			fn = filter_pred_64;
 		else if (field_is_signed)
-			fn = pred_funcs_s64[op - PRED_FUNC_START];
+			fn = pred_funcs_s64[pred_func_index];
 		else
-			fn = pred_funcs_u64[op - PRED_FUNC_START];
+			fn = pred_funcs_u64[pred_func_index];
 		break;
 	case 4:
-		if (op == OP_EQ || op == OP_NE)
+		if (pred_func_index < 0)
 			fn = filter_pred_32;
 		else if (field_is_signed)
-			fn = pred_funcs_s32[op - PRED_FUNC_START];
+			fn = pred_funcs_s32[pred_func_index];
 		else
-			fn = pred_funcs_u32[op - PRED_FUNC_START];
+			fn = pred_funcs_u32[pred_func_index];
 		break;
 	case 2:
-		if (op == OP_EQ || op == OP_NE)
+		if (pred_func_index < 0)
 			fn = filter_pred_16;
 		else if (field_is_signed)
-			fn = pred_funcs_s16[op - PRED_FUNC_START];
+			fn = pred_funcs_s16[pred_func_index];
 		else
-			fn = pred_funcs_u16[op - PRED_FUNC_START];
+			fn = pred_funcs_u16[pred_func_index];
 		break;
 	case 1:
-		if (op == OP_EQ || op == OP_NE)
+		if (pred_func_index < 0)
 			fn = filter_pred_8;
 		else if (field_is_signed)
-			fn = pred_funcs_s8[op - PRED_FUNC_START];
+			fn = pred_funcs_s8[pred_func_index];
 		else
-			fn = pred_funcs_u8[op - PRED_FUNC_START];
+			fn = pred_funcs_u8[pred_func_index];
 		break;
 	}
 
-- 
2.15.1

^ permalink raw reply related	[flat|nested] 51+ messages in thread

* [PATCH 3/3] tracing: Rewrite filter logic to be simpler and faster
  2018-03-10  2:34 [PATCH 0/3] tracing: Rewrite the function filter code Steven Rostedt
  2018-03-10  2:34 ` [PATCH 1/3] tracing: Combine enum and arrays into single macro in " Steven Rostedt
  2018-03-10  2:34 ` [PATCH 2/3] tracing: Clean up and document pred_funcs_##type creation and use Steven Rostedt
@ 2018-03-10  2:34 ` Steven Rostedt
  2018-03-10  3:10   ` Steven Rostedt
                     ` (3 more replies)
  2018-03-11 19:54 ` [PATCH 0/3] tracing: Rewrite the function filter code Jiri Olsa
  3 siblings, 4 replies; 51+ messages in thread
From: Steven Rostedt @ 2018-03-10  2:34 UTC (permalink / raw)
  To: linux-kernel
  Cc: Ingo Molnar, Andrew Morton, Al Viro, Tom Zanussi, Namhyung Kim,
	Masami Hiramatsu, Jiri Olsa

[-- Attachment #1: 0003-tracing-Rewrite-filter-logic-to-be-simpler-and-faste.patch --]
[-- Type: text/plain, Size: 73159 bytes --]

From: "Steven Rostedt (VMware)" <rostedt@goodmis.org>

Al Viro reviewed the filter logic of ftrace trace events and found it to be
very troubling. It creates a binary tree based on the logic operators and
walks it during tracing. He sent myself and Tom Zanussi a long explanation
(and formal proof) of how to do the string parsing better and end up with a
program array that can be simply iterated to come up with the correct
results.

I took his ideas and his pseudo code and rewrote the filter logic based on
them. In doing so, I was able to remove a lot of code, and have a much more
condensed filter logic in the process. I wrote a very long comment
describing the methadology that Al proposed in my own words. For more info
on how this works, read the comment above predicate_parse().

Suggested-by: Al Viro <viro@ZenIV.linux.org.uk>
Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
---
 kernel/trace/trace.h               |   15 +-
 kernel/trace/trace_events_filter.c | 2308 ++++++++++++++++--------------------
 2 files changed, 1050 insertions(+), 1273 deletions(-)

diff --git a/kernel/trace/trace.h b/kernel/trace/trace.h
index 9de3e2a2f042..e6f82f74ad65 100644
--- a/kernel/trace/trace.h
+++ b/kernel/trace/trace.h
@@ -1216,12 +1216,11 @@ struct ftrace_event_field {
 	int			is_signed;
 };
 
+struct prog_entry;
+
 struct event_filter {
-	int			n_preds;	/* Number assigned */
-	int			a_preds;	/* allocated */
-	struct filter_pred __rcu	*preds;
-	struct filter_pred __rcu	*root;
-	char				*filter_string;
+	struct prog_entry	*prog;
+	char			*filter_string;
 };
 
 struct event_subsystem {
@@ -1413,12 +1412,8 @@ struct filter_pred {
 	unsigned short		*ops;
 	struct ftrace_event_field *field;
 	int 			offset;
-	int 			not;
+	int			not;
 	int 			op;
-	unsigned short		index;
-	unsigned short		parent;
-	unsigned short		left;
-	unsigned short		right;
 };
 
 static inline bool is_string_field(struct ftrace_event_field *field)
diff --git a/kernel/trace/trace_events_filter.c b/kernel/trace/trace_events_filter.c
index 9d383f4383dc..8d7da0ded43b 100644
--- a/kernel/trace/trace_events_filter.c
+++ b/kernel/trace/trace_events_filter.c
@@ -33,60 +33,52 @@
 	"# Only events with the given fields will be affected.\n"	\
 	"# If no events are modified, an error message will be displayed here"
 
+/* Due to token parsing '<=' must be before '<' and '>=' must be before '>' */
 #define OPS					\
-	C( OP_OR,	"||",		1 ),	\
-	C( OP_AND,	"&&",		2 ),	\
-	C( OP_GLOB,	"~",		4 ),	\
-	C( OP_NE,	"!=",		4 ),	\
-	C( OP_EQ,	"==",		4 ),	\
-	C( OP_LT,	"<",		5 ),	\
-	C( OP_LE,	"<=",		5 ),	\
-	C( OP_GT,	">",		5 ),	\
-	C( OP_GE,	">=",		5 ),	\
-	C( OP_BAND,	"&",		6 ),	\
-	C( OP_NOT,	"!",		6 ),	\
-	C( OP_NONE,	"OP_NONE",	0 ),	\
-	C( OP_OPEN_PAREN, "(",		0 ),	\
-	C( OP_MAX,	NULL,		0 )
+	C( OP_GLOB,	"~"  ),			\
+	C( OP_NE,	"!=" ),			\
+	C( OP_EQ,	"==" ),			\
+	C( OP_LE,	"<=" ),			\
+	C( OP_LT,	"<"  ),			\
+	C( OP_GE,	">=" ),			\
+	C( OP_GT,	">"  ),			\
+	C( OP_BAND,	"&"  ),			\
+	C( OP_MAX,	NULL )
 
 #undef C
-#define C(a, b, c)	a
+#define C(a, b)	a
 
 enum filter_op_ids { OPS };
 
-struct filter_op {
-	int id;
-	char *string;
-	int precedence;
-};
-
 #undef C
-#define C(a, b, c)	{ a, b, c }
+#define C(a, b)	b
 
-static struct filter_op filter_ops[] = { OPS };
+static const char * ops[] = { OPS };
 
 /*
- * pred functions are OP_LT, OP_LE, OP_GT, OP_GE, and OP_BAND
+ * pred functions are OP_LE, OP_LT, OP_GE, OP_GT, and OP_BAND
  * pred_funcs_##type below must match the order of them above.
  */
-#define PRED_FUNC_START			OP_LT
+#define PRED_FUNC_START			OP_LE
 #define PRED_FUNC_MAX			(OP_BAND - PRED_FUNC_START)
 
 #define ERRORS								\
-	C( NONE,	 	"No error"),				\
-	C( INVALID_OP,		"Invalid operator"),			\
-	C( UNBALANCED_PAREN,	"Unbalanced parens"),			\
-	C( TOO_MANY_OPERANDS,	"Too many operands"),			\
-	C( OPERAND_TOO_LONG,	"Operand too long"),			\
-	C( FIELD_NOT_FOUND,	"Field not found"),			\
-	C( ILLEGAL_FIELD_OP,	"Illegal operation for field type"),	\
-	C( ILLEGAL_INTVAL,	"Illegal integer value"),		\
-	C( BAD_SUBSYS_FILTER,	"Couldn't find or set field in one of a subsystem's events"), \
-	C( TOO_MANY_PREDS,	"Too many terms in predicate expression"), \
-	C( MISSING_FIELD,	"Missing field name and/or value"),	\
-	C( INVALID_FILTER,	"Meaningless filter expression"),	\
-	C( IP_FIELD_ONLY,	"Only 'ip' field is supported for function trace"), \
-	C( ILLEGAL_NOT_OP,	"Illegal use of '!'"),
+	C(NONE,			"No error"),				\
+	C(INVALID_OP,		"Invalid operator"),			\
+	C(TOO_MANY_OPEN,	"Too many '('"),			\
+	C(TOO_MANY_CLOSE,	"Too few '('"),				\
+	C(MISSING_QUOTE,	"Missing matching quote"),		\
+	C(OPERAND_TOO_LONG,	"Operand too long"),			\
+	C(EXPECT_STRING,	"Expecting string field"),		\
+	C(EXPECT_DIGIT,		"Expecting numeric field"),		\
+	C(ILLEGAL_FIELD_OP,	"Illegal operation for field type"),	\
+	C(FIELD_NOT_FOUND,	"Field not found"),			\
+	C(ILLEGAL_INTVAL,	"Illegal integer value"),		\
+	C(BAD_SUBSYS_FILTER,	"Couldn't find or set field in one of a subsystem's events"), \
+	C(TOO_MANY_PREDS,	"Too many terms in predicate expression"), \
+	C(INVALID_FILTER,	"Meaningless filter expression"),	\
+	C(IP_FIELD_ONLY,	"Only 'ip' field is supported for function trace"), \
+	C(INVALID_VALUE,	"Invalid value (did you forget quotes)?"),
 
 #undef C
 #define C(a, b)		FILT_ERR_##a
@@ -98,84 +90,535 @@ enum { ERRORS };
 
 static char *err_text[] = { ERRORS };
 
-struct opstack_op {
-	enum filter_op_ids op;
-	struct list_head list;
-};
+/* Called after a '!' character but "!=" and "!~" are not "not"s */
+static bool is_not(const char *str)
+{
+	switch (str[1]) {
+	case '=':
+	case '~':
+		return false;
+	}
+	return true;
+}
 
-struct postfix_elt {
-	enum filter_op_ids op;
-	char *operand;
-	struct list_head list;
+/**
+ * prog_entry - a singe entry in the filter program
+ * @target:	     Index to jump to on a branch (actually one minus the index)
+ * @when_to_branch:  The value of the result of the predicate to do a branch
+ * @pred:	     The predicate to execute.
+ */
+struct prog_entry {
+	int			target;
+	int			when_to_branch;
+	struct filter_pred	*pred;
 };
 
-struct filter_parse_state {
-	struct filter_op *ops;
-	struct list_head opstack;
-	struct list_head postfix;
+/**
+ * update_preds- assign a program entry a label target
+ * @prog: The program array
+ * @N: The index of the current entry in @prog
+ * @when_to_branch: What to assign a program entry for its branch condition
+ *
+ * The program entry at @N has a target that points to the index of a program
+ * entry that can have its target and when_to_branch fields updated.
+ * Update the current program entry denoted by index @N target field to be
+ * that of the updated entry. This will denote the entry to update if
+ * we are processing an "||" after an "&&"
+ */
+static void update_preds(struct prog_entry *prog, int N, int invert)
+{
+	int t, s;
+
+	t = prog[N].target;
+	s = prog[t].target;
+	prog[t].when_to_branch = invert;
+	prog[t].target = N;
+	prog[N].target = s;
+}
+
+struct filter_parse_error {
 	int lasterr;
 	int lasterr_pos;
-
-	struct {
-		char *string;
-		unsigned int cnt;
-		unsigned int tail;
-	} infix;
-
-	struct {
-		char string[MAX_FILTER_STR_VAL];
-		int pos;
-		unsigned int tail;
-	} operand;
 };
 
-struct pred_stack {
-	struct filter_pred	**preds;
-	int			index;
+static void parse_error(struct filter_parse_error *pe, int err, int pos)
+{
+	pe->lasterr = err;
+	pe->lasterr_pos = pos;
+}
+
+typedef int (*parse_pred_fn)(const char *str, void *data, int pos,
+			     struct filter_parse_error *pe,
+			     struct filter_pred **pred);
+
+enum {
+	INVERT		= 1,
+	PROCESS_AND	= 2,
+	PROCESS_OR	= 4,
 };
 
-/* If not of not match is equal to not of not, then it is a match */
+/*
+ * Without going into a formal proof, this explains the method that is used in
+ * parsing the logical expressions.
+ *
+ * For example, if we have: "a && !(!b || (c && g)) || d || e && !f"
+ * The first pass will convert it into the following program:
+ *
+ * n1: r=a;       l1: if (!r) goto l4;
+ * n2: r=b;       l2: if (!r) goto l4;
+ * n3: r=c; r=!r; l3: if (r) goto l4;
+ * n4: r=g; r=!r; l4: if (r) goto l5;
+ * n5: r=d;       l5: if (r) goto T
+ * n6: r=e;       l6: if (!r) goto l7;
+ * n7: r=f; r=!r; l7: if (!r) goto F
+ * T: return TRUE
+ * F: return FALSE
+ *
+ * To do this, we use a data structure to represent each of the above
+ * predicate and conditions that has:
+ *
+ *  predicate, when_to_branch, invert, target
+ *
+ * The "predicate" will hold the function to determine the result "r".
+ * The "when_to_branch" denotes what "r" should be if a branch is to be taken
+ * "&&" would contain "!r" or (0) and "||" would contain "r" or (1).
+ * The "invert" holds whether the value should be reversed before testing.
+ * The "target" contains the label "l#" to jump to.
+ *
+ * A stack is created to hold values when parentheses are used.
+ *
+ * To simplify the logic, the labels will start at 0 and not 1.
+ *
+ * The possible invert values are 1 and 0. The number of "!"s that are in scope
+ * before the predicate determines the invert value, if the number is odd then
+ * the invert value is 1 and 0 otherwise. This means the invert value only
+ * needs to be toggled when a new "!" is introduced compared to what is stored
+ * on the stack, where parentheses were used.
+ *
+ * The top of the stack and "invert" are initialized to zero.
+ *
+ * ** FIRST PASS **
+ *
+ * #1 A loop through all the tokens is done:
+ *
+ * #2 If the token is an "(", the stack is push, and the current stack value
+ *    gets the current invert value, and the loop continues to the next token.
+ *    The top of the stack saves the "invert" value to keep track of what
+ *    the current inversion is. As "!(a && !b || c)" would require all
+ *    predicates being affected separately by the "!" before the parentheses.
+ *    And that would end up being equivalent to "(!a || b) && !c"
+ *
+ * #3 If the token is an "!", the current "invert" value gets inverted, and
+ *    the loop continues. Note, if the next token is a predicate, then
+ *    this "invert" value is only valid for the current program entry,
+ *    and does not affect other predicates later on.
+ *
+ * The only other acceptable token is the predicate string.
+ *
+ * #4 A new entry into the program is added saving: the predicate and the
+ *    current value of "invert". The target is currently assigned to the
+ *    previous program index (this will not be its final value).
+ *
+ * #5 We now enter another loop and look at the next token. The only valid
+ *    tokens are ")", "&&", "||" or end of the input string "\0".
+ *
+ * #6 The invert variable is reset to the current value saved on the top of
+ *    the stack.
+ *
+ * #7 The top of the stack holds not only the current invert value, but also
+ *    if a "&&" or "||" needs to be processed. Note, the "&&" takes higher
+ *    precedence than "||". That is "a && b || c && d" is equivalent to
+ *    "(a && b) || (c && d)". Thus the first thing to do is to see if "&&" needs
+ *    to be processed. This is the case if an "&&" was the last token. If it was
+ *    then we call update_preds(). This takes the program, the current index in
+ *    the program, and the current value of "invert".  More will be described
+ *    below about this function.
+ *
+ * #8 If the next token is "&&" then we set a flag in the top of the stack
+ *    that denotes that "&&" needs to be processed, break out of this loop
+ *    and continue with the outer loop.
+ *
+ * #9 Otherwise, if a "||" needs to be processed then update_preds() is called.
+ *    This is called with the program, the current index in the program, but
+ *    this time with an inverted value of "invert" (that is !invert). This is
+ *    because the value taken will become the "when_to_branch" value of the
+ *    program.
+ *    Note, this is called when the next token is not an "&&". As stated before,
+ *    "&&" takes higher precedence, and "||" should not be processed yet if the
+ *    next logical operation is "&&".
+ *
+ * #10 If the next token is "||" then we set a flag in the top of the stack
+ *     that denotes that "||" needs to be processed, break out of this loop
+ *     and continue with the outer loop.
+ *
+ * #11 If this is the end of the input string "\0" then we break out of both
+ *     loops.
+ *
+ * #12 Otherwise, the next token is ")", where we pop the stack and continue
+ *     this inner loop.
+ *
+ * Now to discuss the update_pred() function, as that is key to the setting up
+ * of the program. Remember the "target" of the program is initialized to the
+ * previous index and not the "l" label. The target holds the index into the
+ * program that gets affected by the operand. Thus if we have something like
+ *  "a || b && c", when we process "a" the target will be "-1" (undefined).
+ * When we process "b", its target is "0", which is the index of "a", as that's
+ * the predicate that is affected by "||". But because the next token after "b"
+ * is "&&" we don't call update_preds(). Instead continue to "c". As the
+ * next token after "c" is not "&&" but the end of input, we first process the
+ * "&&" by calling update_preds() for the "&&" then we process the "||" by
+ * callin updates_preds() with the values for processing "||".
+ *
+ * What does that mean? What update_preds() does is to first save the "target"
+ * of the program entry indexed by the current program entry's "target"
+ * (remember the "target" is initialized to previous program entry), and then
+ * sets that "target" to the current index which represents the label "l#".
+ * That entry's "when_to_branch" is set to the value passed in (the "invert"
+ * or "!invert"). Then it sets the current program entry's target to the saved
+ * "target" value (the old value of the program that had its "target" updated
+ * to the label).
+ *
+ * Looking back at "a || b && c", we have the following steps:
+ *  "a"  - prog[0] = { "a", X, -1 } // pred, when_to_branch, target
+ *  "||" - flag that we need to process "||"; continue outer loop
+ *  "b"  - prog[1] = { "b", X, 0 }
+ *  "&&" - flag that we need to process "&&"; continue outer loop
+ * (Notice we did not process "||")
+ *  "c"  - prog[2] = { "c", X, 1 }
+ *  update_preds(prog, 2, 0); // invert = 0 as we are processing "&&"
+ *    t = prog[2].target; // t = 1
+ *    s = prog[t].target; // s = 0
+ *    prog[t].target = 2; // Set target to "l2"
+ *    prog[t].when_to_branch = 0;
+ *    prog[2].target = s;
+ * update_preds(prog, 2, 1); // invert = 1 as we are now processing "||"
+ *    t = prog[2].target; // t = 0
+ *    s = prog[t].target; // s = -1
+ *    prog[t].target = 2; // Set target to "l2"
+ *    prog[t].when_to_branch = 1;
+ *    prog[2].target = s;
+ *
+ * #13 Which brings us to the final step of the first pass, which is to set
+ *     the last program entry's when_to_branch and target, which will be
+ *     when_to_branch = 0; target = N; ( the label after the program entry after
+ *     the last program entry processed above).
+ *
+ * If we denote "TRUE" to be the entry after the last program entry processed,
+ * and "FALSE" the program entry after that, we are now done with the first
+ * pass.
+ *
+ * Making the above "a || b && c" have a progam of:
+ *  prog[0] = { "a", 1, 2 }
+ *  prog[1] = { "b", 0, 2 }
+ *  prog[2] = { "c", 0, 3 }
+ *
+ * Which translates into:
+ * n0: r = a; l0: if (r) goto l2;
+ * n1: r = b; l1: if (!r) goto l2;
+ * n2: r = c; l2: if (!r) goto l3;  // Which is the same as "goto F;"
+ * T: return TRUE; l3:
+ * F: return FALSE
+ *
+ * Although, after the first pass, the program is correct, it is
+ * inefficient. The simple sample of "a || b && c" could be easily been
+ * converted into:
+ * n0: r = a; if (r) goto T
+ * n1: r = b; if (!r) goto F
+ * n2: r = c; if (!r) goto F
+ * T: return TRUE;
+ * F: return FALSE;
+ *
+ * The First Pass is over the input string. The next too passes are over
+ * the program itself.
+ *
+ * ** SECOND PASS **
+ *
+ * Which brings us to the second pass. If a jump to a label has the
+ * same condition as that label, it can instead jump to its target.
+ * The original example of "a && !(!b || (c && g)) || d || e && !f"
+ * where the first pass gives us:
+ *
+ * n1: r=a;       l1: if (!r) goto l4;
+ * n2: r=b;       l2: if (!r) goto l4;
+ * n3: r=c; r=!r; l3: if (r) goto l4;
+ * n4: r=g; r=!r; l4: if (r) goto l5;
+ * n5: r=d;       l5: if (r) goto T
+ * n6: r=e;       l6: if (!r) goto l7;
+ * n7: r=f; r=!r; l7: if (!r) goto F:
+ * T: return TRUE;
+ * F: return FALSE
+ *
+ * We can see that "l3: if (r) goto l4;" and at l4, we have "if (r) goto l5;".
+ * And "l5: if (r) goto T", we could optimize this by converting l3 and l4
+ * to go directly to T. To accomplish this, we start from the last
+ * entry in the program and work our way back. If the target of the entry
+ * has the same "when_to_branch" then we could use that entry's target.
+ * Doing this, the above would end up as:
+ *
+ * n1: r=a;       l1: if (!r) goto l4;
+ * n2: r=b;       l2: if (!r) goto l4;
+ * n3: r=c; r=!r; l3: if (r) goto T;
+ * n4: r=g; r=!r; l4: if (r) goto T;
+ * n5: r=d;       l5: if (r) goto T;
+ * n6: r=e;       l6: if (!r) goto F;
+ * n7: r=f; r=!r; l7: if (!r) goto F;
+ * T: return TRUE
+ * F: return FALSE
+ *
+ * In that same pass, if the "when_to_branch" doesn't match, we can simply
+ * go to the program entry after the label. That is, "l2: if (!r) goto l4;"
+ * where "l4: if (r) goto T;", then we can convert l2 to be:
+ * "l2: if (!r) goto n5;".
+ *
+ * This will have the second pass give us:
+ * n1: r=a;       l1: if (!r) goto n5;
+ * n2: r=b;       l2: if (!r) goto n5;
+ * n3: r=c; r=!r; l3: if (r) goto T;
+ * n4: r=g; r=!r; l4: if (r) goto T;
+ * n5: r=d;       l5: if (r) goto T
+ * n6: r=e;       l6: if (!r) goto F;
+ * n7: r=f; r=!r; l7: if (!r) goto F
+ * T: return TRUE
+ * F: return FALSE
+ *
+ * Notice, all the "l#" labels are no longer used, and they can now
+ * be discarded.
+ *
+ * ** THIRD PASS **
+ *
+ * For the third pass we deal with the inverts. As they simply just
+ * make the "when_to_branch" get inverted, a simple loop over the
+ * program to that does: "when_to_branch ^= invert;" will do the
+ * job, leaving us with:
+ * n1: r=a; if (!r) goto n5;
+ * n2: r=b; if (!r) goto n5;
+ * n3: r=c: if (!r) goto T;
+ * n4: r=g; if (!r) goto T;
+ * n5: r=d; if (r) goto T
+ * n6: r=e; if (!r) goto F;
+ * n7: r=f; if (r) goto F
+ * T: return TRUE
+ * F: return FALSE
+ *
+ * As "r = a; if (!r) goto n5;" is obviously the same as
+ * "if (!a) goto n5;" without doing anything we can interperate the
+ * program as:
+ * n1: if (!a) goto n5;
+ * n2: if (!b) goto n5;
+ * n3: if (!c) goto T;
+ * n4: if (!g) goto T;
+ * n5: if (d) goto T
+ * n6: if (!e) goto F;
+ * n7: if (f) goto F
+ * T: return TRUE
+ * F: return FALSE
+ *
+ * Since the inverts are discarded at the end, there's no reason to store
+ * them in the program array (and waste memory). A separate array to hold
+ * the inverts is used and freed at the end.
+ */
+static struct prog_entry *
+predicate_parse(const char *str, int nr_parens, int nr_preds,
+		parse_pred_fn parse_pred, void *data,
+		struct filter_parse_error *pe)
+{
+	struct prog_entry *prog_stack;
+	struct prog_entry *prog;
+	const char *ptr = str;
+	char *inverts = NULL;
+	int *op_stack;
+	int *top;
+	int invert = 0;
+	int ret = -ENOMEM;
+	int len;
+	int N = 0;
+	int i;
+
+	nr_preds += 2; /* For TRUE and FALSE */
+
+	op_stack = kmalloc(sizeof(*op_stack) * nr_parens, GFP_KERNEL);
+	if (!op_stack)
+		return ERR_PTR(-ENOMEM);
+	prog_stack = kmalloc(sizeof(*prog_stack) * nr_preds, GFP_KERNEL);
+	if (!prog_stack) {
+		parse_error(pe, -ENOMEM, 0);
+		goto out_free;
+	}
+	inverts = kmalloc(sizeof(*inverts) * nr_preds, GFP_KERNEL);
+	if (!inverts) {
+		parse_error(pe, -ENOMEM, 0);
+		goto out_free;
+	}
+
+	top = op_stack;
+	prog = prog_stack;
+	*top = 0;
+
+	/* First pass */
+	while (*ptr) {						/* #1 */
+		const char *next = ptr++;
+
+		if (isspace(*next))
+			continue;
+
+		switch (*next) {
+		case '(':					/* #2 */
+			if (top - op_stack > nr_parens)
+				return ERR_PTR(-EINVAL);
+			*(++top) = invert;
+			continue;
+		case '!':					/* #3 */
+			if (!is_not(next))
+				break;
+			invert = !invert;
+			continue;
+		}
+
+		if (N >= nr_preds) {
+			parse_error(pe, FILT_ERR_TOO_MANY_PREDS, next - str);
+			goto out_free;
+		}
+
+		inverts[N] = invert;				/* #4 */
+		prog[N].target = N-1;
+
+		len = parse_pred(next, data, ptr - str, pe, &prog[N].pred);
+		if (len < 0) {
+			ret = len;
+			goto out_free;
+		}
+		ptr = next + len;
+
+		N++;
+
+		ret = -1;
+		while (1) {					/* #5 */
+			next = ptr++;
+			if (isspace(*next))
+				continue;
+
+			switch (*next) {
+			case ')':
+			case '\0':
+				break;
+			case '&':
+			case '|':
+				if (next[1] == next[0]) {
+					ptr++;
+					break;
+				}
+			default:
+				parse_error(pe, FILT_ERR_TOO_MANY_PREDS,
+					    next - str);
+				goto out_free;
+			}
+
+			invert = *top & INVERT;
+
+			if (*top & PROCESS_AND) {		/* #7 */
+				update_preds(prog, N - 1, invert);
+				*top &= ~PROCESS_AND;
+			}
+			if (*next == '&') {			/* #8 */
+				*top |= PROCESS_AND;
+				break;
+			}
+			if (*top & PROCESS_OR) {		/* #9 */
+				update_preds(prog, N - 1, !invert);
+				*top &= ~PROCESS_OR;
+			}
+			if (*next == '|') {			/* #10 */
+				*top |= PROCESS_OR;
+				break;
+			}
+			if (!*next)				/* #11 */
+				goto out;
+
+			if (top == op_stack) {
+				ret = -1;
+				/* Too few '(' */
+				parse_error(pe, FILT_ERR_TOO_MANY_CLOSE, ptr - str);
+				goto out_free;
+			}
+			top--;					/* #12 */
+		}
+	}
+ out:
+	if (top != op_stack) {
+		/* Too many '(' */
+		parse_error(pe, FILT_ERR_TOO_MANY_OPEN, ptr - str);
+		goto out_free;
+	}
+
+	prog[N].pred = NULL;					/* #13 */
+	prog[N].target = 1;		/* TRUE */
+	prog[N+1].pred = NULL;
+	prog[N+1].target = 0;		/* FALSE */
+	prog[N-1].target = N;
+	prog[N-1].when_to_branch = false;
+
+	/* Second Pass */
+	for (i = N-1 ; i--; ) {
+		int target = prog[i].target;
+		if (prog[i].when_to_branch == prog[target].when_to_branch)
+			prog[i].target = prog[target].target;
+	}
+
+	/* Third Pass */
+	for (i = 0; i < N; i++) {
+		invert = inverts[i] ^ prog[i].when_to_branch;
+		prog[i].when_to_branch = invert;
+		/* Make sure the program always moves forward */
+		if (WARN_ON(prog[i].target <= i)) {
+			ret = -EINVAL;
+			goto out_free;
+		}
+	}
+
+	return prog;
+out_free:
+	kfree(op_stack);
+	kfree(prog_stack);
+	kfree(inverts);
+	return ERR_PTR(ret);
+}
+
 #define DEFINE_COMPARISON_PRED(type)					\
 static int filter_pred_LT_##type(struct filter_pred *pred, void *event)	\
 {									\
 	type *addr = (type *)(event + pred->offset);			\
 	type val = (type)pred->val;					\
-	int match = (*addr < val);					\
-	return !!match == !pred->not;					\
+	return *addr < val;						\
 }									\
 static int filter_pred_LE_##type(struct filter_pred *pred, void *event)	\
 {									\
 	type *addr = (type *)(event + pred->offset);			\
 	type val = (type)pred->val;					\
-	int match = (*addr <= val);					\
-	return !!match == !pred->not;					\
+	return *addr <= val;						\
 }									\
 static int filter_pred_GT_##type(struct filter_pred *pred, void *event)	\
 {									\
 	type *addr = (type *)(event + pred->offset);			\
 	type val = (type)pred->val;					\
-	int match = (*addr > val);					\
-	return !!match == !pred->not;					\
+	return *addr > val;					\
 }									\
 static int filter_pred_GE_##type(struct filter_pred *pred, void *event)	\
 {									\
 	type *addr = (type *)(event + pred->offset);			\
 	type val = (type)pred->val;					\
-	int match = (*addr >= val);					\
-	return !!match == !pred->not;					\
+	return *addr >= val;						\
 }									\
 static int filter_pred_BAND_##type(struct filter_pred *pred, void *event) \
 {									\
 	type *addr = (type *)(event + pred->offset);			\
 	type val = (type)pred->val;					\
-	int match = !!(*addr & val);					\
-	return match == !pred->not;					\
+	return !!(*addr & val);						\
 }									\
 static const filter_pred_fn_t pred_funcs_##type[] = {			\
-	filter_pred_LT_##type,						\
 	filter_pred_LE_##type,						\
-	filter_pred_GT_##type,						\
+	filter_pred_LT_##type,						\
 	filter_pred_GE_##type,						\
+	filter_pred_GT_##type,						\
 	filter_pred_BAND_##type,					\
 };
 
@@ -261,44 +704,36 @@ static int filter_pred_strloc(struct filter_pred *pred, void *event)
 static int filter_pred_cpu(struct filter_pred *pred, void *event)
 {
 	int cpu, cmp;
-	int match = 0;
 
 	cpu = raw_smp_processor_id();
 	cmp = pred->val;
 
 	switch (pred->op) {
 	case OP_EQ:
-		match = cpu == cmp;
-		break;
+		return cpu == cmp;
+	case OP_NE:
+		return cpu != cmp;
 	case OP_LT:
-		match = cpu < cmp;
-		break;
+		return cpu < cmp;
 	case OP_LE:
-		match = cpu <= cmp;
-		break;
+		return cpu <= cmp;
 	case OP_GT:
-		match = cpu > cmp;
-		break;
+		return cpu > cmp;
 	case OP_GE:
-		match = cpu >= cmp;
-		break;
+		return cpu >= cmp;
 	default:
-		break;
+		return 0;
 	}
-
-	return !!match == !pred->not;
 }
 
 /* Filter predicate for COMM. */
 static int filter_pred_comm(struct filter_pred *pred, void *event)
 {
-	int cmp, match;
+	int cmp;
 
 	cmp = pred->regex.match(current->comm, &pred->regex,
-				pred->regex.field_len);
-	match = cmp ^ pred->not;
-
-	return match;
+				TASK_COMM_LEN);
+	return cmp ^ pred->not;
 }
 
 static int filter_pred_none(struct filter_pred *pred, void *event)
@@ -355,6 +790,7 @@ static int regex_match_glob(char *str, struct regex *r, int len __maybe_unused)
 		return 1;
 	return 0;
 }
+
 /**
  * filter_parse_regex - parse a basic regex
  * @buff:   the raw regex
@@ -415,10 +851,9 @@ static void filter_build_regex(struct filter_pred *pred)
 	struct regex *r = &pred->regex;
 	char *search;
 	enum regex_type type = MATCH_FULL;
-	int not = 0;
 
 	if (pred->op == OP_GLOB) {
-		type = filter_parse_regex(r->pattern, r->len, &search, &not);
+		type = filter_parse_regex(r->pattern, r->len, &search, &pred->not);
 		r->len = strlen(search);
 		memmove(r->pattern, search, r->len+1);
 	}
@@ -440,210 +875,32 @@ static void filter_build_regex(struct filter_pred *pred)
 		r->match = regex_match_glob;
 		break;
 	}
-
-	pred->not ^= not;
-}
-
-enum move_type {
-	MOVE_DOWN,
-	MOVE_UP_FROM_LEFT,
-	MOVE_UP_FROM_RIGHT
-};
-
-static struct filter_pred *
-get_pred_parent(struct filter_pred *pred, struct filter_pred *preds,
-		int index, enum move_type *move)
-{
-	if (pred->parent & FILTER_PRED_IS_RIGHT)
-		*move = MOVE_UP_FROM_RIGHT;
-	else
-		*move = MOVE_UP_FROM_LEFT;
-	pred = &preds[pred->parent & ~FILTER_PRED_IS_RIGHT];
-
-	return pred;
-}
-
-enum walk_return {
-	WALK_PRED_ABORT,
-	WALK_PRED_PARENT,
-	WALK_PRED_DEFAULT,
-};
-
-typedef int (*filter_pred_walkcb_t) (enum move_type move,
-				     struct filter_pred *pred,
-				     int *err, void *data);
-
-static int walk_pred_tree(struct filter_pred *preds,
-			  struct filter_pred *root,
-			  filter_pred_walkcb_t cb, void *data)
-{
-	struct filter_pred *pred = root;
-	enum move_type move = MOVE_DOWN;
-	int done = 0;
-
-	if  (!preds)
-		return -EINVAL;
-
-	do {
-		int err = 0, ret;
-
-		ret = cb(move, pred, &err, data);
-		if (ret == WALK_PRED_ABORT)
-			return err;
-		if (ret == WALK_PRED_PARENT)
-			goto get_parent;
-
-		switch (move) {
-		case MOVE_DOWN:
-			if (pred->left != FILTER_PRED_INVALID) {
-				pred = &preds[pred->left];
-				continue;
-			}
-			goto get_parent;
-		case MOVE_UP_FROM_LEFT:
-			pred = &preds[pred->right];
-			move = MOVE_DOWN;
-			continue;
-		case MOVE_UP_FROM_RIGHT:
- get_parent:
-			if (pred == root)
-				break;
-			pred = get_pred_parent(pred, preds,
-					       pred->parent,
-					       &move);
-			continue;
-		}
-		done = 1;
-	} while (!done);
-
-	/* We are fine. */
-	return 0;
-}
-
-/*
- * A series of AND or ORs where found together. Instead of
- * climbing up and down the tree branches, an array of the
- * ops were made in order of checks. We can just move across
- * the array and short circuit if needed.
- */
-static int process_ops(struct filter_pred *preds,
-		       struct filter_pred *op, void *rec)
-{
-	struct filter_pred *pred;
-	int match = 0;
-	int type;
-	int i;
-
-	/*
-	 * Micro-optimization: We set type to true if op
-	 * is an OR and false otherwise (AND). Then we
-	 * just need to test if the match is equal to
-	 * the type, and if it is, we can short circuit the
-	 * rest of the checks:
-	 *
-	 * if ((match && op->op == OP_OR) ||
-	 *     (!match && op->op == OP_AND))
-	 *	  return match;
-	 */
-	type = op->op == OP_OR;
-
-	for (i = 0; i < op->val; i++) {
-		pred = &preds[op->ops[i]];
-		if (!WARN_ON_ONCE(!pred->fn))
-			match = pred->fn(pred, rec);
-		if (!!match == type)
-			break;
-	}
-	/* If not of not match is equal to not of not, then it is a match */
-	return !!match == !op->not;
-}
-
-struct filter_match_preds_data {
-	struct filter_pred *preds;
-	int match;
-	void *rec;
-};
-
-static int filter_match_preds_cb(enum move_type move, struct filter_pred *pred,
-				 int *err, void *data)
-{
-	struct filter_match_preds_data *d = data;
-
-	*err = 0;
-	switch (move) {
-	case MOVE_DOWN:
-		/* only AND and OR have children */
-		if (pred->left != FILTER_PRED_INVALID) {
-			/* If ops is set, then it was folded. */
-			if (!pred->ops)
-				return WALK_PRED_DEFAULT;
-			/* We can treat folded ops as a leaf node */
-			d->match = process_ops(d->preds, pred, d->rec);
-		} else {
-			if (!WARN_ON_ONCE(!pred->fn))
-				d->match = pred->fn(pred, d->rec);
-		}
-
-		return WALK_PRED_PARENT;
-	case MOVE_UP_FROM_LEFT:
-		/*
-		 * Check for short circuits.
-		 *
-		 * Optimization: !!match == (pred->op == OP_OR)
-		 *   is the same as:
-		 * if ((match && pred->op == OP_OR) ||
-		 *     (!match && pred->op == OP_AND))
-		 */
-		if (!!d->match == (pred->op == OP_OR))
-			return WALK_PRED_PARENT;
-		break;
-	case MOVE_UP_FROM_RIGHT:
-		break;
-	}
-
-	return WALK_PRED_DEFAULT;
 }
 
 /* return 1 if event matches, 0 otherwise (discard) */
 int filter_match_preds(struct event_filter *filter, void *rec)
 {
-	struct filter_pred *preds;
-	struct filter_pred *root;
-	struct filter_match_preds_data data = {
-		/* match is currently meaningless */
-		.match = -1,
-		.rec   = rec,
-	};
-	int n_preds, ret;
+	struct prog_entry *prog;
+	int i;
 
 	/* no filter is considered a match */
 	if (!filter)
 		return 1;
 
-	n_preds = filter->n_preds;
-	if (!n_preds)
+	prog = rcu_dereference_sched(filter->prog);
+	if (!prog)
 		return 1;
 
-	/*
-	 * n_preds, root and filter->preds are protect with preemption disabled.
-	 */
-	root = rcu_dereference_sched(filter->root);
-	if (!root)
-		return 1;
-
-	data.preds = preds = rcu_dereference_sched(filter->preds);
-	ret = walk_pred_tree(preds, root, filter_match_preds_cb, &data);
-	WARN_ON(ret);
-	return data.match;
+	for (i = 0; prog[i].pred; i++) {
+		struct filter_pred *pred = prog[i].pred;
+		int match = pred->fn(pred, rec);
+		if (match == prog[i].when_to_branch)
+			i = prog[i].target;
+	}
+	return prog[i].target;
 }
 EXPORT_SYMBOL_GPL(filter_match_preds);
 
-static void parse_error(struct filter_parse_state *ps, int err, int pos)
-{
-	ps->lasterr = err;
-	ps->lasterr_pos = pos;
-}
-
 static void remove_filter_string(struct event_filter *filter)
 {
 	if (!filter)
@@ -653,11 +910,11 @@ static void remove_filter_string(struct event_filter *filter)
 	filter->filter_string = NULL;
 }
 
-static void append_filter_err(struct filter_parse_state *ps,
+static void append_filter_err(struct filter_parse_error *pe,
 			      struct event_filter *filter)
 {
 	struct trace_seq *s;
-	int pos = ps->lasterr_pos;
+	int pos = pe->lasterr_pos;
 	char *buf;
 	int len;
 
@@ -671,11 +928,19 @@ static void append_filter_err(struct filter_parse_state *ps,
 
 	len = strlen(filter->filter_string);
 	if (pos > len)
-		len = pos;
+		pos = len;
+
+	/* indexing is off by one */
+	if (pos)
+		pos++;
 
 	trace_seq_puts(s, filter->filter_string);
-	trace_seq_printf(s, "\n%*s", pos, "^");
-	trace_seq_printf(s, "\nparse_error: %s\n", err_text[ps->lasterr]);
+	if (pe->lasterr > 0) {
+		trace_seq_printf(s, "\n%*s", pos, "^");
+		trace_seq_printf(s, "\nparse_error: %s\n", err_text[pe->lasterr]);
+	} else {
+		trace_seq_printf(s, "\nError: (%d)\n", pe->lasterr);
+	}
 	trace_seq_putc(s, 0);
 	buf = kmemdup_nul(s->buffer, s->seq.len, GFP_KERNEL);
 	if (buf) {
@@ -715,108 +980,16 @@ void print_subsystem_event_filter(struct event_subsystem *system,
 	mutex_unlock(&event_mutex);
 }
 
-static int __alloc_pred_stack(struct pred_stack *stack, int n_preds)
-{
-	stack->preds = kcalloc(n_preds + 1, sizeof(*stack->preds), GFP_KERNEL);
-	if (!stack->preds)
-		return -ENOMEM;
-	stack->index = n_preds;
-	return 0;
-}
-
-static void __free_pred_stack(struct pred_stack *stack)
-{
-	kfree(stack->preds);
-	stack->index = 0;
-}
-
-static int __push_pred_stack(struct pred_stack *stack,
-			     struct filter_pred *pred)
-{
-	int index = stack->index;
-
-	if (WARN_ON(index == 0))
-		return -ENOSPC;
-
-	stack->preds[--index] = pred;
-	stack->index = index;
-	return 0;
-}
-
-static struct filter_pred *
-__pop_pred_stack(struct pred_stack *stack)
-{
-	struct filter_pred *pred;
-	int index = stack->index;
-
-	pred = stack->preds[index++];
-	if (!pred)
-		return NULL;
-
-	stack->index = index;
-	return pred;
-}
-
-static int filter_set_pred(struct event_filter *filter,
-			   int idx,
-			   struct pred_stack *stack,
-			   struct filter_pred *src)
-{
-	struct filter_pred *dest = &filter->preds[idx];
-	struct filter_pred *left;
-	struct filter_pred *right;
-
-	*dest = *src;
-	dest->index = idx;
-
-	if (dest->op == OP_OR || dest->op == OP_AND) {
-		right = __pop_pred_stack(stack);
-		left = __pop_pred_stack(stack);
-		if (!left || !right)
-			return -EINVAL;
-		/*
-		 * If both children can be folded
-		 * and they are the same op as this op or a leaf,
-		 * then this op can be folded.
-		 */
-		if (left->index & FILTER_PRED_FOLD &&
-		    ((left->op == dest->op && !left->not) ||
-		     left->left == FILTER_PRED_INVALID) &&
-		    right->index & FILTER_PRED_FOLD &&
-		    ((right->op == dest->op && !right->not) ||
-		     right->left == FILTER_PRED_INVALID))
-			dest->index |= FILTER_PRED_FOLD;
-
-		dest->left = left->index & ~FILTER_PRED_FOLD;
-		dest->right = right->index & ~FILTER_PRED_FOLD;
-		left->parent = dest->index & ~FILTER_PRED_FOLD;
-		right->parent = dest->index | FILTER_PRED_IS_RIGHT;
-	} else {
-		/*
-		 * Make dest->left invalid to be used as a quick
-		 * way to know this is a leaf node.
-		 */
-		dest->left = FILTER_PRED_INVALID;
-
-		/* All leafs allow folding the parent ops. */
-		dest->index |= FILTER_PRED_FOLD;
-	}
-
-	return __push_pred_stack(stack, dest);
-}
-
-static void __free_preds(struct event_filter *filter)
+static void free_prog(struct event_filter *filter)
 {
+	struct prog_entry *prog = filter->prog;
 	int i;
 
-	if (filter->preds) {
-		for (i = 0; i < filter->n_preds; i++)
-			kfree(filter->preds[i].ops);
-		kfree(filter->preds);
-		filter->preds = NULL;
-	}
-	filter->a_preds = 0;
-	filter->n_preds = 0;
+	if (!prog)
+		return;
+
+	for (i = 0; prog[i].pred; i++)
+		kfree(prog[i].pred);
 }
 
 static void filter_disable(struct trace_event_file *file)
@@ -834,7 +1007,7 @@ static void __free_filter(struct event_filter *filter)
 	if (!filter)
 		return;
 
-	__free_preds(filter);
+	free_prog(filter);
 	kfree(filter->filter_string);
 	kfree(filter);
 }
@@ -844,30 +1017,6 @@ void free_event_filter(struct event_filter *filter)
 	__free_filter(filter);
 }
 
-static int __alloc_preds(struct event_filter *filter, int n_preds)
-{
-	struct filter_pred *pred;
-	int i;
-
-	if (filter->preds)
-		__free_preds(filter);
-
-	filter->preds = kcalloc(n_preds, sizeof(*filter->preds), GFP_KERNEL);
-
-	if (!filter->preds)
-		return -ENOMEM;
-
-	filter->a_preds = n_preds;
-	filter->n_preds = 0;
-
-	for (i = 0; i < n_preds; i++) {
-		pred = &filter->preds[i];
-		pred->fn = filter_pred_none;
-	}
-
-	return 0;
-}
-
 static inline void __remove_filter(struct trace_event_file *file)
 {
 	filter_disable(file);
@@ -897,813 +1046,464 @@ static void filter_free_subsystem_filters(struct trace_subsystem_dir *dir,
 {
 	struct trace_event_file *file;
 
-	list_for_each_entry(file, &tr->events, list) {
-		if (file->system != dir)
-			continue;
-		__free_subsystem_filter(file);
-	}
-}
-
-static int filter_add_pred(struct filter_parse_state *ps,
-			   struct event_filter *filter,
-			   struct filter_pred *pred,
-			   struct pred_stack *stack)
-{
-	int err;
-
-	if (WARN_ON(filter->n_preds == filter->a_preds)) {
-		parse_error(ps, FILT_ERR_TOO_MANY_PREDS, 0);
-		return -ENOSPC;
-	}
-
-	err = filter_set_pred(filter, filter->n_preds, stack, pred);
-	if (err)
-		return err;
-
-	filter->n_preds++;
-
-	return 0;
-}
-
-int filter_assign_type(const char *type)
-{
-	if (strstr(type, "__data_loc") && strstr(type, "char"))
-		return FILTER_DYN_STRING;
-
-	if (strchr(type, '[') && strstr(type, "char"))
-		return FILTER_STATIC_STRING;
-
-	return FILTER_OTHER;
-}
-
-static bool is_legal_op(struct ftrace_event_field *field, enum filter_op_ids op)
-{
-	if (is_string_field(field) &&
-	    (op != OP_EQ && op != OP_NE && op != OP_GLOB))
-		return false;
-	if (!is_string_field(field) && op == OP_GLOB)
-		return false;
-
-	return true;
-}
-
-static filter_pred_fn_t select_comparison_fn(enum filter_op_ids op,
-					    int field_size, int field_is_signed)
-{
-	filter_pred_fn_t fn = NULL;
-	int pred_func_index = -1;
-
-	switch (op) {
-	case OP_EQ:
-	case OP_NE:
-		break;
-	default:
-		if (WARN_ON_ONCE(op < PRED_FUNC_START))
-			return NULL;
-		pred_func_index = op - PRED_FUNC_START;
-		if (WARN_ON_ONCE(pred_func_index > PRED_FUNC_MAX))
-			return NULL;
-	}
-
-	switch (field_size) {
-	case 8:
-		if (pred_func_index < 0)
-			fn = filter_pred_64;
-		else if (field_is_signed)
-			fn = pred_funcs_s64[pred_func_index];
-		else
-			fn = pred_funcs_u64[pred_func_index];
-		break;
-	case 4:
-		if (pred_func_index < 0)
-			fn = filter_pred_32;
-		else if (field_is_signed)
-			fn = pred_funcs_s32[pred_func_index];
-		else
-			fn = pred_funcs_u32[pred_func_index];
-		break;
-	case 2:
-		if (pred_func_index < 0)
-			fn = filter_pred_16;
-		else if (field_is_signed)
-			fn = pred_funcs_s16[pred_func_index];
-		else
-			fn = pred_funcs_u16[pred_func_index];
-		break;
-	case 1:
-		if (pred_func_index < 0)
-			fn = filter_pred_8;
-		else if (field_is_signed)
-			fn = pred_funcs_s8[pred_func_index];
-		else
-			fn = pred_funcs_u8[pred_func_index];
-		break;
-	}
-
-	return fn;
-}
-
-static int init_pred(struct filter_parse_state *ps,
-		     struct ftrace_event_field *field,
-		     struct filter_pred *pred)
-
-{
-	filter_pred_fn_t fn = filter_pred_none;
-	unsigned long long val;
-	int ret;
-
-	pred->offset = field->offset;
-
-	if (!is_legal_op(field, pred->op)) {
-		parse_error(ps, FILT_ERR_ILLEGAL_FIELD_OP, 0);
-		return -EINVAL;
-	}
-
-	if (field->filter_type == FILTER_COMM) {
-		filter_build_regex(pred);
-		fn = filter_pred_comm;
-		pred->regex.field_len = TASK_COMM_LEN;
-	} else if (is_string_field(field)) {
-		filter_build_regex(pred);
-
-		if (field->filter_type == FILTER_STATIC_STRING) {
-			fn = filter_pred_string;
-			pred->regex.field_len = field->size;
-		} else if (field->filter_type == FILTER_DYN_STRING)
-			fn = filter_pred_strloc;
-		else
-			fn = filter_pred_pchar;
-	} else if (is_function_field(field)) {
-		if (strcmp(field->name, "ip")) {
-			parse_error(ps, FILT_ERR_IP_FIELD_ONLY, 0);
-			return -EINVAL;
-		}
-	} else {
-		if (field->is_signed)
-			ret = kstrtoll(pred->regex.pattern, 0, &val);
-		else
-			ret = kstrtoull(pred->regex.pattern, 0, &val);
-		if (ret) {
-			parse_error(ps, FILT_ERR_ILLEGAL_INTVAL, 0);
-			return -EINVAL;
-		}
-		pred->val = val;
-
-		if (field->filter_type == FILTER_CPU)
-			fn = filter_pred_cpu;
-		else
-			fn = select_comparison_fn(pred->op, field->size,
-					  field->is_signed);
-		if (!fn) {
-			parse_error(ps, FILT_ERR_INVALID_OP, 0);
-			return -EINVAL;
-		}
-	}
-
-	if (pred->op == OP_NE)
-		pred->not ^= 1;
-
-	pred->fn = fn;
-	return 0;
-}
-
-static void parse_init(struct filter_parse_state *ps,
-		       struct filter_op *ops,
-		       char *infix_string)
-{
-	memset(ps, '\0', sizeof(*ps));
-
-	ps->infix.string = infix_string;
-	ps->infix.cnt = strlen(infix_string);
-	ps->ops = ops;
-
-	INIT_LIST_HEAD(&ps->opstack);
-	INIT_LIST_HEAD(&ps->postfix);
-}
-
-static char infix_next(struct filter_parse_state *ps)
-{
-	if (!ps->infix.cnt)
-		return 0;
-
-	ps->infix.cnt--;
-
-	return ps->infix.string[ps->infix.tail++];
-}
-
-static char infix_peek(struct filter_parse_state *ps)
-{
-	if (ps->infix.tail == strlen(ps->infix.string))
-		return 0;
-
-	return ps->infix.string[ps->infix.tail];
-}
-
-static void infix_advance(struct filter_parse_state *ps)
-{
-	if (!ps->infix.cnt)
-		return;
-
-	ps->infix.cnt--;
-	ps->infix.tail++;
-}
-
-static inline int is_precedence_lower(struct filter_parse_state *ps,
-				      int a, int b)
-{
-	return ps->ops[a].precedence < ps->ops[b].precedence;
-}
-
-static inline int is_op_char(struct filter_parse_state *ps, char c)
-{
-	int i;
-
-	for (i = 0; strcmp(ps->ops[i].string, "OP_NONE"); i++) {
-		if (ps->ops[i].string[0] == c)
-			return 1;
-	}
-
-	return 0;
-}
-
-static int infix_get_op(struct filter_parse_state *ps, char firstc)
-{
-	char nextc = infix_peek(ps);
-	char opstr[3];
-	int i;
-
-	opstr[0] = firstc;
-	opstr[1] = nextc;
-	opstr[2] = '\0';
-
-	for (i = 0; strcmp(ps->ops[i].string, "OP_NONE"); i++) {
-		if (!strcmp(opstr, ps->ops[i].string)) {
-			infix_advance(ps);
-			return ps->ops[i].id;
-		}
-	}
-
-	opstr[1] = '\0';
-
-	for (i = 0; strcmp(ps->ops[i].string, "OP_NONE"); i++) {
-		if (!strcmp(opstr, ps->ops[i].string))
-			return ps->ops[i].id;
-	}
-
-	return OP_NONE;
-}
-
-static inline void clear_operand_string(struct filter_parse_state *ps)
-{
-	memset(ps->operand.string, '\0', MAX_FILTER_STR_VAL);
-	ps->operand.tail = 0;
-}
-
-static inline int append_operand_char(struct filter_parse_state *ps, char c)
-{
-	if (ps->operand.tail == MAX_FILTER_STR_VAL - 1)
-		return -EINVAL;
-
-	ps->operand.string[ps->operand.tail++] = c;
-
-	return 0;
-}
-
-static int filter_opstack_push(struct filter_parse_state *ps,
-			       enum filter_op_ids op)
-{
-	struct opstack_op *opstack_op;
-
-	opstack_op = kmalloc(sizeof(*opstack_op), GFP_KERNEL);
-	if (!opstack_op)
-		return -ENOMEM;
-
-	opstack_op->op = op;
-	list_add(&opstack_op->list, &ps->opstack);
-
-	return 0;
-}
-
-static int filter_opstack_empty(struct filter_parse_state *ps)
-{
-	return list_empty(&ps->opstack);
-}
-
-static int filter_opstack_top(struct filter_parse_state *ps)
-{
-	struct opstack_op *opstack_op;
-
-	if (filter_opstack_empty(ps))
-		return OP_NONE;
-
-	opstack_op = list_first_entry(&ps->opstack, struct opstack_op, list);
-
-	return opstack_op->op;
-}
-
-static int filter_opstack_pop(struct filter_parse_state *ps)
-{
-	struct opstack_op *opstack_op;
-	enum filter_op_ids op;
-
-	if (filter_opstack_empty(ps))
-		return OP_NONE;
-
-	opstack_op = list_first_entry(&ps->opstack, struct opstack_op, list);
-	op = opstack_op->op;
-	list_del(&opstack_op->list);
-
-	kfree(opstack_op);
-
-	return op;
-}
-
-static void filter_opstack_clear(struct filter_parse_state *ps)
-{
-	while (!filter_opstack_empty(ps))
-		filter_opstack_pop(ps);
-}
-
-static char *curr_operand(struct filter_parse_state *ps)
-{
-	return ps->operand.string;
-}
-
-static int postfix_append_operand(struct filter_parse_state *ps, char *operand)
-{
-	struct postfix_elt *elt;
-
-	elt = kmalloc(sizeof(*elt), GFP_KERNEL);
-	if (!elt)
-		return -ENOMEM;
-
-	elt->op = OP_NONE;
-	elt->operand = kstrdup(operand, GFP_KERNEL);
-	if (!elt->operand) {
-		kfree(elt);
-		return -ENOMEM;
-	}
-
-	list_add_tail(&elt->list, &ps->postfix);
-
-	return 0;
-}
-
-static int postfix_append_op(struct filter_parse_state *ps, enum filter_op_ids op)
-{
-	struct postfix_elt *elt;
-
-	elt = kmalloc(sizeof(*elt), GFP_KERNEL);
-	if (!elt)
-		return -ENOMEM;
-
-	elt->op = op;
-	elt->operand = NULL;
-
-	list_add_tail(&elt->list, &ps->postfix);
-
-	return 0;
-}
-
-static void postfix_clear(struct filter_parse_state *ps)
-{
-	struct postfix_elt *elt;
-
-	while (!list_empty(&ps->postfix)) {
-		elt = list_first_entry(&ps->postfix, struct postfix_elt, list);
-		list_del(&elt->list);
-		kfree(elt->operand);
-		kfree(elt);
-	}
-}
-
-static int filter_parse(struct filter_parse_state *ps)
-{
-	enum filter_op_ids op, top_op;
-	int in_string = 0;
-	char ch;
-
-	while ((ch = infix_next(ps))) {
-		if (ch == '"') {
-			in_string ^= 1;
-			continue;
-		}
-
-		if (in_string)
-			goto parse_operand;
-
-		if (isspace(ch))
-			continue;
-
-		if (is_op_char(ps, ch)) {
-			op = infix_get_op(ps, ch);
-			if (op == OP_NONE) {
-				parse_error(ps, FILT_ERR_INVALID_OP, 0);
-				return -EINVAL;
-			}
-
-			if (strlen(curr_operand(ps))) {
-				postfix_append_operand(ps, curr_operand(ps));
-				clear_operand_string(ps);
-			}
-
-			while (!filter_opstack_empty(ps)) {
-				top_op = filter_opstack_top(ps);
-				if (!is_precedence_lower(ps, top_op, op)) {
-					top_op = filter_opstack_pop(ps);
-					postfix_append_op(ps, top_op);
-					continue;
-				}
-				break;
-			}
-
-			filter_opstack_push(ps, op);
-			continue;
-		}
-
-		if (ch == '(') {
-			filter_opstack_push(ps, OP_OPEN_PAREN);
-			continue;
-		}
-
-		if (ch == ')') {
-			if (strlen(curr_operand(ps))) {
-				postfix_append_operand(ps, curr_operand(ps));
-				clear_operand_string(ps);
-			}
-
-			top_op = filter_opstack_pop(ps);
-			while (top_op != OP_NONE) {
-				if (top_op == OP_OPEN_PAREN)
-					break;
-				postfix_append_op(ps, top_op);
-				top_op = filter_opstack_pop(ps);
-			}
-			if (top_op == OP_NONE) {
-				parse_error(ps, FILT_ERR_UNBALANCED_PAREN, 0);
-				return -EINVAL;
-			}
+	list_for_each_entry(file, &tr->events, list) {
+		if (file->system != dir)
 			continue;
-		}
-parse_operand:
-		if (append_operand_char(ps, ch)) {
-			parse_error(ps, FILT_ERR_OPERAND_TOO_LONG, 0);
-			return -EINVAL;
-		}
+		__free_subsystem_filter(file);
 	}
+}
 
-	if (strlen(curr_operand(ps)))
-		postfix_append_operand(ps, curr_operand(ps));
+int filter_assign_type(const char *type)
+{
+	if (strstr(type, "__data_loc") && strstr(type, "char"))
+		return FILTER_DYN_STRING;
 
-	while (!filter_opstack_empty(ps)) {
-		top_op = filter_opstack_pop(ps);
-		if (top_op == OP_NONE)
-			break;
-		if (top_op == OP_OPEN_PAREN) {
-			parse_error(ps, FILT_ERR_UNBALANCED_PAREN, 0);
-			return -EINVAL;
-		}
-		postfix_append_op(ps, top_op);
-	}
+	if (strchr(type, '[') && strstr(type, "char"))
+		return FILTER_STATIC_STRING;
 
-	return 0;
+	return FILTER_OTHER;
 }
 
-static struct filter_pred *create_pred(struct filter_parse_state *ps,
-				       struct trace_event_call *call,
-				       enum filter_op_ids op,
-				       char *operand1, char *operand2)
+static filter_pred_fn_t select_comparison_fn(enum filter_op_ids op,
+					    int field_size, int field_is_signed)
 {
-	struct ftrace_event_field *field;
-	static struct filter_pred pred;
-
-	memset(&pred, 0, sizeof(pred));
-	pred.op = op;
-
-	if (op == OP_AND || op == OP_OR)
-		return &pred;
+	filter_pred_fn_t fn = NULL;
+	int pred_func_index = -1;
 
-	if (!operand1 || !operand2) {
-		parse_error(ps, FILT_ERR_MISSING_FIELD, 0);
-		return NULL;
+	switch (op) {
+	case OP_EQ:
+	case OP_NE:
+		break;
+	default:
+		if (WARN_ON_ONCE(op < PRED_FUNC_START))
+			return NULL;
+		pred_func_index = op - PRED_FUNC_START;
+		if (WARN_ON_ONCE(pred_func_index > PRED_FUNC_MAX))
+			return NULL;
 	}
 
-	field = trace_find_event_field(call, operand1);
-	if (!field) {
-		parse_error(ps, FILT_ERR_FIELD_NOT_FOUND, 0);
-		return NULL;
+	switch (field_size) {
+	case 8:
+		if (pred_func_index < 0)
+			fn = filter_pred_64;
+		else if (field_is_signed)
+			fn = pred_funcs_s64[pred_func_index];
+		else
+			fn = pred_funcs_u64[pred_func_index];
+		break;
+	case 4:
+		if (pred_func_index < 0)
+			fn = filter_pred_32;
+		else if (field_is_signed)
+			fn = pred_funcs_s32[pred_func_index];
+		else
+			fn = pred_funcs_u32[pred_func_index];
+		break;
+	case 2:
+		if (pred_func_index < 0)
+			fn = filter_pred_16;
+		else if (field_is_signed)
+			fn = pred_funcs_s16[pred_func_index];
+		else
+			fn = pred_funcs_u16[pred_func_index];
+		break;
+	case 1:
+		if (pred_func_index < 0)
+			fn = filter_pred_8;
+		else if (field_is_signed)
+			fn = pred_funcs_s8[pred_func_index];
+		else
+			fn = pred_funcs_u8[pred_func_index];
+		break;
 	}
 
-	strcpy(pred.regex.pattern, operand2);
-	pred.regex.len = strlen(pred.regex.pattern);
-	pred.field = field;
-	return init_pred(ps, field, &pred) ? NULL : &pred;
+	return fn;
 }
 
-static int check_preds(struct filter_parse_state *ps)
+/* Called when a predicate is encountered by predicate_parse() */
+static int parse_pred(const char *str, void *data,
+		      int pos, struct filter_parse_error *pe,
+		      struct filter_pred **pred_ptr)
 {
-	int n_normal_preds = 0, n_logical_preds = 0;
-	struct postfix_elt *elt;
-	int cnt = 0;
+	struct trace_event_call *call = data;
+	struct ftrace_event_field *field;
+	struct filter_pred *pred = NULL;
+	char num_buf[24];	/* Big enough to hold an address */
+	char *field_name;
+	char q;
+	u64 val;
+	int len;
+	int ret;
+	int op;
+	int s;
+	int i = 0;
 
-	list_for_each_entry(elt, &ps->postfix, list) {
-		if (elt->op == OP_NONE) {
-			cnt++;
-			continue;
-		}
+	/* First find the field to associate to */
+	while (isspace(str[i]))
+		i++;
+	s = i;
 
-		if (elt->op == OP_AND || elt->op == OP_OR) {
-			n_logical_preds++;
-			cnt--;
-			continue;
-		}
-		if (elt->op != OP_NOT)
-			cnt--;
-		n_normal_preds++;
-		/* all ops should have operands */
-		if (cnt < 0)
-			break;
-	}
+	while (isalnum(str[i]) || str[i] == '_')
+		i++;
+
+	len = i - s;
+
+	if (!len)
+		return -1;
 
-	if (cnt != 1 || !n_normal_preds || n_logical_preds >= n_normal_preds) {
-		parse_error(ps, FILT_ERR_INVALID_FILTER, 0);
+	field_name = kmemdup_nul(str + s, len, GFP_KERNEL);
+	if (!field_name)
+		return -ENOMEM;
+
+	/* Make sure that the field exists */
+
+	field = trace_find_event_field(call, field_name);
+	kfree(field_name);
+	if (!field) {
+		parse_error(pe, FILT_ERR_FIELD_NOT_FOUND, pos + i);
 		return -EINVAL;
 	}
 
-	return 0;
-}
+	while (isspace(str[i]))
+		i++;
 
-static int count_preds(struct filter_parse_state *ps)
-{
-	struct postfix_elt *elt;
-	int n_preds = 0;
+	/* Make sure this op is supported */
+	for (op = 0; ops[op]; op++) {
+		/* This is why '<=' must come before '<' in ops[] */
+		if (strncmp(str + i, ops[op], strlen(ops[op])) == 0)
+			break;
+	}
 
-	list_for_each_entry(elt, &ps->postfix, list) {
-		if (elt->op == OP_NONE)
-			continue;
-		n_preds++;
+	if (!ops[op]) {
+		parse_error(pe, FILT_ERR_INVALID_OP, pos + i);
+		goto err_free;
 	}
 
-	return n_preds;
-}
+	i += strlen(ops[op]);
 
-struct check_pred_data {
-	int count;
-	int max;
-};
+	while (isspace(str[i]))
+		i++;
 
-static int check_pred_tree_cb(enum move_type move, struct filter_pred *pred,
-			      int *err, void *data)
-{
-	struct check_pred_data *d = data;
+	s = i;
 
-	if (WARN_ON(d->count++ > d->max)) {
-		*err = -EINVAL;
-		return WALK_PRED_ABORT;
-	}
-	return WALK_PRED_DEFAULT;
-}
+	pred = kzalloc(sizeof(*pred), GFP_KERNEL);
+	if (!pred)
+		return -ENOMEM;
 
-/*
- * The tree is walked at filtering of an event. If the tree is not correctly
- * built, it may cause an infinite loop. Check here that the tree does
- * indeed terminate.
- */
-static int check_pred_tree(struct event_filter *filter,
-			   struct filter_pred *root)
-{
-	struct check_pred_data data = {
+	pred->field = field;
+	pred->offset = field->offset;
+	pred->op = op;
+
+	if (ftrace_event_is_function(call)) {
 		/*
-		 * The max that we can hit a node is three times.
-		 * Once going down, once coming up from left, and
-		 * once coming up from right. This is more than enough
-		 * since leafs are only hit a single time.
+		 * Perf does things different with function events.
+		 * It only allows an "ip" field, and expects a string.
+		 * But the string does not need to be surrounded by quotes.
+		 * If it is a string, the assigned function as a nop,
+		 * (perf doesn't use it) and grab everything.
 		 */
-		.max   = 3 * filter->n_preds,
-		.count = 0,
-	};
+		if (strcmp(field->name, "ip") != 0) {
+			 parse_error(pe, FILT_ERR_IP_FIELD_ONLY, pos + i);
+			 goto err_free;
+		 }
+		 pred->fn = filter_pred_none;
+
+		 /*
+		  * Quotes are not required, but if they exist then we need
+		  * to read them till we hit a matching one.
+		  */
+		 if (str[i] == '\'' || str[i] == '"')
+			 q = str[i];
+		 else
+			 q = 0;
+
+		 for (i++; str[i]; i++) {
+			 if (q && str[i] == q)
+				 break;
+			 if (!q && (str[i] == ')' || str[i] == '&' ||
+				    str[i] == '|'))
+				 break;
+		 }
+		 /* Skip quotes */
+		 if (q)
+			 s++;
+		len = i - s;
+		if (len >= MAX_FILTER_STR_VAL) {
+			parse_error(pe, FILT_ERR_OPERAND_TOO_LONG, pos + i);
+			goto err_free;
+		}
 
-	return walk_pred_tree(filter->preds, root,
-			      check_pred_tree_cb, &data);
-}
+		pred->regex.len = len;
+		strncpy(pred->regex.pattern, str + s, len);
+		pred->regex.pattern[len] = 0;
+
+	/* This is either a string, or an integer */
+	} else if (str[i] == '\'' || str[i] == '"') {
+		char q = str[i];
+
+		/* Make sure the op is OK for strings */
+		switch (op) {
+		case OP_NE:
+			pred->not = 1;
+			/* Fall through */
+		case OP_GLOB:
+		case OP_EQ:
+			break;
+		default:
+			parse_error(pe, FILT_ERR_ILLEGAL_FIELD_OP, pos + i);
+			goto err_free;
+		}
 
-static int count_leafs_cb(enum move_type move, struct filter_pred *pred,
-			  int *err, void *data)
-{
-	int *count = data;
+		/* Make sure the field is OK for strings */
+		if (!is_string_field(field)) {
+			parse_error(pe, FILT_ERR_EXPECT_DIGIT, pos + i);
+			goto err_free;
+		}
 
-	if ((move == MOVE_DOWN) &&
-	    (pred->left == FILTER_PRED_INVALID))
-		(*count)++;
+		for (i++; str[i]; i++) {
+			if (str[i] == q)
+				break;
+		}
+		if (!str[i]) {
+			parse_error(pe, FILT_ERR_MISSING_QUOTE, pos + i);
+			goto err_free;
+		}
 
-	return WALK_PRED_DEFAULT;
-}
+		/* Skip quotes */
+		s++;
+		len = i - s;
+		if (len >= MAX_FILTER_STR_VAL) {
+			parse_error(pe, FILT_ERR_OPERAND_TOO_LONG, pos + i);
+			goto err_free;
+		}
 
-static int count_leafs(struct filter_pred *preds, struct filter_pred *root)
-{
-	int count = 0, ret;
+		pred->regex.len = len;
+		strncpy(pred->regex.pattern, str + s, len);
+		pred->regex.pattern[len] = 0;
 
-	ret = walk_pred_tree(preds, root, count_leafs_cb, &count);
-	WARN_ON(ret);
-	return count;
-}
+		filter_build_regex(pred);
 
-struct fold_pred_data {
-	struct filter_pred *root;
-	int count;
-	int children;
-};
+		if (field->filter_type == FILTER_COMM) {
+			pred->fn = filter_pred_comm;
 
-static int fold_pred_cb(enum move_type move, struct filter_pred *pred,
-			int *err, void *data)
-{
-	struct fold_pred_data *d = data;
-	struct filter_pred *root = d->root;
+		} else if (field->filter_type == FILTER_STATIC_STRING) {
+			pred->fn = filter_pred_string;
+			pred->regex.field_len = field->size;
 
-	if (move != MOVE_DOWN)
-		return WALK_PRED_DEFAULT;
-	if (pred->left != FILTER_PRED_INVALID)
-		return WALK_PRED_DEFAULT;
+		} else if (field->filter_type == FILTER_DYN_STRING)
+			pred->fn = filter_pred_strloc;
+		else
+			pred->fn = filter_pred_pchar;
+		/* go past the last quote */
+		i++;
 
-	if (WARN_ON(d->count == d->children)) {
-		*err = -EINVAL;
-		return WALK_PRED_ABORT;
-	}
+	} else if (isdigit(str[i])) {
 
-	pred->index &= ~FILTER_PRED_FOLD;
-	root->ops[d->count++] = pred->index;
-	return WALK_PRED_DEFAULT;
-}
+		/* Make sure the field is not a string */
+		if (is_string_field(field)) {
+			parse_error(pe, FILT_ERR_EXPECT_STRING, pos + i);
+			goto err_free;
+		}
 
-static int fold_pred(struct filter_pred *preds, struct filter_pred *root)
-{
-	struct fold_pred_data data = {
-		.root  = root,
-		.count = 0,
-	};
-	int children;
+		if (op == OP_GLOB) {
+			parse_error(pe, FILT_ERR_ILLEGAL_FIELD_OP, pos + i);
+			goto err_free;
+		}
 
-	/* No need to keep the fold flag */
-	root->index &= ~FILTER_PRED_FOLD;
+		/* We allow 0xDEADBEEF */
+		while (isalnum(str[i]))
+			i++;
 
-	/* If the root is a leaf then do nothing */
-	if (root->left == FILTER_PRED_INVALID)
-		return 0;
+		len = i - s;
+		/* 0xfeedfacedeadbeef is 18 chars max */
+		if (len >= sizeof(num_buf)) {
+			parse_error(pe, FILT_ERR_OPERAND_TOO_LONG, pos + i);
+			goto err_free;
+		}
 
-	/* count the children */
-	children = count_leafs(preds, &preds[root->left]);
-	children += count_leafs(preds, &preds[root->right]);
+		strncpy(num_buf, str + s, len);
+		num_buf[len] = 0;
 
-	root->ops = kcalloc(children, sizeof(*root->ops), GFP_KERNEL);
-	if (!root->ops)
-		return -ENOMEM;
+		/* Make sure it is a value */
+		if (field->is_signed)
+			ret = kstrtoll(num_buf, 0, &val);
+		else
+			ret = kstrtoull(num_buf, 0, &val);
+		if (ret) {
+			parse_error(pe, FILT_ERR_ILLEGAL_INTVAL, pos + s);
+			goto err_free;
+		}
 
-	root->val = children;
-	data.children = children;
-	return walk_pred_tree(preds, root, fold_pred_cb, &data);
-}
+		pred->val = val;
 
-static int fold_pred_tree_cb(enum move_type move, struct filter_pred *pred,
-			     int *err, void *data)
-{
-	struct filter_pred *preds = data;
+		if (field->filter_type == FILTER_CPU)
+			pred->fn = filter_pred_cpu;
+		else {
+			pred->fn = select_comparison_fn(pred->op, field->size,
+							field->is_signed);
+			if (pred->op == OP_NE)
+				pred->not = 1;
+		}
 
-	if (move != MOVE_DOWN)
-		return WALK_PRED_DEFAULT;
-	if (!(pred->index & FILTER_PRED_FOLD))
-		return WALK_PRED_DEFAULT;
+	} else {
+		parse_error(pe, FILT_ERR_INVALID_VALUE, pos + i);
+		goto err_free;
+	}
 
-	*err = fold_pred(preds, pred);
-	if (*err)
-		return WALK_PRED_ABORT;
+	*pred_ptr = pred;
+	return i;
 
-	/* eveyrhing below is folded, continue with parent */
-	return WALK_PRED_PARENT;
+err_free:
+	kfree(pred);
+	return -EINVAL;
 }
 
+enum {
+	TOO_MANY_CLOSE		= -1,
+	TOO_MANY_OPEN		= -2,
+	MISSING_QUOTE		= -3,
+};
+
 /*
- * To optimize the processing of the ops, if we have several "ors" or
- * "ands" together, we can put them in an array and process them all
- * together speeding up the filter logic.
+ * Read the filter string once to calculate the number of predicates
+ * as well as how deep the parentheses go.
+ *
+ * Returns:
+ *   0 - everything is fine (err is undefined)
+ *  -1 - too many ')'
+ *  -2 - too many '('
+ *  -3 - No matching quote
  */
-static int fold_pred_tree(struct event_filter *filter,
-			   struct filter_pred *root)
-{
-	return walk_pred_tree(filter->preds, root, fold_pred_tree_cb,
-			      filter->preds);
-}
-
-static int replace_preds(struct trace_event_call *call,
-			 struct event_filter *filter,
-			 struct filter_parse_state *ps,
-			 bool dry_run)
-{
-	char *operand1 = NULL, *operand2 = NULL;
-	struct filter_pred *pred;
-	struct filter_pred *root;
-	struct postfix_elt *elt;
-	struct pred_stack stack = { }; /* init to NULL */
-	int err;
-	int n_preds = 0;
-
-	n_preds = count_preds(ps);
-	if (n_preds >= MAX_FILTER_PRED) {
-		parse_error(ps, FILT_ERR_TOO_MANY_PREDS, 0);
-		return -ENOSPC;
-	}
-
-	err = check_preds(ps);
-	if (err)
-		return err;
+static int calc_stack(const char *str, int *parens, int *preds, int *err)
+{
+	bool is_pred = false;
+	int nr_preds = 0;
+	int open = 1; /* Count the expression as "(E)" */
+	int last_quote = 0;
+	int max_open = 1;
+	int quote = 0;
+	int i;
 
-	if (!dry_run) {
-		err = __alloc_pred_stack(&stack, n_preds);
-		if (err)
-			return err;
-		err = __alloc_preds(filter, n_preds);
-		if (err)
-			goto fail;
-	}
+	*err = 0;
 
-	n_preds = 0;
-	list_for_each_entry(elt, &ps->postfix, list) {
-		if (elt->op == OP_NONE) {
-			if (!operand1)
-				operand1 = elt->operand;
-			else if (!operand2)
-				operand2 = elt->operand;
-			else {
-				parse_error(ps, FILT_ERR_TOO_MANY_OPERANDS, 0);
-				err = -EINVAL;
-				goto fail;
-			}
+	for (i = 0; str[i]; i++) {
+		if (isspace(str[i]))
+			continue;
+		if (quote) {
+			if (str[i] == quote)
+			       quote = 0;
 			continue;
 		}
 
-		if (elt->op == OP_NOT) {
-			if (!n_preds || operand1 || operand2) {
-				parse_error(ps, FILT_ERR_ILLEGAL_NOT_OP, 0);
-				err = -EINVAL;
-				goto fail;
+		switch (str[i]) {
+		case '\'':
+		case '"':
+			quote = str[i];
+			last_quote = i;
+			break;
+		case '|':
+		case '&':
+			if (str[i+1] != str[i])
+				break;
+			is_pred = false;
+			continue;
+		case '(':
+			is_pred = false;
+			open++;
+			if (open > max_open)
+				max_open = open;
+			continue;
+		case ')':
+			is_pred = false;
+			if (open == 1) {
+				*err = i;
+				return TOO_MANY_CLOSE;
 			}
-			if (!dry_run)
-				filter->preds[n_preds - 1].not ^= 1;
+			open--;
 			continue;
 		}
-
-		if (WARN_ON(n_preds++ == MAX_FILTER_PRED)) {
-			parse_error(ps, FILT_ERR_TOO_MANY_PREDS, 0);
-			err = -ENOSPC;
-			goto fail;
+		if (!is_pred) {
+			nr_preds++;
+			is_pred = true;
 		}
+	}
 
-		pred = create_pred(ps, call, elt->op, operand1, operand2);
-		if (!pred) {
-			err = -EINVAL;
-			goto fail;
-		}
+	if (quote) {
+		*err = last_quote;
+		return MISSING_QUOTE;
+	}
 
-		if (!dry_run) {
-			err = filter_add_pred(ps, filter, pred, &stack);
-			if (err)
-				goto fail;
-		}
+	if (open != 1) {
+		int level = open;
 
-		operand1 = operand2 = NULL;
+		/* find the bad open */
+		for (i--; i; i--) {
+			if (quote) {
+				if (str[i] == quote)
+					quote = 0;
+				continue;
+			}
+			switch (str[i]) {
+			case '(':
+				if (level == open) {
+					*err = i;
+					return TOO_MANY_OPEN;
+				}
+				level--;
+				break;
+			case ')':
+				level++;
+				break;
+			case '\'':
+			case '"':
+				quote = str[i];
+				break;
+			}
+		}
+		/* First character is the '(' with missing ')' */
+		*err = 0;
+		return TOO_MANY_OPEN;
 	}
 
-	if (!dry_run) {
-		/* We should have one item left on the stack */
-		pred = __pop_pred_stack(&stack);
-		if (!pred)
-			return -EINVAL;
-		/* This item is where we start from in matching */
-		root = pred;
-		/* Make sure the stack is empty */
-		pred = __pop_pred_stack(&stack);
-		if (WARN_ON(pred)) {
-			err = -EINVAL;
-			filter->root = NULL;
-			goto fail;
+	/* Set the size of the required stacks */
+	*parens = max_open;
+	*preds = nr_preds;
+	return 0;
+}
+
+static int process_preds(struct trace_event_call *call,
+			 const char *filter_string,
+			 struct event_filter *filter,
+			 struct filter_parse_error *pe)
+{
+	struct prog_entry *prog;
+	int nr_parens;
+	int nr_preds;
+	int index;
+	int ret;
+
+	ret = calc_stack(filter_string, &nr_parens, &nr_preds, &index);
+	if (ret < 0) {
+		switch (ret) {
+		case MISSING_QUOTE:
+			parse_error(pe, FILT_ERR_MISSING_QUOTE, index);
+			break;
+		case TOO_MANY_OPEN:
+			parse_error(pe, FILT_ERR_TOO_MANY_OPEN, index);
+			break;
+		default:
+			parse_error(pe, FILT_ERR_TOO_MANY_CLOSE, index);
 		}
-		err = check_pred_tree(filter, root);
-		if (err)
-			goto fail;
-
-		/* Optimize the tree */
-		err = fold_pred_tree(filter, root);
-		if (err)
-			goto fail;
-
-		/* We don't set root until we know it works */
-		barrier();
-		filter->root = root;
+		return ret;
 	}
 
-	err = 0;
-fail:
-	__free_pred_stack(&stack);
-	return err;
+	prog = predicate_parse(filter_string, nr_parens, nr_preds,
+			       parse_pred, call, pe);
+	if (IS_ERR(prog))
+		return PTR_ERR(prog);
+
+	filter->prog = prog;
+	return 0;
 }
 
 static inline void event_set_filtered_flag(struct trace_event_file *file)
@@ -1753,9 +1553,9 @@ struct filter_list {
 	struct event_filter	*filter;
 };
 
-static int replace_system_preds(struct trace_subsystem_dir *dir,
+static int process_system_preds(struct trace_subsystem_dir *dir,
 				struct trace_array *tr,
-				struct filter_parse_state *ps,
+				struct filter_parse_error *pe,
 				char *filter_string)
 {
 	struct trace_event_file *file;
@@ -1766,29 +1566,11 @@ static int replace_system_preds(struct trace_subsystem_dir *dir,
 	bool fail = true;
 	int err;
 
-	list_for_each_entry(file, &tr->events, list) {
-		if (file->system != dir)
-			continue;
-
-		/*
-		 * Try to see if the filter can be applied
-		 *  (filter arg is ignored on dry_run)
-		 */
-		err = replace_preds(file->event_call, NULL, ps, true);
-		if (err)
-			event_set_no_set_filter_flag(file);
-		else
-			event_clear_no_set_filter_flag(file);
-	}
-
 	list_for_each_entry(file, &tr->events, list) {
 
 		if (file->system != dir)
 			continue;
 
-		if (event_no_set_filter_flag(file))
-			continue;
-
 		filter = kzalloc(sizeof(*filter), GFP_KERNEL);
 		if (!filter)
 			goto fail_mem;
@@ -1797,11 +1579,11 @@ static int replace_system_preds(struct trace_subsystem_dir *dir,
 		if (!filter->filter_string)
 			goto fail_mem;
 
-		err = replace_preds(file->event_call, filter, ps, false);
+		err = process_preds(file->event_call, filter_string, filter, pe);
 		if (err) {
 			filter_disable(file);
-			parse_error(ps, FILT_ERR_BAD_SUBSYS_FILTER, 0);
-			append_filter_err(ps, filter);
+			parse_error(pe, FILT_ERR_BAD_SUBSYS_FILTER, 0);
+			append_filter_err(pe, filter);
 		} else
 			event_set_filtered_flag(file);
 
@@ -1843,7 +1625,7 @@ static int replace_system_preds(struct trace_subsystem_dir *dir,
 		list_del(&filter_item->list);
 		kfree(filter_item);
 	}
-	parse_error(ps, FILT_ERR_BAD_SUBSYS_FILTER, 0);
+	parse_error(pe, FILT_ERR_BAD_SUBSYS_FILTER, 0);
 	return -EINVAL;
  fail_mem:
 	kfree(filter);
@@ -1859,16 +1641,16 @@ static int replace_system_preds(struct trace_subsystem_dir *dir,
 }
 
 static int create_filter_start(char *filter_string, bool set_str,
-			       struct filter_parse_state **psp,
+			       struct filter_parse_error **pse,
 			       struct event_filter **filterp)
 {
 	struct event_filter *filter;
-	struct filter_parse_state *ps = NULL;
+	struct filter_parse_error *pe = NULL;
 	int err = 0;
 
-	WARN_ON_ONCE(*psp || *filterp);
+	if (WARN_ON_ONCE(*pse || *filterp))
+		return -EINVAL;
 
-	/* allocate everything, and if any fails, free all and fail */
 	filter = kzalloc(sizeof(*filter), GFP_KERNEL);
 	if (filter && set_str) {
 		filter->filter_string = kstrdup(filter_string, GFP_KERNEL);
@@ -1876,32 +1658,24 @@ static int create_filter_start(char *filter_string, bool set_str,
 			err = -ENOMEM;
 	}
 
-	ps = kzalloc(sizeof(*ps), GFP_KERNEL);
+	pe = kzalloc(sizeof(*pe), GFP_KERNEL);
 
-	if (!filter || !ps || err) {
-		kfree(ps);
+	if (!filter || !pe || err) {
+		kfree(pe);
 		__free_filter(filter);
 		return -ENOMEM;
 	}
 
 	/* we're committed to creating a new filter */
 	*filterp = filter;
-	*psp = ps;
+	*pse = pe;
 
-	parse_init(ps, filter_ops, filter_string);
-	err = filter_parse(ps);
-	if (err && set_str)
-		append_filter_err(ps, filter);
-	return err;
+	return 0;
 }
 
-static void create_filter_finish(struct filter_parse_state *ps)
+static void create_filter_finish(struct filter_parse_error *pe)
 {
-	if (ps) {
-		filter_opstack_clear(ps);
-		postfix_clear(ps);
-		kfree(ps);
-	}
+	kfree(pe);
 }
 
 /**
@@ -1921,24 +1695,20 @@ static void create_filter_finish(struct filter_parse_state *ps)
  * freeing it.
  */
 static int create_filter(struct trace_event_call *call,
-			 char *filter_str, bool set_str,
+			 char *filter_string, bool set_str,
 			 struct event_filter **filterp)
 {
+	struct filter_parse_error *pe = NULL;
 	struct event_filter *filter = NULL;
-	struct filter_parse_state *ps = NULL;
 	int err;
 
-	err = create_filter_start(filter_str, set_str, &ps, &filter);
-	if (!err) {
-		err = replace_preds(call, filter, ps, false);
-		if (err && set_str)
-			append_filter_err(ps, filter);
-	}
-	if (err && !set_str) {
-		free_event_filter(filter);
-		filter = NULL;
-	}
-	create_filter_finish(ps);
+	err = create_filter_start(filter_string, set_str, &pe, &filter);
+	if (err)
+		return err;
+
+	err = process_preds(call, filter_string, filter, pe);
+	if (err && set_str)
+		append_filter_err(pe, filter);
 
 	*filterp = filter;
 	return err;
@@ -1965,21 +1735,21 @@ static int create_system_filter(struct trace_subsystem_dir *dir,
 				char *filter_str, struct event_filter **filterp)
 {
 	struct event_filter *filter = NULL;
-	struct filter_parse_state *ps = NULL;
+	struct filter_parse_error *pe = NULL;
 	int err;
 
-	err = create_filter_start(filter_str, true, &ps, &filter);
+	err = create_filter_start(filter_str, true, &pe, &filter);
 	if (!err) {
-		err = replace_system_preds(dir, tr, ps, filter_str);
+		err = process_system_preds(dir, tr, pe, filter_str);
 		if (!err) {
 			/* System filters just show a default message */
 			kfree(filter->filter_string);
 			filter->filter_string = NULL;
 		} else {
-			append_filter_err(ps, filter);
+			append_filter_err(pe, filter);
 		}
 	}
-	create_filter_finish(ps);
+	create_filter_finish(pe);
 
 	*filterp = filter;
 	return err;
@@ -2162,66 +1932,79 @@ static int __ftrace_function_set_filter(int filter, char *buf, int len,
 	return ret;
 }
 
-static int ftrace_function_check_pred(struct filter_pred *pred, int leaf)
+static int ftrace_function_check_pred(struct filter_pred *pred)
 {
 	struct ftrace_event_field *field = pred->field;
 
-	if (leaf) {
-		/*
-		 * Check the leaf predicate for function trace, verify:
-		 *  - only '==' and '!=' is used
-		 *  - the 'ip' field is used
-		 */
-		if ((pred->op != OP_EQ) && (pred->op != OP_NE))
-			return -EINVAL;
+	/*
+	 * Check the predicate for function trace, verify:
+	 *  - only '==' and '!=' is used
+	 *  - the 'ip' field is used
+	 */
+	if ((pred->op != OP_EQ) && (pred->op != OP_NE))
+		return -EINVAL;
 
-		if (strcmp(field->name, "ip"))
-			return -EINVAL;
-	} else {
-		/*
-		 * Check the non leaf predicate for function trace, verify:
-		 *  - only '||' is used
-		*/
-		if (pred->op != OP_OR)
-			return -EINVAL;
-	}
+	if (strcmp(field->name, "ip"))
+		return -EINVAL;
 
 	return 0;
 }
 
-static int ftrace_function_set_filter_cb(enum move_type move,
-					 struct filter_pred *pred,
-					 int *err, void *data)
+static int ftrace_function_set_filter_pred(struct filter_pred *pred,
+					   struct function_filter_data *data)
 {
+	int ret;
+
 	/* Checking the node is valid for function trace. */
-	if ((move != MOVE_DOWN) ||
-	    (pred->left != FILTER_PRED_INVALID)) {
-		*err = ftrace_function_check_pred(pred, 0);
-	} else {
-		*err = ftrace_function_check_pred(pred, 1);
-		if (*err)
-			return WALK_PRED_ABORT;
-
-		*err = __ftrace_function_set_filter(pred->op == OP_EQ,
-						    pred->regex.pattern,
-						    pred->regex.len,
-						    data);
-	}
+	ret = ftrace_function_check_pred(pred);
+	if (ret)
+		return ret;
+
+	return __ftrace_function_set_filter(pred->op == OP_EQ,
+					    pred->regex.pattern,
+					    pred->regex.len,
+					    data);
+}
+
+static bool is_or(struct prog_entry *prog, int i)
+{
+	int target;
 
-	return (*err) ? WALK_PRED_ABORT : WALK_PRED_DEFAULT;
+	/*
+	 * Only "||" is allowed for function events, thus,
+	 * all true branches should jump to true, and any
+	 * false branch should jump to false.
+	 */
+	target = prog[i].target + 1;
+	/* True and false have NULL preds (all prog entries should jump to one */
+	if (prog[target].pred)
+		return false;
+
+	/* prog[target].target is 1 for TRUE, 0 for FALSE */
+	return prog[i].when_to_branch == prog[target].target;
 }
 
 static int ftrace_function_set_filter(struct perf_event *event,
 				      struct event_filter *filter)
 {
+	struct prog_entry *prog = filter->prog;
 	struct function_filter_data data = {
 		.first_filter  = 1,
 		.first_notrace = 1,
 		.ops           = &event->ftrace_ops,
 	};
+	int i;
+
+	for (i = 0; prog[i].pred; i++) {
+		struct filter_pred *pred = prog[i].pred;
+
+		if (!is_or(prog, i))
+			return -EINVAL;
 
-	return walk_pred_tree(filter->preds, filter->root,
-			      ftrace_function_set_filter_cb, &data);
+		if (ftrace_function_set_filter_pred(pred, &data) < 0)
+			return -EINVAL;
+	}
+	return 0;
 }
 #else
 static int ftrace_function_set_filter(struct perf_event *event,
@@ -2364,26 +2147,27 @@ static int test_pred_visited_fn(struct filter_pred *pred, void *event)
 	return 1;
 }
 
-static int test_walk_pred_cb(enum move_type move, struct filter_pred *pred,
-			     int *err, void *data)
+static void update_pred_fn(struct event_filter *filter, char *fields)
 {
-	char *fields = data;
+	struct prog_entry *prog = filter->prog;
+	int i;
 
-	if ((move == MOVE_DOWN) &&
-	    (pred->left == FILTER_PRED_INVALID)) {
+	for (i = 0; prog[i].pred; i++) {
+		struct filter_pred *pred = prog[i].pred;
 		struct ftrace_event_field *field = pred->field;
 
+		WARN_ON_ONCE(!pred->fn);
+
 		if (!field) {
-			WARN(1, "all leafs should have field defined");
-			return WALK_PRED_DEFAULT;
+			WARN_ONCE(1, "all leafs should have field defined %d", i);
+			continue;
 		}
+
 		if (!strchr(fields, *field->name))
-			return WALK_PRED_DEFAULT;
+			continue;
 
-		WARN_ON(!pred->fn);
 		pred->fn = test_pred_visited_fn;
 	}
-	return WALK_PRED_DEFAULT;
 }
 
 static __init int ftrace_test_event_filter(void)
@@ -2413,9 +2197,7 @@ static __init int ftrace_test_event_filter(void)
 		 */
 		preempt_disable();
 		if (*d->not_visited)
-			walk_pred_tree(filter->preds, filter->root,
-				       test_walk_pred_cb,
-				       d->not_visited);
+			update_pred_fn(filter, d->not_visited);
 
 		test_pred_visited = 0;
 		err = filter_match_preds(filter, &d->rec);
-- 
2.15.1

^ permalink raw reply related	[flat|nested] 51+ messages in thread

* Re: [PATCH v3] kernel.h: Skip single-eval logic on literals in min()/max()
  2018-03-10  1:31           ` Kees Cook
@ 2018-03-10  2:37             ` Linus Torvalds
  0 siblings, 0 replies; 51+ messages in thread
From: Linus Torvalds @ 2018-03-10  2:37 UTC (permalink / raw)
  To: Kees Cook
  Cc: Andrew Morton, Linux Kernel Mailing List, Josh Poimboeuf,
	Rasmus Villemoes, Gustavo A. R. Silva, Tobin C. Harding,
	Steven Rostedt, Jonathan Corbet, Chris Mason, Josef Bacik,
	David Sterba, David S. Miller, Alexey Kuznetsov,
	Hideaki YOSHIFUJI, Ingo Molnar, Peter Zijlstra, Thomas Gleixner,
	Masahiro Yamada, Borislav Petkov, Randy Dunlap, Ian Abbott,
	Sergey Senozhatsky, Petr Mladek, Andy Shevchenko,
	Pantelis Antoniou, Linux Btrfs, Network Development,
	Kernel Hardening

On Fri, Mar 9, 2018 at 5:31 PM, Kees Cook <keescook@chromium.org> wrote:
>
> WTF, gmail just blasted HTML into my explicitly plain-text email?! Apologies...

There's more now in your email, I think maybe it's triggered by your
signature file and some gmail web interface bug. Or it just tries to
force quote using html.

There's some html email disease inside google, where some parts of the
company seems to think that html email is _good_.

The official gmail Android app is a big pain, int hat it doesn't
*have* a plain-text mode, even though it knows how to format the
plain-text part of the email.

You might want to be on the lookout for some bad drugs at the office.
Because the world would thank you if you took them away from the gmail
app people.

             Linus

^ permalink raw reply	[flat|nested] 51+ messages in thread

* Re: [PATCH 3/3] tracing: Rewrite filter logic to be simpler and faster
  2018-03-10  2:34 ` [PATCH 3/3] tracing: Rewrite filter logic to be simpler and faster Steven Rostedt
@ 2018-03-10  3:10   ` Steven Rostedt
  2018-03-10  3:15     ` Steven Rostedt
  2018-03-10  3:18   ` Steven Rostedt
                     ` (2 subsequent siblings)
  3 siblings, 1 reply; 51+ messages in thread
From: Steven Rostedt @ 2018-03-10  3:10 UTC (permalink / raw)
  To: linux-kernel
  Cc: Andrew Morton, Josh Poimboeuf, Rasmus Villemoes,
	Gustavo A. R. Silva, Tobin C. Harding, Jonathan Corbet,
	Chris Mason, Josef Bacik, David Sterba, David S. Miller,
	Alexey Kuznetsov, Hideaki YOSHIFUJI, Ingo Molnar, Peter Zijlstra,
	Thomas Gleixner, Masahiro Yamada, Borislav Petkov, Randy Dunlap,
	Ian Abbott, Sergey Senozhatsky, Petr Mladek, Andy Shevchenko,
	Pantelis Antoniou, Linux Btrfs, Network Development,
	Kernel Hardening

On Fri, 09 Mar 2018 21:34:45 -0500
Steven Rostedt <rostedt@goodmis.org> wrote:

>  2 files changed, 1050 insertions(+), 1273 deletions(-)

BTW, it's a bit bigger impact than 223 deletions. As I added a lot of
comments to explain the logic better. Removing comments and white space
from the modifications we have:

 649 insertions(+), 1035 deletions(-)

That's 386 lines of code removed.

-- Steve

^ permalink raw reply	[flat|nested] 51+ messages in thread

* Re: [PATCH v3] kernel.h: Skip single-eval logic on literals in min()/max()
  2018-03-10  0:07 ` Andrew Morton
  2018-03-10  0:28   ` Linus Torvalds
@ 2018-03-10  3:11   ` Randy Dunlap
  2018-03-10  6:10     ` Miguel Ojeda
  1 sibling, 1 reply; 51+ messages in thread
From: Randy Dunlap @ 2018-03-10  3:11 UTC (permalink / raw)
  To: Andrew Morton, Kees Cook
  Cc: linux-kernel, Josh Poimboeuf, Rasmus Villemoes,
	Gustavo A. R. Silva, Tobin C. Harding, Steven Rostedt,
	Jonathan Corbet, Chris Mason, Josef Bacik, David Sterba,
	David S. Miller, Alexey Kuznetsov, Hideaki YOSHIFUJI,
	Ingo Molnar, Peter Zijlstra, Thomas Gleixner, Masahiro Yamada,
	Borislav Petkov, Ian Abbott, Sergey Senozhatsky, Petr Mladek,
	Andy Shevchenko, Pantelis Antoniou, Linux Btrfs,
	Network Development, Kernel Hardening, Linus Torvalds

On 03/09/2018 04:07 PM, Andrew Morton wrote:
> On Fri, 9 Mar 2018 12:05:36 -0800 Kees Cook <keescook@chromium.org> wrote:
> 
>> When max() is used in stack array size calculations from literal values
>> (e.g. "char foo[max(sizeof(struct1), sizeof(struct2))]", the compiler
>> thinks this is a dynamic calculation due to the single-eval logic, which
>> is not needed in the literal case. This change removes several accidental
>> stack VLAs from an x86 allmodconfig build:
>>
>> $ diff -u before.txt after.txt | grep ^-
>> -drivers/input/touchscreen/cyttsp4_core.c:871:2: warning: ISO C90 forbids variable length array ‘ids’ [-Wvla]
>> -fs/btrfs/tree-checker.c:344:4: warning: ISO C90 forbids variable length array ‘namebuf’ [-Wvla]
>> -lib/vsprintf.c:747:2: warning: ISO C90 forbids variable length array ‘sym’ [-Wvla]
>> -net/ipv4/proc.c:403:2: warning: ISO C90 forbids variable length array ‘buff’ [-Wvla]
>> -net/ipv6/proc.c:198:2: warning: ISO C90 forbids variable length array ‘buff’ [-Wvla]
>> -net/ipv6/proc.c:218:2: warning: ISO C90 forbids variable length array ‘buff64’ [-Wvla]
>>
>> Based on an earlier patch from Josh Poimboeuf.
> 
> v1, v2 and v3 of this patch all fail with gcc-4.4.4:
> 
> ./include/linux/jiffies.h: In function 'jiffies_delta_to_clock_t':
> ./include/linux/jiffies.h:444: error: first argument to '__builtin_choose_expr' not a constant


I'm seeing that problem with
> gcc --version
gcc (SUSE Linux) 4.8.5

in mmotm.

> That's with
> 
> #define __max(t1, t2, x, y)						\
> 	__builtin_choose_expr(__builtin_constant_p(x) &&		\
> 			      __builtin_constant_p(y) &&		\
> 			      __builtin_types_compatible_p(t1, t2),	\
> 			      (t1)(x) > (t2)(y) ? (t1)(x) : (t2)(y),	\
> 			      __single_eval_max(t1, t2,			\
> 						__UNIQUE_ID(max1_),	\
> 						__UNIQUE_ID(max2_),	\
> 						x, y))
> /**
>  * max - return maximum of two values of the same or compatible types
>  * @x: first value
>  * @y: second value
>  */
> #define max(x, y)	__max(typeof(x), typeof(y), x, y)
> 
> 
> A brief poke failed to reveal a workaround - gcc-4.4.4 doesn't appear
> to know that __builtin_constant_p(x) is a constant.  Or something.
> 
> Sigh.  Wasn't there some talk about modernizing our toolchain
> requirements?


-- 
~Randy

^ permalink raw reply	[flat|nested] 51+ messages in thread

* Re: [PATCH 3/3] tracing: Rewrite filter logic to be simpler and faster
  2018-03-10  3:10   ` Steven Rostedt
@ 2018-03-10  3:15     ` Steven Rostedt
  2018-03-10  3:22       ` Steven Rostedt
  0 siblings, 1 reply; 51+ messages in thread
From: Steven Rostedt @ 2018-03-10  3:15 UTC (permalink / raw)
  To: linux-kernel
  Cc: Andrew Morton, Josh Poimboeuf, Rasmus Villemoes,
	Gustavo A. R. Silva, Tobin C. Harding, Jonathan Corbet,
	Chris Mason, Josef Bacik, David Sterba, David S. Miller,
	Alexey Kuznetsov, Hideaki YOSHIFUJI, Ingo Molnar, Peter Zijlstra,
	Thomas Gleixner, Masahiro Yamada, Borislav Petkov, Randy Dunlap,
	Ian Abbott, Sergey Senozhatsky, Petr Mladek, Andy Shevchenko,
	Pantelis Antoniou, Linux Btrfs, Network Development,
	Kernel Hardening


I don't know what the hell happened, but claws mail just inserted a
ton of people into the Cc (I didn't add you). I noticed it just after I
hit send. The added Cc looks like it came from the email right after the
email I was replying to "Subject: Re: [PATCH v3] kernel.h: Skip
single-eval logic on literals in min()/max()".

Sorry for the spam.

-- Steve


On Fri, 9 Mar 2018 22:10:19 -0500
Steven Rostedt <rostedt@goodmis.org> wrote:

> On Fri, 09 Mar 2018 21:34:45 -0500
> Steven Rostedt <rostedt@goodmis.org> wrote:
> 
> >  2 files changed, 1050 insertions(+), 1273 deletions(-)  
> 
> BTW, it's a bit bigger impact than 223 deletions. As I added a lot of
> comments to explain the logic better. Removing comments and white space
> from the modifications we have:
> 
>  649 insertions(+), 1035 deletions(-)
> 
> That's 386 lines of code removed.
> 
> -- Steve

'

^ permalink raw reply	[flat|nested] 51+ messages in thread

* Re: [PATCH 3/3] tracing: Rewrite filter logic to be simpler and faster
  2018-03-10  2:34 ` [PATCH 3/3] tracing: Rewrite filter logic to be simpler and faster Steven Rostedt
  2018-03-10  3:10   ` Steven Rostedt
@ 2018-03-10  3:18   ` Steven Rostedt
  2018-03-12 12:42   ` Jiri Olsa
  2018-03-12 15:10   ` Jiri Olsa
  3 siblings, 0 replies; 51+ messages in thread
From: Steven Rostedt @ 2018-03-10  3:18 UTC (permalink / raw)
  To: linux-kernel
  Cc: Ingo Molnar, Andrew Morton, Al Viro, Tom Zanussi, Namhyung Kim,
	Masami Hiramatsu, Jiri Olsa

[ This time replying to the email I expected to ]

On Fri, 09 Mar 2018 21:34:45 -0500
Steven Rostedt <rostedt@goodmis.org> wrote:

>  2 files changed, 1050 insertions(+), 1273 deletions(-)  

BTW, it's a bit bigger impact than 223 deletions. As I added a lot of
comments to explain the logic better. Removing comments and white space
from the modifications we have:

 649 insertions(+), 1035 deletions(-)

That's 386 lines of code removed.

-- Steve

^ permalink raw reply	[flat|nested] 51+ messages in thread

* Re: [PATCH 3/3] tracing: Rewrite filter logic to be simpler and faster
  2018-03-10  3:15     ` Steven Rostedt
@ 2018-03-10  3:22       ` Steven Rostedt
  0 siblings, 0 replies; 51+ messages in thread
From: Steven Rostedt @ 2018-03-10  3:22 UTC (permalink / raw)
  To: linux-kernel
  Cc: Andrew Morton, Josh Poimboeuf, Rasmus Villemoes,
	Gustavo A. R. Silva, Tobin C. Harding, Jonathan Corbet,
	Chris Mason, Josef Bacik, David Sterba, David S. Miller,
	Alexey Kuznetsov, Hideaki YOSHIFUJI, Ingo Molnar, Peter Zijlstra,
	Thomas Gleixner, Masahiro Yamada, Borislav Petkov, Randy Dunlap,
	Ian Abbott, Sergey Senozhatsky, Petr Mladek, Andy Shevchenko,
	Pantelis Antoniou, Linux Btrfs, Network Development,
	Kernel Hardening

On Fri, 9 Mar 2018 22:15:23 -0500
Steven Rostedt <rostedt@goodmis.org> wrote:

> Sorry for the spam.

A little more spam ;-)

I know what happened, as I'm able to recreate it. For those that use
claws-mail, be careful.

I clicked on the email I wanted to reply to. Then I must have hit the
down arrow key, as it moved the selected email to the next email. I hit
"Reply All", and it showed the email I clicked on (as well as the
subject), but the Cc list belonged to the email that was selected by the
down arrow key.

That's a nasty subtle bug. I think I'll go report this to the claws
folks.

-- Steve

^ permalink raw reply	[flat|nested] 51+ messages in thread

* Re: [PATCH v3] kernel.h: Skip single-eval logic on literals in min()/max()
  2018-03-10  3:11   ` Randy Dunlap
@ 2018-03-10  6:10     ` Miguel Ojeda
  2018-03-10  7:03       ` Miguel Ojeda
  2018-03-10 15:33       ` Kees Cook
  0 siblings, 2 replies; 51+ messages in thread
From: Miguel Ojeda @ 2018-03-10  6:10 UTC (permalink / raw)
  To: Randy Dunlap
  Cc: Andrew Morton, Kees Cook, linux-kernel, Josh Poimboeuf,
	Rasmus Villemoes, Gustavo A. R. Silva, Tobin C. Harding,
	Steven Rostedt, Jonathan Corbet, Chris Mason, Josef Bacik,
	David Sterba, David S. Miller, Alexey Kuznetsov,
	Hideaki YOSHIFUJI, Ingo Molnar, Peter Zijlstra, Thomas Gleixner,
	Masahiro Yamada, Borislav Petkov, Ian Abbott, Sergey Senozhatsky,
	Petr Mladek, Andy Shevchenko, Pantelis Antoniou, Linux Btrfs,
	Network Development, Kernel Hardening, Linus Torvalds

On Sat, Mar 10, 2018 at 4:11 AM, Randy Dunlap <rdunlap@infradead.org> wrote:
> On 03/09/2018 04:07 PM, Andrew Morton wrote:
>> On Fri, 9 Mar 2018 12:05:36 -0800 Kees Cook <keescook@chromium.org> wrote:
>>
>>> When max() is used in stack array size calculations from literal values
>>> (e.g. "char foo[max(sizeof(struct1), sizeof(struct2))]", the compiler
>>> thinks this is a dynamic calculation due to the single-eval logic, which
>>> is not needed in the literal case. This change removes several accidental
>>> stack VLAs from an x86 allmodconfig build:
>>>
>>> $ diff -u before.txt after.txt | grep ^-
>>> -drivers/input/touchscreen/cyttsp4_core.c:871:2: warning: ISO C90 forbids variable length array ‘ids’ [-Wvla]
>>> -fs/btrfs/tree-checker.c:344:4: warning: ISO C90 forbids variable length array ‘namebuf’ [-Wvla]
>>> -lib/vsprintf.c:747:2: warning: ISO C90 forbids variable length array ‘sym’ [-Wvla]
>>> -net/ipv4/proc.c:403:2: warning: ISO C90 forbids variable length array ‘buff’ [-Wvla]
>>> -net/ipv6/proc.c:198:2: warning: ISO C90 forbids variable length array ‘buff’ [-Wvla]
>>> -net/ipv6/proc.c:218:2: warning: ISO C90 forbids variable length array ‘buff64’ [-Wvla]
>>>
>>> Based on an earlier patch from Josh Poimboeuf.
>>
>> v1, v2 and v3 of this patch all fail with gcc-4.4.4:
>>
>> ./include/linux/jiffies.h: In function 'jiffies_delta_to_clock_t':
>> ./include/linux/jiffies.h:444: error: first argument to '__builtin_choose_expr' not a constant
>
>
> I'm seeing that problem with
>> gcc --version
> gcc (SUSE Linux) 4.8.5

Same here, 4.8.5 fails. gcc 5.4.1 seems to work. I compiled a minimal
5.1.0 and it seems to work as well.

Cheers,
Miguel

^ permalink raw reply	[flat|nested] 51+ messages in thread

* Re: [PATCH v3] kernel.h: Skip single-eval logic on literals in min()/max()
  2018-03-10  6:10     ` Miguel Ojeda
@ 2018-03-10  7:03       ` Miguel Ojeda
  2018-03-10 16:04         ` Linus Torvalds
  2018-03-10 15:33       ` Kees Cook
  1 sibling, 1 reply; 51+ messages in thread
From: Miguel Ojeda @ 2018-03-10  7:03 UTC (permalink / raw)
  To: Randy Dunlap
  Cc: Andrew Morton, Kees Cook, linux-kernel, Josh Poimboeuf,
	Rasmus Villemoes, Gustavo A. R. Silva, Tobin C. Harding,
	Steven Rostedt, Jonathan Corbet, Chris Mason, Josef Bacik,
	David Sterba, David S. Miller, Alexey Kuznetsov,
	Hideaki YOSHIFUJI, Ingo Molnar, Peter Zijlstra, Thomas Gleixner,
	Masahiro Yamada, Borislav Petkov, Ian Abbott, Sergey Senozhatsky,
	Petr Mladek, Andy Shevchenko, Pantelis Antoniou, Linux Btrfs,
	Network Development, Kernel Hardening, Linus Torvalds

On Sat, Mar 10, 2018 at 7:10 AM, Miguel Ojeda
<miguel.ojeda.sandonis@gmail.com> wrote:
> On Sat, Mar 10, 2018 at 4:11 AM, Randy Dunlap <rdunlap@infradead.org> wrote:
>> On 03/09/2018 04:07 PM, Andrew Morton wrote:
>>> On Fri, 9 Mar 2018 12:05:36 -0800 Kees Cook <keescook@chromium.org> wrote:
>>>
>>>> When max() is used in stack array size calculations from literal values
>>>> (e.g. "char foo[max(sizeof(struct1), sizeof(struct2))]", the compiler
>>>> thinks this is a dynamic calculation due to the single-eval logic, which
>>>> is not needed in the literal case. This change removes several accidental
>>>> stack VLAs from an x86 allmodconfig build:
>>>>
>>>> $ diff -u before.txt after.txt | grep ^-
>>>> -drivers/input/touchscreen/cyttsp4_core.c:871:2: warning: ISO C90 forbids variable length array ‘ids’ [-Wvla]
>>>> -fs/btrfs/tree-checker.c:344:4: warning: ISO C90 forbids variable length array ‘namebuf’ [-Wvla]
>>>> -lib/vsprintf.c:747:2: warning: ISO C90 forbids variable length array ‘sym’ [-Wvla]
>>>> -net/ipv4/proc.c:403:2: warning: ISO C90 forbids variable length array ‘buff’ [-Wvla]
>>>> -net/ipv6/proc.c:198:2: warning: ISO C90 forbids variable length array ‘buff’ [-Wvla]
>>>> -net/ipv6/proc.c:218:2: warning: ISO C90 forbids variable length array ‘buff64’ [-Wvla]
>>>>
>>>> Based on an earlier patch from Josh Poimboeuf.
>>>
>>> v1, v2 and v3 of this patch all fail with gcc-4.4.4:
>>>
>>> ./include/linux/jiffies.h: In function 'jiffies_delta_to_clock_t':
>>> ./include/linux/jiffies.h:444: error: first argument to '__builtin_choose_expr' not a constant
>>
>>
>> I'm seeing that problem with
>>> gcc --version
>> gcc (SUSE Linux) 4.8.5
>
> Same here, 4.8.5 fails. gcc 5.4.1 seems to work. I compiled a minimal
> 5.1.0 and it seems to work as well.
>

Just compiled 4.9.0 and it seems to work -- so that would be the
minimum required.

Sigh...

Some enterprise distros are either already shipping gcc >= 5 or will
probably be shipping it soon (e.g. RHEL 8), so how much does it hurt
to ask for a newer gcc? Are there many users/companies out there using
enterprise distributions' gcc to compile and run the very latest
kernels?

Miguel

^ permalink raw reply	[flat|nested] 51+ messages in thread

* Re: [PATCH v3] kernel.h: Skip single-eval logic on literals in min()/max()
  2018-03-10  6:10     ` Miguel Ojeda
  2018-03-10  7:03       ` Miguel Ojeda
@ 2018-03-10 15:33       ` Kees Cook
  2018-03-10 16:11         ` Linus Torvalds
  2018-03-10 16:30         ` Linus Torvalds
  1 sibling, 2 replies; 51+ messages in thread
From: Kees Cook @ 2018-03-10 15:33 UTC (permalink / raw)
  To: Miguel Ojeda
  Cc: Randy Dunlap, Andrew Morton, linux-kernel, Josh Poimboeuf,
	Rasmus Villemoes, Gustavo A. R. Silva, Tobin C. Harding,
	Steven Rostedt, Jonathan Corbet, Chris Mason, Josef Bacik,
	David Sterba, David S. Miller, Alexey Kuznetsov,
	Hideaki YOSHIFUJI, Ingo Molnar, Peter Zijlstra, Thomas Gleixner,
	Masahiro Yamada, Borislav Petkov, Ian Abbott, Sergey Senozhatsky,
	Petr Mladek, Andy Shevchenko, Pantelis Antoniou, Linux Btrfs,
	Network Development, Kernel Hardening, Linus Torvalds

On Fri, Mar 9, 2018 at 10:10 PM, Miguel Ojeda
<miguel.ojeda.sandonis@gmail.com> wrote:
> On Sat, Mar 10, 2018 at 4:11 AM, Randy Dunlap <rdunlap@infradead.org> wrote:
>> On 03/09/2018 04:07 PM, Andrew Morton wrote:
>>> On Fri, 9 Mar 2018 12:05:36 -0800 Kees Cook <keescook@chromium.org> wrote:
>>>
>>>> When max() is used in stack array size calculations from literal values
>>>> (e.g. "char foo[max(sizeof(struct1), sizeof(struct2))]", the compiler
>>>> thinks this is a dynamic calculation due to the single-eval logic, which
>>>> is not needed in the literal case. This change removes several accidental
>>>> stack VLAs from an x86 allmodconfig build:
>>>>
>>>> $ diff -u before.txt after.txt | grep ^-
>>>> -drivers/input/touchscreen/cyttsp4_core.c:871:2: warning: ISO C90 forbids variable length array ‘ids’ [-Wvla]
>>>> -fs/btrfs/tree-checker.c:344:4: warning: ISO C90 forbids variable length array ‘namebuf’ [-Wvla]
>>>> -lib/vsprintf.c:747:2: warning: ISO C90 forbids variable length array ‘sym’ [-Wvla]
>>>> -net/ipv4/proc.c:403:2: warning: ISO C90 forbids variable length array ‘buff’ [-Wvla]
>>>> -net/ipv6/proc.c:198:2: warning: ISO C90 forbids variable length array ‘buff’ [-Wvla]
>>>> -net/ipv6/proc.c:218:2: warning: ISO C90 forbids variable length array ‘buff64’ [-Wvla]
>>>>
>>>> Based on an earlier patch from Josh Poimboeuf.
>>>
>>> v1, v2 and v3 of this patch all fail with gcc-4.4.4:
>>>
>>> ./include/linux/jiffies.h: In function 'jiffies_delta_to_clock_t':
>>> ./include/linux/jiffies.h:444: error: first argument to '__builtin_choose_expr' not a constant
>>
>>
>> I'm seeing that problem with
>>> gcc --version
>> gcc (SUSE Linux) 4.8.5
>
> Same here, 4.8.5 fails. gcc 5.4.1 seems to work. I compiled a minimal
> 5.1.0 and it seems to work as well.

And sparse freaks out too:

   drivers/net/ethernet/via/via-velocity.c:97:26: sparse: incorrect
type in initializer (different address spaces) @@    expected void
*addr @@    got struct mac_regs [noderef] <avoid *addr @@
   drivers/net/ethernet/via/via-velocity.c:97:26:    expected void *addr
   drivers/net/ethernet/via/via-velocity.c:97:26:    got struct
mac_regs [noderef] <asn:2>*mac_regs
   drivers/net/ethernet/via/via-velocity.c:100:49: sparse: incorrect
type in argument 2 (different base types) @@    expected restricted
pci_power_t [usertype] state @@    got _t [usertype] state @@

Alright, I'm giving up on fixing max(). I'll go back to STACK_MAX() or
some other name for the simple macro. Bleh.

-Kees

-- 
Kees Cook
Pixel Security

^ permalink raw reply	[flat|nested] 51+ messages in thread

* Re: [PATCH v3] kernel.h: Skip single-eval logic on literals in min()/max()
  2018-03-10  7:03       ` Miguel Ojeda
@ 2018-03-10 16:04         ` Linus Torvalds
  0 siblings, 0 replies; 51+ messages in thread
From: Linus Torvalds @ 2018-03-10 16:04 UTC (permalink / raw)
  To: Miguel Ojeda
  Cc: Randy Dunlap, Andrew Morton, Kees Cook, linux-kernel,
	Josh Poimboeuf, Rasmus Villemoes, Gustavo A. R. Silva,
	Tobin C. Harding, Steven Rostedt, Jonathan Corbet, Chris Mason,
	Josef Bacik, David Sterba, David S. Miller, Alexey Kuznetsov,
	Hideaki YOSHIFUJI, Ingo Molnar, Peter Zijlstra, Thomas Gleixner,
	Masahiro Yamada, Borislav Petkov, Ian Abbott, Sergey Senozhatsky,
	Petr Mladek, Andy Shevchenko, Pantelis Antoniou, Linux Btrfs,
	Network Development, Kernel Hardening

On Fri, Mar 9, 2018 at 11:03 PM, Miguel Ojeda
<miguel.ojeda.sandonis@gmail.com> wrote:
>
> Just compiled 4.9.0 and it seems to work -- so that would be the
> minimum required.
>
> Sigh...
>
> Some enterprise distros are either already shipping gcc >= 5 or will
> probably be shipping it soon (e.g. RHEL 8), so how much does it hurt
> to ask for a newer gcc? Are there many users/companies out there using
> enterprise distributions' gcc to compile and run the very latest
> kernels?

I wouldn't mind upping the compiler requirements, and we have other
reasons to go to 4.6.

But _this_ particular issue doesn't seem worth it to then go even
further. Annoying.

                   Linus

^ permalink raw reply	[flat|nested] 51+ messages in thread

* Re: [PATCH v3] kernel.h: Skip single-eval logic on literals in min()/max()
  2018-03-10 15:33       ` Kees Cook
@ 2018-03-10 16:11         ` Linus Torvalds
  2018-03-10 16:30         ` Linus Torvalds
  1 sibling, 0 replies; 51+ messages in thread
From: Linus Torvalds @ 2018-03-10 16:11 UTC (permalink / raw)
  To: Kees Cook
  Cc: Miguel Ojeda, Randy Dunlap, Andrew Morton, linux-kernel,
	Josh Poimboeuf, Rasmus Villemoes, Gustavo A. R. Silva,
	Tobin C. Harding, Steven Rostedt, Jonathan Corbet, Chris Mason,
	Josef Bacik, David Sterba, David S. Miller, Alexey Kuznetsov,
	Hideaki YOSHIFUJI, Ingo Molnar, Peter Zijlstra, Thomas Gleixner,
	Masahiro Yamada, Borislav Petkov, Ian Abbott, Sergey Senozhatsky,
	Petr Mladek, Andy Shevchenko, Pantelis Antoniou, Linux Btrfs,
	Network Development, Kernel Hardening

On Sat, Mar 10, 2018 at 7:33 AM, Kees Cook <keescook@chromium.org> wrote:
>
> And sparse freaks out too:
>
>    drivers/net/ethernet/via/via-velocity.c:97:26: sparse: incorrect
> type in initializer (different address spaces) @@    expected void
> *addr @@    got struct mac_regs [noderef] <avoid *addr @@

Actually, this seems a valid warning - it's assigning a __iomem
pointer to a regular void, and dropping the iomem in the process.

Sparse does *not* like VLA's, so what may be going on is that fixing
something VLA-related in your tree just makes sparse (correctly) check
something that it had given up on before.

So don't worry about the sparse ones, if they are new. They seem correct.

                     Linus

^ permalink raw reply	[flat|nested] 51+ messages in thread

* Re: [PATCH v3] kernel.h: Skip single-eval logic on literals in min()/max()
  2018-03-10 15:33       ` Kees Cook
  2018-03-10 16:11         ` Linus Torvalds
@ 2018-03-10 16:30         ` Linus Torvalds
  2018-03-10 17:34           ` Miguel Ojeda
  1 sibling, 1 reply; 51+ messages in thread
From: Linus Torvalds @ 2018-03-10 16:30 UTC (permalink / raw)
  To: Kees Cook
  Cc: Miguel Ojeda, Randy Dunlap, Andrew Morton, linux-kernel,
	Josh Poimboeuf, Rasmus Villemoes, Gustavo A. R. Silva,
	Tobin C. Harding, Steven Rostedt, Jonathan Corbet, Chris Mason,
	Josef Bacik, David Sterba, David S. Miller, Alexey Kuznetsov,
	Hideaki YOSHIFUJI, Ingo Molnar, Peter Zijlstra, Thomas Gleixner,
	Masahiro Yamada, Borislav Petkov, Ian Abbott, Sergey Senozhatsky,
	Petr Mladek, Andy Shevchenko, Pantelis Antoniou, Linux Btrfs,
	Network Development, Kernel Hardening

On Sat, Mar 10, 2018 at 7:33 AM, Kees Cook <keescook@chromium.org> wrote:
>
> Alright, I'm giving up on fixing max(). I'll go back to STACK_MAX() or
> some other name for the simple macro. Bleh.

Oh, and I'm starting to see the real problem.

It's not that our current "min/max()" are broiken. It's that "-Wvla" is garbage.

Lookie here:

        int array[(1,2)];

results in gcc saying

     warning: ISO C90 forbids variable length array ‘array’ [-Wvla]
       int array[(1,2)];
       ^~~

and that error message - and the name of the flag - is obviously pure garbage.

What is *actually* going on is that ISO C90 requires an array size to
be not a constant value, but a constant *expression*. Those are two
different things.

A constant expression has little to do with "compile-time constant".
It's a more restricted form of it, and has actual syntax requirements.
A comma expression is not a constant expression, for example, which
was why I tested this.

So "-Wvla" is garbage, with a misleading name, and a misleading
warning string. It has nothing to do with "variable length" and
whether the compiler can figure it out at build time, and everything
to do with a _syntax_ rule.

                      Linus

^ permalink raw reply	[flat|nested] 51+ messages in thread

* Re: [PATCH v3] kernel.h: Skip single-eval logic on literals in min()/max()
  2018-03-10 16:30         ` Linus Torvalds
@ 2018-03-10 17:34           ` Miguel Ojeda
  2018-03-10 17:51             ` Linus Torvalds
  0 siblings, 1 reply; 51+ messages in thread
From: Miguel Ojeda @ 2018-03-10 17:34 UTC (permalink / raw)
  To: Linus Torvalds
  Cc: Kees Cook, Randy Dunlap, Andrew Morton, linux-kernel,
	Josh Poimboeuf, Rasmus Villemoes, Gustavo A. R. Silva,
	Tobin C. Harding, Steven Rostedt, Jonathan Corbet, Chris Mason,
	Josef Bacik, David Sterba, David S. Miller, Alexey Kuznetsov,
	Hideaki YOSHIFUJI, Ingo Molnar, Peter Zijlstra, Thomas Gleixner,
	Masahiro Yamada, Borislav Petkov, Ian Abbott, Sergey Senozhatsky,
	Petr Mladek, Andy Shevchenko, Pantelis Antoniou, Linux Btrfs,
	Network Development, Kernel Hardening

On Sat, Mar 10, 2018 at 5:30 PM, Linus Torvalds
<torvalds@linux-foundation.org> wrote:
> On Sat, Mar 10, 2018 at 7:33 AM, Kees Cook <keescook@chromium.org> wrote:
>>
>> Alright, I'm giving up on fixing max(). I'll go back to STACK_MAX() or
>> some other name for the simple macro. Bleh.
>
> Oh, and I'm starting to see the real problem.
>
> It's not that our current "min/max()" are broiken. It's that "-Wvla" is garbage.
>
> Lookie here:
>
>         int array[(1,2)];
>
> results in gcc saying
>
>      warning: ISO C90 forbids variable length array ‘array’ [-Wvla]
>        int array[(1,2)];
>        ^~~
>
> and that error message - and the name of the flag - is obviously pure garbage.
>
> What is *actually* going on is that ISO C90 requires an array size to
> be not a constant value, but a constant *expression*. Those are two
> different things.
>
> A constant expression has little to do with "compile-time constant".
> It's a more restricted form of it, and has actual syntax requirements.
> A comma expression is not a constant expression, for example, which
> was why I tested this.
>
> So "-Wvla" is garbage, with a misleading name, and a misleading
> warning string. It has nothing to do with "variable length" and
> whether the compiler can figure it out at build time, and everything
> to do with a _syntax_ rule.

The warning string is basically the same to the one used for C++, i.e.:

    int size2 = 2;
    constexpr int size3 = 2;

    int array1[(2,2)];
    int array2[(size2, size2)];
    int array3[(size3, size3)];

only warns for array2 with:

    warning: ISO C++ forbids variable length array 'array2' [-Wvla]
     int array2[(size2, size2)];

So the warning is probably implemented to just trigger whenever VLAs
are used but the given standard does not allow them, for all
languages. The problem is why the ISO C90 frontend is not giving an
error for using invalid syntax for array sizes to begin with?

Miguel

^ permalink raw reply	[flat|nested] 51+ messages in thread

* Re: [PATCH v3] kernel.h: Skip single-eval logic on literals in min()/max()
  2018-03-10 17:34           ` Miguel Ojeda
@ 2018-03-10 17:51             ` Linus Torvalds
  2018-03-10 19:08               ` Miguel Ojeda
  2018-03-11 11:05               ` Ingo Molnar
  0 siblings, 2 replies; 51+ messages in thread
From: Linus Torvalds @ 2018-03-10 17:51 UTC (permalink / raw)
  To: Miguel Ojeda
  Cc: Kees Cook, Randy Dunlap, Andrew Morton, linux-kernel,
	Josh Poimboeuf, Rasmus Villemoes, Gustavo A. R. Silva,
	Tobin C. Harding, Steven Rostedt, Jonathan Corbet, Chris Mason,
	Josef Bacik, David Sterba, David S. Miller, Alexey Kuznetsov,
	Hideaki YOSHIFUJI, Ingo Molnar, Peter Zijlstra, Thomas Gleixner,
	Masahiro Yamada, Borislav Petkov, Ian Abbott, Sergey Senozhatsky,
	Petr Mladek, Andy Shevchenko, Pantelis Antoniou, Linux Btrfs,
	Network Development, Kernel Hardening

On Sat, Mar 10, 2018 at 9:34 AM, Miguel Ojeda
<miguel.ojeda.sandonis@gmail.com> wrote:
>
> So the warning is probably implemented to just trigger whenever VLAs
> are used but the given standard does not allow them, for all
> languages. The problem is why the ISO C90 frontend is not giving an
> error for using invalid syntax for array sizes to begin with?

So in *historical* context - when a compiler didn't do variable length
arrays at all - the original semantics of C "constant expressions"
actually make a ton of sense.

You can basically think of a constant expression as something that can
be (historically) handled by the front-end without any real
complexity, and no optimization phases - just evaluating a simple
parse tree with purely compile-time constants.

So there's a good and perfectly valid reason for why C limits certain
expressions to just a very particular subset. It's not just array
sizes, it's  case statements etc too. And those are very much part of
the C standard.

So an error message like

   warning: ISO C90 requires array sizes to be constant-expressions

would be technically correct and useful from a portability angle. It
tells you when you're doing something non-portable, and should be
automatically enabled with "-ansi -pedantic", for example.

So what's misleading is actually the name of the warning and the
message, not that it happens. The warning isn't about "variable
length", it's literally about the rules for what a
"constant-expression" is.

And in C, the expression (2,3) has a constant _value_ (namely 3), but
it isn't a constant-expression as specified by the language.

Now, the thing is that once you actually do variable length arrays,
those old front-end rules make no sense any more (outside of the "give
portability warnings" thing).

Because once you do variable length arrays, you obviously _parse_
everything just fine, and you're doing to evaluate much more complex
expressions than some limited constant-expression rule.

And at that point it would make a whole lot more sense to add a *new*
warning that basically says "I have to generate code for a
variable-sized array", if you actually talk about VLA's.

But that's clearly not what gcc actually did.

So the problem really is that -Wvla doesn't actually warn about VLA's,
but about something technically completely different.

And that's why those stupid syntactic issues with min/max matter. It's
not whether the end result is a compile-time constant or not, it's
about completely different issues, like whether there is a
comma-expression in it.

              Linus

^ permalink raw reply	[flat|nested] 51+ messages in thread

* Re: [PATCH v3] kernel.h: Skip single-eval logic on literals in min()/max()
  2018-03-10 17:51             ` Linus Torvalds
@ 2018-03-10 19:08               ` Miguel Ojeda
  2018-03-11 11:05               ` Ingo Molnar
  1 sibling, 0 replies; 51+ messages in thread
From: Miguel Ojeda @ 2018-03-10 19:08 UTC (permalink / raw)
  To: Linus Torvalds
  Cc: Kees Cook, Randy Dunlap, Andrew Morton, linux-kernel,
	Josh Poimboeuf, Rasmus Villemoes, Gustavo A. R. Silva,
	Tobin C. Harding, Steven Rostedt, Jonathan Corbet, Chris Mason,
	Josef Bacik, David Sterba, David S. Miller, Alexey Kuznetsov,
	Hideaki YOSHIFUJI, Ingo Molnar, Peter Zijlstra, Thomas Gleixner,
	Masahiro Yamada, Borislav Petkov, Ian Abbott, Sergey Senozhatsky,
	Petr Mladek, Andy Shevchenko, Pantelis Antoniou, Linux Btrfs,
	Network Development, Kernel Hardening

On Sat, Mar 10, 2018 at 6:51 PM, Linus Torvalds
<torvalds@linux-foundation.org> wrote:
>
> So in *historical* context - when a compiler didn't do variable length
> arrays at all - the original semantics of C "constant expressions"
> actually make a ton of sense.
>
> You can basically think of a constant expression as something that can
> be (historically) handled by the front-end without any real
> complexity, and no optimization phases - just evaluating a simple
> parse tree with purely compile-time constants.
>
> So there's a good and perfectly valid reason for why C limits certain
> expressions to just a very particular subset. It's not just array
> sizes, it's  case statements etc too. And those are very much part of
> the C standard.
>
> So an error message like
>
>    warning: ISO C90 requires array sizes to be constant-expressions
>
> would be technically correct and useful from a portability angle. It
> tells you when you're doing something non-portable, and should be
> automatically enabled with "-ansi -pedantic", for example.
>
> So what's misleading is actually the name of the warning and the
> message, not that it happens. The warning isn't about "variable
> length", it's literally about the rules for what a
> "constant-expression" is.
>
> And in C, the expression (2,3) has a constant _value_ (namely 3), but
> it isn't a constant-expression as specified by the language.
>
> Now, the thing is that once you actually do variable length arrays,
> those old front-end rules make no sense any more (outside of the "give
> portability warnings" thing).
>
> Because once you do variable length arrays, you obviously _parse_
> everything just fine, and you're doing to evaluate much more complex
> expressions than some limited constant-expression rule.
>
> And at that point it would make a whole lot more sense to add a *new*
> warning that basically says "I have to generate code for a
> variable-sized array", if you actually talk about VLA's.
>
> But that's clearly not what gcc actually did.
>
> So the problem really is that -Wvla doesn't actually warn about VLA's,
> but about something technically completely different.
>

I *think* I followed your reasoning. For gcc, -Wvla is the "I have to
generate code for a variable-sized array" one; but in this case, the
array size is the actual issue that you would have liked to be warned
about; since people writing:

    int a[(2,3)];

did not really mean to declare a VLA. Therefore, you say warning them
about the "warning: ISO C90 requires array sizes to be
constant-expressions" (let's call it -Wpedantic-array-sizes) would be
more helpful here instead of saying stuff about VLAs.

In my case, I was just expecting gcc to give us both warnings and
that's it, instead of trying to be smart and give only the
-Wpedantic-array-sizes one (which is the one I was wondering in my
previous email about why it was missing). I think it would be clear
enough if both warnings are shown are the same time. And it makes
sense, since if you write that line in ISO C90 it means there really
are 2 things going wrong in the end (fishy syntax while in ISO C90
mode and, due to that, VLA code generated as well), no?

Thanks for taking the time to write about the historical context, by the way!

Miguel

> And that's why those stupid syntactic issues with min/max matter. It's
> not whether the end result is a compile-time constant or not, it's
> about completely different issues, like whether there is a
> comma-expression in it.
>
>               Linus

^ permalink raw reply	[flat|nested] 51+ messages in thread

* Re: [PATCH v3] kernel.h: Skip single-eval logic on literals in min()/max()
  2018-03-10 17:51             ` Linus Torvalds
  2018-03-10 19:08               ` Miguel Ojeda
@ 2018-03-11 11:05               ` Ingo Molnar
  2018-03-11 18:23                 ` Linus Torvalds
  1 sibling, 1 reply; 51+ messages in thread
From: Ingo Molnar @ 2018-03-11 11:05 UTC (permalink / raw)
  To: Linus Torvalds
  Cc: Miguel Ojeda, Kees Cook, Randy Dunlap, Andrew Morton,
	linux-kernel, Josh Poimboeuf, Rasmus Villemoes,
	Gustavo A. R. Silva, Tobin C. Harding, Steven Rostedt,
	Jonathan Corbet, Chris Mason, Josef Bacik, David Sterba,
	David S. Miller, Alexey Kuznetsov, Hideaki YOSHIFUJI,
	Peter Zijlstra, Thomas Gleixner, Masahiro Yamada,
	Borislav Petkov, Ian Abbott, Sergey Senozhatsky, Petr Mladek,
	Andy Shevchenko, Pantelis Antoniou, Linux Btrfs,
	Network Development, Kernel Hardening


* Linus Torvalds <torvalds@linux-foundation.org> wrote:

> So an error message like
> 
>    warning: ISO C90 requires array sizes to be constant-expressions
> 
> would be technically correct and useful from a portability angle. It
> tells you when you're doing something non-portable, and should be
> automatically enabled with "-ansi -pedantic", for example.
> 
> So what's misleading is actually the name of the warning and the
> message, not that it happens. The warning isn't about "variable
> length", it's literally about the rules for what a
> "constant-expression" is.
> 
> And in C, the expression (2,3) has a constant _value_ (namely 3), but
> it isn't a constant-expression as specified by the language.
> 
> Now, the thing is that once you actually do variable length arrays,
> those old front-end rules make no sense any more (outside of the "give
> portability warnings" thing).
> 
> Because once you do variable length arrays, you obviously _parse_
> everything just fine, and you're doing to evaluate much more complex
> expressions than some limited constant-expression rule.

BTW., while I fully agree with everything you said, it's not entirely correct to 
claim that if a C compiler can generate VLA code it is necessarily able to parse 
and evaluate constant array sizes "just fine".

Constant expressions are typically parsed very early on, at the preprocessing 
stage. They can be used with some preprocessor directives as well, such as '#if' 
(with some further limitations on their syntax).

If VLA support is implemented in a later stage, and results in heavy-handed code 
generation that will technically work for constant value expressions as well but 
results in suboptimal code, then a warning should probably be emitted - and it 
wouldn't be pedantic.

The existing warning is still very misleading:

  warning: ISO C90 forbids variable length array ‘array’ [-Wvla]

... and if my above theory is correct then I think a better warning would be 
something like:

  warning: Array declaration is not a C90 constant expression, resulting in VLA code generation

... and note that in this specific case it's not misleading to talk about VLAs in 
the warning text, because the array size, even if it's constant value, results in 
VLA code generation.

I don't know whether GCC has such a limitation, but a quick experiment with GCC 
7.2 suggests that a (2,3) array size expression results in a lot more code being 
generated than with a constant expression.

Thanks,

	Ingo

^ permalink raw reply	[flat|nested] 51+ messages in thread

* Re: [PATCH v3] kernel.h: Skip single-eval logic on literals in min()/max()
  2018-03-11 11:05               ` Ingo Molnar
@ 2018-03-11 18:23                 ` Linus Torvalds
  0 siblings, 0 replies; 51+ messages in thread
From: Linus Torvalds @ 2018-03-11 18:23 UTC (permalink / raw)
  To: Ingo Molnar
  Cc: Miguel Ojeda, Kees Cook, Randy Dunlap, Andrew Morton,
	linux-kernel, Josh Poimboeuf, Rasmus Villemoes,
	Gustavo A. R. Silva, Tobin C. Harding, Steven Rostedt,
	Jonathan Corbet, Chris Mason, Josef Bacik, David Sterba,
	David S. Miller, Alexey Kuznetsov, Hideaki YOSHIFUJI,
	Peter Zijlstra, Thomas Gleixner, Masahiro Yamada,
	Borislav Petkov, Ian Abbott, Sergey Senozhatsky, Petr Mladek,
	Andy Shevchenko, Pantelis Antoniou, Linux Btrfs,
	Network Development, Kernel Hardening

On Sun, Mar 11, 2018 at 4:05 AM, Ingo Molnar <mingo@kernel.org> wrote:
>
> BTW., while I fully agree with everything you said, it's not entirely correct to
> claim that if a C compiler can generate VLA code it is necessarily able to parse
> and evaluate constant array sizes "just fine".
>
> Constant expressions are typically parsed very early on, at the preprocessing
> stage. They can be used with some preprocessor directives as well, such as '#if'
> (with some further limitations on their syntax).

Yes. But constant simplification and CSE etc is just a very
fundamental part of a compiler, and anybody who actually implements
VLA's would have to do it anyway.

So no, a message like

  warning: Array declaration is not a C90 constant expression,
resulting in VLA code generation

would be moronic. Only some completely mindless broken shit would do
"oh, it's not a parse-time constant, so it will be variable". The two
just do not follow AT ALL.

So the message might be about _possibly_ resulting in VLA code
generation, but honestly, at that point you should just add the
warning when you actually generate the code to do the stack
allocation.

Because at that point, you know whether it's variable or not.

And trust me, it won't be variable for things like (2,3), or even for
our "max()" thing with other odd builtins. Not unless the compiler
doesn't really support VLA at all (maybe some bolted-on crazy thing
that just turns a VLA at the front-end time into just an alloca), or
the user has explicitly asked to disable some fundamental optimization
phase.

               Linus

^ permalink raw reply	[flat|nested] 51+ messages in thread

* Re: [PATCH 0/3] tracing: Rewrite the function filter code
  2018-03-10  2:34 [PATCH 0/3] tracing: Rewrite the function filter code Steven Rostedt
                   ` (2 preceding siblings ...)
  2018-03-10  2:34 ` [PATCH 3/3] tracing: Rewrite filter logic to be simpler and faster Steven Rostedt
@ 2018-03-11 19:54 ` Jiri Olsa
  3 siblings, 0 replies; 51+ messages in thread
From: Jiri Olsa @ 2018-03-11 19:54 UTC (permalink / raw)
  To: Steven Rostedt
  Cc: linux-kernel, Ingo Molnar, Andrew Morton, Al Viro, Tom Zanussi,
	Namhyung Kim, Masami Hiramatsu, Jiri Olsa

On Fri, Mar 09, 2018 at 09:34:42PM -0500, Steven Rostedt wrote:

SNIP

> Special thanks goes out to Al for his patience and his time that he spent in
> educating us in a proper logical parser.
> 
> Two patches were added to do some initial clean up. The last patch
> implements Al's suggestions. I wrote up a very lengthy comment describing
> what Al taught us in my own words (hopefully I got it right), and the
> implementation is similar to what Al showed with a few changes (again,
> hopefully I got that right too). I tested this with lots of different
> filters and it looks to be holding up.
> 
> 
>  Jiri,
> 
> This affects perf as well. I ran some tests and I believe I got
> it woking as it does today.

nice, it looks ok to me.. I guess the only concern is the filter for
ftrace:function event, which uses the pred parsing code just to get
the 'ip == ...' pairs and feeds it to ftrace_ops filter api

I'll check and make some tests

I think tracepoint events filters should be working without any special
care because they use the same api as ftrace filters

jirka

^ permalink raw reply	[flat|nested] 51+ messages in thread

* Re: [PATCH v3] kernel.h: Skip single-eval logic on literals in min()/max()
  2018-03-09 21:10 ` Linus Torvalds
  2018-03-09 21:47   ` Kees Cook
@ 2018-03-11 22:46   ` Tobin C. Harding
  2018-03-13 13:31   ` David Laight
  2 siblings, 0 replies; 51+ messages in thread
From: Tobin C. Harding @ 2018-03-11 22:46 UTC (permalink / raw)
  To: Linus Torvalds
  Cc: Kees Cook, Andrew Morton, Linux Kernel Mailing List,
	Josh Poimboeuf, Rasmus Villemoes, Gustavo A. R. Silva,
	Steven Rostedt, Jonathan Corbet, Chris Mason, Josef Bacik,
	David Sterba, David S. Miller, Alexey Kuznetsov,
	Hideaki YOSHIFUJI, Ingo Molnar, Peter Zijlstra, Thomas Gleixner,
	Masahiro Yamada, Borislav Petkov, Randy Dunlap, Ian Abbott,
	Sergey Senozhatsky, Petr Mladek, Andy Shevchenko,
	Pantelis Antoniou, Linux Btrfs, Network Development,
	Kernel Hardening

On Fri, Mar 09, 2018 at 01:10:30PM -0800, Linus Torvalds wrote:
> On Fri, Mar 9, 2018 at 12:05 PM, Kees Cook <keescook@chromium.org> wrote:
> > When max() is used in stack array size calculations from literal values
> > (e.g. "char foo[max(sizeof(struct1), sizeof(struct2))]", the compiler
> > thinks this is a dynamic calculation due to the single-eval logic, which
> > is not needed in the literal case. This change removes several accidental
> > stack VLAs from an x86 allmodconfig build:
> 
> Ok, looks good.
> 
> I just have a couple of questions about applying it.
> 
> In particular, if this will help people working on getting rid of
> VLA's in the short term, I can apply it directly. But if people who
> are looking at it (anybody else than Kees?) don't much care, then this
> might be a 4.17 thing or at least "random -mm queue"?

It's easy enough to work on the other VLA removals without basing on
this, no rush.


thanks,
Tobin.

^ permalink raw reply	[flat|nested] 51+ messages in thread

* Re: [PATCH 1/3] tracing: Combine enum and arrays into single macro in filter code
  2018-03-10  2:34 ` [PATCH 1/3] tracing: Combine enum and arrays into single macro in " Steven Rostedt
@ 2018-03-12 10:31   ` Masami Hiramatsu
  0 siblings, 0 replies; 51+ messages in thread
From: Masami Hiramatsu @ 2018-03-12 10:31 UTC (permalink / raw)
  To: Steven Rostedt
  Cc: linux-kernel, Ingo Molnar, Andrew Morton, Al Viro, Tom Zanussi,
	Namhyung Kim, Masami Hiramatsu, Jiri Olsa

On Fri, 09 Mar 2018 21:34:43 -0500
Steven Rostedt <rostedt@goodmis.org> wrote:

> From: "Steven Rostedt (VMware)" <rostedt@goodmis.org>
> 
> Instead of having a separate enum that is the index into another array, like
> a string array, make a single macro that combines them into a single list,
> and then the two can not get out of sync. This makes it easier to add and
> remove items.
> 
> The macro trick is:
> 
>  #define DOGS				\
>   C( JACK,     "Jack Russell")		\
>   C( ITALIAN,  "Italian Greyhound")	\
>   C( GERMAN,   "German Shepherd")
> 
>  #undef C
>  #define C(a, b) a
> 
>  enum { DOGS };
> 
>  #undef C
>  #define C(a, b) b
> 
>  static char dogs[] = { DOGS };

Looks good to me, and nice idea :)

Reviewed-by: Masami Hiramatsu <mhiramat@kernel.org>

Thanks,

> 
> Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
> ---
>  kernel/trace/trace_events_filter.c | 112 ++++++++++++++++---------------------
>  1 file changed, 48 insertions(+), 64 deletions(-)
> 
> diff --git a/kernel/trace/trace_events_filter.c b/kernel/trace/trace_events_filter.c
> index c3c6eee1e4df..a2ef393b3bb2 100644
> --- a/kernel/trace/trace_events_filter.c
> +++ b/kernel/trace/trace_events_filter.c
> @@ -33,22 +33,26 @@
>  	"# Only events with the given fields will be affected.\n"	\
>  	"# If no events are modified, an error message will be displayed here"
>  
> -enum filter_op_ids
> -{
> -	OP_OR,
> -	OP_AND,
> -	OP_GLOB,
> -	OP_NE,
> -	OP_EQ,
> -	OP_LT,
> -	OP_LE,
> -	OP_GT,
> -	OP_GE,
> -	OP_BAND,
> -	OP_NOT,
> -	OP_NONE,
> -	OP_OPEN_PAREN,
> -};
> +#define OPS					\
> +	C( OP_OR,	"||",		1 ),	\
> +	C( OP_AND,	"&&",		2 ),	\
> +	C( OP_GLOB,	"~",		4 ),	\
> +	C( OP_NE,	"!=",		4 ),	\
> +	C( OP_EQ,	"==",		4 ),	\
> +	C( OP_LT,	"<",		5 ),	\
> +	C( OP_LE,	"<=",		5 ),	\
> +	C( OP_GT,	">",		5 ),	\
> +	C( OP_GE,	">=",		5 ),	\
> +	C( OP_BAND,	"&",		6 ),	\
> +	C( OP_NOT,	"!",		6 ),	\
> +	C( OP_NONE,	"OP_NONE",	0 ),	\
> +	C( OP_OPEN_PAREN, "(",		0 ),	\
> +	C( OP_MAX,	NULL,		0 )
> +
> +#undef C
> +#define C(a, b, c)	a
> +
> +enum filter_op_ids { OPS };
>  
>  struct filter_op {
>  	int id;
> @@ -56,56 +60,36 @@ struct filter_op {
>  	int precedence;
>  };
>  
> -/* Order must be the same as enum filter_op_ids above */
> -static struct filter_op filter_ops[] = {
> -	{ OP_OR,	"||",		1 },
> -	{ OP_AND,	"&&",		2 },
> -	{ OP_GLOB,	"~",		4 },
> -	{ OP_NE,	"!=",		4 },
> -	{ OP_EQ,	"==",		4 },
> -	{ OP_LT,	"<",		5 },
> -	{ OP_LE,	"<=",		5 },
> -	{ OP_GT,	">",		5 },
> -	{ OP_GE,	">=",		5 },
> -	{ OP_BAND,	"&",		6 },
> -	{ OP_NOT,	"!",		6 },
> -	{ OP_NONE,	"OP_NONE",	0 },
> -	{ OP_OPEN_PAREN, "(",		0 },
> -};
> +#undef C
> +#define C(a, b, c)	{ a, b, c }
>  
> -enum {
> -	FILT_ERR_NONE,
> -	FILT_ERR_INVALID_OP,
> -	FILT_ERR_UNBALANCED_PAREN,
> -	FILT_ERR_TOO_MANY_OPERANDS,
> -	FILT_ERR_OPERAND_TOO_LONG,
> -	FILT_ERR_FIELD_NOT_FOUND,
> -	FILT_ERR_ILLEGAL_FIELD_OP,
> -	FILT_ERR_ILLEGAL_INTVAL,
> -	FILT_ERR_BAD_SUBSYS_FILTER,
> -	FILT_ERR_TOO_MANY_PREDS,
> -	FILT_ERR_MISSING_FIELD,
> -	FILT_ERR_INVALID_FILTER,
> -	FILT_ERR_IP_FIELD_ONLY,
> -	FILT_ERR_ILLEGAL_NOT_OP,
> -};
> +static struct filter_op filter_ops[] = { OPS };
>  
> -static char *err_text[] = {
> -	"No error",
> -	"Invalid operator",
> -	"Unbalanced parens",
> -	"Too many operands",
> -	"Operand too long",
> -	"Field not found",
> -	"Illegal operation for field type",
> -	"Illegal integer value",
> -	"Couldn't find or set field in one of a subsystem's events",
> -	"Too many terms in predicate expression",
> -	"Missing field name and/or value",
> -	"Meaningless filter expression",
> -	"Only 'ip' field is supported for function trace",
> -	"Illegal use of '!'",
> -};
> +#define ERRORS								\
> +	C( NONE,	 	"No error"),				\
> +	C( INVALID_OP,		"Invalid operator"),			\
> +	C( UNBALANCED_PAREN,	"Unbalanced parens"),			\
> +	C( TOO_MANY_OPERANDS,	"Too many operands"),			\
> +	C( OPERAND_TOO_LONG,	"Operand too long"),			\
> +	C( FIELD_NOT_FOUND,	"Field not found"),			\
> +	C( ILLEGAL_FIELD_OP,	"Illegal operation for field type"),	\
> +	C( ILLEGAL_INTVAL,	"Illegal integer value"),		\
> +	C( BAD_SUBSYS_FILTER,	"Couldn't find or set field in one of a subsystem's events"), \
> +	C( TOO_MANY_PREDS,	"Too many terms in predicate expression"), \
> +	C( MISSING_FIELD,	"Missing field name and/or value"),	\
> +	C( INVALID_FILTER,	"Meaningless filter expression"),	\
> +	C( IP_FIELD_ONLY,	"Only 'ip' field is supported for function trace"), \
> +	C( ILLEGAL_NOT_OP,	"Illegal use of '!'"),
> +
> +#undef C
> +#define C(a, b)		FILT_ERR_##a
> +
> +enum { ERRORS };
> +
> +#undef C
> +#define C(a, b)		b
> +
> +static char *err_text[] = { ERRORS };
>  
>  struct opstack_op {
>  	enum filter_op_ids op;
> -- 
> 2.15.1
> 
> 


-- 
Masami Hiramatsu <mhiramat@kernel.org>

^ permalink raw reply	[flat|nested] 51+ messages in thread

* Re: [PATCH 3/3] tracing: Rewrite filter logic to be simpler and faster
  2018-03-10  2:34 ` [PATCH 3/3] tracing: Rewrite filter logic to be simpler and faster Steven Rostedt
  2018-03-10  3:10   ` Steven Rostedt
  2018-03-10  3:18   ` Steven Rostedt
@ 2018-03-12 12:42   ` Jiri Olsa
  2018-03-12 18:38     ` Steven Rostedt
  2018-03-12 15:10   ` Jiri Olsa
  3 siblings, 1 reply; 51+ messages in thread
From: Jiri Olsa @ 2018-03-12 12:42 UTC (permalink / raw)
  To: Steven Rostedt
  Cc: linux-kernel, Ingo Molnar, Andrew Morton, Al Viro, Tom Zanussi,
	Namhyung Kim, Masami Hiramatsu, Jiri Olsa

On Fri, Mar 09, 2018 at 09:34:45PM -0500, Steven Rostedt wrote:

SNIP

>  
> -/* If not of not match is equal to not of not, then it is a match */
> +/*
> + * Without going into a formal proof, this explains the method that is used in
> + * parsing the logical expressions.
> + *
> + * For example, if we have: "a && !(!b || (c && g)) || d || e && !f"
> + * The first pass will convert it into the following program:
> + *
> + * n1: r=a;       l1: if (!r) goto l4;
> + * n2: r=b;       l2: if (!r) goto l4;

got stuck in here.. should that be 'goto l5' ?

jirka

> + * n3: r=c; r=!r; l3: if (r) goto l4;
> + * n4: r=g; r=!r; l4: if (r) goto l5;
> + * n5: r=d;       l5: if (r) goto T
> + * n6: r=e;       l6: if (!r) goto l7;
> + * n7: r=f; r=!r; l7: if (!r) goto F
> + * T: return TRUE
> + * F: return FALSE

jirka

^ permalink raw reply	[flat|nested] 51+ messages in thread

* Re: [PATCH 2/3] tracing: Clean up and document pred_funcs_##type creation and use
  2018-03-10  2:34 ` [PATCH 2/3] tracing: Clean up and document pred_funcs_##type creation and use Steven Rostedt
@ 2018-03-12 13:42   ` Masami Hiramatsu
  0 siblings, 0 replies; 51+ messages in thread
From: Masami Hiramatsu @ 2018-03-12 13:42 UTC (permalink / raw)
  To: Steven Rostedt
  Cc: linux-kernel, Ingo Molnar, Andrew Morton, Al Viro, Tom Zanussi,
	Namhyung Kim, Masami Hiramatsu, Jiri Olsa

On Fri, 09 Mar 2018 21:34:44 -0500
Steven Rostedt <rostedt@goodmis.org> wrote:

> From: "Steven Rostedt (VMware)" <rostedt@goodmis.org>
> 
> The pred_funcs_##type arrays consist of five functions that are assigned
> based on the ops. The array must be in the same order of the ops each
> function represents. The PRED_FUNC_START macro denotes the op enum that
> starts the op that maps to the pred_funcs_##type arrays. This is all very
> subtle and prone to bugs if the code is changed.
> 
> Add comments describing how PRED_FUNC_START and pred_funcs_##type array is
> used, and also a PRED_FUNC_MAX that is the maximum number of functions in
> the arrays.
> 
> Clean up select_comparison_fn() that assigns the predicates to the
> pred_funcs_##type array function as well as add protection in case an op is
> passed in that does not map correctly to the array.
> 

This looks good to me.

Reviewed-by: Masami Hiramatsu <mhiramat@kernel.org>

Thanks,

> Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
> ---
>  kernel/trace/trace_events_filter.c | 46 ++++++++++++++++++++++++++------------
>  1 file changed, 32 insertions(+), 14 deletions(-)
> 
> diff --git a/kernel/trace/trace_events_filter.c b/kernel/trace/trace_events_filter.c
> index a2ef393b3bb2..9d383f4383dc 100644
> --- a/kernel/trace/trace_events_filter.c
> +++ b/kernel/trace/trace_events_filter.c
> @@ -65,6 +65,13 @@ struct filter_op {
>  
>  static struct filter_op filter_ops[] = { OPS };
>  
> +/*
> + * pred functions are OP_LT, OP_LE, OP_GT, OP_GE, and OP_BAND
> + * pred_funcs_##type below must match the order of them above.
> + */
> +#define PRED_FUNC_START			OP_LT
> +#define PRED_FUNC_MAX			(OP_BAND - PRED_FUNC_START)
> +
>  #define ERRORS								\
>  	C( NONE,	 	"No error"),				\
>  	C( INVALID_OP,		"Invalid operator"),			\
> @@ -172,8 +179,6 @@ static const filter_pred_fn_t pred_funcs_##type[] = {			\
>  	filter_pred_BAND_##type,					\
>  };
>  
> -#define PRED_FUNC_START			OP_LT
> -
>  #define DEFINE_EQUALITY_PRED(size)					\
>  static int filter_pred_##size(struct filter_pred *pred, void *event)	\
>  {									\
> @@ -946,39 +951,52 @@ static filter_pred_fn_t select_comparison_fn(enum filter_op_ids op,
>  					    int field_size, int field_is_signed)
>  {
>  	filter_pred_fn_t fn = NULL;
> +	int pred_func_index = -1;
> +
> +	switch (op) {
> +	case OP_EQ:
> +	case OP_NE:
> +		break;
> +	default:
> +		if (WARN_ON_ONCE(op < PRED_FUNC_START))
> +			return NULL;
> +		pred_func_index = op - PRED_FUNC_START;
> +		if (WARN_ON_ONCE(pred_func_index > PRED_FUNC_MAX))
> +			return NULL;
> +	}
>  
>  	switch (field_size) {
>  	case 8:
> -		if (op == OP_EQ || op == OP_NE)
> +		if (pred_func_index < 0)
>  			fn = filter_pred_64;
>  		else if (field_is_signed)
> -			fn = pred_funcs_s64[op - PRED_FUNC_START];
> +			fn = pred_funcs_s64[pred_func_index];
>  		else
> -			fn = pred_funcs_u64[op - PRED_FUNC_START];
> +			fn = pred_funcs_u64[pred_func_index];
>  		break;
>  	case 4:
> -		if (op == OP_EQ || op == OP_NE)
> +		if (pred_func_index < 0)
>  			fn = filter_pred_32;
>  		else if (field_is_signed)
> -			fn = pred_funcs_s32[op - PRED_FUNC_START];
> +			fn = pred_funcs_s32[pred_func_index];
>  		else
> -			fn = pred_funcs_u32[op - PRED_FUNC_START];
> +			fn = pred_funcs_u32[pred_func_index];
>  		break;
>  	case 2:
> -		if (op == OP_EQ || op == OP_NE)
> +		if (pred_func_index < 0)
>  			fn = filter_pred_16;
>  		else if (field_is_signed)
> -			fn = pred_funcs_s16[op - PRED_FUNC_START];
> +			fn = pred_funcs_s16[pred_func_index];
>  		else
> -			fn = pred_funcs_u16[op - PRED_FUNC_START];
> +			fn = pred_funcs_u16[pred_func_index];
>  		break;
>  	case 1:
> -		if (op == OP_EQ || op == OP_NE)
> +		if (pred_func_index < 0)
>  			fn = filter_pred_8;
>  		else if (field_is_signed)
> -			fn = pred_funcs_s8[op - PRED_FUNC_START];
> +			fn = pred_funcs_s8[pred_func_index];
>  		else
> -			fn = pred_funcs_u8[op - PRED_FUNC_START];
> +			fn = pred_funcs_u8[pred_func_index];
>  		break;
>  	}
>  
> -- 
> 2.15.1
> 
> 


-- 
Masami Hiramatsu <mhiramat@kernel.org>

^ permalink raw reply	[flat|nested] 51+ messages in thread

* Re: [PATCH 3/3] tracing: Rewrite filter logic to be simpler and faster
  2018-03-10  2:34 ` [PATCH 3/3] tracing: Rewrite filter logic to be simpler and faster Steven Rostedt
                     ` (2 preceding siblings ...)
  2018-03-12 12:42   ` Jiri Olsa
@ 2018-03-12 15:10   ` Jiri Olsa
  2018-03-12 18:40     ` Steven Rostedt
  3 siblings, 1 reply; 51+ messages in thread
From: Jiri Olsa @ 2018-03-12 15:10 UTC (permalink / raw)
  To: Steven Rostedt
  Cc: linux-kernel, Ingo Molnar, Andrew Morton, Al Viro, Tom Zanussi,
	Namhyung Kim, Masami Hiramatsu, Jiri Olsa

On Fri, Mar 09, 2018 at 09:34:45PM -0500, Steven Rostedt wrote:
> From: "Steven Rostedt (VMware)" <rostedt@goodmis.org>
> 
> Al Viro reviewed the filter logic of ftrace trace events and found it to be
> very troubling. It creates a binary tree based on the logic operators and
> walks it during tracing. He sent myself and Tom Zanussi a long explanation
> (and formal proof) of how to do the string parsing better and end up with a
> program array that can be simply iterated to come up with the correct
> results.
> 
> I took his ideas and his pseudo code and rewrote the filter logic based on
> them. In doing so, I was able to remove a lot of code, and have a much more
> condensed filter logic in the process. I wrote a very long comment
> describing the methadology that Al proposed in my own words. For more info
> on how this works, read the comment above predicate_parse().
> 
> Suggested-by: Al Viro <viro@ZenIV.linux.org.uk>
> Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>

got it crashed when clearing the filter via 'echo > filter'

jirka


---
ibm-x3650m4-01 login: [  803.551062] BUG: unable to handle kernel paging request at ffff972acc154c34
[  803.558846] IP: process_preds+0x5e9/0x680
[  803.563318] PGD 251bbf067 P4D 251bbf067 PUD 0 
[  803.568280] Oops: 0000 [#1] SMP PTI
[  803.572172] Modules linked in: intel_rapl sb_edac x86_pkg_temp_thermal intel_powerclamp coretemp kvm_intel kvm igb irqbypass crct10dif_pclmul ptp crc32_pclmul ipmi_ssmi_msghandler intel_rapl_perf wmi i2c_i801 mii lpc_ich xfs libcrc32c mgag200 i2c_algo_bit drm_kms_helper ttm crc32c_intel drm megaraid_sas
[  803.618767] CPU: 13 PID: 1297 Comm: bash Not tainted 4.16.0-rc4+ #96
[  803.625857] Hardware name: IBM System x3650 M4 : -[7915E2G]-/00Y7683, BIOS -[VVE124AUS-1.30]- 11/21/2012
[  803.636441] RIP: 0010:process_preds+0x5e9/0x680
[  803.641496] RSP: 0018:ffffa8aac2b9fd30 EFLAGS: 00010286
[  803.647327] RAX: ffff972375def4e0 RBX: ffff972575b84e08 RCX: 00000000fffffffe
[  803.655289] RDX: ffff972acc154c30 RSI: ffff972375def500 RDI: 0000000000023ee0
[  803.663251] RBP: ffff972575b84198 R08: ffff972377be3ee0 R09: ffff972575b84210
[  803.671213] R10: 0000000000000000 R11: 0000000040000000 R12: ffff972575b84ff0
[  803.679176] R13: 0000000000000001 R14: 00000000fffffff4 R15: ffff972575b84ff0
[  803.687139] FS:  00007f216a513700(0000) GS:ffff972377bc0000(0000) knlGS:0000000000000000
[  803.696169] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[  803.702581] CR2: ffff972acc154c34 CR3: 0000000274478001 CR4: 00000000000606e0
[  803.710544] Call Trace:
[  803.713278]  create_filter+0x80/0xc0
[  803.717268]  apply_event_filter+0xb8/0x120
[  803.721837]  event_filter_write+0x5d/0xb0
[  803.726312]  __vfs_write+0x33/0x170
[  803.730207]  ? __inode_security_revalidate+0x4a/0x70
[  803.735740]  ? selinux_file_permission+0xce/0x120
[  803.740989]  ? _cond_resched+0x16/0x40
[  803.745173]  vfs_write+0xb0/0x190
[  803.748870]  SyS_write+0x52/0xc0
[  803.752473]  do_syscall_64+0x6e/0x180
[  803.756560]  entry_SYSCALL_64_after_hwframe+0x3d/0xa2
[  803.762196] RIP: 0033:0x7f2169c00b50
[  803.766182] RSP: 002b:00007ffc35e05db8 EFLAGS: 00000246 ORIG_RAX: 0000000000000001
[  803.774630] RAX: ffffffffffffffda RBX: 0000000000000001 RCX: 00007f2169c00b50
[  803.782592] RDX: 0000000000000001 RSI: 0000560031449e00 RDI: 0000000000000001
[  803.790555] RBP: 0000560031449e00 R08: 00007f2169ecb740 R09: 00007f216a513700
[  803.798517] R10: 0000560031453a00 R11: 0000000000000246 R12: 0000000000000001
[  803.806479] R13: 0000000000000001 R14: 00007f2169eca5e0 R15: 00007f2169ec63c0
[  803.814443] Code: 02 00 00 00 00 44 89 10 c7 40 04 00 00 00 00 83 e9 01 83 f9 ff 74 22 48 63 c1 48 c1 e0 04 48 01 f0 48 63 10 48 c1 e2 04 48 01 f2 <8b> 5a 04 39 58 04
[  803.835544] RIP: process_preds+0x5e9/0x680 RSP: ffffa8aac2b9fd30
[  803.842247] CR2: ffff972acc154c34
[  803.845945] ---[ end trace 81df1915c389f53d ]---

^ permalink raw reply	[flat|nested] 51+ messages in thread

* Re: [PATCH 3/3] tracing: Rewrite filter logic to be simpler and faster
  2018-03-12 12:42   ` Jiri Olsa
@ 2018-03-12 18:38     ` Steven Rostedt
  0 siblings, 0 replies; 51+ messages in thread
From: Steven Rostedt @ 2018-03-12 18:38 UTC (permalink / raw)
  To: Jiri Olsa
  Cc: linux-kernel, Ingo Molnar, Andrew Morton, Al Viro, Tom Zanussi,
	Namhyung Kim, Masami Hiramatsu, Jiri Olsa

On Mon, 12 Mar 2018 13:42:46 +0100
Jiri Olsa <jolsa@redhat.com> wrote:

> On Fri, Mar 09, 2018 at 09:34:45PM -0500, Steven Rostedt wrote:
> 
> SNIP
> 
> >  
> > -/* If not of not match is equal to not of not, then it is a match */
> > +/*
> > + * Without going into a formal proof, this explains the method that is used in
> > + * parsing the logical expressions.
> > + *
> > + * For example, if we have: "a && !(!b || (c && g)) || d || e && !f"
> > + * The first pass will convert it into the following program:
> > + *
> > + * n1: r=a;       l1: if (!r) goto l4;
> > + * n2: r=b;       l2: if (!r) goto l4;  
> 
> got stuck in here.. should that be 'goto l5' ?

Nope, this is correct. In fact, my user space implementation of the
code is what generated this output.

 !(!b || (c && g)) is the same as (b && (!c || !g)), so we can rewrite
 the above as:

 a && b && (!c || !g) || d || e && !f

Which makes it more obvious why it is goto l4.

And since we have:

 n1: r=a;	l1: if (!r) goto l4;
 n2: r=b;	l2: if (!r) goto l4;
 n3: r=c; r=!r;	l3: if (r) goto l4;
 n4: r=g; r=!r;	l4: if (r) goto l5;
 n5; r=d;	l5: if (r) goto T;

If a is true, if (!r) is false so we continue to n2.
If b is false, then if (!r) is true, so we take the branch. Let's see
what that does:

 l4: if (r) goto l5;

But since we jumped because r is false, then this if statement is
guaranteed to be false, and we continue to n5. Then we need to test d.

 a && b && ... || d

You can see how that is correct. If a is true and then be is false,
then we ignore the rest of the && statement and jump to testing the
other side of || which is d.

-- Steve

> 
> jirka
> 
> > + * n3: r=c; r=!r; l3: if (r) goto l4;
> > + * n4: r=g; r=!r; l4: if (r) goto l5;
> > + * n5: r=d;       l5: if (r) goto T
> > + * n6: r=e;       l6: if (!r) goto l7;
> > + * n7: r=f; r=!r; l7: if (!r) goto F
> > + * T: return TRUE
> > + * F: return FALSE  
> 
> jirka

^ permalink raw reply	[flat|nested] 51+ messages in thread

* Re: [PATCH 3/3] tracing: Rewrite filter logic to be simpler and faster
  2018-03-12 15:10   ` Jiri Olsa
@ 2018-03-12 18:40     ` Steven Rostedt
  2018-03-12 18:54       ` Jiri Olsa
  0 siblings, 1 reply; 51+ messages in thread
From: Steven Rostedt @ 2018-03-12 18:40 UTC (permalink / raw)
  To: Jiri Olsa
  Cc: linux-kernel, Ingo Molnar, Andrew Morton, Al Viro, Tom Zanussi,
	Namhyung Kim, Masami Hiramatsu, Jiri Olsa

On Mon, 12 Mar 2018 16:10:17 +0100
Jiri Olsa <jolsa@redhat.com> wrote:

> got it crashed when clearing the filter via 'echo > filter'

Awesome. I'll go and test this out. Thanks!

Hmm, could you pull my tree and test my branch: ftrace/core. I may have
already fixed this but haven't posted the latest (which I'll do before
pushing to next).

-- Steve

^ permalink raw reply	[flat|nested] 51+ messages in thread

* Re: [PATCH 3/3] tracing: Rewrite filter logic to be simpler and faster
  2018-03-12 18:40     ` Steven Rostedt
@ 2018-03-12 18:54       ` Jiri Olsa
  2018-03-12 19:10         ` Steven Rostedt
  2018-03-12 23:52         ` Steven Rostedt
  0 siblings, 2 replies; 51+ messages in thread
From: Jiri Olsa @ 2018-03-12 18:54 UTC (permalink / raw)
  To: Steven Rostedt
  Cc: linux-kernel, Ingo Molnar, Andrew Morton, Al Viro, Tom Zanussi,
	Namhyung Kim, Masami Hiramatsu, Jiri Olsa

On Mon, Mar 12, 2018 at 02:40:01PM -0400, Steven Rostedt wrote:
> On Mon, 12 Mar 2018 16:10:17 +0100
> Jiri Olsa <jolsa@redhat.com> wrote:
> 
> > got it crashed when clearing the filter via 'echo > filter'
> 
> Awesome. I'll go and test this out. Thanks!
> 
> Hmm, could you pull my tree and test my branch: ftrace/core. I may have
> already fixed this but haven't posted the latest (which I'll do before
> pushing to next).

I couldn't apply your patches from mailbox so I used your ftrace/core already

jirka

^ permalink raw reply	[flat|nested] 51+ messages in thread

* Re: [PATCH 3/3] tracing: Rewrite filter logic to be simpler and faster
  2018-03-12 18:54       ` Jiri Olsa
@ 2018-03-12 19:10         ` Steven Rostedt
  2018-03-12 23:52         ` Steven Rostedt
  1 sibling, 0 replies; 51+ messages in thread
From: Steven Rostedt @ 2018-03-12 19:10 UTC (permalink / raw)
  To: Jiri Olsa
  Cc: linux-kernel, Ingo Molnar, Andrew Morton, Al Viro, Tom Zanussi,
	Namhyung Kim, Masami Hiramatsu, Jiri Olsa

On Mon, 12 Mar 2018 19:54:14 +0100
Jiri Olsa <jolsa@redhat.com> wrote:

> On Mon, Mar 12, 2018 at 02:40:01PM -0400, Steven Rostedt wrote:
> > On Mon, 12 Mar 2018 16:10:17 +0100
> > Jiri Olsa <jolsa@redhat.com> wrote:
> >   
> > > got it crashed when clearing the filter via 'echo > filter'  
> > 
> > Awesome. I'll go and test this out. Thanks!
> > 
> > Hmm, could you pull my tree and test my branch: ftrace/core. I may have
> > already fixed this but haven't posted the latest (which I'll do before
> > pushing to next).  
> 
> I couldn't apply your patches from mailbox so I used your ftrace/core already
> 

Ug, I'm able to reproduce it. Damn, this passed my test suite. How did
it not trigger this. I need to look at my test cases!

-- Steve

^ permalink raw reply	[flat|nested] 51+ messages in thread

* Re: [PATCH v3] kernel.h: Skip single-eval logic on literals in min()/max()
  2018-03-10  1:30         ` Kees Cook
  2018-03-10  1:31           ` Kees Cook
@ 2018-03-12 22:55           ` Andrew Morton
  2018-03-12 23:57             ` Linus Torvalds
  1 sibling, 1 reply; 51+ messages in thread
From: Andrew Morton @ 2018-03-12 22:55 UTC (permalink / raw)
  To: Kees Cook
  Cc: Linus Torvalds, Linux Kernel Mailing List, Josh Poimboeuf,
	Rasmus Villemoes, Gustavo A. R. Silva, Tobin C. Harding,
	Steven Rostedt, Jonathan Corbet, Chris Mason, Josef Bacik,
	David Sterba, David S. Miller, Alexey Kuznetsov,
	Hideaki YOSHIFUJI, Ingo Molnar, Peter Zijlstra, Thomas Gleixner,
	Masahiro Yamada, Borislav Petkov, Randy Dunlap, Ian Abbott,
	Sergey Senozhatsky, Petr Mladek, Andy Shevchenko,
	Pantelis Antoniou, Linux Btrfs, Network Development,
	Kernel Hardening

On Fri, 9 Mar 2018 17:30:15 -0800 Kees Cook <keescook@chromium.org> wrote:

> > It's one reason why I wondered if simplifying the expression to have
> > just that single __builtin_constant_p() might not end up working..
> 
> Yeah, it seems like it doesn't bail out as "false" for complex
> expressions given to __builtin_constant_p().
> 
> If no magic solution, then which of these?
> 
> - go back to my original "multi-eval max only for constants" macro (meh)
> - add gcc version checks around this and similarly for -Wvla in the future (eww)
> - raise gcc version (yikes)

Replacing the __builtin_choose_expr() with ?: works of course.  What
will be the runtime effects?

I tried replacing

	__builtin_choose_expr(__builtin_constant_p(x) &&
                              __builtin_constant_p(y),

with

	__builtin_choose_expr(__builtin_constant_p(x + y),

but no success.

I'm not sure what else to try to trick gcc into working.

--- a/include/linux/kernel.h~kernelh-skip-single-eval-logic-on-literals-in-min-max-v3-fix
+++ a/include/linux/kernel.h
@@ -804,13 +804,10 @@ static inline void ftrace_dump(enum ftra
  * accidental VLA.
  */
 #define __min(t1, t2, x, y)						\
-	__builtin_choose_expr(__builtin_constant_p(x) &&		\
-			      __builtin_constant_p(y),			\
-			      (t1)(x) < (t2)(y) ? (t1)(x) : (t2)(y),	\
-			      __single_eval_min(t1, t2,			\
-						__UNIQUE_ID(min1_),	\
-						__UNIQUE_ID(min2_),	\
-						x, y))
+	((__builtin_constant_p(x) && __builtin_constant_p(y)) ?		\
+		((t1)(x) < (t2)(y) ? (t1)(x) : (t2)(y)) :		\
+		(__single_eval_min(t1, t2, __UNIQUE_ID(min1_),		\
+				  __UNIQUE_ID(min2_), x, y)))
 
 /**
  * min - return minimum of two values of the same or compatible types
@@ -826,13 +823,11 @@ static inline void ftrace_dump(enum ftra
 	max1 > max2 ? max1 : max2; })
 
 #define __max(t1, t2, x, y)						\
-	__builtin_choose_expr(__builtin_constant_p(x) &&		\
-			      __builtin_constant_p(y),			\
-			      (t1)(x) > (t2)(y) ? (t1)(x) : (t2)(y),	\
-			      __single_eval_max(t1, t2,			\
-						__UNIQUE_ID(max1_),	\
-						__UNIQUE_ID(max2_),	\
-						x, y))
+	((__builtin_constant_p(x) && __builtin_constant_p(y)) ?		\
+		((t1)(x) > (t2)(y) ? (t1)(x) : (t2)(y)) :		\
+		(__single_eval_max(t1, t2, __UNIQUE_ID(max1_),		\
+				  __UNIQUE_ID(max2_), x, y)))
+
 /**
  * max - return maximum of two values of the same or compatible types
  * @x: first value
_

^ permalink raw reply	[flat|nested] 51+ messages in thread

* Re: [PATCH 3/3] tracing: Rewrite filter logic to be simpler and faster
  2018-03-12 18:54       ` Jiri Olsa
  2018-03-12 19:10         ` Steven Rostedt
@ 2018-03-12 23:52         ` Steven Rostedt
  2018-03-13 10:14           ` Jiri Olsa
  1 sibling, 1 reply; 51+ messages in thread
From: Steven Rostedt @ 2018-03-12 23:52 UTC (permalink / raw)
  To: Jiri Olsa
  Cc: linux-kernel, Ingo Molnar, Andrew Morton, Al Viro, Tom Zanussi,
	Namhyung Kim, Masami Hiramatsu, Jiri Olsa

On Mon, 12 Mar 2018 19:54:14 +0100
Jiri Olsa <jolsa@redhat.com> wrote:

> On Mon, Mar 12, 2018 at 02:40:01PM -0400, Steven Rostedt wrote:
> > On Mon, 12 Mar 2018 16:10:17 +0100
> > Jiri Olsa <jolsa@redhat.com> wrote:
> >   
> > > got it crashed when clearing the filter via 'echo > filter'  
> > 
> > Awesome. I'll go and test this out. Thanks!
> > 
> > Hmm, could you pull my tree and test my branch: ftrace/core. I may have
> > already fixed this but haven't posted the latest (which I'll do before
> > pushing to next).  
> 
> I couldn't apply your patches from mailbox so I used your ftrace/core already
>

Jiri, If you apply the below, does it fix it for you?

-- Steve

diff --git a/kernel/trace/trace_events_filter.c b/kernel/trace/trace_events_filter.c
index a5a131ec3c9c..4099e141188c 100644
--- a/kernel/trace/trace_events_filter.c
+++ b/kernel/trace/trace_events_filter.c
@@ -1498,11 +1498,14 @@ static int process_preds(struct trace_event_call *call,
 		return ret;
 	}
 
-	prog = predicate_parse(filter_string, nr_parens, nr_preds,
+	if (!nr_preds) {
+		prog = NULL;
+	} else {
+		prog = predicate_parse(filter_string, nr_parens, nr_preds,
 			       parse_pred, call, pe);
-	if (IS_ERR(prog))
-		return PTR_ERR(prog);
-
+		if (IS_ERR(prog))
+			return PTR_ERR(prog);
+	}
 	rcu_assign_pointer(filter->prog, prog);
 	return 0;
 }

^ permalink raw reply related	[flat|nested] 51+ messages in thread

* Re: [PATCH v3] kernel.h: Skip single-eval logic on literals in min()/max()
  2018-03-12 22:55           ` Andrew Morton
@ 2018-03-12 23:57             ` Linus Torvalds
  2018-03-13  4:28               ` Kees Cook
  0 siblings, 1 reply; 51+ messages in thread
From: Linus Torvalds @ 2018-03-12 23:57 UTC (permalink / raw)
  To: Andrew Morton
  Cc: Kees Cook, Linux Kernel Mailing List, Josh Poimboeuf,
	Rasmus Villemoes, Gustavo A. R. Silva, Tobin C. Harding,
	Steven Rostedt, Jonathan Corbet, Chris Mason, Josef Bacik,
	David Sterba, David S. Miller, Alexey Kuznetsov,
	Hideaki YOSHIFUJI, Ingo Molnar, Peter Zijlstra, Thomas Gleixner,
	Masahiro Yamada, Borislav Petkov, Randy Dunlap, Ian Abbott,
	Sergey Senozhatsky, Petr Mladek, Andy Shevchenko,
	Pantelis Antoniou, Linux Btrfs, Network Development,
	Kernel Hardening

On Mon, Mar 12, 2018 at 3:55 PM, Andrew Morton
<akpm@linux-foundation.org> wrote:
>
> Replacing the __builtin_choose_expr() with ?: works of course.

Hmm. That sounds like the right thing to do. We were so myopically
staring at the __builtin_choose_expr() problem that we overlooked the
obvious solution.

Using __builtin_constant_p() together with a ?: is in fact our common
pattern, so that should be fine. The only real reason to use
__builtin_choose_expr() is if you want to get the *type* to vary
depending on which side you choose, but that's not an issue for
min/max.

> What will be the runtime effects?

There should be none. Gcc will turn the conditional for the ?: into a
constant, and DTRT.

              Linus

^ permalink raw reply	[flat|nested] 51+ messages in thread

* Re: [PATCH v3] kernel.h: Skip single-eval logic on literals in min()/max()
  2018-03-12 23:57             ` Linus Torvalds
@ 2018-03-13  4:28               ` Kees Cook
  2018-03-13 21:02                 ` Andrew Morton
  0 siblings, 1 reply; 51+ messages in thread
From: Kees Cook @ 2018-03-13  4:28 UTC (permalink / raw)
  To: Linus Torvalds
  Cc: Andrew Morton, Linux Kernel Mailing List, Josh Poimboeuf,
	Rasmus Villemoes, Gustavo A. R. Silva, Tobin C. Harding,
	Steven Rostedt, Jonathan Corbet, Chris Mason, Josef Bacik,
	David Sterba, David S. Miller, Alexey Kuznetsov,
	Hideaki YOSHIFUJI, Ingo Molnar, Peter Zijlstra, Thomas Gleixner,
	Masahiro Yamada, Borislav Petkov, Randy Dunlap, Ian Abbott,
	Sergey Senozhatsky, Petr Mladek, Andy Shevchenko,
	Pantelis Antoniou, Linux Btrfs, Network Development,
	Kernel Hardening

On Mon, Mar 12, 2018 at 4:57 PM, Linus Torvalds
<torvalds@linux-foundation.org> wrote:
> On Mon, Mar 12, 2018 at 3:55 PM, Andrew Morton
> <akpm@linux-foundation.org> wrote:
>>
>> Replacing the __builtin_choose_expr() with ?: works of course.
>
> Hmm. That sounds like the right thing to do. We were so myopically
> staring at the __builtin_choose_expr() problem that we overlooked the
> obvious solution.
>
> Using __builtin_constant_p() together with a ?: is in fact our common
> pattern, so that should be fine. The only real reason to use
> __builtin_choose_expr() is if you want to get the *type* to vary
> depending on which side you choose, but that's not an issue for
> min/max.

This doesn't solve it for -Wvla, unfortunately. That was the point of
Josh's original suggestion of __builtin_choose_expr().

Try building with KCFLAGS=-Wval and checking net/ipv6/proc.c:

net/ipv6/proc.c: In function ‘snmp6_seq_show_item’:
net/ipv6/proc.c:198:2: warning: ISO C90 forbids array ‘buff’ whose
size can’t be evaluated [-Wvla]
  unsigned long buff[SNMP_MIB_MAX];
  ^~~~~~~~


-Kees

-- 
Kees Cook
Pixel Security

^ permalink raw reply	[flat|nested] 51+ messages in thread

* Re: [PATCH 3/3] tracing: Rewrite filter logic to be simpler and faster
  2018-03-12 23:52         ` Steven Rostedt
@ 2018-03-13 10:14           ` Jiri Olsa
  2018-03-13 14:12             ` Steven Rostedt
  0 siblings, 1 reply; 51+ messages in thread
From: Jiri Olsa @ 2018-03-13 10:14 UTC (permalink / raw)
  To: Steven Rostedt
  Cc: linux-kernel, Ingo Molnar, Andrew Morton, Al Viro, Tom Zanussi,
	Namhyung Kim, Masami Hiramatsu, Jiri Olsa

On Mon, Mar 12, 2018 at 07:52:45PM -0400, Steven Rostedt wrote:
> On Mon, 12 Mar 2018 19:54:14 +0100
> Jiri Olsa <jolsa@redhat.com> wrote:
> 
> > On Mon, Mar 12, 2018 at 02:40:01PM -0400, Steven Rostedt wrote:
> > > On Mon, 12 Mar 2018 16:10:17 +0100
> > > Jiri Olsa <jolsa@redhat.com> wrote:
> > >   
> > > > got it crashed when clearing the filter via 'echo > filter'  
> > > 
> > > Awesome. I'll go and test this out. Thanks!
> > > 
> > > Hmm, could you pull my tree and test my branch: ftrace/core. I may have
> > > already fixed this but haven't posted the latest (which I'll do before
> > > pushing to next).  
> > 
> > I couldn't apply your patches from mailbox so I used your ftrace/core already
> >
> 
> Jiri, If you apply the below, does it fix it for you?

yes, the crash is gone and I can set filter ftrace/function,
but I'm still having some issues put that filter through perf

  # perf record -e ftrace:function --filter "ip == 0xffffffffa41e8490" ls

but I might be just missing something.. it's been a while ;-) I'm looking to that

jirka


> 
> -- Steve
> 
> diff --git a/kernel/trace/trace_events_filter.c b/kernel/trace/trace_events_filter.c
> index a5a131ec3c9c..4099e141188c 100644
> --- a/kernel/trace/trace_events_filter.c
> +++ b/kernel/trace/trace_events_filter.c
> @@ -1498,11 +1498,14 @@ static int process_preds(struct trace_event_call *call,
>  		return ret;
>  	}
>  
> -	prog = predicate_parse(filter_string, nr_parens, nr_preds,
> +	if (!nr_preds) {
> +		prog = NULL;
> +	} else {
> +		prog = predicate_parse(filter_string, nr_parens, nr_preds,
>  			       parse_pred, call, pe);
> -	if (IS_ERR(prog))
> -		return PTR_ERR(prog);
> -
> +		if (IS_ERR(prog))
> +			return PTR_ERR(prog);
> +	}
>  	rcu_assign_pointer(filter->prog, prog);
>  	return 0;
>  }

^ permalink raw reply	[flat|nested] 51+ messages in thread

* RE: [PATCH v3] kernel.h: Skip single-eval logic on literals in min()/max()
  2018-03-09 21:10 ` Linus Torvalds
  2018-03-09 21:47   ` Kees Cook
  2018-03-11 22:46   ` Tobin C. Harding
@ 2018-03-13 13:31   ` David Laight
  2 siblings, 0 replies; 51+ messages in thread
From: David Laight @ 2018-03-13 13:31 UTC (permalink / raw)
  To: 'Linus Torvalds', Kees Cook
  Cc: Andrew Morton, Linux Kernel Mailing List, Josh Poimboeuf,
	Rasmus Villemoes, Gustavo A. R. Silva, Tobin C. Harding,
	Steven Rostedt, Jonathan Corbet, Chris Mason, Josef Bacik,
	David Sterba, David S. Miller, Alexey Kuznetsov,
	Hideaki YOSHIFUJI, Ingo Molnar, Peter Zijlstra, Thomas Gleixner,
	Masahiro Yamada, Borislav Petkov, Randy Dunlap, Ian Abbott,
	Sergey Senozhatsky, Petr Mladek, Andy Shevchenko,
	Pantelis Antoniou, Linux Btrfs, Network Development,
	Kernel Hardening

The amount of replicated defined could also be reduced by passing > or <
to a min_max() macro.
So you start off with something like:
#define min(x, y) __min_max(x, <, y)
#define max(x, y) __min_max(x, >, y)
then have:
#define __min_max(x, cond, y) ((x) cond (y) ? (x) : (y))
in all its associated flavours.

	David



^ permalink raw reply	[flat|nested] 51+ messages in thread

* Re: [PATCH 3/3] tracing: Rewrite filter logic to be simpler and faster
  2018-03-13 10:14           ` Jiri Olsa
@ 2018-03-13 14:12             ` Steven Rostedt
  2018-03-13 14:27               ` Jiri Olsa
  0 siblings, 1 reply; 51+ messages in thread
From: Steven Rostedt @ 2018-03-13 14:12 UTC (permalink / raw)
  To: Jiri Olsa
  Cc: linux-kernel, Ingo Molnar, Andrew Morton, Al Viro, Tom Zanussi,
	Namhyung Kim, Masami Hiramatsu, Jiri Olsa

On Tue, 13 Mar 2018 11:14:01 +0100
Jiri Olsa <jolsa@redhat.com> wrote:

> > Jiri, If you apply the below, does it fix it for you?  
> 
> yes, the crash is gone and I can set filter ftrace/function,

Great!

> but I'm still having some issues put that filter through perf
> 
>   # perf record -e ftrace:function --filter "ip == 0xffffffffa41e8490" ls
> 
> but I might be just missing something.. it's been a while ;-) I'm looking to that

I have to ask. Did that work with the old code? The ftrace filter was
special in the old code and I tried to simulate it in the new code.
I'm not sure I checked if ip can take an address, but from what the code
looked like, it wouldn't. It looked like it required a name of a
function. Something that gets passed into "set_ftrace_filter" which is
not an address.

So instead of doing something like:

 perf record -e ftrace:function --filter "ip == 0xffffffff810ccfa0" ls

You would need to do

 perf record -e ftrace:function --filter "ip == schedule_tail" ls

because perf doesn't use the filter for the function, it uses the
ftrace_ops->hash tables. If it would simply take the address, we could
just use the trace_events_filter logic, and not make it a special case.

-- Steve

^ permalink raw reply	[flat|nested] 51+ messages in thread

* Re: [PATCH 3/3] tracing: Rewrite filter logic to be simpler and faster
  2018-03-13 14:12             ` Steven Rostedt
@ 2018-03-13 14:27               ` Jiri Olsa
  0 siblings, 0 replies; 51+ messages in thread
From: Jiri Olsa @ 2018-03-13 14:27 UTC (permalink / raw)
  To: Steven Rostedt
  Cc: linux-kernel, Ingo Molnar, Andrew Morton, Al Viro, Tom Zanussi,
	Namhyung Kim, Masami Hiramatsu, Jiri Olsa

On Tue, Mar 13, 2018 at 10:12:44AM -0400, Steven Rostedt wrote:
> On Tue, 13 Mar 2018 11:14:01 +0100
> Jiri Olsa <jolsa@redhat.com> wrote:
> 
> > > Jiri, If you apply the below, does it fix it for you?  
> > 
> > yes, the crash is gone and I can set filter ftrace/function,
> 
> Great!
> 
> > but I'm still having some issues put that filter through perf
> > 
> >   # perf record -e ftrace:function --filter "ip == 0xffffffffa41e8490" ls
> > 
> > but I might be just missing something.. it's been a while ;-) I'm looking to that
> 
> I have to ask. Did that work with the old code? The ftrace filter was
> special in the old code and I tried to simulate it in the new code.
> I'm not sure I checked if ip can take an address, but from what the code
> looked like, it wouldn't. It looked like it required a name of a
> function. Something that gets passed into "set_ftrace_filter" which is
> not an address.
> 
> So instead of doing something like:
> 
>  perf record -e ftrace:function --filter "ip == 0xffffffff810ccfa0" ls
> 
> You would need to do
> 
>  perf record -e ftrace:function --filter "ip == schedule_tail" ls
> 
> because perf doesn't use the filter for the function, it uses the
> ftrace_ops->hash tables. If it would simply take the address, we could
> just use the trace_events_filter logic, and not make it a special case.

ok, that's what I've been missing.. for some reason I thought
we need to pass the address.. everything checks out then

I checked few filters and it seems to work properly to me

thanks,
jirka

^ permalink raw reply	[flat|nested] 51+ messages in thread

* Re: [PATCH v3] kernel.h: Skip single-eval logic on literals in min()/max()
  2018-03-13  4:28               ` Kees Cook
@ 2018-03-13 21:02                 ` Andrew Morton
  2018-03-13 22:14                   ` Kees Cook
  0 siblings, 1 reply; 51+ messages in thread
From: Andrew Morton @ 2018-03-13 21:02 UTC (permalink / raw)
  To: Kees Cook
  Cc: Linus Torvalds, Linux Kernel Mailing List, Josh Poimboeuf,
	Rasmus Villemoes, Gustavo A. R. Silva, Tobin C. Harding,
	Steven Rostedt, Jonathan Corbet, Chris Mason, Josef Bacik,
	David Sterba, David S. Miller, Alexey Kuznetsov,
	Hideaki YOSHIFUJI, Ingo Molnar, Peter Zijlstra, Thomas Gleixner,
	Masahiro Yamada, Borislav Petkov, Randy Dunlap, Ian Abbott,
	Sergey Senozhatsky, Petr Mladek, Andy Shevchenko,
	Pantelis Antoniou, Linux Btrfs, Network Development,
	Kernel Hardening

On Mon, 12 Mar 2018 21:28:57 -0700 Kees Cook <keescook@chromium.org> wrote:

> On Mon, Mar 12, 2018 at 4:57 PM, Linus Torvalds
> <torvalds@linux-foundation.org> wrote:
> > On Mon, Mar 12, 2018 at 3:55 PM, Andrew Morton
> > <akpm@linux-foundation.org> wrote:
> >>
> >> Replacing the __builtin_choose_expr() with ?: works of course.
> >
> > Hmm. That sounds like the right thing to do. We were so myopically
> > staring at the __builtin_choose_expr() problem that we overlooked the
> > obvious solution.
> >
> > Using __builtin_constant_p() together with a ?: is in fact our common
> > pattern, so that should be fine. The only real reason to use
> > __builtin_choose_expr() is if you want to get the *type* to vary
> > depending on which side you choose, but that's not an issue for
> > min/max.
> 
> This doesn't solve it for -Wvla, unfortunately. That was the point of
> Josh's original suggestion of __builtin_choose_expr().
> 
> Try building with KCFLAGS=-Wval and checking net/ipv6/proc.c:
> 
> net/ipv6/proc.c: In function ‘snmp6_seq_show_item’:
> net/ipv6/proc.c:198:2: warning: ISO C90 forbids array ‘buff’ whose
> size can’t be evaluated [-Wvla]
>   unsigned long buff[SNMP_MIB_MAX];
>   ^~~~~~~~

PITA.  Didn't we once have a different way of detecting VLAs?  Some
post-compilation asm parser, iirc.

I suppose the world wouldn't end if we had a gcc version ifdef in
kernel.h.  We'll get to remove it in, oh, ten years.

^ permalink raw reply	[flat|nested] 51+ messages in thread

* Re: [PATCH v3] kernel.h: Skip single-eval logic on literals in min()/max()
  2018-03-13 21:02                 ` Andrew Morton
@ 2018-03-13 22:14                   ` Kees Cook
  2018-03-14 11:35                     ` David Laight
  0 siblings, 1 reply; 51+ messages in thread
From: Kees Cook @ 2018-03-13 22:14 UTC (permalink / raw)
  To: Andrew Morton
  Cc: Linus Torvalds, Linux Kernel Mailing List, Josh Poimboeuf,
	Rasmus Villemoes, Gustavo A. R. Silva, Tobin C. Harding,
	Steven Rostedt, Jonathan Corbet, Chris Mason, Josef Bacik,
	David Sterba, David S. Miller, Alexey Kuznetsov,
	Hideaki YOSHIFUJI, Ingo Molnar, Peter Zijlstra, Thomas Gleixner,
	Masahiro Yamada, Borislav Petkov, Randy Dunlap, Ian Abbott,
	Sergey Senozhatsky, Petr Mladek, Andy Shevchenko,
	Pantelis Antoniou, Linux Btrfs, Network Development,
	Kernel Hardening

On Tue, Mar 13, 2018 at 2:02 PM, Andrew Morton
<akpm@linux-foundation.org> wrote:
> On Mon, 12 Mar 2018 21:28:57 -0700 Kees Cook <keescook@chromium.org> wrote:
>
>> On Mon, Mar 12, 2018 at 4:57 PM, Linus Torvalds
>> <torvalds@linux-foundation.org> wrote:
>> > On Mon, Mar 12, 2018 at 3:55 PM, Andrew Morton
>> > <akpm@linux-foundation.org> wrote:
>> >>
>> >> Replacing the __builtin_choose_expr() with ?: works of course.
>> >
>> > Hmm. That sounds like the right thing to do. We were so myopically
>> > staring at the __builtin_choose_expr() problem that we overlooked the
>> > obvious solution.
>> >
>> > Using __builtin_constant_p() together with a ?: is in fact our common
>> > pattern, so that should be fine. The only real reason to use
>> > __builtin_choose_expr() is if you want to get the *type* to vary
>> > depending on which side you choose, but that's not an issue for
>> > min/max.
>>
>> This doesn't solve it for -Wvla, unfortunately. That was the point of
>> Josh's original suggestion of __builtin_choose_expr().
>>
>> Try building with KCFLAGS=-Wval and checking net/ipv6/proc.c:
>>
>> net/ipv6/proc.c: In function ‘snmp6_seq_show_item’:
>> net/ipv6/proc.c:198:2: warning: ISO C90 forbids array ‘buff’ whose
>> size can’t be evaluated [-Wvla]
>>   unsigned long buff[SNMP_MIB_MAX];
>>   ^~~~~~~~
>
> PITA.  Didn't we once have a different way of detecting VLAs?  Some
> post-compilation asm parser, iirc.
>
> I suppose the world wouldn't end if we had a gcc version ifdef in
> kernel.h.  We'll get to remove it in, oh, ten years.

For fixing only 6 VLAs, we don't need all this effort. When it looked
like we could get away with just a "better" max(), sure. ;)

I'll send a "const_max()" which will refuse to work on
non-constant-values (so it doesn't get accidentally used on variables
that could be exposed to double-evaluation), and will work for stack
array declarations (to avoid the overly-sensitive -Wvla checks).

-Kees

-- 
Kees Cook
Pixel Security

^ permalink raw reply	[flat|nested] 51+ messages in thread

* RE: [PATCH v3] kernel.h: Skip single-eval logic on literals in min()/max()
  2018-03-13 22:14                   ` Kees Cook
@ 2018-03-14 11:35                     ` David Laight
  0 siblings, 0 replies; 51+ messages in thread
From: David Laight @ 2018-03-14 11:35 UTC (permalink / raw)
  To: 'Kees Cook', Andrew Morton
  Cc: Linus Torvalds, Linux Kernel Mailing List, Josh Poimboeuf,
	Rasmus Villemoes, Gustavo A. R. Silva, Tobin C. Harding,
	Steven Rostedt, Jonathan Corbet, Chris Mason, Josef Bacik,
	David Sterba, David S. Miller, Alexey Kuznetsov,
	Hideaki YOSHIFUJI, Ingo Molnar, Peter Zijlstra, Thomas Gleixner,
	Masahiro Yamada, Borislav Petkov, Randy Dunlap, Ian Abbott,
	Sergey Senozhatsky, Petr Mladek, Andy Shevchenko,
	Pantelis Antoniou, Linux Btrfs, Network Development,
	Kernel Hardening

From: Kees Cook
> Sent: 13 March 2018 22:15
...
> I'll send a "const_max()" which will refuse to work on
> non-constant-values (so it doesn't get accidentally used on variables
> that could be exposed to double-evaluation), and will work for stack
> array declarations (to avoid the overly-sensitive -Wvla checks).

ISTR the definitions were of the form:
	char foo[max(sizeof (struct bah), sizeof (struct baz))];
This doesn't generate a 'foo' with the required alignment.
It would be much better to use a union.

	David


^ permalink raw reply	[flat|nested] 51+ messages in thread

end of thread, other threads:[~2018-03-14 11:35 UTC | newest]

Thread overview: 51+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2018-03-09 20:05 [PATCH v3] kernel.h: Skip single-eval logic on literals in min()/max() Kees Cook
2018-03-09 21:10 ` Linus Torvalds
2018-03-09 21:47   ` Kees Cook
2018-03-11 22:46   ` Tobin C. Harding
2018-03-13 13:31   ` David Laight
2018-03-10  0:07 ` Andrew Morton
2018-03-10  0:28   ` Linus Torvalds
2018-03-10  0:32     ` Andrew Morton
2018-03-10  0:38       ` Linus Torvalds
2018-03-10  1:30         ` Kees Cook
2018-03-10  1:31           ` Kees Cook
2018-03-10  2:37             ` Linus Torvalds
2018-03-12 22:55           ` Andrew Morton
2018-03-12 23:57             ` Linus Torvalds
2018-03-13  4:28               ` Kees Cook
2018-03-13 21:02                 ` Andrew Morton
2018-03-13 22:14                   ` Kees Cook
2018-03-14 11:35                     ` David Laight
2018-03-10  3:11   ` Randy Dunlap
2018-03-10  6:10     ` Miguel Ojeda
2018-03-10  7:03       ` Miguel Ojeda
2018-03-10 16:04         ` Linus Torvalds
2018-03-10 15:33       ` Kees Cook
2018-03-10 16:11         ` Linus Torvalds
2018-03-10 16:30         ` Linus Torvalds
2018-03-10 17:34           ` Miguel Ojeda
2018-03-10 17:51             ` Linus Torvalds
2018-03-10 19:08               ` Miguel Ojeda
2018-03-11 11:05               ` Ingo Molnar
2018-03-11 18:23                 ` Linus Torvalds
2018-03-10  2:34 [PATCH 0/3] tracing: Rewrite the function filter code Steven Rostedt
2018-03-10  2:34 ` [PATCH 1/3] tracing: Combine enum and arrays into single macro in " Steven Rostedt
2018-03-12 10:31   ` Masami Hiramatsu
2018-03-10  2:34 ` [PATCH 2/3] tracing: Clean up and document pred_funcs_##type creation and use Steven Rostedt
2018-03-12 13:42   ` Masami Hiramatsu
2018-03-10  2:34 ` [PATCH 3/3] tracing: Rewrite filter logic to be simpler and faster Steven Rostedt
2018-03-10  3:10   ` Steven Rostedt
2018-03-10  3:15     ` Steven Rostedt
2018-03-10  3:22       ` Steven Rostedt
2018-03-10  3:18   ` Steven Rostedt
2018-03-12 12:42   ` Jiri Olsa
2018-03-12 18:38     ` Steven Rostedt
2018-03-12 15:10   ` Jiri Olsa
2018-03-12 18:40     ` Steven Rostedt
2018-03-12 18:54       ` Jiri Olsa
2018-03-12 19:10         ` Steven Rostedt
2018-03-12 23:52         ` Steven Rostedt
2018-03-13 10:14           ` Jiri Olsa
2018-03-13 14:12             ` Steven Rostedt
2018-03-13 14:27               ` Jiri Olsa
2018-03-11 19:54 ` [PATCH 0/3] tracing: Rewrite the function filter code Jiri Olsa

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).