Re: [Cocci] getting rid of implicit boolean expressions

From: Julia Lawall <julia.lawall@inria.fr>
To: Akos Pasztory <akos.pasztory@gmail.com>
Cc: cocci@systeme.lip6.fr
Subject: Re: [Cocci] getting rid of implicit boolean expressions
Date: Wed, 21 Apr 2021 21:16:49 +0200 (CEST)	[thread overview]
Message-ID: <alpine.DEB.2.22.394.2104212103160.20674@hadrien> (raw)
In-Reply-To: <CAJwHcF6jc_NNGeXpPh0z7upKLXSOuprS=SPmiR-x-QdYxZiEyw@mail.gmail.com>

[-- Attachment #1: Type: text/plain, Size: 2863 bytes --]

On Wed, 21 Apr 2021, Akos Pasztory wrote:

> Hi,
>
> I'm trying do the following kind of transformations:
>
>  int x, y;
>  char *p;
>  bool b, c;
>
> -b = x || !y;
> +b = (x != 0) || (y == 0);
>
> -c = !p;
> +c = (p == NULL);
>
> -if (x & 3)
> +if ((x & 3) != 0)
>  f();
> // etc
>
> That is: trying to eliminate implicit boolean-ness (and add parentheses as well).
>
> I was thinking along the lines of first finding expressions
> that are in "boolean context" (part of a || or && expression,
> or an if/for/while condition, maybe something else too?).
> Then find sub-expressions of those that are not of the form 'E op F'
> where 'op' is a comparison operator (==, !=, <=, ...).
> And finally depending on whether they are pointer or integer and
> whether they are negated, replace them with the above constructs (x != 0, etc.)
>
> Is this the right way to think about this?  Meaning does it fit the mental model
> of Coccinelle, or some other approach is needed? (E.g. it crossed my mind to
> maybe match all expressions and try to filter out "unwanted" ones via
> position p != { ... } constraints but that seemed infeasible.)

I think you can do

A simple approach could be:

@@
idexpression *x;
@@

- x
+ (x != NULL)
  || ...

@@
idexpression x;
@@

- x
+ (x != 0)
  || ...

If you want to do function calls, you could do

@@
expression *e;
identifier f;
expression list es;
@@

- f(es)@e
+ (f(es) != NULL)
  || ...

@@
identifier f;
expression list es;
@@

- f(es)
+ (f(es) != 0)
  || ...

Some explanation:

* For pattern || ... there is an isomorphism that allows the pattern to
appear anywhere in the top level of a chain of ||s, including an empty
chain.  So it actually should match any expression in a boolean context.

* In the third rule, there is )@e.  That means that e should match the
smallest expression that contains the ), which turns out to the be
function call.  That way you can talk about the return type of the
function call.  A limitation here is that Coccinelle has to be able to
figure out what the type is (this is also a limitation of the first rule
above).  If it can't figure out the type of the variable or the return
type of the function call, then the first/third rule will fail and you
will end up with a != 0 test on a pointer.  To try to avoid this, you can
use the options --recursive-includes --use-headers-for-types
--relax-include-path to try to take into account as many header files as
possible.

An alternate approach is indeed to do something with position variables.
So you could do something like:

@ok@
position p;
expression e;
expression x;
@@

 (x != 0)@e@p

@@
position p != ok.p;
expression x;
@@

- x@p
+ (x != 0)
  || ...

But the first rule would have to be extended to consider lots of cases.

A binary operator metavariable could be helpful, eg:

binary operator bop = { ==, !=, <, > };

julia

[-- Attachment #2: Type: text/plain, Size: 136 bytes --]

_______________________________________________
Cocci mailing list
Cocci@systeme.lip6.fr
https://systeme.lip6.fr/mailman/listinfo/cocci