Coccinelle archive on lore.kernel.org
 help / Atom feed
* [Cocci] First coccinelle script, need some help.
@ 2018-10-10 19:38 joel
  2018-10-10 20:23 ` julia.lawall
  0 siblings, 1 reply; 16+ messages in thread
From: joel @ 2018-10-10 19:38 UTC (permalink / raw)
  To: cocci


Hi!

I am trying to determine if a function argument is used across the whole
kernel for a certain kernel function.

I mustered up enough courage to write my first coccinelle script after a few
late nights of reading up about it :)

Here is .cocci script. I am trying to find if address is used at all in any
possible definitions of pte_alloc():

$ cat ~/pte_alloc.cocci
virtual report

@pte_args depends on report@
identifier E1, E2;
type T1, T2;
position p;
@@

 pte_alloc at p(T1 E1, T2 E2)
 {
...
(
...
 E2
...
)
...
 }

@script:python depends on report@
p << pte_args.p;
@@
coccilib.report.print_report(p[0], "WARNING: found definition of
apte_alloc_one with address used in the body")

The above warning does fire on the following test.c program:

struct page *pte_alloc(struct mm_struct *mm, unsigned long address)
{
        address++;
         if (condition()) {
                return NULL;
        }
}

But, *not* if I move 'address' into the if block:

struct page *pte_alloc(struct mm_struct *mm, unsigned long address)
{
         if (condition()) {
              address++;
              return NULL;
        }
}

I could not understand why, In my view the "address" expression should be
matched across the function body even within if blocks. But if I move
"address" into the if block, then the match doesn't occur any longer.

My coccicheck command is as follow:
make coccicheck COCCI=~/pte_alloc.cocci MODE=report  M=test/test.c

What am I missing? Thanks for any help.

thanks,

 - Joel

^ permalink raw reply	[flat|nested] 16+ messages in thread

* [Cocci] First coccinelle script, need some help.
  2018-10-10 19:38 [Cocci] First coccinelle script, need some help joel
@ 2018-10-10 20:23 ` julia.lawall
  2018-10-10 20:45   ` joel
  0 siblings, 1 reply; 16+ messages in thread
From: julia.lawall @ 2018-10-10 20:23 UTC (permalink / raw)
  To: cocci



On Wed, 10 Oct 2018, Joel Fernandes wrote:

>
> Hi!
>
> I am trying to determine if a function argument is used across the whole
> kernel for a certain kernel function.
>
> I mustered up enough courage to write my first coccinelle script after a few
> late nights of reading up about it :)
>
> Here is .cocci script. I am trying to find if address is used at all in any
> possible definitions of pte_alloc():
>
> $ cat ~/pte_alloc.cocci
> virtual report
>
> @pte_args depends on report@
> identifier E1, E2;
> type T1, T2;
> position p;
> @@
>
>  pte_alloc at p(T1 E1, T2 E2)
>  {
> ...
> (
> ...
>  E2
> ...
> )
> ...
>  }


In report mode, by default, the pattern has to match on all paths.  Also
when you have ... before or after E2, there can be no occurrence of E2 in
the code matched by the ...  So your rule requires that on every possible
execution path through the function, there is exactly one occurrence of
E2.

You can try the following instead:

virtual report

@pte_args depends on report exists@
identifier E1, E2;
type T1, T2;
position p;
@@

 pte_alloc at p(T1 E1, T2 E2)
 {
 ... when any
 E2
 ... when any
 }

The exists in the rule header means check one path at a time.  The when
any allows anything, including E2, to occur in the ... part.  You could
also drop the first when any.  The E2 will only match the first one, but
you don't care in this case.

julia

>
> @script:python depends on report@
> p << pte_args.p;
> @@
> coccilib.report.print_report(p[0], "WARNING: found definition of
> apte_alloc_one with address used in the body")
>
> The above warning does fire on the following test.c program:
>
> struct page *pte_alloc(struct mm_struct *mm, unsigned long address)
> {
>         address++;
>          if (condition()) {
>                 return NULL;
>         }
> }
>
> But, *not* if I move 'address' into the if block:
>
> struct page *pte_alloc(struct mm_struct *mm, unsigned long address)
> {
>          if (condition()) {
>               address++;
>               return NULL;
>         }
> }
>
> I could not understand why, In my view the "address" expression should be
> matched across the function body even within if blocks. But if I move
> "address" into the if block, then the match doesn't occur any longer.
>
> My coccicheck command is as follow:
> make coccicheck COCCI=~/pte_alloc.cocci MODE=report  M=test/test.c
>
> What am I missing? Thanks for any help.
>
> thanks,
>
>  - Joel
>
>

^ permalink raw reply	[flat|nested] 16+ messages in thread

* [Cocci] First coccinelle script, need some help.
  2018-10-10 20:23 ` julia.lawall
@ 2018-10-10 20:45   ` joel
  2018-10-10 20:51     ` julia.lawall
  0 siblings, 1 reply; 16+ messages in thread
From: joel @ 2018-10-10 20:45 UTC (permalink / raw)
  To: cocci

On Wed, Oct 10, 2018 at 10:23:18PM +0200, Julia Lawall wrote:
> 
> 
> On Wed, 10 Oct 2018, Joel Fernandes wrote:
> 
> >
> > Hi!
> >
> > I am trying to determine if a function argument is used across the whole
> > kernel for a certain kernel function.
> >
> > I mustered up enough courage to write my first coccinelle script after a few
> > late nights of reading up about it :)
> >
> > Here is .cocci script. I am trying to find if address is used at all in any
> > possible definitions of pte_alloc():
> >
> > $ cat ~/pte_alloc.cocci
> > virtual report
> >
> > @pte_args depends on report@
> > identifier E1, E2;
> > type T1, T2;
> > position p;
> > @@
> >
> >  pte_alloc at p(T1 E1, T2 E2)
> >  {
> > ...
> > (
> > ...
> >  E2
> > ...
> > )
> > ...
> >  }
> 
> 
> In report mode, by default, the pattern has to match on all paths.  Also
> when you have ... before or after E2, there can be no occurrence of E2 in
> the code matched by the ...  So your rule requires that on every possible
> execution path through the function, there is exactly one occurrence of
> E2.
> 
> You can try the following instead:
> 
> virtual report
> 
> @pte_args depends on report exists@
> identifier E1, E2;
> type T1, T2;
> position p;
> @@
> 
>  pte_alloc at p(T1 E1, T2 E2)
>  {
>  ... when any
>  E2
>  ... when any
>  }

Thanks for the quick reply.
If I just add 'depends on report exists' to the rule, then my original
example works fine now. I did not need to add the 'when any'. Do you mind
taking my original simple test.c example and modify it and let me know under
what situation would it not work?

I even added address = 1 outside of the if block and it works fine, I see the
warning as I expect without 'when any' in pront of the "...".

struct page *pte_alloc(struct mm_struct *mm, unsigned long address)
{
	 address = 1;
         if (condition()) {
		 while (1) {
			address++;
		 }
                return NULL;
        }
}
virtual report
-----
For your reference, I included the .cocci script below again. This time with
the 'depends on report exists' in the rule:

@pte_args depends on report exists@
identifier E1, E2;
type T1, T2;
position p;
@@

 pte_alloc at p(T1 E1, T2 E2)
 {
...
 E2
...
 }

@script:python depends on report@
p << pte_args.p;
@@
coccilib.report.print_report(p[0], "WARNING: found definition of
pte_alloc_one with address used in the body")

^ permalink raw reply	[flat|nested] 16+ messages in thread

* [Cocci] First coccinelle script, need some help.
  2018-10-10 20:45   ` joel
@ 2018-10-10 20:51     ` julia.lawall
  2018-10-10 21:11       ` joel
  0 siblings, 1 reply; 16+ messages in thread
From: julia.lawall @ 2018-10-10 20:51 UTC (permalink / raw)
  To: cocci



On Wed, 10 Oct 2018, Joel Fernandes wrote:

> On Wed, Oct 10, 2018 at 10:23:18PM +0200, Julia Lawall wrote:
> >
> >
> > On Wed, 10 Oct 2018, Joel Fernandes wrote:
> >
> > >
> > > Hi!
> > >
> > > I am trying to determine if a function argument is used across the whole
> > > kernel for a certain kernel function.
> > >
> > > I mustered up enough courage to write my first coccinelle script after a few
> > > late nights of reading up about it :)
> > >
> > > Here is .cocci script. I am trying to find if address is used at all in any
> > > possible definitions of pte_alloc():
> > >
> > > $ cat ~/pte_alloc.cocci
> > > virtual report
> > >
> > > @pte_args depends on report@
> > > identifier E1, E2;
> > > type T1, T2;
> > > position p;
> > > @@
> > >
> > >  pte_alloc at p(T1 E1, T2 E2)
> > >  {
> > > ...
> > > (
> > > ...
> > >  E2
> > > ...
> > > )
> > > ...
> > >  }
> >
> >
> > In report mode, by default, the pattern has to match on all paths.  Also
> > when you have ... before or after E2, there can be no occurrence of E2 in
> > the code matched by the ...  So your rule requires that on every possible
> > execution path through the function, there is exactly one occurrence of
> > E2.
> >
> > You can try the following instead:
> >
> > virtual report
> >
> > @pte_args depends on report exists@
> > identifier E1, E2;
> > type T1, T2;
> > position p;
> > @@
> >
> >  pte_alloc at p(T1 E1, T2 E2)
> >  {
> >  ... when any
> >  E2
> >  ... when any
> >  }
>
> Thanks for the quick reply.
> If I just add 'depends on report exists' to the rule, then my original
> example works fine now. I did not need to add the 'when any'. Do you mind
> taking my original simple test.c example and modify it and let me know under
> what situation would it not work?
>
> I even added address = 1 outside of the if block and it works fine, I see the
> warning as I expect without 'when any' in pront of the "...".
>
> struct page *pte_alloc(struct mm_struct *mm, unsigned long address)
> {
> 	 address = 1;
>          if (condition()) {
> 		 while (1) {
> 			address++;
> 		 }
>                 return NULL;
>         }
> }

This works, because there exists a path through the function that has only
one use of address, ie the path where condition() is false.  It should
break if you put address = 2; just under address = 1, for example.

julia

^ permalink raw reply	[flat|nested] 16+ messages in thread

* [Cocci] First coccinelle script, need some help.
  2018-10-10 20:51     ` julia.lawall
@ 2018-10-10 21:11       ` joel
  2018-10-10 21:12         ` julia.lawall
       [not found]         ` <93da55ff-c807-6587-7ef3-3d2af820117d@users.sourceforge.net>
  0 siblings, 2 replies; 16+ messages in thread
From: joel @ 2018-10-10 21:11 UTC (permalink / raw)
  To: cocci

On Wed, Oct 10, 2018 at 10:51:28PM +0200, Julia Lawall wrote:
> 
> 
> On Wed, 10 Oct 2018, Joel Fernandes wrote:
> 
> > On Wed, Oct 10, 2018 at 10:23:18PM +0200, Julia Lawall wrote:
> > >
> > >
> > > On Wed, 10 Oct 2018, Joel Fernandes wrote:
> > >
> > > >
> > > > Hi!
> > > >
> > > > I am trying to determine if a function argument is used across the whole
> > > > kernel for a certain kernel function.
> > > >
> > > > I mustered up enough courage to write my first coccinelle script after a few
> > > > late nights of reading up about it :)
> > > >
> > > > Here is .cocci script. I am trying to find if address is used at all in any
> > > > possible definitions of pte_alloc():
> > > >
> > > > $ cat ~/pte_alloc.cocci
> > > > virtual report
> > > >
> > > > @pte_args depends on report@
> > > > identifier E1, E2;
> > > > type T1, T2;
> > > > position p;
> > > > @@
> > > >
> > > >  pte_alloc at p(T1 E1, T2 E2)
> > > >  {
> > > > ...
> > > > (
> > > > ...
> > > >  E2
> > > > ...
> > > > )
> > > > ...
> > > >  }
> > >
> > >
> > > In report mode, by default, the pattern has to match on all paths.  Also
> > > when you have ... before or after E2, there can be no occurrence of E2 in
> > > the code matched by the ...  So your rule requires that on every possible
> > > execution path through the function, there is exactly one occurrence of
> > > E2.
> > >
> > > You can try the following instead:
> > >
> > > virtual report
> > >
> > > @pte_args depends on report exists@
> > > identifier E1, E2;
> > > type T1, T2;
> > > position p;
> > > @@
> > >
> > >  pte_alloc at p(T1 E1, T2 E2)
> > >  {
> > >  ... when any
> > >  E2
> > >  ... when any
> > >  }
> >
> > Thanks for the quick reply.
> > If I just add 'depends on report exists' to the rule, then my original
> > example works fine now. I did not need to add the 'when any'. Do you mind
> > taking my original simple test.c example and modify it and let me know under
> > what situation would it not work?
> >
> > I even added address = 1 outside of the if block and it works fine, I see the
> > warning as I expect without 'when any' in pront of the "...".
> >
> > struct page *pte_alloc(struct mm_struct *mm, unsigned long address)
> > {
> > 	 address = 1;
> >          if (condition()) {
> > 		 while (1) {
> > 			address++;
> > 		 }
> >                 return NULL;
> >         }
> > }
> 
> This works, because there exists a path through the function that has only
> one use of address, ie the path where condition() is false.  It should
> break if you put address = 2; just under address = 1, for example.
> 
Ok, thanks. This fact really is a bit subtle I'd say but hopefully will not
be once I get past the learning curve. Interestingly, if I do the following
then that works too without needing 'exists':

virtual report

@pte_args depends on report@
identifier E1, E2;
type T1, T2;
position p;
@@

 pte_alloc at p(T1 E1, T2 E2)
 {
<+...
 E2
...+>
 }

@script:python depends on report@
p << pte_args.p;
@@
coccilib.report.print_report(p[0], "WARNING: found definition of pte_alloc with
address used in the body")

thanks,

 - Joel

^ permalink raw reply	[flat|nested] 16+ messages in thread

* [Cocci] First coccinelle script, need some help.
  2018-10-10 21:11       ` joel
@ 2018-10-10 21:12         ` julia.lawall
       [not found]         ` <93da55ff-c807-6587-7ef3-3d2af820117d@users.sourceforge.net>
  1 sibling, 0 replies; 16+ messages in thread
From: julia.lawall @ 2018-10-10 21:12 UTC (permalink / raw)
  To: cocci



On Wed, 10 Oct 2018, Joel Fernandes wrote:

> On Wed, Oct 10, 2018 at 10:51:28PM +0200, Julia Lawall wrote:
> >
> >
> > On Wed, 10 Oct 2018, Joel Fernandes wrote:
> >
> > > On Wed, Oct 10, 2018 at 10:23:18PM +0200, Julia Lawall wrote:
> > > >
> > > >
> > > > On Wed, 10 Oct 2018, Joel Fernandes wrote:
> > > >
> > > > >
> > > > > Hi!
> > > > >
> > > > > I am trying to determine if a function argument is used across the whole
> > > > > kernel for a certain kernel function.
> > > > >
> > > > > I mustered up enough courage to write my first coccinelle script after a few
> > > > > late nights of reading up about it :)
> > > > >
> > > > > Here is .cocci script. I am trying to find if address is used at all in any
> > > > > possible definitions of pte_alloc():
> > > > >
> > > > > $ cat ~/pte_alloc.cocci
> > > > > virtual report
> > > > >
> > > > > @pte_args depends on report@
> > > > > identifier E1, E2;
> > > > > type T1, T2;
> > > > > position p;
> > > > > @@
> > > > >
> > > > >  pte_alloc at p(T1 E1, T2 E2)
> > > > >  {
> > > > > ...
> > > > > (
> > > > > ...
> > > > >  E2
> > > > > ...
> > > > > )
> > > > > ...
> > > > >  }
> > > >
> > > >
> > > > In report mode, by default, the pattern has to match on all paths.  Also
> > > > when you have ... before or after E2, there can be no occurrence of E2 in
> > > > the code matched by the ...  So your rule requires that on every possible
> > > > execution path through the function, there is exactly one occurrence of
> > > > E2.
> > > >
> > > > You can try the following instead:
> > > >
> > > > virtual report
> > > >
> > > > @pte_args depends on report exists@
> > > > identifier E1, E2;
> > > > type T1, T2;
> > > > position p;
> > > > @@
> > > >
> > > >  pte_alloc at p(T1 E1, T2 E2)
> > > >  {
> > > >  ... when any
> > > >  E2
> > > >  ... when any
> > > >  }
> > >
> > > Thanks for the quick reply.
> > > If I just add 'depends on report exists' to the rule, then my original
> > > example works fine now. I did not need to add the 'when any'. Do you mind
> > > taking my original simple test.c example and modify it and let me know under
> > > what situation would it not work?
> > >
> > > I even added address = 1 outside of the if block and it works fine, I see the
> > > warning as I expect without 'when any' in pront of the "...".
> > >
> > > struct page *pte_alloc(struct mm_struct *mm, unsigned long address)
> > > {
> > > 	 address = 1;
> > >          if (condition()) {
> > > 		 while (1) {
> > > 			address++;
> > > 		 }
> > >                 return NULL;
> > >         }
> > > }
> >
> > This works, because there exists a path through the function that has only
> > one use of address, ie the path where condition() is false.  It should
> > break if you put address = 2; just under address = 1, for example.
> >
> Ok, thanks. This fact really is a bit subtle I'd say but hopefully will not
> be once I get past the learning curve. Interestingly, if I do the following
> then that works too without needing 'exists':
>
> virtual report
>
> @pte_args depends on report@
> identifier E1, E2;
> type T1, T2;
> position p;
> @@
>
>  pte_alloc at p(T1 E1, T2 E2)
>  {
> <+...
>  E2
> ...+>
>  }
>
> @script:python depends on report@
> p << pte_args.p;
> @@
> coccilib.report.print_report(p[0], "WARNING: found definition of pte_alloc with
> address used in the body")

Yes, that is another option.  But it may be more expensive.

julia

^ permalink raw reply	[flat|nested] 16+ messages in thread

* [Cocci] Searching for parameter usages of pte_alloc()
       [not found]         ` <93da55ff-c807-6587-7ef3-3d2af820117d@users.sourceforge.net>
@ 2018-10-11 15:43           ` joel
  2018-10-11 15:51             ` julia.lawall
       [not found]             ` <cd2eb29e-775f-9c48-b839-b2ebdc3e3c56@users.sourceforge.net>
  0 siblings, 2 replies; 16+ messages in thread
From: joel @ 2018-10-11 15:43 UTC (permalink / raw)
  To: cocci

On Thu, Oct 11, 2018 at 03:26:47PM +0200, SF Markus Elfring wrote:
> > Interestingly, if I do the following then that works too without needing 'exists':
> 
> How do you think about the following variant for a SmPL rule in your
> source code analysis approach?
> 
> 
> @pte_args depends on report@
> identifier I;
> type T;
> position p;
> @@
>  pte_alloc at p(..., T I)
>  {
>  <+... I ...+>
>  }

The thing is I know the exact parameter to match, and they are fixed in
number so I don't really care about that. But I am struggling a little with
other things and would like help from you and Julia. There are 3 things (and
sorry if some of this is like, wishful stuff, but I thought its better to ask
than not and may be let you know what features would be useful):

1. How do I match function names partially? For example I want to do,
something like this but I get errors, so I have to write a rule for each
function name variant, like pte_alloc_one, or for __pte_alloc. That can be
error prone if I miss something.

@pte_args depends on report@
identifier E1, E2;
type T1, T2;
position p;
@@

 ...pte_alloc... at p(T1 E1, T2 E2)
 {
<+...
 E2
...+>
 }

2. How do I write a rule that renames function names using the "(" and "|"
syntax? I can do it for callers, but for function names this technique fails
as shown below, so again I have to end up writing separate rules:

// Options: --no-includes --include-headers
@pte_args1 depends on patch exists@
identifier E1, E2;
type T1, T2;
@@

(
- pte_alloc(T1 E1, T2 E2)
+ pte_alloc(T1 E1)
|
- pte_alloc_kernel(T1 E1, T2 E2)
+ pte_alloc_kernel(T1 E1)
)
{ ... }

Also something like the partial match would be even better, so something like:
- ...pte_alloc... at p(T1 E1, T2 E2)
+ ...pte_alloc... at p(T1 E1)
{ ... }

And I want to do the same thing for callers as well:
- ...pte_alloc... at p(T1 E1, T2 E2);
+ ...pte_alloc... at p(T1 E1);
Atleast for callers, the "(" and "|" syntax works, but for function
definitions, nothing works so I have to end up writing separate rules.

3. How do I match macro definitions of pte_alloc defined using #define, and
apply rules on those?  (also perhaps match *pte_alloc*). I tried writing
various rules to no luck making the spatch happy.

thanks,

 - Joel

^ permalink raw reply	[flat|nested] 16+ messages in thread

* [Cocci] Searching for parameter usages of pte_alloc()
  2018-10-11 15:43           ` [Cocci] Searching for parameter usages of pte_alloc() joel
@ 2018-10-11 15:51             ` julia.lawall
  2018-10-11 16:28               ` joel
       [not found]             ` <cd2eb29e-775f-9c48-b839-b2ebdc3e3c56@users.sourceforge.net>
  1 sibling, 1 reply; 16+ messages in thread
From: julia.lawall @ 2018-10-11 15:51 UTC (permalink / raw)
  To: cocci



On Thu, 11 Oct 2018, Joel Fernandes wrote:

> On Thu, Oct 11, 2018 at 03:26:47PM +0200, SF Markus Elfring wrote:
> > > Interestingly, if I do the following then that works too without needing 'exists':
> >
> > How do you think about the following variant for a SmPL rule in your
> > source code analysis approach?
> >
> >
> > @pte_args depends on report@
> > identifier I;
> > type T;
> > position p;
> > @@
> >  pte_alloc at p(..., T I)
> >  {
> >  <+... I ...+>
> >  }
>
> The thing is I know the exact parameter to match, and they are fixed in
> number so I don't really care about that. But I am struggling a little with
> other things and would like help from you and Julia. There are 3 things (and
> sorry if some of this is like, wishful stuff, but I thought its better to ask
> than not and may be let you know what features would be useful):
>
> 1. How do I match function names partially? For example I want to do,
> something like this but I get errors, so I have to write a rule for each
> function name variant, like pte_alloc_one, or for __pte_alloc. That can be
> error prone if I miss something.
>
> @pte_args depends on report@
> identifier E1, E2;
> type T1, T2;
> position p;
> @@
>
>  ...pte_alloc... at p(T1 E1, T2 E2)
>  {
> <+...
>  E2
> ...+>
>  }

You can use regular expressions.  See demos/regexp.cocci.  A more reliable
way though would be to consider what is the common properties of these
functions.  Are they stored in some structure for example?  You can make a
series of rules to address this, eg:

@r@
identifier x,a,pte_alloc;
@@

struct foo x { .a = pte_alloc, };

@@
identifier r.pte_alloc;
@@

pte_alloc(...) { ... } // do whatever you want with the pte_alloc function

>
> 2. How do I write a rule that renames function names using the "(" and "|"
> syntax? I can do it for callers, but for function names this technique fails
> as shown below, so again I have to end up writing separate rules:
>
> // Options: --no-includes --include-headers
> @pte_args1 depends on patch exists@
> identifier E1, E2;
> type T1, T2;
> @@
>
> (
> - pte_alloc(T1 E1, T2 E2)
> + pte_alloc(T1 E1)
> |
> - pte_alloc_kernel(T1 E1, T2 E2)
> + pte_alloc_kernel(T1 E1)
> )
> { ... }
>
> Also something like the partial match would be even better, so something like:
> - ...pte_alloc... at p(T1 E1, T2 E2)
> + ...pte_alloc... at p(T1 E1)
> { ... }
>
> And I want to do the same thing for callers as well:
> - ...pte_alloc... at p(T1 E1, T2 E2);
> + ...pte_alloc... at p(T1 E1);
> Atleast for callers, the "(" and "|" syntax works, but for function
> definitions, nothing works so I have to end up writing separate rules.

Hopefully the answer to question 1 will make this question irrelevant.

> 3. How do I match macro definitions of pte_alloc defined using #define, and
> apply rules on those?  (also perhaps match *pte_alloc*). I tried writing
> various rules to no luck making the spatch happy.

Sorry, I'm not completely sure what you want here.  Please send some
examples of what you want to match.

julia

^ permalink raw reply	[flat|nested] 16+ messages in thread

* [Cocci] Searching for parameter usages of pte_alloc()
  2018-10-11 15:51             ` julia.lawall
@ 2018-10-11 16:28               ` joel
  2018-10-11 16:40                 ` julia.lawall
  0 siblings, 1 reply; 16+ messages in thread
From: joel @ 2018-10-11 16:28 UTC (permalink / raw)
  To: cocci

On Thu, Oct 11, 2018 at 05:51:41PM +0200, Julia Lawall wrote:
> 
> 
> On Thu, 11 Oct 2018, Joel Fernandes wrote:
> 
> > On Thu, Oct 11, 2018 at 03:26:47PM +0200, SF Markus Elfring wrote:
> > > > Interestingly, if I do the following then that works too without needing 'exists':
> > >
> > > How do you think about the following variant for a SmPL rule in your
> > > source code analysis approach?
> > >
> > >
> > > @pte_args depends on report@
> > > identifier I;
> > > type T;
> > > position p;
> > > @@
> > >  pte_alloc at p(..., T I)
> > >  {
> > >  <+... I ...+>
> > >  }
> >
> > The thing is I know the exact parameter to match, and they are fixed in
> > number so I don't really care about that. But I am struggling a little with
> > other things and would like help from you and Julia. There are 3 things (and
> > sorry if some of this is like, wishful stuff, but I thought its better to ask
> > than not and may be let you know what features would be useful):
> >
> > 1. How do I match function names partially? For example I want to do,
> > something like this but I get errors, so I have to write a rule for each
> > function name variant, like pte_alloc_one, or for __pte_alloc. That can be
> > error prone if I miss something.
> >
> > @pte_args depends on report@
> > identifier E1, E2;
> > type T1, T2;
> > position p;
> > @@
> >
> >  ...pte_alloc... at p(T1 E1, T2 E2)
> >  {
> > <+...
> >  E2
> > ...+>
> >  }
> 
> You can use regular expressions.  See demos/regexp.cocci.  A more reliable
> way though would be to consider what is the common properties of these
> functions.  Are they stored in some structure for example?  You can make a
> series of rules to address this, eg:

Awesome! I'll try this.

> 
> @r@
> identifier x,a,pte_alloc;
> @@
> 
> struct foo x { .a = pte_alloc, };
> 
> @@
> identifier r.pte_alloc;
> @@
> 
> pte_alloc(...) { ... } // do whatever you want with the pte_alloc function

Cool, this identifier stuff seems useful. In this case the functions are just
defined in architecture specific headers and called from the mm/ code. They
are not store in a structure. I am hoping I can using the regex say assign
the match to an identifier, and use it in other rules perhaps. I'll send you
what I have today on the -mm thread so if I'm doing something silly
you can let me know (and much appreciate your help!).

> >
> > 2. How do I write a rule that renames function names using the "(" and "|"
> > syntax? I can do it for callers, but for function names this technique fails
> > as shown below, so again I have to end up writing separate rules:
> >
> > // Options: --no-includes --include-headers
> > @pte_args1 depends on patch exists@
> > identifier E1, E2;
> > type T1, T2;
> > @@
> >
> > (
> > - pte_alloc(T1 E1, T2 E2)
> > + pte_alloc(T1 E1)
> > |
> > - pte_alloc_kernel(T1 E1, T2 E2)
> > + pte_alloc_kernel(T1 E1)
> > )
> > { ... }
> >
> > Also something like the partial match would be even better, so something like:
> > - ...pte_alloc... at p(T1 E1, T2 E2)
> > + ...pte_alloc... at p(T1 E1)
> > { ... }
> >
> > And I want to do the same thing for callers as well:
> > - ...pte_alloc... at p(T1 E1, T2 E2);
> > + ...pte_alloc... at p(T1 E1);
> > Atleast for callers, the "(" and "|" syntax works, but for function
> > definitions, nothing works so I have to end up writing separate rules.
> 
> Hopefully the answer to question 1 will make this question irrelevant.

Yes, I think so.

> > 3. How do I match macro definitions of pte_alloc defined using #define, and
> > apply rules on those?  (also perhaps match *pte_alloc*). I tried writing
> > various rules to no luck making the spatch happy.
> 
> Sorry, I'm not completely sure what you want here.  Please send some
> examples of what you want to match.

Sure, so if I have code like this:
#define pte_alloc_one(mm, vmaddr) ((pte_t *) page_table_alloc(mm))

I want to have a rule that does:
- #define pte_alloc_one(mm, vmaddr) ((pte_t *) page_table_alloc(mm))
+ #define pte_alloc_one(mm) ((pte_t *) page_table_alloc(mm))

So far everything I tried only works for functions so I was wondering how one
do this with macros.

thanks,

- Joel

^ permalink raw reply	[flat|nested] 16+ messages in thread

* [Cocci] Searching for parameter usages of pte_alloc()
  2018-10-11 16:28               ` joel
@ 2018-10-11 16:40                 ` julia.lawall
  2018-10-11 16:49                   ` joel
                                     ` (2 more replies)
  0 siblings, 3 replies; 16+ messages in thread
From: julia.lawall @ 2018-10-11 16:40 UTC (permalink / raw)
  To: cocci

> Sure, so if I have code like this:
> #define pte_alloc_one(mm, vmaddr) ((pte_t *) page_table_alloc(mm))
>
> I want to have a rule that does:
> - #define pte_alloc_one(mm, vmaddr) ((pte_t *) page_table_alloc(mm))
> + #define pte_alloc_one(mm) ((pte_t *) page_table_alloc(mm))
>
> So far everything I tried only works for functions so I was wondering how one
> do this with macros.

Maybe

@r@
position p;
identifier i,a,b;
@@

#define i(a,b)@p <+...b...+>

@@
position p != r.p;
identifier i,a,b;
expresssion e;
@@

- #define i(a,b)@p e
+ #define i(a) e

The rule r finds #defines that refer to the second argument, and records
their position in p, while the second rule finds all other #defines with
two arguments.

Of course, you would want to use a regular expression for the macro name,
or do something to avoid changing all two argument macros.

Macros are often defined in header files, so you may want to use the
command line options --no-includes --include-headers.  --no-includes means
ignore header files when they are included into .c files and
--include-headers means treat both .c an .h files.  Otherwise, you only get
.c files.

julia

^ permalink raw reply	[flat|nested] 16+ messages in thread

* [Cocci] Searching for parameter usages of pte_alloc()
  2018-10-11 16:40                 ` julia.lawall
@ 2018-10-11 16:49                   ` joel
  2018-10-11 16:57                     ` julia.lawall
  2018-10-11 18:06                   ` joel
  2018-10-11 18:07                   ` joel
  2 siblings, 1 reply; 16+ messages in thread
From: joel @ 2018-10-11 16:49 UTC (permalink / raw)
  To: cocci

On Thu, Oct 11, 2018 at 06:40:22PM +0200, Julia Lawall wrote:
> > Sure, so if I have code like this:
> > #define pte_alloc_one(mm, vmaddr) ((pte_t *) page_table_alloc(mm))
> >
> > I want to have a rule that does:
> > - #define pte_alloc_one(mm, vmaddr) ((pte_t *) page_table_alloc(mm))
> > + #define pte_alloc_one(mm) ((pte_t *) page_table_alloc(mm))
> >
> > So far everything I tried only works for functions so I was wondering how one
> > do this with macros.
> 
> Maybe
> 
> @r@
> position p;
> identifier i,a,b;
> @@
> 
> #define i(a,b)@p <+...b...+>
> 
> @@
> position p != r.p;
> identifier i,a,b;
> expresssion e;
> @@
> 
> - #define i(a,b)@p e
> + #define i(a) e
> 
> The rule r finds #defines that refer to the second argument, and records
> their position in p, while the second rule finds all other #defines with
> two arguments.

Thanks a lot, I will try these.

> Of course, you would want to use a regular expression for the macro name,
> or do something to avoid changing all two argument macros.
> 
> Macros are often defined in header files, so you may want to use the
> command line options --no-includes --include-headers.  --no-includes means
> ignore header files when they are included into .c files and
> --include-headers means treat both .c an .h files.  Otherwise, you only get
> .c files.

 I am using --include-headers. I didn't get the --no-includes use with
 --include-headers though.

What happens if I don't pass --no-includes?  Do the header files included
from C files also get matched then? Then in that case I shouldn't need to
pass anything (neither --include-headers nor --no-includes) and all included
headers from C files should also be matched/parsed/patched - since every
header should atleast be included *somewhere* otherwise its existinence is
pointless. Could you help me understand this better? Thanks a lot!

 - Joel

^ permalink raw reply	[flat|nested] 16+ messages in thread

* [Cocci] Searching for parameter usages of pte_alloc()
  2018-10-11 16:49                   ` joel
@ 2018-10-11 16:57                     ` julia.lawall
  2018-10-11 17:06                       ` joel
  0 siblings, 1 reply; 16+ messages in thread
From: julia.lawall @ 2018-10-11 16:57 UTC (permalink / raw)
  To: cocci



On Thu, 11 Oct 2018, Joel Fernandes wrote:

> On Thu, Oct 11, 2018 at 06:40:22PM +0200, Julia Lawall wrote:
> > > Sure, so if I have code like this:
> > > #define pte_alloc_one(mm, vmaddr) ((pte_t *) page_table_alloc(mm))
> > >
> > > I want to have a rule that does:
> > > - #define pte_alloc_one(mm, vmaddr) ((pte_t *) page_table_alloc(mm))
> > > + #define pte_alloc_one(mm) ((pte_t *) page_table_alloc(mm))
> > >
> > > So far everything I tried only works for functions so I was wondering how one
> > > do this with macros.
> >
> > Maybe
> >
> > @r@
> > position p;
> > identifier i,a,b;
> > @@
> >
> > #define i(a,b)@p <+...b...+>
> >
> > @@
> > position p != r.p;
> > identifier i,a,b;
> > expresssion e;
> > @@
> >
> > - #define i(a,b)@p e
> > + #define i(a) e
> >
> > The rule r finds #defines that refer to the second argument, and records
> > their position in p, while the second rule finds all other #defines with
> > two arguments.
>
> Thanks a lot, I will try these.
>
> > Of course, you would want to use a regular expression for the macro name,
> > or do something to avoid changing all two argument macros.
> >
> > Macros are often defined in header files, so you may want to use the
> > command line options --no-includes --include-headers.  --no-includes means
> > ignore header files when they are included into .c files and
> > --include-headers means treat both .c an .h files.  Otherwise, you only get
> > .c files.
>
>  I am using --include-headers. I didn't get the --no-includes use with
>  --include-headers though.
>
> What happens if I don't pass --no-includes?  Do the header files included
> from C files also get matched then? Then in that case I shouldn't need to
> pass anything (neither --include-headers nor --no-includes) and all included
> headers from C files should also be matched/parsed/patched - since every
> header should atleast be included *somewhere* otherwise its existinence is
> pointless. Could you help me understand this better? Thanks a lot!

1. Coccinelle desn't know about the make file.  So there are many header
files that it may not be able to find.

2. Many header files are irrelevant to your problem and included in many
.c files, meaning that if you rely on the .c files to get the prcessing of
the .h files, you will be processing the same code over and over.

If you use --no-includes --include-headers then every files will be
processed and it will be processed only once.

Including headers, which can be done with eg --all-includes (all files
mentioned explicitly in the .c file) or --recursive-includes (files
included by other .h files) is useful if you need type information in
processing the .c file and if the processing of the .h file relies on
information present in the .c file.  If you only need type information
then the option --include-headers-for-types is useful, as the headers will
only be taken into account during the type inference phase, and then
ignored during matching and transformation, whih provides a significant
performance benefit.

julia

^ permalink raw reply	[flat|nested] 16+ messages in thread

* [Cocci] Searching for parameter usages of pte_alloc()
  2018-10-11 16:57                     ` julia.lawall
@ 2018-10-11 17:06                       ` joel
  0 siblings, 0 replies; 16+ messages in thread
From: joel @ 2018-10-11 17:06 UTC (permalink / raw)
  To: cocci

On Thu, Oct 11, 2018 at 06:57:39PM +0200, Julia Lawall wrote:
> > > Of course, you would want to use a regular expression for the macro name,
> > > or do something to avoid changing all two argument macros.
> > >
> > > Macros are often defined in header files, so you may want to use the
> > > command line options --no-includes --include-headers.  --no-includes means
> > > ignore header files when they are included into .c files and
> > > --include-headers means treat both .c an .h files.  Otherwise, you only get
> > > .c files.
> >
> >  I am using --include-headers. I didn't get the --no-includes use with
> >  --include-headers though.
> >
> > What happens if I don't pass --no-includes?  Do the header files included
> > from C files also get matched then? Then in that case I shouldn't need to
> > pass anything (neither --include-headers nor --no-includes) and all included
> > headers from C files should also be matched/parsed/patched - since every
> > header should atleast be included *somewhere* otherwise its existinence is
> > pointless. Could you help me understand this better? Thanks a lot!
> 
> 1. Coccinelle desn't know about the make file.  So there are many header
> files that it may not be able to find.
> 
> 2. Many header files are irrelevant to your problem and included in many
> .c files, meaning that if you rely on the .c files to get the prcessing of
> the .h files, you will be processing the same code over and over.
> 
> If you use --no-includes --include-headers then every files will be
> processed and it will be processed only once.
> 
> Including headers, which can be done with eg --all-includes (all files
> mentioned explicitly in the .c file) or --recursive-includes (files
> included by other .h files) is useful if you need type information in
> processing the .c file and if the processing of the .h file relies on
> information present in the .c file.  If you only need type information
> then the option --include-headers-for-types is useful, as the headers will
> only be taken into account during the type inference phase, and then
> ignored during matching and transformation, whih provides a significant
> performance benefit.

Got it, I understand it now. Thanks!

 - Joel

^ permalink raw reply	[flat|nested] 16+ messages in thread

* [Cocci] Searching for parameter usages of pte_alloc()
  2018-10-11 16:40                 ` julia.lawall
  2018-10-11 16:49                   ` joel
@ 2018-10-11 18:06                   ` joel
  2018-10-11 18:07                   ` joel
  2 siblings, 0 replies; 16+ messages in thread
From: joel @ 2018-10-11 18:06 UTC (permalink / raw)
  To: cocci

On Thu, Oct 11, 2018 at 06:40:22PM +0200, Julia Lawall wrote:
> > Sure, so if I have code like this:
> > #define pte_alloc_one(mm, vmaddr) ((pte_t *) page_table_alloc(mm))
> >
> > I want to have a rule that does:
> > - #define pte_alloc_one(mm, vmaddr) ((pte_t *) page_table_alloc(mm))
> > + #define pte_alloc_one(mm) ((pte_t *) page_table_alloc(mm))
> >
> > So far everything I tried only works for functions so I was wondering how one
> > do this with macros.
> 
> Maybe
> 
> @r@
> position p;
> identifier i,a,b;
> @@
> 
> #define i(a,b)@p <+...b...+>
> 
> @@
> position p != r.p;
> identifier i,a,b;
> expresssion e;
> @@
> 
> - #define i(a,b)@p e
> + #define i(a) e

So doing this gives me a parse error:

parse error:
  File "../pte_alloc_replace.cocci", line 40, column 17, charpos = 573
  around = '@',
  whole content = - #define i(a, b)@p e

Also should that 'e' be a statement or an expression?

Thanks!
 

^ permalink raw reply	[flat|nested] 16+ messages in thread

* [Cocci] Searching for parameter usages of pte_alloc()
  2018-10-11 16:40                 ` julia.lawall
  2018-10-11 16:49                   ` joel
  2018-10-11 18:06                   ` joel
@ 2018-10-11 18:07                   ` joel
  2 siblings, 0 replies; 16+ messages in thread
From: joel @ 2018-10-11 18:07 UTC (permalink / raw)
  To: cocci

On Thu, Oct 11, 2018 at 06:40:22PM +0200, Julia Lawall wrote:
> > Sure, so if I have code like this:
> > #define pte_alloc_one(mm, vmaddr) ((pte_t *) page_table_alloc(mm))
> >
> > I want to have a rule that does:
> > - #define pte_alloc_one(mm, vmaddr) ((pte_t *) page_table_alloc(mm))
> > + #define pte_alloc_one(mm) ((pte_t *) page_table_alloc(mm))
> >
> > So far everything I tried only works for functions so I was wondering how one
> > do this with macros.
> 
> Maybe
> 
> @r@
> position p;
> identifier i,a,b;
> @@
> 
> #define i(a,b)@p <+...b...+>
> 
> @@
> position p != r.p;
> identifier i,a,b;
> expresssion e;
> @@
> 
> - #define i(a,b)@p e
> + #define i(a) e

Sorry, it works now. I had not defined 'position p' in the rule. Please
ignore my noise.

^ permalink raw reply	[flat|nested] 16+ messages in thread

* [Cocci] Searching for parameter usages of pte_alloc()
       [not found]             ` <cd2eb29e-775f-9c48-b839-b2ebdc3e3c56@users.sourceforge.net>
@ 2018-10-11 23:54               ` joel
  0 siblings, 0 replies; 16+ messages in thread
From: joel @ 2018-10-11 23:54 UTC (permalink / raw)
  To: cocci

On Thu, Oct 11, 2018 at 09:55:46PM +0200, SF Markus Elfring wrote:
> > The thing is I know the exact parameter to match,
> 
> This is nice.
> 
> 
> > and they are fixed in number so I don't really care about that.
> 
> I got the impression that you care only for the last function argument
> so far, don't you?
> 
> 
> > But I am struggling a little with other things and would like help from you and Julia. There are 3 things (and
> > sorry if some of this is like, wishful stuff, but I thought its better to ask
> > than not and may be let you know what features would be useful):
> > 
> > 1. How do I match function names partially?
> 
> You can use regular expressions for constraints of metavariables
> in the semantic patch language.
> http://coccinelle.lip6.fr/docs/main_grammar016.html#sec28
> 

Ok, it is working.

> > 2. How do I write a rule that renames function names using the "(" and "|" syntax?
> 
> You can specify replacement variants by SmPL disjunctions.
> 
> 
> > (
> > - pte_alloc(T1 E1, T2 E2)
> > + pte_alloc(T1 E1)
> > |
> > - pte_alloc_kernel(T1 E1, T2 E2)
> > + pte_alloc_kernel(T1 E1)
> > )
> > { ... }
> 
> If you would like to delete a parameter from the function signature,
> I guess that the following SmPL code should work.

The thing is now its getting really hard to abstract, sometimes I need remove
3rd arg, sometimes 2nd. So I'm doing some manual work too. I'll CC you on my
patch to give you an idea.

> > 3. How do I match macro definitions of pte_alloc defined using #define,
> > and apply rules on those?
> 
> There are open issues remaining in the support for preprocessor directives.
> https://github.com/coccinelle/coccinelle/issues/139
> 
> How do you think about to pick any related software development
> challenges up?

At the moment I can commit to anything, I am trying to focus on documenting
RCU when I get spare time. Sorry. But I may in the future as I use Coccinelle
more, right now I am just a beginner.

thanks for your help,

 - Joel

^ permalink raw reply	[flat|nested] 16+ messages in thread

end of thread, back to index

Thread overview: 16+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2018-10-10 19:38 [Cocci] First coccinelle script, need some help joel
2018-10-10 20:23 ` julia.lawall
2018-10-10 20:45   ` joel
2018-10-10 20:51     ` julia.lawall
2018-10-10 21:11       ` joel
2018-10-10 21:12         ` julia.lawall
     [not found]         ` <93da55ff-c807-6587-7ef3-3d2af820117d@users.sourceforge.net>
2018-10-11 15:43           ` [Cocci] Searching for parameter usages of pte_alloc() joel
2018-10-11 15:51             ` julia.lawall
2018-10-11 16:28               ` joel
2018-10-11 16:40                 ` julia.lawall
2018-10-11 16:49                   ` joel
2018-10-11 16:57                     ` julia.lawall
2018-10-11 17:06                       ` joel
2018-10-11 18:06                   ` joel
2018-10-11 18:07                   ` joel
     [not found]             ` <cd2eb29e-775f-9c48-b839-b2ebdc3e3c56@users.sourceforge.net>
2018-10-11 23:54               ` joel

Coccinelle archive on lore.kernel.org

Archives are clonable:
	git clone --mirror https://lore.kernel.org/cocci/0 cocci/git/0.git

	# If you have public-inbox 1.1+ installed, you may
	# initialize and index your mirror using the following commands:
	public-inbox-init -V2 cocci cocci/ https://lore.kernel.org/cocci \
		cocci@systeme.lip6.fr cocci@archiver.kernel.org
	public-inbox-index cocci


Newsgroup available over NNTP:
	nntp://nntp.lore.kernel.org/fr.lip6.systeme.cocci


AGPL code for this site: git clone https://public-inbox.org/ public-inbox