cocci.inria.fr archive mirror
 help / color / mirror / Atom feed
* Re: [Cocci] Python interface
@ 2020-06-08 12:22 Markus Elfring
  0 siblings, 0 replies; 13+ messages in thread
From: Markus Elfring @ 2020-06-08 12:22 UTC (permalink / raw)
  To: Julia Lawall; +Cc: cocci, Denis Efremov

> OK, basically I worry about converting a list of 35 000 file names to python.

Can such development concerns be adjusted?


> But maybe it's not a big deal.

Will additional search filters, dependencies and better algorithms improve
the software situation finally?

Regards,
Markus
_______________________________________________
Cocci mailing list
Cocci@systeme.lip6.fr
https://systeme.lip6.fr/mailman/listinfo/cocci

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [Cocci] Python interface
@ 2020-06-08 12:09 Markus Elfring
  0 siblings, 0 replies; 13+ messages in thread
From: Markus Elfring @ 2020-06-08 12:09 UTC (permalink / raw)
  To: Denis Efremov; +Cc: cocci

> > … But I wonder if the difference
> > between "the file is not in the initial list" and "the file is in the
> > initial list but it is ignored" is important for you?
>
> "the file is in the initial list but it is ignored" is ok to me. I don't know
> how to get it.

Which information are you really missing so far for your data processing?


> The problem is that I need to know that the "mm/util.c" file is in the scope.

Why does this source file matter for you here?


> We know that a pattern should match a function in the "mm/util.c" file
> and report only in case it doesn't.

Do you check a system property here?


>                                     We don't need to report if the tool
> is not processing the file "mm/util.c" at all. That is why we need the full
> list of files.

I suggest to reconsider your expectations.
Will solution alternatives become interesting?

Regards,
Markus
_______________________________________________
Cocci mailing list
Cocci@systeme.lip6.fr
https://systeme.lip6.fr/mailman/listinfo/cocci

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [Cocci] Python interface
@ 2020-06-08 12:09 Markus Elfring
  0 siblings, 0 replies; 13+ messages in thread
From: Markus Elfring @ 2020-06-08 12:09 UTC (permalink / raw)
  To: Denis Efremov; +Cc: cocci

> > … But I wonder if the difference
> > between "the file is not in the initial list" and "the file is in the
> > initial list but it is ignored" is important for you?
>
> "the file is in the initial list but it is ignored" is ok to me. I don't know
> how to get it.

Which information are you really missing so far for your data processing?


> The problem is that I need to know that the "mm/util.c" file is in the scope.

Why does this source file matter for you here?


> We know that a pattern should match a function in the "mm/util.c" file
> and report only in case it doesn't.

Do you check a system property here?


>                                     We don't need to report if the tool
> is not processing the file "mm/util.c" at all. That is why we need the full
> list of files.

I suggest to reconsider your expectations.
Will solution alternatives become interesting?

Regards,
Markus
_______________________________________________
Cocci mailing list
Cocci@systeme.lip6.fr
https://systeme.lip6.fr/mailman/listinfo/cocci

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [Cocci] Python interface
  2020-06-08 11:00   ` Denis Efremov
@ 2020-06-08 11:21     ` Julia Lawall
  0 siblings, 0 replies; 13+ messages in thread
From: Julia Lawall @ 2020-06-08 11:21 UTC (permalink / raw)
  To: Denis Efremov; +Cc: cocci



On Mon, 8 Jun 2020, Denis Efremov wrote:

>
> > Is this self-check functionality planned for a patch in the Linux kernel,
> > or for some oher use?  Because the python script that I suggested for
> > collecting the names of all of the files will imply parsing all of those
> > files, which will have a major negative impact on performance.
>
> Yes, I've almost prepared it. It's more like a PoC, of course you are free not
> to taking it. I just find it interesting to implement. I hope you enjoy it.
> The check will depend on additional "virtual selfcheck", so I expect that the
> performance will not downgrade much.
>
> Perhaps it
> > could be possible to have the complete list of files available in the
> > initialize rule, like you expected.  But I wonder if the difference
> > between "the file is not in the initial list" and "the file is in the
> > initial list but it is ignored" is important for you?
>
> "the file is in the initial list but it is ignored" is ok to me. I don't know
> how to get it.
>
> The problem is that I need to know that the "mm/util.c" file is in the scope.
> We know that a pattern should match a function in the "mm/util.c" file
> and report only in case it doesn't. We don't need to report if the tool
> is not processing the file "mm/util.c" at all. That is why we need the full
> list of files.

OK, basically I worry about converting a list of 35 000 file names to
python.  But maybe it's not a big deal.

julia

>
> Thanks,
> Denis
> _______________________________________________
> Cocci mailing list
> Cocci@systeme.lip6.fr
> https://systeme.lip6.fr/mailman/listinfo/cocci
>
_______________________________________________
Cocci mailing list
Cocci@systeme.lip6.fr
https://systeme.lip6.fr/mailman/listinfo/cocci

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [Cocci] Python interface
  2020-06-08 10:31 ` Julia Lawall
@ 2020-06-08 11:09   ` Markus Elfring
  0 siblings, 0 replies; 13+ messages in thread
From: Markus Elfring @ 2020-06-08 11:09 UTC (permalink / raw)
  To: Julia Lawall; +Cc: cocci, Denis Efremov

>> Should the software be able to determine just the amount of script code
>> between the curly brackets?
>
> What if there is a string or comment inside the script code, ant it
> contains only a }?

I suggest to consider corresponding escaping (or quoting) of questionable text.


> Anyway the problem is that the Coccinelle lexer doesn't know that it is
> looking at script code, or eg {} around a selection of types.

Can the parser component become better accordingly?


>> Will a SmPL child process get a chance to perform customised finalisation code?
>> Would you like to continue the clarification according to a topic
>> like “Complete support for fork-join work flows”?
>> https://github.com/coccinelle/coccinelle/issues/50
>
> Those issues are addressed by the use of the merge functionality.

I have still got understanding difficulties for the previously mentioned test script.
https://github.com/coccinelle/coccinelle/blob/175de16bc7e535b6a89a62b81a673b0d0cd7075c/tests/merge_vars_python.cocci#L1

How will the software documentation become finally complete also for this issue?

Regards,
Markus
_______________________________________________
Cocci mailing list
Cocci@systeme.lip6.fr
https://systeme.lip6.fr/mailman/listinfo/cocci

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [Cocci] Python interface
  2020-06-08 10:27 ` Julia Lawall
@ 2020-06-08 11:00   ` Denis Efremov
  2020-06-08 11:21     ` Julia Lawall
  0 siblings, 1 reply; 13+ messages in thread
From: Denis Efremov @ 2020-06-08 11:00 UTC (permalink / raw)
  To: cocci


> Is this self-check functionality planned for a patch in the Linux kernel,
> or for some oher use?  Because the python script that I suggested for
> collecting the names of all of the files will imply parsing all of those
> files, which will have a major negative impact on performance.

Yes, I've almost prepared it. It's more like a PoC, of course you are free not
to taking it. I just find it interesting to implement. I hope you enjoy it.
The check will depend on additional "virtual selfcheck", so I expect that the
performance will not downgrade much.

Perhaps it
> could be possible to have the complete list of files available in the
> initialize rule, like you expected.  But I wonder if the difference
> between "the file is not in the initial list" and "the file is in the
> initial list but it is ignored" is important for you?

"the file is in the initial list but it is ignored" is ok to me. I don't know
how to get it.

The problem is that I need to know that the "mm/util.c" file is in the scope.
We know that a pattern should match a function in the "mm/util.c" file
and report only in case it doesn't. We don't need to report if the tool
is not processing the file "mm/util.c" at all. That is why we need the full
list of files.

Thanks,
Denis
_______________________________________________
Cocci mailing list
Cocci@systeme.lip6.fr
https://systeme.lip6.fr/mailman/listinfo/cocci

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [Cocci] Python interface
  2020-06-08  7:21 Markus Elfring
@ 2020-06-08 10:31 ` Julia Lawall
  2020-06-08 11:09   ` Markus Elfring
  0 siblings, 1 reply; 13+ messages in thread
From: Julia Lawall @ 2020-06-08 10:31 UTC (permalink / raw)
  To: Markus Elfring; +Cc: cocci, Denis Efremov

[-- Attachment #1: Type: text/plain, Size: 1186 bytes --]



On Mon, 8 Jun 2020, Markus Elfring wrote:

> > > > @r depends on !patch@
> > > // It doesn't work. Is it normal?
> > > //position p: script:python() { matches.extend(p); relevant(p) };
> >
> > "Doesn't work" means you get a parse error?  The parser of the code inside
> > the {} is pretty fragile.
>
> I find such information also interesting.
>
>
> > Perhaps this could be improved somewhat, but it is limited by the fact
> > that Coccinelle doesn't know how to parse python properly.
>
> Should the software be able to determine just the amount of script code
> between the curly brackets?

What if there is a string or comment inside the script code, ant it
contains only a }?

Anyway the problem is that the Coccinelle lexer doesn't know that it is
looking at script code, or eg {} around a selection of types.

> > which is not run in parallel.
>
> Will a SmPL child process get a chance to perform customised finalisation code?
> Would you like to continue the clarification according to a topic
> like “Complete support for fork-join work flows”?
> https://github.com/coccinelle/coccinelle/issues/50

Those issues are addressed by the use of the merge functionality.

julia

[-- Attachment #2: Type: text/plain, Size: 136 bytes --]

_______________________________________________
Cocci mailing list
Cocci@systeme.lip6.fr
https://systeme.lip6.fr/mailman/listinfo/cocci

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [Cocci] Python interface
  2020-06-07 15:05 Denis Efremov
  2020-06-07 20:12 ` Julia Lawall
@ 2020-06-08 10:27 ` Julia Lawall
  2020-06-08 11:00   ` Denis Efremov
  1 sibling, 1 reply; 13+ messages in thread
From: Julia Lawall @ 2020-06-08 10:27 UTC (permalink / raw)
  To: Denis Efremov; +Cc: Coccinelle



On Sun, 7 Jun 2020, Denis Efremov wrote:

> I've got a couple of questions about python interface.
> Let us suppose that I want to suppress a couple of matches because they are false-positives.
> However, I still want to check they exists in finalize block and print a warning otherwise.
> This is some kind of self-check for a rule.
>
> For example, there is "test.c" file with:
> extern int function1(void);
> extern int function2(void);
>
> int test(void)
> {
>         return function1();
> }
>
> And the rule test.cocci with:
>
> virtual context
> virtual org
> virtual patch
> virtual report
>
> @initialize:python@
> @@
> matches = [] # global list of all positions to check in finalize
> blacklist = frozenset(['test'])
>
> # Always prints []. Is it normal?
> #print(cocci.files())
>
> def relevant(p): # suppress functions from blacklist
> 	matches.extend(p) # It doesn't work in position script, so I do it here
> 	return False if blacklist & { el.current_element for el in p } else True # intersection
>
> @r depends on !patch@
> // It doesn't work. Is it normal?
> //position p: script:python() { matches.extend(p); relevant(p) };
> position p: script:python() { relevant(p) };
> @@
>
> * function1@p();
>
> @rp depends on patch@
> position p: script:python() { relevant(p) };
> @@
>
> - function1@p();
> + function2();
>
> // Self-check for the rule
> @finalize:python depends on !patch@
> @@
>
> # Always prints []. Is it normal?
> #print(cocci.files())
>
> if 'test.c' in cocci.files(): # I know that we should match test definition in test.c
> 	not_matched = blacklist - { el.current_element for el in matches }; # set difference
> 	if not_matched:
> 		print('SELF-CHECK: patterns no longer match definitions for: ' + ','.join(not_matched))
>
> I want to implement this kind of self-check for memdup_user function. I need check that the patterns
> match the function definition, but suppress these diagnostics. And print a warning about changed
> implementation if there is no matches for the patterns in mm/util.c

Is this self-check functionality planned for a patch in the Linux kernel,
or for some oher use?  Because the python script that I suggested for
collecting the names of all of the files will imply parsing all of those
files, which will have a major negative impact on performance.  Perhaps it
could be possible to have the complete list of files available in the
initialize rule, like you expected.  But I wonder if the difference
between "the file is not in the initial list" and "the file is in the
initial list but it is ignored" is important for you?

julia
_______________________________________________
Cocci mailing list
Cocci@systeme.lip6.fr
https://systeme.lip6.fr/mailman/listinfo/cocci

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [Cocci] Python interface
@ 2020-06-08  7:21 Markus Elfring
  2020-06-08 10:31 ` Julia Lawall
  0 siblings, 1 reply; 13+ messages in thread
From: Markus Elfring @ 2020-06-08  7:21 UTC (permalink / raw)
  To: Julia Lawall; +Cc: cocci, Denis Efremov

> > > @r depends on !patch@
> > // It doesn't work. Is it normal?
> > //position p: script:python() { matches.extend(p); relevant(p) };
>
> "Doesn't work" means you get a parse error?  The parser of the code inside
> the {} is pretty fragile.

I find such information also interesting.


> Perhaps this could be improved somewhat, but it is limited by the fact
> that Coccinelle doesn't know how to parse python properly.

Should the software be able to determine just the amount of script code
between the curly brackets?


> If you tried this and it didn't work, it could be because of parallelism.
> When you use the -j option, each child process has its own address space,

There are the usual concerns to consider around multi-programming.


> and by default they are not combined for the finalize,

Are you looking for an other combination approach?


> which is not run in parallel.

Will a SmPL child process get a chance to perform customised finalisation code?
Would you like to continue the clarification according to a topic
like “Complete support for fork-join work flows”?
https://github.com/coccinelle/coccinelle/issues/50

Regards,
Markus
_______________________________________________
Cocci mailing list
Cocci@systeme.lip6.fr
https://systeme.lip6.fr/mailman/listinfo/cocci

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [Cocci] Python interface
  2020-06-07 21:20   ` Denis Efremov
@ 2020-06-07 21:23     ` Julia Lawall
  0 siblings, 0 replies; 13+ messages in thread
From: Julia Lawall @ 2020-06-07 21:23 UTC (permalink / raw)
  To: Denis Efremov; +Cc: Coccinelle



On Mon, 8 Jun 2020, Denis Efremov wrote:

>
> >> @r depends on !patch@
> >> // It doesn't work. Is it normal?
> >> //position p: script:python() { matches.extend(p); relevant(p) };
> >
> > "Doesn't work" means you get a parse error?  The parser of the code inside
> > the {} is pretty fragile.  Perhaps this could be improved somewhat, but it
> > is limited by the fact that Coccinelle doesn't know how to parse python
> > properly.
>
> It prints "hd" and exits.

OK, it's a form of parse error.

>
> > This seems entirely reasonable.  You can collect the places that are
> > matched in a variable declared in the initialize, and then look at that
> > variable in the finalize.
>
> I need a list of all files spatch tries to process. A list of files in which
> spatch finds some matches is not enough. Otherwise the approach will incorrectly
> work when cocci script runs on a subset of kernel files, e.g.,
> make coccicheck M=drivers/net

OK, just make a python rule that will run on every file.

@script:python@
@@

do something with cocci.files()

---

julia
_______________________________________________
Cocci mailing list
Cocci@systeme.lip6.fr
https://systeme.lip6.fr/mailman/listinfo/cocci

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [Cocci] Python interface
  2020-06-07 20:12 ` Julia Lawall
@ 2020-06-07 21:20   ` Denis Efremov
  2020-06-07 21:23     ` Julia Lawall
  0 siblings, 1 reply; 13+ messages in thread
From: Denis Efremov @ 2020-06-07 21:20 UTC (permalink / raw)
  To: Julia Lawall; +Cc: Coccinelle


>> @r depends on !patch@
>> // It doesn't work. Is it normal?
>> //position p: script:python() { matches.extend(p); relevant(p) };
> 
> "Doesn't work" means you get a parse error?  The parser of the code inside
> the {} is pretty fragile.  Perhaps this could be improved somewhat, but it
> is limited by the fact that Coccinelle doesn't know how to parse python
> properly.

It prints "hd" and exits.

> This seems entirely reasonable.  You can collect the places that are
> matched in a variable declared in the initialize, and then look at that
> variable in the finalize.

I need a list of all files spatch tries to process. A list of files in which
spatch finds some matches is not enough. Otherwise the approach will incorrectly
work when cocci script runs on a subset of kernel files, e.g.,
make coccicheck M=drivers/net

Thanks,
Denis
_______________________________________________
Cocci mailing list
Cocci@systeme.lip6.fr
https://systeme.lip6.fr/mailman/listinfo/cocci

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [Cocci] Python interface
  2020-06-07 15:05 Denis Efremov
@ 2020-06-07 20:12 ` Julia Lawall
  2020-06-07 21:20   ` Denis Efremov
  2020-06-08 10:27 ` Julia Lawall
  1 sibling, 1 reply; 13+ messages in thread
From: Julia Lawall @ 2020-06-07 20:12 UTC (permalink / raw)
  To: Denis Efremov; +Cc: Coccinelle



On Sun, 7 Jun 2020, Denis Efremov wrote:

> I've got a couple of questions about python interface.
> Let us suppose that I want to suppress a couple of matches because they are false-positives.
> However, I still want to check they exists in finalize block and print a warning otherwise.
> This is some kind of self-check for a rule.
>
> For example, there is "test.c" file with:
> extern int function1(void);
> extern int function2(void);
>
> int test(void)
> {
>         return function1();
> }
>
> And the rule test.cocci with:
>
> virtual context
> virtual org
> virtual patch
> virtual report
>
> @initialize:python@
> @@
> matches = [] # global list of all positions to check in finalize
> blacklist = frozenset(['test'])
>
> # Always prints []. Is it normal?
> #print(cocci.files())

At this point yes.  It will give some information when you are in a rule
that is applied to files.

>
> def relevant(p): # suppress functions from blacklist
> 	matches.extend(p) # It doesn't work in position script, so I do it here
> 	return False if blacklist & { el.current_element for el in p } else True # intersection
>
> @r depends on !patch@
> // It doesn't work. Is it normal?
> //position p: script:python() { matches.extend(p); relevant(p) };

"Doesn't work" means you get a parse error?  The parser of the code inside
the {} is pretty fragile.  Perhaps this could be improved somewhat, but it
is limited by the fact that Coccinelle doesn't know how to parse python
properly.

> position p: script:python() { relevant(p) };
> @@
>
> * function1@p();
>
> @rp depends on patch@
> position p: script:python() { relevant(p) };
> @@
>
> - function1@p();
> + function2();
>
> // Self-check for the rule
> @finalize:python depends on !patch@
> @@
>
> # Always prints []. Is it normal?
> #print(cocci.files())

Again, a finalize is not applide to any files.

> if 'test.c' in cocci.files(): # I know that we should match test definition in test.c
> 	not_matched = blacklist - { el.current_element for el in matches }; # set difference
> 	if not_matched:
> 		print('SELF-CHECK: patterns no longer match definitions for: ' + ','.join(not_matched))
>
> I want to implement this kind of self-check for memdup_user function. I need check that the patterns
> match the function definition, but suppress these diagnostics. And print a warning about changed
> implementation if there is no matches for the patterns in mm/util.c

This seems entirely reasonable.  You can collect the places that are
matched in a variable declared in the initialize, and then look at that
variable in the finalize.

If you tried this and it didn't work, it could be because of parallelism.
When you use the -j option, each child process has its own address
space, and by default they are not combined for the finalize, which is not
run in parallel.  You can see coccinelle/tests/merge_vars_python.cocci.
That doesn't look like a great example though...  Basically, l1 <<
merge.v1 will cause l1 to be a list of the v1 values for the different
cores.  So if you have a list of matched files, then merge will give you a
list of lists of matched files.

julia



> Thanks,
> Denis
> _______________________________________________
> Cocci mailing list
> Cocci@systeme.lip6.fr
> https://systeme.lip6.fr/mailman/listinfo/cocci
>
_______________________________________________
Cocci mailing list
Cocci@systeme.lip6.fr
https://systeme.lip6.fr/mailman/listinfo/cocci

^ permalink raw reply	[flat|nested] 13+ messages in thread

* [Cocci] Python interface
@ 2020-06-07 15:05 Denis Efremov
  2020-06-07 20:12 ` Julia Lawall
  2020-06-08 10:27 ` Julia Lawall
  0 siblings, 2 replies; 13+ messages in thread
From: Denis Efremov @ 2020-06-07 15:05 UTC (permalink / raw)
  To: Coccinelle

I've got a couple of questions about python interface.
Let us suppose that I want to suppress a couple of matches because they are false-positives.
However, I still want to check they exists in finalize block and print a warning otherwise.
This is some kind of self-check for a rule.

For example, there is "test.c" file with:
extern int function1(void);
extern int function2(void);

int test(void)
{
        return function1();
}

And the rule test.cocci with:

virtual context
virtual org
virtual patch
virtual report

@initialize:python@
@@
matches = [] # global list of all positions to check in finalize
blacklist = frozenset(['test'])

# Always prints []. Is it normal?
#print(cocci.files())

def relevant(p): # suppress functions from blacklist
	matches.extend(p) # It doesn't work in position script, so I do it here
	return False if blacklist & { el.current_element for el in p } else True # intersection

@r depends on !patch@
// It doesn't work. Is it normal?
//position p: script:python() { matches.extend(p); relevant(p) };
position p: script:python() { relevant(p) };
@@

* function1@p();

@rp depends on patch@
position p: script:python() { relevant(p) };
@@

- function1@p();
+ function2();

// Self-check for the rule
@finalize:python depends on !patch@
@@

# Always prints []. Is it normal?
#print(cocci.files())

if 'test.c' in cocci.files(): # I know that we should match test definition in test.c
	not_matched = blacklist - { el.current_element for el in matches }; # set difference
	if not_matched:
		print('SELF-CHECK: patterns no longer match definitions for: ' + ','.join(not_matched))

I want to implement this kind of self-check for memdup_user function. I need check that the patterns
match the function definition, but suppress these diagnostics. And print a warning about changed
implementation if there is no matches for the patterns in mm/util.c

Thanks,
Denis
_______________________________________________
Cocci mailing list
Cocci@systeme.lip6.fr
https://systeme.lip6.fr/mailman/listinfo/cocci

^ permalink raw reply	[flat|nested] 13+ messages in thread

end of thread, other threads:[~2020-06-08 12:22 UTC | newest]

Thread overview: 13+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-06-08 12:22 [Cocci] Python interface Markus Elfring
  -- strict thread matches above, loose matches on Subject: below --
2020-06-08 12:09 Markus Elfring
2020-06-08 12:09 Markus Elfring
2020-06-08  7:21 Markus Elfring
2020-06-08 10:31 ` Julia Lawall
2020-06-08 11:09   ` Markus Elfring
2020-06-07 15:05 Denis Efremov
2020-06-07 20:12 ` Julia Lawall
2020-06-07 21:20   ` Denis Efremov
2020-06-07 21:23     ` Julia Lawall
2020-06-08 10:27 ` Julia Lawall
2020-06-08 11:00   ` Denis Efremov
2020-06-08 11:21     ` Julia Lawall

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).