cocci.inria.fr archive mirror
 help / color / mirror / Atom feed
* [Cocci] Checking uniqueness for source code positions during SmPL data processing
@ 2019-04-22  7:49 Markus Elfring
  2019-04-22  7:55 ` Julia Lawall
  0 siblings, 1 reply; 5+ messages in thread
From: Markus Elfring @ 2019-04-22  7:49 UTC (permalink / raw)
  To: Coccinelle

[-- Attachment #1: Type: text/plain, Size: 2126 bytes --]

Hello,

I reported that I am trying a specific source code analysis out again.
Information can be imported also into database tables for such a purpose.
I observed a primary key constraint violation for my data processing attempt.
Useful background information can be found for a topic like
“Checking the handling of unique keys/indexes”.
https://groups.google.com/d/msg/sqlalchemy/klmUwiirIQw/LDeeRTcshQ4J

A corresponding aspect can trigger an usual development challenge.
The transaction fails if questionable data were detected. It seems to be hard
to find the single inappropriate data set out by SQL programming interfaces.

Thus I developed the attached script variant for the semantic patch language.
Another test result points interesting details out, doesn't it?


elfring@Sonne:~/Projekte/Linux/next-patched> time spatch ~/Projekte/Coccinelle/janitor/list_duplicate_statement_pairs_from_if_branches5.cocci drivers/media/dvb-frontends/stv0297.c
…
A duplicate key was passed.
function: stv0297_readreg
file: drivers/media/dvb-frontends/stv0297.c
line: 87
column: 4
Traceback (most recent call last):
  File "<string>", line 4, in <module>
  File "<string>", line 26, in store_statements
RuntimeError
exn while in timeout_function
Error in Python script, line 34, file …

real	0m0,606s
user	0m0,541s
sys	0m0,037s


By the way: I would like to point out once more that the code from
the SmPL rule “initialize” is 18 lines long and the definition for
the function “store_statements” starts at line 4 originally.


The implementation of the function “stv0297_readreg” contains two statements
which are repeated in three if branches for the desired exception handling.
https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/drivers/media/dvb-frontends/stv0297.c?id=085b7755808aa11f78ab9377257e1dad2e6fa4bb#n66


Now I wonder about the shown software behaviour again when the corresponding
source code position should be unique based on the specified data fields.
How can affected software areas be improved further?

Regards,
Markus

[-- Attachment #2: list_duplicate_statement_pairs_from_if_branches5.cocci --]
[-- Type: text/plain, Size: 2191 bytes --]

@initialize:python@
@@
import sys
mapping = {}

def store_statements(fun, source, s1, s2):
    """Add data to an internal table."""
    for place in source:
       key = (fun, place.file, place.line, int(place.column) + 1)
       if key in mapping:
          sys.stderr.write("""A duplicate key was passed.
function: %s
file: %s
line: %s
column: %d
""" % key)
          raise RuntimeError
       else:
          mapping[key] = (s1, s2)

@searching@
identifier work;
statement s1, s2;
position pos;
type T;
@@
 T work(...)
 {
 ... when any
 if (...)
 {
 ... when any
 s1@pos
 s2
 }
 ... when any
 }

@script:python collection@
fun << searching.work;
s1 << searching.s1;
s2 << searching.s2;
place << searching.pos;
@@
store_statements(fun, place, s1, s2)

@finalize:python@
@@
entries = len(mapping)

if entries > 0:
   from collections import Counter
   counts = Counter()

   for k, v in mapping.items():
      counts[(v[0], v[1], k[0], k[1])] += 1

   delimiter = "|"
   duplicates = {}

   for k, v in counts.items():
      if v > 1:
         duplicates[k] = v

   if len(duplicates.keys()) > 0:
      sys.stdout.write(delimiter.join(["statement1",
                                       "statement2",
                                       '"function name"',
                                       '"source file"',
                                       "incidence"]))
      sys.stdout.write("\r\n")

      for k, v in duplicates.items():
         sys.stdout.write(delimiter.join([k[0], k[1], k[2], k[3], str(v)]))
         sys.stdout.write("\r\n")
   else:
      sys.stderr.write("Duplicate statements were not determined from "
                       + str(entries) + " records.\n")
      sys.stderr.write(delimiter.join(["statement1",
                                       "statement2",
                                       '"function name"',
                                       '"source file"',
                                       "line"]))
      sys.stderr.write("\r\n")

      for k, v in counts.items():
         sys.stdout.write(delimiter.join([v[0], v[1], k[1], k[0], k[2]]))
         sys.stderr.write("\r\n")
else:
   sys.stderr.write("No result for this analysis!\n")

[-- Attachment #3: Type: text/plain, Size: 136 bytes --]

_______________________________________________
Cocci mailing list
Cocci@systeme.lip6.fr
https://systeme.lip6.fr/mailman/listinfo/cocci

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [Cocci] Checking uniqueness for source code positions during SmPL data processing
  2019-04-22  7:49 [Cocci] Checking uniqueness for source code positions during SmPL data processing Markus Elfring
@ 2019-04-22  7:55 ` Julia Lawall
  2019-04-22  8:55   ` Markus Elfring
  0 siblings, 1 reply; 5+ messages in thread
From: Julia Lawall @ 2019-04-22  7:55 UTC (permalink / raw)
  To: Markus Elfring; +Cc: Coccinelle

[-- Attachment #1: Type: text/plain, Size: 2531 bytes --]



On Mon, 22 Apr 2019, Markus Elfring wrote:

> Hello,
>
> I reported that I am trying a specific source code analysis out again.
> Information can be imported also into database tables for such a purpose.
> I observed a primary key constraint violation for my data processing attempt.
> Useful background information can be found for a topic like
> “Checking the handling of unique keys/indexes”.
> https://groups.google.com/d/msg/sqlalchemy/klmUwiirIQw/LDeeRTcshQ4J
>
> A corresponding aspect can trigger an usual development challenge.
> The transaction fails if questionable data were detected. It seems to be hard
> to find the single inappropriate data set out by SQL programming interfaces.

I'm not going to debug anything that involves external tools, ie your
database.

Note however that by converting from * to printing, you have converted the
...s in your searching rule from "exists" to "forall" as the quantifier
over the paths.  You may want to put exists in the header of the searching
rule.

julia


>
> Thus I developed the attached script variant for the semantic patch language.
> Another test result points interesting details out, doesn't it?
>
>
> elfring@Sonne:~/Projekte/Linux/next-patched> time spatch ~/Projekte/Coccinelle/janitor/list_duplicate_statement_pairs_from_if_branches5.cocci drivers/media/dvb-frontends/stv0297.c
> …
> A duplicate key was passed.
> function: stv0297_readreg
> file: drivers/media/dvb-frontends/stv0297.c
> line: 87
> column: 4
> Traceback (most recent call last):
>   File "<string>", line 4, in <module>
>   File "<string>", line 26, in store_statements
> RuntimeError
> exn while in timeout_function
> Error in Python script, line 34, file …
>
> real	0m0,606s
> user	0m0,541s
> sys	0m0,037s
>
>
> By the way: I would like to point out once more that the code from
> the SmPL rule “initialize” is 18 lines long and the definition for
> the function “store_statements” starts at line 4 originally.
>
>
> The implementation of the function “stv0297_readreg” contains two statements
> which are repeated in three if branches for the desired exception handling.
> https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/drivers/media/dvb-frontends/stv0297.c?id=085b7755808aa11f78ab9377257e1dad2e6fa4bb#n66
>
>
> Now I wonder about the shown software behaviour again when the corresponding
> source code position should be unique based on the specified data fields.
> How can affected software areas be improved further?
>
> Regards,
> Markus
>

[-- Attachment #2: Type: text/plain, Size: 136 bytes --]

_______________________________________________
Cocci mailing list
Cocci@systeme.lip6.fr
https://systeme.lip6.fr/mailman/listinfo/cocci

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [Cocci] Checking uniqueness for source code positions during SmPL data processing
  2019-04-22  7:55 ` Julia Lawall
@ 2019-04-22  8:55   ` Markus Elfring
  2019-04-22  9:05     ` Julia Lawall
  0 siblings, 1 reply; 5+ messages in thread
From: Markus Elfring @ 2019-04-22  8:55 UTC (permalink / raw)
  To: Julia Lawall; +Cc: Coccinelle

> I'm not going to debug anything that involves external tools,
> ie your database.

* Will such a restriction become interesting also for further clarifications?

* Did you notice that the script variant “list_duplicate_statement_pairs_from_if_branches5.cocci”
  is working only by a simple combination of SmPL and Python code
  (without an extra dependency on the software “SQLAlchemy”)?
  The desired data should be imported into an ordinary Python dictionary here.


> Note however that by converting from * to printing, you have converted the
> ...s in your searching rule from "exists" to "forall" as the quantifier
> over the paths.

Thanks for this reminder of consequences around the asterisk functionality
and SmPL ellipses.


> You may want to put exists in the header of the searching rule.

I can try this setting also out.

Would you like to clarify the following test result?

elfring@Sonne:~/Projekte/Linux/next-patched> time spatch ~/Projekte/Coccinelle/janitor/list_duplicate_statement_pairs_from_if_branches6.cocci drivers/media/dvb-frontends/stv0297.c
…
statement1|statement2|"function name"|"source file"|incidence
dprintk ( "%s: readreg error (reg == 0x%02x, ret == %i)\n" , __func__ , reg , ret ) ;|return - 1 ;|stv0297_readreg|drivers/media/dvb-frontends/stv0297.c|3
dprintk ( "%s: readreg error (reg == 0x%02x, ret == %i)\n" , __func__ , reg1 , ret ) ;|return - 1 ;|stv0297_readregs|drivers/media/dvb-frontends/stv0297.c|3

real	0m0,272s
user	0m0,219s
sys	0m0,052s


Where does the added number come from for the identifier “reg1”?

Regards,
Markus
_______________________________________________
Cocci mailing list
Cocci@systeme.lip6.fr
https://systeme.lip6.fr/mailman/listinfo/cocci

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [Cocci] Checking uniqueness for source code positions during SmPL data processing
  2019-04-22  8:55   ` Markus Elfring
@ 2019-04-22  9:05     ` Julia Lawall
  2019-04-22  9:26       ` Markus Elfring
  0 siblings, 1 reply; 5+ messages in thread
From: Julia Lawall @ 2019-04-22  9:05 UTC (permalink / raw)
  To: Markus Elfring; +Cc: Coccinelle

[-- Attachment #1: Type: text/plain, Size: 1755 bytes --]



On Mon, 22 Apr 2019, Markus Elfring wrote:

> > I'm not going to debug anything that involves external tools,
> > ie your database.
>
> * Will such a restriction become interesting also for further clarifications?
>
> * Did you notice that the script variant “list_duplicate_statement_pairs_from_if_branches5.cocci”
>   is working only by a simple combination of SmPL and Python code
>   (without an extra dependency on the software “SQLAlchemy”)?
>   The desired data should be imported into an ordinary Python dictionary here.
>
>
> > Note however that by converting from * to printing, you have converted the
> > ...s in your searching rule from "exists" to "forall" as the quantifier
> > over the paths.
>
> Thanks for this reminder of consequences around the asterisk functionality
> and SmPL ellipses.
>
>
> > You may want to put exists in the header of the searching rule.
>
> I can try this setting also out.
>
> Would you like to clarify the following test result?
>
> elfring@Sonne:~/Projekte/Linux/next-patched> time spatch ~/Projekte/Coccinelle/janitor/list_duplicate_statement_pairs_from_if_branches6.cocci drivers/media/dvb-frontends/stv0297.c
> …
> statement1|statement2|"function name"|"source file"|incidence
> dprintk ( "%s: readreg error (reg == 0x%02x, ret == %i)\n" , __func__ , reg , ret ) ;|return - 1 ;|stv0297_readreg|drivers/media/dvb-frontends/stv0297.c|3
> dprintk ( "%s: readreg error (reg == 0x%02x, ret == %i)\n" , __func__ , reg1 , ret ) ;|return - 1 ;|stv0297_readregs|drivers/media/dvb-frontends/stv0297.c|3
>
> real	0m0,272s
> user	0m0,219s
> sys	0m0,052s
>
>
> Where does the added number come from for the identifier “reg1”?

It's in the source code, at a different position than the reg result.

julia

[-- Attachment #2: Type: text/plain, Size: 136 bytes --]

_______________________________________________
Cocci mailing list
Cocci@systeme.lip6.fr
https://systeme.lip6.fr/mailman/listinfo/cocci

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [Cocci] Checking uniqueness for source code positions during SmPL data processing
  2019-04-22  9:05     ` Julia Lawall
@ 2019-04-22  9:26       ` Markus Elfring
  0 siblings, 0 replies; 5+ messages in thread
From: Markus Elfring @ 2019-04-22  9:26 UTC (permalink / raw)
  To: Julia Lawall; +Cc: Coccinelle

>> Where does the added number come from for the identifier “reg1”?
>
> It's in the source code, at a different position than the reg result.

You (and the script variant “list_duplicate_statement_pairs_from_if_branches6.cocci”)
are right. A bit of exception handling code can be repeated too often
in two function implementations from the source file “stv0297.c”
according to the Linux coding style, can't it?

Will software development challenges be reconsidered around the shown
source code analysis approach together with the SmPL setting “forall”?

Regards,
Markus
_______________________________________________
Cocci mailing list
Cocci@systeme.lip6.fr
https://systeme.lip6.fr/mailman/listinfo/cocci

^ permalink raw reply	[flat|nested] 5+ messages in thread

end of thread, other threads:[~2019-04-22  9:27 UTC | newest]

Thread overview: 5+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2019-04-22  7:49 [Cocci] Checking uniqueness for source code positions during SmPL data processing Markus Elfring
2019-04-22  7:55 ` Julia Lawall
2019-04-22  8:55   ` Markus Elfring
2019-04-22  9:05     ` Julia Lawall
2019-04-22  9:26       ` Markus Elfring

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).