All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH RFC] Coccinelle: Check for return not matching function signature
@ 2015-05-05 10:12 ` Nicholas Mc Guire
  0 siblings, 0 replies; 25+ messages in thread
From: Nicholas Mc Guire @ 2015-05-05 10:12 UTC (permalink / raw)
  To: Julia Lawall
  Cc: Gilles Muller, Nicolas Palix, Michal Marek, cocci, linux-kernel,
	Nicholas Mc Guire

Check if the signature of a function and the return value type match. This
is currently not the case in more than 2300 functions. In many cases this
mismatch will have no side-effects but in some cases it may lead to hard
to locate problems - and for readability and code understanding it is also
helpful when types match.

Signed-off-by: Nicholas Mc Guire <hofrat@osadl.org>
---

Did not see any false positives in a scan of 4.0-rc2 (but did not check
all results either).

Not sure if scripts/coccinelle/tests/ is the right place for this
It did not seem to trigger any false positives but in some cases (notably
void pointers) the warning might be unnecessary.

The output is a bit lengthy - not sure if that is too much but it seemed
useful to me to see the non-matching types explicitly in the warning
message.

Patch is against 4.1-rc2 (localversion-next is -next-20150505)

 .../coccinelle/tests/signature_matches_ret.cocci   |   31 ++++++++++++++++++++
 1 file changed, 31 insertions(+)
 create mode 100644 scripts/coccinelle/tests/signature_matches_ret.cocci

diff --git a/scripts/coccinelle/tests/signature_matches_ret.cocci b/scripts/coccinelle/tests/signature_matches_ret.cocci
new file mode 100644
index 0000000..2dd1086
--- /dev/null
+++ b/scripts/coccinelle/tests/signature_matches_ret.cocci
@@ -0,0 +1,31 @@
+// Find functions where return type and signature do not match
+// Comments:
+// Options: --no-includes --include-headers
+
+virtual context
+virtual org
+virtual report
+
+@match@
+identifier f,ret;
+position p;
+type T1,T2;
+@@
+
+T1 f(...) {
+ T2 ret;
+<+...
+* return@p ret
+;
+...+>
+}
+
+@script:python@
+p << match.p;
+fn << match.f;
+T1 << match.T1;
+T2 << match.T2;
+@@
+
+if T1 != T2:
+   print "%s:%s,%s WARNING: return of wrong type (%s != %s)" % (p[0].file,fn,p[0].line,T1,T2)
-- 
1.7.10.4


^ permalink raw reply related	[flat|nested] 25+ messages in thread

* [Cocci] [PATCH RFC] Coccinelle: Check for return not matching function signature
@ 2015-05-05 10:12 ` Nicholas Mc Guire
  0 siblings, 0 replies; 25+ messages in thread
From: Nicholas Mc Guire @ 2015-05-05 10:12 UTC (permalink / raw)
  To: cocci

Check if the signature of a function and the return value type match. This
is currently not the case in more than 2300 functions. In many cases this
mismatch will have no side-effects but in some cases it may lead to hard
to locate problems - and for readability and code understanding it is also
helpful when types match.

Signed-off-by: Nicholas Mc Guire <hofrat@osadl.org>
---

Did not see any false positives in a scan of 4.0-rc2 (but did not check
all results either).

Not sure if scripts/coccinelle/tests/ is the right place for this
It did not seem to trigger any false positives but in some cases (notably
void pointers) the warning might be unnecessary.

The output is a bit lengthy - not sure if that is too much but it seemed
useful to me to see the non-matching types explicitly in the warning
message.

Patch is against 4.1-rc2 (localversion-next is -next-20150505)

 .../coccinelle/tests/signature_matches_ret.cocci   |   31 ++++++++++++++++++++
 1 file changed, 31 insertions(+)
 create mode 100644 scripts/coccinelle/tests/signature_matches_ret.cocci

diff --git a/scripts/coccinelle/tests/signature_matches_ret.cocci b/scripts/coccinelle/tests/signature_matches_ret.cocci
new file mode 100644
index 0000000..2dd1086
--- /dev/null
+++ b/scripts/coccinelle/tests/signature_matches_ret.cocci
@@ -0,0 +1,31 @@
+// Find functions where return type and signature do not match
+// Comments:
+// Options: --no-includes --include-headers
+
+virtual context
+virtual org
+virtual report
+
+ at match@
+identifier f,ret;
+position p;
+type T1,T2;
+@@
+
+T1 f(...) {
+ T2 ret;
+<+...
+* return@p ret
+;
+...+>
+}
+
+ at script:python@
+p << match.p;
+fn << match.f;
+T1 << match.T1;
+T2 << match.T2;
+@@
+
+if T1 != T2:
+   print "%s:%s,%s WARNING: return of wrong type (%s != %s)" % (p[0].file,fn,p[0].line,T1,T2)
-- 
1.7.10.4

^ permalink raw reply related	[flat|nested] 25+ messages in thread

* Re: [PATCH RFC] Coccinelle: Check for return not matching function signature
  2015-05-05 10:12 ` [Cocci] " Nicholas Mc Guire
@ 2015-05-05 12:32   ` SF Markus Elfring
  -1 siblings, 0 replies; 25+ messages in thread
From: SF Markus Elfring @ 2015-05-05 12:32 UTC (permalink / raw)
  To: Nicholas Mc Guire; +Cc: cocci, linux-kernel

> Check if the signature of a function and the return value type match.

Is this a task that is usually performed by a compiler?


> In many cases this mismatch will have no side-effects
> but in some cases it may lead to hard to locate problems

It is another software development challenge to find concrete
open issues there.


> - and for readability and code understanding it is also helpful
> when types match.

How would you like to check for compatible data types here?


> The output is a bit lengthy - not sure if that is too much
> but it seemed useful to me to see the non-matching types explicitly
> in the warning message.

How do you think about to import the result list into a database table?


> +if T1 != T2:
> +   print "%s:%s,%s WARNING: return of wrong type (%s != %s)" % (p[0].file,fn,p[0].line,T1,T2)

Is such a check a bit too simple?

Regards,
Markus


^ permalink raw reply	[flat|nested] 25+ messages in thread

* [Cocci] [PATCH RFC] Coccinelle: Check for return not matching function signature
@ 2015-05-05 12:32   ` SF Markus Elfring
  0 siblings, 0 replies; 25+ messages in thread
From: SF Markus Elfring @ 2015-05-05 12:32 UTC (permalink / raw)
  To: cocci

> Check if the signature of a function and the return value type match.

Is this a task that is usually performed by a compiler?


> In many cases this mismatch will have no side-effects
> but in some cases it may lead to hard to locate problems

It is another software development challenge to find concrete
open issues there.


> - and for readability and code understanding it is also helpful
> when types match.

How would you like to check for compatible data types here?


> The output is a bit lengthy - not sure if that is too much
> but it seemed useful to me to see the non-matching types explicitly
> in the warning message.

How do you think about to import the result list into a database table?


> +if T1 != T2:
> +   print "%s:%s,%s WARNING: return of wrong type (%s != %s)" % (p[0].file,fn,p[0].line,T1,T2)

Is such a check a bit too simple?

Regards,
Markus

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH RFC] Coccinelle: Check for return not matching function signature
  2015-05-05 12:32   ` [Cocci] " SF Markus Elfring
@ 2015-05-05 13:04     ` Nicholas Mc Guire
  -1 siblings, 0 replies; 25+ messages in thread
From: Nicholas Mc Guire @ 2015-05-05 13:04 UTC (permalink / raw)
  To: SF Markus Elfring; +Cc: Nicholas Mc Guire, cocci, linux-kernel

On Tue, 05 May 2015, SF Markus Elfring wrote:

> > Check if the signature of a function and the return value type match.
> 
> Is this a task that is usually performed by a compiler?
> 
> 
> > In many cases this mismatch will have no side-effects
> > but in some cases it may lead to hard to locate problems
> 
> It is another software development challenge to find concrete
> open issues there.
> 
> 
> > - and for readability and code understanding it is also helpful
> > when types match.
> 
> How would you like to check for compatible data types here?
>

coccinelle knows the type so all you need to do is comare them in
phython .
 
> 
> > The output is a bit lengthy - not sure if that is too much
> > but it seemed useful to me to see the non-matching types explicitly
> > in the warning message.
> 
> How do you think about to import the result list into a database table?
> 

working on that "re-cycling" your parameter count example
top 10:
    488 ssize_t != int
    195 int != unsigned int
    183 long != int
    113 int != u32
     55 int != unsigned long
     48 int != u8
     45 int != u16
     44 unsigned int != int
     37 int != s32
     30 int != long


> 
> > +if T1 != T2:
> > +   print "%s:%s,%s WARNING: return of wrong type (%s != %s)" % (p[0].file,fn,p[0].line,T1,T2)
> 
> Is such a check a bit too simple?
>
Nop - why ?
Cocci "knwow" C so the assignment of types is reliable - 
flaging a s32 != int is fine with respect to readability
even if they are technically the same.

thx!
hofrat

^ permalink raw reply	[flat|nested] 25+ messages in thread

* [Cocci] [PATCH RFC] Coccinelle: Check for return not matching function signature
@ 2015-05-05 13:04     ` Nicholas Mc Guire
  0 siblings, 0 replies; 25+ messages in thread
From: Nicholas Mc Guire @ 2015-05-05 13:04 UTC (permalink / raw)
  To: cocci

On Tue, 05 May 2015, SF Markus Elfring wrote:

> > Check if the signature of a function and the return value type match.
> 
> Is this a task that is usually performed by a compiler?
> 
> 
> > In many cases this mismatch will have no side-effects
> > but in some cases it may lead to hard to locate problems
> 
> It is another software development challenge to find concrete
> open issues there.
> 
> 
> > - and for readability and code understanding it is also helpful
> > when types match.
> 
> How would you like to check for compatible data types here?
>

coccinelle knows the type so all you need to do is comare them in
phython .
 
> 
> > The output is a bit lengthy - not sure if that is too much
> > but it seemed useful to me to see the non-matching types explicitly
> > in the warning message.
> 
> How do you think about to import the result list into a database table?
> 

working on that "re-cycling" your parameter count example
top 10:
    488 ssize_t != int
    195 int != unsigned int
    183 long != int
    113 int != u32
     55 int != unsigned long
     48 int != u8
     45 int != u16
     44 unsigned int != int
     37 int != s32
     30 int != long


> 
> > +if T1 != T2:
> > +   print "%s:%s,%s WARNING: return of wrong type (%s != %s)" % (p[0].file,fn,p[0].line,T1,T2)
> 
> Is such a check a bit too simple?
>
Nop - why ?
Cocci "knwow" C so the assignment of types is reliable - 
flaging a s32 != int is fine with respect to readability
even if they are technically the same.

thx!
hofrat

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [Cocci] [PATCH RFC] Coccinelle: Check for return not matching function signature
  2015-05-05 13:04     ` [Cocci] " Nicholas Mc Guire
@ 2015-05-05 13:29       ` Julia Lawall
  -1 siblings, 0 replies; 25+ messages in thread
From: Julia Lawall @ 2015-05-05 13:29 UTC (permalink / raw)
  To: Nicholas Mc Guire
  Cc: SF Markus Elfring, linux-kernel, cocci, Nicholas Mc Guire

On Tue, 5 May 2015, Nicholas Mc Guire wrote:

> On Tue, 05 May 2015, SF Markus Elfring wrote:
>
> > > Check if the signature of a function and the return value type match.
> >
> > Is this a task that is usually performed by a compiler?
> >
> >
> > > In many cases this mismatch will have no side-effects
> > > but in some cases it may lead to hard to locate problems
> >
> > It is another software development challenge to find concrete
> > open issues there.
> >
> >
> > > - and for readability and code understanding it is also helpful
> > > when types match.
> >
> > How would you like to check for compatible data types here?
> >
>
> coccinelle knows the type so all you need to do is comare them in
> phython .
>
> >
> > > The output is a bit lengthy - not sure if that is too much
> > > but it seemed useful to me to see the non-matching types explicitly
> > > in the warning message.
> >
> > How do you think about to import the result list into a database table?
> >
>
> working on that "re-cycling" your parameter count example
> top 10:
>     488 ssize_t != int
>     195 int != unsigned int
>     183 long != int
>     113 int != u32
>      55 int != unsigned long
>      48 int != u8
>      45 int != u16
>      44 unsigned int != int
>      37 int != s32
>      30 int != long

Thanks for the specific results.  That looks very useful.

julia

>
>
> >
> > > +if T1 != T2:
> > > +   print "%s:%s,%s WARNING: return of wrong type (%s != %s)" % (p[0].file,fn,p[0].line,T1,T2)
> >
> > Is such a check a bit too simple?
> >
> Nop - why ?
> Cocci "knwow" C so the assignment of types is reliable -
> flaging a s32 != int is fine with respect to readability
> even if they are technically the same.
>
> thx!
> hofrat
> _______________________________________________
> Cocci mailing list
> Cocci@systeme.lip6.fr
> https://systeme.lip6.fr/mailman/listinfo/cocci
>

^ permalink raw reply	[flat|nested] 25+ messages in thread

* [Cocci] [PATCH RFC] Coccinelle: Check for return not matching function signature
@ 2015-05-05 13:29       ` Julia Lawall
  0 siblings, 0 replies; 25+ messages in thread
From: Julia Lawall @ 2015-05-05 13:29 UTC (permalink / raw)
  To: cocci

On Tue, 5 May 2015, Nicholas Mc Guire wrote:

> On Tue, 05 May 2015, SF Markus Elfring wrote:
>
> > > Check if the signature of a function and the return value type match.
> >
> > Is this a task that is usually performed by a compiler?
> >
> >
> > > In many cases this mismatch will have no side-effects
> > > but in some cases it may lead to hard to locate problems
> >
> > It is another software development challenge to find concrete
> > open issues there.
> >
> >
> > > - and for readability and code understanding it is also helpful
> > > when types match.
> >
> > How would you like to check for compatible data types here?
> >
>
> coccinelle knows the type so all you need to do is comare them in
> phython .
>
> >
> > > The output is a bit lengthy - not sure if that is too much
> > > but it seemed useful to me to see the non-matching types explicitly
> > > in the warning message.
> >
> > How do you think about to import the result list into a database table?
> >
>
> working on that "re-cycling" your parameter count example
> top 10:
>     488 ssize_t != int
>     195 int != unsigned int
>     183 long != int
>     113 int != u32
>      55 int != unsigned long
>      48 int != u8
>      45 int != u16
>      44 unsigned int != int
>      37 int != s32
>      30 int != long

Thanks for the specific results.  That looks very useful.

julia

>
>
> >
> > > +if T1 != T2:
> > > +   print "%s:%s,%s WARNING: return of wrong type (%s != %s)" % (p[0].file,fn,p[0].line,T1,T2)
> >
> > Is such a check a bit too simple?
> >
> Nop - why ?
> Cocci "knwow" C so the assignment of types is reliable -
> flaging a s32 != int is fine with respect to readability
> even if they are technically the same.
>
> thx!
> hofrat
> _______________________________________________
> Cocci mailing list
> Cocci at systeme.lip6.fr
> https://systeme.lip6.fr/mailman/listinfo/cocci
>

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH RFC] Coccinelle: Check for return not matching function signature
  2015-05-05 13:04     ` [Cocci] " Nicholas Mc Guire
@ 2015-05-05 13:45       ` SF Markus Elfring
  -1 siblings, 0 replies; 25+ messages in thread
From: SF Markus Elfring @ 2015-05-05 13:45 UTC (permalink / raw)
  To: Nicholas Mc Guire; +Cc: Nicholas Mc Guire, cocci, linux-kernel

>> How do you think about to import the result list into a database table?
> 
> working on that "re-cycling" your parameter count example
> top 10:
>     488 ssize_t != int
>     195 int != unsigned int
>     183 long != int

…

Would you like to provide a static source code analysis
directly in such a report format?
Will a drill-down into corresponding implementation details
become more interesting for the detection of further open issues?

Regards,
Markus

^ permalink raw reply	[flat|nested] 25+ messages in thread

* [Cocci] [PATCH RFC] Coccinelle: Check for return not matching function signature
@ 2015-05-05 13:45       ` SF Markus Elfring
  0 siblings, 0 replies; 25+ messages in thread
From: SF Markus Elfring @ 2015-05-05 13:45 UTC (permalink / raw)
  To: cocci

>> How do you think about to import the result list into a database table?
> 
> working on that "re-cycling" your parameter count example
> top 10:
>     488 ssize_t != int
>     195 int != unsigned int
>     183 long != int

?

Would you like to provide a static source code analysis
directly in such a report format?
Will a drill-down into corresponding implementation details
become more interesting for the detection of further open issues?

Regards,
Markus

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH RFC] Coccinelle: Check for return not matching function signature
  2015-05-05 10:12 ` [Cocci] " Nicholas Mc Guire
@ 2015-05-05 14:40   ` Julia Lawall
  -1 siblings, 0 replies; 25+ messages in thread
From: Julia Lawall @ 2015-05-05 14:40 UTC (permalink / raw)
  To: Nicholas Mc Guire
  Cc: Julia Lawall, Gilles Muller, Nicolas Palix, Michal Marek, cocci,
	linux-kernel

> +@match@
> +identifier f,ret;
> +position p;
> +type T1,T2;
> +@@
> +
> +T1 f(...) {
> + T2 ret;
> +<+...
> +* return@p ret
> +;
> +...+>
> +}

Given the number of results, it may seem surprising, but I think that you
are actually missing a lot of results.  Becaue you require that ret be the
first variable that is declared in the function.  Also, you require that
ret be an identifier.  If you want to keep the restriction about being an
identifier, you could put:

@match exists@
type T1,T2;
idexpression T2 ret;
identifier f;
@@

T1 f(...) {
<+...
return@p ret;
...+>
}

If you don't care about the identifier constraint, then you can just put
T2 ret.  Note also the addition of exists.  There is a problem if only one
path has this property.  Another thing you can do is the following:

@match exists@
type T1,T2;
expression T1 ok;
idexpression T2 ret;
identifier f;
@@

T1 f(...) {
<+...
(
return ok;
|
return@p ret;
)
...+>
}

Then Coccinelle will find the cases where the types are wrong, rather than
requiring a test in python.

(I haven't tested any of this)

julia

^ permalink raw reply	[flat|nested] 25+ messages in thread

* [Cocci] [PATCH RFC] Coccinelle: Check for return not matching function signature
@ 2015-05-05 14:40   ` Julia Lawall
  0 siblings, 0 replies; 25+ messages in thread
From: Julia Lawall @ 2015-05-05 14:40 UTC (permalink / raw)
  To: cocci

> + at match@
> +identifier f,ret;
> +position p;
> +type T1,T2;
> +@@
> +
> +T1 f(...) {
> + T2 ret;
> +<+...
> +* return at p ret
> +;
> +...+>
> +}

Given the number of results, it may seem surprising, but I think that you
are actually missing a lot of results.  Becaue you require that ret be the
first variable that is declared in the function.  Also, you require that
ret be an identifier.  If you want to keep the restriction about being an
identifier, you could put:

@match exists@
type T1,T2;
idexpression T2 ret;
identifier f;
@@

T1 f(...) {
<+...
return@p ret;
...+>
}

If you don't care about the identifier constraint, then you can just put
T2 ret.  Note also the addition of exists.  There is a problem if only one
path has this property.  Another thing you can do is the following:

@match exists@
type T1,T2;
expression T1 ok;
idexpression T2 ret;
identifier f;
@@

T1 f(...) {
<+...
(
return ok;
|
return@p ret;
)
...+>
}

Then Coccinelle will find the cases where the types are wrong, rather than
requiring a test in python.

(I haven't tested any of this)

julia

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH RFC] Coccinelle: Check for return not matching function signature
  2015-05-05 14:40   ` [Cocci] " Julia Lawall
@ 2015-05-05 16:00     ` Nicholas Mc Guire
  -1 siblings, 0 replies; 25+ messages in thread
From: Nicholas Mc Guire @ 2015-05-05 16:00 UTC (permalink / raw)
  To: Julia Lawall
  Cc: Nicholas Mc Guire, Gilles Muller, Nicolas Palix, Michal Marek,
	cocci, linux-kernel

On Tue, 05 May 2015, Julia Lawall wrote:

> > +@match@
> > +identifier f,ret;
> > +position p;
> > +type T1,T2;
> > +@@
> > +
> > +T1 f(...) {
> > + T2 ret;
> > +<+...
> > +* return@p ret
> > +;
> > +...+>
> > +}
> 
> Given the number of results, it may seem surprising, but I think that you
> are actually missing a lot of results.  Becaue you require that ret be the
> first variable that is declared in the function.  Also, you require that
> ret be an identifier.  If you want to keep the restriction about being an
> identifier, you could put:
> 
> @match exists@
> type T1,T2;
> idexpression T2 ret;
> identifier f;
> @@
> 
> T1 f(...) {
> <+...
> return@p ret;
> ...+>
> }
>

this is depressing - I now like by wrong solution even more ...
unfortunately you are right - I missed most - its now at 25146
 
> If you don't care about the identifier constraint, then you can just put
> T2 ret.  Note also the addition of exists.  There is a problem if only one
> path has this property.  Another thing you can do is the following:
> 
> @match exists@
> type T1,T2;
  idexpression T1 ok;
> idexpression T2 ret;
> identifier f;
  position p;
> @@
> 
> T1 f(...) {
> <+...
> (
> return ok;
> |
> return@p ret;
> )
> ...+>
> }
> 
> Then Coccinelle will find the cases where the types are wrong, rather than
> requiring a test in python.
> 
> (I haven't tested any of this)

also works - I had naively expected this to be faster - but it does not
seem to be.

will check results did not expect 10% of the kernel functions
to have missmatching return types in atleast one of their paths.

thx!
hofrat

^ permalink raw reply	[flat|nested] 25+ messages in thread

* [Cocci] [PATCH RFC] Coccinelle: Check for return not matching function signature
@ 2015-05-05 16:00     ` Nicholas Mc Guire
  0 siblings, 0 replies; 25+ messages in thread
From: Nicholas Mc Guire @ 2015-05-05 16:00 UTC (permalink / raw)
  To: cocci

On Tue, 05 May 2015, Julia Lawall wrote:

> > + at match@
> > +identifier f,ret;
> > +position p;
> > +type T1,T2;
> > +@@
> > +
> > +T1 f(...) {
> > + T2 ret;
> > +<+...
> > +* return at p ret
> > +;
> > +...+>
> > +}
> 
> Given the number of results, it may seem surprising, but I think that you
> are actually missing a lot of results.  Becaue you require that ret be the
> first variable that is declared in the function.  Also, you require that
> ret be an identifier.  If you want to keep the restriction about being an
> identifier, you could put:
> 
> @match exists@
> type T1,T2;
> idexpression T2 ret;
> identifier f;
> @@
> 
> T1 f(...) {
> <+...
> return at p ret;
> ...+>
> }
>

this is depressing - I now like by wrong solution even more ...
unfortunately you are right - I missed most - its now at 25146
 
> If you don't care about the identifier constraint, then you can just put
> T2 ret.  Note also the addition of exists.  There is a problem if only one
> path has this property.  Another thing you can do is the following:
> 
> @match exists@
> type T1,T2;
  idexpression T1 ok;
> idexpression T2 ret;
> identifier f;
  position p;
> @@
> 
> T1 f(...) {
> <+...
> (
> return ok;
> |
> return at p ret;
> )
> ...+>
> }
> 
> Then Coccinelle will find the cases where the types are wrong, rather than
> requiring a test in python.
> 
> (I haven't tested any of this)

also works - I had naively expected this to be faster - but it does not
seem to be.

will check results did not expect 10% of the kernel functions
to have missmatching return types in atleast one of their paths.

thx!
hofrat

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH RFC] Coccinelle: Check for return not matching function signature
  2015-05-05 14:40   ` [Cocci] " Julia Lawall
@ 2015-05-05 16:25     ` SF Markus Elfring
  -1 siblings, 0 replies; 25+ messages in thread
From: SF Markus Elfring @ 2015-05-05 16:25 UTC (permalink / raw)
  To: Julia Lawall, Nicholas Mc Guire; +Cc: Coccinelle, LKML, Nicholas Mc Guire

> Then Coccinelle will find the cases where the types are wrong, rather than
> requiring a test in python.

Do you need a minus character in the first text column of the SmPL scripts then
to mark corresponding update candidates (instead of the mentioned warning printing)?

Regards,
Markus

^ permalink raw reply	[flat|nested] 25+ messages in thread

* [Cocci] [PATCH RFC] Coccinelle: Check for return not matching function signature
@ 2015-05-05 16:25     ` SF Markus Elfring
  0 siblings, 0 replies; 25+ messages in thread
From: SF Markus Elfring @ 2015-05-05 16:25 UTC (permalink / raw)
  To: cocci

> Then Coccinelle will find the cases where the types are wrong, rather than
> requiring a test in python.

Do you need a minus character in the first text column of the SmPL scripts then
to mark corresponding update candidates (instead of the mentioned warning printing)?

Regards,
Markus

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH RFC] Coccinelle: Check for return not matching function signature
  2015-05-05 16:00     ` [Cocci] " Nicholas Mc Guire
@ 2015-05-05 21:24       ` Julia Lawall
  -1 siblings, 0 replies; 25+ messages in thread
From: Julia Lawall @ 2015-05-05 21:24 UTC (permalink / raw)
  To: Nicholas Mc Guire
  Cc: Julia Lawall, Nicholas Mc Guire, Gilles Muller, Nicolas Palix,
	Michal Marek, cocci, linux-kernel



On Tue, 5 May 2015, Nicholas Mc Guire wrote:

> On Tue, 05 May 2015, Julia Lawall wrote:
>
> > > +@match@
> > > +identifier f,ret;
> > > +position p;
> > > +type T1,T2;
> > > +@@
> > > +
> > > +T1 f(...) {
> > > + T2 ret;
> > > +<+...
> > > +* return@p ret
> > > +;
> > > +...+>
> > > +}
> >
> > Given the number of results, it may seem surprising, but I think that you
> > are actually missing a lot of results.  Becaue you require that ret be the
> > first variable that is declared in the function.  Also, you require that
> > ret be an identifier.  If you want to keep the restriction about being an
> > identifier, you could put:
> >
> > @match exists@
> > type T1,T2;
> > idexpression T2 ret;

I was think ing that you don't want expression in general, because for all
contansts that will give you int.

You can of course put return C; for constant metavariable C in the
disjunction to avoid that possibility.

julia

> > identifier f;
> > @@
> >
> > T1 f(...) {
> > <+...
> > return@p ret;
> > ...+>
> > }
> >
>
> this is depressing - I now like by wrong solution even more ...
> unfortunately you are right - I missed most - its now at 25146
>
> > If you don't care about the identifier constraint, then you can just put
> > T2 ret.  Note also the addition of exists.  There is a problem if only one
> > path has this property.  Another thing you can do is the following:
> >
> > @match exists@
> > type T1,T2;
>   idexpression T1 ok;
> > idexpression T2 ret;
> > identifier f;
>   position p;
> > @@
> >
> > T1 f(...) {
> > <+...
> > (
> > return ok;
> > |
> > return@p ret;
> > )
> > ...+>
> > }
> >
> > Then Coccinelle will find the cases where the types are wrong, rather than
> > requiring a test in python.
> >
> > (I haven't tested any of this)
>
> also works - I had naively expected this to be faster - but it does not
> seem to be.
>
> will check results did not expect 10% of the kernel functions
> to have missmatching return types in atleast one of their paths.
>
> thx!
> hofrat
>

^ permalink raw reply	[flat|nested] 25+ messages in thread

* [Cocci] [PATCH RFC] Coccinelle: Check for return not matching function signature
@ 2015-05-05 21:24       ` Julia Lawall
  0 siblings, 0 replies; 25+ messages in thread
From: Julia Lawall @ 2015-05-05 21:24 UTC (permalink / raw)
  To: cocci



On Tue, 5 May 2015, Nicholas Mc Guire wrote:

> On Tue, 05 May 2015, Julia Lawall wrote:
>
> > > + at match@
> > > +identifier f,ret;
> > > +position p;
> > > +type T1,T2;
> > > +@@
> > > +
> > > +T1 f(...) {
> > > + T2 ret;
> > > +<+...
> > > +* return at p ret
> > > +;
> > > +...+>
> > > +}
> >
> > Given the number of results, it may seem surprising, but I think that you
> > are actually missing a lot of results.  Becaue you require that ret be the
> > first variable that is declared in the function.  Also, you require that
> > ret be an identifier.  If you want to keep the restriction about being an
> > identifier, you could put:
> >
> > @match exists@
> > type T1,T2;
> > idexpression T2 ret;

I was think ing that you don't want expression in general, because for all
contansts that will give you int.

You can of course put return C; for constant metavariable C in the
disjunction to avoid that possibility.

julia

> > identifier f;
> > @@
> >
> > T1 f(...) {
> > <+...
> > return at p ret;
> > ...+>
> > }
> >
>
> this is depressing - I now like by wrong solution even more ...
> unfortunately you are right - I missed most - its now at 25146
>
> > If you don't care about the identifier constraint, then you can just put
> > T2 ret.  Note also the addition of exists.  There is a problem if only one
> > path has this property.  Another thing you can do is the following:
> >
> > @match exists@
> > type T1,T2;
>   idexpression T1 ok;
> > idexpression T2 ret;
> > identifier f;
>   position p;
> > @@
> >
> > T1 f(...) {
> > <+...
> > (
> > return ok;
> > |
> > return at p ret;
> > )
> > ...+>
> > }
> >
> > Then Coccinelle will find the cases where the types are wrong, rather than
> > requiring a test in python.
> >
> > (I haven't tested any of this)
>
> also works - I had naively expected this to be faster - but it does not
> seem to be.
>
> will check results did not expect 10% of the kernel functions
> to have missmatching return types in atleast one of their paths.
>
> thx!
> hofrat
>

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH RFC] Coccinelle: Check for return not matching function signature
  2015-05-05 16:25     ` [Cocci] " SF Markus Elfring
@ 2015-05-05 21:46       ` Julia Lawall
  -1 siblings, 0 replies; 25+ messages in thread
From: Julia Lawall @ 2015-05-05 21:46 UTC (permalink / raw)
  To: SF Markus Elfring
  Cc: Julia Lawall, Nicholas Mc Guire, Coccinelle, LKML, Nicholas Mc Guire

On Tue, 5 May 2015, SF Markus Elfring wrote:

> > Then Coccinelle will find the cases where the types are wrong, rather than
> > requiring a test in python.
>
> Do you need a minus character in the first text column of the SmPL scripts then
> to mark corresponding update candidates (instead of the mentioned warning printing)?

He has a *, as he should.

julia

^ permalink raw reply	[flat|nested] 25+ messages in thread

* [Cocci] [PATCH RFC] Coccinelle: Check for return not matching function signature
@ 2015-05-05 21:46       ` Julia Lawall
  0 siblings, 0 replies; 25+ messages in thread
From: Julia Lawall @ 2015-05-05 21:46 UTC (permalink / raw)
  To: cocci

On Tue, 5 May 2015, SF Markus Elfring wrote:

> > Then Coccinelle will find the cases where the types are wrong, rather than
> > requiring a test in python.
>
> Do you need a minus character in the first text column of the SmPL scripts then
> to mark corresponding update candidates (instead of the mentioned warning printing)?

He has a *, as he should.

julia

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH RFC] Coccinelle: Check for return not matching function signature
  2015-05-05 10:12 ` [Cocci] " Nicholas Mc Guire
  (?)
@ 2015-05-06  7:15   ` SF Markus Elfring
  -1 siblings, 0 replies; 25+ messages in thread
From: SF Markus Elfring @ 2015-05-06  7:15 UTC (permalink / raw)
  To: Nicholas Mc Guire; +Cc: Nicholas Mc Guire, cocci, linux-kernel, kernel-janitors

> +virtual context
> +virtual org
> +virtual report

Where do you want to reuse these variables in your SmPL scripts?


> +@match@
> +identifier f,ret;
> +position p;
> +type T1,T2;
> +@@
> +
> +T1 f(...) {
> + T2 ret;

Will it be more helpful to mark only such variable declarations
where the specified data type should be reconsidered
by a minus character or an asterisk (instead of the following
return statement)?

> +<+...
> +* return@p ret
> +;
> +...+>
> +}


Regards,
Markus


^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH RFC] Coccinelle: Check for return not matching function signature
@ 2015-05-06  7:15   ` SF Markus Elfring
  0 siblings, 0 replies; 25+ messages in thread
From: SF Markus Elfring @ 2015-05-06  7:15 UTC (permalink / raw)
  To: cocci

> +virtual context
> +virtual org
> +virtual report

Where do you want to reuse these variables in your SmPL scripts?


> +@match@
> +identifier f,ret;
> +position p;
> +type T1,T2;
> +@@
> +
> +T1 f(...) {
> + T2 ret;

Will it be more helpful to mark only such variable declarations
where the specified data type should be reconsidered
by a minus character or an asterisk (instead of the following
return statement)?

> +<+...
> +* return@p ret
> +;
> +...+>
> +}


Regards,
Markus


^ permalink raw reply	[flat|nested] 25+ messages in thread

* [Cocci] [PATCH RFC] Coccinelle: Check for return not matching function signature
@ 2015-05-06  7:15   ` SF Markus Elfring
  0 siblings, 0 replies; 25+ messages in thread
From: SF Markus Elfring @ 2015-05-06  7:15 UTC (permalink / raw)
  To: cocci

> +virtual context
> +virtual org
> +virtual report

Where do you want to reuse these variables in your SmPL scripts?


> + at match@
> +identifier f,ret;
> +position p;
> +type T1,T2;
> +@@
> +
> +T1 f(...) {
> + T2 ret;

Will it be more helpful to mark only such variable declarations
where the specified data type should be reconsidered
by a minus character or an asterisk (instead of the following
return statement)?

> +<+...
> +* return at p ret
> +;
> +...+>
> +}


Regards,
Markus

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH RFC] Coccinelle: Check for return not matching function signature
  2015-05-05 21:24       ` [Cocci] " Julia Lawall
@ 2015-05-08  6:59         ` Nicholas Mc Guire
  -1 siblings, 0 replies; 25+ messages in thread
From: Nicholas Mc Guire @ 2015-05-08  6:59 UTC (permalink / raw)
  To: Julia Lawall
  Cc: Nicholas Mc Guire, Gilles Muller, Nicolas Palix, Michal Marek,
	cocci, linux-kernel

On Tue, 05 May 2015, Julia Lawall wrote:

> 
> 
> On Tue, 5 May 2015, Nicholas Mc Guire wrote:
> 
> > On Tue, 05 May 2015, Julia Lawall wrote:
> >
> > > > +@match@
> > > > +identifier f,ret;
> > > > +position p;
> > > > +type T1,T2;
> > > > +@@
> > > > +
> > > > +T1 f(...) {
> > > > + T2 ret;
> > > > +<+...
> > > > +* return@p ret
> > > > +;
> > > > +...+>
> > > > +}
> > >
> > > Given the number of results, it may seem surprising, but I think that you
> > > are actually missing a lot of results.  Becaue you require that ret be the
> > > first variable that is declared in the function.  Also, you require that
> > > ret be an identifier.  If you want to keep the restriction about being an
> > > identifier, you could put:
> > >
> > > @match exists@
> > > type T1,T2;
> > > idexpression T2 ret;
> 
> I was think ing that you don't want expression in general, because for all
> contansts that will give you int.
> 
> You can of course put return C; for constant metavariable C in the
> disjunction to avoid that possibility.
>
looks a lot better and removed a lot of false positives - the main problem 
now is managing classification of the kernels type "system" - seems like there
are atleast 5 ways to describe every type (except for enum) and infinitely
many possible assignments for ssize_t ...

here a little summary of the outputs - might be motivation to put some 
quite simple scanners into mainline to catch such issues.

comparison of return types in functions to the functions signature for 
kernel 4.1-rc2, glibc-2.9 and busybox 1.2.2.1 - no particular reason for
that glibc/busybox versions they just happend to be on my harddrive.

This is using the version that was fixed by Julia Lawal
<snip>
@match exists@                                                                 
type T1,T2;
idexpression T1 ok;
idexpression T2 ret;
identifier f;
constant C;
position p;
@@

T1 f(...) {
<+...
(
return ok;
|
return C;
|
return@p ret;
)
...+>
}
<snip>

component  Nr funcs != type    %
kernel   :  374600   10727   2.85 
glibc    :    9184     268   2.92
busybox  :    3645      43   1.18

                         kernel  glibc busybox  criticality
wrong ?                :     8     4    0       not sure
sign missmatch         :  2279    30    9       critical
down sized             :   435    49    4       critical
up sized               :   910    20    3       ugly
declaration missmatch  :  7095   165   27       wishlist

wrong: seems plain wrong like float assigned to int (did not check details yet)
sign missmatch: assigning signed types to unsiggned or vice versa
down sized:  some form of possible truncation like u64 being assigned u32
up sized: non-critical as it was correct type and it fit
declaration missmatch: means that they were named differently s32/int

Some limitations:
The glibc runs produced some error cases (spatch level) that were ignored
for now e.g.:
<snip>
EXN:Failure("match: node 194: return ...[1,2,90] in rec_dirsearch reachable
by inconsistent control-flow paths")
<snip>
The kernel numbers are a bit inaccurate because not all types can be 
checked reliably - e.g. when they are config dependent also due to the
enourmous type-"system" in the kernel not all assignments are sure
but that does not change the overall result.
I did not yet manage to automate the classification - just too many types
where its hard to say due to config dependencies - probably need to put
thos into a "don't know" category. Also all assignments of pointers of
any type on one side to void * on the other side was counted as legitimate.
Some results were mangled probably because of inacurate filtering resulting 
in things like "_EXTERN_INLINE != mp_limb_t" just dropped those for now.

Conclusion:
* atleast the sign missmatch cases (2279) and potentially truncating 
  assignments (435) are problematic. 
* the scripts needs a lot more cleanup in the classification of the reported
  types to be useful
* probably not realistic to cleanup all currently present tupe mismatches
  but scanning continuously and reporting before it goes into mainline or
  integrating such a check in the routine submission process seems
  worthwhile

 Once the classifier is working properly I'll post the next version.

thx!
hofrat

^ permalink raw reply	[flat|nested] 25+ messages in thread

* [Cocci] [PATCH RFC] Coccinelle: Check for return not matching function signature
@ 2015-05-08  6:59         ` Nicholas Mc Guire
  0 siblings, 0 replies; 25+ messages in thread
From: Nicholas Mc Guire @ 2015-05-08  6:59 UTC (permalink / raw)
  To: cocci

On Tue, 05 May 2015, Julia Lawall wrote:

> 
> 
> On Tue, 5 May 2015, Nicholas Mc Guire wrote:
> 
> > On Tue, 05 May 2015, Julia Lawall wrote:
> >
> > > > + at match@
> > > > +identifier f,ret;
> > > > +position p;
> > > > +type T1,T2;
> > > > +@@
> > > > +
> > > > +T1 f(...) {
> > > > + T2 ret;
> > > > +<+...
> > > > +* return at p ret
> > > > +;
> > > > +...+>
> > > > +}
> > >
> > > Given the number of results, it may seem surprising, but I think that you
> > > are actually missing a lot of results.  Becaue you require that ret be the
> > > first variable that is declared in the function.  Also, you require that
> > > ret be an identifier.  If you want to keep the restriction about being an
> > > identifier, you could put:
> > >
> > > @match exists@
> > > type T1,T2;
> > > idexpression T2 ret;
> 
> I was think ing that you don't want expression in general, because for all
> contansts that will give you int.
> 
> You can of course put return C; for constant metavariable C in the
> disjunction to avoid that possibility.
>
looks a lot better and removed a lot of false positives - the main problem 
now is managing classification of the kernels type "system" - seems like there
are atleast 5 ways to describe every type (except for enum) and infinitely
many possible assignments for ssize_t ...

here a little summary of the outputs - might be motivation to put some 
quite simple scanners into mainline to catch such issues.

comparison of return types in functions to the functions signature for 
kernel 4.1-rc2, glibc-2.9 and busybox 1.2.2.1 - no particular reason for
that glibc/busybox versions they just happend to be on my harddrive.

This is using the version that was fixed by Julia Lawal
<snip>
@match exists@                                                                 
type T1,T2;
idexpression T1 ok;
idexpression T2 ret;
identifier f;
constant C;
position p;
@@

T1 f(...) {
<+...
(
return ok;
|
return C;
|
return@p ret;
)
...+>
}
<snip>

component  Nr funcs != type    %
kernel   :  374600   10727   2.85 
glibc    :    9184     268   2.92
busybox  :    3645      43   1.18

                         kernel  glibc busybox  criticality
wrong ?                :     8     4    0       not sure
sign missmatch         :  2279    30    9       critical
down sized             :   435    49    4       critical
up sized               :   910    20    3       ugly
declaration missmatch  :  7095   165   27       wishlist

wrong: seems plain wrong like float assigned to int (did not check details yet)
sign missmatch: assigning signed types to unsiggned or vice versa
down sized:  some form of possible truncation like u64 being assigned u32
up sized: non-critical as it was correct type and it fit
declaration missmatch: means that they were named differently s32/int

Some limitations:
The glibc runs produced some error cases (spatch level) that were ignored
for now e.g.:
<snip>
EXN:Failure("match: node 194: return ...[1,2,90] in rec_dirsearch reachable
by inconsistent control-flow paths")
<snip>
The kernel numbers are a bit inaccurate because not all types can be 
checked reliably - e.g. when they are config dependent also due to the
enourmous type-"system" in the kernel not all assignments are sure
but that does not change the overall result.
I did not yet manage to automate the classification - just too many types
where its hard to say due to config dependencies - probably need to put
thos into a "don't know" category. Also all assignments of pointers of
any type on one side to void * on the other side was counted as legitimate.
Some results were mangled probably because of inacurate filtering resulting 
in things like "_EXTERN_INLINE != mp_limb_t" just dropped those for now.

Conclusion:
* atleast the sign missmatch cases (2279) and potentially truncating 
  assignments (435) are problematic. 
* the scripts needs a lot more cleanup in the classification of the reported
  types to be useful
* probably not realistic to cleanup all currently present tupe mismatches
  but scanning continuously and reporting before it goes into mainline or
  integrating such a check in the routine submission process seems
  worthwhile

 Once the classifier is working properly I'll post the next version.

thx!
hofrat

^ permalink raw reply	[flat|nested] 25+ messages in thread

end of thread, other threads:[~2015-05-08  6:59 UTC | newest]

Thread overview: 25+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2015-05-05 10:12 [PATCH RFC] Coccinelle: Check for return not matching function signature Nicholas Mc Guire
2015-05-05 10:12 ` [Cocci] " Nicholas Mc Guire
2015-05-05 12:32 ` SF Markus Elfring
2015-05-05 12:32   ` [Cocci] " SF Markus Elfring
2015-05-05 13:04   ` Nicholas Mc Guire
2015-05-05 13:04     ` [Cocci] " Nicholas Mc Guire
2015-05-05 13:29     ` Julia Lawall
2015-05-05 13:29       ` Julia Lawall
2015-05-05 13:45     ` SF Markus Elfring
2015-05-05 13:45       ` [Cocci] " SF Markus Elfring
2015-05-05 14:40 ` Julia Lawall
2015-05-05 14:40   ` [Cocci] " Julia Lawall
2015-05-05 16:00   ` Nicholas Mc Guire
2015-05-05 16:00     ` [Cocci] " Nicholas Mc Guire
2015-05-05 21:24     ` Julia Lawall
2015-05-05 21:24       ` [Cocci] " Julia Lawall
2015-05-08  6:59       ` Nicholas Mc Guire
2015-05-08  6:59         ` [Cocci] " Nicholas Mc Guire
2015-05-05 16:25   ` SF Markus Elfring
2015-05-05 16:25     ` [Cocci] " SF Markus Elfring
2015-05-05 21:46     ` Julia Lawall
2015-05-05 21:46       ` [Cocci] " Julia Lawall
2015-05-06  7:15 ` SF Markus Elfring
2015-05-06  7:15   ` [Cocci] " SF Markus Elfring
2015-05-06  7:15   ` SF Markus Elfring

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.