All of lore.kernel.org
 help / color / mirror / Atom feed
From: Julia Lawall <julia.lawall@lip6.fr>
To: Denis Efremov <efremov@linux.com>
Cc: Coccinelle <cocci@systeme.lip6.fr>,
	Gilles Muller <Gilles.Muller@lip6.fr>,
	Nicolas Palix <nicolas.palix@imag.fr>,
	Michal Marek <michal.lkml@markovi.net>,
	linux-kernel@vger.kernel.org, tacingiht@gmail.com
Subject: Re: [RFC PATCH] coccinelle: check for integer overflow in binary search
Date: Thu, 5 Sep 2019 09:51:28 +0200 (CEST)	[thread overview]
Message-ID: <alpine.DEB.2.21.1909050950100.2357@hadrien> (raw)
In-Reply-To: <8916c9d9-bdf5-51c2-b5cb-49898e14a00c@linux.com>



On Thu, 5 Sep 2019, Denis Efremov wrote:

>
>
> On 05.09.2019 09:20, Julia Lawall wrote:
> >
> >
> > On Thu, 5 Sep 2019, Denis Efremov wrote:
> >
> >> This is an RFC. I will resend the patch after feedback. Currently
> >> I'm preparing big patchset with bsearch warnings fixed. The rule will
> >> be a part of this patchset if it will be considered good enough for
> >> checking.
> >>
> >> There is a known integer overflow error [1] in the binary search
> >> algorithm. Google faced it in 2006 [2]. This rule checks midpoint
> >> calculation in binary search for overflow, i.e., (l + h) / 2.
> >> Not every match is an actual error since the array could be small
> >> enough. However, a custom implementation of binary search is
> >> error-prone and it's better to use the library function (lib/bsearch.c)
> >> or to apply defensive programming for midpoint calculation.
> >>
> >> [1] https://en.wikipedia.org/wiki/Binary_search_algorithm#Implementation_issues
> >> [2] https://ai.googleblog.com/2006/06/extra-extra-read-all-about-it-nearly.html
> >>
> >> Signed-off-by: Denis Efremov <efremov@linux.com>
> >> ---
> >>  scripts/coccinelle/misc/bsearch.cocci | 80 +++++++++++++++++++++++++++
> >>  1 file changed, 80 insertions(+)
> >>  create mode 100644 scripts/coccinelle/misc/bsearch.cocci
> >>
> >> diff --git a/scripts/coccinelle/misc/bsearch.cocci b/scripts/coccinelle/misc/bsearch.cocci
> >> new file mode 100644
> >> index 000000000000..a99d9a8d3ee5
> >> --- /dev/null
> >> +++ b/scripts/coccinelle/misc/bsearch.cocci
> >> @@ -0,0 +1,80 @@
> >> +// SPDX-License-Identifier: GPL-2.0-only
> >> +/// Check midpoint calculation in binary search algorithm for integer overflow
> >> +/// error [1]. Google faced it in 2006 [2]. Not every match is an actual error
> >> +/// since the array can be small enough. However, a custom implementation of
> >> +/// binary search is error-prone and it's better to use the library function
> >> +/// (lib/bsearch.c) or to apply defensive programming for midpoint calculation.
> >> +///
> >> +/// [1] https://en.wikipedia.org/wiki/Binary_search_algorithm#Implementation_issues
> >> +/// [2] https://ai.googleblog.com/2006/06/extra-extra-read-all-about-it-nearly.html
> >> +//
> >> +// Confidence: Medium
> >> +// Copyright: (C) 2019 Denis Efremov, ISPRAS
> >> +// Comments:
> >> +// Options: --no-includes --include-headers
> >> +
> >> +virtual report
> >> +virtual org
> >> +
> >> +@r depends on org || report@
> >> +identifier l, h, m;
> >> +statement S;
> >> +position p;
> >> +// to match 1 in <<
> >> +// to match 2 in /
> >> +// Can't use exact values, e.g. 2, because it fails to match 2L.
> >> +// TODO: Is there an isomorphism for 2, 2L, 2U, 2UL, 2ULL, etc?
> >> +constant c;
> >
> > As far as I can see, you aren't checking for 2 at all at the moment?
>
> Yes, there are no false positives even without pinning constants to 1, 2.
> However, it's better to express this in the rule.
>
> > You
> > should be able to say constant c = {2, 2L, etc};.  Actually, we do
> > consider several variants of 0, so it could be reasonable to allow eg 2 to
> > match other variants as well.
>
> It looks like integer literals aren't fully supported. When I'm trying to write
> 'constant c = {2L}; ' it fails with int_of_string error.

Oops.  I'll fix it, but since people may be using older versions of
Coccinelle, perhaps it is not worth taking this strategy in this case.
Could you make a disjunction, or check for the proper value in the python
code?

julia

WARNING: multiple messages have this Message-ID (diff)
From: Julia Lawall <julia.lawall@lip6.fr>
To: Denis Efremov <efremov@linux.com>
Cc: Michal Marek <michal.lkml@markovi.net>,
	Nicolas Palix <nicolas.palix@imag.fr>,
	linux-kernel@vger.kernel.org, Coccinelle <cocci@systeme.lip6.fr>
Subject: Re: [Cocci] [RFC PATCH] coccinelle: check for integer overflow in binary search
Date: Thu, 5 Sep 2019 09:51:28 +0200 (CEST)	[thread overview]
Message-ID: <alpine.DEB.2.21.1909050950100.2357@hadrien> (raw)
In-Reply-To: <8916c9d9-bdf5-51c2-b5cb-49898e14a00c@linux.com>



On Thu, 5 Sep 2019, Denis Efremov wrote:

>
>
> On 05.09.2019 09:20, Julia Lawall wrote:
> >
> >
> > On Thu, 5 Sep 2019, Denis Efremov wrote:
> >
> >> This is an RFC. I will resend the patch after feedback. Currently
> >> I'm preparing big patchset with bsearch warnings fixed. The rule will
> >> be a part of this patchset if it will be considered good enough for
> >> checking.
> >>
> >> There is a known integer overflow error [1] in the binary search
> >> algorithm. Google faced it in 2006 [2]. This rule checks midpoint
> >> calculation in binary search for overflow, i.e., (l + h) / 2.
> >> Not every match is an actual error since the array could be small
> >> enough. However, a custom implementation of binary search is
> >> error-prone and it's better to use the library function (lib/bsearch.c)
> >> or to apply defensive programming for midpoint calculation.
> >>
> >> [1] https://en.wikipedia.org/wiki/Binary_search_algorithm#Implementation_issues
> >> [2] https://ai.googleblog.com/2006/06/extra-extra-read-all-about-it-nearly.html
> >>
> >> Signed-off-by: Denis Efremov <efremov@linux.com>
> >> ---
> >>  scripts/coccinelle/misc/bsearch.cocci | 80 +++++++++++++++++++++++++++
> >>  1 file changed, 80 insertions(+)
> >>  create mode 100644 scripts/coccinelle/misc/bsearch.cocci
> >>
> >> diff --git a/scripts/coccinelle/misc/bsearch.cocci b/scripts/coccinelle/misc/bsearch.cocci
> >> new file mode 100644
> >> index 000000000000..a99d9a8d3ee5
> >> --- /dev/null
> >> +++ b/scripts/coccinelle/misc/bsearch.cocci
> >> @@ -0,0 +1,80 @@
> >> +// SPDX-License-Identifier: GPL-2.0-only
> >> +/// Check midpoint calculation in binary search algorithm for integer overflow
> >> +/// error [1]. Google faced it in 2006 [2]. Not every match is an actual error
> >> +/// since the array can be small enough. However, a custom implementation of
> >> +/// binary search is error-prone and it's better to use the library function
> >> +/// (lib/bsearch.c) or to apply defensive programming for midpoint calculation.
> >> +///
> >> +/// [1] https://en.wikipedia.org/wiki/Binary_search_algorithm#Implementation_issues
> >> +/// [2] https://ai.googleblog.com/2006/06/extra-extra-read-all-about-it-nearly.html
> >> +//
> >> +// Confidence: Medium
> >> +// Copyright: (C) 2019 Denis Efremov, ISPRAS
> >> +// Comments:
> >> +// Options: --no-includes --include-headers
> >> +
> >> +virtual report
> >> +virtual org
> >> +
> >> +@r depends on org || report@
> >> +identifier l, h, m;
> >> +statement S;
> >> +position p;
> >> +// to match 1 in <<
> >> +// to match 2 in /
> >> +// Can't use exact values, e.g. 2, because it fails to match 2L.
> >> +// TODO: Is there an isomorphism for 2, 2L, 2U, 2UL, 2ULL, etc?
> >> +constant c;
> >
> > As far as I can see, you aren't checking for 2 at all at the moment?
>
> Yes, there are no false positives even without pinning constants to 1, 2.
> However, it's better to express this in the rule.
>
> > You
> > should be able to say constant c = {2, 2L, etc};.  Actually, we do
> > consider several variants of 0, so it could be reasonable to allow eg 2 to
> > match other variants as well.
>
> It looks like integer literals aren't fully supported. When I'm trying to write
> 'constant c = {2L}; ' it fails with int_of_string error.

Oops.  I'll fix it, but since people may be using older versions of
Coccinelle, perhaps it is not worth taking this strategy in this case.
Could you make a disjunction, or check for the proper value in the python
code?

julia
_______________________________________________
Cocci mailing list
Cocci@systeme.lip6.fr
https://systeme.lip6.fr/mailman/listinfo/cocci

  reply	other threads:[~2019-09-05  7:51 UTC|newest]

Thread overview: 14+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2019-09-04 22:12 [RFC PATCH] coccinelle: check for integer overflow in binary search Denis Efremov
2019-09-04 22:12 ` [Cocci] " Denis Efremov
2019-09-05  6:20 ` Julia Lawall
2019-09-05  6:20   ` [Cocci] " Julia Lawall
2019-09-05  7:36   ` Denis Efremov
2019-09-05  7:36     ` [Cocci] " Denis Efremov
2019-09-05  7:51     ` Julia Lawall [this message]
2019-09-05  7:51       ` Julia Lawall
2019-09-05 12:32 ` Markus Elfring
2019-09-05 12:32   ` [Cocci] " Markus Elfring
2019-09-05 12:32   ` Markus Elfring
2019-09-05 14:23 ` Markus Elfring
2019-09-05 14:23   ` [Cocci] " Markus Elfring
2019-09-05 14:23   ` Markus Elfring

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=alpine.DEB.2.21.1909050950100.2357@hadrien \
    --to=julia.lawall@lip6.fr \
    --cc=Gilles.Muller@lip6.fr \
    --cc=cocci@systeme.lip6.fr \
    --cc=efremov@linux.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=michal.lkml@markovi.net \
    --cc=nicolas.palix@imag.fr \
    --cc=tacingiht@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.