From: Julia Lawall <julia.lawall@lip6.fr>
To: Denis Efremov <efremov@linux.com>
Cc: Michal Marek <michal.lkml@markovi.net>,
Nicolas Palix <nicolas.palix@imag.fr>,
linux-kernel@vger.kernel.org, Coccinelle <cocci@systeme.lip6.fr>
Subject: Re: [Cocci] [RFC PATCH] coccinelle: check for integer overflow in binary search
Date: Thu, 5 Sep 2019 08:20:02 +0200 (CEST) [thread overview]
Message-ID: <alpine.DEB.2.21.1909050816370.2815@hadrien> (raw)
In-Reply-To: <20190904221223.5281-1-efremov@linux.com>
On Thu, 5 Sep 2019, Denis Efremov wrote:
> This is an RFC. I will resend the patch after feedback. Currently
> I'm preparing big patchset with bsearch warnings fixed. The rule will
> be a part of this patchset if it will be considered good enough for
> checking.
>
> There is a known integer overflow error [1] in the binary search
> algorithm. Google faced it in 2006 [2]. This rule checks midpoint
> calculation in binary search for overflow, i.e., (l + h) / 2.
> Not every match is an actual error since the array could be small
> enough. However, a custom implementation of binary search is
> error-prone and it's better to use the library function (lib/bsearch.c)
> or to apply defensive programming for midpoint calculation.
>
> [1] https://en.wikipedia.org/wiki/Binary_search_algorithm#Implementation_issues
> [2] https://ai.googleblog.com/2006/06/extra-extra-read-all-about-it-nearly.html
>
> Signed-off-by: Denis Efremov <efremov@linux.com>
> ---
> scripts/coccinelle/misc/bsearch.cocci | 80 +++++++++++++++++++++++++++
> 1 file changed, 80 insertions(+)
> create mode 100644 scripts/coccinelle/misc/bsearch.cocci
>
> diff --git a/scripts/coccinelle/misc/bsearch.cocci b/scripts/coccinelle/misc/bsearch.cocci
> new file mode 100644
> index 000000000000..a99d9a8d3ee5
> --- /dev/null
> +++ b/scripts/coccinelle/misc/bsearch.cocci
> @@ -0,0 +1,80 @@
> +// SPDX-License-Identifier: GPL-2.0-only
> +/// Check midpoint calculation in binary search algorithm for integer overflow
> +/// error [1]. Google faced it in 2006 [2]. Not every match is an actual error
> +/// since the array can be small enough. However, a custom implementation of
> +/// binary search is error-prone and it's better to use the library function
> +/// (lib/bsearch.c) or to apply defensive programming for midpoint calculation.
> +///
> +/// [1] https://en.wikipedia.org/wiki/Binary_search_algorithm#Implementation_issues
> +/// [2] https://ai.googleblog.com/2006/06/extra-extra-read-all-about-it-nearly.html
> +//
> +// Confidence: Medium
> +// Copyright: (C) 2019 Denis Efremov, ISPRAS
> +// Comments:
> +// Options: --no-includes --include-headers
> +
> +virtual report
> +virtual org
> +
> +@r depends on org || report@
> +identifier l, h, m;
> +statement S;
> +position p;
> +// to match 1 in <<
> +// to match 2 in /
> +// Can't use exact values, e.g. 2, because it fails to match 2L.
> +// TODO: Is there an isomorphism for 2, 2L, 2U, 2UL, 2ULL, etc?
> +constant c;
As far as I can see, you aren't checking for 2 at all at the moment? You
should be able to say constant c = {2, 2L, etc};. Actually, we do
consider several variants of 0, so it could be reasonable to allow eg 2 to
match other variants as well.
> +// TODO: Is there an isomorphism for (a / 2) === (a >> 1)?
No. I'm not sure it's common enough to be worth it. But it could be
added if it seems like a good idea.
> +@@
> +(
> + while (\(l < h\|l <= h\|(h - l) > 1\|(l + 1) < h\|l < (h - 1)\)) {
> + ...
> +(
> + ((l + h)@p / c)
> +|
> + ((l + h)@p >> c)
> +)
> + ...
> + }
> +//TODO: Is it possible to match "do {} while ();" loops?
Not at the moment. There is some work on it, but it is not complete
(adding Evan in CC).
julia
> +// |
> +// do {
> +// ...
> +// (
> +// ((l + h)@p / c)
> +// |
> +// ((l + h)@p >> c)
> +// )
> +// ...
> +// } while (\(l < h\|l <= h\|(h - l) > 1\|(l + 1) < h\|l < (h - 1)\));
> +|
> + for (...; \(l < h\|l <= h\|(h - l) > 1\|(l + 1) < h\|l < (h - 1)\);
> + m = \(((l + h)@p / c)\|((l + h)@p >> c)\)) S
> +|
> + for (...; \(l < h\|l <= h\|(h - l) > 1\|(l + 1) < h\|l < (h - 1)\); ...) {
> + ...
> +(
> + ((l + h)@p / c)
> +|
> + ((l + h)@p >> c)
> +)
> + ...
> + }
> +)
> +
> +@script:python depends on org@
> +p << r.p;
> +@@
> +
> +coccilib.org.print_todo(p[0], "WARNING opportunity for bsearch() or 'm = l + (h - l) / 2;'")
> +
> +@script:python depends on report@
> +p << r.p;
> +@@
> +
> +msg="WARNING: custom implementation of bsearch is error-prone. "
> +msg+="Consider using lib/bsearch.c or fix the midpoint calculation "
> +msg+="as 'm = l + (h - l) / 2;' to prevent the arithmetic overflow."
> +coccilib.report.print_report(p[0], msg)
> +
> --
> 2.21.0
>
>
_______________________________________________
Cocci mailing list
Cocci@systeme.lip6.fr
https://systeme.lip6.fr/mailman/listinfo/cocci
next prev parent reply other threads:[~2019-09-05 6:29 UTC|newest]
Thread overview: 6+ messages / expand[flat|nested] mbox.gz Atom feed top
2019-09-04 22:12 [Cocci] [RFC PATCH] coccinelle: check for integer overflow in binary search Denis Efremov
2019-09-05 6:20 ` Julia Lawall [this message]
2019-09-05 7:36 ` Denis Efremov
2019-09-05 7:51 ` Julia Lawall
2019-09-05 12:32 ` Markus Elfring
2019-09-05 14:23 ` Markus Elfring
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=alpine.DEB.2.21.1909050816370.2815@hadrien \
--to=julia.lawall@lip6.fr \
--cc=cocci@systeme.lip6.fr \
--cc=efremov@linux.com \
--cc=linux-kernel@vger.kernel.org \
--cc=michal.lkml@markovi.net \
--cc=nicolas.palix@imag.fr \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).