From: Denis Efremov <efremov@linux.com> To: Coccinelle <cocci@systeme.lip6.fr> Cc: Denis Efremov <efremov@linux.com>, Julia Lawall <julia.lawall@lip6.fr>, Gilles Muller <Gilles.Muller@lip6.fr>, Nicolas Palix <nicolas.palix@imag.fr>, Michal Marek <michal.lkml@markovi.net>, linux-kernel@vger.kernel.org Subject: [RFC PATCH] coccinelle: check for integer overflow in binary search Date: Thu, 5 Sep 2019 01:12:23 +0300 [thread overview] Message-ID: <20190904221223.5281-1-efremov@linux.com> (raw) This is an RFC. I will resend the patch after feedback. Currently I'm preparing big patchset with bsearch warnings fixed. The rule will be a part of this patchset if it will be considered good enough for checking. There is a known integer overflow error [1] in the binary search algorithm. Google faced it in 2006 [2]. This rule checks midpoint calculation in binary search for overflow, i.e., (l + h) / 2. Not every match is an actual error since the array could be small enough. However, a custom implementation of binary search is error-prone and it's better to use the library function (lib/bsearch.c) or to apply defensive programming for midpoint calculation. [1] https://en.wikipedia.org/wiki/Binary_search_algorithm#Implementation_issues [2] https://ai.googleblog.com/2006/06/extra-extra-read-all-about-it-nearly.html Signed-off-by: Denis Efremov <efremov@linux.com> --- scripts/coccinelle/misc/bsearch.cocci | 80 +++++++++++++++++++++++++++ 1 file changed, 80 insertions(+) create mode 100644 scripts/coccinelle/misc/bsearch.cocci diff --git a/scripts/coccinelle/misc/bsearch.cocci b/scripts/coccinelle/misc/bsearch.cocci new file mode 100644 index 000000000000..a99d9a8d3ee5 --- /dev/null +++ b/scripts/coccinelle/misc/bsearch.cocci @@ -0,0 +1,80 @@ +// SPDX-License-Identifier: GPL-2.0-only +/// Check midpoint calculation in binary search algorithm for integer overflow +/// error [1]. Google faced it in 2006 [2]. Not every match is an actual error +/// since the array can be small enough. However, a custom implementation of +/// binary search is error-prone and it's better to use the library function +/// (lib/bsearch.c) or to apply defensive programming for midpoint calculation. +/// +/// [1] https://en.wikipedia.org/wiki/Binary_search_algorithm#Implementation_issues +/// [2] https://ai.googleblog.com/2006/06/extra-extra-read-all-about-it-nearly.html +// +// Confidence: Medium +// Copyright: (C) 2019 Denis Efremov, ISPRAS +// Comments: +// Options: --no-includes --include-headers + +virtual report +virtual org + +@r depends on org || report@ +identifier l, h, m; +statement S; +position p; +// to match 1 in << +// to match 2 in / +// Can't use exact values, e.g. 2, because it fails to match 2L. +// TODO: Is there an isomorphism for 2, 2L, 2U, 2UL, 2ULL, etc? +constant c; +// TODO: Is there an isomorphism for (a / 2) === (a >> 1)? +@@ +( + while (\(l < h\|l <= h\|(h - l) > 1\|(l + 1) < h\|l < (h - 1)\)) { + ... +( + ((l + h)@p / c) +| + ((l + h)@p >> c) +) + ... + } +//TODO: Is it possible to match "do {} while ();" loops? +// | +// do { +// ... +// ( +// ((l + h)@p / c) +// | +// ((l + h)@p >> c) +// ) +// ... +// } while (\(l < h\|l <= h\|(h - l) > 1\|(l + 1) < h\|l < (h - 1)\)); +| + for (...; \(l < h\|l <= h\|(h - l) > 1\|(l + 1) < h\|l < (h - 1)\); + m = \(((l + h)@p / c)\|((l + h)@p >> c)\)) S +| + for (...; \(l < h\|l <= h\|(h - l) > 1\|(l + 1) < h\|l < (h - 1)\); ...) { + ... +( + ((l + h)@p / c) +| + ((l + h)@p >> c) +) + ... + } +) + +@script:python depends on org@ +p << r.p; +@@ + +coccilib.org.print_todo(p[0], "WARNING opportunity for bsearch() or 'm = l + (h - l) / 2;'") + +@script:python depends on report@ +p << r.p; +@@ + +msg="WARNING: custom implementation of bsearch is error-prone. " +msg+="Consider using lib/bsearch.c or fix the midpoint calculation " +msg+="as 'm = l + (h - l) / 2;' to prevent the arithmetic overflow." +coccilib.report.print_report(p[0], msg) + -- 2.21.0
WARNING: multiple messages have this Message-ID (diff)
From: Denis Efremov <efremov@linux.com> To: Coccinelle <cocci@systeme.lip6.fr> Cc: Michal Marek <michal.lkml@markovi.net>, Nicolas Palix <nicolas.palix@imag.fr>, linux-kernel@vger.kernel.org Subject: [Cocci] [RFC PATCH] coccinelle: check for integer overflow in binary search Date: Thu, 5 Sep 2019 01:12:23 +0300 [thread overview] Message-ID: <20190904221223.5281-1-efremov@linux.com> (raw) This is an RFC. I will resend the patch after feedback. Currently I'm preparing big patchset with bsearch warnings fixed. The rule will be a part of this patchset if it will be considered good enough for checking. There is a known integer overflow error [1] in the binary search algorithm. Google faced it in 2006 [2]. This rule checks midpoint calculation in binary search for overflow, i.e., (l + h) / 2. Not every match is an actual error since the array could be small enough. However, a custom implementation of binary search is error-prone and it's better to use the library function (lib/bsearch.c) or to apply defensive programming for midpoint calculation. [1] https://en.wikipedia.org/wiki/Binary_search_algorithm#Implementation_issues [2] https://ai.googleblog.com/2006/06/extra-extra-read-all-about-it-nearly.html Signed-off-by: Denis Efremov <efremov@linux.com> --- scripts/coccinelle/misc/bsearch.cocci | 80 +++++++++++++++++++++++++++ 1 file changed, 80 insertions(+) create mode 100644 scripts/coccinelle/misc/bsearch.cocci diff --git a/scripts/coccinelle/misc/bsearch.cocci b/scripts/coccinelle/misc/bsearch.cocci new file mode 100644 index 000000000000..a99d9a8d3ee5 --- /dev/null +++ b/scripts/coccinelle/misc/bsearch.cocci @@ -0,0 +1,80 @@ +// SPDX-License-Identifier: GPL-2.0-only +/// Check midpoint calculation in binary search algorithm for integer overflow +/// error [1]. Google faced it in 2006 [2]. Not every match is an actual error +/// since the array can be small enough. However, a custom implementation of +/// binary search is error-prone and it's better to use the library function +/// (lib/bsearch.c) or to apply defensive programming for midpoint calculation. +/// +/// [1] https://en.wikipedia.org/wiki/Binary_search_algorithm#Implementation_issues +/// [2] https://ai.googleblog.com/2006/06/extra-extra-read-all-about-it-nearly.html +// +// Confidence: Medium +// Copyright: (C) 2019 Denis Efremov, ISPRAS +// Comments: +// Options: --no-includes --include-headers + +virtual report +virtual org + +@r depends on org || report@ +identifier l, h, m; +statement S; +position p; +// to match 1 in << +// to match 2 in / +// Can't use exact values, e.g. 2, because it fails to match 2L. +// TODO: Is there an isomorphism for 2, 2L, 2U, 2UL, 2ULL, etc? +constant c; +// TODO: Is there an isomorphism for (a / 2) === (a >> 1)? +@@ +( + while (\(l < h\|l <= h\|(h - l) > 1\|(l + 1) < h\|l < (h - 1)\)) { + ... +( + ((l + h)@p / c) +| + ((l + h)@p >> c) +) + ... + } +//TODO: Is it possible to match "do {} while ();" loops? +// | +// do { +// ... +// ( +// ((l + h)@p / c) +// | +// ((l + h)@p >> c) +// ) +// ... +// } while (\(l < h\|l <= h\|(h - l) > 1\|(l + 1) < h\|l < (h - 1)\)); +| + for (...; \(l < h\|l <= h\|(h - l) > 1\|(l + 1) < h\|l < (h - 1)\); + m = \(((l + h)@p / c)\|((l + h)@p >> c)\)) S +| + for (...; \(l < h\|l <= h\|(h - l) > 1\|(l + 1) < h\|l < (h - 1)\); ...) { + ... +( + ((l + h)@p / c) +| + ((l + h)@p >> c) +) + ... + } +) + +@script:python depends on org@ +p << r.p; +@@ + +coccilib.org.print_todo(p[0], "WARNING opportunity for bsearch() or 'm = l + (h - l) / 2;'") + +@script:python depends on report@ +p << r.p; +@@ + +msg="WARNING: custom implementation of bsearch is error-prone. " +msg+="Consider using lib/bsearch.c or fix the midpoint calculation " +msg+="as 'm = l + (h - l) / 2;' to prevent the arithmetic overflow." +coccilib.report.print_report(p[0], msg) + -- 2.21.0 _______________________________________________ Cocci mailing list Cocci@systeme.lip6.fr https://systeme.lip6.fr/mailman/listinfo/cocci
next reply other threads:[~2019-09-04 22:12 UTC|newest] Thread overview: 14+ messages / expand[flat|nested] mbox.gz Atom feed top 2019-09-04 22:12 Denis Efremov [this message] 2019-09-04 22:12 ` [Cocci] [RFC PATCH] coccinelle: check for integer overflow in binary search Denis Efremov 2019-09-05 6:20 ` Julia Lawall 2019-09-05 6:20 ` [Cocci] " Julia Lawall 2019-09-05 7:36 ` Denis Efremov 2019-09-05 7:36 ` [Cocci] " Denis Efremov 2019-09-05 7:51 ` Julia Lawall 2019-09-05 7:51 ` [Cocci] " Julia Lawall 2019-09-05 12:32 ` Markus Elfring 2019-09-05 12:32 ` [Cocci] " Markus Elfring 2019-09-05 12:32 ` Markus Elfring 2019-09-05 14:23 ` Markus Elfring 2019-09-05 14:23 ` [Cocci] " Markus Elfring 2019-09-05 14:23 ` Markus Elfring
Reply instructions: You may reply publicly to this message via plain-text email using any one of the following methods: * Save the following mbox file, import it into your mail client, and reply-to-all from there: mbox Avoid top-posting and favor interleaved quoting: https://en.wikipedia.org/wiki/Posting_style#Interleaved_style * Reply using the --to, --cc, and --in-reply-to switches of git-send-email(1): git send-email \ --in-reply-to=20190904221223.5281-1-efremov@linux.com \ --to=efremov@linux.com \ --cc=Gilles.Muller@lip6.fr \ --cc=cocci@systeme.lip6.fr \ --cc=julia.lawall@lip6.fr \ --cc=linux-kernel@vger.kernel.org \ --cc=michal.lkml@markovi.net \ --cc=nicolas.palix@imag.fr \ /path/to/YOUR_REPLY https://kernel.org/pub/software/scm/git/docs/git-send-email.html * If your mail client supports setting the In-Reply-To header via mailto: links, try the mailto: linkBe sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes, see mirroring instructions on how to clone and mirror all data and code used by this external index.