All of lore.kernel.org
 help / color / mirror / Atom feed
From: Jonathan Nieder <jrnieder@gmail.com>
To: Junio C Hamano <gitster@pobox.com>
Cc: git@vger.kernel.org, Ramsay Jones <ramsay@ramsayjones.plus.com>
Subject: [PATCH v2] xsize_t: avoid implementation defined behavior when len < 0
Date: Tue, 18 May 2021 18:52:56 -0700	[thread overview]
Message-ID: <YKRveGhNqPMjmJs8@google.com> (raw)
In-Reply-To: <xmqqy2cbzeqx.fsf@gitster.g>

The xsize_t helper aims to safely convert an off_t to a size_t,
erroring out when a file offset is too large to fit into a memory
address.  It does this by using two casts:

	size_t size = (size_t) len;
	if (len != (off_t) size)
		... error out ...

On a platform with sizeof(size_t) < sizeof(off_t), this check is safe
and correct.  The first cast truncates to a size_t by finding the
remainder modulo SIZE_MAX+1 (see C99 section 6.3.1.3 Signed and
unsigned integers) and the second promotes to an off_t, meaning the
result is true if and only if len is representable as a size_t.

On other platforms, this two-casts strategy still works well (always
succeeds) for len >= 0.  But for len < 0, when the first cast succeeds
and produces SIZE_MAX + 1 + len, the resulting value is too large to
be represented as an off_t, so the second cast produces implementation
defined behavior.  In practice, it is likely to produce a result of
true despite len not being representable as size_t.

Simplify by replacing with a more straightforward check: compare len
to the relevant bounds and then cast it.  (To avoid a -Wsign-compare
warning, after checking that len >= 0, we explicitly convert to a
sufficiently-large unsigned type before comparing to SIZE_MAX.)

In practice, this is not likely to come up since typical callers use
nonnegative len.  Still, it's helpful to handle this case to make the
behavior easy to reason about.

Historical note: the original bounds-checking in 46be82dfd0 (xsize_t:
check whether we lose bits, 2010-07-28) did not produce this
implementation-defined behavior, though it still did not handle
negative offsets.  It was not until 73560c793a (git-compat-util.h:
xsize_t() - avoid -Wsign-compare warnings, 2017-09-21) introduced the
double cast that the implementation-defined behavior was triggered.

Signed-off-by: Jonathan Nieder <jrnieder@gmail.com>
---
Junio C Hamano wrote:
> Jonathan Nieder <jrnieder@gmail.com> writes:

>> -	if (len != (off_t) size)
>> +	if (len < 0 || len > SIZE_MAX)
>>  		die("Cannot handle files this big");
>
> OK, so negative offset or offset that cannot be represented as size_t
> are rejected.  That is much easier to read than the original ;-)
>
> SIZE_MAX is associated with size_t so it presumably is an unsigned
> constant; would it again trigger a sign-compare warning?

Alas, on platforms with sizeof(size_t) == sizeof(off_t), I believe it
does:

	$ gcc --version
	gcc (Debian 10.2.1-6+build2) 10.2.1 20210110
	Copyright (C) 2020 Free Software Foundation, Inc.
	This is free software; see the source for copying conditions.  There is NO
	warranty; not even for MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.

	$ cat sign-compare-test.c
	#define X (1U)

	extern int signed_int();

	int main(void)
	{
	  int v = signed_int();
	  return v < 0 || v > X;
	}
	$ gcc -c -Wall -W -Wsign-compare sign-compare-test.c
	sign-compare-test.c: In function ‘main’:
	sign-compare-test.c:8:21: warning: comparison of integer expressions of different signedness: ‘int’ and ‘unsigned int’ [-Wsign-compare]
	    8 |   return v < 0 || v > X;
	      |                     ^

That can be worked around by reintroducing a cast, to an unsigned type
this time, like this.

Thanks,
Jonathan

 git-compat-util.h | 6 ++----
 1 file changed, 2 insertions(+), 4 deletions(-)

diff --git a/git-compat-util.h b/git-compat-util.h
index a508dbe5a3..fb6e9af76b 100644
--- a/git-compat-util.h
+++ b/git-compat-util.h
@@ -986,11 +986,9 @@ static inline char *xstrdup_or_null(const char *str)
 
 static inline size_t xsize_t(off_t len)
 {
-	size_t size = (size_t) len;
-
-	if (len != (off_t) size)
+	if (len < 0 || (uintmax_t) len > SIZE_MAX)
 		die("Cannot handle files this big");
-	return size;
+	return (size_t) len;
 }
 
 __attribute__((format (printf, 3, 4)))
-- 
2.31.1.818.g46aad6cb9e


      reply	other threads:[~2021-05-19  1:53 UTC|newest]

Thread overview: 3+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-05-18 15:03 [PATCH] xsize_t: avoid implementation defined behavior when len < 0 Jonathan Nieder
2021-05-19  1:36 ` Junio C Hamano
2021-05-19  1:52   ` Jonathan Nieder [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=YKRveGhNqPMjmJs8@google.com \
    --to=jrnieder@gmail.com \
    --cc=git@vger.kernel.org \
    --cc=gitster@pobox.com \
    --cc=ramsay@ramsayjones.plus.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.