All of lore.kernel.org
 help / color / mirror / Atom feed
From: "Torsten Bögershausen" <tboegi@web.de>
To: Jeff King <peff@peff.net>, Junio C Hamano <gitster@pobox.com>
Cc: "Johannes Sixt" <j6t@kdbg.org>,
	"Johannes Schindelin" <Johannes.Schindelin@gmx.de>,
	git@vger.kernel.org, "René Scharfe" <l.s.r@web.de>
Subject: Re: [PATCH v4 2/5] t5000: test tar files that overflow ustar headers
Date: Fri, 15 Jul 2016 15:37:32 +0200	[thread overview]
Message-ID: <5b99a4bb-9b8e-e8c6-e214-e041209cb6e6@web.de> (raw)
In-Reply-To: <20160714223843.GA22196@sigill.intra.peff.net>



On 07/15/2016 12:38 AM, Jeff King wrote:
> On Thu, Jul 14, 2016 at 03:30:58PM -0700, Junio C Hamano wrote:
>
>>> If we move to time_t everywhere, I think we'll need an extra
>>> TIME_T_IS_64BIT, but we can cross that bridge when we come to it.
>>>
>>> Likewise I think we'll need SIZE_T_IS_64BIT eventually (for real 32-bit
>>> systems; LLP64 systems like Windows will then be able to run the test).
>>
>> I guess I wrote essentially the same thing before refreshing my
>> Inbox.
>>
>> I am a bit fuzzy between off_t and size_t; the former is for the
>> size of things you see on the filesystem, while the latter is for
>> you to give malloc(3).  I would have thought that off_t is the type
>> we would want at the end of the raw object header, denoting the size
>> of a blob object when deflated, which could be larger than the size
>> of a region of memory we can get from malloc(3), in which case we
>> would use the streaming interface.
>
> Yeah, your understanding is right (s/deflated/inflated/). I agree that
> off_t is probably a better size for blobs. Traditionally git assumed any
> object could fit in memory. The streaming interface helps that somewhat,
> but I think there are cases where we still must load a blob (e.g., if it
> is stored as a delta). In theory that never happens because of
> core.bigfilethreshold, but you may get a packfile from somebody with a
> higher threshold than you.
>
> I wouldn't be surprised if there are other cases that aren't smart
> enough to use the streaming interface yet, but the solution there is to
> make them smarter. :)
>
> So off_t is probably better. We do need to be careful, though, when
> allocating objects. E.g., this:
>
>   off_t size;
>   struct git_istream *stream;
>   void *buf;
>
>   stream = open_istream(sha1, &type, &size, NULL);
>   buf = xmalloc(size);
>   while (1) {
> 	/* read stream into buf */
>   }
>
> is a security hole when size_t is less than off_t (it gets truncated in
> the call to xmalloc, which allocates too few bytes). This is a toy
> example, obviously, but it's something to watch out for.
>
> -Peff
That code is "illegal", it should be
  buf = xmalloc(xsize_t(size));

And the transition from off_t into size_t
should always got via xsize_t():

static inline size_t xsize_t(off_t len)
{
	if (len > (size_t) len)
		die("Cannot handle files this big");
	return (size_t)len;
}

There are some more things to be done, on the long run:
- convert "unsigned long" into either off_t of size_t in e.g. convert.c
- Use the streaming interface to analyze if blobs are binary
   (That is already on my list, the old "stream and early out"
   from the olc 10/10, gmane/$293010 or so can be reused)

  reply	other threads:[~2016-07-15 13:38 UTC|newest]

Thread overview: 42+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2016-06-30  9:06 [PATCH v4 0/5] friendlier handling of overflows in archive-tar Jeff King
2016-06-30  9:07 ` [PATCH v4 1/5] t9300: factor out portable "head -c" replacement Jeff King
2016-07-01  4:45   ` Eric Sunshine
2016-07-01 17:23   ` Junio C Hamano
2016-07-01 18:01     ` Jeff King
2016-06-30  9:08 ` [PATCH v4 2/5] t5000: test tar files that overflow ustar headers Jeff King
2016-07-14 15:47   ` Johannes Schindelin
2016-07-14 16:45     ` Johannes Sixt
2016-07-14 17:08       ` Junio C Hamano
2016-07-14 20:52         ` Johannes Sixt
2016-07-14 21:32           ` Jeff King
2016-07-14 22:30             ` Junio C Hamano
2016-07-14 22:38               ` Jeff King
2016-07-15 13:37                 ` Torsten Bögershausen [this message]
2016-07-15 13:46                   ` Jeff King
2016-07-14 22:26           ` Junio C Hamano
2016-07-14 18:24       ` Jeff King
2016-07-14 18:21     ` Jeff King
2016-07-14 20:00       ` Junio C Hamano
2016-07-14 20:03         ` Junio C Hamano
2016-07-14 20:14           ` Jeff King
2016-07-14 20:09         ` Junio C Hamano
2016-07-14 20:10         ` Jeff King
2016-07-14 20:22           ` Junio C Hamano
2016-07-14 20:27             ` Jeff King
2016-07-14 20:34               ` Junio C Hamano
2016-07-14 20:43                 ` [PATCH v2 0/2] ulong may only be 32-bit wide Junio C Hamano
2016-07-14 20:43                   ` [PATCH v2 1/2] t0006: skip "far in the future" test when unsigned long is not long enough Junio C Hamano
2016-07-14 20:43                   ` [PATCH v2 2/2] archive-tar: huge offset and future timestamps would not work on 32-bit Junio C Hamano
2016-07-14 22:20                     ` Jeff King
2016-07-14 22:36                       ` Junio C Hamano
2016-07-16  6:28                         ` Duy Nguyen
2016-07-15 15:10                 ` [PATCH v4 2/5] t5000: test tar files that overflow ustar headers Johannes Schindelin
2016-07-15 16:49                   ` Junio C Hamano
2016-06-30  9:09 ` [PATCH v4 3/5] archive-tar: write extended headers for file sizes >= 8GB Jeff King
2016-07-14 16:48   ` Johannes Sixt
2016-07-14 17:11     ` Junio C Hamano
2016-07-14 18:16       ` Jeff King
2016-07-15  2:59     ` Torsten Bögershausen
2016-06-30  9:09 ` [PATCH v4 4/5] archive-tar: write extended headers for far-future mtime Jeff King
2016-06-30  9:09 ` [PATCH v4 5/5] archive-tar: drop return value Jeff King
2016-06-30  9:14 ` [PATCH v4 6/5] t5000: use test_match_signal Jeff King

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=5b99a4bb-9b8e-e8c6-e214-e041209cb6e6@web.de \
    --to=tboegi@web.de \
    --cc=Johannes.Schindelin@gmx.de \
    --cc=git@vger.kernel.org \
    --cc=gitster@pobox.com \
    --cc=j6t@kdbg.org \
    --cc=l.s.r@web.de \
    --cc=peff@peff.net \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.