All of lore.kernel.org
 help / color / mirror / Atom feed
From: Thomas Gummerer <t.gummerer@gmail.com>
To: Jeff King <peff@peff.net>
Cc: Jonathan Nieder <jrnieder@gmail.com>, git@vger.kernel.org
Subject: Re: playing with MSan, was Re: [PATCH 0/3] fixes for running the test suite with --valgrind
Date: Wed, 4 Oct 2017 20:30:05 +0100	[thread overview]
Message-ID: <20171004193005.GD30301@hank> (raw)
In-Reply-To: <20171004101932.pai6wzcv2eohsicr@sigill.intra.peff.net>

On 10/04, Jeff King wrote:
> On Tue, Oct 03, 2017 at 07:41:54PM -0400, Jeff King wrote:
> 
> > I think using SANITIZE=memory would catch these, but it needs some
> > suppressions tuning. The weird "zlib reads uninitialized memory" error
> > is a problem (valgrind sees this, too, but we have suppressions).
> 
> I dug into this a little more. You can blacklist certain functions from
> getting MSan treatment, but that's not quite what we want. We want to
> mark bytes from certain _sources_ as being initialized, even if MSan
> doesn't agree.
> 
> And indeed, you can do that. As far as I can tell, MSan works by keeping
> a shadow map of memory and setting flags when it believes it has been
> initialized, and then checking that map when we make decisions based on
> the memory. But it can only do that if it instruments all writes. So the
> MSan documentation recommends that you build _everything_, including
> libraries, with it. Which obviously we don't do if we're using a system
> zlib. Or a system libc for that matter (though they intercept many
> common libc functions to handle this).
> 
> So one strategy is to "cheat" a bit at the library interfaces, and claim
> whatever they send us is properly initialized. The patch below tries
> that with zlib, and it does seem to work. It would fail to notice a real
> problem with any input we send _to_ the library (since the library isn't
> instrumented, and we claim that whatever comes out of it is legitimate).
> I could probably live with that.
> 
> But there are quite a few test failures that would still need
> investigating and annotating:
> 
>   - Certainly it's confused by looking at regmatch_t results from
>     regexec(). We can fix that by building with NO_REGEX. But pcre has
>     a similar problem.
> 
>   - Ditto curl and openssl, whose exit points would need annotations.
> 
>   - For some reason test-sigchain segfaults when it raise()s in the
>     signal handler and recurses. Not sure if this is an MSan bug or
>     what.
> 
> So I dunno. This approach is a _lot_ more convenient than trying to
> rebuild all the dependencies from scratch, and it runs way faster than
> valgrind.  It did find the cases that led to the patches in this
> series, and at least one more: if the lstat() at the end of
> entry.c:write_entry() fails, we write nonsense into the cache_entry.

Yeah valgrind found that one too, as I tried (and apparently failed :))
to explain in the cover letter.  I just haven't found the time yet to
actually try and go fix that one.

> I think we could probably get it to zero false positives without _too_
> much effort. I'll stop here for tonight, but I may pick it up again
> later (of course anybody else is welcome to fool around with it, too).
> 
> Below is the patch that let me run:
> 
>   make SANITIZE=memory CC=clang-6.0 NO_REGEX=1
> 
> and get a tractable number of errors.
> 
> -- >8 --
> diff --git a/Makefile b/Makefile
> index b143e4eea3..1da5c01211 100644
> --- a/Makefile
> +++ b/Makefile
> @@ -1047,6 +1047,9 @@ endif
>  ifneq ($(filter leak,$(SANITIZERS)),)
>  BASIC_CFLAGS += -DSUPPRESS_ANNOTATED_LEAKS
>  endif
> +ifneq ($(filter memory,$(SANITIZERS)),)
> +BASIC_CFLAGS += -DENABLE_MSAN_UNPOISON
> +endif
>  endif
>  
>  ifndef sysconfdir
> diff --git a/git-compat-util.h b/git-compat-util.h
> index cedad4d581..836a4c0b54 100644
> --- a/git-compat-util.h
> +++ b/git-compat-util.h
> @@ -1191,4 +1191,11 @@ extern void unleak_memory(const void *ptr, size_t len);
>  #define UNLEAK(var) do {} while (0)
>  #endif
>  
> +#ifdef ENABLE_MSAN_UNPOISON
> +#include <sanitizer/msan_interface.h>
> +#define msan_unpoison(ptr, len) __msan_unpoison(ptr, len)
> +#else
> +#define msan_unpoison(ptr, len) do {} while (0)
> +#endif
> +
>  #endif
> diff --git a/zlib.c b/zlib.c
> index 4223f1a8c5..5fa8f12507 100644
> --- a/zlib.c
> +++ b/zlib.c
> @@ -56,6 +56,8 @@ static void zlib_post_call(git_zstream *s)
>  	if (s->z.total_in != s->total_in + bytes_consumed)
>  		die("BUG: total_in mismatch");
>  
> +	msan_unpoison(s->next_out, bytes_produced);
> +
>  	s->total_out = s->z.total_out;
>  	s->total_in = s->z.total_in;
>  	s->next_in = s->z.next_in;

  reply	other threads:[~2017-10-04 19:29 UTC|newest]

Thread overview: 30+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2017-10-03 19:57 [PATCH 0/3] fixes for running the test suite with --valgrind Thomas Gummerer
2017-10-03 19:57 ` [PATCH 1/3] path.c: fix uninitialized memory access Thomas Gummerer
2017-10-03 22:45   ` Jonathan Nieder
2017-10-03 23:30     ` Jeff King
2017-10-03 23:37       ` Jonathan Nieder
2017-10-04  4:47         ` Junio C Hamano
2017-10-04  5:21           ` Jeff King
2017-10-04 19:22           ` Thomas Gummerer
2017-10-04 19:36           ` Jonathan Nieder
2017-10-03 19:57 ` [PATCH 2/3] http-push: fix construction of hex value from path Thomas Gummerer
2017-10-03 22:53   ` Jonathan Nieder
2017-10-03 23:36     ` Jeff King
2017-10-04  4:48       ` Junio C Hamano
2017-10-04  5:20         ` Junio C Hamano
2017-10-04  5:26           ` Jeff King
2017-10-04  6:26             ` Junio C Hamano
2017-10-03 19:57 ` [PATCH 3/3] sub-process: allocate argv on the heap Thomas Gummerer
2017-10-03 20:24   ` Johannes Sixt
2017-10-04  4:59     ` Junio C Hamano
2017-10-04  5:32       ` Jeff King
2017-10-04  5:58       ` Johannes Sixt
2017-10-04 19:31       ` Thomas Gummerer
2017-10-03 20:25   ` Stefan Beller
2017-10-03 23:41 ` [PATCH 0/3] fixes for running the test suite with --valgrind Jeff King
2017-10-03 23:50   ` Jonathan Nieder
2017-10-03 23:54     ` Jeff King
2017-10-04 10:19   ` playing with MSan, was " Jeff King
2017-10-04 19:30     ` Thomas Gummerer [this message]
2017-10-05  3:46       ` lstat-ing delayed-filter output, was Re: playing with MSan Jeff King
2017-10-05 10:47         ` Lars Schneider

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20171004193005.GD30301@hank \
    --to=t.gummerer@gmail.com \
    --cc=git@vger.kernel.org \
    --cc=jrnieder@gmail.com \
    --cc=peff@peff.net \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.