All of lore.kernel.org
 help / color / mirror / Atom feed
* Re: [RFC][PATCH] index-pack: add testcases found using AFL
       [not found] <20170310151556.18490-1-vegard.nossum@oracle.com>
@ 2017-03-10 16:00 ` Vegard Nossum
  2017-03-10 19:06 ` Jeff King
  1 sibling, 0 replies; 13+ messages in thread
From: Vegard Nossum @ 2017-03-10 16:00 UTC (permalink / raw)
  To: gitster; +Cc: peff, git

On 10/03/2017 16:15, Vegard Nossum wrote:
> I've used AFL to generate a corpus of pack files that maximises the edge
> coverage for 'git index-pack'.
>
> This is a supplement to (and not a replacement for) the regular test cases
> where we know exactly what each test is checking for. These testcases are
> more useful for avoiding regressions in edge cases or as a starting point
> for future fuzzing efforts.
>
> To see the output of running 'git index-pack' on each file, you can do
> something like this:
>
>   make -C t GIT_TEST_OPTS="--run=34 --verbose" t5300-pack-object.sh
>
> I observe the following coverage changes (for t5300 only):
>
>   path                  old%  new%    pp
>   ----------------------------------------
>   builtin/index-pack.c  74.3  76.6   2.3
>   pack-write.c          79.8  80.4    .6
>   patch-delta.c         67.4  81.4  14.0
>   usage.c               26.6  35.5   8.9
>   wrapper.c             42.0  46.1   4.1
>   zlib.c                58.7  64.1   5.4

And if you add this simple patch on top (sorry, I didn't think of it
until after I'd sent the previous e-mail):

diff --git a/t/t5300-pack-object.sh b/t/t5300-pack-object.sh
index 19e02ffc2..db705ba5c 100755
--- a/t/t5300-pack-object.sh
+++ b/t/t5300-pack-object.sh
@@ -425,8 +425,10 @@ test_expect_success 'index-pack <pack> works in 
non-repo' '
  test_expect_success 'index-pack edge coverage' '
         for pack in "$TEST_DIRECTORY"/t5300/*.pack
         do
-               rm -rf "${pack%.pack}.idx" &&
-               test_might_fail git index-pack $pack
+               rm -rf "${pack%.pack}.idx" tmp.pack tmp.idx &&
+               test_might_fail git index-pack $pack &&
+               test_might_fail git index-pack --strict $pack &&
+               test_might_fail git index-pack --stdin --fix-thin 
tmp.pack < $pack
         done
  '


you get this change to the coverage profile instead:

path                  old%  new%    pp
----------------------------------------

alloc.c               58.1  67.4   9.3
builtin/index-pack.c  74.3  80.7   6.4
commit.c              13.9  17.4   3.5
date.c                 3.5   4.2    .7
fsck.c                15.7  33.7  18.0
object.c              56.0  58.7   2.7
pack-write.c          79.8  81.4   1.6
patch-delta.c         67.4  81.4  14.0
path.c                31.6  32.1    .5
sha1_file.c           48.9  49.6    .7
tag.c                  3.7  16.8  13.1
tree.c                36.6  37.5    .9
usage.c               26.6  35.5   8.9
wrapper.c             42.0  46.1   4.1
zlib.c                58.7  64.1   5.4

Of course, it's likely some of those gains can be found in other
testcases outside t5300 -- also, coverage isn't everything. Still seems
like a nice gain with very little effort.


Vegard

^ permalink raw reply related	[flat|nested] 13+ messages in thread

* Re: [RFC][PATCH] index-pack: add testcases found using AFL
       [not found] <20170310151556.18490-1-vegard.nossum@oracle.com>
  2017-03-10 16:00 ` [RFC][PATCH] index-pack: add testcases found using AFL Vegard Nossum
@ 2017-03-10 19:06 ` Jeff King
  2017-03-10 19:34   ` Vegard Nossum
  2017-03-10 22:58   ` Ævar Arnfjörð Bjarmason
  1 sibling, 2 replies; 13+ messages in thread
From: Jeff King @ 2017-03-10 19:06 UTC (permalink / raw)
  To: Vegard Nossum; +Cc: gitster, git

[Note: your original email didn't make it to the list because it's over
100K; I'll quote liberally].

On Fri, Mar 10, 2017 at 04:15:56PM +0100, Vegard Nossum wrote:

> I've used AFL to generate a corpus of pack files that maximises the edge
> coverage for 'git index-pack'.
> 
> This is a supplement to (and not a replacement for) the regular test cases
> where we know exactly what each test is checking for. These testcases are
> more useful for avoiding regressions in edge cases or as a starting point
> for future fuzzing efforts.
> 
> To see the output of running 'git index-pack' on each file, you can do
> something like this:
> 
>   make -C t GIT_TEST_OPTS="--run=34 --verbose" t5300-pack-object.sh
> 
> I observe the following coverage changes (for t5300 only):
> 
>   path                  old%  new%    pp
>   ----------------------------------------
>   builtin/index-pack.c  74.3  76.6   2.3
>   pack-write.c          79.8  80.4    .6
>   patch-delta.c         67.4  81.4  14.0
>   usage.c               26.6  35.5   8.9
>   wrapper.c             42.0  46.1   4.1
>   zlib.c                58.7  64.1   5.4

I'm not sure how I feel about this. More coverage is good, I guess, but
we don't have any idea what these packfiles are doing, or whether
index-pack is behaving sanely in the new lines. The most we can say is
that we tested more lines of code and that nothing segfaulted or
triggered something like ASAN.

That's something I guess, but I'm not enthused by the idea of just
dumping a bunch of binary test cases that nobody, not even the author,
understands.

-Peff

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [RFC][PATCH] index-pack: add testcases found using AFL
  2017-03-10 19:06 ` Jeff King
@ 2017-03-10 19:34   ` Vegard Nossum
  2017-03-10 19:42     ` Jeff King
  2017-03-10 22:58   ` Ævar Arnfjörð Bjarmason
  1 sibling, 1 reply; 13+ messages in thread
From: Vegard Nossum @ 2017-03-10 19:34 UTC (permalink / raw)
  To: Jeff King; +Cc: gitster, git

On 10/03/2017 20:06, Jeff King wrote:
> On Fri, Mar 10, 2017 at 04:15:56PM +0100, Vegard Nossum wrote:
>
>> I've used AFL to generate a corpus of pack files that maximises the edge
>> coverage for 'git index-pack'.
>>
>> This is a supplement to (and not a replacement for) the regular test cases
>> where we know exactly what each test is checking for. These testcases are
>> more useful for avoiding regressions in edge cases or as a starting point
>> for future fuzzing efforts.
>>
>> To see the output of running 'git index-pack' on each file, you can do
>> something like this:
>>
>>   make -C t GIT_TEST_OPTS="--run=34 --verbose" t5300-pack-object.sh
>>
>> I observe the following coverage changes (for t5300 only):
>>
>>   path                  old%  new%    pp
>>   ----------------------------------------
>>   builtin/index-pack.c  74.3  76.6   2.3
>>   pack-write.c          79.8  80.4    .6
>>   patch-delta.c         67.4  81.4  14.0
>>   usage.c               26.6  35.5   8.9
>>   wrapper.c             42.0  46.1   4.1
>>   zlib.c                58.7  64.1   5.4
>
> I'm not sure how I feel about this. More coverage is good, I guess, but
> we don't have any idea what these packfiles are doing, or whether
> index-pack is behaving sanely in the new lines. The most we can say is
> that we tested more lines of code and that nothing segfaulted or
> triggered something like ASAN.
>
> That's something I guess, but I'm not enthused by the idea of just
> dumping a bunch of binary test cases that nobody, not even the author,
> understands.

I understand your concern. This is how I see it:

Negatives:

  - 'make test' takes 1 second longer to run

  - 548K data added to git.git

Positives:

  - regularly exercising more of the code, especially some of the corner
cases which are not caught by the rest of the test suite, possibly
catching bugs in a security-critical bit of git before it makes it into
a release

  - no impact to existing code, everything self-contained in 1 directory

  - giving more people access to the testcases I discovered without
having to repeat the effort of setting up AFL, fixing up SHA1 checksums,
minimising the corpus, running AFL for a week, etc. (each step by itself
is pretty small, but taken altogether I think it's worthwhile not
to have to repeat that).

Then I guess you have to weigh the negatives and positives. For me it's
a clear net win, but others may see it differently.

For sure, I (or somebody else) can go through each testcase and figure
out what it's doing, what it's doing differently from the existing
manual testcases in t5300, and what its expected output should be. It's
not that I couldn't understand what each testcase is doing if I tried,
but I don't think it's worth the effort. I've run everything under
valgrind and the only thing that turned up were some suspicious-looking
allocations (which AFAICT should be safe to ignore because of git's
built-in limits). Otherwise it's mostly hitting sanity checks:

error: inflate: data stream error (incorrect header check)
fatal: pack has 4 unresolved deltas
fatal: pack has bad object at offset 12: delta base offset is out of bound
fatal: invalid tag
...

These are errors you wouldn't see normally and which the existing
testcases don't check for (or not exhaustively or systematically, in any
case).

If somebody were to look at the code and say: "hey, that check looks a
bit off" (which is something I personally do all the time), then being
able to quickly find an input to execute exactly that line of code is
extremely valuable -- and you can do that simply by running the
testcases through gcov.

Anyway, the patch/data is there, use it or don't.


Vegard

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [RFC][PATCH] index-pack: add testcases found using AFL
  2017-03-10 19:34   ` Vegard Nossum
@ 2017-03-10 19:42     ` Jeff King
  2017-03-10 21:18       ` Vegard Nossum
  0 siblings, 1 reply; 13+ messages in thread
From: Jeff King @ 2017-03-10 19:42 UTC (permalink / raw)
  To: Vegard Nossum; +Cc: gitster, git

On Fri, Mar 10, 2017 at 08:34:45PM +0100, Vegard Nossum wrote:

> > That's something I guess, but I'm not enthused by the idea of just
> > dumping a bunch of binary test cases that nobody, not even the author,
> > understands.
> 
> I understand your concern. This is how I see it:
> 
> Negatives:
> 
>  - 'make test' takes 1 second longer to run
> 
>  - 548K data added to git.git

My real concern is that this is the tip of the ice berg. So we increased
coverage in one program by a few percent. But wouldn't this procedure be
applicable to lots of _other_ parts of Git, too?

IOW, I'm worried about a day when we've added dozens or hundreds of
seconds to the test suite. Sure, we can quit adding at any time, but I
feel like it's easier to make a decision from the outset.

I'm tempted to say these should go into a different test-suite, or be
marked with a special flag or something. But then I guess nobody runs
them.

-Peff

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [RFC][PATCH] index-pack: add testcases found using AFL
  2017-03-10 19:42     ` Jeff King
@ 2017-03-10 21:18       ` Vegard Nossum
  2017-03-12 12:24         ` Jeff King
  0 siblings, 1 reply; 13+ messages in thread
From: Vegard Nossum @ 2017-03-10 21:18 UTC (permalink / raw)
  To: Jeff King; +Cc: gitster, git

On 10/03/2017 20:42, Jeff King wrote:
>>> That's something I guess, but I'm not enthused by the idea of just
>>> dumping a bunch of binary test cases that nobody, not even the author,
>>> understands.
[...]

> My real concern is that this is the tip of the ice berg. So we increased
> coverage in one program by a few percent. But wouldn't this procedure be
> applicable to lots of _other_ parts of Git, too?

I think that index-pack is in a special position given its role as the
verifier for packs received over the network, which you also wrote here:
https://www.spinics.net/lists/git/msg265118.html

I also think increased coverage for other parts of git which are not
considered security-sensitive is less valuable without testing for an
actual expected result.


Vegard

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [RFC][PATCH] index-pack: add testcases found using AFL
  2017-03-10 19:06 ` Jeff King
  2017-03-10 19:34   ` Vegard Nossum
@ 2017-03-10 22:58   ` Ævar Arnfjörð Bjarmason
  2017-03-12 12:32     ` Jeff King
  1 sibling, 1 reply; 13+ messages in thread
From: Ævar Arnfjörð Bjarmason @ 2017-03-10 22:58 UTC (permalink / raw)
  To: Jeff King; +Cc: Vegard Nossum, Junio C Hamano, Git Mailing List

On Fri, Mar 10, 2017 at 8:06 PM, Jeff King <peff@peff.net> wrote:
> [Note: your original email didn't make it to the list because it's over
> 100K; I'll quote liberally].
>
> On Fri, Mar 10, 2017 at 04:15:56PM +0100, Vegard Nossum wrote:
>
>> I've used AFL to generate a corpus of pack files that maximises the edge
>> coverage for 'git index-pack'.
>>
>> This is a supplement to (and not a replacement for) the regular test cases
>> where we know exactly what each test is checking for. These testcases are
>> more useful for avoiding regressions in edge cases or as a starting point
>> for future fuzzing efforts.
>>
>> To see the output of running 'git index-pack' on each file, you can do
>> something like this:
>>
>>   make -C t GIT_TEST_OPTS="--run=34 --verbose" t5300-pack-object.sh
>>
>> I observe the following coverage changes (for t5300 only):
>>
>>   path                  old%  new%    pp
>>   ----------------------------------------
>>   builtin/index-pack.c  74.3  76.6   2.3
>>   pack-write.c          79.8  80.4    .6
>>   patch-delta.c         67.4  81.4  14.0
>>   usage.c               26.6  35.5   8.9
>>   wrapper.c             42.0  46.1   4.1
>>   zlib.c                58.7  64.1   5.4
>
> I'm not sure how I feel about this. More coverage is good, I guess, but
> we don't have any idea what these packfiles are doing, or whether
> index-pack is behaving sanely in the new lines. The most we can say is
> that we tested more lines of code and that nothing segfaulted or
> triggered something like ASAN.

Isn't the main value with these sorts of tests that they make up the
difference in the current manually maintained coverage & some
randomized coverage. So when you change the code in the future and the
randomized coverage changes, we don't know if that's a good or a bad
thing, but at least we're more likely to know that it changed, and at
that point someone's likely to actually investigate the root cause,
which'll turn some AFL blob testcase into an isolated testcase?

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [RFC][PATCH] index-pack: add testcases found using AFL
  2017-03-10 21:18       ` Vegard Nossum
@ 2017-03-12 12:24         ` Jeff King
  0 siblings, 0 replies; 13+ messages in thread
From: Jeff King @ 2017-03-12 12:24 UTC (permalink / raw)
  To: Vegard Nossum; +Cc: gitster, git

On Fri, Mar 10, 2017 at 10:18:23PM +0100, Vegard Nossum wrote:

> On 10/03/2017 20:42, Jeff King wrote:
> > > > That's something I guess, but I'm not enthused by the idea of just
> > > > dumping a bunch of binary test cases that nobody, not even the author,
> > > > understands.
> [...]
> 
> > My real concern is that this is the tip of the ice berg. So we increased
> > coverage in one program by a few percent. But wouldn't this procedure be
> > applicable to lots of _other_ parts of Git, too?
> 
> I think that index-pack is in a special position given its role as the
> verifier for packs received over the network, which you also wrote here:
> https://www.spinics.net/lists/git/msg265118.html

That's true. If we take that attitude, my "slippery slope" concerns go
away.

-Peff

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [RFC][PATCH] index-pack: add testcases found using AFL
  2017-03-10 22:58   ` Ævar Arnfjörð Bjarmason
@ 2017-03-12 12:32     ` Jeff King
  2017-03-12 13:44       ` Vegard Nossum
  2017-03-12 18:14       ` Junio C Hamano
  0 siblings, 2 replies; 13+ messages in thread
From: Jeff King @ 2017-03-12 12:32 UTC (permalink / raw)
  To: Ævar Arnfjörð Bjarmason
  Cc: Vegard Nossum, Junio C Hamano, Git Mailing List

On Fri, Mar 10, 2017 at 11:58:13PM +0100, Ævar Arnfjörð Bjarmason wrote:

> >> I observe the following coverage changes (for t5300 only):
> >>
> >>   path                  old%  new%    pp
> >>   ----------------------------------------
> >>   builtin/index-pack.c  74.3  76.6   2.3
> >>   pack-write.c          79.8  80.4    .6
> >>   patch-delta.c         67.4  81.4  14.0
> >>   usage.c               26.6  35.5   8.9
> >>   wrapper.c             42.0  46.1   4.1
> >>   zlib.c                58.7  64.1   5.4
> >
> > I'm not sure how I feel about this. More coverage is good, I guess, but
> > we don't have any idea what these packfiles are doing, or whether
> > index-pack is behaving sanely in the new lines. The most we can say is
> > that we tested more lines of code and that nothing segfaulted or
> > triggered something like ASAN.
> 
> Isn't the main value with these sorts of tests that they make up the
> difference in the current manually maintained coverage & some
> randomized coverage. So when you change the code in the future and the
> randomized coverage changes, we don't know if that's a good or a bad
> thing, but at least we're more likely to know that it changed, and at
> that point someone's likely to actually investigate the root cause,
> which'll turn some AFL blob testcase into an isolated testcase?

I think you may be more optimistic than me in people actually looking at
our coverage numbers. There is a "make coverage" target, but I don't
think its output is useful for anybody trying to make comparison
metrics.

One further devil's advocate:

If people really _do_ care about coverage, arguably the AFL tests are a
pollution of that concept. Because they are running the code, but doing
a very perfunctory job of testing it. IOW, our coverage of "code that
doesn't segfault or trigger ASAN" is improved, but our coverage of "code
that has been tested to be correct" is not (and since the tests are
lumped together, it's hard to get anything but one number).

So I dunno. I remain on the fence about the patch.

-Peff

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [RFC][PATCH] index-pack: add testcases found using AFL
  2017-03-12 12:32     ` Jeff King
@ 2017-03-12 13:44       ` Vegard Nossum
  2017-03-12 18:14       ` Junio C Hamano
  1 sibling, 0 replies; 13+ messages in thread
From: Vegard Nossum @ 2017-03-12 13:44 UTC (permalink / raw)
  To: Jeff King, Ævar Arnfjörð Bjarmason
  Cc: Junio C Hamano, Git Mailing List

[-- Attachment #1: Type: text/plain, Size: 1157 bytes --]

On 12/03/2017 13:32, Jeff King wrote:
> If people really _do_ care about coverage, arguably the AFL tests are a
> pollution of that concept. Because they are running the code, but doing
> a very perfunctory job of testing it. IOW, our coverage of "code that
> doesn't segfault or trigger ASAN" is improved, but our coverage of "code
> that has been tested to be correct" is not (and since the tests are
> lumped together, it's hard to get anything but one number).

It wouldn't be hard to separate out the testcases found by fuzzing
I've attached a patch that does just that -- none of the new testcases
are run unless you pass -f/--fuzzing in GIT_TEST_OPTS.

$ make -C t GIT_TEST_OPTS="--run=34" t5300-pack-object.sh
make: Entering directory '/home/vegard/git/git/t'
*** t5300-pack-object.sh ***
[...]
ok 34 # skip index-pack edge coverage (missing FUZZING)
[...]

$ make -C t GIT_TEST_OPTS="--run=34 -f" t5300-pack-object.sh
make: Entering directory '/home/vegard/git/git/t'
*** t5300-pack-object.sh ***
[...]
ok 34 - index-pack edge coverage
[...]

I assume automatic testing like e.g. Travis would want to enable this.

Would that help at all?


Vegard

[-- Attachment #2: 0001-test-lib-add-fuzzing-option.patch --]
[-- Type: text/x-patch, Size: 2533 bytes --]

From 04446ce562eee129588f2c92c4eef2c82ed4bb4f Mon Sep 17 00:00:00 2001
From: Vegard Nossum <vegard.nossum@oracle.com>
Date: Sun, 12 Mar 2017 14:35:25 +0100
Subject: [PATCH] test-lib: add --fuzzing option

From t/README:

	This causes additional testcases found by fuzzing to be run,
	for more exhaustive testing. Please note that these testcases
	have not been vetted for correctness, but they may uncover
	bugs introduced in code paths which are not otherwise run
	in other tests.

The -f/--fuzzing/FUZZING name is up for discussion, I just couldn't think
of anything more descriptive.
---
 t/README               | 8 ++++++++
 t/t5300-pack-object.sh | 2 +-
 t/test-lib.sh          | 6 ++++++
 3 files changed, 15 insertions(+), 1 deletion(-)

diff --git a/t/README b/t/README
index 4982d1c52..2c56567b1 100644
--- a/t/README
+++ b/t/README
@@ -110,6 +110,14 @@ appropriately before running "make".
 	This causes additional long-running tests to be run (where
 	available), for more exhaustive testing.
 
+-f::
+--fuzzing::
+	This causes additional testcases found by fuzzing to be run,
+	for more exhaustive testing. Please note that these testcases
+	have not been vetted for correctness, but they may uncover
+	bugs introduced in code paths which are not otherwise run
+	in other tests.
+
 -r::
 --run=<test-selector>::
 	Run only the subset of tests indicated by
diff --git a/t/t5300-pack-object.sh b/t/t5300-pack-object.sh
index 19e02ffc2..f58d0d4bf 100755
--- a/t/t5300-pack-object.sh
+++ b/t/t5300-pack-object.sh
@@ -422,7 +422,7 @@ test_expect_success 'index-pack <pack> works in non-repo' '
 '
 
 # These pack files were generated using AFL
-test_expect_success 'index-pack edge coverage' '
+test_expect_success FUZZING 'index-pack edge coverage' '
 	for pack in "$TEST_DIRECTORY"/t5300/*.pack
 	do
 		rm -rf "${pack%.pack}.idx" &&
diff --git a/t/test-lib.sh b/t/test-lib.sh
index 86d77c16d..35df2bd6c 100644
--- a/t/test-lib.sh
+++ b/t/test-lib.sh
@@ -209,6 +209,8 @@ do
 		immediate=t; shift ;;
 	-l|--l|--lo|--lon|--long|--long-|--long-t|--long-te|--long-tes|--long-test|--long-tests)
 		GIT_TEST_LONG=t; export GIT_TEST_LONG; shift ;;
+	-f|--f|--fuzzing)
+		GIT_TEST_FUZZING=t; export GIT_TEST_FUZZING; shift ;;
 	-r)
 		shift; test "$#" -ne 0 || {
 			echo 'error: -r requires an argument' >&2;
@@ -1098,6 +1100,10 @@ test_lazy_prereq EXPENSIVE '
 	test -n "$GIT_TEST_LONG"
 '
 
+test_lazy_prereq FUZZING '
+	test -n "$GIT_TEST_FUZZING"
+'
+
 test_lazy_prereq USR_BIN_TIME '
 	test -x /usr/bin/time
 '
-- 
2.12.0.rc0


^ permalink raw reply related	[flat|nested] 13+ messages in thread

* Re: [RFC][PATCH] index-pack: add testcases found using AFL
  2017-03-12 12:32     ` Jeff King
  2017-03-12 13:44       ` Vegard Nossum
@ 2017-03-12 18:14       ` Junio C Hamano
  2017-03-13 11:07         ` Vegard Nossum
  1 sibling, 1 reply; 13+ messages in thread
From: Junio C Hamano @ 2017-03-12 18:14 UTC (permalink / raw)
  To: Jeff King
  Cc: Ævar Arnfjörð Bjarmason, Vegard Nossum, Git Mailing List

Jeff King <peff@peff.net> writes:

> One further devil's advocate:
>
> If people really _do_ care about coverage, arguably the AFL tests are a
> pollution of that concept. Because they are running the code, but doing
> a very perfunctory job of testing it. IOW, our coverage of "code that
> doesn't segfault or trigger ASAN" is improved, but our coverage of "code
> that has been tested to be correct" is not (and since the tests are
> lumped together, it's hard to get anything but one number).
>
> So I dunno. I remain on the fence about the patch.

Yeah, I have been disturbed by your earlier remark "binary test
cases that nobody, not even the author, understands", and the above
summarizes it more clearly.

Continuously running fuzzer tests on the codebase would have value,
but how exactly are these fuzzballs generated?  Don't they depend on
the code being tested?  IOW, how effective is a set of fuzzballs
that forces the code to take more branches in the current codepath
for the purpose of testing new code that updates the codepath,
changing the structure of the codeflow?  Unless a new set of
fuzzballs to match the updated codeflow is generated, wouldn't the
test coverage with these fuzzballs erode over time, making them less
and less useful baggage we carry around, without nobody noticing that
they no longer are effective to help test coverage?

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [RFC][PATCH] index-pack: add testcases found using AFL
  2017-03-12 18:14       ` Junio C Hamano
@ 2017-03-13 11:07         ` Vegard Nossum
  2017-03-13 17:11           ` Junio C Hamano
  0 siblings, 1 reply; 13+ messages in thread
From: Vegard Nossum @ 2017-03-13 11:07 UTC (permalink / raw)
  To: Junio C Hamano, Jeff King
  Cc: Ævar Arnfjörð Bjarmason, Git Mailing List

On 12/03/2017 19:14, Junio C Hamano wrote:
> Jeff King <peff@peff.net> writes:
>
>> One further devil's advocate:
>>
>> If people really _do_ care about coverage, arguably the AFL tests are a
>> pollution of that concept. Because they are running the code, but doing
>> a very perfunctory job of testing it. IOW, our coverage of "code that
>> doesn't segfault or trigger ASAN" is improved, but our coverage of "code
>> that has been tested to be correct" is not (and since the tests are
>> lumped together, it's hard to get anything but one number).
>>
>> So I dunno. I remain on the fence about the patch.
>
> Yeah, I have been disturbed by your earlier remark "binary test
> cases that nobody, not even the author, understands", and the above
> summarizes it more clearly.
>
> Continuously running fuzzer tests on the codebase would have value,
> but how exactly are these fuzzballs generated?  Don't they depend on
> the code being tested?  IOW, how effective is a set of fuzzballs
> that forces the code to take more branches in the current codepath
> for the purpose of testing new code that updates the codepath,
> changing the structure of the codeflow?  Unless a new set of
> fuzzballs to match the updated codeflow is generated, wouldn't the
> test coverage with these fuzzballs erode over time, making them less
> and less useful baggage we carry around, without nobody noticing that
> they no longer are effective to help test coverage?

Yes, the testcases are generated based on the feedback from the
instrumented code, so they are very clearly shaped by the code itself.

However, I think it's more useful to think of these testcases not as
"binary test that nobody knows what they are doing", but as "(sometimes
invalid) packfiles which tickle interesting code paths in the packfile
parser".

With this perspective it becomes clearer that while they were generated
from the code, they also in a sense describe the packfile format itself.

I did a few experiments in changing the code of the packfile reader in
various small ways (e.g. deleting a check, reordering some code) to see
the effects of the testcases found by fuzzing, and I have to admit it
was fairly disappointing. The testcases I added did not catch a single
buggy change, whereas the other testcases did catch many of them.

So I guess you are right -- these testcases are maybe really not that
useful as part of the test suite without explicit comparison between the
expected and actual results.

Forget about the patch, I will just put the testcases in a separate repo
instead. Maybe I will convert some of them into "real" tests.

Thanks, and sorry for the noise.


Vegard

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [RFC][PATCH] index-pack: add testcases found using AFL
  2017-03-13 11:07         ` Vegard Nossum
@ 2017-03-13 17:11           ` Junio C Hamano
  2017-03-13 19:13             ` Vegard Nossum
  0 siblings, 1 reply; 13+ messages in thread
From: Junio C Hamano @ 2017-03-13 17:11 UTC (permalink / raw)
  To: Vegard Nossum
  Cc: Jeff King, Ævar Arnfjörð Bjarmason, Git Mailing List

Vegard Nossum <vegard.nossum@oracle.com> writes:

> However, I think it's more useful to think of these testcases not as
> "binary test that nobody knows what they are doing", but as "(sometimes
> invalid) packfiles which tickle interesting code paths in the packfile
> parser".
>
> With this perspective it becomes clearer that while they were generated
> from the code, they also in a sense describe the packfile format itself.

I do agree with these two paragraphs (that is why I said that
continuously running fuzzer tests on the codebase would have value),
and I really appreciate the effort.

> I did a few experiments in changing the code of the packfile reader in
> various small ways (e.g. deleting a check, reordering some code) to see
> the effects of the testcases found by fuzzing, and I have to admit it
> was fairly disappointing. The testcases I added did not catch a single
> buggy change, whereas the other testcases did catch many of them.

In short, the summary of the above three paragraphs is that we still
do believe the general approach of using fuzzer has value, but your
experiment indicates that data presented in the patch in this thread
weren't particularly good examples to demonstrate the merit?



^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [RFC][PATCH] index-pack: add testcases found using AFL
  2017-03-13 17:11           ` Junio C Hamano
@ 2017-03-13 19:13             ` Vegard Nossum
  0 siblings, 0 replies; 13+ messages in thread
From: Vegard Nossum @ 2017-03-13 19:13 UTC (permalink / raw)
  To: Junio C Hamano
  Cc: Jeff King, Ævar Arnfjörð Bjarmason, Git Mailing List

On 13/03/2017 18:11, Junio C Hamano wrote:
> Vegard Nossum <vegard.nossum@oracle.com> writes:
>
>> However, I think it's more useful to think of these testcases not as
>> "binary test that nobody knows what they are doing", but as "(sometimes
>> invalid) packfiles which tickle interesting code paths in the packfile
>> parser".
>>
>> With this perspective it becomes clearer that while they were generated
>> from the code, they also in a sense describe the packfile format itself.
>
> I do agree with these two paragraphs (that is why I said that
> continuously running fuzzer tests on the codebase would have value),
> and I really appreciate the effort.
>
>> I did a few experiments in changing the code of the packfile reader in
>> various small ways (e.g. deleting a check, reordering some code) to see
>> the effects of the testcases found by fuzzing, and I have to admit it
>> was fairly disappointing. The testcases I added did not catch a single
>> buggy change, whereas the other testcases did catch many of them.
>
> In short, the summary of the above three paragraphs is that we still
> do believe the general approach of using fuzzer has value, but your
> experiment indicates that data presented in the patch in this thread
> weren't particularly good examples to demonstrate the merit?

Correct.

I thought a priori that the testcases found by AFL would work well as a
regression suite in the face of buggy code changes, but this turned out
not to be the case in practice when I tried introducing bugs on purpose.

The testcases would still have value for the following purposes:

- as a seed for continued fuzzing (as the fuzzing effort would not have
to start over from scratch)

- as a way to quickly find an input that reaches a specific line of code
without having to manually poke at a packfile

- as a basis for writing new testcases with specific expected results


Vegard

^ permalink raw reply	[flat|nested] 13+ messages in thread

end of thread, other threads:[~2017-03-13 19:13 UTC | newest]

Thread overview: 13+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
     [not found] <20170310151556.18490-1-vegard.nossum@oracle.com>
2017-03-10 16:00 ` [RFC][PATCH] index-pack: add testcases found using AFL Vegard Nossum
2017-03-10 19:06 ` Jeff King
2017-03-10 19:34   ` Vegard Nossum
2017-03-10 19:42     ` Jeff King
2017-03-10 21:18       ` Vegard Nossum
2017-03-12 12:24         ` Jeff King
2017-03-10 22:58   ` Ævar Arnfjörð Bjarmason
2017-03-12 12:32     ` Jeff King
2017-03-12 13:44       ` Vegard Nossum
2017-03-12 18:14       ` Junio C Hamano
2017-03-13 11:07         ` Vegard Nossum
2017-03-13 17:11           ` Junio C Hamano
2017-03-13 19:13             ` Vegard Nossum

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.