Re: Feature request: Exponential search in git bisect

From: Christian Couder <christian.couder@gmail.com>
To: "Manuel Bärenz" <manuel@enigmage.de>
Cc: git <git@vger.kernel.org>
Subject: Re: Feature request: Exponential search in git bisect
Date: Sat, 10 Oct 2020 11:22:49 +0200	[thread overview]
Message-ID: <CAP8UFD2dWrUXJUuiFtgCu6krbnbGGqJZ7K+6KqstGTcNmSc_xQ@mail.gmail.com> (raw)
In-Reply-To: <945ab20e-dcde-540e-83a5-83992c2187b1@enigmage.de>

On Fri, Oct 9, 2020 at 9:35 PM Manuel Bärenz <manuel@enigmage.de> wrote:
>
> This feature was requested 8 years ago and briefly discussed:
> https://public-inbox.org/git/20120318212957.GS1219@chaosreigns.com/
>
>
>   TL;DR
>
> Before doing git bisect, I want to use exponential search to
> automatically find a good commit, in logarithmic time.

Ok, but the conclusion of the above discussion is that the problem
with this idea is being able to distinguish between a commit that is
bad and a commit where the feature that you want to test cannot be
tested for example because it hasn't been implemented yet.

>   Scenario
>
>   * I have a bug in HEAD.
>   * I strongly suspect that it was introduced some time ago, but I don't
>     know when exactly.
>   * I have an automated test that will find the bug if the test can run
>     properly.
>   * Most of the commits in the repository are not testable, i.e. the
>     test doesn't run properly. (E.g. because the feature it tests wasn't
>     introduced yet, refactoring, etc.)
>   * I have no idea what a good commit might be, because I don't know
>     when the first /testable/ good commit is.

Ok, so your test cannot distinguish between a bad commit and a commit
where the feature hasn't been implemented.

When you say that most of the commits in the repository are not
testable, it usually means that the feature has been implemented
relatively recently compared to the history of the project.

> This sounds like a standard application for git bisect. No matter how
> big the repo, with binary search we would expect to find the first bad
> commit in logarithmic time.

Not necessarily. If you cannot distinguish in any way between a bad
commit and a commit where the feature hasn't been implemented, then it
might be very difficult to find a good commit. And you need a good
commit before you can properly bisect.

Suppose for example that the first bad commit (the commit that
introduced the bug you are looking for) is a direct child of the
commit that introduced the feature. Then unless you are very lucky any
binary search using your script will only test either bad commits or
commits where the feature hasn't been implemented yet, and it is
unable to distinguish between them in your scenario.

>   Failed attempt
>
> The zeroth idea might be trying to find the good commit by hand, by
> reading changelogs, by trying some commits, whatever. In some
> situations, this is not feasible. In fact, such situations occur
> frequently for me, for example for undocumented features, unversioned
> rolling releases, incidental complexity leading to older commits not
> being testable, etc.

I understand that it's not always easy to find a good commit.

> The first idea that comes to mind - and it was recommended 8 years agos,
> and I've tried it a few times already - is to simply mark the root
> commit as good. (Now, there might be several roots, but that's a puzzle
> you typically only have to figure out once per repo.) This sounds great
> in theory because binary search should get through the good old commits
> in logarithmic time.

It is not necessarily a good idea to do that. And in the thread, what
was actually suggested by Peff (Jeff King) was to test released
versions (for example 1.6.0, 1.7.0, etc in the Git code base).

> The problem with this approach is that if most older commits are
> untestable, I have to git bisect skip them.

Skipping untestable commits is not always the right thing to do. If
you know that they are untestable because the feature has not been
implemented yet, it might be better to mark them as good instead.

This is by the way what an ideal script should do. It should mark
commits where the feature has not been implemented yet as good, and
should "skip" only the commits where the feature has already been
implemented but that are not testable for another reason, like another
"temporary" bug or "temporary" compile failures.

> This basically kills the
> logarithmic performance, because bisect skip doesn't do binary search,
> but something rather random.

I would say that the actual reason here is that bisect skip is used
when it shouldn't be used.

> Just yesterday I killed a bisect search
> that took hours because it kept skipping and didn't find actual good
> commits.

Ok.

> You might say that instead of skipping old commits, one should mark them
> as good.

Yes, they should be marked as good when the feature has not been
implemented yet.

> That's problematic because then I might accidentally mark a
> commit as good that was already untestable bad. Given that bisect has no
> undo functionality, that can quickly mess up my search. Distinguishing
> untestable good from untestable bad is really hard automatically. I
> shouldn't have to do that.

Sometimes it's not very difficult to test if the feature has been
implemented yet or not. For example with Git you could check if an
option for a command has been implemented by just checking if it
appears in the doc. So the bisection script could first check that and
exit 0 (which means good) if the option doesn't appear in the doc. If
it appears in the doc, then it could do the regular test: "skip" if
untestable, "good" if there is no bug, "bad" otherwise.

> Long story short: Going from the root commit typically isn't feasible.
> I've tried it.

It seems that you might not have tried in the best possible way.

>   Proposal: Exponential search
>
> Instead of going from the root commit, what I do manually before
> starting git bisect is this:
>
> git checkout HEAD~10
> ./test.sh # Says: "Bug is present"
> git checkout HEAD~20
> ./test.sh # Says: "Bug is still present"
> git checkout HEAD~40
> ./test.sh # Says: "Bug is still present"
> [...] # Proceed exponentially
> git checkout HEAD~640
> ./test.sh # Says: "Bug is GONE!"
> git bisect good

If your script cannot distinguish between a bad commit and a commit
where the feature hasn't been implemented, then you were lucky that
that HEAD~640 was good. If the feature had been introduced between
HEAD~639 and HEAD~321 then your script would have said  "Bug is still
present".

> This technique is known as
> https://en.wikipedia.org/wiki/Exponential_search, and it works very well
> in practice. I find a good commit long before I enter the "untestable
> good" region.

If you are lucky, yes, you find a good commit long before you enter
the "untestable good" region.

> But it's tedious to do this manually. In this example, I
> needed to run the script 8 times manually, but of course it can be more
> often, and compiling and running the test may take time. This is ok for
> a one-off search, but it's tedious for regular usages.
>
> Yes, I could wrap this up in a shell script, but I guess there are
> caveats that I didn't think of when the history isn't linear. Maybe
> someone even already has, and I'm unaware of that. But it feels like
> this could be a proper git bisect feature, and a very useful one.

I agree that it could be useful. It could be especially useful if
people have a script that can actually distinguish between a bad
commit and a commit where the feature hasn't been implemented.

So if someone plans to implement that in Git, particular care in the
documentation should be taken to explain the issues related to testing
a feature when it might not be implemented yet.