All of lore.kernel.org
 help / color / mirror / Atom feed
* [BUG] bisecting miscounts revisions left to test
@ 2007-03-16 16:14 Uwe Kleine-König
  2007-03-17  0:50 ` Johannes Schindelin
  0 siblings, 1 reply; 18+ messages in thread
From: Uwe Kleine-König @ 2007-03-16 16:14 UTC (permalink / raw)
  To: Git Mailing List

Hello,

I'm just bisecting in the kernel, the last commands + output are:

zeisberg@cassiopeia:~/gsrc/linux-2.6$ git bisect good
Bisecting: 2 revisions left to test after this
[e7b0d26a86943370c04d6833c6edba2a72a6e240] sysfs: reinstate exclusion between method calls and attribute unregistration

zeisberg@cassiopeia:~/gsrc/linux-2.6$ git bisect good
Bisecting: 2 revisions left to test after this
[b810cdfcf91d76f603fd48023aede48ced8e6bed] Merge branch 'upstream-linus' of master.kernel.org:/pub/scm/linux/kernel/git/jgarzik/netdev-2.6

So after having 2 revisions left to test and an addional test I still
have to test 2 revisions.

For reproducibility:

zeisberg@cassiopeia:~/gsrc/linux-2.6$ git bisect log
git-bisect start
# good: [08e15e81a40e3241ce93b4a43886f3abda184aa6] Linux 2.6.21-rc3
git-bisect good 08e15e81a40e3241ce93b4a43886f3abda184aa6
# bad: [bac6eefe96204d0ad67d144f2511a6fc487aa594] Linux 2.6.21-rc4
git-bisect bad bac6eefe96204d0ad67d144f2511a6fc487aa594
# good: [bdf3aaf9519ddd8a026b5e04e713d2fa673532e5] Pull bugzilla-8066 into release branch
git-bisect good bdf3aaf9519ddd8a026b5e04e713d2fa673532e5
# good: [2cb8a57b9851805883dfe92cf5d88a726134a384] Fix vmi time header bug
git-bisect good 2cb8a57b9851805883dfe92cf5d88a726134a384
# good: [a7c999114ecd0c69bd3970272b64d8842b765b21] BLK_DEV_IDE_CELLEB dependency fix
git-bisect good a7c999114ecd0c69bd3970272b64d8842b765b21
# good: [0bdd0f385a44344f83409b9e00797bfe2596faf8] Merge master.kernel.org:/pub/scm/linux/kernel/git/wim/linux-2.6-watchdog
git-bisect good 0bdd0f385a44344f83409b9e00797bfe2596faf8
# good: [6ab27c6bf38d5ff71dafeca77b79e7c284804b75] Merge branch 'for-linus' of master.kernel.org:/pub/scm/linux/kernel/git/jikos/hid
git-bisect good 6ab27c6bf38d5ff71dafeca77b79e7c284804b75
# good: [069f8256362b7a17da532f0631cee73b4cfee65b] natsemi: Fix NAPI for interrupt sharing
git-bisect good 069f8256362b7a17da532f0631cee73b4cfee65b
# good: [e7b0d26a86943370c04d6833c6edba2a72a6e240] sysfs: reinstate exclusion between method calls and attribute unregistration
git-bisect good e7b0d26a86943370c04d6833c6edba2a72a6e240

zeisberg@cassiopeia:~/gsrc/linux-2.6$ git version
git version 1.5.0.3

This is git from the Debian package.

Best regards
Uwe

-- 
Uwe Kleine-König

cal 9 1752 | grep 10

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [BUG] bisecting miscounts revisions left to test
  2007-03-16 16:14 [BUG] bisecting miscounts revisions left to test Uwe Kleine-König
@ 2007-03-17  0:50 ` Johannes Schindelin
  2007-03-17 13:46   ` Uwe Kleine-König
  2007-03-17 14:12   ` [PATCH] calculate the maximal number of revisions " Uwe Kleine-König
  0 siblings, 2 replies; 18+ messages in thread
From: Johannes Schindelin @ 2007-03-17  0:50 UTC (permalink / raw)
  To: Uwe Kleine-König; +Cc: Git Mailing List

[-- Attachment #1: Type: TEXT/PLAIN, Size: 1293 bytes --]

Hi,

On Fri, 16 Mar 2007, Uwe Kleine-König wrote:

> zeisberg@cassiopeia:~/gsrc/linux-2.6$ git bisect good
> Bisecting: 2 revisions left to test after this
> [e7b0d26a86943370c04d6833c6edba2a72a6e240] sysfs: reinstate exclusion between method calls and attribute unregistration
> 
> zeisberg@cassiopeia:~/gsrc/linux-2.6$ git bisect good
> Bisecting: 2 revisions left to test after this
> [b810cdfcf91d76f603fd48023aede48ced8e6bed] Merge branch 'upstream-linus' of master.kernel.org:/pub/scm/linux/kernel/git/jgarzik/netdev-2.6

The problem is that after the first git-bisect good, it looks like this:

g1 - b2 - b1 - M - B
             /
     g2 - b3

where both g1 and g2 are good, and B is bad. The bisection point is b1. 
(Which makes sense, absolutely.)

After the second git-bisect good, it looks like this:

     g3 - M - B
        /
g2 - b3

and the bisection point is M.

Your problem, of course, is that it is not possible to pick a commit 
_exactly_ in the middle. So, the number is just an _expected_ number of 
revisions to test. I am not sure if it is necessary to make this clearer, 
though, as I fully expect that you just said "What the heck" and continued 
bisecting just fine.

Ciao,
Dscho

P.S.: if Momo's turtle thinks that your name contains no non-ASCIIs, why 
should I?

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [BUG] bisecting miscounts revisions left to test
  2007-03-17  0:50 ` Johannes Schindelin
@ 2007-03-17 13:46   ` Uwe Kleine-König
  2007-03-17 14:12   ` [PATCH] calculate the maximal number of revisions " Uwe Kleine-König
  1 sibling, 0 replies; 18+ messages in thread
From: Uwe Kleine-König @ 2007-03-17 13:46 UTC (permalink / raw)
  To: Johannes Schindelin; +Cc: Git Mailing List

Hello,

Johannes Schindelin wrote:
> Hi,
> 
> On Fri, 16 Mar 2007, Uwe Kleine-König wrote:
> 
> > zeisberg@cassiopeia:~/gsrc/linux-2.6$ git bisect good
> > Bisecting: 2 revisions left to test after this
> > [e7b0d26a86943370c04d6833c6edba2a72a6e240] sysfs: reinstate exclusion between method calls and attribute unregistration
> > 
> > zeisberg@cassiopeia:~/gsrc/linux-2.6$ git bisect good
> > Bisecting: 2 revisions left to test after this
> > [b810cdfcf91d76f603fd48023aede48ced8e6bed] Merge branch 'upstream-linus' of master.kernel.org:/pub/scm/linux/kernel/git/jgarzik/netdev-2.6
> 
> The problem is that after the first git-bisect good, it looks like this:
> 
> g1 - b2 - b1 - M - B
>              /
>      g2 - b3
I wonder if bisect really knows that B is bad.  git bisect visualize
doesn't mark B (i.e. bac6eefe96204d0ad67d144f2511a6fc487aa594) bad.
Maybe the problem is, that this is a tag and not a commit?

After 2 more "bisect good"s I once more get B to test.  If I mark that
as good, I get "... was both good and bad"?

The commandline is:

	git-rev-list --bisect '^069f8256362b7a17da532f0631cee73b4cfee65b' '^08e15e81a40e3241ce93b4a43886f3abda184aa6' '^0bdd0f385a44344f83409b9e00797bfe2596faf8' '^2cb8a57b9851805883dfe92cf5d88a726134a384' '^6ab27c6bf38d5ff71dafeca77b79e7c284804b75' '^a7c999114ecd0c69bd3970272b64d8842b765b21' '^b810cdfcf91d76f603fd48023aede48ced8e6bed' '^bdf3aaf9519ddd8a026b5e04e713d2fa673532e5' '^e7b0d26a86943370c04d6833c6edba2a72a6e240' bac6eefe96204d0ad67d144f2511a6fc487aa594 --

Maybe better "git-rev-list --bisect $good $bad^ --" should be used to
find the bisection point?

> P.S.: if Momo's turtle thinks that your name contains no non-ASCIIs, why 
> should I?
You should differentiate between my user name and my real name.

Best regards
Uwe

-- 
Uwe Kleine-König

$ dc << EOF
[d1-d1<a]sa99d1<a1[rdn555760928P*pz1<a]salax
EOF

^ permalink raw reply	[flat|nested] 18+ messages in thread

* [PATCH] calculate the maximal number of revisions to test
  2007-03-17  0:50 ` Johannes Schindelin
  2007-03-17 13:46   ` Uwe Kleine-König
@ 2007-03-17 14:12   ` Uwe Kleine-König
  2007-03-17 17:49     ` Johannes Schindelin
  1 sibling, 1 reply; 18+ messages in thread
From: Uwe Kleine-König @ 2007-03-17 14:12 UTC (permalink / raw)
  To: git

Up to now the number printed was correct given that the current revision to
test is bad.

Moreover I think the number printed was always one to high, this is fixed, too.

Signed-off-by: Uwe Kleine-König <ukleinek@informatik.uni-freiburg.de>
---
 git-bisect.sh |   10 ++++++++--
 1 files changed, 8 insertions(+), 2 deletions(-)

diff --git a/git-bisect.sh b/git-bisect.sh
index b1c3a6b..a5b4fdd 100755
--- a/git-bisect.sh
+++ b/git-bisect.sh
@@ -150,8 +150,14 @@ bisect_next() {
 	    git-diff-tree --pretty $rev
 	    exit 0
 	fi
-	nr=$(eval "git-rev-list $rev $good -- $(cat $GIT_DIR/BISECT_NAMES)" | wc -l) || exit
-	echo "Bisecting: $nr revisions left to test after this"
+	nr_bad=$(eval "git-rev-list $rev^ $good -- $(cat $GIT_DIR/BISECT_NAMES)" | wc -l) || exit
+	nr_good=$(eval "git-rev-list $bad^ ^$rev $good -- $(cat $GIT_DIR/BISECT_NAMES)" | wc -l) || exit
+	if test "$nr_bad" -ge "$nr_good"; then
+		nr="$nr_bad";
+	else
+		nr="$nr_good";
+	fi;
+	echo "Bisecting: maximal $nr revisions left to test after this"
 	echo "$rev" > "$GIT_DIR/refs/heads/new-bisect"
 	git checkout -q new-bisect || exit
 	mv "$GIT_DIR/refs/heads/new-bisect" "$GIT_DIR/refs/heads/bisect" &&
-- 
1.5.0.2.260.g2eb065


-- 
Uwe Kleine-König

http://www.google.com/search?q=Planck%27s+constant%3D

^ permalink raw reply related	[flat|nested] 18+ messages in thread

* Re: [PATCH] calculate the maximal number of revisions to test
  2007-03-17 14:12   ` [PATCH] calculate the maximal number of revisions " Uwe Kleine-König
@ 2007-03-17 17:49     ` Johannes Schindelin
  2007-03-17 19:58       ` Uwe Kleine-König
  0 siblings, 1 reply; 18+ messages in thread
From: Johannes Schindelin @ 2007-03-17 17:49 UTC (permalink / raw)
  To: Uwe Kleine-König; +Cc: git

[-- Attachment #1: Type: TEXT/PLAIN, Size: 1255 bytes --]

Hi zeisberg,

the subject really could use a "bisect:" prefix, and maybe be a little 
clearer to begin with? Imagine how much sense it makes to read the commit 
messages, and see that some maximal number of revisions is calculated.

On Sat, 17 Mar 2007, Uwe Kleine-König wrote:

> diff --git a/git-bisect.sh b/git-bisect.sh
> index b1c3a6b..a5b4fdd 100755
> --- a/git-bisect.sh
> +++ b/git-bisect.sh
> @@ -150,8 +150,14 @@ bisect_next() {
>  	    git-diff-tree --pretty $rev
>  	    exit 0
>  	fi
> -	nr=$(eval "git-rev-list $rev $good -- $(cat $GIT_DIR/BISECT_NAMES)" | wc -l) || exit
> -	echo "Bisecting: $nr revisions left to test after this"
> +	nr_bad=$(eval "git-rev-list $rev^ $good -- $(cat $GIT_DIR/BISECT_NAMES)" | wc -l) || exit
> +	nr_good=$(eval "git-rev-list $bad^ ^$rev $good -- $(cat $GIT_DIR/BISECT_NAMES)" | wc -l) || exit
> +	if test "$nr_bad" -ge "$nr_good"; then
> +		nr="$nr_bad";
> +	else
> +		nr="$nr_good";
> +	fi;
> +	echo "Bisecting: maximal $nr revisions left to test after this"

How about this instead:

-	echo "Bisecting: $nr revisions left to test after this"
+	echo "Bisecting: approx. $nr revisions left to test after this"

since your version is an approximation (although a conservative one) 
anyway. Hmm?

Ciao,
Dscho

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH] calculate the maximal number of revisions to test
  2007-03-17 17:49     ` Johannes Schindelin
@ 2007-03-17 19:58       ` Uwe Kleine-König
  2007-03-21 21:04         ` [PATCH] Bisect: fix calculation of the number of suspicious revisions Uwe Kleine-König
  0 siblings, 1 reply; 18+ messages in thread
From: Uwe Kleine-König @ 2007-03-17 19:58 UTC (permalink / raw)
  To: Johannes Schindelin; +Cc: git

Hi gene099,

> the subject really could use a "bisect:" prefix, and maybe be a little 
> clearer to begin with? Imagine how much sense it makes to read the commit 
> messages, and see that some maximal number of revisions is calculated.
OK, you're right.  I think the patch is not as complete as it could be.
I will fix that and resend with a better log.
 
> > diff --git a/git-bisect.sh b/git-bisect.sh
> > index b1c3a6b..a5b4fdd 100755
> > --- a/git-bisect.sh
> > +++ b/git-bisect.sh
> > @@ -150,8 +150,14 @@ bisect_next() {
> >  	    git-diff-tree --pretty $rev
> >  	    exit 0
> >  	fi
> > -	nr=$(eval "git-rev-list $rev $good -- $(cat $GIT_DIR/BISECT_NAMES)" | wc -l) || exit
> > -	echo "Bisecting: $nr revisions left to test after this"
> > +	nr_bad=$(eval "git-rev-list $rev^ $good -- $(cat $GIT_DIR/BISECT_NAMES)" | wc -l) || exit
> > +	nr_good=$(eval "git-rev-list $bad^ ^$rev $good -- $(cat $GIT_DIR/BISECT_NAMES)" | wc -l) || exit
> > +	if test "$nr_bad" -ge "$nr_good"; then
> > +		nr="$nr_bad";
> > +	else
> > +		nr="$nr_good";
> > +	fi;
> > +	echo "Bisecting: maximal $nr revisions left to test after this"
> 
> How about this instead:
> 
> -	echo "Bisecting: $nr revisions left to test after this"
> +	echo "Bisecting: approx. $nr revisions left to test after this"
> 
> since your version is an approximation (although a conservative one) 
> anyway. Hmm?
I prefer a wording that points out a maximum.  Earlier today I thougt
about this, too, but currently I don't remember my alternative.

Best regards
Uwe

-- 
Uwe Kleine-König

http://www.google.com/search?q=12+divided+by+3

^ permalink raw reply	[flat|nested] 18+ messages in thread

* [PATCH] Bisect: fix calculation of the number of suspicious revisions
  2007-03-17 19:58       ` Uwe Kleine-König
@ 2007-03-21 21:04         ` Uwe Kleine-König
  2007-03-21 21:23           ` Junio C Hamano
                             ` (2 more replies)
  0 siblings, 3 replies; 18+ messages in thread
From: Uwe Kleine-König @ 2007-03-21 21:04 UTC (permalink / raw)
  To: git

Up to now the number printed was calculated assuming that the current revision
to test is bad.  Given that it's not possible that this always matches the
number of suspicious revs if the current one is good, the maximum of both is
taken now.

Moreover I think the number printed was always one to high, this is fixed, too.

Signed-off-by: Uwe Kleine-König <ukleinek@informatik.uni-freiburg.de>
---
In the mail before I wrote that the former version of this patch was not
complete.  This turned out to be a thinko.  So now I only used a better
Subject, a more verbose log and a hopefully more clear output.

 git-bisect.sh |   10 ++++++++--
 1 files changed, 8 insertions(+), 2 deletions(-)

diff --git a/git-bisect.sh b/git-bisect.sh
index b1c3a6b..aeff732 100755
--- a/git-bisect.sh
+++ b/git-bisect.sh
@@ -150,8 +150,14 @@ bisect_next() {
 	    git-diff-tree --pretty $rev
 	    exit 0
 	fi
-	nr=$(eval "git-rev-list $rev $good -- $(cat $GIT_DIR/BISECT_NAMES)" | wc -l) || exit
-	echo "Bisecting: $nr revisions left to test after this"
+	nr_bad=$(eval "git-rev-list $rev^ $good -- $(cat $GIT_DIR/BISECT_NAMES)" | wc -l) || exit
+	nr_good=$(eval "git-rev-list $bad^ ^$rev $good -- $(cat $GIT_DIR/BISECT_NAMES)" | wc -l) || exit
+	if test "$nr_bad" -ge "$nr_good"; then
+		nr="$nr_bad";
+	else
+		nr="$nr_good";
+	fi;
+	echo "Bisecting: up to $nr suspicious revisions left after this test"
 	echo "$rev" > "$GIT_DIR/refs/heads/new-bisect"
 	git checkout -q new-bisect || exit
 	mv "$GIT_DIR/refs/heads/new-bisect" "$GIT_DIR/refs/heads/bisect" &&
-- 
1.5.0.3

-- 
Uwe Kleine-König

http://www.google.com/search?q=1+hertz+in+sec**-1

^ permalink raw reply related	[flat|nested] 18+ messages in thread

* Re: [PATCH] Bisect: fix calculation of the number of suspicious revisions
  2007-03-21 21:04         ` [PATCH] Bisect: fix calculation of the number of suspicious revisions Uwe Kleine-König
@ 2007-03-21 21:23           ` Junio C Hamano
  2007-03-21 21:39             ` Uwe Kleine-König
  2007-03-21 21:34           ` Uwe Kleine-König
  2007-03-21 22:27           ` Linus Torvalds
  2 siblings, 1 reply; 18+ messages in thread
From: Junio C Hamano @ 2007-03-21 21:23 UTC (permalink / raw)
  To: Uwe Kleine-König; +Cc: git

Uwe Kleine-König  <ukleinek@informatik.uni-freiburg.de> writes:

> Up to now the number printed was calculated assuming that the
> current revision to test is bad.  Given that it's not possible
> that this always matches the number of suspicious revs if the
> current one is good, the maximum of both is taken now.
>
> Moreover I think the number printed was always one to high,
> this is fixed, too.

I know you mean well, but is it really worth an extra rev-list
for this off-by-one, I wonder?

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH] Bisect: fix calculation of the number of suspicious revisions
  2007-03-21 21:04         ` [PATCH] Bisect: fix calculation of the number of suspicious revisions Uwe Kleine-König
  2007-03-21 21:23           ` Junio C Hamano
@ 2007-03-21 21:34           ` Uwe Kleine-König
  2007-03-21 21:49             ` Junio C Hamano
  2007-03-21 22:27           ` Linus Torvalds
  2 siblings, 1 reply; 18+ messages in thread
From: Uwe Kleine-König @ 2007-03-21 21:34 UTC (permalink / raw)
  To: git

Hello,

Uwe Kleine-König wrote:
>  	mv "$GIT_DIR/refs/heads/new-bisect" "$GIT_DIR/refs/heads/bisect" &&
independant of my suggested patch, I wonder if that line should better
use update-ref.  (This line is older than update-ref.)

Best regards
Uwe

-- 
Uwe Kleine-König

$ dc -e "5735816763073014741799356604682P"

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH] Bisect: fix calculation of the number of suspicious revisions
  2007-03-21 21:23           ` Junio C Hamano
@ 2007-03-21 21:39             ` Uwe Kleine-König
  0 siblings, 0 replies; 18+ messages in thread
From: Uwe Kleine-König @ 2007-03-21 21:39 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: git

Hello Junio,

Junio C Hamano wrote:
> Uwe Kleine-König  <ukleinek@informatik.uni-freiburg.de> writes:
> 
> > Up to now the number printed was calculated assuming that the
> > current revision to test is bad.  Given that it's not possible
> > that this always matches the number of suspicious revs if the
> > current one is good, the maximum of both is taken now.
> >
> > Moreover I think the number printed was always one to high,
> > this is fixed, too.
> 
> I know you mean well, but is it really worth an extra rev-list
> for this off-by-one, I wonder?
It may be more than one.  E.g. 
   b
  / \
 /   \
a--c--e
 \   /
  \ /
   d

Given a is bad, e is bad, b is rev to test.

I didn't verify it, but I think the current code gives one (as it
assumes b is bad and counts one to much).  If b is good there are two
suspicious left.  I'm sure you can construct an example where it differs
still more.

Best regards
Uwe

-- 
Uwe Kleine-König

http://www.google.com/search?q=half+a+cup+in+teaspoons

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH] Bisect: fix calculation of the number of suspicious revisions
  2007-03-21 21:34           ` Uwe Kleine-König
@ 2007-03-21 21:49             ` Junio C Hamano
  0 siblings, 0 replies; 18+ messages in thread
From: Junio C Hamano @ 2007-03-21 21:49 UTC (permalink / raw)
  To: Uwe Kleine-König; +Cc: git

Uwe Kleine-König  <ukleinek@informatik.uni-freiburg.de> writes:

> Uwe Kleine-König wrote:
>>  	mv "$GIT_DIR/refs/heads/new-bisect" "$GIT_DIR/refs/heads/bisect" &&
> independant of my suggested patch, I wonder if that line should better
> use update-ref.  (This line is older than update-ref.)

Yup.  If we can clean things up to use update-ref and
symbolic-ref properly that would be nice.  Re-counting is an
independent issue as you say.

Also we may perhaps want to stop using bisect branch but use
detached HEAD (but still refs/bisect hierarchy)?  This one I am
not so sure, but one less reserved name.

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH] Bisect: fix calculation of the number of suspicious revisions
  2007-03-21 21:04         ` [PATCH] Bisect: fix calculation of the number of suspicious revisions Uwe Kleine-König
  2007-03-21 21:23           ` Junio C Hamano
  2007-03-21 21:34           ` Uwe Kleine-König
@ 2007-03-21 22:27           ` Linus Torvalds
  2007-03-22  1:19             ` Junio C Hamano
  2007-03-22  1:43             ` [PATCH] bisect: show the maximal number of commits to be tested Johannes Schindelin
  2 siblings, 2 replies; 18+ messages in thread
From: Linus Torvalds @ 2007-03-21 22:27 UTC (permalink / raw)
  To: Uwe Kleine-König; +Cc: git

[-- Attachment #1: Type: TEXT/PLAIN, Size: 1105 bytes --]



On Wed, 21 Mar 2007, Uwe Kleine-König wrote:
>
> Up to now the number printed was calculated assuming that the current revision
> to test is bad.  Given that it's not possible that this always matches the
> number of suspicious revs if the current one is good, the maximum of both is
> taken now.

How about adding a new flag to "git-rev-list", to make it count both ways? 
Doing this whole

	nr = $(eval "git-rev-list ... | wc -l")

was ugly to begin with, and you just made it doubly ugly.

And the thing is, "git-rev-list --bisect" will obviously already have 
calculated these numbers just to _pick_ the revision in the first place, 
so it's a bit sad then execute it twice more (giving it back the result 
*it* gave us in the first place!).

So we could perhaps change the original

	rev=$(eval "git-rev-list --bisect $good $bad -- $(cat $GIT_DIR/BISECT_NAMES)")

with something nicer.

(In fact, I would also suggest we drop or try to fix BISECT_NAMES support, 
while at it - it never really worked, and iirc it was partly exactly 
*because* of the end-condition not being handled right).

		Linus

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH] Bisect: fix calculation of the number of suspicious revisions
  2007-03-21 22:27           ` Linus Torvalds
@ 2007-03-22  1:19             ` Junio C Hamano
  2007-03-22  1:43             ` [PATCH] bisect: show the maximal number of commits to be tested Johannes Schindelin
  1 sibling, 0 replies; 18+ messages in thread
From: Junio C Hamano @ 2007-03-22  1:19 UTC (permalink / raw)
  To: Linus Torvalds; +Cc: Uwe Kleine-König, git

Linus Torvalds <torvalds@linux-foundation.org> writes:

> (In fact, I would also suggest we drop or try to fix BISECT_NAMES support, 
> while at it - it never really worked, and iirc it was partly exactly 
> *because* of the end-condition not being handled right).

I am not opposed to dropping bisect-names.

I was having fun with this patch, which made an experiment to
bisect between v1.0.0 and v1.5.0 in git.git repository by
alternately saying good and bad go from 3.68 seconds to 1.26
seconds.  Bisecting the kernel between v2.6.18 and v2.6.20 with
the same test script takes 21.84 seconds vs 4.22 seconds.

The idea is that count_distance() is expensive and we would want
to avoid it when we can.  When a commit has only one relevant
parent commit, then the number of commits that commit can reach
is exactly one more than the number of commits that the parent
can reach, so we can avoid running count_distance() on commits
that are on straight single strand of pearls.  On the other
hand, for a merge commit, because the commits reachable from one
parent can be reachable from another parent, you cannot just add
the parents' up (plus one for yourself).

So the patch runs count_distance() on merges (and uses util
field to remember them), and then builds the rest by going
backwards, finding each commit whose distance is not yet known
but the distance of its (sole) parent is known, add one to that,
until we count distance for everybody.

Another small optimization is whenever we find a half-way (that
is, a commit whose distance is half of the number of all
commits), we say that is good enough.

---

Here is the test script that I ran /usr/bin/time on to
benchmark.  It takes three params: path to git-rev-list, a good
commit (i.e. older one), and a bad commit.  The script
alternately says "bisect good" and "bisect bad" and finishes
when it narrows down the suspects to one.

Even with the "half is good enough" optimization in place,
somehow I am getting the same bisect sequence to narrow down
between v2.6.18 and v2.6.20 (the kernel) and v1.0.0 and v1.5.0
(git).

-- >8 -- test script -- >8 --
#!/bin/sh
cmd=$1
good=$(git rev-parse --verify "$2")
bad=$(git rev-parse --verify "$3")
flip=good

run_one () {
	bisect=$($cmd --bisect $bad --not $good)
	ifgood=$(git-rev-list $bad --not $good $bisect | wc -l)
	ifbad=$(git-rev-list $bisect --not $good | wc -l)
	echo "bad $bad"
	echo "good $good"
	echo "ifgood $ifgood"
	echo "ifbad $ifbad"
	echo

	if test "$bisect" = "$bad"
	then
		false
	else
		case "$flip" in
		good)
			good="$good $bisect"
			flip=bad
			;;
		bad)
			bad="$bisect"
			flip=good
			;;
		esac
		true
	fi
}


while run_one
do
	:
done
-- >8 -- end of test script -- >8 --

 builtin-rev-list.c |  169 +++++++++++++++++++++++++++++++++++++++++++++++++++-
 1 files changed, 167 insertions(+), 2 deletions(-)

diff --git a/builtin-rev-list.c b/builtin-rev-list.c
index c2db5a5..c43d88d 100644
--- a/builtin-rev-list.c
+++ b/builtin-rev-list.c
@@ -203,6 +203,167 @@ static struct commit_list *find_bisection(struct commit_list *list)
 	return best;
 }
 
+static inline int commit_interesting(struct commit_list *elem)
+{
+	unsigned flags = elem->item->object.flags;
+	if (flags & UNINTERESTING)
+		return 0;
+	return (!revs.prune_fn || (flags & TREECHANGE));
+}
+
+static inline int weight(struct commit_list *elem)
+{
+	return *((int*)(elem->item->util));
+}
+
+static inline void weight_set(struct commit_list *elem, int weight)
+{
+	*((int*)(elem->item->util)) = weight;
+}
+
+static int count_interesting_parents(struct commit_list *elem)
+{
+	int cnt = 0;
+	if (!elem->item->parents)
+		return cnt;
+	for (elem = elem->item->parents; elem; elem = elem->next) {
+		if (commit_interesting(elem))
+			cnt++;
+	}
+	return cnt;
+}
+
+static struct commit_list *find_bisection_2(struct commit_list *list)
+{
+	int n, nr, last_counted, counted, distance;
+	struct commit_list *p, *best;
+	int *weights;
+
+	for (nr = 0, p = list; p; p = p->next) {
+		if (commit_interesting(p))
+			nr++;
+	}
+	weights = xcalloc(nr, sizeof(int*));
+	counted = 0;
+
+	for (n = 0, p = list; p; p = p->next) {
+		if (!commit_interesting(p))
+			continue;
+		if (commit_interesting(p)) {
+			/*
+			 * positive weight is the number of interesting
+			 * commits it can reach, including itself.
+			 * weight = 0 means it has one parent and
+			 * its distance is unknown.
+			 * weight < 0 means it has more than one
+			 * parent and its distance is unknown.
+			 */
+			p->item->util = &weights[n++];
+			switch (count_interesting_parents(p)) {
+			case 0:
+				weight_set(p, 1);
+				counted++;
+				break;
+			case 1:
+				weight_set(p, 0);
+				break;
+			default:
+				weight_set(p, -1);
+				break;
+			}
+		}
+	}
+
+	/*
+	 * If you have only one parent in the resulting set
+	 * then you can reach one commit more than that parent
+	 * can reach.  So we do not have to run the expensive
+	 * count_distance() for single strand of pearls.
+	 *
+	 * However, if you have more than one parents, you cannot
+	 * just add their distance and one for yourself, since
+	 * they usually reach the same ancestor and you would
+	 * end up counting them twice that way.
+	 *
+	 * So we will first count distance of merges the usual
+	 * way, and then fill the blanks using cheaper algorithm.
+	 */
+	for (p = list; p; p = p->next) {
+		if (!commit_interesting(p))
+			continue;
+		n = weight(p);
+		if (0 <= n)
+			continue;
+		distance = count_distance(p);
+		clear_distance(p);
+		weight_set(p, distance);
+		distance = weight(p) * 2;
+		if (nr == distance || (nr+1) == distance) {
+			p->next = NULL;
+			free(weights);
+			return p;
+		}
+		counted++;
+	}
+
+	while (counted < nr) {
+		do {
+			last_counted = counted;
+			for (p = list; p; p = p->next) {
+				struct commit_list *q;
+
+				if (!commit_interesting(p) || 0 < weight(p))
+					continue;
+				if (weight(p) < 0)
+					die("OOPS");
+				for (q = p->item->parents;
+				     q;
+				     q = q->next)
+					if (commit_interesting(q) &&
+					    0 < weight(q))
+						break;
+				if (!q)
+					continue;
+				weight_set(p, weight(q)+1);
+				counted++;
+
+				/*
+				 * Do we happen to have found one that
+				 * has the desired half-weight?
+				 */
+				distance = weight(p) * 2;
+				if (nr == distance || (nr+1) == distance) {
+					p->next = NULL;
+					free(weights);
+					return p;
+				}
+			}
+		} while (counted != last_counted);
+
+	}
+
+	/* Then find the best one */
+	counted = 0;
+	best = list;
+	for (p = list; p; p = p->next) {
+		if (!commit_interesting(p))
+			continue;
+		distance = weight(p);
+		if (nr - distance < distance)
+			distance = nr - distance;
+		if (distance > counted) {
+			best = p;
+			counted = distance;
+		}
+	}
+	if (best) {
+		best->next = NULL;
+	}
+	free(weights);
+	return best;
+}
+
+
 static void read_revisions_from_stdin(struct rev_info *revs)
 {
 	char line[1000];
@@ -285,8 +446,12 @@ int cmd_rev_list(int argc, const char **argv, const char *prefix)
 	if (revs.tree_objects)
 		mark_edges_uninteresting(revs.commits, &revs, show_edge);
 
-	if (bisect_list)
-		revs.commits = find_bisection(revs.commits);
+	if (bisect_list) {
+		if (!revs.prune_fn)
+			revs.commits = find_bisection_2(revs.commits);
+		else
+			revs.commits = find_bisection(revs.commits);
+	}
 
 	traverse_commit_list(&revs, show_commit, show_object);
 

^ permalink raw reply related	[flat|nested] 18+ messages in thread

* [PATCH] bisect: show the maximal number of commits to be tested
  2007-03-21 22:27           ` Linus Torvalds
  2007-03-22  1:19             ` Junio C Hamano
@ 2007-03-22  1:43             ` Johannes Schindelin
  2007-03-22  2:05               ` Junio C Hamano
                                 ` (2 more replies)
  1 sibling, 3 replies; 18+ messages in thread
From: Johannes Schindelin @ 2007-03-22  1:43 UTC (permalink / raw)
  To: Linus Torvalds; +Cc: Uwe Kleine-König, git

[-- Attachment #1: Type: TEXT/PLAIN, Size: 5577 bytes --]


Since git-bisect already asks rev-list to find the midpoint (and rev-list
consequently counts the number of commits), rev-list can pass it the
maximal number of commits.

As a bonus, this avoids an extra call to rev-list.

Miscalculation noticed by Uwe, implementation suggested by Linus.

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
---

	On Wed, 21 Mar 2007, Linus Torvalds wrote:

	> On Wed, 21 Mar 2007, Uwe Kleine-König wrote:
	> >
	> > Up to now the number printed was calculated assuming that the 
	> > current revision to test is bad.  Given that it's not possible 
	> > that this always matches the number of suspicious revs if the 
	> > current one is good, the maximum of both is taken now.
	> 
	> How about adding a new flag to "git-rev-list", to make it count 
	> both ways?

	Did I understand you correctly?

 Documentation/git-rev-list.txt |    8 ++++++++
 builtin-rev-list.c             |   40 +++++++++++++++++++++++++++++++++-------
 git-bisect.sh                  |    7 +++----
 3 files changed, 44 insertions(+), 11 deletions(-)

diff --git a/Documentation/git-rev-list.txt b/Documentation/git-rev-list.txt
index 4f145ea..5f6f2a3 100644
--- a/Documentation/git-rev-list.txt
+++ b/Documentation/git-rev-list.txt
@@ -26,6 +26,7 @@ SYNOPSIS
 	     [ [\--objects | \--objects-edge] [ \--unpacked ] ]
 	     [ \--pretty | \--header ]
 	     [ \--bisect ]
+	     [ \--bisect-vars ]
 	     [ \--merge ]
 	     [ \--reverse ]
 	     [ \--walk-reflogs ]
@@ -249,6 +250,13 @@ introduces a regression is thus reduced to a binary search: repeatedly
 generate and test new 'midpoint's until the commit chain is of length
 one.
 
+--bisect-vars::
+
+This calculates the same as --bisect, but outputs a line ready to be
+eval'ed by the shell. This line will assign the name of the midpoint
+revision to the variable 'rev', and the maximal number of commits to
+be tested after this revision to 'nr'.
+
 --
 
 Commit Ordering
diff --git a/builtin-rev-list.c b/builtin-rev-list.c
index c2db5a5..b653975 100644
--- a/builtin-rev-list.c
+++ b/builtin-rev-list.c
@@ -36,12 +36,12 @@ static const char rev_list_usage[] =
 "    --abbrev=nr | --no-abbrev\n"
 "    --abbrev-commit\n"
 "  special purpose:\n"
-"    --bisect"
+"    --bisect\n"
+"    --bisect-vars"
 ;
 
 static struct rev_info revs;
 
-static int bisect_list;
 static int show_timestamp;
 static int hdr_termination;
 static const char *header_prefix;
@@ -168,7 +168,8 @@ static void clear_distance(struct commit_list *list)
 	}
 }
 
-static struct commit_list *find_bisection(struct commit_list *list)
+static struct commit_list *find_bisection(struct commit_list *list,
+	int *nr_bad, int *nr_good)
 {
 	int nr, closest;
 	struct commit_list *p, *best;
@@ -198,8 +199,20 @@ static struct commit_list *find_bisection(struct commit_list *list)
 			closest = distance;
 		}
 	}
-	if (best)
+	if (best) {
 		best->next = NULL;
+		/*
+		 * The variable nr_bad holds the number of revisions
+		 * to be tested if "best" is marked as bad, and nr_good
+		 * the number if "best" is marked as good.
+		 *
+		 * Since the given range is <good>..<bad>, we have to
+		 * subtract one, because both <good> and <bad> were
+		 * already tested.
+		 */
+		*nr_bad = nr - closest - 1;
+		*nr_good = closest - 1;
+	}
 	return best;
 }
 
@@ -224,7 +237,7 @@ int cmd_rev_list(int argc, const char **argv, const char *prefix)
 {
 	struct commit_list *list;
 	int i;
-	int read_from_stdin = 0;
+	int read_from_stdin = 0, bisect_list = 0, bisect_show_vars = 0;
 
 	git_config(git_default_config);
 	init_revisions(&revs, prefix);
@@ -247,6 +260,11 @@ int cmd_rev_list(int argc, const char **argv, const char *prefix)
 			bisect_list = 1;
 			continue;
 		}
+		if (!strcmp(arg, "--bisect-vars")) {
+			bisect_list = 1;
+			bisect_show_vars = 1;
+			continue;
+		}
 		if (!strcmp(arg, "--stdin")) {
 			if (read_from_stdin++)
 				die("--stdin given twice?");
@@ -285,8 +303,16 @@ int cmd_rev_list(int argc, const char **argv, const char *prefix)
 	if (revs.tree_objects)
 		mark_edges_uninteresting(revs.commits, &revs, show_edge);
 
-	if (bisect_list)
-		revs.commits = find_bisection(revs.commits);
+	if (bisect_list) {
+		int nr_bad = 0, nr_good = 0;
+		revs.commits = find_bisection(revs.commits, &nr_bad, &nr_good);
+		if (bisect_show_vars) {
+			printf("rev=%s;nr=%d;\n", !revs.commits ? "" :
+				sha1_to_hex(revs.commits->item->object.sha1),
+				nr_bad > nr_good ? nr_bad : nr_good);
+			return 0;
+		}
+	}
 
 	traverse_commit_list(&revs, show_commit, show_object);
 
diff --git a/git-bisect.sh b/git-bisect.sh
index b1c3a6b..cd5e3c9 100755
--- a/git-bisect.sh
+++ b/git-bisect.sh
@@ -138,9 +138,9 @@ bisect_next() {
 	bisect_autostart
 	bisect_next_check fail
 	bad=$(git-rev-parse --verify refs/bisect/bad) &&
-	good=$(git-rev-parse --sq --revs-only --not \
-		$(cd "$GIT_DIR" && ls refs/bisect/good-*)) &&
-	rev=$(eval "git-rev-list --bisect $good $bad -- $(cat $GIT_DIR/BISECT_NAMES)") || exit
+	good=$(git-rev-parse --revs-only --not \
+		$(cd "$GIT_DIR" && ls refs/bisect/good-*)) || exit
+	eval "$(git-rev-list --bisect-vars $good $bad -- $(cat $GIT_DIR/BISECT_NAMES))" || exit
 	if [ -z "$rev" ]; then
 	    echo "$bad was both good and bad"
 	    exit 1
@@ -150,7 +150,6 @@ bisect_next() {
 	    git-diff-tree --pretty $rev
 	    exit 0
 	fi
-	nr=$(eval "git-rev-list $rev $good -- $(cat $GIT_DIR/BISECT_NAMES)" | wc -l) || exit
 	echo "Bisecting: $nr revisions left to test after this"
 	echo "$rev" > "$GIT_DIR/refs/heads/new-bisect"
 	git checkout -q new-bisect || exit
-- 
1.5.1.rc1.2306.g3d2f

^ permalink raw reply related	[flat|nested] 18+ messages in thread

* Re: [PATCH] bisect: show the maximal number of commits to be tested
  2007-03-22  1:43             ` [PATCH] bisect: show the maximal number of commits to be tested Johannes Schindelin
@ 2007-03-22  2:05               ` Junio C Hamano
  2007-03-22  2:15                 ` Johannes Schindelin
  2007-03-22  6:36               ` [PATCH 1/2] git-rev-list: add --bisect-vars option Junio C Hamano
  2007-03-22  6:42               ` [PATCH 2/2] git-rev-list --bisect: optimization Junio C Hamano
  2 siblings, 1 reply; 18+ messages in thread
From: Junio C Hamano @ 2007-03-22  2:05 UTC (permalink / raw)
  To: Johannes Schindelin; +Cc: Linus Torvalds, Uwe Kleine-König, git

Johannes Schindelin <Johannes.Schindelin@gmx.de> writes:

> Since git-bisect already asks rev-list to find the midpoint (and rev-list
> consequently counts the number of commits), rev-list can pass it the
> maximal number of commits.
>
> As a bonus, this avoids an extra call to rev-list.
>
> Miscalculation noticed by Uwe, implementation suggested by Linus.
>
> Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
> ---
>
> 	On Wed, 21 Mar 2007, Linus Torvalds wrote:
>
> 	> On Wed, 21 Mar 2007, Uwe Kleine-König wrote:
> 	> >
> 	> > Up to now the number printed was calculated assuming that the 
> 	> > current revision to test is bad.  Given that it's not possible 
> 	> > that this always matches the number of suspicious revs if the 
> 	> > current one is good, the maximum of both is taken now.
> 	> 
> 	> How about adding a new flag to "git-rev-list", to make it count 
> 	> both ways?
>
> 	Did I understand you correctly?

Why not show both, or at least total?

This and the earlier optimization patch steps on each others
toes, ouch ;-).

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH] bisect: show the maximal number of commits to be tested
  2007-03-22  2:05               ` Junio C Hamano
@ 2007-03-22  2:15                 ` Johannes Schindelin
  0 siblings, 0 replies; 18+ messages in thread
From: Johannes Schindelin @ 2007-03-22  2:15 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: Linus Torvalds, Uwe Kleine-König, git

[-- Attachment #1: Type: TEXT/PLAIN, Size: 1259 bytes --]

Hi,

On Wed, 21 Mar 2007, Junio C Hamano wrote:

> Johannes Schindelin <Johannes.Schindelin@gmx.de> writes:
> 
> > Since git-bisect already asks rev-list to find the midpoint (and rev-list
> > consequently counts the number of commits), rev-list can pass it the
> > maximal number of commits.
> >
> > As a bonus, this avoids an extra call to rev-list.
> >
> > Miscalculation noticed by Uwe, implementation suggested by Linus.
> >
> > Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
> > ---
> >
> > 	On Wed, 21 Mar 2007, Linus Torvalds wrote:
> >
> > 	> On Wed, 21 Mar 2007, Uwe Kleine-König wrote:
> > 	> >
> > 	> > Up to now the number printed was calculated assuming that the 
> > 	> > current revision to test is bad.  Given that it's not possible 
> > 	> > that this always matches the number of suspicious revs if the 
> > 	> > current one is good, the maximum of both is taken now.
> > 	> 
> > 	> How about adding a new flag to "git-rev-list", to make it count 
> > 	> both ways?
> >
> > 	Did I understand you correctly?
> 
> Why not show both, or at least total?
> 
> This and the earlier optimization patch steps on each others
> toes, ouch ;-).

Yes... on both accounts. I'll let your patches settle first, and resubmit.

Ciao,
Dscho

^ permalink raw reply	[flat|nested] 18+ messages in thread

* [PATCH 1/2] git-rev-list: add --bisect-vars option.
  2007-03-22  1:43             ` [PATCH] bisect: show the maximal number of commits to be tested Johannes Schindelin
  2007-03-22  2:05               ` Junio C Hamano
@ 2007-03-22  6:36               ` Junio C Hamano
  2007-03-22  6:42               ` [PATCH 2/2] git-rev-list --bisect: optimization Junio C Hamano
  2 siblings, 0 replies; 18+ messages in thread
From: Junio C Hamano @ 2007-03-22  6:36 UTC (permalink / raw)
  To: Johannes Schindelin; +Cc: Linus Torvalds, Uwe Kleine-König, git

This adds --bisect-vars option to rev-list.  The output is suitable
for `eval` in shell and defines five variables:

 - bisect_rev is the next revision to test.
 - bisect_nr is the expected number of commits to test after
   bisect_rev is tested.
 - bisect_good is the expected number of commits to test
   if bisect_rev turns out to be good.
 - bisect_bad is the expected number of commits to test
   if bisect_rev turns out to be bad.
 - bisect_all is the number of commits we are bisecting right now.

The documentation text was partly stolen from Joahnnes
Schindelin's patch.

Signed-off-by: Junio C Hamano <junkio@cox.net>
---

 * This comes first, on top of 'master', without any of my
   optimization hack.

 Documentation/git-rev-list.txt |   13 +++++++++
 builtin-rev-list.c             |   54 +++++++++++++++++++++++++++++++++++----
 2 files changed, 61 insertions(+), 6 deletions(-)

diff --git a/Documentation/git-rev-list.txt b/Documentation/git-rev-list.txt
index 4f145ea..3fa45b8 100644
--- a/Documentation/git-rev-list.txt
+++ b/Documentation/git-rev-list.txt
@@ -26,6 +26,7 @@ SYNOPSIS
 	     [ [\--objects | \--objects-edge] [ \--unpacked ] ]
 	     [ \--pretty | \--header ]
 	     [ \--bisect ]
+	     [ \--bisect-vars ]
 	     [ \--merge ]
 	     [ \--reverse ]
 	     [ \--walk-reflogs ]
@@ -249,6 +250,18 @@ introduces a regression is thus reduced to a binary search: repeatedly
 generate and test new 'midpoint's until the commit chain is of length
 one.
 
+--bisect-vars::
+
+This calculates the same as `--bisect`, but outputs text ready
+to be eval'ed by the shell. These lines will assign the name of
+the midpoint revision to the variable `bisect_rev`, and the
+expected number of commits to be tested after `bisect_rev` is
+tested to `bisect_nr`, the expected number of commits to be
+tested if `bisect_rev` turns out to be good to `bisect_good`,
+the expected number of commits to be tested if `bisect_rev`
+turns out to be bad to `bisect_bad`, and the number of commits
+we are bisecting right now to `bisect_all`.
+
 --
 
 Commit Ordering
diff --git a/builtin-rev-list.c b/builtin-rev-list.c
index c2db5a5..723e4d4 100644
--- a/builtin-rev-list.c
+++ b/builtin-rev-list.c
@@ -36,7 +36,8 @@ static const char rev_list_usage[] =
 "    --abbrev=nr | --no-abbrev\n"
 "    --abbrev-commit\n"
 "  special purpose:\n"
-"    --bisect"
+"    --bisect\n"
+"    --bisect-vars"
 ;
 
 static struct rev_info revs;
@@ -168,7 +169,8 @@ static void clear_distance(struct commit_list *list)
 	}
 }
 
-static struct commit_list *find_bisection(struct commit_list *list)
+static struct commit_list *find_bisection(struct commit_list *list,
+					  int *reaches, int *all)
 {
 	int nr, closest;
 	struct commit_list *p, *best;
@@ -180,21 +182,23 @@ static struct commit_list *find_bisection(struct commit_list *list)
 			nr++;
 		p = p->next;
 	}
+	*all = nr;
 	closest = 0;
 	best = list;
 
 	for (p = list; p; p = p->next) {
-		int distance;
+		int distance, reach;
 
 		if (revs.prune_fn && !(p->item->object.flags & TREECHANGE))
 			continue;
 
-		distance = count_distance(p);
+		distance = reach = count_distance(p);
 		clear_distance(list);
 		if (nr - distance < distance)
 			distance = nr - distance;
 		if (distance > closest) {
 			best = p;
+			*reaches = reach;
 			closest = distance;
 		}
 	}
@@ -225,6 +229,7 @@ int cmd_rev_list(int argc, const char **argv, const char *prefix)
 	struct commit_list *list;
 	int i;
 	int read_from_stdin = 0;
+	int bisect_show_vars = 0;
 
 	git_config(git_default_config);
 	init_revisions(&revs, prefix);
@@ -247,6 +252,11 @@ int cmd_rev_list(int argc, const char **argv, const char *prefix)
 			bisect_list = 1;
 			continue;
 		}
+		if (!strcmp(arg, "--bisect-vars")) {
+			bisect_list = 1;
+			bisect_show_vars = 1;
+			continue;
+		}
 		if (!strcmp(arg, "--stdin")) {
 			if (read_from_stdin++)
 				die("--stdin given twice?");
@@ -285,8 +295,40 @@ int cmd_rev_list(int argc, const char **argv, const char *prefix)
 	if (revs.tree_objects)
 		mark_edges_uninteresting(revs.commits, &revs, show_edge);
 
-	if (bisect_list)
-		revs.commits = find_bisection(revs.commits);
+	if (bisect_list) {
+		int reaches = reaches, all = all;
+
+		revs.commits = find_bisection(revs.commits,
+					      &reaches, &all);
+		if (bisect_show_vars) {
+			int cnt;
+			if (!revs.commits)
+				return 1;
+			/*
+			 * revs.commits can reach "reaches" commits among
+			 * "all" commits.  If it is good, then there are
+			 * (all-reaches) commits left to be bisected.
+			 * On the other hand, if it is bad, then the set
+			 * to bisect is "reaches".
+			 * A bisect set of size N has (N-1) commits further
+			 * to test, as we already know one bad one.
+			 */
+			cnt = all-reaches;
+			if (cnt < reaches)
+				cnt = reaches;
+			printf("bisect_rev=%s\n"
+			       "bisect_nr=%d\n"
+			       "bisect_good=%d\n"
+			       "bisect_bad=%d\n"
+			       "bisect_all=%d\n",
+			       sha1_to_hex(revs.commits->item->object.sha1),
+			       cnt - 1,
+			       all - reaches - 1,
+			       reaches - 1,
+			       all);
+			return 0;
+		}
+	}
 
 	traverse_commit_list(&revs, show_commit, show_object);
 
-- 
1.5.1.rc1.610.g3ba7

^ permalink raw reply related	[flat|nested] 18+ messages in thread

* [PATCH 2/2] git-rev-list --bisect: optimization
  2007-03-22  1:43             ` [PATCH] bisect: show the maximal number of commits to be tested Johannes Schindelin
  2007-03-22  2:05               ` Junio C Hamano
  2007-03-22  6:36               ` [PATCH 1/2] git-rev-list: add --bisect-vars option Junio C Hamano
@ 2007-03-22  6:42               ` Junio C Hamano
  2 siblings, 0 replies; 18+ messages in thread
From: Junio C Hamano @ 2007-03-22  6:42 UTC (permalink / raw)
  To: Johannes Schindelin; +Cc: Linus Torvalds, Uwe Kleine-König, git

This improves the performance of revision bisection.

The idea is to avoid rather expensive count_distance() function,
which counts the number of commits that are reachable from any
given commit (including itself) in the set.  When a commit has
only one relevant parent commit, the number of commits the
commit can reach is exactly the number of commits that the
parent can reach plus one; instead of running count_distance()
on commits that are on straight single strand of pearls, we can
just add one to the parents' count.

On the other hand, for a merge commit, because the commits
reachable from one parent can be reachable from another parent,
you cannot just add the parents' counts up plus one for the
commit itself; that would overcount ancestors that are reachable
from more than one parents.

The algorithm used in the patch runs count_distance() on merge
commits, and uses the util field of commit objects to remember
them.  After that, the number of commits reachable from each of
the remaining commits is counted by finding a commit whose count
is not yet known but the count for its (sole) parent is known,
and adding one to the parent's count, until we assign numbers to
everybody.

Another small optimization is whenever we find a half-way (that
is, a commit that can reach exactly half of the commits in the
set), we stop giving counts to remaining commits.

The performance to bisect between v1.0.0 and v1.5.0 in git.git
repository was improved by saying good and bad in turns from
3.68 seconds down to 1.26 seconds.  Bisecting the kernel between
v2.6.18 and v2.6.20 was sped up from 21.84 seconds down to 4.22
seconds.

Signed-off-by: Junio C Hamano <junkio@cox.net>
---

 * This is a rebase on top of updated Johannes's patch, which is
   [1/2] of this series.

   I am not enabling this for path limited case, so it is still
   in a WIP state.  I am not sure if parent rewriting is doing
   the right thing (if so, then we should be able to assume that
   "interesting" commits are already made contiguous and we can
   keep giving the number one higher than the parent to commits
   on a single-strand-of-pearls like this patch does).

 builtin-rev-list.c |  162 +++++++++++++++++++++++++++++++++++++++++++++++++++-
 1 files changed, 160 insertions(+), 2 deletions(-)

diff --git a/builtin-rev-list.c b/builtin-rev-list.c
index 723e4d4..111ba60 100644
--- a/builtin-rev-list.c
+++ b/builtin-rev-list.c
@@ -207,6 +207,160 @@ static struct commit_list *find_bisection(struct commit_list *list,
 	return best;
 }
 
+static inline int commit_interesting(struct commit_list *elem)
+{
+	unsigned flags = elem->item->object.flags;
+	if (flags & UNINTERESTING)
+		return 0;
+	return (!revs.prune_fn || (flags & TREECHANGE));
+}
+
+static inline int weight(struct commit_list *elem)
+{
+	return *((int*)(elem->item->util));
+}
+
+static inline void weight_set(struct commit_list *elem, int weight)
+{
+	*((int*)(elem->item->util)) = weight;
+}
+
+static int count_interesting_parents(struct commit_list *elem)
+{
+	int cnt = 0;
+	if (!elem->item->parents)
+		return cnt;
+	for (elem = elem->item->parents; elem; elem = elem->next) {
+		if (commit_interesting(elem))
+			cnt++;
+	}
+	return cnt;
+}
+
+static struct commit_list *find_bisection_2(struct commit_list *list,
+					    int *reaches, int *all)
+{
+	int n, nr, counted, distance;
+	struct commit_list *p, *best;
+	int *weights;
+
+	for (nr = 0, p = list; p; p = p->next) {
+		if (commit_interesting(p))
+			nr++;
+	}
+	*all = nr;
+	weights = xcalloc(nr, sizeof(int*));
+	counted = 0;
+
+	for (n = 0, p = list; p; p = p->next) {
+		if (!commit_interesting(p))
+			continue;
+		if (commit_interesting(p)) {
+			/*
+			 * positive weight is the number of interesting
+			 * commits it can reach, including itself.
+			 * weight = 0 means it has one parent and
+			 * its distance is unknown.
+			 * weight < 0 means it has more than one
+			 * parent and its distance is unknown.
+			 */
+			p->item->util = &weights[n++];
+			switch (count_interesting_parents(p)) {
+			case 0:
+				weight_set(p, 1);
+				counted++;
+				break;
+			case 1:
+				weight_set(p, 0);
+				break;
+			default:
+				weight_set(p, -1);
+				break;
+			}
+		}
+	}
+
+	/*
+	 * If you have only one parent in the resulting set
+	 * then you can reach one commit more than that parent
+	 * can reach.  So we do not have to run the expensive
+	 * count_distance() for single strand of pearls.
+	 *
+	 * However, if you have more than one parents, you cannot
+	 * just add their distance and one for yourself, since
+	 * they usually reach the same ancestor and you would
+	 * end up counting them twice that way.
+	 *
+	 * So we will first count distance of merges the usual
+	 * way, and then fill the blanks using cheaper algorithm.
+	 */
+	for (p = list; p; p = p->next) {
+		if (!commit_interesting(p))
+			continue;
+		n = weight(p);
+		if (0 <= n)
+			continue;
+		distance = count_distance(p);
+		clear_distance(p);
+		weight_set(p, distance);
+
+		/* Is it happen to be at exactly half-way? */
+		distance *= 2;
+		if (nr == distance || (nr+1) == distance) {
+			p->next = NULL;
+			*reaches = weight(p);
+			free(weights);
+			return p;
+		}
+		counted++;
+	}
+
+	while (counted < nr) {
+		for (p = list; p; p = p->next) {
+			struct commit_list *q;
+
+			if (!commit_interesting(p) || 0 < weight(p))
+				continue;
+			for (q = p->item->parents; q; q = q->next)
+				if (commit_interesting(q) && 0 < weight(q))
+					break;
+			if (!q)
+				continue;
+			weight_set(p, weight(q)+1);
+			counted++;
+
+			/* Is it happen to be at exactly half-way? */
+			distance = weight(p) * 2;
+			if (nr == distance || (nr+1) == distance) {
+				p->next = NULL;
+				*reaches = weight(p);
+				free(weights);
+				return p;
+			}
+		}
+	}
+
+	/* Then find the best one */
+	counted = 0;
+	best = list;
+	for (p = list; p; p = p->next) {
+		if (!commit_interesting(p))
+			continue;
+		distance = weight(p);
+		if (nr - distance < distance)
+			distance = nr - distance;
+		if (distance > counted) {
+			best = p;
+			counted = distance;
+			*reaches = weight(p);
+		}
+	}
+	if (best)
+		best->next = NULL;
+	free(weights);
+	return best;
+}
+
 static void read_revisions_from_stdin(struct rev_info *revs)
 {
 	char line[1000];
@@ -298,8 +452,12 @@ int cmd_rev_list(int argc, const char **argv, const char *prefix)
 	if (bisect_list) {
 		int reaches = reaches, all = all;
 
-		revs.commits = find_bisection(revs.commits,
-					      &reaches, &all);
+		if (!revs.prune_fn)
+			revs.commits = find_bisection_2(revs.commits,
+							&reaches, &all);
+		else
+			revs.commits = find_bisection(revs.commits,
+						      &reaches, &all);
 		if (bisect_show_vars) {
 			int cnt;
 			if (!revs.commits)
-- 
1.5.1.rc1.610.g3ba7

^ permalink raw reply related	[flat|nested] 18+ messages in thread

end of thread, other threads:[~2007-03-22  6:42 UTC | newest]

Thread overview: 18+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2007-03-16 16:14 [BUG] bisecting miscounts revisions left to test Uwe Kleine-König
2007-03-17  0:50 ` Johannes Schindelin
2007-03-17 13:46   ` Uwe Kleine-König
2007-03-17 14:12   ` [PATCH] calculate the maximal number of revisions " Uwe Kleine-König
2007-03-17 17:49     ` Johannes Schindelin
2007-03-17 19:58       ` Uwe Kleine-König
2007-03-21 21:04         ` [PATCH] Bisect: fix calculation of the number of suspicious revisions Uwe Kleine-König
2007-03-21 21:23           ` Junio C Hamano
2007-03-21 21:39             ` Uwe Kleine-König
2007-03-21 21:34           ` Uwe Kleine-König
2007-03-21 21:49             ` Junio C Hamano
2007-03-21 22:27           ` Linus Torvalds
2007-03-22  1:19             ` Junio C Hamano
2007-03-22  1:43             ` [PATCH] bisect: show the maximal number of commits to be tested Johannes Schindelin
2007-03-22  2:05               ` Junio C Hamano
2007-03-22  2:15                 ` Johannes Schindelin
2007-03-22  6:36               ` [PATCH 1/2] git-rev-list: add --bisect-vars option Junio C Hamano
2007-03-22  6:42               ` [PATCH 2/2] git-rev-list --bisect: optimization Junio C Hamano

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.