All of lore.kernel.org
 help / color / mirror / Atom feed
* Question about 'branch -d' safety
@ 2009-12-29 21:54 Nanako Shiraishi
  2009-12-29 22:31 ` Nicolas Sebrecht
  0 siblings, 1 reply; 42+ messages in thread
From: Nanako Shiraishi @ 2009-12-29 21:54 UTC (permalink / raw)
  To: git

'git branch -d other' refuses to remove 'other' branch when it isn't merged to the current branch, and it is a safety feature.

But I often work in the following way in a project with a master branch and some topic branches:

	% git checkout -b topic origin/topic
	% work work work
	% git push origin topic
	% git checkout master
	% git branch -d topic

Because 'topic' is shared with other people who work on the project, and I don't work on any particular topic all the time, I want to keep my local branches small by removing unused branches. As you can guess, after checking out the master branch, the safety of "branch -d" is based on a wrong commit (namely, 'master', that is often behind 'topic') and deletion is refused.

I think the safety feature should check if the branch to be deleted is merged to the remote tracking branch it was forked from, instead of checking the current branch.

What do you think?

-- 
Nanako Shiraishi
http://ivory.ap.teacup.com/nanako3/

^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: Question about 'branch -d' safety
  2009-12-29 21:54 Question about 'branch -d' safety Nanako Shiraishi
@ 2009-12-29 22:31 ` Nicolas Sebrecht
  2009-12-30  3:12   ` Nanako Shiraishi
  0 siblings, 1 reply; 42+ messages in thread
From: Nicolas Sebrecht @ 2009-12-29 22:31 UTC (permalink / raw)
  To: Nanako Shiraishi; +Cc: git, Nicolas Sebrecht

The 30/12/09, Nanako Shiraishi wrote:

> I think the safety feature should check if the branch to be deleted is merged to the remote tracking branch it was forked from, instead of checking the current branch.
> 
> What do you think?

I think we shouldn't. At first, every repository don't have a remote.
This may easily be passed by a "double check" with a logical OR between
the two statements.

But even with it, we would hit some foreign workflow. Think: Bob
directly push to Alice and Alice does the same to Bob. I don't use this
kind of workflow myself but I consider them to be sensible enough to
have our attention.

Now, I'm talking about what users may expect from the default behaviour.
I'm not against a new configuration variable. It would certainly give
more granularity to the expectation of "what is safe for me".


-- 
Nicolas Sebrecht

^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: Question about 'branch -d' safety
  2009-12-29 22:31 ` Nicolas Sebrecht
@ 2009-12-30  3:12   ` Nanako Shiraishi
  2009-12-30  6:43     ` Junio C Hamano
                       ` (2 more replies)
  0 siblings, 3 replies; 42+ messages in thread
From: Nanako Shiraishi @ 2009-12-30  3:12 UTC (permalink / raw)
  To: Nicolas Sebrecht; +Cc: git

Quoting Nicolas Sebrecht <nicolas.s.dev@gmx.fr>

> The 30/12/09, Nanako Shiraishi wrote:
>
>> I think the safety feature should check if the branch to be deleted is merged to the remote tracking branch it was forked from, instead of checking the current branch.
>> 
>> What do you think?
>
> I think we shouldn't. At first, every repository don't have a remote.
> This may easily be passed by a "double check" with a logical OR between
> the two statements.

Sorry, I was unclear. What I meant was that checking with the branch the branch to be deleted was forked from, if and only if such a branch exists. Otherwise, we can keep using the old default behavior to check with the current branch.

> But even with it, we would hit some foreign workflow. Think: Bob
> directly push to Alice and Alice does the same to Bob. I don't use this
> kind of workflow myself but I consider them to be sensible enough to
> have our attention.

Here is what I think about your scenario.

Bob directly pushes to Alice and Alice does the same to Bob, both to their refs/remotes/<other person>/ tracking branches, and their own local branches fork from the remote tracking branches that keep track of other person's work. You would still want the check based on that tracking branch, not based on 'master' that has no relationship with the branches they are exchanging.

-- 
Nanako Shiraishi
http://ivory.ap.teacup.com/nanako3/

^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: Question about 'branch -d' safety
  2009-12-30  3:12   ` Nanako Shiraishi
@ 2009-12-30  6:43     ` Junio C Hamano
  2009-12-30 21:08     ` Nicolas Sebrecht
  2010-07-10  6:55     ` Clemens Buchacher
  2 siblings, 0 replies; 42+ messages in thread
From: Junio C Hamano @ 2009-12-30  6:43 UTC (permalink / raw)
  To: Nanako Shiraishi; +Cc: Nicolas Sebrecht, git

Nanako Shiraishi <nanako3@lavabit.com> writes:

> Quoting Nicolas Sebrecht <nicolas.s.dev@gmx.fr>
>
>> The 30/12/09, Nanako Shiraishi wrote:
>>
>>> I think the safety feature should check if the branch to be deleted is merged to the remote tracking branch it was forked from, instead of checking the current branch.
>>> 
>>> What do you think?
>>
>> I think we shouldn't. At first, every repository don't have a remote.
>> This may easily be passed by a "double check" with a logical OR between
>> the two statements.
>
> Sorry, I was unclear. What I meant was that checking with the branch the
> branch to be deleted was forked from, if and only if such a branch
> exists. Otherwise, we can keep using the old default behavior to check
> with the current branch.

Back when the original "safety valve" was added, there was no mechanical
"this branch _always_ merges from/rebases on that other one" settings.
The users were supposed to keep track of the correspondence, and the
canonical workflow was "checkout this && merge that && branch -d that".

But I actually think it is quite a natural thing to do in year 2010 to
change the safety valve as suggested.

A patch to do so would look like this.

-- >8 --
branch -d: base the "already-merged" safety on the branch it merges with

When a branch is marked to merge with another ref (e.g. local 'next' that
merges from and pushes back to origin's 'next', with 'branch.next.merge'
set to 'refs/heads/next'), it makes little sense to base the "branch -d"
safety, whose purpose is not to lose commits that are not merged to other
branches, on the current branch.  It is much more sensible to check if it
is merged with the other branch it merges with.

Signed-off-by: Junio C Hamano <gitster@pobox.com>
---
 builtin-branch.c  |   64 ++++++++++++++++++++++++++++++++++++++++++++--------
 t/t3200-branch.sh |   26 +++++++++++++++++++++
 2 files changed, 80 insertions(+), 10 deletions(-)

diff --git a/builtin-branch.c b/builtin-branch.c
index c87e63b..d2a35fe 100644
--- a/builtin-branch.c
+++ b/builtin-branch.c
@@ -93,9 +93,60 @@ static const char *branch_get_color(enum color_branch ix)
 	return "";
 }
 
+static int branch_merged(int kind, const char *name,
+			 struct commit *rev, struct commit *head_rev)
+{
+	/*
+	 * This checks whether the merge bases of branch and HEAD (or
+	 * the other branch this branch builds upon) contains the
+	 * branch, which means that the branch has already been merged
+	 * safely to HEAD (or the other branch).
+	 */
+	struct commit *reference_rev = NULL;
+	const char *reference_name = NULL;
+	int merged;
+
+	if (kind == REF_LOCAL_BRANCH) {
+		struct branch *branch = branch_get(name);
+		unsigned char sha1[20];
+
+		if (branch &&
+		    branch->merge &&
+		    branch->merge[0] &&
+		    branch->merge[0]->dst &&
+		    (reference_name =
+		     resolve_ref(branch->merge[0]->dst, sha1, 1, NULL)) != NULL)
+			reference_rev = lookup_commit_reference(sha1);
+	}
+	if (!reference_rev)
+		reference_rev = head_rev;
+
+	merged = in_merge_bases(rev, &reference_rev, 1);
+
+	/*
+	 * After the safety valve is fully redefined to "check with
+	 * upstream, if any, otherwise with HEAD", we should just
+	 * return the result of the in_merge_bases() above without
+	 * any of the following code, but during the transition period,
+	 * a gentle reminder is in order.
+	 */
+	if ((head_rev != reference_rev) &&
+	    in_merge_bases(rev, &head_rev, 1) != merged) {
+		if (merged)
+			warning("deleting branch '%s' that has been merged to\n"
+				"         '%s', but it is not yet merged to HEAD.",
+				name, reference_name);
+		else
+			warning("not deleting branch '%s' that is not yet merged to\n"
+				"         '%s', even though it is merged to HEAD.",
+				name, reference_name);
+	}
+	return merged;
+}
+
 static int delete_branches(int argc, const char **argv, int force, int kinds)
 {
-	struct commit *rev, *head_rev = head_rev;
+	struct commit *rev, *head_rev = NULL;
 	unsigned char sha1[20];
 	char *name = NULL;
 	const char *fmt, *remote;
@@ -148,15 +199,8 @@ static int delete_branches(int argc, const char **argv, int force, int kinds)
 			continue;
 		}
 
-		/* This checks whether the merge bases of branch and
-		 * HEAD contains branch -- which means that the HEAD
-		 * contains everything in both.
-		 */
-
-		if (!force &&
-		    !in_merge_bases(rev, &head_rev, 1)) {
-			error("The branch '%s' is not an ancestor of "
-			      "your current HEAD.\n"
+		if (!force && !branch_merged(kinds, bname.buf, rev, head_rev)) {
+			error("The branch '%s' is not fully merged.\n"
 			      "If you are sure you want to delete it, "
 			      "run 'git branch -D %s'.", bname.buf, bname.buf);
 			ret = 1;
diff --git a/t/t3200-branch.sh b/t/t3200-branch.sh
index d59a9b4..e0b7605 100755
--- a/t/t3200-branch.sh
+++ b/t/t3200-branch.sh
@@ -468,4 +468,30 @@ test_expect_success 'detect misconfigured autosetuprebase (no value)' '
 	git config --unset branch.autosetuprebase
 '
 
+test_expect_success 'attempt to delete a branch without base and unmerged to HEAD' '
+	git checkout my9 &&
+	git config --unset branch.my8.merge &&
+	test_must_fail git branch -d my8
+'
+
+test_expect_success 'attempt to delete a branch merged to its base' '
+	# we are on my9 which is the initial commit; traditionally
+	# we would not have allowed deleting my8 that is not merged
+	# to my9, but it is set to track master that already has my8
+	git config branch.my8.merge refs/heads/master &&
+	git branch -d my8
+'
+
+test_expect_success 'attempt to delete a branch merged to its base' '
+	git checkout master &&
+	echo Third >>A &&
+	git commit -m "Third commit" A &&
+	git branch -t my10 my9 &&
+	git branch -f my10 HEAD^ &&
+	# we are on master which is at the third commit, and my10
+	# is behind us, so traditionally we would have allowed deleting
+	# it; but my10 is set to track my9 that is further behind.
+	test_must_fail git branch -d my10
+'
+
 test_done

^ permalink raw reply related	[flat|nested] 42+ messages in thread

* Re: Question about 'branch -d' safety
  2009-12-30  3:12   ` Nanako Shiraishi
  2009-12-30  6:43     ` Junio C Hamano
@ 2009-12-30 21:08     ` Nicolas Sebrecht
  2010-07-10  6:55     ` Clemens Buchacher
  2 siblings, 0 replies; 42+ messages in thread
From: Nicolas Sebrecht @ 2009-12-30 21:08 UTC (permalink / raw)
  To: Nanako Shiraishi; +Cc: Nicolas Sebrecht, git

The 30/12/09, Nanako Shiraishi wrote:
> Quoting Nicolas Sebrecht <nicolas.s.dev@gmx.fr>
> 
> > But even with it, we would hit some foreign workflow. Think: Bob
> > directly push to Alice and Alice does the same to Bob. I don't use this
> > kind of workflow myself but I consider them to be sensible enough to
> > have our attention.
> 
> Here is what I think about your scenario.
> 
> Bob directly pushes to Alice and Alice does the same to Bob, both to their refs/remotes/<other person>/ tracking branches

We can't say. They both may have refs/remotes/<same_id> .

Bob:
  $ git branch -d a_branch

Now, Bob has the "I don't want to loose" commit known in
refs/remotes/<same_id> only.

Alice, some time later:
  $ git push -f <same_id> <foo>:<a_branch>

Bob "lose" his commit.


I admit it is a very uncommon use case and Bob can still use the reflog.

-- 
Nicolas Sebrecht

^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: Question about 'branch -d' safety
  2009-12-30  3:12   ` Nanako Shiraishi
  2009-12-30  6:43     ` Junio C Hamano
  2009-12-30 21:08     ` Nicolas Sebrecht
@ 2010-07-10  6:55     ` Clemens Buchacher
  2010-07-10 21:40       ` Jonathan Nieder
  2 siblings, 1 reply; 42+ messages in thread
From: Clemens Buchacher @ 2010-07-10  6:55 UTC (permalink / raw)
  To: git; +Cc: Nicolas Sebrecht, Nanako Shiraishi, Jonathan Nieder

[-- Attachment #1: Type: text/plain, Size: 2175 bytes --]

Hi,

I'm sorry to revive this old thread. But I recently hit this
behavior myself, using the workflow described below, and I ended up
deleting an unmerged branch with git branch -d.

This discussion is about v1.7.0-rc0~18^2 (branch -d: base the
"already-merged" safety on the branch it merges with, 2009-12-29),
which teaches 'git branch' to delete a branch with -d, if it is
merged into the branch it is tracking, even if it is not merged
into any local branch.

On Wed, Dec 30, 2009 at 12:12:38PM +0900, Nanako Shiraishi wrote:
> Quoting Nicolas Sebrecht <nicolas.s.dev@gmx.fr>
> 
> > But even with it, we would hit some foreign workflow. Think: Bob
> > directly push to Alice and Alice does the same to Bob. I don't use this
> > kind of workflow myself but I consider them to be sensible enough to
> > have our attention.
> 
> Here is what I think about your scenario.
> 
> Bob directly pushes to Alice and Alice does the same to Bob, both
> to their refs/remotes/<other person>/ tracking branches, and
> their own local branches fork from the remote tracking branches
> that keep track of other person's work. You would still want the
> check based on that tracking branch, not based on 'master' that
> has no relationship with the branches they are exchanging.

So I had a branch 'topic' in two repositories, neither of which was
in any way authoritative. They were both upstreams to each other.
And the 'topic' branch in each tracked the 'topic' branch in the
other.

At some point, I used branch -d to delete branch 'topic' in one
repo. Later, I did the same in the other. I then realized that
after I do git remote prune origin, the branch will be gone,
without ever having being merged into another branch.

I often use branch -d simply to check if I still forgot something
about this branch, or if it has become redundant. With the above
scenario, I can not rely on branch -d safety any more.

On the other hand, I do think the new behavior is quite useful in
general. Maybe the real solution is a reflog for deleted branch
heads, rather than being too careful about whether or not a branch
can be deleted.

Clemens

[-- Attachment #2: Digital signature --]
[-- Type: application/pgp-signature, Size: 490 bytes --]

^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: Question about 'branch -d' safety
  2010-07-10  6:55     ` Clemens Buchacher
@ 2010-07-10 21:40       ` Jonathan Nieder
  2010-07-10 21:57         ` Jakub Narebski
  0 siblings, 1 reply; 42+ messages in thread
From: Jonathan Nieder @ 2010-07-10 21:40 UTC (permalink / raw)
  To: Clemens Buchacher; +Cc: git, Nicolas Sebrecht, Nanako Shiraishi, Junio C Hamano

Clemens Buchacher wrote:

>          Maybe the real solution is a reflog for deleted branch
> heads, rather than being too careful about whether or not a branch
> can be deleted.

Yes, I think so.  Would there be any bad side effects to this?

diff --git a/refs.c b/refs.c
index b540067..7ed0154 100644
--- a/refs.c
+++ b/refs.c
@@ -1084,7 +1084,6 @@ int delete_ref(const char *refname, const unsigned char *sha1, int delopt)
 	 */
 	ret |= repack_without_ref(refname);
 
-	unlink_or_warn(git_path("logs/%s", lock->ref_name));
 	invalidate_cached_refs();
 	unlock_ref(lock);
 	return ret;
-- 

^ permalink raw reply related	[flat|nested] 42+ messages in thread

* Re: Question about 'branch -d' safety
  2010-07-10 21:40       ` Jonathan Nieder
@ 2010-07-10 21:57         ` Jakub Narebski
  2010-07-10 22:17           ` Jonathan Nieder
  2010-07-11  6:55           ` Clemens Buchacher
  0 siblings, 2 replies; 42+ messages in thread
From: Jakub Narebski @ 2010-07-10 21:57 UTC (permalink / raw)
  To: Jonathan Nieder
  Cc: Clemens Buchacher, git, Nicolas Sebrecht, Nanako Shiraishi,
	Junio C Hamano

Jonathan Nieder <jrnieder@gmail.com> writes:
> Clemens Buchacher wrote:
> 
> >          Maybe the real solution is a reflog for deleted branch
> > heads, rather than being too careful about whether or not a branch
> > can be deleted.
> 
> Yes, I think so.  Would there be any bad side effects to this?
> 
> diff --git a/refs.c b/refs.c
> index b540067..7ed0154 100644
> --- a/refs.c
> +++ b/refs.c
> @@ -1084,7 +1084,6 @@ int delete_ref(const char *refname, const unsigned char *sha1, int delopt)
>  	 */
>  	ret |= repack_without_ref(refname);
>  
> -	unlink_or_warn(git_path("logs/%s", lock->ref_name));
>  	invalidate_cached_refs();
>  	unlock_ref(lock);
>  	return ret;

It's unfortunately not so easy.  The problem you have to solve is D/F
conflict: if you have 'foo/bar' branch, you can't create 'foo' branch,
but after deleting 'foo/bar' you want to be able to create 'foo'
branch and reflog for 'foo' branch.

Therefore there were ideas for various "Attic"-like areas for reflogs
for deleted branches...

-- 
Jakub Narebski
Poland
ShadeHawk on #git

^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: Question about 'branch -d' safety
  2010-07-10 21:57         ` Jakub Narebski
@ 2010-07-10 22:17           ` Jonathan Nieder
  2010-07-11  6:55           ` Clemens Buchacher
  1 sibling, 0 replies; 42+ messages in thread
From: Jonathan Nieder @ 2010-07-10 22:17 UTC (permalink / raw)
  To: Jakub Narebski
  Cc: Clemens Buchacher, git, Nicolas Sebrecht, Nanako Shiraishi,
	Junio C Hamano

Jakub Narebski wrote:
> Jonathan Nieder <jrnieder@gmail.com> writes:

>> +++ b/refs.c
>> @@ -1084,7 +1084,6 @@ int delete_ref(const char *refname, const unsigned char *sha1, int delopt)
>>  	 */
>>  	ret |= repack_without_ref(refname);
>>  
>> -	unlink_or_warn(git_path("logs/%s", lock->ref_name));
>>  	invalidate_cached_refs();
>>  	unlock_ref(lock);
>>  	return ret;
>
> It's unfortunately not so easy.  The problem you have to solve is D/F
> conflict: if you have 'foo/bar' branch, you can't create 'foo' branch,
> but after deleting 'foo/bar' you want to be able to create 'foo'
> branch and reflog for 'foo' branch.

Thanks for the pointer.  Here are some more, for the interested.

 - http://thread.gmane.org/gmane.comp.version-control.git/144250/focus=145366
   the dead_refs namespace

 - http://thread.gmane.org/gmane.comp.version-control.git/73838/focus=73877 
   why to make and why not to make the change

^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: Question about 'branch -d' safety
  2010-07-10 21:57         ` Jakub Narebski
  2010-07-10 22:17           ` Jonathan Nieder
@ 2010-07-11  6:55           ` Clemens Buchacher
  2010-07-11  7:16             ` Jakub Narebski
  1 sibling, 1 reply; 42+ messages in thread
From: Clemens Buchacher @ 2010-07-11  6:55 UTC (permalink / raw)
  To: Jakub Narebski
  Cc: Jonathan Nieder, git, Nicolas Sebrecht, Nanako Shiraishi, Junio C Hamano

[-- Attachment #1: Type: text/plain, Size: 708 bytes --]

On Sat, Jul 10, 2010 at 02:57:35PM -0700, Jakub Narebski wrote:
> 
> It's unfortunately not so easy.  The problem you have to solve is D/F
> conflict: if you have 'foo/bar' branch, you can't create 'foo' branch,
> but after deleting 'foo/bar' you want to be able to create 'foo'
> branch and reflog for 'foo' branch.

I'm going to read up on Jonathan's pointers. But I do not really
see the problem above. If the reflog already exists, the new branch
simpliy continues using it. So if the branch is re-created, it's as
if the branch had never been deleted.

Or possibly we add a special reflog entry which points the branch
to a zero sha to signify that the branch had been deleted.

Clemens

[-- Attachment #2: Digital signature --]
[-- Type: application/pgp-signature, Size: 490 bytes --]

^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: Question about 'branch -d' safety
  2010-07-11  6:55           ` Clemens Buchacher
@ 2010-07-11  7:16             ` Jakub Narebski
  2010-07-11  8:48               ` Julian Phillips
  2010-07-11 13:37               ` Clemens Buchacher
  0 siblings, 2 replies; 42+ messages in thread
From: Jakub Narebski @ 2010-07-11  7:16 UTC (permalink / raw)
  To: Clemens Buchacher
  Cc: Jonathan Nieder, git, Nicolas Sebrecht, Nanako Shiraishi, Junio C Hamano

Dnia niedziela 11. lipca 2010 08:55, Clemens Buchacher napisał:
> On Sat, Jul 10, 2010 at 02:57:35PM -0700, Jakub Narebski wrote:
> > 
> > It's unfortunately not so easy.  The problem you have to solve is D/F
> > conflict: if you have 'foo/bar' branch, you can't create 'foo' branch,
> > but after deleting 'foo/bar' you want to be able to create 'foo'
> > branch and reflog for 'foo' branch.
> 
> I'm going to read up on Jonathan's pointers. But I do not really
> see the problem above. If the reflog already exists, the new branch
> simpliy continues using it. So if the branch is re-created, it's as
> if the branch had never been deleted.

The problem is, that when you have 'foo/bar' branch, then you have
'foo/bar' reflog.  When you delete branch 'foo/bar', but do not delete
'foo/bar' reflog (only add to it branch deletion event), and then you
want to create 'foo' branch, git wouldn't be able to create reflog
fo 'foo' because of directory / file (D/F) conflict: there is 'foo/'
directory preventing file 'foo' from being created.
 
> Or possibly we add a special reflog entry which points the branch
> to a zero sha to signify that the branch had been deleted.

That is a good idea anyway.
-- 
Jakub Narebski
Poland

^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: Question about 'branch -d' safety
  2010-07-11  7:16             ` Jakub Narebski
@ 2010-07-11  8:48               ` Julian Phillips
  2010-07-11 13:37               ` Clemens Buchacher
  1 sibling, 0 replies; 42+ messages in thread
From: Julian Phillips @ 2010-07-11  8:48 UTC (permalink / raw)
  To: Jakub Narebski
  Cc: Clemens Buchacher, Jonathan Nieder, git, Nicolas Sebrecht,
	Nanako Shiraishi, Junio C Hamano

On Sun, 11 Jul 2010 09:16:26 +0200, Jakub Narebski <jnareb@gmail.com>
wrote:
> Dnia niedziela 11. lipca 2010 08:55, Clemens Buchacher napisał:
>> On Sat, Jul 10, 2010 at 02:57:35PM -0700, Jakub Narebski wrote:
>> > 
>> > It's unfortunately not so easy.  The problem you have to solve is D/F
>> > conflict: if you have 'foo/bar' branch, you can't create 'foo'
branch,
>> > but after deleting 'foo/bar' you want to be able to create 'foo'
>> > branch and reflog for 'foo' branch.
>> 
>> I'm going to read up on Jonathan's pointers. But I do not really
>> see the problem above. If the reflog already exists, the new branch
>> simpliy continues using it. So if the branch is re-created, it's as
>> if the branch had never been deleted.
> 
> The problem is, that when you have 'foo/bar' branch, then you have
> 'foo/bar' reflog.  When you delete branch 'foo/bar', but do not delete
> 'foo/bar' reflog (only add to it branch deletion event), and then you
> want to create 'foo' branch, git wouldn't be able to create reflog
> fo 'foo' because of directory / file (D/F) conflict: there is 'foo/'
> directory preventing file 'foo' from being created.

You could just change the reflog code so that if foo is a directory it
looks for/creates foo/.reflog (the dot is to prevent a clash with a valid
branch name)?

>> Or possibly we add a special reflog entry which points the branch
>> to a zero sha to signify that the branch had been deleted.
> 
> That is a good idea anyway.

-- 
Julian

^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: Question about 'branch -d' safety
  2010-07-11  7:16             ` Jakub Narebski
  2010-07-11  8:48               ` Julian Phillips
@ 2010-07-11 13:37               ` Clemens Buchacher
  2010-07-11 18:41                 ` Junio C Hamano
  2010-07-17  9:30                 ` Clemens Buchacher
  1 sibling, 2 replies; 42+ messages in thread
From: Clemens Buchacher @ 2010-07-11 13:37 UTC (permalink / raw)
  To: Jakub Narebski
  Cc: Jonathan Nieder, git, Nicolas Sebrecht, Nanako Shiraishi, Junio C Hamano

[-- Attachment #1: Type: text/plain, Size: 5518 bytes --]

On Sun, Jul 11, 2010 at 09:16:26AM +0200, Jakub Narebski wrote:
> 
> The problem is, that when you have 'foo/bar' branch, then you have
> 'foo/bar' reflog.  When you delete branch 'foo/bar', but do not delete
> 'foo/bar' reflog (only add to it branch deletion event), and then you
> want to create 'foo' branch, git wouldn't be able to create reflog
> fo 'foo' because of directory / file (D/F) conflict: there is 'foo/'
> directory preventing file 'foo' from being created.

Right, of course. So how about this?

Clemens

-->o--
Date: Sun, 11 Jul 2010 12:37:06 +0200
Subject: [PATCH/RFC] graveyard for reflogs

Instead of removing reflogs together with refs, keep them around
and reuse them as a starting point, if a ref of the same name is
created again.

When a ref is deleted, a ~ is appended to each directory of the
corresponding reflog entry, e.g., refs/heads/branch is renamed to
refs~/heads~/branch. Since ~ must not be used for ref names,
directory/file conflicts are thus avoided.

Known issues:

 - The reflog cannot be accessed while the ref does not exist.

 - Older git versions will not resurrect the reflog, and therefore
   leave the renamed reflog behind.

 - Breaks t7701, because git-expire tries to lock log entries,
   which fails because ~ is an illegal character for refs.

 - Breaks t9300.

Signed-off-by: Clemens Buchacher <drizzd@aon.at>
---
 reflog-walk.c   |    9 +++++--
 refs.c          |   67 ++++++++++++++++++++++++++++++++++++++++++++++++++++++-
 t/t1450-fsck.sh |    1 +
 3 files changed, 73 insertions(+), 4 deletions(-)

diff --git a/reflog-walk.c b/reflog-walk.c
index 4879615..9415ac8 100644
--- a/reflog-walk.c
+++ b/reflog-walk.c
@@ -228,15 +228,18 @@ void fake_reflog_parent(struct reflog_walk_info *info, struct commit *commit)
 		return;
 	}
 
-	reflog = &commit_reflog->reflogs->items[commit_reflog->recno];
 	info->last_commit_reflog = commit_reflog;
 	commit_reflog->recno--;
-	commit_info->commit = (struct commit *)parse_object(reflog->osha1);
-	if (!commit_info->commit) {
+	if (commit_reflog->recno < 0) {
 		commit->parents = NULL;
 		return;
 	}
 
+	reflog = &commit_reflog->reflogs->items[commit_reflog->recno];
+	commit_info->commit = (struct commit *)parse_object(reflog->nsha1);
+	if (!commit_info->commit)
+		die("Invalid reflog entry");
+
 	commit->parents = xcalloc(sizeof(struct commit_list), 1);
 	commit->parents->item = commit_info->commit;
 	commit->object.flags &= ~(ADDED | SEEN | SHOWN);
diff --git a/refs.c b/refs.c
index b540067..78c48ad 100644
--- a/refs.c
+++ b/refs.c
@@ -1052,6 +1052,70 @@ static int repack_without_ref(const char *refname)
 	return commit_lock_file(&packlock);
 }
 
+static void pronounce_dead(struct strbuf *dead_ref, const char *ref)
+{
+	const char *p = ref;
+	while (*p) {
+		if (*p == '/') {
+			int len = p - ref;
+			strbuf_add(dead_ref, ref, len);
+			if (len > 0)
+				strbuf_addstr(dead_ref, "~/");
+			ref = p + 1;
+		}
+		p++;
+	}
+	strbuf_addstr(dead_ref, ref);
+}
+
+static int bury_log(const char *ref)
+{
+	struct strbuf dead_ref = STRBUF_INIT;
+	struct stat loginfo;
+
+	if (lstat(git_path("logs/%s", ref), &loginfo))
+		return 0;
+
+	pronounce_dead(&dead_ref, ref);
+
+	if (S_ISLNK(loginfo.st_mode))
+		return error("reflog for %s is a symlink", ref);
+
+	if (safe_create_leading_directories(git_path("logs/%s", dead_ref.buf)))
+		return error("unable to create directory for %s", dead_ref.buf);
+
+	if (rename(git_path("logs/%s", ref), git_path("logs/%s", dead_ref.buf)))
+		return error("unable to move logfile to logs/%s: %s",
+			dead_ref.buf, strerror(errno));
+
+	strbuf_release(&dead_ref);
+	return 0;
+}
+
+static int resurrect_log(const char *ref)
+{
+	struct strbuf dead_ref = STRBUF_INIT;
+	struct stat loginfo;
+
+	pronounce_dead(&dead_ref, ref);
+
+	if (lstat(git_path("logs/%s", dead_ref.buf), &loginfo))
+		return 0;
+
+	if (S_ISLNK(loginfo.st_mode))
+		return error("reflog for %s is a symlink", dead_ref.buf);
+
+	if (safe_create_leading_directories(git_path("logs/%s", ref)))
+		return error("unable to create directory for %s", ref);
+
+	if (rename(git_path("logs/%s", dead_ref.buf), git_path("logs/%s", ref)))
+		return error("unable to move logfile to logs/%s: %s",
+			ref, strerror(errno));
+
+	strbuf_release(&dead_ref);
+	return 0;
+}
+
 int delete_ref(const char *refname, const unsigned char *sha1, int delopt)
 {
 	struct ref_lock *lock;
@@ -1084,7 +1148,7 @@ int delete_ref(const char *refname, const unsigned char *sha1, int delopt)
 	 */
 	ret |= repack_without_ref(refname);
 
-	unlink_or_warn(git_path("logs/%s", lock->ref_name));
+	bury_log(lock->ref_name);
 	invalidate_cached_refs();
 	unlock_ref(lock);
 	return ret;
@@ -1384,6 +1448,7 @@ int write_ref_sha1(struct ref_lock *lock,
 		unlock_ref(lock);
 		return -1;
 	}
+	resurrect_log(lock->ref_name);
 	invalidate_cached_refs();
 	if (log_ref_write(lock->ref_name, lock->old_sha1, sha1, logmsg) < 0 ||
 	    (strcmp(lock->ref_name, lock->orig_ref_name) &&
diff --git a/t/t1450-fsck.sh b/t/t1450-fsck.sh
index 759cf12..65f160e 100755
--- a/t/t1450-fsck.sh
+++ b/t/t1450-fsck.sh
@@ -55,6 +55,7 @@ test_expect_success 'object with bad sha1' '
 	grep "$sha.*corrupt" out &&
 	rm -f .git/objects/$new &&
 	git update-ref -d refs/heads/bogus &&
+	rm -f .git/logs/refs~/heads~/bogus &&
 	git read-tree -u --reset HEAD
 '
 
-- 
1.7.1.571.gba4d01


[-- Attachment #2: Digital signature --]
[-- Type: application/pgp-signature, Size: 490 bytes --]

^ permalink raw reply related	[flat|nested] 42+ messages in thread

* Re: Question about 'branch -d' safety
  2010-07-11 13:37               ` Clemens Buchacher
@ 2010-07-11 18:41                 ` Junio C Hamano
  2010-07-11 19:05                   ` Jakub Narebski
                                     ` (2 more replies)
  2010-07-17  9:30                 ` Clemens Buchacher
  1 sibling, 3 replies; 42+ messages in thread
From: Junio C Hamano @ 2010-07-11 18:41 UTC (permalink / raw)
  To: Clemens Buchacher
  Cc: Jakub Narebski, Jonathan Nieder, git, Nicolas Sebrecht, Nanako Shiraishi

Clemens Buchacher <drizzd@aon.at> writes:

> Known issues:
>
>  - The reflog cannot be accessed while the ref does not exist.
>
>  - Older git versions will not resurrect the reflog, and therefore
>    leave the renamed reflog behind.
>
>  - Breaks t7701, because git-expire tries to lock log entries,
>    which fails because ~ is an illegal character for refs.
>
>  - Breaks t9300.

Perhaps a few obvious ones are missing?

 - It is no longer possible to get rid of objects associated with the
   history of a branch by deleting the branch and then running gc.

 - It is no longer possible to trust git that you would start a history of
   a branch afresh when you create one.  If you happened to have an
   unrelated branch with the same name in the past, the new branch
   inherits reflog entries when it shouldn't.

What problem are you guys really trying to solve?


> diff --git a/t/t1450-fsck.sh b/t/t1450-fsck.sh
> index 759cf12..65f160e 100755
> --- a/t/t1450-fsck.sh
> +++ b/t/t1450-fsck.sh
> @@ -55,6 +55,7 @@ test_expect_success 'object with bad sha1' '
>  	grep "$sha.*corrupt" out &&
>  	rm -f .git/objects/$new &&
>  	git update-ref -d refs/heads/bogus &&
> +	rm -f .git/logs/refs~/heads~/bogus &&
>  	git read-tree -u --reset HEAD
>  '

What is this about???

^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: Question about 'branch -d' safety
  2010-07-11 18:41                 ` Junio C Hamano
@ 2010-07-11 19:05                   ` Jakub Narebski
  2010-07-11 22:02                   ` Will Palmer
  2010-07-12 18:47                   ` Clemens Buchacher
  2 siblings, 0 replies; 42+ messages in thread
From: Jakub Narebski @ 2010-07-11 19:05 UTC (permalink / raw)
  To: Junio C Hamano
  Cc: Clemens Buchacher, Jonathan Nieder, git, Nicolas Sebrecht,
	Nanako Shiraishi

Dnia niedziela 11. lipca 2010 20:41, Junio C Hamano napisał:
> Clemens Buchacher <drizzd@aon.at> writes:
> 
> > Known issues:
> >
> >  - The reflog cannot be accessed while the ref does not exist.
> >
> >  - Older git versions will not resurrect the reflog, and therefore
> >    leave the renamed reflog behind.
> >
> >  - Breaks t7701, because git-expire tries to lock log entries,
> >    which fails because ~ is an illegal character for refs.
> >
> >  - Breaks t9300.
> 
> Perhaps a few obvious ones are missing?
> 
>  - It is no longer possible to get rid of objects associated with the
>    history of a branch by deleting the branch and then running gc.

I think that reflog for deleted branches would still follow gc.reflogExpire
and gr.reflogExpireUnreachable, and that it would be automatically deleted
when it becomes empty.  I don't know if provided implementation assures
this.

>  - It is no longer possible to trust git that you would start a history of
>    a branch afresh when you create one.  If you happened to have an
>    unrelated branch with the same name in the past, the new branch
>    inherits reflog entries when it shouldn't.

That's a valid concern.

> 
> What problem are you guys really trying to solve?

The problem is that if you delete branch by accident, it might be hard
to restore it, and its reflog vanishes forever.  And with relaxed
protection of 'git branch -d' this might happen more frequently (in the
case of circular origin's).

-- 
Jakub Narebski
Poland

^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: Question about 'branch -d' safety
  2010-07-11 18:41                 ` Junio C Hamano
  2010-07-11 19:05                   ` Jakub Narebski
@ 2010-07-11 22:02                   ` Will Palmer
  2010-07-12 18:47                   ` Clemens Buchacher
  2 siblings, 0 replies; 42+ messages in thread
From: Will Palmer @ 2010-07-11 22:02 UTC (permalink / raw)
  To: Junio C Hamano
  Cc: Clemens Buchacher, Jakub Narebski, Jonathan Nieder, git,
	Nicolas Sebrecht, Nanako Shiraishi

On Sun, 2010-07-11 at 11:41 -0700, Junio C Hamano wrote:
> What problem are you guys really trying to solve?

Almost every porcelain git operation is, in general, "safe". Safety is
defined as "will not lose data". I tried to come up with a list of
"unsafe" commands recently, and others in #git pointed out that a lot of
commands could be made safe by inserting "wait 90 days" into various
points. Other than that, there's reset --hard. That's pretty damn safe,
in my opinion.

The reflog is the safety-net which allows us to give users help without
worrying about (as we might in other VCSs) someone typing a command
slightly wrong and blowing away their repository. "Don't worry, you
aren't going to break it" is a git truism which helps us to encourage
git newbies to experiment a little. Even history rewriting is a "safe"
operation.

Personally, I thought "git branch -d" was quite safe enough, as the HEAD
reflog would still have a copy of the tip, but as was pointed out to me,
this would only be there if the branch in question had been checked-out
recently, and of course the reflog for that branch would also be
destroyed. I don't really mind the second half of that, but if we're
going to solve one problem, why not solve both?

The main thing which is lost when a ref is deleted (assuming no gc,
which doesn't happen during a delete without anything else happening) is
the association between the various objects which will show up during
"git fsck" and the name of the ref which was deleted. Without that loss
of association the conversation can move from:
	"don't panic! If you haven't done anything else yet, you can type "git
fsck" and go through the list of dangling commits one by one until you
find it. Disable automatic garbage collection in the mean-time just to
be safe."

to (wishful thinking):
	"don't panic! try typing "git restore-ref -i" and select the branch you
deleted"


What's really problem I'm really trying to solve? I suppose part is the
"I want porcelain commands to be safe, at least for 30 days", and the
rest is that I like having more ways to have a conversation which starts
off in panic end with "oh, thanks! git is awesome!"

-- 
-- Will

^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: Question about 'branch -d' safety
  2010-07-11 18:41                 ` Junio C Hamano
  2010-07-11 19:05                   ` Jakub Narebski
  2010-07-11 22:02                   ` Will Palmer
@ 2010-07-12 18:47                   ` Clemens Buchacher
  2010-07-12 23:50                     ` Junio C Hamano
  2 siblings, 1 reply; 42+ messages in thread
From: Clemens Buchacher @ 2010-07-12 18:47 UTC (permalink / raw)
  To: Junio C Hamano
  Cc: Jakub Narebski, Jonathan Nieder, git, Nicolas Sebrecht, Nanako Shiraishi

[-- Attachment #1: Type: text/plain, Size: 1416 bytes --]

On Sun, Jul 11, 2010 at 11:41:54AM -0700, Junio C Hamano wrote:
> 
>  - It is no longer possible to get rid of objects associated with the
>    history of a branch by deleting the branch and then running gc.

Yes, this should be fixed in the final implementation. The reflog
should be accessible as usual, and should expire normally, even if
the branch has been deleted.

>  - It is no longer possible to trust git that you would start a history of
>    a branch afresh when you create one.  If you happened to have an
>    unrelated branch with the same name in the past, the new branch
>    inherits reflog entries when it shouldn't.

The reflog does not guarantee any such relatedness with previous
entries anyways. Doing "git reset --hard <unrelated-branch>" is not
much different from deleting the branch and creating a new
unrelated branch of the same name.

And if it's really an issue, we can always make the reflog stop
listing entries at the latest branch creation entry, and make
listing beyound that optional. (I actually had to change the reflog
walker to make it continue beyound branch creation.)

> What problem are you guys really trying to solve?

The reflog protects you from almost all involuntary loss of
information. Except for recovery after git branch -D and even git
branch -d, as per the scenario I described in the original posting
of this thread.

Clemens

[-- Attachment #2: Digital signature --]
[-- Type: application/pgp-signature, Size: 490 bytes --]

^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: Question about 'branch -d' safety
  2010-07-12 18:47                   ` Clemens Buchacher
@ 2010-07-12 23:50                     ` Junio C Hamano
  2010-07-13  7:13                       ` Clemens Buchacher
  0 siblings, 1 reply; 42+ messages in thread
From: Junio C Hamano @ 2010-07-12 23:50 UTC (permalink / raw)
  To: Clemens Buchacher
  Cc: Jakub Narebski, Jonathan Nieder, git, Nicolas Sebrecht, Nanako Shiraishi

Clemens Buchacher <drizzd@aon.at> writes:

>> What problem are you guys really trying to solve?
>
> The reflog protects you from almost all involuntary loss of
> information.

And you call "git branch -[dD]" involuntary?

I am not entirely unsympathetic to add "git branch --undelete frotz" to our
vocabulary, but then we should leave users an easy way to really remove
things, and it shouldn't be "git branch -d --i-really-mean-it frotz".

It would probably be more like "git branch -d frotz" followed by "git
branch --purge frotz" or "git branch --purge \*" (if we name the operation
to remove cruft for undelete "purge", that is).

^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: Question about 'branch -d' safety
  2010-07-12 23:50                     ` Junio C Hamano
@ 2010-07-13  7:13                       ` Clemens Buchacher
  2010-07-13  8:00                         ` Will Palmer
  0 siblings, 1 reply; 42+ messages in thread
From: Clemens Buchacher @ 2010-07-13  7:13 UTC (permalink / raw)
  To: Junio C Hamano
  Cc: Jakub Narebski, Jonathan Nieder, git, Nicolas Sebrecht, Nanako Shiraishi

[-- Attachment #1: Type: text/plain, Size: 885 bytes --]

On Mon, Jul 12, 2010 at 04:50:32PM -0700, Junio C Hamano wrote:
> 
> I am not entirely unsympathetic to add "git branch --undelete frotz" to our
> vocabulary, but then we should leave users an easy way to really remove
> things, and it shouldn't be "git branch -d --i-really-mean-it frotz".
> 
> It would probably be more like "git branch -d frotz" followed by "git
> branch --purge frotz" or "git branch --purge \*" (if we name the operation
> to remove cruft for undelete "purge", that is).

Why would we need that? Even now, it is not enough to do "git
branch -D frotz; git gc --prune". You need to expire the reflog,
since HEAD may still have a reference to it. So instead you would
need

 git branch -D frotz
 git reflog expire --expire=now HEAD frotz
 git gc --prune

Purging a branch becomes just a special case of purging anything
from history.

Clemens

[-- Attachment #2: Digital signature --]
[-- Type: application/pgp-signature, Size: 490 bytes --]

^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: Question about 'branch -d' safety
  2010-07-13  7:13                       ` Clemens Buchacher
@ 2010-07-13  8:00                         ` Will Palmer
  2010-07-13  8:30                           ` Johannes Sixt
  2010-07-13 22:21                           ` Clemens Buchacher
  0 siblings, 2 replies; 42+ messages in thread
From: Will Palmer @ 2010-07-13  8:00 UTC (permalink / raw)
  To: Clemens Buchacher
  Cc: Junio C Hamano, Jakub Narebski, Jonathan Nieder, git,
	Nicolas Sebrecht, Nanako Shiraishi

On Tue, 2010-07-13 at 09:13 +0200, Clemens Buchacher wrote:
> Why would we need that? Even now, it is not enough to do "git
> branch -D frotz; git gc --prune". You need to expire the reflog,
> since HEAD may still have a reference to it. So instead you would
> need
> 
>  git branch -D frotz
>  git reflog expire --expire=now HEAD frotz
>  git gc --prune
> 
> Purging a branch becomes just a special case of purging anything
> from history.
> 
> Clemens

With that in mind, would it not be enough to simply remove the ref, but
not the reflog, and change the logic to let the reflog live on until
either pruned or all of its entries have expired normally? On branch
creation, if a reflog already exists, it would remain untouched. Reflog
entries would ALWAYS expire using the same expire-time rules, regardless
of whether or not the ref itself exists, and everything just makes
sense. Simple to follow, simple to prune, branch -d becomes just as safe
as any other operation. I like this.

I understand the argument for "but you may wind up creating an unrelated
branch with the same name", but I can't really think of any cases where
it would matter, and it would be impossible to guess whether or not a
new branch were related to the old one. If we'd really want to take care
of that sort of thing, we'd need to move into two-dimensional reflog
territory, with something like somebranch@{3,5} to mean "3 reflogs ago,
5 states ago", which goes too far and looks ugly.

^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: Question about 'branch -d' safety
  2010-07-13  8:00                         ` Will Palmer
@ 2010-07-13  8:30                           ` Johannes Sixt
  2010-07-13  9:00                             ` Will Palmer
  2010-07-13 22:21                           ` Clemens Buchacher
  1 sibling, 1 reply; 42+ messages in thread
From: Johannes Sixt @ 2010-07-13  8:30 UTC (permalink / raw)
  To: Will Palmer
  Cc: Clemens Buchacher, Junio C Hamano, Jakub Narebski,
	Jonathan Nieder, git, Nicolas Sebrecht, Nanako Shiraishi

Am 7/13/2010 10:00, schrieb Will Palmer:
> On Tue, 2010-07-13 at 09:13 +0200, Clemens Buchacher wrote:
>> Why would we need that? Even now, it is not enough to do "git
>> branch -D frotz; git gc --prune". You need to expire the reflog,
>> since HEAD may still have a reference to it. So instead you would
>> need
>>
>>  git branch -D frotz
>>  git reflog expire --expire=now HEAD frotz
>>  git gc --prune
>>
>> Purging a branch becomes just a special case of purging anything
>> from history.
> 
> With that in mind, would it not be enough to simply remove the ref, but
> not the reflog,

With that in mind, shouldn't it be exactly the other way around, i.e.,
dump the reflog (the objects are still referenced from HEAD's reflog), but
keep the ref around in some attic, just in case the branch is so old that
its reflog was empty and its objects would otherwise be pruned right away?

-- Hannes

^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: Question about 'branch -d' safety
  2010-07-13  8:30                           ` Johannes Sixt
@ 2010-07-13  9:00                             ` Will Palmer
  0 siblings, 0 replies; 42+ messages in thread
From: Will Palmer @ 2010-07-13  9:00 UTC (permalink / raw)
  To: Johannes Sixt
  Cc: Clemens Buchacher, Junio C Hamano, Jakub Narebski,
	Jonathan Nieder, git, Nicolas Sebrecht, Nanako Shiraishi

On Tue, 2010-07-13 at 10:30 +0200, Johannes Sixt wrote:
> With that in mind, shouldn't it be exactly the other way around, i.e.,
> dump the reflog (the objects are still referenced from HEAD's reflog), but
> keep the ref around in some attic, just in case the branch is so old that
> its reflog was empty and its objects would otherwise be pruned right away?
> 
> -- Hannes

It would probably be more sensible to add a "deleted" entry to the
reflog just prior to removing the ref (yes, this will make HEAD@{1}
equal to HEAD@{0} in many cases). Keeping the ref itself around in an
attic doesn't make sense - the reflog can act as a better "attic"
anyway, if we stop deleting nonempty reflogs just because they don't
have a live ref associated with them.

Having a separate "attic" just runs into problems of "what if you delete
the ref twice?" and adds an entirely separate mechanism for tracking
something which we already have a perfectly good method of tracking: the
previous state of a ref.

^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: Question about 'branch -d' safety
  2010-07-13  8:00                         ` Will Palmer
  2010-07-13  8:30                           ` Johannes Sixt
@ 2010-07-13 22:21                           ` Clemens Buchacher
  1 sibling, 0 replies; 42+ messages in thread
From: Clemens Buchacher @ 2010-07-13 22:21 UTC (permalink / raw)
  To: Will Palmer
  Cc: Junio C Hamano, Jakub Narebski, Jonathan Nieder, git,
	Nicolas Sebrecht, Nanako Shiraishi

[-- Attachment #1: Type: text/plain, Size: 846 bytes --]

On Tue, Jul 13, 2010 at 09:00:23AM +0100, Will Palmer wrote:
> 
> With that in mind, would it not be enough to simply remove the ref, but
> not the reflog, and change the logic to let the reflog live on until
> either pruned or all of its entries have expired normally?

That's the idea. The only reason why this patch renames the reflog
after deletion is due to the current directory style layout of refs
and reflogs. In particular, it is not possible to have refs foo/bar
and foo at the same time, since that would be a directory/file
conflict.

I therefore rename reflogs of deleted refs to foo~/bar, for
example, which for all intents and purposes is still the reflog of
foo/bar, only now it's not in the way of creating a new ref foo.
The fact that we rename the ref internally should be transparent to
the user.

Clemens

[-- Attachment #2: Digital signature --]
[-- Type: application/pgp-signature, Size: 490 bytes --]

^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: Question about 'branch -d' safety
  2010-07-11 13:37               ` Clemens Buchacher
  2010-07-11 18:41                 ` Junio C Hamano
@ 2010-07-17  9:30                 ` Clemens Buchacher
  2010-07-18  0:43                   ` Jonathan Nieder
  2010-07-19 18:06                   ` Junio C Hamano
  1 sibling, 2 replies; 42+ messages in thread
From: Clemens Buchacher @ 2010-07-17  9:30 UTC (permalink / raw)
  To: Jakub Narebski
  Cc: Jonathan Nieder, git, Nicolas Sebrecht, Nanako Shiraishi, Junio C Hamano

[-- Attachment #1: Type: text/plain, Size: 2513 bytes --]

Hi,

I am not sure how to proceed with this. It seems like there are
many arguments against it, and few for it. So in order to get a
better idea where we stand, I have summarized the arguments so far
below. 

I do not know which solution would make everyone happy. But I think
there are a few advantages to be considered, and the remaining
issues are not insurmountable.

Clemens
---

Pros and cons for "undeleting branches":

+ safety net

It should not be easy to lose information with git. In most cases
branches can be restored from the HEAD reflog, but it can be
complicated if the branch has never been checked out. Even the
trivial case is not as obvious as searching for a "Branch deleted"
entry in the reflog.

+ less dependant on git branch -d

Since git branch -d deletes branches which have been merged to a
remote tracking branch, it does no longer guarantee that the branch
is still available in history locally, and if the branch is also
deleted remotely, running git remote prune removes it entirely. In
that sense, it cannot be considered safe any more. Being able to
restore deleted branches would make this a non-issue.

+ automatically prune remote tracking branches

Once it is easy to restore deleted branches, there is no need to
keep around remote tracking branches which have been deleted on the
remote. They can be pruned automatically on git fetch.

- only a convenience

Considering the many potential issues for this corner case, why
bother implementing it? Deleted branches can be restored from the
HEAD reflog anyways.

- implementation complexity

Due to the possibility of D/F conflicts, "deleted reflogs" have to
be renamed internally. This makes the reflog implementation more
complex.

- user interface complexity

How to prune deleted branches? Currently, it is enough to do "git
branch -D branch; git gc --prune" in order to get rid of the branch
objects, at least if the HEAD reflog does not contain it or has
expired. Consider for example adding a remote, and removing it
again. This operation would leave a bunch of deleted branches,
which potentially occupy a lot of disk space.

How to find and restore deleted branches? If the reflog is used to
record deleted branches, and a new branch of the same name is
created, the reflog contains entries from unrelated branches. [1]
If the deleted reflogs are stored in an attic, how do we reference
those?

[1] Also, what happens if that new branch is renamed?

[-- Attachment #2: Digital signature --]
[-- Type: application/pgp-signature, Size: 490 bytes --]

^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: Question about 'branch -d' safety
  2010-07-17  9:30                 ` Clemens Buchacher
@ 2010-07-18  0:43                   ` Jonathan Nieder
  2010-07-18 11:55                     ` Jakub Narebski
  2010-07-19 18:06                   ` Junio C Hamano
  1 sibling, 1 reply; 42+ messages in thread
From: Jonathan Nieder @ 2010-07-18  0:43 UTC (permalink / raw)
  To: Clemens Buchacher
  Cc: Jakub Narebski, git, Nicolas Sebrecht, Nanako Shiraishi, Junio C Hamano

Clemens Buchacher wrote:

> I am not sure how to proceed with this.

Maybe: try out the patch locally for a while.  Such an experiment
would make it easier to see whether this feature because indispensible
and how the details should be to work best in practice.

But be careful!  Once you have learned to rely on the
master@{before.deletion} facility, you might end up trigger-happy with
‘branch -d’ and get into trouble when faced with copies of git without
the safety valve.

To respond to your philosophical points:

> + safety net

I consider it very good that git is willing to _lose_ information
rather than get overloaded with it.  On this laptop, if I try

 linux$ git fetch git://git.kernel.org/pub/scm/linux/kernel/git/sfr/linux-next.git
 linux$ git checkout FETCH_HEAD
        ... hack away ...

in a repository with quickly-expiring HEAD reflog, then the pack size
stays barely comfortable.  Perhaps a person that regularly builds
short-lived topic branches off such an unstable branch would be
similarly grateful that the objects do not need to stay long.

So: this aspect is a minus for me.

> + less dependant on git branch -d

I have not find “log -g” too useful for protection against fat-finger
accidents (“fsck --lost-found” and “push” take care of that) so it
seems odd to see safety touted as its main benefit.  What the reflog
and HEAD@{4} syntax prevent me from losing is a train of thought.
After deleting a branch I have just merged, if I start to wonder about
“that branch I just deleted”, reading reflog entries to find it
is the last thing I want to be doing.

In other words, it is not enough that “branch -d” be safe.  Yes, the
old branch tip is an ancestor of origin/master, but that does not
bring me much closer to finding it.  To be able to refer to
master@{5.minutes.ago} directly would be much more convenient.

+.

> + automatically prune remote tracking branches

I’m not comfortable with the idea yet.

0.

> - only a convenience
> - implementation complexity

Your implementation seemed simple.  I don’t think implementation
complexity is the fatal flaw.

0.

> - user interface complexity

Keeping the reflog would arguably make the syntax for specifying
revisions more consistent.

On the other hand, there are some hard policy questions:

 * Should pre-deletion commits expire more quickly?

 * What happens when you rename one branch to take on the
   name of a deleted one?

 * What happens when you create a new branch to take the
   place of a deleted one?

 * Should refs outside the refs/heads/ namespace
   (like refs/bisect/ and refs/original/) use the same
   policy as refs/heads/?

0.

^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: Question about 'branch -d' safety
  2010-07-18  0:43                   ` Jonathan Nieder
@ 2010-07-18 11:55                     ` Jakub Narebski
  2010-07-18 20:27                       ` Will Palmer
  0 siblings, 1 reply; 42+ messages in thread
From: Jakub Narebski @ 2010-07-18 11:55 UTC (permalink / raw)
  To: Jonathan Nieder
  Cc: Clemens Buchacher, git, Nicolas Sebrecht, Nanako Shiraishi,
	Junio C Hamano

On Sun, 18 Jul 2010, Jonathan Nieder wrote:

> On the other hand, there are some hard policy questions:
> 
>  * Should pre-deletion commits expire more quickly?

I think you should have an option to configure expiration of reflogs
of deleted branches separately, just like there is currently a way to
configure longer expiration time of HEAD reflog, or for stash.

With this on low disk size / low quota machine you would be able to
turn this feature (this protection) off.
 
>  * What happens when you rename one branch to take on the
>    name of a deleted one?

The same as with D/F conflict.  If you rename branch 'foo' to 'bar',
you also rename its reflog, but logs/refs/heads/bar would not conflict
with reflog for deleted branch, logs/refs~/heads~/bar (if you had 
deleted branch 'bar').

>  * What happens when you create a new branch to take the
>    place of a deleted one?

It would either be the same as above, or it would get reflog from 
deleted branch.  The former would be better, if there was in git some 
command to resurrect deleted branch.
 
>  * Should refs outside the refs/heads/ namespace
>    (like refs/bisect/ and refs/original/) use the same
>    policy as refs/heads/?

I guess that expire policy should be configured per ref, or per 
namespace.  Perhaps gc.<pattern>.reflogExpireDeleted ?  But then
gc.<pattern>.reflogExpireUnreachableDeleted would be long name, as
if compensating for two-letter section name ;-)

-- 
Jakub Narebski
Poland

^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: Question about 'branch -d' safety
  2010-07-18 11:55                     ` Jakub Narebski
@ 2010-07-18 20:27                       ` Will Palmer
  2010-07-18 23:19                         ` Jakub Narebski
  0 siblings, 1 reply; 42+ messages in thread
From: Will Palmer @ 2010-07-18 20:27 UTC (permalink / raw)
  To: Jakub Narebski
  Cc: Jonathan Nieder, Clemens Buchacher, git, Nicolas Sebrecht,
	Nanako Shiraishi, Junio C Hamano

On Sun, 2010-07-18 at 13:55 +0200, Jakub Narebski wrote:
> On Sun, 18 Jul 2010, Jonathan Nieder wrote:
> The same as with D/F conflict.  If you rename branch 'foo' to 'bar',
> you also rename its reflog, but logs/refs/heads/bar would not conflict
> with reflog for deleted branch, logs/refs~/heads~/bar (if you had 
> deleted branch 'bar').

having any kind of suffix like refs~/heads~/bar is just asking for
someone to delete a branch twice.

Regardless of whether or not it would be difficult to implement, I think
the ideal (for me) would be:
 1) existing syntax should work as-is. I like my reflog and don't want
people screwing with it ;)
 2) new syntax should be added for "some/ref@{..even if it's been
renamed or deleted..}", perhaps an entry in the reflog which points to
the "old name" / fact that it's been resurrected, for
moves/resurrections
 3) getting rid of something "for real" should be a simple command away.
If the steps are getting too numerous (delete, expire, /then/ prune?
Anything else?) then perhaps we just need a "git shred <ref>" which
takes care of listing out what will be involved, giving you lots of
chances to abort, etc, and which maybe is less of a sledgehammer than
the current method.


>From the discussion, I think the things we agree that we all want are:
 1) For git to not lose data by accident. Maybe there is disagreement as
to whether or not this already happens. I think the information
currently lost is: a) the name (ie, ease of finding the "lost" commit)
b) the reflog (which I think is utterly lost on delete at the moment?).
Both of these are often useless after a delete, but sometimes wanted.
 2) A straightforward way to restore information which has not been lost
(again, perhaps there is disagreement as to whether this already exists)
 3) A way to distinguish between "the reflog entries of a deleted ref
with the same name as our new ref" and "the new ref's entries" (these
are the "attic" discussions, etc) (not applicable to the current
situation)
 4) A way to really get rid of things which are no longer wanted. This
should be straightforward and have sane defaults so that, as mentioned,
adding then removing the wrong remote doesn't leave you with an extra
repos' worth of data for the next six months. (obviously this one
already exists)

Did I miss anything?

-- 
-- Will

^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: Question about 'branch -d' safety
  2010-07-18 20:27                       ` Will Palmer
@ 2010-07-18 23:19                         ` Jakub Narebski
  2010-07-19  7:12                           ` Will Palmer
  0 siblings, 1 reply; 42+ messages in thread
From: Jakub Narebski @ 2010-07-18 23:19 UTC (permalink / raw)
  To: wmpalmer
  Cc: Jonathan Nieder, Clemens Buchacher, git, Nicolas Sebrecht,
	Nanako Shiraishi, Junio C Hamano

On Sun, 18 Jul 2010, Will Palmer wrote:
> On Sun, 2010-07-18 at 13:55 +0200, Jakub Narebski wrote:
> > On Sun, 18 Jul 2010, Jonathan Nieder wrote:

> > The same as with D/F conflict.  If you rename branch 'foo' to 'bar',
> > you also rename its reflog, but logs/refs/heads/bar would not conflict
> > with reflog for deleted branch, logs/refs~/heads~/bar (if you had 
> > deleted branch 'bar').
> 
> having any kind of suffix like refs~/heads~/bar is just asking for
> someone to delete a branch twice.

I don't understand what you wanted to say here.  Using the

  $GIT_DIR/logs/refs~/heads~/bar

(and not $GIT_DIR/refs~/heads~/bar) as a reflog for a deleted branch
'bar' is an implementation detail.  You wouldn't see refs~/heads~/bar
when listing branches... well, perhaps 'git branch --list-deleted'
could be used to list deleted branches (by scanning for reflogs).
 
> Regardless of whether or not it would be difficult to implement, I think
> the ideal (for me) would be:
>
> 1) existing syntax should work as-is. I like my reflog and don't want
>    people screwing with it ;)

Nobody proposes anything different.

> 2) new syntax should be added for "some/ref@{..even if it's been
>    renamed or deleted..}", perhaps an entry in the reflog which points
>    to the "old name" / fact that it's been resurrected, for
>    moves/resurrections 

Here comes hard part: naming ;-)  Or rather chosing API for refering to
deleted branches and their reflogs.

> 3) getting rid of something "for real" should be a simple command away.
>    If the steps are getting too numerous (delete, expire, /then/ prune?
>    Anything else?) then perhaps we just need a "git shred <ref>" which
>    takes care of listing out what will be involved, giving you lots of
>    chances to abort, etc, and which maybe is less of a sledgehammer than
>    the current method.

Depending on default safety net, i.e. on default expire time for
reflogs of deleted branches (perhaps just 1 day, or at most 7 days?),
it could be as simple as 'git gc'.

> 
> From the discussion, I think the things we agree that we all want are:
>
> 1) For git to not lose data by accident. Maybe there is disagreement
>    as to whether or not this already happens. I think the information
>    currently lost is: 
>      a) the name (i.e. ease of finding the "lost" commit)

Usually HEAD reflog would be enough at least when "safely" deleting
with 'git branch -d' (and not '-D'), but it is not the case after
relaxing rules a bit.
      
>      b) the reflog (which I think is utterly lost on delete at the
>         moment?). 

Yes it is.  So even if you manage to resurrect branch thanks to HEAD
reflog or tools like contrib/git-resurrect.sh, the data in reflog would
be lost.

> Both of these are often useless after a delete, but sometimes wanted.

True.

> 2) A straightforward way to restore information which has not been
>    lost (again, perhaps there is disagreement as to whether this
>    already exists)

'git branch <old name> <old SHA-1>' is good, though it is perhaps not
quite that straighforward.

> 3) A way to distinguish between "the reflog entries of a deleted ref
>    with the same name as our new ref" and "the new ref's entries"
>    (these are the "attic" discussions, etc) (not applicable to the
>    current situation)

This is a problem of API design.  Probably overloading @{...} yet again
(@{<n>}, @{<time>}, @{-<n>}, @{upstream}), though using '~' as suffix
(or ~@{<n>} when refering to reflog of deleted branch) could be
a solution.

> 4) A way to really get rid of things which are no longer wanted. This
>    should be straightforward and have sane defaults so that, as mentioned,
>    adding then removing the wrong remote doesn't leave you with an extra
>    repos' worth of data for the next six months. (obviously this one
>    already exists)

Well, 'git -c gc.reflogexpire=0 -c gc.reflogexpireunreachable=0 gc --prune=now'
is a bit mouthfull ;-)

> 
> Did I miss anything?

-- 
Jakub Narebski
Poland

^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: Question about 'branch -d' safety
  2010-07-18 23:19                         ` Jakub Narebski
@ 2010-07-19  7:12                           ` Will Palmer
  2010-07-19 11:01                             ` Jakub Narebski
  2010-07-19 17:16                             ` Joshua Jensen
  0 siblings, 2 replies; 42+ messages in thread
From: Will Palmer @ 2010-07-19  7:12 UTC (permalink / raw)
  To: Jakub Narebski
  Cc: Jonathan Nieder, Clemens Buchacher, git, Nicolas Sebrecht,
	Nanako Shiraishi, Junio C Hamano

On Mon, 2010-07-19 at 01:19 +0200, Jakub Narebski wrote:
> On Sun, 18 Jul 2010, Will Palmer wrote:
> > On Sun, 2010-07-18 at 13:55 +0200, Jakub Narebski wrote:
> > having any kind of suffix like refs~/heads~/bar is just asking for
> > someone to delete a branch twice.
> 
> I don't understand what you wanted to say here.  Using the
> 
>   $GIT_DIR/logs/refs~/heads~/bar
> 
> (and not $GIT_DIR/refs~/heads~/bar) as a reflog for a deleted branch
> 'bar' is an implementation detail.  You wouldn't see refs~/heads~/bar
> when listing branches... well, perhaps 'git branch --list-deleted'
> could be used to list deleted branches (by scanning for reflogs).
>  

git branch -d integration
# git renames refs/heads/integration to refs~/heads~/integration
git co -b integration sometopic
# git creates refs/heads/integration, unrelated to the old one
(do some work)
(merge into the main branch)
git branch -d integration

Now what?
git renames refs/heads/integration to ... what?
- does the old refs~/heads~/integration get clobbered? If that's ever
okay, why are we even having this discussion?
- does the "old reflog" stuff get combined? If that's ever okay, why
even have an extra reflog, instead of just using the reflog we already
have?
- do we move everything else one step down, so refs~/heads~/integration
becomes refs~2/heads~2/integration? (ie: 2-dimensional reflog, which
sounds rather too fancy, to me)


-- 
-- Will

^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: Question about 'branch -d' safety
  2010-07-19  7:12                           ` Will Palmer
@ 2010-07-19 11:01                             ` Jakub Narebski
  2010-07-19 17:16                             ` Joshua Jensen
  1 sibling, 0 replies; 42+ messages in thread
From: Jakub Narebski @ 2010-07-19 11:01 UTC (permalink / raw)
  To: wmpalmer
  Cc: Jonathan Nieder, Clemens Buchacher, git, Nicolas Sebrecht,
	Nanako Shiraishi, Junio C Hamano

On Mon, 19 Jul 2010, Will Palmer wrote:
> On Mon, 2010-07-19 at 01:19 +0200, Jakub Narebski wrote:
> > On Sun, 18 Jul 2010, Will Palmer wrote:
> > > On Sun, 2010-07-18 at 13:55 +0200, Jakub Narebski wrote:

> > > having any kind of suffix like refs~/heads~/bar is just asking for
> > > someone to delete a branch twice.
> > 
> > I don't understand what you wanted to say here.  Using the
> > 
> >   $GIT_DIR/logs/refs~/heads~/bar
> > 
> > (and not $GIT_DIR/refs~/heads~/bar) as a reflog for a deleted branch
> > 'bar' is an implementation detail.  You wouldn't see refs~/heads~/bar
> > when listing branches... well, perhaps 'git branch --list-deleted'
> > could be used to list deleted branches (by scanning for reflogs).
> 
> git branch -d integration
> # git renames logs/refs/heads/integration to logs/refs~/heads~/integration
  # git adds deletion event to logs/refs~/heads~/integration reflog
    (similar to putting creation or rename event in reflog)
> git co -b integration sometopic
> # git creates refs/heads/integration, unrelated to the old one
> (do some work)
> (merge into the main branch)
> git branch -d integration

You touch interesting and important issue.

> 
> Now what?
> git renames refs/heads/integration to ... what?
> - does the old refs~/heads~/integration get clobbered? If that's ever
>   okay, why are we even having this discussion?

That's one solution.  We give some safety net, but not too much of
safety net.

> - does the "old reflog" stuff get combined? If that's ever okay, why
>   even have an extra reflog, instead of just using the reflog we
>   already have?

That's another solution.  We have deletion events in reflog to find
events for first 'integration', and distinguish them from events for
second (unrelated) 'integration' branch.

We can't just deal with reflogs of deleted branches by leaving them
as they are, not moved to logs/refs~/heads~/..., and combining them
because of possibility of D/F conflict (and yet another issue,
described below).  The reflog for branch 'foo' would block creating
reflog for branch 'foo/bar'.  Besides I think it's O.K. to require
more work to access reflogs for deleted branches, but not good to
force more work for reflogs for normal branches.


There is also a variant of the situation you described that makes
it harder for the 'concatenate reflogs of deleted branches' solution
to work well, namely:

 $ git branch -d foo
 $ git branch -m bar foo   
 $ git branch -d foo

where branch 'bar', renamed to 'foo' and then deleted has reflog from
before deletion of first 'foo' branch.

> - do we move everything else one step down, so refs~/heads~/integration
>   becomes refs~2/heads~2/integration? (ie: 2-dimensional reflog, which
>   sounds rather too fancy, to me)

This is yet another solution, although I think the better naming would
be to borrow concept of numbered backups, i.e. have

  logs/refs~/heads~/integration~1~
  logs/refs~/heads~/integration~2~
  ...

BTW., we can even make git branch ask (prompt for answer) whether to
delete old reflog of first 'integration' branch, or keep it in some
form... unless confiogured to always choose one solution.


Sigh... this makes eventual solution complicated, but the problem is
also complicated...


P.S. Wouldn't 'refs~' and 'foo~1~' filenames/pathnames be a problem
on MS Windows / on MS Windows filesystems: NTFS or FAT28^W FAT32?
Or on HFS+ on MacOS X?

-- 
Jakub Narebski
Poland

^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: Question about 'branch -d' safety
  2010-07-19  7:12                           ` Will Palmer
  2010-07-19 11:01                             ` Jakub Narebski
@ 2010-07-19 17:16                             ` Joshua Jensen
  2010-07-19 19:34                               ` Clemens Buchacher
                                                 ` (2 more replies)
  1 sibling, 3 replies; 42+ messages in thread
From: Joshua Jensen @ 2010-07-19 17:16 UTC (permalink / raw)
  To: wmpalmer
  Cc: Jakub Narebski, Jonathan Nieder, Clemens Buchacher, git,
	Nicolas Sebrecht, Nanako Shiraishi, Junio C Hamano

  ----- Original Message -----
From: Will Palmer
Date: 7/19/2010 1:12 AM
> git branch -d integration
> # git renames refs/heads/integration to refs~/heads~/integration
> git co -b integration sometopic
> # git creates refs/heads/integration, unrelated to the old one
> (do some work)
> (merge into the main branch)
> git branch -d integration
>
> Now what?
> git renames refs/heads/integration to ... what?
> - does the old refs~/heads~/integration get clobbered? If that's ever
> okay, why are we even having this discussion?
> - does the "old reflog" stuff get combined? If that's ever okay, why
> even have an extra reflog, instead of just using the reflog we already
> have?
> - do we move everything else one step down, so refs~/heads~/integration
> becomes refs~2/heads~2/integration? (ie: 2-dimensional reflog, which
> sounds rather too fancy, to me
I was bit by this last week.  I deleted a branch.  A few days later, I 
realized I needed the branch.  It wasn't in the reflog, so I had to look 
through the "lost" objects to find it.

My brain has become muddied with all the ~2 stuff.  Explain again why it 
can't be as simple as this?

git branch -d integration

git reflog

0000001 HEAD@{0}: (fake)checkout: moving from integration to master 
(0000001)
8000000 HEAD@{1}: branch -d: Deleting integration
0000001 HEAD@{2}: (fake)checkout: moving from master to integration 
(8000000)

git checkout -b integration HEAD@{1}  (or 8000000)

-Josh

^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: Question about 'branch -d' safety
  2010-07-17  9:30                 ` Clemens Buchacher
  2010-07-18  0:43                   ` Jonathan Nieder
@ 2010-07-19 18:06                   ` Junio C Hamano
  2010-07-19 19:22                     ` Clemens Buchacher
                                       ` (2 more replies)
  1 sibling, 3 replies; 42+ messages in thread
From: Junio C Hamano @ 2010-07-19 18:06 UTC (permalink / raw)
  To: Clemens Buchacher
  Cc: Jakub Narebski, Jonathan Nieder, git, Nicolas Sebrecht, Nanako Shiraishi

Clemens Buchacher <drizzd@aon.at> writes:

> Pros and cons for "undeleting branches":
>
> + safety net
> It should not be easy to lose information with git.

I am personally not very convinced by this argument when it comes to the
cases where the user actively asks us to remove something.

> + less dependant on git branch -d
>
> Since git branch -d deletes branches which have been merged to a
> remote tracking branch, it does no longer guarantee that the branch
> is still available in history locally, and if the branch is also
> deleted remotely, running git remote prune removes it entirely.

Sorry, I have no idea what you are talking about in this paragraph.  I
cannot read "Who depends on git branch -d to achieve what" from your
description.  All I read in the above description is that what the command
does is a good thing.  If you want to keep the remote tracking branches
around, don't run "branch -d".  If you do want to remove the history of a
branch, run "branch -d".  Isn't it that simple?

Perhaps the missing "achieve what" may be that "output from 'git branch'
is too cluttered if I don't remove already merged branches as early as
possible, and forces me to run 'git branch -d' to clean things up too
often"?  If that is the issue you are trying to address, perhaps we can
solve that in a more direct way, e.g. "git branch --no-merged" easier to
access?

> - user interface complexity
>
> How to prune deleted branches? Currently, it is enough to do "git
> branch -D branch; git gc --prune" in order to get rid of the branch
> objects, at least if the HEAD reflog does not contain it or has
> expired. Consider for example adding a remote, and removing it
> again. This operation would leave a bunch of deleted branches,
> which potentially occupy a lot of disk space.
>
> How to find and restore deleted branches? If the reflog is used to
> record deleted branches, and a new branch of the same name is
> created, the reflog contains entries from unrelated branches. [1]
> If the deleted reflogs are stored in an attic, how do we reference
> those?

This is the area I am most concerned about.

If you take an analogy in say a file server implementation, yes, it should
not be easy to lose information there.  But it is and should be easy to
say "rm junk".  How would people recover from a mistake if they typed a
wrong filename, "rm junio" to lose a precious file "junio", when they
meant to lose "junk"?  They go to backups.  Can't git users do the same?
After all, .git directory is stored on a filesystem of some sort, and
taking a backup (you do take backups, don't you?) and picking the stuff
you lost from there should be a standard procedure that can be learned
outside of the context of git.

This is pretty-much a tangent, but I recall from time to time people
wonder why the branch namespace is not flat.  If that is a common wish,
your "tilde-suffix all the intermediate path components" trick could be
used in later versions of git (perhaps 1.8.X series) to improve the system
to allow "maint-1.7.0" branch and topic branches that forked from it
e.g. "maint-1.7.0/fix-frotz" and "maint-1.7.0/fix-nitfol" peacefully
co-exist.

^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: Question about 'branch -d' safety
  2010-07-19 18:06                   ` Junio C Hamano
@ 2010-07-19 19:22                     ` Clemens Buchacher
  2010-07-19 20:49                     ` Jakub Narebski
  2010-07-20 13:19                     ` Ævar Arnfjörð Bjarmason
  2 siblings, 0 replies; 42+ messages in thread
From: Clemens Buchacher @ 2010-07-19 19:22 UTC (permalink / raw)
  To: Junio C Hamano
  Cc: Jakub Narebski, Jonathan Nieder, git, Nicolas Sebrecht, Nanako Shiraishi

[-- Attachment #1: Type: text/plain, Size: 4699 bytes --]

On Mon, Jul 19, 2010 at 11:06:23AM -0700, Junio C Hamano wrote:
> Clemens Buchacher <drizzd@aon.at> writes:
> 
> > + less dependant on git branch -d
> >
> > Since git branch -d deletes branches which have been merged to a
> > remote tracking branch, it does no longer guarantee that the branch
> > is still available in history locally, and if the branch is also
> > deleted remotely, running git remote prune removes it entirely.
> 
> Sorry, I have no idea what you are talking about in this paragraph.  I
> cannot read "Who depends on git branch -d to achieve what" from your
> description.  All I read in the above description is that what the command
> does is a good thing.  If you want to keep the remote tracking branches
> around, don't run "branch -d".  If you do want to remove the history of a
> branch, run "branch -d".  Isn't it that simple?

So why do we have "branch -d" in the first place? Exactly because I
do _not_ want to reomve the history of a branch. I just want to
remove the reference to it, because it has already been merged and
is therefore of no interest any more.

> Perhaps the missing "achieve what" may be that "output from 'git branch'
> is too cluttered if I don't remove already merged branches as early as
> possible, and forces me to run 'git branch -d' to clean things up too
> often"?  If that is the issue you are trying to address, perhaps we can
> solve that in a more direct way, e.g. "git branch --no-merged" easier to
> access?

That would only delay the need for cleanup. And when I do the
cleanup, I have typically forgotten what the branch contains
exactly. So I like to use "branch -d" to find out if the branch
potentially contains something interesting, and get rid of it
immediately if it does not. If that does not work I have no choice
but to look at the branch before deleting it blindly.

And I used to consider both "branch -d" and "remote prune" to be
safe, because I knew that at some point I would have had to use
"branch -D" in order to actually lose history. Now this is not true
any more due to the scenario described above. In that sense,
"branch -d" is now pointless, because it's hardly safer than
"branch -D".

So this process is where a cannot depend on "branch -d" any more.

> > - user interface complexity
> >
> > How to prune deleted branches? Currently, it is enough to do "git
> > branch -D branch; git gc --prune" in order to get rid of the branch
> > objects, at least if the HEAD reflog does not contain it or has
> > expired. Consider for example adding a remote, and removing it
> > again. This operation would leave a bunch of deleted branches,
> > which potentially occupy a lot of disk space.
> >
> > How to find and restore deleted branches? If the reflog is used to
> > record deleted branches, and a new branch of the same name is
> > created, the reflog contains entries from unrelated branches. [1]
> > If the deleted reflogs are stored in an attic, how do we reference
> > those?
> 
> This is the area I am most concerned about.

Below you explain that we not need this feature, because we can do
backups instead. I disagree. But even if I were to agree, why are
you concerned about that? Even if you do not use the feature
personally, does it do any harm?

The effort necessary to permanently delete a branch (delete,
expire, prune) will stay exactly the same. The reflog syntax can
also stay the same. We only really need new commands to list and
recreate deleted branches.

> If you take an analogy in say a file server implementation, yes, it should
> not be easy to lose information there.  But it is and should be easy to
> say "rm junk".  How would people recover from a mistake if they typed a > wrong filename, "rm junio" to lose a precious file "junio", when they
> meant to lose "junk"?  They go to backups.

Backups are a good measure to protect against loss of large amounts
of data. They do not protect very well from an accidental "rm
junio", because that only works if junio was already in the latest
backup. Protecting against small-scale changes is exactly what a
version control system is good at. And they are much superior to
backups, because they keep history.

> Can't git users do the same?
> After all, .git directory is stored on a filesystem of some sort, and
> taking a backup (you do take backups, don't you?) and picking the stuff
> you lost from there should be a standard procedure that can be learned
> outside of the context of git.

I usually use git for my more frequent backups. I put everything
that changes a lot and contains much personal work under version
control.

Clemens

[-- Attachment #2: Digital signature --]
[-- Type: application/pgp-signature, Size: 490 bytes --]

^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: Question about 'branch -d' safety
  2010-07-19 17:16                             ` Joshua Jensen
@ 2010-07-19 19:34                               ` Clemens Buchacher
  2010-07-19 19:45                               ` Will Palmer
  2010-07-19 20:36                               ` Jakub Narebski
  2 siblings, 0 replies; 42+ messages in thread
From: Clemens Buchacher @ 2010-07-19 19:34 UTC (permalink / raw)
  To: Joshua Jensen
  Cc: wmpalmer, Jakub Narebski, Jonathan Nieder, git, Nicolas Sebrecht,
	Nanako Shiraishi, Junio C Hamano

[-- Attachment #1: Type: text/plain, Size: 577 bytes --]

On Mon, Jul 19, 2010 at 11:16:41AM -0600, Joshua Jensen wrote:
> 
> 0000001 HEAD@{0}: (fake)checkout: moving from integration to master
> (0000001)
> 8000000 HEAD@{1}: branch -d: Deleting integration
> 0000001 HEAD@{2}: (fake)checkout: moving from master to integration
> (8000000)

That does not fit very well with the current definition of the HEAD
reflog.

But I do like the idea. We could append branch deletions to
.git/log/deleted, which otherwise behaves exactly like the HEAD
reflog.  And we simply do not keep the reflogs of deleted branches.

Clemens

[-- Attachment #2: Digital signature --]
[-- Type: application/pgp-signature, Size: 490 bytes --]

^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: Question about 'branch -d' safety
  2010-07-19 17:16                             ` Joshua Jensen
  2010-07-19 19:34                               ` Clemens Buchacher
@ 2010-07-19 19:45                               ` Will Palmer
  2010-07-19 20:40                                 ` Jakub Narebski
  2010-07-20  3:05                                 ` Joshua Jensen
  2010-07-19 20:36                               ` Jakub Narebski
  2 siblings, 2 replies; 42+ messages in thread
From: Will Palmer @ 2010-07-19 19:45 UTC (permalink / raw)
  To: Joshua Jensen
  Cc: Jakub Narebski, Jonathan Nieder, Clemens Buchacher, git,
	Nicolas Sebrecht, Nanako Shiraishi, Junio C Hamano

On Mon, 2010-07-19 at 11:16 -0600, Joshua Jensen wrote:
> My brain has become muddied with all the ~2 stuff.  Explain again why it 
> can't be as simple as this?
...snip...
> git checkout -b integration HEAD@{1}  (or 8000000)
> 
> -Josh

Because:
 1) The HEAD reflog is the wrong place to stick things which weren't
recently checked-out.
 and 2) the previous tip is currently the easiest-to-recover part of a
deleted branch. What's lost is all the reflog data: order of states, and
how they were reached.

However, I /do/ think it's as simple as "don't delete the reflog right
away when you delete a branch", and other edge-cases and niceties in
terms of UI (such as ref renaming, resurrection of refs for tracking
unrelated data, etc) can be taken care of later, if there's actually a
need for them.

-- 
-- Will

^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: Question about 'branch -d' safety
  2010-07-19 17:16                             ` Joshua Jensen
  2010-07-19 19:34                               ` Clemens Buchacher
  2010-07-19 19:45                               ` Will Palmer
@ 2010-07-19 20:36                               ` Jakub Narebski
  2 siblings, 0 replies; 42+ messages in thread
From: Jakub Narebski @ 2010-07-19 20:36 UTC (permalink / raw)
  To: Joshua Jensen
  Cc: wmpalmer, Jonathan Nieder, Clemens Buchacher, git,
	Nicolas Sebrecht, Nanako Shiraishi, Junio C Hamano

On Mon, 19 Jul 2010, Joshua Jensen wrote:

> My brain has become muddied with all the ~2 stuff.  Explain again why it 
> can't be as simple as this?
> 
> git branch -d integration
> 
> git reflog
> 
> 0000001 HEAD@{0}: (fake)checkout: moving from integration to master 
> (0000001)
> 8000000 HEAD@{1}: branch -d: Deleting integration
> 0000001 HEAD@{2}: (fake)checkout: moving from master to integration 
> (8000000)

Because 'integration' branch nowadays might have been never checked out,
and not be in HEAD, even though 'git branch -d integration' works.
Stuffing unrelated events in HEAD reflog, instead of having it be action
history of HEAD symref is IMVHO a bad idea.

-- 
Jakub Narebski
Poland

^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: Question about 'branch -d' safety
  2010-07-19 19:45                               ` Will Palmer
@ 2010-07-19 20:40                                 ` Jakub Narebski
  2010-07-20  3:05                                 ` Joshua Jensen
  1 sibling, 0 replies; 42+ messages in thread
From: Jakub Narebski @ 2010-07-19 20:40 UTC (permalink / raw)
  To: Will Palmer
  Cc: Joshua Jensen, Jonathan Nieder, Clemens Buchacher, git,
	Nicolas Sebrecht, Nanako Shiraishi, Junio C Hamano

On Mon, 19 Jul 2010, Will Palmer wrote:
> On Mon, 2010-07-19 at 11:16 -0600, Joshua Jensen wrote:

> > My brain has become muddied with all the ~2 stuff.  Explain again why it 
> > can't be as simple as this?
> ...snip...
> > git checkout -b integration HEAD@{1}  (or 8000000)
> > 
> > -Josh
> 
> Because:
>  1) The HEAD reflog is the wrong place to stick things which weren't
> recently checked-out.
>  and 2) the previous tip is currently the easiest-to-recover part of a
> deleted branch. What's lost is all the reflog data: order of states, and
> how they were reached.
> 
> However, I /do/ think it's as simple as "don't delete the reflog right
> away when you delete a branch", and other edge-cases and niceties in
> terms of UI (such as ref renaming, resurrection of refs for tracking
> unrelated data, etc) can be taken care of later, if there's actually a
> need for them.

There are at least two issues, which are not niceties, but requirements:
1.) not deleting reflog for 'foo' when deleting 'foo' branch blocks
    creating reflog for 'foo/bar' branch because of D/F conflict.
2.) there is issue of 'git branch -m bar foo', i.e. renaming some other
    branch to 'foo', a branch which has its own reflog.

But you are right in that "moving reflog to attic" can be postponed till
needed.

-- 
Jakub Narebski
Poland

^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: Question about 'branch -d' safety
  2010-07-19 18:06                   ` Junio C Hamano
  2010-07-19 19:22                     ` Clemens Buchacher
@ 2010-07-19 20:49                     ` Jakub Narebski
  2010-07-20 13:19                     ` Ævar Arnfjörð Bjarmason
  2 siblings, 0 replies; 42+ messages in thread
From: Jakub Narebski @ 2010-07-19 20:49 UTC (permalink / raw)
  To: Junio C Hamano
  Cc: Clemens Buchacher, Jonathan Nieder, git, Nicolas Sebrecht,
	Nanako Shiraishi

On Mon, 19 Jul 2010, Junio C Hamano wrote:

> If you take an analogy in say a file server implementation, yes, it should
> not be easy to lose information there.  But it is and should be easy to
> say "rm junk".  How would people recover from a mistake if they typed a
> wrong filename, "rm junio" to lose a precious file "junio", when they
> meant to lose "junk"?  They go to backups.

First, I guess that many people have 'rm' aliased to 'rm -i', so it
asks for confirmation... though it only moves safety a bit, as 
'rm -f junio' would be still dangerous.

Second, desktop environment have the notion of trashcan, where files
to be deleted are moved, and kept for some time or untill trashcan is
emptied (though it moves safety a bit to people using 'hard delete'
instead of moving to trashcan).  Or use undeletefs,

>                                            Can't git users do the same? 
> After all, .git directory is stored on a filesystem of some sort, and
> taking a backup (you do take backups, don't you?) and picking the stuff
> you lost from there should be a standard procedure that can be learned
> outside of the context of git.

People expect version control system to protect them more from mistakes.

Sidenote: there aere requests in last Git User's Survey (from 2009) to
have safety net against losing changes due to "git reset --hard", in the
form of automatically stashing blobs or something.

> 
> This is pretty-much a tangent, but I recall from time to time people
> wonder why the branch namespace is not flat.  If that is a common wish,
> your "tilde-suffix all the intermediate path components" trick could be
> used in later versions of git (perhaps 1.8.X series) to improve the system
> to allow "maint-1.7.0" branch and topic branches that forked from it
> e.g. "maint-1.7.0/fix-frotz" and "maint-1.7.0/fix-nitfol" peacefully
> co-exist.

Hmmmmm... true, this would make having 'foo' and 'foo/bar' branches 
coexists, whether they are loose or packed, and whether they have reflogs
or not.

-- 
Jakub Narebski
Poland

^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: Question about 'branch -d' safety
  2010-07-19 19:45                               ` Will Palmer
  2010-07-19 20:40                                 ` Jakub Narebski
@ 2010-07-20  3:05                                 ` Joshua Jensen
  2010-07-20  6:31                                   ` Will Palmer
  1 sibling, 1 reply; 42+ messages in thread
From: Joshua Jensen @ 2010-07-20  3:05 UTC (permalink / raw)
  To: wmpalmer
  Cc: Jakub Narebski, Jonathan Nieder, Clemens Buchacher, git,
	Nicolas Sebrecht, Nanako Shiraishi, Junio C Hamano

  ----- Original Message -----
From: Will Palmer
Date: 7/19/2010 1:45 PM
> On Mon, 2010-07-19 at 11:16 -0600, Joshua Jensen wrote:
>> My brain has become muddied with all the ~2 stuff.  Explain again why it
>> can't be as simple as this?
> ...snip...
>> git checkout -b integration HEAD@{1}  (or 8000000)
>>
>> -Josh
> Because:
>   1) The HEAD reflog is the wrong place to stick things which weren't
> recently checked-out.
>   and 2) the previous tip is currently the easiest-to-recover part of a
> deleted branch. What's lost is all the reflog data: order of states, and
> how they were reached.
>
> However, I /do/ think it's as simple as "don't delete the reflog right
> away when you delete a branch", and other edge-cases and niceties in
> terms of UI (such as ref renaming, resurrection of refs for tracking
> unrelated data, etc) can be taken care of later, if there's actually a
> need for them.
Thanks for the extra clarification.

For my sake, how is the previous tip the easiest-to-recover part of a 
deleted branch?  How do you go about finding the lost object?

Josh

^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: Question about 'branch -d' safety
  2010-07-20  3:05                                 ` Joshua Jensen
@ 2010-07-20  6:31                                   ` Will Palmer
  0 siblings, 0 replies; 42+ messages in thread
From: Will Palmer @ 2010-07-20  6:31 UTC (permalink / raw)
  To: Joshua Jensen
  Cc: Jakub Narebski, Jonathan Nieder, Clemens Buchacher, git,
	Nicolas Sebrecht, Nanako Shiraishi, Junio C Hamano

On Mon, 2010-07-19 at 21:05 -0600, Joshua Jensen wrote:
> For my sake, how is the previous tip the easiest-to-recover part of a 
> deleted branch?  How do you go about finding the lost object?
> 
> Josh

git fsck | grep '^dangling commit' | while read dangling commit hash; do
	git log -1 --format='%at %h %s' $hash
done | sort -nr | less -FRSX


-- 
-- Will

^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: Question about 'branch -d' safety
  2010-07-19 18:06                   ` Junio C Hamano
  2010-07-19 19:22                     ` Clemens Buchacher
  2010-07-19 20:49                     ` Jakub Narebski
@ 2010-07-20 13:19                     ` Ævar Arnfjörð Bjarmason
  2010-07-20 13:34                       ` Matthieu Moy
  2 siblings, 1 reply; 42+ messages in thread
From: Ævar Arnfjörð Bjarmason @ 2010-07-20 13:19 UTC (permalink / raw)
  To: Junio C Hamano
  Cc: Clemens Buchacher, Jakub Narebski, Jonathan Nieder, git,
	Nicolas Sebrecht, Nanako Shiraishi, Joshua Jensen

On Mon, Jul 19, 2010 at 18:06, Junio C Hamano <gitster@pobox.com> wrote:
> Clemens Buchacher <drizzd@aon.at> writes:
>
>> Pros and cons for "undeleting branches":
>>
>> + safety net
>> It should not be easy to lose information with git.
>
> I am personally not very convinced by this argument when it comes to the
> cases where the user actively asks us to remove something.

I think it's mainly about user interface consistency. When you
"delete" stuff in with git reset (i.e. move the HEAD) it's recorded in
the reflog, but this isn't the case with branch deletions.

As Will Palmer pointed out, being able to tell newbies "Don't worry,
you aren't going to break it" is a very powerful thing to be able to
tell newbies and experienced users using Git.

It encourages experimentation, because you know that even if you botch
the rewrite and delete the wrong branch it's easy to recover your work
from the reflog in the morning.

I'd really like something like Joshua Jensen's suggestion for
recording a fake branch deletion commit in the reflog.

^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: Question about 'branch -d' safety
  2010-07-20 13:19                     ` Ævar Arnfjörð Bjarmason
@ 2010-07-20 13:34                       ` Matthieu Moy
  0 siblings, 0 replies; 42+ messages in thread
From: Matthieu Moy @ 2010-07-20 13:34 UTC (permalink / raw)
  To: Ævar Arnfjörð Bjarmason
  Cc: Junio C Hamano, Clemens Buchacher, Jakub Narebski,
	Jonathan Nieder, git, Nicolas Sebrecht, Nanako Shiraishi,
	Joshua Jensen

Ævar Arnfjörð Bjarmason <avarab@gmail.com> writes:

> On Mon, Jul 19, 2010 at 18:06, Junio C Hamano <gitster@pobox.com> wrote:
>> Clemens Buchacher <drizzd@aon.at> writes:
>>
>>> Pros and cons for "undeleting branches":
>>>
>>> + safety net
>>> It should not be easy to lose information with git.
>>
>> I am personally not very convinced by this argument when it comes to the
>> cases where the user actively asks us to remove something.
>
> [...]
>
> As Will Palmer pointed out, being able to tell newbies "Don't worry,
> you aren't going to break it" is a very powerful thing to be able to
> tell newbies and experienced users using Git.

Also, one of the _very_ cool things with Git is that advance users can
tell newbies "don't worry, if you break it, I'll be able to repair
it". And advanced users are almost always able to repair beginner's
mistake after the fact (in my case, it's often about teachers helping
students, but it works also in other situations).

"branch -d" (and even more "branch -D") deleting the reflog is one of
the few exceptions to this rule: dumb users can shoot themselves in
the foot (which is normal) but then the doctor cannot undo it easily.

-- 
Matthieu Moy
http://www-verimag.imag.fr/~moy/

^ permalink raw reply	[flat|nested] 42+ messages in thread

end of thread, other threads:[~2010-07-20 13:39 UTC | newest]

Thread overview: 42+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2009-12-29 21:54 Question about 'branch -d' safety Nanako Shiraishi
2009-12-29 22:31 ` Nicolas Sebrecht
2009-12-30  3:12   ` Nanako Shiraishi
2009-12-30  6:43     ` Junio C Hamano
2009-12-30 21:08     ` Nicolas Sebrecht
2010-07-10  6:55     ` Clemens Buchacher
2010-07-10 21:40       ` Jonathan Nieder
2010-07-10 21:57         ` Jakub Narebski
2010-07-10 22:17           ` Jonathan Nieder
2010-07-11  6:55           ` Clemens Buchacher
2010-07-11  7:16             ` Jakub Narebski
2010-07-11  8:48               ` Julian Phillips
2010-07-11 13:37               ` Clemens Buchacher
2010-07-11 18:41                 ` Junio C Hamano
2010-07-11 19:05                   ` Jakub Narebski
2010-07-11 22:02                   ` Will Palmer
2010-07-12 18:47                   ` Clemens Buchacher
2010-07-12 23:50                     ` Junio C Hamano
2010-07-13  7:13                       ` Clemens Buchacher
2010-07-13  8:00                         ` Will Palmer
2010-07-13  8:30                           ` Johannes Sixt
2010-07-13  9:00                             ` Will Palmer
2010-07-13 22:21                           ` Clemens Buchacher
2010-07-17  9:30                 ` Clemens Buchacher
2010-07-18  0:43                   ` Jonathan Nieder
2010-07-18 11:55                     ` Jakub Narebski
2010-07-18 20:27                       ` Will Palmer
2010-07-18 23:19                         ` Jakub Narebski
2010-07-19  7:12                           ` Will Palmer
2010-07-19 11:01                             ` Jakub Narebski
2010-07-19 17:16                             ` Joshua Jensen
2010-07-19 19:34                               ` Clemens Buchacher
2010-07-19 19:45                               ` Will Palmer
2010-07-19 20:40                                 ` Jakub Narebski
2010-07-20  3:05                                 ` Joshua Jensen
2010-07-20  6:31                                   ` Will Palmer
2010-07-19 20:36                               ` Jakub Narebski
2010-07-19 18:06                   ` Junio C Hamano
2010-07-19 19:22                     ` Clemens Buchacher
2010-07-19 20:49                     ` Jakub Narebski
2010-07-20 13:19                     ` Ævar Arnfjörð Bjarmason
2010-07-20 13:34                       ` Matthieu Moy

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.