git.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH/RFC] commit: new option to abort -a something is already staged
@ 2018-08-20 15:41 Nguyễn Thái Ngọc Duy
  2018-08-20 15:55 ` Junio C Hamano
                   ` (4 more replies)
  0 siblings, 5 replies; 28+ messages in thread
From: Nguyễn Thái Ngọc Duy @ 2018-08-20 15:41 UTC (permalink / raw)
  To: git; +Cc: Nguyễn Thái Ngọc Duy

So many times I have carefully cherry picked changes to the index with
`git add -p` then accidentally did `git commit -am ....` (usually by
retrieving a command from history and pressing Enter too quickly)
which destroyed beautiful index.

One way to deal with this is some form of `git undo` that allows me to
retrieve the old index. That's not a lot of work by itself. The problem
is designing that `git undo` interface because there are more undo
options that this.

Instead, let's handle just this problem for now. This new option
commit.preciousDirtyIndex, if set to false, will not allow `commit -a`
to continue if the final index is different from the existing one. I
don't think this can be achieved with hooks because the hooks don't
know about "-a" or different commit modes.

Or is there a better way to handle this?

Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
---
 Documentation/config.txt |  5 +++++
 builtin/commit.c         | 19 +++++++++++++++++++
 2 files changed, 24 insertions(+)

diff --git a/Documentation/config.txt b/Documentation/config.txt
index 95ad715a44..3937681ee9 100644
--- a/Documentation/config.txt
+++ b/Documentation/config.txt
@@ -1417,6 +1417,11 @@ commit.gpgSign::
 	convenient to use an agent to avoid typing your GPG passphrase
 	several times.
 
+commit.preciousDirtyIndex::
+	If some changes are partially staged in the index (i.e.
+	"git commit -a" and "git commit" commit different changes), reject
+	"git commit -a".
+
 commit.status::
 	A boolean to enable/disable inclusion of status information in the
 	commit message template when using an editor to prepare the commit
diff --git a/builtin/commit.c b/builtin/commit.c
index 213fca2d8e..489e4b9f50 100644
--- a/builtin/commit.c
+++ b/builtin/commit.c
@@ -98,6 +98,7 @@ static const char *author_message, *author_message_buffer;
 static char *edit_message, *use_message;
 static char *fixup_message, *squash_message;
 static int all, also, interactive, patch_interactive, only, amend, signoff;
+static int allow_dirty_index = 1;
 static int edit_flag = -1; /* unspecified */
 static int quiet, verbose, no_verify, allow_empty, dry_run, renew_authorship;
 static int config_commit_verbose = -1; /* unspecified */
@@ -385,10 +386,24 @@ static const char *prepare_index(int argc, const char **argv, const char *prefix
 	 * (B) on failure, rollback the real index.
 	 */
 	if (all || (also && pathspec.nr)) {
+		int compare_oid = all && !allow_dirty_index;
+		struct object_id previous_oid;
+
+		if (compare_oid) {
+			if (update_main_cache_tree(0) || !the_index.cache_tree)
+				die(_("error building trees"));
+			if (the_index.cache_tree->entry_count >= 0)
+				oidcpy(&previous_oid, &the_index.cache_tree->oid);
+			else
+				oidclr(&previous_oid);
+		}
 		hold_locked_index(&index_lock, LOCK_DIE_ON_ERROR);
 		add_files_to_cache(also ? prefix : NULL, &pathspec, 0);
 		refresh_cache_or_die(refresh_flags);
 		update_main_cache_tree(WRITE_TREE_SILENT);
+		if (compare_oid && the_index.cache_tree &&
+		    oidcmp(&previous_oid, &the_index.cache_tree->oid))
+			die(_("staged content is different, aborting"));
 		if (write_locked_index(&the_index, &index_lock, 0))
 			die(_("unable to write new_index file"));
 		commit_style = COMMIT_NORMAL;
@@ -1413,6 +1428,10 @@ static int git_commit_config(const char *k, const char *v, void *cb)
 		config_commit_verbose = git_config_bool_or_int(k, v, &is_bool);
 		return 0;
 	}
+	if (!strcmp(k, "commit.preciousdirtyindex")) {
+		allow_dirty_index = !git_config_bool(k, v);
+		return 0;
+	}
 
 	status = git_gpg_config(k, v, NULL);
 	if (status)
-- 
2.18.0.1003.g5e2e2c8169


^ permalink raw reply related	[flat|nested] 28+ messages in thread

* Re: [PATCH/RFC] commit: new option to abort -a something is already staged
  2018-08-20 15:41 [PATCH/RFC] commit: new option to abort -a something is already staged Nguyễn Thái Ngọc Duy
@ 2018-08-20 15:55 ` Junio C Hamano
  2018-08-20 17:48 ` Eric Sunshine
                   ` (3 subsequent siblings)
  4 siblings, 0 replies; 28+ messages in thread
From: Junio C Hamano @ 2018-08-20 15:55 UTC (permalink / raw)
  To: Nguyễn Thái Ngọc Duy; +Cc: git

Nguyễn Thái Ngọc Duy  <pclouds@gmail.com> writes:

> +commit.preciousDirtyIndex::
> +	If some changes are partially staged in the index (i.e.
> +	"git commit -a" and "git commit" commit different changes), reject
> +	"git commit -a".

Or perhaps better yet, go into yes/no interaction to confirm if you
have access to interactive terminal at fd #0/#1, perhaps?

> diff --git a/builtin/commit.c b/builtin/commit.c
> index 213fca2d8e..489e4b9f50 100644
> --- a/builtin/commit.c
> +++ b/builtin/commit.c
> @@ -98,6 +98,7 @@ static const char *author_message, *author_message_buffer;
>  static char *edit_message, *use_message;
>  static char *fixup_message, *squash_message;
>  static int all, also, interactive, patch_interactive, only, amend, signoff;
> +static int allow_dirty_index = 1;
>  static int edit_flag = -1; /* unspecified */
>  static int quiet, verbose, no_verify, allow_empty, dry_run, renew_authorship;
>  static int config_commit_verbose = -1; /* unspecified */
> @@ -385,10 +386,24 @@ static const char *prepare_index(int argc, const char **argv, const char *prefix
>  	 * (B) on failure, rollback the real index.
>  	 */
>  	if (all || (also && pathspec.nr)) {
> +		int compare_oid = all && !allow_dirty_index;
> +		struct object_id previous_oid;
> +
> +		if (compare_oid) {
> +			if (update_main_cache_tree(0) || !the_index.cache_tree)
> +				die(_("error building trees"));

Hmph, when we conclude a conflicted merge with "commit -a", wouldn't
we fail to compute the "before" picture, with higher-stage entries
stil in the index?

> +			if (the_index.cache_tree->entry_count >= 0)
> +				oidcpy(&previous_oid, &the_index.cache_tree->oid);
> +			else
> +				oidclr(&previous_oid);

The cache-tree covers no entry, meaning the index has no cache
entries?  Shouldn't we setting EMPTY_TREE_SHA1_BIN or something
instead?

> +		}
>  		hold_locked_index(&index_lock, LOCK_DIE_ON_ERROR);
>  		add_files_to_cache(also ? prefix : NULL, &pathspec, 0);
>  		refresh_cache_or_die(refresh_flags);
>  		update_main_cache_tree(WRITE_TREE_SILENT);
> +		if (compare_oid && the_index.cache_tree &&
> +		    oidcmp(&previous_oid, &the_index.cache_tree->oid))
> +			die(_("staged content is different, aborting"));

I was hoping that it is an easy change to teach add_files_to_cache()
to report how many paths it actually has "added" (modifications and
removals are of course also counted), which allows us not to waste
computing a throw-away tree object once more.

There only are three existing callers to the function, and only one
of them rely on the current "non-zero is error, zero is good"
semantics, so updating that caller (and perhaps vetting the other
callers to see if they also _should_ pay attention to the return
value, at least to detect errors of not number of paths updated)
shouldn't be that much more effort, and would be a good update to
the API regardless of this new feature, I would think.

>  		if (write_locked_index(&the_index, &index_lock, 0))
>  			die(_("unable to write new_index file"));
>  		commit_style = COMMIT_NORMAL;
> @@ -1413,6 +1428,10 @@ static int git_commit_config(const char *k, const char *v, void *cb)
>  		config_commit_verbose = git_config_bool_or_int(k, v, &is_bool);
>  		return 0;
>  	}
> +	if (!strcmp(k, "commit.preciousdirtyindex")) {
> +		allow_dirty_index = !git_config_bool(k, v);
> +		return 0;
> +	}
>  
>  	status = git_gpg_config(k, v, NULL);
>  	if (status)

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH/RFC] commit: new option to abort -a something is already staged
  2018-08-20 15:41 [PATCH/RFC] commit: new option to abort -a something is already staged Nguyễn Thái Ngọc Duy
  2018-08-20 15:55 ` Junio C Hamano
@ 2018-08-20 17:48 ` Eric Sunshine
  2018-08-20 19:30 ` Jonathan Nieder
                   ` (2 subsequent siblings)
  4 siblings, 0 replies; 28+ messages in thread
From: Eric Sunshine @ 2018-08-20 17:48 UTC (permalink / raw)
  To: Nguyễn Thái Ngọc Duy; +Cc: Git List

On Mon, Aug 20, 2018 at 11:41 AM Nguyễn Thái Ngọc Duy <pclouds@gmail.com> wrote:
> One way to deal with this is some form of `git undo` that allows me to
> retrieve the old index. That's not a lot of work by itself. The problem
> is designing that `git undo` interface because there are more undo
> options that this.

s/that/than/

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH/RFC] commit: new option to abort -a something is already staged
  2018-08-20 15:41 [PATCH/RFC] commit: new option to abort -a something is already staged Nguyễn Thái Ngọc Duy
  2018-08-20 15:55 ` Junio C Hamano
  2018-08-20 17:48 ` Eric Sunshine
@ 2018-08-20 19:30 ` Jonathan Nieder
  2018-08-21 14:43   ` Duy Nguyen
  2018-08-24  2:59 ` Jacob Keller
  2018-09-16  6:31 ` [PATCH v2 0/1] Make 'git commit' not accidentally lose staged content Nguyễn Thái Ngọc Duy
  4 siblings, 1 reply; 28+ messages in thread
From: Jonathan Nieder @ 2018-08-20 19:30 UTC (permalink / raw)
  To: Nguyễn Thái Ngọc Duy; +Cc: git

Hi,

Nguyễn Thái Ngọc Duy wrote:

> So many times I have carefully cherry picked changes to the index with
> `git add -p` then accidentally did `git commit -am ....` (usually by
> retrieving a command from history and pressing Enter too quickly)
> which destroyed beautiful index.
>
> One way to deal with this is some form of `git undo` that allows me to
> retrieve the old index.

Yes, I would love such an undo feature!

How would you imagine that this information would get stored?  We can
start with adding that and a low-level (plumbing) interface to it, to
avoid being blocked on getting the user-facing (porcelain) "git undo"
interface right.  (Or we can go straight for the porcelain.  It's fine
for it to start minimal and gain switches later.)

[...]
> Instead, let's handle just this problem for now. This new option
> commit.preciousDirtyIndex, if set to false, will not allow `commit -a`
> to continue if the final index is different from the existing one. I
> don't think this can be achieved with hooks because the hooks don't
> know about "-a" or different commit modes.

I frequently use "git commit -a" this way intentionally, so I would be
unlikely to turn this config on.  That leads me to suspect it's not a
good candidate for configuration:

- it's not configuration for the sake of a transition period, since some
  people would keep it on forever

- it's not configuration based on different project needs, either

So configuration doesn't feel like a good fit.

It might be that I can switch my muscle memory to "git add -u && git
commit", in which case this could work as a new default without
configuration (or with configuration for a transition period).  A
separate commandline option "git commit -a --no-clobber-index" might
make sense as well, since then I could pass --clobber-index to keep my
current workflow.  That would also follow the usual progression:
introduce commandline option first, then config, then change default.

That said, I lean toward your initial thought, that this is papering
over a missing undo feature.  Can you say more about how you'd imagine
undo working?

Thanks,
Jonathan

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH/RFC] commit: new option to abort -a something is already staged
  2018-08-20 19:30 ` Jonathan Nieder
@ 2018-08-21 14:43   ` Duy Nguyen
  2018-08-23  2:11     ` Jonathan Nieder
  0 siblings, 1 reply; 28+ messages in thread
From: Duy Nguyen @ 2018-08-21 14:43 UTC (permalink / raw)
  To: Jonathan Nieder; +Cc: Git Mailing List

On Mon, Aug 20, 2018 at 9:30 PM Jonathan Nieder <jrnieder@gmail.com> wrote:
>
> Hi,
>
> Nguyễn Thái Ngọc Duy wrote:
>
> > So many times I have carefully cherry picked changes to the index with
> > `git add -p` then accidentally did `git commit -am ....` (usually by
> > retrieving a command from history and pressing Enter too quickly)
> > which destroyed beautiful index.
> >
> > One way to deal with this is some form of `git undo` that allows me to
> > retrieve the old index.
>
> Yes, I would love such an undo feature!
>
> How would you imagine that this information would get stored?  We can
> start with adding that and a low-level (plumbing) interface to it, to
> avoid being blocked on getting the user-facing (porcelain) "git undo"
> interface right.  (Or we can go straight for the porcelain.  It's fine
> for it to start minimal and gain switches later.)

Yeah I'd love to see "git undo" too, but unfortunately I don't have a
clear model how it should operate. Which is why I still hesitate to
implement one. :P

All I have is something close to how undo is done in an editor, where
we could save a list of actions and corresponding undo ones, but
editors are wysiwyg and we can't just let the user undo, see the
result elsewhere and redo if they're not happy.

> > Instead, let's handle just this problem for now. This new option
> > commit.preciousDirtyIndex, if set to false, will not allow `commit -a`
> > to continue if the final index is different from the existing one. I
> > don't think this can be achieved with hooks because the hooks don't
> > know about "-a" or different commit modes.
>
> I frequently use "git commit -a" this way intentionally, so I would be
> unlikely to turn this config on.  That leads me to suspect it's not a
> good candidate for configuration:
>
> - it's not configuration for the sake of a transition period, since some
>   people would keep it on forever
>
> - it's not configuration based on different project needs, either
>
> So configuration doesn't feel like a good fit.

I think it falls under personal preference (yes some people like me
will keep it on forever in fear of losing staged changes).

> That said, I lean toward your initial thought, that this is papering
> over a missing undo feature.  Can you say more about how you'd imagine
> undo working?

There is not much to say. But at least to be able to undo changes in
the index, I made two attempts in the past [1] [2] (and forgot about
them until I got bitten by the lack of undo again this time). I guess
I could push for one of those approaches again because it will help me
and also lay the foundation for any git-undo in the future.

Once we have can restore index, undoing "git reset --hard" is also
possible (by hashing everything and putting them in a temporary index
first).

[1] https://public-inbox.org/git/1375597720-13236-1-git-send-email-pclouds@gmail.com/
[2] https://public-inbox.org/git/1375966270-10968-1-git-send-email-pclouds@gmail.com/
-- 
Duy

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH/RFC] commit: new option to abort -a something is already staged
  2018-08-21 14:43   ` Duy Nguyen
@ 2018-08-23  2:11     ` Jonathan Nieder
  2018-08-23  2:15       ` Jonathan Nieder
  0 siblings, 1 reply; 28+ messages in thread
From: Jonathan Nieder @ 2018-08-23  2:11 UTC (permalink / raw)
  To: Duy Nguyen; +Cc: Git Mailing List

Duy Nguyen wrote:
> On Mon, Aug 20, 2018 at 9:30 PM Jonathan Nieder <jrnieder@gmail.com> wrote:

>> I frequently use "git commit -a" this way intentionally, so I would be
>> unlikely to turn this config on.  That leads me to suspect it's not a
>> good candidate for configuration:
>>
>> - it's not configuration for the sake of a transition period, since some
>>   people would keep it on forever
>>
>> - it's not configuration based on different project needs, either
>>
>> So configuration doesn't feel like a good fit.
>
> I think it falls under personal preference (yes some people like me
> will keep it on forever in fear of losing staged changes).

Sorry for the lack of clarity.  I meant "some people would keep it off
forever".

>> That said, I lean toward your initial thought, that this is papering
>> over a missing undo feature.  Can you say more about how you'd imagine
>> undo working?
[...]
> [1] https://public-inbox.org/git/1375597720-13236-1-git-send-email-pclouds@gmail.com/
> [2] https://public-inbox.org/git/1375966270-10968-1-git-send-email-pclouds@gmail.com/

Thanks for the links!  That's very helpful.

I'm starting to lean toward having this on unconditionally, with a
message that points the user who really doesn't want to clobber their
index toward "git add -u", as a good idea.  I think that for humans,
that would be okay and that configuration doesn't really help much
for this.

The remaining question becomes scripts.  A script might do

	... modify old-file and create new-file ...
	git add new-file
	git commit -m "some great message"

which would trip this error.  For that matter, humans might do that,
too.  Could the check detect this case (where the only changes in the
index are additions of new files) and treat it as non-destructive?

Thanks,
Jonathan

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH/RFC] commit: new option to abort -a something is already staged
  2018-08-23  2:11     ` Jonathan Nieder
@ 2018-08-23  2:15       ` Jonathan Nieder
  2018-08-23 14:49         ` Duy Nguyen
  0 siblings, 1 reply; 28+ messages in thread
From: Jonathan Nieder @ 2018-08-23  2:15 UTC (permalink / raw)
  To: Duy Nguyen; +Cc: Git Mailing List

A few quick corrections:

Jonathan Nieder wrote:

> I'm starting to lean toward having this on unconditionally, with a
> message that points the user who really doesn't want to clobber their

... who really *does* want to clobber their index ...

> index toward "git add -u", as a good idea.  I think that for humans,
> that would be okay and that configuration doesn't really help much
> for this.
>
> The remaining question becomes scripts.  A script might do
>
>	... modify old-file and create new-file ...
> 	git add new-file
> 	git commit -m "some great message"
>
> which would trip this error.  For that matter, humans might do that,
> too.  Could the check detect this case (where the only changes in the
> index are additions of new files) and treat it as non-destructive?

(where the only changes in the index are additions of new files *that
match the worktree*)

Sorry for the noise,
Jonathan

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH/RFC] commit: new option to abort -a something is already staged
  2018-08-23  2:15       ` Jonathan Nieder
@ 2018-08-23 14:49         ` Duy Nguyen
  2018-08-23 15:28           ` Junio C Hamano
  0 siblings, 1 reply; 28+ messages in thread
From: Duy Nguyen @ 2018-08-23 14:49 UTC (permalink / raw)
  To: Jonathan Nieder; +Cc: Git Mailing List

On Thu, Aug 23, 2018 at 4:15 AM Jonathan Nieder <jrnieder@gmail.com> wrote:
> > The remaining question becomes scripts.  A script might do
> >
> >       ... modify old-file and create new-file ...
> >       git add new-file
> >       git commit -m "some great message"
> >
> > which would trip this error.  For that matter, humans might do that,
> > too.  Could the check detect this case (where the only changes in the
> > index are additions of new files) and treat it as non-destructive?
>
> (where the only changes in the index are additions of new files *that
> match the worktree*)

I don't think my change would affect this case. If "git commit" does
not change the index by itself, there's no point in stopping it.
-- 
Duy

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH/RFC] commit: new option to abort -a something is already staged
  2018-08-23 14:49         ` Duy Nguyen
@ 2018-08-23 15:28           ` Junio C Hamano
  2018-08-24  3:02             ` Jacob Keller
  0 siblings, 1 reply; 28+ messages in thread
From: Junio C Hamano @ 2018-08-23 15:28 UTC (permalink / raw)
  To: Duy Nguyen; +Cc: Jonathan Nieder, Git Mailing List

Duy Nguyen <pclouds@gmail.com> writes:

> On Thu, Aug 23, 2018 at 4:15 AM Jonathan Nieder <jrnieder@gmail.com> wrote:
>> > The remaining question becomes scripts.  A script might do
>> >
>> >       ... modify old-file and create new-file ...
>> >       git add new-file
>> >       git commit -m "some great message"
>> >
>> > which would trip this error.  For that matter, humans might do that,
>> > too.  Could the check detect this case (where the only changes in the
>> > index are additions of new files) and treat it as non-destructive?
>>
>> (where the only changes in the index are additions of new files *that
>> match the worktree*)
>
> I don't think my change would affect this case. If "git commit" does
> not change the index by itself, there's no point in stopping it.

I think the above example forgets "-a" on the final "git commit"
step.  With it added, I can understand the concern (and I am sure
you would, too).

The user is trying to add everything done in the working tree, and
"commit -a" would catch all changes to paths that were already
tracked, but a separate "add" is necessary for newly created paths.
But adding a new path means the index no longer matches HEAD, and
the "commit -a" at the final step sweeps the changes to already
tracked paths---failing that because there "already is something
staged" will break the workflow.

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH/RFC] commit: new option to abort -a something is already staged
  2018-08-20 15:41 [PATCH/RFC] commit: new option to abort -a something is already staged Nguyễn Thái Ngọc Duy
                   ` (2 preceding siblings ...)
  2018-08-20 19:30 ` Jonathan Nieder
@ 2018-08-24  2:59 ` Jacob Keller
  2018-09-16  6:31 ` [PATCH v2 0/1] Make 'git commit' not accidentally lose staged content Nguyễn Thái Ngọc Duy
  4 siblings, 0 replies; 28+ messages in thread
From: Jacob Keller @ 2018-08-24  2:59 UTC (permalink / raw)
  To: Duy Nguyen; +Cc: Git mailing list

On Mon, Aug 20, 2018 at 8:43 AM Nguyễn Thái Ngọc Duy <pclouds@gmail.com> wrote:
> Instead, let's handle just this problem for now. This new option
> commit.preciousDirtyIndex, if set to false, will not allow `commit -a`
> to continue if the final index is different from the existing one. I
> don't think this can be achieved with hooks because the hooks don't
> know about "-a" or different commit modes.
>

Here you say setting it to "false" enables this check but...

> +commit.preciousDirtyIndex::
> +       If some changes are partially staged in the index (i.e.
> +       "git commit -a" and "git commit" commit different changes), reject
> +       "git commit -a".
> +

Here, sounds like it is enabled by setting it to true.

Thanks,
Jake

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH/RFC] commit: new option to abort -a something is already staged
  2018-08-23 15:28           ` Junio C Hamano
@ 2018-08-24  3:02             ` Jacob Keller
  2018-08-24 14:42               ` Duy Nguyen
  0 siblings, 1 reply; 28+ messages in thread
From: Jacob Keller @ 2018-08-24  3:02 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: Duy Nguyen, Jonathan Nieder, Git mailing list

On Thu, Aug 23, 2018 at 9:28 AM Junio C Hamano <gitster@pobox.com> wrote:
> I think the above example forgets "-a" on the final "git commit"
> step.  With it added, I can understand the concern (and I am sure
> you would, too).
>
> The user is trying to add everything done in the working tree, and
> "commit -a" would catch all changes to paths that were already
> tracked, but a separate "add" is necessary for newly created paths.
> But adding a new path means the index no longer matches HEAD, and
> the "commit -a" at the final step sweeps the changes to already
> tracked paths---failing that because there "already is something
> staged" will break the workflow.

Right. I think this would need to be able to understand the case of
"different only by new files".

Thanks,
Jake

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH/RFC] commit: new option to abort -a something is already staged
  2018-08-24  3:02             ` Jacob Keller
@ 2018-08-24 14:42               ` Duy Nguyen
  2018-08-24 23:23                 ` Jacob Keller
  0 siblings, 1 reply; 28+ messages in thread
From: Duy Nguyen @ 2018-08-24 14:42 UTC (permalink / raw)
  To: Jacob Keller; +Cc: Junio C Hamano, Jonathan Nieder, Git Mailing List

On Fri, Aug 24, 2018 at 5:02 AM Jacob Keller <jacob.keller@gmail.com> wrote:
>
> On Thu, Aug 23, 2018 at 9:28 AM Junio C Hamano <gitster@pobox.com> wrote:
> > I think the above example forgets "-a" on the final "git commit"
> > step.  With it added, I can understand the concern (and I am sure
> > you would, too).
> >
> > The user is trying to add everything done in the working tree, and
> > "commit -a" would catch all changes to paths that were already
> > tracked, but a separate "add" is necessary for newly created paths.
> > But adding a new path means the index no longer matches HEAD, and
> > the "commit -a" at the final step sweeps the changes to already
> > tracked paths---failing that because there "already is something
> > staged" will break the workflow.
>
> Right. I think this would need to be able to understand the case of
> "different only by new files".

OK so the rules I'm going to try to implement is, if the  version in
the index is not the same as one in HEAD or one in worktree, then "git
commit -a" fails.

The unwritten line is, if the path in question does not exist in HEAD,
then the first condition "as one in HEAD" is dropped, naturally. This
addresses the "git add new-file" problem, but if you have made more
changes in new-file in worktree after "git add" and do "git commit -a"
then you still get rejected because it fails the second condition.

File removal should be considered as well. But I don't foresee any
problem there. Resolving merges, replacing higher stage entries with
stage 0 will be accepted at "git commit -a" as usual.

Let me know if we should tweak these rules (and how).
-- 
Duy

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH/RFC] commit: new option to abort -a something is already staged
  2018-08-24 14:42               ` Duy Nguyen
@ 2018-08-24 23:23                 ` Jacob Keller
  0 siblings, 0 replies; 28+ messages in thread
From: Jacob Keller @ 2018-08-24 23:23 UTC (permalink / raw)
  To: Duy Nguyen; +Cc: Junio C Hamano, Jonathan Nieder, Git mailing list

On Fri, Aug 24, 2018 at 7:42 AM Duy Nguyen <pclouds@gmail.com> wrote:
>
> On Fri, Aug 24, 2018 at 5:02 AM Jacob Keller <jacob.keller@gmail.com> wrote:
> >
> > On Thu, Aug 23, 2018 at 9:28 AM Junio C Hamano <gitster@pobox.com> wrote:
> > > I think the above example forgets "-a" on the final "git commit"
> > > step.  With it added, I can understand the concern (and I am sure
> > > you would, too).
> > >
> > > The user is trying to add everything done in the working tree, and
> > > "commit -a" would catch all changes to paths that were already
> > > tracked, but a separate "add" is necessary for newly created paths.
> > > But adding a new path means the index no longer matches HEAD, and
> > > the "commit -a" at the final step sweeps the changes to already
> > > tracked paths---failing that because there "already is something
> > > staged" will break the workflow.
> >
> > Right. I think this would need to be able to understand the case of
> > "different only by new files".
>
> OK so the rules I'm going to try to implement is, if the  version in
> the index is not the same as one in HEAD or one in worktree, then "git
> commit -a" fails.
>
> The unwritten line is, if the path in question does not exist in HEAD,
> then the first condition "as one in HEAD" is dropped, naturally. This
> addresses the "git add new-file" problem, but if you have made more
> changes in new-file in worktree after "git add" and do "git commit -a"
> then you still get rejected because it fails the second condition.
>
> File removal should be considered as well. But I don't foresee any
> problem there. Resolving merges, replacing higher stage entries with
> stage 0 will be accepted at "git commit -a" as usual.
>
> Let me know if we should tweak these rules (and how).
> --
> Duy

This seems reasonable to me. It might trigger a few issues with people
doing "git add new_file ; $EDITOR new_file ; git commit -a" but... I
know I've accidentally done "git commit -a" and had problems.

What about "git commit <file>"? That's another case where I've
accidentally lost some work for a similar reason.

Thanks,
Jake

^ permalink raw reply	[flat|nested] 28+ messages in thread

* [PATCH v2 0/1] Make 'git commit' not accidentally lose staged content
  2018-08-20 15:41 [PATCH/RFC] commit: new option to abort -a something is already staged Nguyễn Thái Ngọc Duy
                   ` (3 preceding siblings ...)
  2018-08-24  2:59 ` Jacob Keller
@ 2018-09-16  6:31 ` Nguyễn Thái Ngọc Duy
  2018-09-16  6:31   ` [PATCH v2 1/1] commit: do not clobber the index Nguyễn Thái Ngọc Duy
  2018-09-17 17:09   ` [PATCH v2 0/1] Make 'git commit' not accidentally lose staged content Junio C Hamano
  4 siblings, 2 replies; 28+ messages in thread
From: Nguyễn Thái Ngọc Duy @ 2018-09-16  6:31 UTC (permalink / raw)
  To: git
  Cc: Junio C Hamano, Jonathan Nieder, Eric Sunshine, jacob.keller,
	Nguyễn Thái Ngọc Duy

This is about mixing "git add -p" and "git commit -a" (or "git commit
<path>") where you may accidentally lose staged changes. After the
discussion with Jonathan, I'm going with a bit different approach than
v1, this behavior now becomes default, and if the user wants the old
behavior back, they can use --clobber-index.

Another change is "git commit <path>" is covered as well, as pointed
out by Jacob.

I will need to add some test cases of course, if this direction is
still promising. One thing I'm not sure about is whether want to
deliberately clobber the index often, then perhaps we should add a
config key to bring the old behavior back.

Nguyễn Thái Ngọc Duy (1):
  commit: do not clobber the index

 Documentation/git-commit.txt     |  11 +++-
 builtin/commit.c                 | 105 ++++++++++++++++++++++++++++---
 cache.h                          |   1 +
 read-cache.c                     |  11 ++++
 t/t2201-add-update-typechange.sh |   2 +-
 t/t4015-diff-whitespace.sh       |   2 +-
 t/t7102-reset.sh                 |   2 +-
 t/t7500-commit.sh                |   2 +-
 t/t7502-commit.sh                |   4 +-
 t/t9350-fast-export.sh           |   2 +-
 10 files changed, 126 insertions(+), 16 deletions(-)

-- 
2.19.0.rc0.337.ge906d732e7


^ permalink raw reply	[flat|nested] 28+ messages in thread

* [PATCH v2 1/1] commit: do not clobber the index
  2018-09-16  6:31 ` [PATCH v2 0/1] Make 'git commit' not accidentally lose staged content Nguyễn Thái Ngọc Duy
@ 2018-09-16  6:31   ` Nguyễn Thái Ngọc Duy
  2018-09-17 17:09   ` [PATCH v2 0/1] Make 'git commit' not accidentally lose staged content Junio C Hamano
  1 sibling, 0 replies; 28+ messages in thread
From: Nguyễn Thái Ngọc Duy @ 2018-09-16  6:31 UTC (permalink / raw)
  To: git
  Cc: Junio C Hamano, Jonathan Nieder, Eric Sunshine, jacob.keller,
	Nguyễn Thái Ngọc Duy

"git commit" can be used in two different workflows:

 - the user never bothers with "git add" and uses "git commit" for
   both adding content to the index and committing it

 - the user uses "git add" to carefully prepare what they want to
   commit, and "git commit" creates a new commit out of the current
   index. In this case "git commit" does not modify the index at all.

The user could switch between the two workflows of course, and
mistakes can happen that lead to content loss. Imagine the user has
prepared the index with 'git add -p' (second workflow), which makes
the index content different from both HEAD and worktree. If they
accidentally do "git commit -a" (first workflow) after that, all the
preparation is gone because "git commit" clobbers the index.

Prevent this by checking the staged content when runnning "commit -a"
or "commit <path>". If the staged content is different from both HEAD
and the worktree version, abort the commit. The user can still
override this by giving "--clobber-index" option.

Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
---
 Documentation/git-commit.txt     |  11 +++-
 builtin/commit.c                 | 105 ++++++++++++++++++++++++++++---
 cache.h                          |   1 +
 read-cache.c                     |  11 ++++
 t/t2201-add-update-typechange.sh |   2 +-
 t/t4015-diff-whitespace.sh       |   2 +-
 t/t7102-reset.sh                 |   2 +-
 t/t7500-commit.sh                |   2 +-
 t/t7502-commit.sh                |   4 +-
 t/t9350-fast-export.sh           |   2 +-
 10 files changed, 126 insertions(+), 16 deletions(-)

diff --git a/Documentation/git-commit.txt b/Documentation/git-commit.txt
index f970a43422..2d128527ec 100644
--- a/Documentation/git-commit.txt
+++ b/Documentation/git-commit.txt
@@ -31,8 +31,9 @@ The content to be added can be specified in several ways:
 
 3. by listing files as arguments to the 'commit' command
    (without --interactive or --patch switch), in which
-   case the commit will ignore changes staged in the index, and instead
-   record the current content of the listed files (which must already
+   case the commit will ignore changes staged in the index if
+   --clobber-index is specified, and instead record the current
+   content of the listed files (which must already
    be known to Git);
 
 4. by using the -a switch with the 'commit' command to automatically
@@ -274,6 +275,12 @@ FROM UPSTREAM REBASE" section in linkgit:git-rebase[1].)
 	already been staged. If used together with `--allow-empty`
 	paths are also not required, and an empty commit will be created.
 
+--clobber-index::
+	By default 'git commit' will not update the index if
+	non-conflicted staged content of a path is different from both
+	HEAD and the working tree. This option skips this check and
+	makes 'git commit' ignore content changes in the index.
+
 -u[<mode>]::
 --untracked-files[=<mode>]::
 	Show untracked files.
diff --git a/builtin/commit.c b/builtin/commit.c
index 0d9828e29e..a179b8a9e7 100644
--- a/builtin/commit.c
+++ b/builtin/commit.c
@@ -101,6 +101,7 @@ static int all, also, interactive, patch_interactive, only, amend, signoff;
 static int edit_flag = -1; /* unspecified */
 static int quiet, verbose, no_verify, allow_empty, dry_run, renew_authorship;
 static int config_commit_verbose = -1; /* unspecified */
+static int clobber_index;
 static int no_post_rewrite, allow_empty_message;
 static char *untracked_files_arg, *force_date, *ignore_submodule_arg, *ignored_arg;
 static char *sign_commit;
@@ -263,11 +264,78 @@ static int list_paths(struct string_list *list, const char *with_tree,
 	return ret;
 }
 
-static void add_remove_files(struct string_list *list)
+static void mark_staged_paths_cb(struct diff_queue_struct *q,
+				 struct diff_options *options,
+				 void *data)
+{
+	struct index_state *istate = data;
+	int i;
+
+	for (i = 0; i < q->nr; i++) {
+		struct diff_filepair *p;
+		int pos;
+
+		p = q->queue[i];
+
+		switch (p->status) {
+		case DIFF_STATUS_ADDED:
+			break;
+
+		case DIFF_STATUS_MODIFIED:
+			/*
+			 * _MODIFIED could be either because of content or
+			 * mode change. Ignore mode change.
+			 */
+			if (oidcmp(&p->one->oid, &p->two->oid))
+				break;
+			continue;
+
+		default:
+			continue;
+		}
+
+		pos = index_name_pos(istate, p->two->path,
+				     strlen(p->two->path));
+		if (pos < 0)
+			BUG("entry '%s' => '%s' status '%c' not found in the index",
+			    p->one->path ? p->one->path : "(empty)",
+			    p->two->path ? p->two->path : "(empty)",
+			    p->status);
+		istate->cache[pos]->ce_flags |= CE_MATCHED;
+	}
+}
+
+static void mark_staged_paths(struct index_state *istate,
+			      const char *prefix,
+			      struct pathspec *pathspec)
+{
+	struct rev_info rev;
+	struct setup_revision_opt opt;
+	int i;
+
+	init_revisions(&rev, NULL);
+	memset(&opt, 0, sizeof(opt));
+	opt.def = "HEAD";
+	setup_revisions(0, NULL, &rev, &opt);
+
+	rev.diffopt.ita_invisible_in_index = 1;
+	rev.diffopt.output_format |= DIFF_FORMAT_CALLBACK;
+	rev.diffopt.format_callback = mark_staged_paths_cb;
+	rev.diffopt.format_callback_data = istate;
+	copy_pathspec(&rev.prune_data, pathspec);
+
+	for (i = 0; i < istate->cache_nr; i++)
+		istate->cache[i]->ce_flags &= ~CE_MATCHED;
+
+	run_diff_index(&rev, 1);
+}
+
+static void add_remove_files(struct string_list *list, int flags)
 {
 	int i;
 	for (i = 0; i < list->nr; i++) {
 		struct stat st;
+		int pos;
 		struct string_list_item *p = &(list->items[i]);
 
 		/* p->util is skip-worktree */
@@ -275,10 +343,18 @@ static void add_remove_files(struct string_list *list)
 			continue;
 
 		if (!lstat(p->string, &st)) {
-			if (add_to_cache(p->string, &st, 0))
+			if (add_to_cache(p->string, &st, flags))
 				die(_("updating files failed"));
-		} else
-			remove_file_from_cache(p->string);
+			continue;
+		}
+
+		if (flags & ADD_CACHE_KEEP_CE_MATCHED &&
+		    (pos = index_name_pos(&the_index, p->string,
+					  strlen(p->string))) >= 0 &&
+		    the_index.cache[pos]->ce_flags & CE_MATCHED)
+			die(_("clobbering '%s' is not allowed"),
+			    p->string);
+		remove_file_from_cache(p->string);
 	}
 }
 
@@ -326,6 +402,7 @@ static const char *prepare_index(int argc, const char **argv, const char *prefix
 	struct string_list partial = STRING_LIST_INIT_DUP;
 	struct pathspec pathspec;
 	int refresh_flags = REFRESH_QUIET;
+	unsigned int add_flags = 0;
 	const char *ret;
 
 	if (is_status)
@@ -372,6 +449,9 @@ static const char *prepare_index(int argc, const char **argv, const char *prefix
 		goto out;
 	}
 
+	if (!clobber_index && !current_head)
+		clobber_index = 1;
+
 	/*
 	 * Non partial, non as-is commit.
 	 *
@@ -385,8 +465,14 @@ static const char *prepare_index(int argc, const char **argv, const char *prefix
 	 * (B) on failure, rollback the real index.
 	 */
 	if (all || (also && pathspec.nr)) {
+		const char *add_prefix = also ? prefix : NULL;
+
 		hold_locked_index(&index_lock, LOCK_DIE_ON_ERROR);
-		add_files_to_cache(also ? prefix : NULL, &pathspec, 0);
+		if (!clobber_index) {
+			mark_staged_paths(&the_index, add_prefix, &pathspec);
+			add_flags |= ADD_CACHE_KEEP_CE_MATCHED;
+		}
+		add_files_to_cache(add_prefix, &pathspec, add_flags);
 		refresh_cache_or_die(refresh_flags);
 		update_main_cache_tree(WRITE_TREE_SILENT);
 		if (write_locked_index(&the_index, &index_lock, 0))
@@ -455,7 +541,11 @@ static const char *prepare_index(int argc, const char **argv, const char *prefix
 		die(_("cannot read the index"));
 
 	hold_locked_index(&index_lock, LOCK_DIE_ON_ERROR);
-	add_remove_files(&partial);
+	if (!clobber_index) {
+		mark_staged_paths(&the_index, NULL, &pathspec);
+		add_flags |= ADD_CACHE_KEEP_CE_MATCHED;
+	}
+	add_remove_files(&partial, add_flags);
 	refresh_cache(REFRESH_QUIET);
 	update_main_cache_tree(WRITE_TREE_SILENT);
 	if (write_locked_index(&the_index, &index_lock, 0))
@@ -467,7 +557,7 @@ static const char *prepare_index(int argc, const char **argv, const char *prefix
 				  LOCK_DIE_ON_ERROR);
 
 	create_base_index(current_head);
-	add_remove_files(&partial);
+	add_remove_files(&partial, 0);
 	refresh_cache(REFRESH_QUIET);
 
 	if (write_locked_index(&the_index, &false_lock, 0))
@@ -1477,6 +1567,7 @@ int cmd_commit(int argc, const char **argv, const char *prefix)
 		OPT_BOOL('o', "only", &only, N_("commit only specified files")),
 		OPT_BOOL('n', "no-verify", &no_verify, N_("bypass pre-commit and commit-msg hooks")),
 		OPT_BOOL(0, "dry-run", &dry_run, N_("show what would be committed")),
+		OPT_BOOL(0, "clobber-index", &clobber_index, N_("clobber the index")),
 		OPT_SET_INT(0, "short", &status_format, N_("show status concisely"),
 			    STATUS_FORMAT_SHORT),
 		OPT_BOOL(0, "branch", &s.show_branch, N_("show branch information")),
diff --git a/cache.h b/cache.h
index b1fd3d58ab..8718cc636b 100644
--- a/cache.h
+++ b/cache.h
@@ -738,6 +738,7 @@ extern int index_name_pos(const struct index_state *, const char *name, int name
 #define ADD_CACHE_JUST_APPEND 8		/* Append only; tree.c::read_tree() */
 #define ADD_CACHE_NEW_ONLY 16		/* Do not replace existing ones */
 #define ADD_CACHE_KEEP_CACHE_TREE 32	/* Do not invalidate cache-tree */
+#define ADD_CACHE_KEEP_CE_MATCHED 64	/* Do not change CE_MATCHED files */
 extern int add_index_entry(struct index_state *, struct cache_entry *ce, int option);
 extern void rename_index_entry_at(struct index_state *, int pos, const char *new_name);
 
diff --git a/read-cache.c b/read-cache.c
index 7b1354d759..6b912225d5 100644
--- a/read-cache.c
+++ b/read-cache.c
@@ -757,6 +757,17 @@ int add_to_index(struct index_state *istate, const char *path, struct stat *st,
 			discard_cache_entry(ce);
 			return error("unable to index file %s", path);
 		}
+		if (flags & ADD_CACHE_KEEP_CE_MATCHED) {
+			struct cache_entry *ent;
+			int pos = index_name_pos(istate, path, namelen);
+
+			ent = (0 <= pos) ? istate->cache[pos] : NULL;
+			if (ent && ent->ce_flags & CE_MATCHED &&
+			    oidcmp(&ent->oid, &ce->oid)) {
+				discard_cache_entry(ce);
+				return error(_("clobbering '%s' is not allowed"), path);
+			}
+		}
 	} else
 		set_object_name_for_intent_to_add_entry(ce);
 
diff --git a/t/t2201-add-update-typechange.sh b/t/t2201-add-update-typechange.sh
index a4eec0a346..99bf3264bf 100755
--- a/t/t2201-add-update-typechange.sh
+++ b/t/t2201-add-update-typechange.sh
@@ -136,7 +136,7 @@ test_expect_success 'commit -a' '
 		rm -f ".git/index" &&
 		mv ".git/saved-index" ".git/index"
 	fi &&
-	git commit -m "second" -a &&
+	git commit --clobber-index -m "second" -a &&
 	git ls-files -s >actual &&
 	test_cmp expect-final actual &&
 	rm -f .git/index &&
diff --git a/t/t4015-diff-whitespace.sh b/t/t4015-diff-whitespace.sh
index 35fc8b5c2a..e85789b95a 100755
--- a/t/t4015-diff-whitespace.sh
+++ b/t/t4015-diff-whitespace.sh
@@ -616,7 +616,7 @@ test_expect_success 'check staged with space before tab in indent (diff-index)'
 
 test_expect_success 'check with no whitespace errors (diff-tree)' '
 	echo "foo();" >x &&
-	git commit -m "new commit" x &&
+	git commit --clobber-index -m "new commit" x &&
 	git diff-tree --check HEAD^ HEAD
 '
 
diff --git a/t/t7102-reset.sh b/t/t7102-reset.sh
index 97be0d968d..dcb773edf4 100755
--- a/t/t7102-reset.sh
+++ b/t/t7102-reset.sh
@@ -223,7 +223,7 @@ EOF
 test_expect_success \
 	'changing files and redo the last commit should succeed' '
 	echo "3rd line 2nd file" >>secondfile &&
-	git commit -a -C ORIG_HEAD &&
+	git commit --clobber-index -a -C ORIG_HEAD &&
 	head4=$(git rev-parse --verify HEAD) &&
 	check_changes $head4 &&
 	test "$(git rev-parse ORIG_HEAD)" = \
diff --git a/t/t7500-commit.sh b/t/t7500-commit.sh
index 170b4810e0..6864c396d3 100755
--- a/t/t7500-commit.sh
+++ b/t/t7500-commit.sh
@@ -174,7 +174,7 @@ EOF
 
 test_expect_success '--signoff' '
 	echo "yet another content *narf*" >> foo &&
-	echo "zort" | git commit -s -F - foo &&
+	echo "zort" | git commit --clobber-index -s -F - foo &&
 	git cat-file commit HEAD | sed "1,/^\$/d" > output &&
 	test_cmp expect output
 '
diff --git a/t/t7502-commit.sh b/t/t7502-commit.sh
index ca4a740da0..c4f2423ec3 100755
--- a/t/t7502-commit.sh
+++ b/t/t7502-commit.sh
@@ -93,7 +93,7 @@ test_expect_success 'the basics' '
 	echo update added "commit is" file >"commit is" &&
 	echo also update another >not/forbid &&
 	test_tick &&
-	git commit -a -m "initial with -a" &&
+	git commit --clobber-index -a -m "initial with -a" &&
 
 	git cat-file blob HEAD:"commit is" >current.1 &&
 	git cat-file blob HEAD:not/forbid >current.2 &&
@@ -188,7 +188,7 @@ test_expect_success 'prepare file with comment line and trailing newlines'  '
 test_expect_success 'cleanup commit messages (verbatim option,-t)' '
 
 	echo >>negative &&
-	git commit --cleanup=verbatim --no-status -t expect -a &&
+	git commit --cleanup=verbatim --no-status -t expect -a --clobber-index &&
 	git cat-file -p HEAD |sed -e "1,/^\$/d" >actual &&
 	test_cmp expect actual
 
diff --git a/t/t9350-fast-export.sh b/t/t9350-fast-export.sh
index 6a392e87bc..c7aeccec96 100755
--- a/t/t9350-fast-export.sh
+++ b/t/t9350-fast-export.sh
@@ -480,7 +480,7 @@ test_expect_success 'avoid uninteresting refs' '
 	git tag v1.0 &&
 	git branch uninteresting &&
 	echo bump > file &&
-	git commit -a -m bump &&
+	git commit --clobber-index -a -m bump &&
 	git fast-export --import-marks=tmp-marks \
 		--export-marks=tmp-marks ^uninteresting ^v1.0 master > actual &&
 	test_cmp expected actual
-- 
2.19.0.rc0.337.ge906d732e7


^ permalink raw reply related	[flat|nested] 28+ messages in thread

* Re: [PATCH v2 0/1] Make 'git commit' not accidentally lose staged content
  2018-09-16  6:31 ` [PATCH v2 0/1] Make 'git commit' not accidentally lose staged content Nguyễn Thái Ngọc Duy
  2018-09-16  6:31   ` [PATCH v2 1/1] commit: do not clobber the index Nguyễn Thái Ngọc Duy
@ 2018-09-17 17:09   ` Junio C Hamano
  2018-09-17 17:29     ` Duy Nguyen
  2018-09-18 19:33     ` Jacob Keller
  1 sibling, 2 replies; 28+ messages in thread
From: Junio C Hamano @ 2018-09-17 17:09 UTC (permalink / raw)
  To: Nguyễn Thái Ngọc Duy
  Cc: git, Jonathan Nieder, Eric Sunshine, jacob.keller

Nguyễn Thái Ngọc Duy  <pclouds@gmail.com> writes:

> This is about mixing "git add -p" and "git commit -a" (or "git commit
> <path>") where you may accidentally lose staged changes. After the
> discussion with Jonathan, I'm going with a bit different approach than
> v1, this behavior now becomes default, and if the user wants the old
> behavior back, they can use --clobber-index.
>
> Another change is "git commit <path>" is covered as well, as pointed
> out by Jacob.
>
> I will need to add some test cases of course, if this direction is
> still promising. One thing I'm not sure about is whether want to
> deliberately clobber the index often, then perhaps we should add a
> config key to bring the old behavior back.

It usually is safer (simply because you do not have to think about
it) to start a behaviour change like this as a strict opt-in to gain
confidence.

As I often see myself futzing with the same file, adding changes to
it incrementally, so that I can view progress in "diff --cached" and
"diff" output, it would be a serious usability regression if the
last step in the following sequence is rejected by default:

	edit X
	git add X
	git diff --cached X
	git diff X
	... repeat the above number of times ...
	git commit X ;# or "git commit -a" to finally conclude

On the other hand, if I am keeping some change that should never be
in a commit in the working tree file, and building the contents in
the index using "add -p" to incrementally, it would be the same
disaster as you are trying to prevent if I by mistake did a whole
path 'add', even if I catch myself doing so before running 'commit'
i.e.

	edit X
	git add -p X
	git diff --cached X
	git diff X
	... repeat the above number of times ...
	git add X ;# OOPS!
	git add . ;# OOPS! even worse!

Even though this does not involve "git commit -a" or "git commit X",
an unrecoverable damage that requires redoing the manual work is
already done.

The approach to check if the contents in the index matches that in
the HEAD per-path (i.e. "The contents we are adding to the index is
whole working tree contents for that path.  But the index already
has contents different from HEAD for the path---are we losing
information by doing this?"), is a very good one.  But for the
protection to be effective, I think "git commit" and "git add"
should be covered the same way, ideally with the same code and
possibly the same configuration knob and/or command line option to
control the behaviour.

If the information loss caused by the "add/commit X" or "add
-u/commit -a" is so serious that this new feature deserves to become
the default (which I do not yet think it is the case, by the way),
then we could even forbid "commit X" or "commit -a" when the paths
involved has difference between the index and the HEAD, without any
configuration knob or command line override for "commit", and then
tell the users to use "git add/rm" _with_ the override before coming
back to "git commit".

How should this new check intract with paths added with "add -N", by
the way?

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH v2 0/1] Make 'git commit' not accidentally lose staged content
  2018-09-17 17:09   ` [PATCH v2 0/1] Make 'git commit' not accidentally lose staged content Junio C Hamano
@ 2018-09-17 17:29     ` Duy Nguyen
  2018-09-17 18:15       ` Jeff King
  2018-09-17 19:26       ` Junio C Hamano
  2018-09-18 19:33     ` Jacob Keller
  1 sibling, 2 replies; 28+ messages in thread
From: Duy Nguyen @ 2018-09-17 17:29 UTC (permalink / raw)
  To: Junio C Hamano
  Cc: Git Mailing List, Jonathan Nieder, Eric Sunshine, Jacob Keller

On Mon, Sep 17, 2018 at 7:09 PM Junio C Hamano <gitster@pobox.com> wrote:
>
> Nguyễn Thái Ngọc Duy  <pclouds@gmail.com> writes:
>
> > This is about mixing "git add -p" and "git commit -a" (or "git commit
> > <path>") where you may accidentally lose staged changes. After the
> > discussion with Jonathan, I'm going with a bit different approach than
> > v1, this behavior now becomes default, and if the user wants the old
> > behavior back, they can use --clobber-index.
> >
> > Another change is "git commit <path>" is covered as well, as pointed
> > out by Jacob.
> >
> > I will need to add some test cases of course, if this direction is
> > still promising. One thing I'm not sure about is whether want to
> > deliberately clobber the index often, then perhaps we should add a
> > config key to bring the old behavior back.
>
> It usually is safer (simply because you do not have to think about
> it) to start a behaviour change like this as a strict opt-in to gain
> confidence.

Heh. The RFC was opt-in. Jonathan suggested changing default behavior
and I went along just to see how far I could push it :)

> As I often see myself futzing with the same file, adding changes to
> it incrementally, so that I can view progress in "diff --cached" and
> "diff" output, it would be a serious usability regression if the
> last step in the following sequence is rejected by default:
>
>         edit X
>         git add X
>         git diff --cached X
>         git diff X
>         ... repeat the above number of times ...
>         git commit X ;# or "git commit -a" to finally conclude

But yes if there's a valid use case where this behavior change becomes
a problem, then opt-in makes more sense.

> On the other hand, if I am keeping some change that should never be
> in a commit in the working tree file, and building the contents in
> the index using "add -p" to incrementally, it would be the same
> disaster as you are trying to prevent if I by mistake did a whole
> path 'add', even if I catch myself doing so before running 'commit'
> i.e.
>
>         edit X
>         git add -p X
>         git diff --cached X
>         git diff X
>         ... repeat the above number of times ...
>         git add X ;# OOPS!
>         git add . ;# OOPS! even worse!
>
> Even though this does not involve "git commit -a" or "git commit X",
> an unrecoverable damage that requires redoing the manual work is
> already done.

I don't see a good way to get to recover this situation. I could go
back to the "index log" idea, where we keep a log of index changes (or
just "interesting" changes). That way there's no behavior change at
all. The user who accidentally updates/deletes something can always
retrieve the old content back (assuming that they realize quickly
since we can't keep very long log).

I've been thinking about allowing to undo worktree changes too (e.g.
accidental "git reset --hard") and this log can cover it as well.

The only downside is we need a new command for the UI (or perhaps I
can just add "git add --log" or something like that).

Should I just drop this patch and go that direction instead? More
work, but maybe better end result.

> How should this new check intract with paths added with "add -N", by
> the way?

Good point. The check here is basically "git diff --cached". I would
need to turn it to "git diff --cached --ita-invisible-in-index" to
make ita entries disappear.
-- 
Duy

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH v2 0/1] Make 'git commit' not accidentally lose staged content
  2018-09-17 17:29     ` Duy Nguyen
@ 2018-09-17 18:15       ` Jeff King
  2018-09-17 18:41         ` Duy Nguyen
  2018-09-18 19:36         ` Jacob Keller
  2018-09-17 19:26       ` Junio C Hamano
  1 sibling, 2 replies; 28+ messages in thread
From: Jeff King @ 2018-09-17 18:15 UTC (permalink / raw)
  To: Duy Nguyen
  Cc: Junio C Hamano, Git Mailing List, Jonathan Nieder, Eric Sunshine,
	Jacob Keller

On Mon, Sep 17, 2018 at 07:29:26PM +0200, Duy Nguyen wrote:

> > On the other hand, if I am keeping some change that should never be
> > in a commit in the working tree file, and building the contents in
> > the index using "add -p" to incrementally, it would be the same
> > disaster as you are trying to prevent if I by mistake did a whole
> > path 'add', even if I catch myself doing so before running 'commit'
> > i.e.
> >
> >         edit X
> >         git add -p X
> >         git diff --cached X
> >         git diff X
> >         ... repeat the above number of times ...
> >         git add X ;# OOPS!
> >         git add . ;# OOPS! even worse!
> >
> > Even though this does not involve "git commit -a" or "git commit X",
> > an unrecoverable damage that requires redoing the manual work is
> > already done.
> 
> I don't see a good way to get to recover this situation. I could go
> back to the "index log" idea, where we keep a log of index changes (or
> just "interesting" changes). That way there's no behavior change at
> all. The user who accidentally updates/deletes something can always
> retrieve the old content back (assuming that they realize quickly
> since we can't keep very long log).

FWIW, I like that approach much better, since:

  1. It does not bother or restrict anybody in their workflow; instead,
     they pay the complexity price only when they know they have made a
     mistake.

  2. It covers many more cases (e.g., just doing the wrong thing via
     "add -p").

A naive index log would be pretty cheap in CPU, at least for POSIX-ish
systems. You could just hard link "index" to "index.N" before renaming
"index.lock" over "index". But I guess if you have a gigantic index,
that's less appealing. So maybe storing the equivalent of a "--raw" diff
between the two index states would make more sense (and after all, you
don't really need the stat-cache or cache-tree). It would cost more to
reconstruct the index on the fly, but then the point is that you would
create these logs a lot more than you access them.

> I've been thinking about allowing to undo worktree changes too (e.g.
> accidental "git reset --hard") and this log can cover it as well.

I like that, too. It's a little more costly just because it may involve
object-db writes, but I think in most cases it would be fine. I almost
always "git stash" away discarded changes these days instead of "git
reset --hard", because it effectively provides this kind of log.

> The only downside is we need a new command for the UI (or perhaps I
> can just add "git add --log" or something like that).

I think the reflog approach has been successful: give these intermediate
states a name. So in theory I could do something like:

  git checkout -p :@{2.minutes.ago}

though it would probably be useful to be able to walk the states, too,
just like we have "log --reflog-walk".

The syntax above is off-the-cuff (and based on the ":<stage>" index
syntax). I guess if we had a separate log for "stuff in the worktree
that got thrown away by reset" we might need a separate syntax.

> Should I just drop this patch and go that direction instead? More
> work, but maybe better end result.

I guess my whole response is a long-winded "yes". ;)

-Peff

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH v2 0/1] Make 'git commit' not accidentally lose staged content
  2018-09-17 18:15       ` Jeff King
@ 2018-09-17 18:41         ` Duy Nguyen
  2018-09-18 17:35           ` Jeff King
  2018-09-18 19:36         ` Jacob Keller
  1 sibling, 1 reply; 28+ messages in thread
From: Duy Nguyen @ 2018-09-17 18:41 UTC (permalink / raw)
  To: Jeff King
  Cc: Junio C Hamano, Git Mailing List, Jonathan Nieder, Eric Sunshine,
	Jacob Keller

On Mon, Sep 17, 2018 at 8:15 PM Jeff King <peff@peff.net> wrote:
> On Mon, Sep 17, 2018 at 07:29:26PM +0200, Duy Nguyen wrote:
> > I don't see a good way to get to recover this situation. I could go
> > back to the "index log" idea, where we keep a log of index changes (or
> > just "interesting" changes). That way there's no behavior change at
> > all. The user who accidentally updates/deletes something can always
> > retrieve the old content back (assuming that they realize quickly
> > since we can't keep very long log).
>
> FWIW, I like that approach much better, since:
>
>   1. It does not bother or restrict anybody in their workflow; instead,
>      they pay the complexity price only when they know they have made a
>      mistake.
>
>   2. It covers many more cases (e.g., just doing the wrong thing via
>      "add -p").
>
> A naive index log would be pretty cheap in CPU, at least for POSIX-ish
> systems. You could just hard link "index" to "index.N" before renaming
> "index.lock" over "index". But I guess if you have a gigantic index,
> that's less appealing. So maybe storing the equivalent of a "--raw" diff
> between the two index states would make more sense (and after all, you
> don't really need the stat-cache or cache-tree). It would cost more to
> reconstruct the index on the fly, but then the point is that you would
> create these logs a lot more than you access them.

Yeah. The problem though is extracting the information out of these
index.N files.

> > I've been thinking about allowing to undo worktree changes too (e.g.
> > accidental "git reset --hard") and this log can cover it as well.
>
> I like that, too. It's a little more costly just because it may involve
> object-db writes, but I think in most cases it would be fine. I almost
> always "git stash" away discarded changes these days instead of "git
> reset --hard", because it effectively provides this kind of log.

Yeah the added cost is pretty much given. You want safety, you pay extra :)

> > The only downside is we need a new command for the UI (or perhaps I
> > can just add "git add --log" or something like that).
>
> I think the reflog approach has been successful: give these intermediate
> states a name. So in theory I could do something like:
>
>   git checkout -p :@{2.minutes.ago}
>
> though it would probably be useful to be able to walk the states, too,
> just like we have "log --reflog-walk".
>
> The syntax above is off-the-cuff (and based on the ":<stage>" index
> syntax). I guess if we had a separate log for "stuff in the worktree
> that got thrown away by reset" we might need a separate syntax.

I'm leaning towards reflog too. Not because of the syntax but because
of reusing the current code base. I don't have to worry about pruning,
walking, gc-ing... because it's pretty much already there. And the UI
is not so urgent since reflog file is very readable, early adopters
can just open the file and get the hash.

This also lets me handle logging worktree changes in the future
(keeping track of untracked files is impossible with the "index.N"
approach)

I'm trying to quickly make something that writes to
"$GIT_DIR/logs/index" and see how it goes. But yeah I'll probably drop
this patch. The ":@{2.minutes.ago}" just makes me like this direction
more, even though I don't know if I could even make that work.
-- 
Duy

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH v2 0/1] Make 'git commit' not accidentally lose staged content
  2018-09-17 17:29     ` Duy Nguyen
  2018-09-17 18:15       ` Jeff King
@ 2018-09-17 19:26       ` Junio C Hamano
  2018-09-18 19:41         ` Jacob Keller
  1 sibling, 1 reply; 28+ messages in thread
From: Junio C Hamano @ 2018-09-17 19:26 UTC (permalink / raw)
  To: Duy Nguyen; +Cc: Git Mailing List, Jonathan Nieder, Eric Sunshine, Jacob Keller

Duy Nguyen <pclouds@gmail.com> writes:

> On Mon, Sep 17, 2018 at 7:09 PM Junio C Hamano <gitster@pobox.com> wrote:
>> ...
>> On the other hand, if I am keeping some change that should never be
>> in a commit in the working tree file, and building the contents in
>> the index using "add -p" to incrementally, it would be the same
>> disaster as you are trying to prevent if I by mistake did a whole
>> path 'add', even if I catch myself doing so before running 'commit'
>> i.e.
>>
>>         edit X
>>         git add -p X
>>         git diff --cached X
>>         git diff X
>>         ... repeat the above number of times ...
>>         git add X ;# OOPS!
>>         git add . ;# OOPS! even worse!
>>
>> Even though this does not involve "git commit -a" or "git commit X",
>> an unrecoverable damage that requires redoing the manual work is
>> already done.
>
> I don't see a good way to get to recover this situation. I could go
> back to the "index log" idea,...

FWIW, I didn't mean to say that we should give users a way to
recover.  Your "commit -a" or "commit $path" protection just
prevents the situation from happening, and I think it is sufficient.

The sole point I wanted to raise by bringing up the above was that
we should have the same degree of protection against "add $path" or
"add -u".

Of course, "index log" is interesting and it may even turn out to be
useful (I was skeptical about "reference log" the same way, but it
turned out to be useful without burdening the system too heavily),
and it may even remove the need for the "do not accidentally lose
information by adding more to the index" protection.  But until that
happens, if we are to have such a protection, we would wnat to give
the same degree of protection to "commit" and "add".

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH v2 0/1] Make 'git commit' not accidentally lose staged content
  2018-09-17 18:41         ` Duy Nguyen
@ 2018-09-18 17:35           ` Jeff King
  0 siblings, 0 replies; 28+ messages in thread
From: Jeff King @ 2018-09-18 17:35 UTC (permalink / raw)
  To: Duy Nguyen
  Cc: Junio C Hamano, Git Mailing List, Jonathan Nieder, Eric Sunshine,
	Jacob Keller

On Mon, Sep 17, 2018 at 08:41:36PM +0200, Duy Nguyen wrote:

> > I think the reflog approach has been successful: give these intermediate
> > states a name. So in theory I could do something like:
> >
> >   git checkout -p :@{2.minutes.ago}
> >
> > though it would probably be useful to be able to walk the states, too,
> > just like we have "log --reflog-walk".
> >
> > The syntax above is off-the-cuff (and based on the ":<stage>" index
> > syntax). I guess if we had a separate log for "stuff in the worktree
> > that got thrown away by reset" we might need a separate syntax.
> 
> I'm leaning towards reflog too. Not because of the syntax but because
> of reusing the current code base. I don't have to worry about pruning,
> walking, gc-ing... because it's pretty much already there. And the UI
> is not so urgent since reflog file is very readable, early adopters
> can just open the file and get the hash.

Ah, good point on pruning/gc. I had imagined a new "index log" format.
But really if you just turn the result into a tree, then it becomes just
another ref(log). That might be slightly less efficient, but the
flexibility and simplicity are probably worth it. Or maybe with
cache-tree it is not even less efficient.

So that definitely seems like the right direction.

> I'm trying to quickly make something that writes to
> "$GIT_DIR/logs/index" and see how it goes. But yeah I'll probably drop
> this patch. The ":@{2.minutes.ago}" just makes me like this direction
> more, even though I don't know if I could even make that work.

I think "index@{2.minutes.ago}" would almost work, but it would probably
complain about a reflog without a matching ref. That seems like it would
be pretty easy to work around in the reflog code. Or maybe even by
having a stash-like "index log" ref.

-Peff

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH v2 0/1] Make 'git commit' not accidentally lose staged content
  2018-09-17 17:09   ` [PATCH v2 0/1] Make 'git commit' not accidentally lose staged content Junio C Hamano
  2018-09-17 17:29     ` Duy Nguyen
@ 2018-09-18 19:33     ` Jacob Keller
  1 sibling, 0 replies; 28+ messages in thread
From: Jacob Keller @ 2018-09-18 19:33 UTC (permalink / raw)
  To: Junio C Hamano
  Cc: Duy Nguyen, Git mailing list, Jonathan Nieder, Eric Sunshine

On Mon, Sep 17, 2018 at 10:09 AM Junio C Hamano <gitster@pobox.com> wrote:
>
> It usually is safer (simply because you do not have to think about
> it) to start a behaviour change like this as a strict opt-in to gain
> confidence.

I tend to agree, however.. in this case it could be considered safer
to err on the side of not throwing away the index which could have
crafted changes in it.

> The approach to check if the contents in the index matches that in
> the HEAD per-path (i.e. "The contents we are adding to the index is
> whole working tree contents for that path.  But the index already
> has contents different from HEAD for the path---are we losing
> information by doing this?"), is a very good one.  But for the
> protection to be effective, I think "git commit" and "git add"
> should be covered the same way, ideally with the same code and
> possibly the same configuration knob and/or command line option to
> control the behaviour.

Checking both commit and add makes sense to me.

>
> If the information loss caused by the "add/commit X" or "add
> -u/commit -a" is so serious that this new feature deserves to become
> the default (which I do not yet think it is the case, by the way),
> then we could even forbid "commit X" or "commit -a" when the paths
> involved has difference between the index and the HEAD, without any
> configuration knob or command line override for "commit", and then
> tell the users to use "git add/rm" _with_ the override before coming
> back to "git commit".

I was going to suggest we have some sort of reflog equivalent for the
index, but Duy seems to discuss that in a follow-on mail.

>
> How should this new check intract with paths added with "add -N", by
> the way?

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH v2 0/1] Make 'git commit' not accidentally lose staged content
  2018-09-17 18:15       ` Jeff King
  2018-09-17 18:41         ` Duy Nguyen
@ 2018-09-18 19:36         ` Jacob Keller
  2018-09-18 23:19           ` Jeff King
  1 sibling, 1 reply; 28+ messages in thread
From: Jacob Keller @ 2018-09-18 19:36 UTC (permalink / raw)
  To: Jeff King
  Cc: Duy Nguyen, Junio C Hamano, Git mailing list, Jonathan Nieder,
	Eric Sunshine

On Mon, Sep 17, 2018 at 11:15 AM Jeff King <peff@peff.net> wrote:
>
> On Mon, Sep 17, 2018 at 07:29:26PM +0200, Duy Nguyen wrote:
>
> > > On the other hand, if I am keeping some change that should never be
> > > in a commit in the working tree file, and building the contents in
> > > the index using "add -p" to incrementally, it would be the same
> > > disaster as you are trying to prevent if I by mistake did a whole
> > > path 'add', even if I catch myself doing so before running 'commit'
> > > i.e.
> > >
> > >         edit X
> > >         git add -p X
> > >         git diff --cached X
> > >         git diff X
> > >         ... repeat the above number of times ...
> > >         git add X ;# OOPS!
> > >         git add . ;# OOPS! even worse!
> > >
> > > Even though this does not involve "git commit -a" or "git commit X",
> > > an unrecoverable damage that requires redoing the manual work is
> > > already done.
> >
> > I don't see a good way to get to recover this situation. I could go
> > back to the "index log" idea, where we keep a log of index changes (or
> > just "interesting" changes). That way there's no behavior change at
> > all. The user who accidentally updates/deletes something can always
> > retrieve the old content back (assuming that they realize quickly
> > since we can't keep very long log).
>
> FWIW, I like that approach much better, since:
>
>   1. It does not bother or restrict anybody in their workflow; instead,
>      they pay the complexity price only when they know they have made a
>      mistake.
>
>   2. It covers many more cases (e.g., just doing the wrong thing via
>      "add -p").
>

I also think this is a better approach for the same reasons.

> A naive index log would be pretty cheap in CPU, at least for POSIX-ish
> systems. You could just hard link "index" to "index.N" before renaming
> "index.lock" over "index". But I guess if you have a gigantic index,
> that's less appealing. So maybe storing the equivalent of a "--raw" diff
> between the two index states would make more sense (and after all, you
> don't really need the stat-cache or cache-tree). It would cost more to
> reconstruct the index on the fly, but then the point is that you would
> create these logs a lot more than you access them.
>
> > I've been thinking about allowing to undo worktree changes too (e.g.
> > accidental "git reset --hard") and this log can cover it as well.
>
> I like that, too. It's a little more costly just because it may involve
> object-db writes, but I think in most cases it would be fine. I almost
> always "git stash" away discarded changes these days instead of "git
> reset --hard", because it effectively provides this kind of log.
>

Obviously we do eventually turn the index into a tree, which is used
by the commit. Would it be possible to simply somehow store these
trees, and have commands which blow the tree away simply instead, save
it? I'm not sure how costly that is.

Thanks,
Jake

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH v2 0/1] Make 'git commit' not accidentally lose staged content
  2018-09-17 19:26       ` Junio C Hamano
@ 2018-09-18 19:41         ` Jacob Keller
  2018-09-18 21:11           ` Eckhard Maaß
  0 siblings, 1 reply; 28+ messages in thread
From: Jacob Keller @ 2018-09-18 19:41 UTC (permalink / raw)
  To: Junio C Hamano
  Cc: Duy Nguyen, Git mailing list, Jonathan Nieder, Eric Sunshine

On Mon, Sep 17, 2018 at 12:26 PM Junio C Hamano <gitster@pobox.com> wrote:
> FWIW, I didn't mean to say that we should give users a way to
> recover.  Your "commit -a" or "commit $path" protection just
> prevents the situation from happening, and I think it is sufficient.
>
> The sole point I wanted to raise by bringing up the above was that
> we should have the same degree of protection against "add $path" or
> "add -u".
>
> Of course, "index log" is interesting and it may even turn out to be
> useful (I was skeptical about "reference log" the same way, but it
> turned out to be useful without burdening the system too heavily),
> and it may even remove the need for the "do not accidentally lose
> information by adding more to the index" protection.  But until that
> happens, if we are to have such a protection, we would wnat to give
> the same degree of protection to "commit" and "add".

I think having both is good. There are a lot of ways to accidentally
throw away work, and it's pretty frustrating to have it happen. But
the reflog is also somewhat complicated, and I've definitely seen a
lot of developers who've never heard of it, and struggle with the
concept.

I personally think having the nice "it looks like you're about to
throw away all your changes, are you sure" style of protection using
something like --clobber-index is useful as a mode, even if we have an
index log of sorts. Having it be default helps new people, even if it
does get in the way of someone who knows what they're doing. Having it
be configurable, to me, sort of defeats the point, since it means
having to tell people to turn this on.

I personally don't mind having to type an extended option to clobber
when I know it's what I want, but I can see that being painful.

However, if we had a reflog for the index, this becomes less of a
problem since recovery is much easier.

Thanks,
Jake

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH v2 0/1] Make 'git commit' not accidentally lose staged content
  2018-09-18 19:41         ` Jacob Keller
@ 2018-09-18 21:11           ` Eckhard Maaß
  0 siblings, 0 replies; 28+ messages in thread
From: Eckhard Maaß @ 2018-09-18 21:11 UTC (permalink / raw)
  To: Jacob Keller
  Cc: Junio C Hamano, Duy Nguyen, Git mailing list, Jonathan Nieder,
	Eric Sunshine

On Tue, Sep 18, 2018 at 12:41:30PM -0700, Jacob Keller wrote:
> I think having both is good. There are a lot of ways to accidentally
> throw away work, and it's pretty frustrating to have it happen. But
> the reflog is also somewhat complicated, and I've definitely seen a
> lot of developers who've never heard of it, and struggle with the
> concept.

It's definitely good to improve on "oh, I screwed up - how can I
recover?" part. 

> I personally think having the nice "it looks like you're about to
> throw away all your changes, are you sure" style of protection using
> something like --clobber-index is useful as a mode, even if we have an
> index log of sorts.

I don't think it's an appealing design. If we have the index reflog
giving us an undo function, then it is (at least for me) irritating to
have additional protection. Furthermore, this protection, to not break
existing workflows, needs to be configurable. This comes with much
baggage on its own. Having then two things which seem to solve the same
problem setting, but somehow different, is in my opinion even more
confusing in the long run.

Greetings,
Eckhard

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH v2 0/1] Make 'git commit' not accidentally lose staged content
  2018-09-18 19:36         ` Jacob Keller
@ 2018-09-18 23:19           ` Jeff King
  2018-09-19 16:12             ` Duy Nguyen
  0 siblings, 1 reply; 28+ messages in thread
From: Jeff King @ 2018-09-18 23:19 UTC (permalink / raw)
  To: Jacob Keller
  Cc: Duy Nguyen, Junio C Hamano, Git mailing list, Jonathan Nieder,
	Eric Sunshine

On Tue, Sep 18, 2018 at 12:36:06PM -0700, Jacob Keller wrote:

> > I like that, too. It's a little more costly just because it may involve
> > object-db writes, but I think in most cases it would be fine. I almost
> > always "git stash" away discarded changes these days instead of "git
> > reset --hard", because it effectively provides this kind of log.
> 
> Obviously we do eventually turn the index into a tree, which is used
> by the commit. Would it be possible to simply somehow store these
> trees, and have commands which blow the tree away simply instead, save
> it? I'm not sure how costly that is.

For an up-to-date index with the cache-tree extension, this should be
pretty cheap.  Even for a partially-staged index, where we already have
the blob in the object database, it should just incur the tree
computation (and parts you didn't touch would hopefully have an
up-to-date cache-tree).

Where it gets expensive is when you are throwing away working-tree
changes, and you have to re-hash those objects. Conceptually that's not
much worse than just calling `git add`, but there are corner cases
(e.g., imagine you keep a 1GB clone of another project in an ignored
directory, and then `git clean -dx` wants to `git add` all of it).

-Peff

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH v2 0/1] Make 'git commit' not accidentally lose staged content
  2018-09-18 23:19           ` Jeff King
@ 2018-09-19 16:12             ` Duy Nguyen
  2018-09-19 16:16               ` Jeff King
  0 siblings, 1 reply; 28+ messages in thread
From: Duy Nguyen @ 2018-09-19 16:12 UTC (permalink / raw)
  To: Jeff King
  Cc: Jacob Keller, Junio C Hamano, Git Mailing List, Jonathan Nieder,
	Eric Sunshine

On Wed, Sep 19, 2018 at 1:19 AM Jeff King <peff@peff.net> wrote:
>
> On Tue, Sep 18, 2018 at 12:36:06PM -0700, Jacob Keller wrote:
>
> > > I like that, too. It's a little more costly just because it may involve
> > > object-db writes, but I think in most cases it would be fine. I almost
> > > always "git stash" away discarded changes these days instead of "git
> > > reset --hard", because it effectively provides this kind of log.
> >
> > Obviously we do eventually turn the index into a tree, which is used
> > by the commit. Would it be possible to simply somehow store these
> > trees, and have commands which blow the tree away simply instead, save
> > it? I'm not sure how costly that is.
>
> For an up-to-date index with the cache-tree extension, this should be
> pretty cheap.  Even for a partially-staged index, where we already have
> the blob in the object database, it should just incur the tree
> computation (and parts you didn't touch would hopefully have an
> up-to-date cache-tree).

Just FYI. I wanted to get at least pruning support in place before
posting another RFC. But right now I'm going without cache-tree.
Whenever a file is updated, the _blob_ hashes are recorded in the
index log (one line per update, same as reflog format) and the
description part of the line is the updated path.

This is simpler to do, and we can still reconstruct a tree/commit in
memory if we need to (but not whole tree snapshots; but I don't think
that would be very useful). But maybe cache-tree approach is better.
We will see.
-- 
Duy

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH v2 0/1] Make 'git commit' not accidentally lose staged content
  2018-09-19 16:12             ` Duy Nguyen
@ 2018-09-19 16:16               ` Jeff King
  0 siblings, 0 replies; 28+ messages in thread
From: Jeff King @ 2018-09-19 16:16 UTC (permalink / raw)
  To: Duy Nguyen
  Cc: Jacob Keller, Junio C Hamano, Git Mailing List, Jonathan Nieder,
	Eric Sunshine

On Wed, Sep 19, 2018 at 06:12:23PM +0200, Duy Nguyen wrote:

> On Wed, Sep 19, 2018 at 1:19 AM Jeff King <peff@peff.net> wrote:
> >
> > On Tue, Sep 18, 2018 at 12:36:06PM -0700, Jacob Keller wrote:
> >
> > > > I like that, too. It's a little more costly just because it may involve
> > > > object-db writes, but I think in most cases it would be fine. I almost
> > > > always "git stash" away discarded changes these days instead of "git
> > > > reset --hard", because it effectively provides this kind of log.
> > >
> > > Obviously we do eventually turn the index into a tree, which is used
> > > by the commit. Would it be possible to simply somehow store these
> > > trees, and have commands which blow the tree away simply instead, save
> > > it? I'm not sure how costly that is.
> >
> > For an up-to-date index with the cache-tree extension, this should be
> > pretty cheap.  Even for a partially-staged index, where we already have
> > the blob in the object database, it should just incur the tree
> > computation (and parts you didn't touch would hopefully have an
> > up-to-date cache-tree).
> 
> Just FYI. I wanted to get at least pruning support in place before
> posting another RFC. But right now I'm going without cache-tree.
> Whenever a file is updated, the _blob_ hashes are recorded in the
> index log (one line per update, same as reflog format) and the
> description part of the line is the updated path.
> 
> This is simpler to do, and we can still reconstruct a tree/commit in
> memory if we need to (but not whole tree snapshots; but I don't think
> that would be very useful). But maybe cache-tree approach is better.
> We will see.

Oh, interesting. Sort of a hybrid approach. :) That makes sense, and I
don't see any reason it wouldn't work.

-Peff

^ permalink raw reply	[flat|nested] 28+ messages in thread

end of thread, other threads:[~2018-09-19 16:16 UTC | newest]

Thread overview: 28+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2018-08-20 15:41 [PATCH/RFC] commit: new option to abort -a something is already staged Nguyễn Thái Ngọc Duy
2018-08-20 15:55 ` Junio C Hamano
2018-08-20 17:48 ` Eric Sunshine
2018-08-20 19:30 ` Jonathan Nieder
2018-08-21 14:43   ` Duy Nguyen
2018-08-23  2:11     ` Jonathan Nieder
2018-08-23  2:15       ` Jonathan Nieder
2018-08-23 14:49         ` Duy Nguyen
2018-08-23 15:28           ` Junio C Hamano
2018-08-24  3:02             ` Jacob Keller
2018-08-24 14:42               ` Duy Nguyen
2018-08-24 23:23                 ` Jacob Keller
2018-08-24  2:59 ` Jacob Keller
2018-09-16  6:31 ` [PATCH v2 0/1] Make 'git commit' not accidentally lose staged content Nguyễn Thái Ngọc Duy
2018-09-16  6:31   ` [PATCH v2 1/1] commit: do not clobber the index Nguyễn Thái Ngọc Duy
2018-09-17 17:09   ` [PATCH v2 0/1] Make 'git commit' not accidentally lose staged content Junio C Hamano
2018-09-17 17:29     ` Duy Nguyen
2018-09-17 18:15       ` Jeff King
2018-09-17 18:41         ` Duy Nguyen
2018-09-18 17:35           ` Jeff King
2018-09-18 19:36         ` Jacob Keller
2018-09-18 23:19           ` Jeff King
2018-09-19 16:12             ` Duy Nguyen
2018-09-19 16:16               ` Jeff King
2018-09-17 19:26       ` Junio C Hamano
2018-09-18 19:41         ` Jacob Keller
2018-09-18 21:11           ` Eckhard Maaß
2018-09-18 19:33     ` Jacob Keller

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).