All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH 0/2] Make stash a builtin
@ 2016-01-28 19:38 Matthias Aßhauer
  2016-01-28 20:36 ` [PATCH 1/2] stash--helper: implement "git stash--helper" Matthias Asshauer
  0 siblings, 1 reply; 15+ messages in thread
From: Matthias Aßhauer @ 2016-01-28 19:38 UTC (permalink / raw)
  To: git; +Cc: johannes.schindelin

These are the first two Patches in an upcomming series of Patches to
convert git stash into a builtin command. This is mainly based on the
general performance of scripts running inside an interpreter (bash in
this case) running inside a POSIX emulation layer on Windows. Shell,
perl and python scripts are generaly a lot faster on Unix-like systems
compared to Windows systems. That does not mean that Unix-like systems
won't benefit from more native Git commands, but the effect is bigger
on Git for Windows.

These two patches port over the core non-patch part of git-stash into
native code using a separate helper command. This helper command is
intended as a temporary meassure and as such it's subject to change.
For this reason, I did not implement new regression tests, documentation
or localizations for this command.

I meassured the changes in excecution time for the stash related
regression tests on the same hardware running Windows 8.1 and
Kubuntu 15.10. Each result is the difference between the average of five
meassurements (six on Linux, because I lost count on the first run of
meassurements) each before and after these changes. I meassured the
following changes:

Windows:

t3903
real -5,10%
user -0,94%
sys +0,16% (10ms)

t3904
real -0,30%
user -2,98% (20ms)
sys +5,03%

t3905
real -4,03%
user -8,13%
sys +17,42%

t3906
real -2,57%
user +1,94%
sys +1,59%

Linux:

t3903
real +0,63%
user +10,87% (3ms)
sys +4,29% (4ms)

t3904
real -7,29%
user -30,61%
sys +5,77% (4ms)

t3905
real -7,29%
user -33,33% (2ms)
sys +20% (2ms)

t3906
real -0,88%
user -1,08% (1ms)
sys -2,22%

I added the asolute times where I think the difference is below the
meassurement precission (4ms on Linux) and on the two lowest absolute
differences on windows. A full log of all meassurement runs is available
at https://gist.github.com/rimrul/82adf3b368ed633263d2. Please note that
according to Johannes Schindelin, maintainer of Git for Windows, the
meassuring of sys time on Windows is unreliable. With that in mind,
in summary this is a slight increase in performance on Linux, and a more
noticeable increase on Windows.

^ permalink raw reply	[flat|nested] 15+ messages in thread

* [PATCH 2/2] stash: use "stash--helper"
  2016-01-28 20:36 ` [PATCH 1/2] stash--helper: implement "git stash--helper" Matthias Asshauer
@ 2016-01-28 20:36   ` Matthias Asshauer
  2016-01-28 20:59     ` Stefan Beller
  2016-01-28 23:06   ` [PATCH 1/2] stash--helper: implement "git stash--helper" Junio C Hamano
  1 sibling, 1 reply; 15+ messages in thread
From: Matthias Asshauer @ 2016-01-28 20:36 UTC (permalink / raw)
  To: git

From: Matthias Aßhauer <mha1993@live.de>

Use the new "git stash--helper" builtin. It should be faster than the old shell code and is a first step to move
more shell code to C.

Signed-off-by: Matthias Aßhauer <mha1993@live.de>
---
 git-stash.sh | 10 +---------
 1 file changed, 1 insertion(+), 9 deletions(-)

diff --git a/git-stash.sh b/git-stash.sh
index c7c65e2..973c77b 100755
--- a/git-stash.sh
+++ b/git-stash.sh
@@ -112,15 +112,7 @@ create_stash () {
 	then
 
 		# state of the working tree
-		w_tree=$( (
-			git read-tree --index-output="$TMPindex" -m $i_tree &&
-			GIT_INDEX_FILE="$TMPindex" &&
-			export GIT_INDEX_FILE &&
-			git diff --name-only -z HEAD -- >"$TMP-stagenames" &&
-			git update-index -z --add --remove --stdin <"$TMP-stagenames" &&
-			git write-tree &&
-			rm -f "$TMPindex"
-		) ) ||
+		w_tree=$(git stash--helper --non-patch "$TMPindex" $i_tree) ||
 			die "$(gettext "Cannot save the current worktree state")"
 
 	else

--
https://github.com/git/git/pull/191

^ permalink raw reply related	[flat|nested] 15+ messages in thread

* [PATCH 1/2] stash--helper: implement "git stash--helper"
  2016-01-28 19:38 [PATCH 0/2] Make stash a builtin Matthias Aßhauer
@ 2016-01-28 20:36 ` Matthias Asshauer
  2016-01-28 20:36   ` [PATCH 2/2] stash: use "stash--helper" Matthias Asshauer
  2016-01-28 23:06   ` [PATCH 1/2] stash--helper: implement "git stash--helper" Junio C Hamano
  0 siblings, 2 replies; 15+ messages in thread
From: Matthias Asshauer @ 2016-01-28 20:36 UTC (permalink / raw)
  To: git

From: Matthias Aßhauer <mha1993@live.de>

This patch implements a new "git stash--helper" builtin plumbing
command that will be used to migrate "git-stash.sh" to C.

We start by implementing only the "--non-patch" option that will
handle the core part of the non-patch stashing.

Signed-off-by: Matthias Aßhauer <mha1993@live.de>
---
 Makefile                |  2 ++
 builtin.h               |  1 +
 builtin/stash--helper.c | 13 ++++++++++
 git.c                   |  1 +
 stash.c                 | 65 +++++++++++++++++++++++++++++++++++++++++++++++++
 stash.h                 | 11 +++++++++
 6 files changed, 93 insertions(+)
 create mode 100644 builtin/stash--helper.c
 create mode 100644 stash.c
 create mode 100644 stash.h

diff --git a/Makefile b/Makefile
index fc2f1ab..88c07ea 100644
--- a/Makefile
+++ b/Makefile
@@ -792,6 +792,7 @@ LIB_OBJS += shallow.o
 LIB_OBJS += sideband.o
 LIB_OBJS += sigchain.o
 LIB_OBJS += split-index.o
+LIB_OBJS += stash.o
 LIB_OBJS += strbuf.o
 LIB_OBJS += streaming.o
 LIB_OBJS += string-list.o
@@ -913,6 +914,7 @@ BUILTIN_OBJS += builtin/send-pack.o
 BUILTIN_OBJS += builtin/shortlog.o
 BUILTIN_OBJS += builtin/show-branch.o
 BUILTIN_OBJS += builtin/show-ref.o
+BUILTIN_OBJS += builtin/stash--helper.o
 BUILTIN_OBJS += builtin/stripspace.o
 BUILTIN_OBJS += builtin/submodule--helper.o
 BUILTIN_OBJS += builtin/symbolic-ref.o
diff --git a/builtin.h b/builtin.h
index 6b95006..f1c8b39 100644
--- a/builtin.h
+++ b/builtin.h
@@ -118,6 +118,7 @@ extern int cmd_send_pack(int argc, const char **argv, const char *prefix);
 extern int cmd_shortlog(int argc, const char **argv, const char *prefix);
 extern int cmd_show(int argc, const char **argv, const char *prefix);
 extern int cmd_show_branch(int argc, const char **argv, const char *prefix);
+extern int cmd_stash__helper(int argc, const char **argv, const char *prefix);
 extern int cmd_status(int argc, const char **argv, const char *prefix);
 extern int cmd_stripspace(int argc, const char **argv, const char *prefix);
 extern int cmd_submodule__helper(int argc, const char **argv, const char *prefix);
diff --git a/builtin/stash--helper.c b/builtin/stash--helper.c
new file mode 100644
index 0000000..542e782
--- /dev/null
+++ b/builtin/stash--helper.c
@@ -0,0 +1,13 @@
+#include "../stash.h"
+#include <string.h>
+
+static const char builtin_stash__helper_usage[] = {
+	"Usage: git stash--helper --non-patch <tmp_indexfile> <i_tree>"
+};
+
+int cmd_stash__helper(int argc, const char **argv, const char *prefix)
+{
+	if (argc == 4 && !strcmp("--non-patch", argv[1]))
+		return stash_non_patch(argv[2], argv[3], prefix);
+	usage(builtin_stash__helper_usage);
+}
diff --git a/git.c b/git.c
index da278c3..9829ee8 100644
--- a/git.c
+++ b/git.c
@@ -470,6 +470,7 @@ static struct cmd_struct commands[] = {
 	{ "show-branch", cmd_show_branch, RUN_SETUP },
 	{ "show-ref", cmd_show_ref, RUN_SETUP },
 	{ "stage", cmd_add, RUN_SETUP | NEED_WORK_TREE },
+	{ "stash--helper", cmd_stash__helper, RUN_SETUP | NEED_WORK_TREE },
 	{ "status", cmd_status, RUN_SETUP | NEED_WORK_TREE },
 	{ "stripspace", cmd_stripspace },
 	{ "submodule--helper", cmd_submodule__helper, RUN_SETUP },
diff --git a/stash.c b/stash.c
new file mode 100644
index 0000000..c3d6e67
--- /dev/null
+++ b/stash.c
@@ -0,0 +1,65 @@
+#include "stash.h"
+#include "strbuf.h"
+
+static int prepare_update_index_argv(struct argv_array *args,
+	struct strbuf *buf)
+{
+	struct strbuf **bufs, **b;
+
+	bufs = strbuf_split(buf, '\0');
+	for (b = bufs; *b; b++)
+		argv_array_pushf(args, "%s", (*b)->buf);
+	argv_array_push(args, "--");
+	strbuf_list_free(bufs);
+
+	return 0;
+}
+
+int stash_non_patch(const char *tmp_indexfile, const char *i_tree,
+	const char *prefix)
+{
+	int result;
+	struct child_process read_tree = CHILD_PROCESS_INIT;
+	struct child_process diff = CHILD_PROCESS_INIT;
+	struct child_process update_index = CHILD_PROCESS_INIT;
+	struct child_process write_tree = CHILD_PROCESS_INIT;
+	struct strbuf buf = STRBUF_INIT;
+
+	argv_array_push(&read_tree.args, "read-tree");
+	argv_array_pushf(&read_tree.args, "--index-output=%s", tmp_indexfile);
+	argv_array_pushl(&read_tree.args, "-m", i_tree, NULL);
+
+	argv_array_pushl(&diff.args, "diff", "--name-only", "-z", "HEAD", "--",
+		NULL);
+
+	argv_array_pushl(&update_index.args, "update-index", "--add",
+		"--remove", NULL);
+
+	argv_array_push(&write_tree.args, "write-tree");
+
+	read_tree.env =
+		diff.env =
+		update_index.env =
+		write_tree.env = prefix;
+
+	read_tree.use_shell =
+		diff.use_shell =
+		update_index.use_shell =
+		write_tree.use_shell = 1;
+
+	read_tree.git_cmd =
+		diff.git_cmd =
+		update_index.git_cmd =
+		write_tree.git_cmd = 1;
+
+	result = run_command(&read_tree) ||
+		setenv("GIT_INDEX_FILE", tmp_indexfile, 1) ||
+		capture_command(&diff, &buf, 0) ||
+		prepare_update_index_argv(&update_index.args, &buf) ||
+		run_command(&update_index) ||
+		run_command(&write_tree) ||
+		remove(tmp_indexfile);
+
+	strbuf_release(&buf);
+	return result;
+}
diff --git a/stash.h b/stash.h
new file mode 100644
index 0000000..0880456
--- /dev/null
+++ b/stash.h
@@ -0,0 +1,11 @@
+#ifndef STASH_H
+#define STASH_H
+
+#include "git-compat-util.h"
+#include "gettext.h"
+#include "run-command.h"
+
+extern int stash_non_patch(const char *tmp_indexfile, const char *i_tree,
+	const char *prefix);
+
+#endif /* STASH_H */

--
https://github.com/git/git/pull/191

^ permalink raw reply related	[flat|nested] 15+ messages in thread

* Re: [PATCH 2/2] stash: use "stash--helper"
  2016-01-28 20:36   ` [PATCH 2/2] stash: use "stash--helper" Matthias Asshauer
@ 2016-01-28 20:59     ` Stefan Beller
  2016-01-28 21:25       ` AW: " Matthias Aßhauer
  2016-01-29 11:21       ` Thomas Gummerer
  0 siblings, 2 replies; 15+ messages in thread
From: Stefan Beller @ 2016-01-28 20:59 UTC (permalink / raw)
  To: Matthias Asshauer; +Cc: git

On Thu, Jan 28, 2016 at 12:36 PM, Matthias Asshauer <mha1993@live.de> wrote:
> From: Matthias Aßhauer <mha1993@live.de>
>
> Use the new "git stash--helper" builtin. It should be faster than the old shell code and is a first step to move
> more shell code to C.

You had some good measurements in the coverletter, which is not going to be
recorded in the projects history. This part however would be part of the commit.
So you could move the speed improvements here (as well as the other reasoning)
on why this is a good idea. :)

>
> Signed-off-by: Matthias Aßhauer <mha1993@live.de>
> ---
>  git-stash.sh | 10 +---------
>  1 file changed, 1 insertion(+), 9 deletions(-)
>
> diff --git a/git-stash.sh b/git-stash.sh
> index c7c65e2..973c77b 100755
> --- a/git-stash.sh
> +++ b/git-stash.sh
> @@ -112,15 +112,7 @@ create_stash () {
>         then
>
>                 # state of the working tree
> -               w_tree=$( (
> -                       git read-tree --index-output="$TMPindex" -m $i_tree &&
> -                       GIT_INDEX_FILE="$TMPindex" &&
> -                       export GIT_INDEX_FILE &&
> -                       git diff --name-only -z HEAD -- >"$TMP-stagenames" &&
> -                       git update-index -z --add --remove --stdin <"$TMP-stagenames" &&
> -                       git write-tree &&
> -                       rm -f "$TMPindex"
> -               ) ) ||
> +               w_tree=$(git stash--helper --non-patch "$TMPindex" $i_tree) ||
>                         die "$(gettext "Cannot save the current worktree state")"
>
>         else
>
> --
> https://github.com/git/git/pull/191

Oh I see you're using the pull-request to email translator, cool!

> --
> To unsubscribe from this list: send the line "unsubscribe git" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 15+ messages in thread

* AW: [PATCH 2/2] stash: use "stash--helper"
  2016-01-28 20:59     ` Stefan Beller
@ 2016-01-28 21:25       ` Matthias Aßhauer
  2016-01-28 21:41         ` Stefan Beller
  2016-01-29 11:21       ` Thomas Gummerer
  1 sibling, 1 reply; 15+ messages in thread
From: Matthias Aßhauer @ 2016-01-28 21:25 UTC (permalink / raw)
  To: 'Stefan Beller'; +Cc: git

> You had some good measurements in the coverletter, which is not going to be recorded in the projects history. This part however would be part of the commit.
> So you could move the speed improvements here (as well as the other reasoning) on why this is a good idea. :)

I considered that, but I thought it would inflate the size of the commit message quite a bit and represents a  pretty temporary information as I'm planning to port more code. Any further progression on this would make the old meassurements kind of obsolete IMHO. I decided to move it to the coverletter, because it is only valid information if you consider both commits. If the general opinion on here is that I should add it to the commit message though, I'll gladly update it.

>> https://github.com/git/git/pull/191
>
> Oh I see you're using the pull-request to email translator, cool! 

Yes, I did. It definitly makes things easier if you are not used to mailing lists, but it was also a bit of a kerfuffle. I tried to start working on coverletter support, but I couldn't get it to accept the amazon SES credentials I provided. I ended up manually submiting the coverletter. It also didn't like my name.

Thank you for your quick feedback. 

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH 2/2] stash: use "stash--helper"
  2016-01-28 21:25       ` AW: " Matthias Aßhauer
@ 2016-01-28 21:41         ` Stefan Beller
  2016-01-28 23:28           ` Roberto Tyley
  0 siblings, 1 reply; 15+ messages in thread
From: Stefan Beller @ 2016-01-28 21:41 UTC (permalink / raw)
  To: Matthias Aßhauer; +Cc: git, Roberto Tyley

On Thu, Jan 28, 2016 at 1:25 PM, Matthias Aßhauer <mha1993@live.de> wrote:
>> You had some good measurements in the coverletter, which is not going to be recorded in the projects history. This part however would be part of the commit.
>> So you could move the speed improvements here (as well as the other reasoning) on why this is a good idea. :)
>
> I considered that, but I thought it would inflate the size of the commit message quite a bit and represents a  pretty temporary information as I'm planning to port more code.

No worries about too large commit messages. ;) See
dcd1742e56ebb944c4ff62346da4548e1e3be675 as an example for commit
message per code raio what Jeff usually produces. :)

> Any further progression on this would make the old meassurements kind of obsolete IMHO.

Well it records that this specific step was beneficial, too, on the
platforms you measured on. If it turns out to there is a regression
after you rewrote lots of code, it is still traceable that this commit
was done in good faith.

> I decided to move it to the coverletter, because it is only valid information if you consider both commits. If the general opinion on here is that I should add it to the commit message though, I'll gladly update it.

Heh, true. However you enable the speedup in the second patch. If you
were to apply only the first (add the helper), you'd not see the
difference, so maybe it's worth adding it to the second commit
message.

>
>>> https://github.com/git/git/pull/191
>>
>> Oh I see you're using the pull-request to email translator, cool!
>
> Yes, I did. It definitly makes things easier if you are not used to mailing lists, but it was also a bit of a kerfuffle. I tried to start working on coverletter support, but I couldn't get it to accept the amazon SES credentials I provided. I ended up manually submiting the coverletter. It also didn't like my name.

Not sure if Roberto, the creator of that tool, follows the mailing
list.  I cc'd him.

>
> Thank you for your quick feedback.
>

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH 1/2] stash--helper: implement "git stash--helper"
  2016-01-28 20:36 ` [PATCH 1/2] stash--helper: implement "git stash--helper" Matthias Asshauer
  2016-01-28 20:36   ` [PATCH 2/2] stash: use "stash--helper" Matthias Asshauer
@ 2016-01-28 23:06   ` Junio C Hamano
  2016-01-29 19:32     ` AW: " Matthias Aßhauer
  1 sibling, 1 reply; 15+ messages in thread
From: Junio C Hamano @ 2016-01-28 23:06 UTC (permalink / raw)
  To: Matthias Asshauer; +Cc: git

Matthias Asshauer <mha1993@live.de> writes:

> From: Matthias Aßhauer <mha1993@live.de>
>
> This patch implements a new "git stash--helper" builtin plumbing
> command that will be used to migrate "git-stash.sh" to C.
>
> We start by implementing only the "--non-patch" option that will
> handle the core part of the non-patch stashing.
>
> Signed-off-by: Matthias Aßhauer <mha1993@live.de>
> ---
>  Makefile                |  2 ++
>  builtin.h               |  1 +
>  builtin/stash--helper.c | 13 ++++++++++
>  git.c                   |  1 +
>  stash.c                 | 65 +++++++++++++++++++++++++++++++++++++++++++++++++
>  stash.h                 | 11 +++++++++

Hmph, why not have everything inside builtin/stash--helper.c?  I do
not quite see a point of having the other two "library-ish" looking
files.

Also I personally feel that it would be easier to review when
these two patches are squashed into one.  I had to go back and forth
while reading the "non-patch" C function to see what logic from the
scripted version it is trying to replace.

> diff --git a/builtin/stash--helper.c b/builtin/stash--helper.c
> new file mode 100644
> index 0000000..542e782
> --- /dev/null
> +++ b/builtin/stash--helper.c
> @@ -0,0 +1,13 @@
> +#include "../stash.h"
> +#include <string.h>
> +
> +static const char builtin_stash__helper_usage[] = {
> +	"Usage: git stash--helper --non-patch <tmp_indexfile> <i_tree>"
> +};
> +
> +int cmd_stash__helper(int argc, const char **argv, const char *prefix)
> +{
> +	if (argc == 4 && !strcmp("--non-patch", argv[1]))
> +		return stash_non_patch(argv[2], argv[3], prefix);
> +	usage(builtin_stash__helper_usage);
> +}

This is meant to replace this sequence:

	git read-tree --index-output="$TMPindex" -m $i_tree &&
	GIT_INDEX_FILE="$TMPindex" &&
	export GIT_INDEX_FILE &&
	git diff --name-only -z HEAD -- >"$TMP-stagenames" &&
	git update-index -z --add --remove --stdin <"$TMP-stagenames" &&
	git write-tree &&
	rm -f "$TMPindex"

And outside of this section of the script, $TMPindex is never looked
at after this part finishes (which is obvious as the last thing the
section does is to remove it).  As you are rewriting this whole
section in C, you should wonder if you can do it without using a
temporary file in the filesystem at all.

> diff --git a/stash.c b/stash.c
> new file mode 100644
> index 0000000..c3d6e67
> --- /dev/null
> +++ b/stash.c
> @@ -0,0 +1,65 @@
> +#include "stash.h"
> +#include "strbuf.h"
> +
> +static int prepare_update_index_argv(struct argv_array *args,
> +	struct strbuf *buf)
> +{
> +	struct strbuf **bufs, **b;
> +
> +	bufs = strbuf_split(buf, '\0');
> +	for (b = bufs; *b; b++)
> +		argv_array_pushf(args, "%s", (*b)->buf);
> +	argv_array_push(args, "--");
> +	strbuf_list_free(bufs);
> +
> +	return 0;
> +}
> +
> +int stash_non_patch(const char *tmp_indexfile, const char *i_tree,
> +	const char *prefix)
> +{
> +	int result;
> +	struct child_process read_tree = CHILD_PROCESS_INIT;
> +	struct child_process diff = CHILD_PROCESS_INIT;
> +	struct child_process update_index = CHILD_PROCESS_INIT;
> +	struct child_process write_tree = CHILD_PROCESS_INIT;
> +	struct strbuf buf = STRBUF_INIT;
> +
> +	argv_array_push(&read_tree.args, "read-tree");
> +	argv_array_pushf(&read_tree.args, "--index-output=%s", tmp_indexfile);
> +	argv_array_pushl(&read_tree.args, "-m", i_tree, NULL);
> +
> +	argv_array_pushl(&diff.args, "diff", "--name-only", "-z", "HEAD", "--",
> +		NULL);
> +
> +	argv_array_pushl(&update_index.args, "update-index", "--add",
> +		"--remove", NULL);
> +
> +	argv_array_push(&write_tree.args, "write-tree");

Honestly, I had high hopes after seeing the "we are rewriting it in
C" but I am not enthused after seeing this.  I was hoping that the
rewritten version would do this all in-core, by calling these
functions that we already have:

 * read_index() to read the current index into the_index;

 * unpack_trees() to overlay the contents of i_tree to the contents
   of the index;

 * run_diff_index() to make the comparison between the result of the
   above and HEAD to collect the paths that are different (you'd use
   DIFF_FORMAT_CALLBACK mechanism to collect paths---see wt-status.c
   for existing code that already does this for hints);

 * add_to_index() to add or remove paths you found in the previous
   step to the in-core index;

 * write_cache_as_tree() to write out the resulting index of the
   above sequence of calls to a new tree object;

 * sha1_to_hex() to convert that resulting tree object name to hex
   format;

 * puts() to output the result.


Actually, because i_tree is coming from $(git write-tree) that was
done earlier on the current index, the unpack_trees() step should
not even be necessary.

The first three lines of the scripted version:

	git read-tree --index-output="$TMPindex" -m $i_tree &&
	GIT_INDEX_FILE="$TMPindex" &&
	export GIT_INDEX_FILE &&

are creating a new file $TMPindex on the filesystem while preserving
the cached stat info when it can, which is a glorified version of:

	cp -p $GIT_INDEX_FILE $TMPindex

In fact, versions of "git stash" before 3ba2e865 (stash: copy the
index using --index-output instead of cp -p, 2011-03-16) simply
did a "cp -p".

A C rewrite that works all in-core does not even need to write out a
temporary; it can just read the current index and do various things
up to writing the contents of the in-core index as a tree, and the
result would be correct as long as you do not forget *NOT* to write
the in-core index out to $GIT_INDEX_FILE.

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH 2/2] stash: use "stash--helper"
  2016-01-28 21:41         ` Stefan Beller
@ 2016-01-28 23:28           ` Roberto Tyley
  2016-01-29 19:37             ` AW: " Matthias Aßhauer
  0 siblings, 1 reply; 15+ messages in thread
From: Roberto Tyley @ 2016-01-28 23:28 UTC (permalink / raw)
  To: Stefan Beller; +Cc: Matthias Aßhauer, git

On 28 January 2016 at 21:41, Stefan Beller <sbeller@google.com> wrote:
> On Thu, Jan 28, 2016 at 1:25 PM, Matthias Aßhauer <mha1993@live.de> wrote:
>>>> https://github.com/git/git/pull/191
>>>
>>> Oh I see you're using the pull-request to email translator, cool!

Yay!

>> Yes, I did. It definitly makes things easier if you are not used to mailing lists, but it was also a bit of a kerfuffle. I tried to start working on coverletter support, but I couldn't get it to accept the amazon SES credentials I provided. I ended up manually submiting the coverletter. It also didn't like my name.

Apologies for that - https://github.com/rtyley/submitgit/pull/26 has
just been deployed, which should resolve the encoding for non-US ASCII
characters - if you feel like submitting another patch, and want to
put the eszett back into your GitHub account display name, I'd be
interested to know how that goes.

> Not sure if Roberto, the creator of that tool, follows the mailing
> list.  I cc'd him.

I don't closely follow the mailing list, so thanks for the cc!

Roberto

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH 2/2] stash: use "stash--helper"
  2016-01-28 20:59     ` Stefan Beller
  2016-01-28 21:25       ` AW: " Matthias Aßhauer
@ 2016-01-29 11:21       ` Thomas Gummerer
  2016-01-29 18:34         ` Junio C Hamano
  1 sibling, 1 reply; 15+ messages in thread
From: Thomas Gummerer @ 2016-01-29 11:21 UTC (permalink / raw)
  To: Stefan Beller; +Cc: Matthias Asshauer, git

On 01/28, Stefan Beller wrote:
> On Thu, Jan 28, 2016 at 12:36 PM, Matthias Asshauer <mha1993@live.de> wrote:
> > From: Matthias Aßhauer <mha1993@live.de>
> >
> > Use the new "git stash--helper" builtin. It should be faster than the old shell code and is a first step to move
> > more shell code to C.
>
> You had some good measurements in the coverletter, which is not going to be
> recorded in the projects history. This part however would be part of the commit.
> So you could move the speed improvements here (as well as the other reasoning)
> on why this is a good idea. :)

In addition it would be nice to add a performance test in t/perf,
especially since it seems further improvements are planned.  That will
make it easy for everyone to reproduce the performance numbers for
different use-cases.

Matthias, feel free to squash the following (or something similar) in
when you re-roll.

diff --git a/t/perf/p3000-stash.sh b/t/perf/p3000-stash.sh
new file mode 100755
index 0000000..e6e1153
--- /dev/null
+++ b/t/perf/p3000-stash.sh
@@ -0,0 +1,20 @@
+#!/bin/sh
+
+test_description="Test performance of git stash"
+
+. ./perf-lib.sh
+
+test_perf_default_repo
+
+file=$(git ls-files | tail -n 30 | head -1)
+
+test_expect_success "prepare repository" "
+	echo x >$file
+"
+
+test_perf "stash/stash pop" "
+	git stash &&
+	git stash pop
+"
+
+test_done

^ permalink raw reply related	[flat|nested] 15+ messages in thread

* Re: [PATCH 2/2] stash: use "stash--helper"
  2016-01-29 11:21       ` Thomas Gummerer
@ 2016-01-29 18:34         ` Junio C Hamano
  0 siblings, 0 replies; 15+ messages in thread
From: Junio C Hamano @ 2016-01-29 18:34 UTC (permalink / raw)
  To: Thomas Gummerer; +Cc: Stefan Beller, Matthias Asshauer, git

Thomas Gummerer <t.gummerer@gmail.com> writes:

> Matthias, feel free to squash the following (or something similar) in
> when you re-roll.
>
> diff --git a/t/perf/p3000-stash.sh b/t/perf/p3000-stash.sh
> new file mode 100755
> index 0000000..e6e1153
> --- /dev/null
> +++ b/t/perf/p3000-stash.sh
> @@ -0,0 +1,20 @@
> +#!/bin/sh
> +
> +test_description="Test performance of git stash"
> +
> +. ./perf-lib.sh
> +
> +test_perf_default_repo
> +
> +file=$(git ls-files | tail -n 30 | head -1)

If you use "tail -n 30" not "tail -30", which is good manners, you
would want to be consistent and say "head -n 1".

> +
> +test_expect_success "prepare repository" "
> +	echo x >$file
> +"
> +
> +test_perf "stash/stash pop" "
> +	git stash &&
> +	git stash pop
> +"
> +
> +test_done

^ permalink raw reply	[flat|nested] 15+ messages in thread

* AW: [PATCH 1/2] stash--helper: implement "git stash--helper"
  2016-01-28 23:06   ` [PATCH 1/2] stash--helper: implement "git stash--helper" Junio C Hamano
@ 2016-01-29 19:32     ` Matthias Aßhauer
  2016-01-29 19:58       ` Junio C Hamano
  0 siblings, 1 reply; 15+ messages in thread
From: Matthias Aßhauer @ 2016-01-29 19:32 UTC (permalink / raw)
  To: git
  Cc: 'Junio C Hamano', 'Thomas Gummerer',
	'Stefan Beller'

> Hmph, why not have everything inside builtin/stash--helper.c?  I do not quite see a point of having the other two "library-ish" looking files.
> 
> Also I personally feel that it would be easier to review when these two patches are squashed into one.  I had to go back and forth while reading the "non-patch" C function to see what logic from the scripted version it is trying to replace.

I can certainly do that.

> This is meant to replace this sequence:
>
>	git read-tree --index-output="$TMPindex" -m $i_tree &&
>	GIT_INDEX_FILE="$TMPindex" &&
>	export GIT_INDEX_FILE &&
>	git diff --name-only -z HEAD -- >"$TMP-stagenames" &&
>	git update-index -z --add --remove --stdin <"$TMP-stagenames" &&
>	git write-tree &&
>	rm -f "$TMPindex"
>
> And outside of this section of the script, $TMPindex is never looked at after this part finishes (which is obvious as the last thing the section does is to remove it).  As you are rewriting this whole section in C, you should wonder if you can do it without using a temporary file in the filesystem at all.

I assumed the path to $TMPindex was still available in GIT_INDEX_FILE after this, as it gets exported, but I guess the surrounding $() imply a subshell.

> Honestly, I had high hopes after seeing the "we are rewriting it in C" but I am not enthused after seeing this.  I was hoping that the rewritten version would do this all in-core, by calling these functions that we already have:

These functions might be obvious to you, but I'm new to git's source code, so I worked with the things I found documented and those that were brought to my attention by Johannes Schindelin. The run-command API is documented and seemed to be the "official" method of calling any git commands from within native git  code. On the other hand, the documentation for the in-core index API boils down to "TODO: document this". This lead me to believe I did this the intended way and just calling random functions that sound  similar to the  original command may be frowned upon at best.

> A C rewrite that works all in-core does not even need to write out a temporary; it can just read the current index and do various things up to writing the contents of the in-core index as a tree, and the result would be correct as long as you do not forget *NOT* to write the in-core index out to $GIT_INDEX_FILE.

I'll be working on a v2 that incorporates the feedback from you, Thomas Gummerer  and Stefan Beller then. Further feedback is of course welcome.

-----Ursprüngliche Nachricht-----
Von: Junio C Hamano [mailto:gitster@pobox.com] 
Gesendet: Freitag, 29. Januar 2016 00:06
An: Matthias Asshauer <mha1993@live.de>
Cc: git@vger.kernel.org
Betreff: Re: [PATCH 1/2] stash--helper: implement "git stash--helper"

Matthias Asshauer <mha1993@live.de> writes:

> From: Matthias Aßhauer <mha1993@live.de>
>
> This patch implements a new "git stash--helper" builtin plumbing 
> command that will be used to migrate "git-stash.sh" to C.
>
> We start by implementing only the "--non-patch" option that will 
> handle the core part of the non-patch stashing.
>
> Signed-off-by: Matthias Aßhauer <mha1993@live.de>
> ---
>  Makefile                |  2 ++
>  builtin.h               |  1 +
>  builtin/stash--helper.c | 13 ++++++++++
>  git.c                   |  1 +
>  stash.c                 | 65 +++++++++++++++++++++++++++++++++++++++++++++++++
>  stash.h                 | 11 +++++++++

Hmph, why not have everything inside builtin/stash--helper.c?  I do not quite see a point of having the other two "library-ish" looking files.

Also I personally feel that it would be easier to review when these two patches are squashed into one.  I had to go back and forth while reading the "non-patch" C function to see what logic from the scripted version it is trying to replace.

> diff --git a/builtin/stash--helper.c b/builtin/stash--helper.c new 
> file mode 100644 index 0000000..542e782
> --- /dev/null
> +++ b/builtin/stash--helper.c
> @@ -0,0 +1,13 @@
> +#include "../stash.h"
> +#include <string.h>
> +
> +static const char builtin_stash__helper_usage[] = {
> +	"Usage: git stash--helper --non-patch <tmp_indexfile> <i_tree>"
> +};
> +
> +int cmd_stash__helper(int argc, const char **argv, const char 
> +*prefix) {
> +	if (argc == 4 && !strcmp("--non-patch", argv[1]))
> +		return stash_non_patch(argv[2], argv[3], prefix);
> +	usage(builtin_stash__helper_usage);
> +}

This is meant to replace this sequence:

	git read-tree --index-output="$TMPindex" -m $i_tree &&
	GIT_INDEX_FILE="$TMPindex" &&
	export GIT_INDEX_FILE &&
	git diff --name-only -z HEAD -- >"$TMP-stagenames" &&
	git update-index -z --add --remove --stdin <"$TMP-stagenames" &&
	git write-tree &&
	rm -f "$TMPindex"

And outside of this section of the script, $TMPindex is never looked at after this part finishes (which is obvious as the last thing the section does is to remove it).  As you are rewriting this whole section in C, you should wonder if you can do it without using a temporary file in the filesystem at all.

> diff --git a/stash.c b/stash.c
> new file mode 100644
> index 0000000..c3d6e67
> --- /dev/null
> +++ b/stash.c
> @@ -0,0 +1,65 @@
> +#include "stash.h"
> +#include "strbuf.h"
> +
> +static int prepare_update_index_argv(struct argv_array *args,
> +	struct strbuf *buf)
> +{
> +	struct strbuf **bufs, **b;
> +
> +	bufs = strbuf_split(buf, '\0');
> +	for (b = bufs; *b; b++)
> +		argv_array_pushf(args, "%s", (*b)->buf);
> +	argv_array_push(args, "--");
> +	strbuf_list_free(bufs);
> +
> +	return 0;
> +}
> +
> +int stash_non_patch(const char *tmp_indexfile, const char *i_tree,
> +	const char *prefix)
> +{
> +	int result;
> +	struct child_process read_tree = CHILD_PROCESS_INIT;
> +	struct child_process diff = CHILD_PROCESS_INIT;
> +	struct child_process update_index = CHILD_PROCESS_INIT;
> +	struct child_process write_tree = CHILD_PROCESS_INIT;
> +	struct strbuf buf = STRBUF_INIT;
> +
> +	argv_array_push(&read_tree.args, "read-tree");
> +	argv_array_pushf(&read_tree.args, "--index-output=%s", tmp_indexfile);
> +	argv_array_pushl(&read_tree.args, "-m", i_tree, NULL);
> +
> +	argv_array_pushl(&diff.args, "diff", "--name-only", "-z", "HEAD", "--",
> +		NULL);
> +
> +	argv_array_pushl(&update_index.args, "update-index", "--add",
> +		"--remove", NULL);
> +
> +	argv_array_push(&write_tree.args, "write-tree");

Honestly, I had high hopes after seeing the "we are rewriting it in C" but I am not enthused after seeing this.  I was hoping that the rewritten version would do this all in-core, by calling these functions that we already have:

 * read_index() to read the current index into the_index;

 * unpack_trees() to overlay the contents of i_tree to the contents
   of the index;

 * run_diff_index() to make the comparison between the result of the
   above and HEAD to collect the paths that are different (you'd use
   DIFF_FORMAT_CALLBACK mechanism to collect paths---see wt-status.c
   for existing code that already does this for hints);

 * add_to_index() to add or remove paths you found in the previous
   step to the in-core index;

 * write_cache_as_tree() to write out the resulting index of the
   above sequence of calls to a new tree object;

 * sha1_to_hex() to convert that resulting tree object name to hex
   format;

 * puts() to output the result.


Actually, because i_tree is coming from $(git write-tree) that was done earlier on the current index, the unpack_trees() step should not even be necessary.

The first three lines of the scripted version:

	git read-tree --index-output="$TMPindex" -m $i_tree &&
	GIT_INDEX_FILE="$TMPindex" &&
	export GIT_INDEX_FILE &&

are creating a new file $TMPindex on the filesystem while preserving the cached stat info when it can, which is a glorified version of:

	cp -p $GIT_INDEX_FILE $TMPindex

In fact, versions of "git stash" before 3ba2e865 (stash: copy the index using --index-output instead of cp -p, 2011-03-16) simply did a "cp -p".

A C rewrite that works all in-core does not even need to write out a temporary; it can just read the current index and do various things up to writing the contents of the in-core index as a tree, and the result would be correct as long as you do not forget *NOT* to write the in-core index out to $GIT_INDEX_FILE.

^ permalink raw reply	[flat|nested] 15+ messages in thread

* AW: [PATCH 2/2] stash: use "stash--helper"
  2016-01-28 23:28           ` Roberto Tyley
@ 2016-01-29 19:37             ` Matthias Aßhauer
  0 siblings, 0 replies; 15+ messages in thread
From: Matthias Aßhauer @ 2016-01-29 19:37 UTC (permalink / raw)
  To: git; +Cc: 'Roberto Tyley', 'Stefan Beller'

>>> Yes, I did. It definitly makes things easier if you are not used to mailing lists, but it was also a bit of a kerfuffle. I tried to start working on coverletter support, but I couldn't get it to accept the amazon SES credentials I provided. I ended up manually submiting the coverletter. It also didn't like my name.

> Apologies for that - https://github.com/rtyley/submitgit/pull/26 has just been deployed, which should resolve the encoding for non-US ASCII characters - if you feel like submitting another patch, and want to put the eszett back into your GitHub account display name, I'd be interested to know how that goes.

You don't need to apologise. I knew the tool was WIP and had seen the Isuue before Iattempted to submit this. I will try out the patched version when I submit v2 of this.

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: AW: [PATCH 1/2] stash--helper: implement "git stash--helper"
  2016-01-29 19:32     ` AW: " Matthias Aßhauer
@ 2016-01-29 19:58       ` Junio C Hamano
  2016-02-01 23:36         ` Michael Blume
  0 siblings, 1 reply; 15+ messages in thread
From: Junio C Hamano @ 2016-01-29 19:58 UTC (permalink / raw)
  To: Matthias Aßhauer
  Cc: git, 'Thomas Gummerer', 'Stefan Beller'

Matthias Aßhauer <mha1993@live.de> writes:

[administrivia: please wrap your lines to reasonable lengths]

>> Honestly, I had high hopes after seeing the "we are rewriting it
>> in C" but I am not enthused after seeing this.  I was hoping that
>> the rewritten version would do this all in-core, by calling these
>> functions that we already have:
>
> These functions might be obvious to you, but I'm new to git's
> source code, ...

Ahh, I didn't realize I was talking with somebody unfamiliar with
the codebase.  Apologies.

Nevertheless, the list of functions I gave are a good starting
point; they are widely used building blocks in the codebase.

> I'll be working on a v2 that incorporates the feedback from you,
> Thomas Gummerer and Stefan Beller then. Further feedback is of
> course welcome.

Thanks.

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: AW: [PATCH 1/2] stash--helper: implement "git stash--helper"
  2016-01-29 19:58       ` Junio C Hamano
@ 2016-02-01 23:36         ` Michael Blume
  2016-02-01 23:40           ` Junio C Hamano
  0 siblings, 1 reply; 15+ messages in thread
From: Michael Blume @ 2016-02-01 23:36 UTC (permalink / raw)
  To: Junio C Hamano
  Cc: Matthias Aßhauer, Git List, Thomas Gummerer, Stefan Beller

On Fri, Jan 29, 2016 at 11:58 AM, Junio C Hamano <gitster@pobox.com> wrote:
> Matthias Aßhauer <mha1993@live.de> writes:
>
> [administrivia: please wrap your lines to reasonable lengths]
>
>>> Honestly, I had high hopes after seeing the "we are rewriting it
>>> in C" but I am not enthused after seeing this.  I was hoping that
>>> the rewritten version would do this all in-core, by calling these
>>> functions that we already have:
>>
>> These functions might be obvious to you, but I'm new to git's
>> source code, ...
>
> Ahh, I didn't realize I was talking with somebody unfamiliar with
> the codebase.  Apologies.
>
> Nevertheless, the list of functions I gave are a good starting
> point; they are widely used building blocks in the codebase.
>
>> I'll be working on a v2 that incorporates the feedback from you,
>> Thomas Gummerer and Stefan Beller then. Further feedback is of
>> course welcome.
>
> Thanks.
> --
> To unsubscribe from this list: send the line "unsubscribe git" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html

Maybe this isn't important given that it looks like the patch is going
to be rewritten, but I have

stash.c:43:18: warning: incompatible pointer types assigning to 'const
char *const *' from 'const char *'; take the address with &
[-Wincompatible-pointer-types]
                write_tree.env = prefix;

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: AW: [PATCH 1/2] stash--helper: implement "git stash--helper"
  2016-02-01 23:36         ` Michael Blume
@ 2016-02-01 23:40           ` Junio C Hamano
  0 siblings, 0 replies; 15+ messages in thread
From: Junio C Hamano @ 2016-02-01 23:40 UTC (permalink / raw)
  To: Michael Blume
  Cc: Matthias Aßhauer, Git List, Thomas Gummerer, Stefan Beller

Michael Blume <blume.mike@gmail.com> writes:

> Maybe this isn't important given that it looks like the patch is going
> to be rewritten, but I have
>
> stash.c:43:18: warning: incompatible pointer types assigning to 'const
> char *const *' from 'const char *'; take the address with &
> [-Wincompatible-pointer-types]
>                 write_tree.env = prefix;

The way posted patch tries to use the .env field when using the
run-command API is totally bogus and this compilation error is a
manifestation of that.

But the good news is that this should become irrelevant when the
patch is done by using internal calls ;-).

^ permalink raw reply	[flat|nested] 15+ messages in thread

end of thread, other threads:[~2016-02-01 23:40 UTC | newest]

Thread overview: 15+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2016-01-28 19:38 [PATCH 0/2] Make stash a builtin Matthias Aßhauer
2016-01-28 20:36 ` [PATCH 1/2] stash--helper: implement "git stash--helper" Matthias Asshauer
2016-01-28 20:36   ` [PATCH 2/2] stash: use "stash--helper" Matthias Asshauer
2016-01-28 20:59     ` Stefan Beller
2016-01-28 21:25       ` AW: " Matthias Aßhauer
2016-01-28 21:41         ` Stefan Beller
2016-01-28 23:28           ` Roberto Tyley
2016-01-29 19:37             ` AW: " Matthias Aßhauer
2016-01-29 11:21       ` Thomas Gummerer
2016-01-29 18:34         ` Junio C Hamano
2016-01-28 23:06   ` [PATCH 1/2] stash--helper: implement "git stash--helper" Junio C Hamano
2016-01-29 19:32     ` AW: " Matthias Aßhauer
2016-01-29 19:58       ` Junio C Hamano
2016-02-01 23:36         ` Michael Blume
2016-02-01 23:40           ` Junio C Hamano

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.