All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH] add: remove dead code
@ 2015-07-31  0:19 Stefan Beller
  2015-07-31 16:41 ` Junio C Hamano
                   ` (3 more replies)
  0 siblings, 4 replies; 15+ messages in thread
From: Stefan Beller @ 2015-07-31  0:19 UTC (permalink / raw)
  To: gitster; +Cc: git, Stefan Beller

Signed-off-by: Stefan Beller <sbeller@google.com>
---

So I was trying to understand how to implement "git add .gitmodules" as 
I intend to rewrite git submodules in C.

 builtin/add.c | 2 --
 1 file changed, 2 deletions(-)

diff --git a/builtin/add.c b/builtin/add.c
index 4bd98b7..b2a5c57 100644
--- a/builtin/add.c
+++ b/builtin/add.c
@@ -375,7 +375,6 @@ int cmd_add(int argc, const char **argv, const char *prefix)
 
 	if (add_new_files) {
 		int baselen;
-		struct pathspec empty_pathspec;
 
 		/* Set up the default git porcelain excludes */
 		memset(&dir, 0, sizeof(dir));
@@ -384,7 +383,6 @@ int cmd_add(int argc, const char **argv, const char *prefix)
 			setup_standard_excludes(&dir);
 		}
 
-		memset(&empty_pathspec, 0, sizeof(empty_pathspec));
 		/* This picks up the paths that are not tracked */
 		baselen = fill_directory(&dir, &pathspec);
 		if (pathspec.nr)
-- 
2.5.0.rc1.391.g15b60ce

^ permalink raw reply related	[flat|nested] 15+ messages in thread

* Re: [PATCH] add: remove dead code
  2015-07-31  0:19 [PATCH] add: remove dead code Stefan Beller
@ 2015-07-31 16:41 ` Junio C Hamano
  2015-07-31 23:09 ` [RFC/PATCH 0/2] Submodules: refactoring the `module_list` function Stefan Beller
                   ` (2 subsequent siblings)
  3 siblings, 0 replies; 15+ messages in thread
From: Junio C Hamano @ 2015-07-31 16:41 UTC (permalink / raw)
  To: Stefan Beller; +Cc: git, Jens Lehmann, Heiko Voigt

Stefan Beller <sbeller@google.com> writes:

> Signed-off-by: Stefan Beller <sbeller@google.com>
> ---
>
> So I was trying to understand how to implement "git add .gitmodules" as 
> I intend to rewrite git submodules in C.

A request and a suggestion.

 - Please keep Jens and Heiko in the loop and pick their brains ;-)

 - It may not be the best use of your time to attempt rewriting the
   whole of "git submodule" in C in a single pass (which is the
   impression I am getting from seeing you say "git add
   .gitmodules").

   The largest pain point a rewrite in C would solve is that "git
   submodule init" and "update" have to go sequencially even though

   (1) there is no inherent reason for cloning and fetching of
       different submodules to happen in some order;

   (2) there is no inherent reason for a failure to clone or fetch
       of one submodule to abort the entire session without cloning
       or fetching other submodules; and

   (3) the operation in each submodule takes human-scale time and
       the users would benefit greatly by parallel operations.

   One approach that may be beneficial would be to introduce "git
   submodule--helper" written in C to have selected primitives used
   in "git submodule" script to speed them up.  Perhaps the first
   subcommand would be "

    $ git submodule--helper foreach-parallel --cmd=$cmd $args...

   where it takes the arguments currently fed to module_list as
   $args... and runs $cmd in parallel.  The initial implementation
   of "git submodule update" then would replace the expensive and
   sequencial

       module_list "$@" | {
           ...
           while read mode sha1 stage sm_path
           do
           	...
	   done
       }

   loop with a call to the foreach-parallel subcommand.

   The end-user scripts that currently use "git submodule foreach"
   may or may not depend on the sequencial nature of the current
   implementation, so adding "git submodule foreach-parallel" may be
   a good way to expose this as a new "do it in parallel" feature.

   Once you have a solid infrastructure to implement the helper
   subcommand "foreach-parallel" (I'd expect the interface inside C
   into that function would be to give it a worker function with the
   data for the function to consume, and the above --cmd=$cmd form
   would use a worker function that essentially does run_command();
   the spawn(2)ing and wait(2)ing would be done on the more generic
   API side), you can rewrite the $cmd part in C little by little,
   and eventually you would get a full C implemention while keeping
   everything working and retaining debuggability during the course
   of development.

Thanks.

^ permalink raw reply	[flat|nested] 15+ messages in thread

* [RFC/PATCH 0/2] Submodules: refactoring the `module_list` function
  2015-07-31  0:19 [PATCH] add: remove dead code Stefan Beller
  2015-07-31 16:41 ` Junio C Hamano
@ 2015-07-31 23:09 ` Stefan Beller
  2015-07-31 23:09 ` [RFC/PATCH 1/2] submodule: implement `module_list` as a builtin helper Stefan Beller
  2015-07-31 23:09 ` [RFC/PATCH 2/2] Testing the new code Stefan Beller
  3 siblings, 0 replies; 15+ messages in thread
From: Stefan Beller @ 2015-07-31 23:09 UTC (permalink / raw)
  To: git; +Cc: Stefan Beller, Jens.Lehmann, hvoigt, gitster

Hi,

so eventually we want to have the whole Android project inside a git repository,
which includes > 500 submodules for the different sub systems. To make that feasable
we want to improve git-submodule for a huge set of submodules such as parallelizing
`git submodule update`. As a first step I was toying around the `module_list` function
as that is one key component for all the submodule operations. Also that part of the
code is where the operations are split up to the different submodules, so that
would be the natural place where we'd approach introducing threads later on.

This is an early RFC as it breaks the test suite. I do not remember ever interacting
with pathspecs or cache entries before, so I'd appreciate some guiding comments.
(Am I holding it wrong? Also the git code base seems large, it's easy to find new
playgrounds :)

Thanks,
Stefan

Stefan Beller (2):
  submodule: implement `module_list` as a builtin helper
  Testing the new code

 Makefile                    |  1 +
 builtin.h                   |  1 +
 builtin/submodule--helper.c | 99 +++++++++++++++++++++++++++++++++++++++++++++
 git-submodule.sh            | 25 +++++++++++-
 git.c                       |  1 +
 5 files changed, 126 insertions(+), 1 deletion(-)
 create mode 100644 builtin/submodule--helper.c

-- 
2.5.0.5.gf4cd9ae.dirty

^ permalink raw reply	[flat|nested] 15+ messages in thread

* [RFC/PATCH 1/2] submodule: implement `module_list` as a builtin helper
  2015-07-31  0:19 [PATCH] add: remove dead code Stefan Beller
  2015-07-31 16:41 ` Junio C Hamano
  2015-07-31 23:09 ` [RFC/PATCH 0/2] Submodules: refactoring the `module_list` function Stefan Beller
@ 2015-07-31 23:09 ` Stefan Beller
  2015-08-01  1:01   ` Junio C Hamano
  2015-07-31 23:09 ` [RFC/PATCH 2/2] Testing the new code Stefan Beller
  3 siblings, 1 reply; 15+ messages in thread
From: Stefan Beller @ 2015-07-31 23:09 UTC (permalink / raw)
  To: git; +Cc: Stefan Beller, Jens.Lehmann, hvoigt, gitster

Most of the submodule operations work on a set of submodules.
Calculating and using this set is usually done via:

       module_list "$@" | {
           while read mode sha1 stage sm_path
           do
                # the actual operation
           done
       }

Currently the function `module_list` is implemented in the
git-submodule.sh as a shell script wrapping a perl script.
The rewrite is in C, such that it is faster and can later be
easily adapted when other functions are rewritten in C.

Signed-off-by: Stefan Beller <sbeller@google.com>
---
 Makefile                    |  1 +
 builtin.h                   |  1 +
 builtin/submodule--helper.c | 99 +++++++++++++++++++++++++++++++++++++++++++++
 git.c                       |  1 +
 4 files changed, 102 insertions(+)
 create mode 100644 builtin/submodule--helper.c

diff --git a/Makefile b/Makefile
index 8c3c724..6fb7484 100644
--- a/Makefile
+++ b/Makefile
@@ -898,6 +898,7 @@ BUILTIN_OBJS += builtin/shortlog.o
 BUILTIN_OBJS += builtin/show-branch.o
 BUILTIN_OBJS += builtin/show-ref.o
 BUILTIN_OBJS += builtin/stripspace.o
+BUILTIN_OBJS += builtin/submodule--helper.o
 BUILTIN_OBJS += builtin/symbolic-ref.o
 BUILTIN_OBJS += builtin/tag.o
 BUILTIN_OBJS += builtin/unpack-file.o
diff --git a/builtin.h b/builtin.h
index 9e04f97..7bf9597 100644
--- a/builtin.h
+++ b/builtin.h
@@ -118,6 +118,7 @@ extern int cmd_show(int argc, const char **argv, const char *prefix);
 extern int cmd_show_branch(int argc, const char **argv, const char *prefix);
 extern int cmd_status(int argc, const char **argv, const char *prefix);
 extern int cmd_stripspace(int argc, const char **argv, const char *prefix);
+extern int cmd_submodule__helper(int argc, const char **argv, const char *prefix);
 extern int cmd_symbolic_ref(int argc, const char **argv, const char *prefix);
 extern int cmd_tag(int argc, const char **argv, const char *prefix);
 extern int cmd_tar_tree(int argc, const char **argv, const char *prefix);
diff --git a/builtin/submodule--helper.c b/builtin/submodule--helper.c
new file mode 100644
index 0000000..2e24fdc
--- /dev/null
+++ b/builtin/submodule--helper.c
@@ -0,0 +1,99 @@
+#include "builtin.h"
+#include "cache.h"
+#include "parse-options.h"
+#include "quote.h"
+#include "pathspec.h"
+#include "dir.h"
+
+static const char * const git_submodule_helper_usage[] = {
+	N_("git submodule--helper --module_list [<path>...]"),
+	NULL
+};
+
+int module_list(int argc, const char **argv, const char *prefix)
+{
+	int i;
+	static struct pathspec pathspec;
+	const struct cache_entry **ce_entries = NULL;
+	int alloc = 0, used = 0;
+	char *ps_matched = NULL;
+	char *max_prefix;
+	int max_prefix_len;
+	struct string_list already_printed = STRING_LIST_INIT_NODUP;
+
+	parse_pathspec(&pathspec, 0,
+		       PATHSPEC_PREFER_FULL,
+		       prefix, argv);
+
+	/* Find common prefix for all pathspec's */
+	max_prefix = common_prefix(&pathspec);
+	max_prefix_len = max_prefix ? strlen(max_prefix) : 0;
+
+	if (pathspec.nr)
+		ps_matched = xcalloc(1, pathspec.nr);
+
+	if (read_cache() < 0)
+		die("index file corrupt");
+
+	for (i = 0; i < active_nr; i++) {
+		const struct cache_entry *ce = active_cache[i];
+
+		if (!match_pathspec(&pathspec, ce->name, ce_namelen(ce),
+				    max_prefix_len, ps_matched,
+				    S_ISGITLINK(ce->ce_mode) | S_ISDIR(ce->ce_mode)))
+			continue;
+
+		if (S_ISGITLINK(ce->ce_mode)) {
+			ALLOC_GROW(ce_entries, used + 1, alloc);
+			ce_entries[used++] = ce;
+		}
+	}
+
+	if (ps_matched && report_path_error(ps_matched, &pathspec, prefix)) {
+		printf("#unmatched\n");
+		return 1;
+	}
+
+	for (i = 0; i < used; i++) {
+		const struct cache_entry *ce = ce_entries[i];
+
+		if (string_list_has_string(&already_printed, ce->name))
+			continue;
+
+		if (ce_stage(ce)) {
+			printf("%06o %s U\t", ce->ce_mode, sha1_to_hex(null_sha1));
+		} else {
+			printf("%06o %s %d\t", ce->ce_mode, sha1_to_hex(ce->sha1), ce_stage(ce));
+		}
+		write_name_quoted(ce->name, stdout, '\n');
+		string_list_insert(&already_printed, ce->name);
+	}
+	return 0;
+}
+
+int cmd_submodule__helper(int argc, const char **argv, const char *prefix)
+{
+	enum {
+		MODE_UNSPECIFIED = 0,
+		MODE_MODULE_LIST,
+	} cmdmode = MODE_UNSPECIFIED;
+
+	struct option options[] = {
+		OPT_CMDMODE(0, "module_list", &cmdmode,
+			    N_("Get the submodule list"), MODE_MODULE_LIST),
+		OPT_END()
+	};
+
+	argc = parse_options(argc, argv, prefix, options,
+			     git_submodule_helper_usage, 0);
+
+	switch (cmdmode) {
+	case MODE_MODULE_LIST:
+		return module_list(argc, argv, prefix);
+		break;
+	case MODE_UNSPECIFIED:
+		usage_with_options(git_submodule_helper_usage, options);
+		break;
+	}
+	return 0;
+}
diff --git a/git.c b/git.c
index fe94066..721995e 100644
--- a/git.c
+++ b/git.c
@@ -468,6 +468,7 @@ static struct cmd_struct commands[] = {
 	{ "stage", cmd_add, RUN_SETUP | NEED_WORK_TREE },
 	{ "status", cmd_status, RUN_SETUP | NEED_WORK_TREE },
 	{ "stripspace", cmd_stripspace },
+	{ "submodule--helper", cmd_submodule__helper, RUN_SETUP },
 	{ "symbolic-ref", cmd_symbolic_ref, RUN_SETUP },
 	{ "tag", cmd_tag, RUN_SETUP },
 	{ "unpack-file", cmd_unpack_file, RUN_SETUP },
-- 
2.5.0.5.gf4cd9ae.dirty

^ permalink raw reply related	[flat|nested] 15+ messages in thread

* [RFC/PATCH 2/2] Testing the new code
  2015-07-31  0:19 [PATCH] add: remove dead code Stefan Beller
                   ` (2 preceding siblings ...)
  2015-07-31 23:09 ` [RFC/PATCH 1/2] submodule: implement `module_list` as a builtin helper Stefan Beller
@ 2015-07-31 23:09 ` Stefan Beller
  2015-08-01  1:02   ` Junio C Hamano
  3 siblings, 1 reply; 15+ messages in thread
From: Stefan Beller @ 2015-07-31 23:09 UTC (permalink / raw)
  To: git; +Cc: Stefan Beller, Jens.Lehmann, hvoigt, gitster

Signed-off-by: Stefan Beller <sbeller@google.com>
---

Notes:
    The output of the differential testing is below,
    which points out 2 bugs:
    * resolving relative paths seems is broken, so we
      would need to have the equivalent of
        eval "set $(git rev-parse --sq --prefix "$wt_prefix" -- "$@")"
      (091a6eb0, 2013-06-16,  submodule: drop the top-level requirement)
    * The handling of funny characters is broken.
    
    ----------------------------------------------
    ../init
    160000 9ab41d0d44279924747e3eafa6da77966579aa45 0	init
    ----
    ----
    ----
    fatal: ../init: '../init' is outside repository
    ----------------------------------------------
    ../init
    160000 37b832c3ae4dcc0830b173572ff59be6b089bdcd 0	init
    ----
    ----
    ----
    fatal: ../init: '../init' is outside repository
    ----------------------------------------------
    ../init
    160000 9ab41d0d44279924747e3eafa6da77966579aa45 0	init
    ----
    ----
    ----
    fatal: ../init: '../init' is outside repository
    ----------------------------------------------
    
    160000 37b832c3ae4dcc0830b173572ff59be6b089bdcd 0	example2
    160000 9ab41d0d44279924747e3eafa6da77966579aa45 0	init
    160000 01b25b1d44b1049978e053768790100df8b8d9d9 0	å äö
    ----
    160000 37b832c3ae4dcc0830b173572ff59be6b089bdcd 0	example2
    160000 9ab41d0d44279924747e3eafa6da77966579aa45 0	init
    160000 01b25b1d44b1049978e053768790100df8b8d9d9 0	"\303\245 \303\244\303\266"
    ----
    ----

 git-submodule.sh | 25 ++++++++++++++++++++++++-
 1 file changed, 24 insertions(+), 1 deletion(-)

diff --git a/git-submodule.sh b/git-submodule.sh
index 36797c3..717b74d 100755
--- a/git-submodule.sh
+++ b/git-submodule.sh
@@ -149,7 +149,7 @@ relative_path ()
 # Get submodule info for registered submodules
 # $@ = path to limit submodule list
 #
-module_list()
+module_list_shell()
 {
 	eval "set $(git rev-parse --sq --prefix "$wt_prefix" -- "$@")"
 	(
@@ -187,6 +187,29 @@ module_list()
 	'
 }
 
+module_list()
+{
+	# call both the old and new code
+	module_list_shell $@ >/u/git_submodule_module_list_shell 2>/u/git_submodule_module_list_shell2
+	git submodule--helper --module_list $@ >/u/git_submodule_module_list 2>/u/git_submodule_module_list2
+	# if there is a diff record it
+	DIFF=$(diff /u/git_submodule_module_list_shell /u/git_submodule_module_list)
+	if [ "$DIFF" != "" ]
+	then
+		echo "----------------------------------------------">>/u/git_submodule_module_diffs
+		echo $@ >>/u/git_submodule_module_diffs
+		cat /u/git_submodule_module_list_shell >>/u/git_submodule_module_diffs
+		echo "----" >>/u/git_submodule_module_diffs
+		cat /u/git_submodule_module_list >>/u/git_submodule_module_diffs
+		echo "----" >>/u/git_submodule_module_diffs
+		cat /u/git_submodule_module_list_shell2 >>/u/git_submodule_module_diffs
+		echo "----" >>/u/git_submodule_module_diffs
+		cat /u/git_submodule_module_list2 >>/u/git_submodule_module_diffs
+	fi
+	# output to the caller
+	cat /u/git_submodule_module_list
+}
+
 die_if_unmatched ()
 {
 	if test "$1" = "#unmatched"
-- 
2.5.0.5.gf4cd9ae.dirty

^ permalink raw reply related	[flat|nested] 15+ messages in thread

* Re: [RFC/PATCH 1/2] submodule: implement `module_list` as a builtin helper
  2015-07-31 23:09 ` [RFC/PATCH 1/2] submodule: implement `module_list` as a builtin helper Stefan Beller
@ 2015-08-01  1:01   ` Junio C Hamano
  2015-08-03 21:30     ` Stefan Beller
  0 siblings, 1 reply; 15+ messages in thread
From: Junio C Hamano @ 2015-08-01  1:01 UTC (permalink / raw)
  To: Stefan Beller; +Cc: git, Jens.Lehmann, hvoigt

Stefan Beller <sbeller@google.com> writes:

> +static const char * const git_submodule_helper_usage[] = {
> +	N_("git submodule--helper --module_list [<path>...]"),

Yuck.  Please do not force --multi_word_opt upon us, which is simply
too ugly to live around here.  --module-list is perhaps OK, but
because submodule--helper would not have an default action, I'd
prefer to make these just "command words", i.e.

    $ git submodule--helper module_list

> +int module_list(int argc, const char **argv, const char *prefix)
> +{
> +	int i;
> +	static struct pathspec pathspec;
> +	const struct cache_entry **ce_entries = NULL;
> +	int alloc = 0, used = 0;
> +	char *ps_matched = NULL;
> +	char *max_prefix;
> +	int max_prefix_len;
> +	struct string_list already_printed = STRING_LIST_INIT_NODUP;
> +
> +	parse_pathspec(&pathspec, 0,
> +		       PATHSPEC_PREFER_FULL,
> +		       prefix, argv);
> +
> +	/* Find common prefix for all pathspec's */
> +	max_prefix = common_prefix(&pathspec);
> +	max_prefix_len = max_prefix ? strlen(max_prefix) : 0;
> +
> +	if (pathspec.nr)
> +		ps_matched = xcalloc(1, pathspec.nr);

Up to this point it interprets its input, and ...

> +	if (read_cache() < 0)
> +		die("index file corrupt");
> +
> +	for (i = 0; i < active_nr; i++) {
> +		const struct cache_entry *ce = active_cache[i];
> +
> +		if (!match_pathspec(&pathspec, ce->name, ce_namelen(ce),
> +				    max_prefix_len, ps_matched,
> +				    S_ISGITLINK(ce->ce_mode) | S_ISDIR(ce->ce_mode)))
> +			continue;
> +
> +		if (S_ISGITLINK(ce->ce_mode)) {
> +			ALLOC_GROW(ce_entries, used + 1, alloc);
> +			ce_entries[used++] = ce;
> +		}
> +	}
> +
> +	if (ps_matched && report_path_error(ps_matched, &pathspec, prefix)) {
> +		printf("#unmatched\n");
> +		return 1;
> +	}

... does the computation, with diagnosis.

And then it does the I/O with formatting.

> +
> +	for (i = 0; i < used; i++) {
> +		const struct cache_entry *ce = ce_entries[i];
...
> +	return 0;
> +}

When you have the implementation of "foreach-parallel" to move the
most expensive part of "submodule update" of a tree with 500
submodules, you would want to receive more or less the same "args"
as this thing takes and pass the ce_entries[] list to the "spawn and
run the user script in them in parallel" engine.

So I think it makes more sense to split this function into two (or
three).  One that reads from (argc, argv) and allocates and fills
ce_entries[] can become a helper that you can reuse later.  

'int module_list()' (shouldn't it be static?), can make a call to
that helper at the begining of it, and the remainder of the function
would do the textual I/O.

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [RFC/PATCH 2/2] Testing the new code
  2015-07-31 23:09 ` [RFC/PATCH 2/2] Testing the new code Stefan Beller
@ 2015-08-01  1:02   ` Junio C Hamano
  2015-08-03 16:23     ` Stefan Beller
  0 siblings, 1 reply; 15+ messages in thread
From: Junio C Hamano @ 2015-08-01  1:02 UTC (permalink / raw)
  To: Stefan Beller; +Cc: git, Jens.Lehmann, hvoigt

Stefan Beller <sbeller@google.com> writes:

> -module_list()
> +module_list_shell()
>  {
>  	eval "set $(git rev-parse --sq --prefix "$wt_prefix" -- "$@")"
>  	(
> @@ -187,6 +187,29 @@ module_list()
>  	'
>  }
>  
> +module_list()
> +{
> +	# call both the old and new code
> +	module_list_shell $@ >/u/git_submodule_module_list_shell 2>/u/git_submodule_module_list_shell2
> +	git submodule--helper --module_list $@ >/u/git_submodule_module_list 2>/u/git_submodule_module_list2

You seem to be discarding the double-quote around $@ in both of
these two places.  Intended?

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [RFC/PATCH 2/2] Testing the new code
  2015-08-01  1:02   ` Junio C Hamano
@ 2015-08-03 16:23     ` Stefan Beller
  2015-08-03 19:47       ` Junio C Hamano
  0 siblings, 1 reply; 15+ messages in thread
From: Stefan Beller @ 2015-08-03 16:23 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: git, Jens Lehmann, Heiko Voigt

On Fri, Jul 31, 2015 at 6:02 PM, Junio C Hamano <gitster@pobox.com> wrote:
> Stefan Beller <sbeller@google.com> writes:
>
>> -module_list()
>> +module_list_shell()
>>  {
>>       eval "set $(git rev-parse --sq --prefix "$wt_prefix" -- "$@")"
>>       (
>> @@ -187,6 +187,29 @@ module_list()
>>       '
>>  }
>>
>> +module_list()
>> +{
>> +     # call both the old and new code
>> +     module_list_shell $@ >/u/git_submodule_module_list_shell 2>/u/git_submodule_module_list_shell2
>> +     git submodule--helper --module_list $@ >/u/git_submodule_module_list 2>/u/git_submodule_module_list2
>
> You seem to be discarding the double-quote around $@ in both of
> these two places.  Intended?

No, not at all. This was a bit sloppy.
This patch was rather showing off how I intend to test the previous patch.
The idea came from a talk[1] at oscon which presented refactoring code with
this differential strategy (except that they used it on a larger code base and
collected the diffs between the new and old code for a while in production
instead of just relying on the test suite passing).

[1] http://www.oscon.com/open-source-2015/public/schedule/detail/41888

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [RFC/PATCH 2/2] Testing the new code
  2015-08-03 16:23     ` Stefan Beller
@ 2015-08-03 19:47       ` Junio C Hamano
  0 siblings, 0 replies; 15+ messages in thread
From: Junio C Hamano @ 2015-08-03 19:47 UTC (permalink / raw)
  To: Stefan Beller; +Cc: git, Jens Lehmann, Heiko Voigt

Stefan Beller <sbeller@google.com> writes:

> On Fri, Jul 31, 2015 at 6:02 PM, Junio C Hamano <gitster@pobox.com> wrote:
>> Stefan Beller <sbeller@google.com> writes:
>>
>>> -module_list()
>>> +module_list_shell()
>>>  {
>>>       eval "set $(git rev-parse --sq --prefix "$wt_prefix" -- "$@")"
>>>       (
>>> @@ -187,6 +187,29 @@ module_list()
>>>       '
>>>  }
>>>
>>> +module_list()
>>> +{
>>> +     # call both the old and new code
>>> +     module_list_shell $@ >/u/git_submodule_module_list_shell 2>/u/git_submodule_module_list_shell2
>>> +     git submodule--helper --module_list $@ >/u/git_submodule_module_list 2>/u/git_submodule_module_list2
>>
>> You seem to be discarding the double-quote around $@ in both of
>> these two places.  Intended?
>
> No, not at all. This was a bit sloppy.

OK.

> This patch was rather showing off how I intend to test the previous patch.

Yeah, I can see what the code is doing, and you already saw that I
didn't disagree with the approach ;).  During a reimplementation
exercise, it often is a good idea, if the code structure allows you
to, to run both implementations and compare the results---but it can
go only so far.  It obviously is tricky to apply the trick to an
operation that is not idempotent to let two implementations to do it
twice in different ways and make sure they produce the same result.

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [RFC/PATCH 1/2] submodule: implement `module_list` as a builtin helper
  2015-08-01  1:01   ` Junio C Hamano
@ 2015-08-03 21:30     ` Stefan Beller
  2015-08-03 21:38       ` Junio C Hamano
                         ` (2 more replies)
  0 siblings, 3 replies; 15+ messages in thread
From: Stefan Beller @ 2015-08-03 21:30 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: git, Jens Lehmann, Heiko Voigt

On Fri, Jul 31, 2015 at 6:01 PM, Junio C Hamano <gitster@pobox.com> wrote:
> Stefan Beller <sbeller@google.com> writes:
>
>> +static const char * const git_submodule_helper_usage[] = {
>> +     N_("git submodule--helper --module_list [<path>...]"),
>
> Yuck.  Please do not force --multi_word_opt upon us, which is simply
> too ugly to live around here.  --module-list is perhaps OK,

I agree there. The way you word it here, it sounds as if the mixture
of dashes and underscores are a problem.

> but
> because submodule--helper would not have an default action, I'd
> prefer to make these just "command words", i.e.
>
>     $ git submodule--helper module_list

Why would you use an underscore in here as opposed to a dash?
     $ git submodule--helper module-list

I went with --module-list for now as I see no reason real to make it
a command word for now as it is not user facing but just a helper.
I have a patch from my previous attempt to rewrite "git submodule"
as a whole to accept both command words as well as double dashed
selected modes.

>
>> +int module_list(int argc, const char **argv, const char *prefix)
>> +{
>> +     int i;
>> +     static struct pathspec pathspec;
>> +     const struct cache_entry **ce_entries = NULL;
>> +     int alloc = 0, used = 0;
>> +     char *ps_matched = NULL;
>> +     char *max_prefix;
>> +     int max_prefix_len;
>> +     struct string_list already_printed = STRING_LIST_INIT_NODUP;
>> +
>> +     parse_pathspec(&pathspec, 0,
>> +                    PATHSPEC_PREFER_FULL,
>> +                    prefix, argv);
>> +
>> +     /* Find common prefix for all pathspec's */
>> +     max_prefix = common_prefix(&pathspec);
>> +     max_prefix_len = max_prefix ? strlen(max_prefix) : 0;
>> +
>> +     if (pathspec.nr)
>> +             ps_matched = xcalloc(1, pathspec.nr);
>
> Up to this point it interprets its input, and ...
>
>> +     if (read_cache() < 0)
>> +             die("index file corrupt");
>> +
>> +     for (i = 0; i < active_nr; i++) {
>> +             const struct cache_entry *ce = active_cache[i];
>> +
>> +             if (!match_pathspec(&pathspec, ce->name, ce_namelen(ce),
>> +                                 max_prefix_len, ps_matched,
>> +                                 S_ISGITLINK(ce->ce_mode) | S_ISDIR(ce->ce_mode)))
>> +                     continue;
>> +
>> +             if (S_ISGITLINK(ce->ce_mode)) {
>> +                     ALLOC_GROW(ce_entries, used + 1, alloc);
>> +                     ce_entries[used++] = ce;
>> +             }
>> +     }
>> +
>> +     if (ps_matched && report_path_error(ps_matched, &pathspec, prefix)) {
>> +             printf("#unmatched\n");
>> +             return 1;
>> +     }
>
> ... does the computation, with diagnosis.
>
> And then it does the I/O with formatting.
>
>> +
>> +     for (i = 0; i < used; i++) {
>> +             const struct cache_entry *ce = ce_entries[i];
> ...
>> +     return 0;
>> +}
>
> When you have the implementation of "foreach-parallel" to move the
> most expensive part of "submodule update" of a tree with 500
> submodules, you would want to receive more or less the same "args"
> as this thing takes and pass the ce_entries[] list to the "spawn and
> run the user script in them in parallel" engine.

That's true, I thought about splitting it up later when I actually need it.
[That seems easier to write, but not easier to review :( ]
I did split up the function just now.

>
> So I think it makes more sense to split this function into two (or
> three).  One that reads from (argc, argv) and allocates and fills
> ce_entries[] can become a helper that you can reuse later.
>
> 'int module_list()' (shouldn't it be static?), can make a call to
> that helper at the begining of it, and the remainder of the function
> would do the textual I/O.

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [RFC/PATCH 1/2] submodule: implement `module_list` as a builtin helper
  2015-08-03 21:30     ` Stefan Beller
@ 2015-08-03 21:38       ` Junio C Hamano
  2015-08-03 21:58       ` [PATCH] " Stefan Beller
  2015-08-03 22:04       ` [RFC/PATCH 1/2] " Junio C Hamano
  2 siblings, 0 replies; 15+ messages in thread
From: Junio C Hamano @ 2015-08-03 21:38 UTC (permalink / raw)
  To: Stefan Beller; +Cc: git, Jens Lehmann, Heiko Voigt

Stefan Beller <sbeller@google.com> writes:

>>     $ git submodule--helper module_list
>
> Why would you use an underscore in here as opposed to a dash?

Simply because the diff would be easier to read; the callers used to
call module_list shell function, now they call the subcommand with the
same name of submodule--helper.

^ permalink raw reply	[flat|nested] 15+ messages in thread

* [PATCH] submodule: implement `module_list` as a builtin helper
  2015-08-03 21:30     ` Stefan Beller
  2015-08-03 21:38       ` Junio C Hamano
@ 2015-08-03 21:58       ` Stefan Beller
  2015-08-03 22:04       ` [RFC/PATCH 1/2] " Junio C Hamano
  2 siblings, 0 replies; 15+ messages in thread
From: Stefan Beller @ 2015-08-03 21:58 UTC (permalink / raw)
  To: gitster; +Cc: git, jens.lehmann, hvoigt, Stefan Beller

Most of the submodule operations work on a set of submodules.
Calculating and using this set is usually done via:

       module_list "$@" | {
           while read mode sha1 stage sm_path
           do
                # the actual operation
           done
       }

Currently the function `module_list` is implemented in the
git-submodule.sh as a shell script wrapping a perl script.
The rewrite is in C, such that it is faster and can later be
easily adapted when other functions are rewritten in C.

git-submodule.sh similar to the builtin commands will navigate
to the top most directory of the repository and keeping the
subdirectories as a variable. As the helper is called from
within the git-submodule.sh script, we are already navigated
to the root level, but the path arguments are stil relative
to the subdirectory we were in when calling git-submodule.sh.
That's why there is a `--prefix` option pointing to an alternative
path where to anchor relative path arguments.

Signed-off-by: Stefan Beller <sbeller@google.com>
---

This doesn't have the `module_list` subcommand word,
but rather uses the `--module-list` mode for now.

However this is not an RFC any more, but I consider it stable.

Thanks,
Stefan

 Makefile                    |   1 +
 builtin.h                   |   1 +
 builtin/submodule--helper.c | 119 ++++++++++++++++++++++++++++++++++++++++++++
 git-submodule.sh            |  54 +++-----------------
 git.c                       |   1 +
 5 files changed, 128 insertions(+), 48 deletions(-)
 create mode 100644 builtin/submodule--helper.c

diff --git a/Makefile b/Makefile
index 8c3c724..6fb7484 100644
--- a/Makefile
+++ b/Makefile
@@ -898,6 +898,7 @@ BUILTIN_OBJS += builtin/shortlog.o
 BUILTIN_OBJS += builtin/show-branch.o
 BUILTIN_OBJS += builtin/show-ref.o
 BUILTIN_OBJS += builtin/stripspace.o
+BUILTIN_OBJS += builtin/submodule--helper.o
 BUILTIN_OBJS += builtin/symbolic-ref.o
 BUILTIN_OBJS += builtin/tag.o
 BUILTIN_OBJS += builtin/unpack-file.o
diff --git a/builtin.h b/builtin.h
index 9e04f97..7bf9597 100644
--- a/builtin.h
+++ b/builtin.h
@@ -118,6 +118,7 @@ extern int cmd_show(int argc, const char **argv, const char *prefix);
 extern int cmd_show_branch(int argc, const char **argv, const char *prefix);
 extern int cmd_status(int argc, const char **argv, const char *prefix);
 extern int cmd_stripspace(int argc, const char **argv, const char *prefix);
+extern int cmd_submodule__helper(int argc, const char **argv, const char *prefix);
 extern int cmd_symbolic_ref(int argc, const char **argv, const char *prefix);
 extern int cmd_tag(int argc, const char **argv, const char *prefix);
 extern int cmd_tar_tree(int argc, const char **argv, const char *prefix);
diff --git a/builtin/submodule--helper.c b/builtin/submodule--helper.c
new file mode 100644
index 0000000..3f6d07d
--- /dev/null
+++ b/builtin/submodule--helper.c
@@ -0,0 +1,119 @@
+#include "builtin.h"
+#include "cache.h"
+#include "parse-options.h"
+#include "quote.h"
+#include "pathspec.h"
+#include "dir.h"
+#include "utf8.h"
+
+static const char * const git_submodule_helper_usage[] = {
+	N_("git submodule--helper [<options>] [<path>...]"),
+	NULL
+};
+
+static char *ps_matched;
+static int max_prefix_len;
+static const struct cache_entry **ce_entries;
+static int ce_alloc, ce_used;
+static const char *alternative_path;
+
+static void module_list_compute(int argc, const char **argv,
+				const char *prefix,
+				struct pathspec *pathspec)
+{
+	int i;
+	char *max_prefix;
+	parse_pathspec(pathspec, 0,
+		       PATHSPEC_PREFER_FULL |
+		       PATHSPEC_STRIP_SUBMODULE_SLASH_CHEAP,
+		       prefix, argv);
+
+	/* Find common prefix for all pathspec's */
+	max_prefix = common_prefix(pathspec);
+	max_prefix_len = max_prefix ? strlen(max_prefix) : 0;
+
+	if (pathspec->nr)
+		ps_matched = xcalloc(1, pathspec->nr);
+
+
+	if (read_cache() < 0)
+		die("index file corrupt");
+
+	for (i = 0; i < active_nr; i++) {
+		const struct cache_entry *ce = active_cache[i];
+
+		if (!match_pathspec(pathspec, ce->name, ce_namelen(ce),
+				    max_prefix_len, ps_matched,
+				    S_ISGITLINK(ce->ce_mode) | S_ISDIR(ce->ce_mode)))
+			continue;
+
+		if (S_ISGITLINK(ce->ce_mode)) {
+			ALLOC_GROW(ce_entries, ce_used + 1, ce_alloc);
+			ce_entries[ce_used++] = ce;
+		}
+	}
+}
+
+static int module_list(int argc, const char **argv, const char *prefix)
+{
+	int i;
+	struct string_list already_printed = STRING_LIST_INIT_NODUP;
+	struct pathspec pathspec;
+
+	module_list_compute(argc, argv, prefix, &pathspec);
+
+	if (ps_matched && report_path_error(ps_matched, &pathspec, prefix)) {
+		printf("#unmatched\n");
+		return 1;
+	}
+
+	for (i = 0; i < ce_used; i++) {
+		const struct cache_entry *ce = ce_entries[i];
+
+		if (string_list_has_string(&already_printed, ce->name))
+			continue;
+
+		if (ce_stage(ce)) {
+			printf("%06o %s U\t", ce->ce_mode, sha1_to_hex(null_sha1));
+		} else {
+			printf("%06o %s %d\t", ce->ce_mode, sha1_to_hex(ce->sha1), ce_stage(ce));
+		}
+
+		utf8_fprintf(stdout, "%s\n", ce->name);
+
+		string_list_insert(&already_printed, ce->name);
+	}
+	return 0;
+}
+
+int cmd_submodule__helper(int argc, const char **argv, const char *prefix)
+{
+	enum {
+		MODE_UNSPECIFIED = 0,
+		MODE_MODULE_LIST,
+	} cmdmode = MODE_UNSPECIFIED;
+
+	struct option options[] = {
+		OPT_CMDMODE(0, "module-list", &cmdmode,
+			    N_("Get the submodule list"), MODE_MODULE_LIST),
+		OPT_STRING(0, "prefix", &alternative_path,
+			   N_("path"),
+			   N_("alternative anchor for relative paths")),
+		OPT_END()
+	};
+
+	argc = parse_options(argc, argv, prefix, options,
+			     git_submodule_helper_usage, 0);
+
+	switch (cmdmode) {
+	case MODE_MODULE_LIST:
+		return module_list(argc, argv, alternative_path
+					       ? alternative_path
+					       : prefix);
+		break;
+	case MODE_UNSPECIFIED:
+		usage_with_options(git_submodule_helper_usage, options);
+		break;
+	}
+	return 0;
+}
diff --git a/git-submodule.sh b/git-submodule.sh
index 36797c3..e0ab06e 100755
--- a/git-submodule.sh
+++ b/git-submodule.sh
@@ -145,48 +145,6 @@ relative_path ()
 	echo "$result$target"
 }
 
-#
-# Get submodule info for registered submodules
-# $@ = path to limit submodule list
-#
-module_list()
-{
-	eval "set $(git rev-parse --sq --prefix "$wt_prefix" -- "$@")"
-	(
-		git ls-files -z --error-unmatch --stage -- "$@" ||
-		echo "unmatched pathspec exists"
-	) |
-	@@PERL@@ -e '
-	my %unmerged = ();
-	my ($null_sha1) = ("0" x 40);
-	my @out = ();
-	my $unmatched = 0;
-	$/ = "\0";
-	while (<STDIN>) {
-		if (/^unmatched pathspec/) {
-			$unmatched = 1;
-			next;
-		}
-		chomp;
-		my ($mode, $sha1, $stage, $path) =
-			/^([0-7]+) ([0-9a-f]{40}) ([0-3])\t(.*)$/;
-		next unless $mode eq "160000";
-		if ($stage ne "0") {
-			if (!$unmerged{$path}++) {
-				push @out, "$mode $null_sha1 U\t$path\n";
-			}
-			next;
-		}
-		push @out, "$_\n";
-	}
-	if ($unmatched) {
-		print "#unmatched\n";
-	} else {
-		print for (@out);
-	}
-	'
-}
-
 die_if_unmatched ()
 {
 	if test "$1" = "#unmatched"
@@ -532,7 +490,7 @@ cmd_foreach()
 	# command in the subshell (and a recursive call to this function)
 	exec 3<&0
 
-	module_list |
+	git submodule--helper --prefix "$wt_prefix" --module-list |
 	while read mode sha1 stage sm_path
 	do
 		die_if_unmatched "$mode"
@@ -592,7 +550,7 @@ cmd_init()
 		shift
 	done
 
-	module_list "$@" |
+	git submodule--helper --prefix "$wt_prefix" --module-list "$@" |
 	while read mode sha1 stage sm_path
 	do
 		die_if_unmatched "$mode"
@@ -674,7 +632,7 @@ cmd_deinit()
 		die "$(eval_gettext "Use '.' if you really want to deinitialize all submodules")"
 	fi
 
-	module_list "$@" |
+	git submodule--helper --prefix "$wt_prefix" --module-list "$@" |
 	while read mode sha1 stage sm_path
 	do
 		die_if_unmatched "$mode"
@@ -790,7 +748,7 @@ cmd_update()
 	fi
 
 	cloned_modules=
-	module_list "$@" | {
+	git submodule--helper --prefix "$wt_prefix" --module-list "$@" | {
 	err=
 	while read mode sha1 stage sm_path
 	do
@@ -1222,7 +1180,7 @@ cmd_status()
 		shift
 	done
 
-	module_list "$@" |
+	git submodule--helper --prefix "$wt_prefix" --module-list "$@" |
 	while read mode sha1 stage sm_path
 	do
 		die_if_unmatched "$mode"
@@ -1299,7 +1257,7 @@ cmd_sync()
 		esac
 	done
 	cd_to_toplevel
-	module_list "$@" |
+	git submodule--helper --prefix "$wt_prefix" --module-list "$@" |
 	while read mode sha1 stage sm_path
 	do
 		die_if_unmatched "$mode"
diff --git a/git.c b/git.c
index fe94066..721995e 100644
--- a/git.c
+++ b/git.c
@@ -468,6 +468,7 @@ static struct cmd_struct commands[] = {
 	{ "stage", cmd_add, RUN_SETUP | NEED_WORK_TREE },
 	{ "status", cmd_status, RUN_SETUP | NEED_WORK_TREE },
 	{ "stripspace", cmd_stripspace },
+	{ "submodule--helper", cmd_submodule__helper, RUN_SETUP },
 	{ "symbolic-ref", cmd_symbolic_ref, RUN_SETUP },
 	{ "tag", cmd_tag, RUN_SETUP },
 	{ "unpack-file", cmd_unpack_file, RUN_SETUP },
-- 
2.5.0.5.gffc0ba2.dirty

^ permalink raw reply related	[flat|nested] 15+ messages in thread

* Re: [RFC/PATCH 1/2] submodule: implement `module_list` as a builtin helper
  2015-08-03 21:30     ` Stefan Beller
  2015-08-03 21:38       ` Junio C Hamano
  2015-08-03 21:58       ` [PATCH] " Stefan Beller
@ 2015-08-03 22:04       ` Junio C Hamano
  2015-08-03 22:13         ` Stefan Beller
  2015-08-03 22:58         ` [PATCH] " Stefan Beller
  2 siblings, 2 replies; 15+ messages in thread
From: Junio C Hamano @ 2015-08-03 22:04 UTC (permalink / raw)
  To: Stefan Beller; +Cc: git, Jens Lehmann, Heiko Voigt

Stefan Beller <sbeller@google.com> writes:

> On Fri, Jul 31, 2015 at 6:01 PM, Junio C Hamano <gitster@pobox.com> wrote:
>> Stefan Beller <sbeller@google.com> writes:
>>
>>> +static const char * const git_submodule_helper_usage[] = {
>>> +     N_("git submodule--helper --module_list [<path>...]"),
>>
>> Yuck.  Please do not force --multi_word_opt upon us, which is simply
>> too ugly to live around here.  --module-list is perhaps OK,
>
> I agree there. The way you word it here, it sounds as if the mixture
> of dashes and underscores are a problem.
>
>> but
>> because submodule--helper would not have an default action, I'd
>> prefer to make these just "command words", i.e.
>>
>>     $ git submodule--helper module_list
>
> Why would you use an underscore in here as opposed to a dash?
>      $ git submodule--helper module-list
>
> I went with --module-list for now as I see no reason real to make it
> a command word for now ...

The biggest reason why we should not add more --command-mode is to
avoid confusion (and copy & paste misdesign by others).  If you use
the command-word interface, it is crystal clear that

 (1) the word 'module_list' must be the first token after the
     subcommand name, no need to parse "subcmd --opt --cmd", and
     mislead the users to think incorrectly that ...

 (2) ... "cmd --optA --cmd1 --optB --cmd2" might be allowed, which
     would lead you to add code to reject, saying "cmd1 and cmd2 are
     incompatible".

So I'd prefer to see it fixed before you start supporting more
commands in submodule--helper.  It will need unnecessary patch noise
to fix it later.

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [RFC/PATCH 1/2] submodule: implement `module_list` as a builtin helper
  2015-08-03 22:04       ` [RFC/PATCH 1/2] " Junio C Hamano
@ 2015-08-03 22:13         ` Stefan Beller
  2015-08-03 22:58         ` [PATCH] " Stefan Beller
  1 sibling, 0 replies; 15+ messages in thread
From: Stefan Beller @ 2015-08-03 22:13 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: git, Jens Lehmann, Heiko Voigt

On Mon, Aug 3, 2015 at 3:04 PM, Junio C Hamano <gitster@pobox.com> wrote:
> Stefan Beller <sbeller@google.com> writes:
>
>> On Fri, Jul 31, 2015 at 6:01 PM, Junio C Hamano <gitster@pobox.com> wrote:
>>> Stefan Beller <sbeller@google.com> writes:
>>>
>>>> +static const char * const git_submodule_helper_usage[] = {
>>>> +     N_("git submodule--helper --module_list [<path>...]"),
>>>
>>> Yuck.  Please do not force --multi_word_opt upon us, which is simply
>>> too ugly to live around here.  --module-list is perhaps OK,
>>
>> I agree there. The way you word it here, it sounds as if the mixture
>> of dashes and underscores are a problem.
>>
>>> but
>>> because submodule--helper would not have an default action, I'd
>>> prefer to make these just "command words", i.e.
>>>
>>>     $ git submodule--helper module_list
>>
>> Why would you use an underscore in here as opposed to a dash?
>>      $ git submodule--helper module-list
>>
>> I went with --module-list for now as I see no reason real to make it
>> a command word for now ...
>
> The biggest reason why we should not add more --command-mode is to
> avoid confusion (and copy & paste misdesign by others).  If you use
> the command-word interface, it is crystal clear that
>
>  (1) the word 'module_list' must be the first token after the
>      subcommand name, no need to parse "subcmd --opt --cmd", and
>      mislead the users to think incorrectly that ...
>
>  (2) ... "cmd --optA --cmd1 --optB --cmd2" might be allowed, which
>      would lead you to add code to reject, saying "cmd1 and cmd2 are
>      incompatible".
>
> So I'd prefer to see it fixed before you start supporting more
> commands in submodule--helper.  It will need unnecessary patch noise
> to fix it later.

So we had this discussion some time ago [1] and my understanding from back
then was to rather have --command-mode instead of subcommand words because
that's what most git commands use nowadays, so we don't want to add more
of the competing style. It's also easier to work with as we have a powerful
option parsing implementation.

It seems your opinion has swayed. I'll change it then.

[1] $gmane/254076 or $gmane/231376/focus=231478

^ permalink raw reply	[flat|nested] 15+ messages in thread

* [PATCH] submodule: implement `module_list` as a builtin helper
  2015-08-03 22:04       ` [RFC/PATCH 1/2] " Junio C Hamano
  2015-08-03 22:13         ` Stefan Beller
@ 2015-08-03 22:58         ` Stefan Beller
  1 sibling, 0 replies; 15+ messages in thread
From: Stefan Beller @ 2015-08-03 22:58 UTC (permalink / raw)
  To: gitster; +Cc: git, jens.lehmann, hvoigt, Stefan Beller

Most of the submodule operations work on a set of submodules.
Calculating and using this set is usually done via:

       module_list "$@" | {
           while read mode sha1 stage sm_path
           do
                # the actual operation
           done
       }

Currently the function `module_list` is implemented in the
git-submodule.sh as a shell script wrapping a perl script.
The rewrite is in C, such that it is faster and can later be
easily adapted when other functions are rewritten in C.

git-submodule.sh similar to the builtin commands will navigate
to the top most directory of the repository and keeping the
subdirectories as a variable. As the helper is called from
within the git-submodule.sh script, we are already navigated
to the root level, but the path arguments are stil relative
to the subdirectory we were in when calling git-submodule.sh.
That's why there is a `--prefix` option pointing to an alternative
path where to anchor relative path arguments.

Signed-off-by: Stefan Beller <sbeller@google.com>
---
 Makefile                    |   1 +
 builtin.h                   |   1 +
 builtin/submodule--helper.c | 110 ++++++++++++++++++++++++++++++++++++++++++++
 git-submodule.sh            |  54 +++-------------------
 git.c                       |   1 +
 5 files changed, 119 insertions(+), 48 deletions(-)
 create mode 100644 builtin/submodule--helper.c

diff --git a/Makefile b/Makefile
index 8c3c724..6fb7484 100644
--- a/Makefile
+++ b/Makefile
@@ -898,6 +898,7 @@ BUILTIN_OBJS += builtin/shortlog.o
 BUILTIN_OBJS += builtin/show-branch.o
 BUILTIN_OBJS += builtin/show-ref.o
 BUILTIN_OBJS += builtin/stripspace.o
+BUILTIN_OBJS += builtin/submodule--helper.o
 BUILTIN_OBJS += builtin/symbolic-ref.o
 BUILTIN_OBJS += builtin/tag.o
 BUILTIN_OBJS += builtin/unpack-file.o
diff --git a/builtin.h b/builtin.h
index 9e04f97..7bf9597 100644
--- a/builtin.h
+++ b/builtin.h
@@ -118,6 +118,7 @@ extern int cmd_show(int argc, const char **argv, const char *prefix);
 extern int cmd_show_branch(int argc, const char **argv, const char *prefix);
 extern int cmd_status(int argc, const char **argv, const char *prefix);
 extern int cmd_stripspace(int argc, const char **argv, const char *prefix);
+extern int cmd_submodule__helper(int argc, const char **argv, const char *prefix);
 extern int cmd_symbolic_ref(int argc, const char **argv, const char *prefix);
 extern int cmd_tag(int argc, const char **argv, const char *prefix);
 extern int cmd_tar_tree(int argc, const char **argv, const char *prefix);
diff --git a/builtin/submodule--helper.c b/builtin/submodule--helper.c
new file mode 100644
index 0000000..fbd9568
--- /dev/null
+++ b/builtin/submodule--helper.c
@@ -0,0 +1,110 @@
+#include "builtin.h"
+#include "cache.h"
+#include "parse-options.h"
+#include "quote.h"
+#include "pathspec.h"
+#include "dir.h"
+#include "utf8.h"
+
+static char *ps_matched;
+static const struct cache_entry **ce_entries;
+static int ce_alloc, ce_used;
+
+static void module_list_compute(int argc, const char **argv,
+				const char *prefix,
+				struct pathspec *pathspec)
+{
+	int i;
+	char *max_prefix;
+	int max_prefix_len;
+	parse_pathspec(pathspec, 0,
+		       PATHSPEC_PREFER_FULL |
+		       PATHSPEC_STRIP_SUBMODULE_SLASH_CHEAP,
+		       prefix, argv);
+
+	/* Find common prefix for all pathspec's */
+	max_prefix = common_prefix(pathspec);
+	max_prefix_len = max_prefix ? strlen(max_prefix) : 0;
+
+	if (pathspec->nr)
+		ps_matched = xcalloc(1, pathspec->nr);
+
+
+	if (read_cache() < 0)
+		die("index file corrupt");
+
+	for (i = 0; i < active_nr; i++) {
+		const struct cache_entry *ce = active_cache[i];
+
+		if (!match_pathspec(pathspec, ce->name, ce_namelen(ce),
+				    max_prefix_len, ps_matched,
+				    S_ISGITLINK(ce->ce_mode) | S_ISDIR(ce->ce_mode)))
+			continue;
+
+		if (S_ISGITLINK(ce->ce_mode)) {
+			ALLOC_GROW(ce_entries, ce_used + 1, ce_alloc);
+			ce_entries[ce_used++] = ce;
+		}
+	}
+}
+
+static int module_list(int argc, const char **argv, const char *prefix)
+{
+	int i;
+	struct string_list already_printed = STRING_LIST_INIT_NODUP;
+	struct pathspec pathspec;
+	const char *alternative_path;
+
+	struct option module_list_options[] = {
+		OPT_STRING(0, "prefix", &alternative_path,
+			   N_("path"),
+			   N_("alternative anchor for relative paths")),
+		OPT_END()
+	};
+
+	static const char * const git_submodule_helper_usage[] = {
+		N_("git submodule--helper module_list [--prefix=<path>] [<path>...]"),
+		NULL
+	};
+
+	argc = parse_options(argc, argv, prefix, module_list_options,
+			     git_submodule_helper_usage, 0);
+
+	module_list_compute(argc, argv, alternative_path
+					? alternative_path
+					: prefix, &pathspec);
+
+	if (ps_matched && report_path_error(ps_matched, &pathspec, prefix)) {
+		printf("#unmatched\n");
+		return 1;
+	}
+
+	for (i = 0; i < ce_used; i++) {
+		const struct cache_entry *ce = ce_entries[i];
+
+		if (string_list_has_string(&already_printed, ce->name))
+			continue;
+
+		if (ce_stage(ce)) {
+			printf("%06o %s U\t", ce->ce_mode, sha1_to_hex(null_sha1));
+		} else {
+			printf("%06o %s %d\t", ce->ce_mode, sha1_to_hex(ce->sha1), ce_stage(ce));
+		}
+
+		utf8_fprintf(stdout, "%s\n", ce->name);
+
+		string_list_insert(&already_printed, ce->name);
+	}
+	return 0;
+}
+
+int cmd_submodule__helper(int argc, const char **argv, const char *prefix)
+{
+	if (argc < 2)
+		goto usage;
+
+	if (!strcmp(argv[1], "module_list"))
+		return module_list(--argc, ++argv, prefix);
+usage:
+	usage("git submodule--helper module_list\n");
+}
diff --git a/git-submodule.sh b/git-submodule.sh
index 36797c3..af9ecef 100755
--- a/git-submodule.sh
+++ b/git-submodule.sh
@@ -145,48 +145,6 @@ relative_path ()
 	echo "$result$target"
 }
 
-#
-# Get submodule info for registered submodules
-# $@ = path to limit submodule list
-#
-module_list()
-{
-	eval "set $(git rev-parse --sq --prefix "$wt_prefix" -- "$@")"
-	(
-		git ls-files -z --error-unmatch --stage -- "$@" ||
-		echo "unmatched pathspec exists"
-	) |
-	@@PERL@@ -e '
-	my %unmerged = ();
-	my ($null_sha1) = ("0" x 40);
-	my @out = ();
-	my $unmatched = 0;
-	$/ = "\0";
-	while (<STDIN>) {
-		if (/^unmatched pathspec/) {
-			$unmatched = 1;
-			next;
-		}
-		chomp;
-		my ($mode, $sha1, $stage, $path) =
-			/^([0-7]+) ([0-9a-f]{40}) ([0-3])\t(.*)$/;
-		next unless $mode eq "160000";
-		if ($stage ne "0") {
-			if (!$unmerged{$path}++) {
-				push @out, "$mode $null_sha1 U\t$path\n";
-			}
-			next;
-		}
-		push @out, "$_\n";
-	}
-	if ($unmatched) {
-		print "#unmatched\n";
-	} else {
-		print for (@out);
-	}
-	'
-}
-
 die_if_unmatched ()
 {
 	if test "$1" = "#unmatched"
@@ -532,7 +490,7 @@ cmd_foreach()
 	# command in the subshell (and a recursive call to this function)
 	exec 3<&0
 
-	module_list |
+	git submodule--helper module_list --prefix "$wt_prefix"|
 	while read mode sha1 stage sm_path
 	do
 		die_if_unmatched "$mode"
@@ -592,7 +550,7 @@ cmd_init()
 		shift
 	done
 
-	module_list "$@" |
+	git submodule--helper module_list --prefix "$wt_prefix" "$@" |
 	while read mode sha1 stage sm_path
 	do
 		die_if_unmatched "$mode"
@@ -674,7 +632,7 @@ cmd_deinit()
 		die "$(eval_gettext "Use '.' if you really want to deinitialize all submodules")"
 	fi
 
-	module_list "$@" |
+	git submodule--helper module_list --prefix "$wt_prefix" "$@" |
 	while read mode sha1 stage sm_path
 	do
 		die_if_unmatched "$mode"
@@ -790,7 +748,7 @@ cmd_update()
 	fi
 
 	cloned_modules=
-	module_list "$@" | {
+	git submodule--helper module_list --prefix "$wt_prefix" "$@" | {
 	err=
 	while read mode sha1 stage sm_path
 	do
@@ -1222,7 +1180,7 @@ cmd_status()
 		shift
 	done
 
-	module_list "$@" |
+	git submodule--helper module_list --prefix "$wt_prefix" "$@" |
 	while read mode sha1 stage sm_path
 	do
 		die_if_unmatched "$mode"
@@ -1299,7 +1257,7 @@ cmd_sync()
 		esac
 	done
 	cd_to_toplevel
-	module_list "$@" |
+	git submodule--helper module_list --prefix "$wt_prefix" "$@" |
 	while read mode sha1 stage sm_path
 	do
 		die_if_unmatched "$mode"
diff --git a/git.c b/git.c
index fe94066..721995e 100644
--- a/git.c
+++ b/git.c
@@ -468,6 +468,7 @@ static struct cmd_struct commands[] = {
 	{ "stage", cmd_add, RUN_SETUP | NEED_WORK_TREE },
 	{ "status", cmd_status, RUN_SETUP | NEED_WORK_TREE },
 	{ "stripspace", cmd_stripspace },
+	{ "submodule--helper", cmd_submodule__helper, RUN_SETUP },
 	{ "symbolic-ref", cmd_symbolic_ref, RUN_SETUP },
 	{ "tag", cmd_tag, RUN_SETUP },
 	{ "unpack-file", cmd_unpack_file, RUN_SETUP },
-- 
2.5.0.1.gc648038.dirty

^ permalink raw reply related	[flat|nested] 15+ messages in thread

end of thread, other threads:[~2015-08-03 22:58 UTC | newest]

Thread overview: 15+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2015-07-31  0:19 [PATCH] add: remove dead code Stefan Beller
2015-07-31 16:41 ` Junio C Hamano
2015-07-31 23:09 ` [RFC/PATCH 0/2] Submodules: refactoring the `module_list` function Stefan Beller
2015-07-31 23:09 ` [RFC/PATCH 1/2] submodule: implement `module_list` as a builtin helper Stefan Beller
2015-08-01  1:01   ` Junio C Hamano
2015-08-03 21:30     ` Stefan Beller
2015-08-03 21:38       ` Junio C Hamano
2015-08-03 21:58       ` [PATCH] " Stefan Beller
2015-08-03 22:04       ` [RFC/PATCH 1/2] " Junio C Hamano
2015-08-03 22:13         ` Stefan Beller
2015-08-03 22:58         ` [PATCH] " Stefan Beller
2015-07-31 23:09 ` [RFC/PATCH 2/2] Testing the new code Stefan Beller
2015-08-01  1:02   ` Junio C Hamano
2015-08-03 16:23     ` Stefan Beller
2015-08-03 19:47       ` Junio C Hamano

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.