All of lore.kernel.org
 help / color / mirror / Atom feed
* Request feature: –no-submodule
@ 2021-06-02 10:31 Ilias Apostolou
  2021-06-02 20:31 ` Taylor Blau
  0 siblings, 1 reply; 12+ messages in thread
From: Ilias Apostolou @ 2021-06-02 10:31 UTC (permalink / raw)
  To: git

Hello Git community.

As you already know, git ls-files command lists all of the tracked 
files, but submodule names are included.

My team would like a –no-submodule switch to exclude those.

Best rewards,
Ilias Apostolou

Github: https://github.com/LiakosC

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: Request feature: –no-submodule
  2021-06-02 10:31 Request feature: –no-submodule Ilias Apostolou
@ 2021-06-02 20:31 ` Taylor Blau
  2021-06-03  0:55   ` Junio C Hamano
  0 siblings, 1 reply; 12+ messages in thread
From: Taylor Blau @ 2021-06-02 20:31 UTC (permalink / raw)
  To: Ilias Apostolou; +Cc: git

On Wed, Jun 02, 2021 at 01:31:11PM +0300, Ilias Apostolou wrote:
> Hello Git community.
>
> As you already know, git ls-files command lists all of the tracked files,
> but submodule names are included.
>
> My team would like a –no-submodule switch to exclude those.

In all honesty, though this seems like a niche request for ls-files to
fulfill, ls-files already has quite the collection of options, so I
wouldn't be sad to see it learn how to do this, too.

The change boils down to having builtin/ls-files.c:show_ce() avoid
printing out submoudles in when '--exclude-submoudles' is given).
Something like this:

--- >8 ---

diff --git a/builtin/ls-files.c b/builtin/ls-files.c
index 45cc3b23dd..09d76fd068 100644
--- a/builtin/ls-files.c
+++ b/builtin/ls-files.c
@@ -35,6 +35,7 @@ static int show_fsmonitor_bit;
 static int line_terminator = '\n';
 static int debug_mode;
 static int show_eol;
+static int exclude_submodules;
 static int recurse_submodules;
 static int skipping_duplicates;

@@ -230,6 +231,9 @@ static void show_ce(struct repository *repo, struct dir_struct *dir,
 	if (max_prefix_len > strlen(fullname))
 		die("git ls-files: internal error - cache entry not superset of prefix");

+	if (exclude_submodules && S_ISGITLINK(ce->ce_mode))
+		return;
+
 	if (recurse_submodules && S_ISGITLINK(ce->ce_mode) &&
 	    is_submodule_active(repo, ce->name)) {
 		show_submodule(repo, dir, ce->name);
@@ -662,6 +666,8 @@ int cmd_ls_files(int argc, const char **argv, const char *cmd_prefix)
 		OPT_SET_INT_F(0, "full-name", &prefix_len,
 			      N_("make the output relative to the project top directory"),
 			      0, PARSE_OPT_NONEG),
+		OPT_BOOL(0, "exclude-submodules", &exclude_submodules,
+			N_("avoid showing submodules altogether")),
 		OPT_BOOL(0, "recurse-submodules", &recurse_submodules,
 			N_("recurse through submodules")),
 		OPT_BOOL(0, "error-unmatch", &error_unmatch,

--- 8< ---

I'll leave it at that to decide if others think this is a good idea or
not.

Thanks,
Taylor

^ permalink raw reply related	[flat|nested] 12+ messages in thread

* Re: Request feature: –no-submodule
  2021-06-02 20:31 ` Taylor Blau
@ 2021-06-03  0:55   ` Junio C Hamano
  2021-06-03  2:33     ` Taylor Blau
  0 siblings, 1 reply; 12+ messages in thread
From: Junio C Hamano @ 2021-06-03  0:55 UTC (permalink / raw)
  To: Taylor Blau; +Cc: Ilias Apostolou, git

Taylor Blau <me@ttaylorr.com> writes:

> On Wed, Jun 02, 2021 at 01:31:11PM +0300, Ilias Apostolou wrote:
>> Hello Git community.
>>
>> As you already know, git ls-files command lists all of the tracked files,
>> but submodule names are included.
>>
>> My team would like a –no-submodule switch to exclude those.
>
> In all honesty, though this seems like a niche request for ls-files to
> fulfill, ls-files already has quite the collection of options, so I
> wouldn't be sad to see it learn how to do this, too.

I would be somewhat sad for two reasons.

 - If "I am not interested in any submodule" in a project with
   submodules is a common thing people would want, teaching a trick
   only to "ls-files" is an expensive and ineffective approach, and
   adding the option to everything would just be ugly.  "git diff
   --no-submodule"?  "git add --no-submodule ."?

 - Is "not interested in any submodule" so special and fundamental,
   or is it merely because the project the original requestor is
   looking at happens to have an optional submodule? If the project
   had that optional part as a subdirectory instead, would the
   request have been not --no-submodule but something else?  What
   happens when the project that led to the original request
   acquires another submodule that is more interesting, or what if
   the requestor's interest shifts and makes some submodules
   interesting but others not?  Would the --no-submodule option
   become totally useless in such a case?

I wonder if the "attr" magic of the pathspec, that allows you to
choose paths based on the attributes you set on them, is what the
original requestor missed.


^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: Request feature: –no-submodule
  2021-06-03  0:55   ` Junio C Hamano
@ 2021-06-03  2:33     ` Taylor Blau
  2021-06-03 10:48       ` Ilias Apostolou
  2021-06-03 17:40       ` Junio C Hamano
  0 siblings, 2 replies; 12+ messages in thread
From: Taylor Blau @ 2021-06-03  2:33 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: Taylor Blau, Ilias Apostolou, git

On Thu, Jun 03, 2021 at 09:55:57AM +0900, Junio C Hamano wrote:
> Taylor Blau <me@ttaylorr.com> writes:
>
> > In all honesty, though this seems like a niche request for ls-files to
> > fulfill, ls-files already has quite the collection of options, so I
> > wouldn't be sad to see it learn how to do this, too.
>
> I would be somewhat sad for two reasons.

Mmm, that's a convincing set of reasons to think that this is a bad
idea. (And I was sort of on the fence about it anyway by posting the
whole thing as a short diff instead of a polished patch).

> I wonder if the "attr" magic of the pathspec, that allows you to
> choose paths based on the attributes you set on them, is what the
> original requestor missed.

Maybe... but relying on the attr magic for this particular case would
force the requester to set that attribute on all submodules in their
project, and constantly keep that in-sync with their .gitmodules. So, it
certainly make it easier to handle a request like "I don't care about
ls-files telling me about any submodule(s) except these ones", but
perhaps at the expense of some extra bookkeeping.

I might be missing something, though.

Thanks,
Taylor

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: Request feature: –no-submodule
  2021-06-03  2:33     ` Taylor Blau
@ 2021-06-03 10:48       ` Ilias Apostolou
  2021-06-03 22:06         ` Junio C Hamano
  2021-06-03 17:40       ` Junio C Hamano
  1 sibling, 1 reply; 12+ messages in thread
From: Ilias Apostolou @ 2021-06-03 10:48 UTC (permalink / raw)
  To: Taylor Blau, Junio C Hamano; +Cc: git

The reason we need to list all of the true files (except submodules) is 
for code refactoring using the sed unility, for example:

git ls-files | grep -Ev '(png$|ico$)' | xargs sed -i 's/\r\n/\n/'

All of the other alternatives we could think of are very ugly.

On 3/6/2021 5:33 π.μ., Taylor Blau wrote:
> On Thu, Jun 03, 2021 at 09:55:57AM +0900, Junio C Hamano wrote:
>> Taylor Blau <me@ttaylorr.com> writes:
>>
>>> In all honesty, though this seems like a niche request for ls-files to
>>> fulfill, ls-files already has quite the collection of options, so I
>>> wouldn't be sad to see it learn how to do this, too.
>> I would be somewhat sad for two reasons.

In my opinion, this should be a feature for "ls-files" only, since it 
would be nice to have a clean stream of true files.

Thank you for your replies,
Ilias


^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: Request feature: –no-submodule
  2021-06-03  2:33     ` Taylor Blau
  2021-06-03 10:48       ` Ilias Apostolou
@ 2021-06-03 17:40       ` Junio C Hamano
  2021-06-03 19:22         ` Jeff King
  1 sibling, 1 reply; 12+ messages in thread
From: Junio C Hamano @ 2021-06-03 17:40 UTC (permalink / raw)
  To: Taylor Blau; +Cc: Ilias Apostolou, git

Taylor Blau <me@ttaylorr.com> writes:

> Mmm, that's a convincing set of reasons to think that this is a bad
> idea. (And I was sort of on the fence about it anyway by posting the
> whole thing as a short diff instead of a polished patch).
>
>> I wonder if the "attr" magic of the pathspec, that allows you to
>> choose paths based on the attributes you set on them, is what the
>> original requestor missed.
>
> Maybe... but relying on the attr magic for this particular case would
> force the requester to set that attribute on all submodules in their
> project, and constantly keep that in-sync with their .gitmodules.

Well, it contradicts with the above "convincing" adjective, and
shows that you are not convinced that "submodule"-ness is not all
that essential and it merely is an artifact that the paths that the
original requester happens to be uninterested in are all submodules.

But if we agree that focusing too narrowly on "submodule"-ness is a
bad idea and open our mind to elsewhere, we'd realize that once we
learn that we can "mark" any path with attributes and use that in
magic pathspec, we can mark not just submodules but a subdirectory
as "uninteresting", which will not become useless even when it turns
out that "submodule"-ness wasn't really what the request was about.

Besides, you can iterate over the available submodules with "git
submodule foreach" fairly mechanically, and maintaining the
attribute per path shouldn't be all that hard, I would imagine.

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: Request feature: –no-submodule
  2021-06-03 17:40       ` Junio C Hamano
@ 2021-06-03 19:22         ` Jeff King
  2021-06-03 21:54           ` Junio C Hamano
  0 siblings, 1 reply; 12+ messages in thread
From: Jeff King @ 2021-06-03 19:22 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: Taylor Blau, Ilias Apostolou, git

On Fri, Jun 04, 2021 at 02:40:25AM +0900, Junio C Hamano wrote:

> Besides, you can iterate over the available submodules with "git
> submodule foreach" fairly mechanically, and maintaining the
> attribute per path shouldn't be all that hard, I would imagine.

It doesn't seem outrageous to me for Git to automatically populate
"pseudo-attributes" that connect properties of paths to the attribute
system. I.e., could we just act as if every path that is a gitlink has
the "gitlink" attribute set to true, and let people do:

  git ls-files ':(attr:-gitlink)'

That uses the existing generic mechanism, so it supports complex
situations, but it also makes the "easy" case of "just ignore
submodules" easy, with no attribute maintenance.

I didn't look at the code, though, so I'm not sure how awkward it would
be to implement (usually we decide on attributes only from looking at
the attribute files, not the trees/index themselves, but I think most
code asking about a path would be iterating a list of files in the first
place, and could feed the mode information).

An alternative view is allowing a pathspec that asks about the mode:

  git ls-files ':(mode=160000)'

That also lets you ask about other things, like:

  git ls-files ':(mode=100755)'

but it is probably unnecessarily arcane (even I had to look up the
correct mode for a gitlink just now :) ).

-Peff

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: Request feature: –no-submodule
  2021-06-03 19:22         ` Jeff King
@ 2021-06-03 21:54           ` Junio C Hamano
  2021-06-04  4:03             ` Jeff King
  0 siblings, 1 reply; 12+ messages in thread
From: Junio C Hamano @ 2021-06-03 21:54 UTC (permalink / raw)
  To: Jeff King; +Cc: Taylor Blau, Ilias Apostolou, git

Jeff King <peff@peff.net> writes:

> It doesn't seem outrageous to me for Git to automatically populate
> "pseudo-attributes" that connect properties of paths to the attribute
> system.

Sounds sensible.  The attribute assignment was designed to purely
depend on paths and not contents, so it might be a bit awkwarad to
implement, but it should be doable.

> An alternative view is allowing a pathspec that asks about the mode:
>
>   git ls-files ':(mode=160000)'
>
> That also lets you ask about other things, like:
>
>   git ls-files ':(mode=100755)'
>
> but it is probably unnecessarily arcane (even I had to look up the
> correct mode for a gitlink just now :) ).

The original request, as I understand the clarification posted
upthread, is not "submodules are uninteresting", but is "we want
regular files" (and we postprocess further the output), so such a
"mode" (pseudo-)attribute that is automatically populated would be a
better fit anyway.  With the current system, they can already do:

    git ls-files -s ':(exclude)*.png' ':(exclude)*.ico)' |
    sed -n -e 's/^100[76][54][54] [0-9a-f]* 0       //p' |
    xargs dos2unix

(cf. <9cc98ca3-bdc5-61bf-450a-99bb47673d6c@gmail.com>)

and with such an auto-pseudo-attribute, presumably something along
this line would work, removing the need for the intermediate filter:

    git ls-files \
	':(attr:mode=100755)' ':(attr:mode=100644)' \
	':(exclude)*.png' ':(exclude)*.ico' |
    xargs dos2unix


^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: Request feature: –no-submodule
  2021-06-03 10:48       ` Ilias Apostolou
@ 2021-06-03 22:06         ` Junio C Hamano
  0 siblings, 0 replies; 12+ messages in thread
From: Junio C Hamano @ 2021-06-03 22:06 UTC (permalink / raw)
  To: Ilias Apostolou; +Cc: Taylor Blau, git

Ilias Apostolou <ilias.apostolou.zero@gmail.com> writes:

> The reason we need to list all of the true files (except submodules)
> is for code refactoring using the sed unility, for example:
>
> git ls-files | grep -Ev '(png$|ico$)' | xargs sed -i 's/\r\n/\n/'
> ...
> In my opinion, this should be a feature for "ls-files" only, since it
> would be nice to have a clean stream of true files.

Ah, then pathspec is still the right tool to use, I would think.

For example,

    git ls-files -s ':(exclude)*.png' ':(exclude)*.ico' |
    sed -n -e 's/^100[76][54][54] [0-9a-f]* 0       //p' |
    xargs sed -i 's/\r\n/\n/'

that is,

 (1) ls-files -s can be used to learn what kind of path it is.
     Regular files are either 100644 or 100755.  That way, you can
     also exclude symbolic links, which your example use case would
     probably not want to touch.  And you can filter the output to
     have 'a clean stream of true files' fairly easily Depending on
     the details of your needs, it can be tweaked into 'a clean
     stream of executable files' etc., too.

 (2) pathspec magic like ':(exclude)' can be used to lose your "we
     know png and ico are not text files".

Hope this helps.

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: Request feature: –no-submodule
  2021-06-03 21:54           ` Junio C Hamano
@ 2021-06-04  4:03             ` Jeff King
  2021-06-04  5:06               ` Junio C Hamano
  0 siblings, 1 reply; 12+ messages in thread
From: Jeff King @ 2021-06-04  4:03 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: Taylor Blau, Ilias Apostolou, git

On Fri, Jun 04, 2021 at 06:54:10AM +0900, Junio C Hamano wrote:

> > An alternative view is allowing a pathspec that asks about the mode:
> >
> >   git ls-files ':(mode=160000)'
> >
> > That also lets you ask about other things, like:
> >
> >   git ls-files ':(mode=100755)'
> >
> > but it is probably unnecessarily arcane (even I had to look up the
> > correct mode for a gitlink just now :) ).
> 
> The original request, as I understand the clarification posted
> upthread, is not "submodules are uninteresting", but is "we want
> regular files" (and we postprocess further the output), so such a
> "mode" (pseudo-)attribute that is automatically populated would be a
> better fit anyway.  With the current system, they can already do:
> 
>     git ls-files -s ':(exclude)*.png' ':(exclude)*.ico)' |
>     sed -n -e 's/^100[76][54][54] [0-9a-f]* 0       //p' |
>     xargs dos2unix
> 
> (cf. <9cc98ca3-bdc5-61bf-450a-99bb47673d6c@gmail.com>)
> 
> and with such an auto-pseudo-attribute, presumably something along
> this line would work, removing the need for the intermediate filter:
> 
>     git ls-files \
> 	':(attr:mode=100755)' ':(attr:mode=100644)' \
> 	':(exclude)*.png' ':(exclude)*.ico' |
>     xargs dos2unix
> 

Yeah, that makes sense.

By the way, another reason (beyond a simpler pipeline) that the "magic
pathspec that understands modes" is nicer is that it can be applied to a
more dynamic set of paths.

For instance, you could use a pipeline like the one you showed above to
limit the ls-files output, and then you could feed that set of literal
paths to a command like "git add" (perhaps with --literal-pathspecs).
But you would not want to feed it to git-log like:

  git --literal-pathspecs log $(git ls-files -s | sed ...)

because you'd really want to expand the set of paths at each commit, not
once based on the current index (i.e., it would fail to match paths that
were removed or changed modes).

This is kind of a subtle and esoteric point, but it makes me more
convinced that having powerful pathspec selectors is potentially useful.
Or at least solving a problem that's hard to otherwise solve correctly. :)

-Peff

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: Request feature: –no-submodule
  2021-06-04  4:03             ` Jeff King
@ 2021-06-04  5:06               ` Junio C Hamano
  2021-06-05  5:45                 ` Ilias Apostolou
  0 siblings, 1 reply; 12+ messages in thread
From: Junio C Hamano @ 2021-06-04  5:06 UTC (permalink / raw)
  To: Jeff King; +Cc: Taylor Blau, Ilias Apostolou, git

>> The original request, as I understand the clarification posted
>> ...
>> (cf. <9cc98ca3-bdc5-61bf-450a-99bb47673d6c@gmail.com>)
>> 
>> and with such an auto-pseudo-attribute, presumably something along
>> this line would work, removing the need for the intermediate filter:
>> 
>>     git ls-files \
>> 	':(attr:mode=100755)' ':(attr:mode=100644)' \
>> 	':(exclude)*.png' ':(exclude)*.ico' |
>>     xargs dos2unix
>> 
>
> Yeah, that makes sense.
>
> By the way, another reason (beyond a simpler pipeline) that the "magic
> pathspec that understands modes" is nicer is that it can be applied to a
> more dynamic set of paths.

In the longer term, the project the original request wanted to
invent the "--no-submodule" option for may want to lose the specific
"we know that the only paths we do not want to run dos2unix happen
to be png and ico files in the current codebase" from the above
sample command line, and replace it with something like ':(attr:text)'

Alas, the "text" attribute does *not* work that way, though ;-)
Just like any other attributes, what you assigned yourself counts,
and our "is this file a text?" auto-detection logic only kicks in
when there is no attribute that tells if a path is text or not.

It would be expensive at runtime; even if we were to introduce a
"dynamic" pseudo attribute to tell text files and others apart, we
probably shouldn't use the same "attr:*" magic but use something
distinct (e.g. "dynamic-attr:*") in order to make sure that the
users understand the performance implications.  I think the above
"mode=100755" matic (or anything that requires more than the basic
"which entries in the .gitattributes files does this pathname
match?") would fall into the same category.


^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: Request feature: –no-submodule
  2021-06-04  5:06               ` Junio C Hamano
@ 2021-06-05  5:45                 ` Ilias Apostolou
  0 siblings, 0 replies; 12+ messages in thread
From: Ilias Apostolou @ 2021-06-05  5:45 UTC (permalink / raw)
  To: Junio C Hamano, Jeff King; +Cc: Taylor Blau, git

If I understand correctly, we should learn to use git ls-files special 
parameters like 'attr' and 'exclude'. I'm impressed by the fact that 
these are not wide-spread known. We should learn to use these advanced 
features instead of extra piping. Your emails are precious to my team 
and will help us advance.

Please consider this request closed.
Ilias


^ permalink raw reply	[flat|nested] 12+ messages in thread

end of thread, other threads:[~2021-06-05  5:45 UTC | newest]

Thread overview: 12+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-06-02 10:31 Request feature: –no-submodule Ilias Apostolou
2021-06-02 20:31 ` Taylor Blau
2021-06-03  0:55   ` Junio C Hamano
2021-06-03  2:33     ` Taylor Blau
2021-06-03 10:48       ` Ilias Apostolou
2021-06-03 22:06         ` Junio C Hamano
2021-06-03 17:40       ` Junio C Hamano
2021-06-03 19:22         ` Jeff King
2021-06-03 21:54           ` Junio C Hamano
2021-06-04  4:03             ` Jeff King
2021-06-04  5:06               ` Junio C Hamano
2021-06-05  5:45                 ` Ilias Apostolou

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.