All of lore.kernel.org
 help / color / mirror / Atom feed
From: Stefan Beller <sbeller@google.com>
To: Junio C Hamano <gitster@pobox.com>
Cc: "git@vger.kernel.org" <git@vger.kernel.org>
Subject: Re: [PATCH] submodule operations: tighten pathspec errors
Date: Mon, 6 Jun 2016 12:31:22 -0700	[thread overview]
Message-ID: <CAGZ79kY+K4-hbzL-KGgsyTHvM0BoXBLawvj2GmWJ9tbTuiSGbg@mail.gmail.com> (raw)
In-Reply-To: <xmqqy46owr0n.fsf@gitster.mtv.corp.google.com>

On Wed, Jun 1, 2016 at 2:14 PM, Junio C Hamano <gitster@pobox.com> wrote:
> Stefan Beller <sbeller@google.com> writes:
>
>> On Thu, May 26, 2016 at 1:00 PM, Junio C Hamano <gitster@pobox.com> wrote:
>>
>>>> @@ -36,10 +37,9 @@ static int module_list_compute(int argc, const char **argv,
>>>>
>>>>       for (i = 0; i < active_nr; i++) {
>>>>               const struct cache_entry *ce = active_cache[i];
>>>> -
>>>> -             if (!match_pathspec(pathspec, ce->name, ce_namelen(ce),
>>>> -                                 0, ps_matched, 1) ||
>>>> -                 !S_ISGITLINK(ce->ce_mode))
>>>> +             if (!S_ISGITLINK(ce->ce_mode) ||
>>>> +                 !match_pathspec(pathspec, ce->name, ce_namelen(ce),
>>>> +                                 0, ps_matched, 1))
>>>>                       continue;
>>>
>>> OK, this is the crucial bit in this patch. pathspec matches are now
>>> done only against gitlinks, so any unmatch is "pattern or path
>>> given, but there was no such submodule".
>>
>> right.
>
> That changes the semantics, and its user visible effect may deserve
> to be in the documentation, no?
>
> It used to be that "git submodule--helper list COPYING" did not
> complain, but with this change, it would.  Which may be a good
> change, but "git submodule--helper list sub*" where most but not all
> of glob expansion done by the shell are submodule directories may be
> a common thing people may do, and it may be irritating to see the
> unmatch complaints.  I dunno.

I fixed all the variable names and was confident with what I had,
but this is an issue indeed as the error message is plain wrong:
    git$ git submodule--helper list sub*
    error: pathspec 'submodule.c' did not match any file(s) known to git.
    error: pathspec 'submodule-config.c' did not match any file(s) known to git.
    error: pathspec 'submodule-config.h' did not match any file(s) known to git.
    error: pathspec 'submodule-config.o' did not match any file(s) known to git.
    error: pathspec 'submodule.h' did not match any file(s) known to git.
    error: pathspec 'submodule_multiple_references' did not match any
file(s) known to git.
    error: pathspec 'submodule.o' did not match any file(s) known to git.
    error: pathspec 'submodulespec.o' did not match any file(s) known to git.

(I realize I have some stray object files laying around)

So I do not think we can do

-               if (!match_pathspec(pathspec, ce->name, ce_namelen(ce),
-                                   0, ps_matched, 1) ||
-                   !S_ISGITLINK(ce->ce_mode))
+               if (!S_ISGITLINK(ce->ce_mode) ||
+                   !match_pathspec(pathspec, ce->name, ce_namelen(ce),
+                                   0, ps_matched, 1))

or we need to have custom error reporting, which checks if any file
still matches the pathspec (and ignore non gitlinks)

>
> When we know we are not going to complain (i.e. --unmatch-ok option

I'd rather go with ignore-unmatch as git-rm does use that.

> is given from the command line), I think it is perfectly fine (and
> it is even preferrable) to swap the order of the check.  The mode
> check done with S_ISGITLINK() is much cheaper and it is likely to
> yield true much less often, which in turn would allow us to make
> fewer calls to more expensive match_pathspec().

As said, in that case we would then need a
    pathspec_mark_ps_matched(ce->name, ce_namelen(ce), ps_matched)

>
> But when we want to diagnose typo (i.e. --unmatch-ok was not given),
> we may want to preserve the current order, so that the "sub*"
> example in a few paragraphs ago would not irritate the users.

I see.

     if (!S_ISGITLINK(ce->ce_mode)) {
        pathspec_mark_ps_matched(...);
        continue;
    } else {
        if ( !match_pathspec(pathspec, ce->name, ce_namelen(ce),
                                     0, ps_matched, 1))
        continue;
    }
    ...

doesn't look very appealing, so I guess we'd just keep the current behavior
of
-               if (!match_pathspec(pathspec, ce->name, ce_namelen(ce),
-                                   0, ps_matched, 1) ||
-                   !S_ISGITLINK(ce->ce_mode))


>
>>> It is tempting to update report_path_error() return "OK" when its
>>> first parameter is NULL.
>>
>> such that we can do a
>>
>>     if (report_path_error(unmatch_ok ? NULL : ps_matched, pathspec, prefix)))
>>         result = -1;
>
> I actually was thinking about setting ps_matched to NULL so that
> both match_pathspec() and report_path_error() would get NULL.
> match_pathspec() has to do _more_ work when ps_matched[] aka seen[]
> is given.

Yes for one call; No from the birds eye view;
the first lines that use the seen[]:

    if (seen && seen[i] == MATCHED_EXACTLY)
        continue;

so if we have  `ps->nr > 0` then it is beneficial to have a
seen array, I think?

>
>>>> @@ -407,7 +410,7 @@ cmd_foreach()
>>>>       # command in the subshell (and a recursive call to this function)
>>>>       exec 3<&0
>>>>
>>>> -     git submodule--helper list --prefix "$wt_prefix"|
>>>> +     git submodule--helper list ${unmatch:+--unmatch} --prefix "$wt_prefix"|
>>>
>>> For this to work, somebody must ensure that the variable unmatch is
>>> either unset or set to empty unless the user gave --error-unmatch to
>>> us.  There is a block of empty assignments hear the beginning of
>>> this file for that very purpose, i.e. resetting a stray environment
>>> variable that could be in user's environment.

done.

>>>
>>> The patch itself does not look too bad.  I do not have an opinion on
>>> which one should be the default, and I certainly would understand if
>>> you want to keep the default loose (i.e. not complaining) with an
>>> optional error checking, but whichever default you choose, the
>>> option and variable names need to be clarified to avoid confusion.
>>
>> Ok I'll fix the variable names; I think for consistency with e.g.
>> ls-files --error-unmatch
>> we would want to be loose by default and strict on that option.
>
> I do not think the "typo-prevention" safety measure should suddenly
> be turned into opt-in; I'd suggest "--unmatch-ok" instead.

--ignore-unmatch has the same meaning and is taken by git-rm already?

      parent reply	other threads:[~2016-06-06 19:31 UTC|newest]

Thread overview: 6+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2016-05-21  1:21 [PATCH] submodule operations: tighten pathspec errors Stefan Beller
2016-05-26 20:00 ` Junio C Hamano
2016-06-01 20:55   ` Stefan Beller
2016-06-01 21:14     ` Junio C Hamano
2016-06-01 21:20       ` Junio C Hamano
2016-06-06 19:31       ` Stefan Beller [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=CAGZ79kY+K4-hbzL-KGgsyTHvM0BoXBLawvj2GmWJ9tbTuiSGbg@mail.gmail.com \
    --to=sbeller@google.com \
    --cc=git@vger.kernel.org \
    --cc=gitster@pobox.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.