git.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Eric Sunshine <sunshine@sunshineco.com>
To: "Ævar Arnfjörð Bjarmason" <avarab@gmail.com>
Cc: Eric Sunshine via GitGitGadget <gitgitgadget@gmail.com>,
	Git List <git@vger.kernel.org>, Jeff King <peff@peff.net>,
	Elijah Newren <newren@gmail.com>,
	Fabian Stelzer <fs@gigacodes.de>,
	Johannes Schindelin <Johannes.Schindelin@gmx.de>
Subject: Re: [PATCH 02/18] chainlint.pl: add POSIX shell lexical analyzer
Date: Sat, 3 Sep 2022 02:00:11 -0400	[thread overview]
Message-ID: <CAPig+cTfcz3cJ3-ESW-yUNa7QC0HbjZ_giDQA72gBWp5T4Zb6w@mail.gmail.com> (raw)
In-Reply-To: <220901.86fshbjmqj.gmgdl@evledraar.gmail.com>

On Thu, Sep 1, 2022 at 8:35 AM Ævar Arnfjörð Bjarmason <avarab@gmail.com> wrote:
> On Thu, Sep 01 2022, Eric Sunshine via GitGitGadget wrote:
> Just generally on this series:
>
> > +     $tag =~ s/['"\\]//g;
>
> I think this would be a *lot* easier to read if all of these little
> regex decls could be split out into some "grammar" class, or other
> helper module/namespace. So e.g.:
>
>         my $SCRIPT_QUOTE_RX = qr/['"\\]/;

Taken out of context (as in the quoted snippet), it may indeed be
difficult to understand what that line is doing, however in context
with a meaningful function name:

    sub scan_heredoc_tag {
        ...
        my $tag = $self->scan_token();
        $tag =~ s/['"\\]//g;
        push(@{$self->{heretags}}, $indented ? "\t$tag" : "$tag");
        ...
    }

for someone who is familiar with common heredoc tag quoting/escaping
(i.e. <<'EOF', <<"EOF", <<\EOF), I find the inline character class
`['"\\]` much easier to understand than some opaque name such as
$SCRIPT_QUOTE_RX, doubly so because the definition of the named regex
might be far removed from the actual code which uses it, which would
require going and studying that definition before being able to
understand what this code is doing.

I grasp you made that name up on-the-fly as an example, but that does
highlight another reason why I'd be hesitant to try to pluck out and
name these regexes. Specifically, naming is hard and I don't trust
that I could come up with succinct meaningful names which would convey
what a regex does as well as the actual regex itself conveys what it
does. In context within the well-named function, `s/['"\\]//g` is
obviously stripping quoting/escaping from the tag name; trying to come
up with a succinct yet accurate name to convey that intention is
difficult. And this is just one example. The script is littered with
little regexes like this, and they are almost all unique, thus making
the task of inventing succinct meaningful names extra difficult. And,
as noted above, I'm not at all convinced that plucking the regex out
of its natural context -- thus making the reader go elsewhere to find
the definition of the regex -- would help improve comprehension.

> Then:
>
> > +     return $cc if $cc =~ /^(?:&&|\|\||>>|;;|<&|>&|<>|>\|)$/;
>
>         my $SCRIPT_WHATEVER_RX = qr/
>                 ^(?:
>                 &&
>                 |
>                 \|\|
>                 [...]
>         /x;
>
> etc., i.e. we could then make use of /x to add inline comments to these.

`/x` does make this slightly easier to grok, and this is a an example
of a regex which might be easy to name (i.e. $TWO_CHAR_OPERATOR), but
-- extra mandatory escaping aside -- it's not hard to understand this
one as-is; it's pretty obvious that it's looking for operators `&&`,
`||`, `>>`, `;;`, `<&`, `>&`, `<>`, and `>|`.

  reply	other threads:[~2022-09-03  6:00 UTC|newest]

Thread overview: 51+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-09-01  0:29 [PATCH 00/18] make test "linting" more comprehensive Eric Sunshine via GitGitGadget
2022-09-01  0:29 ` [PATCH 01/18] t: add skeleton chainlint.pl Eric Sunshine via GitGitGadget
2022-09-01 12:27   ` Ævar Arnfjörð Bjarmason
2022-09-02 18:53     ` Eric Sunshine
2022-09-01  0:29 ` [PATCH 02/18] chainlint.pl: add POSIX shell lexical analyzer Eric Sunshine via GitGitGadget
2022-09-01 12:32   ` Ævar Arnfjörð Bjarmason
2022-09-03  6:00     ` Eric Sunshine [this message]
2022-09-01  0:29 ` [PATCH 03/18] chainlint.pl: add POSIX shell parser Eric Sunshine via GitGitGadget
2022-09-01  0:29 ` [PATCH 04/18] chainlint.pl: add parser to validate tests Eric Sunshine via GitGitGadget
2022-09-01  0:29 ` [PATCH 05/18] chainlint.pl: add parser to identify test definitions Eric Sunshine via GitGitGadget
2022-09-01  0:29 ` [PATCH 06/18] chainlint.pl: validate test scripts in parallel Eric Sunshine via GitGitGadget
2022-09-01 12:36   ` Ævar Arnfjörð Bjarmason
2022-09-03  7:51     ` Eric Sunshine
2022-09-06 22:35   ` Eric Wong
2022-09-06 22:52     ` Eric Sunshine
2022-09-06 23:26       ` Jeff King
2022-11-21  4:02         ` Eric Sunshine
2022-11-21 13:28           ` Ævar Arnfjörð Bjarmason
2022-11-21 14:07             ` Eric Sunshine
2022-11-21 14:18               ` Ævar Arnfjörð Bjarmason
2022-11-21 14:48                 ` Eric Sunshine
2022-11-21 18:04           ` Jeff King
2022-11-21 18:47             ` Eric Sunshine
2022-11-21 18:50               ` Eric Sunshine
2022-11-21 18:52               ` Jeff King
2022-11-21 19:00                 ` Eric Sunshine
2022-11-21 19:28                   ` Jeff King
2022-11-22  0:11                   ` Ævar Arnfjörð Bjarmason
2022-09-01  0:29 ` [PATCH 07/18] chainlint.pl: don't require `return|exit|continue` to end with `&&` Eric Sunshine via GitGitGadget
2022-09-01  0:29 ` [PATCH 08/18] t/Makefile: apply chainlint.pl to existing self-tests Eric Sunshine via GitGitGadget
2022-09-01  0:29 ` [PATCH 09/18] chainlint.pl: don't require `&` background command to end with `&&` Eric Sunshine via GitGitGadget
2022-09-01  0:29 ` [PATCH 10/18] chainlint.pl: don't flag broken &&-chain if `$?` handled explicitly Eric Sunshine via GitGitGadget
2022-09-01  0:29 ` [PATCH 11/18] chainlint.pl: don't flag broken &&-chain if failure indicated explicitly Eric Sunshine via GitGitGadget
2022-09-01  0:29 ` [PATCH 12/18] chainlint.pl: complain about loops lacking explicit failure handling Eric Sunshine via GitGitGadget
2022-09-01  0:29 ` [PATCH 13/18] chainlint.pl: allow `|| echo` to signal failure upstream of a pipe Eric Sunshine via GitGitGadget
2022-09-01  0:29 ` [PATCH 14/18] t/chainlint: add more chainlint.pl self-tests Eric Sunshine via GitGitGadget
2022-09-01  0:29 ` [PATCH 15/18] test-lib: retire "lint harder" optimization hack Eric Sunshine via GitGitGadget
2022-09-01  0:29 ` [PATCH 16/18] test-lib: replace chainlint.sed with chainlint.pl Eric Sunshine via GitGitGadget
2022-09-03  5:07   ` Elijah Newren
2022-09-03  5:24     ` Eric Sunshine
2022-09-01  0:29 ` [PATCH 17/18] t/Makefile: teach `make test` and `make prove` to run chainlint.pl Eric Sunshine via GitGitGadget
2022-09-01  0:29 ` [PATCH 18/18] t: retire unused chainlint.sed Eric Sunshine via GitGitGadget
2022-09-02 12:42   ` several messages Johannes Schindelin
2022-09-02 18:16     ` Eric Sunshine
2022-09-02 18:34       ` Jeff King
2022-09-02 18:44         ` Junio C Hamano
2022-09-11  5:28 ` [PATCH 00/18] make test "linting" more comprehensive Jeff King
2022-09-11  7:01   ` Eric Sunshine
2022-09-11 18:31     ` Jeff King
2022-09-12 23:17       ` Eric Sunshine
2022-09-13  0:04         ` Jeff King

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=CAPig+cTfcz3cJ3-ESW-yUNa7QC0HbjZ_giDQA72gBWp5T4Zb6w@mail.gmail.com \
    --to=sunshine@sunshineco.com \
    --cc=Johannes.Schindelin@gmx.de \
    --cc=avarab@gmail.com \
    --cc=fs@gigacodes.de \
    --cc=git@vger.kernel.org \
    --cc=gitgitgadget@gmail.com \
    --cc=newren@gmail.com \
    --cc=peff@peff.net \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).