dash.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Jilles Tjoelker <jilles@stack.nl>
To: Herbert Xu <herbert@gondor.apana.org.au>
Cc: Harald van Dijk <harald@gigawatt.nl>,
	olof@ethup.se, dash@vger.kernel.org
Subject: Re: Parameter expansion, patterns and fnmatch
Date: Fri, 2 Sep 2016 17:12:59 +0200	[thread overview]
Message-ID: <20160902151259.GB87540@stack.nl> (raw)
In-Reply-To: <20160902140437.GA12639@gondor.apana.org.au>

On Fri, Sep 02, 2016 at 10:04:37PM +0800, Herbert Xu wrote:
> Harald van Dijk <harald@gigawatt.nl> wrote:
> > Yes, this looks like a bug in dash. With the default --disable-fnmatch 
> > code, when dash encounters [ in a pattern, it immediately treats the 
> > following characters as part of the set. If it then encounters the end 
> > of the pattern without having seen a matching ], it attempts to reset 
> > the state and continue as if [ was treated as a literal character right 
> > from the start. The attempt to reset the state doesn't look right, and 
> > has been like this since at least the initial Git commit in 2005.

> pdksh exhibits the same behaviour:

> $ pdksh -c 'foo=[abc]; echo ${foo#[}'
> [abc]
> $

> POSIX says:

> 9.3.3 BRE Special Characters

> A BRE special character has special properties in certain contexts.
> Outside those contexts, or when preceded by a backslash, such a
> character is a BRE that matches the special character itself. The
> BRE special characters and the contexts in which they have their
> special meaning are as follows:

> .[\
> The period, left-bracket, and backslash shall be special except
> when used in a bracket expression (see RE Bracket Expression). An
> expression containing a '[' that is not preceded by a backslash
> and is not part of a bracket expression produces undefined results.

I think this interpretation of POSIX is incorrect. This is about shell
patterns, not basic regular expressions. Shell patterns are specified in
XCU 2.13 Pattern Matching Notation. In XCU 2.13.1, it is written:

] [
] If an open bracket introduces a bracket expression as in XBD Section
] 9.3.5, except that the <exclamation-mark> character ('!') shall
] replace the <circumflex> character ('^') in its role in a non-matching
] list in the regular expression notation, it shall introduce a pattern
] bracket expression. A bracket expression starting with an unquoted
] <circumflex> character produces unspecified results. Otherwise, '['
] shall match the character itself.

Therefore, pdksh is wrong and the output should be abc].

It is normally better to test against the actively developed mksh
instead of pdksh, but here mksh has the same bug. OpenBSD's ksh also has
some active development but stays closer to the original pdksh.

> > This also affects

> >     case [a in [?) echo ok ;; *) echo bad ;; esac

> > which should print ok.

> Even ksh prints bad here.

I think POSIX may be saying something different here from what it really
wants to say. There is text in 2.13.3 Patterns Used for Filename
Expansion that leaves unspecified whether [? matches only the literal
filename component [? or all two-character filename components starting
with [ (other slash-separated components in the same pattern are
unaffected). However, if ksh93 behaves similarly in a case statement,
that may have been what the standard had intended to say.

Looking at as simple code as possible, this seems, however, unhelpful.
Since a pattern like *[ should match the literal string *[ in the choice
where brackets that do not introduce a bracket expression are supposed
to disable other special characters and any earlier work on the * is
therefore wrong, implementing this choice requires an additional scan
for brackets that do not introduce a bracket expression.

-- 
Jilles Tjoelker

      parent reply	other threads:[~2016-09-02 15:13 UTC|newest]

Thread overview: 18+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2016-08-09  9:28 Parameter expansion, patterns and fnmatch Olof Johansson
2016-08-09 21:39 ` Harald van Dijk
2016-08-17 14:50   ` Olof Johansson
2016-09-02 14:04   ` Herbert Xu
2016-09-02 14:25     ` Eric Blake
2016-09-02 14:29       ` Herbert Xu
2016-09-02 14:49         ` Eric Blake
2016-09-02 14:51           ` Herbert Xu
2016-09-03 12:03             ` Harald van Dijk
2016-09-03 13:05               ` Herbert Xu
2016-09-03 13:19                 ` Harald van Dijk
2016-09-03 13:58                   ` Herbert Xu
2016-09-03 15:16                     ` Harald van Dijk
2016-09-02 14:46       ` Herbert Xu
2016-09-02 14:54         ` Eric Blake
2016-09-02 15:06         ` Chet Ramey
2016-09-02 14:48       ` Eric Blake
2016-09-02 15:12     ` Jilles Tjoelker [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20160902151259.GB87540@stack.nl \
    --to=jilles@stack.nl \
    --cc=dash@vger.kernel.org \
    --cc=harald@gigawatt.nl \
    --cc=herbert@gondor.apana.org.au \
    --cc=olof@ethup.se \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).