dash.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* Parameter expansion, patterns and fnmatch
@ 2016-08-09  9:28 Olof Johansson
  2016-08-09 21:39 ` Harald van Dijk
  0 siblings, 1 reply; 18+ messages in thread
From: Olof Johansson @ 2016-08-09  9:28 UTC (permalink / raw)
  To: dash

Hello,

I'm seeing some discrepancies between dash built with --enable-fnmatch
and dash built without that got me curious. Maybe you could help shed
some light?

 foo='[abc]'
 echo ${foo#[}
 echo ${foo#\[}

With dash built with --enable-fnmatch:

 abc]
 abc]

With dash built without --enable-fnmatch:

 [abc]
 abc]

I was able to reproduce this behavior on latest git master
(17a5f24e0). The dash manual states:

> The end of the character class is indicated by a ]; if the ] is
> missing then the [ matches a [ rather than introducing a character
> class.

Which to me suggests that the non-fnmatch case is not the expected
behavior. Is this interpretation correct?

Thanks,
-- 
Olof Johansson                            https://github.com/olof

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: Parameter expansion, patterns and fnmatch
  2016-08-09  9:28 Parameter expansion, patterns and fnmatch Olof Johansson
@ 2016-08-09 21:39 ` Harald van Dijk
  2016-08-17 14:50   ` Olof Johansson
  2016-09-02 14:04   ` Herbert Xu
  0 siblings, 2 replies; 18+ messages in thread
From: Harald van Dijk @ 2016-08-09 21:39 UTC (permalink / raw)
  To: Olof Johansson, dash

[-- Attachment #1: Type: text/plain, Size: 1534 bytes --]

On 09/08/16 11:28, Olof Johansson wrote:
> Hello,
>
> I'm seeing some discrepancies between dash built with --enable-fnmatch
> and dash built without that got me curious. Maybe you could help shed
> some light?
>
>  foo='[abc]'
>  echo ${foo#[}
>  echo ${foo#\[}
>
> With dash built with --enable-fnmatch:
>
>  abc]
>  abc]
>
> With dash built without --enable-fnmatch:
>
>  [abc]
>  abc]

Nice find.

> I was able to reproduce this behavior on latest git master
> (17a5f24e0). The dash manual states:
>
>> The end of the character class is indicated by a ]; if the ] is
>> missing then the [ matches a [ rather than introducing a character
>> class.
>
> Which to me suggests that the non-fnmatch case is not the expected
> behavior. Is this interpretation correct?

Yes, this looks like a bug in dash. With the default --disable-fnmatch 
code, when dash encounters [ in a pattern, it immediately treats the 
following characters as part of the set. If it then encounters the end 
of the pattern without having seen a matching ], it attempts to reset 
the state and continue as if [ was treated as a literal character right 
from the start. The attempt to reset the state doesn't look right, and 
has been like this since at least the initial Git commit in 2005.

This also affects

     case [a in [?) echo ok ;; *) echo bad ;; esac

which should print ok.

Attached is a patch that attempts to reset the state correctly. It 
passes your test and mine, but I have not yet tested it extensively.

> Thanks,
>

Cheers,
Harald van Dijk

[-- Attachment #2: lit-lbrack.patch --]
[-- Type: text/x-patch, Size: 234 bytes --]

--- a/src/expand.c
+++ b/src/expand.c
@@ -1594,7 +1594,8 @@ pmatch(const char *pattern, const char *string)
 			do {
 				if (!c) {
 					p = startp;
-					c = *p;
+					q--;
+					c = '[';
 					goto dft;
 				}
 				if (c == '[') {

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: Parameter expansion, patterns and fnmatch
  2016-08-09 21:39 ` Harald van Dijk
@ 2016-08-17 14:50   ` Olof Johansson
  2016-09-02 14:04   ` Herbert Xu
  1 sibling, 0 replies; 18+ messages in thread
From: Olof Johansson @ 2016-08-17 14:50 UTC (permalink / raw)
  To: Harald van Dijk, dash

On 2016-08-09 23:39 +0200, Harald van Dijk wrote:
> Yes, this looks like a bug in dash. With the default
> --disable-fnmatch code, when dash encounters [ in a pattern, it
> immediately treats the following characters as part of the set. If
> it then encounters the end of the pattern without having seen a
> matching ], it attempts to reset the state and continue as if [ was
> treated as a literal character right from the start. The attempt to
> reset the state doesn't look right, and has been like this since at
> least the initial Git commit in 2005.
> 
> This also affects
> 
>     case [a in [?) echo ok ;; *) echo bad ;; esac
> 
> which should print ok.
> 
> Attached is a patch that attempts to reset the state correctly. It
> passes your test and mine, but I have not yet tested it extensively.

Thanks a lot! For what it's worth, I've been using dash with this
patch this for about a week now, including building a lot of open
source software (via OpenEmbedded). I haven't seen any issues yet.

-- 
Olof Johansson                            https://github.com/olof

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: Parameter expansion, patterns and fnmatch
  2016-08-09 21:39 ` Harald van Dijk
  2016-08-17 14:50   ` Olof Johansson
@ 2016-09-02 14:04   ` Herbert Xu
  2016-09-02 14:25     ` Eric Blake
  2016-09-02 15:12     ` Jilles Tjoelker
  1 sibling, 2 replies; 18+ messages in thread
From: Herbert Xu @ 2016-09-02 14:04 UTC (permalink / raw)
  To: Harald van Dijk; +Cc: olof, dash

Harald van Dijk <harald@gigawatt.nl> wrote:
>
> Yes, this looks like a bug in dash. With the default --disable-fnmatch 
> code, when dash encounters [ in a pattern, it immediately treats the 
> following characters as part of the set. If it then encounters the end 
> of the pattern without having seen a matching ], it attempts to reset 
> the state and continue as if [ was treated as a literal character right 
> from the start. The attempt to reset the state doesn't look right, and 
> has been like this since at least the initial Git commit in 2005.

pdksh exhibits the same behaviour:

$ pdksh -c 'foo=[abc]; echo ${foo#[}'
[abc]
$

POSIX says:

9.3.3 BRE Special Characters

A BRE special character has special properties in certain contexts.
Outside those contexts, or when preceded by a backslash, such a
character is a BRE that matches the special character itself. The
BRE special characters and the contexts in which they have their
special meaning are as follows:

.[\
The period, left-bracket, and backslash shall be special except
when used in a bracket expression (see RE Bracket Expression). An
expression containing a '[' that is not preceded by a backslash
and is not part of a bracket expression produces undefined results.

> This also affects
> 
>     case [a in [?) echo ok ;; *) echo bad ;; esac
> 
> which should print ok.

Even ksh prints bad here.

I would however consider a patch that simplifies the code in the
undefined case.

Cheers,
-- 
Email: Herbert Xu <herbert@gondor.apana.org.au>
Home Page: http://gondor.apana.org.au/~herbert/
PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: Parameter expansion, patterns and fnmatch
  2016-09-02 14:04   ` Herbert Xu
@ 2016-09-02 14:25     ` Eric Blake
  2016-09-02 14:29       ` Herbert Xu
                         ` (2 more replies)
  2016-09-02 15:12     ` Jilles Tjoelker
  1 sibling, 3 replies; 18+ messages in thread
From: Eric Blake @ 2016-09-02 14:25 UTC (permalink / raw)
  To: Herbert Xu, Harald van Dijk; +Cc: olof, dash


[-- Attachment #1.1: Type: text/plain, Size: 2014 bytes --]

On 09/02/2016 09:04 AM, Herbert Xu wrote:

>> Yes, this looks like a bug in dash. With the default --disable-fnmatch 
>> code, when dash encounters [ in a pattern, it immediately treats the 
>> following characters as part of the set. If it then encounters the end 
>> of the pattern without having seen a matching ], it attempts to reset 
>> the state and continue as if [ was treated as a literal character right 
>> from the start.

> 
> POSIX says:
> 
> 9.3.3 BRE Special Characters
> 
> A BRE special character has special properties in certain contexts.
...
> An
> expression containing a '[' that is not preceded by a backslash
> and is not part of a bracket expression produces undefined results.


Ah, but POSIX also says:

2.13.1 Patterns Matching a Single Character

[
    If an open bracket introduces a bracket expression as in XBD RE
Bracket Expression, except that the <exclamation-mark> character ( '!' )
shall replace the <circumflex> character ( '^' ) in its role in a
non-matching list in the regular expression notation, it shall introduce
a pattern bracket expression. A bracket expression starting with an
unquoted <circumflex> character produces unspecified results. Otherwise,
'[' shall match the character itself.

So while a lone '[' is unspecified in a normal BRE, it is well-defined
in a shell filename pattern matching context.  Since '[' is not a
bracket expression, it MUST be treated as a literal '[', so ${foo#[}
MUST strip the leading [ from the contents of foo, without requiring
that the [ be quoted.

> 
>> This also affects
>>
>>     case [a in [?) echo ok ;; *) echo bad ;; esac
>>
>> which should print ok.
> 
> Even ksh prints bad here.

So ksh is also buggy.

> 
> I would however consider a patch that simplifies the code in the
> undefined case.

Except that it is well-defined by POSIX, not undefined.

-- 
Eric Blake   eblake redhat com    +1-919-301-3266
Libvirt virtualization library http://libvirt.org


[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 604 bytes --]

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: Parameter expansion, patterns and fnmatch
  2016-09-02 14:25     ` Eric Blake
@ 2016-09-02 14:29       ` Herbert Xu
  2016-09-02 14:49         ` Eric Blake
  2016-09-02 14:46       ` Herbert Xu
  2016-09-02 14:48       ` Eric Blake
  2 siblings, 1 reply; 18+ messages in thread
From: Herbert Xu @ 2016-09-02 14:29 UTC (permalink / raw)
  To: Eric Blake; +Cc: Harald van Dijk, olof, dash

On Fri, Sep 02, 2016 at 09:25:15AM -0500, Eric Blake wrote:
>
> >> This also affects
> >>
> >>     case [a in [?) echo ok ;; *) echo bad ;; esac
> >>
> >> which should print ok.
> > 
> > Even ksh prints bad here.
> 
> So ksh is also buggy.

Good luck writing a script with an unquoted [ expecting it to be
portable :)
-- 
Email: Herbert Xu <herbert@gondor.apana.org.au>
Home Page: http://gondor.apana.org.au/~herbert/
PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: Parameter expansion, patterns and fnmatch
  2016-09-02 14:25     ` Eric Blake
  2016-09-02 14:29       ` Herbert Xu
@ 2016-09-02 14:46       ` Herbert Xu
  2016-09-02 14:54         ` Eric Blake
  2016-09-02 15:06         ` Chet Ramey
  2016-09-02 14:48       ` Eric Blake
  2 siblings, 2 replies; 18+ messages in thread
From: Herbert Xu @ 2016-09-02 14:46 UTC (permalink / raw)
  To: Eric Blake; +Cc: Harald van Dijk, olof, dash

On Fri, Sep 02, 2016 at 09:25:15AM -0500, Eric Blake wrote:
>
> 2.13.1 Patterns Matching a Single Character
> 
> [
>     If an open bracket introduces a bracket expression as in XBD RE
> Bracket Expression, except that the <exclamation-mark> character ( '!' )
> shall replace the <circumflex> character ( '^' ) in its role in a
> non-matching list in the regular expression notation, it shall introduce
> a pattern bracket expression. A bracket expression starting with an
> unquoted <circumflex> character produces unspecified results. Otherwise,
> '[' shall match the character itself.

BTW, this last sentence is not present in

http://pubs.opengroup.org/onlinepubs/009604499/utilities/xcu_chap02.html#tag_02_13

So I presume it's a newer unreleased revision.

Seriously, you guys are turning POSIX into a joke by introducing
all these new requirements.  At this point I think we should
pretty much give up on POSIX compliance the way it's headed.

Cheers,
-- 
Email: Herbert Xu <herbert@gondor.apana.org.au>
Home Page: http://gondor.apana.org.au/~herbert/
PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: Parameter expansion, patterns and fnmatch
  2016-09-02 14:25     ` Eric Blake
  2016-09-02 14:29       ` Herbert Xu
  2016-09-02 14:46       ` Herbert Xu
@ 2016-09-02 14:48       ` Eric Blake
  2 siblings, 0 replies; 18+ messages in thread
From: Eric Blake @ 2016-09-02 14:48 UTC (permalink / raw)
  To: Herbert Xu, Harald van Dijk; +Cc: olof, dash


[-- Attachment #1.1: Type: text/plain, Size: 764 bytes --]

On 09/02/2016 09:25 AM, Eric Blake wrote:
> 
> So while a lone '[' is unspecified in a normal BRE, it is well-defined
> in a shell filename pattern matching context.  Since '[' is not a
> bracket expression, it MUST be treated as a literal '[', so ${foo#[}
> MUST strip the leading [ from the contents of foo, without requiring
> that the [ be quoted.

Rationale: The '[' shell builtin is not undefined behavior.  This was a
specific addition made in POSIX 2008 that was not present in POSIX 2001
(although I can't easily find the bug report that justified it, since
the bugs for POSIX 2008 predate the current austingroupbugs.net database)

-- 
Eric Blake   eblake redhat com    +1-919-301-3266
Libvirt virtualization library http://libvirt.org


[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 604 bytes --]

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: Parameter expansion, patterns and fnmatch
  2016-09-02 14:29       ` Herbert Xu
@ 2016-09-02 14:49         ` Eric Blake
  2016-09-02 14:51           ` Herbert Xu
  0 siblings, 1 reply; 18+ messages in thread
From: Eric Blake @ 2016-09-02 14:49 UTC (permalink / raw)
  To: Herbert Xu; +Cc: Harald van Dijk, olof, dash


[-- Attachment #1.1: Type: text/plain, Size: 657 bytes --]

On 09/02/2016 09:29 AM, Herbert Xu wrote:
> On Fri, Sep 02, 2016 at 09:25:15AM -0500, Eric Blake wrote:
>>
>>>> This also affects
>>>>
>>>>     case [a in [?) echo ok ;; *) echo bad ;; esac
>>>>
>>>> which should print ok.
>>>
>>> Even ksh prints bad here.
>>
>> So ksh is also buggy.
> 
> Good luck writing a script with an unquoted [ expecting it to be
> portable :)

[ '' ] || echo empty

There, I just wrote a portable script with unquoted [ portably
interpreted as itself and not as a bracket filename expansion pattern.

-- 
Eric Blake   eblake redhat com    +1-919-301-3266
Libvirt virtualization library http://libvirt.org


[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 604 bytes --]

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: Parameter expansion, patterns and fnmatch
  2016-09-02 14:49         ` Eric Blake
@ 2016-09-02 14:51           ` Herbert Xu
  2016-09-03 12:03             ` Harald van Dijk
  0 siblings, 1 reply; 18+ messages in thread
From: Herbert Xu @ 2016-09-02 14:51 UTC (permalink / raw)
  To: Eric Blake; +Cc: Harald van Dijk, olof, dash

On Fri, Sep 02, 2016 at 09:49:53AM -0500, Eric Blake wrote:
> On 09/02/2016 09:29 AM, Herbert Xu wrote:
> > On Fri, Sep 02, 2016 at 09:25:15AM -0500, Eric Blake wrote:
> >>
> >>>> This also affects
> >>>>
> >>>>     case [a in [?) echo ok ;; *) echo bad ;; esac
> >>>>
> >>>> which should print ok.
> >>>
> >>> Even ksh prints bad here.
> >>
> >> So ksh is also buggy.
> > 
> > Good luck writing a script with an unquoted [ expecting it to be
> > portable :)
> 
> [ '' ] || echo empty
> 
> There, I just wrote a portable script with unquoted [ portably
> interpreted as itself and not as a bracket filename expansion pattern.

dash handles that just fine.  We're not talking about a lone [
in file expansion context here.

Cheers,
-- 
Email: Herbert Xu <herbert@gondor.apana.org.au>
Home Page: http://gondor.apana.org.au/~herbert/
PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: Parameter expansion, patterns and fnmatch
  2016-09-02 14:46       ` Herbert Xu
@ 2016-09-02 14:54         ` Eric Blake
  2016-09-02 15:06         ` Chet Ramey
  1 sibling, 0 replies; 18+ messages in thread
From: Eric Blake @ 2016-09-02 14:54 UTC (permalink / raw)
  To: Herbert Xu; +Cc: Harald van Dijk, olof, dash


[-- Attachment #1.1: Type: text/plain, Size: 1699 bytes --]

On 09/02/2016 09:46 AM, Herbert Xu wrote:
> On Fri, Sep 02, 2016 at 09:25:15AM -0500, Eric Blake wrote:
>>
>> 2.13.1 Patterns Matching a Single Character
>>
>> [
>>     If an open bracket introduces a bracket expression as in XBD RE
>> Bracket Expression, except that the <exclamation-mark> character ( '!' )
>> shall replace the <circumflex> character ( '^' ) in its role in a
>> non-matching list in the regular expression notation, it shall introduce
>> a pattern bracket expression. A bracket expression starting with an
>> unquoted <circumflex> character produces unspecified results. Otherwise,
>> '[' shall match the character itself.
> 
> BTW, this last sentence is not present in
> 
> http://pubs.opengroup.org/onlinepubs/009604499/utilities/xcu_chap02.html#tag_02_13
> 

That's the 2004 edition (TC1 of the 2001 spec, aka Issue 6).

> So I presume it's a newer unreleased revision.

Newer but released (TC2 of the 2008 spec, aka Issue 7):
http://pubs.opengroup.org/onlinepubs/9699919799/utilities/V3_chap02.html#tag_18_13_01

The requirement has been there for 8 years now.

> 
> Seriously, you guys are turning POSIX into a joke by introducing
> all these new requirements.  At this point I think we should
> pretty much give up on POSIX compliance the way it's headed.

I hope you're just stating that out of frustration, and not something
that you actually intend to follow through with.  And if there is a
requirement being considered in the Austin Group that you disagree with,
please speak up on the Austin Group - membership is free.

-- 
Eric Blake   eblake redhat com    +1-919-301-3266
Libvirt virtualization library http://libvirt.org


[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 604 bytes --]

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: Parameter expansion, patterns and fnmatch
  2016-09-02 14:46       ` Herbert Xu
  2016-09-02 14:54         ` Eric Blake
@ 2016-09-02 15:06         ` Chet Ramey
  1 sibling, 0 replies; 18+ messages in thread
From: Chet Ramey @ 2016-09-02 15:06 UTC (permalink / raw)
  To: Herbert Xu, Eric Blake; +Cc: chet.ramey, Harald van Dijk, olof, dash

On 9/2/16 10:46 AM, Herbert Xu wrote:
> On Fri, Sep 02, 2016 at 09:25:15AM -0500, Eric Blake wrote:
>>
>> 2.13.1 Patterns Matching a Single Character
>>
>> [
>>     If an open bracket introduces a bracket expression as in XBD RE
>> Bracket Expression, except that the <exclamation-mark> character ( '!' )
>> shall replace the <circumflex> character ( '^' ) in its role in a
>> non-matching list in the regular expression notation, it shall introduce
>> a pattern bracket expression. A bracket expression starting with an
>> unquoted <circumflex> character produces unspecified results. Otherwise,
>> '[' shall match the character itself.
> 
> BTW, this last sentence is not present in
> 
> http://pubs.opengroup.org/onlinepubs/009604499/utilities/xcu_chap02.html#tag_02_13

That's pretty old; it's from 2004.

> So I presume it's a newer unreleased revision.

It's in the current revision, which dates from 2008.

> 
> Seriously, you guys are turning POSIX into a joke by introducing
> all these new requirements.  At this point I think we should
> pretty much give up on POSIX compliance the way it's headed.

Let's not go overboard here.  This was introduced to tighten up ambiguous
behavior: what happens when the open bracket *doesn't* introduce a
bracket expression.


-- 
``The lyf so short, the craft so long to lerne.'' - Chaucer
		 ``Ars longa, vita brevis'' - Hippocrates
Chet Ramey, UTech, CWRU    chet@case.edu    http://cnswww.cns.cwru.edu/~chet/

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: Parameter expansion, patterns and fnmatch
  2016-09-02 14:04   ` Herbert Xu
  2016-09-02 14:25     ` Eric Blake
@ 2016-09-02 15:12     ` Jilles Tjoelker
  1 sibling, 0 replies; 18+ messages in thread
From: Jilles Tjoelker @ 2016-09-02 15:12 UTC (permalink / raw)
  To: Herbert Xu; +Cc: Harald van Dijk, olof, dash

On Fri, Sep 02, 2016 at 10:04:37PM +0800, Herbert Xu wrote:
> Harald van Dijk <harald@gigawatt.nl> wrote:
> > Yes, this looks like a bug in dash. With the default --disable-fnmatch 
> > code, when dash encounters [ in a pattern, it immediately treats the 
> > following characters as part of the set. If it then encounters the end 
> > of the pattern without having seen a matching ], it attempts to reset 
> > the state and continue as if [ was treated as a literal character right 
> > from the start. The attempt to reset the state doesn't look right, and 
> > has been like this since at least the initial Git commit in 2005.

> pdksh exhibits the same behaviour:

> $ pdksh -c 'foo=[abc]; echo ${foo#[}'
> [abc]
> $

> POSIX says:

> 9.3.3 BRE Special Characters

> A BRE special character has special properties in certain contexts.
> Outside those contexts, or when preceded by a backslash, such a
> character is a BRE that matches the special character itself. The
> BRE special characters and the contexts in which they have their
> special meaning are as follows:

> .[\
> The period, left-bracket, and backslash shall be special except
> when used in a bracket expression (see RE Bracket Expression). An
> expression containing a '[' that is not preceded by a backslash
> and is not part of a bracket expression produces undefined results.

I think this interpretation of POSIX is incorrect. This is about shell
patterns, not basic regular expressions. Shell patterns are specified in
XCU 2.13 Pattern Matching Notation. In XCU 2.13.1, it is written:

] [
] If an open bracket introduces a bracket expression as in XBD Section
] 9.3.5, except that the <exclamation-mark> character ('!') shall
] replace the <circumflex> character ('^') in its role in a non-matching
] list in the regular expression notation, it shall introduce a pattern
] bracket expression. A bracket expression starting with an unquoted
] <circumflex> character produces unspecified results. Otherwise, '['
] shall match the character itself.

Therefore, pdksh is wrong and the output should be abc].

It is normally better to test against the actively developed mksh
instead of pdksh, but here mksh has the same bug. OpenBSD's ksh also has
some active development but stays closer to the original pdksh.

> > This also affects

> >     case [a in [?) echo ok ;; *) echo bad ;; esac

> > which should print ok.

> Even ksh prints bad here.

I think POSIX may be saying something different here from what it really
wants to say. There is text in 2.13.3 Patterns Used for Filename
Expansion that leaves unspecified whether [? matches only the literal
filename component [? or all two-character filename components starting
with [ (other slash-separated components in the same pattern are
unaffected). However, if ksh93 behaves similarly in a case statement,
that may have been what the standard had intended to say.

Looking at as simple code as possible, this seems, however, unhelpful.
Since a pattern like *[ should match the literal string *[ in the choice
where brackets that do not introduce a bracket expression are supposed
to disable other special characters and any earlier work on the * is
therefore wrong, implementing this choice requires an additional scan
for brackets that do not introduce a bracket expression.

-- 
Jilles Tjoelker

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: Parameter expansion, patterns and fnmatch
  2016-09-02 14:51           ` Herbert Xu
@ 2016-09-03 12:03             ` Harald van Dijk
  2016-09-03 13:05               ` Herbert Xu
  0 siblings, 1 reply; 18+ messages in thread
From: Harald van Dijk @ 2016-09-03 12:03 UTC (permalink / raw)
  To: Herbert Xu, Eric Blake; +Cc: olof, dash

On 02/09/16 16:51, Herbert Xu wrote:
> On Fri, Sep 02, 2016 at 09:49:53AM -0500, Eric Blake wrote:
>> On 09/02/2016 09:29 AM, Herbert Xu wrote:
>>> On Fri, Sep 02, 2016 at 09:25:15AM -0500, Eric Blake wrote:
>>>>
>>>>>> This also affects
>>>>>>
>>>>>>     case [a in [?) echo ok ;; *) echo bad ;; esac
>>>>>>
>>>>>> which should print ok.
>>>>>
>>>>> Even ksh prints bad here.
>>>>
>>>> So ksh is also buggy.
>>>
>>> Good luck writing a script with an unquoted [ expecting it to be
>>> portable :)
>>
>> [ '' ] || echo empty
>>
>> There, I just wrote a portable script with unquoted [ portably
>> interpreted as itself and not as a bracket filename expansion pattern.
>
> dash handles that just fine.  We're not talking about a lone [
> in file expansion context here.

We were originally talking about the [ in ${foo#[}. That is a lone [ in 
pattern matching context, which is what file expansion context is.

We're talking about something that dash has had code to handle for >10 
years, that's been documented by dash as supported for >10 years, and 
now that it turns out there's a flaw in the code where dash does not 
behave as documented and as clearly intended by the code, it's POSIX's 
fault?

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: Parameter expansion, patterns and fnmatch
  2016-09-03 12:03             ` Harald van Dijk
@ 2016-09-03 13:05               ` Herbert Xu
  2016-09-03 13:19                 ` Harald van Dijk
  0 siblings, 1 reply; 18+ messages in thread
From: Herbert Xu @ 2016-09-03 13:05 UTC (permalink / raw)
  To: Harald van Dijk; +Cc: Eric Blake, olof, dash

On Sat, Sep 03, 2016 at 02:03:28PM +0200, Harald van Dijk wrote:
> >>>>
> >>>>>>This also affects
> >>>>>>
> >>>>>>    case [a in [?) echo ok ;; *) echo bad ;; esac
>
> We're talking about something that dash has had code to handle for
> >10 years, that's been documented by dash as supported for >10
> years, and now that it turns out there's a flaw in the code where
> dash does not behave as documented and as clearly intended by the
> code, it's POSIX's fault?

dash has never worked this way.  What's worse is that your patch
changes dash's behaviour in a way that is inconsistent with every
shell out there except bash.

Cheers,
-- 
Email: Herbert Xu <herbert@gondor.apana.org.au>
Home Page: http://gondor.apana.org.au/~herbert/
PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: Parameter expansion, patterns and fnmatch
  2016-09-03 13:05               ` Herbert Xu
@ 2016-09-03 13:19                 ` Harald van Dijk
  2016-09-03 13:58                   ` Herbert Xu
  0 siblings, 1 reply; 18+ messages in thread
From: Harald van Dijk @ 2016-09-03 13:19 UTC (permalink / raw)
  To: Herbert Xu; +Cc: Eric Blake, olof, dash

On 03/09/16 15:05, Herbert Xu wrote:
> On Sat, Sep 03, 2016 at 02:03:28PM +0200, Harald van Dijk wrote:
>>>>>>
>>>>>>>> This also affects
>>>>>>>>
>>>>>>>>    case [a in [?) echo ok ;; *) echo bad ;; esac
>>
>> We're talking about something that dash has had code to handle for
>>> 10 years, that's been documented by dash as supported for >10
>> years, and now that it turns out there's a flaw in the code where
>> dash does not behave as documented and as clearly intended by the
>> code, it's POSIX's fault?
>
> dash has never worked this way.

Very misleading to continue quoting the complicated case but leave out 
the simple case again: dash doesn't handle even ${foo#[} correctly. And 
again, by correctly I mean as documented by dash, as required by POSIX, 
and as clearly intended by the author at the time the code was written. 
I do not know if you were the author of that specific piece of code.

But yeah, sure, if the bug has been there for over 10 years, and I'm 
unable to find older versions of dash to check, I would have guessed 
that dash indeed has never worked this way.

> What's worse is that your patch
> changes dash's behaviour in a way that is inconsistent with every
> shell out there except bash.

For the simple case, ksh and posh also strip a leading [ if doing ${foo#[}.

For the complicated case, at the very least, it makes dash consistent 
between builds. If you feel that for the complicated case, bad should be 
printed rather than ok, then that just makes dash buggy in the 
--enable-fnmatch builds, it still means there's something to fix.

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: Parameter expansion, patterns and fnmatch
  2016-09-03 13:19                 ` Harald van Dijk
@ 2016-09-03 13:58                   ` Herbert Xu
  2016-09-03 15:16                     ` Harald van Dijk
  0 siblings, 1 reply; 18+ messages in thread
From: Herbert Xu @ 2016-09-03 13:58 UTC (permalink / raw)
  To: Harald van Dijk; +Cc: Eric Blake, olof, dash

On Sat, Sep 03, 2016 at 03:19:57PM +0200, Harald van Dijk wrote:
> 
> But yeah, sure, if the bug has been there for over 10 years, and I'm
> unable to find older versions of dash to check, I would have guessed
> that dash indeed has never worked this way.

OK it looks like this actually wasn't the original behaviour.
It was introduced along with the character class support.  So
with that in mind, I feel a lot happier in changing the behaviour
of the case statement.

I've changed your patch slightly and will commit it if there are
no other issues.

---8<---
Subject: expand - Fix dangling left square brackets in patterns

When there is an unmatched left square bracket in patterns, pmatch
will behave strangely and exhibit undefined behaviour.  This patch
(based on Harld van Dijk's original) fixes this by treating it as
a literal left square bracket.

Reported-by: Olof Johansson <olof@ethup.se>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>

diff --git a/src/expand.c b/src/expand.c
index 36bea76..2a50830 100644
--- a/src/expand.c
+++ b/src/expand.c
@@ -1584,14 +1584,14 @@ pmatch(const char *pattern, const char *string)
 				p++;
 			}
 			found = 0;
-			chr = *q++;
+			chr = *q;
 			if (chr == '\0')
 				return 0;
 			c = *p++;
 			do {
 				if (!c) {
 					p = startp;
-					c = *p;
+					c = '[';
 					goto dft;
 				}
 				if (c == '[') {
@@ -1618,6 +1618,7 @@ pmatch(const char *pattern, const char *string)
 			} while ((c = *p++) != ']');
 			if (found == invert)
 				return 0;
+			q++;
 			break;
 		}
 dft:	        default:
-- 
Email: Herbert Xu <herbert@gondor.apana.org.au>
Home Page: http://gondor.apana.org.au/~herbert/
PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: Parameter expansion, patterns and fnmatch
  2016-09-03 13:58                   ` Herbert Xu
@ 2016-09-03 15:16                     ` Harald van Dijk
  0 siblings, 0 replies; 18+ messages in thread
From: Harald van Dijk @ 2016-09-03 15:16 UTC (permalink / raw)
  To: Herbert Xu; +Cc: Eric Blake, olof, dash

On 03/09/16 15:58, Herbert Xu wrote:
> On Sat, Sep 03, 2016 at 03:19:57PM +0200, Harald van Dijk wrote:
>>
>> But yeah, sure, if the bug has been there for over 10 years, and I'm
>> unable to find older versions of dash to check, I would have guessed
>> that dash indeed has never worked this way.
>
> OK it looks like this actually wasn't the original behaviour.
> It was introduced along with the character class support.  So
> with that in mind, I feel a lot happier in changing the behaviour
> of the case statement.
>
> I've changed your patch slightly and will commit it if there are
> no other issues.

None that I can see. Agreed that avoiding increment-decrement-increment 
to just do a single increment instead is an improvement, and I don't 
spot any issues in it.

> ---8<---
> Subject: expand - Fix dangling left square brackets in patterns
>
> When there is an unmatched left square bracket in patterns, pmatch
> will behave strangely and exhibit undefined behaviour.  This patch
> (based on Harld van Dijk's original) fixes this by treating it as
> a literal left square bracket.
>
> Reported-by: Olof Johansson <olof@ethup.se>
> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
>
> diff --git a/src/expand.c b/src/expand.c
> index 36bea76..2a50830 100644
> --- a/src/expand.c
> +++ b/src/expand.c
> @@ -1584,14 +1584,14 @@ pmatch(const char *pattern, const char *string)
>  				p++;
>  			}
>  			found = 0;
> -			chr = *q++;
> +			chr = *q;
>  			if (chr == '\0')
>  				return 0;
>  			c = *p++;
>  			do {
>  				if (!c) {
>  					p = startp;
> -					c = *p;
> +					c = '[';
>  					goto dft;
>  				}
>  				if (c == '[') {
> @@ -1618,6 +1618,7 @@ pmatch(const char *pattern, const char *string)
>  			} while ((c = *p++) != ']');
>  			if (found == invert)
>  				return 0;
> +			q++;
>  			break;
>  		}
>  dft:	        default:
>

^ permalink raw reply	[flat|nested] 18+ messages in thread

end of thread, other threads:[~2016-09-03 15:16 UTC | newest]

Thread overview: 18+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2016-08-09  9:28 Parameter expansion, patterns and fnmatch Olof Johansson
2016-08-09 21:39 ` Harald van Dijk
2016-08-17 14:50   ` Olof Johansson
2016-09-02 14:04   ` Herbert Xu
2016-09-02 14:25     ` Eric Blake
2016-09-02 14:29       ` Herbert Xu
2016-09-02 14:49         ` Eric Blake
2016-09-02 14:51           ` Herbert Xu
2016-09-03 12:03             ` Harald van Dijk
2016-09-03 13:05               ` Herbert Xu
2016-09-03 13:19                 ` Harald van Dijk
2016-09-03 13:58                   ` Herbert Xu
2016-09-03 15:16                     ` Harald van Dijk
2016-09-02 14:46       ` Herbert Xu
2016-09-02 14:54         ` Eric Blake
2016-09-02 15:06         ` Chet Ramey
2016-09-02 14:48       ` Eric Blake
2016-09-02 15:12     ` Jilles Tjoelker

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).