dash.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* getopts appears to not be shifting $@ when consuming options
@ 2021-01-29 18:25 earnestly
  2021-01-29 20:15 ` Harald van Dijk
  2021-01-29 20:36 ` Jilles Tjoelker
  0 siblings, 2 replies; 8+ messages in thread
From: earnestly @ 2021-01-29 18:25 UTC (permalink / raw)
  To: dash

In this example dash will repeatedly append 'attr=foo' to the list of
parameters in an infinite loop:

    #!/bin/dash -x

    while getopts :a: arg -a foo -a bar; do
        case $arg in
            a) set -- "$@" attr="$OPTARG"
        esac
    done
    shift "$((OPTIND - 1))"

Instead I expected this to result in parameter list containing
'attr=foo' and 'attr=bar'.

This works in all shells I have been able to test with the exception of
busybox sh:

    * sh   (bash)
    * bash (All versions from 1.14 through 5.1.4)
    * mksh (MIRBSD KSH R59 2020/05/16)
    * ksh  (93u+)
    * zsh  (5.8)
    * zsh --emulate sh
    * heirloom-sh (bourne)

The only workaround I've found is to explicitly use `shift 2` in the a)
case and obviate the final shift using OPTIND.  This will unfortunately
break every other shell.

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: getopts appears to not be shifting $@ when consuming options
  2021-01-29 18:25 getopts appears to not be shifting $@ when consuming options earnestly
@ 2021-01-29 20:15 ` Harald van Dijk
  2021-01-30 15:31   ` Harald van Dijk
  2021-01-29 20:36 ` Jilles Tjoelker
  1 sibling, 1 reply; 8+ messages in thread
From: Harald van Dijk @ 2021-01-29 20:15 UTC (permalink / raw)
  To: earnestly, dash

Hi,

On 29/01/2021 18:25, earnestly wrote:
> In this example dash will repeatedly append 'attr=foo' to the list of
> parameters in an infinite loop:
> 
>      #!/bin/dash -x
> 
>      while getopts :a: arg -a foo -a bar; do
>          case $arg in
>              a) set -- "$@" attr="$OPTARG"
>          esac
>      done
>      shift "$((OPTIND - 1))"
> 
> Instead I expected this to result in parameter list containing
> 'attr=foo' and 'attr=bar'.

You are correct in your expectation, I believe: ending the loop after 
processing both arguments is what your script should do.

The reason that dash is behaving the way it does is that dash resets the 
getopts state when the positional arguments are changed by the set or 
shift commands. The getopts state is also reset when the positional 
arguments are changed because of a function call, but restored after the 
function returns. This is something shared across ash-based shells; you 
will also find it in FreeBSD's /bin/sh, and if I am not misreading the 
code, NetBSD's /bin/sh as well.

Although there are certainly cases where this behaviour is useful, 
especially the part where it saves and restores state for function 
calls, there are also where it is not, such as yours. Additionally, it 
appears to be in conflict with POSIX, which requires the getopts state 
to be preserved as long as it continues to be called with the same 
arguments: only resetting OPTIND to 1 is specified to reset the state to 
allow arguments to be parsed anew.

I would suggest that if this is changed to conform to POSIX, a 
non-standard method should remain available to allow shell functions to 
use getopts internally, including when the getopts loop in the function 
calls other functions that themselves use getopts, to ensure that any 
existing scripts broken by the change can be easily updated. One way to 
achieve that could be to special-case "local OPTIND=1" so that when the 
function returns, it restores not just the value of OPTIND, but uses 
that moment to additionally restore the extra internal state.

Cheers,
Harald van Dijk

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: getopts appears to not be shifting $@ when consuming options
  2021-01-29 18:25 getopts appears to not be shifting $@ when consuming options earnestly
  2021-01-29 20:15 ` Harald van Dijk
@ 2021-01-29 20:36 ` Jilles Tjoelker
  2021-01-29 21:19   ` Harald van Dijk
  2021-01-29 22:25   ` earnestly
  1 sibling, 2 replies; 8+ messages in thread
From: Jilles Tjoelker @ 2021-01-29 20:36 UTC (permalink / raw)
  To: earnestly, dash

On Fri, Jan 29, 2021 at 06:25:25PM +0000, earnestly wrote:
> In this example dash will repeatedly append 'attr=foo' to the list of
> parameters in an infinite loop:

>     #!/bin/dash -x

>     while getopts :a: arg -a foo -a bar; do
>         case $arg in
>             a) set -- "$@" attr="$OPTARG"
>         esac
>     done
>     shift "$((OPTIND - 1))"

> Instead I expected this to result in parameter list containing
> 'attr=foo' and 'attr=bar'.

Hmm, that's pretty clever. By passing the parameters to be parsed to the
getopts command instead of via $@, it is possible to modify $@ without
violating POSIX's rule that only allows parsing a single unmodified set
of parameters (unless getopts is reset via OPTIND=1).

Unfortunately, dash and FreeBSD sh reset the getopts state when the
positional parameters are modified via set or shift. They probably do
this to avoid use after free and out of bounds memory access when a
script violates POSIX's rule. However, when the parameters come from the
getopts command, there is no violation of the rule and no memory
unsafety problem.

> This works in all shells I have been able to test with the exception of
> busybox sh:

>     * sh   (bash)
>     * bash (All versions from 1.14 through 5.1.4)
>     * mksh (MIRBSD KSH R59 2020/05/16)
>     * ksh  (93u+)
>     * zsh  (5.8)
>     * zsh --emulate sh
>     * heirloom-sh (bourne)

> The only workaround I've found is to explicitly use `shift 2` in the a)
> case and obviate the final shift using OPTIND.  This will unfortunately
> break every other shell.

Since you need code to "un-eval" a list of parameters to construct the
getopts command above in a general case, a better workaround would be to
use that code to construct the attr="$OPTARG" list.

If your actual code is more like:

set -- -a foo -a bar # for testing
while getopts :a: arg; do
...

then the script violates the rule I mentioned, and has unspecified
results.

-- 
Jilles Tjoelker

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: getopts appears to not be shifting $@ when consuming options
  2021-01-29 20:36 ` Jilles Tjoelker
@ 2021-01-29 21:19   ` Harald van Dijk
  2021-01-29 22:25   ` earnestly
  1 sibling, 0 replies; 8+ messages in thread
From: Harald van Dijk @ 2021-01-29 21:19 UTC (permalink / raw)
  To: Jilles Tjoelker, earnestly, dash

On 29/01/2021 20:36, Jilles Tjoelker wrote:
> Unfortunately, dash and FreeBSD sh reset the getopts state when the
> positional parameters are modified via set or shift. They probably do
> this to avoid use after free and out of bounds memory access when a
> script violates POSIX's rule.

In dash, the getopts command guards against this. getoptscmd() checks 
that shellparam.optind is in bounds before calling getopts(), and 
getopts() checks that shellparam.optoff is in bounds, so as far as 
correctness goes, it should be enough to just remove the resetting of 
shellparam.optind and shellparam.optoff in those places that are not 
supposed to reset the state.

Cheers,
Harald van Dijk

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: getopts appears to not be shifting $@ when consuming options
  2021-01-29 20:36 ` Jilles Tjoelker
  2021-01-29 21:19   ` Harald van Dijk
@ 2021-01-29 22:25   ` earnestly
  2021-01-30  6:39     ` Vladimir N. Oleynik
  2021-01-30  7:36     ` Vladimir N. Oleynik
  1 sibling, 2 replies; 8+ messages in thread
From: earnestly @ 2021-01-29 22:25 UTC (permalink / raw)
  To: dash

On Fri, Jan 29, 2021 at 09:36:50PM +0100, Jilles Tjoelker wrote:
> If your actual code is more like:
> 
> set -- -a foo -a bar # for testing
> while getopts :a: arg; do
> ...
> 
> then the script violates the rule I mentioned, and has unspecified
> results.

This, of course, would be the intended use of such a technique.

Having read the a little more of the standard I may have come up with a
solution which both appears to work and without much ugliness.  I hope
it also doesn't violate either the spirit or word of the standard on this
matter:

    #!/bin/sh --

    while getopts :a: arg; do
        case $arg in
            a) set -- "$@" attr="$OPTARG"; shift "$((OPTIND - 1))"; OPTIND=1
        esac
    done
    shift "$((OPTIND - 1))"

It would be quite sad for me to lose this functionality.

P.S. To my amusement this works with bourne as well but shifting needs
     to be undertaken with `expr "$OPTIND" - 1` instead.

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: getopts appears to not be shifting $@ when consuming options
  2021-01-29 22:25   ` earnestly
@ 2021-01-30  6:39     ` Vladimir N. Oleynik
  2021-01-30  7:36     ` Vladimir N. Oleynik
  1 sibling, 0 replies; 8+ messages in thread
From: Vladimir N. Oleynik @ 2021-01-30  6:39 UTC (permalink / raw)
  To: dash

Hello,

>> set -- -a foo -a bar # for testing
>> while getopts :a: arg; do
>> ...

You can use typical array cryptic emulation with dash for this :)

#!/bin/dash

main() {
   local i=0 l= oIFS="$IFS" n=

   while getopts :a:u:n: arg; do
     case $arg in
      a) local l$((i=i+1))="$OPTARG"; l="${l}${l:+ }"'attr="$l'$i\";;
      n) n=$OPTARG;;
     esac
   done
   shift "$((OPTIND - 1))"
   IFS=' '; eval set -- $l '"$@"'; IFS=$oIFS

   echo '$@:' "$@"
   echo "n=$n"
}

main -a "foo 1" -a '$((ls))' -u unused -n x -n last_n_val a1 a2

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: getopts appears to not be shifting $@ when consuming options
  2021-01-29 22:25   ` earnestly
  2021-01-30  6:39     ` Vladimir N. Oleynik
@ 2021-01-30  7:36     ` Vladimir N. Oleynik
  1 sibling, 0 replies; 8+ messages in thread
From: Vladimir N. Oleynik @ 2021-01-30  7:36 UTC (permalink / raw)
  To: dash

Hello,

>> set -- -a foo -a bar # for testing
>> while getopts :a: arg; do
>> ...

You can use typical array cryptic emulation with dash for this :)

But i can not send this cryptic script to maillist, his parser crashing:
"for recipient address <dash@vger.kernel.org> the policy analysis reported:
zpostgrey: Read timed out !!!" :))

See ftp://simtreas.ru/pub/my/dash/arg2array.sh


--v
vodz

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: getopts appears to not be shifting $@ when consuming options
  2021-01-29 20:15 ` Harald van Dijk
@ 2021-01-30 15:31   ` Harald van Dijk
  0 siblings, 0 replies; 8+ messages in thread
From: Harald van Dijk @ 2021-01-30 15:31 UTC (permalink / raw)
  To: earnestly, dash

On 29/01/2021 20:15, Harald van Dijk wrote:
> I would suggest that if this is changed to conform to POSIX, a 
> non-standard method should remain available to allow shell functions to 
> use getopts internally, including when the getopts loop in the function 
> calls other functions that themselves use getopts, to ensure that any 
> existing scripts broken by the change can be easily updated. One way to 
> achieve that could be to special-case "local OPTIND=1" so that when the 
> function returns, it restores not just the value of OPTIND, but uses 
> that moment to additionally restore the extra internal state.

I have found that "local OPTIND=1" already works that way in bash, 
restoring upon function return not only the value of OPTIND but also the 
extra internal extra state. It is designed to work like that, not just a 
happy accident: it is listed as a new feature of 4.4. Because of that, 
if something like this is implemented, this would probably be the best 
syntax to use.

I have also found that this is easy to implement: currently, 
getoptsreset() just sets the internal state (shellparam.optind and 
shellparam.optoff) based on the text value of OPTIND. Instead, if:
- var.h defines a new flag to indicate that special hidden data is
   present past the end of OPTIND's text value,
- getopts() builds its own buffer, including the extra hidden data, and
   calls setvareq() directly with that special flag rather than relying
   on setvarint(),
- getoptsreset() checks it to either set the internal state as before
   or restore the internal state, depending on whether that special flag
   is set,
everything just works with no special casing needed in the 
implementation of the "local" command. The net result is even a small 
reduction in code size.

I am only posting a description of the changes rather than a patch or a 
link to a patch because I implemented this in my shell (a fork of dash), 
not dash itself, and the changes as I implemented them cannot be applied 
to dash without some modifications.

Cheers,
Harald van Dijk

^ permalink raw reply	[flat|nested] 8+ messages in thread

end of thread, other threads:[~2021-01-30 16:22 UTC | newest]

Thread overview: 8+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-01-29 18:25 getopts appears to not be shifting $@ when consuming options earnestly
2021-01-29 20:15 ` Harald van Dijk
2021-01-30 15:31   ` Harald van Dijk
2021-01-29 20:36 ` Jilles Tjoelker
2021-01-29 21:19   ` Harald van Dijk
2021-01-29 22:25   ` earnestly
2021-01-30  6:39     ` Vladimir N. Oleynik
2021-01-30  7:36     ` Vladimir N. Oleynik

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).