dash.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* getopts doesn't properly update OPTIND when called from function
@ 2015-05-28 18:54 Martijn Dekker
  2015-05-28 22:39 ` Harald van Dijk
  0 siblings, 1 reply; 9+ messages in thread
From: Martijn Dekker @ 2015-05-28 18:54 UTC (permalink / raw)
  To: dash

I'm writing a shell function that extends the functionality of the
'getopts' builtin. For that to work, it is necessary to call the
'getopts' builtin from the shell function.

The POSIX standard specifies that OPTIND and OPTARG are global
variables, even though the positional parameters are local to the
function.[*] This makes it possible to call 'getopts' from a function by
simply passing the global positional parameters along by adding "$@".

My problem is that dash does not properly update the global OPTIND
variable when getopts is called from a function, which defeats my
function on dash. It updates the global OPTIND for the first option but
not for subsequent options, so OPTIND gets stuck on 3. (It does
accurately update the global OPTARG variable, though.)

I made a little test program that demonstrates this; see below the
footnote. It succeeds on bash, ksh93, pdksh, mksh, and yash, but not
(d)ash or zsh[*2].

The output of my test script seems consistent with the hypothesis that
OPTIND is reinitialized to 1 whenever a function is called. It should
only be initialized when the shell is initialized.

I suspect this is an old bug as other versions of ash, including Busybox
ash and NetBSD's /bin/sh, share it.

Thanks,

- Martijn

[*] The POSIX standard specifies:
http://pubs.opengroup.org/onlinepubs/9699919799/utilities/getopts.html
"The shell variable specified by the name operand, OPTIND, and OPTARG
shall affect the current shell execution environment", which implies
that they are global variables.
    Confusingly, that same page also says: "The shell variables OPTIND
and OPTARG shall be local to the caller of getopts and shall not be
exported by default."
    But I believe that "caller" here means the program that calls
getopts, not the function; POSIX does not support function-local
variables. This interpretation is supported by the added phrase "... and
shall not be exported by default" and by the evidence that the majority
of popular shells pass my test script. Also, it is in fact global in
dash; after all, it does get updated just once...
    (Of course it should be possible to explicitly make OPTIND and
OPTARG local using the non-standard 'local' keyword.)

[*2] In zsh, OPTIND appears to be local to the function as the
positional parameters are, so in my test script OPTIND is stuck at 1. I
submitted a bug report to zsh-workers and a patch was posted in less
than an hour!

#### begin test script ####

#! /bin/sh

expect() {
    if [ "X$2" = "X$3" ]; then
        printf '%s: OK, got "%s"\n' "$1" "$2"
    else
        printf '%s: BUG: expected "%s", got "%s"\n' "$1" "$2" "$3"
        return 1
    fi
}

callgetopts() {
    getopts 'D:ln:vhL' opt "$@"
}

testfn() {
    expect OPTIND 1 "$OPTIND"

    callgetopts "$@"
    expect opt D "$opt"
    expect OPTARG 'test' "$OPTARG"

    callgetopts "$@"
    expect opt h "$opt"
    expect OPTARG '' "$OPTARG"

    callgetopts "$@"
    expect OPTIND 5 "$OPTIND"
    expect opt n "$opt"
    expect OPTARG 1 "$OPTARG"

    callgetopts "$@"
    expect OPTIND 5 "$OPTIND"

    callgetopts "$@"
    expect OPTIND 5 "$OPTIND"
}

testfn -D test -hn 1 test arguments

#### end test script ####

Output on dash 0.5.6 and current dash git version:

OPTIND: OK, got "1"
opt: OK, got "D"
OPTARG: OK, got "test"
opt: BUG: expected "h", got "D"
OPTARG: BUG: expected "", got "test"
OPTIND: BUG: expected "5", got "3"
opt: BUG: expected "n", got "D"
OPTARG: BUG: expected "1", got "test"
OPTIND: BUG: expected "5", got "3"
OPTIND: BUG: expected "5", got "3"

Expected output (on bash, *ksh*, yash):

OPTIND: OK, got "1"
opt: OK, got "D"
OPTARG: OK, got "test"
opt: OK, got "h"
OPTARG: OK, got ""
OPTIND: OK, got "5"
opt: OK, got "n"
OPTARG: OK, got "1"
OPTIND: OK, got "5"
OPTIND: OK, got "5"

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: getopts doesn't properly update OPTIND when called from function
  2015-05-28 18:54 getopts doesn't properly update OPTIND when called from function Martijn Dekker
@ 2015-05-28 22:39 ` Harald van Dijk
  2015-05-29  2:58   ` Herbert Xu
  2015-06-04 19:56   ` Martijn Dekker
  0 siblings, 2 replies; 9+ messages in thread
From: Harald van Dijk @ 2015-05-28 22:39 UTC (permalink / raw)
  To: Martijn Dekker; +Cc: dash

[-- Attachment #1: Type: text/plain, Size: 2166 bytes --]

On 28/05/2015 20:54, Martijn Dekker wrote:
> I'm writing a shell function that extends the functionality of the
> 'getopts' builtin. For that to work, it is necessary to call the
> 'getopts' builtin from the shell function.
>
> The POSIX standard specifies that OPTIND and OPTARG are global
> variables, even though the positional parameters are local to the
> function.[*] This makes it possible to call 'getopts' from a function by
> simply passing the global positional parameters along by adding "$@".
>
> My problem is that dash does not properly update the global OPTIND
> variable when getopts is called from a function, which defeats my
> function on dash. It updates the global OPTIND for the first option but
> not for subsequent options, so OPTIND gets stuck on 3. (It does
> accurately update the global OPTARG variable, though.)

That isn't the problem, not exactly anyway. The problem is that getopts 
is required to keep internal state separately from the OPTIND variable 
(a single integer is insufficient to track the progress when multiple 
options are combined in a single word), and that internal state is 
stored along with the positional parameters. The positional parameters 
are saved just before a function call, and restored when the function 
returns. The internal state of getopts should not be saved the same way. 
It should probably just be global to dash.

A quick patch to make sure it is global, and isn't reset when it 
shouldn't or doesn't need to be, is attached. You can experiment with 
it, if you like. Your script runs as expected with this patch. However, 
I should warn that I have done far too little investigation to actually 
submit this patch for inclusion at this point. I'll do more extensive 
checking, and testing, later. If someone else can check whether the 
patch is okay, and if not, in what cases it fails, that would be very 
welcome too.

Note that any changes could break existing scripts, that (incorrectly) 
rely on dash's current behaviour of implicitly resetting the state, and 
don't bother setting OPTIND to explicitly reset it when they want to 
parse a new set of arguments.

Cheers,
Harald van Dijk

[-- Attachment #2: getopts.patch --]
[-- Type: text/plain, Size: 2967 bytes --]

diff --git a/src/eval.c b/src/eval.c
index 071fb1b..59e7506 100644
--- a/src/eval.c
+++ b/src/eval.c
@@ -953,8 +953,6 @@ evalfun(struct funcnode *func, int argc, char **argv, int flags)
 	INTON;
 	shellparam.nparam = argc - 1;
 	shellparam.p = argv + 1;
-	shellparam.optind = 1;
-	shellparam.optoff = -1;
 	pushlocalvars();
 	evaltree(func->n.ndefun.body, flags & EV_TESTED);
 	poplocalvars(0);
diff --git a/src/options.c b/src/options.c
index 6f381e6..5b24eeb 100644
--- a/src/options.c
+++ b/src/options.c
@@ -163,8 +163,8 @@ setarg0:
 	}
 
 	shellparam.p = xargv;
-	shellparam.optind = 1;
-	shellparam.optoff = -1;
+	shoptind = 1;
+	shoptoff = -1;
 	/* assert(shellparam.malloc == 0 && shellparam.nparam == 0); */
 	while (*xargv) {
 		shellparam.nparam++;
@@ -316,8 +316,6 @@ setparam(char **argv)
 	shellparam.malloc = 1;
 	shellparam.nparam = nparam;
 	shellparam.p = newparam;
-	shellparam.optind = 1;
-	shellparam.optoff = -1;
 }
 
 
@@ -362,8 +360,6 @@ shiftcmd(int argc, char **argv)
 	}
 	ap2 = shellparam.p;
 	while ((*ap2++ = *ap1++) != NULL);
-	shellparam.optind = 1;
-	shellparam.optoff = -1;
 	INTON;
 	return 0;
 }
@@ -394,8 +390,8 @@ void
 getoptsreset(value)
 	const char *value;
 {
-	shellparam.optind = number(value) ?: 1;
-	shellparam.optoff = -1;
+	shoptind = number(value) ?: 1;
+	shoptoff = -1;
 }
 
 /*
@@ -412,20 +408,10 @@ getoptscmd(int argc, char **argv)
 
 	if (argc < 3)
 		sh_error("Usage: getopts optstring var [arg]");
-	else if (argc == 3) {
+	else if (argc == 3)
 		optbase = shellparam.p;
-		if ((unsigned)shellparam.optind > shellparam.nparam + 1) {
-			shellparam.optind = 1;
-			shellparam.optoff = -1;
-		}
-	}
-	else {
+	else
 		optbase = &argv[3];
-		if ((unsigned)shellparam.optind > argc - 2) {
-			shellparam.optind = 1;
-			shellparam.optoff = -1;
-		}
-	}
 
 	return getopts(argv[1], argv[2], optbase);
 }
@@ -438,10 +424,10 @@ getopts(char *optstr, char *optvar, char **optfirst)
 	int done = 0;
 	char s[2];
 	char **optnext;
-	int ind = shellparam.optind;
-	int off = shellparam.optoff;
+	int ind = shoptind;
+	int off = shoptoff;
 
-	shellparam.optind = -1;
+	shoptind = -1;
 	optnext = optfirst + ind - 1;
 
 	if (ind <= 1 || off < 0 || strlen(optnext[-1]) < off)
@@ -509,8 +495,8 @@ out:
 	s[1] = '\0';
 	setvar(optvar, s, 0);
 
-	shellparam.optoff = p ? p - *(optnext - 1) : -1;
-	shellparam.optind = ind;
+	shoptoff = p ? p - *(optnext - 1) : -1;
+	shoptind = ind;
 
 	return done;
 }
diff --git a/src/options.h b/src/options.h
index 975fe33..8295eb9 100644
--- a/src/options.h
+++ b/src/options.h
@@ -38,9 +38,9 @@ struct shparam {
 	int nparam;		/* # of positional parameters (without $0) */
 	unsigned char malloc;	/* if parameter list dynamically allocated */
 	char **p;		/* parameter list */
-	int optind;		/* next parameter to be processed by getopts */
-	int optoff;		/* used by getopts */
 };
+int shoptind;		/* next parameter to be processed by getopts */
+int shoptoff;		/* used by getopts */
 
 
 

^ permalink raw reply related	[flat|nested] 9+ messages in thread

* Re: getopts doesn't properly update OPTIND when called from function
  2015-05-28 22:39 ` Harald van Dijk
@ 2015-05-29  2:58   ` Herbert Xu
  2015-05-29  5:50     ` Harald van Dijk
  2015-06-04 19:56   ` Martijn Dekker
  1 sibling, 1 reply; 9+ messages in thread
From: Herbert Xu @ 2015-05-29  2:58 UTC (permalink / raw)
  To: Harald van Dijk; +Cc: martijn, dash

Harald van Dijk <harald@gigawatt.nl> wrote:
> That isn't the problem, not exactly anyway. The problem is that getopts 
> is required to keep internal state separately from the OPTIND variable 
> (a single integer is insufficient to track the progress when multiple 
> options are combined in a single word), and that internal state is 
> stored along with the positional parameters. The positional parameters 
> are saved just before a function call, and restored when the function 
> returns. The internal state of getopts should not be saved the same way. 
> It should probably just be global to dash.

I think the current behaviour is fine as far as POSIX is concerned.
It says:

http://pubs.opengroup.org/onlinepubs/9699919799/utilities/getopts.html

: APPLICATION USAGE

...

: Note that shell functions share OPTIND with the calling shell
: even though the positional parameters are changed. If the calling
: shell and any of its functions uses getopts to parse arguments,
: the results are unspecified.

Cheers,
-- 
Email: Herbert Xu <herbert@gondor.apana.org.au>
Home Page: http://gondor.apana.org.au/~herbert/
PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: getopts doesn't properly update OPTIND when called from function
  2015-05-29  2:58   ` Herbert Xu
@ 2015-05-29  5:50     ` Harald van Dijk
  2015-06-01  6:29       ` Herbert Xu
  0 siblings, 1 reply; 9+ messages in thread
From: Harald van Dijk @ 2015-05-29  5:50 UTC (permalink / raw)
  To: Herbert Xu; +Cc: martijn, dash

On 29/05/2015 04:58, Herbert Xu wrote:
> Harald van Dijk <harald@gigawatt.nl> wrote:
>> That isn't the problem, not exactly anyway. The problem is that getopts
>> is required to keep internal state separately from the OPTIND variable
>> (a single integer is insufficient to track the progress when multiple
>> options are combined in a single word), and that internal state is
>> stored along with the positional parameters. The positional parameters
>> are saved just before a function call, and restored when the function
>> returns. The internal state of getopts should not be saved the same way.
>> It should probably just be global to dash.
>
> I think the current behaviour is fine as far as POSIX is concerned.
> It says:
>
> http://pubs.opengroup.org/onlinepubs/9699919799/utilities/getopts.html
>
> : APPLICATION USAGE
>
> ...
>
> : Note that shell functions share OPTIND with the calling shell
> : even though the positional parameters are changed. If the calling
> : shell and any of its functions uses getopts to parse arguments,
> : the results are unspecified.

The Application usage sections are informative and aren't worded as 
precisely as the other sections. If a script uses getopts at the global 
level, and it calls a shell function that too uses getopts, then it is 
very easy to be covered by

 > Any other attempt to invoke getopts multiple times in a single shell 
execution environment with parameters (positional parameters or arg 
operands) that are not the same in all invocations, or with an OPTIND 
value modified to be a value other than 1, produces unspecified results.

But the test script in this thread does invoke getopts with parameters 
that are the same in all invocations, and without modifying OPTIND. I 
don't see anything else in the normative sections that would make the 
result undefined or unspecified either. I do think the script is valid, 
and the results in dash should match those of other shells.

Cheers,
Harald van Dijk

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: getopts doesn't properly update OPTIND when called from function
  2015-05-29  5:50     ` Harald van Dijk
@ 2015-06-01  6:29       ` Herbert Xu
  2015-06-01 17:30         ` Harald van Dijk
  0 siblings, 1 reply; 9+ messages in thread
From: Herbert Xu @ 2015-06-01  6:29 UTC (permalink / raw)
  To: Harald van Dijk; +Cc: martijn, dash

On Fri, May 29, 2015 at 07:50:09AM +0200, Harald van Dijk wrote:
> 
> But the test script in this thread does invoke getopts with
> parameters that are the same in all invocations, and without
> modifying OPTIND. I don't see anything else in the normative
> sections that would make the result undefined or unspecified either.
> I do think the script is valid, and the results in dash should match
> those of other shells.

The bash behaviour makes it impossible to call shell functions
that invoke getopts while in the middle of an getopts loop.

IMHO the dash behaviour makes a lot more sense since a function
always brings with it a new set of parameters.  That plus the
fact that this behaviour has been there since day one makes me
reluctant to change it since the POSIX wording is not all that
clear.

Cheers,
-- 
Email: Herbert Xu <herbert@gondor.apana.org.au>
Home Page: http://gondor.apana.org.au/~herbert/
PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: getopts doesn't properly update OPTIND when called from function
  2015-06-01  6:29       ` Herbert Xu
@ 2015-06-01 17:30         ` Harald van Dijk
  2015-06-01 22:10           ` Jilles Tjoelker
  0 siblings, 1 reply; 9+ messages in thread
From: Harald van Dijk @ 2015-06-01 17:30 UTC (permalink / raw)
  To: Herbert Xu; +Cc: martijn, dash

On 01/06/2015 08:29, Herbert Xu wrote:
> On Fri, May 29, 2015 at 07:50:09AM +0200, Harald van Dijk wrote:
>>
>> But the test script in this thread does invoke getopts with
>> parameters that are the same in all invocations, and without
>> modifying OPTIND. I don't see anything else in the normative
>> sections that would make the result undefined or unspecified either.
>> I do think the script is valid, and the results in dash should match
>> those of other shells.
>
> The bash behaviour makes it impossible to call shell functions
> that invoke getopts while in the middle of an getopts loop.
 >
> IMHO the dash behaviour makes a lot more sense since a function
> always brings with it a new set of parameters.  That plus the
> fact that this behaviour has been there since day one makes me
> reluctant to change it since the POSIX wording is not all that
> clear.

True. Given that almost no shell supports that anyway, there can't be 
too many scripts that rely on it, but I did warn about the risk of 
breaking another type of existing scripts as well, I agree that's a real 
concern.

One thing that doesn't really make sense, though: if the getopts 
internal state is local to a function, then OPTIND and OPTARG really 
should be too. Because they aren't, nested getopts loops already don't 
really work in a useful way in dash, because the inner loop overwrites 
the OPTIND and OPTARG variables. While OPTARG will typically be checked 
right at the start of the loop, before any inner loop executes, OPTIND 
is typically used at the end of the loop, in combination with the shift 
command. The current behaviour makes the OPTIND value in that case 
unreliable.

So either way, I think something should change. But if you prefer to get 
clarification first about the intended meaning of the POSIX wording, 
that certainly seems reasonable to me.

> Cheers,

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: getopts doesn't properly update OPTIND when called from function
  2015-06-01 17:30         ` Harald van Dijk
@ 2015-06-01 22:10           ` Jilles Tjoelker
  2015-06-02  0:21             ` Harald van Dijk
  0 siblings, 1 reply; 9+ messages in thread
From: Jilles Tjoelker @ 2015-06-01 22:10 UTC (permalink / raw)
  To: Harald van Dijk; +Cc: Herbert Xu, martijn, dash

On Mon, Jun 01, 2015 at 07:30:46PM +0200, Harald van Dijk wrote:
> On 01/06/2015 08:29, Herbert Xu wrote:
> > On Fri, May 29, 2015 at 07:50:09AM +0200, Harald van Dijk wrote:

> >> But the test script in this thread does invoke getopts with
> >> parameters that are the same in all invocations, and without
> >> modifying OPTIND. I don't see anything else in the normative
> >> sections that would make the result undefined or unspecified either.
> >> I do think the script is valid, and the results in dash should match
> >> those of other shells.

> > The bash behaviour makes it impossible to call shell functions
> > that invoke getopts while in the middle of an getopts loop.

> > IMHO the dash behaviour makes a lot more sense since a function
> > always brings with it a new set of parameters.  That plus the
> > fact that this behaviour has been there since day one makes me
> > reluctant to change it since the POSIX wording is not all that
> > clear.

> True. Given that almost no shell supports that anyway, there can't be 
> too many scripts that rely on it, but I did warn about the risk of 
> breaking another type of existing scripts as well, I agree that's a real 
> concern.

FreeBSD sh inherits similar code from ash and likewise has per-function
getopts state. Various shell scripts in the FreeBSD base system use
getopts in functions without setting OPTIND=1.

> One thing that doesn't really make sense, though: if the getopts 
> internal state is local to a function, then OPTIND and OPTARG really 
> should be too. Because they aren't, nested getopts loops already don't 
> really work in a useful way in dash, because the inner loop overwrites 
> the OPTIND and OPTARG variables. While OPTARG will typically be checked 
> right at the start of the loop, before any inner loop executes, OPTIND 
> is typically used at the end of the loop, in combination with the shift 
> command. The current behaviour makes the OPTIND value in that case 
> unreliable.

First, note that the OPTARG and OPTIND shell variables are not an input
to getopts, except for an assignment OPTIND=1 (restoring an OPTIND local
at function return does not reset getopts), and that getopts writes
OPTIND no matter whether getopts's internal optind changed in this
invocation.

With that, the value of OPTIND generally used in scripts is not
unreliable. OPTIND is generally only checked after getopts returned
false (end of options), in the sequence
  while getopts ...; do
    ...
  done
  shift "$((OPTIND - 1))"

> So either way, I think something should change. But if you prefer to get 
> clarification first about the intended meaning of the POSIX wording, 
> that certainly seems reasonable to me.

I think the POSIX wording is clear enough, but it may not be desirable
to change getopts to work that way.

-- 
Jilles Tjoelker

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: getopts doesn't properly update OPTIND when called from function
  2015-06-01 22:10           ` Jilles Tjoelker
@ 2015-06-02  0:21             ` Harald van Dijk
  0 siblings, 0 replies; 9+ messages in thread
From: Harald van Dijk @ 2015-06-02  0:21 UTC (permalink / raw)
  To: Jilles Tjoelker; +Cc: Herbert Xu, martijn, dash

On 02/06/2015 00:10, Jilles Tjoelker wrote:
> On Mon, Jun 01, 2015 at 07:30:46PM +0200, Harald van Dijk wrote:
>> On 01/06/2015 08:29, Herbert Xu wrote:
>>> On Fri, May 29, 2015 at 07:50:09AM +0200, Harald van Dijk wrote:
>
>>>> But the test script in this thread does invoke getopts with
>>>> parameters that are the same in all invocations, and without
>>>> modifying OPTIND. I don't see anything else in the normative
>>>> sections that would make the result undefined or unspecified either.
>>>> I do think the script is valid, and the results in dash should match
>>>> those of other shells.
>
>>> The bash behaviour makes it impossible to call shell functions
>>> that invoke getopts while in the middle of an getopts loop.
>
>>> IMHO the dash behaviour makes a lot more sense since a function
>>> always brings with it a new set of parameters.  That plus the
>>> fact that this behaviour has been there since day one makes me
>>> reluctant to change it since the POSIX wording is not all that
>>> clear.
>
>> True. Given that almost no shell supports that anyway, there can't be
>> too many scripts that rely on it, but I did warn about the risk of
>> breaking another type of existing scripts as well, I agree that's a real
>> concern.
>
> FreeBSD sh inherits similar code from ash and likewise has per-function
> getopts state. Various shell scripts in the FreeBSD base system use
> getopts in functions without setting OPTIND=1.

Yikes. That's an unfortunate effect of writing scripts that only get run 
on a single shell: things like that don't even show up as a possible 
problem. It's similar to how many bashisms sneak into supposedly 
portable shell scripts.

>> One thing that doesn't really make sense, though: if the getopts
>> internal state is local to a function, then OPTIND and OPTARG really
>> should be too. Because they aren't, nested getopts loops already don't
>> really work in a useful way in dash, because the inner loop overwrites
>> the OPTIND and OPTARG variables. While OPTARG will typically be checked
>> right at the start of the loop, before any inner loop executes, OPTIND
>> is typically used at the end of the loop, in combination with the shift
>> command. The current behaviour makes the OPTIND value in that case
>> unreliable.
>
> First, note that the OPTARG and OPTIND shell variables are not an input
> to getopts, except for an assignment OPTIND=1 (restoring an OPTIND local
> at function return does not reset getopts), and that getopts writes
> OPTIND no matter whether getopts's internal optind changed in this
> invocation.
>
> With that, the value of OPTIND generally used in scripts is not
> unreliable. OPTIND is generally only checked after getopts returned
> false (end of options), in the sequence
>    while getopts ...; do
>      ...
>    done
>    shift "$((OPTIND - 1))"

Ah, you're right, I missed that there will usually be another execution 
of getopts before OPTIND is used. Thanks for clearing that up. In that 
case, I agree, the situations in which the values of OPTIND and OPTARG 
are unreliable are only situations in which scripts usually don't bother 
checking their values.

>> So either way, I think something should change. But if you prefer to get
>> clarification first about the intended meaning of the POSIX wording,
>> that certainly seems reasonable to me.
>
> I think the POSIX wording is clear enough, but it may not be desirable
> to change getopts to work that way.

It was Herbert Xu who felt the POSIX wording was unclear, and he is the 
dash maintainer, so his opinion on whether the wording is clear is the 
one that matters.

If it is clear or clarified what POSIX requires, and that POSIX allows 
the current implementation, then I see no need either to change the dash 
behaviour. It could still be useful to make OPTIND and OPTARG local, but 
you've convinced at least me that it's only a minor problem.

If it is clear or clarified what POSIX requires, and that POSIX 
disallows the current implementation, and the current implementation is 
deemed too desirable to drop, then it might make sense to support both 
alternatives, with an option at configure time to switch between them. 
As far as I know, dash does still aim to conform to POSIX, so even if a 
conscious decision is made to deviate from POSIX by default, I think an 
option to conform to it would be nice for those who care about it. I 
would be happy to create a patch, if this approach would be more agreeable.

Cheers,
Harald van Dijk

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: getopts doesn't properly update OPTIND when called from function
  2015-05-28 22:39 ` Harald van Dijk
  2015-05-29  2:58   ` Herbert Xu
@ 2015-06-04 19:56   ` Martijn Dekker
  1 sibling, 0 replies; 9+ messages in thread
From: Martijn Dekker @ 2015-06-04 19:56 UTC (permalink / raw)
  To: dash

Harald van Dijk schreef op 29-05-15 om 00:39:
> A quick patch to make sure it is global, and isn't reset when it
> shouldn't or doesn't need to be, is attached. You can experiment with
> it, if you like.

I've been using dash with this patch since you posted it, and it works
like a charm (including my function that extends getopts'
functionality). No issues encountered. Thanks.

Further discussion in this thread shows that the patch may conflict with
existing usage of 'getopts' for parsing the options within a function (a
usage that would make the script quite shell-specific, by the way,
because it would rely on Almquist-specific behaviour).

The issue, as I understand it, is that 'getopts' keeps not just the
OPTIND variable but also an additional invisible internal variable to
maintain its state. This is necessary to keep track of combined short
options.[*]

There appear to be two possible use cases for calling 'getopts' within a
function:

1. The option parsing loop is in the function, parsing the function's
options. This requires a function-local internal state of 'getopts',
otherwise calling a function using getopts from a main getopts loop
couldn't possibly work, because there is no way to directly save or
restore the unnamed internal state variable of getopts.

2. The option parsing loop is in the main shell environment, but instead
of calling getopts directly, the option parsing loop calls a function,
passing on the main positional parameters, and that function then calls
'getopts' and does additional things (in my case, re-parse GNU-style
--long options in terms of a special short option '--' with argument;
but of course it could be anything). This requires a global internal
'getopts' state.

Use case 1 requires a global internal 'getopts' state and use case 2
requires a local one, so they are mutually incompatible.

But I'm thinking that perhaps there is a way for the shell to
distinguish between these two use cases so that they can be reconciled.

The standard says that OPTIND is a global variable in any case, so use
case 1 above could only work if, before starting the function's option
parsing loop, OPTIND is either explicitly declared a function-local
variable using the non-standard 'local' keyword or is reinitialized
using an assignment.

On the other hand, use case 2 could only work if OPTIND is completely
left alone by the function, allowing a 'getopts' with a global state to
do its thing without interference.

So I would suggest the following might reconcile both use cases: By
default, make the 'getopts' internal state global. However, whenever
OPTIND is either assigned a value within a function or declared local
within a function, automatically make the 'getopts' internal state local
to the function.

Comments?

- M.

[*] Just as a datapoint, I found that yash has a different strategy for
this: it stores both values in OPTIND, separated by a semicolon -- e.g.
an $OPTIND of 3:2 means getopts is at the second option in the third
argument.


^ permalink raw reply	[flat|nested] 9+ messages in thread

end of thread, other threads:[~2015-06-04 19:56 UTC | newest]

Thread overview: 9+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2015-05-28 18:54 getopts doesn't properly update OPTIND when called from function Martijn Dekker
2015-05-28 22:39 ` Harald van Dijk
2015-05-29  2:58   ` Herbert Xu
2015-05-29  5:50     ` Harald van Dijk
2015-06-01  6:29       ` Herbert Xu
2015-06-01 17:30         ` Harald van Dijk
2015-06-01 22:10           ` Jilles Tjoelker
2015-06-02  0:21             ` Harald van Dijk
2015-06-04 19:56   ` Martijn Dekker

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).