* 'return' from subshell in function doesn't [not found] <S1726267AbgCHLof/20200308114435Z+1505@vger.kernel.org> @ 2020-03-08 12:35 ` Dirk Fieldhouse 2020-03-08 13:44 ` Harald van Dijk 0 siblings, 1 reply; 6+ messages in thread From: Dirk Fieldhouse @ 2020-03-08 12:35 UTC (permalink / raw) To: DASH mailing list POSIX <https://pubs.opengroup.org/onlinepubs/9699919799/utilities/V3_chap02.html#return> says, and has since at least 2004: "The return utility shall cause the shell to stop executing the current function or dot script. If the shell is not currently executing a function or dot script, the results are unspecified." Clear enough, one would think, but consider this example: foo() { echo "$1" | while read -r xx _; do if [ "$xx" = fum ]; then echo EQ return 0 fi done echo NE return 1 } According to the spec we expect: $ foo fum || echo WTF EQ $ What actually happens, with DASH-0.5.8-2.1ubuntu2 and -0.5.9.1 built from source: $ foo fum || echo WTF EQ NE WTF $ foo baz || echo OK NE OK $ Same with bash-4.3-14ubuntu1.4, busybox-static-1:1.22.0-15ubuntu1.4. A simpler test case shows that the issue is 'return' not breaking out of a subshell: bar() { ( if [ "$1" = fum ]; then echo EQ return 0 fi ) echo NE return 1 } barbar() { if [ "$1" = fum ]; then echo EQ return 0 fi echo NE return 1 } $ bar fum || echo WTF EQ NE WTF $ bar baz || echo OK NE OK $ barbar fum || echo WTF EQ $ As POSIX refers to subshells explicitly elsewhere (eg 'exit') it's difficult to believe that "subshell" was accidentally omitted from the list of contexts that 'return' should return from, but implementation behaviours consistently contradict the spec as written. Can they be made conformant without breaking existing scripts? A work-around is to make any subshell explicit and 'exit' from it: foo_wa() { echo "$1" | ( while read -r xx _; do if [ "$xx" = fum ]; then echo EQ exit 0 fi done; exit 1 ) && return ret=$? echo NE return $ret } $ foo_wa fum || echo WTF EQ $ -- London SW6 UK ^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: 'return' from subshell in function doesn't 2020-03-08 12:35 ` 'return' from subshell in function doesn't Dirk Fieldhouse @ 2020-03-08 13:44 ` Harald van Dijk 2020-03-08 14:40 ` Dirk Fieldhouse 0 siblings, 1 reply; 6+ messages in thread From: Harald van Dijk @ 2020-03-08 13:44 UTC (permalink / raw) To: Dirk Fieldhouse, DASH mailing list On 08/03/2020 12:35, Dirk Fieldhouse wrote: > POSIX > <https://pubs.opengroup.org/onlinepubs/9699919799/utilities/V3_chap02.html#return> > > says, and has since at least 2004: > "The return utility shall cause the shell to stop executing the current > function or dot script. If the shell is not currently executing a > function or dot script, the results are unspecified." >[...] > As POSIX refers to subshells explicitly elsewhere (eg 'exit') it's > difficult to believe that "subshell" was accidentally omitted from the > list of contexts that 'return' should return from, but implementation > behaviours consistently contradict the spec as written. Can they be made > conformant without breaking existing scripts? In the subshell, the shell should not be considered to still be executing a function or dot script. As such, the results should be unspecified, and any behaviour should be valid. The standard may be underspecified here, but any other interpretation is not reasonable. Subshells work by starting a new process. The parent process waits for the subshell to finish and acts on its exit status. The child process has very little ways to influence its parent process other than that, and the parent process might not even still be running by the time the child process gets to the return statement. Cheers, Harald van Dijk ^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: 'return' from subshell in function doesn't 2020-03-08 13:44 ` Harald van Dijk @ 2020-03-08 14:40 ` Dirk Fieldhouse 2020-03-08 15:19 ` Harald van Dijk 0 siblings, 1 reply; 6+ messages in thread From: Dirk Fieldhouse @ 2020-03-08 14:40 UTC (permalink / raw) To: DASH mailing list; +Cc: Harald van Dijk On 08/03/20 13:44, Harald van Dijk wrote: > On 08/03/2020 12:35, Dirk Fieldhouse wrote: >> POSIX >> <https://pubs.opengroup.org/onlinepubs/9699919799/utilities/V3_chap02.html#return> >> says, and has since at least 2004: >> "The return utility shall cause the shell to stop executing the current >> function or dot script. If the shell is not currently executing a >> function or dot script, the results are unspecified." >> [...] >> As POSIX refers to subshells explicitly elsewhere (eg 'exit') it's >> difficult to believe that "subshell" was accidentally omitted from the >> list of contexts that 'return' should return from, but implementation >> behaviours consistently contradict the spec as written. Can they be made >> conformant without breaking existing scripts? > > In the subshell, the shell should not be considered to still be > executing a function or dot script. As such, the results should be > unspecified, and any behaviour should be valid. The standard may be > underspecified here, but any other interpretation is not reasonable. Your argument here is essentially saying that the spec left out an exception concerning subshells. If you read the spec without having knowledge of existing shell internals, it's entirely reasonable (and IMO desirable) to consider that a shell function is a lexical group, like a script file, which is being executed as long as any command within the function's defining compound command is running. Otherwise the definition of a shell function would have to be limited to certain types of compound command, ie excluding command substitution, commands grouped with parentheses, asynchronous lists, and (under implementation-specific circumstances) pipelines. The behaviour that I expected is supported by <https://pubs.opengroup.org/onlinepubs/9699919799/utilities/V3_chap02.html#tag_18_09_05>: "A function is a user-defined name that is used as a simple command to call a compound command with new positional parameters. ... The compound-command shall be executed whenever the function name is specified as the name of a simple command (see Command Search and Execution). ... If the special built-in return ... is executed in the compound-command, the function completes and execution shall resume with the next command after the function call." > Subshells work by starting a new process. The parent process waits for > the subshell to finish and acts on its exit status. The child process > has very little ways to influence its parent process other than that, > and the parent process might not even still be running by the time the > child process gets to the return statement. What the conforming implementation has to do shouldn't be of concern to the shell programmer, especially since a subshell may, but need not, be created implicitly in a pipeline; in particular any subshell processes are transparent to the shell programmer ($! "shall expand to the same value as that of the current shell"). What POSIX says presumably means that the implementation should wait for any subprocesses, threads or whatever spawned in the course of executing a function to complete (subject to &) before continuing to execute the next command. If the calling script process or some spawned thread of control gets killed before the return can be executed, that's just an exception, the sort of thing that traps exist for. However, as your interpretation seems to have been widely made by shell implementations, is it necessary to abandon the behaviour currently specified in favour of a more pragmatic specification? /df -- London SW6 UK ^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: 'return' from subshell in function doesn't 2020-03-08 14:40 ` Dirk Fieldhouse @ 2020-03-08 15:19 ` Harald van Dijk 2020-03-09 7:44 ` Stephane Chazelas 2020-03-09 12:43 ` Dirk Fieldhouse 0 siblings, 2 replies; 6+ messages in thread From: Harald van Dijk @ 2020-03-08 15:19 UTC (permalink / raw) To: Dirk Fieldhouse, DASH mailing list On 08/03/2020 14:40, Dirk Fieldhouse wrote: > On 08/03/20 13:44, Harald van Dijk wrote: >> Subshells work by starting a new process. The parent process waits for >> the subshell to finish and acts on its exit status. The child process >> has very little ways to influence its parent process other than that, >> and the parent process might not even still be running by the time the >> child process gets to the return statement. > > What the conforming implementation has to do shouldn't be of concern to > the shell programmer, especially since a subshell may, but need not, be > created implicitly in a pipeline; in particular any subshell processes > are transparent to the shell programmer ($! "shall expand to the same > value as that of the current shell"). I think you meant $$ there, but this is the difference between theory and practice. In theory, the standard is perfect, and shell internals are irrelevant, we can just look at what the standard says. In practice, unfortunately, the standard is not perfect and there are numerous cases where the standard is either ambiguous or contradicts implementations, and where this is deemed a defect in the standard rather than in the implementations. It need not even be because what the standard specifies is unreasonable, it can just be because the what the standard specifies is unintended. > What POSIX says presumably means > that the implementation should wait for any subprocesses, threads or > whatever spawned in the course of executing a function to complete > (subject to &) before continuing to execute the next command. If the > calling script process or some spawned thread of control gets killed > before the return can be executed, that's just an exception, the sort of > thing that traps exist for. Sure, for the parent process, but for the child process it leaves questions unanswered such as what the expected output would be of: f() { (kill -9 $$; return; echo hello) } f echo bye This cannot print 'bye', but should it print 'hello'? The 'return' statement cannot return from 'f' if the main process is killed, so would the subshell just continue execution with the command after 'return'? I would argue that even if you disagree that the behaviour should be unspecified in your original example, it should still be unspecified in mine. > However, as your interpretation seems to have been widely made by shell > implementations, is it necessary to abandon the behaviour currently > specified in favour of a more pragmatic specification? I suspect so. There is a case I forgot about though: f() ( return 0 echo bug ) f This should not print 'bug', and does not in any shell I can think to test. By your interpretation of the standard, this is currently specified. By mine, it would be unspecified, but I would agree that it should be fine for the whole function to be defined using a () compound command, and to contain a return statement directly inside it. The same problem applies to the 'break' and 'continue' statements too: for var in x y z do echo $var (break) done This prints x, y, and z in all shells, the 'break' statement in the subshell does not cause the loop to terminate. Some shells additionally print a warning or error message such as "break: not in a loop". Here again, presumably the intent of the standard is not that the 'break' statement should cause the loop to terminate. It is not something that shells do, and it is not something that is reasonable for shells to implement. This is looking like a giant can of worms I'm not sure I'm ready to see opened. :) Cheers, Harald van Dijk ^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: 'return' from subshell in function doesn't 2020-03-08 15:19 ` Harald van Dijk @ 2020-03-09 7:44 ` Stephane Chazelas 2020-03-09 12:43 ` Dirk Fieldhouse 1 sibling, 0 replies; 6+ messages in thread From: Stephane Chazelas @ 2020-03-09 7:44 UTC (permalink / raw) To: Harald van Dijk; +Cc: Dirk Fieldhouse, DASH mailing list 2020-03-08 15:19:01 +0000, Harald van Dijk: [...] > The same problem applies to the 'break' and 'continue' statements too: > > for var in x y z > do > echo $var > (break) > done > > This prints x, y, and z in all shells, the 'break' statement in the subshell > does not cause the loop to terminate. Some shells additionally print a > warning or error message such as "break: not in a loop". Here again, > presumably the intent of the standard is not that the 'break' statement > should cause the loop to terminate. It is not something that shells do, and > it is not something that is reasonable for shells to implement. > > This is looking like a giant can of worms I'm not sure I'm ready to see > opened. :) [...] See https://www.austingroupbugs.net/view.php?id=842 and its resolution (https://www.austingroupbugs.net/view.php?id=842#c2257) about that. -- Stephane ^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: 'return' from subshell in function doesn't 2020-03-08 15:19 ` Harald van Dijk 2020-03-09 7:44 ` Stephane Chazelas @ 2020-03-09 12:43 ` Dirk Fieldhouse 1 sibling, 0 replies; 6+ messages in thread From: Dirk Fieldhouse @ 2020-03-09 12:43 UTC (permalink / raw) To: DASH mailing list; +Cc: Harald van Dijk On 08/03/20 15:19, Harald van Dijk wrote: > On 08/03/2020 14:40, Dirk Fieldhouse wrote: >> On 08/03/20 13:44, Harald van Dijk wrote: >>> Subshells work by starting a new process. ... >> >> What the conforming implementation has to do shouldn't be of concern to >> the shell programmer, especially since a subshell may, but need not, be >> created implicitly in a pipeline; in particular any subshell processes >> are transparent to the shell programmer ($! "shall expand to the same >> value as that of the current shell"). > > I think you meant $$ there, but this is the difference between theory > and practice. In theory, the standard is perfect, and shell internals > are irrelevant, we can just look at what the standard says. In practice, > unfortunately, the standard is not perfect and there are numerous cases > where the standard is either ambiguous or contradicts implementations, > and where this is deemed a defect in the standard rather than in the > implementations. It need not even be because what the standard specifies > is unreasonable, it can just be because the what the standard specifies > is unintended. Yes obvs $$, thanks. If a supplier has to warrant conformance to the standard they have a problem if what the standard says is universally ignored. Something has to give. I think it's fair to say that historically there has been convergence from both directions. But this particular issue seems to have been a blind spot, perhaps because it seems so obvious to the implementers who also work on the spec (we've all been there). >> What POSIX says presumably means >> that the implementation should wait for any subprocesses, threads or >> whatever spawned in the course of executing a function to complete >> (subject to &) before continuing to execute the next command. If the >> calling script process or some spawned thread of control gets killed >> before the return can be executed, that's just an exception, the sort of >> thing that traps exist for. > > Sure, for the parent process, but for the child process it leaves > questions unanswered such as what the expected output would be of: > > f() { > (kill -9 $$; return; echo hello) > } > f > echo bye If you kill the shell (you're not supposed to know that kill only kills some main process) you shouldn't expect any subsequent command to have run. A better name for this f() is cut_off_the_branch_I_am_sitting_on(). > I would argue that even if you disagree that the behaviour should be > unspecified in your original example, it should still be unspecified in > mine. > >> However, as your interpretation seems to have been widely made by shell >> implementations, is it necessary to abandon the behaviour currently >> specified in favour of a more pragmatic specification? > > I suspect so. There is a case I forgot about though: > > f() ( > return 0 > echo bug > ) > f > > This should not print 'bug', and does not in any shell I can think to > test. By your interpretation of the standard, this is currently > specified. By mine, it would be unspecified, but I would agree that it > should be fine for the whole function to be defined using a () compound > command, and to contain a return statement directly inside it. +1. Apparently the consensus has been that 'return' in a subshell means 'exit'. But should someone write a test suite with a test case similar to my original bar() against POSIX.1-2017 these implementations will all fail to pass. > The same problem applies to the 'break' and 'continue' statements too: > > for var in x y z > do > echo $var > (break) > done As Stephane (instigator of the relevant defect report) pointed out, this has been addressed in the 2017 text, so that your 'break' example is unspecified behaviour (unenclosed break or continue). Of course similar wording could have been used to restrict the specified behaviour of 'return' as well -- but wasn't. > This is looking like a giant can of worms I'm not sure I'm ready to see > opened. :) Otherwise it would have been sorted out before and I wouldn't have raised it! This <https://www.austingroupbugs.net/view.php?id=1042> POSIX DR touches on the same issue but doesn't come to grips with 'return'. This <https://www.austingroupbugs.net/view.php?id=1247> DR identifies at least one other case where an implicit subshell is used. I suppose further discussion should be at austin-group-l? regards /df -- London SW6 UK ^ permalink raw reply [flat|nested] 6+ messages in thread
end of thread, other threads:[~2020-03-09 12:43 UTC | newest] Thread overview: 6+ messages (download: mbox.gz / follow: Atom feed) -- links below jump to the message on this page -- [not found] <S1726267AbgCHLof/20200308114435Z+1505@vger.kernel.org> 2020-03-08 12:35 ` 'return' from subshell in function doesn't Dirk Fieldhouse 2020-03-08 13:44 ` Harald van Dijk 2020-03-08 14:40 ` Dirk Fieldhouse 2020-03-08 15:19 ` Harald van Dijk 2020-03-09 7:44 ` Stephane Chazelas 2020-03-09 12:43 ` Dirk Fieldhouse
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).