dash.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* possible bug in job control
@ 2013-07-29 21:44 Luigi Tarenga
  2013-07-29 22:47 ` Harald van Dijk
  0 siblings, 1 reply; 6+ messages in thread
From: Luigi Tarenga @ 2013-07-29 21:44 UTC (permalink / raw)
  To: dash

hi list,
while writing a script to execute parallel ssh command on many host I found
a strange behavior of dash. I can replicate it with a very simple script but
didn't find any documentation about dash or POSIX that can explain it.

tested on centos 6.4 (dash 0.5.5.1) and wih dash compiled from source (0.5.7)
the following script reports error:

#!/bin/dash

sleep 3 &
sleep 3 &
sleep 3 &
sleep 3 &

#/bin/true
jobs -l

wait %1
wait %2
wait %3
wait %4

[vortex@lizard ~]$ ./dash-0.5.7/src/dash test.sh
[4] + 4569 Running
[3] - 4568 Running
[2]   4567 Running
[1]   4566 Running
prova: 14: wait: No such job: %4
[vortex@lizard ~]$ echo $?
2

if you uncomment /bin/true it works or if you add one more job it work again:

#!/bin/dash

sleep 3 &
sleep 3 &
sleep 3 &
sleep 3 &
sleep 3 &

#/bin/true
jobs -l

wait %1
wait %2
wait %3
wait %4
wait %5

[vortex@lizard ~]$ ./dash-0.5.7/src/dash prova
[5] + 4590 Running
[4] - 4589 Running
[3]   4588 Running
[2]   4587 Running
[1]   4586 Running


the script fails with 4 and 8 jobs, works with 1 2 3 5 6 7 9.
jobs -l have no other efffect than printing job list (this is ok).
executing one external binary (like /bin/true) between sleeps and wait
make it works. putting and internal command in backgroup like : &
(so it spawn another subprocess) make it works. using %% in the last
wait make it works.

what do you think? it this a bug?

thanks in advance
Luigi

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: possible bug in job control
  2013-07-29 21:44 possible bug in job control Luigi Tarenga
@ 2013-07-29 22:47 ` Harald van Dijk
       [not found]   ` <CAKkO-EiY79gEH+SbvK6kF=1v0h7Q5=ypHGcs+m-yHdMFq-L-7A@mail.gmail.com>
                     ` (2 more replies)
  0 siblings, 3 replies; 6+ messages in thread
From: Harald van Dijk @ 2013-07-29 22:47 UTC (permalink / raw)
  To: Luigi Tarenga; +Cc: dash

[-- Attachment #1: Type: text/plain, Size: 1305 bytes --]

On 29/07/13 23:44, Luigi Tarenga wrote:
> hi list,
> while writing a script to execute parallel ssh command on many host I found
> a strange behavior of dash. I can replicate it with a very simple script but
> didn't find any documentation about dash or POSIX that can explain it.
> 
> tested on centos 6.4 (dash 0.5.5.1) and wih dash compiled from source (0.5.7)
> the following script reports error:
> 
> #!/bin/dash
> 
> sleep 3 &
> sleep 3 &
> sleep 3 &
> sleep 3 &
> 
> #/bin/true
> jobs -l
> 
> wait %1
> wait %2
> wait %3
> wait %4
> 
> [vortex@lizard ~]$ ./dash-0.5.7/src/dash test.sh
> [4] + 4569 Running
> [3] - 4568 Running
> [2]   4567 Running
> [1]   4566 Running
> prova: 14: wait: No such job: %4
> [vortex@lizard ~]$ echo $?
> 2

Yes, this looks like a bug to me. The number of allocated jobs is always
kept as a multiple of four, and the first check in considering whether
the job number is valid is "if it's greater than or equal to the number
of allocated job, it's invalid". That doesn't look right. That would
only be right if jobs were zero-based, but they aren't. If it's exactly
equal to the number of available jobs, it can still be valid. It works
when adding /bin/true, because four more more jobs end up allocated
internally.

The attached patch should fix it.

Cheers,
Harald

[-- Attachment #2: dash-getjob-off-by-one.patch --]
[-- Type: text/x-patch, Size: 477 bytes --]

commit ddeba5485c3309ffc7010f8924d604a781908e1d
Author: Harald van Dijk <harald@gigawatt.nl>
Date:   Tue Jul 30 00:36:53 2013 +0200

    getjob: Fix off-by-one error for multiple of four job numbers

diff --git a/src/jobs.c b/src/jobs.c
index bf40204..c2c2332 100644
--- a/src/jobs.c
+++ b/src/jobs.c
@@ -699,7 +699,7 @@ check:
 
 	if (is_number(p)) {
 		num = atoi(p);
-		if (num < njobs) {
+		if (num <= njobs) {
 			jp = jobtab + num - 1;
 			if (jp->used)
 				goto gotit;

^ permalink raw reply related	[flat|nested] 6+ messages in thread

* Re: possible bug in job control
       [not found]   ` <CAKkO-EiY79gEH+SbvK6kF=1v0h7Q5=ypHGcs+m-yHdMFq-L-7A@mail.gmail.com>
@ 2013-07-30 16:17     ` Harald van Dijk
  2013-07-31  7:34       ` Herbert Xu
  0 siblings, 1 reply; 6+ messages in thread
From: Harald van Dijk @ 2013-07-30 16:17 UTC (permalink / raw)
  To: Luigi Tarenga, dash

On 30/07/13 14:26, Luigi Tarenga wrote:
> hi Harald,
> I quickly tested you patch.
> It fix my script. I hope this will be committed on git.
> Should I wrote some feedback on the ML?

Hi Luigi,

Standard practise on this list seems to be to send all replies both to
the people involved and to the list. I'm sure that confirmation that a
patch works will be appreciated. :)

It may take some time for a bugfix to be committed: the last commit in
git is from over a year ago, and since then, there have been some other
bug reports with patches sent to this list that have not yet been
applied, reviewed, or had any other feedback. But the past suggests that
that is just the way it works for dash, the mails do get read
eventually, they do not simply get discarded.

> thank you very much for your support :)

You're quite welcome!

Cheers,
Harald

> Luigi

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: possible bug in job control
  2013-07-29 22:47 ` Harald van Dijk
       [not found]   ` <CAKkO-EiY79gEH+SbvK6kF=1v0h7Q5=ypHGcs+m-yHdMFq-L-7A@mail.gmail.com>
@ 2013-07-30 16:42   ` Luigi Tarenga
  2014-09-26  9:28   ` Herbert Xu
  2 siblings, 0 replies; 6+ messages in thread
From: Luigi Tarenga @ 2013-07-30 16:42 UTC (permalink / raw)
  To: Harald van Dijk; +Cc: dash

Hi Harald,
thanks for the patch. I tested it against source tarball (0.5.7) and
it works for me.
It works with the example scripts and with my script of parallel ssh
(tested with case of multiple of 4 jobs).

many thanks for the support
Luigi




On Tue, Jul 30, 2013 at 12:47 AM, Harald van Dijk <harald@gigawatt.nl> wrote:
> On 29/07/13 23:44, Luigi Tarenga wrote:
>> hi list,
>> while writing a script to execute parallel ssh command on many host I found
>> a strange behavior of dash. I can replicate it with a very simple script but
>> didn't find any documentation about dash or POSIX that can explain it.
>>
>> tested on centos 6.4 (dash 0.5.5.1) and wih dash compiled from source (0.5.7)
>> the following script reports error:
>>
>> #!/bin/dash
>>
>> sleep 3 &
>> sleep 3 &
>> sleep 3 &
>> sleep 3 &
>>
>> #/bin/true
>> jobs -l
>>
>> wait %1
>> wait %2
>> wait %3
>> wait %4
>>
>> [vortex@lizard ~]$ ./dash-0.5.7/src/dash test.sh
>> [4] + 4569 Running
>> [3] - 4568 Running
>> [2]   4567 Running
>> [1]   4566 Running
>> prova: 14: wait: No such job: %4
>> [vortex@lizard ~]$ echo $?
>> 2
>
> Yes, this looks like a bug to me. The number of allocated jobs is always
> kept as a multiple of four, and the first check in considering whether
> the job number is valid is "if it's greater than or equal to the number
> of allocated job, it's invalid". That doesn't look right. That would
> only be right if jobs were zero-based, but they aren't. If it's exactly
> equal to the number of available jobs, it can still be valid. It works
> when adding /bin/true, because four more more jobs end up allocated
> internally.
>
> The attached patch should fix it.
>
> Cheers,
> Harald

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: possible bug in job control
  2013-07-30 16:17     ` Harald van Dijk
@ 2013-07-31  7:34       ` Herbert Xu
  0 siblings, 0 replies; 6+ messages in thread
From: Herbert Xu @ 2013-07-31  7:34 UTC (permalink / raw)
  To: Harald van Dijk; +Cc: luigi.tarenga, dash

Harald van Dijk <harald@gigawatt.nl> wrote:
> 
> It may take some time for a bugfix to be committed: the last commit in
> git is from over a year ago, and since then, there have been some other
> bug reports with patches sent to this list that have not yet been
> applied, reviewed, or had any other feedback. But the past suggests that
> that is just the way it works for dash, the mails do get read
> eventually, they do not simply get discarded.

Sorry about the lack of commits.  Be rest assured your patches
are in my queue :)

Cheers,
-- 
Email: Herbert Xu <herbert@gondor.apana.org.au>
Home Page: http://gondor.apana.org.au/~herbert/
PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: possible bug in job control
  2013-07-29 22:47 ` Harald van Dijk
       [not found]   ` <CAKkO-EiY79gEH+SbvK6kF=1v0h7Q5=ypHGcs+m-yHdMFq-L-7A@mail.gmail.com>
  2013-07-30 16:42   ` Luigi Tarenga
@ 2014-09-26  9:28   ` Herbert Xu
  2 siblings, 0 replies; 6+ messages in thread
From: Herbert Xu @ 2014-09-26  9:28 UTC (permalink / raw)
  To: Harald van Dijk; +Cc: Luigi Tarenga, dash

On Mon, Jul 29, 2013 at 10:47:40PM +0000, Harald van Dijk wrote:
> On 29/07/13 23:44, Luigi Tarenga wrote:
> > hi list,
> > while writing a script to execute parallel ssh command on many host I found
> > a strange behavior of dash. I can replicate it with a very simple script but
> > didn't find any documentation about dash or POSIX that can explain it.
> > 
> > tested on centos 6.4 (dash 0.5.5.1) and wih dash compiled from source (0.5.7)
> > the following script reports error:
> > 
> > #!/bin/dash
> > 
> > sleep 3 &
> > sleep 3 &
> > sleep 3 &
> > sleep 3 &
> > 
> > #/bin/true
> > jobs -l
> > 
> > wait %1
> > wait %2
> > wait %3
> > wait %4
> > 
> > [vortex@lizard ~]$ ./dash-0.5.7/src/dash test.sh
> > [4] + 4569 Running
> > [3] - 4568 Running
> > [2]   4567 Running
> > [1]   4566 Running
> > prova: 14: wait: No such job: %4
> > [vortex@lizard ~]$ echo $?
> > 2
> 
> Yes, this looks like a bug to me. The number of allocated jobs is always
> kept as a multiple of four, and the first check in considering whether
> the job number is valid is "if it's greater than or equal to the number
> of allocated job, it's invalid". That doesn't look right. That would
> only be right if jobs were zero-based, but they aren't. If it's exactly
> equal to the number of available jobs, it can still be valid. It works
> when adding /bin/true, because four more more jobs end up allocated
> internally.
> 
> The attached patch should fix it.
> 
> Cheers,
> Harald

> commit ddeba5485c3309ffc7010f8924d604a781908e1d
> Author: Harald van Dijk <harald@gigawatt.nl>
> Date:   Tue Jul 30 00:36:53 2013 +0200
> 
>     getjob: Fix off-by-one error for multiple of four job numbers

Patch applied.  Thanks!
-- 
Email: Herbert Xu <herbert@gondor.apana.org.au>
Home Page: http://gondor.apana.org.au/~herbert/
PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt

^ permalink raw reply	[flat|nested] 6+ messages in thread

end of thread, other threads:[~2014-09-26  9:28 UTC | newest]

Thread overview: 6+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2013-07-29 21:44 possible bug in job control Luigi Tarenga
2013-07-29 22:47 ` Harald van Dijk
     [not found]   ` <CAKkO-EiY79gEH+SbvK6kF=1v0h7Q5=ypHGcs+m-yHdMFq-L-7A@mail.gmail.com>
2013-07-30 16:17     ` Harald van Dijk
2013-07-31  7:34       ` Herbert Xu
2013-07-30 16:42   ` Luigi Tarenga
2014-09-26  9:28   ` Herbert Xu

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).