linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [RFD] Combined fork-exec syscall.
@ 2003-04-28  0:57 Mark Grosberg
  2003-04-28  0:59 ` Larry McVoy
                   ` (6 more replies)
  0 siblings, 7 replies; 48+ messages in thread
From: Mark Grosberg @ 2003-04-28  0:57 UTC (permalink / raw)
  To: linux-kernel


Hello all,

Is there any interest in a single system call that will perform both a
fork() and exec()? Could this save some extra work of doing a
copy_mm(), copy_signals(), etc?

I would think on large, multi-user systems that are spawning processes all
day, this might improve performance if the shells on such a system were
patched.

Perhaps a system call like:

   pid_t spawn(const char *p_path,
               const char *argv[],
               const char *envp[],
               const int   filp[]);

The filp array would allow file descriptors to be redirected. It could be
terminated by a -1 and reference the file descriptors of the current
process (this could also potentially save some dup() syscalls).

If any of these parameters (exclusing p_path) are NULL, then the
appropriate values are taken from the current process.

I originally was thinking of a name of fexec() for such a syscall, but
since there are already "f" variant syscalls (fchmod, fstat, ...) that an
fexec() would make more sense about executing an already open file, so the
name spawn() came to mind.

I know almost all of my fork()-exec() code does almost the same thing. I
guess vfork() was a potential solution, but this somehow seems cleaner
(and still may be more efficient than having to issue two syscalls)...
the downside is, of course, another syscall.

L8r,
Mark G.


^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: [RFD] Combined fork-exec syscall.
  2003-04-28  0:57 [RFD] Combined fork-exec syscall Mark Grosberg
@ 2003-04-28  0:59 ` Larry McVoy
  2003-04-28  1:16   ` Mark Grosberg
  2003-04-28  1:17 ` Davide Libenzi
                   ` (5 subsequent siblings)
  6 siblings, 1 reply; 48+ messages in thread
From: Larry McVoy @ 2003-04-28  0:59 UTC (permalink / raw)
  To: Mark Grosberg; +Cc: linux-kernel

If you do this, _please_ make it compat with NT.

On Sun, Apr 27, 2003 at 08:57:12PM -0400, Mark Grosberg wrote:
> 
> Hello all,
> 
> Is there any interest in a single system call that will perform both a
> fork() and exec()? Could this save some extra work of doing a
> copy_mm(), copy_signals(), etc?
> 
> I would think on large, multi-user systems that are spawning processes all
> day, this might improve performance if the shells on such a system were
> patched.
> 
> Perhaps a system call like:
> 
>    pid_t spawn(const char *p_path,
>                const char *argv[],
>                const char *envp[],
>                const int   filp[]);
> 
> The filp array would allow file descriptors to be redirected. It could be
> terminated by a -1 and reference the file descriptors of the current
> process (this could also potentially save some dup() syscalls).
> 
> If any of these parameters (exclusing p_path) are NULL, then the
> appropriate values are taken from the current process.
> 
> I originally was thinking of a name of fexec() for such a syscall, but
> since there are already "f" variant syscalls (fchmod, fstat, ...) that an
> fexec() would make more sense about executing an already open file, so the
> name spawn() came to mind.
> 
> I know almost all of my fork()-exec() code does almost the same thing. I
> guess vfork() was a potential solution, but this somehow seems cleaner
> (and still may be more efficient than having to issue two syscalls)...
> the downside is, of course, another syscall.
> 
> L8r,
> Mark G.
> 
> -
> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> Please read the FAQ at  http://www.tux.org/lkml/

-- 
---
Larry McVoy              lm at bitmover.com          http://www.bitmover.com/lm

^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: [RFD] Combined fork-exec syscall.
  2003-04-28  0:59 ` Larry McVoy
@ 2003-04-28  1:16   ` Mark Grosberg
  2003-04-28  1:36     ` Måns Rullgård
  0 siblings, 1 reply; 48+ messages in thread
From: Mark Grosberg @ 2003-04-28  1:16 UTC (permalink / raw)
  To: Larry McVoy; +Cc: linux-kernel



On Sun, 27 Apr 2003, Larry McVoy wrote:

> If you do this, _please_ make it compat with NT.

Actually, I thought about this. My first thought is this could benefit
WINE running on Linux. Then (not like I'm a Wine expert by any means) I
figured it might be an issue as far as having to do some preliminary
wineserver setup work (if anybody on this list knows better than me, speak
up!)

But yeah, basically, something similar to NT's CreateProcess(). For the
cases where the one-step process creation is sufficient.

L8r,
Mark G.



^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: [RFD] Combined fork-exec syscall.
  2003-04-28  0:57 [RFD] Combined fork-exec syscall Mark Grosberg
  2003-04-28  0:59 ` Larry McVoy
@ 2003-04-28  1:17 ` Davide Libenzi
  2003-04-28  1:28   ` Mark Grosberg
  2003-04-28  1:41   ` Ulrich Drepper
  2003-04-28  1:35 ` dean gaudet
                   ` (4 subsequent siblings)
  6 siblings, 2 replies; 48+ messages in thread
From: Davide Libenzi @ 2003-04-28  1:17 UTC (permalink / raw)
  To: Mark Grosberg; +Cc: Linux Kernel Mailing List

On Sun, 27 Apr 2003, Mark Grosberg wrote:

> Is there any interest in a single system call that will perform both a
> fork() and exec()? Could this save some extra work of doing a
> copy_mm(), copy_signals(), etc?
>
> I would think on large, multi-user systems that are spawning processes all
> day, this might improve performance if the shells on such a system were
> patched.
>
> Perhaps a system call like:
>
>    pid_t spawn(const char *p_path,
>                const char *argv[],
>                const char *envp[],
>                const int   filp[]);
>
> The filp array would allow file descriptors to be redirected. It could be
> terminated by a -1 and reference the file descriptors of the current
> process (this could also potentially save some dup() syscalls).
>
> If any of these parameters (exclusing p_path) are NULL, then the
> appropriate values are taken from the current process.
>
> I originally was thinking of a name of fexec() for such a syscall, but
> since there are already "f" variant syscalls (fchmod, fstat, ...) that an
> fexec() would make more sense about executing an already open file, so the
> name spawn() came to mind.
>
> I know almost all of my fork()-exec() code does almost the same thing. I
> guess vfork() was a potential solution, but this somehow seems cleaner
> (and still may be more efficient than having to issue two syscalls)...
> the downside is, of course, another syscall.

This is very much library stuff. I don't think that saving a couple of
system calls will give you an edge, expecially when we're talking of
spawning another process. Even if the process itself does nothing but
return. Ulrich might be eventually interested ...




- Davide


^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: [RFD] Combined fork-exec syscall.
  2003-04-29  1:05 ` Rafael Costa dos Santos
@ 2003-04-28  1:19   ` Mark Grosberg
  2003-04-29  1:29     ` Rafael Costa dos Santos
  0 siblings, 1 reply; 48+ messages in thread
From: Mark Grosberg @ 2003-04-28  1:19 UTC (permalink / raw)
  To: rafael; +Cc: linux-kernel



On Mon, 28 Apr 2003, Rafael Costa dos Santos wrote:

> Do you have some work done on this issue ?

Nope. Was just thinking. To be honest, I stopped doing kernel development
in the 2.0 days (but I keep up with LKML). This would be for 2.5 but
could probably be backported to 2.4 without too much trouble.

But tomorrow I may very well find myself out of a job (big surprise), and
if so, I'll setup a Linux box (since I've mostly been using OpenBSD
these days) and work on this (for i386 to begin with) if  there is enough
interest.

L8r,
Mark G.


^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: [RFD] Combined fork-exec syscall.
  2003-04-28  1:17 ` Davide Libenzi
@ 2003-04-28  1:28   ` Mark Grosberg
  2003-04-29  2:01     ` Rafael Costa dos Santos
  2003-04-28  1:41   ` Ulrich Drepper
  1 sibling, 1 reply; 48+ messages in thread
From: Mark Grosberg @ 2003-04-28  1:28 UTC (permalink / raw)
  To: Davide Libenzi; +Cc: Linux Kernel Mailing List



On Sun, 27 Apr 2003, Davide Libenzi wrote:

> This is very much library stuff. I don't think that saving a couple of
> system calls will give you an edge, expecially when we're talking of

I guess it depends on what is considered saving. To be honest, my area of
interest is embedded systems. uClinux might benefit from such a syscall
(not that I have much experience with Linux on mmu-less hardware).

The idea would be to avoid the syscalls needed. I looked at the typical
fork-exec code I write and it does something similar to:

   if (fork() == 0)
   {
     for(i = 3; i < NFILES; i++)
      close(i);

     sigaction(...);
     sigaction(...);

     exec();
   }

The system call I was proposing would have a few benefits:

  (1) Because of the file mapping array that can be provided, the
      closing of the file descriptors isn't necessary (close-on-exec
      isn't always convenient to use). Neither is the dup()ing of
      file descriptors for things like pipelines.

  (2) There is always the difficulty of finding out in the parent
      if the exec() fails. Sure you can stat the path, sure you can send
      a signal or do this countless other ways. But the single syscall
      would make this a no-brainer.

  (3) We would eliminate the page faults for the new stack as the child
      runs to setup the environment. I guess this could save a a few free
      pages for a millisecond. Yeah, minor... but if you have a large
      system with 1000's of users it may matter.

Of course, a new system call is intrusive in the kernel. One option is to
prototype this as a library function first, see how much use it gets and
then decide later on to move it to the kernel.

L8r,
Mark G.



^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: [RFD] Combined fork-exec syscall.
  2003-04-28  0:57 [RFD] Combined fork-exec syscall Mark Grosberg
  2003-04-28  0:59 ` Larry McVoy
  2003-04-28  1:17 ` Davide Libenzi
@ 2003-04-28  1:35 ` dean gaudet
  2003-04-28  1:43   ` Mark Grosberg
  2003-04-28  2:38   ` Davide Libenzi
  2003-04-28  2:09 ` Richard B. Johnson
                   ` (3 subsequent siblings)
  6 siblings, 2 replies; 48+ messages in thread
From: dean gaudet @ 2003-04-28  1:35 UTC (permalink / raw)
  To: Mark Grosberg; +Cc: linux-kernel

On Sun, 27 Apr 2003, Mark Grosberg wrote:

> I would think on large, multi-user systems that are spawning processes all
> day, this might improve performance if the shells on such a system were
> patched.

more relevant is a large multithreaded (or async model with many
connections per thread/process) webserver spawning cgi.  otherwise you pay
the costs of duplicating the mm and even if you use F_CLOEXEC (which has a
cost-per-connection) you have to pay for scanning the open fds.

if you look at such webservers they tend to have a separate process just
for the purpose of spawning cgi/etc. and use some IPC to pass the data to
the cgi spawner.

-dean

^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: [RFD] Combined fork-exec syscall.
  2003-04-28  1:16   ` Mark Grosberg
@ 2003-04-28  1:36     ` Måns Rullgård
  2003-04-28  1:45       ` Mark Grosberg
                         ` (2 more replies)
  0 siblings, 3 replies; 48+ messages in thread
From: Måns Rullgård @ 2003-04-28  1:36 UTC (permalink / raw)
  To: Mark Grosberg; +Cc: Larry McVoy, linux-kernel

Mark Grosberg <mark@nolab.conman.org> writes:

> > If you do this, _please_ make it compat with NT.
> 
> Actually, I thought about this. My first thought is this could benefit
> WINE running on Linux. Then (not like I'm a Wine expert by any means) I
> figured it might be an issue as far as having to do some preliminary
> wineserver setup work (if anybody on this list knows better than me, speak
> up!)
> 
> But yeah, basically, something similar to NT's CreateProcess(). For the
> cases where the one-step process creation is sufficient.

Is that the call that takes dozens of parameters?  Copying :-) that
is, IMHO, straight against the UNIX philosophy.

-- 
Måns Rullgård
mru@users.sf.net

^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: [RFD] Combined fork-exec syscall.
  2003-04-28  1:17 ` Davide Libenzi
  2003-04-28  1:28   ` Mark Grosberg
@ 2003-04-28  1:41   ` Ulrich Drepper
  2003-04-28  1:49     ` Mark Grosberg
  1 sibling, 1 reply; 48+ messages in thread
From: Ulrich Drepper @ 2003-04-28  1:41 UTC (permalink / raw)
  To: Davide Libenzi; +Cc: Mark Grosberg, Linux Kernel Mailing List

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

Davide Libenzi wrote:

> This is very much library stuff. I don't think that saving a couple of
> system calls will give you an edge, expecially when we're talking of
> spawning another process. Even if the process itself does nothing but
> return. Ulrich might be eventually interested ...

POSIX has a spawn interface, see <spawn.h> on modern systems.  A syscall
should be compatible with this interface.

- -- 
- --------------.                        ,-.            444 Castro Street
Ulrich Drepper \    ,-----------------'   \ Mountain View, CA 94041 USA
Red Hat         `--' drepper at redhat.com `---------------------------
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.2.1 (GNU/Linux)

iD8DBQE+rIbE2ijCOnn/RHQRAstmAKClxTVl6JUUsKycwat1o3UGqPF64wCgsH5j
imxS5VWcVU0li0nNK2Aa99o=
=Z49l
-----END PGP SIGNATURE-----


^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: [RFD] Combined fork-exec syscall.
  2003-04-28  1:35 ` dean gaudet
@ 2003-04-28  1:43   ` Mark Grosberg
  2003-04-28  3:44     ` Mark Mielke
  2003-04-28  2:38   ` Davide Libenzi
  1 sibling, 1 reply; 48+ messages in thread
From: Mark Grosberg @ 2003-04-28  1:43 UTC (permalink / raw)
  To: dean gaudet; +Cc: linux-kernel



On Sun, 27 Apr 2003, dean gaudet wrote:

> On Sun, 27 Apr 2003, Mark Grosberg wrote:
>
> > I would think on large, multi-user systems that are spawning processes all
> > day, this might improve performance if the shells on such a system were
> > patched.
>
> more relevant is a large multithreaded (or async model with many
> connections per thread/process) webserver spawning cgi.  otherwise you pay

Heh. I just happen to have written a multi-threaded webserver (called
Seminole), but it does CGI "in process." Actually, it runs on VxWorks too
where there is no concept of a process. :-)

But you're right. This could be a boon for any non-in-process (non
mod_perl or PHP) webservers.

The idea would be that the file mapping array would be easier to scan
(kind of like how poll() is a lot easier than select()).

> if you look at such webservers they tend to have a separate process just
> for the purpose of spawning cgi/etc. and use some IPC to pass the data to
> the cgi spawner.

Yup. I suppose for Apache this could be an alternate interface of the APR
spawn process function.

L8r,
Mark G.


^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: [RFD] Combined fork-exec syscall.
  2003-04-28  1:36     ` Måns Rullgård
@ 2003-04-28  1:45       ` Mark Grosberg
  2003-04-28  1:49       ` dean gaudet
  2003-05-01 13:14       ` Jakob Oestergaard
  2 siblings, 0 replies; 48+ messages in thread
From: Mark Grosberg @ 2003-04-28  1:45 UTC (permalink / raw)
  To: Måns Rullgård; +Cc: Larry McVoy, linux-kernel

[-- Warning: decoded text below may be mangled, UTF-8 assumed --]
[-- Attachment #1: Type: TEXT/PLAIN; charset=X-UNKNOWN, Size: 467 bytes --]



On 28 Apr 2003, [iso-8859-1] Måns Rullgård wrote:

> > But yeah, basically, something similar to NT's CreateProcess(). For the
> > cases where the one-step process creation is sufficient.
>
> Is that the call that takes dozens of parameters?  Copying :-) that
> is, IMHO, straight against the UNIX philosophy.

Well, it does take quite a few parameters. I wasn't thinking that it be
nearly that messy. See my first message for my original proposal.

L8r,
Mark G.



^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: [RFD] Combined fork-exec syscall.
  2003-04-28  1:36     ` Måns Rullgård
  2003-04-28  1:45       ` Mark Grosberg
@ 2003-04-28  1:49       ` dean gaudet
  2003-04-28  1:59         ` Mark Grosberg
  2003-05-01 13:14       ` Jakob Oestergaard
  2 siblings, 1 reply; 48+ messages in thread
From: dean gaudet @ 2003-04-28  1:49 UTC (permalink / raw)
  To: Måns Rullgård; +Cc: Mark Grosberg, Larry McVoy, linux-kernel

On Sun, 28 Apr 2003, Måns Rullgård wrote:

> Mark Grosberg <mark@nolab.conman.org> writes:
>
> > > If you do this, _please_ make it compat with NT.
> >
> > Actually, I thought about this. My first thought is this could benefit
> > WINE running on Linux. Then (not like I'm a Wine expert by any means) I
> > figured it might be an issue as far as having to do some preliminary
> > wineserver setup work (if anybody on this list knows better than me, speak
> > up!)
> >
> > But yeah, basically, something similar to NT's CreateProcess(). For the
> > cases where the one-step process creation is sufficient.
>
> Is that the call that takes dozens of parameters?  Copying :-) that
> is, IMHO, straight against the UNIX philosophy.

unfortunately you want those dozen parameters, they all have a purpose...
which is what makes such a call suspect in the first place.

vfork() solves the mm copying problem, which eliminates half the reason
for a combined fork-exec syscall.

the only time fork-exec is inefficient, given the existence of vfork, is
when you need to fork a process which has a lot of fd.  and by "a lot" i
mean thousands.

in that case even F_CLOEXEC isn't a good answer -- because it's a pain in
the ass to set because it requires an extra system call for the most
important case -- sockets.  otherwise you have to iterate over the entire
fd array to close things... which isn't so hot for multiprocessor setups.

but even this has a potential work-around using procfs -- use clone() to
get the vfork semantics without also copying the fd array.  then open
/proc/$ppid/fd/N for any file descriptors you want opened in the forked
process.

given both vfork and procfs i'm not sure there's any other performance
benefit a combined fork+exec syscall offers...  and if procfs isn't fast
enough for this then that's a better place to focus effort :)

-dean

^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: [RFD] Combined fork-exec syscall.
  2003-04-28  1:41   ` Ulrich Drepper
@ 2003-04-28  1:49     ` Mark Grosberg
  2003-04-28  2:19       ` Ulrich Drepper
  2003-04-28  6:59       ` Kai Henningsen
  0 siblings, 2 replies; 48+ messages in thread
From: Mark Grosberg @ 2003-04-28  1:49 UTC (permalink / raw)
  To: Ulrich Drepper; +Cc: Linux Kernel Mailing List



On Sun, 27 Apr 2003, Ulrich Drepper wrote:

> POSIX has a spawn interface, see <spawn.h> on modern systems.  A syscall
> should be compatible with this interface.

Hmmm. Okay, it isn't listed in my POSIX reference (which is really dated).

I don't have any docs on this... I did grep around some header files on a
Linux box and it looks to be a fairly complex interface. I'm not opposed
to supporting the interface, but I would like the syscall to be fairly
light-weight.

Would my original proposal cover the POSIX spec with some userland glue?

L8r,
Mark G.



^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: [RFD] Combined fork-exec syscall.
  2003-04-28  1:49       ` dean gaudet
@ 2003-04-28  1:59         ` Mark Grosberg
  2003-04-28  2:27           ` Miles Bader
  2003-04-28 19:07           ` dean gaudet
  0 siblings, 2 replies; 48+ messages in thread
From: Mark Grosberg @ 2003-04-28  1:59 UTC (permalink / raw)
  To: dean gaudet; +Cc: Måns Rullgård, Larry McVoy, linux-kernel



On Sun, 27 Apr 2003, dean gaudet wrote:

> the only time fork-exec is inefficient, given the existence of vfork, is
> when you need to fork a process which has a lot of fd.  and by "a lot" i
> mean thousands.

Depends at what level of optimization you are talking about. I consider a
syscall an expensive operation. The transition from user to kernel mode,
the setup and retrieval of parameters all cost (and some architectures are
worse at it than i386).

> but even this has a potential work-around using procfs -- use clone() to
> get the vfork semantics without also copying the fd array.  then open
> /proc/$ppid/fd/N for any file descriptors you want opened in the forked
> process.

That is still quite a few syscalls (and some path walking for each file
descriptor)... I was proposing to get around the syscall overhead which
on large multi-user systems (or webservers running lots of CGI) could be
significant.

Honestly, I'm not sure if it is necessary either. But I can think of a few
advantages (and people on the list have thought of more)... How do
MMU-less archs spawn processes? Do they always use vfork()?

L8r,
Mark G.



^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: [RFD] Combined fork-exec syscall.
  2003-04-28  0:57 [RFD] Combined fork-exec syscall Mark Grosberg
                   ` (2 preceding siblings ...)
  2003-04-28  1:35 ` dean gaudet
@ 2003-04-28  2:09 ` Richard B. Johnson
  2003-04-28  2:12   ` Mark Grosberg
  2003-04-28  2:32   ` Werner Almesberger
  2003-04-28  7:40 ` Mirar
                   ` (2 subsequent siblings)
  6 siblings, 2 replies; 48+ messages in thread
From: Richard B. Johnson @ 2003-04-28  2:09 UTC (permalink / raw)
  To: Mark Grosberg; +Cc: linux-kernel

On Sun, 27 Apr 2003, Mark Grosberg wrote:

>
> Hello all,
>
> Is there any interest in a single system call that will perform both a
> fork() and exec()? Could this save some extra work of doing a
> copy_mm(), copy_signals(), etc?
>
> I would think on large, multi-user systems that are spawning processes all
> day, this might improve performance if the shells on such a system were
> patched.
>
> Perhaps a system call like:
>
>    pid_t spawn(const char *p_path,
>                const char *argv[],
>                const char *envp[],
>                const int   filp[]);
>
> The filp array would allow file descriptors to be redirected. It could be
> terminated by a -1 and reference the file descriptors of the current
> process (this could also potentially save some dup() syscalls).
>
> If any of these parameters (exclusing p_path) are NULL, then the
> appropriate values are taken from the current process.
>
> I originally was thinking of a name of fexec() for such a syscall, but
> since there are already "f" variant syscalls (fchmod, fstat, ...) that an
> fexec() would make more sense about executing an already open file, so the
> name spawn() came to mind.
>
> I know almost all of my fork()-exec() code does almost the same thing. I
> guess vfork() was a potential solution, but this somehow seems cleaner
> (and still may be more efficient than having to issue two syscalls)...
> the downside is, of course, another syscall.
>
> L8r,
> Mark G.

You don't save anything but one system call time which is inconsequential
compared to the time necessary to exec (load a file, etc). Also, it is
worthless for anything except the most basic 'system()' or popen()
enulation. In fact, it wouldn't even work for popen() because one
needs to set up a pipe in the child before the exec.

All it does is add kernel bloat and duplicate existing kernel code
(both). Learn Unix instead of trying to make it VMS with spawn().


Cheers,
Dick Johnson
Penguin : Linux version 2.4.20 on an i686 machine (797.90 BogoMips).
Why is the government concerned about the lunatic fringe? Think about it.


^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: [RFD] Combined fork-exec syscall.
  2003-04-28  2:09 ` Richard B. Johnson
@ 2003-04-28  2:12   ` Mark Grosberg
  2003-04-28  2:42     ` Werner Almesberger
  2003-04-28 13:00     ` Richard B. Johnson
  2003-04-28  2:32   ` Werner Almesberger
  1 sibling, 2 replies; 48+ messages in thread
From: Mark Grosberg @ 2003-04-28  2:12 UTC (permalink / raw)
  To: Richard B. Johnson; +Cc: linux-kernel



On Sun, 27 Apr 2003, Richard B. Johnson wrote:

> You don't save anything but one system call time which is inconsequential
> compared to the time necessary to exec (load a file, etc). Also, it is
> worthless for anything except the most basic 'system()' or popen()

Actually, my original proposal will work for popen and all sorts of piping
because of the file descriptor map. For example:

   int   in[2], out[2];
   char *null_argv[] = { NULL };
   int   fmap[4];
   pid_t p;

   pipe(in);
   pipe(out);

   fmap[0] = in[0];                     /* STDIN  */
   fmap[1] = out[1];                    /* STDOUT */
   fmap[2] = open("/dev/null", O_RDWR); /* STDERR */
   fmap[3] = -1;                        /* end    */

   p = nexec("/bin/cat",
             null_argv,
             NULL,
             filmap);


In this case you save the extra closes the child would have to do and you
save the dup's.

> All it does is add kernel bloat and duplicate existing kernel code
> (both). Learn Unix instead of trying to make it VMS with spawn().

Ahem, I happen to know Unix very well, thank you very much. Please read my
proposed API before flaming it out and assuming I know nothing of UNIX,
kernel development, or operating systems in general!

Do you honestly think that just because I picked a name spawn() that
happens to be in VMS (and MS-DOS C compilers) that I am inexperienced to
Unix. Nope. I just happen to be a BSD user in general and don't frequent
LKML.... and now I remember WHY!

And there _ARE_ issues this does solve as were already pointed out because
of the linear scan that must be made on the file descriptor array for the
close-on-exec flag (which this API could happily say it ignores since it
builds a _WHOLE_NEW file descriptor array).

L8r,
Mark G.


^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: [RFD] Combined fork-exec syscall.
  2003-04-28  1:49     ` Mark Grosberg
@ 2003-04-28  2:19       ` Ulrich Drepper
  2003-04-28  6:59       ` Kai Henningsen
  1 sibling, 0 replies; 48+ messages in thread
From: Ulrich Drepper @ 2003-04-28  2:19 UTC (permalink / raw)
  To: Mark Grosberg; +Cc: Linux Kernel Mailing List

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

Mark Grosberg wrote:

> Would my original proposal cover the POSIX spec with some userland glue?

No.  The additional work has to be done in the child before exec.  The
whole point is to not leave the kernel.

- -- 
- --------------.                        ,-.            444 Castro Street
Ulrich Drepper \    ,-----------------'   \ Mountain View, CA 94041 USA
Red Hat         `--' drepper at redhat.com `---------------------------
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.2.1 (GNU/Linux)

iD8DBQE+rI+a2ijCOnn/RHQRAkQhAJ9pG+yBJJOyWlHtU7emXBy5gN/hLACfYXmg
UcGYeu2DEWXX6utwaZVBdAA=
=1Lee
-----END PGP SIGNATURE-----


^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: [RFD] Combined fork-exec syscall.
  2003-04-28  1:59         ` Mark Grosberg
@ 2003-04-28  2:27           ` Miles Bader
  2003-04-28 19:07           ` dean gaudet
  1 sibling, 0 replies; 48+ messages in thread
From: Miles Bader @ 2003-04-28  2:27 UTC (permalink / raw)
  To: Mark Grosberg
  Cc: dean gaudet, =?iso-8859-1?q? Måns Rullgård?=,
	Larry McVoy, linux-kernel

Mark Grosberg <mark@nolab.conman.org> writes:
> How do MMU-less archs spawn processes? Do they always use vfork()?

Yes

-Miles
-- 
Is it true that nothing can be known?  If so how do we know this?  -Woody Allen

^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: [RFD] Combined fork-exec syscall.
  2003-04-28  2:09 ` Richard B. Johnson
  2003-04-28  2:12   ` Mark Grosberg
@ 2003-04-28  2:32   ` Werner Almesberger
  1 sibling, 0 replies; 48+ messages in thread
From: Werner Almesberger @ 2003-04-28  2:32 UTC (permalink / raw)
  To: Richard B. Johnson; +Cc: Mark Grosberg, linux-kernel

Richard B. Johnson wrote:
> Learn Unix instead of trying to make it VMS with spawn().

... or take the hint: the full name is LIB$SPAWN.
(Not that SYS$CREPRC is exactly a beauty either ...)

- Werner

-- 
  _________________________________________________________________________
 / Werner Almesberger, Buenos Aires, Argentina         wa@almesberger.net /
/_http://www.almesberger.net/____________________________________________/

^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: [RFD] Combined fork-exec syscall.
  2003-04-28  1:35 ` dean gaudet
  2003-04-28  1:43   ` Mark Grosberg
@ 2003-04-28  2:38   ` Davide Libenzi
  1 sibling, 0 replies; 48+ messages in thread
From: Davide Libenzi @ 2003-04-28  2:38 UTC (permalink / raw)
  To: dean gaudet; +Cc: Linux Kernel Mailing List

On Sun, 27 Apr 2003, dean gaudet wrote:

> On Sun, 27 Apr 2003, Mark Grosberg wrote:
>
> > I would think on large, multi-user systems that are spawning processes all
> > day, this might improve performance if the shells on such a system were
> > patched.
>
> more relevant is a large multithreaded (or async model with many
> connections per thread/process) webserver spawning cgi.  otherwise you pay
> the costs of duplicating the mm and even if you use F_CLOEXEC (which has a
> cost-per-connection) you have to pay for scanning the open fds.

This might be the only edge of such new syscall IMO. Processes with
hugh file tables. Not the MM stuff, that is fine with vfork().




- Davide


^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: [RFD] Combined fork-exec syscall.
  2003-04-28  2:12   ` Mark Grosberg
@ 2003-04-28  2:42     ` Werner Almesberger
  2003-04-28  6:35       ` Mark Grosberg
  2003-04-29  2:47       ` Rafael Santos
  2003-04-28 13:00     ` Richard B. Johnson
  1 sibling, 2 replies; 48+ messages in thread
From: Werner Almesberger @ 2003-04-28  2:42 UTC (permalink / raw)
  To: Mark Grosberg; +Cc: Richard B. Johnson, linux-kernel

Mark Grosberg wrote:
>    fmap[0] = in[0];                     /* STDIN  */
>    fmap[1] = out[1];                    /* STDOUT */
>    fmap[2] = open("/dev/null", O_RDWR); /* STDERR */
>    fmap[3] = -1;                        /* end    */
> 
>    p = nexec("/bin/cat",
>              null_argv,
>              NULL,
>              filmap);

How about

    fdrplc(3,fmap);
    exec("/bin/cat",...);

?

0) System call names must be short and cryptic :-)
1) Requiring the kernel to iterate over the array element by element
   in order to find out how big it is may be inefficient. Better to
   pass the length.
2) System call overhead is marginal, particularly in this case.
3) There may be other uses than exec(2), where a way for closeing
   all fds and getting a new set may be useful.

- Werner

-- 
  _________________________________________________________________________
 / Werner Almesberger, Buenos Aires, Argentina         wa@almesberger.net /
/_http://www.almesberger.net/____________________________________________/

^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: [RFD] Combined fork-exec syscall.
  2003-04-29  2:47       ` Rafael Santos
@ 2003-04-28  3:20         ` Werner Almesberger
  0 siblings, 0 replies; 48+ messages in thread
From: Werner Almesberger @ 2003-04-28  3:20 UTC (permalink / raw)
  To: Rafael Santos; +Cc: linux-kernel

Rafael Santos wrote:
> This is not the point.

It avoids kernel.c:copy_files and the ensuing close(2) orgy.
You can already avoid MM duplication with CLONE_VM. CLONE_VFORK
gives you synchronization. What else is the point ?

- Werner

-- 
  _________________________________________________________________________
 / Werner Almesberger, Buenos Aires, Argentina         wa@almesberger.net /
/_http://www.almesberger.net/____________________________________________/

^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: [RFD] Combined fork-exec syscall.
  2003-04-28  1:43   ` Mark Grosberg
@ 2003-04-28  3:44     ` Mark Mielke
  2003-04-28  5:16       ` Jamie Lokier
  0 siblings, 1 reply; 48+ messages in thread
From: Mark Mielke @ 2003-04-28  3:44 UTC (permalink / raw)
  To: Mark Grosberg; +Cc: dean gaudet, linux-kernel

On Sun, Apr 27, 2003 at 09:43:38PM -0400, Mark Grosberg wrote:
> The idea would be that the file mapping array would be easier to scan
> (kind of like how poll() is a lot easier than select()).

This brings up the whole poll() vs select() vs /dev/poll vs ... discussion.

poll() is not necessarily faster than select(). If FD_CLOEXEC is not fast
enough, then perhaps efforts should be put into improving FD_CLOEXEC in the
kernel, rather than implementing a new system call that nobody will use
because it isn't defined by POSIX. If the argument is that vfork(), exec()
must scan the file descriptors to determine which ones have FD_CLOEXEC set,
then perhaps the answer is to index the FD_CLOEXEC bits of file descriptors?

> > if you look at such webservers they tend to have a separate process just
> > for the purpose of spawning cgi/etc. and use some IPC to pass the data to
> > the cgi spawner.
> Yup. I suppose for Apache this could be an alternate interface of the APR
> spawn process function.

The Apache 2.0 documentation refers to mod_cgid as a method of
avoiding the scenario that involves fork() copying all threads, not
fork() scanning all file descriptors. Other than complexity of having
a separate 'spawning daemon', I'm not sure that providing a spawn()
system call would make things any faster. A CGI is going to take a
fair amount of time to complete regardless of where it is spawned
from. If you need speed, write your own mod_feature.c, or use an
alternative such as mod_perl.

mark

-- 
mark@mielke.cc/markm@ncf.ca/markm@nortelnetworks.com __________________________
.  .  _  ._  . .   .__    .  . ._. .__ .   . . .__  | Neighbourhood Coder
|\/| |_| |_| |/    |_     |\/|  |  |_  |   |/  |_   | 
|  | | | | \ | \   |__ .  |  | .|. |__ |__ | \ |__  | Ottawa, Ontario, Canada

  One ring to rule them all, one ring to find them, one ring to bring them all
                       and in the darkness bind them...

                           http://mark.mielke.cc/


^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: [RFD] Combined fork-exec syscall.
  2003-04-28  3:44     ` Mark Mielke
@ 2003-04-28  5:16       ` Jamie Lokier
  0 siblings, 0 replies; 48+ messages in thread
From: Jamie Lokier @ 2003-04-28  5:16 UTC (permalink / raw)
  To: Mark Mielke; +Cc: Mark Grosberg, dean gaudet, linux-kernel

Mark Mielke wrote:
> If the argument is that vfork(), exec()
> must scan the file descriptors to determine which ones have FD_CLOEXEC set,
> then perhaps the answer is to index the FD_CLOEXEC bits of file descriptors?

More precisely, to index the file descriptors whih do _not_ have the
FD_CLOEXEC bit set - those are the file descriptors to copy into the
new process.

A more relevant optimisation is to make searching for new file
descriptors O(1) in accept(), yet that was discussed years ago and it
was decided it wasn't worth doing.

-- Jamie

^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: [RFD] Combined fork-exec syscall.
  2003-04-28  2:42     ` Werner Almesberger
@ 2003-04-28  6:35       ` Mark Grosberg
  2003-04-29  2:47       ` Rafael Santos
  1 sibling, 0 replies; 48+ messages in thread
From: Mark Grosberg @ 2003-04-28  6:35 UTC (permalink / raw)
  To: Werner Almesberger; +Cc: linux-kernel



On Sun, 27 Apr 2003, Werner Almesberger wrote:

> How about
>
>     fdrplc(3,fmap);
>     exec("/bin/cat",...);

Not a bad idea. Although my initial motives were to try and reduce the
number of syscalls for forking processes, I can see this as kind of a
useful call as well.

> 0) System call names must be short and cryptic :-)

Heh. How about something like fdmap_set()?

> 1) Requiring the kernel to iterate over the array element by element
>    in order to find out how big it is may be inefficient. Better to
>    pass the length.

Good point. Just a quick verify_area() check and then process away.

> 2) System call overhead is marginal, particularly in this case.

Depends. I know that on the one multi-user Linux machine I do use on a
day-to-day basis syscall overhead is painful:

  Calibrating delay loop.. ok - 16.59 BogoMIPS

and considering this is a multi-user machine with quite a few users always
tapping away doing quick commands (edit this file, run this program, copy
this file, ...).

> 3) There may be other uses than exec(2), where a way for closeing
>    all fds and getting a new set may be useful.

Agreed.

L8r,
Mark G.



^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: [RFD] Combined fork-exec syscall.
  2003-04-28  1:49     ` Mark Grosberg
  2003-04-28  2:19       ` Ulrich Drepper
@ 2003-04-28  6:59       ` Kai Henningsen
  1 sibling, 0 replies; 48+ messages in thread
From: Kai Henningsen @ 2003-04-28  6:59 UTC (permalink / raw)
  To: linux-kernel

mark@nolab.conman.org (Mark Grosberg)  wrote on 27.04.03 in <Pine.BSO.4.44.0304272145580.23296-100000@kwalitee.nolab.conman.org>:

> On Sun, 27 Apr 2003, Ulrich Drepper wrote:
>
> > POSIX has a spawn interface, see <spawn.h> on modern systems.  A syscall
> > should be compatible with this interface.
>
> Hmmm. Okay, it isn't listed in my POSIX reference (which is really dated).
>
> I don't have any docs on this...

http://www.opengroup.org/onlinepubs/007904975/functions/posix_spawn.html

or start from here:

http://unix-systems.org/version3/online.html

MfG Kai

^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: [RFD] Combined fork-exec syscall.
  2003-04-28  0:57 [RFD] Combined fork-exec syscall Mark Grosberg
                   ` (3 preceding siblings ...)
  2003-04-28  2:09 ` Richard B. Johnson
@ 2003-04-28  7:40 ` Mirar
  2003-04-28 12:45 ` Matthias Andree
  2003-04-29  1:05 ` Rafael Costa dos Santos
  6 siblings, 0 replies; 48+ messages in thread
From: Mirar @ 2003-04-28  7:40 UTC (permalink / raw)
  To: Mark Grosberg; +Cc: linux-kernel

> Is there any interest in a single system call that will perform both a
> fork() and exec()? Could this save some extra work of doing a
> copy_mm(), copy_signals(), etc?

I reall have missed it. 

But if you implement one, there are a few extra parameters that can be
nice to have - cwd, uid, gid. I can probably figure out other things
you usually do between fork and exec (chroot, setsid, maybe?).

/Mirar

^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: [RFD] Combined fork-exec syscall.
  2003-04-28  0:57 [RFD] Combined fork-exec syscall Mark Grosberg
                   ` (4 preceding siblings ...)
  2003-04-28  7:40 ` Mirar
@ 2003-04-28 12:45 ` Matthias Andree
  2003-04-29  1:05 ` Rafael Costa dos Santos
  6 siblings, 0 replies; 48+ messages in thread
From: Matthias Andree @ 2003-04-28 12:45 UTC (permalink / raw)
  To: linux-kernel

On Sun, 27 Apr 2003, Mark Grosberg wrote:

> Is there any interest in a single system call that will perform both a
> fork() and exec()? Could this save some extra work of doing a
> copy_mm(), copy_signals(), etc?

How about doing vfork() right (fixing the "what if execve(2) fails"
race) instead?

> I know almost all of my fork()-exec() code does almost the same thing. I
> guess vfork() was a potential solution, but this somehow seems cleaner
> (and still may be more efficient than having to issue two syscalls)...
> the downside is, of course, another syscall.

Which is a major showstopper, because it'd only be useful to
non-portable, Unix-specific applications (thus it wouldn't be put to
much use). OTOH, copy-on-write pages will eliminate much of the overhead
already.

^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: [RFD] Combined fork-exec syscall.
  2003-04-28  2:12   ` Mark Grosberg
  2003-04-28  2:42     ` Werner Almesberger
@ 2003-04-28 13:00     ` Richard B. Johnson
  2003-04-28 13:22       ` Andreas Schwab
                         ` (2 more replies)
  1 sibling, 3 replies; 48+ messages in thread
From: Richard B. Johnson @ 2003-04-28 13:00 UTC (permalink / raw)
  To: Mark Grosberg; +Cc: linux-kernel

On Sun, 27 Apr 2003, Mark Grosberg wrote:

>
>
> On Sun, 27 Apr 2003, Richard B. Johnson wrote:
>
> > You don't save anything but one system call time which is inconsequential
> > compared to the time necessary to exec (load a file, etc). Also, it is
> > worthless for anything except the most basic 'system()' or popen()
>
> Actually, my original proposal will work for popen and all sorts of piping
> because of the file descriptor map. For example:
>
>    int   in[2], out[2];
>    char *null_argv[] = { NULL };
>    int   fmap[4];
>    pid_t p;
>
>    pipe(in);
>    pipe(out);
>
>    fmap[0] = in[0];                     /* STDIN  */
>    fmap[1] = out[1];                    /* STDOUT */
>    fmap[2] = open("/dev/null", O_RDWR); /* STDERR */
>    fmap[3] = -1;                        /* end    */
>
>    p = nexec("/bin/cat",
>              null_argv,
>              NULL,
>              filmap);
>
>
> In this case you save the extra closes the child would have to do and you
> save the dup's.
>
> > All it does is add kernel bloat and duplicate existing kernel code
> > (both). Learn Unix instead of trying to make it VMS with spawn().
>
> Ahem, I happen to know Unix very well, thank you very much. Please read my
> proposed API before flaming it out and assuming I know nothing of UNIX,
> kernel development, or operating systems in general!
>
> Do you honestly think that just because I picked a name spawn() that
> happens to be in VMS (and MS-DOS C compilers) that I am inexperienced to
> Unix. Nope. I just happen to be a BSD user in general and don't frequent
> LKML.... and now I remember WHY!
>
> And there _ARE_ issues this does solve as were already pointed out because
> of the linear scan that must be made on the file descriptor array for the
> close-on-exec flag (which this API could happily say it ignores since it
> builds a _WHOLE_NEW file descriptor array).
>
> L8r,
> Mark G.


The Unix API provides execve(), fexecve(), execv(), execle(),
execl(), execvp(), and execlp() for what you call 'exec'. So
there is no 'fork and exec' as you state.

The kernel provides one system call, execve(). All of the
other functional changes are done with 'C' wrappers in the
'C' runtime library. To make a generic fork-exec, would require
that this code, or its functionality, be moved into the kernel.

To save some processing time, most knowledgeable software
engineers would use vfork(). This leaves the major time,
the time necessary to load the new application into the
new address space and begin its execution. This time could
be tens of milliseconds or even hundreds if the application
is on a CD, floppy, a disk that hasn't been accessed yet,
or the network. In the usuall situation where processing
must be performed between the fork() and the execve(), you
can't use vfork().

You can measure the time for a system call by executing
getpid() or something similar. It is in the noise compared
to the time necessary to execute a program. Further, we
get to the situation where one can't even verify a supposed
speed increase because the system call overhead is in the
noise. Great, one can claim any improvement they want and
it can't be verified. What will be verified, though, is
the increase in size of the kernel.

The following is a "simple popem()', about as minimal as
you can get and have it work.


 *   invocation as `/bin/sh -c COMMAND`. 0 reads 1 writes.
 */
FILE *popen(const char *command, const char *type)
{
    size_t i;
    int fd2close;
    struct sigaction sa;
    char *args[NR_ARGS];
    FILE *file;
    if((command == NULL) || (type == NULL))
    {
        errno = EINVAL;
        return NULL;
    }
    if(!((*type == (char)'r') || (*type == (char)'w')))
    {
        errno = EINVAL;
        return NULL;
    }
    if((file = (FILE *) malloc(sizeof(FILE))) == NULL)
    {
        return file;
    }
    bzero(file, sizeof(FILE));
    if(pipe(file->pfd))
    {
        free(file);
        return NULL;
    }
    fd2close = 0xff;
    if(*type == (char)'r')
    {
        file->fd = file->pfd[0];
        fd2close = file->pfd[1];
    }
    else
    {
        file->fd = file->pfd[1];
        fd2close = file->pfd[0];
    }
    i = 0;
    args[i++] = "/bin/sh";
    args[i++] = "-c";
    args[i++] = strtok((char *)command, " ");
    for(; i< NR_ARGS; i++)
        if((args[i] = strtok(NULL, " ")) == NULL)
            break;
    for(i++; i < NR_ARGS; i++)
        args[i] = NULL;
    sigaction(SIGCHLD, NULL, &sa);     /* Save old */
    signal(SIGCHLD, SIG_IGN);
    switch((file->pid=fork()))
    {
    case 0:
        if(*type == (char)'r')
        {
            dup2(file->pfd[1], STDOUT_FILENO);
            (void)close(file->pfd[0]);
        }
        else
        {
            dup2(file->pfd[0], STDIN_FILENO);
            (void)close(file->pfd[1]);
        }
        signal(SIGINT, SIG_IGN);
        signal(SIGQUIT, SIG_IGN);
        execve(args[0], args, __environ);
        exit(EXIT_FAILURE);
        break;
    case -1:
        (void)close(file->pfd[0]);
        (void)close(file->pfd[1]);
        free(file);
        return NULL;
    default:
        break;
    }
    file->magic = POPEN;
    sigaction(SIGCHLD, &sa, NULL);     /* Restore old */
    (void)close(fd2close);
    return file;
}

Clearly, some additional, non-generic, processing has to
occur after the fork() and before execve(). For instance,
in the parent it is mandatory that the file descriptor that
is not being accessed by the parent be closed just as it
is mandatory that the file descriptor that is not being
accessed by the child be closed. Otherwise, a read from
the file descriptor by the parent, will not error-out
and return control to the parent when the child closes its
end of the pipe. All these 'trivial little details' are
necessary to have individual function calls work as a
system. That's why Unix breaks these functions into little
pieces (primitives) so the writer has control over the
overall behavior of the complete system. Integration of
these components into a monolythic conglomeration has
always failed to provide increased functionality or
performance, instead it simply reduces the number of
lines of code necessary to be written and maintained.

Reducing the number of lines of code may be a good thing.
However, the proper place for that is in the 'C' library,
not the kernel.

Cheers,
Dick Johnson
Penguin : Linux version 2.4.20 on an i686 machine (797.90 BogoMips).
Why is the government concerned about the lunatic fringe? Think about it.


^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: [RFD] Combined fork-exec syscall.
  2003-04-28 13:00     ` Richard B. Johnson
@ 2003-04-28 13:22       ` Andreas Schwab
  2003-04-28 13:57         ` Richard B. Johnson
  2003-04-28 16:36       ` Mark Grosberg
  2003-04-29 18:50       ` Timothy Miller
  2 siblings, 1 reply; 48+ messages in thread
From: Andreas Schwab @ 2003-04-28 13:22 UTC (permalink / raw)
  To: root; +Cc: Mark Grosberg, linux-kernel

"Richard B. Johnson" <root@chaos.analogic.com> writes:

|> The following is a "simple popem()', about as minimal as
|> you can get and have it work.

Except it doesn't.

|>     i = 0;
|>     args[i++] = "/bin/sh";
|>     args[i++] = "-c";
|>     args[i++] = strtok((char *)command, " ");
|>     for(; i< NR_ARGS; i++)
|>         if((args[i] = strtok(NULL, " ")) == NULL)
|>             break;

The command line must be a single argument for -c.

Andreas.

-- 
Andreas Schwab, SuSE Labs, schwab@suse.de
SuSE Linux AG, Deutschherrnstr. 15-19, D-90429 Nürnberg
Key fingerprint = 58CA 54C7 6D53 942B 1756  01D3 44D5 214B 8276 4ED5
"And now for something completely different."

^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: [RFD] Combined fork-exec syscall.
  2003-04-28 13:57         ` Richard B. Johnson
@ 2003-04-28 13:57           ` Andreas Schwab
  2003-04-28 14:16             ` Richard B. Johnson
  0 siblings, 1 reply; 48+ messages in thread
From: Andreas Schwab @ 2003-04-28 13:57 UTC (permalink / raw)
  To: root; +Cc: Mark Grosberg, Linux kernel

"Richard B. Johnson" <root@chaos.analogic.com> writes:

|> On Mon, 28 Apr 2003, Andreas Schwab wrote:
|> 
|> > "Richard B. Johnson" <root@chaos.analogic.com> writes:
|> >
|> > |> The following is a "simple popem()', about as minimal as
|> > |> you can get and have it work.
|> >
|> > Except it doesn't.
|> >
|> > |>     i = 0;
|> > |>     args[i++] = "/bin/sh";
|> > |>     args[i++] = "-c";
|> > |>     args[i++] = strtok((char *)command, " ");
|> > |>     for(; i< NR_ARGS; i++)
|> > |>         if((args[i] = strtok(NULL, " ")) == NULL)
|> > |>             break;
|> 
|> Yes it does.

$ sh -c echo a b c

$ sh -c 'echo a b c'
a b c

Not what I call working.

Andreas.

-- 
Andreas Schwab, SuSE Labs, schwab@suse.de
SuSE Linux AG, Deutschherrnstr. 15-19, D-90429 Nürnberg
Key fingerprint = 58CA 54C7 6D53 942B 1756  01D3 44D5 214B 8276 4ED5
"And now for something completely different."

^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: [RFD] Combined fork-exec syscall.
  2003-04-28 13:22       ` Andreas Schwab
@ 2003-04-28 13:57         ` Richard B. Johnson
  2003-04-28 13:57           ` Andreas Schwab
  0 siblings, 1 reply; 48+ messages in thread
From: Richard B. Johnson @ 2003-04-28 13:57 UTC (permalink / raw)
  To: Andreas Schwab; +Cc: Mark Grosberg, Linux kernel

On Mon, 28 Apr 2003, Andreas Schwab wrote:

> "Richard B. Johnson" <root@chaos.analogic.com> writes:
>
> |> The following is a "simple popem()', about as minimal as
> |> you can get and have it work.
>
> Except it doesn't.
>
> |>     i = 0;
> |>     args[i++] = "/bin/sh";
> |>     args[i++] = "-c";
> |>     args[i++] = strtok((char *)command, " ");
> |>     for(; i< NR_ARGS; i++)
> |>         if((args[i] = strtok(NULL, " ")) == NULL)
> |>             break;

Yes it does.

>
> The command line must be a single argument for -c.
>

That is "implementation dependent", and not a rule. 'sh' may
take additional parameters after '-c'. In the case of Linux
which uses `bash` for `sh` this can be helpful since when
bash is envoked with ths '-c' argument, $0 becomes the next
parameter, instead if the file-name, and $1 becomes the second, etc.

Cheers,
Dick Johnson
Penguin : Linux version 2.4.20 on an i686 machine (797.90 BogoMips).
Why is the government concerned about the lunatic fringe? Think about it.


^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: [RFD] Combined fork-exec syscall.
  2003-04-28 13:57           ` Andreas Schwab
@ 2003-04-28 14:16             ` Richard B. Johnson
  2003-04-28 14:38               ` Valdis.Kletnieks
  2003-04-28 14:42               ` Andreas Schwab
  0 siblings, 2 replies; 48+ messages in thread
From: Richard B. Johnson @ 2003-04-28 14:16 UTC (permalink / raw)
  To: Andreas Schwab; +Cc: Mark Grosberg, Linux kernel

On Mon, 28 Apr 2003, Andreas Schwab wrote:

> "Richard B. Johnson" <root@chaos.analogic.com> writes:
>
> |> On Mon, 28 Apr 2003, Andreas Schwab wrote:
> |>
> |> > "Richard B. Johnson" <root@chaos.analogic.com> writes:
> |> >
> |> > |> The following is a "simple popem()', about as minimal as
> |> > |> you can get and have it work.
> |> >
> |> > Except it doesn't.
> |> >
> |> > |>     i = 0;
> |> > |>     args[i++] = "/bin/sh";
> |> > |>     args[i++] = "-c";
> |> > |>     args[i++] = strtok((char *)command, " ");
> |> > |>     for(; i< NR_ARGS; i++)
> |> > |>         if((args[i] = strtok(NULL, " ")) == NULL)
> |> > |>             break;
> |>
> |> Yes it does.
>
> $ sh -c echo a b c
>
> $ sh -c 'echo a b c'
> a b c
>
> Not what I call working.
>
> Andreas.

Read the bash documentation `man bash`. The first argument becomes
$0 (the process name), the second becomes $1, etc. Please  don't
just keep assuming that I don't know what I'm talking about.

$ sh -c 'ignore echo a b c'
Works fine.

Cheers,
Dick Johnson
Penguin : Linux version 2.4.20 on an i686 machine (797.90 BogoMips).
Why is the government concerned about the lunatic fringe? Think about it.


^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: [RFD] Combined fork-exec syscall.
  2003-04-28 14:16             ` Richard B. Johnson
@ 2003-04-28 14:38               ` Valdis.Kletnieks
  2003-04-28 14:56                 ` Richard B. Johnson
  2003-04-28 14:42               ` Andreas Schwab
  1 sibling, 1 reply; 48+ messages in thread
From: Valdis.Kletnieks @ 2003-04-28 14:38 UTC (permalink / raw)
  To: root; +Cc: Andreas Schwab, Mark Grosberg, Linux kernel

[-- Attachment #1: Type: text/plain, Size: 939 bytes --]

On Mon, 28 Apr 2003 10:16:21 EDT, "Richard B. Johnson" said:

> Read the bash documentation `man bash`. The first argument becomes
> $0 (the process name), the second becomes $1, etc. Please  don't
> just keep assuming that I don't know what I'm talking about.
> 
> $ sh -c 'ignore echo a b c'
> Works fine.

[~]2 /bin/bash -c ignore echo a b c
echo: line 1: ignore: command not found
[~]2 /bin/bash -c 'ignore echo a b c'
/bin/bash: line 1: ignore: command not found

Obviously, tokenization makes a difference here. ;)

So let's try forcing $0 to /bin/bash rather than 'ignore'...

[~]2 sh -c '/bin/bash echo a b c'
echo: /bin/echo: cannot execute binary file

Correct, but unexpected results..

[~]2 sh -c /bin/echo echo a b c

[~]2 sh -c '/bin/echo a b c'
a b c

Again, tokenization matters - try working out what the value of argc is
for the exec of /bin/bash for each of these cases...

Dick, do you have an 'ignore' in your $PATH?


[-- Attachment #2: Type: application/pgp-signature, Size: 226 bytes --]

^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: [RFD] Combined fork-exec syscall.
  2003-04-28 14:16             ` Richard B. Johnson
  2003-04-28 14:38               ` Valdis.Kletnieks
@ 2003-04-28 14:42               ` Andreas Schwab
  1 sibling, 0 replies; 48+ messages in thread
From: Andreas Schwab @ 2003-04-28 14:42 UTC (permalink / raw)
  To: root; +Cc: Mark Grosberg, Linux kernel

"Richard B. Johnson" <root@chaos.analogic.com> writes:

|> On Mon, 28 Apr 2003, Andreas Schwab wrote:
|> 
|> > "Richard B. Johnson" <root@chaos.analogic.com> writes:
|> >
|> > |> On Mon, 28 Apr 2003, Andreas Schwab wrote:
|> > |>
|> > |> > "Richard B. Johnson" <root@chaos.analogic.com> writes:
|> > |> >
|> > |> > |> The following is a "simple popem()', about as minimal as
|> > |> > |> you can get and have it work.
|> > |> >
|> > |> > Except it doesn't.
|> > |> >
|> > |> > |>     i = 0;
|> > |> > |>     args[i++] = "/bin/sh";
|> > |> > |>     args[i++] = "-c";
|> > |> > |>     args[i++] = strtok((char *)command, " ");
|> > |> > |>     for(; i< NR_ARGS; i++)
|> > |> > |>         if((args[i] = strtok(NULL, " ")) == NULL)
|> > |> > |>             break;
|> > |>
|> > |> Yes it does.
|> >
|> > $ sh -c echo a b c
|> >
|> > $ sh -c 'echo a b c'
|> > a b c
|> >
|> > Not what I call working.
|> >
|> > Andreas.
|> 
|> Read the bash documentation `man bash`.

Read the popen documentation 'man popen'.

Andreas.

-- 
Andreas Schwab, SuSE Labs, schwab@suse.de
SuSE Linux AG, Deutschherrnstr. 15-19, D-90429 Nürnberg
Key fingerprint = 58CA 54C7 6D53 942B 1756  01D3 44D5 214B 8276 4ED5
"And now for something completely different."

^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: [RFD] Combined fork-exec syscall.
  2003-04-28 14:38               ` Valdis.Kletnieks
@ 2003-04-28 14:56                 ` Richard B. Johnson
  0 siblings, 0 replies; 48+ messages in thread
From: Richard B. Johnson @ 2003-04-28 14:56 UTC (permalink / raw)
  To: Valdis.Kletnieks; +Cc: Andreas Schwab, Mark Grosberg, Linux kernel

On Mon, 28 Apr 2003 Valdis.Kletnieks@vt.edu wrote:

> On Mon, 28 Apr 2003 10:16:21 EDT, "Richard B. Johnson" said:
>
> > Read the bash documentation `man bash`. The first argument becomes
> > $0 (the process name), the second becomes $1, etc. Please  don't
> > just keep assuming that I don't know what I'm talking about.
> >
> > $ sh -c 'ignore echo a b c'
> > Works fine.
>
> [~]2 /bin/bash -c ignore echo a b c
> echo: line 1: ignore: command not found
> [~]2 /bin/bash -c 'ignore echo a b c'
> /bin/bash: line 1: ignore: command not found
>
> Obviously, tokenization makes a difference here. ;)
>
> So let's try forcing $0 to /bin/bash rather than 'ignore'...
>
> [~]2 sh -c '/bin/bash echo a b c'
> echo: /bin/echo: cannot execute binary file
>
> Correct, but unexpected results..
>
> [~]2 sh -c /bin/echo echo a b c
>
> [~]2 sh -c '/bin/echo a b c'
> a b c
>
> Again, tokenization matters - try working out what the value of argc is
> for the exec of /bin/bash for each of these cases...
>
> Dick, do you have an 'ignore' in your $PATH?
>

No:
BASH=/bin/bash
BASH_VERSION=1.14.5(1)
COLUMNS=80
DISPLAY=:0
EDITOR=/bin/vi
EUID=0
GNUHELP=/usr/local/lib/gnuplot/gnuplot.gih
HISTFILE=
HISTFILESIZE=0
HISTSIZE=500
HOME=/root
HOSTTYPE=i386
IFS=

JAVA_HOME=/usr/java
LANG=en_US.88591
LD_LIBRARY_PATH=/lib:/usr/lib/:/opt/intel/compiler50/ia32/lib:/usr/X11R6/lib:/opt/Office50/lib:/usr/java/lib/i686
LESS=-MM
LIB=/usr/X11R6/lib:/usr/X11/lib
LINES=25
LOGNAME=root
LS_COLORS=no=00:fi=40;32:di=01;34:ln=01;36:pi=40;33:so=01;35:bd=40;33;01:cd=40;33;01:or=01;05;37;41:mi=01;05;37;41:ex=01;33:*.cmd=01;32:*.o=40;32:*.c=01;26:*.S=01;26:*.h=01;32:*.exe=01;32:*.com=01;32:*.btm=01;32:*.bat=01;32:*.tar=01;31:*.tgz=01;31:*.arj=01;31:*.taz=01;31:*.lzh=01;31:*.zip=01;31:*.z=01;31:*.Z=01;31:*.gz=01;31:*.jpg=01;35:*.gif=01;35:*.bmp=01;35:*.xbm=01;35:*.xpm=01;35:
MAIL=/var/spool/mail/root
MAILCHECK=60
MANPATH=/usr/man:/usr/X11/man:/usr/openwin/man:/opt/schily/man
MINICOM=-c on
NLSPATH=/usr/man:/usr/X11/man:/usr/openwin/man:/opt/schily/man
OPENWINHOME=/usr/openwin
OPTERR=1
OPTIND=1
OSTYPE=Linux
PATH=/sbin:/bin:/usr/sbin:/usr/bin:/usr/X11R6/bin:/opt/schily/bin:/usr/bin/X11:/sbin/:/usr/TeX/bin:/usr/openwin/bin:/opt/intel/compiler50/ia32/bin:/usr/games:.:/usr/local/Office50/bin:/usr/java/bin:/home/users/root/tools
PPID=1
PRINTER=mcd
PS1=#
PS2=>
PS4=+
PS_SYSTEM_MAP=/System.map
PWD=/root
SHELL=/bin/bash
SHLVL=1
TERM=vt100-am
TERMCAP=vt100|vt100-am|dec vt100 (w/advanced video):am:mi:ms:xn:xo:co#80:it#8:li#25:vt#3:@8=\EOM:DO=\E[%dB:K1=\EOq:K2=\EOr:K3=\EOs:K4=\EOp:K5=\EOn:LE=\E[%dD:RI=\E[%dC:UP=\E[%dA:ac=\140\140aaffggjjkkllmmnnooppqqrrssttuuvvwwxxyyzz{{||}}~~:ae=^O:as=^N:bl=^G:cb=\E[1K:cd=\E[J:ce=\E[K:cl=\E[H\E[J:cm=\E[%i%d;%dH:cr=^M:cs=\E[%i%d;%dr:ct=\E[3g:do=^J:eA=\E(B\E)0:ho=\E[H:k0=\EOy:k1=\EOP:k2=\EOQ:k3=\EOR:k4=\EOS:k5=\EOt:k6=\EOu:k7=\EOv:k8=\EOl:k9=\EOw:k;=\EOx:kb=^H:kd=\EOB:ke=\E[?1l\E>:kl=\EOD:kr=\EOC:ks=\E[?1h\E=:ku=\EOA:le=^H:mb=\E[5m:md=\E[1m:me=\E[m\017:mr=\E[7m:nd=\E[C:is=\E<\E)0:r2=\E>\E[?3l\E[?4l\E[?5l\E[?7h\E[?8h:rc=\E8:sc=\E7:se=\E[m:sf=^J:so=\E[1;7m:sr=\EM:st=\EH:ta=^I:ue=\E[m:up=\E[A:us=\E[4m:
TZ=US/Eastern
UID=0
VISUAL=/bin/vi
notify=1


Same behavior with:

BASH=/bin/bash
BASH_VERSION=1.14.5(1)
COLUMNS=80
DISPLAY=:0
EDITOR=/bin/vi
EUID=0
GNUHELP=/usr/local/lib/gnuplot/gnuplot.gih
HISTFILE=
HISTFILESIZE=0
HISTSIZE=500
HOME=/root
HOSTTYPE=i386
IFS=

JAVA_HOME=/usr/java
LANG=en_US.88591
LD_LIBRARY_PATH=/lib:/usr/lib/:/opt/intel/compiler50/ia32/lib:/usr/X11R6/lib:/opt/Office50/lib:/usr/java/lib/i686
LESS=-MM
LIB=/usr/X11R6/lib:/usr/X11/lib
LINES=25
LOGNAME=root
LS_COLORS=
MAIL=/var/spool/mail/root
MAILCHECK=60
MANPATH=
MINICOM=-c on
NLSPATH=
OLDPWD=/root/assembly
OPENWINHOME=/usr/openwin
OPTERR=1
OPTIND=1
OSTYPE=Linux
PATH=
PPID=1
PRINTER=mcd
PS1=#
PS2=>
PS4=+
PS_SYSTEM_MAP=/System.map
PWD=/root
SHELL=/bin/bash
SHLVL=1
TERM=
TERMCAP=
TZ=
UID=0
VISUAL=
_=echo 1 2 3 4
notify=1


Cheers,
Dick Johnson
Penguin : Linux version 2.4.20 on an i686 machine (797.90 BogoMips).
Why is the government concerned about the lunatic fringe? Think about it.


^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: [RFD] Combined fork-exec syscall.
  2003-04-28 13:00     ` Richard B. Johnson
  2003-04-28 13:22       ` Andreas Schwab
@ 2003-04-28 16:36       ` Mark Grosberg
  2003-04-28 17:19         ` Davide Libenzi
                           ` (2 more replies)
  2003-04-29 18:50       ` Timothy Miller
  2 siblings, 3 replies; 48+ messages in thread
From: Mark Grosberg @ 2003-04-28 16:36 UTC (permalink / raw)
  To: Richard B. Johnson; +Cc: linux-kernel



On Mon, 28 Apr 2003, Richard B. Johnson wrote:

> The Unix API provides execve(), fexecve(), execv(), execle(),
> execl(), execvp(), and execlp() for what you call 'exec'. So
> there is no 'fork and exec' as you state.

I'm well aware of this.

> The kernel provides one system call, execve(). All of the
> other functional changes are done with 'C' wrappers in the

As I am of this.

> 'C' runtime library. To make a generic fork-exec, would require
> that this code, or its functionality, be moved into the kernel.

And why pray tell could the kernel not supply a nexecve() and then C
wrappers be used to get the various versions?

> To save some processing time, most knowledgeable software
> engineers would use vfork(). This leaves the major time,
> the time necessary to load the new application into the
> new address space and begin its execution. This time could

I am also aware of vfork(). I've been using UNIX back when the mashey
shell was around.

The point of my system call is:

  (1) Save the extra overhead of vfork() and exec(). A single system
      call would still be faster.

  (2) Avoid the resulting file descriptor manipulations for setting
      up pipelines (dup's and closes).

  (3) Avoid having to do any execution of the child. vfork() shares the
      address space but there is still overhead in doing the setup of
      vfork().

And maybe on your 797.90 BogoMips super fast machine the extra syscall
doesn't matter. But on my current server hardware (16.59 BogoMIPS) it is a
savings. So either:

  (1) Buy me a new server so the syscall overhead isn't such a big deal
  (2) Go bother somebody else

There _are_ people you know who do run Linux on embedded devices where CPU
clocks are slow and saving several system calls could be a big deal.

I am not *just* talking about two syscalls here, I am also taking about
dup's and closes.

> be tens of milliseconds or even hundreds if the application
> is on a CD, floppy, a disk that hasn't been accessed yet,

I suppose you have never heard of demand paging in a
binary? Maybe you should give me a kernel lesson on that.

> You can measure the time for a system call by executing
> getpid() or something similar. It is in the noise compared
> to the time necessary to execute a program. Further, we

On what hardware? On what CPU architecture? And if you have 100 users
slamming away at their shells all day, that noise adds up.

> it can't be verified. What will be verified, though, is
> the increase in size of the kernel.

Then what about the sendfile() API! It's totally just a speed hack and a
simple mmap()/write() would probably be just as fast and is POSIX
compliant.

sendfile() just contributes to kernel bloat. It bloats every vfsops
structure and all the implementations are taking up valuable non-pageable
kernel space. So let's get rid of sendfile!

>     case 0:
>         if(*type == (char)'r')
>         {
>             dup2(file->pfd[1], STDOUT_FILENO);
>             (void)close(file->pfd[0]);

Two syscalls saved.

>         signal(SIGINT, SIG_IGN);
>         signal(SIGQUIT, SIG_IGN);

Two more saved (signal dispositions wouldn't be copied over, like in
exec).

>         execve(args[0], args, __environ);
>         exit(EXIT_FAILURE);

One more saved here.

Although there are ways around this, this code does *NOT* inform the
parent of the failure to load a process which could be different from the
real process returning EXIT_FAILURE.

> Clearly, some additional, non-generic, processing has to
> occur after the fork() and before execve(). For instance,
> in the parent it is mandatory that the file descriptor that
> is not being accessed by the parent be closed just as it

Ahem. So lets look at my original proposal which replaced the entire set
of fd's with a new set. So thats 5 system calls saved over your
implementation. Five transitions between user mode and kernel mode.

> system. That's why Unix breaks these functions into little
> pieces (primitives) so the writer has control over the

I'm not saying get rid of the primitives. I am saying that a fork-followed
by exec() where the file descriptor map is the only thing changed is such
a common operation that it should be built into the kernel as a single
syscall to save the overhead of calling the primitives.

> Reducing the number of lines of code may be a good thing.
> However, the proper place for that is in the 'C' library,
> not the kernel.

I am not talking about reducing the number of lines? Can you read my
original post please. We are talking about the overhead of syscalls!

> Penguin : Linux version 2.4.20 on an i686 machine (797.90 BogoMips).

Linux version 2.0.39 on an i486sx machine (16.59 BogoMIPS) and usually 5-6
active users at any instant (plus a dozen services).



^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: [RFD] Combined fork-exec syscall.
  2003-04-28 16:36       ` Mark Grosberg
@ 2003-04-28 17:19         ` Davide Libenzi
  2003-04-28 18:28         ` Craig Ruff
  2003-05-06  2:48         ` Miles Bader
  2 siblings, 0 replies; 48+ messages in thread
From: Davide Libenzi @ 2003-04-28 17:19 UTC (permalink / raw)
  To: Mark Grosberg; +Cc: Linux Kernel Mailing List

On Mon, 28 Apr 2003, Mark Grosberg wrote:

> The point of my system call is:
>
>   (1) Save the extra overhead of vfork() and exec(). A single system
>       call would still be faster.
>
>   (2) Avoid the resulting file descriptor manipulations for setting
>       up pipelines (dup's and closes).
>
>   (3) Avoid having to do any execution of the child. vfork() shares the
>       address space but there is still overhead in doing the setup of
>       vfork().

You still have to do a fork inside the kernel. For sure you don't want to
call do_execve() in the parent task context ;) So there's no save there.
Saving a few syscalls gives you almost nothing compared to the cost of
spawning a new process and to the machine resource consumption of the task
itself. The only place where this might have a measurable impact is when
you have the parent process with 100K fds open and still you have to have
a pretty short lived child process to have a measurable impact. Supposed
that this will be considered as worth, for sure you don't want to
introduce another syscall not respecting the POSIX one. That, if I did
read it correctly, sucks from that point of view. It suck because it is my
understanding that you have to explicitly drop close actions inside the
file actions list. So, suppose you have N fds and you want to close N-2
fds and dup 2 of them, you need N-2 close actions and 2 dup actions. So
you'll end up having an O(N) in user space to build the action list, and
at least an O(N) in kernel space to walk through it again. Actually, since
it is my understanding that the posix_spawn() interface assume an
all-fds-open by default, you have a few options to setup the child file
list :

1) You have an std clone() to copy all files to the new task struct
	( at *least* O(N) ) and you walk through the POSIX file action
	list to apply close/dup ( O(N) )

2) You have a special clone that starts with a brand new file table ( cost == 0 )
	and you walk though the old files array ( O(N) ) by seeking if the
	current file has actions passed by the caller. This is very bad
	since the action list is not indexed, so going in this direction
	w/out building some kind of index might be as O(N^2). Suppose you
	build an index that it'll give you an O(1), you still have to
	spend *at least* O(N) to build the index itself.

3) Other ways are possible but the minimum cost is at least O(2*N).


What would give the POSIX interface a boost would be to have a
default-all-closed option. In this case, in our example, we will have the
new process created with a blank file table ( cost == 0 ) plus 2 dups done
on the parent file table.




- Davide


^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: [RFD] Combined fork-exec syscall.
  2003-04-28 16:36       ` Mark Grosberg
  2003-04-28 17:19         ` Davide Libenzi
@ 2003-04-28 18:28         ` Craig Ruff
  2003-05-06  2:48         ` Miles Bader
  2 siblings, 0 replies; 48+ messages in thread
From: Craig Ruff @ 2003-04-28 18:28 UTC (permalink / raw)
  To: Mark Grosberg; +Cc: Richard B. Johnson, linux-kernel

> On Mon, 28 Apr 2003, Richard B. Johnson wrote:
> 
> > The Unix API provides execve(), fexecve(), execv(), execle(),
> > execl(), execvp(), and execlp() for what you call 'exec'. So
> > there is no 'fork and exec' as you state.

By the way, the latest ISO/IEC 9945-1:2002 POSIX standard defines the
posix_spawn* functions which provide this fork/exec style of operation.

^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: [RFD] Combined fork-exec syscall.
  2003-04-28  1:59         ` Mark Grosberg
  2003-04-28  2:27           ` Miles Bader
@ 2003-04-28 19:07           ` dean gaudet
  1 sibling, 0 replies; 48+ messages in thread
From: dean gaudet @ 2003-04-28 19:07 UTC (permalink / raw)
  To: Mark Grosberg; +Cc: Måns Rullgård, Larry McVoy, linux-kernel

On Sun, 27 Apr 2003, Mark Grosberg wrote:

> On Sun, 27 Apr 2003, dean gaudet wrote:
>
> > the only time fork-exec is inefficient, given the existence of vfork, is
> > when you need to fork a process which has a lot of fd.  and by "a lot" i
> > mean thousands.
>
> Depends at what level of optimization you are talking about. I consider a
> syscall an expensive operation.

"expensive syscalls" are a mistake of non-linux unixes :)

> The transition from user to kernel mode,
> the setup and retrieval of parameters all cost (and some architectures are
> worse at it than i386).
>
> > but even this has a potential work-around using procfs -- use clone() to
> > get the vfork semantics without also copying the fd array.  then open
> > /proc/$ppid/fd/N for any file descriptors you want opened in the forked
> > process.
>
> That is still quite a few syscalls (and some path walking for each file
> descriptor)... I was proposing to get around the syscall overhead which
> on large multi-user systems (or webservers running lots of CGI) could be
> significant.

it's no more syscalls than are already required to set up stdin, out, and
error... the open() calls replace dup2() calls.

if the path walking is a problem then create a openparent(int parent_fd)
syscall... which would have to do all the same permissions checking that
using an open("/proc/ppid/...") would.

note that for this to be generically useful for CGI you also need to be
able to setuid(), and chdir().  this is why NT CreateProcess has a zillion
arguments -- and why it's really suspect...

-dean

^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: [RFD] Combined fork-exec syscall.
  2003-04-28  0:57 [RFD] Combined fork-exec syscall Mark Grosberg
                   ` (5 preceding siblings ...)
  2003-04-28 12:45 ` Matthias Andree
@ 2003-04-29  1:05 ` Rafael Costa dos Santos
  2003-04-28  1:19   ` Mark Grosberg
  6 siblings, 1 reply; 48+ messages in thread
From: Rafael Costa dos Santos @ 2003-04-29  1:05 UTC (permalink / raw)
  To: linux-kernel, Mark Grosberg

Do you have some work done on this issue ?


4/27/03 9:57:12 PM, Mark Grosberg <mark@nolab.conman.org> wrote:

>
>Hello all,
>
>Is there any interest in a single system call that will perform both a
>fork() and exec()? Could this save some extra work of doing a
>copy_mm(), copy_signals(), etc?
>
>I would think on large, multi-user systems that are spawning processes all
>day, this might improve performance if the shells on such a system were
>patched.
>
>Perhaps a system call like:
>
>   pid_t spawn(const char *p_path,
>               const char *argv[],
>               const char *envp[],
>               const int   filp[]);
>
>The filp array would allow file descriptors to be redirected. It could be
>terminated by a -1 and reference the file descriptors of the current
>process (this could also potentially save some dup() syscalls).
>
>If any of these parameters (exclusing p_path) are NULL, then the
>appropriate values are taken from the current process.
>
>I originally was thinking of a name of fexec() for such a syscall, but
>since there are already "f" variant syscalls (fchmod, fstat, ...) that an
>fexec() would make more sense about executing an already open file, so the
>name spawn() came to mind.
>
>I know almost all of my fork()-exec() code does almost the same thing. I
>guess vfork() was a potential solution, but this somehow seems cleaner
>(and still may be more efficient than having to issue two syscalls)...
>the downside is, of course, another syscall.
>
>L8r,
>Mark G.
>
>-
>To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
>the body of a message to majordomo@vger.kernel.org
>More majordomo info at  http://vger.kernel.org/majordomo-info.html
>Please read the FAQ at  http://www.tux.org/lkml/
>
>



^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: [RFD] Combined fork-exec syscall.
  2003-04-28  1:19   ` Mark Grosberg
@ 2003-04-29  1:29     ` Rafael Costa dos Santos
  0 siblings, 0 replies; 48+ messages in thread
From: Rafael Costa dos Santos @ 2003-04-29  1:29 UTC (permalink / raw)
  To: linux-kernel

If any help needed. Tell me. I would be glad to help.


4/27/03 10:19:52 PM, Mark Grosberg <mark@nolab.conman.org> wrote:

>
>
>On Mon, 28 Apr 2003, Rafael Costa dos Santos wrote:
>
>> Do you have some work done on this issue ?
>
>Nope. Was just thinking. To be honest, I stopped doing kernel development
>in the 2.0 days (but I keep up with LKML). This would be for 2.5 but
>could probably be backported to 2.4 without too much trouble.
>
>But tomorrow I may very well find myself out of a job (big surprise), and
>if so, I'll setup a Linux box (since I've mostly been using OpenBSD
>these days) and work on this (for i386 to begin with) if  there is enough
>interest.
>
>L8r,
>Mark G.
>




^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: [RFD] Combined fork-exec syscall.
  2003-04-28  1:28   ` Mark Grosberg
@ 2003-04-29  2:01     ` Rafael Costa dos Santos
  0 siblings, 0 replies; 48+ messages in thread
From: Rafael Costa dos Santos @ 2003-04-29  2:01 UTC (permalink / raw)
  To: Linux Kernel Mailing List

POSIX spawn() 

http://mail.gnu.org/archive/html/bug-hurd/2001-07/msg00107.html



^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: [RFD] Combined fork-exec syscall.
  2003-04-28  2:42     ` Werner Almesberger
  2003-04-28  6:35       ` Mark Grosberg
@ 2003-04-29  2:47       ` Rafael Santos
  2003-04-28  3:20         ` Werner Almesberger
  1 sibling, 1 reply; 48+ messages in thread
From: Rafael Santos @ 2003-04-29  2:47 UTC (permalink / raw)
  To: linux-kernel

[-- Warning: decoded text below may be mangled, UTF-8 assumed --]
[-- Attachment #1: Type: text/plain; charset="us-ascii", Size: 1563 bytes --]

This is not the point.


4/27/03 11:42:15 PM, Werner Almesberger <wa@almesberger.net> wrote:

>Mark Grosberg wrote:
>>    fmap[0] = in[0];                     /* STDIN  */
>>    fmap[1] = out[1];                    /* STDOUT */
>>    fmap[2] = open("/dev/null", O_RDWR); /* STDERR */
>>    fmap[3] = -1;                        /* end    */
>> 
>>    p = nexec("/bin/cat",
>>              null_argv,
>>              NULL,
>>              filmap);
>
>How about
>
>    fdrplc(3,fmap);
>    exec("/bin/cat",...);
>
>?
>
>0) System call names must be short and cryptic :-)
>1) Requiring the kernel to iterate over the array element by element
>   in order to find out how big it is may be inefficient. Better to
>   pass the length.
>2) System call overhead is marginal, particularly in this case.
>3) There may be other uses than exec(2), where a way for closeing
>   all fds and getting a new set may be useful.
>
>- Werner
>
>-- 
>  _________________________________________________________________________
> / Werner Almesberger, Buenos Aires, Argentina         wa@almesberger.net /
>/_http://www.almesberger.net/____________________________________________/
>-
>To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
>the body of a message to majordomo@vger.kernel.org
>More majordomo info at  http://vger.kernel.org/majordomo-info.html
>Please read the FAQ at  http://www.tux.org/lkml/
>
>
Rafael Costa dos Santos
ThinkFreak Comércio e Soluções em Hardware e Software
Rio de Janeiro / RJ / Brazil
rafael@thinkfreak.com.br
+ 55 21 9432-9266



^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: [RFD] Combined fork-exec syscall.
  2003-04-28 13:00     ` Richard B. Johnson
  2003-04-28 13:22       ` Andreas Schwab
  2003-04-28 16:36       ` Mark Grosberg
@ 2003-04-29 18:50       ` Timothy Miller
  2 siblings, 0 replies; 48+ messages in thread
From: Timothy Miller @ 2003-04-29 18:50 UTC (permalink / raw)
  To: root; +Cc: Mark Grosberg, linux-kernel



Richard B. Johnson wrote:

>  
>
>To save some processing time, most knowledgeable software
>engineers would use vfork(). This leaves the major time,
>the time necessary to load the new application into the
>new address space and begin its execution. This time could
>be tens of milliseconds or even hundreds if the application
>is on a CD, floppy, a disk that hasn't been accessed yet,
>or the network. In the usuall situation where processing
>must be performed between the fork() and the execve(), you
>can't use vfork().
>
>You can measure the time for a system call by executing
>getpid() or something similar. It is in the noise compared
>to the time necessary to execute a program. Further, we
>get to the situation where one can't even verify a supposed
>speed increase because the system call overhead is in the
>noise. Great, one can claim any improvement they want and
>it can't be verified. What will be verified, though, is
>the increase in size of the kernel.
>
>
>  
>

So, you can't save any time _for_that_particular_process_ by speeding up 
the fork.  Granted.  But that wasted CPU time could be better spent 
working on some unrelated process that is not waiting on I/O.



^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: [RFD] Combined fork-exec syscall.
  2003-04-28  1:36     ` Måns Rullgård
  2003-04-28  1:45       ` Mark Grosberg
  2003-04-28  1:49       ` dean gaudet
@ 2003-05-01 13:14       ` Jakob Oestergaard
  2 siblings, 0 replies; 48+ messages in thread
From: Jakob Oestergaard @ 2003-05-01 13:14 UTC (permalink / raw)
  To: Måns Rullgård; +Cc: Mark Grosberg, Larry McVoy, linux-kernel

On Mon, Apr 28, 2003 at 03:36:17AM +0200, Måns Rullgård wrote:
> Mark Grosberg <mark@nolab.conman.org> writes:
...
> > But yeah, basically, something similar to NT's CreateProcess(). For the
> > cases where the one-step process creation is sufficient.
> 
> Is that the call that takes dozens of parameters?  Copying :-) that
> is, IMHO, straight against the UNIX philosophy.

I agree with Måns completely.

CreateProcess() is *horrible*.  It takes 10 arguments, several of them
being pointers to structures.  Ugh!

Besides, the CreateProcessAsUser() call (which takes 13 arguments IIRC)
demonstrates why such all-in-one-and-a-kitchen-sink calls are
fundamentally flawed.

In the few cases where they do not demand unnecessary arguments, they
simply lack the functionality that is actually needed.

I would argue that any time spent on replicating such monsters in Linux
would be far better spent optimizing the basic calls
(exec/fork/dup/close/fcntl/...) instead.


That was my 0.02 Euro on that one.

-- 
................................................................
:   jakob@unthought.net   : And I see the elder races,         :
:.........................: putrid forms of man                :
:   Jakob Østergaard      : See him rise and claim the earth,  :
:        OZ9ABN           : his downfall is at hand.           :
:.........................:............{Konkhra}...............:

^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: [RFD] Combined fork-exec syscall.
  2003-04-28 16:36       ` Mark Grosberg
  2003-04-28 17:19         ` Davide Libenzi
  2003-04-28 18:28         ` Craig Ruff
@ 2003-05-06  2:48         ` Miles Bader
  2 siblings, 0 replies; 48+ messages in thread
From: Miles Bader @ 2003-05-06  2:48 UTC (permalink / raw)
  To: Mark Grosberg; +Cc: Richard B. Johnson, linux-kernel

Mark Grosberg <mark@nolab.conman.org> writes:
> And maybe on your 797.90 BogoMips super fast machine the extra syscall
> doesn't matter. But on my current server hardware (16.59 BogoMIPS) it is a
> savings.

Are you sure it's really all that bad?

The machines I use are even slower (~6 bogomips), but system calls still
seem pretty fast; I've measured them as having about a total
65-instruction overhead on my arch -- which is a lot slower than a
function call to be sure, but presumably the actual work done by the
system call ends up being more.

-Miles
-- 
`The suburb is an obsolete and contradictory form of human settlement'

^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: [RFD] Combined fork-exec syscall.
@ 2003-04-28  3:03 Davide Libenzi
  0 siblings, 0 replies; 48+ messages in thread
From: Davide Libenzi @ 2003-04-28  3:03 UTC (permalink / raw)
  To: Ulrich Drepper; +Cc: Linux Kernel Mailing List


On Sun, 27 Apr 2003, Ulrich Drepper wrote:

> -----BEGIN PGP SIGNED MESSAGE-----
> Hash: SHA1
>
> Davide Libenzi wrote:
>
> > This is very much library stuff. I don't think that saving a couple of
> > system calls will give you an edge, expecially when we're talking of
> > spawning another process. Even if the process itself does nothing but
> > return. Ulrich might be eventually interested ...
>
> POSIX has a spawn interface, see <spawn.h> on modern systems.
                                                ^^^^^^^^^^^^^^
( You want to make me pay for the last question about swapcontext in our
old glibc environment, don't you ? ;)

If I read the specification correctly, the posix_spwan() interface will
not solve scalability problems due to huge file tables. If I read it
correctly, and if you have M files currently opened and you want to
keep/dup only three files, you have to drop (M-3) close actions plus 3 dup
actions. To solve such problem you'd need a default-all-closed option plus
3 dup actions. That inside the kernel will translate in a brand new file
table plus 3 links.




- Davide


^ permalink raw reply	[flat|nested] 48+ messages in thread

end of thread, other threads:[~2003-05-06  2:36 UTC | newest]

Thread overview: 48+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2003-04-28  0:57 [RFD] Combined fork-exec syscall Mark Grosberg
2003-04-28  0:59 ` Larry McVoy
2003-04-28  1:16   ` Mark Grosberg
2003-04-28  1:36     ` Måns Rullgård
2003-04-28  1:45       ` Mark Grosberg
2003-04-28  1:49       ` dean gaudet
2003-04-28  1:59         ` Mark Grosberg
2003-04-28  2:27           ` Miles Bader
2003-04-28 19:07           ` dean gaudet
2003-05-01 13:14       ` Jakob Oestergaard
2003-04-28  1:17 ` Davide Libenzi
2003-04-28  1:28   ` Mark Grosberg
2003-04-29  2:01     ` Rafael Costa dos Santos
2003-04-28  1:41   ` Ulrich Drepper
2003-04-28  1:49     ` Mark Grosberg
2003-04-28  2:19       ` Ulrich Drepper
2003-04-28  6:59       ` Kai Henningsen
2003-04-28  1:35 ` dean gaudet
2003-04-28  1:43   ` Mark Grosberg
2003-04-28  3:44     ` Mark Mielke
2003-04-28  5:16       ` Jamie Lokier
2003-04-28  2:38   ` Davide Libenzi
2003-04-28  2:09 ` Richard B. Johnson
2003-04-28  2:12   ` Mark Grosberg
2003-04-28  2:42     ` Werner Almesberger
2003-04-28  6:35       ` Mark Grosberg
2003-04-29  2:47       ` Rafael Santos
2003-04-28  3:20         ` Werner Almesberger
2003-04-28 13:00     ` Richard B. Johnson
2003-04-28 13:22       ` Andreas Schwab
2003-04-28 13:57         ` Richard B. Johnson
2003-04-28 13:57           ` Andreas Schwab
2003-04-28 14:16             ` Richard B. Johnson
2003-04-28 14:38               ` Valdis.Kletnieks
2003-04-28 14:56                 ` Richard B. Johnson
2003-04-28 14:42               ` Andreas Schwab
2003-04-28 16:36       ` Mark Grosberg
2003-04-28 17:19         ` Davide Libenzi
2003-04-28 18:28         ` Craig Ruff
2003-05-06  2:48         ` Miles Bader
2003-04-29 18:50       ` Timothy Miller
2003-04-28  2:32   ` Werner Almesberger
2003-04-28  7:40 ` Mirar
2003-04-28 12:45 ` Matthias Andree
2003-04-29  1:05 ` Rafael Costa dos Santos
2003-04-28  1:19   ` Mark Grosberg
2003-04-29  1:29     ` Rafael Costa dos Santos
2003-04-28  3:03 Davide Libenzi

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).