From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S261184AbTD1QYQ (ORCPT ); Mon, 28 Apr 2003 12:24:16 -0400 Received: (majordomo@vger.kernel.org) by vger.kernel.org id S261188AbTD1QYQ (ORCPT ); Mon, 28 Apr 2003 12:24:16 -0400 Received: from user72.209.42.38.dsli.com ([209.42.38.72]:64139 "EHLO nolab.conman.org") by vger.kernel.org with ESMTP id S261184AbTD1QYN (ORCPT ); Mon, 28 Apr 2003 12:24:13 -0400 Date: Mon, 28 Apr 2003 12:36:30 -0400 (EDT) From: Mark Grosberg To: "Richard B. Johnson" Cc: linux-kernel@vger.kernel.org Subject: Re: [RFD] Combined fork-exec syscall. In-Reply-To: Message-ID: MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII Sender: linux-kernel-owner@vger.kernel.org X-Mailing-List: linux-kernel@vger.kernel.org On Mon, 28 Apr 2003, Richard B. Johnson wrote: > The Unix API provides execve(), fexecve(), execv(), execle(), > execl(), execvp(), and execlp() for what you call 'exec'. So > there is no 'fork and exec' as you state. I'm well aware of this. > The kernel provides one system call, execve(). All of the > other functional changes are done with 'C' wrappers in the As I am of this. > 'C' runtime library. To make a generic fork-exec, would require > that this code, or its functionality, be moved into the kernel. And why pray tell could the kernel not supply a nexecve() and then C wrappers be used to get the various versions? > To save some processing time, most knowledgeable software > engineers would use vfork(). This leaves the major time, > the time necessary to load the new application into the > new address space and begin its execution. This time could I am also aware of vfork(). I've been using UNIX back when the mashey shell was around. The point of my system call is: (1) Save the extra overhead of vfork() and exec(). A single system call would still be faster. (2) Avoid the resulting file descriptor manipulations for setting up pipelines (dup's and closes). (3) Avoid having to do any execution of the child. vfork() shares the address space but there is still overhead in doing the setup of vfork(). And maybe on your 797.90 BogoMips super fast machine the extra syscall doesn't matter. But on my current server hardware (16.59 BogoMIPS) it is a savings. So either: (1) Buy me a new server so the syscall overhead isn't such a big deal (2) Go bother somebody else There _are_ people you know who do run Linux on embedded devices where CPU clocks are slow and saving several system calls could be a big deal. I am not *just* talking about two syscalls here, I am also taking about dup's and closes. > be tens of milliseconds or even hundreds if the application > is on a CD, floppy, a disk that hasn't been accessed yet, I suppose you have never heard of demand paging in a binary? Maybe you should give me a kernel lesson on that. > You can measure the time for a system call by executing > getpid() or something similar. It is in the noise compared > to the time necessary to execute a program. Further, we On what hardware? On what CPU architecture? And if you have 100 users slamming away at their shells all day, that noise adds up. > it can't be verified. What will be verified, though, is > the increase in size of the kernel. Then what about the sendfile() API! It's totally just a speed hack and a simple mmap()/write() would probably be just as fast and is POSIX compliant. sendfile() just contributes to kernel bloat. It bloats every vfsops structure and all the implementations are taking up valuable non-pageable kernel space. So let's get rid of sendfile! > case 0: > if(*type == (char)'r') > { > dup2(file->pfd[1], STDOUT_FILENO); > (void)close(file->pfd[0]); Two syscalls saved. > signal(SIGINT, SIG_IGN); > signal(SIGQUIT, SIG_IGN); Two more saved (signal dispositions wouldn't be copied over, like in exec). > execve(args[0], args, __environ); > exit(EXIT_FAILURE); One more saved here. Although there are ways around this, this code does *NOT* inform the parent of the failure to load a process which could be different from the real process returning EXIT_FAILURE. > Clearly, some additional, non-generic, processing has to > occur after the fork() and before execve(). For instance, > in the parent it is mandatory that the file descriptor that > is not being accessed by the parent be closed just as it Ahem. So lets look at my original proposal which replaced the entire set of fd's with a new set. So thats 5 system calls saved over your implementation. Five transitions between user mode and kernel mode. > system. That's why Unix breaks these functions into little > pieces (primitives) so the writer has control over the I'm not saying get rid of the primitives. I am saying that a fork-followed by exec() where the file descriptor map is the only thing changed is such a common operation that it should be built into the kernel as a single syscall to save the overhead of calling the primitives. > Reducing the number of lines of code may be a good thing. > However, the proper place for that is in the 'C' library, > not the kernel. I am not talking about reducing the number of lines? Can you read my original post please. We are talking about the overhead of syscalls! > Penguin : Linux version 2.4.20 on an i686 machine (797.90 BogoMips). Linux version 2.0.39 on an i486sx machine (16.59 BogoMIPS) and usually 5-6 active users at any instant (plus a dozen services).