From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1752054AbbATWk5 (ORCPT ); Tue, 20 Jan 2015 17:40:57 -0500 Received: from mail-lb0-f172.google.com ([209.85.217.172]:54630 "EHLO mail-lb0-f172.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1751502AbbATWkz (ORCPT ); Tue, 20 Jan 2015 17:40:55 -0500 MIME-Version: 1.0 In-Reply-To: <1421747878-30744-1-git-send-email-famz@redhat.com> References: <1421747878-30744-1-git-send-email-famz@redhat.com> From: Andy Lutomirski Date: Tue, 20 Jan 2015 14:40:32 -0800 Message-ID: Subject: Re: [PATCH RFC 0/6] epoll: Introduce new syscall "epoll_mod_wait" To: Fam Zheng Cc: "linux-kernel@vger.kernel.org" , Thomas Gleixner , Ingo Molnar , "H. Peter Anvin" , X86 ML , Alexander Viro , Andrew Morton , Kees Cook , David Herrmann , Alexei Starovoitov , Miklos Szeredi , David Drysdale , Oleg Nesterov , "David S. Miller" , Vivek Goyal , Mike Frysinger , "Theodore Ts'o" , Heiko Carstens , Rasmus Villemoes , Rashika Kheria , Hugh Dickins , Mathieu Desnoyers , Peter Zijlstra , Linux FS Devel , Linux API , Josh Triplett , "Michael Kerrisk (man-pages)" , Paolo Bonzini Content-Type: text/plain; charset=UTF-8 Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Tue, Jan 20, 2015 at 1:57 AM, Fam Zheng wrote: > This adds a new system call, epoll_mod_wait. It's described as below: > > NAME > epoll_mod_wait - modify and wait for I/O events on an epoll file > descriptor > > SYNOPSIS > > int epoll_mod_wait(int epfd, int flags, > int ncmds, struct epoll_mod_cmd *cmds, > struct epoll_wait_spec *spec); > > DESCRIPTION > > The epoll_mod_wait() system call can be seen as an enhanced combination > of several epoll_ctl(2) calls, which are followed by an epoll_pwait(2) > call. It is superior in two cases: > > 1) When epoll_ctl(2) are followed by epoll_wait(2), using epoll_mod_wait > will save context switches between user mode and kernel mode; > > 2) When you need higher precision than microsecond for wait timeout. > > The epoll_ctl(2) operations are embedded into this call by with ncmds > and cmds. The latter is an array of command structs: > > struct epoll_mod_cmd { > > /* Reserved flags for future extension, must be 0 for now. */ > int flags; > > /* The same as epoll_ctl() op parameter. */ > int op; > > /* The same as epoll_ctl() fd parameter. */ > int fd; > > /* The same as the "events" field in struct epoll_event. */ > uint32_t events; > > /* The same as the "data" field in struct epoll_event. */ > uint64_t data; > > /* Output field, will be set to the return code once this > * command is executed by kernel */ > int error; > }; I would add an extra u32 at the end so that the structure size will be a multiple of 8 bytes on all platforms. > > There is no guartantee that all the commands are executed in order. Only > if all the commands are successfully executed (all the error fields are > set to 0), events are polled. If this doesn't happen, what error is returned? > struct epoll_wait_spec { > > /* The same as "maxevents" in epoll_pwait() */ > int maxevents; > > /* The same as "events" in epoll_pwait() */ > struct epoll_event *events; > > /* Which clock to use for timeout */ > int clockid; > > /* Maximum time to wait if there is no event */ > struct timespec timeout; > > /* The same as "sigmask" in epoll_pwait() */ > sigset_t *sigmask; > > /* The same as "sigsetsize" in epoll_pwait() */ > size_t sigsetsize; > } EPOLL_PACKED; I think the convention is to align the structure's fields manually rather than declaring it to be packed. > > RETURN VALUE > > When any error occurs, epoll_mod_wait() returns -1 and errno is set > appropriately. All the "error" fields in cmds are unchanged before they > are executed, and if any cmds are executed, the "error" fields are set > to a return code accordingly. See also epoll_ctl for more details of the > return code. Does this mean that callers should initialize the error fields to an impossible value first so they can tell which commands were executed? > > When successful, epoll_mod_wait() returns the number of file > descriptors ready for the requested I/O, or zero if no file descriptor > became ready during the requested timeout milliseconds. > > If spec is NULL, it returns 0 if all the commands are successful, and -1 > if an error occured. > > ERRORS > > These errors apply on either the return value of epoll_mod_wait or error > status for each command, respectively. Please clarify which errors are returned overall and which are per-command. Thanks, Andy From mboxrd@z Thu Jan 1 00:00:00 1970 From: Andy Lutomirski Subject: Re: [PATCH RFC 0/6] epoll: Introduce new syscall "epoll_mod_wait" Date: Tue, 20 Jan 2015 14:40:32 -0800 Message-ID: References: <1421747878-30744-1-git-send-email-famz@redhat.com> Mime-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Cc: "linux-kernel-u79uwXL29TY76Z2rM5mHXA@public.gmane.org" , Thomas Gleixner , Ingo Molnar , "H. Peter Anvin" , X86 ML , Alexander Viro , Andrew Morton , Kees Cook , David Herrmann , Alexei Starovoitov , Miklos Szeredi , David Drysdale , Oleg Nesterov , "David S. Miller" , Vivek Goyal , Mike Frysinger , "Theodore Ts'o" , Heiko Carstens , Rasmus Villemoes , Rashika Kheria , Hugh Dickins , Mathieu Desnoyers , Peter Zijlstra Return-path: In-Reply-To: <1421747878-30744-1-git-send-email-famz-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org> Sender: linux-api-owner-u79uwXL29TY76Z2rM5mHXA@public.gmane.org List-Id: linux-fsdevel.vger.kernel.org On Tue, Jan 20, 2015 at 1:57 AM, Fam Zheng wrote: > This adds a new system call, epoll_mod_wait. It's described as below: > > NAME > epoll_mod_wait - modify and wait for I/O events on an epoll file > descriptor > > SYNOPSIS > > int epoll_mod_wait(int epfd, int flags, > int ncmds, struct epoll_mod_cmd *cmds, > struct epoll_wait_spec *spec); > > DESCRIPTION > > The epoll_mod_wait() system call can be seen as an enhanced combination > of several epoll_ctl(2) calls, which are followed by an epoll_pwait(2) > call. It is superior in two cases: > > 1) When epoll_ctl(2) are followed by epoll_wait(2), using epoll_mod_wait > will save context switches between user mode and kernel mode; > > 2) When you need higher precision than microsecond for wait timeout. > > The epoll_ctl(2) operations are embedded into this call by with ncmds > and cmds. The latter is an array of command structs: > > struct epoll_mod_cmd { > > /* Reserved flags for future extension, must be 0 for now. */ > int flags; > > /* The same as epoll_ctl() op parameter. */ > int op; > > /* The same as epoll_ctl() fd parameter. */ > int fd; > > /* The same as the "events" field in struct epoll_event. */ > uint32_t events; > > /* The same as the "data" field in struct epoll_event. */ > uint64_t data; > > /* Output field, will be set to the return code once this > * command is executed by kernel */ > int error; > }; I would add an extra u32 at the end so that the structure size will be a multiple of 8 bytes on all platforms. > > There is no guartantee that all the commands are executed in order. Only > if all the commands are successfully executed (all the error fields are > set to 0), events are polled. If this doesn't happen, what error is returned? > struct epoll_wait_spec { > > /* The same as "maxevents" in epoll_pwait() */ > int maxevents; > > /* The same as "events" in epoll_pwait() */ > struct epoll_event *events; > > /* Which clock to use for timeout */ > int clockid; > > /* Maximum time to wait if there is no event */ > struct timespec timeout; > > /* The same as "sigmask" in epoll_pwait() */ > sigset_t *sigmask; > > /* The same as "sigsetsize" in epoll_pwait() */ > size_t sigsetsize; > } EPOLL_PACKED; I think the convention is to align the structure's fields manually rather than declaring it to be packed. > > RETURN VALUE > > When any error occurs, epoll_mod_wait() returns -1 and errno is set > appropriately. All the "error" fields in cmds are unchanged before they > are executed, and if any cmds are executed, the "error" fields are set > to a return code accordingly. See also epoll_ctl for more details of the > return code. Does this mean that callers should initialize the error fields to an impossible value first so they can tell which commands were executed? > > When successful, epoll_mod_wait() returns the number of file > descriptors ready for the requested I/O, or zero if no file descriptor > became ready during the requested timeout milliseconds. > > If spec is NULL, it returns 0 if all the commands are successful, and -1 > if an error occured. > > ERRORS > > These errors apply on either the return value of epoll_mod_wait or error > status for each command, respectively. Please clarify which errors are returned overall and which are per-command. Thanks, Andy From mboxrd@z Thu Jan 1 00:00:00 1970 From: Andy Lutomirski Subject: Re: [PATCH RFC 0/6] epoll: Introduce new syscall "epoll_mod_wait" Date: Tue, 20 Jan 2015 14:40:32 -0800 Message-ID: References: <1421747878-30744-1-git-send-email-famz@redhat.com> Mime-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Return-path: In-Reply-To: <1421747878-30744-1-git-send-email-famz-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org> Sender: linux-api-owner-u79uwXL29TY76Z2rM5mHXA@public.gmane.org To: Fam Zheng Cc: "linux-kernel-u79uwXL29TY76Z2rM5mHXA@public.gmane.org" , Thomas Gleixner , Ingo Molnar , "H. Peter Anvin" , X86 ML , Alexander Viro , Andrew Morton , Kees Cook , David Herrmann , Alexei Starovoitov , Miklos Szeredi , David Drysdale , Oleg Nesterov , "David S. Miller" , Vivek Goyal , Mike Frysinger , Theodore Ts'o , Heiko Carstens , Rasmus Villemoes , Rashika Kheria , Hugh Dickins , Mathieu Desnoyers , Peter Zijlstra List-Id: linux-api@vger.kernel.org On Tue, Jan 20, 2015 at 1:57 AM, Fam Zheng wrote: > This adds a new system call, epoll_mod_wait. It's described as below: > > NAME > epoll_mod_wait - modify and wait for I/O events on an epoll file > descriptor > > SYNOPSIS > > int epoll_mod_wait(int epfd, int flags, > int ncmds, struct epoll_mod_cmd *cmds, > struct epoll_wait_spec *spec); > > DESCRIPTION > > The epoll_mod_wait() system call can be seen as an enhanced combination > of several epoll_ctl(2) calls, which are followed by an epoll_pwait(2) > call. It is superior in two cases: > > 1) When epoll_ctl(2) are followed by epoll_wait(2), using epoll_mod_wait > will save context switches between user mode and kernel mode; > > 2) When you need higher precision than microsecond for wait timeout. > > The epoll_ctl(2) operations are embedded into this call by with ncmds > and cmds. The latter is an array of command structs: > > struct epoll_mod_cmd { > > /* Reserved flags for future extension, must be 0 for now. */ > int flags; > > /* The same as epoll_ctl() op parameter. */ > int op; > > /* The same as epoll_ctl() fd parameter. */ > int fd; > > /* The same as the "events" field in struct epoll_event. */ > uint32_t events; > > /* The same as the "data" field in struct epoll_event. */ > uint64_t data; > > /* Output field, will be set to the return code once this > * command is executed by kernel */ > int error; > }; I would add an extra u32 at the end so that the structure size will be a multiple of 8 bytes on all platforms. > > There is no guartantee that all the commands are executed in order. Only > if all the commands are successfully executed (all the error fields are > set to 0), events are polled. If this doesn't happen, what error is returned? > struct epoll_wait_spec { > > /* The same as "maxevents" in epoll_pwait() */ > int maxevents; > > /* The same as "events" in epoll_pwait() */ > struct epoll_event *events; > > /* Which clock to use for timeout */ > int clockid; > > /* Maximum time to wait if there is no event */ > struct timespec timeout; > > /* The same as "sigmask" in epoll_pwait() */ > sigset_t *sigmask; > > /* The same as "sigsetsize" in epoll_pwait() */ > size_t sigsetsize; > } EPOLL_PACKED; I think the convention is to align the structure's fields manually rather than declaring it to be packed. > > RETURN VALUE > > When any error occurs, epoll_mod_wait() returns -1 and errno is set > appropriately. All the "error" fields in cmds are unchanged before they > are executed, and if any cmds are executed, the "error" fields are set > to a return code accordingly. See also epoll_ctl for more details of the > return code. Does this mean that callers should initialize the error fields to an impossible value first so they can tell which commands were executed? > > When successful, epoll_mod_wait() returns the number of file > descriptors ready for the requested I/O, or zero if no file descriptor > became ready during the requested timeout milliseconds. > > If spec is NULL, it returns 0 if all the commands are successful, and -1 > if an error occured. > > ERRORS > > These errors apply on either the return value of epoll_mod_wait or error > status for each command, respectively. Please clarify which errors are returned overall and which are per-command. Thanks, Andy