From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1751796AbaF1GnI (ORCPT ); Sat, 28 Jun 2014 02:43:08 -0400 Received: from mail-wi0-f176.google.com ([209.85.212.176]:63433 "EHLO mail-wi0-f176.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1751205AbaF1GnF (ORCPT ); Sat, 28 Jun 2014 02:43:05 -0400 MIME-Version: 1.0 In-Reply-To: References: <1403913966-4927-1-git-send-email-ast@plumgrid.com> <1403913966-4927-4-git-send-email-ast@plumgrid.com> Date: Fri, 27 Jun 2014 23:43:03 -0700 Message-ID: Subject: Re: [PATCH RFC net-next 03/14] bpf: introduce syscall(BPF, ...) and BPF maps From: Alexei Starovoitov To: Andy Lutomirski Cc: "David S. Miller" , Ingo Molnar , Linus Torvalds , Steven Rostedt , Daniel Borkmann , Chema Gonzalez , Eric Dumazet , Peter Zijlstra , Arnaldo Carvalho de Melo , Jiri Olsa , Thomas Gleixner , "H. Peter Anvin" , Andrew Morton , Kees Cook , Linux API , Network Development , "linux-kernel@vger.kernel.org" Content-Type: text/plain; charset=UTF-8 Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Fri, Jun 27, 2014 at 11:25 PM, Andy Lutomirski wrote: > On Fri, Jun 27, 2014 at 10:55 PM, Alexei Starovoitov wrote: >> On Fri, Jun 27, 2014 at 5:16 PM, Andy Lutomirski wrote: >>> On Fri, Jun 27, 2014 at 5:05 PM, Alexei Starovoitov wrote: >>>> BPF syscall is a demux for different BPF releated commands. >>>> >>>> 'maps' is a generic storage of different types for sharing data between kernel >>>> and userspace. >>>> >>>> The maps can be created/deleted from user space via BPF syscall: >>>> - create a map with given id, type and attributes >>>> map_id = bpf_map_create(int map_id, map_type, struct nlattr *attr, int len) >>>> returns positive map id or negative error >>>> >>>> - delete map with given map id >>>> err = bpf_map_delete(int map_id) >>>> returns zero or negative error >>> >>> What's the scope of "id"? How is it secured? >> >> the map and program id space is global and it's cap_sys_admin only. >> There is no pressing need to do it with per-user limits. >> So the whole thing is root only for now. >> > > Hmm. This may be unpleasant if you ever want to support non-root or > namespaced operation. I think it will be easy to extend it per namespace when we lift root-only restriction. It will be seamless without user api changes. > How hard would it be to give these things fds? you mean programs/maps auto-terminate when creator process exits? I thought about it and it's appealing at first glance, but doesn't fit the model of existing tracepoint events which are global. The programs attached to events need to live without 'daemon' hanging around. Therefore I picked 'kernel module'- like method. From mboxrd@z Thu Jan 1 00:00:00 1970 From: Alexei Starovoitov Subject: Re: [PATCH RFC net-next 03/14] bpf: introduce syscall(BPF, ...) and BPF maps Date: Fri, 27 Jun 2014 23:43:03 -0700 Message-ID: References: <1403913966-4927-1-git-send-email-ast@plumgrid.com> <1403913966-4927-4-git-send-email-ast@plumgrid.com> Mime-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Cc: "David S. Miller" , Ingo Molnar , Linus Torvalds , Steven Rostedt , Daniel Borkmann , Chema Gonzalez , Eric Dumazet , Peter Zijlstra , Arnaldo Carvalho de Melo , Jiri Olsa , Thomas Gleixner , "H. Peter Anvin" , Andrew Morton , Kees Cook , Linux API , Network Development , "linux-kernel-u79uwXL29TY76Z2rM5mHXA@public.gmane.org" To: Andy Lutomirski Return-path: In-Reply-To: Sender: linux-api-owner-u79uwXL29TY76Z2rM5mHXA@public.gmane.org List-Id: netdev.vger.kernel.org On Fri, Jun 27, 2014 at 11:25 PM, Andy Lutomirski wrote: > On Fri, Jun 27, 2014 at 10:55 PM, Alexei Starovoitov wrote: >> On Fri, Jun 27, 2014 at 5:16 PM, Andy Lutomirski wrote: >>> On Fri, Jun 27, 2014 at 5:05 PM, Alexei Starovoitov wrote: >>>> BPF syscall is a demux for different BPF releated commands. >>>> >>>> 'maps' is a generic storage of different types for sharing data between kernel >>>> and userspace. >>>> >>>> The maps can be created/deleted from user space via BPF syscall: >>>> - create a map with given id, type and attributes >>>> map_id = bpf_map_create(int map_id, map_type, struct nlattr *attr, int len) >>>> returns positive map id or negative error >>>> >>>> - delete map with given map id >>>> err = bpf_map_delete(int map_id) >>>> returns zero or negative error >>> >>> What's the scope of "id"? How is it secured? >> >> the map and program id space is global and it's cap_sys_admin only. >> There is no pressing need to do it with per-user limits. >> So the whole thing is root only for now. >> > > Hmm. This may be unpleasant if you ever want to support non-root or > namespaced operation. I think it will be easy to extend it per namespace when we lift root-only restriction. It will be seamless without user api changes. > How hard would it be to give these things fds? you mean programs/maps auto-terminate when creator process exits? I thought about it and it's appealing at first glance, but doesn't fit the model of existing tracepoint events which are global. The programs attached to events need to live without 'daemon' hanging around. Therefore I picked 'kernel module'- like method.