On Wed, Oct 26, 2016 at 10:03:09PM +0200, Mickaël Salaün wrote: > On 26/10/2016 21:01, Jann Horn wrote: > > On Wed, Oct 26, 2016 at 08:56:39AM +0200, Mickaël Salaün wrote: > >> This new arraymap looks like a set and brings new properties: > >> * strong typing of entries: the eBPF functions get the array type of > >> elements instead of CONST_PTR_TO_MAP (e.g. > >> CONST_PTR_TO_LANDLOCK_HANDLE_FS); > >> * force sequential filling (i.e. replace or append-only update), which > >> allow quick browsing of all entries. > >> > >> This strong typing is useful to statically check if the content of a map > >> can be passed to an eBPF function. For example, Landlock use it to store > >> and manage kernel objects (e.g. struct file) instead of dealing with > >> userland raw data. This improve efficiency and ensure that an eBPF > >> program can only call functions with the right high-level arguments. > >> > >> The enum bpf_map_handle_type list low-level types (e.g. > >> BPF_MAP_HANDLE_TYPE_LANDLOCK_FS_FD) which are identified when > >> updating a map entry (handle). This handle types are used to infer a > >> high-level arraymap type which are listed in enum bpf_map_array_type > >> (e.g. BPF_MAP_ARRAY_TYPE_LANDLOCK_FS). > >> > >> For now, this new arraymap is only used by Landlock LSM (cf. next > >> commits) but it could be useful for other needs. > >> > >> Changes since v3: > >> * make handle arraymap safe (RCU) and remove buggy synchronize_rcu() > >> * factor out the arraymay walk > >> > >> Changes since v2: > >> * add a RLIMIT_NOFILE-based limit to the maximum number of arraymap > >> handle entries (suggested by Andy Lutomirski) > >> * remove useless checks > >> > >> Changes since v1: > >> * arraymap of handles replace custom checker groups > >> * simpler userland API > > [...] > >> + case BPF_MAP_HANDLE_TYPE_LANDLOCK_FS_FD: > >> + handle_file = fget(handle->fd); > >> + if (IS_ERR(handle_file)) > >> + return ERR_CAST(handle_file); > >> + /* check if the FD is tied to a user mount point */ > >> + if (unlikely(handle_file->f_path.mnt->mnt_flags & MNT_INTERNAL)) { > >> + fput(handle_file); > >> + return ERR_PTR(-EINVAL); > >> + } > >> + path_get(&handle_file->f_path); > >> + ret = kmalloc(sizeof(*ret), GFP_KERNEL); > >> + ret->path = handle_file->f_path; > >> + fput(handle_file); > > > > You can use fdget() and fdput() here because the reference to > > handle_file is dropped before the end of the syscall. > > The reference to handle_file is dropped but not the reference to the > (inner) path thanks to path_get(). That's irrelevant. As long as you promise to fdput() any references acquired using fdget() before any of the following can happen, using fdget() is okay: - the syscall exits - the fd table is shared with a process that might write to it - an fd is closed by the kernel In other words, you must be able to prove that nobody can remove the struct file * from your fd table before you fdput(). Taking a long-term reference to an object pointed to by a struct file that was looked up with fdget() is fine.