From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 96E55C433EF for ; Tue, 15 Mar 2022 17:27:58 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1350561AbiCOR3I (ORCPT ); Tue, 15 Mar 2022 13:29:08 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:53388 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1347737AbiCOR3F (ORCPT ); Tue, 15 Mar 2022 13:29:05 -0400 Received: from mail-qt1-x832.google.com (mail-qt1-x832.google.com [IPv6:2607:f8b0:4864:20::832]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 552B35F86 for ; Tue, 15 Mar 2022 10:27:52 -0700 (PDT) Received: by mail-qt1-x832.google.com with SMTP id v2so7198594qtc.5 for ; Tue, 15 Mar 2022 10:27:52 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20210112; h=mime-version:references:in-reply-to:from:date:message-id:subject:to :cc; bh=2dR5ednAuivheslMdbA8lSv7LtQURKSYeYic+EaoGiM=; b=KzkYQqtRQrG2h9DQbzl9S69dP29Ym/FTat20XRl3lHJ6CbIUEoeND6fpndc0FcXgUh ePyYdd7xUuKYn1NSqAi7Lc73xdRg7v3t2vZcTaMhGijbG85YYfG7UyFRMs/QGcmRH+Q8 YRT64JUCzcHXp1FQAKJszjOf3fWqhFqc+3RV78rDaiv6inKoHZbAwNttUBYkTe/BRDz4 TVaxgDHrxMksZkaQMcaDKWsl2CrZJkR9PX9Hnw5BhR41T5V+hy8mkuS+OtyC9qY8ghtp DKxeeJH9ttzPbuQUHFCT9kgxELydtbRRyvaa3Oznpy7yGmQHp4oHWKhh8KL3ySzHnqPG G+2w== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=x-gm-message-state:mime-version:references:in-reply-to:from:date :message-id:subject:to:cc; bh=2dR5ednAuivheslMdbA8lSv7LtQURKSYeYic+EaoGiM=; b=XWbTtNyuvGQA2Su/+nnlYaDgaduulLVTJt42/8q+rVPI/YENt2xHVNTNpCPrUGPnYI 9jUDefflcnabwkuRGmnUegX0N6j8PgI9kY1JkSa7/MigiFgPE+UVeaZ5V7p8Aa8frjvq lf2ZBLCpgARQ4d3VtIX/6e02HBh1NFKt7XOHClNBo6xEwcSJu+w+a3QptYGs5Na1FgHr WQJJenvZrz6KPLWLANEMyz5efJoMxvUaZY7M0FWew+vxG8ZCWfWhPq7vAzWh0vmMIhsr TY7qspdsQXrzBAfdYpbPBi0CASpXFBv7THNRmCLqQ7DfLgCDE3/HRSVnIlq2bN5/GY7v e72Q== X-Gm-Message-State: AOAM530jDek3OUkpT8EA7X7dYh2r34dThMyh4wNI/RAET2StEWy3wO/z zaTNPqUo9awJDdCkdBeS4xO2EPH0Fs1lJHAGXJFcWA== X-Google-Smtp-Source: ABdhPJx1++PpaWMWUFpwAInRieMduBGSPX6FQuyqDVX1DvQJlTCYBiCVuK425IpwF7LmCBiVsVrafIxb6cIaekQvmIs= X-Received: by 2002:ac8:7c4f:0:b0:2e1:a763:da87 with SMTP id o15-20020ac87c4f000000b002e1a763da87mr22988918qtv.478.1647365271174; Tue, 15 Mar 2022 10:27:51 -0700 (PDT) MIME-Version: 1.0 References: <20220225234339.2386398-1-haoluo@google.com> <20220225234339.2386398-2-haoluo@google.com> In-Reply-To: From: Hao Luo Date: Tue, 15 Mar 2022 10:27:39 -0700 Message-ID: Subject: Re: [PATCH bpf-next v1 1/9] bpf: Add mkdir, rmdir, unlink syscalls for prog_bpf_syscall To: Al Viro Cc: Alexei Starovoitov , Andrii Nakryiko , Daniel Borkmann , Martin KaFai Lau , Song Liu , Yonghong Song , KP Singh , Shakeel Butt , Joe Burton , Tejun Heo , joshdon@google.com, sdf@google.com, bpf@vger.kernel.org, linux-kernel@vger.kernel.org Content-Type: text/plain; charset="UTF-8" Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Mon, Mar 14, 2022 at 4:12 PM Al Viro wrote: > > On Mon, Mar 14, 2022 at 10:07:31AM -0700, Hao Luo wrote: > > Hello Al, > > > > In which contexts can those be called? > > > > > > > In a sleepable context. The plan is to introduce a certain tracepoints > > as sleepable, a program that attaches to sleepable tracepoints is > > allowed to call these functions. In particular, the first sleepable > > tracepoint introduced in this patchset is one at the end of > > cgroup_mkdir(). Do you have any advices? > > Yes - don't do it, unless you really want a lot of user-triggerable > deadlocks. > > Pathname resolution is not locking-agnostic. In particular, you can't > do it if you are under any ->i_rwsem, whether it's shared or exclusive. > That includes cgroup_mkdir() callchains. And if the pathname passed > to these functions will have you walk through the parent directory, > you would get screwed (e.g. if the next component happens to be > inexistent, triggering a lookup, which takes ->i_rwsem shared). I'm thinking of two options, let's see if either can work out: Option 1: We can put restrictions on the pathname passed into this helper. We can explicitly require the parameter dirfd to be in bpffs (we can verify). In addition, we check pathname to be not containing any dot or dotdot, so the resolved path will end up inside bpffs, therefore won't take ->i_rwsem that is in the callchain of cgroup_mkdir(). Option 2: We can avoid pathname resolution entirely. Like above, we can adjust the semantics of this helper to be: making an immediate directory under the dirfd passed in. In particular, like above, we can enforce the dirfd to be in bpffs and pathname to consist of only alphabet and numbers. With these restrictions, we call vfs_mkdir() to create directories. Being able to mkdir from bpf has useful use cases, let's try to make it happen even with many limitations. Thanks!