From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-3.8 required=3.0 tests=DKIM_SIGNED,DKIM_VALID, DKIM_VALID_AU,FREEMAIL_FORGED_FROMDOMAIN,FREEMAIL_FROM, HEADER_FROM_DIFFERENT_DOMAINS,MAILING_LIST_MULTI,SIGNED_OFF_BY,SPF_HELO_NONE, SPF_PASS autolearn=no autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 65487C54FCE for ; Mon, 23 Mar 2020 18:03:46 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.kernel.org (Postfix) with ESMTP id 38B082072E for ; Mon, 23 Mar 2020 18:03:46 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b="SI8rcU1t" Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1727289AbgCWSDp (ORCPT ); Mon, 23 Mar 2020 14:03:45 -0400 Received: from mail-qt1-f193.google.com ([209.85.160.193]:44304 "EHLO mail-qt1-f193.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1727011AbgCWSDp (ORCPT ); Mon, 23 Mar 2020 14:03:45 -0400 Received: by mail-qt1-f193.google.com with SMTP id x16so3368015qts.11; Mon, 23 Mar 2020 11:03:42 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=mime-version:references:in-reply-to:from:date:message-id:subject:to :cc; bh=fHu91a/Lr8MUsZOsiLVkd/C4HW7N74BB6IqSYZpx+Aw=; b=SI8rcU1tgyYpjNp2i9swb7M3rMAZWoxtSwefg0dcmHWQe0eCaJuQQxI7G9JKFM4TWU hCdSnlDhRWXtyEZKAQfjsRqTkmZTvHvlOjN7++m+9lVWgYsIk++omJmFXjKF1tXafld1 Uhmpz5mnhRiCOzqjLqxkH0MNsgZ2svplNPiQKIgD+zk71/DZyTc2KFzQp9tSUHrhjp4T Rwl/1p1ptdRx4n0ZMpme+UzUq202fDwAdRObFkSMJkJWBowe6jxpAanF+Lo3Oam8QSzI sMgYwQgug66UfrQ9n+nOy5I5jfYcXc6U7MidCZBL7Z5kF4qXkz9k8SrMkdCnAYUXYnRP MnXQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:references:in-reply-to:from:date :message-id:subject:to:cc; bh=fHu91a/Lr8MUsZOsiLVkd/C4HW7N74BB6IqSYZpx+Aw=; b=nCGS0VSUZASIPYziIyWVbL45F/Rzqsnye4ZW9LfuLfcR/MelhiJFkJdE8rubtOVN3C GQDDALlYM57pgFx1CBw1XIKYU8BjF1i9vfWCNaKvg4YYoFTDRkd42lG/nSBD5UWT25BL ih7UX/7eYRRKlVYZFewte5/tO5zTFfAwdUvme8pge/EmaQaB6qt5u9kJvp9f0QRKbjjm LzfC9iiiBnXT9mgNHV4wnf2Y3LyJCS2koL1G2qfnq30IPlDqC2YM1Ao10i9N+C+921aq rukxTjoOiI2KeQ6lPFYF4R1U5m6fR7c7OuDSSuy0RG7oLEDwwgoCYs+R/u0ksnbIQORB tzAg== X-Gm-Message-State: ANhLgQ2m2OX7Tg/PM4xfWZac7oYQqK45GUbBSrzI9vOtNoq32MRRdYp2 LFRuSfIwc42pwfP7XtRBd3eRQRcy+sH/EopEOAk= X-Google-Smtp-Source: ADFU+vtopggPFvOeNPrm7x8MhbmifH1bk+B6z5gdKcLmrow6Orzae/CM2RKt8qrPTdbaAMmWF9MqV5zIT7frDOiCcqc= X-Received: by 2002:ac8:7cb0:: with SMTP id z16mr6236000qtv.59.1584986621700; Mon, 23 Mar 2020 11:03:41 -0700 (PDT) MIME-Version: 1.0 References: <20200320203615.1519013-1-andriin@fb.com> <20200320203615.1519013-4-andriin@fb.com> <20200320234628.GA11775@rdna-mbp> In-Reply-To: From: Andrii Nakryiko Date: Mon, 23 Mar 2020 11:03:30 -0700 Message-ID: Subject: Re: [Potential Spoof] [PATCH bpf-next 3/6] bpf: implement bpf_link-based cgroup BPF program attachment To: Andrey Ignatov Cc: Andrii Nakryiko , bpf , Networking , Alexei Starovoitov , Daniel Borkmann , Kernel Team Content-Type: text/plain; charset="UTF-8" Sender: netdev-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: netdev@vger.kernel.org On Fri, Mar 20, 2020 at 5:19 PM Andrii Nakryiko wrote: > > On Fri, Mar 20, 2020 at 4:46 PM Andrey Ignatov wrote: > > > > Andrii Nakryiko [Fri, 2020-03-20 13:37 -0700]: > > > Implement new sub-command to attach cgroup BPF programs and return FD-based > > > bpf_link back on success. bpf_link, once attached to cgroup, cannot be > > > replaced, except by owner having its FD. cgroup bpf_link has semantics of > > > BPF_F_ALLOW_MULTI BPF program attachments and can co-exist with > > > > Hi Andrii, > > > > Is there any reason to limit it to only BPF_F_ALLOW_MULTI? > > > > No technical reason, just felt like the good default behavior. It's > possible to support all the same flags as with legacy BPF program > attachment, but see below... So I went ahead and just added support for all the modes, it's very minor change in kernel itself, but needs a lot more selftesting logic, given all the new modes. Once I finish with that, I'll post v2 that also fixes build with !CONFIG_CGROUP_BPF. We can continue discussing orthogonal inheritance policies independently, if there is any intereset. > > > The thing is BPF_F_ALLOW_MULTI not only allows to attach multiple > > programs to specified cgroup but also controls what programs can later > > be attached to a sub-cgroup, and in BPF_F_ALLOW_MULTI case both > > sub-cgroup programs and specified cgroup programs will be executed (in > > this order). > > > > There many use-cases though when it's desired to either completely > > disallow attaching programs to a sub-cgroup or override parent cgroup's > > program behavior in sub-cgroup. If bpf_link covers only > > BPF_F_ALLOW_MULTI, those scenarios won't be able to leverage it. > > > > This double-purpose of BPF_F_ALLOW_MULTI is a pain ... For example if > > Yeah, exactly. I don't know historical reasons for why it was done the > way it was done (i.e., BPF_F_ALLOW_MULTI and BPF_F_ALLOW_OVERRIDE > flags), so maybe someone can give some insights there. But I wonder if > inheritance policy should be orthogonal to a single vs multiple > bpf_progs limit for a given cgroup. They could be specified on > per-cgroup level, not per-BPF program (and then making sure flags > match). That way it would be possible to express more nuanced > policies, like allowing multiple programs for a root cgroup, but > disallowing attach new BPF programs for any child cgroup, etc. > > Would be good to get some more perspectives on this... > > > one wants to attach multiple programs to a cgroup but disallow attaching > > programs to a sub-cgroup it's currently impossible (well, w/o additional > > cgroup level just for this). > > > > > non-bpf_link-based BPF cgroup attachments. > > > > > > To prevent bpf_cgroup_link from keeping cgroup alive past the point when no > > > BPF program can be executed, implement auto-detachment of link. When > > > cgroup_bpf_release() is called, all attached bpf_links are forced to release > > > cgroup refcounts, but they leave bpf_link otherwise active and allocated, as > > > well as still owning underlying bpf_prog. This is because user-space might > > > still have FDs open and active, so bpf_link as a user-referenced object can't > > > be freed yet. Once last active FD is closed, bpf_link will be freed and > > > underlying bpf_prog refcount will be dropped. But cgroup refcount won't be > > > touched, because cgroup is released already. > > > > > > The inherent race between bpf_cgroup_link release (from closing last FD) and > > > cgroup_bpf_release() is resolved by both operations taking cgroup_mutex. So > > > the only additional check required is when bpf_cgroup_link attempts to detach > > > itself from cgroup. At that time we need to check whether there is still > > > cgroup associated with that link. And if not, exit with success, because > > > bpf_cgroup_link was already successfully detached. > > > > > > Acked-by: Roman Gushchin > > > Signed-off-by: Andrii Nakryiko > > > --- > > > include/linux/bpf-cgroup.h | 27 ++- > > > include/linux/bpf.h | 10 +- > > > include/uapi/linux/bpf.h | 9 +- > > > kernel/bpf/cgroup.c | 313 +++++++++++++++++++++++++-------- > > > kernel/bpf/syscall.c | 62 +++++-- > > > kernel/cgroup/cgroup.c | 14 +- > > > tools/include/uapi/linux/bpf.h | 9 +- > > > 7 files changed, 345 insertions(+), 99 deletions(-) > > > > > [...]