From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 79DA2C43334 for ; Thu, 7 Jul 2022 23:33:42 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S236931AbiGGXdk (ORCPT ); Thu, 7 Jul 2022 19:33:40 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:40270 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S236695AbiGGXdi (ORCPT ); Thu, 7 Jul 2022 19:33:38 -0400 Received: from mail-qt1-x833.google.com (mail-qt1-x833.google.com [IPv6:2607:f8b0:4864:20::833]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 5D58C1EECE for ; Thu, 7 Jul 2022 16:33:35 -0700 (PDT) Received: by mail-qt1-x833.google.com with SMTP id i11so25101810qtr.4 for ; Thu, 07 Jul 2022 16:33:35 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20210112; h=mime-version:references:in-reply-to:from:date:message-id:subject:to :cc; bh=C3mGVzkNGFQOGZ/y27ZbQkMVyVRvFTFo9AVDkxPU3Qs=; b=jI3G3nw3SkAqpzPaUgQF22T9Ic6RqP7p4lMxcErcOD7LNwKHd9PhRD3jRjB5LpdRas Y5NSMurtCWidZFToFy1cwL/XBxIfNw8pRD1q5Tkl+DjQU82h4wug5DMjjkpFX5N/DorO SwsedXFgGBd+s1tGJbH90/3YrEFEevIGRkGo0KerFHlOynd+zFhmCXD6wOxegx+qKcI7 whx6NUJOU7zlSU/AwEkpEWHVAccpRcvhOXltX62YPY1hchP+LZnGgdRpe/U837IrWH3D dz5dOl6K+w5srsimqMB2BRtRhqEml7mfZ8+BDapksn3GwHM8o2gHQoLWP2Z3Uu5lfvw3 v75A== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=x-gm-message-state:mime-version:references:in-reply-to:from:date :message-id:subject:to:cc; bh=C3mGVzkNGFQOGZ/y27ZbQkMVyVRvFTFo9AVDkxPU3Qs=; b=PPEXSg7W2zwRj5CMpNsMu026OXeLvSb1zJEpA8IEeFCTot/GJXrXDoWmvzGbYCyz2/ kjJRxoRhxwy5Y9nNcG1GVxfkAkiBfsBaJyJZ9H0ce2wTCam6w/qO5Z0++C1F+fjAEPuQ aO6abMWo6ueJOsFAcNjIa8vCFQIdqijafQtrkJFxhXcqssEmo6iBjMpcUJBx7V5mWdhn 0CcBHmMS2GBffaQOD6owgvozIQ/ii2WrjP6PLQO8iOdRcTjsqQDfForhfLjHaPuJQYxX 4NiGPnAqlR9Da8lgfMKHqRsNOXkIF682hf+HZAOu+O0y4SQjzqSbyCtz1hEYsgjMP8CM co8Q== X-Gm-Message-State: AJIora9ugeyGqdxAHVTU3CmWXA6M2qLmock0bpNIgDOMtq44MJCScAq5 i4lGeh20d5nHsR48jk5a4IDB+FCkYr0f7KoxVlC1zA== X-Google-Smtp-Source: AGRyM1vEAd1x2UNB33pdXaPWBs45WXrfdgTefCVvtafX/mSzrQX5KL9nA7apiMh9koyl9ZSeRjRPZ2w8IZQ0Rvdb+DA= X-Received: by 2002:a0c:b30e:0:b0:470:a567:edf6 with SMTP id s14-20020a0cb30e000000b00470a567edf6mr326564qve.44.1657236814309; Thu, 07 Jul 2022 16:33:34 -0700 (PDT) MIME-Version: 1.0 References: <20220610194435.2268290-1-yosryahmed@google.com> <20220610194435.2268290-5-yosryahmed@google.com> In-Reply-To: From: Hao Luo Date: Thu, 7 Jul 2022 16:33:23 -0700 Message-ID: Subject: Re: [PATCH bpf-next v2 4/8] bpf: Introduce cgroup iter To: Yonghong Song Cc: Yosry Ahmed , Alexei Starovoitov , Daniel Borkmann , Andrii Nakryiko , Martin KaFai Lau , Song Liu , John Fastabend , KP Singh , Tejun Heo , Zefan Li , Johannes Weiner , Shuah Khan , Michal Hocko , Roman Gushchin , David Rientjes , Stanislav Fomichev , Greg Thelen , Shakeel Butt , linux-kernel@vger.kernel.org, netdev@vger.kernel.org, bpf@vger.kernel.org, cgroups@vger.kernel.org Content-Type: text/plain; charset="UTF-8" Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Mon, Jun 27, 2022 at 9:09 PM Yonghong Song wrote: > > > > On 6/10/22 12:44 PM, Yosry Ahmed wrote: > > From: Hao Luo > > > > Cgroup_iter is a type of bpf_iter. It walks over cgroups in two modes: > > > > - walking a cgroup's descendants. > > - walking a cgroup's ancestors. > > The implementation has another choice, BPF_ITER_CGROUP_PARENT_UP. > We should add it here as well. > Sorry about the late reply. I meant to write two modes: walking up and walking down. And two sub-modes when walking down: pre-order and post-order. But it seems this is confusing. I'm going to use three modes in the next version: UP, PRE and POST. Besides, since this patch modifies the bpf_iter_link_info, and that breaks the btf_dump selftest, I need to include the fix of the selftest in this patch. > > > > When attaching cgroup_iter, one can set a cgroup to the iter_link > > created from attaching. This cgroup is passed as a file descriptor and > > serves as the starting point of the walk. If no cgroup is specified, > > the starting point will be the root cgroup. > > > > For walking descendants, one can specify the order: either pre-order or > > post-order. For walking ancestors, the walk starts at the specified > > cgroup and ends at the root. > > > > One can also terminate the walk early by returning 1 from the iter > > program. > > > > Note that because walking cgroup hierarchy holds cgroup_mutex, the iter > > program is called with cgroup_mutex held. > > Overall looks good to me with a few nits below. > > Acked-by: Yonghong Song > > > > > Signed-off-by: Hao Luo > > Signed-off-by: Yosry Ahmed > > --- [...] > > + > > +/* cgroup_iter provides two modes of traversal to the cgroup hierarchy. > > + * > > + * 1. Walk the descendants of a cgroup. > > + * 2. Walk the ancestors of a cgroup. > > three modes here? > Same here. Will use 'three modes' in the next version. > > + * [...] > > + > > +static int bpf_iter_attach_cgroup(struct bpf_prog *prog, > > + union bpf_iter_link_info *linfo, > > + struct bpf_iter_aux_info *aux) > > +{ > > + int fd = linfo->cgroup.cgroup_fd; > > + struct cgroup *cgrp; > > + > > + if (fd) > > + cgrp = cgroup_get_from_fd(fd); > > + else /* walk the entire hierarchy by default. */ > > + cgrp = cgroup_get_from_path("/"); > > + > > + if (IS_ERR(cgrp)) > > + return PTR_ERR(cgrp); > > + > > + aux->cgroup.start = cgrp; > > + aux->cgroup.order = linfo->cgroup.traversal_order; > > The legality of traversal_order should be checked. > Sounds good. Will do. > > + return 0; > > +} > > + [...] > > +static void bpf_iter_cgroup_show_fdinfo(const struct bpf_iter_aux_info *aux, > > + struct seq_file *seq) > > +{ > > + char *buf; > > + > > + buf = kzalloc(PATH_MAX, GFP_KERNEL); > > + if (!buf) { > > + seq_puts(seq, "cgroup_path:\n"); > > This is a really unlikely case. maybe "cgroup_path:"? > Sure, no problem. This is a path that is unlikely to hit. > > + goto show_order; [...]