From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id DCFA6C433F5 for ; Tue, 10 May 2022 20:44:45 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S235241AbiEJUoo (ORCPT ); Tue, 10 May 2022 16:44:44 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:54898 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S237005AbiEJUo3 (ORCPT ); Tue, 10 May 2022 16:44:29 -0400 Received: from mail-wm1-x333.google.com (mail-wm1-x333.google.com [IPv6:2a00:1450:4864:20::333]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id E83EA2A375A for ; Tue, 10 May 2022 13:44:24 -0700 (PDT) Received: by mail-wm1-x333.google.com with SMTP id i20-20020a05600c355400b0039456976dcaso1670644wmq.1 for ; Tue, 10 May 2022 13:44:24 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20210112; h=mime-version:references:in-reply-to:from:date:message-id:subject:to :cc; bh=H+z42sapPI1YBphYIrh6dW0GFMAVRokw99cL8ZqGxg8=; b=CTpiNl02mk2asLHcnBsuNAVd8tR5UxljFjwGWN7EPCdyw8jw1qTJN8Kj958PoCw+f9 C32LJd3iJyAPIc7JnhsDf3F+CLv0mIznaGhwzjGKAYcrkMmcqX0unecnaIRY4eg2RdS/ 0wpRZMze08BUkQN9Kfpmmxj4PSmOnYIqgNxsMT/R3rkcd0HxVerVL26sQmKyCcoM0a/m s7JJXvfYmJef+dh/gZmajmQP15BB/zKpZ1gNodzL0OvqccJTjiHJ10N8+lrMEhpadzdu fFFf+dHjfdsLLT+W0V8q1dm0F4FO7Oj3L3kgVi1f9Ro2JBTKBHwIl+3Ypo+uOj1B/r4q F3zQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=x-gm-message-state:mime-version:references:in-reply-to:from:date :message-id:subject:to:cc; bh=H+z42sapPI1YBphYIrh6dW0GFMAVRokw99cL8ZqGxg8=; b=05ZpI6KVPE8xPBGfhA8HUqMSAXvYpHql5vZQLn1SBEABK8WM/8TD0h6J/gZIrQ1/3c htMZ2iONqwVg4DUAXD5UjzZLNBjQR0iS5iqkUaQBcrTs5k/uwZ8lUjfBZlLKG4M2d7jd W4TQ/Ii6McMyQIaWKP/0zjS3AmFSJIoBnIxIxw2XRHPhZIbg2Tx4pr4cVkpiV8cdLXSS aVsFpN5JWKnA7bwjRsFfBRsnLEZ7kKZRLAuQf0NdU14EnigUUx1IgIlZ0PfM6YzbjD3E +DE7uh/AfSr61Hv+GG5t5kuJiXiNPtOKOBmB9OoU+Mkez8PC24RJtdOJmvHtZwQqBprr Nuyw== X-Gm-Message-State: AOAM5307k/BWVednErWfl8JkeOmudSHjOeeoP+1SNPrEbhJoTIuJAUHU G9KLcP3+DR9EBbZF4nX2PgzE8G9AMnBB2lGaojrGmQ== X-Google-Smtp-Source: ABdhPJwliUlItsrpLn1+I5hrd9rEc+YvxNDiJSj3yNhjvh/mpLFiO4d78N2zmuNH2esxan4r8P2KPApz87peJAW0mPs= X-Received: by 2002:a05:600c:4ecc:b0:394:790d:5f69 with SMTP id g12-20020a05600c4ecc00b00394790d5f69mr1691105wmq.196.1652215462895; Tue, 10 May 2022 13:44:22 -0700 (PDT) MIME-Version: 1.0 References: <20220510001807.4132027-1-yosryahmed@google.com> <20220510001807.4132027-2-yosryahmed@google.com> In-Reply-To: From: Yosry Ahmed Date: Tue, 10 May 2022 13:43:46 -0700 Message-ID: Subject: Re: [RFC PATCH bpf-next 1/9] bpf: introduce CGROUP_SUBSYS_RSTAT program type To: Tejun Heo Cc: Alexei Starovoitov , Daniel Borkmann , Andrii Nakryiko , Martin KaFai Lau , Song Liu , Yonghong Song , John Fastabend , KP Singh , Hao Luo , Zefan Li , Johannes Weiner , Shuah Khan , Roman Gushchin , Michal Hocko , Stanislav Fomichev , David Rientjes , Greg Thelen , Shakeel Butt , Linux Kernel Mailing List , Networking , bpf , cgroups@vger.kernel.org Content-Type: text/plain; charset="UTF-8" Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Tue, May 10, 2022 at 12:59 PM Tejun Heo wrote: > > Hello, > > On Tue, May 10, 2022 at 12:34:42PM -0700, Yosry Ahmed wrote: > > The rationale behind associating this work with cgroup_subsys is that > > usually the stats are associated with a resource (e.g. memory, cpu, > > etc). For example, if the memory controller is only enabled for a > > subtree in a big hierarchy, it would be more efficient to only run BPF > > rstat programs for those cgroups, not the entire hierarchy. It > > provides a way to control what part of the hierarchy you want to > > collect stats for. This is also semantically similar to the > > css_rstat_flush() callback. > > Hmm... one major point of rstat is not having to worry about these things > because we iterate what's been active rather than what exists. Now, this > isn't entirely true because we share the same updated list for all sources. > This is a trade-off which makes sense because 1. the number of cgroups to > iterate each cycle is generally really low anyway 2. different controllers > often get enabled together. If the balance tilts towards "we're walking too > many due to the sharing of updated list across different sources", the > solution would be splitting the updated list so that we make the walk finer > grained. > > Note that the above doesn't really affect the conceptual model. It's purely > an optimization decision. Tying these things to a cgroup_subsys does affect > the conceptual model and, in this case, the userland API for a performance > consideration which can be solved otherwise. > > So, let's please keep this simple and in the (unlikely) case that the > overhead becomes an issue, solve it from rstat operation side. > > Thanks. I assume if we do this optimization, and have separate updated lists for controllers, we will still have a "core" updated list that is not tied to any controller. Is this correct? If yes, then we can make the interface controller-agnostic (a global list of BPF flushers). If we do the optimization later, we tie BPF stats to the "core" updated list. We can even extend the userland interface then to allow for controller-specific BPF stats if found useful. If not, and there will only be controller-specific updated lists then, then we might need to maintain a "core" updated list just for the sake of BPF programs, which I don't think would be favorable. What do you think? Either-way, I will try to document our discussion outcome in the commit message (and maybe the code), so that if-and-when this optimization is made, we can come back to it. > > -- > tejun