From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-2.7 required=3.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,FREEMAIL_FORGED_FROMDOMAIN,FREEMAIL_FROM, HEADER_FROM_DIFFERENT_DOMAINS,MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS, URIBL_BLOCKED autolearn=no autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 51087C433F5 for ; Thu, 23 Sep 2021 02:03:17 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 2958761019 for ; Thu, 23 Sep 2021 02:03:17 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S238822AbhIWCEr (ORCPT ); Wed, 22 Sep 2021 22:04:47 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:47362 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S238820AbhIWCEq (ORCPT ); Wed, 22 Sep 2021 22:04:46 -0400 Received: from mail-pg1-x52a.google.com (mail-pg1-x52a.google.com [IPv6:2607:f8b0:4864:20::52a]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 00A6CC061574 for ; Wed, 22 Sep 2021 19:03:16 -0700 (PDT) Received: by mail-pg1-x52a.google.com with SMTP id 17so4755417pgp.4 for ; Wed, 22 Sep 2021 19:03:15 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20210112; h=mime-version:references:in-reply-to:from:date:message-id:subject:to :cc; bh=Z7qiwpPOEdwX6enSg0zoWEvw2XmC2LUd5AMukx60YJw=; b=j+yyk6gCABWGjrF9+EYsYuN0hjrLWX8/MMyXtf3r0IxYE93qpmGTwmrSWeWh2F2ThT K5oBuyHli/MmjaRjjgt4CY2NYFLneVk62/T2ydVO8g09U3tGcHWEX6pbEWnC49ojk4+W 5rcJpWV2W7FCCfpjy2upNvNd2zyt/CpXZ+4/mgJ8hsKSebj0zlANly4Ms+leNzBnT7me FmFITbzZQJ6ITD5v5XQOXyRtgQuN32X6d/PL/lmiHAx0oBgnZ9QndsqeJdSiB3kkoCHS 2NJhq8rvfoA47azNqdVtlJssfjP9koj5DeM+WUU5B41aF2IZk/QLS7NJlZFYLliztRhm 0hXg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=x-gm-message-state:mime-version:references:in-reply-to:from:date :message-id:subject:to:cc; bh=Z7qiwpPOEdwX6enSg0zoWEvw2XmC2LUd5AMukx60YJw=; b=gU6ozB1kRV7vR5QPBL+eiKez+SKT0GQWX0ivlHOaWKJICemi3h4dA6DBkqhHG9z+k4 pe6CNoVm1CiWbGMm8RYdFhhVxxzIRez/+OAYXxXGqMFfLQ6S5tyZF0KZM8cd7jKqw8uZ 8JSFHjM1lc4TwM61vhDN/8tzE0AvPJpsVWXTsZluSU7iFZpfjmBZZbQze9kqgzqf5K4J kZ6ZOkftBaTWxlPOAMrX3SfRMqRI6H70QwH6WrbetePyz6mzIwnIm4VlEwnWsW9//QsE i8Jtqia8Ngeyl++HEIVVovGIW45J+EzWXILKCoO2zncYXN3MAUg8LQ74MwUTCGdH9RiW Nu5w== X-Gm-Message-State: AOAM530wiaNXLsHGVhvx+bPco5NxCWXjL9i89AvzSW0fguObXuN4IafE Q5DLvWjgSgIyrd2SrtYMmNH8eT2aNpa9dicvqPc= X-Google-Smtp-Source: ABdhPJwH1DXW3/RQCmDH64osli+JTmYyChVMuMF/HrQlLTsYh1xasGewUJ1jgyIUQ7oNgoooSx1PhHREs/EU+8YwEY0= X-Received: by 2002:a63:5c51:: with SMTP id n17mr1922552pgm.376.1632362595428; Wed, 22 Sep 2021 19:03:15 -0700 (PDT) MIME-Version: 1.0 References: <97d6e8ab-7a02-f317-81ed-6f45d26ad3c6@iogearbox.net> In-Reply-To: <97d6e8ab-7a02-f317-81ed-6f45d26ad3c6@iogearbox.net> From: Alexei Starovoitov Date: Wed, 22 Sep 2021 19:03:04 -0700 Message-ID: Subject: Re: bpf_jit_limit close shave To: Daniel Borkmann Cc: Lorenz Bauer , Frank Hofmann , bpf , Alexei Starovoitov , Andrii Nakryiko , kernel-team Content-Type: text/plain; charset="UTF-8" Precedence: bulk List-ID: X-Mailing-List: bpf@vger.kernel.org On Wed, Sep 22, 2021 at 2:51 PM Daniel Borkmann wrote: > > On 9/22/21 1:07 PM, Lorenz Bauer wrote: > > On Wed, 22 Sept 2021 at 09:20, Frank Hofmann wrote: > >> > >>> That jit limit is not there on older kernels and doesn't apply to root. > >>> How would you notice such a kernel bug in such conditions? > >> > >> I'm talking about bpf_jit_current - it's an "overall gauge" for > >> allocation, priv and unpriv. I understood Lorenz' note as "change it > >> so it only tracks unpriv BPF mem usage - since we'll never act on > >> privileged usage anyway" > > > > Yes, that was my suggestion indeed. What Frank is saying: it looks > > like our leak of JIT memory is due to a privileged process. By > > exempting privileged processes it would be even harder to notice / > > debug. That's true, and brings me back to my question: what is > > different about JIT memory that we can't do a better limit? There is nothing special about JIT and kernel module memory. > The knob with the limit was basically added back then as a band-aid to avoid > unprivileged BPF JIT (cBPF or eBPF) eating up all the module memory to the > point where we cannot even load kernel modules anymore. Given that memory > resource is global, we added the bpf_jit_limit / bpf_jit_current acounting > as a fix/heuristic via ede95a63b5e8 ("bpf: add bpf_jit_limit knob to restrict > unpriv allocations"). If we wouldn't account for root, how would such detection > proposal work otherwise to block unprivileged? I don't think it's feasible to > only account the latter given privileged progs might have occupied most of the > budget already. Right. At the end it boils down to module_alloc() is not using GFP_ACCOUNT. It's indeed very similar to rlimit issue we had in the past. I don't have a preference whether normal kernel mods should be memcg-ed, but JITed memory certainly can be. bpf progs memory is already covered. I think we can do that and then can remove this limit. Just like rlimit.