From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id AEF7CC433EF for ; Mon, 27 Sep 2021 21:14:22 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 876D460F6C for ; Mon, 27 Sep 2021 21:14:22 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S237099AbhI0VP7 (ORCPT ); Mon, 27 Sep 2021 17:15:59 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:40262 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S234848AbhI0VP7 (ORCPT ); Mon, 27 Sep 2021 17:15:59 -0400 Received: from mail-yb1-xb2d.google.com (mail-yb1-xb2d.google.com [IPv6:2607:f8b0:4864:20::b2d]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 1FF54C061575 for ; Mon, 27 Sep 2021 14:14:21 -0700 (PDT) Received: by mail-yb1-xb2d.google.com with SMTP id z5so27049540ybj.2 for ; Mon, 27 Sep 2021 14:14:21 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20210112; h=mime-version:references:in-reply-to:from:date:message-id:subject:to :cc; bh=NlB9Jq+7YM+VVcwWr0Qd1rqHlLAB0HoJrWYjxOkeu7E=; b=dB/wsPtHKLhQCbJpOm/c0NOTylzndHq3yH9BIHvECtUn0e+Wj26Z6L2UgWm09iBaEV 1qxH8UVVDxmfgsD60A+D1At3SrotXpzF/aDnXwWIQN6Jb/jWNCR4gGz+oPj0wGvN/Ija gfMJpNm/8k52qvQ9fQtRuhg2VlXF576faaU+95bfXI1zzlOQ89D0G+2acQLukwXyyOXx zWu9nHmMKdryzgbC8Hp4gc/2n+Sj/poPF9HZy/eDJ1PdNyLDcJWweRLrBAiB2jJIBSvz WmOPfN79TkqlrnzCe3Ak1WoZFhXYD4IF4gvHfTAIKrxlHHz5yAO94Qo7skC30SdwJ9wf qXaQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=x-gm-message-state:mime-version:references:in-reply-to:from:date :message-id:subject:to:cc; bh=NlB9Jq+7YM+VVcwWr0Qd1rqHlLAB0HoJrWYjxOkeu7E=; b=pY360JpQWFrtDARpb8HvmsM7h2PyFvmO4qPFVX4dlxL69vbLvNH4F2MmqpNCyd4EUr aI6p7xNcGSa/CBcW57qlSzfHuJrwRoChVq/9ge/G0cbFU3iJa3wvRBOKjucPWgepPtJX aYVmI4YPHYuRL5YjATXeY+8vw9OcoVMuGqWy50zVt09/n/l8IIV3m27ADmIn1DZR1uFS jVUOELpHkEaJOXxXA5v08MD4elWAXYmr71D5y/ZHy8nQikieLddzQH5bInEEjRmMU3Bg aU8HyW2aTyNLNt+NxOVJzAd0EF7vCbJjWjlb8p/e2t0JkYGPJ8qXAJdYXoquhmY9Ms6S GNTw== X-Gm-Message-State: AOAM530tS9ICmvp5McC5ltedb6puKHfg9Z/YobjFgJnJon3Gu+noxA+4 0U/j2+2Jg+3EjGscOu7+9UhxAHYaW1kSqi8e1zs= X-Google-Smtp-Source: ABdhPJxyjYH9row1DUbQn6Sp3drtjZ7HVCmizFxK7vWVf75Jokjr81N87CO3SsWyx+vqqAxms7UXRtG+5vDxjKr+7fg= X-Received: by 2002:a25:7c42:: with SMTP id x63mr2602003ybc.225.1632777260215; Mon, 27 Sep 2021 14:14:20 -0700 (PDT) MIME-Version: 1.0 References: <20210923012849.qfgammwxxcd47fgn@kafai-mbp.dhcp.thefacebook.com> <20210923194233.og5pamu6g7xfnsmp@kafai-mbp> <20210923203046.a3fsogdl37mw56kp@ast-mbp> <7957a053-8b98-1e09-26c8-882df6920e6e@fb.com> <118c7f22-f710-581f-b87e-ee07aced429a@fb.com> <20210927164110.gg33uypguty45huk@ast-mbp.dhcp.thefacebook.com> In-Reply-To: <20210927164110.gg33uypguty45huk@ast-mbp.dhcp.thefacebook.com> From: Andrii Nakryiko Date: Mon, 27 Sep 2021 14:14:09 -0700 Message-ID: Subject: Re: [PATCH v3 bpf-next 1/5] bpf: Add bloom filter map implementation To: Alexei Starovoitov Cc: Joanne Koong , Martin KaFai Lau , bpf , Kernel Team Content-Type: text/plain; charset="UTF-8" Precedence: bulk List-ID: X-Mailing-List: bpf@vger.kernel.org On Mon, Sep 27, 2021 at 9:41 AM Alexei Starovoitov wrote: > > On Fri, Sep 24, 2021 at 04:12:11PM -0700, Andrii Nakryiko wrote: > > > > That's not what I proposed. So let's say somewhere in the kernel we > > have this variable: > > > > static int bpf_bloom_exists = 1; > > > > Now, for bpf_map_lookup_elem() helper we pass data as key pointer. If > > all its hashed bits are set in Bloom filter (it "exists"), we return > > &bpf_bloom_exists. So it's not a NULL pointer. > > imo that's too much of a hack. too bad, because this feels pretty natural in BPF code: int my_key = 1234; if (bpf_map_lookup_elem(&my_bloom_filter, &my_key)) { /* handle "maybe exists" */ } else { /* handle "definitely doesn't exist" */ } With map->value_size == 0 verifier should be able to ensure that whatever non-NULL pointer is returned isn't readable/writable. We might need to fix some BPF verifier internals if that's not supported yet, I haven't checked thoroughly. > > > Now, despite returning a valid pointer, it would be good to prevent > > reading/writing from/to it in a valid BPF program. I'm hoping it is as > > easy as just seetting map->value_size = 0 during map creation. But > > worst case, we can let BPF program just overwrite that 1 with whatever > > they want. It doesn't matter because the contract is that > > bpf_map_lookup_elem() returns non-NULL for "exists" and NULL > > otherwise. > > > > Now, for MAP_LOOKUP_ELEM command on user-space side. I'd say the > > contract should be 0 return code on "exists" (and nothing is written > > to value pointer, perhaps even disallow to specify the non-NULL value > > pointer altogether); -ENOENT, otherwise. Again, worst-case we can > > specify that "value_size" is 1 or 4 bytes and write 1 to it. > > > > Does it make sense? > > > > > > BTW, for STACK/QUEUE maps, I'm not clear why we had to add > > map_peek_elem/map_pop_elem/map_push_elem operations, to be honest. > > They could have been pretty reasonably mapped to map_lookup_elem (peek > > is non-destructive lookup), map_lookup_elem + some flag (pop is > > lookup, but destructive, so flag allows "destruction"), and > > map_update_elem (for push). map_delete_elem would be pop for when > > value can be ignored. > > > > That's to say that I don't consider STACK/QUEUE maps as good examples, > > it's rather a counter-example of maps that barely anyone is using, yet > > it just adds clutter to BPF internals. > > Repurposing lookup/update ops for stack/queue and forcing bpf progs > to pass dummy key would have looked quite ugly. The idea was to pass non-NULL *key* pointer. For bpf_map_update_elem() you'd pass non-NULL key and NULL value. But it's fine, we have peek/pop/push now, might as well use them. From user-space side it's still mapped to LOOKUP/UPDATE/DELETE commands, right? BTW, seems like we have LOOKUP_AND_DELETE command as well which is implemented for STACK/QUEUE maps, but not BLOOM_FILTER. I guess for consistency we'd have to implement that one for user-space as well? > peek/pop/push have one pointer. That pointer points to value. > For bitset we have single pointer as well. > So it makes perfect sense to reuse push/pop/peek and keep bitset > as a value from the verifier side. Ok. > bpf_helper_defs.h could have static inline functions: > bool bpf_bitset_clear(map, key); > bool bpf_bitset_set(map, key); > bool bpf_bitset_test(map, key); > > But they will map to FUNC_map_pop_elem/push/peek as func ids > and will be seen as values from the verifier pov. > > The bpf progs might think of them as keys, but they will be value_size. > The bitset creation could use key_size as an input, but internally > set it as value_size. > Not sure whether such internal vs uapi remaping will not lead > to confusion and bugs though. I think it will and I wouldn't do it. > I agree that key as an input to bitset_clear/set/test make more sense, > but the verifier needs value_size to plug into peek/pop infra. Which is why I initially proposed to not use peek/pop infra and instead do lookup/update/delete, but if a NULL value pointer passed into bpf_map_update_elem() disqualifies this, then it's fine. But I didn't ask to repurpose peek/pop for working with key pointers. > > I don't think it's worth to extend ops with yet another 3 callbacks > and have clean ARG_PTR_TO_UNINIT_MAP_KEY there. > That is probably too much clutter. > > I think > bool bpf_bitset_clear(map, value); > bool bpf_bitset_set(map, value); > bool bpf_bitset_test(map, value); > doesn't look too bad either. > At least this way the bit set map_create op will use value_size > and keep it as value_size. The bpf prog won't be confusing. > That's my preference, so far.