From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 64F81C4332F for ; Thu, 20 Oct 2022 18:03:45 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S230118AbiJTSDn (ORCPT ); Thu, 20 Oct 2022 14:03:43 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:32814 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S230161AbiJTSDT (ORCPT ); Thu, 20 Oct 2022 14:03:19 -0400 Received: from mail-yb1-xb2a.google.com (mail-yb1-xb2a.google.com [IPv6:2607:f8b0:4864:20::b2a]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 04139202720 for ; Thu, 20 Oct 2022 11:02:40 -0700 (PDT) Received: by mail-yb1-xb2a.google.com with SMTP id r3so515003yba.5 for ; Thu, 20 Oct 2022 11:02:40 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20210112; h=cc:to:subject:message-id:date:from:in-reply-to:references :mime-version:from:to:cc:subject:date:message-id:reply-to; bh=Moz9mEOE8PRO8YTR3q1qOudY9duTlJQdyx4sySmJngs=; b=s/sgU1R5aiDV5sSrF+5+hKs18FyYWRZHEjIRbhoVbWqIFyhp9bsr1L/i0Yq9etl337 uzGMgs24LDkgKqutbQp1bAGUG3lzna87g+ueGVguNzq1a3ANSt1dLKhfNkHAzhRONr9D q5HpxW89yUWSSMvIGvWpuGsQCrrv4lbtq7tWe7g42iNPALYH+NPHeHu2NtyTMOJ1IpKW WMXbhhWiAKOFeeY29ZMqnrdoVx0gZQpxIEdLZ+PdWu5cWHMJYz8NgPjFVP2Lo8KaeULc 2bxJ6DOBKD2LnA0whtq/wS7Yh4FOxQcg79w5p0C1dyovALVZqhD2oOgUgYRXLGzNH3iK VCfw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=cc:to:subject:message-id:date:from:in-reply-to:references :mime-version:x-gm-message-state:from:to:cc:subject:date:message-id :reply-to; bh=Moz9mEOE8PRO8YTR3q1qOudY9duTlJQdyx4sySmJngs=; b=Mr+TYmwbOoQEXkCBVFEr/1sl3Zk4gMKB6ewvPdKb8UhT0yPLnxlQe535++7CTTlJqg 2tbFtnjsqNkPhE0CD/7WfPyqZQ5d6XQ22n35mWfrs/1A9A7A0NQveYb3JoNWqkYW2zk8 4cCEnmmu/lCdJ5zK7ndgOFjPVkv7yj0IuJk5l8eptBNKYfZHFXgY4R6S0HiW+eZLUgBG KEwAYFhNivVdaAWmTeFRpc8f4xRE0jvoRYz09i5W2HC3jmNmZGlY5o8R+22dVxaiZYYB BclCn3mPxVI2PzwHYopVRShiACUzbQf5i+bxrPaSsfiqAInzRjKSBCo3M6oSabTrIH6q B5Aw== X-Gm-Message-State: ACrzQf01HDlgkO6mGANMtghWnqjKK1FfJaF5ikKFD9cqAXMzi+ry+anX Busddl83fTmlphGu3kLcMEoobDjJODi7aEsvrLxtgw== X-Google-Smtp-Source: AMsMyM6FqF6mdKK1SFkEFj2gZ2CjOMnrYP2ZlDXAph44Sxdd6c/4Y6DnGBQPkBr/3YbzNddpS0SHPcTO5N//2dZR4Cg= X-Received: by 2002:a25:2fc6:0:b0:6be:873a:d15c with SMTP id v189-20020a252fc6000000b006be873ad15cmr11910012ybv.577.1666288912887; Thu, 20 Oct 2022 11:01:52 -0700 (PDT) MIME-Version: 1.0 References: <20221020142247.1682009-1-houtao@huaweicloud.com> In-Reply-To: <20221020142247.1682009-1-houtao@huaweicloud.com> From: Hao Luo Date: Thu, 20 Oct 2022 11:01:41 -0700 Message-ID: Subject: Re: [PATCH bpf] bpf: Support for setting numa node in bpf memory allocator To: Hou Tao Cc: bpf@vger.kernel.org, Alexei Starovoitov , Martin KaFai Lau , Andrii Nakryiko , Song Liu , Yonghong Song , Daniel Borkmann , KP Singh , Stanislav Fomichev , Jiri Olsa , John Fastabend , houtao1@huawei.com Content-Type: text/plain; charset="UTF-8" Precedence: bulk List-ID: X-Mailing-List: bpf@vger.kernel.org On Thu, Oct 20, 2022 at 6:57 AM Hou Tao wrote: > > From: Hou Tao > > Since commit fba1a1c6c912 ("bpf: Convert hash map to bpf_mem_alloc."), > numa node setting for non-preallocated hash table is ignored. The reason > is that bpf memory allocator only supports NUMA_NO_NODE, but it seems it > is trivial to support numa node setting for bpf memory allocator. > > So adding support for setting numa node in bpf memory allocator and > updating hash map accordingly. > > Fixes: fba1a1c6c912 ("bpf: Convert hash map to bpf_mem_alloc.") > Signed-off-by: Hou Tao > --- Looks good to me with a few nits. > <...> > diff --git a/kernel/bpf/memalloc.c b/kernel/bpf/memalloc.c > index fc116cf47d24..44c531ba9534 100644 > --- a/kernel/bpf/memalloc.c > +++ b/kernel/bpf/memalloc.c <...> > +static inline bool is_valid_numa_node(int numa_node, bool percpu) > +{ > + return numa_node == NUMA_NO_NODE || > + (!percpu && (unsigned int)numa_node < nr_node_ids); Maybe also check node_online? There is a similar helper function in kernel/bpf/syscall.c. It may help debugging if we could log the reason here, for example, PERCPU map but with numa_node specified. > +} > + > +/* The initial prefill is running in the context of map creation process, so > + * if the preferred numa node is NUMA_NO_NODE, needs to use numa node of the > + * specific cpu instead. > + */ > +static inline int get_prefill_numa_node(int numa_node, int cpu) > +{ > + int prefill_numa_node; > + > + if (numa_node == NUMA_NO_NODE) > + prefill_numa_node = cpu_to_node(cpu); > + else > + prefill_numa_node = numa_node; > + return prefill_numa_node; > } nit: an alternative implementation is return numa_node == NUMA_NO_NODE ? cpu_to_node(cpu) : numa_node; > > /* When size != 0 bpf_mem_cache for each cpu. > @@ -359,13 +383,17 @@ static void prefill_mem_cache(struct bpf_mem_cache *c, int cpu) > * kmalloc/kfree. Max allocation size is 4096 in this case. > * This is bpf_dynptr and bpf_kptr use case. > */ We added a parameter to this function, I think it is worth mentioning the 'numa_node' argument's behavior under different values of 'percpu'. > -int bpf_mem_alloc_init(struct bpf_mem_alloc *ma, int size, bool percpu) > +int bpf_mem_alloc_init(struct bpf_mem_alloc *ma, int size, int numa_node, > + bool percpu) > { <...> > -- > 2.29.2 >