From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 1E11DC433F5 for ; Thu, 14 Apr 2022 02:07:18 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S236086AbiDNCJi (ORCPT ); Wed, 13 Apr 2022 22:09:38 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:47036 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S229821AbiDNCJi (ORCPT ); Wed, 13 Apr 2022 22:09:38 -0400 Received: from out30-42.freemail.mail.aliyun.com (out30-42.freemail.mail.aliyun.com [115.124.30.42]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id D0BC840A2E for ; Wed, 13 Apr 2022 19:07:14 -0700 (PDT) X-Alimail-AntiSpam: AC=PASS;BC=-1|-1;BR=01201311R201e4;CH=green;DM=||false|;DS=||;FP=0|-1|-1|-1|0|-1|-1|-1;HT=e01e04357;MF=wuzongyong@linux.alibaba.com;NM=1;PH=DS;RN=2;SR=0;TI=SMTPD_---0VA05mnn_1649902032; Received: from localhost(mailfrom:wuzongyong@linux.alibaba.com fp:SMTPD_---0VA05mnn_1649902032) by smtp.aliyun-inc.com(127.0.0.1); Thu, 14 Apr 2022 10:07:12 +0800 Date: Thu, 14 Apr 2022 10:07:09 +0800 From: Wu Zongyong To: Andrii Nakryiko Cc: bpf Subject: Re: [Question] bpf map value increase unexpectedly with tracepoint qdisc/qdisc_dequeue Message-ID: <20220414020709.GA27635@L-PF27918B-1352.localdomain> Reply-To: Wu Zongyong References: <20220413140629.GA22650@L-PF27918B-1352.localdomain> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: User-Agent: Mutt/1.9.4 (2018-02-28) Precedence: bulk List-ID: X-Mailing-List: bpf@vger.kernel.org On Wed, Apr 13, 2022 at 12:30:51PM -0700, Andrii Nakryiko wrote: > On Wed, Apr 13, 2022 at 12:08 PM Wu Zongyong > wrote: > > > > Hi, > > > > I tried to count when tracepoint qdisc/qdisc_dequeue hit each time, then read > > the count value from userspace by bpf_map_lookup_elem(). > > With bpftrace, I can see this tracepoint is hit about 700 times, but the count > > I get from the bpf map is below 20. It's weird. Then I tried to add a bpf_printk() > > to the program, and the count I get is normal which is about 700. > > > > The bpf program codes like that: > > > > struct qdisc_dequeue_ctx { > > __u64 __pad; > > __u64 qdisc; > > __u64 txq; > > int packets; > > __u64 skbaddr; > > int ifindex; > > __u32 handle; > > __u32 parent; > > unsigned long txq_state; > > }; > > > > struct { > > __uint(type, BPF_MAP_TYPE_HASH); > > __type(key, int); > > __type(value, __u32); > > __uint(max_entries, 1); > > __uint(pinning, LIBBPF_PIN_BY_NAME); > > } count_map SEC(".maps"); > > > > SEC("tracepoint/qdisc/qdisc_dequeue") > > int trace_dequeue(struct qdisc_dequeue_ctx *ctx) > > { > > int key = 0; > > __u32 *value; > > __u32 init = 0; > > > > value = bpf_map_lookup_elem(&count_map, &key); > > if (value) { > > *value += 1; > > } else { > > bpf_printk("value reset"); > > bpf_map_update_elem(&count_map, &key, &init, 0); > > } > > return 0; > > } > > > > Any suggestion is appreciated! > > > > First, why do you use HASH map for single-key map? ARRAY would make > more sense for keys that are small integers. But I assume your real > world use case needs bigger and more random keys, right? > Yes, this just is a simple test. > > Second, you have two race conditions. One, you overwrite the value in > the map with this bpf_map_update_elem(..., 0). Use BPF_NOEXISTS for > initialization to avoid overwriting something that another CPU already > set. Another one is your *value += 1 is non-atomic, so you are loosing > updates as well. Use __sync_fetch_and_add(value, 1) for atomic > increment. __sync_fetch_and_add do solve my problem. Thanks! I have tried to use spinlock to prevent race conditions, but it seems that spinlock cannot be used in tracepoint. > > Something like this: > > value = bpf_map_lookup_elem(&count_map, &key); > if (!value) { > /* BPF_NOEXIST won't allow to override the value that's already set */ > bpf_map_update_elem(&count_map, &key, &init, BPF_NOEXISTS); > value = bpf_map_lookup_elem(&count_map, &key); > } > if (!value) > return 0; > > __sync_fetch_and_add(value, 1); > > > > Thanks, > > Wu Zongyong