From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-4.0 required=3.0 tests=BAYES_00,MAILING_LIST_MULTI, SPF_HELO_NONE,SPF_PASS,URIBL_BLOCKED autolearn=no autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id F347FC49EA5 for ; Thu, 24 Jun 2021 22:07:06 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id C4BC6613B9 for ; Thu, 24 Jun 2021 22:07:06 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S232781AbhFXWJY (ORCPT ); Thu, 24 Jun 2021 18:09:24 -0400 Received: from mail-lf1-f42.google.com ([209.85.167.42]:37390 "EHLO mail-lf1-f42.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S229848AbhFXWJT (ORCPT ); Thu, 24 Jun 2021 18:09:19 -0400 Received: by mail-lf1-f42.google.com with SMTP id p7so12907605lfg.4 for ; Thu, 24 Jun 2021 15:06:58 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:references:in-reply-to:from:date :message-id:subject:to:cc; bh=Os1/TN/h4NhhToWmiawIUoZc6hd9BmZaTjKdtDTLUhA=; b=bmd3LusqqI5gQEq88COlU8+sTyFbg+6Xw1lQpjAC0W5954nbzpU/MYCk+iUicgO8nN r6DxhW8Xcx5mChts3ChNsAD9KIGm7veAmL16u+bn8NP7fFHLg0F9ceI5kLPHdkxNeUyW e/YIb2w5ldbz53ENcIyEXT0cR6AvEjaEDsKoVkgRftdzyQtbpThVCz7XE9NYcX+yf/1q l6dGsL7suR3UEYz7iKiyKfrOgkOjS02DimkGDu/eIPLUHNG5ovrsVSG2RL9MaUq2YFM5 lT/cbiFVcgwPuPRfdSp+lyu94xbswmkBPD58MSqQDLb2D/hGdI+k7YQ+jpwSD5cNN7y1 nKkg== X-Gm-Message-State: AOAM530yl1XtUP5bKUD4DMDV2odAlBFhYroHES3IL67sMovpvRqpvS2v 3bM187ipmDR/a/l/u+hjhNbTcpnyEKMUwv61CQ0= X-Google-Smtp-Source: ABdhPJymKmsDDUhrIXSdEHT7+UbwBC6+ZIdknvPo3pFQJOWJT9a/nSV08ZN4JtE/TJRdWJUrb9HI380IEG0mysM7pe0= X-Received: by 2002:ac2:50da:: with SMTP id h26mr5263152lfm.635.1624572417501; Thu, 24 Jun 2021 15:06:57 -0700 (PDT) MIME-Version: 1.0 References: <20210622071221.128271-1-namhyung@kernel.org> <20210622071221.128271-4-namhyung@kernel.org> In-Reply-To: From: Namhyung Kim Date: Thu, 24 Jun 2021 15:06:46 -0700 Message-ID: Subject: Re: [PATCH 3/3] perf stat: Enable BPF counter with --for-each-cgroup To: Song Liu Cc: Arnaldo Carvalho de Melo , Jiri Olsa , Ingo Molnar , Peter Zijlstra , LKML , Andi Kleen , Ian Rogers , Stephane Eranian Content-Type: text/plain; charset="UTF-8" Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Thu, Jun 24, 2021 at 2:41 PM Song Liu wrote: > > > > > On Jun 24, 2021, at 2:01 PM, Namhyung Kim wrote: > > > > On Thu, Jun 24, 2021 at 9:20 AM Song Liu wrote: > >>>>> + > >>>>> +// single set of global perf events to measure > >>>>> +struct { > >>>>> + __uint(type, BPF_MAP_TYPE_PERF_EVENT_ARRAY); > >>>>> + __uint(key_size, sizeof(__u32)); > >>>>> + __uint(value_size, sizeof(int)); > >>>>> + __uint(max_entries, 1); > >>>>> +} events SEC(".maps"); > >>>>> + > >>>>> +// from logical cpu number to event index > >>>>> +// useful when user wants to count subset of cpus > >>>>> +struct { > >>>>> + __uint(type, BPF_MAP_TYPE_HASH); > >>>>> + __uint(key_size, sizeof(__u32)); > >>>>> + __uint(value_size, sizeof(__u32)); > >>>>> + __uint(max_entries, 1); > >>>>> +} cpu_idx SEC(".maps"); > >>>> > >>>> How about we make cpu_idx a percpu array and use 0,1 for > >>>> disable/enable profiling on this cpu? > >>> > >>> No, it's to calculate an index to the cgrp_readings map which > >>> has the event x cpu x cgroup number of elements. > >>> > >>> It controls enabling events with a global (bss) variable. > >> > >> If we make cgrp_idx a per cpu array, we probably don't need the > >> cpu_idx map? > > > > Right. Maybe not. Sometimes we want to profile a subset of cpus only. In that case, cpu != idx then I think we still need this. > > > >> > >>> > >>>> > >>>>> + > >>>>> +// from cgroup id to event index > >>>>> +struct { > >>>>> + __uint(type, BPF_MAP_TYPE_HASH); > >>>>> + __uint(key_size, sizeof(__u64)); > >>>>> + __uint(value_size, sizeof(__u32)); > >>>>> + __uint(max_entries, 1); > >>>>> +} cgrp_idx SEC(".maps"); > >>>>> + > >>>>> +// per-cpu event snapshots to calculate delta > >>>>> +struct { > >>>>> + __uint(type, BPF_MAP_TYPE_PERCPU_ARRAY); > >>>>> + __uint(key_size, sizeof(__u32)); > >>>>> + __uint(value_size, sizeof(struct bpf_perf_event_value)); > >>>>> +} prev_readings SEC(".maps"); > >>>>> + > >>>>> +// aggregated event values for each cgroup > >>>>> +// will be read from the user-space > >>>>> +struct { > >>>>> + __uint(type, BPF_MAP_TYPE_ARRAY); > >>>>> + __uint(key_size, sizeof(__u32)); > >>>>> + __uint(value_size, sizeof(struct bpf_perf_event_value)); > >>>>> +} cgrp_readings SEC(".maps"); > >>>> > >>>> Maybe also make this a percpu array? This should make the BPF program > >>>> faster. > >>> > >>> Maybe. But I don't know how to access the elements > >>> in a per-cpu map from userspace. > >> > >> Please refer to bperf__read() reading accum_readings. Basically, we read > >> one index of all CPUs with one bpf_map_lookup_elem(). > > > > Thanks! So when I use a per-cpu array with 3 elements, I can access > > to cpu/elem entries in a row like below, right? > > > > 0/0, 0/1, 0/2, 1/0, 1/1, 1/2, 2/0, 2/1, 2/2, 3/0, ... > > I am not sure I am following here. > > Say the system have 10 cpus, and the array has 3 elements. We can do: > > __u32 values[10]; /* assuming both key and value are __u32 */ > __u32 elem; > int cpu; > > for (elem = 0; elem < 3; elem++) { > bpf_map_lookup_elem(map_fd, &elem, values); > for (cpu = 0; cpu < 10; cpu++) > values[cpu] /* this is the value for cpu/elem */ > } Thanks for the explanation, I didn't think that way. I thought it like below: __u32 elem, value; for (elem = 0; elem < 3 * 10; elem++) { bpf_map_lookup_elem(map_fd, &elem, &value); } So in this case, the actual value size is like below, right? value-size = map-value-size * number-of-cpu Thanks, Namhyung