bpf.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Song Liu <songliubraving@fb.com>
To: Peter Zijlstra <peterz@infradead.org>,
	"acme@kernel.org" <acme@kernel.org>,
	Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: "open list:BPF (Safe dynamic programs and tools)" 
	"linux-kernel@vger.kernel.org" <linux-kernel@vger.kernel.org>,
	"mingo@redhat.com" <mingo@redhat.com>,
	Kernel Team <Kernel-team@fb.com>,
	"Kan Liang" <kan.liang@linux.intel.com>
Subject: Re: [RFC] bpf: lbr: enable reading LBR from tracing bpf programs
Date: Fri, 20 Aug 2021 07:33:13 +0000	[thread overview]
Message-ID: <D63A163C-4270-4783-81F4-18992EB5E706@fb.com> (raw)
In-Reply-To: <YR6ih+pKSm5TVVBc@hirez.programming.kicks-ass.net>

Hi Peter, 

> On Aug 19, 2021, at 11:27 AM, Peter Zijlstra <peterz@infradead.org> wrote:
> On Thu, Aug 19, 2021 at 06:22:07PM +0000, Song Liu wrote:
>>> And if we're going to be adding new pmu::methods then I figure one that
>>> does the whole sample state might be more useful.
>> What do you mean by "whole sample state"? To integrate with exiting
>> perf_sample_data, like perf_output_sample()?
> Yeah, the PMI can/does set more than data->br_stack, but I'm now
> thinking that without an actual PMI, much of that will not be possible.
> br_stack is special here.
> Oh well, carry on I suppose.

Here is another design choice that I would like to know your opinion on. 

Say we don't use BPF here. Instead, we use perf_kprobe. On a kretprobe, 
we can use branch_stack to figure out what branches we took before the 
return, say what happened when sys_perf_event_open returns -EINVAL? 

To achieve this, we will need two events from the kernel's perspective, 
one INTEL_FIXED_VLBR_EVENT attached to hardware pmu, and a kretprobe
event. Let's call them VLBR-event and software-event respectively. When
the software-event triggers, it need to read branch_stack snapshot from 
the VLBR-event's pmu, which is kind of weird. We will need some connection
between these two events. 

Also, to keep more useful data in LBR registers, we want minimal number 
of branches between software-event triggers and 
perf_pmu_disable(hardware_pmu). I guess the best way is to add a pointer,
branch_stack_pmu, to perf_event, and use it directly when the software-
event triggers. Note that, the pmu will not go away. So event the VLBR-
event get freed by accident, we will not crash the kernel (we may read
garbage data though). Does this make sense to you?

BPF case will be similar to this. We will replace the software-event with
a BPF program, and still need the VLBR-event on each CPU. 

Another question is, how do we crate the VLBR-event. On way is to create
it in user space, and somehow pass the information to the software-event
or the BPF program. Another approach is to create it in the kernel with
perf_event_create_kernel_counter(). Which of the two do you like better?


      parent reply	other threads:[~2021-08-20  7:33 UTC|newest]

Thread overview: 10+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-08-18  1:29 [RFC] bpf: lbr: enable reading LBR from tracing bpf programs Song Liu
2021-08-18  9:15 ` Peter Zijlstra
2021-08-18 16:46   ` Song Liu
2021-08-19 11:57     ` Peter Zijlstra
2021-08-19 16:46       ` Song Liu
2021-08-19 18:06         ` Peter Zijlstra
2021-08-19 18:22           ` Song Liu
2021-08-19 18:27             ` Peter Zijlstra
2021-08-19 18:45               ` Song Liu
2021-08-20  7:33               ` Song Liu [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=D63A163C-4270-4783-81F4-18992EB5E706@fb.com \
    --to=songliubraving@fb.com \
    --cc=Kernel-team@fb.com \
    --cc=acme@kernel.org \
    --cc=acme@redhat.com \
    --cc=bpf@vger.kernel.org \
    --cc=kan.liang@linux.intel.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=mingo@redhat.com \
    --cc=peterz@infradead.org \


* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).