From: Alexei Starovoitov <ast@fb.com>
To: Song Liu <songliubraving@fb.com>, Peter Zijlstra <peterz@infradead.org>
Cc: "open list:BPF (Safe dynamic programs and tools)"
<bpf@vger.kernel.org>, LKML <linux-kernel@vger.kernel.org>,
Arnaldo Carvalho de Melo <acme@kernel.org>,
Ingo Molnar <mingo@redhat.com>, kajoljain <kjain@linux.ibm.com>,
Kernel Team <Kernel-team@fb.com>
Subject: Re: [PATCH v3 bpf-next 2/3] bpf: introduce helper bpf_get_branch_snapshot
Date: Tue, 31 Aug 2021 14:37:19 -0700 [thread overview]
Message-ID: <651cce25-e354-1486-c6b6-7cca4a37e3b7@fb.com> (raw)
In-Reply-To: <F7738B87-1CD3-40F8-9278-DC69E9AB0395@fb.com>
On 8/31/21 2:24 PM, Song Liu wrote:
>
>
>> On Aug 31, 2021, at 9:41 AM, Song Liu <songliubraving@fb.com> wrote:
>>
>>
>>
>>> On Aug 31, 2021, at 8:32 AM, Peter Zijlstra <peterz@infradead.org> wrote:
>>>
>>> On Mon, Aug 30, 2021 at 02:41:05PM -0700, Song Liu wrote:
>>>
>>>> @@ -564,6 +565,18 @@ static void notrace inc_misses_counter(struct bpf_prog *prog)
>>>> u64 notrace __bpf_prog_enter(struct bpf_prog *prog)
>>>> __acquires(RCU)
>>>> {
>>> preempt_disable_notrace();
>>>
>>>> +#ifdef CONFIG_PERF_EVENTS
>>>> + /* Calling migrate_disable costs two entries in the LBR. To save
>>>> + * some entries, we call perf_snapshot_branch_stack before
>>>> + * migrate_disable to save some entries. This is OK because we
>>>> + * care about the branch trace before entering the BPF program.
>>>> + * If migrate happens exactly here, there isn't much we can do to
>>>> + * preserve the data.
>>>> + */
>>>> + if (prog->call_get_branch)
>>>> + static_call(perf_snapshot_branch_stack)(
>>>> + this_cpu_ptr(&bpf_perf_branch_snapshot));
>>>
>>> Here the comment is accurate, but if you recall the calling context
>>> requirements of perf_snapshot_branch_stack from the last patch, you'll
>>> see it requires you have at the very least preemption disabled, which
>>> you just violated.
>>
>>>
>>> I think you'll find that (on x86 at least) the suggested
>>> preempt_disable_notrace() incurs no additional branches.
>>>
>>> Still, there is the next point to consider...
>>>
>>>> +#endif
>>>> rcu_read_lock();
>>>> migrate_disable();
>>>
>>> preempt_enable_notrace();
>>
>> Do we want preempt_enable_notrace() after migrate_disable()? It feels a
>> little weird to me.
>>
>>>
>>>> if (unlikely(__this_cpu_inc_return(*(prog->active)) != 1)) {
>>>
>>>> @@ -1863,9 +1892,23 @@ void bpf_put_raw_tracepoint(struct bpf_raw_event_map *btp)
>>>> preempt_enable();
>>>> }
>>>>
>>>> +DEFINE_PER_CPU(struct perf_branch_snapshot, bpf_perf_branch_snapshot);
>>>> +
>>>> static __always_inline
>>>> void __bpf_trace_run(struct bpf_prog *prog, u64 *args)
>>>> {
>>>> +#ifdef CONFIG_PERF_EVENTS
>>>> + /* Calling migrate_disable costs two entries in the LBR. To save
>>>> + * some entries, we call perf_snapshot_branch_stack before
>>>> + * migrate_disable to save some entries. This is OK because we
>>>> + * care about the branch trace before entering the BPF program.
>>>> + * If migrate happens exactly here, there isn't much we can do to
>>>> + * preserve the data.
>>>> + */
>>>> + if (prog->call_get_branch)
>>>> + static_call(perf_snapshot_branch_stack)(
>>>> + this_cpu_ptr(&bpf_perf_branch_snapshot));
>>>> +#endif
>>>> cant_sleep();
>>>
>>> In the face of ^^^^^^ the comment makes no sense. Still, what are the
>>> nesting rules for __bpf_trace_run() and __bpf_prog_enter() ? I'm
>>> thinking the trace one can nest inside an occurence of prog, at which
>>> point you have pieces.
>>
>> I think broken LBR records is something we cannot really avoid in case
>> of nesting. OTOH, these should be rare cases and will not hurt the results
>> in most the use cases.
>>
>> I should probably tighten the rules in verifier to only apply it for
>> __bpf_prog_enter (where we have the primary use case). We can enable it
>> for other program types when there are other use cases.
>
> Update about some offline discussion with Alexei and Andrii. We are planning
> to move static_call(perf_snapshot_branch_stack) to inside the helper
> bpf_get_branch_snapshot. This change has a few benefit:
>
> 1. No need for extra check (prog->call_get_branch) before every program (even
> when the program doesn't use the helper).
>
> 2. No need to duplicate code of different BPF program hook.
> 3. BPF program always run with migrate_disable(), so it is not necessary to
> run add extra preempt_disable_notrace.
>
> It does flushes a few more LBR entries. But the result seems ok:
>
> ID: 0 from intel_pmu_lbr_disable_all+58 to intel_pmu_lbr_disable_all+93
> ID: 1 from intel_pmu_lbr_disable_all+54 to intel_pmu_lbr_disable_all+58
> ID: 2 from intel_pmu_snapshot_branch_stack+88 to intel_pmu_lbr_disable_all+0
> ID: 3 from bpf_get_branch_snapshot+28 to intel_pmu_snapshot_branch_stack+0
> ID: 4 from <bpf_tramepoline> to bpf_get_branch_snapshot+0
> ID: 5 from <bpf_tramepoline> to <bpf_tramepoline>
> ID: 6 from __bpf_prog_enter+34 to <bpf_tramepoline>
> ID: 7 from migrate_disable+60 to __bpf_prog_enter+9
> ID: 8 from __bpf_prog_enter+4 to migrate_disable+0
If we make migrate_disable 'static inline' it will save these 2 entries.
It's probably worth doing regardless, since it will be immediate
performance benefit for all bpf programs.
> ID: 9 from __bpf_prog_enter+4 to __bpf_prog_enter+0
> ID: 10 from bpf_fexit_loop_test1+22 to __bpf_prog_enter+0
> ID: 11 from bpf_fexit_loop_test1+20 to bpf_fexit_loop_test1+13
> ID: 12 from bpf_fexit_loop_test1+20 to bpf_fexit_loop_test1+13
> ID: 13 from bpf_fexit_loop_test1+20 to bpf_fexit_loop_test1+13
> ID: 14 from bpf_fexit_loop_test1+20 to bpf_fexit_loop_test1+13
> ID: 15 from bpf_fexit_loop_test1+20 to bpf_fexit_loop_test1+13
>
> We can save more by inlining intel_pmu_lbr_disable_all(). But it is probably
> not necessary at the moment.
>
> Thanks,
> Song
>
>
>
next prev parent reply other threads:[~2021-08-31 21:37 UTC|newest]
Thread overview: 14+ messages / expand[flat|nested] mbox.gz Atom feed top
2021-08-30 21:41 [PATCH v3 bpf-next 0/3] bpf: introduce bpf_get_branch_snapshot Song Liu
2021-08-30 21:41 ` [PATCH v3 bpf-next 1/3] perf: enable branch record for software events Song Liu
2021-08-30 22:05 ` Andrii Nakryiko
2021-08-31 15:24 ` Peter Zijlstra
2021-08-31 16:12 ` Song Liu
2021-08-30 21:41 ` [PATCH v3 bpf-next 2/3] bpf: introduce helper bpf_get_branch_snapshot Song Liu
2021-08-30 22:14 ` Andrii Nakryiko
2021-08-31 11:16 ` kernel test robot
2021-08-31 15:32 ` Peter Zijlstra
2021-08-31 16:41 ` Song Liu
2021-08-31 21:24 ` Song Liu
2021-08-31 21:37 ` Alexei Starovoitov [this message]
2021-08-30 21:41 ` [PATCH v3 bpf-next 3/3] selftests/bpf: add test for bpf_get_branch_snapshot Song Liu
2021-08-30 22:28 ` Andrii Nakryiko
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=651cce25-e354-1486-c6b6-7cca4a37e3b7@fb.com \
--to=ast@fb.com \
--cc=Kernel-team@fb.com \
--cc=acme@kernel.org \
--cc=bpf@vger.kernel.org \
--cc=kjain@linux.ibm.com \
--cc=linux-kernel@vger.kernel.org \
--cc=mingo@redhat.com \
--cc=peterz@infradead.org \
--cc=songliubraving@fb.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).