linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: "Wangnan (F)" <wangnan0@huawei.com>
To: Peter Zijlstra <peterz@infradead.org>, xiakaixu <xiakaixu@huawei.com>
Cc: Alexei Starovoitov <ast@plumgrid.com>, <davem@davemloft.net>,
	<acme@kernel.org>, <mingo@redhat.com>,
	<masami.hiramatsu.pt@hitachi.com>, <jolsa@kernel.org>,
	<daniel@iogearbox.net>, <linux-kernel@vger.kernel.org>,
	<pi3orama@163.com>, <hekuang@huawei.com>,
	<netdev@vger.kernel.org>
Subject: Re: [PATCH V5 1/1] bpf: control events stored in PERF_EVENT_ARRAY maps trace data output when perf sampling
Date: Wed, 21 Oct 2015 19:49:34 +0800	[thread overview]
Message-ID: <56277BCE.6030400@huawei.com> (raw)
In-Reply-To: <20151021113316.GM17308@twins.programming.kicks-ass.net>



On 2015/10/21 19:33, Peter Zijlstra wrote:
> On Wed, Oct 21, 2015 at 06:31:04PM +0800, xiakaixu wrote:
>
>> The RFC patch set contains the necessary commit log [1].
> That's of course the wrong place, this should be in the patch's
> Changelog. It doesn't become less relevant.
>
>> In some scenarios we don't want to output trace data when perf sampling
>> in order to reduce overhead. For example, perf can be run as daemon to
>> dump trace data when necessary, such as the system performance goes down.
>> Just like the example given in the cover letter, we only receive the
>> samples within sys_write() syscall.
>>
>> The helper bpf_perf_event_control() in this patch set can control the
>> data output process and get the samples we are most interested in.
>> The cpu_function_call is probably too much to do from bpf program, so
>> I choose current design that like 'soft_disable'.
> So, IIRC, we already require eBPF perf events to be CPU-local, which
> obviates the entire need for IPIs.

But soft-disable/enable don't require IPI because it is only
a memory store operation.

> So calling pmu->stop() seems entirely possible (its even NMI safe).

But we need to turn off sampling across CPUs. Please have a look
at my another email.

> This, however, does not explain if you need nesting, your patch seemed
> to have a counter, which suggest you do.
To avoid reacing.

If our task is sampling cycle events during a function is running,
and if two cores start that function overlap:

Time:   ...................A
Core 0: sys_write----\
                       \
                        \
Core 1:             sys_write%return
Core 2: ................sys_write

Then without counter at time A it is highly possible that
BPF program on core 1 and core 2 get conflict with each other.
The final result is we make some of those events be turned on
and others turned off. Using atomic counter can avoid this
problem.

Thank you.

>
> In any case, you could add perf_event_{stop,start}_local() to mirror the
> existing perf_event_read_local(), no? That would stop the entire thing
> and reduce even more overhead than simply skipping the overflow handler.
>



  reply	other threads:[~2015-10-21 11:51 UTC|newest]

Thread overview: 32+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2015-10-20  7:22 [PATCH V5 0/1] bpf: control events stored in PERF_EVENT_ARRAY maps trace data output when perf sampling Kaixu Xia
2015-10-20  7:22 ` [PATCH V5 1/1] " Kaixu Xia
2015-10-20 22:53   ` Alexei Starovoitov
2015-10-21  9:12     ` Peter Zijlstra
2015-10-21 10:31       ` xiakaixu
2015-10-21 11:33         ` Peter Zijlstra
2015-10-21 11:49           ` Wangnan (F) [this message]
2015-10-21 12:17             ` Peter Zijlstra
2015-10-21 13:42               ` Wangnan (F)
2015-10-21 13:49                 ` Peter Zijlstra
2015-10-21 14:01                   ` pi3orama
2015-10-21 14:09                     ` Peter Zijlstra
2015-10-21 15:06                       ` pi3orama
2015-10-21 16:57                         ` Peter Zijlstra
2015-10-21 21:19                           ` Alexei Starovoitov
2015-10-22  9:06                             ` Peter Zijlstra
2015-10-22 10:28                               ` Wangnan (F)
2015-10-23 12:52                                 ` Peter Zijlstra
2015-10-23 15:12                                   ` Peter Zijlstra
2015-10-27  6:43                                     ` xiakaixu
2015-10-22  2:46                           ` Wangnan (F)
2015-10-22  7:39                             ` Ingo Molnar
2015-10-22  7:51                               ` Wangnan (F)
2015-10-22  9:24                                 ` Peter Zijlstra
2015-10-22  1:56                 ` Wangnan (F)
2015-10-22  3:09                   ` Alexei Starovoitov
2015-10-22  3:12                     ` Wangnan (F)
2015-10-22  3:26                       ` Alexei Starovoitov
2015-10-22  9:49                       ` Peter Zijlstra
2015-10-21 11:34       ` Wangnan (F)
2015-10-21 11:56         ` Peter Zijlstra
2015-10-21 12:03           ` Wangnan (F)

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=56277BCE.6030400@huawei.com \
    --to=wangnan0@huawei.com \
    --cc=acme@kernel.org \
    --cc=ast@plumgrid.com \
    --cc=daniel@iogearbox.net \
    --cc=davem@davemloft.net \
    --cc=hekuang@huawei.com \
    --cc=jolsa@kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=masami.hiramatsu.pt@hitachi.com \
    --cc=mingo@redhat.com \
    --cc=netdev@vger.kernel.org \
    --cc=peterz@infradead.org \
    --cc=pi3orama@163.com \
    --cc=xiakaixu@huawei.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).