kernelnewbies.kernelnewbies.org archive mirror
 help / color / mirror / Atom feed
From: Theodore Dubois <tbodt@google.com>
To: "Valdis Klētnieks" <valdis.kletnieks@vt.edu>
Cc: kernelnewbies@kernelnewbies.org, a.p.zijlstra@chello.nl,
	linux-kernel@vger.kernel.org
Subject: Re: perf_event wakeup_events = 0
Date: Sat, 7 Sep 2019 16:27:39 -0700	[thread overview]
Message-ID: <CDBCD40D-F873-4EE9-96A9-F8AF105921E9@google.com> (raw)
In-Reply-To: <972489.1567896331@turing-police>

On Sep 7, 2019, at 3:45 PM, Valdis Klētnieks <valdis.kletnieks@vt.edu> wrote:

> So an entry is made in the buffer. It's not clear that this immediately triggers
> a signal…

I think the documentation says it does when wakeup_events is 1. The code for
perf backs this up:
https://github.com/torvalds/linux/blob/a9815a4fa2fd297cab9fa7a12161b16657290293/tools/perf/util/evsel.c#L1051-L1054
The puzzle is what happens when wakeup_events is 0. The documentation saying
"more recent kernels treat 0 the same as 1" suggests it should behave the same,
but then why would perf set it to 1 after zero-initializing it?

> So you need to look at what size mmap buffer is being allocated.  It's *probably*
> on the order of megabytes, so that you can buffer a fairly large number of entries
> and not take several user/kernel transitions on every single entry…

It’s 512 KiB. Each sample is 40 bytes (the sample_type is IP | TID | TIME |
PERIOD, and each one of those 8 bytes). 40 bytes per sample * 4000 samples per
second * 1.637 seconds is 261920 which is almost exactly half the buffer.

So does wakeup_events = 0 means it causes a wakeup when the buffer is half
full? I don't see anything in the man page about this....

If you'd like to try yourself, this is the strace command I've been using:
strace -ttTv -eperf_event_open,mmap,poll -operf.strace perf record stress --cpu 1 --timeout 1

~Theodore

> 
> On Sat, 07 Sep 2019 09:14:49 -0700, Theodore Dubois said:
> 
> Reading what it actually says rather than what I thought it said.. :)
> 
>       Events come in two flavors: counting and sampled.  A counting event  is
>       one  that  is  used  for  counting  the aggregate number of events that
>       occur.  In general, counting event results are gathered with a  read(2)
>       call.   A  sampling  event periodically writes measurements to a buffer
>       that can then be accessed via mmap(2).
> 
> For some reason, I was thinking counting events.  -ENOCAFFEINE. :)
> 
>> sample_freq is 4000 (and freq is 1). Here’s the man page on this field:
>> 
>>       sample_period, sample_freq
>>              A "sampling" event is one that generates an  overflow  notifica‐
>>              tion  every N events, where N is given by sample_period.  A sam‐
>>              pling event has sample_period > 0.
> 
> There's this part:
>>              pling event has sample_period > 0.   When  an  overflow  occurs,
>>              requested  data is recorded in the mmap buffer.  The sample_type
>>              field controls what data is recorded on each overflow.
> 
> So an entry is made in the buffer. It's not clear that this immediately triggers
> a signal...
> 
>   MMAP layout
>       When using perf_event_open() in sampled mode, asynchronous events (like
>       counter overflow or PROT_EXEC mmap tracking) are logged  into  a  ring-
>       buffer.  This ring-buffer is created and accessed through mmap(2).
> 
>       The mmap size should be 1+2^n pages, where the first page is a metadata
>       page (struct perf_event_mmap_page) that contains various bits of infor?
>       mation such as where the ring-buffer head is.
> 
> So you need to look at what size mmap buffer is being allocated.  It's *probably*
> on the order of megabytes, so that you can buffer a fairly large number of entries
> and not take several user/kernel transitions on every single entry...
> 
>> If I’m reading this right, this is a sampling event which overflows 4000 times a second.
> 
> And 4,000 entries are made in the buffer per second..
> 
>> But perf then does a poll call which wakes up on this FD with POLLIN after
>> 1.637 seconds, instead of 0.00025 seconds
> 
> At which point perf goes and looks at several thousand entries in the ring buffer...


_______________________________________________
Kernelnewbies mailing list
Kernelnewbies@kernelnewbies.org
https://lists.kernelnewbies.org/mailman/listinfo/kernelnewbies

      reply	other threads:[~2019-09-16 15:52 UTC|newest]

Thread overview: 6+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2019-09-06 23:28 perf_event wakeup_events = 0 Theodore Dubois
2019-09-07 13:40 ` Valdis Klētnieks
2019-09-07 16:14   ` Theodore Dubois
2019-09-07 22:00     ` Valdis Klētnieks
2019-09-07 22:45     ` Valdis Klētnieks
2019-09-07 23:27       ` Theodore Dubois [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=CDBCD40D-F873-4EE9-96A9-F8AF105921E9@google.com \
    --to=tbodt@google.com \
    --cc=a.p.zijlstra@chello.nl \
    --cc=kernelnewbies@kernelnewbies.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=valdis.kletnieks@vt.edu \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).