All of lore.kernel.org
 help / color / mirror / Atom feed
From: William Cohen <wcohen@redhat.com>
To: Elazar Leibovich <elazar.leibovich@ravellosystems.com>
Cc: linux-perf-users@vger.kernel.org, Stephane Eranian <eranian@google.com>
Subject: Re: Why the need to do a perf_event_open syscall for each cpu on the system?
Date: Mon, 16 Mar 2015 10:47:22 -0400	[thread overview]
Message-ID: <5506ECFA.40305@redhat.com> (raw)
In-Reply-To: <CAL2Y34DEW+HvF6tsFH5qg-RxkAmstEcK=qYhkcRV6D8DCFG3kg@mail.gmail.com>

On 03/15/2015 01:15 AM, Elazar Leibovich wrote:
> Hi,
> 
> Not an expert, but my understanding is that it's just technical
> difficulty. Performance metrics are being saved in per-cpu buffer.
> Having pid==-1 and cpu==-1 means that something would aggregate all
> buffers in multiple CPUs to a single buffer. That code must exist,
> either in userspace or in the kernel.
> 
> The kernel preferred that this code would be in userspace.

Hi Elazar,

I suspected the reasoning was something along those lines.  I was hoping that someone could point to archived email threads with earlier discussions showing the complications that would arise by having system-wide setup perf event setup and reading handled in the kernel. Looking through the earlier versions of perf see that pid==-1 and cpu=-1 were not allowed in the very early proposed patches (http://thread.gmane.org/gmane.linux.kernel.cross-arch/2578).  However, not much in the way explanation in the design tradeoffs in there.

Making user-space set up performance events for each cpu certainly simplifies the kernel code for system-wide monitoring. The cgroup support is essentially like system-wide monitoring with additional filtering on the cgroup and things get more complicated using the perf cgroup support when the cgroups are not pinned to a particular processor, O(cgroups*cpus) opens and reads.  If the cgroups is scaled up at the same rate as cpus, this would be O(cpus^2).  I am wondering if handling the system-wide case (pid==-1 and cpu==-1) in the kernel would make cgroup and system-wide monitoring more efficient or if the complications in the kernel are just too much.

-Will
>
> On Fri, Mar 13, 2015 at 8:49 PM, William Cohen <wcohen@redhat.com> wrote:
>> Hi All,
>>
>> I have a design question about the linux kernel perf support. A number of /proc statistics aggregate data across all the cpus in the system.  Why the does perf require the user-space application to enumerate all the processors and do a perf_event_open syscall for each of the processors?  Why not have a perf_event_open with pid=-1 and cpu=-1 mean system-wide event and aggregate it in the kernel when the value is read?  The line below from design.txt specifically say it is invalid.
>>
>> (Note: the combination of 'pid == -1' and 'cpu == -1' is not valid.)
>>
>> -Will
>> --
>> To unsubscribe from this list: send the line "unsubscribe linux-perf-users" in
>> the body of a message to majordomo@vger.kernel.org
>> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> --
> To unsubscribe from this list: send the line "unsubscribe linux-perf-users" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> 

  reply	other threads:[~2015-03-16 14:47 UTC|newest]

Thread overview: 7+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2015-03-13 18:49 Why the need to do a perf_event_open syscall for each cpu on the system? William Cohen
2015-03-13 21:14 ` Vince Weaver
2015-03-15  5:15 ` Elazar Leibovich
2015-03-16 14:47   ` William Cohen [this message]
2015-03-17  0:51     ` Stephane Eranian
2015-03-17 14:40     ` Andi Kleen
2015-03-17 15:30       ` William Cohen

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=5506ECFA.40305@redhat.com \
    --to=wcohen@redhat.com \
    --cc=elazar.leibovich@ravellosystems.com \
    --cc=eranian@google.com \
    --cc=linux-perf-users@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.