linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: "Jin, Yao" <yao.jin@linux.intel.com>
To: Jiri Olsa <jolsa@redhat.com>
Cc: acme@kernel.org, jolsa@kernel.org, peterz@infradead.org,
	mingo@redhat.com, alexander.shishkin@linux.intel.com,
	Linux-kernel@vger.kernel.org, ak@linux.intel.com,
	kan.liang@intel.com, yao.jin@intel.com
Subject: Re: [PATCH] perf evsel: Get group fd from CPU0 for system wide event
Date: Fri, 15 May 2020 14:04:57 +0800	[thread overview]
Message-ID: <68e53765-6f45-9483-7543-0a2f961cdc62@linux.intel.com> (raw)
In-Reply-To: <3e813227-4954-0d4b-bc7a-ca272b18454a@linux.intel.com>

Hi Jiri,

On 5/9/2020 3:37 PM, Jin, Yao wrote:
> Hi Jiri,
> 
> On 5/5/2020 8:03 AM, Jiri Olsa wrote:
>> On Sat, May 02, 2020 at 10:33:59AM +0800, Jin, Yao wrote:
>>
>> SNIP
>>
>>>>> @@ -1461,6 +1461,9 @@ static int get_group_fd(struct evsel *evsel, int cpu, int thread)
>>>>>        BUG_ON(!leader->core.fd);
>>>>>        fd = FD(leader, cpu, thread);
>>>>> +    if (fd == -1 && leader->core.system_wide)
>>>>
>>>> fd does not need to be -1 in here.. in my setup cstate_pkg/c2-residency/
>>>> has cpumask 0, so other cpus never get open and are 0, and the whole thing
>>>> ends up with:
>>>>
>>>>     sys_perf_event_open: pid -1  cpu 1  group_fd 0  flags 0
>>>>     sys_perf_event_open failed, error -9
>>>>
>>>> I actualy thought we put -1 to fd array but couldn't find it.. perhaps we should od that
>>>>
>>>>
>>>
>>> I have tested on two platforms. On KBL desktop fd is 0 for this case, but on
>>> oncascadelakex server, fd is -1, so the BUG_ON(fd == -1) is triggered.
>>>
>>>>> +        fd = FD(leader, 0, thread);
>>>>> +
>>>>
>>>> so how do we group following events?
>>>>
>>>>     cstate_pkg/c2-residency/ - cpumask 0
>>>>     msr/tsc/                 - all cpus
>>>>
>>>
>>> Not sure if it's enough to only use cpumask 0 because
>>> cstate_pkg/c2-residency/ should be per-socket.
>>>
>>>> cpu 0 is fine.. the rest I have no idea ;-)
>>>>
>>>
>>> Perhaps we directly remove the BUG_ON(fd == -1) assertion?
>>
>> I think we need to make clear how to deal with grouping over
>> events that comes for different cpus
>>
>>     so how do we group following events?
>>
>>        cstate_pkg/c2-residency/ - cpumask 0
>>        msr/tsc/                 - all cpus
>>
>>
>> what's the reason/expected output of groups with above events?
>> seems to make sense only if we limit msr/tsc/ to cpumask 0 as well
>>
>> jirka
>>
> 
> On 2-socket machine (e.g cascadelakex), "cstate_pkg/c2-residency/" is per-socket event and the 
> cpumask is 0 and 24.
> 
> root@lkp-csl-2sp5 /sys/devices/cstate_pkg# cat cpumask
> 0,24
> 
> We can't limit it to cpumask 0. It should be programmed on CPU0 and CPU24 (the first CPU on each 
> socket).
> 
> The "msr/tsc" are per-cpu event, it should be programmed on all cpus. So I don't think we can limit 
> msr/tsc to cpumask 0.
> 
> The issue is how we deal with get_group_fd().
> 
> static int get_group_fd(struct evsel *evsel, int cpu, int thread)
> {
>          struct evsel *leader = evsel->leader;
>          int fd;
> 
>          if (evsel__is_group_leader(evsel))
>                  return -1;
> 
>          /*
>           * Leader must be already processed/open,
>           * if not it's a bug.
>           */
>          BUG_ON(!leader->core.fd);
> 
>          fd = FD(leader, cpu, thread);
>          BUG_ON(fd == -1);
> 
>          return fd;
> }
> 
> When evsel is "msr/tsc/",
> 
> FD(leader, 0, 0) is 3 (3 is the fd of "cstate_pkg/c2-residency/" on CPU0)
> FD(leader, 1, 0) is -1
> BUG_ON asserted.
> 
> If we just return group_fd(-1) for "msr/tsc", it looks like it's not a problem, is it?
> 
> Thanks
> Jin Yao

I think I get the root cause. That should be a serious bug in get_group_fd, access violation!

For a group mixed with system-wide event and per-core event and the group leader is system-wide 
event, access violation will happen.

perf_evsel__alloc_fd allocates one FD member for system-wide event (only FD(evsel, 0, 0) is valid).

But for per core event, perf_evsel__alloc_fd allocates N FD members (N = ncpus). For example, for 
ncpus is 8, FD(evsel, 0, 0) to FD(evsel, 7, 0) are valid.

get_group_fd(struct evsel *evsel, int cpu, int thread)
{
     struct evsel *leader = evsel->leader;

     fd = FD(leader, cpu, thread);    /* access violation may happen here */
}

If leader is system-wide event, only the FD(leader, 0, 0) is valid.

When get_group_fd accesses FD(leader, 1, 0), access violation happens.

My fix is:

diff --git a/tools/perf/util/evsel.c b/tools/perf/util/evsel.c
index 28683b0eb738..db05b8a1e1a8 100644
--- a/tools/perf/util/evsel.c
+++ b/tools/perf/util/evsel.c
@@ -1440,6 +1440,9 @@ static int get_group_fd(struct evsel *evsel, int cpu, int thread)
         if (evsel__is_group_leader(evsel))
                 return -1;

+       if (leader->core.system_wide && !evsel->core.system_wide)
+               return -2;
+
         /*
          * Leader must be already processed/open,
          * if not it's a bug.
@@ -1665,6 +1668,11 @@ static int evsel__open_cpu(struct evsel *evsel, struct perf_cpu_map *cpus,
                                 pid = perf_thread_map__pid(threads, thread);

                         group_fd = get_group_fd(evsel, cpu, thread);
+                       if (group_fd == -2) {
+                               errno = EINVAL;
+                               err = -EINVAL;
+                               goto out_close;
+                       }
  retry_open:
                         test_attr__ready();

It enables the perf_evlist__reset_weak_group. And in the second_pass (in __run_perf_stat), the 
events will be opened successfully.

I have tested OK for this fix on cascadelakex.

Thanks
Jin Yao

  reply	other threads:[~2020-05-15  6:05 UTC|newest]

Thread overview: 11+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-04-30  1:34 [PATCH] perf evsel: Get group fd from CPU0 for system wide event Jin Yao
2020-05-01 10:23 ` Jiri Olsa
2020-05-02  2:33   ` Jin, Yao
2020-05-05  0:03     ` Jiri Olsa
2020-05-09  7:37       ` Jin, Yao
2020-05-15  6:04         ` Jin, Yao [this message]
2020-05-15  8:33           ` Jiri Olsa
2020-05-18  3:28             ` Jin, Yao
2020-05-20  5:36               ` Jin, Yao
2020-05-20  7:50                 ` Jiri Olsa
2020-05-21  4:38                   ` Jin, Yao

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=68e53765-6f45-9483-7543-0a2f961cdc62@linux.intel.com \
    --to=yao.jin@linux.intel.com \
    --cc=Linux-kernel@vger.kernel.org \
    --cc=acme@kernel.org \
    --cc=ak@linux.intel.com \
    --cc=alexander.shishkin@linux.intel.com \
    --cc=jolsa@kernel.org \
    --cc=jolsa@redhat.com \
    --cc=kan.liang@intel.com \
    --cc=mingo@redhat.com \
    --cc=peterz@infradead.org \
    --cc=yao.jin@intel.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).