From: Jiri Olsa <jolsa@redhat.com> To: Andi Kleen <ak@linux.intel.com> Cc: Jiri Olsa <jolsa@kernel.org>, Arnaldo Carvalho de Melo <acme@kernel.org>, lkml <linux-kernel@vger.kernel.org>, Ingo Molnar <mingo@kernel.org>, Namhyung Kim <namhyung@kernel.org>, Alexander Shishkin <alexander.shishkin@linux.intel.com>, Peter Zijlstra <a.p.zijlstra@chello.nl>, Jonas Rabenstein <jonas.rabenstein@studium.uni-erlangen.de>, Nageswara R Sastry <nasastry@in.ibm.com>, Ravi Bangoria <ravi.bangoria@linux.ibm.com> Subject: Re: [PATCHv2 5/8] perf tools: Get precise_ip from the pmu config Date: Thu, 14 Mar 2019 15:01:24 +0100 Message-ID: <20190314140124.GE4406@krava> (raw) In-Reply-To: <20190307165123.GE7535@tassilo.jf.intel.com> On Thu, Mar 07, 2019 at 08:51:23AM -0800, Andi Kleen wrote: > On Thu, Mar 07, 2019 at 04:35:00PM +0100, Jiri Olsa wrote: > > On Tue, Mar 05, 2019 at 08:40:17AM -0800, Andi Kleen wrote: > > > On Tue, Mar 05, 2019 at 05:28:54PM +0100, Jiri Olsa wrote: > > > > On Tue, Mar 05, 2019 at 08:13:19AM -0800, Andi Kleen wrote: > > > > > On Tue, Mar 05, 2019 at 04:25:33PM +0100, Jiri Olsa wrote: > > > > > > Getting precise_ip field from the perf_pmu::max_precise > > > > > > config read from sysfs. If it's not available falling > > > > > > back to current detection function. > > > > > > > > > > max_precise depends on the event. This won't work for all > > > > > events. For example only instructions and cycles support > > > > > ppp > > > > > > > > I'm getting precise_ip=3 on mem-* events as well, that's why I > > > > was fixing this.. now it's not working for any event > > > > > > I don't think it means anything for mem-* > > > > > > There's some support for it on Goldmont plus for other events, > > > but it doesn't support mem-*. On big core it's only > > > for instructions and cycles, all implemented with the same > > > event. All other PEBS events only have two levels > > > switching between the two IPs. > > > > ok, so how about this, it's the change I posted merged with the patch > > Still seems like a hack. Even though I don't know of a case that > would break now. But it would be better to have the precise probing > in the open retry loop, instead of trying to reinvent it here. how about something like below? jirka --- tools/perf/util/evlist.c | 29 ---------------- tools/perf/util/evlist.h | 2 -- tools/perf/util/evsel.c | 73 +++++++++++++++++++++++++++++++++------- 3 files changed, 60 insertions(+), 44 deletions(-) diff --git a/tools/perf/util/evlist.c b/tools/perf/util/evlist.c index ed20f4379956..3a243c249c29 100644 --- a/tools/perf/util/evlist.c +++ b/tools/perf/util/evlist.c @@ -230,35 +230,6 @@ void perf_evlist__set_leader(struct perf_evlist *evlist) } } -void perf_event_attr__set_max_precise_ip(struct perf_event_attr *pattr) -{ - struct perf_event_attr attr = { - .type = PERF_TYPE_HARDWARE, - .config = PERF_COUNT_HW_CPU_CYCLES, - .exclude_kernel = 1, - .precise_ip = 3, - }; - - event_attr_init(&attr); - - /* - * Unnamed union member, not supported as struct member named - * initializer in older compilers such as gcc 4.4.7 - */ - attr.sample_period = 1; - - while (attr.precise_ip != 0) { - int fd = sys_perf_event_open(&attr, 0, -1, -1, 0); - if (fd != -1) { - close(fd); - break; - } - --attr.precise_ip; - } - - pattr->precise_ip = attr.precise_ip; -} - int __perf_evlist__add_default(struct perf_evlist *evlist, bool precise) { struct perf_evsel *evsel = perf_evsel__new_cycles(precise); diff --git a/tools/perf/util/evlist.h b/tools/perf/util/evlist.h index 744906dd4887..cb15e3366eba 100644 --- a/tools/perf/util/evlist.h +++ b/tools/perf/util/evlist.h @@ -303,8 +303,6 @@ void perf_evlist__to_front(struct perf_evlist *evlist, void perf_evlist__set_tracking_event(struct perf_evlist *evlist, struct perf_evsel *tracking_evsel); -void perf_event_attr__set_max_precise_ip(struct perf_event_attr *attr); - struct perf_evsel * perf_evlist__find_evsel_by_str(struct perf_evlist *evlist, const char *str); diff --git a/tools/perf/util/evsel.c b/tools/perf/util/evsel.c index 3bbf73e979c0..7cd70f99201a 100644 --- a/tools/perf/util/evsel.c +++ b/tools/perf/util/evsel.c @@ -295,7 +295,6 @@ struct perf_evsel *perf_evsel__new_cycles(bool precise) if (!precise) goto new_event; - perf_event_attr__set_max_precise_ip(&attr); /* * Now let the usual logic to set up the perf_event_attr defaults * to kick in when we return and before perf_evsel__open() is called. @@ -305,6 +304,8 @@ struct perf_evsel *perf_evsel__new_cycles(bool precise) if (evsel == NULL) goto out; + evsel->precise_max = true; + /* use asprintf() because free(evsel) assumes name is allocated */ if (asprintf(&evsel->name, "cycles%s%s%.*s", (attr.precise_ip || attr.exclude_kernel) ? ":" : "", @@ -1083,7 +1084,7 @@ void perf_evsel__config(struct perf_evsel *evsel, struct record_opts *opts, } if (evsel->precise_max) - perf_event_attr__set_max_precise_ip(attr); + attr->precise_ip = 3; if (opts->all_user) { attr->exclude_kernel = 1; @@ -1749,6 +1750,60 @@ static bool ignore_missing_thread(struct perf_evsel *evsel, return true; } +static void display_attr(struct perf_event_attr *attr) +{ + if (verbose >= 2) { + fprintf(stderr, "%.60s\n", graph_dotted_line); + fprintf(stderr, "perf_event_attr:\n"); + perf_event_attr__fprintf(stderr, attr, __open_attr__fprintf, NULL); + fprintf(stderr, "%.60s\n", graph_dotted_line); + } +} + +static int perf_event_open(struct perf_evsel *evsel, + pid_t pid, int cpu, int group_fd, + unsigned long flags) +{ + int precise_ip = evsel->attr.precise_ip; + int fd; + + while (1) { + pr_debug2("sys_perf_event_open: pid %d cpu %d group_fd %d flags %#lx", + pid, cpu, group_fd, flags); + + fd = sys_perf_event_open(&evsel->attr, pid, cpu, group_fd, flags); + if (fd >= 0) + break; + + /* + * Do quick precise_ip fallback if requested: + * - there is some precise_ip + * - maximum precise is requested + * - we failed with ENOTSUP error, which is + * associated with wrong precise_ip config + */ + if (!precise_ip || !evsel->precise_max || (errno != ENOTSUP)) + break; + + /* + * We tried all the precise_ip values, and it's + * still failing, so leave it to standard fallabck. + */ + if (!evsel->attr.precise_ip) { + evsel->attr.precise_ip = precise_ip; + break; + } + + pr_debug2("\nsys_perf_event_open failed, error %d\n", -ENOTSUP); + + evsel->attr.precise_ip--; + pr_debug2("decreasing precise_ip by one (%d)\n", evsel->attr.precise_ip); + display_attr(&evsel->attr); + } + + return fd; +} + int perf_evsel__open(struct perf_evsel *evsel, struct cpu_map *cpus, struct thread_map *threads) { @@ -1824,12 +1879,7 @@ int perf_evsel__open(struct perf_evsel *evsel, struct cpu_map *cpus, if (perf_missing_features.sample_id_all) evsel->attr.sample_id_all = 0; - if (verbose >= 2) { - fprintf(stderr, "%.60s\n", graph_dotted_line); - fprintf(stderr, "perf_event_attr:\n"); - perf_event_attr__fprintf(stderr, &evsel->attr, __open_attr__fprintf, NULL); - fprintf(stderr, "%.60s\n", graph_dotted_line); - } + display_attr(&evsel->attr); for (cpu = 0; cpu < cpus->nr; cpu++) { @@ -1841,13 +1891,10 @@ int perf_evsel__open(struct perf_evsel *evsel, struct cpu_map *cpus, group_fd = get_group_fd(evsel, cpu, thread); retry_open: - pr_debug2("sys_perf_event_open: pid %d cpu %d group_fd %d flags %#lx", - pid, cpus->map[cpu], group_fd, flags); - test_attr__ready(); - fd = sys_perf_event_open(&evsel->attr, pid, cpus->map[cpu], - group_fd, flags); + fd = perf_event_open(evsel, pid, cpus->map[cpu], + group_fd, flags); FD(evsel, cpu, thread) = fd; -- 2.17.2
next prev parent reply index Thread overview: 30+ messages / expand[flat|nested] mbox.gz Atom feed top 2019-03-05 15:25 [PATCH 0/8] perf tools: Assorted fixes Jiri Olsa 2019-03-05 15:25 ` [PATCH 1/8] perf c2c: Fix c2c report for empty numa node Jiri Olsa 2019-03-09 20:05 ` [tip:perf/urgent] " tip-bot for Jiri Olsa 2019-03-05 15:25 ` [PATCH 2/8] perf tools: Add error path into hist_entry__init Jiri Olsa 2019-03-09 20:06 ` [tip:perf/urgent] perf hist: " tip-bot for Jiri Olsa 2019-03-05 15:25 ` [PATCH 3/8] perf hist: Fix memory leak of srcline Jiri Olsa 2019-03-09 20:07 ` [tip:perf/urgent] " tip-bot for Jiri Olsa 2019-03-05 15:25 ` [PATCH 4/8] perf tools: Read and store caps/max_precise in perf_pmu Jiri Olsa 2019-03-09 20:07 ` [tip:perf/urgent] " tip-bot for Jiri Olsa 2019-03-05 15:25 ` [PATCH 5/8] perf tools: Get precise_ip from the pmu config Jiri Olsa 2019-03-05 16:13 ` Andi Kleen 2019-03-05 16:28 ` Jiri Olsa 2019-03-05 16:40 ` Andi Kleen 2019-03-07 15:35 ` [PATCHv2 " Jiri Olsa 2019-03-07 16:51 ` Andi Kleen 2019-03-07 22:32 ` Jiri Olsa 2019-03-14 14:01 ` Jiri Olsa [this message] 2019-03-14 15:49 ` Andi Kleen 2019-03-15 12:15 ` [PATCH] perf tools: Move precise_ip detection into perf_evsel__open Jiri Olsa 2019-03-15 14:05 ` Andi Kleen 2019-03-15 14:35 ` Arnaldo Carvalho de Melo 2019-03-15 14:52 ` Jiri Olsa 2019-03-23 15:04 ` Jiri Olsa 2019-03-25 14:53 ` Arnaldo Carvalho de Melo 2019-03-05 15:25 ` [PATCH 6/8] perf tools: Probe for precise_ip with simple attr Jiri Olsa 2019-03-09 20:08 ` [tip:perf/urgent] perf evsel: " tip-bot for Jiri Olsa 2019-03-05 15:25 ` [PATCH 7/8] perf tools: Fix double free in perf_data__close Jiri Olsa 2019-03-09 20:09 ` [tip:perf/urgent] perf session: " tip-bot for Jiri Olsa 2019-03-05 15:25 ` [PATCH 8/8] perf tools: Force perf_data__open|close zero data->file.path Jiri Olsa 2019-03-09 20:09 ` [tip:perf/urgent] perf data: " tip-bot for Jiri Olsa
Reply instructions: You may reply publicly to this message via plain-text email using any one of the following methods: * Save the following mbox file, import it into your mail client, and reply-to-all from there: mbox Avoid top-posting and favor interleaved quoting: https://en.wikipedia.org/wiki/Posting_style#Interleaved_style * Reply using the --to, --cc, and --in-reply-to switches of git-send-email(1): git send-email \ --in-reply-to=20190314140124.GE4406@krava \ --to=jolsa@redhat.com \ --cc=a.p.zijlstra@chello.nl \ --cc=acme@kernel.org \ --cc=ak@linux.intel.com \ --cc=alexander.shishkin@linux.intel.com \ --cc=jolsa@kernel.org \ --cc=jonas.rabenstein@studium.uni-erlangen.de \ --cc=linux-kernel@vger.kernel.org \ --cc=mingo@kernel.org \ --cc=namhyung@kernel.org \ --cc=nasastry@in.ibm.com \ --cc=ravi.bangoria@linux.ibm.com \ /path/to/YOUR_REPLY https://kernel.org/pub/software/scm/git/docs/git-send-email.html * If your mail client supports setting the In-Reply-To header via mailto: links, try the mailto: link
LKML Archive on lore.kernel.org Archives are clonable: git clone --mirror https://lore.kernel.org/lkml/0 lkml/git/0.git git clone --mirror https://lore.kernel.org/lkml/1 lkml/git/1.git git clone --mirror https://lore.kernel.org/lkml/2 lkml/git/2.git git clone --mirror https://lore.kernel.org/lkml/3 lkml/git/3.git git clone --mirror https://lore.kernel.org/lkml/4 lkml/git/4.git git clone --mirror https://lore.kernel.org/lkml/5 lkml/git/5.git git clone --mirror https://lore.kernel.org/lkml/6 lkml/git/6.git git clone --mirror https://lore.kernel.org/lkml/7 lkml/git/7.git git clone --mirror https://lore.kernel.org/lkml/8 lkml/git/8.git git clone --mirror https://lore.kernel.org/lkml/9 lkml/git/9.git # If you have public-inbox 1.1+ installed, you may # initialize and index your mirror using the following commands: public-inbox-init -V2 lkml lkml/ https://lore.kernel.org/lkml \ linux-kernel@vger.kernel.org public-inbox-index lkml Example config snippet for mirrors Newsgroup available over NNTP: nntp://nntp.lore.kernel.org/org.kernel.vger.linux-kernel AGPL code for this site: git clone https://public-inbox.org/public-inbox.git