bpf.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Ian Rogers <irogers@google.com>
To: Adrian Hunter <adrian.hunter@intel.com>
Cc: Peter Zijlstra <peterz@infradead.org>,
	Ingo Molnar <mingo@redhat.com>,
	 Arnaldo Carvalho de Melo <acme@kernel.org>,
	Mark Rutland <mark.rutland@arm.com>,
	 Alexander Shishkin <alexander.shishkin@linux.intel.com>,
	Jiri Olsa <jolsa@kernel.org>,  Namhyung Kim <namhyung@kernel.org>,
	Nick Terrell <terrelln@fb.com>,
	 Kan Liang <kan.liang@linux.intel.com>,
	Song Liu <song@kernel.org>,  Sandipan Das <sandipan.das@amd.com>,
	Anshuman Khandual <anshuman.khandual@arm.com>,
	 James Clark <james.clark@arm.com>,
	Liam Howlett <liam.howlett@oracle.com>,
	 Miguel Ojeda <ojeda@kernel.org>, Leo Yan <leo.yan@linaro.org>,
	 German Gomez <german.gomez@arm.com>,
	Ravi Bangoria <ravi.bangoria@amd.com>,
	 Artem Savkov <asavkov@redhat.com>,
	Athira Rajeev <atrajeev@linux.vnet.ibm.com>,
	 Andi Kleen <ak@linux.intel.com>,
	linux-kernel@vger.kernel.org,  linux-perf-users@vger.kernel.org,
	bpf@vger.kernel.org
Subject: Re: [PATCH v2 13/13] perf machine thread: Remove exited threads by default
Date: Mon, 23 Oct 2023 11:49:00 -0700	[thread overview]
Message-ID: <CAP-5=fX303b8-hQXWaRibtYFyBS2sQywDSKR6KhKMUMX_0gzrw@mail.gmail.com> (raw)
In-Reply-To: <c04c196a-3f29-4b89-97a7-5e71def3d3ce@intel.com>

On Mon, Oct 23, 2023 at 7:24 AM Adrian Hunter <adrian.hunter@intel.com> wrote:
>
> On 12/10/23 09:23, Ian Rogers wrote:
> > struct thread values hold onto references to mmaps, dsos, etc. When a
> > thread exits it is necessary to clean all of this memory up by
> > removing the thread from the machine's threads. Some tools require
> > this doesn't happen, such as perf report if offcpu events exist or if
> > a task list is being generated, so add a symbol_conf value to make the
> > behavior optional. When an exited thread is left in the machine's
> > threads, mark it as exited.
> >
> > This change relates to commit 40826c45eb0b ("perf thread: Remove
> > notion of dead threads"). Dead threads were removed as they had a
> > reference count of 0 and were difficult to reason about with the
> > reference count checker. Here a thread is removed from threads when it
> > exits, unless via symbol_conf the exited thread isn't remove and is
> > marked as exited. Reference counting behaves as it normally does.
>
> Can we exclude AUX area tracing?
>
> Essentially, the EXIT event happens when the task is still running
> in kernel mode, so the thread has not in fact fully exited.
>
> Example:
>
> # perf record -a --kcore -e intel_pt// uname
>
> Before:
>
> # perf script --itrace=b --show-task-events -C6 | grep -C10 EXIT
>            uname   14740 [006] 26795.092638:          1   branches:  ffffffffb63124ee __perf_event_header__init_id+0x5e ([kernel.kallsyms]) => ffffffffb63124f7 __perf_event_header__init_id+0x67 ([kernel.kallsyms])
>            uname   14740 [006] 26795.092638:          1   branches:  ffffffffb6312501 __perf_event_header__init_id+0x71 ([kernel.kallsyms]) => ffffffffb6312512 __perf_event_header__init_id+0x82 ([kernel.kallsyms])
>            uname   14740 [006] 26795.092638:          1   branches:  ffffffffb6312531 __perf_event_header__init_id+0xa1 ([kernel.kallsyms]) => ffffffffb6316b3a perf_event_task_output+0x26a ([kernel.kallsyms])
>            uname   14740 [006] 26795.092638:          1   branches:  ffffffffb6316b40 perf_event_task_output+0x270 ([kernel.kallsyms]) => ffffffffb6316959 perf_event_task_output+0x89 ([kernel.kallsyms])
>            uname   14740 [006] 26795.092638:          1   branches:  ffffffffb6316966 perf_event_task_output+0x96 ([kernel.kallsyms]) => ffffffffb6322040 perf_output_begin+0x0 ([kernel.kallsyms])
>            uname   14740 [006] 26795.092638:          1   branches:  ffffffffb6322080 perf_output_begin+0x40 ([kernel.kallsyms]) => ffffffffb6194dc0 __rcu_read_lock+0x0 ([kernel.kallsyms])
>            uname   14740 [006] 26795.092638:          1   branches:  ffffffffb6194de1 __rcu_read_lock+0x21 ([kernel.kallsyms]) => ffffffffb6322085 perf_output_begin+0x45 ([kernel.kallsyms])
>            uname   14740 [006] 26795.092638:          1   branches:  ffffffffb63220ce perf_output_begin+0x8e ([kernel.kallsyms]) => ffffffffb611d280 preempt_count_add+0x0 ([kernel.kallsyms])
>            uname   14740 [006] 26795.092638:          1   branches:  ffffffffb611d2bf preempt_count_add+0x3f ([kernel.kallsyms]) => ffffffffb63220d3 perf_output_begin+0x93 ([kernel.kallsyms])
>            uname   14740 [006] 26795.092638:          1   branches:  ffffffffb63220e8 perf_output_begin+0xa8 ([kernel.kallsyms]) => ffffffffb63220ff perf_output_begin+0xbf ([kernel.kallsyms])
>            uname   14740 [006] 26795.092638: PERF_RECORD_EXIT(14740:14740):(14739:14739)
>            uname   14740 [006] 26795.092638:          1   branches:  ffffffffb6322119 perf_output_begin+0xd9 ([kernel.kallsyms]) => ffffffffb6322128 perf_output_begin+0xe8 ([kernel.kallsyms])
>            uname   14740 [006] 26795.092638:          1   branches:  ffffffffb6322146 perf_output_begin+0x106 ([kernel.kallsyms]) => ffffffffb63220ea perf_output_begin+0xaa ([kernel.kallsyms])
>            uname   14740 [006] 26795.092638:          1   branches:  ffffffffb63220f9 perf_output_begin+0xb9 ([kernel.kallsyms]) => ffffffffb63221ab perf_output_begin+0x16b ([kernel.kallsyms])
>            uname   14740 [006] 26795.092638:          1   branches:  ffffffffb63221ae perf_output_begin+0x16e ([kernel.kallsyms]) => ffffffffb63221b6 perf_output_begin+0x176 ([kernel.kallsyms])
>            uname   14740 [006] 26795.092638:          1   branches:  ffffffffb6322202 perf_output_begin+0x1c2 ([kernel.kallsyms]) => ffffffffb6322167 perf_output_begin+0x127 ([kernel.kallsyms])
>            uname   14740 [006] 26795.092638:          1   branches:  ffffffffb632218c perf_output_begin+0x14c ([kernel.kallsyms]) => ffffffffb631696b perf_event_task_output+0x9b ([kernel.kallsyms])
>            uname   14740 [006] 26795.092638:          1   branches:  ffffffffb6316990 perf_event_task_output+0xc0 ([kernel.kallsyms]) => ffffffffb61034a0 __task_pid_nr_ns+0x0 ([kernel.kallsyms])
>            uname   14740 [006] 26795.092638:          1   branches:  ffffffffb61034b7 __task_pid_nr_ns+0x17 ([kernel.kallsyms]) => ffffffffb6194dc0 __rcu_read_lock+0x0 ([kernel.kallsyms])
>            uname   14740 [006] 26795.092638:          1   branches:  ffffffffb6194de1 __rcu_read_lock+0x21 ([kernel.kallsyms]) => ffffffffb61034bc __task_pid_nr_ns+0x1c ([kernel.kallsyms])
>            uname   14740 [006] 26795.092638:          1   branches:  ffffffffb6103503 __task_pid_nr_ns+0x63 ([kernel.kallsyms]) => ffffffffb610353b __task_pid_nr_ns+0x9b ([kernel.kallsyms])
>
> After:
>
> $ perf script --itrace=b --show-task-events -C6 | grep -C10 EXIT
>            uname   14740 [006] 26795.092638:          1   branches:  ffffffffb63124ee __perf_event_header__init_id+0x5e ([kernel.kallsyms]) => ffffffffb63124f7 __perf_event_header__init_id+0x67 ([kernel.kallsyms])
>            uname   14740 [006] 26795.092638:          1   branches:  ffffffffb6312501 __perf_event_header__init_id+0x71 ([kernel.kallsyms]) => ffffffffb6312512 __perf_event_header__init_id+0x82 ([kernel.kallsyms])
>            uname   14740 [006] 26795.092638:          1   branches:  ffffffffb6312531 __perf_event_header__init_id+0xa1 ([kernel.kallsyms]) => ffffffffb6316b3a perf_event_task_output+0x26a ([kernel.kallsyms])
>            uname   14740 [006] 26795.092638:          1   branches:  ffffffffb6316b40 perf_event_task_output+0x270 ([kernel.kallsyms]) => ffffffffb6316959 perf_event_task_output+0x89 ([kernel.kallsyms])
>            uname   14740 [006] 26795.092638:          1   branches:  ffffffffb6316966 perf_event_task_output+0x96 ([kernel.kallsyms]) => ffffffffb6322040 perf_output_begin+0x0 ([kernel.kallsyms])
>            uname   14740 [006] 26795.092638:          1   branches:  ffffffffb6322080 perf_output_begin+0x40 ([kernel.kallsyms]) => ffffffffb6194dc0 __rcu_read_lock+0x0 ([kernel.kallsyms])
>            uname   14740 [006] 26795.092638:          1   branches:  ffffffffb6194de1 __rcu_read_lock+0x21 ([kernel.kallsyms]) => ffffffffb6322085 perf_output_begin+0x45 ([kernel.kallsyms])
>            uname   14740 [006] 26795.092638:          1   branches:  ffffffffb63220ce perf_output_begin+0x8e ([kernel.kallsyms]) => ffffffffb611d280 preempt_count_add+0x0 ([kernel.kallsyms])
>            uname   14740 [006] 26795.092638:          1   branches:  ffffffffb611d2bf preempt_count_add+0x3f ([kernel.kallsyms]) => ffffffffb63220d3 perf_output_begin+0x93 ([kernel.kallsyms])
>            uname   14740 [006] 26795.092638:          1   branches:  ffffffffb63220e8 perf_output_begin+0xa8 ([kernel.kallsyms]) => ffffffffb63220ff perf_output_begin+0xbf ([kernel.kallsyms])
>            uname   14740 [006] 26795.092638: PERF_RECORD_EXIT(14740:14740):(14739:14739)
>           :14740   14740 [006] 26795.092638:          1   branches:  ffffffffb6322119 perf_output_begin+0xd9 ([kernel.kallsyms]) => ffffffffb6322128 perf_output_begin+0xe8 ([kernel.kallsyms])
>           :14740   14740 [006] 26795.092638:          1   branches:  ffffffffb6322146 perf_output_begin+0x106 ([kernel.kallsyms]) => ffffffffb63220ea perf_output_begin+0xaa ([kernel.kallsyms])
>           :14740   14740 [006] 26795.092638:          1   branches:  ffffffffb63220f9 perf_output_begin+0xb9 ([kernel.kallsyms]) => ffffffffb63221ab perf_output_begin+0x16b ([kernel.kallsyms])
>           :14740   14740 [006] 26795.092638:          1   branches:  ffffffffb63221ae perf_output_begin+0x16e ([kernel.kallsyms]) => ffffffffb63221b6 perf_output_begin+0x176 ([kernel.kallsyms])
>           :14740   14740 [006] 26795.092638:          1   branches:  ffffffffb6322202 perf_output_begin+0x1c2 ([kernel.kallsyms]) => ffffffffb6322167 perf_output_begin+0x127 ([kernel.kallsyms])
>           :14740   14740 [006] 26795.092638:          1   branches:  ffffffffb632218c perf_output_begin+0x14c ([kernel.kallsyms]) => ffffffffb631696b perf_event_task_output+0x9b ([kernel.kallsyms])
>           :14740   14740 [006] 26795.092638:          1   branches:  ffffffffb6316990 perf_event_task_output+0xc0 ([kernel.kallsyms]) => ffffffffb61034a0 __task_pid_nr_ns+0x0 ([kernel.kallsyms])
>           :14740   14740 [006] 26795.092638:          1   branches:  ffffffffb61034b7 __task_pid_nr_ns+0x17 ([kernel.kallsyms]) => ffffffffb6194dc0 __rcu_read_lock+0x0 ([kernel.kallsyms])
>           :14740   14740 [006] 26795.092638:          1   branches:  ffffffffb6194de1 __rcu_read_lock+0x21 ([kernel.kallsyms]) => ffffffffb61034bc __task_pid_nr_ns+0x1c ([kernel.kallsyms])
>           :14740   14740 [006] 26795.092638:          1   branches:  ffffffffb6103503 __task_pid_nr_ns+0x63 ([kernel.kallsyms]) => ffffffffb610353b __task_pid_nr_ns+0x9b ([kernel.kallsyms])
>
> This will also affect samples made after PERF_RECORD_EXIT but before
> the task finishes exiting.

Makes sense. Would an appropriate fix be in perf_session__open to set:
symbol_conf.keep_exited_threads = true;

when:
perf_header__has_feat(&session->header, HEADER_AUXTRACE)

It is kind of hacky to be changing global state this way, but
symbol_conf is like that in general.

Thanks,
Ian

> >
> > Signed-off-by: Ian Rogers <irogers@google.com>
> > ---
> >  tools/perf/builtin-report.c   |  7 +++++++
> >  tools/perf/util/machine.c     | 10 +++++++---
> >  tools/perf/util/symbol_conf.h |  3 ++-
> >  tools/perf/util/thread.h      | 14 ++++++++++++++
> >  4 files changed, 30 insertions(+), 4 deletions(-)
> >
> > diff --git a/tools/perf/builtin-report.c b/tools/perf/builtin-report.c
> > index dcedfe00f04d..749246817aed 100644
> > --- a/tools/perf/builtin-report.c
> > +++ b/tools/perf/builtin-report.c
> > @@ -1411,6 +1411,13 @@ int cmd_report(int argc, const char **argv)
> >       if (ret < 0)
> >               goto exit;
> >
> > +     /*
> > +      * tasks_mode require access to exited threads to list those that are in
> > +      * the data file. Off-cpu events are synthesized after other events and
> > +      * reference exited threads.
> > +      */
> > +     symbol_conf.keep_exited_threads = true;
> > +
> >       annotation_options__init(&report.annotation_opts);
> >
> >       ret = perf_config(report__config, &report);
> > diff --git a/tools/perf/util/machine.c b/tools/perf/util/machine.c
> > index 6ca7500e2cf4..5cda47eb337d 100644
> > --- a/tools/perf/util/machine.c
> > +++ b/tools/perf/util/machine.c
> > @@ -2157,9 +2157,13 @@ int machine__process_exit_event(struct machine *machine, union perf_event *event
> >       if (dump_trace)
> >               perf_event__fprintf_task(event, stdout);
> >
> > -     if (thread != NULL)
> > -             thread__put(thread);
> > -
> > +     if (thread != NULL) {
> > +             if (symbol_conf.keep_exited_threads)
> > +                     thread__set_exited(thread, /*exited=*/true);
> > +             else
> > +                     machine__remove_thread(machine, thread);
> > +     }
> > +     thread__put(thread);
> >       return 0;
> >  }
> >
> > diff --git a/tools/perf/util/symbol_conf.h b/tools/perf/util/symbol_conf.h
> > index 2b2fb9e224b0..6040286e07a6 100644
> > --- a/tools/perf/util/symbol_conf.h
> > +++ b/tools/perf/util/symbol_conf.h
> > @@ -43,7 +43,8 @@ struct symbol_conf {
> >                       disable_add2line_warn,
> >                       buildid_mmap2,
> >                       guest_code,
> > -                     lazy_load_kernel_maps;
> > +                     lazy_load_kernel_maps,
> > +                     keep_exited_threads;
> >       const char      *vmlinux_name,
> >                       *kallsyms_name,
> >                       *source_prefix,
> > diff --git a/tools/perf/util/thread.h b/tools/perf/util/thread.h
> > index e79225a0ea46..0df775b5c110 100644
> > --- a/tools/perf/util/thread.h
> > +++ b/tools/perf/util/thread.h
> > @@ -36,13 +36,22 @@ struct thread_rb_node {
> >  };
> >
> >  DECLARE_RC_STRUCT(thread) {
> > +     /** @maps: mmaps associated with this thread. */
> >       struct maps             *maps;
> >       pid_t                   pid_; /* Not all tools update this */
> > +     /** @tid: thread ID number unique to a machine. */
> >       pid_t                   tid;
> > +     /** @ppid: parent process of the process this thread belongs to. */
> >       pid_t                   ppid;
> >       int                     cpu;
> >       int                     guest_cpu; /* For QEMU thread */
> >       refcount_t              refcnt;
> > +     /**
> > +      * @exited: Has the thread had an exit event. Such threads are usually
> > +      * removed from the machine's threads but some events/tools require
> > +      * access to dead threads.
> > +      */
> > +     bool                    exited;
> >       bool                    comm_set;
> >       int                     comm_len;
> >       struct list_head        namespaces_list;
> > @@ -189,6 +198,11 @@ static inline refcount_t *thread__refcnt(struct thread *thread)
> >       return &RC_CHK_ACCESS(thread)->refcnt;
> >  }
> >
> > +static inline void thread__set_exited(struct thread *thread, bool exited)
> > +{
> > +     RC_CHK_ACCESS(thread)->exited = exited;
> > +}
> > +
> >  static inline bool thread__comm_set(const struct thread *thread)
> >  {
> >       return RC_CHK_ACCESS(thread)->comm_set;
>

  reply	other threads:[~2023-10-23 18:49 UTC|newest]

Thread overview: 28+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2023-10-12  6:23 [PATCH v2 00/13] Improvements to memory use Ian Rogers
2023-10-12  6:23 ` [PATCH v2 01/13] perf machine: Avoid out of bounds LBR memory read Ian Rogers
2023-10-12  6:23 ` [PATCH v2 02/13] libperf rc_check: Make implicit enabling work for GCC Ian Rogers
2023-10-12  6:23 ` [PATCH v2 03/13] perf hist: Add missing puts to hist__account_cycles Ian Rogers
2023-10-18 23:16   ` Namhyung Kim
2023-10-24 16:12     ` Ian Rogers
2023-10-12  6:23 ` [PATCH v2 04/13] perf threads: Remove unused dead thread list Ian Rogers
2023-10-12  6:23 ` [PATCH v2 05/13] perf offcpu: Add missed btf_free Ian Rogers
2023-10-12  6:23 ` [PATCH v2 06/13] perf callchain: Make display use of branch_type_stat const Ian Rogers
2023-10-12  6:23 ` [PATCH v2 07/13] perf callchain: Make brtype_stat in callchain_list optional Ian Rogers
2023-10-12  6:23 ` [PATCH v2 08/13] perf callchain: Minor layout changes to callchain_list Ian Rogers
2023-10-12  6:23 ` [PATCH v2 09/13] perf mem_info: Add and use map_symbol__exit and addr_map_symbol__exit Ian Rogers
2023-10-18 23:20   ` Namhyung Kim
2023-10-24 20:52     ` Ian Rogers
2023-10-12  6:23 ` [PATCH v2 10/13] perf record: Lazy load kernel symbols Ian Rogers
2023-10-19 11:02   ` Adrian Hunter
2023-10-24 18:42     ` Ian Rogers
2023-10-12  6:23 ` [PATCH v2 11/13] libperf: Lazily allocate mmap event copy Ian Rogers
2023-10-12  6:23 ` [PATCH v2 12/13] perf mmap: Lazily initialize zstd streams Ian Rogers
2023-10-18 23:21   ` Namhyung Kim
2023-10-24 19:21     ` Ian Rogers
2023-10-12  6:23 ` [PATCH v2 13/13] perf machine thread: Remove exited threads by default Ian Rogers
2023-10-18 23:30   ` Namhyung Kim
2023-10-19  0:39     ` Ian Rogers
2023-10-20 21:08       ` Namhyung Kim
2023-10-23 14:24   ` Adrian Hunter
2023-10-23 18:49     ` Ian Rogers [this message]
2023-10-24  5:19       ` Adrian Hunter

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to='CAP-5=fX303b8-hQXWaRibtYFyBS2sQywDSKR6KhKMUMX_0gzrw@mail.gmail.com' \
    --to=irogers@google.com \
    --cc=acme@kernel.org \
    --cc=adrian.hunter@intel.com \
    --cc=ak@linux.intel.com \
    --cc=alexander.shishkin@linux.intel.com \
    --cc=anshuman.khandual@arm.com \
    --cc=asavkov@redhat.com \
    --cc=atrajeev@linux.vnet.ibm.com \
    --cc=bpf@vger.kernel.org \
    --cc=german.gomez@arm.com \
    --cc=james.clark@arm.com \
    --cc=jolsa@kernel.org \
    --cc=kan.liang@linux.intel.com \
    --cc=leo.yan@linaro.org \
    --cc=liam.howlett@oracle.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-perf-users@vger.kernel.org \
    --cc=mark.rutland@arm.com \
    --cc=mingo@redhat.com \
    --cc=namhyung@kernel.org \
    --cc=ojeda@kernel.org \
    --cc=peterz@infradead.org \
    --cc=ravi.bangoria@amd.com \
    --cc=sandipan.das@amd.com \
    --cc=song@kernel.org \
    --cc=terrelln@fb.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).