All of lore.kernel.org
 help / color / mirror / Atom feed
From: Alexey Budankov <alexey.budankov@linux.intel.com>
To: Arnaldo Carvalho de Melo <acme@kernel.org>,
	linux-kernel <linux-kernel@vger.kernel.org>
Cc: Jiri Olsa <jolsa@redhat.com>, Namhyung Kim <namhyung@kernel.org>,
	Alexander Shishkin <alexander.shishkin@linux.intel.com>,
	Peter Zijlstra <peterz@infradead.org>,
	Ingo Molnar <mingo@redhat.com>, Andi Kleen <ak@linux.intel.com>,
	Kan Liang <kan.liang@linux.intel.com>,
	"Jin, Yao" <yao.jin@linux.intel.com>
Subject: [PATCH v1 0/3] collect LBR callstack together with thread stack data
Date: Fri, 9 Aug 2019 18:16:02 +0300	[thread overview]
Message-ID: <ec5fe6b1-a116-fb60-42c6-dc8a9dedfc15@linux.intel.com> (raw)


The patch set unblocks collection of LBR call stack data simultaneously with
raw thread stack data by --call-graph dwarf,SIZE option:

  $perf record -g --call-graph dwarf,1024 -j stack,u -- stack_test

Collected LBR call stack can be used to augment dwarf call stack calculated 
from the raw thread stack data and to provide more comprehensive call stack 
information for cases when collected SIZE is not enough to cover complete 
thread stack.

Such cases are typical for workloads that allocate large arrays of data on 
its threads stacks or the possible SIZE to collect can't be large enough due 
to workload nature or system configuration and this is where hardware 
captured LBR call stacks can provide missing stack frames. Possible dwarf plus 
LBR call stacks consolidation algorithm description follows.

With this patch set perf report command UI currently ignores collected LBR 
call stack data and still provides dwarf based call stacks information.

===========================================================================

Overview:

   Legend:

   THS - thread stack
   CTX - thread register context
   SWS - software stack
   SSF - skipped stack frames
   PSS - Perf sample stack

   ip,sp,bp - HW registers values
   d        - allocated stack regions
   kip      - ip address in the kernel space
   K        - captured thread stack size

        THS

       -----
       |   |<-stack bottom
        ... 
       |---|
       |ip4|
       |---|         PSS = SWS(THS(K))
       |   |
   --> |   |
   |   |d3 |                  user/
   |   |---|         user PSS kernel PSS
   |   |ip3|         ------   ------
   |   |---|         |SSF |   |SSF |
   |   |   |          ....     ....
   |   |   |         ------   ------
   |   |d2 |         | -1 |   | -1 |
       |---|   user  ------   ------
   K   |ip2|   CTX   |ip3 |   |ip3 |
       |---|         |----|   |----|
   |   |d1 |   ...   |ip2 | , |ip2 |
   |   |---|  |---|  |----|   |----|
   |   |ip1|  |bp0|  |ip1 |   |ip1 |
   |   |---|  |---|  |----|   |----|
   |   |   |  |ip0|->|ip0 |   |ip0 |<-user stack top
   |   |   |  |---|  ------   ------
   |   |   |<-|sp0|<-stack    |kip0|<-kernel stack bottom
   --> -----  -----   top     |----|
                              |kip1|
                              |----|
		              |kip2|
		              |----|
                               ....
			      |    |<-kernel stack top
                              ------

Algorithm details:

   Legend:

   HWS - hardware stack
   K-SWS - kernel software stack

			 BRANCH
			 TABLE

		 HWS      ip   ip
			  from to
		 ------  -----------
		 |ip7`|  |ip7`|    |
		 |----|  |----|----|
		 |ip6`|  |ip6`|    |
	user PSS |----|  |----|----|
		 |ip5`|  |ip5`|    |
	------   |----|  |----|----|
	| -1 |   |ip4`|  |ip4`|    |
	------   |----|  |----|----|
	|ip3 |~~~|ip3`|  |ip3`|    |
	|----|   |----|  |----|----|
	|ip2 |~~~|ip2`|  |ip2`|    |
	|----| 	 |----|  |----|----|
	|ip1 |~~~|ip1`|  |ip1`|ip0`|
	|----| 	 |----|  -----------
	|ip0 |~~~|ip0`|<---------'
	------   ------

	1. if (sym(ipj) == sym(ipj`)), j=0-3 ===> user PSS
	2. ipj`                      , j=4-7 ===> user PSS

Augmented PSS = A_SWS(SWS(THS(K)), HWS):

	         user/
       user PSS  kernel PSS

	------   ------
	|ip7`|   |ip7`|<-user PSS bottom
	|----|   |----|
	|ip6`|   |ip6`|
	|----|   |----|
    HWS	|ip5`|   |ip5`|
	|----|   |----|
	|ip4`|   |ip4`|
	------   ------
	|ip3 |   |ip3 |
	|----|   |----|
    SWS |ip2 |   |ip2 |
	|----|   |----|
	|ip1 |   |ip1 |
	|----|   |----|
	|ip0 |   |ip0 |<-user PSS top
	------   ------
		 |kip0|<-kernel PSS bottom
		 |----|
		 |kip1|
	   K-SWS |----|
		 |kip2|
		 |----|
		 |kip3|<-kernel PSS top
		 ------

                  APSS

===========================================================================

---
Alexey Budankov (3):
  perf record: enable LBR callstack capture jointly with thread stack
  perf report: dump LBR callstack data by -D jointly with thread stack
  perf report: prefer dwarf callstacks to LBR ones when captured both

 tools/perf/builtin-report.c            |  2 ++
 tools/perf/util/parse-branch-options.c |  1 +
 tools/perf/util/session.c              | 31 ++++++++++++++++----------
 3 files changed, 22 insertions(+), 12 deletions(-)

-- 
2.20.1


             reply	other threads:[~2019-08-09 15:16 UTC|newest]

Thread overview: 7+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2019-08-09 15:16 Alexey Budankov [this message]
2019-08-09 15:23 ` [PATCH v1 1/3] perf record: enable LBR callstack capture jointly with thread stack Alexey Budankov
2019-08-23  2:28   ` [tip: perf/core] perf record: Enable " tip-bot2 for Alexey Budankov
2019-08-09 15:26 ` [PATCH v1 2/3] perf report: dump LBR callstack data by -D " Alexey Budankov
2019-08-23  2:28   ` [tip: perf/core] perf report: Dump " tip-bot2 for Alexey Budankov
2019-08-09 15:31 ` [PATCH v1 3/3] perf report: prefer dwarf callstacks to LBR ones when captured both Alexey Budankov
2019-08-23  2:28   ` [tip: perf/core] perf report: Prefer DWARF " tip-bot2 for Alexey Budankov

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=ec5fe6b1-a116-fb60-42c6-dc8a9dedfc15@linux.intel.com \
    --to=alexey.budankov@linux.intel.com \
    --cc=acme@kernel.org \
    --cc=ak@linux.intel.com \
    --cc=alexander.shishkin@linux.intel.com \
    --cc=jolsa@redhat.com \
    --cc=kan.liang@linux.intel.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=mingo@redhat.com \
    --cc=namhyung@kernel.org \
    --cc=peterz@infradead.org \
    --cc=yao.jin@linux.intel.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.