All of lore.kernel.org
 help / color / mirror / Atom feed
From: Ravi Bangoria <ravi.bangoria@linux.vnet.ibm.com>
To: Brendan Gregg <brendan.d.gregg@gmail.com>
Cc: Ingo Molnar <mingo@redhat.com>,
	Arnaldo Carvalho de Melo <acme@kernel.org>,
	Peter Zijlstra <peterz@infradead.org>,
	Alexander Shishkin <alexander.shishkin@linux.intel.com>,
	Wang Nan <wangnan0@huawei.com>, Jiri Olsa <jolsa@kernel.org>,
	Andi Kleen <ak@linux.intel.com>,
	treeze.taeung@gmail.com,
	Mathieu Poirier <mathieu.poirier@linaro.org>,
	He Kuang <hekuang@huawei.com>,
	sukadev@linux.vnet.ibm.com, ananth@in.ibm.com,
	"Naveen N. Rao" <naveen.n.rao@linux.vnet.ibm.com>,
	Adrian Hunter <adrian.hunter@intel.com>,
	LKML <linux-kernel@vger.kernel.org>,
	Hemant Kumar <hemant@linux.vnet.ibm.com>,
	Masami Hiramatsu <mhiramat@kernel.org>,
	Ravi Bangoria <ravi.bangoria@linux.vnet.ibm.com>
Subject: Re: [PATCH v3 2/2] perf/sdt: Directly record SDT events with 'perf record'
Date: Wed, 1 Mar 2017 11:15:36 +0530	[thread overview]
Message-ID: <58B66000.7080504@linux.vnet.ibm.com> (raw)
In-Reply-To: <CAE40pdeKQg-qv37k4QaCAs1ieuaqAXtm9RK1UiLSG7xh6EP0dw@mail.gmail.com>

Thank you Brendan for reviewing,

On Wednesday 01 March 2017 10:34 AM, Brendan Gregg wrote:
> On Tue, Feb 28, 2017 at 2:31 PM, Brendan Gregg
> <brendan.d.gregg@gmail.com> wrote:
>> G'Day Ravi,
>>
> [...]
>> Now retrying perf:
>>
>> # ./perf record -e sdt_node:http__server__request -a
>> ^C[ perf record: Woken up 1 times to write data ]
>> [ perf record: Captured and wrote 0.446 MB perf.data (3 samples) ]
>> # ./perf script
>>             node  7646 [002]   361.012364:
>> sdt_node:http__server__request: (dc2e69)
>>             node  7646 [002]   361.204718:
>> sdt_node:http__server__request: (dc2e69)
>>             node  7646 [002]   361.363043:
>> sdt_node:http__server__request: (dc2e69)
>>
>> Now perf works.
>>
>> If I restart the node process, it goes back to the broken state.
>>
> Oh sorry, I forgot about that these Node.js probes are behind an
> is-enabled semaphore.

Yes. Perf does not support "is-enabled" markers yet.

> $ readelf -n `which node`
> [...]
>   stapsdt              0x00000089    NT_STAPSDT (SystemTap probe descriptors)
>     Provider: node
>     Name: http__server__request
>     Location: 0x0000000000dc2e69, Base: 0x000000000112e064, Semaphore:
> 0x0000000001470954
>     Arguments: 8@%r14 8@%rax 8@-4344(%rbp) -4@-4348(%rbp)
> 8@-4304(%rbp) 8@-4312(%rbp) -4@-4352(%rbp)
> # dd if=/proc/31695/mem bs=1 count=1 skip=$(( 0x0000000001470954 ))
> 2>/dev/null | xxd
> 00000000: 00                                       .
>  # printf "\x1" | dd of=/proc/31695/mem bs=1 count=1 seek=$((
> 0x0000000001470954 )) 2>/dev/null
> # dd if=/proc/31695/mem bs=1 count=1 skip=$(( 0x0000000001470954 ))
> 2>/dev/null | xxd
> 00000000: 01                                       .
> # ./perf record -e sdt_node:http__server__request -a
> Matching event(s) from uprobe_events:
>    sdt_node:http__server__request  0x9c2e69@/usr/local/bin/node
> Use 'perf probe -d <event>' to delete event(s).
> ^C[ perf record: Woken up 1 times to write data ]
> [ perf record: Captured and wrote 0.280 MB perf.data (3 samples) ]
> # ./perf script
>             node 31695 [003] 24947.168761:
> sdt_node:http__server__request: (dc2e69)
>             node 31695 [003] 24947.476143:
> sdt_node:http__server__request: (dc2e69)
>             node 31695 [003] 24947.679090:
> sdt_node:http__server__request: (dc2e69)
>
> So setting that to 1 made the probe work from perf. I guess this is
> not a problem with this patch set, but rather a feature request for
> the next one: is-enabled SDT support.

Yes. Actually I'm thinking about how this can be accomplished. Perf is a userspace
tool and, unlike systemtap, it cannot change semaphore value easily. This is what
I was thinking:

'perf record', at the start of session will increments semaphore in /proc/<pid>/mem.
and at the end of session, perf will decrement it (same as bcc). This does not require
any support from kernel infrastructure. But there are challenges with this approach:

1. What if user starts workload after starting 'perf record'. How perf will be able
    to increment semaphore value.

2. Systemwide record. We have to loop over all pids and check if any process is
   using SDT with semaphore that is being recorded.

3. Dynamic library loading. How to handle SDT probes in library that is not loaded
   at the time of 'perf record'?

Please let me know your thoughts.

> Were probe arguments supposed to work? I don't notice them in the perf
> script output.

Not yet. Alexis[1] (and followed by me[2]) had sent patches for that. Please
have a look at them.

[1] https://lkml.org/lkml/2016/12/13/784
[2] https://lkml.org/lkml/2017/2/2/145


So, why perf is able to record data after recording them with bcc?

Ideally, bcc should increment semaphore value at the start of session and
it should decrement at the end of the session. So after bcc process exits,
semaphore value should be zero. But actually it's not happening.

I've seen this when I was experimenting bcc with is-enabled markers.
See this example,

  $ readelf -n /usr/bin/node | grep -A2 Provider
      Provider: node
      Name: http__server__request
      Location: 0x0000000000e5f484, Base: 0x00000000011a0bc4, Semaphore: 0x0000000001558cf2

  $ sudo cat /proc/1426/maps
    00400000-01306000 r-xp 00000000 08:02 1083365    /usr/bin/node
    01506000-01551000 r--p 00f06000 08:02 1083365    /usr/bin/node
    01551000-01559000 rw-p 00f51000 08:02 1083365    /usr/bin/node
    ...

  [TERMINAL-1]$ gdb 1426
    (gdb) x/1 0x1558cf2
    0x1558cf2:    0

  [TERMINAL-2]$ sudo ./trace.py -p 1426 'u:node:http__server__request'
    PID    TID    COMM         FUNC
    /* Do not exit yet. */

  [TERMINAL-1]
    (gdb) x/1 0x1558cf2
    0x1558cf2:    2

  [TERMINAL-2]
     ^C         /* Exit bcc trace.py */

  [TERMINAL-1]
    (gdb) x/1 0x1558cf2
    0x1558cf2:    2

Here it's maintaining value 2 as it is. it should be 0. I suspect this is a bug in
bcc. Please let me know if I'm understanding it wrong.

>
> PS, if it's helpful, here's the commands to build node with these SDT probes:
>
> $ sudo apt-get install systemtap-sdt-dev       # adds "dtrace", used
> by node build
> $ wget https://nodejs.org/dist/v4.4.1/node-v4.4.1.tar.gz
> $ tar xvf node-v4.4.1.tar.gz
> $ cd node-v4.4.1
> $ ./configure --with-dtrace
> $ make -j 8

Thanks for this. :)
-Ravi

  reply	other threads:[~2017-03-01 11:01 UTC|newest]

Thread overview: 11+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2017-02-24  7:43 [PATCH v3 0/2] perf/sdt: Directly record SDT events with 'perf record' Ravi Bangoria
2017-02-24  7:43 ` [PATCH v3 1/2] perf/sdt: Introduce util func is_sdt_event() Ravi Bangoria
2017-02-28  1:34   ` Masami Hiramatsu
2017-02-24  7:43 ` [PATCH v3 2/2] perf/sdt: Directly record SDT events with 'perf record' Ravi Bangoria
2017-02-28  5:45   ` Masami Hiramatsu
2017-02-28 10:56     ` Ravi Bangoria
2017-03-01  5:45       ` Masami Hiramatsu
2017-02-28 22:31   ` Brendan Gregg
2017-03-01  5:04     ` Brendan Gregg
2017-03-01  5:45       ` Ravi Bangoria [this message]
2017-02-27  4:54 ` [PATCH v3 0/2] " Ravi Bangoria

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=58B66000.7080504@linux.vnet.ibm.com \
    --to=ravi.bangoria@linux.vnet.ibm.com \
    --cc=acme@kernel.org \
    --cc=adrian.hunter@intel.com \
    --cc=ak@linux.intel.com \
    --cc=alexander.shishkin@linux.intel.com \
    --cc=ananth@in.ibm.com \
    --cc=brendan.d.gregg@gmail.com \
    --cc=hekuang@huawei.com \
    --cc=hemant@linux.vnet.ibm.com \
    --cc=jolsa@kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=mathieu.poirier@linaro.org \
    --cc=mhiramat@kernel.org \
    --cc=mingo@redhat.com \
    --cc=naveen.n.rao@linux.vnet.ibm.com \
    --cc=peterz@infradead.org \
    --cc=sukadev@linux.vnet.ibm.com \
    --cc=treeze.taeung@gmail.com \
    --cc=wangnan0@huawei.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.