All of lore.kernel.org
 help / color / mirror / Atom feed
* perf smapling
@ 2015-03-29  5:54 sahil aggarwal
  2015-03-29  6:22 ` Elazar Leibovich
                   ` (2 more replies)
  0 siblings, 3 replies; 17+ messages in thread
From: sahil aggarwal @ 2015-03-29  5:54 UTC (permalink / raw)
  To: linux-perf-users

Hi all

I am beginner to the perf API. Can someone direct me to the example of 
reading ring buffer using mmap.?

Thank you
Regards

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: perf smapling
  2015-03-29  5:54 perf smapling sahil aggarwal
@ 2015-03-29  6:22 ` Elazar Leibovich
  2015-03-30 17:38 ` Andi Kleen
       [not found] ` <CAL2Y34Bi+VLSvGucN8J=4RXKh785Lg9BYAa4LUU_yCMR7COTgQ@mail.gmail.com>
  2 siblings, 0 replies; 17+ messages in thread
From: Elazar Leibovich @ 2015-03-29  6:22 UTC (permalink / raw)
  To: sahil aggarwal; +Cc: linux-perf-users

[-- Attachment #1: Type: text/plain, Size: 656 bytes --]

Hi,

While not ready to prime time yet, I attach a tar.gz of a program that
does just that.

I hope to make it an official project soon, but in the mean time, I
think it can help you understand how to read the ring buffer.

On Sun, Mar 29, 2015 at 8:54 AM, sahil aggarwal <sahil.agg15@gmail.com> wrote:
> Hi all
>
> I am beginner to the perf API. Can someone direct me to the example of
> reading ring buffer using mmap.?
>
> Thank you
> Regards
>
> --
> To unsubscribe from this list: send the line "unsubscribe linux-perf-users" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html

[-- Attachment #2: perf2.tar.gz --]
[-- Type: application/x-gzip, Size: 23412 bytes --]

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: perf smapling
  2015-03-29  5:54 perf smapling sahil aggarwal
  2015-03-29  6:22 ` Elazar Leibovich
@ 2015-03-30 17:38 ` Andi Kleen
       [not found] ` <CAL2Y34Bi+VLSvGucN8J=4RXKh785Lg9BYAa4LUU_yCMR7COTgQ@mail.gmail.com>
  2 siblings, 0 replies; 17+ messages in thread
From: Andi Kleen @ 2015-03-30 17:38 UTC (permalink / raw)
  To: sahil aggarwal; +Cc: linux-perf-users

sahil aggarwal <sahil.agg15@gmail.com> writes:

> Hi all
>
> I am beginner to the perf API. Can someone direct me to the example of 
> reading ring buffer using mmap.?

https://github.com/andikleen/pmu-tools/tree/master/addr

-Andi

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Fwd: perf smapling
       [not found]     ` <CAL2Y34A=Fk03_LSTM_SHwVaE4NaKSD=zZt0vWtzn3Rm9dQtDLQ@mail.gmail.com>
@ 2015-03-31 14:15       ` Elazar Leibovich
  2015-03-31 15:10         ` sahil aggarwal
  0 siblings, 1 reply; 17+ messages in thread
From: Elazar Leibovich @ 2015-03-31 14:15 UTC (permalink / raw)
  To: sahil aggarwal, linux-perf-users

I wanted to ensure the user always see contiguous array of data from
the ring buffer.

The last piece of data, say "abcde" could wrap around in the ring
buffer and appear like:

[de...                 ...abc]

I wanted the user to see a contigious array of the form [abcde].

So in the case I'm having input that wrap around, I'll simply copy it
to the first buffer

[wrap_buffer][de..                 ...abc]
would become
[               abc][de...               ...abc]

And then I'll the user pointer to the leftmost "a", and he'll see
"abcde" without knowing he's handling a ring buffer.

Let me know if I was clear enough.

On Tue, Mar 31, 2015 at 2:18 PM, sahil aggarwal <sahil.agg15@gmail.com> wrote:
>
> Hi Elazar
>
> Can you help me understand why you have used
> mmap_pages->wrap_base.? And, instead of allocating
> (2^n)+1 pages you allocate (2^n)+2 pages, why so.?
> wrap_base points to (2^n)+2 pages and base points to
> (2^n)+1 pages, what is use of wrap_base.? I tried reading
> perf source too, there it seems they use (2^n)+1 pages only.
>
>
> Thanks
> Regards

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: perf smapling
  2015-03-31 14:15       ` Fwd: " Elazar Leibovich
@ 2015-03-31 15:10         ` sahil aggarwal
  2015-03-31 15:22           ` sahil aggarwal
  0 siblings, 1 reply; 17+ messages in thread
From: sahil aggarwal @ 2015-03-31 15:10 UTC (permalink / raw)
  To: Elazar Leibovich; +Cc: linux-perf-users

Yeah that was clear enough.
Thanks a lot. Your code is of great help.

Regards
Sahil

On 31 March 2015 at 19:45, Elazar Leibovich
<elazar.leibovich@ravellosystems.com> wrote:
> I wanted to ensure the user always see contiguous array of data from
> the ring buffer.
>
> The last piece of data, say "abcde" could wrap around in the ring
> buffer and appear like:
>
> [de...                 ...abc]
>
> I wanted the user to see a contigious array of the form [abcde].
>
> So in the case I'm having input that wrap around, I'll simply copy it
> to the first buffer
>
> [wrap_buffer][de..                 ...abc]
> would become
> [               abc][de...               ...abc]
>
> And then I'll the user pointer to the leftmost "a", and he'll see
> "abcde" without knowing he's handling a ring buffer.
>
> Let me know if I was clear enough.
>
> On Tue, Mar 31, 2015 at 2:18 PM, sahil aggarwal <sahil.agg15@gmail.com> wrote:
>>
>> Hi Elazar
>>
>> Can you help me understand why you have used
>> mmap_pages->wrap_base.? And, instead of allocating
>> (2^n)+1 pages you allocate (2^n)+2 pages, why so.?
>> wrap_base points to (2^n)+2 pages and base points to
>> (2^n)+1 pages, what is use of wrap_base.? I tried reading
>> perf source too, there it seems they use (2^n)+1 pages only.
>>
>>
>> Thanks
>> Regards

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: perf smapling
  2015-03-31 15:10         ` sahil aggarwal
@ 2015-03-31 15:22           ` sahil aggarwal
  2015-03-31 18:07             ` Elazar Leibovich
  0 siblings, 1 reply; 17+ messages in thread
From: sahil aggarwal @ 2015-03-31 15:22 UTC (permalink / raw)
  To: Elazar Leibovich; +Cc: linux-perf-users

Actually i need most of the sampling around PERF_TYPE_TRACEPOINT,
so if i enable tracepoint "syscalls/sys_enter_open/" what will be the "type"
field in perf_event_header.? And, the the record struct will be same as given
in "syscalls/sys_enter_open/format" .?

Thanks

On 31 March 2015 at 20:40, sahil aggarwal <sahil.agg15@gmail.com> wrote:
> Yeah that was clear enough.
> Thanks a lot. Your code is of great help.
>
> Regards
> Sahil
>
> On 31 March 2015 at 19:45, Elazar Leibovich
> <elazar.leibovich@ravellosystems.com> wrote:
>> I wanted to ensure the user always see contiguous array of data from
>> the ring buffer.
>>
>> The last piece of data, say "abcde" could wrap around in the ring
>> buffer and appear like:
>>
>> [de...                 ...abc]
>>
>> I wanted the user to see a contigious array of the form [abcde].
>>
>> So in the case I'm having input that wrap around, I'll simply copy it
>> to the first buffer
>>
>> [wrap_buffer][de..                 ...abc]
>> would become
>> [               abc][de...               ...abc]
>>
>> And then I'll the user pointer to the leftmost "a", and he'll see
>> "abcde" without knowing he's handling a ring buffer.
>>
>> Let me know if I was clear enough.
>>
>> On Tue, Mar 31, 2015 at 2:18 PM, sahil aggarwal <sahil.agg15@gmail.com> wrote:
>>>
>>> Hi Elazar
>>>
>>> Can you help me understand why you have used
>>> mmap_pages->wrap_base.? And, instead of allocating
>>> (2^n)+1 pages you allocate (2^n)+2 pages, why so.?
>>> wrap_base points to (2^n)+2 pages and base points to
>>> (2^n)+1 pages, what is use of wrap_base.? I tried reading
>>> perf source too, there it seems they use (2^n)+1 pages only.
>>>
>>>
>>> Thanks
>>> Regards

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: perf smapling
  2015-03-31 15:22           ` sahil aggarwal
@ 2015-03-31 18:07             ` Elazar Leibovich
  2015-04-01  9:22               ` sahil aggarwal
  0 siblings, 1 reply; 17+ messages in thread
From: Elazar Leibovich @ 2015-03-31 18:07 UTC (permalink / raw)
  To: sahil aggarwal; +Cc: linux-perf-users

Look at the man page, you should set the type to PERF_TYPE_TRACEPOINT
and set the config to the event id.

On my system, sys_enter_open event id is 455

$ sudo cat /sys/kernel/debug/tracing/events/syscalls/sys_enter_open/id
455

Add PERF_SAMPLE_RAW to the sample_type.

BTW
You can compile the tar.gz I sent and echo JSON in the attr format to
it, it'll print back perf data in json format. Easier to experiment
with perf_event_open API than writing a C program.

For example

$ make
$ sudo ./perf2 <<EOF
{
  "attr": {
    "sample_type": [
      "PERF_SAMPLE_IP",
      "PERF_SAMPLE_RAW"
    ],
    "wakeup_events": 1,
    "config": 455,
    "sample_period": 1,
    "type": "PERF_TYPE_TRACEPOINT"
  }
}
EOF
{"type":"PERF_RECORD_SAMPLE","misc":"PERF_RECORD_MISC_USER","sample":{"ip":"7f4f3625263d","data":[-57,1,0,0,7,11,0,0,2,0,0,0,0,0,0,0,-32,101,75,22,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0]}}
{"type":"PERF_RECORD_SAMPLE","misc":"PERF_RECORD_MISC_USER","sample":{"ip":"7f4f3625263d","data":[-57,1,0,0,7,11,0,0,2,0,0,0,0,0,0,0,-64,112,75,22,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0]}}
{"type":"PERF_RECORD_SAMPLE","misc":"PERF_RECORD_MISC_USER","sample":{"ip":"7f4f3625263d","data":[-57,1,0,0,7,11,0,0,2,0,0,0,0,0,0,0,-112,70,75,22,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0]}}
{"type":"PERF_RECORD_SAMPLE","misc":"PERF_RECORD_MISC_USER","sample":{"ip":"7f4f3625263d","data":[-57,1,0,0,7,11,0,0,2,0,0,0,0,0,0,0,-32,70,75,22,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0]}}
{"type":"PERF_RECORD_SAMPLE","misc":"PERF_RECORD_MISC_USER","sample":{"ip":"7f4f3625263d","data":[-57,1,0,0,7,11,0,0,2,0,0,0,0,0,0,0,0,-99,26,-19,-1,127,0,0,0,0,0,0,0,0,0,0,-74,1,0,0,0,0,0,0,0,0,0,0]}}
...

What is the raw data? Depends on the event. For sys_enter/exit it is
struct syscall_trace_enter/exit.

http://osxr.org/linux/source/kernel/trace/trace.h#0095
struct trace_entry {
     unsigned short      type;
     unsigned char       flags;
     unsigned char       preempt_count;
     int         pid;
};
struct syscall_trace_enter {
    struct trace_entry  ent;
    int         nr;
    unsigned long       args[];
};

How did I know that? I followed the kernel logic here:

http://osxr.org/linux/source/kernel/trace/trace_syscalls.c#0636
static void perf_syscall_exit(void *ignore, struct pt_regs *regs, long ret)
{
...
rec = (struct syscall_trace_exit *)perf_trace_buf_prepare(size, ...);
...
}

Note that indeed after short+char+char+int we have 2, the open syscall
number in all event's raw data.

On Tue, Mar 31, 2015 at 6:22 PM, sahil aggarwal <sahil.agg15@gmail.com> wrote:
> Actually i need most of the sampling around PERF_TYPE_TRACEPOINT,
> so if i enable tracepoint "syscalls/sys_enter_open/" what will be the "type"
> field in perf_event_header.? And, the the record struct will be same as given
> in "syscalls/sys_enter_open/format" .?
>
> Thanks
>
> On 31 March 2015 at 20:40, sahil aggarwal <sahil.agg15@gmail.com> wrote:
>> Yeah that was clear enough.
>> Thanks a lot. Your code is of great help.
>>
>> Regards
>> Sahil
>>
>> On 31 March 2015 at 19:45, Elazar Leibovich
>> <elazar.leibovich@ravellosystems.com> wrote:
>>> I wanted to ensure the user always see contiguous array of data from
>>> the ring buffer.
>>>
>>> The last piece of data, say "abcde" could wrap around in the ring
>>> buffer and appear like:
>>>
>>> [de...                 ...abc]
>>>
>>> I wanted the user to see a contigious array of the form [abcde].
>>>
>>> So in the case I'm having input that wrap around, I'll simply copy it
>>> to the first buffer
>>>
>>> [wrap_buffer][de..                 ...abc]
>>> would become
>>> [               abc][de...               ...abc]
>>>
>>> And then I'll the user pointer to the leftmost "a", and he'll see
>>> "abcde" without knowing he's handling a ring buffer.
>>>
>>> Let me know if I was clear enough.
>>>
>>> On Tue, Mar 31, 2015 at 2:18 PM, sahil aggarwal <sahil.agg15@gmail.com> wrote:
>>>>
>>>> Hi Elazar
>>>>
>>>> Can you help me understand why you have used
>>>> mmap_pages->wrap_base.? And, instead of allocating
>>>> (2^n)+1 pages you allocate (2^n)+2 pages, why so.?
>>>> wrap_base points to (2^n)+2 pages and base points to
>>>> (2^n)+1 pages, what is use of wrap_base.? I tried reading
>>>> perf source too, there it seems they use (2^n)+1 pages only.
>>>>
>>>>
>>>> Thanks
>>>> Regards
> --
> To unsubscribe from this list: send the line "unsubscribe linux-perf-users" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: perf smapling
  2015-03-31 18:07             ` Elazar Leibovich
@ 2015-04-01  9:22               ` sahil aggarwal
  2015-04-01  9:58                 ` Elazar Leibovich
       [not found]                 ` <CAL2Y34Amk49Upd8+eCmEK9WqG6SGgEJfj3tUXdrRKEa_6Hxr8Q@mail.gmail.com>
  0 siblings, 2 replies; 17+ messages in thread
From: sahil aggarwal @ 2015-04-01  9:22 UTC (permalink / raw)
  To: Elazar Leibovich; +Cc: linux-perf-users

Hi Elazar

Finally i am able to make small prototype to enable tracepoints. :)

One more thing, is it possible to enable multiple tracepoints through
1 thread and
while parsing output find out to which tracepoint that raw data belongs.?

Or i would have to create separate thread for each tracepoint. ?

Man page says:
  Set config to one of the  following:
          .........

So i am assuming i will have to create separate thread for each event.


Thanks a lot.

On 31 March 2015 at 23:37, Elazar Leibovich
<elazar.leibovich@ravellosystems.com> wrote:
> Look at the man page, you should set the type to PERF_TYPE_TRACEPOINT
> and set the config to the event id.
>
> On my system, sys_enter_open event id is 455
>
> $ sudo cat /sys/kernel/debug/tracing/events/syscalls/sys_enter_open/id
> 455
>
> Add PERF_SAMPLE_RAW to the sample_type.
>
> BTW
> You can compile the tar.gz I sent and echo JSON in the attr format to
> it, it'll print back perf data in json format. Easier to experiment
> with perf_event_open API than writing a C program.
>
> For example
>
> $ make
> $ sudo ./perf2 <<EOF
> {
>   "attr": {
>     "sample_type": [
>       "PERF_SAMPLE_IP",
>       "PERF_SAMPLE_RAW"
>     ],
>     "wakeup_events": 1,
>     "config": 455,
>     "sample_period": 1,
>     "type": "PERF_TYPE_TRACEPOINT"
>   }
> }
> EOF
> {"type":"PERF_RECORD_SAMPLE","misc":"PERF_RECORD_MISC_USER","sample":{"ip":"7f4f3625263d","data":[-57,1,0,0,7,11,0,0,2,0,0,0,0,0,0,0,-32,101,75,22,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0]}}
> {"type":"PERF_RECORD_SAMPLE","misc":"PERF_RECORD_MISC_USER","sample":{"ip":"7f4f3625263d","data":[-57,1,0,0,7,11,0,0,2,0,0,0,0,0,0,0,-64,112,75,22,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0]}}
> {"type":"PERF_RECORD_SAMPLE","misc":"PERF_RECORD_MISC_USER","sample":{"ip":"7f4f3625263d","data":[-57,1,0,0,7,11,0,0,2,0,0,0,0,0,0,0,-112,70,75,22,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0]}}
> {"type":"PERF_RECORD_SAMPLE","misc":"PERF_RECORD_MISC_USER","sample":{"ip":"7f4f3625263d","data":[-57,1,0,0,7,11,0,0,2,0,0,0,0,0,0,0,-32,70,75,22,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0]}}
> {"type":"PERF_RECORD_SAMPLE","misc":"PERF_RECORD_MISC_USER","sample":{"ip":"7f4f3625263d","data":[-57,1,0,0,7,11,0,0,2,0,0,0,0,0,0,0,0,-99,26,-19,-1,127,0,0,0,0,0,0,0,0,0,0,-74,1,0,0,0,0,0,0,0,0,0,0]}}
> ...
>
> What is the raw data? Depends on the event. For sys_enter/exit it is
> struct syscall_trace_enter/exit.
>
> http://osxr.org/linux/source/kernel/trace/trace.h#0095
> struct trace_entry {
>      unsigned short      type;
>      unsigned char       flags;
>      unsigned char       preempt_count;
>      int         pid;
> };
> struct syscall_trace_enter {
>     struct trace_entry  ent;
>     int         nr;
>     unsigned long       args[];
> };
>
> How did I know that? I followed the kernel logic here:
>
> http://osxr.org/linux/source/kernel/trace/trace_syscalls.c#0636
> static void perf_syscall_exit(void *ignore, struct pt_regs *regs, long ret)
> {
> ...
> rec = (struct syscall_trace_exit *)perf_trace_buf_prepare(size, ...);
> ...
> }
>
> Note that indeed after short+char+char+int we have 2, the open syscall
> number in all event's raw data.
>
> On Tue, Mar 31, 2015 at 6:22 PM, sahil aggarwal <sahil.agg15@gmail.com> wrote:
>> Actually i need most of the sampling around PERF_TYPE_TRACEPOINT,
>> so if i enable tracepoint "syscalls/sys_enter_open/" what will be the "type"
>> field in perf_event_header.? And, the the record struct will be same as given
>> in "syscalls/sys_enter_open/format" .?
>>
>> Thanks
>>
>> On 31 March 2015 at 20:40, sahil aggarwal <sahil.agg15@gmail.com> wrote:
>>> Yeah that was clear enough.
>>> Thanks a lot. Your code is of great help.
>>>
>>> Regards
>>> Sahil
>>>
>>> On 31 March 2015 at 19:45, Elazar Leibovich
>>> <elazar.leibovich@ravellosystems.com> wrote:
>>>> I wanted to ensure the user always see contiguous array of data from
>>>> the ring buffer.
>>>>
>>>> The last piece of data, say "abcde" could wrap around in the ring
>>>> buffer and appear like:
>>>>
>>>> [de...                 ...abc]
>>>>
>>>> I wanted the user to see a contigious array of the form [abcde].
>>>>
>>>> So in the case I'm having input that wrap around, I'll simply copy it
>>>> to the first buffer
>>>>
>>>> [wrap_buffer][de..                 ...abc]
>>>> would become
>>>> [               abc][de...               ...abc]
>>>>
>>>> And then I'll the user pointer to the leftmost "a", and he'll see
>>>> "abcde" without knowing he's handling a ring buffer.
>>>>
>>>> Let me know if I was clear enough.
>>>>
>>>> On Tue, Mar 31, 2015 at 2:18 PM, sahil aggarwal <sahil.agg15@gmail.com> wrote:
>>>>>
>>>>> Hi Elazar
>>>>>
>>>>> Can you help me understand why you have used
>>>>> mmap_pages->wrap_base.? And, instead of allocating
>>>>> (2^n)+1 pages you allocate (2^n)+2 pages, why so.?
>>>>> wrap_base points to (2^n)+2 pages and base points to
>>>>> (2^n)+1 pages, what is use of wrap_base.? I tried reading
>>>>> perf source too, there it seems they use (2^n)+1 pages only.
>>>>>
>>>>>
>>>>> Thanks
>>>>> Regards
>> --
>> To unsubscribe from this list: send the line "unsubscribe linux-perf-users" in
>> the body of a message to majordomo@vger.kernel.org
>> More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: perf smapling
  2015-04-01  9:22               ` sahil aggarwal
@ 2015-04-01  9:58                 ` Elazar Leibovich
       [not found]                 ` <CAL2Y34Amk49Upd8+eCmEK9WqG6SGgEJfj3tUXdrRKEa_6Hxr8Q@mail.gmail.com>
  1 sibling, 0 replies; 17+ messages in thread
From: Elazar Leibovich @ 2015-04-01  9:58 UTC (permalink / raw)
  To: sahil aggarwal; +Cc: linux-perf-users

Yes, this is correct, as far as I can tell, when you create a
perf_event for every tracepoint event.

I personally created a separate thread for each trace point, since I
found this approach simpler, but it is certainly possible to use a
single thread + select from all perf_event_open file descriptor.

On Wed, Apr 1, 2015 at 12:22 PM, sahil aggarwal <sahil.agg15@gmail.com> wrote:
> Hi Elazar
>
> Finally i am able to make small prototype to enable tracepoints. :)
>
> One more thing, is it possible to enable multiple tracepoints through
> 1 thread and
> while parsing output find out to which tracepoint that raw data belongs.?
>
> Or i would have to create separate thread for each tracepoint. ?
>
> Man page says:
>   Set config to one of the  following:
>           .........
>
> So i am assuming i will have to create separate thread for each event.
>
>
> Thanks a lot.
>
> On 31 March 2015 at 23:37, Elazar Leibovich
> <elazar.leibovich@ravellosystems.com> wrote:
>> Look at the man page, you should set the type to PERF_TYPE_TRACEPOINT
>> and set the config to the event id.
>>
>> On my system, sys_enter_open event id is 455
>>
>> $ sudo cat /sys/kernel/debug/tracing/events/syscalls/sys_enter_open/id
>> 455
>>
>> Add PERF_SAMPLE_RAW to the sample_type.
>>
>> BTW
>> You can compile the tar.gz I sent and echo JSON in the attr format to
>> it, it'll print back perf data in json format. Easier to experiment
>> with perf_event_open API than writing a C program.
>>
>> For example
>>
>> $ make
>> $ sudo ./perf2 <<EOF
>> {
>>   "attr": {
>>     "sample_type": [
>>       "PERF_SAMPLE_IP",
>>       "PERF_SAMPLE_RAW"
>>     ],
>>     "wakeup_events": 1,
>>     "config": 455,
>>     "sample_period": 1,
>>     "type": "PERF_TYPE_TRACEPOINT"
>>   }
>> }
>> EOF
>> {"type":"PERF_RECORD_SAMPLE","misc":"PERF_RECORD_MISC_USER","sample":{"ip":"7f4f3625263d","data":[-57,1,0,0,7,11,0,0,2,0,0,0,0,0,0,0,-32,101,75,22,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0]}}
>> {"type":"PERF_RECORD_SAMPLE","misc":"PERF_RECORD_MISC_USER","sample":{"ip":"7f4f3625263d","data":[-57,1,0,0,7,11,0,0,2,0,0,0,0,0,0,0,-64,112,75,22,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0]}}
>> {"type":"PERF_RECORD_SAMPLE","misc":"PERF_RECORD_MISC_USER","sample":{"ip":"7f4f3625263d","data":[-57,1,0,0,7,11,0,0,2,0,0,0,0,0,0,0,-112,70,75,22,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0]}}
>> {"type":"PERF_RECORD_SAMPLE","misc":"PERF_RECORD_MISC_USER","sample":{"ip":"7f4f3625263d","data":[-57,1,0,0,7,11,0,0,2,0,0,0,0,0,0,0,-32,70,75,22,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0]}}
>> {"type":"PERF_RECORD_SAMPLE","misc":"PERF_RECORD_MISC_USER","sample":{"ip":"7f4f3625263d","data":[-57,1,0,0,7,11,0,0,2,0,0,0,0,0,0,0,0,-99,26,-19,-1,127,0,0,0,0,0,0,0,0,0,0,-74,1,0,0,0,0,0,0,0,0,0,0]}}
>> ...
>>
>> What is the raw data? Depends on the event. For sys_enter/exit it is
>> struct syscall_trace_enter/exit.
>>
>> http://osxr.org/linux/source/kernel/trace/trace.h#0095
>> struct trace_entry {
>>      unsigned short      type;
>>      unsigned char       flags;
>>      unsigned char       preempt_count;
>>      int         pid;
>> };
>> struct syscall_trace_enter {
>>     struct trace_entry  ent;
>>     int         nr;
>>     unsigned long       args[];
>> };
>>
>> How did I know that? I followed the kernel logic here:
>>
>> http://osxr.org/linux/source/kernel/trace/trace_syscalls.c#0636
>> static void perf_syscall_exit(void *ignore, struct pt_regs *regs, long ret)
>> {
>> ...
>> rec = (struct syscall_trace_exit *)perf_trace_buf_prepare(size, ...);
>> ...
>> }
>>
>> Note that indeed after short+char+char+int we have 2, the open syscall
>> number in all event's raw data.
>>
>> On Tue, Mar 31, 2015 at 6:22 PM, sahil aggarwal <sahil.agg15@gmail.com> wrote:
>>> Actually i need most of the sampling around PERF_TYPE_TRACEPOINT,
>>> so if i enable tracepoint "syscalls/sys_enter_open/" what will be the "type"
>>> field in perf_event_header.? And, the the record struct will be same as given
>>> in "syscalls/sys_enter_open/format" .?
>>>
>>> Thanks
>>>
>>> On 31 March 2015 at 20:40, sahil aggarwal <sahil.agg15@gmail.com> wrote:
>>>> Yeah that was clear enough.
>>>> Thanks a lot. Your code is of great help.
>>>>
>>>> Regards
>>>> Sahil
>>>>
>>>> On 31 March 2015 at 19:45, Elazar Leibovich
>>>> <elazar.leibovich@ravellosystems.com> wrote:
>>>>> I wanted to ensure the user always see contiguous array of data from
>>>>> the ring buffer.
>>>>>
>>>>> The last piece of data, say "abcde" could wrap around in the ring
>>>>> buffer and appear like:
>>>>>
>>>>> [de...                 ...abc]
>>>>>
>>>>> I wanted the user to see a contigious array of the form [abcde].
>>>>>
>>>>> So in the case I'm having input that wrap around, I'll simply copy it
>>>>> to the first buffer
>>>>>
>>>>> [wrap_buffer][de..                 ...abc]
>>>>> would become
>>>>> [               abc][de...               ...abc]
>>>>>
>>>>> And then I'll the user pointer to the leftmost "a", and he'll see
>>>>> "abcde" without knowing he's handling a ring buffer.
>>>>>
>>>>> Let me know if I was clear enough.
>>>>>
>>>>> On Tue, Mar 31, 2015 at 2:18 PM, sahil aggarwal <sahil.agg15@gmail.com> wrote:
>>>>>>
>>>>>> Hi Elazar
>>>>>>
>>>>>> Can you help me understand why you have used
>>>>>> mmap_pages->wrap_base.? And, instead of allocating
>>>>>> (2^n)+1 pages you allocate (2^n)+2 pages, why so.?
>>>>>> wrap_base points to (2^n)+2 pages and base points to
>>>>>> (2^n)+1 pages, what is use of wrap_base.? I tried reading
>>>>>> perf source too, there it seems they use (2^n)+1 pages only.
>>>>>>
>>>>>>
>>>>>> Thanks
>>>>>> Regards
>>> --
>>> To unsubscribe from this list: send the line "unsubscribe linux-perf-users" in
>>> the body of a message to majordomo@vger.kernel.org
>>> More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: perf smapling
       [not found]                 ` <CAL2Y34Amk49Upd8+eCmEK9WqG6SGgEJfj3tUXdrRKEa_6Hxr8Q@mail.gmail.com>
@ 2015-04-01 10:04                   ` sahil aggarwal
  2015-04-01 11:49                     ` sahil aggarwal
  0 siblings, 1 reply; 17+ messages in thread
From: sahil aggarwal @ 2015-04-01 10:04 UTC (permalink / raw)
  To: Elazar Leibovich; +Cc: linux-perf-users

No i didn't give it a shot yet but code was very helpful.
And, the raw data form was same as struct defined for event in format
file(syscalls/sys_enter_open/format).


On 1 April 2015 at 15:28, Elazar Leibovich
<elazar.leibovich@ravellosystems.com> wrote:
> Yes, this is correct, as far as I can tell, when you create a perf_event for
> every tracepoint event.
>
> I personally created a separate thread for each trace point, since I found
> this approach simpler, but it is certainly possible to use a single thread +
> select from all perf_event_open file descriptor.
>
> BTW, did you manage to experiment with perf using the tool I attached?
>
> On Wed, Apr 1, 2015 at 12:22 PM, sahil aggarwal <sahil.agg15@gmail.com>
> wrote:
>>
>> Hi Elazar
>>
>> Finally i am able to make small prototype to enable tracepoints. :)
>>
>> One more thing, is it possible to enable multiple tracepoints through
>> 1 thread and
>> while parsing output find out to which tracepoint that raw data belongs.?
>>
>> Or i would have to create separate thread for each tracepoint. ?
>>
>> Man page says:
>>   Set config to one of the  following:
>>           .........
>>
>> So i am assuming i will have to create separate thread for each event.
>>
>>
>> Thanks a lot.
>>
>> On 31 March 2015 at 23:37, Elazar Leibovich
>> <elazar.leibovich@ravellosystems.com> wrote:
>> > Look at the man page, you should set the type to PERF_TYPE_TRACEPOINT
>> > and set the config to the event id.
>> >
>> > On my system, sys_enter_open event id is 455
>> >
>> > $ sudo cat /sys/kernel/debug/tracing/events/syscalls/sys_enter_open/id
>> > 455
>> >
>> > Add PERF_SAMPLE_RAW to the sample_type.
>> >
>> > BTW
>> > You can compile the tar.gz I sent and echo JSON in the attr format to
>> > it, it'll print back perf data in json format. Easier to experiment
>> > with perf_event_open API than writing a C program.
>> >
>> > For example
>> >
>> > $ make
>> > $ sudo ./perf2 <<EOF
>> > {
>> >   "attr": {
>> >     "sample_type": [
>> >       "PERF_SAMPLE_IP",
>> >       "PERF_SAMPLE_RAW"
>> >     ],
>> >     "wakeup_events": 1,
>> >     "config": 455,
>> >     "sample_period": 1,
>> >     "type": "PERF_TYPE_TRACEPOINT"
>> >   }
>> > }
>> > EOF
>> >
>> > {"type":"PERF_RECORD_SAMPLE","misc":"PERF_RECORD_MISC_USER","sample":{"ip":"7f4f3625263d","data":[-57,1,0,0,7,11,0,0,2,0,0,0,0,0,0,0,-32,101,75,22,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0]}}
>> >
>> > {"type":"PERF_RECORD_SAMPLE","misc":"PERF_RECORD_MISC_USER","sample":{"ip":"7f4f3625263d","data":[-57,1,0,0,7,11,0,0,2,0,0,0,0,0,0,0,-64,112,75,22,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0]}}
>> >
>> > {"type":"PERF_RECORD_SAMPLE","misc":"PERF_RECORD_MISC_USER","sample":{"ip":"7f4f3625263d","data":[-57,1,0,0,7,11,0,0,2,0,0,0,0,0,0,0,-112,70,75,22,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0]}}
>> >
>> > {"type":"PERF_RECORD_SAMPLE","misc":"PERF_RECORD_MISC_USER","sample":{"ip":"7f4f3625263d","data":[-57,1,0,0,7,11,0,0,2,0,0,0,0,0,0,0,-32,70,75,22,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0]}}
>> >
>> > {"type":"PERF_RECORD_SAMPLE","misc":"PERF_RECORD_MISC_USER","sample":{"ip":"7f4f3625263d","data":[-57,1,0,0,7,11,0,0,2,0,0,0,0,0,0,0,0,-99,26,-19,-1,127,0,0,0,0,0,0,0,0,0,0,-74,1,0,0,0,0,0,0,0,0,0,0]}}
>> > ...
>> >
>> > What is the raw data? Depends on the event. For sys_enter/exit it is
>> > struct syscall_trace_enter/exit.
>> >
>> > http://osxr.org/linux/source/kernel/trace/trace.h#0095
>> > struct trace_entry {
>> >      unsigned short      type;
>> >      unsigned char       flags;
>> >      unsigned char       preempt_count;
>> >      int         pid;
>> > };
>> > struct syscall_trace_enter {
>> >     struct trace_entry  ent;
>> >     int         nr;
>> >     unsigned long       args[];
>> > };
>> >
>> > How did I know that? I followed the kernel logic here:
>> >
>> > http://osxr.org/linux/source/kernel/trace/trace_syscalls.c#0636
>> > static void perf_syscall_exit(void *ignore, struct pt_regs *regs, long
>> > ret)
>> > {
>> > ...
>> > rec = (struct syscall_trace_exit *)perf_trace_buf_prepare(size, ...);
>> > ...
>> > }
>> >
>> > Note that indeed after short+char+char+int we have 2, the open syscall
>> > number in all event's raw data.
>> >
>> > On Tue, Mar 31, 2015 at 6:22 PM, sahil aggarwal <sahil.agg15@gmail.com>
>> > wrote:
>> >> Actually i need most of the sampling around PERF_TYPE_TRACEPOINT,
>> >> so if i enable tracepoint "syscalls/sys_enter_open/" what will be the
>> >> "type"
>> >> field in perf_event_header.? And, the the record struct will be same as
>> >> given
>> >> in "syscalls/sys_enter_open/format" .?
>> >>
>> >> Thanks
>> >>
>> >> On 31 March 2015 at 20:40, sahil aggarwal <sahil.agg15@gmail.com>
>> >> wrote:
>> >>> Yeah that was clear enough.
>> >>> Thanks a lot. Your code is of great help.
>> >>>
>> >>> Regards
>> >>> Sahil
>> >>>
>> >>> On 31 March 2015 at 19:45, Elazar Leibovich
>> >>> <elazar.leibovich@ravellosystems.com> wrote:
>> >>>> I wanted to ensure the user always see contiguous array of data from
>> >>>> the ring buffer.
>> >>>>
>> >>>> The last piece of data, say "abcde" could wrap around in the ring
>> >>>> buffer and appear like:
>> >>>>
>> >>>> [de...                 ...abc]
>> >>>>
>> >>>> I wanted the user to see a contigious array of the form [abcde].
>> >>>>
>> >>>> So in the case I'm having input that wrap around, I'll simply copy it
>> >>>> to the first buffer
>> >>>>
>> >>>> [wrap_buffer][de..                 ...abc]
>> >>>> would become
>> >>>> [               abc][de...               ...abc]
>> >>>>
>> >>>> And then I'll the user pointer to the leftmost "a", and he'll see
>> >>>> "abcde" without knowing he's handling a ring buffer.
>> >>>>
>> >>>> Let me know if I was clear enough.
>> >>>>
>> >>>> On Tue, Mar 31, 2015 at 2:18 PM, sahil aggarwal
>> >>>> <sahil.agg15@gmail.com> wrote:
>> >>>>>
>> >>>>> Hi Elazar
>> >>>>>
>> >>>>> Can you help me understand why you have used
>> >>>>> mmap_pages->wrap_base.? And, instead of allocating
>> >>>>> (2^n)+1 pages you allocate (2^n)+2 pages, why so.?
>> >>>>> wrap_base points to (2^n)+2 pages and base points to
>> >>>>> (2^n)+1 pages, what is use of wrap_base.? I tried reading
>> >>>>> perf source too, there it seems they use (2^n)+1 pages only.
>> >>>>>
>> >>>>>
>> >>>>> Thanks
>> >>>>> Regards
>> >> --
>> >> To unsubscribe from this list: send the line "unsubscribe
>> >> linux-perf-users" in
>> >> the body of a message to majordomo@vger.kernel.org
>> >> More majordomo info at  http://vger.kernel.org/majordomo-info.html
>
>

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: perf smapling
  2015-04-01 10:04                   ` sahil aggarwal
@ 2015-04-01 11:49                     ` sahil aggarwal
  2015-04-01 11:54                       ` sahil aggarwal
  2015-04-01 18:50                       ` Elazar Leibovich
  0 siblings, 2 replies; 17+ messages in thread
From: sahil aggarwal @ 2015-04-01 11:49 UTC (permalink / raw)
  To: Elazar Leibovich; +Cc: linux-perf-users

If i enable multiple tracepoints, say:

type    = PERF_TYPE_TRACEPOINT
config  = 87 | 402  (sched/sched_switch && syscalls/sys_enter_open)
sample_type = PERF_SAMPLE_TIME       |
                         PERF_SAMPLE_RAW       |
                         PERF_SAMPLE_TID          |
                         PERF_SAMPLE_STREAM_ID ;

It gives me some ID when i print sample_id(i thought it will print
config value).  But how i can connect this ID with my type of event
(sched_switch or sys_enter_open). .?

On 1 April 2015 at 15:34, sahil aggarwal <sahil.agg15@gmail.com> wrote:
> No i didn't give it a shot yet but code was very helpful.
> And, the raw data form was same as struct defined for event in format
> file(syscalls/sys_enter_open/format).
>
>
> On 1 April 2015 at 15:28, Elazar Leibovich
> <elazar.leibovich@ravellosystems.com> wrote:
>> Yes, this is correct, as far as I can tell, when you create a perf_event for
>> every tracepoint event.
>>
>> I personally created a separate thread for each trace point, since I found
>> this approach simpler, but it is certainly possible to use a single thread +
>> select from all perf_event_open file descriptor.
>>
>> BTW, did you manage to experiment with perf using the tool I attached?
>>
>> On Wed, Apr 1, 2015 at 12:22 PM, sahil aggarwal <sahil.agg15@gmail.com>
>> wrote:
>>>
>>> Hi Elazar
>>>
>>> Finally i am able to make small prototype to enable tracepoints. :)
>>>
>>> One more thing, is it possible to enable multiple tracepoints through
>>> 1 thread and
>>> while parsing output find out to which tracepoint that raw data belongs.?
>>>
>>> Or i would have to create separate thread for each tracepoint. ?
>>>
>>> Man page says:
>>>   Set config to one of the  following:
>>>           .........
>>>
>>> So i am assuming i will have to create separate thread for each event.
>>>
>>>
>>> Thanks a lot.
>>>
>>> On 31 March 2015 at 23:37, Elazar Leibovich
>>> <elazar.leibovich@ravellosystems.com> wrote:
>>> > Look at the man page, you should set the type to PERF_TYPE_TRACEPOINT
>>> > and set the config to the event id.
>>> >
>>> > On my system, sys_enter_open event id is 455
>>> >
>>> > $ sudo cat /sys/kernel/debug/tracing/events/syscalls/sys_enter_open/id
>>> > 455
>>> >
>>> > Add PERF_SAMPLE_RAW to the sample_type.
>>> >
>>> > BTW
>>> > You can compile the tar.gz I sent and echo JSON in the attr format to
>>> > it, it'll print back perf data in json format. Easier to experiment
>>> > with perf_event_open API than writing a C program.
>>> >
>>> > For example
>>> >
>>> > $ make
>>> > $ sudo ./perf2 <<EOF
>>> > {
>>> >   "attr": {
>>> >     "sample_type": [
>>> >       "PERF_SAMPLE_IP",
>>> >       "PERF_SAMPLE_RAW"
>>> >     ],
>>> >     "wakeup_events": 1,
>>> >     "config": 455,
>>> >     "sample_period": 1,
>>> >     "type": "PERF_TYPE_TRACEPOINT"
>>> >   }
>>> > }
>>> > EOF
>>> >
>>> > {"type":"PERF_RECORD_SAMPLE","misc":"PERF_RECORD_MISC_USER","sample":{"ip":"7f4f3625263d","data":[-57,1,0,0,7,11,0,0,2,0,0,0,0,0,0,0,-32,101,75,22,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0]}}
>>> >
>>> > {"type":"PERF_RECORD_SAMPLE","misc":"PERF_RECORD_MISC_USER","sample":{"ip":"7f4f3625263d","data":[-57,1,0,0,7,11,0,0,2,0,0,0,0,0,0,0,-64,112,75,22,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0]}}
>>> >
>>> > {"type":"PERF_RECORD_SAMPLE","misc":"PERF_RECORD_MISC_USER","sample":{"ip":"7f4f3625263d","data":[-57,1,0,0,7,11,0,0,2,0,0,0,0,0,0,0,-112,70,75,22,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0]}}
>>> >
>>> > {"type":"PERF_RECORD_SAMPLE","misc":"PERF_RECORD_MISC_USER","sample":{"ip":"7f4f3625263d","data":[-57,1,0,0,7,11,0,0,2,0,0,0,0,0,0,0,-32,70,75,22,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0]}}
>>> >
>>> > {"type":"PERF_RECORD_SAMPLE","misc":"PERF_RECORD_MISC_USER","sample":{"ip":"7f4f3625263d","data":[-57,1,0,0,7,11,0,0,2,0,0,0,0,0,0,0,0,-99,26,-19,-1,127,0,0,0,0,0,0,0,0,0,0,-74,1,0,0,0,0,0,0,0,0,0,0]}}
>>> > ...
>>> >
>>> > What is the raw data? Depends on the event. For sys_enter/exit it is
>>> > struct syscall_trace_enter/exit.
>>> >
>>> > http://osxr.org/linux/source/kernel/trace/trace.h#0095
>>> > struct trace_entry {
>>> >      unsigned short      type;
>>> >      unsigned char       flags;
>>> >      unsigned char       preempt_count;
>>> >      int         pid;
>>> > };
>>> > struct syscall_trace_enter {
>>> >     struct trace_entry  ent;
>>> >     int         nr;
>>> >     unsigned long       args[];
>>> > };
>>> >
>>> > How did I know that? I followed the kernel logic here:
>>> >
>>> > http://osxr.org/linux/source/kernel/trace/trace_syscalls.c#0636
>>> > static void perf_syscall_exit(void *ignore, struct pt_regs *regs, long
>>> > ret)
>>> > {
>>> > ...
>>> > rec = (struct syscall_trace_exit *)perf_trace_buf_prepare(size, ...);
>>> > ...
>>> > }
>>> >
>>> > Note that indeed after short+char+char+int we have 2, the open syscall
>>> > number in all event's raw data.
>>> >
>>> > On Tue, Mar 31, 2015 at 6:22 PM, sahil aggarwal <sahil.agg15@gmail.com>
>>> > wrote:
>>> >> Actually i need most of the sampling around PERF_TYPE_TRACEPOINT,
>>> >> so if i enable tracepoint "syscalls/sys_enter_open/" what will be the
>>> >> "type"
>>> >> field in perf_event_header.? And, the the record struct will be same as
>>> >> given
>>> >> in "syscalls/sys_enter_open/format" .?
>>> >>
>>> >> Thanks
>>> >>
>>> >> On 31 March 2015 at 20:40, sahil aggarwal <sahil.agg15@gmail.com>
>>> >> wrote:
>>> >>> Yeah that was clear enough.
>>> >>> Thanks a lot. Your code is of great help.
>>> >>>
>>> >>> Regards
>>> >>> Sahil
>>> >>>
>>> >>> On 31 March 2015 at 19:45, Elazar Leibovich
>>> >>> <elazar.leibovich@ravellosystems.com> wrote:
>>> >>>> I wanted to ensure the user always see contiguous array of data from
>>> >>>> the ring buffer.
>>> >>>>
>>> >>>> The last piece of data, say "abcde" could wrap around in the ring
>>> >>>> buffer and appear like:
>>> >>>>
>>> >>>> [de...                 ...abc]
>>> >>>>
>>> >>>> I wanted the user to see a contigious array of the form [abcde].
>>> >>>>
>>> >>>> So in the case I'm having input that wrap around, I'll simply copy it
>>> >>>> to the first buffer
>>> >>>>
>>> >>>> [wrap_buffer][de..                 ...abc]
>>> >>>> would become
>>> >>>> [               abc][de...               ...abc]
>>> >>>>
>>> >>>> And then I'll the user pointer to the leftmost "a", and he'll see
>>> >>>> "abcde" without knowing he's handling a ring buffer.
>>> >>>>
>>> >>>> Let me know if I was clear enough.
>>> >>>>
>>> >>>> On Tue, Mar 31, 2015 at 2:18 PM, sahil aggarwal
>>> >>>> <sahil.agg15@gmail.com> wrote:
>>> >>>>>
>>> >>>>> Hi Elazar
>>> >>>>>
>>> >>>>> Can you help me understand why you have used
>>> >>>>> mmap_pages->wrap_base.? And, instead of allocating
>>> >>>>> (2^n)+1 pages you allocate (2^n)+2 pages, why so.?
>>> >>>>> wrap_base points to (2^n)+2 pages and base points to
>>> >>>>> (2^n)+1 pages, what is use of wrap_base.? I tried reading
>>> >>>>> perf source too, there it seems they use (2^n)+1 pages only.
>>> >>>>>
>>> >>>>>
>>> >>>>> Thanks
>>> >>>>> Regards
>>> >> --
>>> >> To unsubscribe from this list: send the line "unsubscribe
>>> >> linux-perf-users" in
>>> >> the body of a message to majordomo@vger.kernel.org
>>> >> More majordomo info at  http://vger.kernel.org/majordomo-info.html
>>
>>

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: perf smapling
  2015-04-01 11:49                     ` sahil aggarwal
@ 2015-04-01 11:54                       ` sahil aggarwal
  2015-04-01 18:50                       ` Elazar Leibovich
  1 sibling, 0 replies; 17+ messages in thread
From: sahil aggarwal @ 2015-04-01 11:54 UTC (permalink / raw)
  To: Elazar Leibovich; +Cc: linux-perf-users

** it gives me some ID when i print stream_id

On 1 April 2015 at 17:19, sahil aggarwal <sahil.agg15@gmail.com> wrote:
> If i enable multiple tracepoints, say:
>
> type    = PERF_TYPE_TRACEPOINT
> config  = 87 | 402  (sched/sched_switch && syscalls/sys_enter_open)
> sample_type = PERF_SAMPLE_TIME       |
>                          PERF_SAMPLE_RAW       |
>                          PERF_SAMPLE_TID          |
>                          PERF_SAMPLE_STREAM_ID ;
>
> It gives me some ID when i print sample_id(i thought it will print
> config value).  But how i can connect this ID with my type of event
> (sched_switch or sys_enter_open). .?
>
> On 1 April 2015 at 15:34, sahil aggarwal <sahil.agg15@gmail.com> wrote:
>> No i didn't give it a shot yet but code was very helpful.
>> And, the raw data form was same as struct defined for event in format
>> file(syscalls/sys_enter_open/format).
>>
>>
>> On 1 April 2015 at 15:28, Elazar Leibovich
>> <elazar.leibovich@ravellosystems.com> wrote:
>>> Yes, this is correct, as far as I can tell, when you create a perf_event for
>>> every tracepoint event.
>>>
>>> I personally created a separate thread for each trace point, since I found
>>> this approach simpler, but it is certainly possible to use a single thread +
>>> select from all perf_event_open file descriptor.
>>>
>>> BTW, did you manage to experiment with perf using the tool I attached?
>>>
>>> On Wed, Apr 1, 2015 at 12:22 PM, sahil aggarwal <sahil.agg15@gmail.com>
>>> wrote:
>>>>
>>>> Hi Elazar
>>>>
>>>> Finally i am able to make small prototype to enable tracepoints. :)
>>>>
>>>> One more thing, is it possible to enable multiple tracepoints through
>>>> 1 thread and
>>>> while parsing output find out to which tracepoint that raw data belongs.?
>>>>
>>>> Or i would have to create separate thread for each tracepoint. ?
>>>>
>>>> Man page says:
>>>>   Set config to one of the  following:
>>>>           .........
>>>>
>>>> So i am assuming i will have to create separate thread for each event.
>>>>
>>>>
>>>> Thanks a lot.
>>>>
>>>> On 31 March 2015 at 23:37, Elazar Leibovich
>>>> <elazar.leibovich@ravellosystems.com> wrote:
>>>> > Look at the man page, you should set the type to PERF_TYPE_TRACEPOINT
>>>> > and set the config to the event id.
>>>> >
>>>> > On my system, sys_enter_open event id is 455
>>>> >
>>>> > $ sudo cat /sys/kernel/debug/tracing/events/syscalls/sys_enter_open/id
>>>> > 455
>>>> >
>>>> > Add PERF_SAMPLE_RAW to the sample_type.
>>>> >
>>>> > BTW
>>>> > You can compile the tar.gz I sent and echo JSON in the attr format to
>>>> > it, it'll print back perf data in json format. Easier to experiment
>>>> > with perf_event_open API than writing a C program.
>>>> >
>>>> > For example
>>>> >
>>>> > $ make
>>>> > $ sudo ./perf2 <<EOF
>>>> > {
>>>> >   "attr": {
>>>> >     "sample_type": [
>>>> >       "PERF_SAMPLE_IP",
>>>> >       "PERF_SAMPLE_RAW"
>>>> >     ],
>>>> >     "wakeup_events": 1,
>>>> >     "config": 455,
>>>> >     "sample_period": 1,
>>>> >     "type": "PERF_TYPE_TRACEPOINT"
>>>> >   }
>>>> > }
>>>> > EOF
>>>> >
>>>> > {"type":"PERF_RECORD_SAMPLE","misc":"PERF_RECORD_MISC_USER","sample":{"ip":"7f4f3625263d","data":[-57,1,0,0,7,11,0,0,2,0,0,0,0,0,0,0,-32,101,75,22,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0]}}
>>>> >
>>>> > {"type":"PERF_RECORD_SAMPLE","misc":"PERF_RECORD_MISC_USER","sample":{"ip":"7f4f3625263d","data":[-57,1,0,0,7,11,0,0,2,0,0,0,0,0,0,0,-64,112,75,22,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0]}}
>>>> >
>>>> > {"type":"PERF_RECORD_SAMPLE","misc":"PERF_RECORD_MISC_USER","sample":{"ip":"7f4f3625263d","data":[-57,1,0,0,7,11,0,0,2,0,0,0,0,0,0,0,-112,70,75,22,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0]}}
>>>> >
>>>> > {"type":"PERF_RECORD_SAMPLE","misc":"PERF_RECORD_MISC_USER","sample":{"ip":"7f4f3625263d","data":[-57,1,0,0,7,11,0,0,2,0,0,0,0,0,0,0,-32,70,75,22,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0]}}
>>>> >
>>>> > {"type":"PERF_RECORD_SAMPLE","misc":"PERF_RECORD_MISC_USER","sample":{"ip":"7f4f3625263d","data":[-57,1,0,0,7,11,0,0,2,0,0,0,0,0,0,0,0,-99,26,-19,-1,127,0,0,0,0,0,0,0,0,0,0,-74,1,0,0,0,0,0,0,0,0,0,0]}}
>>>> > ...
>>>> >
>>>> > What is the raw data? Depends on the event. For sys_enter/exit it is
>>>> > struct syscall_trace_enter/exit.
>>>> >
>>>> > http://osxr.org/linux/source/kernel/trace/trace.h#0095
>>>> > struct trace_entry {
>>>> >      unsigned short      type;
>>>> >      unsigned char       flags;
>>>> >      unsigned char       preempt_count;
>>>> >      int         pid;
>>>> > };
>>>> > struct syscall_trace_enter {
>>>> >     struct trace_entry  ent;
>>>> >     int         nr;
>>>> >     unsigned long       args[];
>>>> > };
>>>> >
>>>> > How did I know that? I followed the kernel logic here:
>>>> >
>>>> > http://osxr.org/linux/source/kernel/trace/trace_syscalls.c#0636
>>>> > static void perf_syscall_exit(void *ignore, struct pt_regs *regs, long
>>>> > ret)
>>>> > {
>>>> > ...
>>>> > rec = (struct syscall_trace_exit *)perf_trace_buf_prepare(size, ...);
>>>> > ...
>>>> > }
>>>> >
>>>> > Note that indeed after short+char+char+int we have 2, the open syscall
>>>> > number in all event's raw data.
>>>> >
>>>> > On Tue, Mar 31, 2015 at 6:22 PM, sahil aggarwal <sahil.agg15@gmail.com>
>>>> > wrote:
>>>> >> Actually i need most of the sampling around PERF_TYPE_TRACEPOINT,
>>>> >> so if i enable tracepoint "syscalls/sys_enter_open/" what will be the
>>>> >> "type"
>>>> >> field in perf_event_header.? And, the the record struct will be same as
>>>> >> given
>>>> >> in "syscalls/sys_enter_open/format" .?
>>>> >>
>>>> >> Thanks
>>>> >>
>>>> >> On 31 March 2015 at 20:40, sahil aggarwal <sahil.agg15@gmail.com>
>>>> >> wrote:
>>>> >>> Yeah that was clear enough.
>>>> >>> Thanks a lot. Your code is of great help.
>>>> >>>
>>>> >>> Regards
>>>> >>> Sahil
>>>> >>>
>>>> >>> On 31 March 2015 at 19:45, Elazar Leibovich
>>>> >>> <elazar.leibovich@ravellosystems.com> wrote:
>>>> >>>> I wanted to ensure the user always see contiguous array of data from
>>>> >>>> the ring buffer.
>>>> >>>>
>>>> >>>> The last piece of data, say "abcde" could wrap around in the ring
>>>> >>>> buffer and appear like:
>>>> >>>>
>>>> >>>> [de...                 ...abc]
>>>> >>>>
>>>> >>>> I wanted the user to see a contigious array of the form [abcde].
>>>> >>>>
>>>> >>>> So in the case I'm having input that wrap around, I'll simply copy it
>>>> >>>> to the first buffer
>>>> >>>>
>>>> >>>> [wrap_buffer][de..                 ...abc]
>>>> >>>> would become
>>>> >>>> [               abc][de...               ...abc]
>>>> >>>>
>>>> >>>> And then I'll the user pointer to the leftmost "a", and he'll see
>>>> >>>> "abcde" without knowing he's handling a ring buffer.
>>>> >>>>
>>>> >>>> Let me know if I was clear enough.
>>>> >>>>
>>>> >>>> On Tue, Mar 31, 2015 at 2:18 PM, sahil aggarwal
>>>> >>>> <sahil.agg15@gmail.com> wrote:
>>>> >>>>>
>>>> >>>>> Hi Elazar
>>>> >>>>>
>>>> >>>>> Can you help me understand why you have used
>>>> >>>>> mmap_pages->wrap_base.? And, instead of allocating
>>>> >>>>> (2^n)+1 pages you allocate (2^n)+2 pages, why so.?
>>>> >>>>> wrap_base points to (2^n)+2 pages and base points to
>>>> >>>>> (2^n)+1 pages, what is use of wrap_base.? I tried reading
>>>> >>>>> perf source too, there it seems they use (2^n)+1 pages only.
>>>> >>>>>
>>>> >>>>>
>>>> >>>>> Thanks
>>>> >>>>> Regards
>>>> >> --
>>>> >> To unsubscribe from this list: send the line "unsubscribe
>>>> >> linux-perf-users" in
>>>> >> the body of a message to majordomo@vger.kernel.org
>>>> >> More majordomo info at  http://vger.kernel.org/majordomo-info.html
>>>
>>>

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: perf smapling
  2015-04-01 11:49                     ` sahil aggarwal
  2015-04-01 11:54                       ` sahil aggarwal
@ 2015-04-01 18:50                       ` Elazar Leibovich
  2015-04-02 13:30                         ` sahil aggarwal
  1 sibling, 1 reply; 17+ messages in thread
From: Elazar Leibovich @ 2015-04-01 18:50 UTC (permalink / raw)
  To: sahil aggarwal; +Cc: linux-perf-users

You can't && event ids in config, it'll simply give a different event.
You need to open a stream per tracing event.

On Wed, Apr 1, 2015 at 2:49 PM, sahil aggarwal <sahil.agg15@gmail.com> wrote:
> If i enable multiple tracepoints, say:
>
> type    = PERF_TYPE_TRACEPOINT
> config  = 87 | 402  (sched/sched_switch && syscalls/sys_enter_open)
> sample_type = PERF_SAMPLE_TIME       |
>                          PERF_SAMPLE_RAW       |
>                          PERF_SAMPLE_TID          |
>                          PERF_SAMPLE_STREAM_ID ;
>
> It gives me some ID when i print sample_id(i thought it will print
> config value).  But how i can connect this ID with my type of event
> (sched_switch or sys_enter_open). .?
>
> On 1 April 2015 at 15:34, sahil aggarwal <sahil.agg15@gmail.com> wrote:
>> No i didn't give it a shot yet but code was very helpful.
>> And, the raw data form was same as struct defined for event in format
>> file(syscalls/sys_enter_open/format).
>>
>>
>> On 1 April 2015 at 15:28, Elazar Leibovich
>> <elazar.leibovich@ravellosystems.com> wrote:
>>> Yes, this is correct, as far as I can tell, when you create a perf_event for
>>> every tracepoint event.
>>>
>>> I personally created a separate thread for each trace point, since I found
>>> this approach simpler, but it is certainly possible to use a single thread +
>>> select from all perf_event_open file descriptor.
>>>
>>> BTW, did you manage to experiment with perf using the tool I attached?
>>>
>>> On Wed, Apr 1, 2015 at 12:22 PM, sahil aggarwal <sahil.agg15@gmail.com>
>>> wrote:
>>>>
>>>> Hi Elazar
>>>>
>>>> Finally i am able to make small prototype to enable tracepoints. :)
>>>>
>>>> One more thing, is it possible to enable multiple tracepoints through
>>>> 1 thread and
>>>> while parsing output find out to which tracepoint that raw data belongs.?
>>>>
>>>> Or i would have to create separate thread for each tracepoint. ?
>>>>
>>>> Man page says:
>>>>   Set config to one of the  following:
>>>>           .........
>>>>
>>>> So i am assuming i will have to create separate thread for each event.
>>>>
>>>>
>>>> Thanks a lot.
>>>>
>>>> On 31 March 2015 at 23:37, Elazar Leibovich
>>>> <elazar.leibovich@ravellosystems.com> wrote:
>>>> > Look at the man page, you should set the type to PERF_TYPE_TRACEPOINT
>>>> > and set the config to the event id.
>>>> >
>>>> > On my system, sys_enter_open event id is 455
>>>> >
>>>> > $ sudo cat /sys/kernel/debug/tracing/events/syscalls/sys_enter_open/id
>>>> > 455
>>>> >
>>>> > Add PERF_SAMPLE_RAW to the sample_type.
>>>> >
>>>> > BTW
>>>> > You can compile the tar.gz I sent and echo JSON in the attr format to
>>>> > it, it'll print back perf data in json format. Easier to experiment
>>>> > with perf_event_open API than writing a C program.
>>>> >
>>>> > For example
>>>> >
>>>> > $ make
>>>> > $ sudo ./perf2 <<EOF
>>>> > {
>>>> >   "attr": {
>>>> >     "sample_type": [
>>>> >       "PERF_SAMPLE_IP",
>>>> >       "PERF_SAMPLE_RAW"
>>>> >     ],
>>>> >     "wakeup_events": 1,
>>>> >     "config": 455,
>>>> >     "sample_period": 1,
>>>> >     "type": "PERF_TYPE_TRACEPOINT"
>>>> >   }
>>>> > }
>>>> > EOF
>>>> >
>>>> > {"type":"PERF_RECORD_SAMPLE","misc":"PERF_RECORD_MISC_USER","sample":{"ip":"7f4f3625263d","data":[-57,1,0,0,7,11,0,0,2,0,0,0,0,0,0,0,-32,101,75,22,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0]}}
>>>> >
>>>> > {"type":"PERF_RECORD_SAMPLE","misc":"PERF_RECORD_MISC_USER","sample":{"ip":"7f4f3625263d","data":[-57,1,0,0,7,11,0,0,2,0,0,0,0,0,0,0,-64,112,75,22,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0]}}
>>>> >
>>>> > {"type":"PERF_RECORD_SAMPLE","misc":"PERF_RECORD_MISC_USER","sample":{"ip":"7f4f3625263d","data":[-57,1,0,0,7,11,0,0,2,0,0,0,0,0,0,0,-112,70,75,22,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0]}}
>>>> >
>>>> > {"type":"PERF_RECORD_SAMPLE","misc":"PERF_RECORD_MISC_USER","sample":{"ip":"7f4f3625263d","data":[-57,1,0,0,7,11,0,0,2,0,0,0,0,0,0,0,-32,70,75,22,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0]}}
>>>> >
>>>> > {"type":"PERF_RECORD_SAMPLE","misc":"PERF_RECORD_MISC_USER","sample":{"ip":"7f4f3625263d","data":[-57,1,0,0,7,11,0,0,2,0,0,0,0,0,0,0,0,-99,26,-19,-1,127,0,0,0,0,0,0,0,0,0,0,-74,1,0,0,0,0,0,0,0,0,0,0]}}
>>>> > ...
>>>> >
>>>> > What is the raw data? Depends on the event. For sys_enter/exit it is
>>>> > struct syscall_trace_enter/exit.
>>>> >
>>>> > http://osxr.org/linux/source/kernel/trace/trace.h#0095
>>>> > struct trace_entry {
>>>> >      unsigned short      type;
>>>> >      unsigned char       flags;
>>>> >      unsigned char       preempt_count;
>>>> >      int         pid;
>>>> > };
>>>> > struct syscall_trace_enter {
>>>> >     struct trace_entry  ent;
>>>> >     int         nr;
>>>> >     unsigned long       args[];
>>>> > };
>>>> >
>>>> > How did I know that? I followed the kernel logic here:
>>>> >
>>>> > http://osxr.org/linux/source/kernel/trace/trace_syscalls.c#0636
>>>> > static void perf_syscall_exit(void *ignore, struct pt_regs *regs, long
>>>> > ret)
>>>> > {
>>>> > ...
>>>> > rec = (struct syscall_trace_exit *)perf_trace_buf_prepare(size, ...);
>>>> > ...
>>>> > }
>>>> >
>>>> > Note that indeed after short+char+char+int we have 2, the open syscall
>>>> > number in all event's raw data.
>>>> >
>>>> > On Tue, Mar 31, 2015 at 6:22 PM, sahil aggarwal <sahil.agg15@gmail.com>
>>>> > wrote:
>>>> >> Actually i need most of the sampling around PERF_TYPE_TRACEPOINT,
>>>> >> so if i enable tracepoint "syscalls/sys_enter_open/" what will be the
>>>> >> "type"
>>>> >> field in perf_event_header.? And, the the record struct will be same as
>>>> >> given
>>>> >> in "syscalls/sys_enter_open/format" .?
>>>> >>
>>>> >> Thanks
>>>> >>
>>>> >> On 31 March 2015 at 20:40, sahil aggarwal <sahil.agg15@gmail.com>
>>>> >> wrote:
>>>> >>> Yeah that was clear enough.
>>>> >>> Thanks a lot. Your code is of great help.
>>>> >>>
>>>> >>> Regards
>>>> >>> Sahil
>>>> >>>
>>>> >>> On 31 March 2015 at 19:45, Elazar Leibovich
>>>> >>> <elazar.leibovich@ravellosystems.com> wrote:
>>>> >>>> I wanted to ensure the user always see contiguous array of data from
>>>> >>>> the ring buffer.
>>>> >>>>
>>>> >>>> The last piece of data, say "abcde" could wrap around in the ring
>>>> >>>> buffer and appear like:
>>>> >>>>
>>>> >>>> [de...                 ...abc]
>>>> >>>>
>>>> >>>> I wanted the user to see a contigious array of the form [abcde].
>>>> >>>>
>>>> >>>> So in the case I'm having input that wrap around, I'll simply copy it
>>>> >>>> to the first buffer
>>>> >>>>
>>>> >>>> [wrap_buffer][de..                 ...abc]
>>>> >>>> would become
>>>> >>>> [               abc][de...               ...abc]
>>>> >>>>
>>>> >>>> And then I'll the user pointer to the leftmost "a", and he'll see
>>>> >>>> "abcde" without knowing he's handling a ring buffer.
>>>> >>>>
>>>> >>>> Let me know if I was clear enough.
>>>> >>>>
>>>> >>>> On Tue, Mar 31, 2015 at 2:18 PM, sahil aggarwal
>>>> >>>> <sahil.agg15@gmail.com> wrote:
>>>> >>>>>
>>>> >>>>> Hi Elazar
>>>> >>>>>
>>>> >>>>> Can you help me understand why you have used
>>>> >>>>> mmap_pages->wrap_base.? And, instead of allocating
>>>> >>>>> (2^n)+1 pages you allocate (2^n)+2 pages, why so.?
>>>> >>>>> wrap_base points to (2^n)+2 pages and base points to
>>>> >>>>> (2^n)+1 pages, what is use of wrap_base.? I tried reading
>>>> >>>>> perf source too, there it seems they use (2^n)+1 pages only.
>>>> >>>>>
>>>> >>>>>
>>>> >>>>> Thanks
>>>> >>>>> Regards
>>>> >> --
>>>> >> To unsubscribe from this list: send the line "unsubscribe
>>>> >> linux-perf-users" in
>>>> >> the body of a message to majordomo@vger.kernel.org
>>>> >> More majordomo info at  http://vger.kernel.org/majordomo-info.html
>>>
>>>

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: perf smapling
  2015-04-01 18:50                       ` Elazar Leibovich
@ 2015-04-02 13:30                         ` sahil aggarwal
  2015-04-03  5:34                           ` sahil aggarwal
  0 siblings, 1 reply; 17+ messages in thread
From: sahil aggarwal @ 2015-04-02 13:30 UTC (permalink / raw)
  To: Elazar Leibovich; +Cc: linux-perf-users

Yeah you are right. And, there seem to be problem when i declare
'struct perf_event_attr'
at run time. Is it know issue.?
It gives me -EINVAL(Invalid Argument).

Run Time:
perf_event_open(0x22a20b8, 0, 0xffffffff, 0xffffffff, 0) = -1 EINVAL
(Invalid argument)

Compile Time:
perf_event_open(0x7fffa242ee50, 0xffffffff, 0x1, 0xffffffff, 0) = 6

On 2 April 2015 at 00:20, Elazar Leibovich
<elazar.leibovich@ravellosystems.com> wrote:
> You can't && event ids in config, it'll simply give a different event.
> You need to open a stream per tracing event.
>
> On Wed, Apr 1, 2015 at 2:49 PM, sahil aggarwal <sahil.agg15@gmail.com> wrote:
>> If i enable multiple tracepoints, say:
>>
>> type    = PERF_TYPE_TRACEPOINT
>> config  = 87 | 402  (sched/sched_switch && syscalls/sys_enter_open)
>> sample_type = PERF_SAMPLE_TIME       |
>>                          PERF_SAMPLE_RAW       |
>>                          PERF_SAMPLE_TID          |
>>                          PERF_SAMPLE_STREAM_ID ;
>>
>> It gives me some ID when i print sample_id(i thought it will print
>> config value).  But how i can connect this ID with my type of event
>> (sched_switch or sys_enter_open). .?
>>
>> On 1 April 2015 at 15:34, sahil aggarwal <sahil.agg15@gmail.com> wrote:
>>> No i didn't give it a shot yet but code was very helpful.
>>> And, the raw data form was same as struct defined for event in format
>>> file(syscalls/sys_enter_open/format).
>>>
>>>
>>> On 1 April 2015 at 15:28, Elazar Leibovich
>>> <elazar.leibovich@ravellosystems.com> wrote:
>>>> Yes, this is correct, as far as I can tell, when you create a perf_event for
>>>> every tracepoint event.
>>>>
>>>> I personally created a separate thread for each trace point, since I found
>>>> this approach simpler, but it is certainly possible to use a single thread +
>>>> select from all perf_event_open file descriptor.
>>>>
>>>> BTW, did you manage to experiment with perf using the tool I attached?
>>>>
>>>> On Wed, Apr 1, 2015 at 12:22 PM, sahil aggarwal <sahil.agg15@gmail.com>
>>>> wrote:
>>>>>
>>>>> Hi Elazar
>>>>>
>>>>> Finally i am able to make small prototype to enable tracepoints. :)
>>>>>
>>>>> One more thing, is it possible to enable multiple tracepoints through
>>>>> 1 thread and
>>>>> while parsing output find out to which tracepoint that raw data belongs.?
>>>>>
>>>>> Or i would have to create separate thread for each tracepoint. ?
>>>>>
>>>>> Man page says:
>>>>>   Set config to one of the  following:
>>>>>           .........
>>>>>
>>>>> So i am assuming i will have to create separate thread for each event.
>>>>>
>>>>>
>>>>> Thanks a lot.
>>>>>
>>>>> On 31 March 2015 at 23:37, Elazar Leibovich
>>>>> <elazar.leibovich@ravellosystems.com> wrote:
>>>>> > Look at the man page, you should set the type to PERF_TYPE_TRACEPOINT
>>>>> > and set the config to the event id.
>>>>> >
>>>>> > On my system, sys_enter_open event id is 455
>>>>> >
>>>>> > $ sudo cat /sys/kernel/debug/tracing/events/syscalls/sys_enter_open/id
>>>>> > 455
>>>>> >
>>>>> > Add PERF_SAMPLE_RAW to the sample_type.
>>>>> >
>>>>> > BTW
>>>>> > You can compile the tar.gz I sent and echo JSON in the attr format to
>>>>> > it, it'll print back perf data in json format. Easier to experiment
>>>>> > with perf_event_open API than writing a C program.
>>>>> >
>>>>> > For example
>>>>> >
>>>>> > $ make
>>>>> > $ sudo ./perf2 <<EOF
>>>>> > {
>>>>> >   "attr": {
>>>>> >     "sample_type": [
>>>>> >       "PERF_SAMPLE_IP",
>>>>> >       "PERF_SAMPLE_RAW"
>>>>> >     ],
>>>>> >     "wakeup_events": 1,
>>>>> >     "config": 455,
>>>>> >     "sample_period": 1,
>>>>> >     "type": "PERF_TYPE_TRACEPOINT"
>>>>> >   }
>>>>> > }
>>>>> > EOF
>>>>> >
>>>>> > {"type":"PERF_RECORD_SAMPLE","misc":"PERF_RECORD_MISC_USER","sample":{"ip":"7f4f3625263d","data":[-57,1,0,0,7,11,0,0,2,0,0,0,0,0,0,0,-32,101,75,22,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0]}}
>>>>> >
>>>>> > {"type":"PERF_RECORD_SAMPLE","misc":"PERF_RECORD_MISC_USER","sample":{"ip":"7f4f3625263d","data":[-57,1,0,0,7,11,0,0,2,0,0,0,0,0,0,0,-64,112,75,22,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0]}}
>>>>> >
>>>>> > {"type":"PERF_RECORD_SAMPLE","misc":"PERF_RECORD_MISC_USER","sample":{"ip":"7f4f3625263d","data":[-57,1,0,0,7,11,0,0,2,0,0,0,0,0,0,0,-112,70,75,22,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0]}}
>>>>> >
>>>>> > {"type":"PERF_RECORD_SAMPLE","misc":"PERF_RECORD_MISC_USER","sample":{"ip":"7f4f3625263d","data":[-57,1,0,0,7,11,0,0,2,0,0,0,0,0,0,0,-32,70,75,22,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0]}}
>>>>> >
>>>>> > {"type":"PERF_RECORD_SAMPLE","misc":"PERF_RECORD_MISC_USER","sample":{"ip":"7f4f3625263d","data":[-57,1,0,0,7,11,0,0,2,0,0,0,0,0,0,0,0,-99,26,-19,-1,127,0,0,0,0,0,0,0,0,0,0,-74,1,0,0,0,0,0,0,0,0,0,0]}}
>>>>> > ...
>>>>> >
>>>>> > What is the raw data? Depends on the event. For sys_enter/exit it is
>>>>> > struct syscall_trace_enter/exit.
>>>>> >
>>>>> > http://osxr.org/linux/source/kernel/trace/trace.h#0095
>>>>> > struct trace_entry {
>>>>> >      unsigned short      type;
>>>>> >      unsigned char       flags;
>>>>> >      unsigned char       preempt_count;
>>>>> >      int         pid;
>>>>> > };
>>>>> > struct syscall_trace_enter {
>>>>> >     struct trace_entry  ent;
>>>>> >     int         nr;
>>>>> >     unsigned long       args[];
>>>>> > };
>>>>> >
>>>>> > How did I know that? I followed the kernel logic here:
>>>>> >
>>>>> > http://osxr.org/linux/source/kernel/trace/trace_syscalls.c#0636
>>>>> > static void perf_syscall_exit(void *ignore, struct pt_regs *regs, long
>>>>> > ret)
>>>>> > {
>>>>> > ...
>>>>> > rec = (struct syscall_trace_exit *)perf_trace_buf_prepare(size, ...);
>>>>> > ...
>>>>> > }
>>>>> >
>>>>> > Note that indeed after short+char+char+int we have 2, the open syscall
>>>>> > number in all event's raw data.
>>>>> >
>>>>> > On Tue, Mar 31, 2015 at 6:22 PM, sahil aggarwal <sahil.agg15@gmail.com>
>>>>> > wrote:
>>>>> >> Actually i need most of the sampling around PERF_TYPE_TRACEPOINT,
>>>>> >> so if i enable tracepoint "syscalls/sys_enter_open/" what will be the
>>>>> >> "type"
>>>>> >> field in perf_event_header.? And, the the record struct will be same as
>>>>> >> given
>>>>> >> in "syscalls/sys_enter_open/format" .?
>>>>> >>
>>>>> >> Thanks
>>>>> >>
>>>>> >> On 31 March 2015 at 20:40, sahil aggarwal <sahil.agg15@gmail.com>
>>>>> >> wrote:
>>>>> >>> Yeah that was clear enough.
>>>>> >>> Thanks a lot. Your code is of great help.
>>>>> >>>
>>>>> >>> Regards
>>>>> >>> Sahil
>>>>> >>>
>>>>> >>> On 31 March 2015 at 19:45, Elazar Leibovich
>>>>> >>> <elazar.leibovich@ravellosystems.com> wrote:
>>>>> >>>> I wanted to ensure the user always see contiguous array of data from
>>>>> >>>> the ring buffer.
>>>>> >>>>
>>>>> >>>> The last piece of data, say "abcde" could wrap around in the ring
>>>>> >>>> buffer and appear like:
>>>>> >>>>
>>>>> >>>> [de...                 ...abc]
>>>>> >>>>
>>>>> >>>> I wanted the user to see a contigious array of the form [abcde].
>>>>> >>>>
>>>>> >>>> So in the case I'm having input that wrap around, I'll simply copy it
>>>>> >>>> to the first buffer
>>>>> >>>>
>>>>> >>>> [wrap_buffer][de..                 ...abc]
>>>>> >>>> would become
>>>>> >>>> [               abc][de...               ...abc]
>>>>> >>>>
>>>>> >>>> And then I'll the user pointer to the leftmost "a", and he'll see
>>>>> >>>> "abcde" without knowing he's handling a ring buffer.
>>>>> >>>>
>>>>> >>>> Let me know if I was clear enough.
>>>>> >>>>
>>>>> >>>> On Tue, Mar 31, 2015 at 2:18 PM, sahil aggarwal
>>>>> >>>> <sahil.agg15@gmail.com> wrote:
>>>>> >>>>>
>>>>> >>>>> Hi Elazar
>>>>> >>>>>
>>>>> >>>>> Can you help me understand why you have used
>>>>> >>>>> mmap_pages->wrap_base.? And, instead of allocating
>>>>> >>>>> (2^n)+1 pages you allocate (2^n)+2 pages, why so.?
>>>>> >>>>> wrap_base points to (2^n)+2 pages and base points to
>>>>> >>>>> (2^n)+1 pages, what is use of wrap_base.? I tried reading
>>>>> >>>>> perf source too, there it seems they use (2^n)+1 pages only.
>>>>> >>>>>
>>>>> >>>>>
>>>>> >>>>> Thanks
>>>>> >>>>> Regards
>>>>> >> --
>>>>> >> To unsubscribe from this list: send the line "unsubscribe
>>>>> >> linux-perf-users" in
>>>>> >> the body of a message to majordomo@vger.kernel.org
>>>>> >> More majordomo info at  http://vger.kernel.org/majordomo-info.html
>>>>
>>>>

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: perf smapling
  2015-04-02 13:30                         ` sahil aggarwal
@ 2015-04-03  5:34                           ` sahil aggarwal
  2015-04-08  6:08                             ` sahil aggarwal
  0 siblings, 1 reply; 17+ messages in thread
From: sahil aggarwal @ 2015-04-03  5:34 UTC (permalink / raw)
  To: Elazar Leibovich; +Cc: linux-perf-users

Sorry for that dumb ques. Problem was somewhere else.

On 2 April 2015 at 19:00, sahil aggarwal <sahil.agg15@gmail.com> wrote:
> Yeah you are right. And, there seem to be problem when i declare
> 'struct perf_event_attr'
> at run time. Is it know issue.?
> It gives me -EINVAL(Invalid Argument).
>
> Run Time:
> perf_event_open(0x22a20b8, 0, 0xffffffff, 0xffffffff, 0) = -1 EINVAL
> (Invalid argument)
>
> Compile Time:
> perf_event_open(0x7fffa242ee50, 0xffffffff, 0x1, 0xffffffff, 0) = 6
>
> On 2 April 2015 at 00:20, Elazar Leibovich
> <elazar.leibovich@ravellosystems.com> wrote:
>> You can't && event ids in config, it'll simply give a different event.
>> You need to open a stream per tracing event.
>>
>> On Wed, Apr 1, 2015 at 2:49 PM, sahil aggarwal <sahil.agg15@gmail.com> wrote:
>>> If i enable multiple tracepoints, say:
>>>
>>> type    = PERF_TYPE_TRACEPOINT
>>> config  = 87 | 402  (sched/sched_switch && syscalls/sys_enter_open)
>>> sample_type = PERF_SAMPLE_TIME       |
>>>                          PERF_SAMPLE_RAW       |
>>>                          PERF_SAMPLE_TID          |
>>>                          PERF_SAMPLE_STREAM_ID ;
>>>
>>> It gives me some ID when i print sample_id(i thought it will print
>>> config value).  But how i can connect this ID with my type of event
>>> (sched_switch or sys_enter_open). .?
>>>
>>> On 1 April 2015 at 15:34, sahil aggarwal <sahil.agg15@gmail.com> wrote:
>>>> No i didn't give it a shot yet but code was very helpful.
>>>> And, the raw data form was same as struct defined for event in format
>>>> file(syscalls/sys_enter_open/format).
>>>>
>>>>
>>>> On 1 April 2015 at 15:28, Elazar Leibovich
>>>> <elazar.leibovich@ravellosystems.com> wrote:
>>>>> Yes, this is correct, as far as I can tell, when you create a perf_event for
>>>>> every tracepoint event.
>>>>>
>>>>> I personally created a separate thread for each trace point, since I found
>>>>> this approach simpler, but it is certainly possible to use a single thread +
>>>>> select from all perf_event_open file descriptor.
>>>>>
>>>>> BTW, did you manage to experiment with perf using the tool I attached?
>>>>>
>>>>> On Wed, Apr 1, 2015 at 12:22 PM, sahil aggarwal <sahil.agg15@gmail.com>
>>>>> wrote:
>>>>>>
>>>>>> Hi Elazar
>>>>>>
>>>>>> Finally i am able to make small prototype to enable tracepoints. :)
>>>>>>
>>>>>> One more thing, is it possible to enable multiple tracepoints through
>>>>>> 1 thread and
>>>>>> while parsing output find out to which tracepoint that raw data belongs.?
>>>>>>
>>>>>> Or i would have to create separate thread for each tracepoint. ?
>>>>>>
>>>>>> Man page says:
>>>>>>   Set config to one of the  following:
>>>>>>           .........
>>>>>>
>>>>>> So i am assuming i will have to create separate thread for each event.
>>>>>>
>>>>>>
>>>>>> Thanks a lot.
>>>>>>
>>>>>> On 31 March 2015 at 23:37, Elazar Leibovich
>>>>>> <elazar.leibovich@ravellosystems.com> wrote:
>>>>>> > Look at the man page, you should set the type to PERF_TYPE_TRACEPOINT
>>>>>> > and set the config to the event id.
>>>>>> >
>>>>>> > On my system, sys_enter_open event id is 455
>>>>>> >
>>>>>> > $ sudo cat /sys/kernel/debug/tracing/events/syscalls/sys_enter_open/id
>>>>>> > 455
>>>>>> >
>>>>>> > Add PERF_SAMPLE_RAW to the sample_type.
>>>>>> >
>>>>>> > BTW
>>>>>> > You can compile the tar.gz I sent and echo JSON in the attr format to
>>>>>> > it, it'll print back perf data in json format. Easier to experiment
>>>>>> > with perf_event_open API than writing a C program.
>>>>>> >
>>>>>> > For example
>>>>>> >
>>>>>> > $ make
>>>>>> > $ sudo ./perf2 <<EOF
>>>>>> > {
>>>>>> >   "attr": {
>>>>>> >     "sample_type": [
>>>>>> >       "PERF_SAMPLE_IP",
>>>>>> >       "PERF_SAMPLE_RAW"
>>>>>> >     ],
>>>>>> >     "wakeup_events": 1,
>>>>>> >     "config": 455,
>>>>>> >     "sample_period": 1,
>>>>>> >     "type": "PERF_TYPE_TRACEPOINT"
>>>>>> >   }
>>>>>> > }
>>>>>> > EOF
>>>>>> >
>>>>>> > {"type":"PERF_RECORD_SAMPLE","misc":"PERF_RECORD_MISC_USER","sample":{"ip":"7f4f3625263d","data":[-57,1,0,0,7,11,0,0,2,0,0,0,0,0,0,0,-32,101,75,22,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0]}}
>>>>>> >
>>>>>> > {"type":"PERF_RECORD_SAMPLE","misc":"PERF_RECORD_MISC_USER","sample":{"ip":"7f4f3625263d","data":[-57,1,0,0,7,11,0,0,2,0,0,0,0,0,0,0,-64,112,75,22,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0]}}
>>>>>> >
>>>>>> > {"type":"PERF_RECORD_SAMPLE","misc":"PERF_RECORD_MISC_USER","sample":{"ip":"7f4f3625263d","data":[-57,1,0,0,7,11,0,0,2,0,0,0,0,0,0,0,-112,70,75,22,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0]}}
>>>>>> >
>>>>>> > {"type":"PERF_RECORD_SAMPLE","misc":"PERF_RECORD_MISC_USER","sample":{"ip":"7f4f3625263d","data":[-57,1,0,0,7,11,0,0,2,0,0,0,0,0,0,0,-32,70,75,22,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0]}}
>>>>>> >
>>>>>> > {"type":"PERF_RECORD_SAMPLE","misc":"PERF_RECORD_MISC_USER","sample":{"ip":"7f4f3625263d","data":[-57,1,0,0,7,11,0,0,2,0,0,0,0,0,0,0,0,-99,26,-19,-1,127,0,0,0,0,0,0,0,0,0,0,-74,1,0,0,0,0,0,0,0,0,0,0]}}
>>>>>> > ...
>>>>>> >
>>>>>> > What is the raw data? Depends on the event. For sys_enter/exit it is
>>>>>> > struct syscall_trace_enter/exit.
>>>>>> >
>>>>>> > http://osxr.org/linux/source/kernel/trace/trace.h#0095
>>>>>> > struct trace_entry {
>>>>>> >      unsigned short      type;
>>>>>> >      unsigned char       flags;
>>>>>> >      unsigned char       preempt_count;
>>>>>> >      int         pid;
>>>>>> > };
>>>>>> > struct syscall_trace_enter {
>>>>>> >     struct trace_entry  ent;
>>>>>> >     int         nr;
>>>>>> >     unsigned long       args[];
>>>>>> > };
>>>>>> >
>>>>>> > How did I know that? I followed the kernel logic here:
>>>>>> >
>>>>>> > http://osxr.org/linux/source/kernel/trace/trace_syscalls.c#0636
>>>>>> > static void perf_syscall_exit(void *ignore, struct pt_regs *regs, long
>>>>>> > ret)
>>>>>> > {
>>>>>> > ...
>>>>>> > rec = (struct syscall_trace_exit *)perf_trace_buf_prepare(size, ...);
>>>>>> > ...
>>>>>> > }
>>>>>> >
>>>>>> > Note that indeed after short+char+char+int we have 2, the open syscall
>>>>>> > number in all event's raw data.
>>>>>> >
>>>>>> > On Tue, Mar 31, 2015 at 6:22 PM, sahil aggarwal <sahil.agg15@gmail.com>
>>>>>> > wrote:
>>>>>> >> Actually i need most of the sampling around PERF_TYPE_TRACEPOINT,
>>>>>> >> so if i enable tracepoint "syscalls/sys_enter_open/" what will be the
>>>>>> >> "type"
>>>>>> >> field in perf_event_header.? And, the the record struct will be same as
>>>>>> >> given
>>>>>> >> in "syscalls/sys_enter_open/format" .?
>>>>>> >>
>>>>>> >> Thanks
>>>>>> >>
>>>>>> >> On 31 March 2015 at 20:40, sahil aggarwal <sahil.agg15@gmail.com>
>>>>>> >> wrote:
>>>>>> >>> Yeah that was clear enough.
>>>>>> >>> Thanks a lot. Your code is of great help.
>>>>>> >>>
>>>>>> >>> Regards
>>>>>> >>> Sahil
>>>>>> >>>
>>>>>> >>> On 31 March 2015 at 19:45, Elazar Leibovich
>>>>>> >>> <elazar.leibovich@ravellosystems.com> wrote:
>>>>>> >>>> I wanted to ensure the user always see contiguous array of data from
>>>>>> >>>> the ring buffer.
>>>>>> >>>>
>>>>>> >>>> The last piece of data, say "abcde" could wrap around in the ring
>>>>>> >>>> buffer and appear like:
>>>>>> >>>>
>>>>>> >>>> [de...                 ...abc]
>>>>>> >>>>
>>>>>> >>>> I wanted the user to see a contigious array of the form [abcde].
>>>>>> >>>>
>>>>>> >>>> So in the case I'm having input that wrap around, I'll simply copy it
>>>>>> >>>> to the first buffer
>>>>>> >>>>
>>>>>> >>>> [wrap_buffer][de..                 ...abc]
>>>>>> >>>> would become
>>>>>> >>>> [               abc][de...               ...abc]
>>>>>> >>>>
>>>>>> >>>> And then I'll the user pointer to the leftmost "a", and he'll see
>>>>>> >>>> "abcde" without knowing he's handling a ring buffer.
>>>>>> >>>>
>>>>>> >>>> Let me know if I was clear enough.
>>>>>> >>>>
>>>>>> >>>> On Tue, Mar 31, 2015 at 2:18 PM, sahil aggarwal
>>>>>> >>>> <sahil.agg15@gmail.com> wrote:
>>>>>> >>>>>
>>>>>> >>>>> Hi Elazar
>>>>>> >>>>>
>>>>>> >>>>> Can you help me understand why you have used
>>>>>> >>>>> mmap_pages->wrap_base.? And, instead of allocating
>>>>>> >>>>> (2^n)+1 pages you allocate (2^n)+2 pages, why so.?
>>>>>> >>>>> wrap_base points to (2^n)+2 pages and base points to
>>>>>> >>>>> (2^n)+1 pages, what is use of wrap_base.? I tried reading
>>>>>> >>>>> perf source too, there it seems they use (2^n)+1 pages only.
>>>>>> >>>>>
>>>>>> >>>>>
>>>>>> >>>>> Thanks
>>>>>> >>>>> Regards
>>>>>> >> --
>>>>>> >> To unsubscribe from this list: send the line "unsubscribe
>>>>>> >> linux-perf-users" in
>>>>>> >> the body of a message to majordomo@vger.kernel.org
>>>>>> >> More majordomo info at  http://vger.kernel.org/majordomo-info.html
>>>>>
>>>>>

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: perf smapling
  2015-04-03  5:34                           ` sahil aggarwal
@ 2015-04-08  6:08                             ` sahil aggarwal
  2015-04-08  6:41                               ` sahil aggarwal
  0 siblings, 1 reply; 17+ messages in thread
From: sahil aggarwal @ 2015-04-08  6:08 UTC (permalink / raw)
  To: Elazar Leibovich; +Cc: linux-perf-users

Hey Elazar,

In the example you sent me, if i set :
attr->inherit=1

Then,
 mmap(mmap_pages->wrap_base + page_size,             \
            mmap_len - page_size, PROT_READ |               \
            PROT_WRITE, MAP_FIXED | MAP_SHARED,   \
             fd, 0);

call fails with "Invalid argument".

Happens only if inherit is set. Do i need to set any other flag
too.?

Thanks
Regards

On 3 April 2015 at 11:04, sahil aggarwal <sahil.agg15@gmail.com> wrote:
> Sorry for that dumb ques. Problem was somewhere else.
>
> On 2 April 2015 at 19:00, sahil aggarwal <sahil.agg15@gmail.com> wrote:
>> Yeah you are right. And, there seem to be problem when i declare
>> 'struct perf_event_attr'
>> at run time. Is it know issue.?
>> It gives me -EINVAL(Invalid Argument).
>>
>> Run Time:
>> perf_event_open(0x22a20b8, 0, 0xffffffff, 0xffffffff, 0) = -1 EINVAL
>> (Invalid argument)
>>
>> Compile Time:
>> perf_event_open(0x7fffa242ee50, 0xffffffff, 0x1, 0xffffffff, 0) = 6
>>
>> On 2 April 2015 at 00:20, Elazar Leibovich
>> <elazar.leibovich@ravellosystems.com> wrote:
>>> You can't && event ids in config, it'll simply give a different event.
>>> You need to open a stream per tracing event.
>>>
>>> On Wed, Apr 1, 2015 at 2:49 PM, sahil aggarwal <sahil.agg15@gmail.com> wrote:
>>>> If i enable multiple tracepoints, say:
>>>>
>>>> type    = PERF_TYPE_TRACEPOINT
>>>> config  = 87 | 402  (sched/sched_switch && syscalls/sys_enter_open)
>>>> sample_type = PERF_SAMPLE_TIME       |
>>>>                          PERF_SAMPLE_RAW       |
>>>>                          PERF_SAMPLE_TID          |
>>>>                          PERF_SAMPLE_STREAM_ID ;
>>>>
>>>> It gives me some ID when i print sample_id(i thought it will print
>>>> config value).  But how i can connect this ID with my type of event
>>>> (sched_switch or sys_enter_open). .?
>>>>
>>>> On 1 April 2015 at 15:34, sahil aggarwal <sahil.agg15@gmail.com> wrote:
>>>>> No i didn't give it a shot yet but code was very helpful.
>>>>> And, the raw data form was same as struct defined for event in format
>>>>> file(syscalls/sys_enter_open/format).
>>>>>
>>>>>
>>>>> On 1 April 2015 at 15:28, Elazar Leibovich
>>>>> <elazar.leibovich@ravellosystems.com> wrote:
>>>>>> Yes, this is correct, as far as I can tell, when you create a perf_event for
>>>>>> every tracepoint event.
>>>>>>
>>>>>> I personally created a separate thread for each trace point, since I found
>>>>>> this approach simpler, but it is certainly possible to use a single thread +
>>>>>> select from all perf_event_open file descriptor.
>>>>>>
>>>>>> BTW, did you manage to experiment with perf using the tool I attached?
>>>>>>
>>>>>> On Wed, Apr 1, 2015 at 12:22 PM, sahil aggarwal <sahil.agg15@gmail.com>
>>>>>> wrote:
>>>>>>>
>>>>>>> Hi Elazar
>>>>>>>
>>>>>>> Finally i am able to make small prototype to enable tracepoints. :)
>>>>>>>
>>>>>>> One more thing, is it possible to enable multiple tracepoints through
>>>>>>> 1 thread and
>>>>>>> while parsing output find out to which tracepoint that raw data belongs.?
>>>>>>>
>>>>>>> Or i would have to create separate thread for each tracepoint. ?
>>>>>>>
>>>>>>> Man page says:
>>>>>>>   Set config to one of the  following:
>>>>>>>           .........
>>>>>>>
>>>>>>> So i am assuming i will have to create separate thread for each event.
>>>>>>>
>>>>>>>
>>>>>>> Thanks a lot.
>>>>>>>
>>>>>>> On 31 March 2015 at 23:37, Elazar Leibovich
>>>>>>> <elazar.leibovich@ravellosystems.com> wrote:
>>>>>>> > Look at the man page, you should set the type to PERF_TYPE_TRACEPOINT
>>>>>>> > and set the config to the event id.
>>>>>>> >
>>>>>>> > On my system, sys_enter_open event id is 455
>>>>>>> >
>>>>>>> > $ sudo cat /sys/kernel/debug/tracing/events/syscalls/sys_enter_open/id
>>>>>>> > 455
>>>>>>> >
>>>>>>> > Add PERF_SAMPLE_RAW to the sample_type.
>>>>>>> >
>>>>>>> > BTW
>>>>>>> > You can compile the tar.gz I sent and echo JSON in the attr format to
>>>>>>> > it, it'll print back perf data in json format. Easier to experiment
>>>>>>> > with perf_event_open API than writing a C program.
>>>>>>> >
>>>>>>> > For example
>>>>>>> >
>>>>>>> > $ make
>>>>>>> > $ sudo ./perf2 <<EOF
>>>>>>> > {
>>>>>>> >   "attr": {
>>>>>>> >     "sample_type": [
>>>>>>> >       "PERF_SAMPLE_IP",
>>>>>>> >       "PERF_SAMPLE_RAW"
>>>>>>> >     ],
>>>>>>> >     "wakeup_events": 1,
>>>>>>> >     "config": 455,
>>>>>>> >     "sample_period": 1,
>>>>>>> >     "type": "PERF_TYPE_TRACEPOINT"
>>>>>>> >   }
>>>>>>> > }
>>>>>>> > EOF
>>>>>>> >
>>>>>>> > {"type":"PERF_RECORD_SAMPLE","misc":"PERF_RECORD_MISC_USER","sample":{"ip":"7f4f3625263d","data":[-57,1,0,0,7,11,0,0,2,0,0,0,0,0,0,0,-32,101,75,22,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0]}}
>>>>>>> >
>>>>>>> > {"type":"PERF_RECORD_SAMPLE","misc":"PERF_RECORD_MISC_USER","sample":{"ip":"7f4f3625263d","data":[-57,1,0,0,7,11,0,0,2,0,0,0,0,0,0,0,-64,112,75,22,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0]}}
>>>>>>> >
>>>>>>> > {"type":"PERF_RECORD_SAMPLE","misc":"PERF_RECORD_MISC_USER","sample":{"ip":"7f4f3625263d","data":[-57,1,0,0,7,11,0,0,2,0,0,0,0,0,0,0,-112,70,75,22,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0]}}
>>>>>>> >
>>>>>>> > {"type":"PERF_RECORD_SAMPLE","misc":"PERF_RECORD_MISC_USER","sample":{"ip":"7f4f3625263d","data":[-57,1,0,0,7,11,0,0,2,0,0,0,0,0,0,0,-32,70,75,22,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0]}}
>>>>>>> >
>>>>>>> > {"type":"PERF_RECORD_SAMPLE","misc":"PERF_RECORD_MISC_USER","sample":{"ip":"7f4f3625263d","data":[-57,1,0,0,7,11,0,0,2,0,0,0,0,0,0,0,0,-99,26,-19,-1,127,0,0,0,0,0,0,0,0,0,0,-74,1,0,0,0,0,0,0,0,0,0,0]}}
>>>>>>> > ...
>>>>>>> >
>>>>>>> > What is the raw data? Depends on the event. For sys_enter/exit it is
>>>>>>> > struct syscall_trace_enter/exit.
>>>>>>> >
>>>>>>> > http://osxr.org/linux/source/kernel/trace/trace.h#0095
>>>>>>> > struct trace_entry {
>>>>>>> >      unsigned short      type;
>>>>>>> >      unsigned char       flags;
>>>>>>> >      unsigned char       preempt_count;
>>>>>>> >      int         pid;
>>>>>>> > };
>>>>>>> > struct syscall_trace_enter {
>>>>>>> >     struct trace_entry  ent;
>>>>>>> >     int         nr;
>>>>>>> >     unsigned long       args[];
>>>>>>> > };
>>>>>>> >
>>>>>>> > How did I know that? I followed the kernel logic here:
>>>>>>> >
>>>>>>> > http://osxr.org/linux/source/kernel/trace/trace_syscalls.c#0636
>>>>>>> > static void perf_syscall_exit(void *ignore, struct pt_regs *regs, long
>>>>>>> > ret)
>>>>>>> > {
>>>>>>> > ...
>>>>>>> > rec = (struct syscall_trace_exit *)perf_trace_buf_prepare(size, ...);
>>>>>>> > ...
>>>>>>> > }
>>>>>>> >
>>>>>>> > Note that indeed after short+char+char+int we have 2, the open syscall
>>>>>>> > number in all event's raw data.
>>>>>>> >
>>>>>>> > On Tue, Mar 31, 2015 at 6:22 PM, sahil aggarwal <sahil.agg15@gmail.com>
>>>>>>> > wrote:
>>>>>>> >> Actually i need most of the sampling around PERF_TYPE_TRACEPOINT,
>>>>>>> >> so if i enable tracepoint "syscalls/sys_enter_open/" what will be the
>>>>>>> >> "type"
>>>>>>> >> field in perf_event_header.? And, the the record struct will be same as
>>>>>>> >> given
>>>>>>> >> in "syscalls/sys_enter_open/format" .?
>>>>>>> >>
>>>>>>> >> Thanks
>>>>>>> >>
>>>>>>> >> On 31 March 2015 at 20:40, sahil aggarwal <sahil.agg15@gmail.com>
>>>>>>> >> wrote:
>>>>>>> >>> Yeah that was clear enough.
>>>>>>> >>> Thanks a lot. Your code is of great help.
>>>>>>> >>>
>>>>>>> >>> Regards
>>>>>>> >>> Sahil
>>>>>>> >>>
>>>>>>> >>> On 31 March 2015 at 19:45, Elazar Leibovich
>>>>>>> >>> <elazar.leibovich@ravellosystems.com> wrote:
>>>>>>> >>>> I wanted to ensure the user always see contiguous array of data from
>>>>>>> >>>> the ring buffer.
>>>>>>> >>>>
>>>>>>> >>>> The last piece of data, say "abcde" could wrap around in the ring
>>>>>>> >>>> buffer and appear like:
>>>>>>> >>>>
>>>>>>> >>>> [de...                 ...abc]
>>>>>>> >>>>
>>>>>>> >>>> I wanted the user to see a contigious array of the form [abcde].
>>>>>>> >>>>
>>>>>>> >>>> So in the case I'm having input that wrap around, I'll simply copy it
>>>>>>> >>>> to the first buffer
>>>>>>> >>>>
>>>>>>> >>>> [wrap_buffer][de..                 ...abc]
>>>>>>> >>>> would become
>>>>>>> >>>> [               abc][de...               ...abc]
>>>>>>> >>>>
>>>>>>> >>>> And then I'll the user pointer to the leftmost "a", and he'll see
>>>>>>> >>>> "abcde" without knowing he's handling a ring buffer.
>>>>>>> >>>>
>>>>>>> >>>> Let me know if I was clear enough.
>>>>>>> >>>>
>>>>>>> >>>> On Tue, Mar 31, 2015 at 2:18 PM, sahil aggarwal
>>>>>>> >>>> <sahil.agg15@gmail.com> wrote:
>>>>>>> >>>>>
>>>>>>> >>>>> Hi Elazar
>>>>>>> >>>>>
>>>>>>> >>>>> Can you help me understand why you have used
>>>>>>> >>>>> mmap_pages->wrap_base.? And, instead of allocating
>>>>>>> >>>>> (2^n)+1 pages you allocate (2^n)+2 pages, why so.?
>>>>>>> >>>>> wrap_base points to (2^n)+2 pages and base points to
>>>>>>> >>>>> (2^n)+1 pages, what is use of wrap_base.? I tried reading
>>>>>>> >>>>> perf source too, there it seems they use (2^n)+1 pages only.
>>>>>>> >>>>>
>>>>>>> >>>>>
>>>>>>> >>>>> Thanks
>>>>>>> >>>>> Regards
>>>>>>> >> --
>>>>>>> >> To unsubscribe from this list: send the line "unsubscribe
>>>>>>> >> linux-perf-users" in
>>>>>>> >> the body of a message to majordomo@vger.kernel.org
>>>>>>> >> More majordomo info at  http://vger.kernel.org/majordomo-info.html
>>>>>>
>>>>>>

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: perf smapling
  2015-04-08  6:08                             ` sahil aggarwal
@ 2015-04-08  6:41                               ` sahil aggarwal
  0 siblings, 0 replies; 17+ messages in thread
From: sahil aggarwal @ 2015-04-08  6:41 UTC (permalink / raw)
  To: Elazar Leibovich; +Cc: linux-perf-users

Digging a bit deep  i found that if

pid > 0 && cpu == -1

then only mmap during attr->inherit=1 fails. what i could be missing.?

On 8 April 2015 at 11:38, sahil aggarwal <sahil.agg15@gmail.com> wrote:
> Hey Elazar,
>
> In the example you sent me, if i set :
> attr->inherit=1
>
> Then,
>  mmap(mmap_pages->wrap_base + page_size,             \
>             mmap_len - page_size, PROT_READ |               \
>             PROT_WRITE, MAP_FIXED | MAP_SHARED,   \
>              fd, 0);
>
> call fails with "Invalid argument".
>
> Happens only if inherit is set. Do i need to set any other flag
> too.?
>
> Thanks
> Regards
>
> On 3 April 2015 at 11:04, sahil aggarwal <sahil.agg15@gmail.com> wrote:
>> Sorry for that dumb ques. Problem was somewhere else.
>>
>> On 2 April 2015 at 19:00, sahil aggarwal <sahil.agg15@gmail.com> wrote:
>>> Yeah you are right. And, there seem to be problem when i declare
>>> 'struct perf_event_attr'
>>> at run time. Is it know issue.?
>>> It gives me -EINVAL(Invalid Argument).
>>>
>>> Run Time:
>>> perf_event_open(0x22a20b8, 0, 0xffffffff, 0xffffffff, 0) = -1 EINVAL
>>> (Invalid argument)
>>>
>>> Compile Time:
>>> perf_event_open(0x7fffa242ee50, 0xffffffff, 0x1, 0xffffffff, 0) = 6
>>>
>>> On 2 April 2015 at 00:20, Elazar Leibovich
>>> <elazar.leibovich@ravellosystems.com> wrote:
>>>> You can't && event ids in config, it'll simply give a different event.
>>>> You need to open a stream per tracing event.
>>>>
>>>> On Wed, Apr 1, 2015 at 2:49 PM, sahil aggarwal <sahil.agg15@gmail.com> wrote:
>>>>> If i enable multiple tracepoints, say:
>>>>>
>>>>> type    = PERF_TYPE_TRACEPOINT
>>>>> config  = 87 | 402  (sched/sched_switch && syscalls/sys_enter_open)
>>>>> sample_type = PERF_SAMPLE_TIME       |
>>>>>                          PERF_SAMPLE_RAW       |
>>>>>                          PERF_SAMPLE_TID          |
>>>>>                          PERF_SAMPLE_STREAM_ID ;
>>>>>
>>>>> It gives me some ID when i print sample_id(i thought it will print
>>>>> config value).  But how i can connect this ID with my type of event
>>>>> (sched_switch or sys_enter_open). .?
>>>>>
>>>>> On 1 April 2015 at 15:34, sahil aggarwal <sahil.agg15@gmail.com> wrote:
>>>>>> No i didn't give it a shot yet but code was very helpful.
>>>>>> And, the raw data form was same as struct defined for event in format
>>>>>> file(syscalls/sys_enter_open/format).
>>>>>>
>>>>>>
>>>>>> On 1 April 2015 at 15:28, Elazar Leibovich
>>>>>> <elazar.leibovich@ravellosystems.com> wrote:
>>>>>>> Yes, this is correct, as far as I can tell, when you create a perf_event for
>>>>>>> every tracepoint event.
>>>>>>>
>>>>>>> I personally created a separate thread for each trace point, since I found
>>>>>>> this approach simpler, but it is certainly possible to use a single thread +
>>>>>>> select from all perf_event_open file descriptor.
>>>>>>>
>>>>>>> BTW, did you manage to experiment with perf using the tool I attached?
>>>>>>>
>>>>>>> On Wed, Apr 1, 2015 at 12:22 PM, sahil aggarwal <sahil.agg15@gmail.com>
>>>>>>> wrote:
>>>>>>>>
>>>>>>>> Hi Elazar
>>>>>>>>
>>>>>>>> Finally i am able to make small prototype to enable tracepoints. :)
>>>>>>>>
>>>>>>>> One more thing, is it possible to enable multiple tracepoints through
>>>>>>>> 1 thread and
>>>>>>>> while parsing output find out to which tracepoint that raw data belongs.?
>>>>>>>>
>>>>>>>> Or i would have to create separate thread for each tracepoint. ?
>>>>>>>>
>>>>>>>> Man page says:
>>>>>>>>   Set config to one of the  following:
>>>>>>>>           .........
>>>>>>>>
>>>>>>>> So i am assuming i will have to create separate thread for each event.
>>>>>>>>
>>>>>>>>
>>>>>>>> Thanks a lot.
>>>>>>>>
>>>>>>>> On 31 March 2015 at 23:37, Elazar Leibovich
>>>>>>>> <elazar.leibovich@ravellosystems.com> wrote:
>>>>>>>> > Look at the man page, you should set the type to PERF_TYPE_TRACEPOINT
>>>>>>>> > and set the config to the event id.
>>>>>>>> >
>>>>>>>> > On my system, sys_enter_open event id is 455
>>>>>>>> >
>>>>>>>> > $ sudo cat /sys/kernel/debug/tracing/events/syscalls/sys_enter_open/id
>>>>>>>> > 455
>>>>>>>> >
>>>>>>>> > Add PERF_SAMPLE_RAW to the sample_type.
>>>>>>>> >
>>>>>>>> > BTW
>>>>>>>> > You can compile the tar.gz I sent and echo JSON in the attr format to
>>>>>>>> > it, it'll print back perf data in json format. Easier to experiment
>>>>>>>> > with perf_event_open API than writing a C program.
>>>>>>>> >
>>>>>>>> > For example
>>>>>>>> >
>>>>>>>> > $ make
>>>>>>>> > $ sudo ./perf2 <<EOF
>>>>>>>> > {
>>>>>>>> >   "attr": {
>>>>>>>> >     "sample_type": [
>>>>>>>> >       "PERF_SAMPLE_IP",
>>>>>>>> >       "PERF_SAMPLE_RAW"
>>>>>>>> >     ],
>>>>>>>> >     "wakeup_events": 1,
>>>>>>>> >     "config": 455,
>>>>>>>> >     "sample_period": 1,
>>>>>>>> >     "type": "PERF_TYPE_TRACEPOINT"
>>>>>>>> >   }
>>>>>>>> > }
>>>>>>>> > EOF
>>>>>>>> >
>>>>>>>> > {"type":"PERF_RECORD_SAMPLE","misc":"PERF_RECORD_MISC_USER","sample":{"ip":"7f4f3625263d","data":[-57,1,0,0,7,11,0,0,2,0,0,0,0,0,0,0,-32,101,75,22,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0]}}
>>>>>>>> >
>>>>>>>> > {"type":"PERF_RECORD_SAMPLE","misc":"PERF_RECORD_MISC_USER","sample":{"ip":"7f4f3625263d","data":[-57,1,0,0,7,11,0,0,2,0,0,0,0,0,0,0,-64,112,75,22,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0]}}
>>>>>>>> >
>>>>>>>> > {"type":"PERF_RECORD_SAMPLE","misc":"PERF_RECORD_MISC_USER","sample":{"ip":"7f4f3625263d","data":[-57,1,0,0,7,11,0,0,2,0,0,0,0,0,0,0,-112,70,75,22,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0]}}
>>>>>>>> >
>>>>>>>> > {"type":"PERF_RECORD_SAMPLE","misc":"PERF_RECORD_MISC_USER","sample":{"ip":"7f4f3625263d","data":[-57,1,0,0,7,11,0,0,2,0,0,0,0,0,0,0,-32,70,75,22,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0]}}
>>>>>>>> >
>>>>>>>> > {"type":"PERF_RECORD_SAMPLE","misc":"PERF_RECORD_MISC_USER","sample":{"ip":"7f4f3625263d","data":[-57,1,0,0,7,11,0,0,2,0,0,0,0,0,0,0,0,-99,26,-19,-1,127,0,0,0,0,0,0,0,0,0,0,-74,1,0,0,0,0,0,0,0,0,0,0]}}
>>>>>>>> > ...
>>>>>>>> >
>>>>>>>> > What is the raw data? Depends on the event. For sys_enter/exit it is
>>>>>>>> > struct syscall_trace_enter/exit.
>>>>>>>> >
>>>>>>>> > http://osxr.org/linux/source/kernel/trace/trace.h#0095
>>>>>>>> > struct trace_entry {
>>>>>>>> >      unsigned short      type;
>>>>>>>> >      unsigned char       flags;
>>>>>>>> >      unsigned char       preempt_count;
>>>>>>>> >      int         pid;
>>>>>>>> > };
>>>>>>>> > struct syscall_trace_enter {
>>>>>>>> >     struct trace_entry  ent;
>>>>>>>> >     int         nr;
>>>>>>>> >     unsigned long       args[];
>>>>>>>> > };
>>>>>>>> >
>>>>>>>> > How did I know that? I followed the kernel logic here:
>>>>>>>> >
>>>>>>>> > http://osxr.org/linux/source/kernel/trace/trace_syscalls.c#0636
>>>>>>>> > static void perf_syscall_exit(void *ignore, struct pt_regs *regs, long
>>>>>>>> > ret)
>>>>>>>> > {
>>>>>>>> > ...
>>>>>>>> > rec = (struct syscall_trace_exit *)perf_trace_buf_prepare(size, ...);
>>>>>>>> > ...
>>>>>>>> > }
>>>>>>>> >
>>>>>>>> > Note that indeed after short+char+char+int we have 2, the open syscall
>>>>>>>> > number in all event's raw data.
>>>>>>>> >
>>>>>>>> > On Tue, Mar 31, 2015 at 6:22 PM, sahil aggarwal <sahil.agg15@gmail.com>
>>>>>>>> > wrote:
>>>>>>>> >> Actually i need most of the sampling around PERF_TYPE_TRACEPOINT,
>>>>>>>> >> so if i enable tracepoint "syscalls/sys_enter_open/" what will be the
>>>>>>>> >> "type"
>>>>>>>> >> field in perf_event_header.? And, the the record struct will be same as
>>>>>>>> >> given
>>>>>>>> >> in "syscalls/sys_enter_open/format" .?
>>>>>>>> >>
>>>>>>>> >> Thanks
>>>>>>>> >>
>>>>>>>> >> On 31 March 2015 at 20:40, sahil aggarwal <sahil.agg15@gmail.com>
>>>>>>>> >> wrote:
>>>>>>>> >>> Yeah that was clear enough.
>>>>>>>> >>> Thanks a lot. Your code is of great help.
>>>>>>>> >>>
>>>>>>>> >>> Regards
>>>>>>>> >>> Sahil
>>>>>>>> >>>
>>>>>>>> >>> On 31 March 2015 at 19:45, Elazar Leibovich
>>>>>>>> >>> <elazar.leibovich@ravellosystems.com> wrote:
>>>>>>>> >>>> I wanted to ensure the user always see contiguous array of data from
>>>>>>>> >>>> the ring buffer.
>>>>>>>> >>>>
>>>>>>>> >>>> The last piece of data, say "abcde" could wrap around in the ring
>>>>>>>> >>>> buffer and appear like:
>>>>>>>> >>>>
>>>>>>>> >>>> [de...                 ...abc]
>>>>>>>> >>>>
>>>>>>>> >>>> I wanted the user to see a contigious array of the form [abcde].
>>>>>>>> >>>>
>>>>>>>> >>>> So in the case I'm having input that wrap around, I'll simply copy it
>>>>>>>> >>>> to the first buffer
>>>>>>>> >>>>
>>>>>>>> >>>> [wrap_buffer][de..                 ...abc]
>>>>>>>> >>>> would become
>>>>>>>> >>>> [               abc][de...               ...abc]
>>>>>>>> >>>>
>>>>>>>> >>>> And then I'll the user pointer to the leftmost "a", and he'll see
>>>>>>>> >>>> "abcde" without knowing he's handling a ring buffer.
>>>>>>>> >>>>
>>>>>>>> >>>> Let me know if I was clear enough.
>>>>>>>> >>>>
>>>>>>>> >>>> On Tue, Mar 31, 2015 at 2:18 PM, sahil aggarwal
>>>>>>>> >>>> <sahil.agg15@gmail.com> wrote:
>>>>>>>> >>>>>
>>>>>>>> >>>>> Hi Elazar
>>>>>>>> >>>>>
>>>>>>>> >>>>> Can you help me understand why you have used
>>>>>>>> >>>>> mmap_pages->wrap_base.? And, instead of allocating
>>>>>>>> >>>>> (2^n)+1 pages you allocate (2^n)+2 pages, why so.?
>>>>>>>> >>>>> wrap_base points to (2^n)+2 pages and base points to
>>>>>>>> >>>>> (2^n)+1 pages, what is use of wrap_base.? I tried reading
>>>>>>>> >>>>> perf source too, there it seems they use (2^n)+1 pages only.
>>>>>>>> >>>>>
>>>>>>>> >>>>>
>>>>>>>> >>>>> Thanks
>>>>>>>> >>>>> Regards
>>>>>>>> >> --
>>>>>>>> >> To unsubscribe from this list: send the line "unsubscribe
>>>>>>>> >> linux-perf-users" in
>>>>>>>> >> the body of a message to majordomo@vger.kernel.org
>>>>>>>> >> More majordomo info at  http://vger.kernel.org/majordomo-info.html
>>>>>>>
>>>>>>>

^ permalink raw reply	[flat|nested] 17+ messages in thread

end of thread, other threads:[~2015-04-08  6:41 UTC | newest]

Thread overview: 17+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2015-03-29  5:54 perf smapling sahil aggarwal
2015-03-29  6:22 ` Elazar Leibovich
2015-03-30 17:38 ` Andi Kleen
     [not found] ` <CAL2Y34Bi+VLSvGucN8J=4RXKh785Lg9BYAa4LUU_yCMR7COTgQ@mail.gmail.com>
     [not found]   ` <20150331111813.GA1152@ubuntu>
     [not found]     ` <CAL2Y34A=Fk03_LSTM_SHwVaE4NaKSD=zZt0vWtzn3Rm9dQtDLQ@mail.gmail.com>
2015-03-31 14:15       ` Fwd: " Elazar Leibovich
2015-03-31 15:10         ` sahil aggarwal
2015-03-31 15:22           ` sahil aggarwal
2015-03-31 18:07             ` Elazar Leibovich
2015-04-01  9:22               ` sahil aggarwal
2015-04-01  9:58                 ` Elazar Leibovich
     [not found]                 ` <CAL2Y34Amk49Upd8+eCmEK9WqG6SGgEJfj3tUXdrRKEa_6Hxr8Q@mail.gmail.com>
2015-04-01 10:04                   ` sahil aggarwal
2015-04-01 11:49                     ` sahil aggarwal
2015-04-01 11:54                       ` sahil aggarwal
2015-04-01 18:50                       ` Elazar Leibovich
2015-04-02 13:30                         ` sahil aggarwal
2015-04-03  5:34                           ` sahil aggarwal
2015-04-08  6:08                             ` sahil aggarwal
2015-04-08  6:41                               ` sahil aggarwal

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.