linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Claudio <claudio.fontana@gliwa.com>
To: Steven Rostedt <rostedt@goodmis.org>
Cc: Ingo Molnar <mingo@redhat.com>,
	Peter Zijlstra <peterz@infradead.org>,
	Thomas Gleixner <tglx@linutronix.de>,
	linux-kernel@vger.kernel.org
Subject: Re: ftrace global trace_pipe_raw
Date: Wed, 19 Dec 2018 12:32:41 +0100	[thread overview]
Message-ID: <732bd9ce-c0dd-79ea-de17-462ba662a10a@gliwa.com> (raw)
In-Reply-To: <20180724102316.41cdb8a1@gandalf.local.home>

Hi Steven,

going back to this old theme to clarify a bit what I was trying to achieve:

On 07/24/2018 04:23 PM, Steven Rostedt wrote:
> On Tue, 24 Jul 2018 11:58:18 +0200
> Claudio <claudio.fontana@gliwa.com> wrote:
> 
>> Hello Steven,
>>
>> I am doing correlation of linux sched events, following all tasks between cpus,
>> and one thing that would be really convenient would be to have a global
>> trace_pipe_raw, in addition to the per-cpu ones, with already sorted events.

I think that I asked for the wrong thing, since I did not understand how the implementation worked.
Which lead to your response, thank you for the clarification.

>>
>> I would imagine the core functionality is already available, since trace_pipe
>> in the tracing directory already shows all events regardless of CPU, and so
>> it would be a matter of doing the same for trace_pipe_raw.
> 
> The difference between trace_pipe and trace_pipe_raw is that trace_pipe
> is post processed, and reads the per CPU buffers and interleaves them
> one event at a time. The trace_pipe_raw just sends you the raw
> unprocessed data directly from the buffers, which are grouped per CPU.

I think that what I am looking for, to improve the performance of our system,
is a post processed stream of binary entry data, already merged from all CPUs
and sorted per timestamp, in the same way that it is done for textual output
in __find_next_entry:

       for_each_tracing_cpu(cpu) {

                if (ring_buffer_empty_cpu(buffer, cpu))
                        continue;

                ent = peek_next_entry(iter, cpu, &ts, &lost_events);

                /*                                                                
                 * Pick the entry with the smallest timestamp:                    
                 */
                if (ent && (!next || ts < next_ts)) {
                        next = ent;
                        next_cpu = cpu;
                        next_ts = ts;
                        next_lost = lost_events;
                        next_size = iter->ent_size;
                }
        }

We first tried to use the textual output directly, but this lead to
unacceptable overheads in parsing the text.

Please correct me if I do not understand, however it seems to me that it
would be possible do the same kind of post processing including generating
a sorted stream of entries, just avoiding the text output formatting,
and outputting the binary data of the entry directly, which would be way
more efficient to consume directly from user space correlators.

But maybe this is not a general enough requirement to be acceptable for
implementing directly into the kernel?

We have the requirement of using the OS tracing events, including
scheduling events, to react from software immediately
(vs doing after-the-fact analysis).

Thank you for your comment on this and I wish you nice holidays.

Ciao,

Claudio

  parent reply	other threads:[~2018-12-19 11:42 UTC|newest]

Thread overview: 17+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2018-07-06  6:22 ftrace performance (sched events): cyclictest shows 25% more latency Claudio
2018-07-06 21:24 ` Steven Rostedt
2018-07-06 21:39   ` Steven Rostedt
2018-07-06 22:00     ` Steven Rostedt
2018-07-09 10:06   ` Claudio
2018-07-09 14:53     ` Claudio
2018-07-09 15:11       ` Steven Rostedt
2018-07-09 15:32 ` Steven Rostedt
2018-07-24  9:58   ` ftrace global trace_pipe_raw Claudio
2018-07-24 14:23     ` Steven Rostedt
2018-07-24 14:25       ` Steven Rostedt
2018-07-24 15:30         ` Claudio
2018-12-19 11:32       ` Claudio [this message]
2018-12-19 16:37         ` Steven Rostedt
2019-01-16  8:00           ` Claudio
2019-01-16 13:47             ` Steven Rostedt
2018-07-24  9:59   ` Claudio

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=732bd9ce-c0dd-79ea-de17-462ba662a10a@gliwa.com \
    --to=claudio.fontana@gliwa.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=mingo@redhat.com \
    --cc=peterz@infradead.org \
    --cc=rostedt@goodmis.org \
    --cc=tglx@linutronix.de \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).