All of lore.kernel.org
 help / color / mirror / Atom feed
From: Frank Rowand <frowand.list@gmail.com>
To: Masami Hiramatsu <mhiramat@kernel.org>,
	Steven Rostedt <rostedt@goodmis.org>,
	Rob Herring <robh+dt@kernel.org>, Tim Bird <Tim.Bird@sony.com>
Cc: Ingo Molnar <mingo@redhat.com>,
	Namhyung Kim <namhyung@kernel.org>, Jiri Olsa <jolsa@redhat.com>,
	Arnaldo Carvalho de Melo <acme@kernel.org>,
	Tom Zanussi <tom.zanussi@linux.intel.com>,
	linux-kernel@vger.kernel.org, devicetree@vger.kernel.org
Subject: Re: [RFC PATCH v2 00/15] tracing: of: Boot time tracing using devicetree
Date: Mon, 15 Jul 2019 07:21:27 -0700	[thread overview]
Message-ID: <488a65e6-1d80-0acb-5092-80c18b7ff447@gmail.com> (raw)
In-Reply-To: <156316746861.23477.5815110570539190650.stgit@devnote2>

Hi Masami,

After receiving this email, I replied to one email on the v1 thread,
so there will be a little bit of overlap in the ordering of the two
threads.  Feel free to reply to my comments in the v1 thread in this
thread instead.

More comments below.

On 7/14/19 10:11 PM, Masami Hiramatsu wrote:> Hello,
> 
> Here is the 2nd version of RFC series to add boot-time tracing using
> devicetree. Previous thread is here.
> 
> https://lkml.kernel.org/r/156113387975.28344.16009584175308192243.stgit@devnote2
> 
> In this version, I moved the ftrace node under /chosen/linux,ftrace
> and remove compatible property, because it must be in fixed place.
> Also this version has following features;
> 
>  - Introduce "instance" node, which can have events nodes for setting
>    events filters and actions for the trace instance.
>  - Introduce "cpumask" property
>  - Introduce "ftrace-filters" and "ftrace-notraces"
>  - Introduce "fgraph-filters", "fgraph-notraces" and "fgraph-max-depth"
> 
> At this moment, this feature is only available on the architecutre which
> supports devicetree. For x86, we can use it on qemu with --dtb option,
> or apply below patch on grub to add devicetree support on grub-x86.
> 
> https://github.com/mhiramat/grub/commit/644c35bfd2d18c772cc353b74215344f8264923a
> 
> Note that the devicetree for x86 must contain the nodes only under
> /chosen/, or it may cause a problem if it conflicts with ACPI.
> (Maybe we need to disable the FDT nodes except for nodes under /chosen
>  on boot if ACPI exists.)
> 
> This series can be applied on Steve's tracing tree (ftrace/core) or
> available on below
> 
> https://github.com/mhiramat/linux.git ftrace-devicetree-v2
> 
> Usage
> ======
> With this series, we can setup new kprobe and synthetic events, more
> complicated event filters and trigger actions including histogram
> via devicetree.
> 
> For example, following kernel parameters
> 
> trace_options=sym-addr trace_event=initcall:* tp_printk trace_buf_size=1M ftrace=function ftrace_filter="vfs*"
> 
> it can be written in devicetree like below.
> 
> /{
> chosen {
> 	...
> 	ftrace {
> 		options = "sym-addr";
> 		events = "initcall:*";
> 		tp-printk;
> 		buffer-size-kb = <0x400>;	// 1024KB == 1MB
> 		ftrace-filters = "vfs*";
> 	};
> 
> Moreover, now we can expand it to add filters for events, kprobe events,
> and synthetic events with histogram like below.
> 
> 	ftrace {
> 		...
> 		event0 {
> 			event = "task:task_newtask";
> 			filter = "pid < 128";	// adding filters
> 			enable;
> 		};
> 		event1 {
> 			event = "kprobes:vfs_read";
> 			probes = "vfs_read $arg1 $arg2"; // add kprobes
> 			filter = "common_pid < 200";
> 			enable;
> 		};
> 		event2 {
> 			event = "initcall_latency";	// add synth event
> 			fields = "unsigned long func", "u64 lat";
> 			// with histogram
> 			actions = "hist:keys=func.sym,lat:vals=lat:sort=lat";
> 		};
> 		// and synthetic event callers
> 		event3 {
> 			event = "initcall:initcall_start";
> 			actions = "hist:keys=func:ts0=common_timestamp.usecs";
> 		};
> 		event4 {
> 			event = "initcall:initcall_finish";
> 			actions = "hist:keys=func:lat=common_timestamp.usecs-$ts0:onmatch(initcall.initcall_start).initcall_latency(func,$lat)";
> 		};
> 	};
> 
> Also, this version supports "instance" node, which allows us to
> run several tracers for different purpose at once. For example,
> one tracer is for tracing functions in module alpha, and others
> tracing module beta, you can write followings.
> 
> 	ftrace {
> 		instance0 {
> 			instance = "foo";
> 			tracer = "function";
> 			ftrace-filters = "*:mod:alpha";
> 		};
> 		instance1 {
> 			instance = "bar";
> 			tracer = "function";
> 			ftrace-filters = "*:mod:beta";
> 			
> 		};
> 	};
> 
> The instance node also accepts event nodes so that each instance
> can customize its event tracing.
> 
> Discussion
> =====
> On the previous thread, we discussed that the this devicetree usage
> itself was acceptable or not. Fortunately, I had a chance to discuss
> it in a F2F meeting with Frank and Tim last week.

Thanks for writing up some of what we discussed.

Let me add a problem statement and use case.  I'll probably get it at least
a little bit wrong, so please update as needed.

(1) You feel the ftrace kernel command line syntax is not sufficiently user
    friendly.

(2) The kernel command line is too small to contain the full set of desired
    ftrace commands and options.

(3) There is a desire to change the boot time ftrace commands and options
    without re-compiling or re-linking the Linux kernel.


> 
> I think the advantages of using devicetree are,
> 
> - reuse devicetree's structured syntax for complicated tracefs settings
> - reuse OF-APIs in linux kernel to accept and parse it
> - reuse dtc complier to compile it and validate syntax. (with yaml schema,
>   we can enhance it)
> - reuse current bootloader (and qemu) to load it

Devicetree is not a universal data structure and communication channel.

Devicetree is a description of the hardware and also conveys bootloader
specific information that is need by the kernel to boot.


> And we talked about some other ideas to avoid using devicetree.
> 
> - expand kernel command line (ascii command strings)
> - expand kernel command line with base64 encoded comressed ascii command 
>    strings

Base64 being one of possibly many ways to convert arbitrary binary data to
ascii safe data _if_ you want to transfer the ftrace options and commands
in a binary format.


> - load (compressed) ascii command strings to somewhere on memory and pass
>    the address via kernel cmdline

Similar to the way initrd is handled, if I understand correctly.  (I am not
up to date on how initrd location is passed to the kernel for a non-devicetree
kernel.)

Compressed or not compressed would be an ftrace design choice.


> - load (compressed) ascii command strings to somewhere on memory and pass
>    the address via /chosen node (as same as initrd)

Compressed or not compressed would be an ftrace design choice.


> - load binary C data and point it from kernel cmdline
> - load binary C data and point it from /chosen node (as same as initrd)
> - load binary C data as a section of kernel image

For the three options above:

Binary data if ftrace prefers structured data.
A list of strings if ftrace wants to use the existing kernel command line
syntax.

For the third of the above three options, the linker would provide the start
and end address of the ftrace options and commands section.


> The first 2 ideas expand the kernel's cmdline to pass some "magic" command
> to setup ftrace. In both case, the problems are the maximal size of cmdline
> and the issues related to the complexity of commands.

Not a "magic" command.  Either continue using the existing ftrace syntax or
add something like: ftrace_cmd="whatever format ftrace desires".

Why can the maximum size of the cmdline not be increased?


> My example showed that the ftrace settings becomes long even if making one
> histogram, which can be longer than 256 bytes. The long and complex data
> can easily lead mis-typing, but cmdline has no syntax validator, it just
> ignores the mis-typed commands.

Hand typing a kernel command line is already not a fun exercise, even
before adding ftrace commands.  If you are hand typing kernel command
lines then I suggest you improve your tools (eg bootloader or whatever
is not allowing you to edit and store command lines).


> (Of course even with the devicetree, it must be smaller than 2 pages)
> 
> Next 2 ideas are similar, but load the commands on some other memory area
> and pass only address via cmdline. This solves the size limitation issue,
> but still no syntax validation. Of course we can make a new structured
> syntax validator similar to (or just forked from) dt-validate.
> The problem (or disadvantage) of these (and following) ideas, is to change
> the kernel and boot loaders to load another binary blobs on memory.
> 
> Maybe if we introduce a generic structured kernel boot arguments, which is
> a kind of /chosen node of devicetree. (But if there is already such hook,
> why we make another one...?)

I got lost in the next sentence, so for my benefit:
GSKBA == generic structured kernel boot arguments

> Also, this "GSKBA" may introduce a parser and access APIs which will be
> very similar to OF-APIs. This also seems redundant to me.
 

> So the last 3 ideas will avoid introducing new parser and APIs, we just
> compile the data as C data and point it from cmdline or somewhere else.

Or if in a kernel data section then the linker can provide the begin and
end address of the blob.  This is already implemented for some other data
structures.

> With these ideas, we still need to expand boot loaders to support
> loading new binary blobs. (And the last one requires to add elf header
> parser/modifier to boot loader too)

Why would the boot loader need to access the elf header?  The linker
can provide the location of the new kernel data section via kernel
variables.

> 
>>From the above reasons, I think using devicetree's /chosen node is 
> the least intrusive way to introduce this boot-time tracing feature.

This is still mis-use of the devicetree data structure.  This data
does not belong in the devicetree.


> 
> Any suggestions, thoughts?
> 
> Thank you,
> 
> ---
> 
> Masami Hiramatsu (15):
>       tracing: Apply soft-disabled and filter to tracepoints printk
>       tracing: kprobes: Output kprobe event to printk buffer
>       tracing: Expose EXPORT_SYMBOL_GPL symbol
>       tracing: kprobes: Register to dynevent earlier stage
>       tracing: Accept different type for synthetic event fields
>       tracing: Add NULL trace-array check in print_synth_event()
>       dt-bindings: tracing: Add ftrace binding document
>       tracing: of: Add setup tracing by devicetree support
>       tracing: of: Add trace event settings
>       tracing: of: Add kprobe event support
>       tracing: of: Add synthetic event support
>       tracing: of: Add instance node support
>       tracing: of: Add cpumask property support
>       tracing: of: Add function tracer filter properties
>       tracing: of: Add function-graph tracer option properties
> 
> 
>  .../devicetree/bindings/chosen/linux,ftrace.yaml   |  306 ++++++++++++
>  include/linux/trace_events.h                       |    1 
>  kernel/trace/Kconfig                               |   10 
>  kernel/trace/Makefile                              |    1 
>  kernel/trace/ftrace.c                              |   85 ++-
>  kernel/trace/trace.c                               |   90 ++--
>  kernel/trace/trace_events.c                        |    3 
>  kernel/trace/trace_events_hist.c                   |   14 -
>  kernel/trace/trace_events_trigger.c                |    2 
>  kernel/trace/trace_kprobe.c                        |   81 ++-
>  kernel/trace/trace_of.c                            |  507 ++++++++++++++++++++
>  11 files changed, 1004 insertions(+), 96 deletions(-)
>  create mode 100644 Documentation/devicetree/bindings/chosen/linux,ftrace.yaml
>  create mode 100644 kernel/trace/trace_of.c
> 
> --
> Masami Hiramatsu (Linaro) <mhiramat@kernel.org>
> 


  parent reply	other threads:[~2019-07-15 14:21 UTC|newest]

Thread overview: 19+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2019-07-15  5:11 [RFC PATCH v2 00/15] tracing: of: Boot time tracing using devicetree Masami Hiramatsu
2019-07-15  5:11 ` [RFC PATCH v2 01/15] tracing: Apply soft-disabled and filter to tracepoints printk Masami Hiramatsu
2019-07-15  5:11 ` [RFC PATCH v2 02/15] tracing: kprobes: Output kprobe event to printk buffer Masami Hiramatsu
2019-07-15  5:11 ` [RFC PATCH v2 03/15] tracing: Expose EXPORT_SYMBOL_GPL symbol Masami Hiramatsu
2019-07-15  5:11 ` [RFC PATCH v2 04/15] tracing: kprobes: Register to dynevent earlier stage Masami Hiramatsu
2019-07-15  5:12 ` [RFC PATCH v2 05/15] tracing: Accept different type for synthetic event fields Masami Hiramatsu
2019-07-15  5:12 ` [RFC PATCH v2 06/15] tracing: Add NULL trace-array check in print_synth_event() Masami Hiramatsu
2019-07-15  5:12 ` [RFC PATCH v2 07/15] dt-bindings: tracing: Add ftrace binding document Masami Hiramatsu
2019-07-15  5:12 ` [RFC PATCH v2 08/15] tracing: of: Add setup tracing by devicetree support Masami Hiramatsu
2019-07-15  5:12 ` [RFC PATCH v2 09/15] tracing: of: Add trace event settings Masami Hiramatsu
2019-07-15  5:12 ` [RFC PATCH v2 10/15] tracing: of: Add kprobe event support Masami Hiramatsu
2019-07-15  5:13 ` [RFC PATCH v2 11/15] tracing: of: Add synthetic " Masami Hiramatsu
2019-07-15  5:13 ` [RFC PATCH v2 12/15] tracing: of: Add instance node support Masami Hiramatsu
2019-07-15  5:13 ` [RFC PATCH v2 13/15] tracing: of: Add cpumask property support Masami Hiramatsu
2019-07-15  5:13 ` [RFC PATCH v2 14/15] tracing: of: Add function tracer filter properties Masami Hiramatsu
2019-07-15  5:13 ` [RFC PATCH v2 15/15] tracing: of: Add function-graph tracer option properties Masami Hiramatsu
2019-07-15 14:21 ` Frank Rowand [this message]
2019-07-16 15:02   ` [RFC PATCH v2 00/15] tracing: of: Boot time tracing using devicetree Masami Hiramatsu
2019-07-22 14:38     ` Masami Hiramatsu

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=488a65e6-1d80-0acb-5092-80c18b7ff447@gmail.com \
    --to=frowand.list@gmail.com \
    --cc=Tim.Bird@sony.com \
    --cc=acme@kernel.org \
    --cc=devicetree@vger.kernel.org \
    --cc=jolsa@redhat.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=mhiramat@kernel.org \
    --cc=mingo@redhat.com \
    --cc=namhyung@kernel.org \
    --cc=robh+dt@kernel.org \
    --cc=rostedt@goodmis.org \
    --cc=tom.zanussi@linux.intel.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.