BPF Archive on lore.kernel.org
 help / color / Atom feed
From: Kris Van Hees <kris.van.hees@oracle.com>
To: Alexei Starovoitov <alexei.starovoitov@gmail.com>
Cc: Kris Van Hees <kris.van.hees@oracle.com>,
	Steven Rostedt <rostedt@goodmis.org>,
	netdev@vger.kernel.org, bpf@vger.kernel.org,
	dtrace-devel@oss.oracle.com, linux-kernel@vger.kernel.org,
	mhiramat@kernel.org, acme@kernel.org, ast@kernel.org,
	daniel@iogearbox.net, peterz@infradead.org
Subject: Re: [RFC PATCH 00/11] bpf, trace, dtrace: DTrace BPF program type implementation and sample use
Date: Thu, 23 May 2019 01:46:10 -0400
Message-ID: <20190523054610.GR2422@oracle.com> (raw)
In-Reply-To: <20190522205329.uu26oq2saj56og5m@ast-mbp.dhcp.thefacebook.com>

On Wed, May 22, 2019 at 01:53:31PM -0700, Alexei Starovoitov wrote:
> On Wed, May 22, 2019 at 01:23:27AM -0400, Kris Van Hees wrote:
> > 
> > Userspace aside, there are various features that are not currently available
> > such as retrieving the ppid of the current task, and various other data items
> > that relate to the current task that triggered a probe.  There are ways to
> > work around it (using the bpf_probe_read() helper, which actually performs a
> > probe_kernel_read()) but that is rather clunky
> Sounds like you're admiting that the access to all kernel data structures
> is actually available, but you don't want to change user space to use it?

I of course agree that access to all kernel structures can be done using the
bpf_probe_read() helper.  But I hope you agree that the availability of that
helper doesn't mean that there is no room for more elegant ways to access
information.  There are already helpers (e.g. bpf_get_current_pid_tgid) that
could be replaced by BPF code that uses bpf_probe_read to accomplish the same

> > triggered the execution.  Often, a single DTrace clause is associated with
> > multiple probes, of different types.  Probes in the kernel (kprobe, perf event,
> > tracepoint, ...) are associated with their own BPF program type, so it is not
> > possible to load the DTrace clause (translated into BPF code) once and
> > associate it with probes of different types.  Instead, I'd have to load it
> > as a BPF_PROG_TYPE_KPROBE program to associate it with a kprobe, and I'd have
> > to load it as a BPF_PROG_TYPE_TRACEPOINT program to associate it with a
> > tracepoint, and so on.  This also means that I suddenly have to add code to
> > the userspace component to know about the different program types with more
> > detail, like what helpers are available to specific program types.
> That also sounds that there is a solution, but you don't want to change user space ?

I think there is a difference between a solution and a good solution.  Adding
a lot of knowledge in the userspace component about how things are imeplemented
at the kernel level makes for a more fragile infrastructure and involves
breaking down well established boundaries in DTrace that are part of the design
specifically to ensure that userspace doesn't need to depend on such intimate

> > Another advantage of being able to operate on a more abstract probe concept
> > that is not tied to a specific probe type is that the userspace component does
> > not need to know about the implementation details of the specific probes.
> If that is indeed the case that dtrace is broken _by design_
> and nothing on the kernel side can fix it.
> bpf prog attached to NMI is running in NMI.
> That is very different execution context vs kprobe.
> kprobe execution context is also different from syscall.
> The user writing the script has to be aware in what context
> that script will be executing.

The design behind DTrace definitely recognizes that different types of probes
operate in different ways and have different data associated with them.  That
is why probes (in legacy DTrace) are managed by providers, one for each type
of probe.  The providers handle the specifics of a probe type, and provide a
generic probe API to the processing component of DTrace:

    SDT probes -----> SDT provider -------+
    FBT probes -----> FBT provider -------+--> DTrace engine
    syscall probes -> systrace provider --+

This means that the DTrace processing component can be implemented based on a
generic probe concept, and the providers will take care of the specifics.  In
that sense, it is similar to so many other parts of the kernel where a generic
API is exposed so that higher level components don't need to know implementation

In DTrace, people write scripts based on UAPI-style interfaces and they don't
have to concern themselves with e.g. knowing how to get the value of the 3rd
argument that was passed by the firing probe.  All they need to know is that
the probe will have a 3rd argument, and that the 3rd argument to *any* probe
can be accessed as 'arg2' (or args[2] for typed arguments, if the provider is
capable of providing that).  Different probes have different ways of passing
arguments, and only the provider code for each probe type needs to know how
to retrieve the argument values.

Does this help bring clarity to the reasons why an abstract (generic) probe
concept is part of DTrace's design?

  reply index

Thread overview: 54+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2019-05-20 23:47 Kris Van Hees
2019-05-21 17:56 ` Alexei Starovoitov
2019-05-21 18:41   ` Kris Van Hees
2019-05-21 20:55     ` Alexei Starovoitov
2019-05-21 21:36       ` Steven Rostedt
2019-05-21 21:43         ` Alexei Starovoitov
2019-05-21 21:48           ` Steven Rostedt
2019-05-22  5:23             ` Kris Van Hees
2019-05-22 20:53               ` Alexei Starovoitov
2019-05-23  5:46                 ` Kris Van Hees [this message]
2019-05-23 21:13                   ` Alexei Starovoitov
2019-05-23 23:02                     ` Steven Rostedt
2019-05-24  0:31                       ` Alexei Starovoitov
2019-05-24  1:57                         ` Steven Rostedt
2019-05-24  2:08                           ` Alexei Starovoitov
2019-05-24  2:40                             ` Steven Rostedt
2019-05-24  5:26                             ` Kris Van Hees
2019-05-24  5:10                       ` Kris Van Hees
2019-05-24  4:05                     ` Kris Van Hees
2019-05-24 13:28                       ` Steven Rostedt
2019-05-21 21:36       ` Kris Van Hees
2019-05-21 23:26         ` Alexei Starovoitov
2019-05-22  4:12           ` Kris Van Hees
2019-05-22 20:16             ` Alexei Starovoitov
2019-05-23  5:16               ` Kris Van Hees
2019-05-23 20:28                 ` Alexei Starovoitov
2019-05-30 16:15                   ` Kris Van Hees
2019-05-31 15:25                     ` Chris Mason
2019-06-06 20:58                       ` Kris Van Hees
2019-06-18  1:25                   ` Kris Van Hees
2019-06-18  1:32                     ` Alexei Starovoitov
2019-06-18  1:54                       ` Kris Van Hees
2019-06-18  3:01                         ` Alexei Starovoitov
2019-06-18  3:19                           ` Kris Van Hees
2019-05-22 14:25   ` Peter Zijlstra
2019-05-22 18:22     ` Kris Van Hees
2019-05-22 19:55       ` Alexei Starovoitov
2019-05-22 20:20         ` David Miller
2019-05-23  5:19         ` Kris Van Hees
2019-05-24  7:27       ` Peter Zijlstra
2019-05-21 20:39 ` [RFC PATCH 01/11] bpf: context casting for tail call Kris Van Hees
2019-05-21 20:39 ` [RFC PATCH 02/11] bpf: add BPF_PROG_TYPE_DTRACE Kris Van Hees
2019-05-21 20:39 ` [RFC PATCH 03/11] bpf: export proto for bpf_perf_event_output helper Kris Van Hees
     [not found] ` <facilities>
2019-05-21 20:39   ` [RFC PATCH 04/11] trace: initial implementation of DTrace based on kernel Kris Van Hees
2019-05-21 20:39 ` [RFC PATCH 05/11] trace: update Kconfig and Makefile to include DTrace Kris Van Hees
     [not found] ` <features>
2019-05-21 20:39   ` [RFC PATCH 06/11] dtrace: tiny userspace tool to exercise DTrace support Kris Van Hees
2019-05-21 20:39 ` [RFC PATCH 07/11] bpf: implement writable buffers in contexts Kris Van Hees
2019-05-21 20:39 ` [RFC PATCH 08/11] perf: add perf_output_begin_forward_in_page Kris Van Hees
     [not found] ` <the>
     [not found]   ` <context>
2019-05-21 20:39     ` [RFC PATCH 09/11] bpf: mark helpers explicitly whether they may change Kris Van Hees
     [not found] ` <helpers>
2019-05-21 20:39   ` [RFC PATCH 10/11] bpf: add bpf_buffer_reserve and bpf_buffer_commit Kris Van Hees
2019-05-21 20:40 ` [RFC PATCH 11/11] dtrace: make use of writable buffers in BPF Kris Van Hees
2019-05-21 20:48 ` [RFC PATCH 00/11] bpf, trace, dtrace: DTrace BPF program type implementation and sample use Kris Van Hees
2019-05-21 20:54   ` Steven Rostedt
2019-05-21 20:56   ` Alexei Starovoitov

Reply instructions:

You may reply publically to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20190523054610.GR2422@oracle.com \
    --to=kris.van.hees@oracle.com \
    --cc=acme@kernel.org \
    --cc=alexei.starovoitov@gmail.com \
    --cc=ast@kernel.org \
    --cc=bpf@vger.kernel.org \
    --cc=daniel@iogearbox.net \
    --cc=dtrace-devel@oss.oracle.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=mhiramat@kernel.org \
    --cc=netdev@vger.kernel.org \
    --cc=peterz@infradead.org \
    --cc=rostedt@goodmis.org \


* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

BPF Archive on lore.kernel.org

Archives are clonable:
	git clone --mirror https://lore.kernel.org/bpf/0 bpf/git/0.git

	# If you have public-inbox 1.1+ installed, you may
	# initialize and index your mirror using the following commands:
	public-inbox-init -V2 bpf bpf/ https://lore.kernel.org/bpf \
		bpf@vger.kernel.org bpf@archiver.kernel.org
	public-inbox-index bpf

Newsgroup available over NNTP:

AGPL code for this site: git clone https://public-inbox.org/ public-inbox