All of lore.kernel.org
 help / color / mirror / Atom feed
From: Josef Bacik <jbacik@fb.com>
To: Steven Rostedt <rostedt@goodmis.org>,
	"ksummit-discuss@lists.linux-foundation.org"
	<ksummit-discuss@lists.linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>,
	Josef Bacik <josef@toxicpanda.com>
Subject: Re: [Ksummit-discuss] [MAINTAINER TOPIC] tracepoints without user space interfaces
Date: Wed, 20 Sep 2017 14:54:07 +0000	[thread overview]
Message-ID: <0C1E6F2D-2E7D-4477-9F35-8C59F62BB409@fb.com> (raw)
In-Reply-To: <20170920095031.1972fba5@gandalf.local.home>

Cc’ing my personal address so I can reply with a sane email client.

On 9/20/17, 9:50 AM, "Steven Rostedt" <rostedt@goodmis.org> wrote:

The topic came up again at (of all places) the Schedule Workloads
Microconf at Linux Plumbers in LA last week. The addition of
tracepoints in locations that maintainers don't want them, only because
they don't want them to become an ABI for user space tools. Where
these tools then must be supported indefinitely, and may prevent
future development of the kernel. This includes the scheduler as well
as VFS (mandated by Al Viro).

The current solution by Facebook (told to us by Josef Bacik) is to just
hand write kprobes with BPF programs to the locations that they need.
When they get a new kernel, they just rewrite the programs because the
kprobes and BPF programs break at each new release (or can break).

First it was mentioned to add a hook to locations where it would be
easier to get variables, as the compiler could optimize them out, and
it becomes difficult even with BPF and kprobes to get the information
one would like to have. It was asked if we could add a tracepoint hook
in these locations that are not exported to user space where it runs
the risk of becoming an ABI. It was pointed out that this mechanism
already exists in the kernel.

A tracepoint is the hook in the kernel. The TRACE_EVENT() macro is
built on top of a tracepoint to export it to user space. But the
tracepoint itself can be manually added anywhere and there will be no
creation of trace event files in the tracefs directory, nor would perf
be able to access it. But the advantage of having this hook is that a
kernel module could access it without a problem.

By adding tracepoints in the scheduler and VFS, without the TRACE_EVENT
macros that export them to user space, it would be much easier for
companies like Facebook, Red Hat and SuSE to add a module that can tap
into these hooks and build their custom analysis tools on top.

Requiring an external and custom module to access the tracepoints on
live systems (that is, an unmodified vanilla kernel or distro kernel)
will help these companies implement advance analytical tools to monitor
their production kernels, and because it requires a module, and it has
been stated several times in the past that there is no KABI with module
interfaces, the maintainers of these hooks should have no fear that
they will become a stable interface.

Now, I will also point out that if one of the tracepoint hooks prove to
be useful for a generic tool, then this could be an incentive to have
the maintainer change the tracepoint hook into a full blown
TRACE_EVENT() and upgrade it to an ABI, after having time to see how it
is useful. This is a better method than having tens of trace events
where one random one proves to be useful for tools and surprises the
maintainer that the code it affects can no longer be changed.

Thoughts?

-- Steve



  reply	other threads:[~2017-09-20 14:54 UTC|newest]

Thread overview: 7+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2017-09-20 13:50 [Ksummit-discuss] [MAINTAINER TOPIC] tracepoints without user space interfaces Steven Rostedt
2017-09-20 14:54 ` Josef Bacik [this message]
2017-09-20 15:04   ` Josef Bacik
2017-09-20 15:13     ` Steven Rostedt
2017-09-21  9:45       ` Sergey Senozhatsky
2017-09-29 23:50     ` Alexei Starovoitov
2017-10-04  0:55       ` Steven Rostedt

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=0C1E6F2D-2E7D-4477-9F35-8C59F62BB409@fb.com \
    --to=jbacik@fb.com \
    --cc=josef@toxicpanda.com \
    --cc=ksummit-discuss@lists.linux-foundation.org \
    --cc=peterz@infradead.org \
    --cc=rostedt@goodmis.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.