linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH v3 0/6] kernel: introduce uaccess logging
@ 2021-12-08  4:48 Peter Collingbourne
  2021-12-08  4:48 ` [PATCH v3 1/6] include: split out uaccess instrumentation into a separate header Peter Collingbourne
                   ` (6 more replies)
  0 siblings, 7 replies; 22+ messages in thread
From: Peter Collingbourne @ 2021-12-08  4:48 UTC (permalink / raw)
  To: Catalin Marinas, Will Deacon, Ingo Molnar, Peter Zijlstra,
	Juri Lelli, Vincent Guittot, Dietmar Eggemann, Steven Rostedt,
	Ben Segall, Mel Gorman, Daniel Bristot de Oliveira,
	Thomas Gleixner, Andy Lutomirski, Kees Cook, Andrew Morton,
	Masahiro Yamada, Sami Tolvanen, YiFei Zhu, Mark Rutland,
	Frederic Weisbecker, Viresh Kumar, Andrey Konovalov,
	Peter Collingbourne, Gabriel Krisman Bertazi, Chris Hyser,
	Daniel Vetter, Chris Wilson, Arnd Bergmann, Dmitry Vyukov,
	Christian Brauner, Eric W. Biederman, Alexey Gladkov,
	Ran Xiaokai, David Hildenbrand, Xiaofeng Cao, Cyrill Gorcunov,
	Thomas Cedeno, Marco Elver, Alexander Potapenko
  Cc: linux-kernel, linux-arm-kernel, Evgenii Stepanov

This patch series introduces a kernel feature known as uaccess
logging, which allows userspace programs to be made aware of the
address and size of uaccesses performed by the kernel during
the servicing of a syscall. More details on the motivation
for and interface to this feature are available in the file
Documentation/admin-guide/uaccess-logging.rst added by the final
patch in the series.

Because we don't have a common kernel entry/exit code path that is used
on all architectures, uaccess logging is only implemented for arm64
and architectures that use CONFIG_GENERIC_ENTRY, i.e. x86 and s390.

The proposed interface is the result of numerous iterations and
prototyping and is based on a proposal by Dmitry Vyukov. The interface
preserves the correspondence between uaccess log identity and syscall
identity while tolerating incoming asynchronous signals in the interval
between setting up the logging and the actual syscall. We considered
a number of alternative designs but rejected them for various reasons:

- The design from v1 of this patch [1] proposed notifying the kernel
  of the address and size of the uaccess buffer via a prctl that
  would also automatically mask and unmask asynchronous signals as
  needed, but this would require multiple syscalls per "real" syscall,
  harming performance.

- We considered extending the syscall calling convention to
  designate currently-unused registers to be used to pass the
  location of the uaccess buffer, but this was rejected for being
  architecture-specific.

- One idea that we considered involved using the stack pointer address
  as a unique identifier for the syscall, but this currently would
  need to be arch-specific as we currently do not appear to have an
  arch-generic way of retrieving the stack pointer; the userspace
  side would also need some arch-specific code for this to work. It's
  also possible that a longjmp() past the signal handler would make
  the stack pointer address not unique enough for this purpose.

We also evaluated implementing this on top of the existing tracepoint
facility, but concluded that it is not suitable for this purpose:

- Tracepoints have a per-task granularity at best, whereas we really want
  to trace per-syscall. This is so that we can exclude syscalls that
  should not be traced, such as syscalls that make up part of the
  sanitizer implementation (to avoid infinite recursion when e.g. printing
  an error report).

- Tracing would need to be synchronous in order to produce useful
  stack traces. For example this could be achieved using the new SIGTRAP
  on perf events mechanism. However, this would require logging each
  access to the stack (in the form of a sigcontext) and this is more
  likely to overflow the stack due to being much larger than a uaccess
  buffer entry as well as being unbounded, in contrast to the bounded
  buffer size passed to prctl(). An approach based on signal handlers is
  also likely to fall foul of the asynchronous signal issues mentioned
  previously, together with needing sigreturn to be handled specially
  (because it copies a sigcontext from userspace) otherwise we could
  never return from the signal handler. Furthermore, arguments to the
  trace events are not available to SIGTRAP. (This on its own wouldn't
  be insurmountable though -- we could add the arguments as fields
  to siginfo.)

- The API in https://www.kernel.org/doc/Documentation/trace/ftrace.txt
  -- e.g. trace_pipe_raw gives access to the internal ring buffer, but
  I don't think it's usable because it's per-CPU and not per-task.

- Tracepoints can be used by eBPF programs, but eBPF programs may
  only be loaded as root, among other potential headaches.

[1] https://lore.kernel.org/all/20210922061809.736124-1-pcc@google.com/

Peter Collingbourne (6):
  include: split out uaccess instrumentation into a separate header
  uaccess-buffer: add core code
  fs: use copy_from_user_nolog() to copy mount() data
  uaccess-buffer: add CONFIG_GENERIC_ENTRY support
  arm64: add support for uaccess logging
  Documentation: document uaccess logging

 Documentation/admin-guide/index.rst           |   1 +
 Documentation/admin-guide/uaccess-logging.rst | 151 ++++++++++++++++++
 arch/Kconfig                                  |   6 +
 arch/arm64/Kconfig                            |   1 +
 arch/arm64/include/asm/thread_info.h          |   7 +-
 arch/arm64/kernel/ptrace.c                    |   7 +
 arch/arm64/kernel/signal.c                    |   5 +
 arch/arm64/kernel/syscall.c                   |   1 +
 fs/exec.c                                     |   3 +
 fs/namespace.c                                |   8 +-
 include/linux/entry-common.h                  |   2 +
 include/linux/instrumented-uaccess.h          |  53 ++++++
 include/linux/instrumented.h                  |  34 ----
 include/linux/sched.h                         |   5 +
 include/linux/thread_info.h                   |   4 +
 include/linux/uaccess-buffer-info.h           |  46 ++++++
 include/linux/uaccess-buffer.h                | 112 +++++++++++++
 include/linux/uaccess.h                       |   2 +-
 include/uapi/linux/prctl.h                    |   3 +
 include/uapi/linux/uaccess-buffer.h           |  27 ++++
 kernel/Makefile                               |   1 +
 kernel/bpf/helpers.c                          |   7 +-
 kernel/entry/common.c                         |  10 ++
 kernel/fork.c                                 |   4 +
 kernel/signal.c                               |   4 +-
 kernel/sys.c                                  |   6 +
 kernel/uaccess-buffer.c                       | 129 +++++++++++++++
 lib/iov_iter.c                                |   2 +-
 lib/usercopy.c                                |   2 +-
 29 files changed, 602 insertions(+), 41 deletions(-)
 create mode 100644 Documentation/admin-guide/uaccess-logging.rst
 create mode 100644 include/linux/instrumented-uaccess.h
 create mode 100644 include/linux/uaccess-buffer-info.h
 create mode 100644 include/linux/uaccess-buffer.h
 create mode 100644 include/uapi/linux/uaccess-buffer.h
 create mode 100644 kernel/uaccess-buffer.c

-- 
2.34.1.173.g76aa8bc2d0-goog


^ permalink raw reply	[flat|nested] 22+ messages in thread

end of thread, other threads:[~2021-12-10  4:23 UTC | newest]

Thread overview: 22+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-12-08  4:48 [PATCH v3 0/6] kernel: introduce uaccess logging Peter Collingbourne
2021-12-08  4:48 ` [PATCH v3 1/6] include: split out uaccess instrumentation into a separate header Peter Collingbourne
2021-12-08  9:27   ` Dmitry Vyukov
2021-12-08  4:48 ` [PATCH v3 2/6] uaccess-buffer: add core code Peter Collingbourne
2021-12-08 10:21   ` Dmitry Vyukov
2021-12-09 22:13     ` Peter Collingbourne
2021-12-08 16:46   ` Marco Elver
2021-12-09 22:14     ` Peter Collingbourne
2021-12-08  4:48 ` [PATCH v3 3/6] fs: use copy_from_user_nolog() to copy mount() data Peter Collingbourne
2021-12-08  9:34   ` Dmitry Vyukov
2021-12-09 21:42     ` Peter Collingbourne
2021-12-10  2:59       ` Dmitry Vyukov
2021-12-10  4:02         ` Peter Collingbourne
2021-12-10  4:23           ` Dmitry Vyukov
2021-12-08  4:48 ` [PATCH v3 4/6] uaccess-buffer: add CONFIG_GENERIC_ENTRY support Peter Collingbourne
2021-12-08  9:43   ` Dmitry Vyukov
2021-12-08  4:48 ` [PATCH v3 5/6] arm64: add support for uaccess logging Peter Collingbourne
2021-12-08  9:48   ` Dmitry Vyukov
2021-12-09 21:44     ` Peter Collingbourne
2021-12-08  4:48 ` [PATCH v3 6/6] Documentation: document " Peter Collingbourne
2021-12-08 15:33 ` [PATCH v3 0/6] kernel: introduce " Marco Elver
2021-12-09 22:12   ` Peter Collingbourne

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).