From: Peter Collingbourne <pcc@google.com> To: Catalin Marinas <catalin.marinas@arm.com>, Will Deacon <will@kernel.org>, Ingo Molnar <mingo@redhat.com>, Peter Zijlstra <peterz@infradead.org>, Juri Lelli <juri.lelli@redhat.com>, Vincent Guittot <vincent.guittot@linaro.org>, Dietmar Eggemann <dietmar.eggemann@arm.com>, Steven Rostedt <rostedt@goodmis.org>, Ben Segall <bsegall@google.com>, Mel Gorman <mgorman@suse.de>, Daniel Bristot de Oliveira <bristot@redhat.com>, Thomas Gleixner <tglx@linutronix.de>, Andy Lutomirski <luto@kernel.org>, Kees Cook <keescook@chromium.org>, Andrew Morton <akpm@linux-foundation.org>, Masahiro Yamada <masahiroy@kernel.org>, Sami Tolvanen <samitolvanen@google.com>, YiFei Zhu <yifeifz2@illinois.edu>, Colin Ian King <colin.king@canonical.com>, Mark Rutland <mark.rutland@arm.com>, Frederic Weisbecker <frederic@kernel.org>, Viresh Kumar <viresh.kumar@linaro.org>, Andrey Konovalov <andreyknvl@gmail.com>, Peter Collingbourne <pcc@google.com>, Gabriel Krisman Bertazi <krisman@collabora.com>, Chris Hyser <chris.hyser@oracle.com>, Daniel Vetter <daniel.vetter@ffwll.ch>, Chris Wilson <chris@chris-wilson.co.uk>, Arnd Bergmann <arnd@arndb.de>, Dmitry Vyukov <dvyukov@google.com>, Christian Brauner <christian.brauner@ubuntu.com>, "Eric W. Biederman" <ebiederm@xmission.com>, Alexey Gladkov <legion@kernel.org>, Ran Xiaokai <ran.xiaokai@zte.com.cn>, David Hildenbrand <david@redhat.com>, Xiaofeng Cao <caoxiaofeng@yulong.com>, Cyrill Gorcunov <gorcunov@gmail.com>, Thomas Cedeno <thomascedeno@google.com>, Marco Elver <elver@google.com>, Alexander Potapenko <glider@google.com> Cc: linux-kernel@vger.kernel.org, linux-arm-kernel@lists.infradead.org, Evgenii Stepanov <eugenis@google.com> Subject: [PATCH v2 0/5] kernel: introduce uaccess logging Date: Mon, 22 Nov 2021 21:16:53 -0800 [thread overview] Message-ID: <20211123051658.3195589-1-pcc@google.com> (raw) This patch series introduces a kernel feature known as uaccess logging, which allows userspace programs to be made aware of the address and size of uaccesses performed by the kernel during the servicing of a syscall. More details on the motivation for and interface to this feature are available in the file Documentation/admin-guide/uaccess-logging.rst added by the final patch in the series. Because we don't have a common kernel entry/exit code path that is used on all architectures, uaccess logging is only implemented for arm64 and architectures that use CONFIG_GENERIC_ENTRY, i.e. x86 and s390. The proposed interface is the result of numerous iterations and prototyping and is based on a proposal by Dmitry Vyukov. The interface preserves the correspondence between uaccess log identity and syscall identity while tolerating incoming asynchronous signals in the interval between setting up the logging and the actual syscall. We considered a number of alternative designs but rejected them for various reasons: - The design from v1 of this patch [1] proposed notifying the kernel of the address and size of the uaccess buffer via a prctl that would also automatically mask and unmask asynchronous signals as needed, but this would require multiple syscalls per "real" syscall, harming performance. - We considered extending the syscall calling convention to designate currently-unused registers to be used to pass the location of the uaccess buffer, but this was rejected for being architecture-specific. - One idea that we considered involved using the stack pointer address as a unique identifier for the syscall, but this currently would need to be arch-specific as we currently do not appear to have an arch-generic way of retrieving the stack pointer; the userspace side would also need some arch-specific code for this to work. It's also possible that a longjmp() past the signal handler would make the stack pointer address not unique enough for this purpose. We also evaluated implementing this on top of the existing tracepoint facility, but concluded that it is not suitable for this purpose: - Tracepoints have a per-task granularity at best, whereas we really want to trace per-syscall. This is so that we can exclude syscalls that should not be traced, such as syscalls that make up part of the sanitizer implementation (to avoid infinite recursion when e.g. printing an error report). - Tracing would need to be synchronous in order to produce useful stack traces. For example this could be achieved using the new SIGTRAP on perf events mechanism. However, this would require logging each access to the stack (in the form of a sigcontext) and this is more likely to overflow the stack due to being much larger than a uaccess buffer entry as well as being unbounded, in contrast to the bounded buffer size passed to prctl(). An approach based on signal handlers is also likely to fall foul of the asynchronous signal issues mentioned previously, together with needing sigreturn to be handled specially (because it copies a sigcontext from userspace) otherwise we could never return from the signal handler. Furthermore, arguments to the trace events are not available to SIGTRAP. (This on its own wouldn't be insurmountable though -- we could add the arguments as fields to siginfo.) - The API in https://www.kernel.org/doc/Documentation/trace/ftrace.txt -- e.g. trace_pipe_raw gives access to the internal ring buffer, but I don't think it's usable because it's per-CPU and not per-task. - Tracepoints can be used by eBPF programs, but eBPF programs may only be loaded as root, among other potential headaches. [1] https://lore.kernel.org/all/20210922061809.736124-1-pcc@google.com/ Peter Collingbourne (5): fs: use raw_copy_from_user() to copy mount() data uaccess-buffer: add core code uaccess-buffer: add CONFIG_GENERIC_ENTRY support arm64: add support for uaccess logging Documentation: document uaccess logging Documentation/admin-guide/index.rst | 1 + Documentation/admin-guide/uaccess-logging.rst | 149 ++++++++++++++++++ arch/Kconfig | 6 + arch/arm64/Kconfig | 1 + arch/arm64/kernel/signal.c | 5 + arch/arm64/kernel/syscall.c | 3 + fs/exec.c | 2 + fs/namespace.c | 7 +- include/linux/instrumented.h | 5 +- include/linux/sched.h | 4 + include/linux/uaccess-buffer-log-hooks.h | 59 +++++++ include/linux/uaccess-buffer.h | 79 ++++++++++ include/uapi/linux/prctl.h | 3 + include/uapi/linux/uaccess-buffer.h | 25 +++ kernel/Makefile | 1 + kernel/bpf/helpers.c | 6 +- kernel/entry/common.c | 7 + kernel/fork.c | 3 + kernel/signal.c | 4 +- kernel/sys.c | 6 + kernel/uaccess-buffer.c | 125 +++++++++++++++ 21 files changed, 497 insertions(+), 4 deletions(-) create mode 100644 Documentation/admin-guide/uaccess-logging.rst create mode 100644 include/linux/uaccess-buffer-log-hooks.h create mode 100644 include/linux/uaccess-buffer.h create mode 100644 include/uapi/linux/uaccess-buffer.h create mode 100644 kernel/uaccess-buffer.c -- 2.34.0.rc2.393.gf8c9666880-goog
WARNING: multiple messages have this Message-ID (diff)
From: Peter Collingbourne <pcc@google.com> To: Catalin Marinas <catalin.marinas@arm.com>, Will Deacon <will@kernel.org>, Ingo Molnar <mingo@redhat.com>, Peter Zijlstra <peterz@infradead.org>, Juri Lelli <juri.lelli@redhat.com>, Vincent Guittot <vincent.guittot@linaro.org>, Dietmar Eggemann <dietmar.eggemann@arm.com>, Steven Rostedt <rostedt@goodmis.org>, Ben Segall <bsegall@google.com>, Mel Gorman <mgorman@suse.de>, Daniel Bristot de Oliveira <bristot@redhat.com>, Thomas Gleixner <tglx@linutronix.de>, Andy Lutomirski <luto@kernel.org>, Kees Cook <keescook@chromium.org>, Andrew Morton <akpm@linux-foundation.org>, Masahiro Yamada <masahiroy@kernel.org>, Sami Tolvanen <samitolvanen@google.com>, YiFei Zhu <yifeifz2@illinois.edu>, Colin Ian King <colin.king@canonical.com>, Mark Rutland <mark.rutland@arm.com>, Frederic Weisbecker <frederic@kernel.org>, Viresh Kumar <viresh.kumar@linaro.org>, Andrey Konovalov <andreyknvl@gmail.com>, Peter Collingbourne <pcc@google.com>, Gabriel Krisman Bertazi <krisman@collabora.com>, Chris Hyser <chris.hyser@oracle.com>, Daniel Vetter <daniel.vetter@ffwll.ch>, Chris Wilson <chris@chris-wilson.co.uk>, Arnd Bergmann <arnd@arndb.de>, Dmitry Vyukov <dvyukov@google.com>, Christian Brauner <christian.brauner@ubuntu.com>, "Eric W. Biederman" <ebiederm@xmission.com>, Alexey Gladkov <legion@kernel.org>, Ran Xiaokai <ran.xiaokai@zte.com.cn>, David Hildenbrand <david@redhat.com>, Xiaofeng Cao <caoxiaofeng@yulong.com>, Cyrill Gorcunov <gorcunov@gmail.com>, Thomas Cedeno <thomascedeno@google.com>, Marco Elver <elver@google.com>, Alexander Potapenko <glider@google.com> Cc: linux-kernel@vger.kernel.org, linux-arm-kernel@lists.infradead.org, Evgenii Stepanov <eugenis@google.com> Subject: [PATCH v2 0/5] kernel: introduce uaccess logging Date: Mon, 22 Nov 2021 21:16:53 -0800 [thread overview] Message-ID: <20211123051658.3195589-1-pcc@google.com> (raw) This patch series introduces a kernel feature known as uaccess logging, which allows userspace programs to be made aware of the address and size of uaccesses performed by the kernel during the servicing of a syscall. More details on the motivation for and interface to this feature are available in the file Documentation/admin-guide/uaccess-logging.rst added by the final patch in the series. Because we don't have a common kernel entry/exit code path that is used on all architectures, uaccess logging is only implemented for arm64 and architectures that use CONFIG_GENERIC_ENTRY, i.e. x86 and s390. The proposed interface is the result of numerous iterations and prototyping and is based on a proposal by Dmitry Vyukov. The interface preserves the correspondence between uaccess log identity and syscall identity while tolerating incoming asynchronous signals in the interval between setting up the logging and the actual syscall. We considered a number of alternative designs but rejected them for various reasons: - The design from v1 of this patch [1] proposed notifying the kernel of the address and size of the uaccess buffer via a prctl that would also automatically mask and unmask asynchronous signals as needed, but this would require multiple syscalls per "real" syscall, harming performance. - We considered extending the syscall calling convention to designate currently-unused registers to be used to pass the location of the uaccess buffer, but this was rejected for being architecture-specific. - One idea that we considered involved using the stack pointer address as a unique identifier for the syscall, but this currently would need to be arch-specific as we currently do not appear to have an arch-generic way of retrieving the stack pointer; the userspace side would also need some arch-specific code for this to work. It's also possible that a longjmp() past the signal handler would make the stack pointer address not unique enough for this purpose. We also evaluated implementing this on top of the existing tracepoint facility, but concluded that it is not suitable for this purpose: - Tracepoints have a per-task granularity at best, whereas we really want to trace per-syscall. This is so that we can exclude syscalls that should not be traced, such as syscalls that make up part of the sanitizer implementation (to avoid infinite recursion when e.g. printing an error report). - Tracing would need to be synchronous in order to produce useful stack traces. For example this could be achieved using the new SIGTRAP on perf events mechanism. However, this would require logging each access to the stack (in the form of a sigcontext) and this is more likely to overflow the stack due to being much larger than a uaccess buffer entry as well as being unbounded, in contrast to the bounded buffer size passed to prctl(). An approach based on signal handlers is also likely to fall foul of the asynchronous signal issues mentioned previously, together with needing sigreturn to be handled specially (because it copies a sigcontext from userspace) otherwise we could never return from the signal handler. Furthermore, arguments to the trace events are not available to SIGTRAP. (This on its own wouldn't be insurmountable though -- we could add the arguments as fields to siginfo.) - The API in https://www.kernel.org/doc/Documentation/trace/ftrace.txt -- e.g. trace_pipe_raw gives access to the internal ring buffer, but I don't think it's usable because it's per-CPU and not per-task. - Tracepoints can be used by eBPF programs, but eBPF programs may only be loaded as root, among other potential headaches. [1] https://lore.kernel.org/all/20210922061809.736124-1-pcc@google.com/ Peter Collingbourne (5): fs: use raw_copy_from_user() to copy mount() data uaccess-buffer: add core code uaccess-buffer: add CONFIG_GENERIC_ENTRY support arm64: add support for uaccess logging Documentation: document uaccess logging Documentation/admin-guide/index.rst | 1 + Documentation/admin-guide/uaccess-logging.rst | 149 ++++++++++++++++++ arch/Kconfig | 6 + arch/arm64/Kconfig | 1 + arch/arm64/kernel/signal.c | 5 + arch/arm64/kernel/syscall.c | 3 + fs/exec.c | 2 + fs/namespace.c | 7 +- include/linux/instrumented.h | 5 +- include/linux/sched.h | 4 + include/linux/uaccess-buffer-log-hooks.h | 59 +++++++ include/linux/uaccess-buffer.h | 79 ++++++++++ include/uapi/linux/prctl.h | 3 + include/uapi/linux/uaccess-buffer.h | 25 +++ kernel/Makefile | 1 + kernel/bpf/helpers.c | 6 +- kernel/entry/common.c | 7 + kernel/fork.c | 3 + kernel/signal.c | 4 +- kernel/sys.c | 6 + kernel/uaccess-buffer.c | 125 +++++++++++++++ 21 files changed, 497 insertions(+), 4 deletions(-) create mode 100644 Documentation/admin-guide/uaccess-logging.rst create mode 100644 include/linux/uaccess-buffer-log-hooks.h create mode 100644 include/linux/uaccess-buffer.h create mode 100644 include/uapi/linux/uaccess-buffer.h create mode 100644 kernel/uaccess-buffer.c -- 2.34.0.rc2.393.gf8c9666880-goog _______________________________________________ linux-arm-kernel mailing list linux-arm-kernel@lists.infradead.org http://lists.infradead.org/mailman/listinfo/linux-arm-kernel
next reply other threads:[~2021-11-23 5:17 UTC|newest] Thread overview: 38+ messages / expand[flat|nested] mbox.gz Atom feed top 2021-11-23 5:16 Peter Collingbourne [this message] 2021-11-23 5:16 ` [PATCH v2 0/5] kernel: introduce uaccess logging Peter Collingbourne 2021-11-23 5:16 ` [PATCH v2 1/5] fs: use raw_copy_from_user() to copy mount() data Peter Collingbourne 2021-11-23 5:16 ` Peter Collingbourne 2021-11-23 7:50 ` Dmitry Vyukov 2021-11-23 7:50 ` Dmitry Vyukov 2021-11-23 10:09 ` Alexander Potapenko 2021-11-23 10:09 ` Alexander Potapenko 2021-12-08 3:53 ` Peter Collingbourne 2021-12-08 3:53 ` Peter Collingbourne 2021-11-23 5:16 ` [PATCH v2 2/5] uaccess-buffer: add core code Peter Collingbourne 2021-11-23 5:16 ` Peter Collingbourne 2021-11-23 9:56 ` Dmitry Vyukov 2021-11-23 9:56 ` Dmitry Vyukov 2021-11-23 10:08 ` Dmitry Vyukov 2021-11-23 10:08 ` Dmitry Vyukov 2021-11-23 10:19 ` Alexander Potapenko 2021-11-23 10:19 ` Alexander Potapenko 2021-11-23 14:53 ` David Laight 2021-11-23 14:53 ` David Laight 2021-12-08 3:52 ` Peter Collingbourne 2021-12-08 3:52 ` Peter Collingbourne 2021-11-23 13:07 ` kernel test robot 2021-11-23 5:16 ` [PATCH v2 3/5] uaccess-buffer: add CONFIG_GENERIC_ENTRY support Peter Collingbourne 2021-11-23 5:16 ` Peter Collingbourne 2021-11-23 10:56 ` kernel test robot 2021-11-23 20:27 ` kernel test robot 2021-11-24 1:45 ` kernel test robot 2021-11-25 13:40 ` [uaccess] 7cd6f10220: BUG:unable_to_handle_page_fault_for_address kernel test robot 2021-11-25 13:40 ` kernel test robot 2021-11-23 5:16 ` [PATCH v2 4/5] arm64: add support for uaccess logging Peter Collingbourne 2021-11-23 5:16 ` Peter Collingbourne 2021-11-23 5:16 ` [PATCH v2 5/5] Documentation: document " Peter Collingbourne 2021-11-23 5:16 ` Peter Collingbourne 2021-11-23 7:46 ` Dmitry Vyukov 2021-11-23 7:46 ` Dmitry Vyukov 2021-12-10 21:29 ` Peter Collingbourne 2021-12-10 21:29 ` Peter Collingbourne
Reply instructions: You may reply publicly to this message via plain-text email using any one of the following methods: * Save the following mbox file, import it into your mail client, and reply-to-all from there: mbox Avoid top-posting and favor interleaved quoting: https://en.wikipedia.org/wiki/Posting_style#Interleaved_style * Reply using the --to, --cc, and --in-reply-to switches of git-send-email(1): git send-email \ --in-reply-to=20211123051658.3195589-1-pcc@google.com \ --to=pcc@google.com \ --cc=akpm@linux-foundation.org \ --cc=andreyknvl@gmail.com \ --cc=arnd@arndb.de \ --cc=bristot@redhat.com \ --cc=bsegall@google.com \ --cc=caoxiaofeng@yulong.com \ --cc=catalin.marinas@arm.com \ --cc=chris.hyser@oracle.com \ --cc=chris@chris-wilson.co.uk \ --cc=christian.brauner@ubuntu.com \ --cc=colin.king@canonical.com \ --cc=daniel.vetter@ffwll.ch \ --cc=david@redhat.com \ --cc=dietmar.eggemann@arm.com \ --cc=dvyukov@google.com \ --cc=ebiederm@xmission.com \ --cc=elver@google.com \ --cc=eugenis@google.com \ --cc=frederic@kernel.org \ --cc=glider@google.com \ --cc=gorcunov@gmail.com \ --cc=juri.lelli@redhat.com \ --cc=keescook@chromium.org \ --cc=krisman@collabora.com \ --cc=legion@kernel.org \ --cc=linux-arm-kernel@lists.infradead.org \ --cc=linux-kernel@vger.kernel.org \ --cc=luto@kernel.org \ --cc=mark.rutland@arm.com \ --cc=masahiroy@kernel.org \ --cc=mgorman@suse.de \ --cc=mingo@redhat.com \ --cc=peterz@infradead.org \ --cc=ran.xiaokai@zte.com.cn \ --cc=rostedt@goodmis.org \ --cc=samitolvanen@google.com \ --cc=tglx@linutronix.de \ --cc=thomascedeno@google.com \ --cc=vincent.guittot@linaro.org \ --cc=viresh.kumar@linaro.org \ --cc=will@kernel.org \ --cc=yifeifz2@illinois.edu \ /path/to/YOUR_REPLY https://kernel.org/pub/software/scm/git/docs/git-send-email.html * If your mail client supports setting the In-Reply-To header via mailto: links, try the mailto: linkBe sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes, see mirroring instructions on how to clone and mirror all data and code used by this external index.