linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Andrei Vagin <avagin@gmail.com>
To: Will Deacon <will@kernel.org>,
	Catalin Marinas <catalin.marinas@arm.com>,
	keno@juliacomputing.com, dave.martin@arm.com
Cc: Oleg Nesterov <oleg@redhat.com>,
	linux-arm-kernel@lists.infradead.org,
	LKML <linux-kernel@vger.kernel.org>,
	Andrei Vagin <avagin@google.com>,
	Howard Zhang <howard.zhang@arm.com>,
	Anthony Steinhauser <asteinhauser@google.com>
Subject: Re: [PATCH 0/3] arm64/ptrace: allow to get all registers on syscall traps
Date: Wed, 27 Jan 2021 00:10:30 -0800	[thread overview]
Message-ID: <CANaxB-zwjDu5PSFJebeJe5zH94HC7mThOwyPYSjE4tkQ0zwvBA@mail.gmail.com> (raw)
In-Reply-To: <20210119220637.494476-1-avagin@gmail.com>

On Tue, Jan 19, 2021 at 2:08 PM Andrei Vagin <avagin@gmail.com> wrote:
>
> Right now, ip/r12 for AArch32 and x7 for AArch64 is used to indicate
> whether or not the stop has been signalled from syscall entry or syscall
> exit. This means that:
>
> - Any writes by the tracer to this register during the stop are
>   ignored/discarded.
>
> - The actual value of the register is not available during the stop,
>   so the tracer cannot save it and restore it later.
>
> This series introduces NT_ARM_PRSTATUS to get all registers and makes it
> possible to change ip/r12 and x7 registers when tracee is stopped in
> syscall traps.
>
> For applications like the user-mode Linux or gVisor, it is critical to
> have access to the full set of registers at any moment. For example,
> they need to change values of all registers to emulate rt_sigreturn and
> they need to have the full set of registers to build a signal frame.

I have found the thread [1] where Keno, Will, and Dave discussed the same
problem. If I understand this right, the problem was not fixed, because there
were no users who needed it.

gVisor is a general-purpose sandbox to run untrusted workloads. It has a
platform interface that is responsible for syscall interception, context
switching, and managing process address spaces. Right now, we have kvm and
ptrace platforms. The ptrace platform runs a guest code in the context of stub
processes and intercepts syscalls with help of PTRACE_SYSEMU. All system calls
are handled by the gVisor kernel including rt_sigreturn and execve. Signal
handling is happing inside the gVisor kernel too. Each stub process can have
more than one thread, but we don't bind guest threads to stub threads and we
can run more than one guest thread in the context of one stub thread. Taking
into account all these facts, we need to have access to all registers at any
moment when a stub thread has been stopped.

We were able to introduce the workaround [3] for this issue. Each time when a
stub process is stopped on a system call, we queue a fake signal and resume a
process to stop it on the signal. It works, but we need to do extra interaction
with a stub process what is expensive. My benchmarks show that this workaround
slows down syscalls in gVisor for more than 50%. BTW: it is one of the major
reasons why PTRACE_SYSEMU was introduced instead of emulating it via
two calls of PTRACE_SYSCALL.


[1] https://lore.kernel.org/lkml/CABV8kRz0mKSc=u1LeonQSLroKJLOKWOWktCoGji2nvEBc=e7=w@mail.gmail.com/#r
[2] https://github.com/google/gvisor/issues/5238
[3] https://github.com/google/gvisor/commit/a44efaab6d4b815880749a39647fb3ed9634a489

>
> Andrei Vagin (3):
>   arm64/ptrace: don't clobber task registers on syscall entry/exit traps
>   arm64/ptrace: introduce NT_ARM_PRSTATUS to get a full set of registers
>   selftest/arm64/ptrace: add a test for NT_ARM_PRSTATUS
>
>  arch/arm64/include/asm/ptrace.h               |   5 +
>  arch/arm64/kernel/ptrace.c                    | 130 +++++++++++-----
>  include/uapi/linux/elf.h                      |   1 +
>  tools/testing/selftests/arm64/Makefile        |   2 +-
>  tools/testing/selftests/arm64/ptrace/Makefile |   6 +
>  .../arm64/ptrace/ptrace_syscall_regs_test.c   | 142 ++++++++++++++++++
>  6 files changed, 246 insertions(+), 40 deletions(-)
>  create mode 100644 tools/testing/selftests/arm64/ptrace/Makefile
>  create mode 100644 tools/testing/selftests/arm64/ptrace/ptrace_syscall_regs_test.c
>
> --
> 2.29.2
>

      parent reply	other threads:[~2021-01-27  8:16 UTC|newest]

Thread overview: 8+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-01-19 22:06 [PATCH 0/3] arm64/ptrace: allow to get all registers on syscall traps Andrei Vagin
2021-01-19 22:06 ` [PATCH 1/3] arm64/ptrace: don't clobber task registers on syscall entry/exit traps Andrei Vagin
2021-01-27 15:14   ` Dave Martin
2021-01-19 22:06 ` [PATCH 2/3] arm64/ptrace: introduce NT_ARM_PRSTATUS to get a full set of registers Andrei Vagin
2021-01-27 14:53   ` Dave Martin
2021-01-29  7:56     ` Andrei Vagin
2021-01-19 22:06 ` [PATCH 3/3] selftest/arm64/ptrace: add a test for NT_ARM_PRSTATUS Andrei Vagin
2021-01-27  8:10 ` Andrei Vagin [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=CANaxB-zwjDu5PSFJebeJe5zH94HC7mThOwyPYSjE4tkQ0zwvBA@mail.gmail.com \
    --to=avagin@gmail.com \
    --cc=asteinhauser@google.com \
    --cc=avagin@google.com \
    --cc=catalin.marinas@arm.com \
    --cc=dave.martin@arm.com \
    --cc=howard.zhang@arm.com \
    --cc=keno@juliacomputing.com \
    --cc=linux-arm-kernel@lists.infradead.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=oleg@redhat.com \
    --cc=will@kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).