From: Yonghong Song <yhs@fb.com>
To: "Daniel T. Lee" <danieltimlee@gmail.com>,
Daniel Borkmann <daniel@iogearbox.net>,
Alexei Starovoitov <ast@kernel.org>,
Andrii Nakryiko <andrii.nakryiko@gmail.com>
Cc: <bpf@vger.kernel.org>, <netdev@vger.kernel.org>
Subject: Re: [PATCH] samples: bpf: fix tracex2 due to empty sys_write count argument
Date: Fri, 10 Dec 2021 10:28:12 -0800 [thread overview]
Message-ID: <e23dbf45-d75b-8411-bc62-d0e26cccab8a@fb.com> (raw)
In-Reply-To: <20211210111918.4904-1-danieltimlee@gmail.com>
On 12/10/21 3:19 AM, Daniel T. Lee wrote:
> Currently from syscall entry, argument can't be fetched correctly as a
> result of register cleanup.
>
> commit 6b8cf5cc9965 ("x86/entry/64/compat: Clear registers for compat syscalls, to reduce speculation attack surface")
>
> For example in upper commit, registers are cleaned prior to syscall.
> To be more specific, sys_write syscall has count size as a third argument.
> But this can't be fetched from __x64_sys_enter/__s390x_sys_enter due to
> register cleanup. (e.g. [x86] xorl %r8d, %r8d / [s390x] xgr %r7, %r7)
is this the real reason? Did you build 32-bit user space application?
Note that the above commit is for compat syscalls.
>
> This commit fix this problem by modifying the trace event to ksys_write
> instead of sys_write syscall entry.
>
> # Wrong example of 'write()' syscall argument fetching
> # ./tracex2
> ...
> pid 50909 cmd dd uid 0
> syscall write() stats
> byte_size : count distribution
> 1 -> 1 : 4968837 |************************************* |
>
> # Successful example of 'write()' syscall argument fetching
> # (dd's write bytes at a time defaults to 512)
> # ./tracex2
> ...
> pid 3095 cmd dd uid 0
> syscall write() stats
> byte_size : count distribution
> ...
> 256 -> 511 : 0 | |
> 512 -> 1023 : 4968844 |************************************* |
>
> Signed-off-by: Daniel T. Lee <danieltimlee@gmail.com>
> ---
> samples/bpf/tracex2_kern.c | 2 +-
> 1 file changed, 1 insertion(+), 1 deletion(-)
>
> diff --git a/samples/bpf/tracex2_kern.c b/samples/bpf/tracex2_kern.c
> index 5bc696bac27d..96dff3bea227 100644
> --- a/samples/bpf/tracex2_kern.c
> +++ b/samples/bpf/tracex2_kern.c
> @@ -78,7 +78,7 @@ struct {
> __uint(max_entries, 1024);
> } my_hist_map SEC(".maps");
>
> -SEC("kprobe/" SYSCALL(sys_write))
> +SEC("kprobe/ksys_write")
> int bpf_prog3(struct pt_regs *ctx)
> {
> long write_size = PT_REGS_PARM3(ctx);
I think the real reason of the failure is due to SYSCALL_WRAPPER.
Please take a look at test_probe_write_user_kern.c.
The issue with ksys_write() is that it can easily be inlined. For
example, the source code,
ssize_t ksys_write(unsigned int fd, const char __user *buf, size_t count)
{
...
}
SYSCALL_DEFINE3(write, unsigned int, fd, const char __user *, buf,
size_t, count)
{
return ksys_write(fd, buf, count);
}
In my 5.12 kernel, I have
ffffffff81053b00 <__x64_sys_write>:
ffffffff81053b00: 0f 1f 44 00 00 nopl (%rax,%rax)
ffffffff81053b05: 41 57 pushq %r15
ffffffff81053b07: 41 56 pushq %r14
ffffffff81053b09: 41 55 pushq %r13
ffffffff81053b0b: 41 54 pushq %r12
ffffffff81053b0d: 53 pushq %rbx
ffffffff81053b0e: 48 83 ec 10 subq $16, %rsp
ffffffff81053b12: 65 48 8b 04 25 28 00 00 00 movq %gs:40, %rax
ffffffff81053b1b: 48 89 44 24 08 movq %rax, 8(%rsp)
ffffffff81053b20: 8b 47 70 movl 112(%rdi), %eax
ffffffff81053b23: 4c 8b 7f 60 movq 96(%rdi), %r15
ffffffff81053b27: 4c 8b 67 68 movq 104(%rdi), %r12
ffffffff81053b2b: 89 c7 movl %eax, %edi
ffffffff81053b2d: e8 6e a3 00 00 callq 0xffffffff8105dea0
<__fdget_pos>
...
The ksys_write() is inlined.
prev parent reply other threads:[~2021-12-10 18:28 UTC|newest]
Thread overview: 3+ messages / expand[flat|nested] mbox.gz Atom feed top
2021-12-10 11:19 [PATCH] samples: bpf: fix tracex2 due to empty sys_write count argument Daniel T. Lee
2021-12-10 18:00 ` John Fastabend
2021-12-10 18:28 ` Yonghong Song [this message]
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=e23dbf45-d75b-8411-bc62-d0e26cccab8a@fb.com \
--to=yhs@fb.com \
--cc=andrii.nakryiko@gmail.com \
--cc=ast@kernel.org \
--cc=bpf@vger.kernel.org \
--cc=daniel@iogearbox.net \
--cc=danieltimlee@gmail.com \
--cc=netdev@vger.kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.