All of lore.kernel.org
 help / color / mirror / Atom feed
From: Xu Kuohai <xukuohai@huawei.com>
To: Daniel Borkmann <daniel@iogearbox.net>,
	Xu Kuohai <xukuohai@huaweicloud.com>, <bpf@vger.kernel.org>,
	<linux-arm-kernel@lists.infradead.org>,
	<linux-kernel@vger.kernel.org>
Cc: Jean-Philippe Brucker <jean-philippe@linaro.org>,
	Alexei Starovoitov <ast@kernel.org>,
	Andrii Nakryiko <andrii@kernel.org>,
	Martin KaFai Lau <martin.lau@linux.dev>,
	Song Liu <song@kernel.org>, Yonghong Song <yhs@fb.com>,
	John Fastabend <john.fastabend@gmail.com>,
	KP Singh <kpsingh@kernel.org>,
	Stanislav Fomichev <sdf@google.com>, Hao Luo <haoluo@google.com>,
	Jiri Olsa <jolsa@kernel.org>, Zi Shen Lim <zlim.lnx@gmail.com>,
	Catalin Marinas <catalin.marinas@arm.com>,
	Will Deacon <will@kernel.org>, <mark.rutland@arm.com>,
	<revest@chromium.org>
Subject: Re: [RESEND PATCH bpf-next 1/2] bpf, arm64: Jit BPF_CALL to direct call when possible
Date: Tue, 27 Sep 2022 22:01:07 +0800	[thread overview]
Message-ID: <30bbaf09-1c2f-72fb-98cd-afe75849261c@huawei.com> (raw)
In-Reply-To: <21073277-5bbd-5555-88f2-76b07ad9b74f@iogearbox.net>

On 9/27/2022 4:29 AM, Daniel Borkmann wrote:
> [ +Mark/Florent ]
> 
> On 9/19/22 11:21 AM, Xu Kuohai wrote:
>> From: Xu Kuohai <xukuohai@huawei.com>
>>
>> Currently BPF_CALL is always jited to indirect call, but when target is
>> in the range of direct call, BPF_CALL can be jited to direct call.
>>
>> For example, the following BPF_CALL
>>
>>      call __htab_map_lookup_elem
>>
>> is always jited to an indirect call:
>>
>>      mov     x10, #0xffffffffffff18f4
>>      movk    x10, #0x821, lsl #16
>>      movk    x10, #0x8000, lsl #32
>>      blr     x10
>>
>> When the target is in the range of direct call, it can be jited to:
>>
>>      bl      0xfffffffffd33bc98
>>
>> This patch does such jit when possible.
>>
>> 1. First pass, get the maximum jited image size. Since the jited image
>>     memory is not allocated yet, the distance between jited BPF_CALL
>>     instructon and call target is unknown, so jit all BPF_CALL to indirect
>>     call to get the maximum image size.
>>
>> 2. Allocate image memory with the size caculated in step 1.
>>
>> 3. Second pass, determine the jited address and size for every bpf instruction.
>>     Since image memory is now allocated and there is only one jit method for
>>     bpf instructions other than BPF_CALL, so the jited address for the first
>>     BPF_CALL is determined, so the distance to call target is determined, so
>>     the first BPF_CALL is determined to be jited to direct or indirect call,
>>     so the jited image size after the first BPF_CALL is determined. By analogy,
>>     the jited addresses and sizes for all subsequent BPF instructions are
>>     determined.
>>
>> 4. Last pass, generate the final image. The jump offset of jump instruction
>>     whose target is within the jited image is determined in this pass, since
>>     the target instruction address may be changed in step 3.
> 
> Wouldn't this require similar convergence process like in x86-64 JIT? You state
> the jump instructions are placed in step 4 because step 3 could have changed their
> offsets, but then after step 4, couldn't also again the offsets have changed for
> the target addresses from 3 again in some corner cases (given emit_a64_mov_i() is
> used also in jump encoding)?
> 

IIUC, the reason why there is a convergence process on x86 is that x86's jmp
instruction length varies with the size of immediate part, so after immediate
part is adjusted, the instruction length may change accordingly, and consequently
cause the positions of subsequent instructions to change, which in turn causes
the distance between instructions to change. However, arm64's instruction size
is fixed to 4 bytes and does not change with immediate part changes. So adjusting
the immediate part of arm64 jump instruction does not result in a change in
instruction length or position.

For BPF_CALL, arguments passed to emit_call() and emit_a64_mov_i() (if called)
do not change in pass 3 and 4, so the jited result does not change. This is also
true for other non-BPF_JMP instructions.

So no convergence is required on arm64.

>> Tested with test_bpf.ko and some arm64 working selftests, nothing failed.

[...]


WARNING: multiple messages have this Message-ID (diff)
From: Xu Kuohai <xukuohai@huawei.com>
To: Daniel Borkmann <daniel@iogearbox.net>,
	Xu Kuohai <xukuohai@huaweicloud.com>, <bpf@vger.kernel.org>,
	<linux-arm-kernel@lists.infradead.org>,
	<linux-kernel@vger.kernel.org>
Cc: Jean-Philippe Brucker <jean-philippe@linaro.org>,
	Alexei Starovoitov <ast@kernel.org>,
	Andrii Nakryiko <andrii@kernel.org>,
	Martin KaFai Lau <martin.lau@linux.dev>,
	Song Liu <song@kernel.org>, Yonghong Song <yhs@fb.com>,
	John Fastabend <john.fastabend@gmail.com>,
	KP Singh <kpsingh@kernel.org>,
	Stanislav Fomichev <sdf@google.com>, Hao Luo <haoluo@google.com>,
	Jiri Olsa <jolsa@kernel.org>, Zi Shen Lim <zlim.lnx@gmail.com>,
	Catalin Marinas <catalin.marinas@arm.com>,
	Will Deacon <will@kernel.org>, <mark.rutland@arm.com>,
	<revest@chromium.org>
Subject: Re: [RESEND PATCH bpf-next 1/2] bpf, arm64: Jit BPF_CALL to direct call when possible
Date: Tue, 27 Sep 2022 22:01:07 +0800	[thread overview]
Message-ID: <30bbaf09-1c2f-72fb-98cd-afe75849261c@huawei.com> (raw)
In-Reply-To: <21073277-5bbd-5555-88f2-76b07ad9b74f@iogearbox.net>

On 9/27/2022 4:29 AM, Daniel Borkmann wrote:
> [ +Mark/Florent ]
> 
> On 9/19/22 11:21 AM, Xu Kuohai wrote:
>> From: Xu Kuohai <xukuohai@huawei.com>
>>
>> Currently BPF_CALL is always jited to indirect call, but when target is
>> in the range of direct call, BPF_CALL can be jited to direct call.
>>
>> For example, the following BPF_CALL
>>
>>      call __htab_map_lookup_elem
>>
>> is always jited to an indirect call:
>>
>>      mov     x10, #0xffffffffffff18f4
>>      movk    x10, #0x821, lsl #16
>>      movk    x10, #0x8000, lsl #32
>>      blr     x10
>>
>> When the target is in the range of direct call, it can be jited to:
>>
>>      bl      0xfffffffffd33bc98
>>
>> This patch does such jit when possible.
>>
>> 1. First pass, get the maximum jited image size. Since the jited image
>>     memory is not allocated yet, the distance between jited BPF_CALL
>>     instructon and call target is unknown, so jit all BPF_CALL to indirect
>>     call to get the maximum image size.
>>
>> 2. Allocate image memory with the size caculated in step 1.
>>
>> 3. Second pass, determine the jited address and size for every bpf instruction.
>>     Since image memory is now allocated and there is only one jit method for
>>     bpf instructions other than BPF_CALL, so the jited address for the first
>>     BPF_CALL is determined, so the distance to call target is determined, so
>>     the first BPF_CALL is determined to be jited to direct or indirect call,
>>     so the jited image size after the first BPF_CALL is determined. By analogy,
>>     the jited addresses and sizes for all subsequent BPF instructions are
>>     determined.
>>
>> 4. Last pass, generate the final image. The jump offset of jump instruction
>>     whose target is within the jited image is determined in this pass, since
>>     the target instruction address may be changed in step 3.
> 
> Wouldn't this require similar convergence process like in x86-64 JIT? You state
> the jump instructions are placed in step 4 because step 3 could have changed their
> offsets, but then after step 4, couldn't also again the offsets have changed for
> the target addresses from 3 again in some corner cases (given emit_a64_mov_i() is
> used also in jump encoding)?
> 

IIUC, the reason why there is a convergence process on x86 is that x86's jmp
instruction length varies with the size of immediate part, so after immediate
part is adjusted, the instruction length may change accordingly, and consequently
cause the positions of subsequent instructions to change, which in turn causes
the distance between instructions to change. However, arm64's instruction size
is fixed to 4 bytes and does not change with immediate part changes. So adjusting
the immediate part of arm64 jump instruction does not result in a change in
instruction length or position.

For BPF_CALL, arguments passed to emit_call() and emit_a64_mov_i() (if called)
do not change in pass 3 and 4, so the jited result does not change. This is also
true for other non-BPF_JMP instructions.

So no convergence is required on arm64.

>> Tested with test_bpf.ko and some arm64 working selftests, nothing failed.

[...]


_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

  reply	other threads:[~2022-09-27 14:02 UTC|newest]

Thread overview: 12+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-09-19  9:21 [RESEND PATCH bpf-next 0/2] Jit BPF_CALL to direct call when possible Xu Kuohai
2022-09-19  9:21 ` Xu Kuohai
2022-09-19  9:21 ` [RESEND PATCH bpf-next 1/2] bpf, arm64: " Xu Kuohai
2022-09-19  9:21   ` Xu Kuohai
2022-09-26 20:29   ` Daniel Borkmann
2022-09-26 20:29     ` Daniel Borkmann
2022-09-27 14:01     ` Xu Kuohai [this message]
2022-09-27 14:01       ` Xu Kuohai
2022-10-13  2:07       ` Xu Kuohai
2022-10-13  2:07         ` Xu Kuohai
2022-09-19  9:21 ` [RESEND PATCH bpf-next 2/2] bpf, arm64: Eliminate false -EFBIG error in bpf trampoline Xu Kuohai
2022-09-19  9:21   ` Xu Kuohai

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=30bbaf09-1c2f-72fb-98cd-afe75849261c@huawei.com \
    --to=xukuohai@huawei.com \
    --cc=andrii@kernel.org \
    --cc=ast@kernel.org \
    --cc=bpf@vger.kernel.org \
    --cc=catalin.marinas@arm.com \
    --cc=daniel@iogearbox.net \
    --cc=haoluo@google.com \
    --cc=jean-philippe@linaro.org \
    --cc=john.fastabend@gmail.com \
    --cc=jolsa@kernel.org \
    --cc=kpsingh@kernel.org \
    --cc=linux-arm-kernel@lists.infradead.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=mark.rutland@arm.com \
    --cc=martin.lau@linux.dev \
    --cc=revest@chromium.org \
    --cc=sdf@google.com \
    --cc=song@kernel.org \
    --cc=will@kernel.org \
    --cc=xukuohai@huaweicloud.com \
    --cc=yhs@fb.com \
    --cc=zlim.lnx@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.