bpf.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Jiong Wang <jiong.wang@netronome.com>
To: Daniel Borkmann <daniel@iogearbox.net>
Cc: Jiong Wang <jiong.wang@netronome.com>,
	alexei.starovoitov@gmail.com, bpf@vger.kernel.org,
	netdev@vger.kernel.org, oss-drivers@netronome.com
Subject: Re: [PATCH v6 bpf-next 02/17] bpf: verifier: mark verified-insn with sub-register zext flag
Date: Mon, 06 May 2019 23:14:48 +0100	[thread overview]
Message-ID: <87lfzjut5z.fsf@netronome.com> (raw)
In-Reply-To: <76304717-347f-990a-2a5a-0999ebbc3b70@iogearbox.net>


Daniel Borkmann writes:

> On 05/03/2019 12:42 PM, Jiong Wang wrote:
>> eBPF ISA specification requires high 32-bit cleared when low 32-bit
>> sub-register is written. This applies to destination register of ALU32 etc.
>> JIT back-ends must guarantee this semantic when doing code-gen.
>> 
>> x86-64 and arm64 ISA has the same semantic, so the corresponding JIT
>> back-end doesn't need to do extra work. However, 32-bit arches (arm, nfp
>> etc.) and some other 64-bit arches (powerpc, sparc etc), need explicit zero
>> extension sequence to meet such semantic.
>> 
>> This is important, because for code the following:
>> 
>>   u64_value = (u64) u32_value
>>   ... other uses of u64_value
>> 
>> compiler could exploit the semantic described above and save those zero
>> extensions for extending u32_value to u64_value. Hardware, runtime, or BPF
>> JIT back-ends, are responsible for guaranteeing this. Some benchmarks show
>> ~40% sub-register writes out of total insns, meaning ~40% extra code-gen (
>> could go up to more for some arches which requires two shifts for zero
>> extension) because JIT back-end needs to do extra code-gen for all such
>> instructions.
>> 
>> However this is not always necessary in case u32_value is never cast into
>> a u64, which is quite normal in real life program. So, it would be really
>> good if we could identify those places where such type cast happened, and
>> only do zero extensions for them, not for the others. This could save a lot
>> of BPF code-gen.
>> 
>> Algo:
>>  - Split read flags into READ32 and READ64.
>> 
>>  - Record indices of instructions that do sub-register def (write). And
>>    these indices need to stay with reg state so path pruning and bpf
>>    to bpf function call could be handled properly.
>> 
>>    These indices are kept up to date while doing insn walk.
>> 
>>  - A full register read on an active sub-register def marks the def insn as
>>    needing zero extension on dst register.
>> 
>>  - A new sub-register write overrides the old one.
>> 
>>    A new full register write makes the register free of zero extension on
>>    dst register.
>> 
>>  - When propagating read64 during path pruning, also marks def insns whose
>>    defs are hanging active sub-register.
>> 
>> Reviewed-by: Jakub Kicinski <jakub.kicinski@netronome.com>
>> Signed-off-by: Jiong Wang <jiong.wang@netronome.com>
>
> [...]
>> +/* This function is supposed to be used by the following 32-bit optimization
>> + * code only. It returns TRUE if the source or destination register operates
>> + * on 64-bit, otherwise return FALSE.
>> + */
>> +static bool is_reg64(struct bpf_verifier_env *env, struct bpf_insn *insn,
>> +		     u32 regno, struct bpf_reg_state *reg, enum reg_arg_type t)
>> +{
>> +	u8 code, class, op;
>> +
>> +	code = insn->code;
>> +	class = BPF_CLASS(code);
>> +	op = BPF_OP(code);
>> +	if (class == BPF_JMP) {
>> +		/* BPF_EXIT for "main" will reach here. Return TRUE
>> +		 * conservatively.
>> +		 */
>> +		if (op == BPF_EXIT)
>> +			return true;
>> +		if (op == BPF_CALL) {
>> +			/* BPF to BPF call will reach here because of marking
>> +			 * caller saved clobber with DST_OP_NO_MARK for which we
>> +			 * don't care the register def because they are anyway
>> +			 * marked as NOT_INIT already.
>> +			 */
>> +			if (insn->src_reg == BPF_PSEUDO_CALL)
>> +				return false;
>> +			/* Helper call will reach here because of arg type
>> +			 * check.
>> +			 */
>> +			if (t == SRC_OP)
>> +				return helper_call_arg64(env, insn->imm, regno);
>> +
>> +			return false;
>> +		}
>> +	}
>> +
>> +	if (class == BPF_ALU64 || class == BPF_JMP ||
>> +	    /* BPF_END always use BPF_ALU class. */
>> +	    (class == BPF_ALU && op == BPF_END && insn->imm == 64))
>> +		return true;
>
> For the BPF_JMP + JA case we don't look at registers, but I presume here
> we 'pretend' to use 64 bit regs to be more conservative as verifier would
> otherwise need to do more complex analysis at the jump target wrt zero
> extension, correct?

is_reg64 is only called by instruction which defines or uses register
value, BPF_JMP + JA doesn't have register define or use, so it won't define
sub-register nor has potential 64-bit read that will cause the associated
32-bit sub-register marked as needing zero extension.

So, BPF_JMP + JA actually doesn't matter to 32-bit opt and won't trigger
is_reg64.

Regards,
Jiong

>
>> +
>> +	if (class == BPF_ALU || class == BPF_JMP32)
>> +		return false;
>> +
>> +	if (class == BPF_LDX) {
>> +		if (t != SRC_OP)
>> +			return BPF_SIZE(code) == BPF_DW;
>> +		/* LDX source must be ptr. */
>> +		return true;
>> +	}
>> +
>> +	if (class == BPF_STX) {
>> +		if (reg->type != SCALAR_VALUE)
>> +			return true;
>> +		return BPF_SIZE(code) == BPF_DW;
>> +	}
>> +
>> +	if (class == BPF_LD) {
>> +		u8 mode = BPF_MODE(code);
>> +
>> +		/* LD_IMM64 */
>> +		if (mode == BPF_IMM)
>> +			return true;
>> +
>> +		/* Both LD_IND and LD_ABS return 32-bit data. */
>> +		if (t != SRC_OP)
>> +			return  false;
>> +
>> +		/* Implicit ctx ptr. */
>> +		if (regno == BPF_REG_6)
>> +			return true;
>> +
>> +		/* Explicit source could be any width. */
>> +		return true;
>> +	}
>> +
>> +	if (class == BPF_ST)
>> +		/* The only source register for BPF_ST is a ptr. */
>> +		return true;
>> +
>> +	/* Conservatively return true at default. */
>> +	return true;
>> +}


  parent reply	other threads:[~2019-05-06 22:14 UTC|newest]

Thread overview: 40+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2019-05-03 10:42 [PATCH v6 bpf-next 00/17] bpf: eliminate zero extensions for sub-register writes Jiong Wang
2019-05-03 10:42 ` [PATCH v6 bpf-next 01/17] bpf: verifier: offer more accurate helper function arg and return type Jiong Wang
2019-05-06 13:57   ` Daniel Borkmann
2019-05-06 22:25     ` Jiong Wang
2019-05-08 11:12       ` Jiong Wang
2019-05-06 15:50   ` Alexei Starovoitov
2019-05-08 14:45     ` Jiong Wang
2019-05-08 17:51       ` Alexei Starovoitov
2019-05-09 12:32         ` Jiong Wang
2019-05-09 17:31           ` Jiong Wang
2019-05-10  1:53           ` Alexei Starovoitov
2019-05-10  8:30             ` Jiong Wang
2019-05-10 20:10               ` Alexei Starovoitov
2019-05-10 21:59                 ` Jiong Wang
2019-05-03 10:42 ` [PATCH v6 bpf-next 02/17] bpf: verifier: mark verified-insn with sub-register zext flag Jiong Wang
2019-05-06 13:49   ` Daniel Borkmann
2019-05-06 14:49     ` Daniel Borkmann
2019-05-06 22:14     ` Jiong Wang [this message]
2019-05-03 10:42 ` [PATCH v6 bpf-next 03/17] bpf: verifier: mark patched-insn " Jiong Wang
2019-05-03 10:42 ` [PATCH v6 bpf-next 04/17] bpf: introduce new alu insn BPF_ZEXT for explicit zero extension Jiong Wang
2019-05-06 15:57   ` Alexei Starovoitov
2019-05-06 23:19     ` Jiong Wang
2019-05-07  4:29       ` Jiong Wang
2019-05-07  4:40         ` Alexei Starovoitov
2019-05-03 10:42 ` [PATCH v6 bpf-next 05/17] bpf: verifier: insert BPF_ZEXT according to zext analysis result Jiong Wang
2019-05-03 10:42 ` [PATCH v6 bpf-next 06/17] bpf: introduce new bpf prog load flags "BPF_F_TEST_RND_HI32" Jiong Wang
2019-05-03 10:42 ` [PATCH v6 bpf-next 07/17] bpf: verifier: randomize high 32-bit when BPF_F_TEST_RND_HI32 is set Jiong Wang
2019-05-03 10:42 ` [PATCH v6 bpf-next 08/17] libbpf: add "prog_flags" to bpf_program/bpf_prog_load_attr/bpf_load_program_attr Jiong Wang
2019-05-03 10:42 ` [PATCH v6 bpf-next 09/17] selftests: bpf: adjust several test_verifier helpers for insn insertion Jiong Wang
2019-05-03 10:42 ` [PATCH v6 bpf-next 10/17] selftests: bpf: enable hi32 randomization for all tests Jiong Wang
2019-05-03 10:42 ` [PATCH v6 bpf-next 11/17] arm: bpf: eliminate zero extension code-gen Jiong Wang
2019-05-03 10:42 ` [PATCH v6 bpf-next 12/17] powerpc: " Jiong Wang
2019-05-03 10:42 ` [PATCH v6 bpf-next 13/17] s390: " Jiong Wang
2019-05-03 13:41   ` Heiko Carstens
2019-05-03 13:50     ` Eric Dumazet
2019-05-03 14:09     ` Jiong Wang
2019-05-03 10:42 ` [PATCH v6 bpf-next 14/17] sparc: " Jiong Wang
2019-05-03 10:42 ` [PATCH v6 bpf-next 15/17] x32: " Jiong Wang
2019-05-03 10:42 ` [PATCH v6 bpf-next 16/17] riscv: " Jiong Wang
2019-05-03 10:42 ` [PATCH v6 bpf-next 17/17] nfp: " Jiong Wang

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=87lfzjut5z.fsf@netronome.com \
    --to=jiong.wang@netronome.com \
    --cc=alexei.starovoitov@gmail.com \
    --cc=bpf@vger.kernel.org \
    --cc=daniel@iogearbox.net \
    --cc=netdev@vger.kernel.org \
    --cc=oss-drivers@netronome.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).