bpf.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* bpf jit PPC64 (BE) test_verifier PTR_TO_STACK store/load failure
@ 2019-03-13 10:54 Yauheni Kaliuta
  2019-03-13 13:51 ` Naveen N. Rao
  2019-03-13 22:14 ` Segher Boessenkool
  0 siblings, 2 replies; 4+ messages in thread
From: Yauheni Kaliuta @ 2019-03-13 10:54 UTC (permalink / raw)
  To: Naveen N. Rao, Sandipan Das, Daniel Borkmann, Michael Ellerman
  Cc: bpf, linuxppc-dev, netdev, Jiri Olsa

Hi!

I found a failure:

```
# ./test_verifier 722
#722/u PTR_TO_STACK store/load FAIL retval -1 != -87117812 
0: (bf) r1 = r10
1: (07) r1 += -10
2: (7a) *(u64 *)(r1 +2) = -87117812
3: (79) r0 = *(u64 *)(r1 +2)
4: (95) exit
processed 5 insns (limit 131072), stack depth 8
#722/p PTR_TO_STACK store/load FAIL retval -1 != -87117812 
0: (bf) r1 = r10
1: (07) r1 += -10
2: (7a) *(u64 *)(r1 +2) = -87117812
3: (79) r0 = *(u64 *)(r1 +2)
4: (95) exit
processed 5 insns (limit 131072), stack depth 8
Summary: 0 PASSED, 0 SKIPPED, 2 FAILED
```

The reason is in the JIT. The code is jitted into:

[...]
d00000000580e7f8:       f9 23 00 00     std     r9,0(r3)
d00000000580e7fc:       e9 03 00 02     lwa     r8,0(r3)
[...]

so, it stores DW to the location r3, but loads W, i.e. in BE it is:

saves
r3: FF FF FF FF FA CE B0 0C
loads
r3: FF FF FF FF

(in LE it works semicorretly, saves 0C B0 CE FA FF FF FF FF, loads 0C B0 CE FA)

This is because of the handling of the +2 offset. For stores it is:


#define PPC_STD(r, base, i)	EMIT(PPC_INST_STD | ___PPC_RS(r) |	      \
				     ___PPC_RA(base) | ((i) & 0xfffc))

and for loads
#define PPC_LD(r, base, i)	EMIT(PPC_INST_LD | ___PPC_RT(r) |	      \
				     ___PPC_RA(base) | IMM_L(i))
#define IMM_L(i)		((uintptr_t)(i) & 0xffff)

So, in the load case the offset +2 (immediate value) is not
masked and turns the instruction to lwa instead of ld.


Would it be correct to & 0xfffc the immediate value as well?

BTW, the full run on big endian:

Summary: 1190 PASSED, 125 SKIPPED, 4 FAILED

-- 
WBR,
Yauheni Kaliuta

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: bpf jit PPC64 (BE) test_verifier PTR_TO_STACK store/load failure
  2019-03-13 10:54 bpf jit PPC64 (BE) test_verifier PTR_TO_STACK store/load failure Yauheni Kaliuta
@ 2019-03-13 13:51 ` Naveen N. Rao
  2019-03-13 22:14 ` Segher Boessenkool
  1 sibling, 0 replies; 4+ messages in thread
From: Naveen N. Rao @ 2019-03-13 13:51 UTC (permalink / raw)
  To: Daniel Borkmann, Michael Ellerman, Sandipan Das, Yauheni Kaliuta
  Cc: bpf, Jiri Olsa, linuxppc-dev, netdev

Hi,

Yauheni Kaliuta wrote:
> Hi!
> 
> I found a failure:
> 
> ```
> # ./test_verifier 722
> #722/u PTR_TO_STACK store/load FAIL retval -1 != -87117812 
> 0: (bf) r1 = r10
> 1: (07) r1 += -10
> 2: (7a) *(u64 *)(r1 +2) = -87117812
> 3: (79) r0 = *(u64 *)(r1 +2)
> 4: (95) exit
> processed 5 insns (limit 131072), stack depth 8
> #722/p PTR_TO_STACK store/load FAIL retval -1 != -87117812 
> 0: (bf) r1 = r10
> 1: (07) r1 += -10
> 2: (7a) *(u64 *)(r1 +2) = -87117812
> 3: (79) r0 = *(u64 *)(r1 +2)
> 4: (95) exit
> processed 5 insns (limit 131072), stack depth 8
> Summary: 0 PASSED, 0 SKIPPED, 2 FAILED
> ```
> 
> The reason is in the JIT. The code is jitted into:
> 
> [...]
> d00000000580e7f8:       f9 23 00 00     std     r9,0(r3)
> d00000000580e7fc:       e9 03 00 02     lwa     r8,0(r3)
> [...]
> 
> so, it stores DW to the location r3, but loads W, i.e. in BE it is:
> 
> saves
> r3: FF FF FF FF FA CE B0 0C
> loads
> r3: FF FF FF FF
> 
> (in LE it works semicorretly, saves 0C B0 CE FA FF FF FF FF, loads 0C B0 CE FA)
> 
> This is because of the handling of the +2 offset. For stores it is:
> 
> 
> #define PPC_STD(r, base, i)	EMIT(PPC_INST_STD | ___PPC_RS(r) |	      
> \
> 				     ___PPC_RA(base) | ((i) & 0xfffc))
> 
> and for loads
> #define PPC_LD(r, base, i)	EMIT(PPC_INST_LD | ___PPC_RT(r) |	      \
> 				     ___PPC_RA(base) | IMM_L(i))
> #define IMM_L(i)		((uintptr_t)(i) & 0xffff)
> 
> So, in the load case the offset +2 (immediate value) is not
> masked and turns the instruction to lwa instead of ld.

Indeed -- good catch and analysis!

> 
> 
> Would it be correct to & 0xfffc the immediate value as well?

Yes, I think that would be the right fix.

> 
> BTW, the full run on big endian:
> 
> Summary: 1190 PASSED, 125 SKIPPED, 4 FAILED

Thanks for pointing that out, I'll look into these failures.


- Naveen



^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: bpf jit PPC64 (BE) test_verifier PTR_TO_STACK store/load failure
  2019-03-13 10:54 bpf jit PPC64 (BE) test_verifier PTR_TO_STACK store/load failure Yauheni Kaliuta
  2019-03-13 13:51 ` Naveen N. Rao
@ 2019-03-13 22:14 ` Segher Boessenkool
  2019-03-15 13:16   ` Naveen N. Rao
  1 sibling, 1 reply; 4+ messages in thread
From: Segher Boessenkool @ 2019-03-13 22:14 UTC (permalink / raw)
  To: Yauheni Kaliuta
  Cc: Naveen N. Rao, Sandipan Das, Daniel Borkmann, Michael Ellerman,
	netdev, bpf, linuxppc-dev, Jiri Olsa

Hi!

On Wed, Mar 13, 2019 at 12:54:16PM +0200, Yauheni Kaliuta wrote:
> This is because of the handling of the +2 offset.

The low two bits of instructions with primary opcodes 58 and 62 are part
of the opcode, not the offset.  These instructions can not have offsets
with the low two bits non-zero.

> For stores it is:
> #define PPC_STD(r, base, i)	EMIT(PPC_INST_STD | ___PPC_RS(r) |	      \
> 				     ___PPC_RA(base) | ((i) & 0xfffc))
> 
> and for loads
> #define PPC_LD(r, base, i)	EMIT(PPC_INST_LD | ___PPC_RT(r) |	      \
> 				     ___PPC_RA(base) | IMM_L(i))
> #define IMM_L(i)		((uintptr_t)(i) & 0xffff)
> 
> So, in the load case the offset +2 (immediate value) is not
> masked and turns the instruction to lwa instead of ld.
> 
> Would it be correct to & 0xfffc the immediate value as well?

That is only part of it.  The other thing is you have to make sure those
low bits are zero *already* (and then you do not need the mask anymore).
For example, if the low two bits are not zero load the offset into a
register instead (and then do ldx or lwax).


Segher

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: bpf jit PPC64 (BE) test_verifier PTR_TO_STACK store/load failure
  2019-03-13 22:14 ` Segher Boessenkool
@ 2019-03-15 13:16   ` Naveen N. Rao
  0 siblings, 0 replies; 4+ messages in thread
From: Naveen N. Rao @ 2019-03-15 13:16 UTC (permalink / raw)
  To: Segher Boessenkool, Yauheni Kaliuta
  Cc: bpf, Daniel Borkmann, Jiri Olsa, linuxppc-dev, Michael Ellerman,
	netdev, Sandipan Das

Segher Boessenkool wrote:
> Hi!
> 
> On Wed, Mar 13, 2019 at 12:54:16PM +0200, Yauheni Kaliuta wrote:
>> This is because of the handling of the +2 offset.
> 
> The low two bits of instructions with primary opcodes 58 and 62 are part
> of the opcode, not the offset.  These instructions can not have offsets
> with the low two bits non-zero.
> 
>> For stores it is:
>> #define PPC_STD(r, base, i)	EMIT(PPC_INST_STD | ___PPC_RS(r) |	      \
>> 				     ___PPC_RA(base) | ((i) & 0xfffc))
>> 
>> and for loads
>> #define PPC_LD(r, base, i)	EMIT(PPC_INST_LD | ___PPC_RT(r) |	      \
>> 				     ___PPC_RA(base) | IMM_L(i))
>> #define IMM_L(i)		((uintptr_t)(i) & 0xffff)
>> 
>> So, in the load case the offset +2 (immediate value) is not
>> masked and turns the instruction to lwa instead of ld.
>> 
>> Would it be correct to & 0xfffc the immediate value as well?
> 
> That is only part of it.  The other thing is you have to make sure those
> low bits are zero *already* (and then you do not need the mask anymore).
> For example, if the low two bits are not zero load the offset into a
> register instead (and then do ldx or lwax).

Thanks for pointing that out, Segher. That is a detail that is easily 
missed.

- Naveen



^ permalink raw reply	[flat|nested] 4+ messages in thread

end of thread, other threads:[~2019-03-15 13:16 UTC | newest]

Thread overview: 4+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2019-03-13 10:54 bpf jit PPC64 (BE) test_verifier PTR_TO_STACK store/load failure Yauheni Kaliuta
2019-03-13 13:51 ` Naveen N. Rao
2019-03-13 22:14 ` Segher Boessenkool
2019-03-15 13:16   ` Naveen N. Rao

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).