All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH RFC 0/4] Initial 32-bit eBPF encoding support
@ 2017-09-18 20:47 Jiong Wang
  2017-09-18 20:47 ` [PATCH RFC 1/4] Improve instruction encoding descriptions Jiong Wang
                   ` (5 more replies)
  0 siblings, 6 replies; 20+ messages in thread
From: Jiong Wang @ 2017-09-18 20:47 UTC (permalink / raw)
  To: xdp-newbies; +Cc: llvm-dev, iovisor-dev, oss-drivers, Jiong Wang

Hi,

  Currently, LLVM eBPF backend always generate code in 64-bit mode, this may
cause troubles when JITing to 32-bit targets.

  For example, it is quite common for XDP eBPF program to access some packet
fields through base + offset that the default eBPF will generate BPF_ALU64 for
the address formation, later when JITing to 32-bit hardware, BPF_ALU64 needs
to be expanded into 32 bit ALU sequences even though the address space is
32-bit that the high bits is not significant.

  While a complete 32-bit mode implemention may need an new ABI (something like
-target-abi=ilp32), this patch set first add some initial code so we could
construct 32-bit eBPF tests through hand-written assembly.

  A new 32-bit register set is introduced, its name is with "w" prefix and LLVM
assembler will encode statements like "w1 += w2" into the following 8-bit code
field:

    BPF_ADD | BPF_X | BPF_ALU

BPF_ALU will be used instead of BPF_ALU64.

  NOTE, currently you can only use "w" register with ALU statements, not with
others like branches etc as they don't have different encoding for 32-bit
target.

  Comments?

*** BLURB HERE ***

Jiong Wang (4):
  Improve instruction encoding descriptions
  Improve class inheritance in instruction patterns
  New 32-bit register set
  Initial 32-bit ALU (BPF_ALU) encoding support in assembler

 lib/Target/BPF/BPFInstrFormats.td               |  84 +++-
 lib/Target/BPF/BPFInstrInfo.td                  | 506 +++++++++++-------------
 lib/Target/BPF/BPFRegisterInfo.td               |  74 +++-
 lib/Target/BPF/Disassembler/BPFDisassembler.cpp |  15 +
 test/MC/BPF/insn-unit-32.s                      |  53 +++
 5 files changed, 427 insertions(+), 305 deletions(-)
 create mode 100644 test/MC/BPF/insn-unit-32.s

-- 
2.7.4

^ permalink raw reply	[flat|nested] 20+ messages in thread

* [PATCH RFC 1/4] Improve instruction encoding descriptions
  2017-09-18 20:47 [PATCH RFC 0/4] Initial 32-bit eBPF encoding support Jiong Wang
@ 2017-09-18 20:47 ` Jiong Wang
  2017-09-18 20:47 ` [PATCH RFC 2/4] Improve class inheritance in instruction patterns Jiong Wang
                   ` (4 subsequent siblings)
  5 siblings, 0 replies; 20+ messages in thread
From: Jiong Wang @ 2017-09-18 20:47 UTC (permalink / raw)
  To: xdp-newbies; +Cc: llvm-dev, iovisor-dev, oss-drivers, Jiong Wang

Currently, eBPF backend is using some constant directly in instruction patterns,
This patch replace them with mnemonics and removed some unnecessary temparary
variables.

Acked-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Signed-off-by: Jiong Wang <jiong.wang@netronome.com>
---
 lib/Target/BPF/BPFInstrFormats.td |  84 ++++++++++-
 lib/Target/BPF/BPFInstrInfo.td    | 290 ++++++++++++++------------------------
 2 files changed, 189 insertions(+), 185 deletions(-)

diff --git a/lib/Target/BPF/BPFInstrFormats.td b/lib/Target/BPF/BPFInstrFormats.td
index 53f3ad6..1e3bc3b 100644
--- a/lib/Target/BPF/BPFInstrFormats.td
+++ b/lib/Target/BPF/BPFInstrFormats.td
@@ -7,6 +7,86 @@
 //
 //===----------------------------------------------------------------------===//
 
+class BPFOpClass<bits<3> val> {
+  bits<3> Value = val;
+}
+
+def BPF_LD    : BPFOpClass<0x0>;
+def BPF_LDX   : BPFOpClass<0x1>;
+def BPF_ST    : BPFOpClass<0x2>;
+def BPF_STX   : BPFOpClass<0x3>;
+def BPF_ALU   : BPFOpClass<0x4>;
+def BPF_JMP   : BPFOpClass<0x5>;
+def BPF_ALU64 : BPFOpClass<0x7>;
+
+class BPFSrcType<bits<1> val> {
+  bits<1> Value = val;
+}
+
+def BPF_K : BPFSrcType<0x0>;
+def BPF_X : BPFSrcType<0x1>;
+
+class BPFArithOp<bits<4> val> {
+  bits<4> Value = val;
+}
+
+def BPF_ADD  : BPFArithOp<0x0>;
+def BPF_SUB  : BPFArithOp<0x1>;
+def BPF_MUL  : BPFArithOp<0x2>;
+def BPF_DIV  : BPFArithOp<0x3>;
+def BPF_OR   : BPFArithOp<0x4>;
+def BPF_AND  : BPFArithOp<0x5>;
+def BPF_LSH  : BPFArithOp<0x6>;
+def BPF_RSH  : BPFArithOp<0x7>;
+def BPF_XOR  : BPFArithOp<0xa>;
+def BPF_MOV  : BPFArithOp<0xb>;
+def BPF_ARSH : BPFArithOp<0xc>;
+def BPF_END  : BPFArithOp<0xd>;
+
+class BPFEndDir<bits<1> val> {
+  bits<1> Value = val;
+}
+
+def BPF_TO_LE : BPFSrcType<0x0>;
+def BPF_TO_BE : BPFSrcType<0x1>;
+
+class BPFJumpOp<bits<4> val> {
+  bits<4> Value = val;
+}
+
+def BPF_JA   : BPFJumpOp<0x0>;
+def BPF_JEQ  : BPFJumpOp<0x1>;
+def BPF_JGT  : BPFJumpOp<0x2>;
+def BPF_JGE  : BPFJumpOp<0x3>;
+def BPF_JNE  : BPFJumpOp<0x5>;
+def BPF_JSGT : BPFJumpOp<0x6>;
+def BPF_JSGE : BPFJumpOp<0x7>;
+def BPF_CALL : BPFJumpOp<0x8>;
+def BPF_EXIT : BPFJumpOp<0x9>;
+def BPF_JLT  : BPFJumpOp<0xa>;
+def BPF_JLE  : BPFJumpOp<0xb>;
+def BPF_JSLT : BPFJumpOp<0xc>;
+def BPF_JSLE : BPFJumpOp<0xd>;
+
+class BPFWidthModifer<bits<2> val> {
+  bits<2> Value = val;
+}
+
+def BPF_W  : BPFWidthModifer<0x0>;
+def BPF_H  : BPFWidthModifer<0x1>;
+def BPF_B  : BPFWidthModifer<0x2>;
+def BPF_DW : BPFWidthModifer<0x3>;
+
+class BPFModeModifer<bits<3> val> {
+  bits<3> Value = val;
+}
+
+def BPF_IMM  : BPFModeModifer<0x0>;
+def BPF_ABS  : BPFModeModifer<0x1>;
+def BPF_IND  : BPFModeModifer<0x2>;
+def BPF_MEM  : BPFModeModifer<0x3>;
+def BPF_XADD : BPFModeModifer<0x6>;
+
 class InstBPF<dag outs, dag ins, string asmstr, list<dag> pattern>
   : Instruction {
   field bits<64> Inst;
@@ -16,8 +96,8 @@ class InstBPF<dag outs, dag ins, string asmstr, list<dag> pattern>
   let Namespace = "BPF";
   let DecoderNamespace = "BPF";
 
-  bits<3> BPFClass;
-  let Inst{58-56} = BPFClass;
+  BPFOpClass BPFClass;
+  let Inst{58-56} = BPFClass.Value;
 
   dag OutOperandList = outs;
   dag InOperandList = ins;
diff --git a/lib/Target/BPF/BPFInstrInfo.td b/lib/Target/BPF/BPFInstrInfo.td
index bef6ce0..5d05105 100644
--- a/lib/Target/BPF/BPFInstrInfo.td
+++ b/lib/Target/BPF/BPFInstrInfo.td
@@ -89,162 +89,138 @@ def BPF_CC_LEU : PatLeaf<(i64 imm),
                          [{return (N->getZExtValue() == ISD::SETULE);}]>;
 
 // jump instructions
-class JMP_RR<bits<4> Opc, string OpcodeStr, PatLeaf Cond>
+class JMP_RR<BPFJumpOp Opc, string OpcodeStr, PatLeaf Cond>
     : InstBPF<(outs), (ins GPR:$dst, GPR:$src, brtarget:$BrDst),
               "if $dst "#OpcodeStr#" $src goto $BrDst",
               [(BPFbrcc i64:$dst, i64:$src, Cond, bb:$BrDst)]> {
-  bits<4> op;
-  bits<1> BPFSrc;
   bits<4> dst;
   bits<4> src;
   bits<16> BrDst;
 
-  let Inst{63-60} = op;
-  let Inst{59} = BPFSrc;
+  let Inst{63-60} = Opc.Value;
+  let Inst{59} = BPF_X.Value;
   let Inst{55-52} = src;
   let Inst{51-48} = dst;
   let Inst{47-32} = BrDst;
 
-  let op = Opc;
-  let BPFSrc = 1;
-  let BPFClass = 5; // BPF_JMP
+  let BPFClass = BPF_JMP;
 }
 
-class JMP_RI<bits<4> Opc, string OpcodeStr, PatLeaf Cond>
+class JMP_RI<BPFJumpOp Opc, string OpcodeStr, PatLeaf Cond>
     : InstBPF<(outs), (ins GPR:$dst, i64imm:$imm, brtarget:$BrDst),
               "if $dst "#OpcodeStr#" $imm goto $BrDst",
               [(BPFbrcc i64:$dst, i64immSExt32:$imm, Cond, bb:$BrDst)]> {
-  bits<4> op;
-  bits<1> BPFSrc;
   bits<4> dst;
   bits<16> BrDst;
   bits<32> imm;
 
-  let Inst{63-60} = op;
-  let Inst{59} = BPFSrc;
+  let Inst{63-60} = Opc.Value;
+  let Inst{59} = BPF_K.Value;
   let Inst{51-48} = dst;
   let Inst{47-32} = BrDst;
   let Inst{31-0} = imm;
 
-  let op = Opc;
-  let BPFSrc = 0;
-  let BPFClass = 5; // BPF_JMP
+  let BPFClass = BPF_JMP;
 }
 
-multiclass J<bits<4> Opc, string OpcodeStr, PatLeaf Cond> {
+multiclass J<BPFJumpOp Opc, string OpcodeStr, PatLeaf Cond> {
   def _rr : JMP_RR<Opc, OpcodeStr, Cond>;
   def _ri : JMP_RI<Opc, OpcodeStr, Cond>;
 }
 
 let isBranch = 1, isTerminator = 1, hasDelaySlot=0 in {
 // cmp+goto instructions
-defm JEQ  : J<0x1, "==",  BPF_CC_EQ>;
-defm JUGT : J<0x2, ">", BPF_CC_GTU>;
-defm JUGE : J<0x3, ">=", BPF_CC_GEU>;
-defm JNE  : J<0x5, "!=",  BPF_CC_NE>;
-defm JSGT : J<0x6, "s>", BPF_CC_GT>;
-defm JSGE : J<0x7, "s>=", BPF_CC_GE>;
-defm JULT : J<0xa, "<", BPF_CC_LTU>;
-defm JULE : J<0xb, "<=", BPF_CC_LEU>;
-defm JSLT : J<0xc, "s<", BPF_CC_LT>;
-defm JSLE : J<0xd, "s<=", BPF_CC_LE>;
+defm JEQ  : J<BPF_JEQ, "==",  BPF_CC_EQ>;
+defm JUGT : J<BPF_JGT, ">", BPF_CC_GTU>;
+defm JUGE : J<BPF_JGE, ">=", BPF_CC_GEU>;
+defm JNE  : J<BPF_JNE, "!=",  BPF_CC_NE>;
+defm JSGT : J<BPF_JSGT, "s>", BPF_CC_GT>;
+defm JSGE : J<BPF_JSGE, "s>=", BPF_CC_GE>;
+defm JULT : J<BPF_JLT, "<", BPF_CC_LTU>;
+defm JULE : J<BPF_JLE, "<=", BPF_CC_LEU>;
+defm JSLT : J<BPF_JSLT, "s<", BPF_CC_LT>;
+defm JSLE : J<BPF_JSLE, "s<=", BPF_CC_LE>;
 }
 
 // ALU instructions
-class ALU_RI<bits<4> Opc, string OpcodeStr, SDNode OpNode>
+class ALU_RI<BPFArithOp Opc, string OpcodeStr, SDNode OpNode>
     : InstBPF<(outs GPR:$dst), (ins GPR:$src2, i64imm:$imm),
               "$dst "#OpcodeStr#" $imm",
               [(set GPR:$dst, (OpNode GPR:$src2, i64immSExt32:$imm))]> {
-  bits<4> op;
-  bits<1> BPFSrc;
   bits<4> dst;
   bits<32> imm;
 
-  let Inst{63-60} = op;
-  let Inst{59} = BPFSrc;
+  let Inst{63-60} = Opc.Value;
+  let Inst{59} = BPF_K.Value;
   let Inst{51-48} = dst;
   let Inst{31-0} = imm;
 
-  let op = Opc;
-  let BPFSrc = 0;
-  let BPFClass = 7; // BPF_ALU64
+  let BPFClass = BPF_ALU64;
 }
 
-class ALU_RR<bits<4> Opc, string OpcodeStr, SDNode OpNode>
+class ALU_RR<BPFArithOp Opc, string OpcodeStr, SDNode OpNode>
     : InstBPF<(outs GPR:$dst), (ins GPR:$src2, GPR:$src),
               "$dst "#OpcodeStr#" $src",
               [(set GPR:$dst, (OpNode i64:$src2, i64:$src))]> {
-  bits<4> op;
-  bits<1> BPFSrc;
   bits<4> dst;
   bits<4> src;
 
-  let Inst{63-60} = op;
-  let Inst{59} = BPFSrc;
+  let Inst{63-60} = Opc.Value;
+  let Inst{59} = BPF_X.Value;
   let Inst{55-52} = src;
   let Inst{51-48} = dst;
 
-  let op = Opc;
-  let BPFSrc = 1;
-  let BPFClass = 7; // BPF_ALU64
+  let BPFClass = BPF_ALU64;
 }
 
-multiclass ALU<bits<4> Opc, string OpcodeStr, SDNode OpNode> {
+multiclass ALU<BPFArithOp Opc, string OpcodeStr, SDNode OpNode> {
   def _rr : ALU_RR<Opc, OpcodeStr, OpNode>;
   def _ri : ALU_RI<Opc, OpcodeStr, OpNode>;
 }
 
 let Constraints = "$dst = $src2" in {
 let isAsCheapAsAMove = 1 in {
-  defm ADD : ALU<0x0, "+=", add>;
-  defm SUB : ALU<0x1, "-=", sub>;
-  defm OR  : ALU<0x4, "|=", or>;
-  defm AND : ALU<0x5, "&=", and>;
-  defm SLL : ALU<0x6, "<<=", shl>;
-  defm SRL : ALU<0x7, ">>=", srl>;
-  defm XOR : ALU<0xa, "^=", xor>;
-  defm SRA : ALU<0xc, "s>>=", sra>;
+  defm ADD : ALU<BPF_ADD, "+=", add>;
+  defm SUB : ALU<BPF_SUB, "-=", sub>;
+  defm OR  : ALU<BPF_OR, "|=", or>;
+  defm AND : ALU<BPF_AND, "&=", and>;
+  defm SLL : ALU<BPF_LSH, "<<=", shl>;
+  defm SRL : ALU<BPF_RSH, ">>=", srl>;
+  defm XOR : ALU<BPF_XOR, "^=", xor>;
+  defm SRA : ALU<BPF_ARSH, "s>>=", sra>;
 }
-  defm MUL : ALU<0x2, "*=", mul>;
-  defm DIV : ALU<0x3, "/=", udiv>;
+  defm MUL : ALU<BPF_MUL, "*=", mul>;
+  defm DIV : ALU<BPF_DIV, "/=", udiv>;
 }
 
 class MOV_RR<string OpcodeStr>
     : InstBPF<(outs GPR:$dst), (ins GPR:$src),
               "$dst "#OpcodeStr#" $src",
               []> {
-  bits<4> op;
-  bits<1> BPFSrc;
   bits<4> dst;
   bits<4> src;
 
-  let Inst{63-60} = op;
-  let Inst{59} = BPFSrc;
+  let Inst{63-60} = BPF_MOV.Value;
+  let Inst{59} = BPF_X.Value;
   let Inst{55-52} = src;
   let Inst{51-48} = dst;
 
-  let op = 0xb;     // BPF_MOV
-  let BPFSrc = 1;   // BPF_X
-  let BPFClass = 7; // BPF_ALU64
+  let BPFClass = BPF_ALU64;
 }
 
 class MOV_RI<string OpcodeStr>
     : InstBPF<(outs GPR:$dst), (ins i64imm:$imm),
               "$dst "#OpcodeStr#" $imm",
               [(set GPR:$dst, (i64 i64immSExt32:$imm))]> {
-  bits<4> op;
-  bits<1> BPFSrc;
   bits<4> dst;
   bits<32> imm;
 
-  let Inst{63-60} = op;
-  let Inst{59} = BPFSrc;
+  let Inst{63-60} = BPF_MOV.Value;
+  let Inst{59} = BPF_K.Value;
   let Inst{51-48} = dst;
   let Inst{31-0} = imm;
 
-  let op = 0xb;     // BPF_MOV
-  let BPFSrc = 0;   // BPF_K
-  let BPFClass = 7; // BPF_ALU64
+  let BPFClass = BPF_ALU64;
 }
 
 class LD_IMM64<bits<4> Pseudo, string OpcodeStr>
@@ -252,21 +228,17 @@ class LD_IMM64<bits<4> Pseudo, string OpcodeStr>
               "$dst "#OpcodeStr#" ${imm} ll",
               [(set GPR:$dst, (i64 imm:$imm))]> {
 
-  bits<3> mode;
-  bits<2> size;
   bits<4> dst;
   bits<64> imm;
 
-  let Inst{63-61} = mode;
-  let Inst{60-59} = size;
+  let Inst{63-61} = BPF_IMM.Value;
+  let Inst{60-59} = BPF_DW.Value;
   let Inst{51-48} = dst;
   let Inst{55-52} = Pseudo;
   let Inst{47-32} = 0;
   let Inst{31-0} = imm{31-0};
 
-  let mode = 0;     // BPF_IMM
-  let size = 3;     // BPF_DW
-  let BPFClass = 0; // BPF_LD
+  let BPFClass = BPF_LD;
 }
 
 let isReMaterializable = 1, isAsCheapAsAMove = 1 in {
@@ -287,7 +259,7 @@ def FI_ri
   let Inst{55-52} = 2;
   let Inst{47-32} = 0;
   let Inst{31-0} = 0;
-  let BPFClass = 0;
+  let BPFClass = BPF_LD;
 }
 
 
@@ -296,115 +268,95 @@ def LD_pseudo
               "ld_pseudo\t$dst, $pseudo, $imm",
               [(set GPR:$dst, (int_bpf_pseudo imm:$pseudo, imm:$imm))]> {
 
-  bits<3> mode;
-  bits<2> size;
   bits<4> dst;
   bits<64> imm;
   bits<4> pseudo;
 
-  let Inst{63-61} = mode;
-  let Inst{60-59} = size;
+  let Inst{63-61} = BPF_IMM.Value;
+  let Inst{60-59} = BPF_DW.Value;
   let Inst{51-48} = dst;
   let Inst{55-52} = pseudo;
   let Inst{47-32} = 0;
   let Inst{31-0} = imm{31-0};
 
-  let mode = 0;     // BPF_IMM
-  let size = 3;     // BPF_DW
-  let BPFClass = 0; // BPF_LD
+  let BPFClass = BPF_LD;
 }
 
 // STORE instructions
-class STORE<bits<2> SizeOp, string OpcodeStr, list<dag> Pattern>
+class STORE<BPFWidthModifer SizeOp, string OpcodeStr, list<dag> Pattern>
     : InstBPF<(outs), (ins GPR:$src, MEMri:$addr),
               "*("#OpcodeStr#" *)($addr) = $src", Pattern> {
-  bits<3> mode;
-  bits<2> size;
   bits<4> src;
   bits<20> addr;
 
-  let Inst{63-61} = mode;
-  let Inst{60-59} = size;
+  let Inst{63-61} = BPF_MEM.Value;
+  let Inst{60-59} = SizeOp.Value;
   let Inst{51-48} = addr{19-16}; // base reg
   let Inst{55-52} = src;
   let Inst{47-32} = addr{15-0}; // offset
 
-  let mode = 3;     // BPF_MEM
-  let size = SizeOp;
-  let BPFClass = 3; // BPF_STX
+  let BPFClass = BPF_STX;
 }
 
-class STOREi64<bits<2> Opc, string OpcodeStr, PatFrag OpNode>
+class STOREi64<BPFWidthModifer Opc, string OpcodeStr, PatFrag OpNode>
     : STORE<Opc, OpcodeStr, [(OpNode i64:$src, ADDRri:$addr)]>;
 
-def STW : STOREi64<0x0, "u32", truncstorei32>;
-def STH : STOREi64<0x1, "u16", truncstorei16>;
-def STB : STOREi64<0x2, "u8", truncstorei8>;
-def STD : STOREi64<0x3, "u64", store>;
+def STW : STOREi64<BPF_W, "u32", truncstorei32>;
+def STH : STOREi64<BPF_H, "u16", truncstorei16>;
+def STB : STOREi64<BPF_B, "u8", truncstorei8>;
+def STD : STOREi64<BPF_DW, "u64", store>;
 
 // LOAD instructions
-class LOAD<bits<2> SizeOp, string OpcodeStr, list<dag> Pattern>
+class LOAD<BPFWidthModifer SizeOp, string OpcodeStr, list<dag> Pattern>
     : InstBPF<(outs GPR:$dst), (ins MEMri:$addr),
               "$dst = *("#OpcodeStr#" *)($addr)", Pattern> {
-  bits<3> mode;
-  bits<2> size;
   bits<4> dst;
   bits<20> addr;
 
-  let Inst{63-61} = mode;
-  let Inst{60-59} = size;
+  let Inst{63-61} = BPF_MEM.Value;
+  let Inst{60-59} = SizeOp.Value;
   let Inst{51-48} = dst;
   let Inst{55-52} = addr{19-16};
   let Inst{47-32} = addr{15-0};
 
-  let mode = 3;     // BPF_MEM
-  let size = SizeOp;
-  let BPFClass = 1; // BPF_LDX
+  let BPFClass = BPF_LDX;
 }
 
-class LOADi64<bits<2> SizeOp, string OpcodeStr, PatFrag OpNode>
+class LOADi64<BPFWidthModifer SizeOp, string OpcodeStr, PatFrag OpNode>
     : LOAD<SizeOp, OpcodeStr, [(set i64:$dst, (OpNode ADDRri:$addr))]>;
 
-def LDW : LOADi64<0x0, "u32", zextloadi32>;
-def LDH : LOADi64<0x1, "u16", zextloadi16>;
-def LDB : LOADi64<0x2, "u8", zextloadi8>;
-def LDD : LOADi64<0x3, "u64", load>;
+def LDW : LOADi64<BPF_W, "u32", zextloadi32>;
+def LDH : LOADi64<BPF_H, "u16", zextloadi16>;
+def LDB : LOADi64<BPF_B, "u8", zextloadi8>;
+def LDD : LOADi64<BPF_DW, "u64", load>;
 
-class BRANCH<bits<4> Opc, string OpcodeStr, list<dag> Pattern>
+class BRANCH<BPFJumpOp Opc, string OpcodeStr, list<dag> Pattern>
     : InstBPF<(outs), (ins brtarget:$BrDst),
               !strconcat(OpcodeStr, " $BrDst"), Pattern> {
-  bits<4> op;
   bits<16> BrDst;
-  bits<1> BPFSrc;
 
-  let Inst{63-60} = op;
-  let Inst{59} = BPFSrc;
+  let Inst{63-60} = Opc.Value;
+  let Inst{59} = BPF_K.Value;
   let Inst{47-32} = BrDst;
 
-  let op = Opc;
-  let BPFSrc = 0;
-  let BPFClass = 5; // BPF_JMP
+  let BPFClass = BPF_JMP;
 }
 
 class CALL<string OpcodeStr>
     : InstBPF<(outs), (ins calltarget:$BrDst),
               !strconcat(OpcodeStr, " $BrDst"), []> {
-  bits<4> op;
   bits<32> BrDst;
-  bits<1> BPFSrc;
 
-  let Inst{63-60} = op;
-  let Inst{59} = BPFSrc;
+  let Inst{63-60} = BPF_CALL.Value;
+  let Inst{59} = BPF_K.Value;
   let Inst{31-0} = BrDst;
 
-  let op = 8;       // BPF_CALL
-  let BPFSrc = 0;
-  let BPFClass = 5; // BPF_JMP
+  let BPFClass = BPF_JMP;
 }
 
 // Jump always
 let isBranch = 1, isTerminator = 1, hasDelaySlot=0, isBarrier = 1 in {
-  def JMP : BRANCH<0x0, "goto", [(br bb:$BrDst)]>;
+  def JMP : BRANCH<BPF_JA, "goto", [(br bb:$BrDst)]>;
 }
 
 // Jump and link
@@ -418,21 +370,12 @@ class NOP_I<string OpcodeStr>
     : InstBPF<(outs), (ins i32imm:$imm),
               !strconcat(OpcodeStr, "\t$imm"), []> {
   // mov r0, r0 == nop
-  bits<4> op;
-  bits<1> BPFSrc;
-  bits<4> dst;
-  bits<4> src;
-
-  let Inst{63-60} = op;
-  let Inst{59} = BPFSrc;
-  let Inst{55-52} = src;
-  let Inst{51-48} = dst;
+  let Inst{63-60} = BPF_MOV.Value;
+  let Inst{59} = BPF_X.Value;
+  let Inst{55-52} = 0;
+  let Inst{51-48} = 0;
 
-  let op = 0xb;     // BPF_MOV
-  let BPFSrc = 1;   // BPF_X
-  let BPFClass = 7; // BPF_ALU64
-  let src = 0;      // R0
-  let dst = 0;      // R0
+  let BPFClass = BPF_ALU64;
 }
 
 let hasSideEffects = 0 in
@@ -441,14 +384,11 @@ let hasSideEffects = 0 in
 class RET<string OpcodeStr>
     : InstBPF<(outs), (ins),
               !strconcat(OpcodeStr, ""), [(BPFretflag)]> {
-  bits<4> op;
-
-  let Inst{63-60} = op;
+  let Inst{63-60} = BPF_EXIT.Value;
   let Inst{59} = 0;
   let Inst{31-0} = 0;
 
-  let op = 9;       // BPF_EXIT
-  let BPFClass = 5; // BPF_JMP
+  let BPFClass = BPF_JMP;
 }
 
 let isReturn = 1, isTerminator = 1, hasDelaySlot=0, isBarrier = 1,
@@ -497,29 +437,25 @@ def : Pat<(extloadi16 ADDRri:$src), (i64 (LDH ADDRri:$src))>;
 def : Pat<(extloadi32 ADDRri:$src), (i64 (LDW ADDRri:$src))>;
 
 // Atomics
-class XADD<bits<2> SizeOp, string OpcodeStr, PatFrag OpNode>
+class XADD<BPFWidthModifer SizeOp, string OpcodeStr, PatFrag OpNode>
     : InstBPF<(outs GPR:$dst), (ins MEMri:$addr, GPR:$val),
               "lock *("#OpcodeStr#" *)($addr) += $val",
               [(set GPR:$dst, (OpNode ADDRri:$addr, GPR:$val))]> {
-  bits<3> mode;
-  bits<2> size;
   bits<4> dst;
   bits<20> addr;
 
-  let Inst{63-61} = mode;
-  let Inst{60-59} = size;
+  let Inst{63-61} = BPF_XADD.Value;
+  let Inst{60-59} = SizeOp.Value;
   let Inst{51-48} = addr{19-16}; // base reg
   let Inst{55-52} = dst;
   let Inst{47-32} = addr{15-0}; // offset
 
-  let mode = 6;     // BPF_XADD
-  let size = SizeOp;
-  let BPFClass = 3; // BPF_STX
+  let BPFClass = BPF_STX;
 }
 
 let Constraints = "$dst = $val" in {
-def XADD32 : XADD<0, "u32", atomic_load_add_32>;
-def XADD64 : XADD<3, "u64", atomic_load_add_64>;
+def XADD32 : XADD<BPF_W, "u32", atomic_load_add_32>;
+def XADD64 : XADD<BPF_DW, "u64", atomic_load_add_64>;
 // undefined def XADD16 : XADD<1, "xadd16", atomic_load_add_16>;
 // undefined def XADD8  : XADD<2, "xadd8", atomic_load_add_8>;
 }
@@ -529,19 +465,15 @@ class BSWAP<bits<32> SizeOp, string OpcodeStr, list<dag> Pattern>
     : InstBPF<(outs GPR:$dst), (ins GPR:$src),
               !strconcat(OpcodeStr, "\t$dst"),
               Pattern> {
-  bits<4> op;
-  bits<1> BPFSrc;
   bits<4> dst;
   bits<32> imm;
 
-  let Inst{63-60} = op;
-  let Inst{59} = BPFSrc;
+  let Inst{63-60} = BPF_END.Value;
+  let Inst{59} = BPF_TO_BE.Value; // (TODO: use BPF_TO_LE for big-endian target)
   let Inst{51-48} = dst;
   let Inst{31-0} = imm;
 
-  let op = 0xd;     // BPF_END
-  let BPFSrc = 1;   // BPF_TO_BE (TODO: use BPF_TO_LE for big-endian target)
-  let BPFClass = 4; // BPF_ALU
+  let BPFClass = BPF_ALU;
   let imm = SizeOp;
 }
 
@@ -553,45 +485,37 @@ def BSWAP64 : BSWAP<64, "bswap64", [(set GPR:$dst, (bswap GPR:$src))]>;
 
 let Defs = [R0, R1, R2, R3, R4, R5], Uses = [R6], hasSideEffects = 1,
     hasExtraDefRegAllocReq = 1, hasExtraSrcRegAllocReq = 1, mayLoad = 1 in {
-class LOAD_ABS<bits<2> SizeOp, string OpcodeStr, Intrinsic OpNode>
+class LOAD_ABS<BPFWidthModifer SizeOp, string OpcodeStr, Intrinsic OpNode>
     : InstBPF<(outs), (ins GPR:$skb, i64imm:$imm),
               "r0 = *("#OpcodeStr#" *)skb[$imm]",
               [(set R0, (OpNode GPR:$skb, i64immSExt32:$imm))]> {
-  bits<3> mode;
-  bits<2> size;
   bits<32> imm;
 
-  let Inst{63-61} = mode;
-  let Inst{60-59} = size;
+  let Inst{63-61} = BPF_ABS.Value;
+  let Inst{60-59} = SizeOp.Value;
   let Inst{31-0} = imm;
 
-  let mode = 1;     // BPF_ABS
-  let size = SizeOp;
-  let BPFClass = 0; // BPF_LD
+  let BPFClass = BPF_LD;
 }
 
-class LOAD_IND<bits<2> SizeOp, string OpcodeStr, Intrinsic OpNode>
+class LOAD_IND<BPFWidthModifer SizeOp, string OpcodeStr, Intrinsic OpNode>
     : InstBPF<(outs), (ins GPR:$skb, GPR:$val),
               "r0 = *("#OpcodeStr#" *)skb[$val]",
               [(set R0, (OpNode GPR:$skb, GPR:$val))]> {
-  bits<3> mode;
-  bits<2> size;
   bits<4> val;
 
-  let Inst{63-61} = mode;
-  let Inst{60-59} = size;
+  let Inst{63-61} = BPF_IND.Value;
+  let Inst{60-59} = SizeOp.Value;
   let Inst{55-52} = val;
 
-  let mode = 2;     // BPF_IND
-  let size = SizeOp;
-  let BPFClass = 0; // BPF_LD
+  let BPFClass = BPF_LD;
 }
 }
 
-def LD_ABS_B : LOAD_ABS<2, "u8", int_bpf_load_byte>;
-def LD_ABS_H : LOAD_ABS<1, "u16", int_bpf_load_half>;
-def LD_ABS_W : LOAD_ABS<0, "u32", int_bpf_load_word>;
+def LD_ABS_B : LOAD_ABS<BPF_B, "u8", int_bpf_load_byte>;
+def LD_ABS_H : LOAD_ABS<BPF_H, "u16", int_bpf_load_half>;
+def LD_ABS_W : LOAD_ABS<BPF_W, "u32", int_bpf_load_word>;
 
-def LD_IND_B : LOAD_IND<2, "u8", int_bpf_load_byte>;
-def LD_IND_H : LOAD_IND<1, "u16", int_bpf_load_half>;
-def LD_IND_W : LOAD_IND<0, "u32", int_bpf_load_word>;
+def LD_IND_B : LOAD_IND<BPF_B, "u8", int_bpf_load_byte>;
+def LD_IND_H : LOAD_IND<BPF_H, "u16", int_bpf_load_half>;
+def LD_IND_W : LOAD_IND<BPF_W, "u32", int_bpf_load_word>;
-- 
2.7.4

^ permalink raw reply related	[flat|nested] 20+ messages in thread

* [PATCH RFC 2/4] Improve class inheritance in instruction patterns
  2017-09-18 20:47 [PATCH RFC 0/4] Initial 32-bit eBPF encoding support Jiong Wang
  2017-09-18 20:47 ` [PATCH RFC 1/4] Improve instruction encoding descriptions Jiong Wang
@ 2017-09-18 20:47 ` Jiong Wang
  2017-09-18 20:47 ` [PATCH RFC 3/4] New 32-bit register set Jiong Wang
                   ` (3 subsequent siblings)
  5 siblings, 0 replies; 20+ messages in thread
From: Jiong Wang @ 2017-09-18 20:47 UTC (permalink / raw)
  To: xdp-newbies; +Cc: llvm-dev, iovisor-dev, oss-drivers, Jiong Wang

Arithmetic and jump instructions, load and store instructions are sharing
the same 8-bit code field encoding,

A better instruction pattern implemention could be the following inheritance
relationships, and each layer only encoding those fields which start to
diverse from that layer. This avoids some redundant code.

  InstBPF -> TYPE_ALU_JMP -> ALU/JMP
  InstBPF -> TYPE_LD_ST -> Load/Store

Acked-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Signed-off-by: Jiong Wang <jiong.wang@netronome.com>
---
 lib/Target/BPF/BPFInstrInfo.td | 238 ++++++++++++++++++++++-------------------
 1 file changed, 127 insertions(+), 111 deletions(-)

diff --git a/lib/Target/BPF/BPFInstrInfo.td b/lib/Target/BPF/BPFInstrInfo.td
index 5d05105..0319cfe 100644
--- a/lib/Target/BPF/BPFInstrInfo.td
+++ b/lib/Target/BPF/BPFInstrInfo.td
@@ -88,38 +88,67 @@ def BPF_CC_LTU : PatLeaf<(i64 imm),
 def BPF_CC_LEU : PatLeaf<(i64 imm),
                          [{return (N->getZExtValue() == ISD::SETULE);}]>;
 
+// For arithmetic and jump instructions the 8-bit 'code'
+// field is divided into three parts:
+//
+//  +----------------+--------+--------------------+
+//  |   4 bits       |  1 bit |   3 bits           |
+//  | operation code | source | instruction class  |
+//  +----------------+--------+--------------------+
+//  (MSB)                                      (LSB)
+class TYPE_ALU_JMP<bits<4> op, bits<1> srctype,
+                   dag outs, dag ins, string asmstr, list<dag> pattern>
+  : InstBPF<outs, ins, asmstr, pattern> {
+
+  let Inst{63-60} = op;
+  let Inst{59} = srctype;
+}
+
+//For load and store instructions the 8-bit 'code' field is divided as:
+//
+//  +--------+--------+-------------------+
+//  | 3 bits | 2 bits |   3 bits          |
+//  |  mode  |  size  | instruction class |
+//  +--------+--------+-------------------+
+//  (MSB)                             (LSB)
+class TYPE_LD_ST<bits<3> mode, bits<2> size,
+                 dag outs, dag ins, string asmstr, list<dag> pattern>
+  : InstBPF<outs, ins, asmstr, pattern> {
+
+  let Inst{63-61} = mode;
+  let Inst{60-59} = size;
+}
+
 // jump instructions
 class JMP_RR<BPFJumpOp Opc, string OpcodeStr, PatLeaf Cond>
-    : InstBPF<(outs), (ins GPR:$dst, GPR:$src, brtarget:$BrDst),
-              "if $dst "#OpcodeStr#" $src goto $BrDst",
-              [(BPFbrcc i64:$dst, i64:$src, Cond, bb:$BrDst)]> {
+    : TYPE_ALU_JMP<Opc.Value, BPF_X.Value,
+                   (outs),
+                   (ins GPR:$dst, GPR:$src, brtarget:$BrDst),
+                   "if $dst "#OpcodeStr#" $src goto $BrDst",
+                   [(BPFbrcc i64:$dst, i64:$src, Cond, bb:$BrDst)]> {
   bits<4> dst;
   bits<4> src;
   bits<16> BrDst;
 
-  let Inst{63-60} = Opc.Value;
-  let Inst{59} = BPF_X.Value;
   let Inst{55-52} = src;
   let Inst{51-48} = dst;
   let Inst{47-32} = BrDst;
-
   let BPFClass = BPF_JMP;
 }
 
 class JMP_RI<BPFJumpOp Opc, string OpcodeStr, PatLeaf Cond>
-    : InstBPF<(outs), (ins GPR:$dst, i64imm:$imm, brtarget:$BrDst),
-              "if $dst "#OpcodeStr#" $imm goto $BrDst",
-              [(BPFbrcc i64:$dst, i64immSExt32:$imm, Cond, bb:$BrDst)]> {
+    : TYPE_ALU_JMP<Opc.Value, BPF_K.Value,
+                   (outs),
+                   (ins GPR:$dst, i64imm:$imm, brtarget:$BrDst),
+                   "if $dst "#OpcodeStr#" $imm goto $BrDst",
+                   [(BPFbrcc i64:$dst, i64immSExt32:$imm, Cond, bb:$BrDst)]> {
   bits<4> dst;
   bits<16> BrDst;
   bits<32> imm;
 
-  let Inst{63-60} = Opc.Value;
-  let Inst{59} = BPF_K.Value;
   let Inst{51-48} = dst;
   let Inst{47-32} = BrDst;
   let Inst{31-0} = imm;
-
   let BPFClass = BPF_JMP;
 }
 
@@ -144,32 +173,30 @@ defm JSLE : J<BPF_JSLE, "s<=", BPF_CC_LE>;
 
 // ALU instructions
 class ALU_RI<BPFArithOp Opc, string OpcodeStr, SDNode OpNode>
-    : InstBPF<(outs GPR:$dst), (ins GPR:$src2, i64imm:$imm),
-              "$dst "#OpcodeStr#" $imm",
-              [(set GPR:$dst, (OpNode GPR:$src2, i64immSExt32:$imm))]> {
+    : TYPE_ALU_JMP<Opc.Value, BPF_K.Value,
+                   (outs GPR:$dst),
+                   (ins GPR:$src2, i64imm:$imm),
+                   "$dst "#OpcodeStr#" $imm",
+                   [(set GPR:$dst, (OpNode GPR:$src2, i64immSExt32:$imm))]> {
   bits<4> dst;
   bits<32> imm;
 
-  let Inst{63-60} = Opc.Value;
-  let Inst{59} = BPF_K.Value;
   let Inst{51-48} = dst;
   let Inst{31-0} = imm;
-
   let BPFClass = BPF_ALU64;
 }
 
 class ALU_RR<BPFArithOp Opc, string OpcodeStr, SDNode OpNode>
-    : InstBPF<(outs GPR:$dst), (ins GPR:$src2, GPR:$src),
-              "$dst "#OpcodeStr#" $src",
-              [(set GPR:$dst, (OpNode i64:$src2, i64:$src))]> {
+    : TYPE_ALU_JMP<Opc.Value, BPF_X.Value,
+                   (outs GPR:$dst),
+                   (ins GPR:$src2, GPR:$src),
+                   "$dst "#OpcodeStr#" $src",
+                   [(set GPR:$dst, (OpNode i64:$src2, i64:$src))]> {
   bits<4> dst;
   bits<4> src;
 
-  let Inst{63-60} = Opc.Value;
-  let Inst{59} = BPF_X.Value;
   let Inst{55-52} = src;
   let Inst{51-48} = dst;
-
   let BPFClass = BPF_ALU64;
 }
 
@@ -194,50 +221,47 @@ let isAsCheapAsAMove = 1 in {
 }
 
 class MOV_RR<string OpcodeStr>
-    : InstBPF<(outs GPR:$dst), (ins GPR:$src),
-              "$dst "#OpcodeStr#" $src",
-              []> {
+    : TYPE_ALU_JMP<BPF_MOV.Value, BPF_X.Value,
+                   (outs GPR:$dst),
+                   (ins GPR:$src),
+                   "$dst "#OpcodeStr#" $src",
+                   []> {
   bits<4> dst;
   bits<4> src;
 
-  let Inst{63-60} = BPF_MOV.Value;
-  let Inst{59} = BPF_X.Value;
   let Inst{55-52} = src;
   let Inst{51-48} = dst;
-
   let BPFClass = BPF_ALU64;
 }
 
 class MOV_RI<string OpcodeStr>
-    : InstBPF<(outs GPR:$dst), (ins i64imm:$imm),
-              "$dst "#OpcodeStr#" $imm",
-              [(set GPR:$dst, (i64 i64immSExt32:$imm))]> {
+    : TYPE_ALU_JMP<BPF_MOV.Value, BPF_K.Value,
+                   (outs GPR:$dst),
+                   (ins i64imm:$imm),
+                   "$dst "#OpcodeStr#" $imm",
+                   [(set GPR:$dst, (i64 i64immSExt32:$imm))]> {
   bits<4> dst;
   bits<32> imm;
 
-  let Inst{63-60} = BPF_MOV.Value;
-  let Inst{59} = BPF_K.Value;
   let Inst{51-48} = dst;
   let Inst{31-0} = imm;
-
   let BPFClass = BPF_ALU64;
 }
 
 class LD_IMM64<bits<4> Pseudo, string OpcodeStr>
-    : InstBPF<(outs GPR:$dst), (ins u64imm:$imm),
-              "$dst "#OpcodeStr#" ${imm} ll",
-              [(set GPR:$dst, (i64 imm:$imm))]> {
+    : TYPE_LD_ST<BPF_IMM.Value, BPF_DW.Value,
+                 (outs GPR:$dst),
+                 (ins u64imm:$imm),
+                 "$dst "#OpcodeStr#" ${imm} ll",
+                 [(set GPR:$dst, (i64 imm:$imm))]> {
 
   bits<4> dst;
   bits<64> imm;
 
-  let Inst{63-61} = BPF_IMM.Value;
-  let Inst{60-59} = BPF_DW.Value;
   let Inst{51-48} = dst;
   let Inst{55-52} = Pseudo;
   let Inst{47-32} = 0;
   let Inst{31-0} = imm{31-0};
-
   let BPFClass = BPF_LD;
 }
 
@@ -248,13 +272,13 @@ def MOV_ri : MOV_RI<"=">;
 }
 
 def FI_ri
-    : InstBPF<(outs GPR:$dst), (ins MEMri:$addr),
-               "lea\t$dst, $addr",
-               [(set i64:$dst, FIri:$addr)]> {
+    : TYPE_LD_ST<BPF_IMM.Value, BPF_DW.Value,
+                 (outs GPR:$dst),
+                 (ins MEMri:$addr),
+                 "lea\t$dst, $addr",
+                 [(set i64:$dst, FIri:$addr)]> {
   // This is a tentative instruction, and will be replaced
   // with MOV_rr and ADD_ri in PEI phase
-  let Inst{63-61} = 0;
-  let Inst{60-59} = 3;
   let Inst{51-48} = 0;
   let Inst{55-52} = 2;
   let Inst{47-32} = 0;
@@ -262,39 +286,37 @@ def FI_ri
   let BPFClass = BPF_LD;
 }
 
-
 def LD_pseudo
-    : InstBPF<(outs GPR:$dst), (ins i64imm:$pseudo, u64imm:$imm),
-              "ld_pseudo\t$dst, $pseudo, $imm",
-              [(set GPR:$dst, (int_bpf_pseudo imm:$pseudo, imm:$imm))]> {
+    : TYPE_LD_ST<BPF_IMM.Value, BPF_DW.Value,
+                 (outs GPR:$dst),
+                 (ins i64imm:$pseudo, u64imm:$imm),
+                 "ld_pseudo\t$dst, $pseudo, $imm",
+                 [(set GPR:$dst, (int_bpf_pseudo imm:$pseudo, imm:$imm))]> {
 
   bits<4> dst;
   bits<64> imm;
   bits<4> pseudo;
 
-  let Inst{63-61} = BPF_IMM.Value;
-  let Inst{60-59} = BPF_DW.Value;
   let Inst{51-48} = dst;
   let Inst{55-52} = pseudo;
   let Inst{47-32} = 0;
   let Inst{31-0} = imm{31-0};
-
   let BPFClass = BPF_LD;
 }
 
 // STORE instructions
 class STORE<BPFWidthModifer SizeOp, string OpcodeStr, list<dag> Pattern>
-    : InstBPF<(outs), (ins GPR:$src, MEMri:$addr),
-              "*("#OpcodeStr#" *)($addr) = $src", Pattern> {
+    : TYPE_LD_ST<BPF_MEM.Value, SizeOp.Value,
+                 (outs),
+                 (ins GPR:$src, MEMri:$addr),
+                 "*("#OpcodeStr#" *)($addr) = $src",
+                 Pattern> {
   bits<4> src;
   bits<20> addr;
 
-  let Inst{63-61} = BPF_MEM.Value;
-  let Inst{60-59} = SizeOp.Value;
   let Inst{51-48} = addr{19-16}; // base reg
   let Inst{55-52} = src;
   let Inst{47-32} = addr{15-0}; // offset
-
   let BPFClass = BPF_STX;
 }
 
@@ -308,17 +330,17 @@ def STD : STOREi64<BPF_DW, "u64", store>;
 
 // LOAD instructions
 class LOAD<BPFWidthModifer SizeOp, string OpcodeStr, list<dag> Pattern>
-    : InstBPF<(outs GPR:$dst), (ins MEMri:$addr),
-              "$dst = *("#OpcodeStr#" *)($addr)", Pattern> {
+    : TYPE_LD_ST<BPF_MEM.Value, SizeOp.Value,
+                 (outs GPR:$dst),
+                 (ins MEMri:$addr),
+                 "$dst = *("#OpcodeStr#" *)($addr)",
+                 Pattern> {
   bits<4> dst;
   bits<20> addr;
 
-  let Inst{63-61} = BPF_MEM.Value;
-  let Inst{60-59} = SizeOp.Value;
   let Inst{51-48} = dst;
   let Inst{55-52} = addr{19-16};
   let Inst{47-32} = addr{15-0};
-
   let BPFClass = BPF_LDX;
 }
 
@@ -331,26 +353,26 @@ def LDB : LOADi64<BPF_B, "u8", zextloadi8>;
 def LDD : LOADi64<BPF_DW, "u64", load>;
 
 class BRANCH<BPFJumpOp Opc, string OpcodeStr, list<dag> Pattern>
-    : InstBPF<(outs), (ins brtarget:$BrDst),
-              !strconcat(OpcodeStr, " $BrDst"), Pattern> {
+    : TYPE_ALU_JMP<Opc.Value, BPF_K.Value,
+                   (outs),
+                   (ins brtarget:$BrDst),
+                   !strconcat(OpcodeStr, " $BrDst"),
+                   Pattern> {
   bits<16> BrDst;
 
-  let Inst{63-60} = Opc.Value;
-  let Inst{59} = BPF_K.Value;
   let Inst{47-32} = BrDst;
-
   let BPFClass = BPF_JMP;
 }
 
 class CALL<string OpcodeStr>
-    : InstBPF<(outs), (ins calltarget:$BrDst),
-              !strconcat(OpcodeStr, " $BrDst"), []> {
+    : TYPE_ALU_JMP<BPF_CALL.Value, BPF_K.Value,
+                   (outs),
+                   (ins calltarget:$BrDst),
+                   !strconcat(OpcodeStr, " $BrDst"),
+                   []> {
   bits<32> BrDst;
 
-  let Inst{63-60} = BPF_CALL.Value;
-  let Inst{59} = BPF_K.Value;
   let Inst{31-0} = BrDst;
-
   let BPFClass = BPF_JMP;
 }
 
@@ -367,14 +389,14 @@ let isCall=1, hasDelaySlot=0, Uses = [R11],
 }
 
 class NOP_I<string OpcodeStr>
-    : InstBPF<(outs), (ins i32imm:$imm),
-              !strconcat(OpcodeStr, "\t$imm"), []> {
+    : TYPE_ALU_JMP<BPF_MOV.Value, BPF_X.Value,
+                   (outs),
+                   (ins i32imm:$imm),
+                   !strconcat(OpcodeStr, "\t$imm"),
+                   []> {
   // mov r0, r0 == nop
-  let Inst{63-60} = BPF_MOV.Value;
-  let Inst{59} = BPF_X.Value;
   let Inst{55-52} = 0;
   let Inst{51-48} = 0;
-
   let BPFClass = BPF_ALU64;
 }
 
@@ -382,12 +404,12 @@ let hasSideEffects = 0 in
   def NOP : NOP_I<"nop">;
 
 class RET<string OpcodeStr>
-    : InstBPF<(outs), (ins),
-              !strconcat(OpcodeStr, ""), [(BPFretflag)]> {
-  let Inst{63-60} = BPF_EXIT.Value;
-  let Inst{59} = 0;
+    : TYPE_ALU_JMP<BPF_EXIT.Value, BPF_K.Value,
+                   (outs),
+                   (ins),
+                   !strconcat(OpcodeStr, ""),
+                   [(BPFretflag)]> {
   let Inst{31-0} = 0;
-
   let BPFClass = BPF_JMP;
 }
 
@@ -438,18 +460,17 @@ def : Pat<(extloadi32 ADDRri:$src), (i64 (LDW ADDRri:$src))>;
 
 // Atomics
 class XADD<BPFWidthModifer SizeOp, string OpcodeStr, PatFrag OpNode>
-    : InstBPF<(outs GPR:$dst), (ins MEMri:$addr, GPR:$val),
-              "lock *("#OpcodeStr#" *)($addr) += $val",
-              [(set GPR:$dst, (OpNode ADDRri:$addr, GPR:$val))]> {
+    : TYPE_LD_ST<BPF_XADD.Value, SizeOp.Value,
+                 (outs GPR:$dst),
+                 (ins MEMri:$addr, GPR:$val),
+                 "lock *("#OpcodeStr#" *)($addr) += $val",
+                 [(set GPR:$dst, (OpNode ADDRri:$addr, GPR:$val))]> {
   bits<4> dst;
   bits<20> addr;
 
-  let Inst{63-61} = BPF_XADD.Value;
-  let Inst{60-59} = SizeOp.Value;
   let Inst{51-48} = addr{19-16}; // base reg
   let Inst{55-52} = dst;
   let Inst{47-32} = addr{15-0}; // offset
-
   let BPFClass = BPF_STX;
 }
 
@@ -462,19 +483,16 @@ def XADD64 : XADD<BPF_DW, "u64", atomic_load_add_64>;
 
 // bswap16, bswap32, bswap64
 class BSWAP<bits<32> SizeOp, string OpcodeStr, list<dag> Pattern>
-    : InstBPF<(outs GPR:$dst), (ins GPR:$src),
-              !strconcat(OpcodeStr, "\t$dst"),
-              Pattern> {
+    : TYPE_ALU_JMP<BPF_END.Value, BPF_TO_BE.Value,
+                   (outs GPR:$dst),
+                   (ins GPR:$src),
+                   !strconcat(OpcodeStr, "\t$dst"),
+                   Pattern> {
   bits<4> dst;
-  bits<32> imm;
 
-  let Inst{63-60} = BPF_END.Value;
-  let Inst{59} = BPF_TO_BE.Value; // (TODO: use BPF_TO_LE for big-endian target)
   let Inst{51-48} = dst;
-  let Inst{31-0} = imm;
-
+  let Inst{31-0} = SizeOp;
   let BPFClass = BPF_ALU;
-  let imm = SizeOp;
 }
 
 let Constraints = "$dst = $src" in {
@@ -486,28 +504,26 @@ def BSWAP64 : BSWAP<64, "bswap64", [(set GPR:$dst, (bswap GPR:$src))]>;
 let Defs = [R0, R1, R2, R3, R4, R5], Uses = [R6], hasSideEffects = 1,
     hasExtraDefRegAllocReq = 1, hasExtraSrcRegAllocReq = 1, mayLoad = 1 in {
 class LOAD_ABS<BPFWidthModifer SizeOp, string OpcodeStr, Intrinsic OpNode>
-    : InstBPF<(outs), (ins GPR:$skb, i64imm:$imm),
-              "r0 = *("#OpcodeStr#" *)skb[$imm]",
-              [(set R0, (OpNode GPR:$skb, i64immSExt32:$imm))]> {
+    : TYPE_LD_ST<BPF_ABS.Value, SizeOp.Value,
+                 (outs),
+                 (ins GPR:$skb, i64imm:$imm),
+                 "r0 = *("#OpcodeStr#" *)skb[$imm]",
+                 [(set R0, (OpNode GPR:$skb, i64immSExt32:$imm))]> {
   bits<32> imm;
 
-  let Inst{63-61} = BPF_ABS.Value;
-  let Inst{60-59} = SizeOp.Value;
   let Inst{31-0} = imm;
-
   let BPFClass = BPF_LD;
 }
 
 class LOAD_IND<BPFWidthModifer SizeOp, string OpcodeStr, Intrinsic OpNode>
-    : InstBPF<(outs), (ins GPR:$skb, GPR:$val),
-              "r0 = *("#OpcodeStr#" *)skb[$val]",
-              [(set R0, (OpNode GPR:$skb, GPR:$val))]> {
+    : TYPE_LD_ST<BPF_IND.Value, SizeOp.Value,
+                 (outs),
+                 (ins GPR:$skb, GPR:$val),
+                 "r0 = *("#OpcodeStr#" *)skb[$val]",
+                 [(set R0, (OpNode GPR:$skb, GPR:$val))]> {
   bits<4> val;
 
-  let Inst{63-61} = BPF_IND.Value;
-  let Inst{60-59} = SizeOp.Value;
   let Inst{55-52} = val;
-
   let BPFClass = BPF_LD;
 }
 }
-- 
2.7.4

^ permalink raw reply related	[flat|nested] 20+ messages in thread

* [PATCH RFC 3/4] New 32-bit register set
  2017-09-18 20:47 [PATCH RFC 0/4] Initial 32-bit eBPF encoding support Jiong Wang
  2017-09-18 20:47 ` [PATCH RFC 1/4] Improve instruction encoding descriptions Jiong Wang
  2017-09-18 20:47 ` [PATCH RFC 2/4] Improve class inheritance in instruction patterns Jiong Wang
@ 2017-09-18 20:47 ` Jiong Wang
  2017-09-19  6:44   ` [iovisor-dev] " Y Song
  2017-09-18 20:47 ` [PATCH RFC 4/4] Initial 32-bit ALU encoding support in assembler Jiong Wang
                   ` (2 subsequent siblings)
  5 siblings, 1 reply; 20+ messages in thread
From: Jiong Wang @ 2017-09-18 20:47 UTC (permalink / raw)
  To: xdp-newbies; +Cc: llvm-dev, iovisor-dev, oss-drivers, Jiong Wang

This patch introduce the new "w*" 32-bit register set and alias them to the low
32-bit of the corresponding 64-bit register.

Acked-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Signed-off-by: Jiong Wang <jiong.wang@netronome.com>
---
 lib/Target/BPF/BPFRegisterInfo.td | 74 ++++++++++++++++++++++++++++-----------
 1 file changed, 53 insertions(+), 21 deletions(-)

diff --git a/lib/Target/BPF/BPFRegisterInfo.td b/lib/Target/BPF/BPFRegisterInfo.td
index c8e24f8..a5722fe 100644
--- a/lib/Target/BPF/BPFRegisterInfo.td
+++ b/lib/Target/BPF/BPFRegisterInfo.td
@@ -11,31 +11,63 @@
 //  Declarations that describe the BPF register file
 //===----------------------------------------------------------------------===//
 
+let Namespace = "BPF" in {
+  def sub_32 : SubRegIndex<32>;
+}
+
+class Wi<bits<16> Enc, string n> : Register<n> {
+  let HWEncoding = Enc;
+  let Namespace = "BPF";
+}
+
 // Registers are identified with 4-bit ID numbers.
 // Ri - 64-bit integer registers
-class Ri<bits<16> Enc, string n> : Register<n> {
-  let Namespace = "BPF";
+class Ri<bits<16> Enc, string n, list<Register> subregs>
+  : RegisterWithSubRegs<n, subregs> {
   let HWEncoding = Enc;
+  let Namespace = "BPF";
+  let SubRegIndices = [sub_32];
 }
 
-// Integer registers
-def R0 : Ri< 0, "r0">, DwarfRegNum<[0]>;
-def R1 : Ri< 1, "r1">, DwarfRegNum<[1]>;
-def R2 : Ri< 2, "r2">, DwarfRegNum<[2]>;
-def R3 : Ri< 3, "r3">, DwarfRegNum<[3]>;
-def R4 : Ri< 4, "r4">, DwarfRegNum<[4]>;
-def R5 : Ri< 5, "r5">, DwarfRegNum<[5]>;
-def R6 : Ri< 6, "r6">, DwarfRegNum<[6]>;
-def R7 : Ri< 7, "r7">, DwarfRegNum<[7]>;
-def R8 : Ri< 8, "r8">, DwarfRegNum<[8]>;
-def R9 : Ri< 9, "r9">, DwarfRegNum<[9]>;
-def R10 : Ri<10, "r10">, DwarfRegNum<[10]>;
-def R11 : Ri<11, "r11">, DwarfRegNum<[11]>;
+// 32-bit Integer registers which alias to low part of 64-bit one.
+def W0  : Wi<0,  "w0">,  DwarfRegNum<[0]>;
+def W1  : Wi<1,  "w1">,  DwarfRegNum<[1]>;
+def W2  : Wi<2,  "w2">,  DwarfRegNum<[2]>;
+def W3  : Wi<3,  "w3">,  DwarfRegNum<[3]>;
+def W4  : Wi<4,  "w4">,  DwarfRegNum<[4]>;
+def W5  : Wi<5,  "w5">,  DwarfRegNum<[5]>;
+def W6  : Wi<6,  "w6">,  DwarfRegNum<[6]>;
+def W7  : Wi<7,  "w7">,  DwarfRegNum<[7]>;
+def W8  : Wi<8,  "w8">,  DwarfRegNum<[8]>;
+def W9  : Wi<9,  "w9">,  DwarfRegNum<[9]>;
+def W10 : Wi<10, "w10">, DwarfRegNum<[10]>;
+def W11 : Wi<11, "w11">, DwarfRegNum<[11]>;
+
+// 64-bit Integer registers
+def R0  : Ri<0,  "r0",  [W0]>,  DwarfRegNum<[0]>;
+def R1  : Ri<1,  "r1",  [W1]>,  DwarfRegNum<[1]>;
+def R2  : Ri<2,  "r2",  [W2]>,  DwarfRegNum<[2]>;
+def R3  : Ri<3,  "r3",  [W3]>,  DwarfRegNum<[3]>;
+def R4  : Ri<4,  "r4",  [W4]>,  DwarfRegNum<[4]>;
+def R5  : Ri<5,  "r5",  [W5]>,  DwarfRegNum<[5]>;
+def R6  : Ri<6,  "r6",  [W6]>,  DwarfRegNum<[6]>;
+def R7  : Ri<7,  "r7",  [W7]>,  DwarfRegNum<[7]>;
+def R8  : Ri<8,  "r8",  [W8]>,  DwarfRegNum<[8]>;
+def R9  : Ri<9,  "r9",  [W9]>,  DwarfRegNum<[9]>;
+def R10 : Ri<10, "r10", [W10]>, DwarfRegNum<[10]>;
+def R11 : Ri<11, "r11", [W11]>, DwarfRegNum<[11]>;
 
 // Register classes.
-def GPR : RegisterClass<"BPF", [i64], 64, (add R1, R2, R3, R4, R5,
-                                           R6, R7, R8, R9, // callee saved
-                                           R0,  // return value
-                                           R11, // stack ptr
-                                           R10  // frame ptr
-                                          )>;
+def GPR32 : RegisterClass<"BPF", [i32], 32, (add
+  (sequence "W%u", 1, 9), // Callee saved
+  W0, // Return value
+  W11, // Stack Ptr
+  W10  // Frame Ptr
+)>;
+
+def GPR : RegisterClass<"BPF", [i64], 64, (add
+  (sequence "R%u", 1, 9), // Callee saved
+  R0, // Return value
+  R11, // Stack Ptr
+  R10  // Frame Ptr
+)>;
-- 
2.7.4

^ permalink raw reply related	[flat|nested] 20+ messages in thread

* [PATCH RFC 4/4] Initial 32-bit ALU encoding support in assembler
  2017-09-18 20:47 [PATCH RFC 0/4] Initial 32-bit eBPF encoding support Jiong Wang
                   ` (2 preceding siblings ...)
  2017-09-18 20:47 ` [PATCH RFC 3/4] New 32-bit register set Jiong Wang
@ 2017-09-18 20:47 ` Jiong Wang
  2017-09-18 21:29 ` [PATCH RFC 0/4] Initial 32-bit eBPF encoding support Daniel Borkmann
  2017-09-19  4:49 ` Fulvio Risso
  5 siblings, 0 replies; 20+ messages in thread
From: Jiong Wang @ 2017-09-18 20:47 UTC (permalink / raw)
  To: xdp-newbies; +Cc: llvm-dev, iovisor-dev, oss-drivers, Jiong Wang

This patch adds instruction patterns for operations in BPF_ALU. After this,
assembler could recognize some 32-bit ALU statement. For example, those listed
int the unit test file.

Separate MOV patterns are unnecessary as MOV is ALU operation that could reuse
ALU encoding infrastructure, this patch removed those redundant patterns.

Acked-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Signed-off-by: Jiong Wang <jiong.wang@netronome.com>
---
 lib/Target/BPF/BPFInstrInfo.td                  | 94 +++++++++++++------------
 lib/Target/BPF/Disassembler/BPFDisassembler.cpp | 15 ++++
 test/MC/BPF/insn-unit-32.s                      | 53 ++++++++++++++
 3 files changed, 116 insertions(+), 46 deletions(-)
 create mode 100644 test/MC/BPF/insn-unit-32.s

diff --git a/lib/Target/BPF/BPFInstrInfo.td b/lib/Target/BPF/BPFInstrInfo.td
index 0319cfe..e1f233e 100644
--- a/lib/Target/BPF/BPFInstrInfo.td
+++ b/lib/Target/BPF/BPFInstrInfo.td
@@ -172,37 +172,49 @@ defm JSLE : J<BPF_JSLE, "s<=", BPF_CC_LE>;
 }
 
 // ALU instructions
-class ALU_RI<BPFArithOp Opc, string OpcodeStr, SDNode OpNode>
-    : TYPE_ALU_JMP<Opc.Value, BPF_K.Value,
-                   (outs GPR:$dst),
-                   (ins GPR:$src2, i64imm:$imm),
-                   "$dst "#OpcodeStr#" $imm",
-                   [(set GPR:$dst, (OpNode GPR:$src2, i64immSExt32:$imm))]> {
+class ALU_RI<BPFOpClass Class, BPFArithOp Opc,
+             dag outs, dag ins, string asmstr, list<dag> pattern>
+    : TYPE_ALU_JMP<Opc.Value, BPF_K.Value, outs, ins, asmstr, pattern> {
   bits<4> dst;
   bits<32> imm;
 
   let Inst{51-48} = dst;
   let Inst{31-0} = imm;
-  let BPFClass = BPF_ALU64;
+  let BPFClass = Class;
 }
 
-class ALU_RR<BPFArithOp Opc, string OpcodeStr, SDNode OpNode>
-    : TYPE_ALU_JMP<Opc.Value, BPF_X.Value,
-                   (outs GPR:$dst),
-                   (ins GPR:$src2, GPR:$src),
-                   "$dst "#OpcodeStr#" $src",
-                   [(set GPR:$dst, (OpNode i64:$src2, i64:$src))]> {
+class ALU_RR<BPFOpClass Class, BPFArithOp Opc,
+             dag outs, dag ins, string asmstr, list<dag> pattern>
+    : TYPE_ALU_JMP<Opc.Value, BPF_X.Value, outs, ins, asmstr, pattern> {
   bits<4> dst;
   bits<4> src;
 
   let Inst{55-52} = src;
   let Inst{51-48} = dst;
-  let BPFClass = BPF_ALU64;
+  let BPFClass = Class;
 }
 
 multiclass ALU<BPFArithOp Opc, string OpcodeStr, SDNode OpNode> {
-  def _rr : ALU_RR<Opc, OpcodeStr, OpNode>;
-  def _ri : ALU_RI<Opc, OpcodeStr, OpNode>;
+  def _rr : ALU_RR<BPF_ALU64, Opc,
+                   (outs GPR:$dst),
+                   (ins GPR:$src2, GPR:$src),
+                   "$dst "#OpcodeStr#" $src",
+                   [(set GPR:$dst, (OpNode i64:$src2, i64:$src))]>;
+  def _ri : ALU_RI<BPF_ALU64, Opc,
+                   (outs GPR:$dst),
+                   (ins GPR:$src2, i64imm:$imm),
+                   "$dst "#OpcodeStr#" $imm",
+                   [(set GPR:$dst, (OpNode GPR:$src2, i64immSExt32:$imm))]>;
+  def _rr_32 : ALU_RR<BPF_ALU, Opc,
+                   (outs GPR32:$dst),
+                   (ins GPR32:$src2, GPR32:$src),
+                   "$dst "#OpcodeStr#" $src",
+                   [(set GPR32:$dst, (OpNode i32:$src2, i32:$src))]>;
+  def _ri_32 : ALU_RI<BPF_ALU, Opc,
+                   (outs GPR32:$dst),
+                   (ins GPR32:$src2, i32imm:$imm),
+                   "$dst "#OpcodeStr#" $imm",
+                   [(set GPR32:$dst, (OpNode GPR32:$src2, i32:$imm))]>;
 }
 
 let Constraints = "$dst = $src2" in {
@@ -220,34 +232,6 @@ let isAsCheapAsAMove = 1 in {
   defm DIV : ALU<BPF_DIV, "/=", udiv>;
 }
 
-class MOV_RR<string OpcodeStr>
-    : TYPE_ALU_JMP<BPF_MOV.Value, BPF_X.Value,
-                   (outs GPR:$dst),
-                   (ins GPR:$src),
-                   "$dst "#OpcodeStr#" $src",
-                   []> {
-  bits<4> dst;
-  bits<4> src;
-
-  let Inst{55-52} = src;
-  let Inst{51-48} = dst;
-  let BPFClass = BPF_ALU64;
-}
-
-class MOV_RI<string OpcodeStr>
-    : TYPE_ALU_JMP<BPF_MOV.Value, BPF_K.Value,
-                   (outs GPR:$dst),
-                   (ins i64imm:$imm),
-                   "$dst "#OpcodeStr#" $imm",
-                   [(set GPR:$dst, (i64 i64immSExt32:$imm))]> {
-  bits<4> dst;
-  bits<32> imm;
-
-  let Inst{51-48} = dst;
-  let Inst{31-0} = imm;
-  let BPFClass = BPF_ALU64;
-}
-
 class LD_IMM64<bits<4> Pseudo, string OpcodeStr>
     : TYPE_LD_ST<BPF_IMM.Value, BPF_DW.Value,
                  (outs GPR:$dst),
@@ -267,8 +251,26 @@ class LD_IMM64<bits<4> Pseudo, string OpcodeStr>
 
 let isReMaterializable = 1, isAsCheapAsAMove = 1 in {
 def LD_imm64 : LD_IMM64<0, "=">;
-def MOV_rr : MOV_RR<"=">;
-def MOV_ri : MOV_RI<"=">;
+def MOV_rr : ALU_RR<BPF_ALU64, BPF_MOV,
+                    (outs GPR:$dst),
+                    (ins GPR:$src),
+                    "$dst = $src",
+                    []>;
+def MOV_ri : ALU_RI<BPF_ALU64, BPF_MOV,
+                    (outs GPR:$dst),
+                    (ins i64imm:$imm),
+                    "$dst = $imm",
+                    [(set GPR:$dst, (i64 i64immSExt32:$imm))]>;
+def MOV_rr_32 : ALU_RR<BPF_ALU, BPF_MOV,
+                    (outs GPR32:$dst),
+                    (ins GPR32:$src),
+                    "$dst = $src",
+                    []>;
+def MOV_ri_32 : ALU_RI<BPF_ALU, BPF_MOV,
+                    (outs GPR32:$dst),
+                    (ins i32imm:$imm),
+                    "$dst = $imm",
+                    [(set GPR32:$dst, (i32 i32:$imm))]>;
 }
 
 def FI_ri
diff --git a/lib/Target/BPF/Disassembler/BPFDisassembler.cpp b/lib/Target/BPF/Disassembler/BPFDisassembler.cpp
index a1d732c..f5b621f 100644
--- a/lib/Target/BPF/Disassembler/BPFDisassembler.cpp
+++ b/lib/Target/BPF/Disassembler/BPFDisassembler.cpp
@@ -79,6 +79,21 @@ static DecodeStatus DecodeGPRRegisterClass(MCInst &Inst, unsigned RegNo,
   return MCDisassembler::Success;
 }
 
+static const unsigned GPR32DecoderTable[] = {
+    BPF::W0,  BPF::W1,  BPF::W2,  BPF::W3,  BPF::W4,  BPF::W5,
+    BPF::W6,  BPF::W7,  BPF::W8,  BPF::W9,  BPF::W10, BPF::W11};
+
+static DecodeStatus DecodeGPR32RegisterClass(MCInst &Inst, unsigned RegNo,
+                                             uint64_t /*Address*/,
+                                             const void * /*Decoder*/) {
+  if (RegNo > 11)
+    return MCDisassembler::Fail;
+
+  unsigned Reg = GPR32DecoderTable[RegNo];
+  Inst.addOperand(MCOperand::createReg(Reg));
+  return MCDisassembler::Success;
+}
+
 static DecodeStatus decodeMemoryOpValue(MCInst &Inst, unsigned Insn,
                                         uint64_t Address, const void *Decoder) {
   unsigned Register = (Insn >> 16) & 0xf;
diff --git a/test/MC/BPF/insn-unit-32.s b/test/MC/BPF/insn-unit-32.s
new file mode 100644
index 0000000..77787a9
--- /dev/null
+++ b/test/MC/BPF/insn-unit-32.s
@@ -0,0 +1,53 @@
+# RUN: llvm-mc -triple bpfel -filetype=obj -o %t %s
+# RUN: llvm-objdump -d -r %t | FileCheck %s
+
+// ======== BPF_ALU Class ========
+  w0 += w1    // BPF_ADD  | BPF_X
+  w1 -= w2    // BPF_SUB  | BPF_X
+  w2 *= w3    // BPF_MUL  | BPF_X
+  w3 /= w4    // BPF_DIV  | BPF_X
+// CHECK: 0c 10 00 00 00 00 00 00 	w0 += w1
+// CHECK: 1c 21 00 00 00 00 00 00 	w1 -= w2
+// CHECK: 2c 32 00 00 00 00 00 00 	w2 *= w3
+// CHECK: 3c 43 00 00 00 00 00 00 	w3 /= w4
+
+  w4 |= w5    // BPF_OR   | BPF_X
+  w5 &= w6    // BPF_AND  | BPF_X
+  w6 <<= w7   // BPF_LSH  | BPF_X
+  w7 >>= w8   // BPF_RSH  | BPF_X
+  w8 ^= w9    // BPF_XOR  | BPF_X
+  w9 = w10    // BPF_MOV  | BPF_X
+  w10 s>>= w0 // BPF_ARSH | BPF_X
+// CHECK: 4c 54 00 00 00 00 00 00 	w4 |= w5
+// CHECK: 5c 65 00 00 00 00 00 00 	w5 &= w6
+// CHECK: 6c 76 00 00 00 00 00 00 	w6 <<= w7
+// CHECK: 7c 87 00 00 00 00 00 00 	w7 >>= w8
+// CHECK: ac 98 00 00 00 00 00 00 	w8 ^= w9
+// CHECK: bc a9 00 00 00 00 00 00 	w9 = w10
+// CHECK: cc 0a 00 00 00 00 00 00 	w10 s>>= w0
+
+  w0 += 1           // BPF_ADD  | BPF_K
+  w1 -= 0x1         // BPF_SUB  | BPF_K
+  w2 *= -4          // BPF_MUL  | BPF_K
+  w3 /= 5           // BPF_DIV  | BPF_K
+// CHECK: 04 00 00 00 01 00 00 00 	w0 += 1
+// CHECK: 14 01 00 00 01 00 00 00 	w1 -= 1
+// CHECK: 24 02 00 00 fc ff ff ff 	w2 *= -4
+// CHECK: 34 03 00 00 05 00 00 00 	w3 /= 5
+
+  w4 |= 0xff        // BPF_OR   | BPF_K
+  w5 &= 0xFF        // BPF_AND  | BPF_K
+  w6 <<= 63         // BPF_LSH  | BPF_K
+  w7 >>= 32         // BPF_RSH  | BPF_K
+  w8 ^= 0           // BPF_XOR  | BPF_K
+  w9 = 1            // BPF_MOV  | BPF_K
+  w9 = 0xffffffff   // BPF_MOV  | BPF_K
+  w10 s>>= 64       // BPF_ARSH | BPF_K
+// CHECK: 44 04 00 00 ff 00 00 00 	w4 |= 255
+// CHECK: 54 05 00 00 ff 00 00 00 	w5 &= 255
+// CHECK: 64 06 00 00 3f 00 00 00 	w6 <<= 63
+// CHECK: 74 07 00 00 20 00 00 00 	w7 >>= 32
+// CHECK: a4 08 00 00 00 00 00 00 	w8 ^= 0
+// CHECK: b4 09 00 00 01 00 00 00 	w9 = 1
+// CHECK: b4 09 00 00 ff ff ff ff 	w9 = -1
+// CHECK: c4 0a 00 00 40 00 00 00 	w10 s>>= 64
-- 
2.7.4

^ permalink raw reply related	[flat|nested] 20+ messages in thread

* Re: [PATCH RFC 0/4] Initial 32-bit eBPF encoding support
  2017-09-18 20:47 [PATCH RFC 0/4] Initial 32-bit eBPF encoding support Jiong Wang
                   ` (3 preceding siblings ...)
  2017-09-18 20:47 ` [PATCH RFC 4/4] Initial 32-bit ALU encoding support in assembler Jiong Wang
@ 2017-09-18 21:29 ` Daniel Borkmann
  2017-09-19 23:20   ` Jiong Wang
  2017-09-19  4:49 ` Fulvio Risso
  5 siblings, 1 reply; 20+ messages in thread
From: Daniel Borkmann @ 2017-09-18 21:29 UTC (permalink / raw)
  To: Jiong Wang; +Cc: xdp-newbies, llvm-dev, iovisor-dev, oss-drivers, ys114321

On 09/18/2017 10:47 PM, Jiong Wang wrote:
> Hi,
>
>    Currently, LLVM eBPF backend always generate code in 64-bit mode, this may
> cause troubles when JITing to 32-bit targets.
>
>    For example, it is quite common for XDP eBPF program to access some packet
> fields through base + offset that the default eBPF will generate BPF_ALU64 for
> the address formation, later when JITing to 32-bit hardware, BPF_ALU64 needs
> to be expanded into 32 bit ALU sequences even though the address space is
> 32-bit that the high bits is not significant.
>
>    While a complete 32-bit mode implemention may need an new ABI (something like
> -target-abi=ilp32), this patch set first add some initial code so we could
> construct 32-bit eBPF tests through hand-written assembly.
>
>    A new 32-bit register set is introduced, its name is with "w" prefix and LLVM
> assembler will encode statements like "w1 += w2" into the following 8-bit code
> field:
>
>      BPF_ADD | BPF_X | BPF_ALU
>
> BPF_ALU will be used instead of BPF_ALU64.
>
>    NOTE, currently you can only use "w" register with ALU statements, not with
> others like branches etc as they don't have different encoding for 32-bit
> target.

Great to see work in this direction! Can we also enable to use / emit
all the 32bit BPF_ALU instructions whenever possible for the currently
available bpf targets while at it (which only use BPF_ALU64 right now)?

Thanks,
Daniel

> *** BLURB HERE ***
>
> Jiong Wang (4):
>    Improve instruction encoding descriptions
>    Improve class inheritance in instruction patterns
>    New 32-bit register set
>    Initial 32-bit ALU (BPF_ALU) encoding support in assembler
>
>   lib/Target/BPF/BPFInstrFormats.td               |  84 +++-
>   lib/Target/BPF/BPFInstrInfo.td                  | 506 +++++++++++-------------
>   lib/Target/BPF/BPFRegisterInfo.td               |  74 +++-
>   lib/Target/BPF/Disassembler/BPFDisassembler.cpp |  15 +
>   test/MC/BPF/insn-unit-32.s                      |  53 +++
>   5 files changed, 427 insertions(+), 305 deletions(-)
>   create mode 100644 test/MC/BPF/insn-unit-32.s
>

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [iovisor-dev] [PATCH RFC 0/4] Initial 32-bit eBPF encoding support
  2017-09-18 20:47 [PATCH RFC 0/4] Initial 32-bit eBPF encoding support Jiong Wang
                   ` (4 preceding siblings ...)
  2017-09-18 21:29 ` [PATCH RFC 0/4] Initial 32-bit eBPF encoding support Daniel Borkmann
@ 2017-09-19  4:49 ` Fulvio Risso
  2017-09-19 22:56   ` Jiong Wang
  5 siblings, 1 reply; 20+ messages in thread
From: Fulvio Risso @ 2017-09-19  4:49 UTC (permalink / raw)
  To: Jiong Wang, xdp-newbies; +Cc: llvm-dev, oss-drivers, iovisor-dev

Dear Jiong,

that's a great work.
I havent' gone through the whole patches, but it seems to me that the 
documentation is not that much.
 From my past experiences, putting your hands into a compiler without at 
least some high-level documentation that presents how it works, it would 
be a nightmare.
Even something like what you wrote in this email is valuable; of course, 
also how to turn this feature on.

Thanks,

	fulvio


On 18/09/2017 13:47, Jiong Wang via iovisor-dev wrote:
> Hi,
> 
>    Currently, LLVM eBPF backend always generate code in 64-bit mode, this may
> cause troubles when JITing to 32-bit targets.
> 
>    For example, it is quite common for XDP eBPF program to access some packet
> fields through base + offset that the default eBPF will generate BPF_ALU64 for
> the address formation, later when JITing to 32-bit hardware, BPF_ALU64 needs
> to be expanded into 32 bit ALU sequences even though the address space is
> 32-bit that the high bits is not significant.
> 
>    While a complete 32-bit mode implemention may need an new ABI (something like
> -target-abi=ilp32), this patch set first add some initial code so we could
> construct 32-bit eBPF tests through hand-written assembly.
> 
>    A new 32-bit register set is introduced, its name is with "w" prefix and LLVM
> assembler will encode statements like "w1 += w2" into the following 8-bit code
> field:
> 
>      BPF_ADD | BPF_X | BPF_ALU
> 
> BPF_ALU will be used instead of BPF_ALU64.
> 
>    NOTE, currently you can only use "w" register with ALU statements, not with
> others like branches etc as they don't have different encoding for 32-bit
> target.
> 
>    Comments?
> 
> *** BLURB HERE ***
> 
> Jiong Wang (4):
>    Improve instruction encoding descriptions
>    Improve class inheritance in instruction patterns
>    New 32-bit register set
>    Initial 32-bit ALU (BPF_ALU) encoding support in assembler
> 
>   lib/Target/BPF/BPFInstrFormats.td               |  84 +++-
>   lib/Target/BPF/BPFInstrInfo.td                  | 506 +++++++++++-------------
>   lib/Target/BPF/BPFRegisterInfo.td               |  74 +++-
>   lib/Target/BPF/Disassembler/BPFDisassembler.cpp |  15 +
>   test/MC/BPF/insn-unit-32.s                      |  53 +++
>   5 files changed, 427 insertions(+), 305 deletions(-)
>   create mode 100644 test/MC/BPF/insn-unit-32.s
> 

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [iovisor-dev] [PATCH RFC 3/4] New 32-bit register set
  2017-09-18 20:47 ` [PATCH RFC 3/4] New 32-bit register set Jiong Wang
@ 2017-09-19  6:44   ` Y Song
  2017-09-19 23:10     ` Jiong Wang
  0 siblings, 1 reply; 20+ messages in thread
From: Y Song @ 2017-09-19  6:44 UTC (permalink / raw)
  To: Jiong Wang
  Cc: xdp-newbies, LLVM Developers Mailing List, oss-drivers,
	Tom Herbert via iovisor-dev

Hi, Jiong,

Thanks for the patch! It is a great start to support 32bit register in BPF.
In the past, I have studied a little bit to see whether 32bit register
support may reduce
the number of unnecessary shifts on x86_64 and improve the
performance. Looking through
a few bpf programs and it looks like the opportunity is not great, but
still nice to have if we
have this capability. As you mentioned, this definitely helped 32bit
architecture!

I am not an expert in LLVM tablegen. I briefly looked through the code
change and it looks like
correct. Hopefully some llvm-dev tablegen experts can comment on the
implementation.

Below I only have a couple of minor comments.

On Mon, Sep 18, 2017 at 1:47 PM, Jiong Wang via iovisor-dev
<iovisor-dev@lists.iovisor.org> wrote:
> This patch introduce the new "w*" 32-bit register set and alias them to the low
> 32-bit of the corresponding 64-bit register.
>
> Acked-by: Jakub Kicinski <jakub.kicinski@netronome.com>
> Signed-off-by: Jiong Wang <jiong.wang@netronome.com>
> ---
>  lib/Target/BPF/BPFRegisterInfo.td | 74 ++++++++++++++++++++++++++++-----------
>  1 file changed, 53 insertions(+), 21 deletions(-)
>
> diff --git a/lib/Target/BPF/BPFRegisterInfo.td b/lib/Target/BPF/BPFRegisterInfo.td
> index c8e24f8..a5722fe 100644
> --- a/lib/Target/BPF/BPFRegisterInfo.td
> +++ b/lib/Target/BPF/BPFRegisterInfo.td
> @@ -11,31 +11,63 @@
>  //  Declarations that describe the BPF register file
>  //===----------------------------------------------------------------------===//
>
> +let Namespace = "BPF" in {
> +  def sub_32 : SubRegIndex<32>;
> +}
> +
> +class Wi<bits<16> Enc, string n> : Register<n> {
> +  let HWEncoding = Enc;
> +  let Namespace = "BPF";
> +}
> +
>  // Registers are identified with 4-bit ID numbers.
>  // Ri - 64-bit integer registers
> -class Ri<bits<16> Enc, string n> : Register<n> {
> -  let Namespace = "BPF";
> +class Ri<bits<16> Enc, string n, list<Register> subregs>
> +  : RegisterWithSubRegs<n, subregs> {
>    let HWEncoding = Enc;
> +  let Namespace = "BPF";
> +  let SubRegIndices = [sub_32];
>  }
>
> -// Integer registers
> -def R0 : Ri< 0, "r0">, DwarfRegNum<[0]>;
> -def R1 : Ri< 1, "r1">, DwarfRegNum<[1]>;
> -def R2 : Ri< 2, "r2">, DwarfRegNum<[2]>;
> -def R3 : Ri< 3, "r3">, DwarfRegNum<[3]>;
> -def R4 : Ri< 4, "r4">, DwarfRegNum<[4]>;
> -def R5 : Ri< 5, "r5">, DwarfRegNum<[5]>;
> -def R6 : Ri< 6, "r6">, DwarfRegNum<[6]>;
> -def R7 : Ri< 7, "r7">, DwarfRegNum<[7]>;
> -def R8 : Ri< 8, "r8">, DwarfRegNum<[8]>;
> -def R9 : Ri< 9, "r9">, DwarfRegNum<[9]>;
> -def R10 : Ri<10, "r10">, DwarfRegNum<[10]>;
> -def R11 : Ri<11, "r11">, DwarfRegNum<[11]>;
> +// 32-bit Integer registers which alias to low part of 64-bit one.
> +def W0  : Wi<0,  "w0">,  DwarfRegNum<[0]>;
> +def W1  : Wi<1,  "w1">,  DwarfRegNum<[1]>;
> +def W2  : Wi<2,  "w2">,  DwarfRegNum<[2]>;
> +def W3  : Wi<3,  "w3">,  DwarfRegNum<[3]>;
> +def W4  : Wi<4,  "w4">,  DwarfRegNum<[4]>;
> +def W5  : Wi<5,  "w5">,  DwarfRegNum<[5]>;
> +def W6  : Wi<6,  "w6">,  DwarfRegNum<[6]>;
> +def W7  : Wi<7,  "w7">,  DwarfRegNum<[7]>;
> +def W8  : Wi<8,  "w8">,  DwarfRegNum<[8]>;
> +def W9  : Wi<9,  "w9">,  DwarfRegNum<[9]>;
> +def W10 : Wi<10, "w10">, DwarfRegNum<[10]>;
> +def W11 : Wi<11, "w11">, DwarfRegNum<[11]>;
> +
> +// 64-bit Integer registers
> +def R0  : Ri<0,  "r0",  [W0]>,  DwarfRegNum<[0]>;
> +def R1  : Ri<1,  "r1",  [W1]>,  DwarfRegNum<[1]>;
> +def R2  : Ri<2,  "r2",  [W2]>,  DwarfRegNum<[2]>;
> +def R3  : Ri<3,  "r3",  [W3]>,  DwarfRegNum<[3]>;
> +def R4  : Ri<4,  "r4",  [W4]>,  DwarfRegNum<[4]>;
> +def R5  : Ri<5,  "r5",  [W5]>,  DwarfRegNum<[5]>;
> +def R6  : Ri<6,  "r6",  [W6]>,  DwarfRegNum<[6]>;
> +def R7  : Ri<7,  "r7",  [W7]>,  DwarfRegNum<[7]>;
> +def R8  : Ri<8,  "r8",  [W8]>,  DwarfRegNum<[8]>;
> +def R9  : Ri<9,  "r9",  [W9]>,  DwarfRegNum<[9]>;
> +def R10 : Ri<10, "r10", [W10]>, DwarfRegNum<[10]>;
> +def R11 : Ri<11, "r11", [W11]>, DwarfRegNum<[11]>;

Since you are touching this part, you could use "sequence" loop
to generate W* and R* definitions.

>
>  // Register classes.
> -def GPR : RegisterClass<"BPF", [i64], 64, (add R1, R2, R3, R4, R5,
> -                                           R6, R7, R8, R9, // callee saved
> -                                           R0,  // return value
> -                                           R11, // stack ptr
> -                                           R10  // frame ptr
> -                                          )>;
> +def GPR32 : RegisterClass<"BPF", [i32], 32, (add
> +  (sequence "W%u", 1, 9), // Callee saved

You can remove "Called saved" comments here.
The callee saved registers are actually R6-R9.

> +  W0, // Return value
> +  W11, // Stack Ptr
> +  W10  // Frame Ptr
> +)>;
> +
> +def GPR : RegisterClass<"BPF", [i64], 64, (add
> +  (sequence "R%u", 1, 9), // Callee saved

ditto.

> +  R0, // Return value
> +  R11, // Stack Ptr
> +  R10  // Frame Ptr
> +)>;
> --
> 2.7.4
>
> _______________________________________________
> iovisor-dev mailing list
> iovisor-dev@lists.iovisor.org
> https://lists.iovisor.org/mailman/listinfo/iovisor-dev

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [iovisor-dev] [PATCH RFC 0/4] Initial 32-bit eBPF encoding support
  2017-09-19  4:49 ` Fulvio Risso
@ 2017-09-19 22:56   ` Jiong Wang
  0 siblings, 0 replies; 20+ messages in thread
From: Jiong Wang @ 2017-09-19 22:56 UTC (permalink / raw)
  To: Fulvio Risso, xdp-newbies; +Cc: llvm-dev, oss-drivers, iovisor-dev

On 19/09/2017 05:49, Fulvio Risso wrote:
> Dear Jiong,
>
> that's a great work.
> I havent' gone through the whole patches, but it seems to me that the 
> documentation is not that much.
> From my past experiences, putting your hands into a compiler without 
> at least some high-level documentation that presents how it works, it 
> would be a nightmare.

Hi Fulvio,

   Thanks for the feedback.

   I agree that we need some high-level documentation on this. It could let
people evaluating the implementation approach taken and we could make sure
things on on correct direction.

   I was dividing the 32-bit BPF support into the following three parts and
I am thinking the support of the first part could let the user be able to
construct testcases contaning 32-bit ALU which is good for experimenting
and it won't affect code-gen from .c/.ll. Aslo this part is not hard for
review and relatively independent, so I have seperated them out into this
patch set for review.

   Detailed implementation details or high-level design will be included in
the cover letter of patch set for the second part which is the actually
part for 32-bit code-gen enablement.

brief intro on 32-bit BPF support phases
===
     * Support in assembly level.

       This includes cleanup of existing BPF instruction patterns to make them
       32-bit support friendly and also the registration of 32-bit register set.

     * Full code-gen Support in LLVM.

       This includes making 32-bit integer type as legal type and associate it
       with 32-bit register class, also we need to cleanup operation legalization
       stragegies.

       We need to discuss whether we want to make 32-bit BPF an new target or
       just an new environment in LLVM's concept, so the user could simply enable
       it through something like --triple=bpf-unknown-elf-ilp32. We may also need
       new ELF attributes and new relocation types for map address relocation for
       32-bit eBPF.

     * Hook Clang driver with LLVM.

       Support for this is straightforward, especially if we go with new environment
       approach.

Regards,

Jiong

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [iovisor-dev] [PATCH RFC 3/4] New 32-bit register set
  2017-09-19  6:44   ` [iovisor-dev] " Y Song
@ 2017-09-19 23:10     ` Jiong Wang
  2017-09-22  4:55       ` Y Song
  0 siblings, 1 reply; 20+ messages in thread
From: Jiong Wang @ 2017-09-19 23:10 UTC (permalink / raw)
  To: Y Song
  Cc: xdp-newbies, LLVM Developers Mailing List, oss-drivers,
	Tom Herbert via iovisor-dev

[-- Attachment #1: Type: text/plain, Size: 982 bytes --]

On 19/09/2017 07:44, Y Song wrote:
> Hi, Jiong,
>
> Thanks for the patch! It is a great start to support 32bit register in BPF.
> In the past, I have studied a little bit to see whether 32bit register
> support may reduce
> the number of unnecessary shifts on x86_64 and improve the
> performance. Looking through
> a few bpf programs and it looks like the opportunity is not great, but
> still nice to have if we
> have this capability. As you mentioned, this definitely helped 32bit
> architecture!
>
> I am not an expert in LLVM tablegen. I briefly looked through the code
> change and it looks like
> correct. Hopefully some llvm-dev tablegen experts can comment on the
> implementation.
>
> Below I only have a couple of minor comments.

Yong Hong,

   Thanks for the review.

   I have addressed your comments and attached the updated patch.

   Do you want me to put this patch set on to llvm review website? I 
guess it is the
   formal review channel?

Regards,
Jiong


[-- Attachment #2: patch3-v2.patch --]
[-- Type: text/plain, Size: 2446 bytes --]

Acked-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Signed-off-by: Jiong Wang <jiong.wang@netronome.com>

diff --git a/lib/Target/BPF/BPFRegisterInfo.td b/lib/Target/BPF/BPFRegisterInfo.td
index c8e24f8..da1d6b5 100644
--- a/lib/Target/BPF/BPFRegisterInfo.td
+++ b/lib/Target/BPF/BPFRegisterInfo.td
@@ -11,31 +11,42 @@
 //  Declarations that describe the BPF register file
 //===----------------------------------------------------------------------===//
 
+let Namespace = "BPF" in {
+  def sub_32 : SubRegIndex<32>;
+}
+
+class Wi<bits<16> Enc, string n> : Register<n> {
+  let HWEncoding = Enc;
+  let Namespace = "BPF";
+}
+
 // Registers are identified with 4-bit ID numbers.
 // Ri - 64-bit integer registers
-class Ri<bits<16> Enc, string n> : Register<n> {
-  let Namespace = "BPF";
+class Ri<bits<16> Enc, string n, list<Register> subregs>
+  : RegisterWithSubRegs<n, subregs> {
   let HWEncoding = Enc;
+  let Namespace = "BPF";
+  let SubRegIndices = [sub_32];
 }
 
-// Integer registers
-def R0 : Ri< 0, "r0">, DwarfRegNum<[0]>;
-def R1 : Ri< 1, "r1">, DwarfRegNum<[1]>;
-def R2 : Ri< 2, "r2">, DwarfRegNum<[2]>;
-def R3 : Ri< 3, "r3">, DwarfRegNum<[3]>;
-def R4 : Ri< 4, "r4">, DwarfRegNum<[4]>;
-def R5 : Ri< 5, "r5">, DwarfRegNum<[5]>;
-def R6 : Ri< 6, "r6">, DwarfRegNum<[6]>;
-def R7 : Ri< 7, "r7">, DwarfRegNum<[7]>;
-def R8 : Ri< 8, "r8">, DwarfRegNum<[8]>;
-def R9 : Ri< 9, "r9">, DwarfRegNum<[9]>;
-def R10 : Ri<10, "r10">, DwarfRegNum<[10]>;
-def R11 : Ri<11, "r11">, DwarfRegNum<[11]>;
+foreach I = 0-11 in {
+  // 32-bit Integer (alias to low part of 64-bit register).
+  def W#I  : Wi<I,  "w"#I>,  DwarfRegNum<[I]>;
+  // 64-bit Integer registers
+  def R#I  : Ri<I,  "r"#I,  [!cast<Wi>("W"#I)]>,  DwarfRegNum<[I]>;
+}
 
 // Register classes.
-def GPR : RegisterClass<"BPF", [i64], 64, (add R1, R2, R3, R4, R5,
-                                           R6, R7, R8, R9, // callee saved
-                                           R0,  // return value
-                                           R11, // stack ptr
-                                           R10  // frame ptr
-                                          )>;
+def GPR32 : RegisterClass<"BPF", [i32], 32, (add
+  (sequence "W%u", 1, 9),
+  W0, // Return value
+  W11, // Stack Ptr
+  W10  // Frame Ptr
+)>;
+
+def GPR : RegisterClass<"BPF", [i64], 64, (add
+  (sequence "R%u", 1, 9),
+  R0, // Return value
+  R11, // Stack Ptr
+  R10  // Frame Ptr
+)>;

^ permalink raw reply related	[flat|nested] 20+ messages in thread

* Re: [PATCH RFC 0/4] Initial 32-bit eBPF encoding support
  2017-09-18 21:29 ` [PATCH RFC 0/4] Initial 32-bit eBPF encoding support Daniel Borkmann
@ 2017-09-19 23:20   ` Jiong Wang
  2017-09-19 23:24     ` Daniel Borkmann
  2017-09-21 18:56     ` [iovisor-dev] " Alexei Starovoitov
  0 siblings, 2 replies; 20+ messages in thread
From: Jiong Wang @ 2017-09-19 23:20 UTC (permalink / raw)
  To: Daniel Borkmann; +Cc: xdp-newbies, llvm-dev, iovisor-dev, oss-drivers, ys114321

On 18/09/2017 22:29, Daniel Borkmann wrote:
> On 09/18/2017 10:47 PM, Jiong Wang wrote:
>> Hi,
>>
>>    Currently, LLVM eBPF backend always generate code in 64-bit mode, 
>> this may
>> cause troubles when JITing to 32-bit targets.
>>
>>    For example, it is quite common for XDP eBPF program to access 
>> some packet
>> fields through base + offset that the default eBPF will generate 
>> BPF_ALU64 for
>> the address formation, later when JITing to 32-bit hardware, 
>> BPF_ALU64 needs
>> to be expanded into 32 bit ALU sequences even though the address 
>> space is
>> 32-bit that the high bits is not significant.
>>
>>    While a complete 32-bit mode implemention may need an new ABI 
>> (something like
>> -target-abi=ilp32), this patch set first add some initial code so we 
>> could
>> construct 32-bit eBPF tests through hand-written assembly.
>>
>>    A new 32-bit register set is introduced, its name is with "w" 
>> prefix and LLVM
>> assembler will encode statements like "w1 += w2" into the following 
>> 8-bit code
>> field:
>>
>>      BPF_ADD | BPF_X | BPF_ALU
>>
>> BPF_ALU will be used instead of BPF_ALU64.
>>
>>    NOTE, currently you can only use "w" register with ALU statements, 
>> not with
>> others like branches etc as they don't have different encoding for 
>> 32-bit
>> target.
>
> Great to see work in this direction! Can we also enable to use / emit
> all the 32bit BPF_ALU instructions whenever possible for the currently
> available bpf targets while at it (which only use BPF_ALU64 right now)?

Hi Daniel,

    Thanks for the feedback.

    I think we could also enable the use of all the 32bit BPF_ALU under 
currently
available bpf targets.  As we now have 32bit register set support, we 
could make
i32 type as legal type to prevent it be promoted into i64, then hook it 
up with i32
ALU patterns, will look into this.

Regards,
Jiong

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH RFC 0/4] Initial 32-bit eBPF encoding support
  2017-09-19 23:20   ` Jiong Wang
@ 2017-09-19 23:24     ` Daniel Borkmann
  2017-09-21 18:56     ` [iovisor-dev] " Alexei Starovoitov
  1 sibling, 0 replies; 20+ messages in thread
From: Daniel Borkmann @ 2017-09-19 23:24 UTC (permalink / raw)
  To: Jiong Wang; +Cc: xdp-newbies, llvm-dev, iovisor-dev, oss-drivers, ys114321

On 09/20/2017 01:20 AM, Jiong Wang wrote:
> On 18/09/2017 22:29, Daniel Borkmann wrote:
[...]
>> Great to see work in this direction! Can we also enable to use / emit
>> all the 32bit BPF_ALU instructions whenever possible for the currently
>> available bpf targets while at it (which only use BPF_ALU64 right now)?
>
> Hi Daniel,
>
>     Thanks for the feedback.
>
>     I think we could also enable the use of all the 32bit BPF_ALU under currently
> available bpf targets.  As we now have 32bit register set support, we could make
> i32 type as legal type to prevent it be promoted into i64, then hook it up with i32
> ALU patterns, will look into this.

Great, thanks!

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [iovisor-dev] [PATCH RFC 0/4] Initial 32-bit eBPF encoding support
  2017-09-19 23:20   ` Jiong Wang
  2017-09-19 23:24     ` Daniel Borkmann
@ 2017-09-21 18:56     ` Alexei Starovoitov
  2017-09-21 19:36       ` Daniel Borkmann
  2017-09-22 16:24       ` [oss-drivers] " Jakub Kicinski
  1 sibling, 2 replies; 20+ messages in thread
From: Alexei Starovoitov @ 2017-09-21 18:56 UTC (permalink / raw)
  To: Jiong Wang
  Cc: Daniel Borkmann, xdp-newbies, llvm-dev, iovisor-dev, oss-drivers, yhs

On Wed, Sep 20, 2017 at 12:20:40AM +0100, Jiong Wang via iovisor-dev wrote:
> On 18/09/2017 22:29, Daniel Borkmann wrote:
> > On 09/18/2017 10:47 PM, Jiong Wang wrote:
> > > Hi,
> > > 
> > >    Currently, LLVM eBPF backend always generate code in 64-bit mode,
> > > this may
> > > cause troubles when JITing to 32-bit targets.
> > > 
> > >    For example, it is quite common for XDP eBPF program to access
> > > some packet
> > > fields through base + offset that the default eBPF will generate
> > > BPF_ALU64 for
> > > the address formation, later when JITing to 32-bit hardware,
> > > BPF_ALU64 needs
> > > to be expanded into 32 bit ALU sequences even though the address
> > > space is
> > > 32-bit that the high bits is not significant.
> > > 
> > >    While a complete 32-bit mode implemention may need an new ABI
> > > (something like
> > > -target-abi=ilp32), this patch set first add some initial code so we
> > > could
> > > construct 32-bit eBPF tests through hand-written assembly.
> > > 
> > >    A new 32-bit register set is introduced, its name is with "w"
> > > prefix and LLVM
> > > assembler will encode statements like "w1 += w2" into the following
> > > 8-bit code
> > > field:
> > > 
> > >      BPF_ADD | BPF_X | BPF_ALU
> > > 
> > > BPF_ALU will be used instead of BPF_ALU64.
> > > 
> > >    NOTE, currently you can only use "w" register with ALU
> > > statements, not with
> > > others like branches etc as they don't have different encoding for
> > > 32-bit
> > > target.
> > 
> > Great to see work in this direction! Can we also enable to use / emit
> > all the 32bit BPF_ALU instructions whenever possible for the currently
> > available bpf targets while at it (which only use BPF_ALU64 right now)?
> 
> Hi Daniel,
> 
>    Thanks for the feedback.
> 
>    I think we could also enable the use of all the 32bit BPF_ALU under
> currently
> available bpf targets.  As we now have 32bit register set support, we could
> make
> i32 type as legal type to prevent it be promoted into i64, then hook it up
> with i32
> ALU patterns, will look into this.

I don't think we need to gate 32bit alu generation with a flag.
Though interpreter and JITs support 32-bit since day one, the verifier
never seen such programs before, so some valid programs may get
rejected. After some time passes and we're sure that all progs
still work fine when they're optimized with 32-bit alu, we can flip
the switch in llvm and make it default.

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [iovisor-dev] [PATCH RFC 0/4] Initial 32-bit eBPF encoding support
  2017-09-21 18:56     ` [iovisor-dev] " Alexei Starovoitov
@ 2017-09-21 19:36       ` Daniel Borkmann
  2017-09-22 16:24       ` [oss-drivers] " Jakub Kicinski
  1 sibling, 0 replies; 20+ messages in thread
From: Daniel Borkmann @ 2017-09-21 19:36 UTC (permalink / raw)
  To: Alexei Starovoitov, Jiong Wang
  Cc: xdp-newbies, llvm-dev, iovisor-dev, oss-drivers, yhs

On 09/21/2017 08:56 PM, Alexei Starovoitov wrote:
> On Wed, Sep 20, 2017 at 12:20:40AM +0100, Jiong Wang via iovisor-dev wrote:
>> On 18/09/2017 22:29, Daniel Borkmann wrote:
>>> On 09/18/2017 10:47 PM, Jiong Wang wrote:
>>>> Hi,
>>>>
>>>>     Currently, LLVM eBPF backend always generate code in 64-bit mode,
>>>> this may
>>>> cause troubles when JITing to 32-bit targets.
>>>>
>>>>     For example, it is quite common for XDP eBPF program to access
>>>> some packet
>>>> fields through base + offset that the default eBPF will generate
>>>> BPF_ALU64 for
>>>> the address formation, later when JITing to 32-bit hardware,
>>>> BPF_ALU64 needs
>>>> to be expanded into 32 bit ALU sequences even though the address
>>>> space is
>>>> 32-bit that the high bits is not significant.
>>>>
>>>>     While a complete 32-bit mode implemention may need an new ABI
>>>> (something like
>>>> -target-abi=ilp32), this patch set first add some initial code so we
>>>> could
>>>> construct 32-bit eBPF tests through hand-written assembly.
>>>>
>>>>     A new 32-bit register set is introduced, its name is with "w"
>>>> prefix and LLVM
>>>> assembler will encode statements like "w1 += w2" into the following
>>>> 8-bit code
>>>> field:
>>>>
>>>>       BPF_ADD | BPF_X | BPF_ALU
>>>>
>>>> BPF_ALU will be used instead of BPF_ALU64.
>>>>
>>>>     NOTE, currently you can only use "w" register with ALU
>>>> statements, not with
>>>> others like branches etc as they don't have different encoding for
>>>> 32-bit
>>>> target.
>>>
>>> Great to see work in this direction! Can we also enable to use / emit
>>> all the 32bit BPF_ALU instructions whenever possible for the currently
>>> available bpf targets while at it (which only use BPF_ALU64 right now)?
>>
>> Hi Daniel,
>>
>>     Thanks for the feedback.
>>
>>     I think we could also enable the use of all the 32bit BPF_ALU under
>> currently
>> available bpf targets.  As we now have 32bit register set support, we could
>> make
>> i32 type as legal type to prevent it be promoted into i64, then hook it up
>> with i32
>> ALU patterns, will look into this.
>
> I don't think we need to gate 32bit alu generation with a flag.
> Though interpreter and JITs support 32-bit since day one, the verifier
> never seen such programs before, so some valid programs may get
> rejected. After some time passes and we're sure that all progs
> still work fine when they're optimized with 32-bit alu, we can flip
> the switch in llvm and make it default.

Sounds good to me! Could this be done for bpf target via -mattr=+alu32
feature flag or similar in llc?

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [iovisor-dev] [PATCH RFC 3/4] New 32-bit register set
  2017-09-19 23:10     ` Jiong Wang
@ 2017-09-22  4:55       ` Y Song
  0 siblings, 0 replies; 20+ messages in thread
From: Y Song @ 2017-09-22  4:55 UTC (permalink / raw)
  To: Jiong Wang
  Cc: xdp-newbies, LLVM Developers Mailing List, oss-drivers,
	Tom Herbert via iovisor-dev

Hi, Jiong,

The new patch looks good. I did some basic testing on
net-next:samples/bpf and net-next:tools/testing/selftests/bpf and it
works fine. All existing llvm unit tests are not impacted as well as
expected.

I have applied the patch to the trunk. Besides your other work to
support 32bit abi, it would be
interesting to see how new 32bit register can be used in 64bit
architecture which may help improve performance and/or reduce
instruction count.

Thanks,
Yonghong

On Tue, Sep 19, 2017 at 4:10 PM, Jiong Wang <jiong.wang@netronome.com> wrote:
> On 19/09/2017 07:44, Y Song wrote:
>>
>> Hi, Jiong,
>>
>> Thanks for the patch! It is a great start to support 32bit register in
>> BPF.
>> In the past, I have studied a little bit to see whether 32bit register
>> support may reduce
>> the number of unnecessary shifts on x86_64 and improve the
>> performance. Looking through
>> a few bpf programs and it looks like the opportunity is not great, but
>> still nice to have if we
>> have this capability. As you mentioned, this definitely helped 32bit
>> architecture!
>>
>> I am not an expert in LLVM tablegen. I briefly looked through the code
>> change and it looks like
>> correct. Hopefully some llvm-dev tablegen experts can comment on the
>> implementation.
>>
>> Below I only have a couple of minor comments.
>
>
> Yong Hong,
>
>   Thanks for the review.
>
>   I have addressed your comments and attached the updated patch.
>
>   Do you want me to put this patch set on to llvm review website? I guess it
> is the
>   formal review channel?
>
> Regards,
> Jiong
>

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [oss-drivers] Re: [iovisor-dev] [PATCH RFC 0/4] Initial 32-bit eBPF encoding support
  2017-09-21 18:56     ` [iovisor-dev] " Alexei Starovoitov
  2017-09-21 19:36       ` Daniel Borkmann
@ 2017-09-22 16:24       ` Jakub Kicinski
  2017-09-23  5:03         ` Yonghong Song
  1 sibling, 1 reply; 20+ messages in thread
From: Jakub Kicinski @ 2017-09-22 16:24 UTC (permalink / raw)
  To: Alexei Starovoitov
  Cc: Jiong Wang, Daniel Borkmann, xdp-newbies, llvm-dev, iovisor-dev,
	oss-drivers, yhs

On Thu, 21 Sep 2017 11:56:55 -0700, Alexei Starovoitov wrote:
> On Wed, Sep 20, 2017 at 12:20:40AM +0100, Jiong Wang via iovisor-dev wrote:
> > On 18/09/2017 22:29, Daniel Borkmann wrote:  
> > > On 09/18/2017 10:47 PM, Jiong Wang wrote:  
> > > > Hi,
> > > > 
> > > >    Currently, LLVM eBPF backend always generate code in 64-bit mode,
> > > > this may
> > > > cause troubles when JITing to 32-bit targets.
> > > > 
> > > >    For example, it is quite common for XDP eBPF program to access
> > > > some packet
> > > > fields through base + offset that the default eBPF will generate
> > > > BPF_ALU64 for
> > > > the address formation, later when JITing to 32-bit hardware,
> > > > BPF_ALU64 needs
> > > > to be expanded into 32 bit ALU sequences even though the address
> > > > space is
> > > > 32-bit that the high bits is not significant.
> > > > 
> > > >    While a complete 32-bit mode implemention may need an new ABI
> > > > (something like
> > > > -target-abi=ilp32), this patch set first add some initial code so we
> > > > could
> > > > construct 32-bit eBPF tests through hand-written assembly.
> > > > 
> > > >    A new 32-bit register set is introduced, its name is with "w"
> > > > prefix and LLVM
> > > > assembler will encode statements like "w1 += w2" into the following
> > > > 8-bit code
> > > > field:
> > > > 
> > > >      BPF_ADD | BPF_X | BPF_ALU
> > > > 
> > > > BPF_ALU will be used instead of BPF_ALU64.
> > > > 
> > > >    NOTE, currently you can only use "w" register with ALU
> > > > statements, not with
> > > > others like branches etc as they don't have different encoding for
> > > > 32-bit
> > > > target.  
> > > 
> > > Great to see work in this direction! Can we also enable to use / emit
> > > all the 32bit BPF_ALU instructions whenever possible for the currently
> > > available bpf targets while at it (which only use BPF_ALU64 right now)?  
> > 
> > Hi Daniel,
> > 
> >    Thanks for the feedback.
> > 
> >    I think we could also enable the use of all the 32bit BPF_ALU under
> > currently
> > available bpf targets.  As we now have 32bit register set support, we could
> > make
> > i32 type as legal type to prevent it be promoted into i64, then hook it up
> > with i32
> > ALU patterns, will look into this.  
> 
> I don't think we need to gate 32bit alu generation with a flag.
> Though interpreter and JITs support 32-bit since day one, the verifier
> never seen such programs before, so some valid programs may get
> rejected. After some time passes and we're sure that all progs
> still work fine when they're optimized with 32-bit alu, we can flip
> the switch in llvm and make it default.

Thinking about next steps - do we expect the 32b operations to clear the
upper halves of the registers?  The interpreter does it, and so does
x86.  I don't think we can load 32bit-only programs on 64bit hosts, so
we would need some form of data flow analysis in the kernel to prune
the zeroing for 32bit offload targets.  Is that correct?

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [oss-drivers] Re: [iovisor-dev] [PATCH RFC 0/4] Initial 32-bit eBPF encoding support
  2017-09-22 16:24       ` [oss-drivers] " Jakub Kicinski
@ 2017-09-23  5:03         ` Yonghong Song
  2017-09-23  8:41           ` Jakub Kicinski
  0 siblings, 1 reply; 20+ messages in thread
From: Yonghong Song @ 2017-09-23  5:03 UTC (permalink / raw)
  To: Jakub Kicinski, Alexei Starovoitov
  Cc: Jiong Wang, Daniel Borkmann, xdp-newbies, llvm-dev, iovisor-dev,
	oss-drivers



On 9/22/17 9:24 AM, Jakub Kicinski wrote:
> On Thu, 21 Sep 2017 11:56:55 -0700, Alexei Starovoitov wrote:
>> On Wed, Sep 20, 2017 at 12:20:40AM +0100, Jiong Wang via iovisor-dev wrote:
>>> On 18/09/2017 22:29, Daniel Borkmann wrote:
>>>> On 09/18/2017 10:47 PM, Jiong Wang wrote:
>>>>> Hi,
>>>>>
>>>>>     Currently, LLVM eBPF backend always generate code in 64-bit mode,
>>>>> this may
>>>>> cause troubles when JITing to 32-bit targets.
>>>>>
>>>>>     For example, it is quite common for XDP eBPF program to access
>>>>> some packet
>>>>> fields through base + offset that the default eBPF will generate
>>>>> BPF_ALU64 for
>>>>> the address formation, later when JITing to 32-bit hardware,
>>>>> BPF_ALU64 needs
>>>>> to be expanded into 32 bit ALU sequences even though the address
>>>>> space is
>>>>> 32-bit that the high bits is not significant.
>>>>>
>>>>>     While a complete 32-bit mode implemention may need an new ABI
>>>>> (something like
>>>>> -target-abi=ilp32), this patch set first add some initial code so we
>>>>> could
>>>>> construct 32-bit eBPF tests through hand-written assembly.
>>>>>
>>>>>     A new 32-bit register set is introduced, its name is with "w"
>>>>> prefix and LLVM
>>>>> assembler will encode statements like "w1 += w2" into the following
>>>>> 8-bit code
>>>>> field:
>>>>>
>>>>>       BPF_ADD | BPF_X | BPF_ALU
>>>>>
>>>>> BPF_ALU will be used instead of BPF_ALU64.
>>>>>
>>>>>     NOTE, currently you can only use "w" register with ALU
>>>>> statements, not with
>>>>> others like branches etc as they don't have different encoding for
>>>>> 32-bit
>>>>> target.
>>>>
>>>> Great to see work in this direction! Can we also enable to use / emit
>>>> all the 32bit BPF_ALU instructions whenever possible for the currently
>>>> available bpf targets while at it (which only use BPF_ALU64 right now)?
>>>
>>> Hi Daniel,
>>>
>>>     Thanks for the feedback.
>>>
>>>     I think we could also enable the use of all the 32bit BPF_ALU under
>>> currently
>>> available bpf targets.  As we now have 32bit register set support, we could
>>> make
>>> i32 type as legal type to prevent it be promoted into i64, then hook it up
>>> with i32
>>> ALU patterns, will look into this.
>>
>> I don't think we need to gate 32bit alu generation with a flag.
>> Though interpreter and JITs support 32-bit since day one, the verifier
>> never seen such programs before, so some valid programs may get
>> rejected. After some time passes and we're sure that all progs
>> still work fine when they're optimized with 32-bit alu, we can flip
>> the switch in llvm and make it default.
> 
> Thinking about next steps - do we expect the 32b operations to clear the
> upper halves of the registers?  The interpreter does it, and so does
> x86.  I don't think we can load 32bit-only programs on 64bit hosts, so
> we would need some form of data flow analysis in the kernel to prune
> the zeroing for 32bit offload targets.  Is that correct?

Could you contrive an example to show the problem? If I understand 
correctly, you most worried that some natural sign extension is gone
with "clearing the upper 32-bit register" and such clearing may make
some operation, esp. memory operation not correct in 64-bit machine?

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [iovisor-dev] [PATCH RFC 0/4] Initial 32-bit eBPF encoding support
  2017-09-23  5:03         ` Yonghong Song
@ 2017-09-23  8:41           ` Jakub Kicinski
  2017-09-23 14:42             ` Y Song
  2017-09-24  5:46             ` Alexei Starovoitov
  0 siblings, 2 replies; 20+ messages in thread
From: Jakub Kicinski @ 2017-09-23  8:41 UTC (permalink / raw)
  To: Yonghong Song
  Cc: Alexei Starovoitov, Jiong Wang, Daniel Borkmann, xdp-newbies,
	llvm-dev, iovisor-dev, oss-drivers

On Fri, 22 Sep 2017 22:03:47 -0700, Yonghong Song wrote:
> On 9/22/17 9:24 AM, Jakub Kicinski wrote:
> > On Thu, 21 Sep 2017 11:56:55 -0700, Alexei Starovoitov wrote:  
> >> On Wed, Sep 20, 2017 at 12:20:40AM +0100, Jiong Wang via iovisor-dev wrote:  
> >>> On 18/09/2017 22:29, Daniel Borkmann wrote:  
> >>>> On 09/18/2017 10:47 PM, Jiong Wang wrote:  
> >>>>> Hi,
> >>>>>
> >>>>>     Currently, LLVM eBPF backend always generate code in 64-bit mode,
> >>>>> this may
> >>>>> cause troubles when JITing to 32-bit targets.
> >>>>>
> >>>>>     For example, it is quite common for XDP eBPF program to access
> >>>>> some packet
> >>>>> fields through base + offset that the default eBPF will generate
> >>>>> BPF_ALU64 for
> >>>>> the address formation, later when JITing to 32-bit hardware,
> >>>>> BPF_ALU64 needs
> >>>>> to be expanded into 32 bit ALU sequences even though the address
> >>>>> space is
> >>>>> 32-bit that the high bits is not significant.
> >>>>>
> >>>>>     While a complete 32-bit mode implemention may need an new ABI
> >>>>> (something like
> >>>>> -target-abi=ilp32), this patch set first add some initial code so we
> >>>>> could
> >>>>> construct 32-bit eBPF tests through hand-written assembly.
> >>>>>
> >>>>>     A new 32-bit register set is introduced, its name is with "w"
> >>>>> prefix and LLVM
> >>>>> assembler will encode statements like "w1 += w2" into the following
> >>>>> 8-bit code
> >>>>> field:
> >>>>>
> >>>>>       BPF_ADD | BPF_X | BPF_ALU
> >>>>>
> >>>>> BPF_ALU will be used instead of BPF_ALU64.
> >>>>>
> >>>>>     NOTE, currently you can only use "w" register with ALU
> >>>>> statements, not with
> >>>>> others like branches etc as they don't have different encoding for
> >>>>> 32-bit
> >>>>> target.  
> >>>>
> >>>> Great to see work in this direction! Can we also enable to use / emit
> >>>> all the 32bit BPF_ALU instructions whenever possible for the currently
> >>>> available bpf targets while at it (which only use BPF_ALU64 right now)?  
> >>>
> >>> Hi Daniel,
> >>>
> >>>     Thanks for the feedback.
> >>>
> >>>     I think we could also enable the use of all the 32bit BPF_ALU under
> >>> currently
> >>> available bpf targets.  As we now have 32bit register set support, we could
> >>> make
> >>> i32 type as legal type to prevent it be promoted into i64, then hook it up
> >>> with i32
> >>> ALU patterns, will look into this.  
> >>
> >> I don't think we need to gate 32bit alu generation with a flag.
> >> Though interpreter and JITs support 32-bit since day one, the verifier
> >> never seen such programs before, so some valid programs may get
> >> rejected. After some time passes and we're sure that all progs
> >> still work fine when they're optimized with 32-bit alu, we can flip
> >> the switch in llvm and make it default.  
> > 
> > Thinking about next steps - do we expect the 32b operations to clear the
> > upper halves of the registers?  The interpreter does it, and so does
> > x86.  I don't think we can load 32bit-only programs on 64bit hosts, so
> > we would need some form of data flow analysis in the kernel to prune
> > the zeroing for 32bit offload targets.  Is that correct?  
> 
> Could you contrive an example to show the problem? If I understand 
> correctly, you most worried that some natural sign extension is gone
> with "clearing the upper 32-bit register" and such clearing may make
> some operation, esp. memory operation not correct in 64-bit machine?

Hm.  Perhaps it's a blunder on my side, but let's take:

  r1 = ~0ULL
  w1 = 0
  # use r1

on x86 and the interpreter, the w1 = 0 will clear upper 32bits, so r1
ends up as 0.  32b arches may translate this to something like:

  # r1 = ~0ULL
  r1.lo = ~0
  r1.hi = ~0
  # w1 = 0
  r1.lo = 0
  # r1.hi not touched

which will obviously result in r1 == 0xffffffff00000000.  LLVM should
not assume r1.hi is cleared, but I'm not sure this is a strong enough
argument.

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [iovisor-dev] [PATCH RFC 0/4] Initial 32-bit eBPF encoding support
  2017-09-23  8:41           ` Jakub Kicinski
@ 2017-09-23 14:42             ` Y Song
  2017-09-24  5:46             ` Alexei Starovoitov
  1 sibling, 0 replies; 20+ messages in thread
From: Y Song @ 2017-09-23 14:42 UTC (permalink / raw)
  To: Jakub Kicinski
  Cc: Yonghong Song, xdp-newbies, Tom Herbert via iovisor-dev,
	LLVM Developers Mailing List, oss-drivers

On Sat, Sep 23, 2017 at 1:41 AM, Jakub Kicinski via iovisor-dev
<iovisor-dev@lists.iovisor.org> wrote:
> On Fri, 22 Sep 2017 22:03:47 -0700, Yonghong Song wrote:
>> On 9/22/17 9:24 AM, Jakub Kicinski wrote:
>> > On Thu, 21 Sep 2017 11:56:55 -0700, Alexei Starovoitov wrote:
>> >> On Wed, Sep 20, 2017 at 12:20:40AM +0100, Jiong Wang via iovisor-dev wrote:
>> >>> On 18/09/2017 22:29, Daniel Borkmann wrote:
>> >>>> On 09/18/2017 10:47 PM, Jiong Wang wrote:
>> >>>>> Hi,
>> >>>>>
>> >>>>>     Currently, LLVM eBPF backend always generate code in 64-bit mode,
>> >>>>> this may
>> >>>>> cause troubles when JITing to 32-bit targets.
>> >>>>>
>> >>>>>     For example, it is quite common for XDP eBPF program to access
>> >>>>> some packet
>> >>>>> fields through base + offset that the default eBPF will generate
>> >>>>> BPF_ALU64 for
>> >>>>> the address formation, later when JITing to 32-bit hardware,
>> >>>>> BPF_ALU64 needs
>> >>>>> to be expanded into 32 bit ALU sequences even though the address
>> >>>>> space is
>> >>>>> 32-bit that the high bits is not significant.
>> >>>>>
>> >>>>>     While a complete 32-bit mode implemention may need an new ABI
>> >>>>> (something like
>> >>>>> -target-abi=ilp32), this patch set first add some initial code so we
>> >>>>> could
>> >>>>> construct 32-bit eBPF tests through hand-written assembly.
>> >>>>>
>> >>>>>     A new 32-bit register set is introduced, its name is with "w"
>> >>>>> prefix and LLVM
>> >>>>> assembler will encode statements like "w1 += w2" into the following
>> >>>>> 8-bit code
>> >>>>> field:
>> >>>>>
>> >>>>>       BPF_ADD | BPF_X | BPF_ALU
>> >>>>>
>> >>>>> BPF_ALU will be used instead of BPF_ALU64.
>> >>>>>
>> >>>>>     NOTE, currently you can only use "w" register with ALU
>> >>>>> statements, not with
>> >>>>> others like branches etc as they don't have different encoding for
>> >>>>> 32-bit
>> >>>>> target.
>> >>>>
>> >>>> Great to see work in this direction! Can we also enable to use / emit
>> >>>> all the 32bit BPF_ALU instructions whenever possible for the currently
>> >>>> available bpf targets while at it (which only use BPF_ALU64 right now)?
>> >>>
>> >>> Hi Daniel,
>> >>>
>> >>>     Thanks for the feedback.
>> >>>
>> >>>     I think we could also enable the use of all the 32bit BPF_ALU under
>> >>> currently
>> >>> available bpf targets.  As we now have 32bit register set support, we could
>> >>> make
>> >>> i32 type as legal type to prevent it be promoted into i64, then hook it up
>> >>> with i32
>> >>> ALU patterns, will look into this.
>> >>
>> >> I don't think we need to gate 32bit alu generation with a flag.
>> >> Though interpreter and JITs support 32-bit since day one, the verifier
>> >> never seen such programs before, so some valid programs may get
>> >> rejected. After some time passes and we're sure that all progs
>> >> still work fine when they're optimized with 32-bit alu, we can flip
>> >> the switch in llvm and make it default.
>> >
>> > Thinking about next steps - do we expect the 32b operations to clear the
>> > upper halves of the registers?  The interpreter does it, and so does
>> > x86.  I don't think we can load 32bit-only programs on 64bit hosts, so
>> > we would need some form of data flow analysis in the kernel to prune
>> > the zeroing for 32bit offload targets.  Is that correct?
>>
>> Could you contrive an example to show the problem? If I understand
>> correctly, you most worried that some natural sign extension is gone
>> with "clearing the upper 32-bit register" and such clearing may make
>> some operation, esp. memory operation not correct in 64-bit machine?
>
> Hm.  Perhaps it's a blunder on my side, but let's take:
>
>   r1 = ~0ULL
>   w1 = 0
>   # use r1
>
> on x86 and the interpreter, the w1 = 0 will clear upper 32bits, so r1
> ends up as 0.  32b arches may translate this to something like:
>
>   # r1 = ~0ULL
>   r1.lo = ~0
>   r1.hi = ~0
>   # w1 = 0
>   r1.lo = 0
>   # r1.hi not touched
>
> which will obviously result in r1 == 0xffffffff00000000.  LLVM should
> not assume r1.hi is cleared, but I'm not sure this is a strong enough
> argument.

Not sure what LLVM will do in this case  for later "r1" access
unless going through the real implementation. My hunch is LLVM
should do a conversion from 32bit to 64bit, "r1 <<= 32" and
"r1 >>= 32" after "w1 = 0" before using r1. Let us wait and check
once implementation in place.

> _______________________________________________
> iovisor-dev mailing list
> iovisor-dev@lists.iovisor.org
> https://lists.iovisor.org/mailman/listinfo/iovisor-dev

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [iovisor-dev] [PATCH RFC 0/4] Initial 32-bit eBPF encoding support
  2017-09-23  8:41           ` Jakub Kicinski
  2017-09-23 14:42             ` Y Song
@ 2017-09-24  5:46             ` Alexei Starovoitov
  1 sibling, 0 replies; 20+ messages in thread
From: Alexei Starovoitov @ 2017-09-24  5:46 UTC (permalink / raw)
  To: Jakub Kicinski
  Cc: Yonghong Song, Jiong Wang, Daniel Borkmann, xdp-newbies,
	llvm-dev, iovisor-dev, oss-drivers

On Sat, Sep 23, 2017 at 10:41:25AM +0200, Jakub Kicinski wrote:
> > > Thinking about next steps - do we expect the 32b operations to clear the
> > > upper halves of the registers?  The interpreter does it, and so does
> > > x86.  I don't think we can load 32bit-only programs on 64bit hosts, so
> > > we would need some form of data flow analysis in the kernel to prune
> > > the zeroing for 32bit offload targets.  Is that correct?  
> > 
> > Could you contrive an example to show the problem? If I understand 
> > correctly, you most worried that some natural sign extension is gone
> > with "clearing the upper 32-bit register" and such clearing may make
> > some operation, esp. memory operation not correct in 64-bit machine?
> 
> Hm.  Perhaps it's a blunder on my side, but let's take:
> 
>   r1 = ~0ULL
>   w1 = 0
>   # use r1
> 
> on x86 and the interpreter, the w1 = 0 will clear upper 32bits, so r1
> ends up as 0.  32b arches may translate this to something like:
> 
>   # r1 = ~0ULL
>   r1.lo = ~0
>   r1.hi = ~0
>   # w1 = 0
>   r1.lo = 0
>   # r1.hi not touched
> 
> which will obviously result in r1 == 0xffffffff00000000.  LLVM should
> not assume r1.hi is cleared, but I'm not sure this is a strong enough
> argument.

llvm will assume that r1.hi is cleared. 32-bit subregisters were
defined on the day one. See Documentation/networking/filter.txt
"All eBPF registers are 64-bit with 32-bit lower
 subregisters that zero-extend into 64-bit if they are being written to."
If some JIT is not clearing upper bits, it's a bug or it's being too smart :)
We can add analysis pass to the verifier to help JITs in such case.

^ permalink raw reply	[flat|nested] 20+ messages in thread

end of thread, other threads:[~2017-09-24  5:46 UTC | newest]

Thread overview: 20+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2017-09-18 20:47 [PATCH RFC 0/4] Initial 32-bit eBPF encoding support Jiong Wang
2017-09-18 20:47 ` [PATCH RFC 1/4] Improve instruction encoding descriptions Jiong Wang
2017-09-18 20:47 ` [PATCH RFC 2/4] Improve class inheritance in instruction patterns Jiong Wang
2017-09-18 20:47 ` [PATCH RFC 3/4] New 32-bit register set Jiong Wang
2017-09-19  6:44   ` [iovisor-dev] " Y Song
2017-09-19 23:10     ` Jiong Wang
2017-09-22  4:55       ` Y Song
2017-09-18 20:47 ` [PATCH RFC 4/4] Initial 32-bit ALU encoding support in assembler Jiong Wang
2017-09-18 21:29 ` [PATCH RFC 0/4] Initial 32-bit eBPF encoding support Daniel Borkmann
2017-09-19 23:20   ` Jiong Wang
2017-09-19 23:24     ` Daniel Borkmann
2017-09-21 18:56     ` [iovisor-dev] " Alexei Starovoitov
2017-09-21 19:36       ` Daniel Borkmann
2017-09-22 16:24       ` [oss-drivers] " Jakub Kicinski
2017-09-23  5:03         ` Yonghong Song
2017-09-23  8:41           ` Jakub Kicinski
2017-09-23 14:42             ` Y Song
2017-09-24  5:46             ` Alexei Starovoitov
2017-09-19  4:49 ` Fulvio Risso
2017-09-19 22:56   ` Jiong Wang

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.