All of lore.kernel.org
 help / color / mirror / Atom feed
* [RFC PATCH v2 tip 0/7] 64-bit BPF insn set and tracing filters
@ 2014-02-06  1:10 Alexei Starovoitov
  2014-02-06  1:10 ` [RFC PATCH v2 tip 1/7] Extended BPF core framework Alexei Starovoitov
                   ` (7 more replies)
  0 siblings, 8 replies; 26+ messages in thread
From: Alexei Starovoitov @ 2014-02-06  1:10 UTC (permalink / raw)
  To: Ingo Molnar
  Cc: David S. Miller, Steven Rostedt, Peter Zijlstra, H. Peter Anvin,
	Thomas Gleixner, Masami Hiramatsu, Tom Zanussi, Jovi Zhangwei,
	Eric Dumazet, Linus Torvalds, Andrew Morton, Frederic Weisbecker,
	Arnaldo Carvalho de Melo, Pekka Enberg, Arjan van de Ven,
	Christoph Hellwig, linux-kernel, netdev

Hi All,

this patch set addresses main sticking points of the previous discussion:
http://thread.gmane.org/gmane.linux.kernel/1605783

Main difference:
. all components are now in one place
  tools/bpf/llvm - standalone LLVM backend for extended BPF instruction set

. regs.si, regs.di accessors are replaced with arg1, arg2

. compiler enforces presence of 'license' string in source C code
  kernel enforces GPL compatibility of BPF program

Why bother with it?
Current 32-bit BPF is safe, but limited.
kernel modules are 'all-goes', but not-safe.
Extended 64-bit BPF provides safe and restricted kernel modules.

Just like the first two, extended BPF can be used for all sorts of things.
Initially for tracing/debugging/[ks]tap-like without vmlinux around,
then for networking, security, etc

To make exising kernel modules safe the x86 disassembler and code analyzer
are needed. We've tried to follow that path. Disassembler was straight forward,
but x86 analyzer was becoming unbearably complex due to variety of addressing
modes, so we started to hack GCC to reduce output x86 insns and facing
the headache of redoing disasm/analyzer for arm and other arhcs.
Plus there is old 32-bit bpf insn set already.
On one side extended BPF is a 64-bit extension to current BPF.
On the other side it's a common subset of x86-64/aarch64/... ISAs:
a generic 64-bit insn set that can be JITed to native HW one to one.

Tested on x86-64 and i386.
BPF core was tested on arm-v7.

V2 vs V1 details:
0001-Extended-BPF-core-framework:
  no difference to instruction set
  new bpf image format to include license string and enforcement during load

0002-Extended-BPF-JIT-for-x86-64: no changes

0003-Extended-BPF-64-bit-BPF-design-document: no changes

0004-Revert-x86-ptrace-Remove-unused-regs_get_argument:
  restoring Masami's get_Nth_argument accessor to simplify kprobe filters

0005-use-BPF-in-tracing-filters: minor changes to switch from si/di to argN

0006-LLVM-BPF-backend: standalone BPF backend for LLVM
  requires: apt-get install llvm-3.2-dev clang
  compiles in 7 seconds, links with the rest of llvm infra
  compatible with llvm 3.2, 3.3 and just released 3.4
  Written in llvm coding style and llvm license, so it can be
  upstreamed into llvm tree

0007-tracing-filter-examples-in-BPF:
  tools/bpf/filter_check: userspace pre-checker of BPF filter
  runs the same bpf_check() code as kernel does

  tools/bpf/examples/netif_rcv.c:
-----
#define DESC(NAME) __attribute__((section(NAME), used))
void my_filter(struct bpf_context *ctx)
{
        char devname[4] = "lo";
        struct net_device *dev;
        struct sk_buff *skb = 0;

        /*
         * for tracepoints arg1 is the 1st arg of TP_ARGS() macro
         * defined in include/trace/events/.h
         * for kprobe events arg1 is the 1st arg of probed function
         */
        skb = (struct sk_buff *)ctx->arg1;

        dev = bpf_load_pointer(&skb->dev);
        if (bpf_memcmp(dev->name, devname, 2) == 0) {
                char fmt[] = "skb %p dev %p \n";
                bpf_trace_printk(fmt, sizeof(fmt), (long)skb, (long)dev, 0);
        }
}
/* filter code license: */
char license[] DESC("license") = "GPL";
-----

$cd tools/bpf/examples
$make
  compile it using clang+llvm_bpf
$make check
  check safety
$make try
  attach this filter to net:netif_receive_skb and kprobe __netif_receive_skb
  and try ping

dropmon.c is a demo of faster version of net_dropmonitor:
-----
/* attaches to /sys/kernel/debug/tracing/events/skb/kfree_skb */
void dropmon(struct bpf_context *ctx)
{
        void *loc;
        uint64_t *drop_cnt;

        /*
         * skb:kfree_skb is defined as:
         * TRACE_EVENT(kfree_skb,
         *         TP_PROTO(struct sk_buff *skb, void *location),
         * so ctx->arg2 is 'location'
         */
        loc = (void *)ctx->arg2;

        drop_cnt = bpf_table_lookup(ctx, 0, &loc);
        if (drop_cnt) {
                __sync_fetch_and_add(drop_cnt, 1);
        } else {
                uint64_t init = 0;
                bpf_table_update(ctx, 0, &loc, &init);
        }
}
struct bpf_table t[] DESC("bpftables") = {
        {BPF_TABLE_HASH, sizeof(void *), sizeof(uint64_t), 4096, 0}
};
/* filter code license: */
char l[] DESC("license") = "GPL v2";
-----
It's not fully functional yet. Minimal work remaining to implement
bpf_table_lookup()/bpf_table_update() in kernel
and userspace access to filter's table.

This example demonstrates that some interesting events don't have to be
always fed into userspace, but can be pre-processed in kernel.
tools/perf/scripts/python/net_dropmonitor.py would need to read bpf table
from kernel (via debugfs or netlink) and print it in a nice format.

Same as in V1 BPF filters are called before tracepoints store the TP_STRUCT
fields, since performance advantage is significant.

TODO:

- complete 'dropmonitor': finish bpf hashtable and userspace access to it

- add multi-probe support, so that one C program can specify multiple
  functions for different probe points (similar to [ks]tap)

- add 'lsmod' like facility to list all loaded BPF filters

- add -m32 flag to llvm, so that C pointers are 32-bit,
  but emitted BPF is still 64-bit.
  Useful for kernel struct walking in BPF program on 32-bit archs

- finish testing on arm

- teach llvm to store line numbers in BPF image, so that bpf_check()
  can print nice errors when program is not safe

- allow read-only "strings" in C code
  today analyzer can only verify safety of: char s[] = "string"; bpf_print(s);
  but bpf_print("string"); cannot be proven yet

- write JIT from BPF to aarch64

- refactor openvswitch + BPF proposal

If direction is ok, I would like to commit this part to a branch of tip tree
or staging tree and continue working there.
Future deltas will be easier to review.

Thanks

Alexei Starovoitov (7):
  Extended BPF core framework
  Extended BPF JIT for x86-64
  Extended BPF (64-bit BPF) design document
  Revert "x86/ptrace: Remove unused regs_get_argument_nth API"
  use BPF in tracing filters
  LLVM BPF backend
  tracing filter examples in BPF

 Documentation/bpf_jit.txt                          |  204 ++++
 arch/x86/Kconfig                                   |    1 +
 arch/x86/include/asm/ptrace.h                      |    3 +
 arch/x86/kernel/ptrace.c                           |   24 +
 arch/x86/net/Makefile                              |    1 +
 arch/x86/net/bpf64_jit_comp.c                      |  625 ++++++++++++
 arch/x86/net/bpf_jit_comp.c                        |   23 +-
 arch/x86/net/bpf_jit_comp.h                        |   35 +
 include/linux/bpf.h                                |  149 +++
 include/linux/bpf_jit.h                            |  134 +++
 include/linux/ftrace_event.h                       |    5 +
 include/trace/bpf_trace.h                          |   41 +
 include/trace/ftrace.h                             |   17 +
 kernel/Makefile                                    |    1 +
 kernel/bpf_jit/Makefile                            |    3 +
 kernel/bpf_jit/bpf_check.c                         | 1054 ++++++++++++++++++++
 kernel/bpf_jit/bpf_run.c                           |  511 ++++++++++
 kernel/trace/Kconfig                               |    1 +
 kernel/trace/Makefile                              |    1 +
 kernel/trace/bpf_trace_callbacks.c                 |  193 ++++
 kernel/trace/trace.c                               |    7 +
 kernel/trace/trace.h                               |   11 +-
 kernel/trace/trace_events.c                        |    9 +-
 kernel/trace/trace_events_filter.c                 |   61 +-
 kernel/trace/trace_kprobe.c                        |   15 +-
 lib/Kconfig.debug                                  |   15 +
 tools/bpf/examples/Makefile                        |   71 ++
 tools/bpf/examples/README.txt                      |   59 ++
 tools/bpf/examples/dropmon.c                       |   40 +
 tools/bpf/examples/netif_rcv.c                     |   34 +
 tools/bpf/filter_check/Makefile                    |   32 +
 tools/bpf/filter_check/README.txt                  |    3 +
 tools/bpf/filter_check/trace_filter_check.c        |  115 +++
 tools/bpf/llvm/LICENSE.TXT                         |   70 ++
 tools/bpf/llvm/Makefile.rules                      |  641 ++++++++++++
 tools/bpf/llvm/README.txt                          |   23 +
 tools/bpf/llvm/bld/.gitignore                      |    2 +
 tools/bpf/llvm/bld/Makefile                        |   27 +
 tools/bpf/llvm/bld/Makefile.common                 |   14 +
 tools/bpf/llvm/bld/Makefile.config                 |  124 +++
 .../llvm/bld/include/llvm/Config/AsmParsers.def    |    8 +
 .../llvm/bld/include/llvm/Config/AsmPrinters.def   |    9 +
 .../llvm/bld/include/llvm/Config/Disassemblers.def |    8 +
 tools/bpf/llvm/bld/include/llvm/Config/Targets.def |    9 +
 .../bpf/llvm/bld/include/llvm/Support/DataTypes.h  |   96 ++
 tools/bpf/llvm/bld/lib/Makefile                    |   11 +
 .../llvm/bld/lib/Target/BPF/InstPrinter/Makefile   |   10 +
 .../llvm/bld/lib/Target/BPF/MCTargetDesc/Makefile  |   11 +
 tools/bpf/llvm/bld/lib/Target/BPF/Makefile         |   17 +
 .../llvm/bld/lib/Target/BPF/TargetInfo/Makefile    |   10 +
 tools/bpf/llvm/bld/lib/Target/Makefile             |   11 +
 tools/bpf/llvm/bld/tools/Makefile                  |   12 +
 tools/bpf/llvm/bld/tools/llc/Makefile              |   15 +
 tools/bpf/llvm/lib/Target/BPF/BPF.h                |   30 +
 tools/bpf/llvm/lib/Target/BPF/BPF.td               |   29 +
 tools/bpf/llvm/lib/Target/BPF/BPFAsmPrinter.cpp    |  100 ++
 tools/bpf/llvm/lib/Target/BPF/BPFCFGFixup.cpp      |   62 ++
 tools/bpf/llvm/lib/Target/BPF/BPFCallingConv.td    |   24 +
 tools/bpf/llvm/lib/Target/BPF/BPFFrameLowering.cpp |   36 +
 tools/bpf/llvm/lib/Target/BPF/BPFFrameLowering.h   |   35 +
 tools/bpf/llvm/lib/Target/BPF/BPFISelDAGToDAG.cpp  |  182 ++++
 tools/bpf/llvm/lib/Target/BPF/BPFISelLowering.cpp  |  676 +++++++++++++
 tools/bpf/llvm/lib/Target/BPF/BPFISelLowering.h    |  105 ++
 tools/bpf/llvm/lib/Target/BPF/BPFInstrFormats.td   |   29 +
 tools/bpf/llvm/lib/Target/BPF/BPFInstrInfo.cpp     |  162 +++
 tools/bpf/llvm/lib/Target/BPF/BPFInstrInfo.h       |   53 +
 tools/bpf/llvm/lib/Target/BPF/BPFInstrInfo.td      |  455 +++++++++
 tools/bpf/llvm/lib/Target/BPF/BPFMCInstLower.cpp   |   77 ++
 tools/bpf/llvm/lib/Target/BPF/BPFMCInstLower.h     |   40 +
 tools/bpf/llvm/lib/Target/BPF/BPFRegisterInfo.cpp  |  122 +++
 tools/bpf/llvm/lib/Target/BPF/BPFRegisterInfo.h    |   65 ++
 tools/bpf/llvm/lib/Target/BPF/BPFRegisterInfo.td   |   39 +
 tools/bpf/llvm/lib/Target/BPF/BPFSubtarget.cpp     |   23 +
 tools/bpf/llvm/lib/Target/BPF/BPFSubtarget.h       |   33 +
 tools/bpf/llvm/lib/Target/BPF/BPFTargetMachine.cpp |   72 ++
 tools/bpf/llvm/lib/Target/BPF/BPFTargetMachine.h   |   69 ++
 .../lib/Target/BPF/InstPrinter/BPFInstPrinter.cpp  |   79 ++
 .../lib/Target/BPF/InstPrinter/BPFInstPrinter.h    |   34 +
 .../lib/Target/BPF/MCTargetDesc/BPFAsmBackend.cpp  |   85 ++
 .../llvm/lib/Target/BPF/MCTargetDesc/BPFBaseInfo.h |   33 +
 .../Target/BPF/MCTargetDesc/BPFELFObjectWriter.cpp |  119 +++
 .../lib/Target/BPF/MCTargetDesc/BPFMCAsmInfo.h     |   34 +
 .../Target/BPF/MCTargetDesc/BPFMCCodeEmitter.cpp   |  120 +++
 .../lib/Target/BPF/MCTargetDesc/BPFMCCodeEmitter.h |   67 ++
 .../Target/BPF/MCTargetDesc/BPFMCTargetDesc.cpp    |  115 +++
 .../lib/Target/BPF/MCTargetDesc/BPFMCTargetDesc.h  |   56 ++
 .../lib/Target/BPF/TargetInfo/BPFTargetInfo.cpp    |   13 +
 tools/bpf/llvm/tools/llc/llc.cpp                   |  381 +++++++
 88 files changed, 8255 insertions(+), 25 deletions(-)
 create mode 100644 Documentation/bpf_jit.txt
 create mode 100644 arch/x86/net/bpf64_jit_comp.c
 create mode 100644 arch/x86/net/bpf_jit_comp.h
 create mode 100644 include/linux/bpf.h
 create mode 100644 include/linux/bpf_jit.h
 create mode 100644 include/trace/bpf_trace.h
 create mode 100644 kernel/bpf_jit/Makefile
 create mode 100644 kernel/bpf_jit/bpf_check.c
 create mode 100644 kernel/bpf_jit/bpf_run.c
 create mode 100644 kernel/trace/bpf_trace_callbacks.c
 create mode 100644 tools/bpf/examples/Makefile
 create mode 100644 tools/bpf/examples/README.txt
 create mode 100644 tools/bpf/examples/dropmon.c
 create mode 100644 tools/bpf/examples/netif_rcv.c
 create mode 100644 tools/bpf/filter_check/Makefile
 create mode 100644 tools/bpf/filter_check/README.txt
 create mode 100644 tools/bpf/filter_check/trace_filter_check.c
 create mode 100644 tools/bpf/llvm/LICENSE.TXT
 create mode 100644 tools/bpf/llvm/Makefile.rules
 create mode 100644 tools/bpf/llvm/README.txt
 create mode 100644 tools/bpf/llvm/bld/.gitignore
 create mode 100644 tools/bpf/llvm/bld/Makefile
 create mode 100644 tools/bpf/llvm/bld/Makefile.common
 create mode 100644 tools/bpf/llvm/bld/Makefile.config
 create mode 100644 tools/bpf/llvm/bld/include/llvm/Config/AsmParsers.def
 create mode 100644 tools/bpf/llvm/bld/include/llvm/Config/AsmPrinters.def
 create mode 100644 tools/bpf/llvm/bld/include/llvm/Config/Disassemblers.def
 create mode 100644 tools/bpf/llvm/bld/include/llvm/Config/Targets.def
 create mode 100644 tools/bpf/llvm/bld/include/llvm/Support/DataTypes.h
 create mode 100644 tools/bpf/llvm/bld/lib/Makefile
 create mode 100644 tools/bpf/llvm/bld/lib/Target/BPF/InstPrinter/Makefile
 create mode 100644 tools/bpf/llvm/bld/lib/Target/BPF/MCTargetDesc/Makefile
 create mode 100644 tools/bpf/llvm/bld/lib/Target/BPF/Makefile
 create mode 100644 tools/bpf/llvm/bld/lib/Target/BPF/TargetInfo/Makefile
 create mode 100644 tools/bpf/llvm/bld/lib/Target/Makefile
 create mode 100644 tools/bpf/llvm/bld/tools/Makefile
 create mode 100644 tools/bpf/llvm/bld/tools/llc/Makefile
 create mode 100644 tools/bpf/llvm/lib/Target/BPF/BPF.h
 create mode 100644 tools/bpf/llvm/lib/Target/BPF/BPF.td
 create mode 100644 tools/bpf/llvm/lib/Target/BPF/BPFAsmPrinter.cpp
 create mode 100644 tools/bpf/llvm/lib/Target/BPF/BPFCFGFixup.cpp
 create mode 100644 tools/bpf/llvm/lib/Target/BPF/BPFCallingConv.td
 create mode 100644 tools/bpf/llvm/lib/Target/BPF/BPFFrameLowering.cpp
 create mode 100644 tools/bpf/llvm/lib/Target/BPF/BPFFrameLowering.h
 create mode 100644 tools/bpf/llvm/lib/Target/BPF/BPFISelDAGToDAG.cpp
 create mode 100644 tools/bpf/llvm/lib/Target/BPF/BPFISelLowering.cpp
 create mode 100644 tools/bpf/llvm/lib/Target/BPF/BPFISelLowering.h
 create mode 100644 tools/bpf/llvm/lib/Target/BPF/BPFInstrFormats.td
 create mode 100644 tools/bpf/llvm/lib/Target/BPF/BPFInstrInfo.cpp
 create mode 100644 tools/bpf/llvm/lib/Target/BPF/BPFInstrInfo.h
 create mode 100644 tools/bpf/llvm/lib/Target/BPF/BPFInstrInfo.td
 create mode 100644 tools/bpf/llvm/lib/Target/BPF/BPFMCInstLower.cpp
 create mode 100644 tools/bpf/llvm/lib/Target/BPF/BPFMCInstLower.h
 create mode 100644 tools/bpf/llvm/lib/Target/BPF/BPFRegisterInfo.cpp
 create mode 100644 tools/bpf/llvm/lib/Target/BPF/BPFRegisterInfo.h
 create mode 100644 tools/bpf/llvm/lib/Target/BPF/BPFRegisterInfo.td
 create mode 100644 tools/bpf/llvm/lib/Target/BPF/BPFSubtarget.cpp
 create mode 100644 tools/bpf/llvm/lib/Target/BPF/BPFSubtarget.h
 create mode 100644 tools/bpf/llvm/lib/Target/BPF/BPFTargetMachine.cpp
 create mode 100644 tools/bpf/llvm/lib/Target/BPF/BPFTargetMachine.h
 create mode 100644 tools/bpf/llvm/lib/Target/BPF/InstPrinter/BPFInstPrinter.cpp
 create mode 100644 tools/bpf/llvm/lib/Target/BPF/InstPrinter/BPFInstPrinter.h
 create mode 100644 tools/bpf/llvm/lib/Target/BPF/MCTargetDesc/BPFAsmBackend.cpp
 create mode 100644 tools/bpf/llvm/lib/Target/BPF/MCTargetDesc/BPFBaseInfo.h
 create mode 100644 tools/bpf/llvm/lib/Target/BPF/MCTargetDesc/BPFELFObjectWriter.cpp
 create mode 100644 tools/bpf/llvm/lib/Target/BPF/MCTargetDesc/BPFMCAsmInfo.h
 create mode 100644 tools/bpf/llvm/lib/Target/BPF/MCTargetDesc/BPFMCCodeEmitter.cpp
 create mode 100644 tools/bpf/llvm/lib/Target/BPF/MCTargetDesc/BPFMCCodeEmitter.h
 create mode 100644 tools/bpf/llvm/lib/Target/BPF/MCTargetDesc/BPFMCTargetDesc.cpp
 create mode 100644 tools/bpf/llvm/lib/Target/BPF/MCTargetDesc/BPFMCTargetDesc.h
 create mode 100644 tools/bpf/llvm/lib/Target/BPF/TargetInfo/BPFTargetInfo.cpp
 create mode 100644 tools/bpf/llvm/tools/llc/llc.cpp

-- 
1.7.9.5


^ permalink raw reply	[flat|nested] 26+ messages in thread

* [RFC PATCH v2 tip 1/7] Extended BPF core framework
  2014-02-06  1:10 [RFC PATCH v2 tip 0/7] 64-bit BPF insn set and tracing filters Alexei Starovoitov
@ 2014-02-06  1:10 ` Alexei Starovoitov
  2014-02-06  1:10 ` [RFC PATCH v2 tip 2/7] Extended BPF JIT for x86-64 Alexei Starovoitov
                   ` (6 subsequent siblings)
  7 siblings, 0 replies; 26+ messages in thread
From: Alexei Starovoitov @ 2014-02-06  1:10 UTC (permalink / raw)
  To: Ingo Molnar
  Cc: David S. Miller, Steven Rostedt, Peter Zijlstra, H. Peter Anvin,
	Thomas Gleixner, Masami Hiramatsu, Tom Zanussi, Jovi Zhangwei,
	Eric Dumazet, Linus Torvalds, Andrew Morton, Frederic Weisbecker,
	Arnaldo Carvalho de Melo, Pekka Enberg, Arjan van de Ven,
	Christoph Hellwig, linux-kernel, netdev

Extended BPF (or 64-bit BPF) is an instruction set to
create safe dynamically loadable filters that can call fixed set
of kernel functions and take generic bpf_context as an input.
BPF filter is a glue between kernel functions and bpf_context.
Different kernel subsystems can define their own set of available functions
and alter BPF machinery for specific use case.

include/linux/bpf.h - instruction set definition
kernel/bpf_jit/bpf_check.c - code safety checker/static analyzer
kernel/bpf_jit/bpf_run.c - emulator for archs without BPF64_JIT

Extended BPF instruction set is designed for efficient mapping to native
instructions on 64-bit CPUs

Signed-off-by: Alexei Starovoitov <ast@plumgrid.com>
---
 include/linux/bpf.h        |  149 +++++++
 include/linux/bpf_jit.h    |  134 ++++++
 kernel/Makefile            |    1 +
 kernel/bpf_jit/Makefile    |    3 +
 kernel/bpf_jit/bpf_check.c | 1054 ++++++++++++++++++++++++++++++++++++++++++++
 kernel/bpf_jit/bpf_run.c   |  511 +++++++++++++++++++++
 lib/Kconfig.debug          |   15 +
 7 files changed, 1867 insertions(+)
 create mode 100644 include/linux/bpf.h
 create mode 100644 include/linux/bpf_jit.h
 create mode 100644 kernel/bpf_jit/Makefile
 create mode 100644 kernel/bpf_jit/bpf_check.c
 create mode 100644 kernel/bpf_jit/bpf_run.c

diff --git a/include/linux/bpf.h b/include/linux/bpf.h
new file mode 100644
index 0000000..a4e18e9
--- /dev/null
+++ b/include/linux/bpf.h
@@ -0,0 +1,149 @@
+/* 64-bit BPF is Copyright (c) 2011-2014, PLUMgrid, http://plumgrid.com */
+
+#ifndef __LINUX_BPF_H__
+#define __LINUX_BPF_H__
+
+#include <linux/types.h>
+
+struct bpf_insn {
+	__u8	code;    /* opcode */
+	__u8    a_reg:4; /* dest register*/
+	__u8    x_reg:4; /* source register */
+	__s16	off;     /* signed offset */
+	__s32	imm;     /* signed immediate constant */
+};
+
+struct bpf_table {
+	__u32   type;
+	__u32   key_size;
+	__u32   elem_size;
+	__u32   max_entries;
+	__u32   param1;         /* meaning is table-dependent */
+};
+
+enum bpf_table_type {
+	BPF_TABLE_HASH = 1,
+	BPF_TABLE_LPM
+};
+
+/* maximum number of insns and tables in a BPF program */
+#define MAX_BPF_INSNS 4096
+#define MAX_BPF_TABLES 64
+#define MAX_BPF_STRTAB_SIZE 1024
+
+/* pointer to bpf_context is the first and only argument to BPF program
+ * its definition is use-case specific */
+struct bpf_context;
+
+/* bpf_add|sub|...: a += x
+ *         bpf_mov: a = x
+ *       bpf_bswap: bswap a */
+#define BPF_INSN_ALU(op, a, x) \
+	(struct bpf_insn){BPF_ALU|BPF_OP(op)|BPF_X, a, x, 0, 0}
+
+/* bpf_add|sub|...: a += imm
+ *         bpf_mov: a = imm */
+#define BPF_INSN_ALU_IMM(op, a, imm) \
+	(struct bpf_insn){BPF_ALU|BPF_OP(op)|BPF_K, a, 0, 0, imm}
+
+/* a = *(uint *) (x + off) */
+#define BPF_INSN_LD(size, a, x, off) \
+	(struct bpf_insn){BPF_LDX|BPF_SIZE(size)|BPF_REL, a, x, off, 0}
+
+/* *(uint *) (a + off) = x */
+#define BPF_INSN_ST(size, a, off, x) \
+	(struct bpf_insn){BPF_STX|BPF_SIZE(size)|BPF_REL, a, x, off, 0}
+
+/* *(uint *) (a + off) = imm */
+#define BPF_INSN_ST_IMM(size, a, off, imm) \
+	(struct bpf_insn){BPF_ST|BPF_SIZE(size)|BPF_REL, a, 0, off, imm}
+
+/* lock *(uint *) (a + off) += x */
+#define BPF_INSN_XADD(size, a, off, x) \
+	(struct bpf_insn){BPF_STX|BPF_SIZE(size)|BPF_XADD, a, x, off, 0}
+
+/* if (a 'op' x) pc += off else fall through */
+#define BPF_INSN_JUMP(op, a, x, off) \
+	(struct bpf_insn){BPF_JMP|BPF_OP(op)|BPF_X, a, x, off, 0}
+
+/* if (a 'op' imm) pc += off else fall through */
+#define BPF_INSN_JUMP_IMM(op, a, imm, off) \
+	(struct bpf_insn){BPF_JMP|BPF_OP(op)|BPF_K, a, 0, off, imm}
+
+#define BPF_INSN_RET() \
+	(struct bpf_insn){BPF_RET|BPF_K, 0, 0, 0, 0}
+
+#define BPF_INSN_CALL(fn_code) \
+	(struct bpf_insn){BPF_JMP|BPF_CALL, 0, 0, 0, fn_code}
+
+/* Instruction classes */
+#define BPF_CLASS(code) ((code) & 0x07)
+#define         BPF_LD          0x00
+#define         BPF_LDX         0x01
+#define         BPF_ST          0x02
+#define         BPF_STX         0x03
+#define         BPF_ALU         0x04
+#define         BPF_JMP         0x05
+#define         BPF_RET         0x06
+
+/* ld/ldx fields */
+#define BPF_SIZE(code)  ((code) & 0x18)
+#define         BPF_W           0x00
+#define         BPF_H           0x08
+#define         BPF_B           0x10
+#define         BPF_DW          0x18
+#define BPF_MODE(code)  ((code) & 0xe0)
+#define         BPF_IMM         0x00
+#define         BPF_ABS         0x20
+#define         BPF_IND         0x40
+#define         BPF_MEM         0x60
+#define         BPF_LEN         0x80
+#define         BPF_MSH         0xa0
+#define         BPF_REL         0xc0
+#define         BPF_XADD        0xe0 /* exclusive add */
+
+/* alu/jmp fields */
+#define BPF_OP(code)    ((code) & 0xf0)
+#define         BPF_ADD         0x00
+#define         BPF_SUB         0x10
+#define         BPF_MUL         0x20
+#define         BPF_DIV         0x30
+#define         BPF_OR          0x40
+#define         BPF_AND         0x50
+#define         BPF_LSH         0x60
+#define         BPF_RSH         0x70 /* logical shift right */
+#define         BPF_NEG         0x80
+#define         BPF_MOD         0x90
+#define         BPF_XOR         0xa0
+#define         BPF_MOV         0xb0 /* mov reg to reg */
+#define         BPF_ARSH        0xc0 /* sign extending arithmetic shift right */
+#define         BPF_BSWAP32     0xd0 /* swap lower 4 bytes of 64-bit register */
+#define         BPF_BSWAP64     0xe0 /* swap all 8 bytes of 64-bit register */
+
+#define         BPF_JA          0x00
+#define         BPF_JEQ         0x10 /* jump == */
+#define         BPF_JGT         0x20 /* GT is unsigned '>', JA in x86 */
+#define         BPF_JGE         0x30 /* GE is unsigned '>=', JAE in x86 */
+#define         BPF_JSET        0x40
+#define         BPF_JNE         0x50 /* jump != */
+#define         BPF_JSGT        0x60 /* SGT is signed '>', GT in x86 */
+#define         BPF_JSGE        0x70 /* SGE is signed '>=', GE in x86 */
+#define         BPF_CALL        0x80 /* function call */
+#define BPF_SRC(code)   ((code) & 0x08)
+#define         BPF_K           0x00
+#define         BPF_X           0x08
+
+/* 64-bit registers */
+#define         R0              0
+#define         R1              1
+#define         R2              2
+#define         R3              3
+#define         R4              4
+#define         R5              5
+#define         R6              6
+#define         R7              7
+#define         R8              8
+#define         R9              9
+#define         __fp__          10
+
+#endif /* __LINUX_BPF_H__ */
diff --git a/include/linux/bpf_jit.h b/include/linux/bpf_jit.h
new file mode 100644
index 0000000..170ea64
--- /dev/null
+++ b/include/linux/bpf_jit.h
@@ -0,0 +1,134 @@
+/* 64-bit BPF is Copyright (c) 2011-2014, PLUMgrid, http://plumgrid.com */
+
+#ifndef __LINUX_BPF_JIT_H__
+#define __LINUX_BPF_JIT_H__
+
+#include <linux/slab.h>
+#include <linux/workqueue.h>
+#include <linux/bpf.h>
+
+/*
+ * type of value stored in a BPF register or
+ * passed into function as an argument or
+ * returned from the function
+ */
+enum bpf_reg_type {
+	INVALID_PTR,  /* reg doesn't contain a valid pointer */
+	PTR_TO_CTX,   /* reg points to bpf_context */
+	PTR_TO_TABLE, /* reg points to table element */
+	PTR_TO_TABLE_CONDITIONAL, /* points to table element or NULL */
+	PTR_TO_STACK,     /* reg == frame_pointer */
+	PTR_TO_STACK_IMM, /* reg == frame_pointer + imm */
+	PTR_TO_STACK_IMM_TABLE_KEY, /* pointer to stack used as table key */
+	PTR_TO_STACK_IMM_TABLE_ELEM, /* pointer to stack used as table elem */
+	RET_INTEGER, /* function returns integer */
+	RET_VOID,    /* function returns void */
+	CONST_ARG,    /* function expects integer constant argument */
+	CONST_ARG_TABLE_ID, /* int const argument that is used as table_id */
+	/*
+	 * int const argument indicating number of bytes accessed from stack
+	 * previous function argument must be ptr_to_stack_imm
+	 */
+	CONST_ARG_STACK_IMM_SIZE,
+};
+
+/* BPF function prototype */
+struct bpf_func_proto {
+	enum bpf_reg_type ret_type;
+	enum bpf_reg_type arg1_type;
+	enum bpf_reg_type arg2_type;
+	enum bpf_reg_type arg3_type;
+	enum bpf_reg_type arg4_type;
+};
+
+/* struct bpf_context access type */
+enum bpf_access_type {
+	BPF_READ = 1,
+	BPF_WRITE = 2
+};
+
+struct bpf_context_access {
+	int size;
+	enum bpf_access_type type;
+};
+
+struct bpf_callbacks {
+	/* execute BPF func_id with given registers */
+	void (*execute_func)(char *strtab, int id, u64 *regs);
+
+	/* return address of func_id suitable to be called from JITed program */
+	void *(*jit_select_func)(char *strtab, int id);
+
+	/* return BPF function prototype for verification */
+	const struct bpf_func_proto* (*get_func_proto)(char *strtab, int id);
+
+	/* return expected bpf_context access size and permissions
+	 * for given byte offset within bpf_context */
+	const struct bpf_context_access *(*get_context_access)(int off);
+};
+
+struct bpf_program {
+	int   insn_cnt;
+	int   table_cnt;
+	int   strtab_size;
+	struct bpf_insn *insns;
+	struct bpf_table *tables;
+	char *strtab;
+	struct bpf_callbacks *cb;
+	void (*jit_image)(struct bpf_context *ctx);
+	struct work_struct work;
+};
+
+/*
+ * BPF image format:
+ * 4 bytes "bpf\0"
+ * 4 bytes - size of strtab section in bytes
+ * string table: zero separated ascii strings
+ * {
+ *   4 bytes - size of next section in bytes
+ *   4 bytes - index into strtab of section name
+ *   N bytes - of this section
+ * } repeated
+ * "license" section contains BPF license that must be GPL compatible
+ * "bpftables" section contains zero or more of 'struct bpf_table'
+ * "e ..." section contains one or more of 'struct bpf_insn'
+ *
+ * bpf_load_image() - load BPF image, setup callback extensions
+ * and run through verifier
+ */
+int bpf_load_image(const char *image, int image_len, struct bpf_callbacks *cb,
+		   struct bpf_program **prog);
+
+/* free BPF program */
+void bpf_free(struct bpf_program *prog);
+
+/* execture BPF program */
+void bpf_run(struct bpf_program *prog, struct bpf_context *ctx);
+
+/* verify correctness of BPF program */
+int bpf_check(struct bpf_program *prog);
+
+/* pr_info one BPF instructions and registers */
+void pr_info_bpf_insn(struct bpf_insn *insn, u64 *regs);
+
+static inline void free_bpf_program(struct bpf_program *prog)
+{
+	kfree(prog->strtab);
+	kfree(prog->tables);
+	kfree(prog->insns);
+	kfree(prog);
+}
+#if defined(CONFIG_BPF64_JIT)
+void bpf_compile(struct bpf_program *prog);
+void __bpf_free(struct bpf_program *prog);
+#else
+static inline void bpf_compile(struct bpf_program *prog)
+{
+}
+static inline void __bpf_free(struct bpf_program *prog)
+{
+	free_bpf_program(prog);
+}
+#endif
+
+#endif
diff --git a/kernel/Makefile b/kernel/Makefile
index bc010ee..e63d81c 100644
--- a/kernel/Makefile
+++ b/kernel/Makefile
@@ -83,6 +83,7 @@ obj-$(CONFIG_TRACING) += trace/
 obj-$(CONFIG_TRACE_CLOCK) += trace/
 obj-$(CONFIG_RING_BUFFER) += trace/
 obj-$(CONFIG_TRACEPOINTS) += trace/
+obj-$(CONFIG_BPF64) += bpf_jit/
 obj-$(CONFIG_IRQ_WORK) += irq_work.o
 obj-$(CONFIG_CPU_PM) += cpu_pm.o
 
diff --git a/kernel/bpf_jit/Makefile b/kernel/bpf_jit/Makefile
new file mode 100644
index 0000000..2e576f9
--- /dev/null
+++ b/kernel/bpf_jit/Makefile
@@ -0,0 +1,3 @@
+obj-$(CONFIG_BPF64) += bpf_check.o
+obj-$(CONFIG_BPF64) += bpf_run.o
+
diff --git a/kernel/bpf_jit/bpf_check.c b/kernel/bpf_jit/bpf_check.c
new file mode 100644
index 0000000..c3aa574
--- /dev/null
+++ b/kernel/bpf_jit/bpf_check.c
@@ -0,0 +1,1054 @@
+/* Copyright (c) 2011-2014 PLUMgrid, http://plumgrid.com
+ *
+ * This program is free software; you can redistribute it and/or
+ * modify it under the terms of version 2 of the GNU General Public
+ * License as published by the Free Software Foundation.
+ *
+ * This program is distributed in the hope that it will be useful, but
+ * WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
+ * General Public License for more details.
+ */
+#include <linux/kernel.h>
+#include <linux/types.h>
+#include <linux/slab.h>
+#include <linux/bpf_jit.h>
+
+/*
+ * bpf_check() is a static code analyzer that walks the BPF program
+ * instruction by instruction and updates register/stack state.
+ * All paths of conditional branches are analyzed until 'ret' insn.
+ *
+ * At the first pass depth-first-search verifies that the BPF program is a DAG.
+ * It rejects the following programs:
+ * - larger than 4K insns or 64 tables
+ * - if loop is present (detected via back-edge)
+ * - unreachable insns exist (shouldn't be a forest. program = one function)
+ * - more than one ret insn
+ * - ret insn is not a last insn
+ * - out of bounds or malformed jumps
+ * The second pass is all possible path descent from the 1st insn.
+ * Conditional branch target insns keep a link list of verifier states.
+ * If the state already visited, this path can be pruned.
+ * If it wasn't a DAG, such state prunning would be incorrect, since it would
+ * skip cycles. Since it's analyzing all pathes through the program,
+ * the length of the analysis is limited to 32k insn, which may be hit even
+ * if insn_cnt < 4K, but there are too many branches that change stack/regs.
+ * Number of 'branches to be analyzed' is limited to 1k
+ *
+ * All registers are 64-bit (even on 32-bit arch)
+ * R0 - return register
+ * R1-R5 argument passing registers
+ * R6-R9 callee saved registers
+ * R10 - frame pointer read-only
+ *
+ * At the start of BPF program the register R1 contains a pointer to bpf_context
+ * and has type PTR_TO_CTX.
+ *
+ * R10 has type PTR_TO_STACK. The sequence 'mov Rx, R10; add Rx, imm' changes
+ * Rx state to PTR_TO_STACK_IMM and immediate constant is saved for further
+ * stack bounds checking
+ *
+ * registers used to pass pointers to function calls are verified against
+ * function prototypes
+ *
+ * Example: before the call to bpf_table_lookup(), R1 must have type PTR_TO_CTX
+ * R2 must contain integer constant and R3 PTR_TO_STACK_IMM_TABLE_KEY
+ * Integer constant in R2 is a table_id. It's checked that 0 <= R2 < table_cnt
+ * and corresponding table_info->key_size fetched to check that
+ * [R3, R3 + table_info->key_size) are within stack limits and all that stack
+ * memory was initiliazed earlier by BPF program.
+ * After bpf_table_lookup() call insn, R0 is set to PTR_TO_TABLE_CONDITIONAL
+ * R1-R5 are cleared and no longer readable (but still writeable).
+ *
+ * bpf_table_lookup() function returns ether pointer to table value or NULL
+ * which is type PTR_TO_TABLE_CONDITIONAL. Once it passes through !=0 insn
+ * the register holding that pointer in the true branch changes state to
+ * PTR_TO_TABLE and the same register changes state to INVALID_PTR in the false
+ * branch. See check_cond_jmp_op()
+ *
+ * load/store alignment is checked
+ * Ex: stx [Rx + 3], (u32)Ry is rejected
+ *
+ * load/store to stack bounds checked and register spill is tracked
+ * Ex: stx [R10 + 0], (u8)Rx is rejected
+ *
+ * load/store to table bounds checked and table_id provides table size
+ * Ex: stx [Rx + 8], (u16)Ry is ok, if Rx is PTR_TO_TABLE and
+ * 8 + sizeof(u16) <= table_info->elem_size
+ *
+ * load/store to bpf_context checked against known fields
+ *
+ * Future improvements:
+ * stack size is hardcoded to 512 bytes maximum per program, relax it
+ */
+#define _(OP) ({ int ret = OP; if (ret < 0) return ret; })
+
+/* JITed code allocates 512 bytes and used bottom 4 slots
+ * to save R6-R9
+ */
+#define MAX_BPF_STACK (512 - 4 * 8)
+
+struct reg_state {
+	enum bpf_reg_type ptr;
+	bool read_ok;
+	int imm;
+};
+
+#define MAX_REG 11
+
+enum bpf_stack_slot_type {
+	STACK_INVALID,    /* nothing was stored in this stack slot */
+	STACK_SPILL,      /* 1st byte of register spilled into stack */
+	STACK_SPILL_PART, /* other 7 bytes of register spill */
+	STACK_MISC	  /* BPF program wrote some data into this slot */
+};
+
+struct bpf_stack_slot {
+	enum bpf_stack_slot_type type;
+	enum bpf_reg_type ptr;
+	int imm;
+};
+
+/* state of the program:
+ * type of all registers and stack info
+ */
+struct verifier_state {
+	struct reg_state regs[MAX_REG];
+	struct bpf_stack_slot stack[MAX_BPF_STACK];
+};
+
+/* linked list of verifier states
+ * used to prune search
+ */
+struct verifier_state_list {
+	struct verifier_state state;
+	struct verifier_state_list *next;
+};
+
+/* verifier_state + insn_idx are pushed to stack
+ * when branch is encountered
+ */
+struct verifier_stack_elem {
+	struct verifier_state st;
+	int insn_idx; /* at insn 'insn_idx' the program state is 'st' */
+	struct verifier_stack_elem *next;
+};
+
+/* single container for all structs
+ * one verifier_env per bpf_check() call
+ */
+struct verifier_env {
+	struct bpf_program *prog;
+	struct verifier_stack_elem *head;
+	int stack_size;
+	struct verifier_state cur_state;
+	struct verifier_state_list **branch_landing;
+};
+
+static int pop_stack(struct verifier_env *env)
+{
+	int insn_idx;
+	struct verifier_stack_elem *elem;
+	if (env->head == NULL)
+		return -1;
+	memcpy(&env->cur_state, &env->head->st, sizeof(env->cur_state));
+	insn_idx = env->head->insn_idx;
+	elem = env->head->next;
+	kfree(env->head);
+	env->head = elem;
+	env->stack_size--;
+	return insn_idx;
+}
+
+static struct verifier_state *push_stack(struct verifier_env *env, int insn_idx)
+{
+	struct verifier_stack_elem *elem;
+	elem = kmalloc(sizeof(struct verifier_stack_elem), GFP_KERNEL);
+	if (!elem)
+		goto err;
+	memcpy(&elem->st, &env->cur_state, sizeof(env->cur_state));
+	elem->insn_idx = insn_idx;
+	elem->next = env->head;
+	env->head = elem;
+	env->stack_size++;
+	if (env->stack_size > 1024) {
+		pr_err("BPF program is too complex\n");
+		goto err;
+	}
+	return &elem->st;
+err:
+	/* pop all elements and return */
+	while (pop_stack(env) >= 0);
+	return NULL;
+}
+
+#define CALLER_SAVED_REGS 6
+static const int caller_saved[CALLER_SAVED_REGS] = { R0, R1, R2, R3, R4, R5 };
+
+static void init_reg_state(struct reg_state *regs)
+{
+	struct reg_state *reg;
+	int i;
+	for (i = 0; i < MAX_REG; i++) {
+		regs[i].ptr = INVALID_PTR;
+		regs[i].read_ok = false;
+		regs[i].imm = 0xbadbad;
+	}
+	reg = regs + __fp__;
+	reg->ptr = PTR_TO_STACK;
+	reg->read_ok = true;
+
+	reg = regs + R1;	/* 1st arg to a function */
+	reg->ptr = PTR_TO_CTX;
+	reg->read_ok = true;
+}
+
+static void mark_reg_no_ptr(struct reg_state *regs, int regno)
+{
+	regs[regno].ptr = INVALID_PTR;
+	regs[regno].imm = 0xbadbad;
+	regs[regno].read_ok = true;
+}
+
+static int check_reg_arg(struct reg_state *regs, int regno, bool is_src)
+{
+	if (is_src) {
+		if (!regs[regno].read_ok) {
+			pr_err("R%d !read_ok\n", regno);
+			return -EACCES;
+		}
+	} else {
+		if (regno == __fp__)
+			/* frame pointer is read only */
+			return -EACCES;
+		mark_reg_no_ptr(regs, regno);
+	}
+	return 0;
+}
+
+static int bpf_size_to_bytes(int bpf_size)
+{
+	if (bpf_size == BPF_W)
+		return 4;
+	else if (bpf_size == BPF_H)
+		return 2;
+	else if (bpf_size == BPF_B)
+		return 1;
+	else if (bpf_size == BPF_DW)
+		return 8;
+	else
+		return -EACCES;
+}
+
+static int check_stack_write(struct verifier_state *state, int off, int size,
+			     int value_regno)
+{
+	int i;
+	struct bpf_stack_slot *slot;
+	if (value_regno >= 0 &&
+	    (state->regs[value_regno].ptr == PTR_TO_TABLE ||
+	     state->regs[value_regno].ptr == PTR_TO_CTX)) {
+
+		/* register containing pointer is being spilled into stack */
+		if (size != 8) {
+			pr_err("invalid size of register spill\n");
+			return -EACCES;
+		}
+
+		slot = &state->stack[MAX_BPF_STACK + off];
+		slot->type = STACK_SPILL;
+		/* save register state */
+		slot->ptr = state->regs[value_regno].ptr;
+		slot->imm = state->regs[value_regno].imm;
+		for (i = 1; i < 8; i++) {
+			slot = &state->stack[MAX_BPF_STACK + off + i];
+			slot->type = STACK_SPILL_PART;
+		}
+	} else {
+
+		/* regular write of data into stack */
+		for (i = 0; i < size; i++) {
+			slot = &state->stack[MAX_BPF_STACK + off + i];
+			slot->type = STACK_MISC;
+		}
+	}
+	return 0;
+}
+
+static int check_stack_read(struct verifier_state *state, int off, int size,
+			    int value_regno)
+{
+	int i;
+	struct bpf_stack_slot *slot;
+
+	slot = &state->stack[MAX_BPF_STACK + off];
+
+	if (slot->type == STACK_SPILL) {
+		if (size != 8) {
+			pr_err("invalid size of register spill\n");
+			return -EACCES;
+		}
+		for (i = 1; i < 8; i++) {
+			if (state->stack[MAX_BPF_STACK + off + i].type !=
+			    STACK_SPILL_PART) {
+				pr_err("corrupted spill memory\n");
+				return -EACCES;
+			}
+		}
+
+		/* restore register state from stack */
+		state->regs[value_regno].ptr = slot->ptr;
+		state->regs[value_regno].imm = slot->imm;
+		state->regs[value_regno].read_ok = true;
+		return 0;
+	} else {
+		for (i = 0; i < size; i++) {
+			if (state->stack[MAX_BPF_STACK + off + i].type !=
+			    STACK_MISC) {
+				pr_err("invalid read from stack off %d+%d size %d\n",
+				       off, i, size);
+				return -EACCES;
+			}
+		}
+		/* have read misc data from the stack */
+		mark_reg_no_ptr(state->regs, value_regno);
+		return 0;
+	}
+}
+
+static int get_table_info(struct verifier_env *env, int table_id,
+			  struct bpf_table **tablep)
+{
+	/* if BPF program contains bpf_table_lookup(ctx, 1024, key)
+	 * the incorrect table_id will be caught here
+	 */
+	if (table_id < 0 || table_id >= env->prog->table_cnt) {
+		pr_err("invalid access to table_id=%d max_tables=%d\n",
+		       table_id, env->prog->table_cnt);
+		return -EACCES;
+	}
+	*tablep = &env->prog->tables[table_id];
+	return 0;
+}
+
+/* check read/write into table element returned by bpf_table_lookup() */
+static int check_table_access(struct verifier_env *env, int regno, int off,
+			      int size)
+{
+	struct bpf_table *table;
+	int table_id = env->cur_state.regs[regno].imm;
+
+	_(get_table_info(env, table_id, &table));
+
+	if (off < 0 || off + size > table->elem_size) {
+		pr_err("invalid access to table_id=%d leaf_size=%d off=%d size=%d\n",
+		       table_id, table->elem_size, off, size);
+		return -EACCES;
+	}
+	return 0;
+}
+
+/* check access to 'struct bpf_context' fields */
+static int check_ctx_access(struct verifier_env *env, int off, int size,
+			    enum bpf_access_type t)
+{
+	const struct bpf_context_access *access;
+
+	if (off < 0 || off >= 32768/* struct bpf_context shouldn't be huge */)
+		goto error;
+
+	access = env->prog->cb->get_context_access(off);
+	if (!access)
+		goto error;
+
+	if (access->size == size && (access->type & t))
+		return 0;
+error:
+	pr_err("invalid bpf_context access off=%d size=%d\n", off, size);
+	return -EACCES;
+}
+
+static int check_mem_access(struct verifier_env *env, int regno, int off,
+			    int bpf_size, enum bpf_access_type t,
+			    int value_regno)
+{
+	struct verifier_state *state = &env->cur_state;
+	int size;
+	_(size = bpf_size_to_bytes(bpf_size));
+
+	if (off % size != 0) {
+		pr_err("misaligned access off %d size %d\n", off, size);
+		return -EACCES;
+	}
+
+	if (state->regs[regno].ptr == PTR_TO_TABLE) {
+		_(check_table_access(env, regno, off, size));
+		if (t == BPF_READ)
+			mark_reg_no_ptr(state->regs, value_regno);
+	} else if (state->regs[regno].ptr == PTR_TO_CTX) {
+		_(check_ctx_access(env, off, size, t));
+		if (t == BPF_READ)
+			mark_reg_no_ptr(state->regs, value_regno);
+	} else if (state->regs[regno].ptr == PTR_TO_STACK) {
+		if (off >= 0 || off < -MAX_BPF_STACK) {
+			pr_err("invalid stack off=%d size=%d\n", off, size);
+			return -EACCES;
+		}
+		if (t == BPF_WRITE)
+			_(check_stack_write(state, off, size, value_regno));
+		else
+			_(check_stack_read(state, off, size, value_regno));
+	} else {
+		pr_err("invalid mem access %d\n", state->regs[regno].ptr);
+		return -EACCES;
+	}
+	return 0;
+}
+
+/*
+ * when register 'regno' is passed into function that will read 'access_size'
+ * bytes from that pointer, make sure that it's within stack boundary
+ * and all elements of stack are initialized
+ */
+static int check_stack_boundary(struct verifier_env *env,
+				int regno, int access_size)
+{
+	struct verifier_state *state = &env->cur_state;
+	struct reg_state *regs = state->regs;
+	int off, i;
+
+	if (regs[regno].ptr != PTR_TO_STACK_IMM)
+		return -EACCES;
+
+	off = regs[regno].imm;
+	if (off >= 0 || off < -MAX_BPF_STACK || off + access_size > 0 ||
+	    access_size <= 0) {
+		pr_err("invalid stack ptr R%d off=%d access_size=%d\n",
+		       regno, off, access_size);
+		return -EACCES;
+	}
+
+	for (i = 0; i < access_size; i++) {
+		if (state->stack[MAX_BPF_STACK + off + i].type != STACK_MISC) {
+			pr_err("invalid indirect read from stack off %d+%d size %d\n",
+			       off, i, access_size);
+			return -EACCES;
+		}
+	}
+	return 0;
+}
+
+static int check_func_arg(struct verifier_env *env, int regno,
+			  enum bpf_reg_type arg_type, int *table_id,
+			  struct bpf_table **tablep)
+{
+	struct reg_state *reg = env->cur_state.regs + regno;
+	enum bpf_reg_type expected_type;
+
+	if (arg_type == INVALID_PTR)
+		return 0;
+
+	if (!reg->read_ok) {
+		pr_err("R%d !read_ok\n", regno);
+		return -EACCES;
+	}
+
+	if (arg_type == PTR_TO_STACK_IMM_TABLE_KEY ||
+	    arg_type == PTR_TO_STACK_IMM_TABLE_ELEM)
+		expected_type = PTR_TO_STACK_IMM;
+	else if (arg_type == CONST_ARG_TABLE_ID ||
+		 arg_type == CONST_ARG_STACK_IMM_SIZE)
+		expected_type = CONST_ARG;
+	else
+		expected_type = arg_type;
+
+	if (reg->ptr != expected_type) {
+		pr_err("R%d ptr=%d expected=%d\n", regno, reg->ptr,
+		       expected_type);
+		return -EACCES;
+	}
+
+	if (arg_type == CONST_ARG_TABLE_ID) {
+		/* bpf_table_xxx(table_id) call: check that table_id is valid */
+		*table_id = reg->imm;
+		_(get_table_info(env, reg->imm, tablep));
+	} else if (arg_type == PTR_TO_STACK_IMM_TABLE_KEY) {
+		/*
+		 * bpf_table_xxx(..., table_id, ..., key) call:
+		 * check that [key, key + table_info->key_size) are within
+		 * stack limits and initialized
+		 */
+		if (!*tablep) {
+			/*
+			 * in function declaration table_id must come before
+			 * table_key or table_elem, so that it's verified
+			 * and known before we have to check table_key here
+			 */
+			pr_err("invalid table_id to access table->key\n");
+			return -EACCES;
+		}
+		_(check_stack_boundary(env, regno, (*tablep)->key_size));
+	} else if (arg_type == PTR_TO_STACK_IMM_TABLE_ELEM) {
+		/*
+		 * bpf_table_xxx(..., table_id, ..., elem) call:
+		 * check [elem, elem + table_info->elem_size) validity
+		 */
+		if (!*tablep) {
+			pr_err("invalid table_id to access table->elem\n");
+			return -EACCES;
+		}
+		_(check_stack_boundary(env, regno, (*tablep)->elem_size));
+	} else if (arg_type == CONST_ARG_STACK_IMM_SIZE) {
+		/*
+		 * bpf_xxx(..., buf, len) call will access 'len' bytes
+		 * from stack pointer 'buf'. Check it
+		 * note: regno == len, regno - 1 == buf
+		 */
+		_(check_stack_boundary(env, regno - 1, reg->imm));
+	}
+
+	return 0;
+}
+
+static int check_call(struct verifier_env *env, int func_id)
+{
+	struct verifier_state *state = &env->cur_state;
+	const struct bpf_func_proto *fn = NULL;
+	struct reg_state *regs = state->regs;
+	struct bpf_table *table = NULL;
+	int table_id = -1;
+	struct reg_state *reg;
+	int i;
+
+	/* find function prototype */
+	if (func_id <= 0 || func_id >= env->prog->strtab_size) {
+		pr_err("invalid func %d\n", func_id);
+		return -EINVAL;
+	}
+
+	if (env->prog->cb->get_func_proto)
+		fn = env->prog->cb->get_func_proto(env->prog->strtab, func_id);
+
+	if (!fn || (fn->ret_type != RET_INTEGER &&
+		    fn->ret_type != PTR_TO_TABLE_CONDITIONAL &&
+		    fn->ret_type != RET_VOID)) {
+		pr_err("unknown func %d\n", func_id);
+		return -EINVAL;
+	}
+
+	/* check args */
+	_(check_func_arg(env, R1, fn->arg1_type, &table_id, &table));
+	_(check_func_arg(env, R2, fn->arg2_type, &table_id, &table));
+	_(check_func_arg(env, R3, fn->arg3_type, &table_id, &table));
+	_(check_func_arg(env, R4, fn->arg4_type, &table_id, &table));
+
+	/* reset caller saved regs */
+	for (i = 0; i < CALLER_SAVED_REGS; i++) {
+		reg = regs + caller_saved[i];
+		reg->read_ok = false;
+		reg->ptr = INVALID_PTR;
+		reg->imm = 0xbadbad;
+	}
+
+	/* update return register */
+	reg = regs + R0;
+	if (fn->ret_type == RET_INTEGER) {
+		reg->read_ok = true;
+		reg->ptr = INVALID_PTR;
+	} else if (fn->ret_type != RET_VOID) {
+		reg->read_ok = true;
+		reg->ptr = fn->ret_type;
+		if (fn->ret_type == PTR_TO_TABLE_CONDITIONAL)
+			/*
+			 * remember table_id, so that check_table_access()
+			 * can check 'elem_size' boundary of memory access
+			 * to table element returned from bpf_table_lookup()
+			 */
+			reg->imm = table_id;
+	}
+	return 0;
+}
+
+static int check_alu_op(struct reg_state *regs, struct bpf_insn *insn)
+{
+	u16 opcode = BPF_OP(insn->code);
+
+	if (opcode == BPF_BSWAP32 || opcode == BPF_BSWAP64 ||
+	    opcode == BPF_NEG) {
+		if (BPF_SRC(insn->code) != BPF_X)
+			return -EINVAL;
+		/* check src operand */
+		_(check_reg_arg(regs, insn->a_reg, 1));
+
+		/* check dest operand */
+		_(check_reg_arg(regs, insn->a_reg, 0));
+
+	} else if (opcode == BPF_MOV) {
+
+		if (BPF_SRC(insn->code) == BPF_X)
+			/* check src operand */
+			_(check_reg_arg(regs, insn->x_reg, 1));
+
+		/* check dest operand */
+		_(check_reg_arg(regs, insn->a_reg, 0));
+
+		if (BPF_SRC(insn->code) == BPF_X) {
+			/* case: R1 = R2
+			 * copy register state to dest reg
+			 */
+			regs[insn->a_reg].ptr = regs[insn->x_reg].ptr;
+			regs[insn->a_reg].imm = regs[insn->x_reg].imm;
+		} else {
+			/* case: R = imm
+			 * remember the value we stored into this reg
+			 */
+			regs[insn->a_reg].ptr = CONST_ARG;
+			regs[insn->a_reg].imm = insn->imm;
+		}
+
+	} else {	/* all other ALU ops: and, sub, xor, add, ... */
+
+		int stack_relative = 0;
+
+		if (BPF_SRC(insn->code) == BPF_X)
+			/* check src1 operand */
+			_(check_reg_arg(regs, insn->x_reg, 1));
+
+		/* check src2 operand */
+		_(check_reg_arg(regs, insn->a_reg, 1));
+
+		if (opcode == BPF_ADD &&
+		    regs[insn->a_reg].ptr == PTR_TO_STACK &&
+		    BPF_SRC(insn->code) == BPF_K)
+			stack_relative = 1;
+
+		/* check dest operand */
+		_(check_reg_arg(regs, insn->a_reg, 0));
+
+		if (stack_relative) {
+			regs[insn->a_reg].ptr = PTR_TO_STACK_IMM;
+			regs[insn->a_reg].imm = insn->imm;
+		}
+	}
+
+	return 0;
+}
+
+static int check_cond_jmp_op(struct verifier_env *env, struct bpf_insn *insn,
+			     int insn_idx)
+{
+	struct reg_state *regs = env->cur_state.regs;
+	struct verifier_state *other_branch;
+	u16 opcode = BPF_OP(insn->code);
+
+	if (BPF_SRC(insn->code) == BPF_X)
+		/* check src1 operand */
+		_(check_reg_arg(regs, insn->x_reg, 1));
+
+	/* check src2 operand */
+	_(check_reg_arg(regs, insn->a_reg, 1));
+
+	other_branch = push_stack(env, insn_idx + insn->off + 1);
+	if (!other_branch)
+		return -EFAULT;
+
+	/* detect if R == 0 where R is returned value from table_lookup() */
+	if (BPF_SRC(insn->code) == BPF_K &&
+	    insn->imm == 0 && (opcode == BPF_JEQ ||
+			       opcode == BPF_JNE) &&
+	    regs[insn->a_reg].ptr == PTR_TO_TABLE_CONDITIONAL) {
+		if (opcode == BPF_JEQ) {
+			/*
+			 * next fallthrough insn can access memory via
+			 * this register
+			 */
+			regs[insn->a_reg].ptr = PTR_TO_TABLE;
+			/* branch targer cannot access it, since reg == 0 */
+			other_branch->regs[insn->a_reg].ptr = INVALID_PTR;
+		} else {
+			other_branch->regs[insn->a_reg].ptr = PTR_TO_TABLE;
+			regs[insn->a_reg].ptr = INVALID_PTR;
+		}
+	}
+	return 0;
+}
+
+
+/*
+ * non-recursive DFS pseudo code
+ * 1  procedure DFS-iterative(G,v):
+ * 2      label v as discovered
+ * 3      let S be a stack
+ * 4      S.push(v)
+ * 5      while S is not empty
+ * 6            t <- S.pop()
+ * 7            if t is what we're looking for:
+ * 8                return t
+ * 9            for all edges e in G.adjacentEdges(t) do
+ * 10               if edge e is already labelled
+ * 11                   continue with the next edge
+ * 12               w <- G.adjacentVertex(t,e)
+ * 13               if vertex w is not discovered and not explored
+ * 14                   label e as tree-edge
+ * 15                   label w as discovered
+ * 16                   S.push(w)
+ * 17                   continue at 5
+ * 18               else if vertex w is discovered
+ * 19                   label e as back-edge
+ * 20               else
+ * 21                   // vertex w is explored
+ * 22                   label e as forward- or cross-edge
+ * 23           label t as explored
+ * 24           S.pop()
+ *
+ * convention:
+ * 1 - discovered
+ * 2 - discovered and 1st branch labelled
+ * 3 - discovered and 1st and 2nd branch labelled
+ * 4 - explored
+ */
+
+#define STATE_END ((struct verifier_state_list *)-1)
+
+#define PUSH_INT(I) \
+	do { \
+		if (cur_stack >= insn_cnt) { \
+			ret = -E2BIG; \
+			goto free_st; \
+		} \
+		stack[cur_stack++] = I; \
+	} while (0)
+
+#define PEAK_INT() \
+	({ \
+		int _ret; \
+		if (cur_stack == 0) \
+			_ret = -1; \
+		else \
+			_ret = stack[cur_stack - 1]; \
+		_ret; \
+	 })
+
+#define POP_INT() \
+	({ \
+		int _ret; \
+		if (cur_stack == 0) \
+			_ret = -1; \
+		else \
+			_ret = stack[--cur_stack]; \
+		_ret; \
+	 })
+
+#define PUSH_INSN(T, W, E) \
+	do { \
+		int w = W; \
+		if (E == 1 && st[T] >= 2) \
+			break; \
+		if (E == 2 && st[T] >= 3) \
+			break; \
+		if (w >= insn_cnt) { \
+			ret = -EACCES; \
+			goto free_st; \
+		} \
+		if (E == 2) \
+			/* mark branch target for state pruning */ \
+			env->branch_landing[w] = STATE_END; \
+		if (st[w] == 0) { \
+			/* tree-edge */ \
+			st[T] = 1 + E; \
+			st[w] = 1; /* discovered */ \
+			PUSH_INT(w); \
+			goto peak_stack; \
+		} else if (st[w] == 1 || st[w] == 2 || st[w] == 3) { \
+			pr_err("back-edge from insn %d to %d\n", t, w); \
+			ret = -EINVAL; \
+			goto free_st; \
+		} else if (st[w] == 4) { \
+			/* forward- or cross-edge */ \
+			st[T] = 1 + E; \
+		} else { \
+			pr_err("insn state internal bug\n"); \
+			ret = -EFAULT; \
+			goto free_st; \
+		} \
+	} while (0)
+
+/* non-recursive depth-first-search to detect loops in BPF program
+ * loop == back-edge in directed graph
+ */
+static int check_cfg(struct verifier_env *env)
+{
+	struct bpf_insn *insns = env->prog->insns;
+	int insn_cnt = env->prog->insn_cnt;
+	int cur_stack = 0;
+	int *stack;
+	int ret = 0;
+	int *st;
+	int i, t;
+
+	if (insns[insn_cnt - 1].code != (BPF_RET | BPF_K)) {
+		pr_err("last insn is not a 'ret'\n");
+		return -EINVAL;
+	}
+
+	st = kzalloc(sizeof(int) * insn_cnt, GFP_KERNEL);
+	if (!st)
+		return -ENOMEM;
+
+	stack = kzalloc(sizeof(int) * insn_cnt, GFP_KERNEL);
+	if (!stack) {
+		kfree(st);
+		return -ENOMEM;
+	}
+
+	st[0] = 1; /* mark 1st insn as discovered */
+	PUSH_INT(0);
+
+peak_stack:
+	while ((t = PEAK_INT()) != -1) {
+		if (t == insn_cnt - 1)
+			goto mark_explored;
+
+		if (BPF_CLASS(insns[t].code) == BPF_RET) {
+			pr_err("extraneous 'ret'\n");
+			ret = -EINVAL;
+			goto free_st;
+		}
+
+		if (BPF_CLASS(insns[t].code) == BPF_JMP) {
+			u16 opcode = BPF_OP(insns[t].code);
+			if (opcode == BPF_CALL) {
+				PUSH_INSN(t, t + 1, 1);
+			} else if (opcode == BPF_JA) {
+				if (BPF_SRC(insns[t].code) != BPF_X) {
+					ret = -EINVAL;
+					goto free_st;
+				}
+				PUSH_INSN(t, t + insns[t].off + 1, 1);
+			} else {
+				PUSH_INSN(t, t + 1, 1);
+				PUSH_INSN(t, t + insns[t].off + 1, 2);
+			}
+		} else {
+			PUSH_INSN(t, t + 1, 1);
+		}
+
+mark_explored:
+		st[t] = 4; /* explored */
+		if (POP_INT() == -1) {
+			pr_err("pop_int internal bug\n");
+			ret = -EFAULT;
+			goto free_st;
+		}
+	}
+
+
+	for (i = 0; i < insn_cnt; i++) {
+		if (st[i] != 4) {
+			pr_err("unreachable insn %d\n", i);
+			ret = -EINVAL;
+			goto free_st;
+		}
+	}
+
+free_st:
+	kfree(st);
+	kfree(stack);
+	return ret;
+}
+
+static int is_state_visited(struct verifier_env *env, int insn_idx)
+{
+	struct verifier_state_list *new_sl;
+	struct verifier_state_list *sl;
+
+	sl = env->branch_landing[insn_idx];
+	if (!sl)
+		/* no branch jump to this insn, ignore it */
+		return 0;
+
+	while (sl != STATE_END) {
+		if (memcmp(&sl->state, &env->cur_state,
+			   sizeof(env->cur_state)) == 0)
+			/* reached the same register/stack state,
+			 * prune the search
+			 */
+			return 1;
+		sl = sl->next;
+	}
+	new_sl = kmalloc(sizeof(struct verifier_state_list), GFP_KERNEL);
+
+	if (!new_sl)
+		/* ignore kmalloc error, since it's rare and doesn't affect
+		 * correctness of algorithm
+		 */
+		return 0;
+	/* add new state to the head of linked list */
+	memcpy(&new_sl->state, &env->cur_state, sizeof(env->cur_state));
+	new_sl->next = env->branch_landing[insn_idx];
+	env->branch_landing[insn_idx] = new_sl;
+	return 0;
+}
+
+#undef _
+#define _(OP) ({ err = OP; if (err < 0) goto err_print_insn; })
+
+static int __bpf_check(struct verifier_env *env)
+{
+	struct verifier_state *state = &env->cur_state;
+	struct bpf_insn *insns = env->prog->insns;
+	struct reg_state *regs = state->regs;
+	int insn_cnt = env->prog->insn_cnt;
+	int insn_processed = 0;
+	int insn_idx;
+	int err;
+
+	init_reg_state(regs);
+	insn_idx = 0;
+	for (;;) {
+		struct bpf_insn *insn;
+		u16 class;
+
+		if (insn_idx >= insn_cnt) {
+			pr_err("invalid insn idx %d insn_cnt %d\n",
+			       insn_idx, insn_cnt);
+			return -EFAULT;
+		}
+
+		insn = &insns[insn_idx];
+		class = BPF_CLASS(insn->code);
+
+		if (++insn_processed > 32768) {
+			pr_err("BPF program is too large. Proccessed %d insn\n",
+			       insn_processed);
+			return -E2BIG;
+		}
+
+		if (is_state_visited(env, insn_idx))
+			goto process_ret;
+
+		if (class == BPF_ALU) {
+			_(check_alu_op(regs, insn));
+
+		} else if (class == BPF_LDX) {
+			if (BPF_MODE(insn->code) != BPF_REL)
+				return -EINVAL;
+
+			/* check src operand */
+			_(check_reg_arg(regs, insn->x_reg, 1));
+
+			_(check_mem_access(env, insn->x_reg, insn->off,
+					   BPF_SIZE(insn->code), BPF_READ,
+					   insn->a_reg));
+
+			/* dest reg state will be updated by mem_access */
+
+		} else if (class == BPF_STX) {
+			/* check src1 operand */
+			_(check_reg_arg(regs, insn->x_reg, 1));
+			/* check src2 operand */
+			_(check_reg_arg(regs, insn->a_reg, 1));
+			_(check_mem_access(env, insn->a_reg, insn->off,
+					   BPF_SIZE(insn->code), BPF_WRITE,
+					   insn->x_reg));
+
+		} else if (class == BPF_ST) {
+			if (BPF_MODE(insn->code) != BPF_REL)
+				return -EINVAL;
+			/* check src operand */
+			_(check_reg_arg(regs, insn->a_reg, 1));
+			_(check_mem_access(env, insn->a_reg, insn->off,
+					   BPF_SIZE(insn->code), BPF_WRITE,
+					   -1));
+
+		} else if (class == BPF_JMP) {
+			u16 opcode = BPF_OP(insn->code);
+			if (opcode == BPF_CALL) {
+				_(check_call(env, insn->imm));
+			} else if (opcode == BPF_JA) {
+				if (BPF_SRC(insn->code) != BPF_X)
+					return -EINVAL;
+				insn_idx += insn->off + 1;
+				continue;
+			} else {
+				_(check_cond_jmp_op(env, insn, insn_idx));
+			}
+
+		} else if (class == BPF_RET) {
+process_ret:
+			insn_idx = pop_stack(env);
+			if (insn_idx < 0)
+				break;
+			else
+				continue;
+		}
+
+		insn_idx++;
+	}
+
+	pr_debug("insn_processed %d\n", insn_processed);
+	return 0;
+
+err_print_insn:
+	pr_info("insn #%d\n", insn_idx);
+	pr_info_bpf_insn(&insns[insn_idx], NULL);
+	return err;
+}
+
+static void free_states(struct verifier_env *env, int insn_cnt)
+{
+	int i;
+
+	for (i = 0; i < insn_cnt; i++) {
+		struct verifier_state_list *sl = env->branch_landing[i];
+		if (sl)
+			while (sl != STATE_END) {
+				struct verifier_state_list *sln = sl->next;
+				kfree(sl);
+				sl = sln;
+			}
+	}
+
+	kfree(env->branch_landing);
+}
+
+int bpf_check(struct bpf_program *prog)
+{
+	int ret;
+	struct verifier_env *env;
+
+	if (prog->insn_cnt <= 0 || prog->insn_cnt > MAX_BPF_INSNS ||
+	    prog->table_cnt < 0 || prog->table_cnt > MAX_BPF_TABLES ||
+	    prog->strtab_size < 0 || prog->strtab_size > MAX_BPF_STRTAB_SIZE ||
+	    prog->strtab[prog->strtab_size - 1] != 0) {
+		pr_err("BPF program has %d insn and %d tables. Max is %d/%d\n",
+		       prog->insn_cnt, prog->table_cnt,
+		       MAX_BPF_INSNS, MAX_BPF_TABLES);
+		return -E2BIG;
+	}
+
+	env = kzalloc(sizeof(struct verifier_env), GFP_KERNEL);
+	if (!env)
+		return -ENOMEM;
+
+	env->prog = prog;
+	env->branch_landing = kzalloc(sizeof(struct verifier_state_list *) *
+				      prog->insn_cnt, GFP_KERNEL);
+
+	if (!env->branch_landing) {
+		kfree(env);
+		return -ENOMEM;
+	}
+
+	ret = check_cfg(env);
+	if (ret)
+		goto free_env;
+	ret = __bpf_check(env);
+free_env:
+	while (pop_stack(env) >= 0);
+	free_states(env, prog->insn_cnt);
+	kfree(env);
+	return ret;
+}
+EXPORT_SYMBOL(bpf_check);
diff --git a/kernel/bpf_jit/bpf_run.c b/kernel/bpf_jit/bpf_run.c
new file mode 100644
index 0000000..d3b51b6
--- /dev/null
+++ b/kernel/bpf_jit/bpf_run.c
@@ -0,0 +1,511 @@
+/* Copyright (c) 2011-2014 PLUMgrid, http://plumgrid.com
+ *
+ * This program is free software; you can redistribute it and/or
+ * modify it under the terms of version 2 of the GNU General Public
+ * License as published by the Free Software Foundation.
+ *
+ * This program is distributed in the hope that it will be useful, but
+ * WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
+ * General Public License for more details.
+ */
+#include <linux/kernel.h>
+#include <linux/types.h>
+#include <linux/slab.h>
+#include <linux/uaccess.h>
+#include <linux/bpf_jit.h>
+#include <linux/license.h>
+
+static const char *const bpf_class_string[] = {
+	"ld", "ldx", "st", "stx", "alu", "jmp", "ret", "misc"
+};
+
+static const char *const bpf_alu_string[] = {
+	"+=", "-=", "*=", "/=", "|=", "&=", "<<=", ">>=", "neg",
+	"%=", "^=", "=", "s>>=", "bswap32", "bswap64", "BUG"
+};
+
+static const char *const bpf_ldst_string[] = {
+	"u32", "u16", "u8", "u64"
+};
+
+static const char *const bpf_jmp_string[] = {
+	"jmp", "==", ">", ">=", "&", "!=", "s>", "s>=", "call"
+};
+
+static const char *reg_to_str(int regno, u64 *regs)
+{
+	static char reg_value[16][32];
+	if (!regs)
+		return "";
+	snprintf(reg_value[regno], sizeof(reg_value[regno]), "(0x%llx)",
+		 regs[regno]);
+	return reg_value[regno];
+}
+
+#define R(regno) reg_to_str(regno, regs)
+
+void pr_info_bpf_insn(struct bpf_insn *insn, u64 *regs)
+{
+	u16 class = BPF_CLASS(insn->code);
+	if (class == BPF_ALU) {
+		if (BPF_SRC(insn->code) == BPF_X)
+			pr_info("code_%02x r%d%s %s r%d%s\n",
+				insn->code, insn->a_reg, R(insn->a_reg),
+				bpf_alu_string[BPF_OP(insn->code) >> 4],
+				insn->x_reg, R(insn->x_reg));
+		else
+			pr_info("code_%02x r%d%s %s %d\n",
+				insn->code, insn->a_reg, R(insn->a_reg),
+				bpf_alu_string[BPF_OP(insn->code) >> 4],
+				insn->imm);
+	} else if (class == BPF_STX) {
+		if (BPF_MODE(insn->code) == BPF_REL)
+			pr_info("code_%02x *(%s *)(r%d%s %+d) = r%d%s\n",
+				insn->code,
+				bpf_ldst_string[BPF_SIZE(insn->code) >> 3],
+				insn->a_reg, R(insn->a_reg),
+				insn->off, insn->x_reg, R(insn->x_reg));
+		else if (BPF_MODE(insn->code) == BPF_XADD)
+			pr_info("code_%02x lock *(%s *)(r%d%s %+d) += r%d%s\n",
+				insn->code,
+				bpf_ldst_string[BPF_SIZE(insn->code) >> 3],
+				insn->a_reg, R(insn->a_reg), insn->off,
+				insn->x_reg, R(insn->x_reg));
+		else
+			pr_info("BUG_%02x\n", insn->code);
+	} else if (class == BPF_ST) {
+		if (BPF_MODE(insn->code) != BPF_REL) {
+			pr_info("BUG_st_%02x\n", insn->code);
+			return;
+		}
+		pr_info("code_%02x *(%s *)(r%d%s %+d) = %d\n",
+			insn->code,
+			bpf_ldst_string[BPF_SIZE(insn->code) >> 3],
+			insn->a_reg, R(insn->a_reg),
+			insn->off, insn->imm);
+	} else if (class == BPF_LDX) {
+		if (BPF_MODE(insn->code) != BPF_REL) {
+			pr_info("BUG_ldx_%02x\n", insn->code);
+			return;
+		}
+		pr_info("code_%02x r%d = *(%s *)(r%d%s %+d)\n",
+			insn->code, insn->a_reg,
+			bpf_ldst_string[BPF_SIZE(insn->code) >> 3],
+			insn->x_reg, R(insn->x_reg), insn->off);
+	} else if (class == BPF_JMP) {
+		u16 opcode = BPF_OP(insn->code);
+		if (opcode == BPF_CALL) {
+			pr_info("code_%02x call %d\n", insn->code, insn->imm);
+		} else if (insn->code == (BPF_JMP | BPF_JA | BPF_X)) {
+			pr_info("code_%02x goto pc%+d\n",
+				insn->code, insn->off);
+		} else if (BPF_SRC(insn->code) == BPF_X) {
+			pr_info("code_%02x if r%d%s %s r%d%s goto pc%+d\n",
+				insn->code, insn->a_reg, R(insn->a_reg),
+				bpf_jmp_string[BPF_OP(insn->code) >> 4],
+				insn->x_reg, R(insn->x_reg), insn->off);
+		} else {
+			pr_info("code_%02x if r%d%s %s 0x%x goto pc%+d\n",
+				insn->code, insn->a_reg, R(insn->a_reg),
+				bpf_jmp_string[BPF_OP(insn->code) >> 4],
+				insn->imm, insn->off);
+		}
+	} else {
+		pr_info("code_%02x %s\n", insn->code, bpf_class_string[class]);
+	}
+}
+
+void bpf_run(struct bpf_program *prog, struct bpf_context *ctx)
+{
+	struct bpf_insn *insn = prog->insns;
+	u64 stack[64];
+	u64 regs[16] = { };
+	regs[__fp__] = (u64)(ulong)&stack[64];
+	regs[R1] = (u64)(ulong)ctx;
+
+	for (;; insn++) {
+		const s32 K = insn->imm;
+		u64 tmp;
+		u64 *a_reg = &regs[insn->a_reg];
+		u64 *x_reg = &regs[insn->x_reg];
+#define A (*a_reg)
+#define X (*x_reg)
+		/*pr_info_bpf_insn(insn, regs);*/
+		switch (insn->code) {
+			/* ALU */
+		case BPF_ALU | BPF_ADD | BPF_X:
+			A += X;
+			continue;
+		case BPF_ALU | BPF_ADD | BPF_K:
+			A += K;
+			continue;
+		case BPF_ALU | BPF_SUB | BPF_X:
+			A -= X;
+			continue;
+		case BPF_ALU | BPF_SUB | BPF_K:
+			A -= K;
+			continue;
+		case BPF_ALU | BPF_AND | BPF_X:
+			A &= X;
+			continue;
+		case BPF_ALU | BPF_AND | BPF_K:
+			A &= K;
+			continue;
+		case BPF_ALU | BPF_OR | BPF_X:
+			A |= X;
+			continue;
+		case BPF_ALU | BPF_OR | BPF_K:
+			A |= K;
+			continue;
+		case BPF_ALU | BPF_LSH | BPF_X:
+			A <<= X;
+			continue;
+		case BPF_ALU | BPF_LSH | BPF_K:
+			A <<= K;
+			continue;
+		case BPF_ALU | BPF_RSH | BPF_X:
+			A >>= X;
+			continue;
+		case BPF_ALU | BPF_RSH | BPF_K:
+			A >>= K;
+			continue;
+		case BPF_ALU | BPF_MOV | BPF_X:
+			A = X;
+			continue;
+		case BPF_ALU | BPF_MOV | BPF_K:
+			A = K;
+			continue;
+		case BPF_ALU | BPF_ARSH | BPF_X:
+			(*(s64 *) &A) >>= X;
+			continue;
+		case BPF_ALU | BPF_ARSH | BPF_K:
+			(*(s64 *) &A) >>= K;
+			continue;
+		case BPF_ALU | BPF_BSWAP32 | BPF_X:
+			A = __builtin_bswap32(A);
+			continue;
+		case BPF_ALU | BPF_BSWAP64 | BPF_X:
+			A = __builtin_bswap64(A);
+			continue;
+		case BPF_ALU | BPF_MOD | BPF_X:
+			tmp = A;
+			if (X)
+				A = do_div(tmp, X);
+			continue;
+		case BPF_ALU | BPF_MOD | BPF_K:
+			tmp = A;
+			if (K)
+				A = do_div(tmp, K);
+			continue;
+
+			/* CALL */
+		case BPF_JMP | BPF_CALL:
+			prog->cb->execute_func(prog->strtab, K, regs);
+			continue;
+
+			/* JMP */
+		case BPF_JMP | BPF_JA | BPF_X:
+			insn += insn->off;
+			continue;
+		case BPF_JMP | BPF_JEQ | BPF_X:
+			if (A == X)
+				insn += insn->off;
+			continue;
+		case BPF_JMP | BPF_JEQ | BPF_K:
+			if (A == K)
+				insn += insn->off;
+			continue;
+		case BPF_JMP | BPF_JNE | BPF_X:
+			if (A != X)
+				insn += insn->off;
+			continue;
+		case BPF_JMP | BPF_JNE | BPF_K:
+			if (A != K)
+				insn += insn->off;
+			continue;
+		case BPF_JMP | BPF_JGT | BPF_X:
+			if (A > X)
+				insn += insn->off;
+			continue;
+		case BPF_JMP | BPF_JGT | BPF_K:
+			if (A > K)
+				insn += insn->off;
+			continue;
+		case BPF_JMP | BPF_JGE | BPF_X:
+			if (A >= X)
+				insn += insn->off;
+			continue;
+		case BPF_JMP | BPF_JGE | BPF_K:
+			if (A >= K)
+				insn += insn->off;
+			continue;
+		case BPF_JMP | BPF_JSGT | BPF_X:
+			if (((s64)A) > ((s64)X))
+				insn += insn->off;
+			continue;
+		case BPF_JMP | BPF_JSGT | BPF_K:
+			if (((s64)A) > ((s64)K))
+				insn += insn->off;
+			continue;
+		case BPF_JMP | BPF_JSGE | BPF_X:
+			if (((s64)A) >= ((s64)X))
+				insn += insn->off;
+			continue;
+		case BPF_JMP | BPF_JSGE | BPF_K:
+			if (((s64)A) >= ((s64)K))
+				insn += insn->off;
+			continue;
+
+			/* STX */
+		case BPF_STX | BPF_REL | BPF_B:
+			*(u8 *)(ulong)(A + insn->off) = X;
+			continue;
+		case BPF_STX | BPF_REL | BPF_H:
+			*(u16 *)(ulong)(A + insn->off) = X;
+			continue;
+		case BPF_STX | BPF_REL | BPF_W:
+			*(u32 *)(ulong)(A + insn->off) = X;
+			continue;
+		case BPF_STX | BPF_REL | BPF_DW:
+			*(u64 *)(ulong)(A + insn->off) = X;
+			continue;
+
+			/* ST */
+		case BPF_ST | BPF_REL | BPF_B:
+			*(u8 *)(ulong)(A + insn->off) = K;
+			continue;
+		case BPF_ST | BPF_REL | BPF_H:
+			*(u16 *)(ulong)(A + insn->off) = K;
+			continue;
+		case BPF_ST | BPF_REL | BPF_W:
+			*(u32 *)(ulong)(A + insn->off) = K;
+			continue;
+		case BPF_ST | BPF_REL | BPF_DW:
+			*(u64 *)(ulong)(A + insn->off) = K;
+			continue;
+
+			/* LDX */
+		case BPF_LDX | BPF_REL | BPF_B:
+			A = *(u8 *)(ulong)(X + insn->off);
+			continue;
+		case BPF_LDX | BPF_REL | BPF_H:
+			A = *(u16 *)(ulong)(X + insn->off);
+			continue;
+		case BPF_LDX | BPF_REL | BPF_W:
+			A = *(u32 *)(ulong)(X + insn->off);
+			continue;
+		case BPF_LDX | BPF_REL | BPF_DW:
+			A = *(u64 *)(ulong)(X + insn->off);
+			continue;
+
+			/* STX XADD */
+		case BPF_STX | BPF_XADD | BPF_B:
+			__sync_fetch_and_add((u8 *)(ulong)(A + insn->off),
+					     (u8)X);
+			continue;
+		case BPF_STX | BPF_XADD | BPF_H:
+			__sync_fetch_and_add((u16 *)(ulong)(A + insn->off),
+					     (u16)X);
+			continue;
+		case BPF_STX | BPF_XADD | BPF_W:
+			__sync_fetch_and_add((u32 *)(ulong)(A + insn->off),
+					     (u32)X);
+			continue;
+		case BPF_STX | BPF_XADD | BPF_DW:
+			__sync_fetch_and_add((u64 *)(ulong)(A + insn->off),
+					     (u64)X);
+			continue;
+
+			/* RET */
+		case BPF_RET | BPF_K:
+			return;
+		default:
+			/*
+			 * bpf_check() will guarantee that
+			 * we never reach here
+			 */
+			pr_err("unknown opcode %02x\n", insn->code);
+			return;
+		}
+	}
+}
+EXPORT_SYMBOL(bpf_run);
+
+/*
+ * BPF image format:
+ * 4 bytes "bpf\0"
+ * 4 bytes - size of strtab section in bytes
+ * string table: zero separated ascii strings
+ * {
+ *   4 bytes - size of next section in bytes
+ *   4 bytes - index into strtab of section name
+ *   N bytes - of this section
+ * } repeated
+ * "license" section contains BPF license that must be GPL compatible
+ * "bpftables" section contains zero or more of 'struct bpf_table'
+ * "e skb:kfree_skb" section contains one or more of 'struct bpf_insn'
+ */
+#define BPF_HEADER_SIZE 8
+int bpf_load_image(const char *image, int image_len, struct bpf_callbacks *cb,
+		   struct bpf_program **p_prog)
+{
+	struct bpf_program *prog;
+	int sec_size, sec_name, strtab_size;
+	int ret;
+
+	BUILD_BUG_ON(sizeof(struct bpf_insn) != 8);
+
+	if (!image || !cb || !cb->execute_func || !cb->get_func_proto ||
+	    !cb->get_context_access)
+		return -EINVAL;
+
+	if (image_len < 8 || memcmp(image, "bpf", 4) != 0) {
+		pr_err("invalid bpf image, size=%d\n", image_len);
+		return -EINVAL;
+	}
+
+	/* eat 'bpf' header */
+	image += 4;
+	image_len -= 4;
+
+	memcpy(&strtab_size, image, 4);
+	/* eat strtab size */
+	image += 4;
+	image_len -= 4;
+
+	if (strtab_size < 0 ||
+	    strtab_size > MAX_BPF_STRTAB_SIZE ||
+	    strtab_size >= image_len ||
+	    /*
+	     * check that strtab section is null terminated, so we can use
+	     * strcmp below even if sec_name points to strtab_size - 1
+	     */
+	    image[strtab_size - 1] != '\0') {
+		pr_err("BPF program strtab_size %d\n", strtab_size);
+		return -E2BIG;
+	}
+
+	prog = kzalloc(sizeof(struct bpf_program), GFP_KERNEL);
+	if (!prog)
+		return -ENOMEM;
+	prog->cb = cb;
+
+	prog->strtab_size = strtab_size;
+	prog->strtab = kmalloc(strtab_size, GFP_KERNEL);
+	if (!prog->strtab) {
+		ret = -ENOMEM;
+		goto free_prog;
+	}
+	memcpy(prog->strtab, image, strtab_size);
+	/* eat strtab section */
+	image += strtab_size;
+	image_len -= strtab_size;
+
+	/* now walk through all the sections */
+process_section:
+	if (image_len < 8) {
+		ret = -EINVAL;
+		goto free_strtab;
+	}
+	memcpy(&sec_size, image, 4);
+	memcpy(&sec_name, image + 4, 4);
+	image += 8;
+	image_len -= 8;
+	if (sec_name < 0 || sec_name >= strtab_size) {
+		ret = -EINVAL;
+		goto free_strtab;
+	}
+
+	if (prog->strtab[sec_name] == 'e' &&
+	    prog->strtab[sec_name + 1] == ' ' &&
+	    !prog->insns) {
+		/* got bpf_insn section */
+		prog->insn_cnt = sec_size / sizeof(struct bpf_insn);
+		if (prog->insn_cnt <= 0 ||
+		    sec_size % sizeof(struct bpf_insn) ||
+		    sec_size > image_len ||
+		    prog->insn_cnt > MAX_BPF_INSNS) {
+			pr_err("BPF program insn_size %d\n", sec_size);
+			ret = -E2BIG;
+			goto free_strtab;
+		}
+
+		prog->insns = kmalloc(sec_size, GFP_KERNEL);
+		if (!prog->insns) {
+			ret = -ENOMEM;
+			goto free_strtab;
+		}
+		memcpy(prog->insns, image, sec_size);
+		image += sec_size;
+		image_len -= sec_size;
+	} else if (strcmp(&prog->strtab[sec_name], "bpftables") == 0 &&
+		   !prog->tables) {
+		/* got bpf_tables section */
+		prog->table_cnt = sec_size / sizeof(struct bpf_table);
+		if (prog->table_cnt < 0 ||
+		    sec_size % sizeof(struct bpf_table) ||
+		    sec_size > image_len ||
+		    prog->table_cnt > MAX_BPF_TABLES) {
+			pr_err("BPF program table_size %d\n", sec_size);
+			ret = -E2BIG;
+			goto free_strtab;
+		}
+		prog->tables = kmalloc(sec_size, GFP_KERNEL);
+		if (!prog->tables) {
+			ret = -ENOMEM;
+			goto free_strtab;
+		}
+		memcpy(prog->tables, image, sec_size);
+		image += sec_size;
+		image_len -= sec_size;
+	} else if (strcmp(&prog->strtab[sec_name], "license") == 0) {
+		/* license section */
+		if (sec_size > image_len) {
+			pr_err("BPF program license_size %d\n", sec_size);
+			ret = -E2BIG;
+			goto free_strtab;
+		}
+		if (image[sec_size - 1] != '\0' ||
+		    !license_is_gpl_compatible(image)) {
+			pr_err("BPF program license is not GPL compatible\n");
+			ret = -EINVAL;
+			goto free_strtab;
+		}
+		image += sec_size;
+		image_len -= sec_size;
+	}
+
+	if (image_len)
+		goto process_section;
+
+	/* verify BPF program */
+	ret = bpf_check(prog);
+	if (ret)
+		goto free_strtab;
+
+	/* compile it (map BPF insns to native hw insns) */
+	bpf_compile(prog);
+
+	*p_prog = prog;
+
+	return 0;
+
+free_strtab:
+	kfree(prog->strtab);
+	kfree(prog->tables);
+	kfree(prog->insns);
+free_prog:
+	kfree(prog);
+	return ret;
+}
+EXPORT_SYMBOL(bpf_load_image);
+
+void bpf_free(struct bpf_program *prog)
+{
+	if (!prog)
+		return;
+	__bpf_free(prog);
+}
+EXPORT_SYMBOL(bpf_free);
+
diff --git a/lib/Kconfig.debug b/lib/Kconfig.debug
index a48abea..5a8d2fd 100644
--- a/lib/Kconfig.debug
+++ b/lib/Kconfig.debug
@@ -1615,3 +1615,18 @@ source "samples/Kconfig"
 
 source "lib/Kconfig.kgdb"
 
+# Used by archs to tell that they support 64-bit BPF JIT
+config HAVE_BPF64_JIT
+	bool
+
+config BPF64
+	bool "Enable 64-bit BPF instruction set support"
+	help
+	  Enable this option to support 64-bit BPF programs
+
+config BPF64_JIT
+	bool "Enable 64-bit BPF JIT compiler"
+	depends on BPF64 && HAVE_BPF64_JIT
+	help
+	  Enable Just-In-Time compiler for 64-bit BPF programs
+
-- 
1.7.9.5


^ permalink raw reply related	[flat|nested] 26+ messages in thread

* [RFC PATCH v2 tip 2/7] Extended BPF JIT for x86-64
  2014-02-06  1:10 [RFC PATCH v2 tip 0/7] 64-bit BPF insn set and tracing filters Alexei Starovoitov
  2014-02-06  1:10 ` [RFC PATCH v2 tip 1/7] Extended BPF core framework Alexei Starovoitov
@ 2014-02-06  1:10 ` Alexei Starovoitov
  2014-02-06  1:10 ` [RFC PATCH v2 tip 3/7] Extended BPF (64-bit BPF) design document Alexei Starovoitov
                   ` (5 subsequent siblings)
  7 siblings, 0 replies; 26+ messages in thread
From: Alexei Starovoitov @ 2014-02-06  1:10 UTC (permalink / raw)
  To: Ingo Molnar
  Cc: David S. Miller, Steven Rostedt, Peter Zijlstra, H. Peter Anvin,
	Thomas Gleixner, Masami Hiramatsu, Tom Zanussi, Jovi Zhangwei,
	Eric Dumazet, Linus Torvalds, Andrew Morton, Frederic Weisbecker,
	Arnaldo Carvalho de Melo, Pekka Enberg, Arjan van de Ven,
	Christoph Hellwig, linux-kernel, netdev

Just-In-Time compiler that maps 64-bit BPF instructions to x86-64 instructions.

Most BPF instructions have one to one mapping.

Every BPF register maps to one x86-64 register:
R0 -> rax
R1 -> rdi
R2 -> rsi
R3 -> rdx
R4 -> rcx
R5 -> r8
R6 -> rbx
R7 -> r13
R8 -> r14
R9 -> r15
FP -> rbp

BPF calling convention is defined as:
R0 - return value from in-kernel function
R1-R5 - arguments from BPF program to in-kernel function
R6-R9 - callee saved registers that in-kernel function will preserve
R10 - read-only frame pointer to access stack
so BPF calling convention maps directly to x86-64 calling convention.

Allowing zero-overhead calls between BPF filter and safe kernel functions

Signed-off-by: Alexei Starovoitov <ast@plumgrid.com>
---
 arch/x86/Kconfig              |    1 +
 arch/x86/net/Makefile         |    1 +
 arch/x86/net/bpf64_jit_comp.c |  625 +++++++++++++++++++++++++++++++++++++++++
 arch/x86/net/bpf_jit_comp.c   |   23 +-
 arch/x86/net/bpf_jit_comp.h   |   35 +++
 5 files changed, 665 insertions(+), 20 deletions(-)
 create mode 100644 arch/x86/net/bpf64_jit_comp.c
 create mode 100644 arch/x86/net/bpf_jit_comp.h

diff --git a/arch/x86/Kconfig b/arch/x86/Kconfig
index fe55897..ff97d4b 100644
--- a/arch/x86/Kconfig
+++ b/arch/x86/Kconfig
@@ -94,6 +94,7 @@ config X86
 	select GENERIC_CLOCKEVENTS_MIN_ADJUST
 	select IRQ_FORCED_THREADING
 	select HAVE_BPF_JIT if X86_64
+	select HAVE_BPF64_JIT if X86_64
 	select HAVE_ARCH_TRANSPARENT_HUGEPAGE
 	select CLKEVT_I8253
 	select ARCH_HAVE_NMI_SAFE_CMPXCHG
diff --git a/arch/x86/net/Makefile b/arch/x86/net/Makefile
index 90568c3..c3bb7d5 100644
--- a/arch/x86/net/Makefile
+++ b/arch/x86/net/Makefile
@@ -2,3 +2,4 @@
 # Arch-specific network modules
 #
 obj-$(CONFIG_BPF_JIT) += bpf_jit.o bpf_jit_comp.o
+obj-$(CONFIG_BPF64_JIT) += bpf64_jit_comp.o
diff --git a/arch/x86/net/bpf64_jit_comp.c b/arch/x86/net/bpf64_jit_comp.c
new file mode 100644
index 0000000..5f7c331
--- /dev/null
+++ b/arch/x86/net/bpf64_jit_comp.c
@@ -0,0 +1,625 @@
+/*
+ * Copyright (c) 2011-2013 PLUMgrid, http://plumgrid.com
+ *
+ * This program is free software; you can redistribute it and/or
+ * modify it under the terms of version 2 of the GNU General Public
+ * License as published by the Free Software Foundation.
+ *
+ * This program is distributed in the hope that it will be useful, but
+ * WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
+ * General Public License for more details.
+ */
+#include <linux/kernel.h>
+#include <linux/types.h>
+#include <linux/slab.h>
+#include <linux/bpf_jit.h>
+#include <linux/moduleloader.h>
+#include "bpf_jit_comp.h"
+
+static inline u8 *emit_code(u8 *ptr, u32 bytes, unsigned int len)
+{
+	if (len == 1)
+		*ptr = bytes;
+	else if (len == 2)
+		*(u16 *)ptr = bytes;
+	else
+		*(u32 *)ptr = bytes;
+	return ptr + len;
+}
+
+#define EMIT(bytes, len) (prog = emit_code(prog, (bytes), (len)))
+
+#define EMIT1(b1)		EMIT(b1, 1)
+#define EMIT2(b1, b2)		EMIT((b1) + ((b2) << 8), 2)
+#define EMIT3(b1, b2, b3)	EMIT((b1) + ((b2) << 8) + ((b3) << 16), 3)
+#define EMIT4(b1, b2, b3, b4)	EMIT((b1) + ((b2) << 8) + ((b3) << 16) + \
+				     ((b4) << 24), 4)
+/* imm32 is sign extended by cpu */
+#define EMIT1_off32(b1, off) \
+	do {EMIT1(b1); EMIT(off, 4); } while (0)
+#define EMIT2_off32(b1, b2, off) \
+	do {EMIT2(b1, b2); EMIT(off, 4); } while (0)
+#define EMIT3_off32(b1, b2, b3, off) \
+	do {EMIT3(b1, b2, b3); EMIT(off, 4); } while (0)
+#define EMIT4_off32(b1, b2, b3, b4, off) \
+	do {EMIT4(b1, b2, b3, b4); EMIT(off, 4); } while (0)
+
+/* mov A, X */
+#define EMIT_mov(A, X) \
+	EMIT3(add_2mod(0x48, A, X), 0x89, add_2reg(0xC0, A, X))
+
+#define X86_JAE 0x73
+#define X86_JE  0x74
+#define X86_JNE 0x75
+#define X86_JA  0x77
+#define X86_JGE 0x7D
+#define X86_JG  0x7F
+
+static inline bool is_imm8(__s32 value)
+{
+	return value <= 127 && value >= -128;
+}
+
+static inline bool is_simm32(__s64 value)
+{
+	return value == (__s64)(__s32)value;
+}
+
+static int bpf_size_to_x86_bytes(int bpf_size)
+{
+	if (bpf_size == BPF_W)
+		return 4;
+	else if (bpf_size == BPF_H)
+		return 2;
+	else if (bpf_size == BPF_B)
+		return 1;
+	else if (bpf_size == BPF_DW)
+		return 4; /* imm32 */
+	else
+		return 0;
+}
+
+#define AUX_REG 32
+
+/* avoid x86-64 R12 which if used as base address in memory access
+ * always needs an extra byte for index */
+static const int reg2hex[] = {
+	[R0] = 0, /* rax */
+	[R1] = 7, /* rdi */
+	[R2] = 6, /* rsi */
+	[R3] = 2, /* rdx */
+	[R4] = 1, /* rcx */
+	[R5] = 0, /* r8 */
+	[R6] = 3, /* rbx callee saved */
+	[R7] = 5, /* r13 callee saved */
+	[R8] = 6, /* r14 callee saved */
+	[R9] = 7, /* r15 callee saved */
+	[__fp__] = 5, /* rbp readonly */
+	[AUX_REG] = 1, /* r9 temp register */
+};
+
+/* is_ereg() == true if r8 <= reg <= r15,
+ * rax,rcx,...,rbp don't need extra byte of encoding */
+static inline bool is_ereg(u32 reg)
+{
+	if (reg == R5 || (reg >= R7 && reg <= R9) || reg == AUX_REG)
+		return true;
+	else
+		return false;
+}
+
+static inline u8 add_1mod(u8 byte, u32 reg)
+{
+	if (is_ereg(reg))
+		byte |= 1;
+	return byte;
+}
+static inline u8 add_2mod(u8 byte, u32 r1, u32 r2)
+{
+	if (is_ereg(r1))
+		byte |= 1;
+	if (is_ereg(r2))
+		byte |= 4;
+	return byte;
+}
+
+static inline u8 add_1reg(u8 byte, u32 a_reg)
+{
+	return byte + reg2hex[a_reg];
+}
+static inline u8 add_2reg(u8 byte, u32 a_reg, u32 x_reg)
+{
+	return byte + reg2hex[a_reg] + (reg2hex[x_reg] << 3);
+}
+
+static u8 *select_bpf_func(struct bpf_program *prog, int id)
+{
+	if (id <= 0 || id >= prog->strtab_size)
+		return NULL;
+	return prog->cb->jit_select_func(prog->strtab, id);
+}
+
+static int do_jit(struct bpf_program *bpf_prog, int *addrs, u8 *image,
+		  int oldproglen)
+{
+	struct bpf_insn *insn = bpf_prog->insns;
+	int insn_cnt = bpf_prog->insn_cnt;
+	u8 temp[64];
+	int i;
+	int proglen = 0;
+	u8 *prog = temp;
+	int stacksize = 512;
+
+	EMIT1(0x55); /* push rbp */
+	EMIT3(0x48, 0x89, 0xE5); /* mov rbp,rsp */
+
+	/* sub rsp, stacksize */
+	EMIT3_off32(0x48, 0x81, 0xEC, stacksize);
+	/* mov qword ptr [rbp-X],rbx */
+	EMIT3_off32(0x48, 0x89, 0x9D, -stacksize);
+	/* mov qword ptr [rbp-X],r13 */
+	EMIT3_off32(0x4C, 0x89, 0xAD, -stacksize + 8);
+	/* mov qword ptr [rbp-X],r14 */
+	EMIT3_off32(0x4C, 0x89, 0xB5, -stacksize + 16);
+	/* mov qword ptr [rbp-X],r15 */
+	EMIT3_off32(0x4C, 0x89, 0xBD, -stacksize + 24);
+
+	for (i = 0; i < insn_cnt; i++, insn++) {
+		const __s32 K = insn->imm;
+		__u32 a_reg = insn->a_reg;
+		__u32 x_reg = insn->x_reg;
+		u8 b1 = 0, b2 = 0, b3 = 0;
+		u8 jmp_cond;
+		__s64 jmp_offset;
+		int ilen;
+		u8 *func;
+
+		switch (insn->code) {
+			/* ALU */
+		case BPF_ALU | BPF_ADD | BPF_X:
+		case BPF_ALU | BPF_SUB | BPF_X:
+		case BPF_ALU | BPF_AND | BPF_X:
+		case BPF_ALU | BPF_OR | BPF_X:
+		case BPF_ALU | BPF_XOR | BPF_X:
+			b1 = 0x48;
+			b3 = 0xC0;
+			switch (BPF_OP(insn->code)) {
+			case BPF_ADD: b2 = 0x01; break;
+			case BPF_SUB: b2 = 0x29; break;
+			case BPF_AND: b2 = 0x21; break;
+			case BPF_OR: b2 = 0x09; break;
+			case BPF_XOR: b2 = 0x31; break;
+			}
+			EMIT3(add_2mod(b1, a_reg, x_reg), b2,
+			      add_2reg(b3, a_reg, x_reg));
+			break;
+
+			/* mov A, X */
+		case BPF_ALU | BPF_MOV | BPF_X:
+			EMIT_mov(a_reg, x_reg);
+			break;
+
+			/* neg A */
+		case BPF_ALU | BPF_NEG | BPF_X:
+			EMIT3(add_1mod(0x48, a_reg), 0xF7,
+			      add_1reg(0xD8, a_reg));
+			break;
+
+		case BPF_ALU | BPF_ADD | BPF_K:
+		case BPF_ALU | BPF_SUB | BPF_K:
+		case BPF_ALU | BPF_AND | BPF_K:
+		case BPF_ALU | BPF_OR | BPF_K:
+			b1 = add_1mod(0x48, a_reg);
+
+			switch (BPF_OP(insn->code)) {
+			case BPF_ADD: b3 = 0xC0; break;
+			case BPF_SUB: b3 = 0xE8; break;
+			case BPF_AND: b3 = 0xE0; break;
+			case BPF_OR: b3 = 0xC8; break;
+			}
+
+			if (is_imm8(K))
+				EMIT4(b1, 0x83, add_1reg(b3, a_reg), K);
+			else
+				EMIT3_off32(b1, 0x81, add_1reg(b3, a_reg), K);
+			break;
+
+		case BPF_ALU | BPF_MOV | BPF_K:
+			/* 'mov rax, imm32' sign extends imm32.
+			 * possible optimization: if imm32 is positive,
+			 * use 'mov eax, imm32' (which zero-extends imm32)
+			 * to save 2 bytes */
+			b1 = add_1mod(0x48, a_reg);
+			b2 = 0xC7;
+			b3 = 0xC0;
+			EMIT3_off32(b1, b2, add_1reg(b3, a_reg), K);
+			break;
+
+			/* A %= X
+			 * A /= X */
+		case BPF_ALU | BPF_MOD | BPF_X:
+		case BPF_ALU | BPF_DIV | BPF_X:
+			EMIT1(0x50); /* push rax */
+			EMIT1(0x52); /* push rdx */
+
+			/* mov r9, X */
+			EMIT_mov(AUX_REG, x_reg);
+
+			/* mov rax, A */
+			EMIT_mov(R0, a_reg);
+
+			/* xor rdx, rdx */
+			EMIT3(0x48, 0x31, 0xd2);
+
+			/* if X==0, skip divide, make A=0 */
+
+			/* cmp r9, 0 */
+			EMIT4(0x49, 0x83, 0xF9, 0x00);
+
+			/* je .+3 */
+			EMIT2(X86_JE, 3);
+
+			/* div r9 */
+			EMIT3(0x49, 0xF7, 0xF1);
+
+			if (BPF_OP(insn->code) == BPF_MOD) {
+				/* mov r9, rdx */
+				EMIT3(0x49, 0x89, 0xD1);
+			} else {
+				/* mov r9, rax */
+				EMIT3(0x49, 0x89, 0xC1);
+			}
+
+			EMIT1(0x5A); /* pop rdx */
+			EMIT1(0x58); /* pop rax */
+
+			/* mov A, r9 */
+			EMIT_mov(a_reg, AUX_REG);
+			break;
+
+			/* shifts */
+		case BPF_ALU | BPF_LSH | BPF_K:
+		case BPF_ALU | BPF_RSH | BPF_K:
+		case BPF_ALU | BPF_ARSH | BPF_K:
+			b1 = add_1mod(0x48, a_reg);
+			switch (BPF_OP(insn->code)) {
+			case BPF_LSH: b3 = 0xE0; break;
+			case BPF_RSH: b3 = 0xE8; break;
+			case BPF_ARSH: b3 = 0xF8; break;
+			}
+			EMIT4(b1, 0xC1, add_1reg(b3, a_reg), K);
+			break;
+
+		case BPF_ALU | BPF_BSWAP32 | BPF_X:
+			/* emit 'bswap eax' to swap lower 4-bytes */
+			if (is_ereg(a_reg))
+				EMIT2(0x41, 0x0F);
+			else
+				EMIT1(0x0F);
+			EMIT1(add_1reg(0xC8, a_reg));
+			break;
+
+		case BPF_ALU | BPF_BSWAP64 | BPF_X:
+			/* emit 'bswap rax' to swap 8-bytes */
+			EMIT3(add_1mod(0x48, a_reg), 0x0F,
+			      add_1reg(0xC8, a_reg));
+			break;
+
+			/* ST: *(u8*)(a_reg + off) = imm */
+		case BPF_ST | BPF_REL | BPF_B:
+			if (is_ereg(a_reg))
+				EMIT2(0x41, 0xC6);
+			else
+				EMIT1(0xC6);
+			goto st;
+		case BPF_ST | BPF_REL | BPF_H:
+			if (is_ereg(a_reg))
+				EMIT3(0x66, 0x41, 0xC7);
+			else
+				EMIT2(0x66, 0xC7);
+			goto st;
+		case BPF_ST | BPF_REL | BPF_W:
+			if (is_ereg(a_reg))
+				EMIT2(0x41, 0xC7);
+			else
+				EMIT1(0xC7);
+			goto st;
+		case BPF_ST | BPF_REL | BPF_DW:
+			EMIT2(add_1mod(0x48, a_reg), 0xC7);
+
+st:			if (is_imm8(insn->off))
+				EMIT2(add_1reg(0x40, a_reg), insn->off);
+			else
+				EMIT1_off32(add_1reg(0x80, a_reg), insn->off);
+
+			EMIT(K, bpf_size_to_x86_bytes(BPF_SIZE(insn->code)));
+			break;
+
+			/* STX: *(u8*)(a_reg + off) = x_reg */
+		case BPF_STX | BPF_REL | BPF_B:
+			/* emit 'mov byte ptr [rax + off], al' */
+			if (is_ereg(a_reg) || is_ereg(x_reg) ||
+			    /* have to add extra byte for x86 SIL, DIL regs */
+			    x_reg == R1 || x_reg == R2)
+				EMIT2(add_2mod(0x40, a_reg, x_reg), 0x88);
+			else
+				EMIT1(0x88);
+			goto stx;
+		case BPF_STX | BPF_REL | BPF_H:
+			if (is_ereg(a_reg) || is_ereg(x_reg))
+				EMIT3(0x66, add_2mod(0x40, a_reg, x_reg), 0x89);
+			else
+				EMIT2(0x66, 0x89);
+			goto stx;
+		case BPF_STX | BPF_REL | BPF_W:
+			if (is_ereg(a_reg) || is_ereg(x_reg))
+				EMIT2(add_2mod(0x40, a_reg, x_reg), 0x89);
+			else
+				EMIT1(0x89);
+			goto stx;
+		case BPF_STX | BPF_REL | BPF_DW:
+			EMIT2(add_2mod(0x48, a_reg, x_reg), 0x89);
+stx:			if (is_imm8(insn->off))
+				EMIT2(add_2reg(0x40, a_reg, x_reg), insn->off);
+			else
+				EMIT1_off32(add_2reg(0x80, a_reg, x_reg),
+					    insn->off);
+			break;
+
+			/* LDX: a_reg = *(u8*)(x_reg + off) */
+		case BPF_LDX | BPF_REL | BPF_B:
+			/* emit 'movzx rax, byte ptr [rax + off]' */
+			EMIT3(add_2mod(0x48, x_reg, a_reg), 0x0F, 0xB6);
+			goto ldx;
+		case BPF_LDX | BPF_REL | BPF_H:
+			/* emit 'movzx rax, word ptr [rax + off]' */
+			EMIT3(add_2mod(0x48, x_reg, a_reg), 0x0F, 0xB7);
+			goto ldx;
+		case BPF_LDX | BPF_REL | BPF_W:
+			/* emit 'mov eax, dword ptr [rax+0x14]' */
+			if (is_ereg(a_reg) || is_ereg(x_reg))
+				EMIT2(add_2mod(0x40, x_reg, a_reg), 0x8B);
+			else
+				EMIT1(0x8B);
+			goto ldx;
+		case BPF_LDX | BPF_REL | BPF_DW:
+			/* emit 'mov rax, qword ptr [rax+0x14]' */
+			EMIT2(add_2mod(0x48, x_reg, a_reg), 0x8B);
+ldx:			/* if insn->off == 0 we can save one extra byte, but
+			 * special case of x86 R13 which always needs an offset
+			 * is not worth the pain */
+			if (is_imm8(insn->off))
+				EMIT2(add_2reg(0x40, x_reg, a_reg), insn->off);
+			else
+				EMIT1_off32(add_2reg(0x80, x_reg, a_reg),
+					    insn->off);
+			break;
+
+			/* STX XADD: lock *(u8*)(a_reg + off) += x_reg */
+		case BPF_STX | BPF_XADD | BPF_B:
+			/* emit 'lock add byte ptr [rax + off], al' */
+			if (is_ereg(a_reg) || is_ereg(x_reg) ||
+			    /* have to add extra byte for x86 SIL, DIL regs */
+			    x_reg == R1 || x_reg == R2)
+				EMIT3(0xF0, add_2mod(0x40, a_reg, x_reg), 0x00);
+			else
+				EMIT2(0xF0, 0x00);
+			goto xadd;
+		case BPF_STX | BPF_XADD | BPF_H:
+			if (is_ereg(a_reg) || is_ereg(x_reg))
+				EMIT4(0x66, 0xF0, add_2mod(0x40, a_reg, x_reg),
+				      0x01);
+			else
+				EMIT3(0x66, 0xF0, 0x01);
+			goto xadd;
+		case BPF_STX | BPF_XADD | BPF_W:
+			if (is_ereg(a_reg) || is_ereg(x_reg))
+				EMIT3(0xF0, add_2mod(0x40, a_reg, x_reg), 0x01);
+			else
+				EMIT2(0xF0, 0x01);
+			goto xadd;
+		case BPF_STX | BPF_XADD | BPF_DW:
+			EMIT3(0xF0, add_2mod(0x48, a_reg, x_reg), 0x01);
+xadd:			if (is_imm8(insn->off))
+				EMIT2(add_2reg(0x40, a_reg, x_reg), insn->off);
+			else
+				EMIT1_off32(add_2reg(0x80, a_reg, x_reg),
+					    insn->off);
+			break;
+
+			/* call */
+		case BPF_JMP | BPF_CALL:
+			func = select_bpf_func(bpf_prog, K);
+			jmp_offset = func - (image + addrs[i]);
+			if (!func || !is_simm32(jmp_offset)) {
+				pr_err("unsupported bpf func %d addr %p image %p\n",
+				       K, func, image);
+				return -EINVAL;
+			}
+			EMIT1_off32(0xE8, jmp_offset);
+			break;
+
+			/* cond jump */
+		case BPF_JMP | BPF_JEQ | BPF_X:
+		case BPF_JMP | BPF_JNE | BPF_X:
+		case BPF_JMP | BPF_JGT | BPF_X:
+		case BPF_JMP | BPF_JGE | BPF_X:
+		case BPF_JMP | BPF_JSGT | BPF_X:
+		case BPF_JMP | BPF_JSGE | BPF_X:
+			/* emit 'cmp a_reg, x_reg' insn */
+			b1 = 0x48;
+			b2 = 0x39;
+			b3 = 0xC0;
+			EMIT3(add_2mod(b1, a_reg, x_reg), b2,
+			      add_2reg(b3, a_reg, x_reg));
+			goto emit_jump;
+		case BPF_JMP | BPF_JEQ | BPF_K:
+		case BPF_JMP | BPF_JNE | BPF_K:
+		case BPF_JMP | BPF_JGT | BPF_K:
+		case BPF_JMP | BPF_JGE | BPF_K:
+		case BPF_JMP | BPF_JSGT | BPF_K:
+		case BPF_JMP | BPF_JSGE | BPF_K:
+			/* emit 'cmp a_reg, imm8/32' */
+			EMIT1(add_1mod(0x48, a_reg));
+
+			if (is_imm8(K))
+				EMIT3(0x83, add_1reg(0xF8, a_reg), K);
+			else
+				EMIT2_off32(0x81, add_1reg(0xF8, a_reg), K);
+
+emit_jump:		/* convert BPF opcode to x86 */
+			switch (BPF_OP(insn->code)) {
+			case BPF_JEQ:
+				jmp_cond = X86_JE;
+				break;
+			case BPF_JNE:
+				jmp_cond = X86_JNE;
+				break;
+			case BPF_JGT:
+				/* GT is unsigned '>', JA in x86 */
+				jmp_cond = X86_JA;
+				break;
+			case BPF_JGE:
+				/* GE is unsigned '>=', JAE in x86 */
+				jmp_cond = X86_JAE;
+				break;
+			case BPF_JSGT:
+				/* signed '>', GT in x86 */
+				jmp_cond = X86_JG;
+				break;
+			case BPF_JSGE:
+				/* signed '>=', GE in x86 */
+				jmp_cond = X86_JGE;
+				break;
+			default: /* to silence gcc warning */
+				return -EFAULT;
+			}
+			jmp_offset = addrs[i + insn->off] - addrs[i];
+			if (is_imm8(jmp_offset)) {
+				EMIT2(jmp_cond, jmp_offset);
+			} else if (is_simm32(jmp_offset)) {
+				EMIT2_off32(0x0F, jmp_cond + 0x10, jmp_offset);
+			} else {
+				pr_err("cond_jmp gen bug %llx\n", jmp_offset);
+				return -EFAULT;
+			}
+
+			break;
+
+		case BPF_JMP | BPF_JA | BPF_X:
+			jmp_offset = addrs[i + insn->off] - addrs[i];
+			if (is_imm8(jmp_offset)) {
+				EMIT2(0xEB, jmp_offset);
+			} else if (is_simm32(jmp_offset)) {
+				EMIT1_off32(0xE9, jmp_offset);
+			} else {
+				pr_err("jmp gen bug %llx\n", jmp_offset);
+				return -EFAULT;
+			}
+
+			break;
+
+		case BPF_RET | BPF_K:
+			/* mov rbx, qword ptr [rbp-X] */
+			EMIT3_off32(0x48, 0x8B, 0x9D, -stacksize);
+			/* mov r13, qword ptr [rbp-X] */
+			EMIT3_off32(0x4C, 0x8B, 0xAD, -stacksize + 8);
+			/* mov r14, qword ptr [rbp-X] */
+			EMIT3_off32(0x4C, 0x8B, 0xB5, -stacksize + 16);
+			/* mov r15, qword ptr [rbp-X] */
+			EMIT3_off32(0x4C, 0x8B, 0xBD, -stacksize + 24);
+
+			EMIT1(0xC9); /* leave */
+			EMIT1(0xC3); /* ret */
+			break;
+
+		default:
+			/*pr_debug_bpf_insn(insn, NULL);*/
+			pr_err("bpf_jit: unknown opcode %02x\n", insn->code);
+			return -EINVAL;
+		}
+
+		ilen = prog - temp;
+		if (image) {
+			if (proglen + ilen > oldproglen)
+				return -2;
+			memcpy(image + proglen, temp, ilen);
+		}
+		proglen += ilen;
+		addrs[i] = proglen;
+		prog = temp;
+	}
+	return proglen;
+}
+
+void bpf_compile(struct bpf_program *prog)
+{
+	struct bpf_binary_header *header = NULL;
+	int proglen, oldproglen = 0;
+	int *addrs;
+	u8 *image = NULL;
+	int pass;
+	int i;
+
+	if (!prog || !prog->cb || !prog->cb->jit_select_func)
+		return;
+
+	addrs = kmalloc(prog->insn_cnt * sizeof(*addrs), GFP_KERNEL);
+	if (!addrs)
+		return;
+
+	for (proglen = 0, i = 0; i < prog->insn_cnt; i++) {
+		proglen += 64;
+		addrs[i] = proglen;
+	}
+	for (pass = 0; pass < 10; pass++) {
+		proglen = do_jit(prog, addrs, image, oldproglen);
+		if (proglen <= 0) {
+			image = NULL;
+			goto out;
+		}
+		if (image) {
+			if (proglen != oldproglen)
+				pr_err("bpf_jit: proglen=%d != oldproglen=%d\n",
+				       proglen, oldproglen);
+			break;
+		}
+		if (proglen == oldproglen) {
+			header = bpf_alloc_binary(proglen, &image);
+			if (!header)
+				goto out;
+		}
+		oldproglen = proglen;
+	}
+
+	if (image) {
+		bpf_flush_icache(header, image + proglen);
+		set_memory_ro((unsigned long)header, header->pages);
+	}
+out:
+	kfree(addrs);
+	prog->jit_image = (void (*)(struct bpf_context *ctx))image;
+	return;
+}
+
+static void bpf_jit_free_deferred(struct work_struct *work)
+{
+	struct bpf_program *prog = container_of(work, struct bpf_program, work);
+	unsigned long addr = (unsigned long)prog->jit_image & PAGE_MASK;
+	struct bpf_binary_header *header = (void *)addr;
+
+	set_memory_rw(addr, header->pages);
+	module_free(NULL, header);
+	free_bpf_program(prog);
+}
+
+void __bpf_free(struct bpf_program *prog)
+{
+	if (prog->jit_image) {
+		INIT_WORK(&prog->work, bpf_jit_free_deferred);
+		schedule_work(&prog->work);
+	} else {
+		free_bpf_program(prog);
+	}
+}
diff --git a/arch/x86/net/bpf_jit_comp.c b/arch/x86/net/bpf_jit_comp.c
index 4ed75dd..f9ece1e 100644
--- a/arch/x86/net/bpf_jit_comp.c
+++ b/arch/x86/net/bpf_jit_comp.c
@@ -13,6 +13,7 @@
 #include <linux/filter.h>
 #include <linux/if_vlan.h>
 #include <linux/random.h>
+#include "bpf_jit_comp.h"
 
 /*
  * Conventions :
@@ -112,16 +113,6 @@ do {								\
 #define SEEN_XREG    2 /* ebx is used */
 #define SEEN_MEM     4 /* use mem[] for temporary storage */
 
-static inline void bpf_flush_icache(void *start, void *end)
-{
-	mm_segment_t old_fs = get_fs();
-
-	set_fs(KERNEL_DS);
-	smp_wmb();
-	flush_icache_range((unsigned long)start, (unsigned long)end);
-	set_fs(old_fs);
-}
-
 #define CHOOSE_LOAD_FUNC(K, func) \
 	((int)K < 0 ? ((int)K >= SKF_LL_OFF ? func##_negative_offset : func) : func##_positive_offset)
 
@@ -145,16 +136,8 @@ static int pkt_type_offset(void)
 	return -1;
 }
 
-struct bpf_binary_header {
-	unsigned int	pages;
-	/* Note : for security reasons, bpf code will follow a randomly
-	 * sized amount of int3 instructions
-	 */
-	u8		image[];
-};
-
-static struct bpf_binary_header *bpf_alloc_binary(unsigned int proglen,
-						  u8 **image_ptr)
+struct bpf_binary_header *bpf_alloc_binary(unsigned int proglen,
+					   u8 **image_ptr)
 {
 	unsigned int sz, hole;
 	struct bpf_binary_header *header;
diff --git a/arch/x86/net/bpf_jit_comp.h b/arch/x86/net/bpf_jit_comp.h
new file mode 100644
index 0000000..74ff45d
--- /dev/null
+++ b/arch/x86/net/bpf_jit_comp.h
@@ -0,0 +1,35 @@
+/* bpf_jit_comp.h : BPF filter alloc/free routines
+ *
+ * This program is free software; you can redistribute it and/or
+ * modify it under the terms of the GNU General Public License
+ * as published by the Free Software Foundation; version 2
+ * of the License.
+ */
+#ifndef __BPF_JIT_COMP_H
+#define __BPF_JIT_COMP_H
+
+#include <linux/uaccess.h>
+#include <asm/cacheflush.h>
+
+struct bpf_binary_header {
+	unsigned int	pages;
+	/* Note : for security reasons, bpf code will follow a randomly
+	 * sized amount of int3 instructions
+	 */
+	u8		image[];
+};
+
+static inline void bpf_flush_icache(void *start, void *end)
+{
+	mm_segment_t old_fs = get_fs();
+
+	set_fs(KERNEL_DS);
+	smp_wmb();
+	flush_icache_range((unsigned long)start, (unsigned long)end);
+	set_fs(old_fs);
+}
+
+struct bpf_binary_header *bpf_alloc_binary(unsigned int proglen,
+					   u8 **image_ptr);
+
+#endif
-- 
1.7.9.5


^ permalink raw reply related	[flat|nested] 26+ messages in thread

* [RFC PATCH v2 tip 3/7] Extended BPF (64-bit BPF) design document
  2014-02-06  1:10 [RFC PATCH v2 tip 0/7] 64-bit BPF insn set and tracing filters Alexei Starovoitov
  2014-02-06  1:10 ` [RFC PATCH v2 tip 1/7] Extended BPF core framework Alexei Starovoitov
  2014-02-06  1:10 ` [RFC PATCH v2 tip 2/7] Extended BPF JIT for x86-64 Alexei Starovoitov
@ 2014-02-06  1:10 ` Alexei Starovoitov
  2014-02-06  1:10 ` [RFC PATCH v2 tip 4/7] Revert "x86/ptrace: Remove unused regs_get_argument_nth API" Alexei Starovoitov
                   ` (4 subsequent siblings)
  7 siblings, 0 replies; 26+ messages in thread
From: Alexei Starovoitov @ 2014-02-06  1:10 UTC (permalink / raw)
  To: Ingo Molnar
  Cc: David S. Miller, Steven Rostedt, Peter Zijlstra, H. Peter Anvin,
	Thomas Gleixner, Masami Hiramatsu, Tom Zanussi, Jovi Zhangwei,
	Eric Dumazet, Linus Torvalds, Andrew Morton, Frederic Weisbecker,
	Arnaldo Carvalho de Melo, Pekka Enberg, Arjan van de Ven,
	Christoph Hellwig, linux-kernel, netdev

Signed-off-by: Alexei Starovoitov <ast@plumgrid.com>
---
 Documentation/bpf_jit.txt |  204 +++++++++++++++++++++++++++++++++++++++++++++
 1 file changed, 204 insertions(+)
 create mode 100644 Documentation/bpf_jit.txt

diff --git a/Documentation/bpf_jit.txt b/Documentation/bpf_jit.txt
new file mode 100644
index 0000000..9c70f42
--- /dev/null
+++ b/Documentation/bpf_jit.txt
@@ -0,0 +1,204 @@
+Subject: extended BPF or 64-bit BPF
+
+Q: What is BPF?
+A: Safe dynamically loadable 32-bit program that can access skb->data via
+sk_load_byte/half/word calls or seccomp_data. Can be attached to sockets,
+to netfilter xtables, seccomp. In case of sockets/xtables input is skb.
+In case of seccomp input is struct seccomp_data.
+
+Q: What is extended BPF?
+A: Safe dynamically loadable 64-bit program that can call fixed set
+of kernel functions and takes generic bpf_context as an input.
+BPF program is a glue between kernel functions and bpf_context.
+Different kernel subsystems can define their own set of available functions
+and alter BPF machinery for specific use case.
+
+Example 1:
+when function set is {bpf_load_byte/half/word} and bpf_context=skb
+the extended BPF is equivalent to original BPF (w/o negative offset extensions),
+since any such extended BPF program will only be able to load data from skb
+and interpret it.
+
+Example 2:
+when function set is {empty} and bpf_context=seccomp_data,
+the extended BPF is equivalent to original seccomp BPF with simpler programs
+and can immediately take advantage of extended BPF-JIT.
+(original BPF-JIT doesn't work for seccomp)
+
+Example 3:
+when function set is {bpf_load_xxx + bpf_table_lookup} and bpf_context=skb
+the extended BPF can be used to implement network analytics in tcpdump.
+Like counting all tcp flows through the dev or filtering for specific
+set of IP addresses.
+
+Example 4:
+when function set is {load_xxx + table_lookup + trace_printk} and
+bpf_context=pt_regs, the extended BPF is used to implement systemtap-like
+tracing filters
+
+Extended Instruction Set was designed with these goals:
+- write programs in restricted C and compile into BPF with GCC/LLVM
+- just-in-time map to modern 64-bit CPU with minimal performance overhead
+  over two steps: C -> BPF -> native code
+- guarantee termination and safety of BPF program in kernel
+  with simple algorithm
+
+Writing filters in tcpdump syntax or in systemtap language is difficult.
+Same filter done in C is easier to understand.
+GCC/LLVM-bpf backend is optional.
+Extended BPF can be coded with macroses from bpf.h just like original BPF.
+
+Minimal performance overhead is achieved by having one to one mapping
+between BPF insns and native insns, and one to one mapping between BPF
+registers and native registers on 64-bit CPUs
+
+Extended BPF allows jump forward and backward for two reasons:
+to reduce branch mispredict penalty compiler moves cold basic blocks out of
+fall-through path and to reduce code duplication that would be unavoidable
+if only jump forward was available.
+To guarantee termination simple non-recursive depth-first-search verifies
+that there are no back-edges (no loops in the program), program is a DAG
+with root at the first insn, all branches end at the last RET insn and
+all instructions are reachable.
+(Original BPF actually allows unreachable insns, but that's a bug)
+
+Original BPF has two registers (A and X) and hidden frame pointer.
+Extended BPF has ten registers and read-only frame pointer.
+Since 64-bit CPUs are passing arguments to the functions via registers
+the number of args from BPF program to in-kernel function is restricted to 5
+and one register is used to accept return value from in-kernel function.
+x86_64 passes first 6 arguments in registers.
+aarch64/sparcv9/mips64 have 7-8 registers for arguments.
+x86_64 has 6 callee saved registers.
+aarch64/sparcv9/mips64 have 11 or more callee saved registers.
+
+Therefore extended BPF calling convention is defined as:
+R0 - return value from in-kernel function
+R1-R5 - arguments from BPF program to in-kernel function
+R6-R9 - callee saved registers that in-kernel function will preserve
+R10 - read-only frame pointer to access stack
+
+so that all BPF registers map one to one to HW registers on x86_64,aarch64,etc
+and BPF calling convention maps directly to ABIs used by kernel on 64-bit
+architectures.
+
+R0-R5 are scratch registers and BPF program needs spill/fill them if necessary
+across calls.
+Note that there is only one BPF program == one BPF function and it cannot call
+other BPF functions. It can only call predefined in-kernel functions.
+
+All BPF registers are 64-bit without subregs, which makes JITed x86 code
+less optimal, but matches sparc/mips architectures.
+Adding 32-bit subregs was considered, since JIT can map them to x86 and aarch64
+nicely, but read-modify-write overhead for sparc/mips is not worth the gains.
+
+Original BPF and extended BPF are two operand instructions, which helps
+to do one-to-one mapping between BPF insn and x86 insn during JIT.
+
+Extended BPF doesn't have pre-defined endianness not to favor one
+architecture vs another. Therefore bswap insn was introduced.
+Original BPF doesn't have such insn and does bswap as part of sk_load_word call
+which is often unnecessary if we want to compare the value with the constant.
+Restricted C code might be written differently depending on endianness
+and GCC/LLVM-bpf will take an endianness flag.
+
+32-bit architectures run 64-bit extended BPF programs via interpreter
+
+Q: Why extended BPF is 64-bit? Cannot we live with 32-bit?
+A: On 64-bit architectures, pointers are 64-bit and we want to pass 64-bit
+values in/out kernel functions, so 32-bit BPF registers would require to define
+register-pair ABI, there won't be a direct BPF register to HW register
+mapping and JIT would need to do combine/split/move operations for every
+register in and out of the function, which is complex, bug prone and slow.
+Another reason is counters. To use 64-bit counter BPF program would need to do
+a complex math. Again bug prone and not atomic.
+
+Q: Original BPF is safe, deterministic and kernel can easily prove that.
+   Does extended BPF keep these properties?
+A: Yes. The safety of the program is determined in two steps.
+First step does depth-first-search to disallow loops and other CFG validation.
+Second step starts from the first insn and descends all possible paths.
+It simulates execution of every insn and observes the state change of
+registers and stack.
+At the start of the program the register R1 contains a pointer to bpf_context
+and has type PTR_TO_CTX. If checker sees an insn that does R2=R1, then R2 has
+now type PTR_TO_CTX as well and can be used on right hand side of expression.
+If R1=PTR_TO_CTX and insn is R2=R1+1, then R2=INVALID_PTR and it is readable.
+If register was never written to, it's not readable.
+After kernel function call, R1-R5 are reset to unreadable and R0 has a return
+type of the function. Since R6-R9 are callee saved, their state is preserved
+across the call.
+load/store instructions are allowed only with registers of valid types, which
+are PTR_TO_CTX, PTR_TO_TABLE, PTR_TO_STACK. They are bounds and alginment
+checked.
+
+bpf_context structure is generic. Its contents are defined by specific use case.
+For seccomp it can be seccomp_data and through get_context_access callback
+BPF checker is customized, so that BPF program can only access certain fields
+of bpf_context with specified size and alignment.
+For example, the following insn:
+  BPF_INSN_LD(BPF_W, R0, R6, 8)
+intends to load word from address R6 + 8 and store it into R0
+If R6=PTR_TO_CTX, then get_context_access callback should let the checker know
+that offset 8 of size 4 bytes can be accessed for reading, otherwise the checker
+will reject the program.
+If R6=PTR_TO_STACK, then access should be aligned and be within stack bounds,
+which are hard coded to [-480, 0]. In this example offset is 8, so it will fail
+verification.
+The checker will allow BPF program to read data from stack only after it wrote
+into it.
+Pointer register spill/fill is tracked as well, since four (R6-R9) callee saved
+registers may not be enough for some programs.
+
+Allowed function calls are customized via get_func_proto callback.
+For example:
+  u64 bpf_load_byte(struct bpf_context *ctx, u32 offset);
+function will have the following definition:
+  struct bpf_func_proto proto = {RET_INTEGER, PTR_TO_CTX};
+and BPF checker will verify that bpf_load_byte is always called with first
+argument being a valid pointer to bpf_context. After the call BPF register R0
+will be set to readable state, so that BPF program can access it.
+
+One of the useful functions that can be made available to BPF program
+are bpf_table_lookup/bpf_table_update.
+Using them a tracing filter can collect any type of statistics.
+
+Therefore extended BPF program consists of instructions and tables.
+From BPF program the table is identified by constant table_id
+and access to a table in C looks like:
+elem = bpf_table_lookup(ctx, table_id, key);
+
+BPF checker matches 'table_id' against known tables, verifies that 'key' points
+to stack and table->key_size bytes are initialized.
+From there on bpf_table_lookup() is a normal kernel function. It needs to do
+a lookup by whatever means and return either valid pointer to the element
+or NULL. BPF checker will verify that the program accesses the pointer only
+after comparing it to NULL. That's the meaning of PTR_TO_TABLE_CONDITIONAL and
+PTR_TO_TABLE register types in bpf_check.c
+
+If a kernel subsystem wants to use this BPF framework and decides to implement
+bpf_table_lookup, the checker will guarantee that argument 'ctx' is a valid
+pointer to bpf_context, 'table_id' is valid table_id and table->key_size bytes
+can be read from the pointer 'key'. It's up to implementation to decide how it
+wants to do the lookup and what is the key.
+
+Going back to the example BPF insn:
+  BPF_INSN_LD(BPF_W, R0, R6, 8)
+if R6=PTR_TO_TABLE, then offset and size of access must be within
+[0, table->elem_size] which is determined by constant table_id that was passed
+into bpf_table_lookup call prior to this insn.
+
+Just like original, extended BPF is limited to 4096 insns, which means that any
+program will terminate quickly and will call fixed number of kernel functions.
+Earlier implementation of the checker had a precise calculation of worst case
+number of insns, but it was removed to simplify the code, since the worst number
+is always less then number of insns in a program anyway (because it's a DAG).
+
+Since register/stack state tracking simulates execution of all insns in all
+possible branches, it will explode if not bounded. There are two bounds.
+verifier_state stack is limited to 1k, therefore BPF program cannot have
+more than 1k jump insns.
+Total number of insns to be analyzed is limited to 32k, which means that
+checker will either prove correctness or reject the program in few
+milliseconds on average x86 cpu. Valid programs take microseconds to verify.
+
-- 
1.7.9.5


^ permalink raw reply related	[flat|nested] 26+ messages in thread

* [RFC PATCH v2 tip 4/7] Revert "x86/ptrace: Remove unused regs_get_argument_nth API"
  2014-02-06  1:10 [RFC PATCH v2 tip 0/7] 64-bit BPF insn set and tracing filters Alexei Starovoitov
                   ` (2 preceding siblings ...)
  2014-02-06  1:10 ` [RFC PATCH v2 tip 3/7] Extended BPF (64-bit BPF) design document Alexei Starovoitov
@ 2014-02-06  1:10 ` Alexei Starovoitov
  2014-02-06  1:10 ` [RFC PATCH v2 tip 5/7] use BPF in tracing filters Alexei Starovoitov
                   ` (3 subsequent siblings)
  7 siblings, 0 replies; 26+ messages in thread
From: Alexei Starovoitov @ 2014-02-06  1:10 UTC (permalink / raw)
  To: Ingo Molnar
  Cc: David S. Miller, Steven Rostedt, Peter Zijlstra, H. Peter Anvin,
	Thomas Gleixner, Masami Hiramatsu, Tom Zanussi, Jovi Zhangwei,
	Eric Dumazet, Linus Torvalds, Andrew Morton, Frederic Weisbecker,
	Arnaldo Carvalho de Melo, Pekka Enberg, Arjan van de Ven,
	Christoph Hellwig, linux-kernel, netdev

This reverts commit aa5add93e92019018e905146f8c3d3f8e3c08300.

Signed-off-by: Alexei Starovoitov <ast@plumgrid.com>
---
 arch/x86/include/asm/ptrace.h |    3 +++
 arch/x86/kernel/ptrace.c      |   24 ++++++++++++++++++++++++
 2 files changed, 27 insertions(+)

diff --git a/arch/x86/include/asm/ptrace.h b/arch/x86/include/asm/ptrace.h
index 14fd6fd..e026176 100644
--- a/arch/x86/include/asm/ptrace.h
+++ b/arch/x86/include/asm/ptrace.h
@@ -222,6 +222,9 @@ static inline unsigned long regs_get_kernel_stack_nth(struct pt_regs *regs,
 		return 0;
 }
 
+/* Get Nth argument at function call */
+unsigned long regs_get_argument_nth(struct pt_regs *regs, unsigned int n);
+
 #define arch_has_single_step()	(1)
 #ifdef CONFIG_X86_DEBUGCTLMSR
 #define arch_has_block_step()	(1)
diff --git a/arch/x86/kernel/ptrace.c b/arch/x86/kernel/ptrace.c
index 7461f50..ac1c705 100644
--- a/arch/x86/kernel/ptrace.c
+++ b/arch/x86/kernel/ptrace.c
@@ -141,6 +141,30 @@ static const int arg_offs_table[] = {
 #endif
 };
 
+/**
+ * regs_get_argument_nth() - get Nth argument at function call
+ * @regs:	pt_regs which contains registers at function entry.
+ * @n:		argument number.
+ *
+ * regs_get_argument_nth() returns @n th argument of a function call.
+ * Since usually the kernel stack will be changed right after function entry,
+ * you must use this at function entry. If the @n th entry is NOT in the
+ * kernel stack or pt_regs, this returns 0.
+ */
+unsigned long regs_get_argument_nth(struct pt_regs *regs, unsigned int n)
+{
+	if (n < ARRAY_SIZE(arg_offs_table))
+		return *(unsigned long *)((char *)regs + arg_offs_table[n]);
+	else {
+		/*
+		 * The typical case: arg n is on the stack.
+		 * (Note: stack[0] = return address, so skip it)
+		 */
+		n -= ARRAY_SIZE(arg_offs_table);
+		return regs_get_kernel_stack_nth(regs, 1 + n);
+	}
+}
+
 /*
  * does not yet catch signals sent when the child dies.
  * in exit.c or in signal.c.
-- 
1.7.9.5


^ permalink raw reply related	[flat|nested] 26+ messages in thread

* [RFC PATCH v2 tip 5/7] use BPF in tracing filters
  2014-02-06  1:10 [RFC PATCH v2 tip 0/7] 64-bit BPF insn set and tracing filters Alexei Starovoitov
                   ` (3 preceding siblings ...)
  2014-02-06  1:10 ` [RFC PATCH v2 tip 4/7] Revert "x86/ptrace: Remove unused regs_get_argument_nth API" Alexei Starovoitov
@ 2014-02-06  1:10 ` Alexei Starovoitov
  2014-02-06  1:10 ` [RFC PATCH v2 tip 6/7] LLVM BPF backend Alexei Starovoitov
                   ` (2 subsequent siblings)
  7 siblings, 0 replies; 26+ messages in thread
From: Alexei Starovoitov @ 2014-02-06  1:10 UTC (permalink / raw)
  To: Ingo Molnar
  Cc: David S. Miller, Steven Rostedt, Peter Zijlstra, H. Peter Anvin,
	Thomas Gleixner, Masami Hiramatsu, Tom Zanussi, Jovi Zhangwei,
	Eric Dumazet, Linus Torvalds, Andrew Morton, Frederic Weisbecker,
	Arnaldo Carvalho de Melo, Pekka Enberg, Arjan van de Ven,
	Christoph Hellwig, linux-kernel, netdev

Such filters can be written in C and allow safe read-only access to
any kernel data structure.
Like systemtap but with safety guaranteed by kernel.

The user can do:
cat bpf_program > /sys/kernel/debug/tracing/.../filter
if tracing event is either static or dynamic via kprobe_events.

The program can be anything as long as bpf_check() can verify its safety.
For example, the user can create kprobe_event on dst_discard()
and use logically following code inside BPF filter:
      skb = (struct sk_buff *)ctx->arg1;
      dev = bpf_load_pointer(&skb->dev);
to access 'struct net_device'
Since its prototype is 'int dst_discard(struct sk_buff *skb);'
bpf_load_pointer() will try to fetch 'dev' field of 'sk_buff'
structure and will suppress page-fault if pointer is incorrect.

Signed-off-by: Alexei Starovoitov <ast@plumgrid.com>
---
 include/linux/ftrace_event.h       |    5 +
 include/trace/bpf_trace.h          |   41 ++++++++
 include/trace/ftrace.h             |   17 ++++
 kernel/trace/Kconfig               |    1 +
 kernel/trace/Makefile              |    1 +
 kernel/trace/bpf_trace_callbacks.c |  193 ++++++++++++++++++++++++++++++++++++
 kernel/trace/trace.c               |    7 ++
 kernel/trace/trace.h               |   11 +-
 kernel/trace/trace_events.c        |    9 +-
 kernel/trace/trace_events_filter.c |   61 +++++++++++-
 kernel/trace/trace_kprobe.c        |   15 ++-
 11 files changed, 356 insertions(+), 5 deletions(-)
 create mode 100644 include/trace/bpf_trace.h
 create mode 100644 kernel/trace/bpf_trace_callbacks.c

diff --git a/include/linux/ftrace_event.h b/include/linux/ftrace_event.h
index 4e4cc28..616ae01 100644
--- a/include/linux/ftrace_event.h
+++ b/include/linux/ftrace_event.h
@@ -204,6 +204,7 @@ enum {
 	TRACE_EVENT_FL_IGNORE_ENABLE_BIT,
 	TRACE_EVENT_FL_WAS_ENABLED_BIT,
 	TRACE_EVENT_FL_USE_CALL_FILTER_BIT,
+	TRACE_EVENT_FL_BPF_BIT,
 };
 
 /*
@@ -224,6 +225,7 @@ enum {
 	TRACE_EVENT_FL_IGNORE_ENABLE	= (1 << TRACE_EVENT_FL_IGNORE_ENABLE_BIT),
 	TRACE_EVENT_FL_WAS_ENABLED	= (1 << TRACE_EVENT_FL_WAS_ENABLED_BIT),
 	TRACE_EVENT_FL_USE_CALL_FILTER	= (1 << TRACE_EVENT_FL_USE_CALL_FILTER_BIT),
+	TRACE_EVENT_FL_BPF		= (1 << TRACE_EVENT_FL_BPF_BIT),
 };
 
 struct ftrace_event_call {
@@ -487,6 +489,9 @@ event_trigger_unlock_commit_regs(struct ftrace_event_file *file,
 		event_triggers_post_call(file, tt);
 }
 
+struct bpf_context;
+void filter_call_bpf(struct event_filter *filter, struct bpf_context *ctx);
+
 enum {
 	FILTER_OTHER = 0,
 	FILTER_STATIC_STRING,
diff --git a/include/trace/bpf_trace.h b/include/trace/bpf_trace.h
new file mode 100644
index 0000000..3402384
--- /dev/null
+++ b/include/trace/bpf_trace.h
@@ -0,0 +1,41 @@
+/* Copyright (c) 2011-2014 PLUMgrid, http://plumgrid.com
+ *
+ * This program is free software; you can redistribute it and/or
+ * modify it under the terms of version 2 of the GNU General Public
+ * License as published by the Free Software Foundation.
+ */
+#ifndef _LINUX_KERNEL_BPF_TRACE_H
+#define _LINUX_KERNEL_BPF_TRACE_H
+
+struct pt_regs;
+
+struct bpf_context {
+	long arg1;
+	long arg2;
+	long arg3;
+	long arg4;
+	long arg5;
+	struct pt_regs *regs;
+};
+
+static inline void init_bpf_context(struct bpf_context *ctx, long arg1,
+				    long arg2, long arg3, long arg4, long arg5)
+{
+	ctx->arg1 = arg1;
+	ctx->arg2 = arg2;
+	ctx->arg3 = arg3;
+	ctx->arg4 = arg4;
+	ctx->arg5 = arg5;
+}
+void *bpf_load_pointer(void *unsafe_ptr);
+long bpf_memcmp(void *unsafe_ptr, void *safe_ptr, long size);
+void bpf_dump_stack(struct bpf_context *ctx);
+void bpf_trace_printk(char *fmt, long fmt_size,
+		      long arg1, long arg2, long arg3);
+void *bpf_table_lookup(struct bpf_context *ctx, long table_id, const void *key);
+long bpf_table_update(struct bpf_context *ctx, long table_id, const void *key,
+		      const void *leaf);
+
+extern struct bpf_callbacks bpf_trace_cb;
+
+#endif /* _LINUX_KERNEL_BPF_TRACE_H */
diff --git a/include/trace/ftrace.h b/include/trace/ftrace.h
index 1a8b28d..2348afd 100644
--- a/include/trace/ftrace.h
+++ b/include/trace/ftrace.h
@@ -17,6 +17,8 @@
  */
 
 #include <linux/ftrace_event.h>
+#include <linux/kexec.h>
+#include <trace/bpf_trace.h>
 
 /*
  * DECLARE_EVENT_CLASS can be used to add a generic function
@@ -556,6 +558,21 @@ ftrace_raw_event_##call(void *__data, proto)				\
 	if (ftrace_trigger_soft_disabled(ftrace_file))			\
 		return;							\
 									\
+	if (unlikely(ftrace_file->flags & FTRACE_EVENT_FL_FILTERED) &&	\
+	    unlikely(ftrace_file->event_call->flags & TRACE_EVENT_FL_BPF)) { \
+		struct bpf_context _ctx;				\
+		struct pt_regs _regs;					\
+		void (*_fn)(struct bpf_context *, proto,		\
+			    long, long, long, long);			\
+		crash_setup_regs(&_regs, NULL);				\
+		_fn = (void (*)(struct bpf_context *, proto, long, long,\
+				long, long))init_bpf_context;		\
+		_fn(&_ctx, args, 0, 0, 0, 0);				\
+		_ctx.regs = &_regs;					\
+		filter_call_bpf(ftrace_file->filter, &_ctx);		\
+		return;							\
+	}								\
+									\
 	local_save_flags(irq_flags);					\
 	pc = preempt_count();						\
 									\
diff --git a/kernel/trace/Kconfig b/kernel/trace/Kconfig
index 015f85a..2809cd1 100644
--- a/kernel/trace/Kconfig
+++ b/kernel/trace/Kconfig
@@ -80,6 +80,7 @@ config FTRACE_NMI_ENTER
 
 config EVENT_TRACING
 	select CONTEXT_SWITCH_TRACER
+	select BPF64
 	bool
 
 config CONTEXT_SWITCH_TRACER
diff --git a/kernel/trace/Makefile b/kernel/trace/Makefile
index 1378e84..dc4fb44 100644
--- a/kernel/trace/Makefile
+++ b/kernel/trace/Makefile
@@ -51,6 +51,7 @@ obj-$(CONFIG_EVENT_TRACING) += trace_event_perf.o
 endif
 obj-$(CONFIG_EVENT_TRACING) += trace_events_filter.o
 obj-$(CONFIG_EVENT_TRACING) += trace_events_trigger.o
+obj-$(CONFIG_EVENT_TRACING) += bpf_trace_callbacks.o
 obj-$(CONFIG_KPROBE_EVENT) += trace_kprobe.o
 obj-$(CONFIG_TRACEPOINTS) += power-traces.o
 ifeq ($(CONFIG_PM_RUNTIME),y)
diff --git a/kernel/trace/bpf_trace_callbacks.c b/kernel/trace/bpf_trace_callbacks.c
new file mode 100644
index 0000000..2b7955d
--- /dev/null
+++ b/kernel/trace/bpf_trace_callbacks.c
@@ -0,0 +1,193 @@
+/* Copyright (c) 2011-2014 PLUMgrid, http://plumgrid.com
+ *
+ * This program is free software; you can redistribute it and/or
+ * modify it under the terms of version 2 of the GNU General Public
+ * License as published by the Free Software Foundation.
+ */
+#include <linux/kernel.h>
+#include <linux/types.h>
+#include <linux/slab.h>
+#include <linux/bpf_jit.h>
+#include <linux/uaccess.h>
+#include <trace/bpf_trace.h>
+#include "trace.h"
+
+#define MAX_CTX_OFF sizeof(struct bpf_context)
+
+static const struct bpf_context_access ctx_access[MAX_CTX_OFF] = {
+	[offsetof(struct bpf_context, arg1)] = {
+		FIELD_SIZEOF(struct bpf_context, arg1),
+		BPF_READ
+	},
+	[offsetof(struct bpf_context, arg2)] = {
+		FIELD_SIZEOF(struct bpf_context, arg2),
+		BPF_READ
+	},
+	[offsetof(struct bpf_context, arg3)] = {
+		FIELD_SIZEOF(struct bpf_context, arg3),
+		BPF_READ
+	},
+	[offsetof(struct bpf_context, arg4)] = {
+		FIELD_SIZEOF(struct bpf_context, arg4),
+		BPF_READ
+	},
+	[offsetof(struct bpf_context, arg5)] = {
+		FIELD_SIZEOF(struct bpf_context, arg5),
+		BPF_READ
+	},
+};
+
+static const struct bpf_context_access *get_context_access(int off)
+{
+	if (off >= MAX_CTX_OFF)
+		return NULL;
+	return &ctx_access[off];
+}
+
+void *bpf_load_pointer(void *unsafe_ptr)
+{
+	void *ptr = NULL;
+
+	probe_kernel_read(&ptr, unsafe_ptr, sizeof(void *));
+	return ptr;
+}
+
+long bpf_memcmp(void *unsafe_ptr, void *safe_ptr, long size)
+{
+	char buf[64];
+	int err;
+
+	if (size < 64) {
+		err = probe_kernel_read(buf, unsafe_ptr, size);
+		if (err)
+			return err;
+		return memcmp(buf, safe_ptr, size);
+	}
+	return -1;
+}
+
+void bpf_dump_stack(struct bpf_context *ctx)
+{
+	unsigned long flags;
+
+	local_save_flags(flags);
+
+	__trace_stack_regs(flags, 0, preempt_count(), ctx->regs);
+}
+
+/*
+ * limited trace_printk()
+ * only %d %u %p %x conversion specifiers allowed
+ */
+void bpf_trace_printk(char *fmt, long fmt_size, long arg1, long arg2, long arg3)
+{
+	int fmt_cnt = 0;
+	int i;
+
+	/*
+	 * bpf_check() guarantees that fmt points to bpf program stack and
+	 * fmt_size bytes of it were initialized by bpf program
+	 */
+	if (fmt[fmt_size - 1] != 0)
+		return;
+
+	for (i = 0; i < fmt_size; i++)
+		if (fmt[i] == '%') {
+			if (i + 1 >= fmt_size)
+				return;
+			if (fmt[i + 1] != 'p' && fmt[i + 1] != 'd' &&
+			    fmt[i + 1] != 'u' && fmt[i + 1] != 'x')
+				return;
+			fmt_cnt++;
+		}
+	if (fmt_cnt > 3)
+		return;
+	__trace_printk((unsigned long)__builtin_return_address(3), fmt,
+		       arg1, arg2, arg3);
+}
+
+
+static const struct bpf_func_proto *get_func_proto(char *strtab, int id)
+{
+	if (!strcmp(strtab + id, "bpf_load_pointer")) {
+		static const struct bpf_func_proto proto = {RET_INTEGER};
+		return &proto;
+	}
+	if (!strcmp(strtab + id, "bpf_memcmp")) {
+		static const struct bpf_func_proto proto = {RET_INTEGER,
+			INVALID_PTR, PTR_TO_STACK_IMM,
+			CONST_ARG_STACK_IMM_SIZE};
+		return &proto;
+	}
+	if (!strcmp(strtab + id, "bpf_dump_stack")) {
+		static const struct bpf_func_proto proto = {RET_VOID,
+			PTR_TO_CTX};
+		return &proto;
+	}
+	if (!strcmp(strtab + id, "bpf_trace_printk")) {
+		static const struct bpf_func_proto proto = {RET_VOID,
+			PTR_TO_STACK_IMM, CONST_ARG_STACK_IMM_SIZE};
+		return &proto;
+	}
+	if (!strcmp(strtab + id, "bpf_table_lookup")) {
+		static const struct bpf_func_proto proto = {
+			PTR_TO_TABLE_CONDITIONAL, PTR_TO_CTX,
+			CONST_ARG_TABLE_ID, PTR_TO_STACK_IMM_TABLE_KEY};
+		return &proto;
+	}
+	if (!strcmp(strtab + id, "bpf_table_update")) {
+		static const struct bpf_func_proto proto = {RET_INTEGER,
+			PTR_TO_CTX, CONST_ARG_TABLE_ID,
+			PTR_TO_STACK_IMM_TABLE_KEY,
+			PTR_TO_STACK_IMM_TABLE_ELEM};
+		return &proto;
+	}
+	return NULL;
+}
+
+static void execute_func(char *strtab, int id, u64 *regs)
+{
+	regs[R0] = 0;
+
+	/*
+	 * strcmp-approach is not efficient.
+	 * TODO: optimize it for poor archs that don't have JIT yet
+	 */
+	if (!strcmp(strtab + id, "bpf_load_pointer")) {
+		regs[R0] = (u64)bpf_load_pointer((void *)regs[R1]);
+	} else if (!strcmp(strtab + id, "bpf_memcmp")) {
+		regs[R0] = (u64)bpf_memcmp((void *)regs[R1], (void *)regs[R2],
+					   (long)regs[R3]);
+	} else if (!strcmp(strtab + id, "bpf_dump_stack")) {
+		bpf_dump_stack((struct bpf_context *)regs[R1]);
+	} else if (!strcmp(strtab + id, "bpf_trace_printk")) {
+		bpf_trace_printk((char *)regs[R1], (long)regs[R2],
+				 (long)regs[R3], (long)regs[R4],
+				 (long)regs[R5]);
+	} else {
+		pr_err_once("trace cannot execute unknown bpf function %d '%s'\n",
+			    id, strtab + id);
+	}
+}
+
+static void *jit_select_func(char *strtab, int id)
+{
+	if (!strcmp(strtab + id, "bpf_load_pointer"))
+		return bpf_load_pointer;
+
+	if (!strcmp(strtab + id, "bpf_memcmp"))
+		return bpf_memcmp;
+
+	if (!strcmp(strtab + id, "bpf_dump_stack"))
+		return bpf_dump_stack;
+
+	if (!strcmp(strtab + id, "bpf_trace_printk"))
+		return bpf_trace_printk;
+
+	return NULL;
+}
+
+struct bpf_callbacks bpf_trace_cb = {
+	execute_func, jit_select_func, get_func_proto, get_context_access
+};
+
diff --git a/kernel/trace/trace.c b/kernel/trace/trace.c
index 815c878..1a7762b 100644
--- a/kernel/trace/trace.c
+++ b/kernel/trace/trace.c
@@ -1791,6 +1791,13 @@ void __trace_stack(struct trace_array *tr, unsigned long flags, int skip,
 	__ftrace_trace_stack(tr->trace_buffer.buffer, flags, skip, pc, NULL);
 }
 
+void __trace_stack_regs(unsigned long flags, int skip, int pc,
+			struct pt_regs *regs)
+{
+	__ftrace_trace_stack(global_trace.trace_buffer.buffer, flags, skip,
+			     pc, regs);
+}
+
 /**
  * trace_dump_stack - record a stack back trace in the trace buffer
  * @skip: Number of functions to skip (helper handlers)
diff --git a/kernel/trace/trace.h b/kernel/trace/trace.h
index 02b592f..fa7db5f 100644
--- a/kernel/trace/trace.h
+++ b/kernel/trace/trace.h
@@ -619,6 +619,8 @@ void ftrace_trace_userstack(struct ring_buffer *buffer, unsigned long flags,
 
 void __trace_stack(struct trace_array *tr, unsigned long flags, int skip,
 		   int pc);
+void __trace_stack_regs(unsigned long flags, int skip, int pc,
+			struct pt_regs *regs);
 #else
 static inline void ftrace_trace_stack(struct ring_buffer *buffer,
 				      unsigned long flags, int skip, int pc)
@@ -640,6 +642,10 @@ static inline void __trace_stack(struct trace_array *tr, unsigned long flags,
 				 int skip, int pc)
 {
 }
+static inline void __trace_stack_regs(unsigned long flags, int skip, int pc,
+				      struct pt_regs *regs)
+{
+}
 #endif /* CONFIG_STACKTRACE */
 
 extern cycle_t ftrace_now(int cpu);
@@ -939,12 +945,15 @@ struct ftrace_event_field {
 	int			is_signed;
 };
 
+struct bpf_program;
+
 struct event_filter {
 	int			n_preds;	/* Number assigned */
 	int			a_preds;	/* allocated */
 	struct filter_pred	*preds;
 	struct filter_pred	*root;
 	char			*filter_string;
+	struct bpf_program	*prog;
 };
 
 struct event_subsystem {
@@ -1017,7 +1026,7 @@ filter_parse_regex(char *buff, int len, char **search, int *not);
 extern void print_event_filter(struct ftrace_event_file *file,
 			       struct trace_seq *s);
 extern int apply_event_filter(struct ftrace_event_file *file,
-			      char *filter_string);
+			      char *filter_string, int filter_len);
 extern int apply_subsystem_event_filter(struct ftrace_subsystem_dir *dir,
 					char *filter_string);
 extern void print_subsystem_event_filter(struct event_subsystem *system,
diff --git a/kernel/trace/trace_events.c b/kernel/trace/trace_events.c
index e71ffd4..b6aadc3 100644
--- a/kernel/trace/trace_events.c
+++ b/kernel/trace/trace_events.c
@@ -1042,9 +1042,16 @@ event_filter_write(struct file *filp, const char __user *ubuf, size_t cnt,
 	mutex_lock(&event_mutex);
 	file = event_file_data(filp);
 	if (file)
-		err = apply_event_filter(file, buf);
+		err = apply_event_filter(file, buf, cnt);
 	mutex_unlock(&event_mutex);
 
+	if (file->event_call->flags & TRACE_EVENT_FL_BPF)
+		/*
+		 * allocate per-cpu printk buffers, since BPF program
+		 * might be calling bpf_trace_printk
+		 */
+		trace_printk_init_buffers();
+
 	free_page((unsigned long) buf);
 	if (err < 0)
 		return err;
diff --git a/kernel/trace/trace_events_filter.c b/kernel/trace/trace_events_filter.c
index 8a86319..d4fb09c 100644
--- a/kernel/trace/trace_events_filter.c
+++ b/kernel/trace/trace_events_filter.c
@@ -23,6 +23,8 @@
 #include <linux/mutex.h>
 #include <linux/perf_event.h>
 #include <linux/slab.h>
+#include <linux/bpf_jit.h>
+#include <trace/bpf_trace.h>
 
 #include "trace.h"
 #include "trace_output.h"
@@ -535,6 +537,20 @@ static int filter_match_preds_cb(enum move_type move, struct filter_pred *pred,
 	return WALK_PRED_DEFAULT;
 }
 
+void filter_call_bpf(struct event_filter *filter, struct bpf_context *ctx)
+{
+	BUG_ON(!filter || !filter->prog);
+
+	if (!filter->prog->jit_image) {
+		pr_warn_once("BPF jit image is not available. Fallback to emulation\n");
+		bpf_run(filter->prog, ctx);
+		return;
+	}
+
+	filter->prog->jit_image(ctx);
+}
+EXPORT_SYMBOL_GPL(filter_call_bpf);
+
 /* return 1 if event matches, 0 otherwise (discard) */
 int filter_match_preds(struct event_filter *filter, void *rec)
 {
@@ -794,6 +810,7 @@ static void __free_filter(struct event_filter *filter)
 	if (!filter)
 		return;
 
+	bpf_free(filter->prog);
 	__free_preds(filter);
 	kfree(filter->filter_string);
 	kfree(filter);
@@ -1898,6 +1915,37 @@ static int create_filter_start(char *filter_str, bool set_str,
 	return err;
 }
 
+static int create_filter_bpf(char *filter_str, int filter_len,
+			     struct event_filter **filterp)
+{
+	struct event_filter *filter;
+	int err = 0;
+
+	*filterp = NULL;
+
+	filter = __alloc_filter();
+	if (filter)
+		err = replace_filter_string(filter, "bpf");
+
+	if (!filter || err) {
+		__free_filter(filter);
+		return -ENOMEM;
+	}
+
+	err = bpf_load_image(filter_str, filter_len, &bpf_trace_cb,
+			     &filter->prog);
+
+	if (err) {
+		pr_err("failed to load bpf %d\n", err);
+		__free_filter(filter);
+		return -EACCES;
+	}
+
+	*filterp = filter;
+
+	return err;
+}
+
 static void create_filter_finish(struct filter_parse_state *ps)
 {
 	if (ps) {
@@ -1985,7 +2033,8 @@ static int create_system_filter(struct event_subsystem *system,
 }
 
 /* caller must hold event_mutex */
-int apply_event_filter(struct ftrace_event_file *file, char *filter_string)
+int apply_event_filter(struct ftrace_event_file *file, char *filter_string,
+		       int filter_len)
 {
 	struct ftrace_event_call *call = file->event_call;
 	struct event_filter *filter;
@@ -2007,7 +2056,15 @@ int apply_event_filter(struct ftrace_event_file *file, char *filter_string)
 		return 0;
 	}
 
-	err = create_filter(call, filter_string, true, &filter);
+	if (!strcmp(filter_string, "bpf")) {
+		err = create_filter_bpf(filter_string, filter_len, &filter);
+		if (!err)
+			call->flags |= TRACE_EVENT_FL_BPF;
+	} else {
+		err = create_filter(call, filter_string, true, &filter);
+		if (!err)
+			call->flags &= ~TRACE_EVENT_FL_BPF;
+	}
 
 	/*
 	 * Always swap the call filter with the new filter
diff --git a/kernel/trace/trace_kprobe.c b/kernel/trace/trace_kprobe.c
index bdbae45..1e508d2 100644
--- a/kernel/trace/trace_kprobe.c
+++ b/kernel/trace/trace_kprobe.c
@@ -19,7 +19,7 @@
 
 #include <linux/module.h>
 #include <linux/uaccess.h>
-
+#include <trace/bpf_trace.h>
 #include "trace_probe.h"
 
 #define KPROBE_EVENT_SYSTEM "kprobes"
@@ -936,6 +936,19 @@ __kprobe_trace_func(struct trace_kprobe *tk, struct pt_regs *regs,
 	if (ftrace_trigger_soft_disabled(ftrace_file))
 		return;
 
+	if (unlikely(ftrace_file->flags & FTRACE_EVENT_FL_FILTERED) &&
+	    unlikely(ftrace_file->event_call->flags & TRACE_EVENT_FL_BPF)) {
+		struct bpf_context ctx;
+		ctx.regs = regs;
+		ctx.arg1 = regs_get_argument_nth(regs, 0);
+		ctx.arg2 = regs_get_argument_nth(regs, 1);
+		ctx.arg3 = regs_get_argument_nth(regs, 2);
+		ctx.arg4 = regs_get_argument_nth(regs, 3);
+		ctx.arg5 = regs_get_argument_nth(regs, 4);
+		filter_call_bpf(ftrace_file->filter, &ctx);
+		return;
+	}
+
 	local_save_flags(irq_flags);
 	pc = preempt_count();
 
-- 
1.7.9.5


^ permalink raw reply related	[flat|nested] 26+ messages in thread

* [RFC PATCH v2 tip 6/7] LLVM BPF backend
  2014-02-06  1:10 [RFC PATCH v2 tip 0/7] 64-bit BPF insn set and tracing filters Alexei Starovoitov
                   ` (4 preceding siblings ...)
  2014-02-06  1:10 ` [RFC PATCH v2 tip 5/7] use BPF in tracing filters Alexei Starovoitov
@ 2014-02-06  1:10 ` Alexei Starovoitov
  2014-02-06  1:10 ` [RFC PATCH v2 tip 7/7] tracing filter examples in BPF Alexei Starovoitov
  2014-02-06 10:42 ` [RFC PATCH v2 tip 0/7] 64-bit BPF insn set and tracing filters Daniel Borkmann
  7 siblings, 0 replies; 26+ messages in thread
From: Alexei Starovoitov @ 2014-02-06  1:10 UTC (permalink / raw)
  To: Ingo Molnar
  Cc: David S. Miller, Steven Rostedt, Peter Zijlstra, H. Peter Anvin,
	Thomas Gleixner, Masami Hiramatsu, Tom Zanussi, Jovi Zhangwei,
	Eric Dumazet, Linus Torvalds, Andrew Morton, Frederic Weisbecker,
	Arnaldo Carvalho de Melo, Pekka Enberg, Arjan van de Ven,
	Christoph Hellwig, linux-kernel, netdev

standalone BPF backend for LLVM 3.2, 3.3 and 3.4
See tools/bpf/llvm/README.txt

Written in LLVM codying style and LLVM license.

Most of the lib/Target/BPF/* is boilerplate code
which is required for any LLVM backend.

Backend enforces presence of 'license' section in the source C file.

Makefile* is simplified LLVM build system

Signed-off-by: Alexei Starovoitov <ast@plumgrid.com>
---
 tools/bpf/llvm/LICENSE.TXT                         |   70 ++
 tools/bpf/llvm/Makefile.rules                      |  641 +++++++++++++++++++
 tools/bpf/llvm/README.txt                          |   23 +
 tools/bpf/llvm/bld/.gitignore                      |    2 +
 tools/bpf/llvm/bld/Makefile                        |   27 +
 tools/bpf/llvm/bld/Makefile.common                 |   14 +
 tools/bpf/llvm/bld/Makefile.config                 |  124 ++++
 .../llvm/bld/include/llvm/Config/AsmParsers.def    |    8 +
 .../llvm/bld/include/llvm/Config/AsmPrinters.def   |    9 +
 .../llvm/bld/include/llvm/Config/Disassemblers.def |    8 +
 tools/bpf/llvm/bld/include/llvm/Config/Targets.def |    9 +
 .../bpf/llvm/bld/include/llvm/Support/DataTypes.h  |   96 +++
 tools/bpf/llvm/bld/lib/Makefile                    |   11 +
 .../llvm/bld/lib/Target/BPF/InstPrinter/Makefile   |   10 +
 .../llvm/bld/lib/Target/BPF/MCTargetDesc/Makefile  |   11 +
 tools/bpf/llvm/bld/lib/Target/BPF/Makefile         |   17 +
 .../llvm/bld/lib/Target/BPF/TargetInfo/Makefile    |   10 +
 tools/bpf/llvm/bld/lib/Target/Makefile             |   11 +
 tools/bpf/llvm/bld/tools/Makefile                  |   12 +
 tools/bpf/llvm/bld/tools/llc/Makefile              |   15 +
 tools/bpf/llvm/lib/Target/BPF/BPF.h                |   30 +
 tools/bpf/llvm/lib/Target/BPF/BPF.td               |   29 +
 tools/bpf/llvm/lib/Target/BPF/BPFAsmPrinter.cpp    |  100 +++
 tools/bpf/llvm/lib/Target/BPF/BPFCFGFixup.cpp      |   62 ++
 tools/bpf/llvm/lib/Target/BPF/BPFCallingConv.td    |   24 +
 tools/bpf/llvm/lib/Target/BPF/BPFFrameLowering.cpp |   36 ++
 tools/bpf/llvm/lib/Target/BPF/BPFFrameLowering.h   |   35 +
 tools/bpf/llvm/lib/Target/BPF/BPFISelDAGToDAG.cpp  |  182 ++++++
 tools/bpf/llvm/lib/Target/BPF/BPFISelLowering.cpp  |  676 ++++++++++++++++++++
 tools/bpf/llvm/lib/Target/BPF/BPFISelLowering.h    |  105 +++
 tools/bpf/llvm/lib/Target/BPF/BPFInstrFormats.td   |   29 +
 tools/bpf/llvm/lib/Target/BPF/BPFInstrInfo.cpp     |  162 +++++
 tools/bpf/llvm/lib/Target/BPF/BPFInstrInfo.h       |   53 ++
 tools/bpf/llvm/lib/Target/BPF/BPFInstrInfo.td      |  455 +++++++++++++
 tools/bpf/llvm/lib/Target/BPF/BPFMCInstLower.cpp   |   77 +++
 tools/bpf/llvm/lib/Target/BPF/BPFMCInstLower.h     |   40 ++
 tools/bpf/llvm/lib/Target/BPF/BPFRegisterInfo.cpp  |  122 ++++
 tools/bpf/llvm/lib/Target/BPF/BPFRegisterInfo.h    |   65 ++
 tools/bpf/llvm/lib/Target/BPF/BPFRegisterInfo.td   |   39 ++
 tools/bpf/llvm/lib/Target/BPF/BPFSubtarget.cpp     |   23 +
 tools/bpf/llvm/lib/Target/BPF/BPFSubtarget.h       |   33 +
 tools/bpf/llvm/lib/Target/BPF/BPFTargetMachine.cpp |   72 +++
 tools/bpf/llvm/lib/Target/BPF/BPFTargetMachine.h   |   69 ++
 .../lib/Target/BPF/InstPrinter/BPFInstPrinter.cpp  |   79 +++
 .../lib/Target/BPF/InstPrinter/BPFInstPrinter.h    |   34 +
 .../lib/Target/BPF/MCTargetDesc/BPFAsmBackend.cpp  |   85 +++
 .../llvm/lib/Target/BPF/MCTargetDesc/BPFBaseInfo.h |   33 +
 .../Target/BPF/MCTargetDesc/BPFELFObjectWriter.cpp |  119 ++++
 .../lib/Target/BPF/MCTargetDesc/BPFMCAsmInfo.h     |   34 +
 .../Target/BPF/MCTargetDesc/BPFMCCodeEmitter.cpp   |  120 ++++
 .../lib/Target/BPF/MCTargetDesc/BPFMCCodeEmitter.h |   67 ++
 .../Target/BPF/MCTargetDesc/BPFMCTargetDesc.cpp    |  115 ++++
 .../lib/Target/BPF/MCTargetDesc/BPFMCTargetDesc.h  |   56 ++
 .../lib/Target/BPF/TargetInfo/BPFTargetInfo.cpp    |   13 +
 tools/bpf/llvm/tools/llc/llc.cpp                   |  381 +++++++++++
 55 files changed, 4782 insertions(+)
 create mode 100644 tools/bpf/llvm/LICENSE.TXT
 create mode 100644 tools/bpf/llvm/Makefile.rules
 create mode 100644 tools/bpf/llvm/README.txt
 create mode 100644 tools/bpf/llvm/bld/.gitignore
 create mode 100644 tools/bpf/llvm/bld/Makefile
 create mode 100644 tools/bpf/llvm/bld/Makefile.common
 create mode 100644 tools/bpf/llvm/bld/Makefile.config
 create mode 100644 tools/bpf/llvm/bld/include/llvm/Config/AsmParsers.def
 create mode 100644 tools/bpf/llvm/bld/include/llvm/Config/AsmPrinters.def
 create mode 100644 tools/bpf/llvm/bld/include/llvm/Config/Disassemblers.def
 create mode 100644 tools/bpf/llvm/bld/include/llvm/Config/Targets.def
 create mode 100644 tools/bpf/llvm/bld/include/llvm/Support/DataTypes.h
 create mode 100644 tools/bpf/llvm/bld/lib/Makefile
 create mode 100644 tools/bpf/llvm/bld/lib/Target/BPF/InstPrinter/Makefile
 create mode 100644 tools/bpf/llvm/bld/lib/Target/BPF/MCTargetDesc/Makefile
 create mode 100644 tools/bpf/llvm/bld/lib/Target/BPF/Makefile
 create mode 100644 tools/bpf/llvm/bld/lib/Target/BPF/TargetInfo/Makefile
 create mode 100644 tools/bpf/llvm/bld/lib/Target/Makefile
 create mode 100644 tools/bpf/llvm/bld/tools/Makefile
 create mode 100644 tools/bpf/llvm/bld/tools/llc/Makefile
 create mode 100644 tools/bpf/llvm/lib/Target/BPF/BPF.h
 create mode 100644 tools/bpf/llvm/lib/Target/BPF/BPF.td
 create mode 100644 tools/bpf/llvm/lib/Target/BPF/BPFAsmPrinter.cpp
 create mode 100644 tools/bpf/llvm/lib/Target/BPF/BPFCFGFixup.cpp
 create mode 100644 tools/bpf/llvm/lib/Target/BPF/BPFCallingConv.td
 create mode 100644 tools/bpf/llvm/lib/Target/BPF/BPFFrameLowering.cpp
 create mode 100644 tools/bpf/llvm/lib/Target/BPF/BPFFrameLowering.h
 create mode 100644 tools/bpf/llvm/lib/Target/BPF/BPFISelDAGToDAG.cpp
 create mode 100644 tools/bpf/llvm/lib/Target/BPF/BPFISelLowering.cpp
 create mode 100644 tools/bpf/llvm/lib/Target/BPF/BPFISelLowering.h
 create mode 100644 tools/bpf/llvm/lib/Target/BPF/BPFInstrFormats.td
 create mode 100644 tools/bpf/llvm/lib/Target/BPF/BPFInstrInfo.cpp
 create mode 100644 tools/bpf/llvm/lib/Target/BPF/BPFInstrInfo.h
 create mode 100644 tools/bpf/llvm/lib/Target/BPF/BPFInstrInfo.td
 create mode 100644 tools/bpf/llvm/lib/Target/BPF/BPFMCInstLower.cpp
 create mode 100644 tools/bpf/llvm/lib/Target/BPF/BPFMCInstLower.h
 create mode 100644 tools/bpf/llvm/lib/Target/BPF/BPFRegisterInfo.cpp
 create mode 100644 tools/bpf/llvm/lib/Target/BPF/BPFRegisterInfo.h
 create mode 100644 tools/bpf/llvm/lib/Target/BPF/BPFRegisterInfo.td
 create mode 100644 tools/bpf/llvm/lib/Target/BPF/BPFSubtarget.cpp
 create mode 100644 tools/bpf/llvm/lib/Target/BPF/BPFSubtarget.h
 create mode 100644 tools/bpf/llvm/lib/Target/BPF/BPFTargetMachine.cpp
 create mode 100644 tools/bpf/llvm/lib/Target/BPF/BPFTargetMachine.h
 create mode 100644 tools/bpf/llvm/lib/Target/BPF/InstPrinter/BPFInstPrinter.cpp
 create mode 100644 tools/bpf/llvm/lib/Target/BPF/InstPrinter/BPFInstPrinter.h
 create mode 100644 tools/bpf/llvm/lib/Target/BPF/MCTargetDesc/BPFAsmBackend.cpp
 create mode 100644 tools/bpf/llvm/lib/Target/BPF/MCTargetDesc/BPFBaseInfo.h
 create mode 100644 tools/bpf/llvm/lib/Target/BPF/MCTargetDesc/BPFELFObjectWriter.cpp
 create mode 100644 tools/bpf/llvm/lib/Target/BPF/MCTargetDesc/BPFMCAsmInfo.h
 create mode 100644 tools/bpf/llvm/lib/Target/BPF/MCTargetDesc/BPFMCCodeEmitter.cpp
 create mode 100644 tools/bpf/llvm/lib/Target/BPF/MCTargetDesc/BPFMCCodeEmitter.h
 create mode 100644 tools/bpf/llvm/lib/Target/BPF/MCTargetDesc/BPFMCTargetDesc.cpp
 create mode 100644 tools/bpf/llvm/lib/Target/BPF/MCTargetDesc/BPFMCTargetDesc.h
 create mode 100644 tools/bpf/llvm/lib/Target/BPF/TargetInfo/BPFTargetInfo.cpp
 create mode 100644 tools/bpf/llvm/tools/llc/llc.cpp

diff --git a/tools/bpf/llvm/LICENSE.TXT b/tools/bpf/llvm/LICENSE.TXT
new file mode 100644
index 0000000..00cf601
--- /dev/null
+++ b/tools/bpf/llvm/LICENSE.TXT
@@ -0,0 +1,70 @@
+==============================================================================
+LLVM Release License
+==============================================================================
+University of Illinois/NCSA
+Open Source License
+
+Copyright (c) 2003-2012 University of Illinois at Urbana-Champaign.
+All rights reserved.
+
+Developed by:
+
+    LLVM Team
+
+    University of Illinois at Urbana-Champaign
+
+    http://llvm.org
+
+Permission is hereby granted, free of charge, to any person obtaining a copy of
+this software and associated documentation files (the "Software"), to deal with
+the Software without restriction, including without limitation the rights to
+use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies
+of the Software, and to permit persons to whom the Software is furnished to do
+so, subject to the following conditions:
+
+    * Redistributions of source code must retain the above copyright notice,
+      this list of conditions and the following disclaimers.
+
+    * Redistributions in binary form must reproduce the above copyright notice,
+      this list of conditions and the following disclaimers in the
+      documentation and/or other materials provided with the distribution.
+
+    * Neither the names of the LLVM Team, University of Illinois at
+      Urbana-Champaign, nor the names of its contributors may be used to
+      endorse or promote products derived from this Software without specific
+      prior written permission.
+
+THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
+IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS
+FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT.  IN NO EVENT SHALL THE
+CONTRIBUTORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
+LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
+OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS WITH THE
+SOFTWARE.
+
+==============================================================================
+Copyrights and Licenses for Third Party Software Distributed with LLVM:
+==============================================================================
+The LLVM software contains code written by third parties.  Such software will
+have its own individual LICENSE.TXT file in the directory in which it appears.
+This file will describe the copyrights, license, and restrictions which apply
+to that code.
+
+The disclaimer of warranty in the University of Illinois Open Source License
+applies to all code in the LLVM Distribution, and nothing in any of the
+other licenses gives permission to use the names of the LLVM Team or the
+University of Illinois to endorse or promote products derived from this
+Software.
+
+The following pieces of software have additional or alternate copyrights,
+licenses, and/or restrictions:
+
+Program             Directory
+-------             ---------
+Autoconf            llvm/autoconf
+                    llvm/projects/ModuleMaker/autoconf
+                    llvm/projects/sample/autoconf
+CellSPU backend     llvm/lib/Target/CellSPU/README.txt
+Google Test         llvm/utils/unittest/googletest
+OpenBSD regex       llvm/lib/Support/{reg*, COPYRIGHT.regex}
+pyyaml tests        llvm/test/YAMLParser/{*.data, LICENSE.TXT}
diff --git a/tools/bpf/llvm/Makefile.rules b/tools/bpf/llvm/Makefile.rules
new file mode 100644
index 0000000..9689527
--- /dev/null
+++ b/tools/bpf/llvm/Makefile.rules
@@ -0,0 +1,641 @@
+#===-- Makefile.rules - Common make rules for LLVM ---------*- Makefile -*--===#
+#
+# This file is distributed under the University of Illinois Open Source
+# License. See LICENSE.TXT for details.
+#
+# This file is included by all of the LLVM makefiles.  For details on how to use
+# it properly, please see the document MakefileGuide.html in the docs directory.
+
+# TARGETS: Define standard targets that can be invoked
+
+# Define the various target sets
+RecursiveTargets := all clean clean-all install uninstall
+LocalTargets     := all-local clean-local clean-all-local check-local \
+                    install-local uninstall-local
+TopLevelTargets  := check dist-clean
+UserTargets      := $(RecursiveTargets) $(LocalTargets) $(TopLevelTargets)
+InternalTargets  := preconditions
+
+# INITIALIZATION: Basic things the makefile needs
+
+# Set the VPATH so that we can find source files.
+VPATH=$(PROJ_SRC_DIR)
+
+# Reset the list of suffixes we know how to build.
+.SUFFIXES:
+.SUFFIXES: .c .cpp .cc .h .hpp .o .a
+.SUFFIXES: $(SHLIBEXT) $(SUFFIXES)
+
+# Mark all of these targets as phony to avoid implicit rule search
+.PHONY: $(UserTargets) $(InternalTargets)
+
+# Make sure all the user-target rules are double colon rules and
+# they are defined first.
+
+$(UserTargets)::
+
+# PRECONDITIONS: that which must be built/checked first
+
+SrcMakefiles       := $(filter %Makefile %Makefile.tests,\
+                      $(wildcard $(PROJ_SRC_DIR)/Makefile*))
+ObjMakefiles       := $(subst $(PROJ_SRC_DIR),$(PROJ_OBJ_DIR),$(SrcMakefiles))
+MakefileConfig     := $(PROJ_OBJ_ROOT)/Makefile.config
+MakefileCommon     := $(PROJ_OBJ_ROOT)/Makefile.common
+PreConditions      := $(ObjMakefiles)
+PreConditions      += $(MakefileCommon)
+PreConditions      += $(MakefileConfig)
+
+preconditions: $(PreConditions)
+
+# Make sure the BUILT_SOURCES are built first
+$(filter-out clean clean-local,$(UserTargets)):: $(BUILT_SOURCES)
+
+clean-all-local::
+ifneq ($(strip $(BUILT_SOURCES)),)
+	-$(Verb) $(RM) -f $(BUILT_SOURCES)
+endif
+
+$(BUILT_SOURCES) : $(ObjMakefiles)
+
+ifndef PROJ_MAKEFILE
+PROJ_MAKEFILE := $(PROJ_OBJ_DIR)/Makefile
+endif
+
+# Set up the basic dependencies
+$(UserTargets):: $(PreConditions)
+
+all:: all-local
+clean:: clean-local
+clean-all:: clean-local clean-all-local
+install:: install-local
+uninstall:: uninstall-local
+install-local:: all-local
+
+# VARIABLES: Set up various variables based on configuration data
+
+# Variable for if this make is for a "cleaning" target
+ifneq ($(strip $(filter clean clean-local dist-clean,$(MAKECMDGOALS))),)
+  IS_CLEANING_TARGET=1
+endif
+
+# Variables derived from configuration we are building
+
+CPP.Defines :=
+ifeq ($(ENABLE_OPTIMIZED),1)
+  BuildMode := Release
+  OmitFramePointer := -fomit-frame-pointer
+
+  CXX.Flags += $(OPTIMIZE_OPTION) $(OmitFramePointer)
+  C.Flags   += $(OPTIMIZE_OPTION) $(OmitFramePointer)
+  LD.Flags  += $(OPTIMIZE_OPTION)
+  ifdef DEBUG_SYMBOLS
+    BuildMode := $(BuildMode)+Debug
+    CXX.Flags += -g
+    C.Flags   += -g
+    LD.Flags  += -g
+    KEEP_SYMBOLS := 1
+  endif
+else
+  ifdef NO_DEBUG_SYMBOLS
+    BuildMode := Unoptimized
+    CXX.Flags +=
+    C.Flags   +=
+    LD.Flags  +=
+    KEEP_SYMBOLS := 1
+  else
+    BuildMode := Debug
+    CXX.Flags += -g
+    C.Flags   += -g
+    LD.Flags  += -g
+    KEEP_SYMBOLS := 1
+  endif
+endif
+
+ifeq ($(ENABLE_WERROR),1)
+  CXX.Flags += -Werror
+  C.Flags += -Werror
+endif
+
+ifeq ($(ENABLE_VISIBILITY_INLINES_HIDDEN),1)
+    CXX.Flags += -fvisibility-inlines-hidden
+endif
+
+CXX.Flags += -fno-exceptions
+
+CXX.Flags += -fno-rtti
+
+# If DISABLE_ASSERTIONS=1 is specified (make command line or configured),
+# then disable assertions by defining the appropriate preprocessor symbols.
+ifeq ($(DISABLE_ASSERTIONS),1)
+  CPP.Defines += -DNDEBUG
+else
+  BuildMode := $(BuildMode)+Asserts
+  CPP.Defines += -D_DEBUG
+endif
+
+# If ENABLE_EXPENSIVE_CHECKS=1 is specified (make command line or
+# configured), then enable expensive checks by defining the
+# appropriate preprocessor symbols.
+ifeq ($(ENABLE_EXPENSIVE_CHECKS),1)
+  BuildMode := $(BuildMode)+Checks
+  CPP.Defines += -DXDEBUG
+endif
+
+DOTDIR_TIMESTAMP_COMMAND := $(DATE)
+
+CXX.Flags     += -Woverloaded-virtual
+CPP.BaseFlags += $(CPP.Defines)
+AR.Flags      := cru
+
+# Directory locations
+
+ObjRootDir  := $(PROJ_OBJ_DIR)/$(BuildMode)
+ObjDir      := $(ObjRootDir)
+LibDir      := $(PROJ_OBJ_ROOT)/$(BuildMode)/lib
+ToolDir     := $(PROJ_OBJ_ROOT)/$(BuildMode)/bin
+ExmplDir    := $(PROJ_OBJ_ROOT)/$(BuildMode)/examples
+LLVMLibDir  := $(LLVM_OBJ_ROOT)/$(BuildMode)/lib
+LLVMToolDir := $(LLVM_OBJ_ROOT)/$(BuildMode)/bin
+LLVMExmplDir:= $(LLVM_OBJ_ROOT)/$(BuildMode)/examples
+
+# Locations of shared libraries
+SharedPrefix     := lib
+SharedLibDir     := $(LibDir)
+LLVMSharedLibDir := $(LLVMLibDir)
+
+# Full Paths To Compiled Tools and Utilities
+EchoCmd  := $(ECHO) llvm[$(MAKELEVEL)]:
+
+Echo     := @$(EchoCmd)
+LLVMToolDir := $(shell $(LLVM_CONFIG) --bindir)
+LLVMLibDir := $(shell $(LLVM_CONFIG) --libdir)
+LLVMIncludeDir := $(shell $(LLVM_CONFIG) --includedir)
+ifndef LLVM_TBLGEN
+LLVM_TBLGEN   := $(LLVMToolDir)/llvm-tblgen$(EXEEXT)
+endif
+
+SharedLinkOptions=-shared
+
+ifdef TOOL_VERBOSE
+  C.Flags += -v
+  CXX.Flags += -v
+  LD.Flags += -v
+  VERBOSE := 1
+endif
+
+# Adjust settings for verbose mode
+ifndef VERBOSE
+  Verb := @
+  AR.Flags += >/dev/null 2>/dev/null
+endif
+
+# By default, strip symbol information from executable
+ifndef KEEP_SYMBOLS
+  Strip := $(PLATFORMSTRIPOPTS)
+  StripWarnMsg := "(without symbols)"
+  Install.StripFlag += -s
+endif
+
+ifdef TOOL_NO_EXPORTS
+  DynamicFlags :=
+else
+  DynamicFlag := $(RDYNAMIC)
+endif
+
+# Adjust linker flags for building an executable
+ifdef TOOLNAME
+  LD.Flags += $(RPATH) -Wl,'$$ORIGIN/../lib'
+  LD.Flags += $(RPATH) -Wl,$(ToolDir) $(DynamicFlag)
+endif
+
+# Options To Invoke Tools
+ifdef EXTRA_LD_OPTIONS
+LD.Flags += $(EXTRA_LD_OPTIONS)
+endif
+
+ifndef NO_PEDANTIC
+CompileCommonOpts += -pedantic -Wno-long-long
+endif
+CompileCommonOpts += -Wall -W -Wno-unused-parameter -Wwrite-strings \
+                     $(EXTRA_OPTIONS)
+# Enable cast-qual for C++; the workaround is to use const_cast.
+CXX.Flags += -Wcast-qual
+
+LD.Flags    += -L$(LibDir) -L$(LLVMLibDir)
+
+CPP.BaseFlags += -D_GNU_SOURCE -D__STDC_CONSTANT_MACROS -D__STDC_FORMAT_MACROS -D__STDC_LIMIT_MACROS
+# All -I flags should go here, so that they don't confuse llvm-config.
+CPP.Flags     += $(sort -I$(PROJ_OBJ_DIR) -I$(PROJ_SRC_DIR) \
+	         $(patsubst %,-I%/include,\
+	         $(PROJ_OBJ_ROOT) $(PROJ_SRC_ROOT) \
+	         $(LLVM_OBJ_ROOT) $(LLVM_SRC_ROOT))) \
+	         -I$(LLVMIncludeDir) $(CPP.BaseFlags)
+
+Compile.Wrapper :=
+
+Compile.C     = $(Compile.Wrapper) \
+	          $(CC) $(CPP.Flags) $(C.Flags) $(CFLAGS) $(CPPFLAGS) \
+                $(TargetCommonOpts) $(CompileCommonOpts) -c
+Compile.CXX   = $(Compile.Wrapper) \
+	          $(CXX) $(CPP.Flags) $(CXX.Flags) $(CXXFLAGS) $(CPPFLAGS) \
+                $(TargetCommonOpts) $(CompileCommonOpts) -c
+Link          = $(Compile.Wrapper) \
+	          $(CXX) $(CPP.Flags) $(CXX.Flags) $(CXXFLAGS) $(LD.Flags) \
+                $(LDFLAGS) $(TargetCommonOpts)  $(CompileCommonOpts) $(Strip)
+
+ProgInstall   = $(INSTALL) $(Install.StripFlag) -m 0755
+ScriptInstall = $(INSTALL) -m 0755
+DataInstall   = $(INSTALL) -m 0644
+
+TableGen.Flags= -I $(call SYSPATH, $(PROJ_SRC_DIR)) \
+                -I $(call SYSPATH, $(LLVMIncludeDir)) \
+                -I $(call SYSPATH, $(PROJ_SRC_ROOT)/include) \
+                -I $(call SYSPATH, $(PROJ_SRC_ROOT)/lib/Target)
+LLVMTableGen  = $(LLVM_TBLGEN) $(TableGen.Flags)
+
+Archive       = $(AR) $(AR.Flags)
+ifdef RANLIB
+Ranlib        = $(RANLIB)
+else
+Ranlib        = ranlib
+endif
+
+AliasTool     = ln -s
+
+# Get the list of source files and compute object file
+# names from them.
+ifndef SOURCES
+  Sources := $(notdir $(wildcard $(PROJ_SRC_DIR)/*.cpp \
+             $(PROJ_SRC_DIR)/*.cc $(PROJ_SRC_DIR)/*.c))
+else
+  Sources := $(SOURCES)
+endif
+
+ifdef BUILT_SOURCES
+Sources += $(filter %.cpp %.c %.cc,$(BUILT_SOURCES))
+endif
+
+BaseNameSources := $(sort $(basename $(Sources)))
+
+ObjectsO  := $(BaseNameSources:%=$(ObjDir)/%.o)
+
+ECHOPATH := $(Verb)$(ECHO)
+
+# DIRECTORIES: Handle recursive descent of directory structure
+
+# Provide rules to make install dirs. This must be early
+# in the file so they get built before dependencies
+
+$(DESTDIR)$(PROJ_bindir)::
+	$(Verb) $(MKDIR) $@
+
+# To create other directories, as needed, and timestamp their creation
+%/.dir:
+	$(Verb) $(MKDIR) $* > /dev/null
+	$(Verb) $(DOTDIR_TIMESTAMP_COMMAND) > $@
+
+.PRECIOUS: $(ObjDir)/.dir $(LibDir)/.dir $(ToolDir)/.dir $(ExmplDir)/.dir
+.PRECIOUS: $(LLVMLibDir)/.dir $(LLVMToolDir)/.dir $(LLVMExmplDir)/.dir
+
+# Handle the DIRS options for sequential construction
+
+SubDirs :=
+ifdef DIRS
+SubDirs += $(DIRS)
+
+ifneq ($(PROJ_SRC_ROOT),$(PROJ_OBJ_ROOT))
+$(RecursiveTargets)::
+	$(Verb) for dir in $(DIRS); do \
+	  if ([ ! -f $$dir/Makefile ] || \
+	      command test $$dir/Makefile -ot $(PROJ_SRC_DIR)/$$dir/Makefile ); then \
+	    $(MKDIR) $$dir; \
+	    $(CP) $(PROJ_SRC_DIR)/$$dir/Makefile $$dir/Makefile; \
+	  fi; \
+	  ($(MAKE) -C $$dir $@ ) || exit 1; \
+	done
+else
+$(RecursiveTargets)::
+	$(Verb) for dir in $(DIRS); do \
+	  ($(MAKE) -C $$dir $@ ) || exit 1; \
+	done
+endif
+
+endif
+
+# Handle the PARALLEL_DIRS options for parallel construction
+ifdef PARALLEL_DIRS
+
+SubDirs += $(PARALLEL_DIRS)
+
+# Unfortunately, this list must be maintained if new recursive targets are added
+all      :: $(addsuffix /.makeall      ,$(PARALLEL_DIRS))
+clean    :: $(addsuffix /.makeclean    ,$(PARALLEL_DIRS))
+clean-all:: $(addsuffix /.makeclean-all,$(PARALLEL_DIRS))
+install  :: $(addsuffix /.makeinstall  ,$(PARALLEL_DIRS))
+uninstall:: $(addsuffix /.makeuninstall,$(PARALLEL_DIRS))
+
+ParallelTargets := $(foreach T,$(RecursiveTargets),%/.make$(T))
+
+$(ParallelTargets) :
+	$(Verb) \
+	  SD=$(PROJ_SRC_DIR)/$(@D); \
+	  DD=$(@D); \
+	  if [ ! -f $$SD/Makefile ]; then \
+	    SD=$(@D); \
+	    DD=$(notdir $(@D)); \
+	  fi; \
+	  if ([ ! -f $$DD/Makefile ] || \
+	            command test $$DD/Makefile -ot \
+                      $$SD/Makefile ); then \
+	  $(MKDIR) $$DD; \
+	  $(CP) $$SD/Makefile $$DD/Makefile; \
+	fi; \
+	$(MAKE) -C $$DD $(subst $(@D)/.make,,$@)
+endif
+
+# Set up variables for building libraries
+
+# Define various command line options pertaining to the
+# libraries needed when linking. There are "Proj" libs
+# (defined by the user's project) and "LLVM" libs (defined
+# by the LLVM project).
+
+ifdef USEDLIBS
+ProjLibsOptions := $(patsubst %.a.o, -l%, $(addsuffix .o, $(USEDLIBS)))
+ProjLibsOptions := $(patsubst %.o, $(LibDir)/%.o,  $(ProjLibsOptions))
+ProjUsedLibs    := $(patsubst %.a.o, lib%.a, $(addsuffix .o, $(USEDLIBS)))
+ProjLibsPaths   := $(addprefix $(LibDir)/,$(ProjUsedLibs))
+endif
+
+ifdef LLVMLIBS
+LLVMLibsOptions := $(patsubst %.a.o, -l%, $(addsuffix .o, $(LLVMLIBS)))
+LLVMLibsOptions := $(patsubst %.o, $(LLVMLibDir)/%.o, $(LLVMLibsOptions))
+LLVMUsedLibs    := $(patsubst %.a.o, lib%.a, $(addsuffix .o, $(LLVMLIBS)))
+LLVMLibsPaths   := $(addprefix $(LLVMLibDir)/,$(LLVMUsedLibs))
+endif
+
+ifndef IS_CLEANING_TARGET
+ifdef LINK_COMPONENTS
+
+LLVMConfigLibs := $(shell $(LLVM_CONFIG) --libs $(LINK_COMPONENTS) || echo Error)
+ifeq ($(LLVMConfigLibs),Error)
+$(error llvm-config --libs failed)
+endif
+LLVMLibsOptions += $(LLVMConfigLibs)
+LLVMConfigLibfiles := $(shell $(LLVM_CONFIG) --libfiles $(LINK_COMPONENTS) || echo Error)
+ifeq ($(LLVMConfigLibfiles),Error)
+$(error llvm-config --libfiles failed)
+endif
+LLVMLibsPaths += $(LLVMConfigLibfiles)
+
+endif
+endif
+
+# Library Build Rules: Four ways to build a library
+
+# if we're building a library ...
+ifdef LIBRARYNAME
+
+# Make sure there isn't any extraneous whitespace on the LIBRARYNAME option
+LIBRARYNAME := $(strip $(LIBRARYNAME))
+BaseLibName.A  := lib$(LIBRARYNAME).a
+BaseLibName.SO := $(SharedPrefix)$(LIBRARYNAME)$(SHLIBEXT)
+LibName.A  := $(LibDir)/$(BaseLibName.A)
+LibName.SO := $(SharedLibDir)/$(BaseLibName.SO)
+LibName.O  := $(LibDir)/$(LIBRARYNAME).o
+
+# Library Targets:
+#   If neither BUILD_ARCHIVE or LOADABLE_MODULE are specified, default to
+#   building an archive.
+ifndef NO_BUILD_ARCHIVE
+ifndef BUILD_ARCHIVE
+ifndef LOADABLE_MODULE
+BUILD_ARCHIVE = 1
+endif
+endif
+endif
+
+# Archive Library Targets:
+#   If the user wanted a regular archive library built,
+#   then we provide targets for building them.
+ifdef BUILD_ARCHIVE
+
+all-local:: $(LibName.A)
+
+$(LibName.A): $(ObjectsO) $(LibDir)/.dir
+	$(Echo) Building $(BuildMode) Archive Library $(notdir $@)
+	-$(Verb) $(RM) -f $@
+	$(Verb) $(Archive) $@ $(ObjectsO)
+	$(Verb) $(Ranlib) $@
+
+clean-local::
+ifneq ($(strip $(LibName.A)),)
+	-$(Verb) $(RM) -f $(LibName.A)
+endif
+
+install-local::
+	$(Echo) Install circumvented with NO_INSTALL
+uninstall-local::
+	$(Echo) Uninstall circumvented with NO_INSTALL
+endif
+
+# endif LIBRARYNAME
+endif
+
+# Tool Build Rules: Build executable tool based on TOOLNAME option
+
+ifdef TOOLNAME
+
+# Set up variables for building a tool.
+TOOLEXENAME := $(strip $(TOOLNAME))$(EXEEXT)
+ToolBuildPath   := $(ToolDir)/$(TOOLEXENAME)
+
+# Provide targets for building the tools
+all-local:: $(ToolBuildPath)
+
+clean-local::
+ifneq ($(strip $(ToolBuildPath)),)
+	-$(Verb) $(RM) -f $(ToolBuildPath)
+endif
+
+$(ToolBuildPath): $(ToolDir)/.dir
+
+$(ToolBuildPath): $(ObjectsO) $(ProjLibsPaths) $(LLVMLibsPaths)
+	$(Echo) Linking $(BuildMode) executable $(TOOLNAME) $(StripWarnMsg)
+	$(Verb) $(Link) -o $@ $(TOOLLINKOPTS) $(ObjectsO) $(ProjLibsOptions) \
+	$(LLVMLibsOptions) $(ExtraLibs) $(TOOLLINKOPTSB) $(LIBS)
+	$(Echo) ======= Finished Linking $(BuildMode) Executable $(TOOLNAME) \
+          $(StripWarnMsg)
+
+ifdef NO_INSTALL
+install-local::
+	$(Echo) Install circumvented with NO_INSTALL
+uninstall-local::
+	$(Echo) Uninstall circumvented with NO_INSTALL
+else
+
+ToolBinDir = $(DESTDIR)$(PROJ_bindir)
+DestTool = $(ToolBinDir)/$(program_prefix)$(TOOLEXENAME)
+
+install-local:: $(DestTool)
+
+$(DestTool): $(ToolBuildPath)
+	$(Echo) Installing $(BuildMode) $(DestTool)
+	$(Verb) $(MKDIR) $(ToolBinDir)
+	$(Verb) $(ProgInstall) $(ToolBuildPath) $(DestTool)
+
+uninstall-local::
+	$(Echo) Uninstalling $(BuildMode) $(DestTool)
+	-$(Verb) $(RM) -f $(DestTool)
+
+endif
+endif
+
+# Create .o files in the ObjDir directory from the .cpp and .c files...
+
+DEPEND_OPTIONS = -MMD -MP -MF "$(ObjDir)/$*.d.tmp" \
+         -MT "$(ObjDir)/$*.o" -MT "$(ObjDir)/$*.d"
+
+# If the build succeeded, move the dependency file over, otherwise
+# remove it.
+DEPEND_MOVEFILE = then $(MV) -f "$(ObjDir)/$*.d.tmp" "$(ObjDir)/$*.d"; \
+                  else $(RM) "$(ObjDir)/$*.d.tmp"; exit 1; fi
+
+$(ObjDir)/%.o: %.cpp $(ObjDir)/.dir $(BUILT_SOURCES) $(PROJ_MAKEFILE)
+	$(Echo) "Compiling $*.cpp for $(BuildMode) build" $(PIC_FLAG)
+	$(Verb) if $(Compile.CXX) $(DEPEND_OPTIONS) $< -o $(ObjDir)/$*.o ; \
+	        $(DEPEND_MOVEFILE)
+
+$(ObjDir)/%.o: %.cc $(ObjDir)/.dir $(BUILT_SOURCES) $(PROJ_MAKEFILE)
+	$(Echo) "Compiling $*.cc for $(BuildMode) build" $(PIC_FLAG)
+	$(Verb) if $(Compile.CXX) $(DEPEND_OPTIONS) $< -o $(ObjDir)/$*.o ; \
+	        $(DEPEND_MOVEFILE)
+
+$(ObjDir)/%.o: %.c $(ObjDir)/.dir $(BUILT_SOURCES) $(PROJ_MAKEFILE)
+	$(Echo) "Compiling $*.c for $(BuildMode) build" $(PIC_FLAG)
+	$(Verb) if $(Compile.C) $(DEPEND_OPTIONS) $< -o $(ObjDir)/$*.o ; \
+	        $(DEPEND_MOVEFILE)
+
+# TABLEGEN: Provide rules for running tblgen to produce *.inc files
+
+ifdef TARGET
+TABLEGEN_INC_FILES_COMMON = 1
+endif
+
+ifdef TABLEGEN_INC_FILES_COMMON
+
+INCFiles := $(filter %.inc,$(BUILT_SOURCES))
+INCTMPFiles := $(INCFiles:%=$(ObjDir)/%.tmp)
+.PRECIOUS: $(INCTMPFiles) $(INCFiles)
+
+# INCFiles rule: All of the tblgen generated files are emitted to
+# $(ObjDir)/%.inc.tmp, instead of emitting them directly to %.inc.  This allows
+# us to only "touch" the real file if the contents of it change.  IOW, if
+# tblgen is modified, all of the .inc.tmp files are regenerated, but no
+# dependencies of the .inc files are, unless the contents of the .inc file
+# changes.
+$(INCFiles) : %.inc : $(ObjDir)/%.inc.tmp
+	$(Verb) $(CMP) -s $@ $< || $(CP) $< $@
+
+endif # TABLEGEN_INC_FILES_COMMON
+
+ifdef TARGET
+
+TDFiles := $(strip $(wildcard $(PROJ_SRC_DIR)/*.td) \
+           $(LLVMIncludeDir)/llvm/Target/Target.td \
+           $(LLVMIncludeDir)/llvm/Target/TargetCallingConv.td \
+           $(LLVMIncludeDir)/llvm/Target/TargetSchedule.td \
+           $(LLVMIncludeDir)/llvm/Target/TargetSelectionDAG.td \
+           $(LLVMIncludeDir)/llvm/CodeGen/ValueTypes.td) \
+           $(wildcard $(LLVMIncludeDir)/llvm/Intrinsics*.td)
+
+# All .inc.tmp files depend on the .td files.
+$(INCTMPFiles) : $(TDFiles)
+
+$(TARGET:%=$(ObjDir)/%GenRegisterInfo.inc.tmp): \
+$(ObjDir)/%GenRegisterInfo.inc.tmp : %.td $(ObjDir)/.dir $(LLVM_TBLGEN)
+	$(Echo) "Building $(<F) register info implementation with tblgen"
+	$(Verb) $(LLVMTableGen) -gen-register-info -o $(call SYSPATH, $@) $<
+
+$(TARGET:%=$(ObjDir)/%GenInstrInfo.inc.tmp): \
+$(ObjDir)/%GenInstrInfo.inc.tmp : %.td $(ObjDir)/.dir $(LLVM_TBLGEN)
+	$(Echo) "Building $(<F) instruction information with tblgen"
+	$(Verb) $(LLVMTableGen) -gen-instr-info -o $(call SYSPATH, $@) $<
+
+$(TARGET:%=$(ObjDir)/%GenAsmWriter.inc.tmp): \
+$(ObjDir)/%GenAsmWriter.inc.tmp : %.td $(ObjDir)/.dir $(LLVM_TBLGEN)
+	$(Echo) "Building $(<F) assembly writer with tblgen"
+	$(Verb) $(LLVMTableGen) -gen-asm-writer -o $(call SYSPATH, $@) $<
+
+$(TARGET:%=$(ObjDir)/%GenAsmMatcher.inc.tmp): \
+$(ObjDir)/%GenAsmMatcher.inc.tmp : %.td $(ObjDir)/.dir $(LLVM_TBLGEN)
+	$(Echo) "Building $(<F) assembly matcher with tblgen"
+	$(Verb) $(LLVMTableGen) -gen-asm-matcher -o $(call SYSPATH, $@) $<
+
+$(TARGET:%=$(ObjDir)/%GenMCCodeEmitter.inc.tmp): \
+$(ObjDir)/%GenMCCodeEmitter.inc.tmp: %.td $(ObjDir)/.dir $(LLVM_TBLGEN)
+	$(Echo) "Building $(<F) MC code emitter with tblgen"
+	$(Verb) $(LLVMTableGen) -gen-emitter -mc-emitter -o $(call SYSPATH, $@) $<
+
+$(TARGET:%=$(ObjDir)/%GenCodeEmitter.inc.tmp): \
+$(ObjDir)/%GenCodeEmitter.inc.tmp: %.td $(ObjDir)/.dir $(LLVM_TBLGEN)
+	$(Echo) "Building $(<F) code emitter with tblgen"
+	$(Verb) $(LLVMTableGen) -gen-emitter -o $(call SYSPATH, $@) $<
+
+$(TARGET:%=$(ObjDir)/%GenDAGISel.inc.tmp): \
+$(ObjDir)/%GenDAGISel.inc.tmp : %.td $(ObjDir)/.dir $(LLVM_TBLGEN)
+	$(Echo) "Building $(<F) DAG instruction selector implementation with tblgen"
+	$(Verb) $(LLVMTableGen) -gen-dag-isel -o $(call SYSPATH, $@) $<
+
+$(TARGET:%=$(ObjDir)/%GenSubtargetInfo.inc.tmp): \
+$(ObjDir)/%GenSubtargetInfo.inc.tmp : %.td $(ObjDir)/.dir $(LLVM_TBLGEN)
+	$(Echo) "Building $(<F) subtarget information with tblgen"
+	$(Verb) $(LLVMTableGen) -gen-subtarget -o $(call SYSPATH, $@) $<
+
+$(TARGET:%=$(ObjDir)/%GenCallingConv.inc.tmp): \
+$(ObjDir)/%GenCallingConv.inc.tmp : %.td $(ObjDir)/.dir $(LLVM_TBLGEN)
+	$(Echo) "Building $(<F) calling convention information with tblgen"
+	$(Verb) $(LLVMTableGen) -gen-callingconv -o $(call SYSPATH, $@) $<
+
+clean-local::
+	-$(Verb) $(RM) -f $(INCFiles)
+
+endif # TARGET
+
+# This rules ensures that header files that are removed still have a rule for
+# which they can be "generated."  This allows make to ignore them and
+# reproduce the dependency lists.
+%.h:: ;
+%.hpp:: ;
+
+# Define clean-local to clean the current directory. Note that this uses a
+# very conservative approach ensuring that empty variables do not cause
+# errors or disastrous removal.
+clean-local::
+ifneq ($(strip $(ObjRootDir)),)
+	-$(Verb) $(RM) -rf $(ObjRootDir)
+endif
+ifneq ($(strip $(SHLIBEXT)),) # Extra paranoia - make real sure SHLIBEXT is set
+	-$(Verb) $(RM) -f *$(SHLIBEXT)
+endif
+
+clean-all-local::
+	-$(Verb) $(RM) -rf Debug Release Profile
+
+
+# DEPENDENCIES: Include the dependency files if we should
+ifndef DISABLE_AUTO_DEPENDENCIES
+
+# If its not one of the cleaning targets
+ifndef IS_CLEANING_TARGET
+
+# Get the list of dependency files
+DependSourceFiles := $(basename $(filter %.cpp %.c %.cc %.m %.mm, $(Sources)))
+DependFiles := $(DependSourceFiles:%=$(PROJ_OBJ_DIR)/$(BuildMode)/%.d)
+
+-include $(DependFiles) ""
+
+endif
+
+endif
+
diff --git a/tools/bpf/llvm/README.txt b/tools/bpf/llvm/README.txt
new file mode 100644
index 0000000..6085afb
--- /dev/null
+++ b/tools/bpf/llvm/README.txt
@@ -0,0 +1,23 @@
+LLVM BPF backend:
+lib/Target/BPF/*.cpp
+
+Links with LLVM 3.2, 3.3 and 3.4
+
+prerequisites:
+apt-get install clang llvm-3.[234]-dev
+
+To build:
+$cd bld
+$make
+if 'llvm-config-3.2' is not found in PATH, build with:
+$make -j4 LLVM_CONFIG=/path_to/llvm-config
+
+To run:
+$clang -O2 -emit-llvm -c file.c -o -|./bld/Debug+Asserts/bin/llc -o file.bpf
+
+'clang' - is unmodified clang used to build x86 code
+'llc' - llvm bit-code to BPF compiler
+file.bpf - BPF binary image, see include/linux/bpf_jit.h
+
+$clang -O2 -emit-llvm -c file.c -o -|llc -filetype=asm -o file.s
+will emit human readable BPF assembler instead.
diff --git a/tools/bpf/llvm/bld/.gitignore b/tools/bpf/llvm/bld/.gitignore
new file mode 100644
index 0000000..c3fc209
--- /dev/null
+++ b/tools/bpf/llvm/bld/.gitignore
@@ -0,0 +1,2 @@
+*.inc
+Debug+Asserts
diff --git a/tools/bpf/llvm/bld/Makefile b/tools/bpf/llvm/bld/Makefile
new file mode 100644
index 0000000..7ac0938
--- /dev/null
+++ b/tools/bpf/llvm/bld/Makefile
@@ -0,0 +1,27 @@
+#===- ./Makefile -------------------------------------------*- Makefile -*--===#
+#
+# This file is distributed under the University of Illinois Open Source
+# License. See LICENSE.TXT for details.
+
+ifndef LLVM_CONFIG
+LLVM_CONFIG := llvm-config-3.2
+export LLVM_CONFIG
+endif
+
+LEVEL := .
+
+DIRS := lib tools
+
+include $(LEVEL)/Makefile.config
+
+# Include the main makefile machinery.
+include $(LLVM_SRC_ROOT)/Makefile.rules
+
+# NOTE: This needs to remain as the last target definition in this file so
+# that it gets executed last.
+all::
+	$(Echo) '*****' Completed $(BuildMode) Build
+
+# declare all targets at this level to be serial:
+.NOTPARALLEL:
+
diff --git a/tools/bpf/llvm/bld/Makefile.common b/tools/bpf/llvm/bld/Makefile.common
new file mode 100644
index 0000000..624f7d3
--- /dev/null
+++ b/tools/bpf/llvm/bld/Makefile.common
@@ -0,0 +1,14 @@
+#===-- Makefile.common - Common make rules for LLVM --------*- Makefile -*--===#
+#
+# This file is distributed under the University of Illinois Open Source
+# License. See LICENSE.TXT for details.
+
+# Configuration file to set paths specific to local installation of LLVM
+ifndef LLVM_OBJ_ROOT
+include $(LEVEL)/Makefile.config
+else
+include $(LLVM_OBJ_ROOT)/Makefile.config
+endif
+
+# Include all of the build rules used for making LLVM
+include $(LLVM_SRC_ROOT)/Makefile.rules
diff --git a/tools/bpf/llvm/bld/Makefile.config b/tools/bpf/llvm/bld/Makefile.config
new file mode 100644
index 0000000..d8eda05
--- /dev/null
+++ b/tools/bpf/llvm/bld/Makefile.config
@@ -0,0 +1,124 @@
+#===-- Makefile.config - Local configuration for LLVM ------*- Makefile -*--===#
+#
+# This file is distributed under the University of Illinois Open Source
+# License. See LICENSE.TXT for details.
+#
+# This file is included by Makefile.common.  It defines paths and other
+# values specific to a particular installation of LLVM.
+#
+
+# Directory Configuration
+#	This section of the Makefile determines what is where.  To be
+#	specific, there are several locations that need to be defined:
+#
+#	o LLVM_SRC_ROOT  : The root directory of the LLVM source code.
+#	o LLVM_OBJ_ROOT  : The root directory containing the built LLVM code.
+#
+#	o PROJ_SRC_DIR  : The directory containing the code to build.
+#	o PROJ_SRC_ROOT : The root directory of the code to build.
+#
+#	o PROJ_OBJ_DIR  : The directory in which compiled code will be placed.
+#	o PROJ_OBJ_ROOT : The root directory in which compiled code is placed.
+
+PWD := /bin/pwd
+
+# The macro below is expanded when 'realpath' is not built-in.
+# Built-in 'realpath' is available on GNU Make 3.81.
+realpath = $(shell cd $(1); $(PWD))
+
+PROJ_OBJ_DIR  := $(call realpath, .)
+PROJ_OBJ_ROOT := $(call realpath, $(PROJ_OBJ_DIR)/$(LEVEL))
+
+LLVM_SRC_ROOT   := $(call realpath, $(PROJ_OBJ_DIR)/$(LEVEL)/..)
+LLVM_OBJ_ROOT   := $(call realpath, $(PROJ_OBJ_DIR)/$(LEVEL))
+PROJ_SRC_ROOT   := $(LLVM_SRC_ROOT)
+PROJ_SRC_DIR    := $(LLVM_SRC_ROOT)$(patsubst $(PROJ_OBJ_ROOT)%,%,$(PROJ_OBJ_DIR))
+
+prefix          := /usr/local
+PROJ_prefix     := $(prefix)
+program_prefix  := 
+
+PROJ_bindir     := $(PROJ_prefix)/bin
+
+# Extra options to compile LLVM with
+EXTRA_OPTIONS=
+
+# Extra options to link LLVM with
+EXTRA_LD_OPTIONS=
+
+# Path to the C++ compiler to use.  This is an optional setting, which defaults
+# to whatever your gmake defaults to.
+CXX = g++
+
+# Path to the CC binary, which use used by testcases for native builds.
+CC := gcc
+
+# Linker flags.
+LDFLAGS+=
+
+# Path to the library archiver program.
+AR_PATH = ar
+AR = ar
+
+# The pathnames of the programs we require to build
+CMP        := /usr/bin/cmp
+CP         := /bin/cp
+DATE       := /bin/date
+INSTALL    := /usr/bin/install -c
+MKDIR      := mkdir -p
+MV         := /bin/mv
+RANLIB     := ranlib
+RM         := /bin/rm
+
+LIBS       := -lncurses -lpthread -ldl -lm
+
+# Targets that we should build
+TARGETS_TO_BUILD=BPF 
+
+# What to pass as rpath flag to g++
+RPATH := -Wl,-R
+
+# What to pass as -rdynamic flag to g++
+RDYNAMIC := -Wl,-export-dynamic
+
+# When ENABLE_WERROR is enabled, we'll pass -Werror on the command line
+ENABLE_WERROR = 0
+
+# When ENABLE_OPTIMIZED is enabled, LLVM code is optimized and output is put
+# into the "Release" directories. Otherwise, LLVM code is not optimized and
+# output is put in the "Debug" directories.
+#ENABLE_OPTIMIZED = 1
+
+# When DISABLE_ASSERTIONS is enabled, builds of all of the LLVM code will
+# exclude assertion checks, otherwise they are included.
+#DISABLE_ASSERTIONS = 1
+
+# When DEBUG_SYMBOLS is enabled, the compiler libraries will retain debug
+# symbols.
+#DEBUG_SYMBOLS = 1
+
+# When KEEP_SYMBOLS is enabled, installed executables will never have their
+# symbols stripped.
+#KEEP_SYMBOLS = 1
+
+# The compiler flags to use for optimized builds.
+OPTIMIZE_OPTION := -O3
+
+# Use -fvisibility-inlines-hidden?
+ENABLE_VISIBILITY_INLINES_HIDDEN := 1
+
+# This option tells the Makefiles to produce verbose output.
+# It essentially prints the commands that make is executing
+#VERBOSE = 1
+
+# Shared library extension for host platform.
+SHLIBEXT = .so
+
+# Executable file extension for host platform.
+EXEEXT = 
+
+# Things we just assume are "there"
+ECHO := echo
+
+SYSPATH = $(1)
+
diff --git a/tools/bpf/llvm/bld/include/llvm/Config/AsmParsers.def b/tools/bpf/llvm/bld/include/llvm/Config/AsmParsers.def
new file mode 100644
index 0000000..9efd8f4
--- /dev/null
+++ b/tools/bpf/llvm/bld/include/llvm/Config/AsmParsers.def
@@ -0,0 +1,8 @@
+/*===- llvm/Config/AsmParsers.def - LLVM Assembly Parsers -------*- C++ -*-===*\
+|* This file is distributed under the University of Illinois Open Source      *|
+|* License. See LICENSE.TXT for details.                                      *|
+\*===----------------------------------------------------------------------===*/
+#ifndef LLVM_ASM_PARSER
+#  error Please define the macro LLVM_ASM_PARSER(TargetName)
+#endif
+#undef LLVM_ASM_PARSER
diff --git a/tools/bpf/llvm/bld/include/llvm/Config/AsmPrinters.def b/tools/bpf/llvm/bld/include/llvm/Config/AsmPrinters.def
new file mode 100644
index 0000000..f212afa
--- /dev/null
+++ b/tools/bpf/llvm/bld/include/llvm/Config/AsmPrinters.def
@@ -0,0 +1,9 @@
+/*===- llvm/Config/AsmPrinters.def - LLVM Assembly Printers -----*- C++ -*-===*\
+|* This file is distributed under the University of Illinois Open Source      *|
+|* License. See LICENSE.TXT for details.                                      *|
+\*===----------------------------------------------------------------------===*/
+#ifndef LLVM_ASM_PRINTER
+#  error Please define the macro LLVM_ASM_PRINTER(TargetName)
+#endif
+LLVM_ASM_PRINTER(BPF) 
+#undef LLVM_ASM_PRINTER
diff --git a/tools/bpf/llvm/bld/include/llvm/Config/Disassemblers.def b/tools/bpf/llvm/bld/include/llvm/Config/Disassemblers.def
new file mode 100644
index 0000000..527473f
--- /dev/null
+++ b/tools/bpf/llvm/bld/include/llvm/Config/Disassemblers.def
@@ -0,0 +1,8 @@
+/*===- llvm/Config/Disassemblers.def - LLVM Assembly Parsers ----*- C++ -*-===*\
+|* This file is distributed under the University of Illinois Open Source      *|
+|* License. See LICENSE.TXT for details.                                      *|
+\*===----------------------------------------------------------------------===*/
+#ifndef LLVM_DISASSEMBLER
+#  error Please define the macro LLVM_DISASSEMBLER(TargetName)
+#endif
+#undef LLVM_DISASSEMBLER
diff --git a/tools/bpf/llvm/bld/include/llvm/Config/Targets.def b/tools/bpf/llvm/bld/include/llvm/Config/Targets.def
new file mode 100644
index 0000000..cb2852c
--- /dev/null
+++ b/tools/bpf/llvm/bld/include/llvm/Config/Targets.def
@@ -0,0 +1,9 @@
+/*===- llvm/Config/Targets.def - LLVM Target Architectures ------*- C++ -*-===*\
+|* This file is distributed under the University of Illinois Open Source      *|
+|* License. See LICENSE.TXT for details.                                      *|
+\*===----------------------------------------------------------------------===*/
+#ifndef LLVM_TARGET
+#  error Please define the macro LLVM_TARGET(TargetName)
+#endif
+LLVM_TARGET(BPF) 
+#undef LLVM_TARGET
diff --git a/tools/bpf/llvm/bld/include/llvm/Support/DataTypes.h b/tools/bpf/llvm/bld/include/llvm/Support/DataTypes.h
new file mode 100644
index 0000000..81328a6
--- /dev/null
+++ b/tools/bpf/llvm/bld/include/llvm/Support/DataTypes.h
@@ -0,0 +1,96 @@
+/* include/llvm/Support/DataTypes.h.  Generated from DataTypes.h.in by configure.  */
+/*===-- include/Support/DataTypes.h - Define fixed size types -----*- C -*-===*\
+|*                                                                            *|
+|*                     The LLVM Compiler Infrastructure                       *|
+|*                                                                            *|
+|* This file is distributed under the University of Illinois Open Source      *|
+|* License. See LICENSE.TXT for details.                                      *|
+|*                                                                            *|
+|*===----------------------------------------------------------------------===*|
+|*                                                                            *|
+|* This file contains definitions to figure out the size of _HOST_ data types.*|
+|* This file is important because different host OS's define different macros,*|
+|* which makes portability tough.  This file exports the following            *|
+|* definitions:                                                               *|
+|*                                                                            *|
+|*   [u]int(32|64)_t : typedefs for signed and unsigned 32/64 bit system types*|
+|*   [U]INT(8|16|32|64)_(MIN|MAX) : Constants for the min and max values.     *|
+|*                                                                            *|
+|* No library is required when using these functions.                         *|
+|*                                                                            *|
+|*===----------------------------------------------------------------------===*/
+
+/* Please leave this file C-compatible. */
+
+#ifndef SUPPORT_DATATYPES_H
+#define SUPPORT_DATATYPES_H
+
+#define HAVE_SYS_TYPES_H 1
+#define HAVE_INTTYPES_H 1
+#define HAVE_STDINT_H 1
+#define HAVE_UINT64_T 1
+/* #undef HAVE_U_INT64_T */
+
+#ifdef __cplusplus
+#include <cmath>
+#else
+#include <math.h>
+#endif
+
+/* Note that this header's correct operation depends on __STDC_LIMIT_MACROS
+   being defined.  We would define it here, but in order to prevent Bad Things
+   happening when system headers or C++ STL headers include stdint.h before we
+   define it here, we define it on the g++ command line (in Makefile.rules). */
+#if !defined(__STDC_LIMIT_MACROS)
+# error "Must #define __STDC_LIMIT_MACROS before #including Support/DataTypes.h"
+#endif
+
+#if !defined(__STDC_CONSTANT_MACROS)
+# error "Must #define __STDC_CONSTANT_MACROS before " \
+        "#including Support/DataTypes.h"
+#endif
+
+/* Note that <inttypes.h> includes <stdint.h>, if this is a C99 system. */
+#ifdef HAVE_SYS_TYPES_H
+#include <sys/types.h>
+#endif
+
+#ifdef HAVE_INTTYPES_H
+#include <inttypes.h>
+#endif
+
+#ifdef HAVE_STDINT_H
+#include <stdint.h>
+#endif
+
+/* Handle incorrect definition of uint64_t as u_int64_t */
+#ifndef HAVE_UINT64_T
+#ifdef HAVE_U_INT64_T
+typedef u_int64_t uint64_t;
+#else
+# error "Don't have a definition for uint64_t on this platform"
+#endif
+#endif
+
+/* Set defaults for constants which we cannot find. */
+#if !defined(INT64_MAX)
+# define INT64_MAX 9223372036854775807LL
+#endif
+#if !defined(INT64_MIN)
+# define INT64_MIN ((-INT64_MAX)-1)
+#endif
+#if !defined(UINT64_MAX)
+# define UINT64_MAX 0xffffffffffffffffULL
+#endif
+
+#if __GNUC__ > 3
+#define END_WITH_NULL __attribute__((sentinel))
+#else
+#define END_WITH_NULL
+#endif
+
+#ifndef HUGE_VALF
+#define HUGE_VALF (float)HUGE_VAL
+#endif
+
+#endif  /* SUPPORT_DATATYPES_H */
diff --git a/tools/bpf/llvm/bld/lib/Makefile b/tools/bpf/llvm/bld/lib/Makefile
new file mode 100644
index 0000000..5c7e219
--- /dev/null
+++ b/tools/bpf/llvm/bld/lib/Makefile
@@ -0,0 +1,11 @@
+##===- lib/Makefile ----------------------------------------*- Makefile -*-===##
+# This file is distributed under the University of Illinois Open Source
+# License. See LICENSE.TXT for details.
+LEVEL = ..
+
+include $(LEVEL)/Makefile.config
+
+PARALLEL_DIRS := Target
+
+include $(LEVEL)/Makefile.common
+
diff --git a/tools/bpf/llvm/bld/lib/Target/BPF/InstPrinter/Makefile b/tools/bpf/llvm/bld/lib/Target/BPF/InstPrinter/Makefile
new file mode 100644
index 0000000..d9a4522
--- /dev/null
+++ b/tools/bpf/llvm/bld/lib/Target/BPF/InstPrinter/Makefile
@@ -0,0 +1,10 @@
+##===- lib/Target/BPF/InstPrinter/Makefile ----------------*- Makefile -*-===##
+# This file is distributed under the University of Illinois Open Source
+# License. See LICENSE.TXT for details.
+LEVEL = ../../../..
+LIBRARYNAME = LLVMBPFAsmPrinter
+
+# Hack: we need to include 'main' BPF target directory to grab private headers
+CPP.Flags += -I$(PROJ_OBJ_DIR)/.. -I$(PROJ_SRC_DIR)/..
+
+include $(LEVEL)/Makefile.common
diff --git a/tools/bpf/llvm/bld/lib/Target/BPF/MCTargetDesc/Makefile b/tools/bpf/llvm/bld/lib/Target/BPF/MCTargetDesc/Makefile
new file mode 100644
index 0000000..5f2e209
--- /dev/null
+++ b/tools/bpf/llvm/bld/lib/Target/BPF/MCTargetDesc/Makefile
@@ -0,0 +1,11 @@
+##===- lib/Target/BPF/TargetDesc/Makefile ----------------*- Makefile -*-===##
+# This file is distributed under the University of Illinois Open Source
+# License. See LICENSE.TXT for details.
+
+LEVEL = ../../../..
+LIBRARYNAME = LLVMBPFDesc
+
+# Hack: we need to include 'main' target directory to grab private headers
+CPP.Flags += -I$(PROJ_OBJ_DIR)/.. -I$(PROJ_SRC_DIR)/..
+
+include $(LEVEL)/Makefile.common
diff --git a/tools/bpf/llvm/bld/lib/Target/BPF/Makefile b/tools/bpf/llvm/bld/lib/Target/BPF/Makefile
new file mode 100644
index 0000000..14dea1a3
--- /dev/null
+++ b/tools/bpf/llvm/bld/lib/Target/BPF/Makefile
@@ -0,0 +1,17 @@
+##===- lib/Target/BPF/Makefile ---------------------------*- Makefile -*-===##
+# This file is distributed under the University of Illinois Open Source
+# License. See LICENSE.TXT for details.
+
+LEVEL = ../../..
+LIBRARYNAME = LLVMBPFCodeGen
+TARGET = BPF
+
+# Make sure that tblgen is run, first thing.
+BUILT_SOURCES = BPFGenRegisterInfo.inc BPFGenInstrInfo.inc \
+		BPFGenAsmWriter.inc BPFGenAsmMatcher.inc BPFGenDAGISel.inc \
+		BPFGenMCCodeEmitter.inc BPFGenSubtargetInfo.inc BPFGenCallingConv.inc
+
+DIRS = InstPrinter TargetInfo MCTargetDesc
+
+include $(LEVEL)/Makefile.common
+
diff --git a/tools/bpf/llvm/bld/lib/Target/BPF/TargetInfo/Makefile b/tools/bpf/llvm/bld/lib/Target/BPF/TargetInfo/Makefile
new file mode 100644
index 0000000..fdf9056
--- /dev/null
+++ b/tools/bpf/llvm/bld/lib/Target/BPF/TargetInfo/Makefile
@@ -0,0 +1,10 @@
+##===- lib/Target/BPF/TargetInfo/Makefile ----------------*- Makefile -*-===##
+# This file is distributed under the University of Illinois Open Source
+# License. See LICENSE.TXT for details.
+LEVEL = ../../../..
+LIBRARYNAME = LLVMBPFInfo
+
+# Hack: we need to include 'main' target directory to grab private headers
+CPPFLAGS = -I$(PROJ_OBJ_DIR)/.. -I$(PROJ_SRC_DIR)/..
+
+include $(LEVEL)/Makefile.common
diff --git a/tools/bpf/llvm/bld/lib/Target/Makefile b/tools/bpf/llvm/bld/lib/Target/Makefile
new file mode 100644
index 0000000..06e5185
--- /dev/null
+++ b/tools/bpf/llvm/bld/lib/Target/Makefile
@@ -0,0 +1,11 @@
+#===- lib/Target/Makefile ----------------------------------*- Makefile -*-===##
+# This file is distributed under the University of Illinois Open Source
+# License. See LICENSE.TXT for details.
+
+LEVEL = ../..
+
+include $(LEVEL)/Makefile.config
+
+PARALLEL_DIRS := $(TARGETS_TO_BUILD)
+
+include $(LLVM_SRC_ROOT)/Makefile.rules
diff --git a/tools/bpf/llvm/bld/tools/Makefile b/tools/bpf/llvm/bld/tools/Makefile
new file mode 100644
index 0000000..6613681
--- /dev/null
+++ b/tools/bpf/llvm/bld/tools/Makefile
@@ -0,0 +1,12 @@
+##===- tools/Makefile --------------------------------------*- Makefile -*-===##
+# This file is distributed under the University of Illinois Open Source
+# License. See LICENSE.TXT for details.
+
+LEVEL := ..
+
+include $(LEVEL)/Makefile.config
+
+DIRS :=
+PARALLEL_DIRS := llc
+
+include $(LEVEL)/Makefile.common
diff --git a/tools/bpf/llvm/bld/tools/llc/Makefile b/tools/bpf/llvm/bld/tools/llc/Makefile
new file mode 100644
index 0000000..499feb0
--- /dev/null
+++ b/tools/bpf/llvm/bld/tools/llc/Makefile
@@ -0,0 +1,15 @@
+#===- tools/llc/Makefile -----------------------------------*- Makefile -*-===##
+# This file is distributed under the University of Illinois Open Source
+# License. See LICENSE.TXT for details.
+
+LEVEL := ../..
+TOOLNAME := llc
+ifneq (,$(filter $(shell $(LLVM_CONFIG) --version),3.3 3.4))
+LINK_COMPONENTS := asmparser asmprinter codegen bitreader core mc selectiondag support target irreader
+else
+LINK_COMPONENTS := asmparser asmprinter codegen bitreader core mc selectiondag support target
+endif
+USEDLIBS := LLVMBPFCodeGen.a LLVMBPFDesc.a LLVMBPFInfo.a LLVMBPFAsmPrinter.a
+
+include $(LEVEL)/Makefile.common
+
diff --git a/tools/bpf/llvm/lib/Target/BPF/BPF.h b/tools/bpf/llvm/lib/Target/BPF/BPF.h
new file mode 100644
index 0000000..7412b51
--- /dev/null
+++ b/tools/bpf/llvm/lib/Target/BPF/BPF.h
@@ -0,0 +1,30 @@
+//===-- BPF.h - Top-level interface for BPF representation ----*- C++ -*-===//
+// This file is distributed under the University of Illinois Open Source
+// License. See LICENSE.TXT for details.
+
+#ifndef TARGET_BPF_H
+#define TARGET_BPF_H
+#include "llvm/Config/config.h"
+#undef LLVM_NATIVE_TARGET
+#undef LLVM_NATIVE_ASMPRINTER
+#undef LLVM_NATIVE_ASMPARSER
+#undef LLVM_NATIVE_DISASSEMBLER
+#include "MCTargetDesc/BPFBaseInfo.h"
+#include "MCTargetDesc/BPFMCTargetDesc.h"
+#include "llvm/Target/TargetMachine.h"
+
+namespace llvm {
+class FunctionPass;
+class TargetMachine;
+class BPFTargetMachine;
+
+/// createBPFISelDag - This pass converts a legalized DAG into a
+/// BPF-specific DAG, ready for instruction scheduling.
+FunctionPass *createBPFISelDag(BPFTargetMachine &TM);
+
+FunctionPass *createBPFCFGFixup(BPFTargetMachine &TM);
+
+extern Target TheBPFTarget;
+}
+
+#endif
diff --git a/tools/bpf/llvm/lib/Target/BPF/BPF.td b/tools/bpf/llvm/lib/Target/BPF/BPF.td
new file mode 100644
index 0000000..867c7f8
--- /dev/null
+++ b/tools/bpf/llvm/lib/Target/BPF/BPF.td
@@ -0,0 +1,29 @@
+//===- BPF.td - Describe the BPF Target Machine --------*- tablegen -*-===//
+// This file is distributed under the University of Illinois Open Source
+// License. See LICENSE.TXT for details.
+
+// Target-independent interfaces which we are implementing
+include "llvm/Target/Target.td"
+
+// BPF Subtarget features.
+include "BPFRegisterInfo.td"
+include "BPFCallingConv.td"
+include "BPFInstrInfo.td"
+
+def BPFInstrInfo : InstrInfo;
+
+class Proc<string Name, list<SubtargetFeature> Features>
+ : Processor<Name, NoItineraries, Features>;
+
+def : Proc<"generic", []>;
+
+def BPFInstPrinter : AsmWriter {
+  string AsmWriterClassName  = "InstPrinter";
+  bit isMCAsmWriter = 1;
+}
+
+// Declare the target which we are implementing
+def BPF : Target {
+  let InstructionSet = BPFInstrInfo;
+  let AssemblyWriters = [BPFInstPrinter];
+}
diff --git a/tools/bpf/llvm/lib/Target/BPF/BPFAsmPrinter.cpp b/tools/bpf/llvm/lib/Target/BPF/BPFAsmPrinter.cpp
new file mode 100644
index 0000000..9740d87
--- /dev/null
+++ b/tools/bpf/llvm/lib/Target/BPF/BPFAsmPrinter.cpp
@@ -0,0 +1,100 @@
+//===-- BPFAsmPrinter.cpp - BPF LLVM assembly writer --------------------===//
+// This file is distributed under the University of Illinois Open Source
+// License. See LICENSE.TXT for details.
+//
+// This file contains a printer that converts from our internal representation
+// of machine-dependent LLVM code to the BPF assembly language.
+
+#define DEBUG_TYPE "asm-printer"
+#include "BPF.h"
+#include "BPFInstrInfo.h"
+#include "BPFMCInstLower.h"
+#include "BPFTargetMachine.h"
+#include "InstPrinter/BPFInstPrinter.h"
+#include "llvm/Assembly/Writer.h"
+#include "llvm/CodeGen/AsmPrinter.h"
+#include "llvm/CodeGen/MachineModuleInfo.h"
+#include "llvm/CodeGen/MachineFunctionPass.h"
+#include "llvm/CodeGen/MachineConstantPool.h"
+#include "llvm/CodeGen/MachineInstr.h"
+#include "llvm/MC/MCAsmInfo.h"
+#include "llvm/MC/MCInst.h"
+#include "llvm/MC/MCStreamer.h"
+#include "llvm/MC/MCSymbol.h"
+#include "llvm/Target/Mangler.h"
+#include "llvm/Support/TargetRegistry.h"
+#include "llvm/Support/raw_ostream.h"
+using namespace llvm;
+
+namespace {
+  class BPFAsmPrinter : public AsmPrinter {
+  public:
+    explicit BPFAsmPrinter(TargetMachine &TM, MCStreamer &Streamer)
+      : AsmPrinter(TM, Streamer) {}
+
+    virtual const char *getPassName() const {
+      return "BPF Assembly Printer";
+    }
+
+    void printOperand(const MachineInstr *MI, int OpNum,
+                      raw_ostream &O, const char* Modifier = 0);
+    void EmitInstruction(const MachineInstr *MI);
+  private:
+    void customEmitInstruction(const MachineInstr *MI);
+  };
+}
+
+void BPFAsmPrinter::printOperand(const MachineInstr *MI, int OpNum,
+                                  raw_ostream &O, const char *Modifier) {
+  const MachineOperand &MO = MI->getOperand(OpNum);
+
+  switch (MO.getType()) {
+  case MachineOperand::MO_Register:
+    O << BPFInstPrinter::getRegisterName(MO.getReg());
+    break;
+
+  case MachineOperand::MO_Immediate:
+    O << MO.getImm();
+    break;
+
+  case MachineOperand::MO_MachineBasicBlock:
+    O << *MO.getMBB()->getSymbol();
+    break;
+
+  case MachineOperand::MO_GlobalAddress:
+#if LLVM_VERSION_MINOR==4
+      O << *getSymbol(MO.getGlobal());
+#else
+      O << *Mang->getSymbol(MO.getGlobal());
+#endif
+    break;
+
+  default:
+    llvm_unreachable("<unknown operand type>");
+    O << "bug";
+    return;
+  }
+}
+
+void BPFAsmPrinter::customEmitInstruction(const MachineInstr *MI) {
+  BPFMCInstLower MCInstLowering(OutContext, *Mang, *this);
+
+  MCInst TmpInst;
+  MCInstLowering.Lower(MI, TmpInst);
+  OutStreamer.EmitInstruction(TmpInst);
+}
+
+void BPFAsmPrinter::EmitInstruction(const MachineInstr *MI) {
+
+  MachineBasicBlock::const_instr_iterator I = MI;
+  MachineBasicBlock::const_instr_iterator E = MI->getParent()->instr_end();
+
+  do {
+    customEmitInstruction(I++);
+  } while ((I != E) && I->isInsideBundle());
+}
+
+// Force static initialization.
+extern "C" void LLVMInitializeBPFAsmPrinter() {
+  RegisterAsmPrinter<BPFAsmPrinter> X(TheBPFTarget);
+}
diff --git a/tools/bpf/llvm/lib/Target/BPF/BPFCFGFixup.cpp b/tools/bpf/llvm/lib/Target/BPF/BPFCFGFixup.cpp
new file mode 100644
index 0000000..18401ba
--- /dev/null
+++ b/tools/bpf/llvm/lib/Target/BPF/BPFCFGFixup.cpp
@@ -0,0 +1,62 @@
+//===-- BPFCFGFixup.cpp - CFG fixup pass -----------------------===//
+// This file is distributed under the University of Illinois Open Source
+// License. See LICENSE.TXT for details.
+
+#define DEBUG_TYPE "bpf_cfg"
+#include "BPF.h"
+#include "BPFInstrInfo.h"
+#include "BPFSubtarget.h"
+#include "BPFTargetMachine.h"
+#include "BPFSubtarget.h"
+#include "llvm/CodeGen/MachineFunctionPass.h"
+
+using namespace llvm;
+
+namespace {
+
+class BPFCFGFixup : public MachineFunctionPass {
+ private:
+  BPFTargetMachine& QTM;
+  const BPFSubtarget &QST;
+
+  void InvertAndChangeJumpTarget(MachineInstr*, MachineBasicBlock*);
+
+ public:
+  static char ID;
+  BPFCFGFixup(BPFTargetMachine& TM) : MachineFunctionPass(ID),
+                                                  QTM(TM),
+                                                  QST(*TM.getSubtargetImpl()) {}
+
+  const char *getPassName() const {
+    return "BPF RET insn fixup";
+  }
+  bool runOnMachineFunction(MachineFunction &Fn);
+};
+
+char BPFCFGFixup::ID = 0;
+
+bool BPFCFGFixup::runOnMachineFunction(MachineFunction &Fn) {
+
+  // Loop over all of the basic blocks.
+  for (MachineFunction::iterator MBBb = Fn.begin(), MBBe = Fn.end();
+       MBBb != MBBe; ++MBBb) {
+    MachineBasicBlock* MBB = MBBb;
+
+    MachineBasicBlock::iterator MII = MBB->getFirstTerminator();
+    if (MII != MBB->end()) {
+      /* if last insn of this basic block is RET, make this BB last */
+      if (MII->getOpcode() == BPF::RET) {
+        MBBe--;
+        if (MBB != MBBe)
+          MBB->moveAfter(MBBe);
+        break;
+      }
+    }
+  }
+  return true;
+}
+}
+
+FunctionPass *llvm::createBPFCFGFixup(BPFTargetMachine &TM) {
+  return new BPFCFGFixup(TM);
+}
diff --git a/tools/bpf/llvm/lib/Target/BPF/BPFCallingConv.td b/tools/bpf/llvm/lib/Target/BPF/BPFCallingConv.td
new file mode 100644
index 0000000..27c327e
--- /dev/null
+++ b/tools/bpf/llvm/lib/Target/BPF/BPFCallingConv.td
@@ -0,0 +1,24 @@
+//===- BPFCallingConv.td - Calling Conventions BPF -------*- tablegen -*-===//
+// This file is distributed under the University of Illinois Open Source
+// License. See LICENSE.TXT for details.
+//
+// This describes the calling conventions for the BPF architectures.
+
+// BPF 64-bit C return-value convention.
+def RetCC_BPF64 : CallingConv<[
+  CCIfType<[i64], CCAssignToReg<[R0]>>
+]>;
+
+// BPF 64-bit C Calling convention.
+def CC_BPF64 : CallingConv<[
+  // Promote i8/i16/i32 args to i64
+  CCIfType<[i8, i16, i32], CCPromoteToType<i64>>,
+
+  // All arguments get passed in integer registers if there is space.
+  CCIfType<[i64], CCAssignToReg<[R1, R2, R3, R4, R5]>>,
+
+  // Alternatively, they are assigned to the stack in 8-byte aligned units.
+  CCAssignToStack<8, 8>
+]>;
+
+def CSR: CalleeSavedRegs<(add R6, R7, R8, R9, R10)>;
diff --git a/tools/bpf/llvm/lib/Target/BPF/BPFFrameLowering.cpp b/tools/bpf/llvm/lib/Target/BPF/BPFFrameLowering.cpp
new file mode 100644
index 0000000..b263b5f
--- /dev/null
+++ b/tools/bpf/llvm/lib/Target/BPF/BPFFrameLowering.cpp
@@ -0,0 +1,36 @@
+//===-- BPFFrameLowering.cpp - BPF Frame Information --------------------===//
+// This file is distributed under the University of Illinois Open Source
+// License. See LICENSE.TXT for details.
+//
+// This file contains the BPF implementation of TargetFrameLowering class.
+
+#include "BPFFrameLowering.h"
+#include "BPFInstrInfo.h"
+#include "llvm/CodeGen/MachineFrameInfo.h"
+#include "llvm/CodeGen/MachineFunction.h"
+#include "llvm/CodeGen/MachineInstrBuilder.h"
+#include "llvm/CodeGen/MachineRegisterInfo.h"
+
+using namespace llvm;
+
+bool BPFFrameLowering::hasFP(const MachineFunction &MF) const {
+  return true;
+}
+
+void BPFFrameLowering::emitPrologue(MachineFunction &MF) const {
+}
+
+void BPFFrameLowering::emitEpilogue(MachineFunction &MF,
+                                    MachineBasicBlock &MBB) const {
+}
+
+void BPFFrameLowering::
+processFunctionBeforeCalleeSavedScan(MachineFunction &MF,
+                                     RegScavenger *RS) const {
+  MachineRegisterInfo& MRI = MF.getRegInfo();
+
+  MRI.setPhysRegUnused(BPF::R6);
+  MRI.setPhysRegUnused(BPF::R7);
+  MRI.setPhysRegUnused(BPF::R8);
+  MRI.setPhysRegUnused(BPF::R9);
+}
diff --git a/tools/bpf/llvm/lib/Target/BPF/BPFFrameLowering.h b/tools/bpf/llvm/lib/Target/BPF/BPFFrameLowering.h
new file mode 100644
index 0000000..3e3d9ad
--- /dev/null
+++ b/tools/bpf/llvm/lib/Target/BPF/BPFFrameLowering.h
@@ -0,0 +1,35 @@
+//===-- BPFFrameLowering.h - Define frame lowering for BPF ---*- C++ -*--===//
+// This file is distributed under the University of Illinois Open Source
+// License. See LICENSE.TXT for details.
+
+#ifndef BPF_FRAMEINFO_H
+#define BPF_FRAMEINFO_H
+
+#include "BPF.h"
+#include "BPFSubtarget.h"
+#include "llvm/Target/TargetFrameLowering.h"
+
+namespace llvm {
+class BPFSubtarget;
+
+class BPFFrameLowering : public TargetFrameLowering {
+public:
+  explicit BPFFrameLowering(const BPFSubtarget &sti)
+    : TargetFrameLowering(TargetFrameLowering::StackGrowsDown, 8, 0) {
+  }
+
+  void emitPrologue(MachineFunction &MF) const;
+  void emitEpilogue(MachineFunction &MF, MachineBasicBlock &MBB) const;
+
+  bool hasFP(const MachineFunction &MF) const;
+  virtual void processFunctionBeforeCalleeSavedScan(MachineFunction &MF,
+                                                    RegScavenger *RS) const;
+
+  // llvm 3.3 defines it here
+  void eliminateCallFramePseudoInstr(MachineFunction &MF, MachineBasicBlock &MBB,
+                                     MachineBasicBlock::iterator MI) const {
+    MBB.erase(MI);
+  }
+};
+}
+#endif
diff --git a/tools/bpf/llvm/lib/Target/BPF/BPFISelDAGToDAG.cpp b/tools/bpf/llvm/lib/Target/BPF/BPFISelDAGToDAG.cpp
new file mode 100644
index 0000000..85f905b
--- /dev/null
+++ b/tools/bpf/llvm/lib/Target/BPF/BPFISelDAGToDAG.cpp
@@ -0,0 +1,182 @@
+//===-- BPFISelDAGToDAG.cpp - A dag to dag inst selector for BPF --------===//
+// This file is distributed under the University of Illinois Open Source
+// License. See LICENSE.TXT for details.
+//
+// This file defines an instruction selector for the BPF target.
+
+#define DEBUG_TYPE "bpf-isel"
+#include "BPF.h"
+#include "BPFRegisterInfo.h"
+#include "BPFSubtarget.h"
+#include "BPFTargetMachine.h"
+#include "llvm/Support/CFG.h"
+#include "llvm/CodeGen/MachineConstantPool.h"
+#include "llvm/CodeGen/MachineFunction.h"
+#include "llvm/CodeGen/MachineFrameInfo.h"
+#include "llvm/CodeGen/MachineInstrBuilder.h"
+#include "llvm/CodeGen/MachineRegisterInfo.h"
+#include "llvm/CodeGen/SelectionDAGISel.h"
+#include "llvm/Target/TargetMachine.h"
+#include "llvm/Support/Debug.h"
+#include "llvm/Support/ErrorHandling.h"
+#include "llvm/Support/raw_ostream.h"
+using namespace llvm;
+
+// Instruction Selector Implementation
+namespace {
+
+class BPFDAGToDAGISel : public SelectionDAGISel {
+
+  /// TM - Keep a reference to BPFTargetMachine.
+  BPFTargetMachine &TM;
+
+  /// Subtarget - Keep a pointer to the BPFSubtarget around so that we can
+  /// make the right decision when generating code for different targets.
+  const BPFSubtarget &Subtarget;
+
+public:
+  explicit BPFDAGToDAGISel(BPFTargetMachine &tm) :
+  SelectionDAGISel(tm),
+  TM(tm), Subtarget(tm.getSubtarget<BPFSubtarget>()) {}
+
+  // Pass Name
+  virtual const char *getPassName() const {
+    return "BPF DAG->DAG Pattern Instruction Selection";
+  }
+
+private:
+  // Include the pieces autogenerated from the target description.
+  #include "BPFGenDAGISel.inc"
+
+  /// getTargetMachine - Return a reference to the TargetMachine, casted
+  /// to the target-specific type.
+  const BPFTargetMachine &getTargetMachine() {
+    return static_cast<const BPFTargetMachine &>(TM);
+  }
+
+  /// getInstrInfo - Return a reference to the TargetInstrInfo, casted
+  /// to the target-specific type.
+  const BPFInstrInfo *getInstrInfo() {
+    return getTargetMachine().getInstrInfo();
+  }
+
+  SDNode *Select(SDNode *N);
+
+  // Complex Pattern for address selection.
+  bool SelectAddr(SDValue Addr, SDValue &Base, SDValue &Offset);
+
+  // getI32Imm - Return a target constant with the specified value, of type i32.
+  inline SDValue getI32Imm(unsigned Imm) {
+    return CurDAG->getTargetConstant(Imm, MVT::i64);
+  }
+};
+
+}
+
+/// ComplexPattern used on BPFInstrInfo
+/// Used on BPF Load/Store instructions
+bool BPFDAGToDAGISel::
+SelectAddr(SDValue Addr, SDValue &Base, SDValue &Offset) {
+  // if Address is FI, get the TargetFrameIndex.
+  if (FrameIndexSDNode *FIN = dyn_cast<FrameIndexSDNode>(Addr)) {
+    Base   = CurDAG->getTargetFrameIndex(FIN->getIndex(), MVT::i64);
+    Offset = CurDAG->getTargetConstant(0, MVT::i64);
+    return true;
+  }
+
+  if (Addr.getOpcode() == ISD::TargetExternalSymbol ||
+      Addr.getOpcode() == ISD::TargetGlobalAddress)
+    return false;
+
+  // Addresses of the form FI+const or FI|const
+  if (CurDAG->isBaseWithConstantOffset(Addr)) {
+    ConstantSDNode *CN = dyn_cast<ConstantSDNode>(Addr.getOperand(1));
+    if (isInt<32>(CN->getSExtValue())) {
+
+      // If the first operand is a FI, get the TargetFI Node
+      if (FrameIndexSDNode *FIN = dyn_cast<FrameIndexSDNode>
+                                  (Addr.getOperand(0)))
+        Base = CurDAG->getTargetFrameIndex(FIN->getIndex(), MVT::i64);
+      else
+        Base = Addr.getOperand(0);
+
+      Offset = CurDAG->getTargetConstant(CN->getSExtValue(), MVT::i64);
+      return true;
+    }
+  }
+
+  // Operand is a result from an ADD.
+  if (Addr.getOpcode() == ISD::ADD) {
+    if (ConstantSDNode *CN = dyn_cast<ConstantSDNode>(Addr.getOperand(1))) {
+      if (isInt<32>(CN->getSExtValue())) {
+
+        // If the first operand is a FI, get the TargetFI Node
+        if (FrameIndexSDNode *FIN = dyn_cast<FrameIndexSDNode>
+                                    (Addr.getOperand(0))) {
+          Base = CurDAG->getTargetFrameIndex(FIN->getIndex(), MVT::i64);
+        } else {
+          Base = Addr.getOperand(0);
+        }
+
+        Offset = CurDAG->getTargetConstant(CN->getSExtValue(), MVT::i64);
+        return true;
+      }
+    }
+  }
+
+  Base   = Addr;
+  Offset = CurDAG->getTargetConstant(0, MVT::i64);
+  return true;
+}
+
+/// Select instructions not customized! Used for
+/// expanded, promoted and normal instructions
+SDNode* BPFDAGToDAGISel::Select(SDNode *Node) {
+  unsigned Opcode = Node->getOpcode();
+
+  // Dump information about the Node being selected
+  DEBUG(errs() << "Selecting: "; Node->dump(CurDAG); errs() << "\n");
+
+  // If we have a custom node, we already have selected!
+  if (Node->isMachineOpcode()) {
+    DEBUG(errs() << "== "; Node->dump(CurDAG); errs() << "\n");
+    return NULL;
+  }
+
+  // tablegen selection should be handled here.
+  switch(Opcode) {
+    default: break;
+
+    case ISD::FrameIndex: {
+        int FI = dyn_cast<FrameIndexSDNode>(Node)->getIndex();
+        EVT VT = Node->getValueType(0);
+        SDValue TFI = CurDAG->getTargetFrameIndex(FI, VT);
+        unsigned Opc = BPF::MOV_rr;
+        if (Node->hasOneUse())
+          return CurDAG->SelectNodeTo(Node, Opc, VT, TFI);
+#if LLVM_VERSION_MINOR==4
+        return CurDAG->getMachineNode(Opc, SDLoc(Node), VT, TFI);
+#else
+        return CurDAG->getMachineNode(Opc, Node->getDebugLoc(), VT, TFI);
+#endif
+
+    }
+  }
+
+  // Select the default instruction
+  SDNode *ResNode = SelectCode(Node);
+
+  DEBUG(errs() << "=> ");
+  if (ResNode == NULL || ResNode == Node)
+    DEBUG(Node->dump(CurDAG));
+  else
+    DEBUG(ResNode->dump(CurDAG));
+  DEBUG(errs() << "\n");
+  return ResNode;
+}
+
+/// createBPFISelDag - This pass converts a legalized DAG into a
+/// BPF-specific DAG, ready for instruction scheduling.
+FunctionPass *llvm::createBPFISelDag(BPFTargetMachine &TM) {
+  return new BPFDAGToDAGISel(TM);
+}
diff --git a/tools/bpf/llvm/lib/Target/BPF/BPFISelLowering.cpp b/tools/bpf/llvm/lib/Target/BPF/BPFISelLowering.cpp
new file mode 100644
index 0000000..b065d31
--- /dev/null
+++ b/tools/bpf/llvm/lib/Target/BPF/BPFISelLowering.cpp
@@ -0,0 +1,676 @@
+//===-- BPFISelLowering.cpp - BPF DAG Lowering Implementation  ----------===//
+// This file is distributed under the University of Illinois Open Source
+// License. See LICENSE.TXT for details.
+//
+// This file implements the BPFTargetLowering class.
+
+#define DEBUG_TYPE "bpf-lower"
+
+#include "BPFISelLowering.h"
+#include "BPF.h"
+#include "BPFTargetMachine.h"
+#include "BPFSubtarget.h"
+#include "llvm/CodeGen/CallingConvLower.h"
+#include "llvm/CodeGen/MachineFrameInfo.h"
+#include "llvm/CodeGen/MachineFunction.h"
+#include "llvm/CodeGen/MachineInstrBuilder.h"
+#include "llvm/CodeGen/MachineRegisterInfo.h"
+#include "llvm/CodeGen/SelectionDAGISel.h"
+#include "llvm/CodeGen/TargetLoweringObjectFileImpl.h"
+#include "llvm/CodeGen/ValueTypes.h"
+#include "llvm/Support/CommandLine.h"
+#include "llvm/Support/Debug.h"
+#include "llvm/Support/ErrorHandling.h"
+#include "llvm/Support/raw_ostream.h"
+using namespace llvm;
+
+BPFTargetLowering::BPFTargetLowering(BPFTargetMachine &tm) :
+  TargetLowering(tm, new TargetLoweringObjectFileELF()),
+  Subtarget(*tm.getSubtargetImpl()), TM(tm) {
+
+  // Set up the register classes.
+  addRegisterClass(MVT::i64, &BPF::GPRRegClass);
+
+  // Compute derived properties from the register classes
+  computeRegisterProperties();
+
+  setStackPointerRegisterToSaveRestore(BPF::R11);
+
+  setOperationAction(ISD::BR_CC,             MVT::i64, Custom);
+  setOperationAction(ISD::BR_JT,             MVT::Other, Expand);
+  setOperationAction(ISD::BRCOND,            MVT::Other, Expand);
+  setOperationAction(ISD::SETCC,             MVT::i64, Expand);
+  setOperationAction(ISD::SELECT,            MVT::i64, Expand);
+  setOperationAction(ISD::SELECT_CC,         MVT::i64, Custom);
+
+//  setCondCodeAction(ISD::SETLT,             MVT::i64, Expand);
+
+  setOperationAction(ISD::GlobalAddress,     MVT::i64, Custom);
+  /*setOperationAction(ISD::BlockAddress,      MVT::i64, Custom);
+  setOperationAction(ISD::JumpTable,         MVT::i64, Custom);
+  setOperationAction(ISD::ConstantPool,      MVT::i64, Custom);*/
+
+  setOperationAction(ISD::DYNAMIC_STACKALLOC, MVT::i64,   Custom);
+  setOperationAction(ISD::STACKSAVE,          MVT::Other, Expand);
+  setOperationAction(ISD::STACKRESTORE,       MVT::Other, Expand);
+
+/*  setOperationAction(ISD::VASTART,            MVT::Other, Custom);
+  setOperationAction(ISD::VAARG,              MVT::Other, Expand);
+  setOperationAction(ISD::VACOPY,             MVT::Other, Expand);
+  setOperationAction(ISD::VAEND,              MVT::Other, Expand);*/
+
+//    setOperationAction(ISD::SDIV,            MVT::i64, Expand);
+//  setOperationAction(ISD::UDIV,            MVT::i64, Expand);
+
+  setOperationAction(ISD::SDIVREM,           MVT::i64, Expand);
+  setOperationAction(ISD::UDIVREM,           MVT::i64, Expand);
+  setOperationAction(ISD::SREM,              MVT::i64, Expand);
+  setOperationAction(ISD::UREM,              MVT::i64, Expand);
+
+//  setOperationAction(ISD::MUL,             MVT::i64, Expand);
+
+  setOperationAction(ISD::MULHU,             MVT::i64, Expand);
+  setOperationAction(ISD::MULHS,             MVT::i64, Expand);
+  setOperationAction(ISD::UMUL_LOHI,         MVT::i64, Expand);
+  setOperationAction(ISD::SMUL_LOHI,         MVT::i64, Expand);
+
+  setOperationAction(ISD::ADDC, MVT::i64, Expand);
+  setOperationAction(ISD::ADDE, MVT::i64, Expand);
+  setOperationAction(ISD::SUBC, MVT::i64, Expand);
+  setOperationAction(ISD::SUBE, MVT::i64, Expand);
+
+  setOperationAction(ISD::ROTR,              MVT::i64, Expand);
+  setOperationAction(ISD::ROTL,              MVT::i64, Expand);
+  setOperationAction(ISD::SHL_PARTS,         MVT::i64, Expand);
+  setOperationAction(ISD::SRL_PARTS,         MVT::i64, Expand);
+  setOperationAction(ISD::SRA_PARTS,         MVT::i64, Expand);
+
+  setOperationAction(ISD::BSWAP,             MVT::i64, Expand);
+  setOperationAction(ISD::CTTZ,              MVT::i64, Custom);
+  setOperationAction(ISD::CTLZ,              MVT::i64, Custom);
+  setOperationAction(ISD::CTTZ_ZERO_UNDEF,   MVT::i64, Custom);
+  setOperationAction(ISD::CTLZ_ZERO_UNDEF,   MVT::i64, Custom);
+  setOperationAction(ISD::CTPOP,             MVT::i64, Expand);
+
+
+  setOperationAction(ISD::SIGN_EXTEND_INREG, MVT::i1,   Expand);
+  setOperationAction(ISD::SIGN_EXTEND_INREG, MVT::i8,   Expand);
+  setOperationAction(ISD::SIGN_EXTEND_INREG, MVT::i16,  Expand);
+  setOperationAction(ISD::SIGN_EXTEND_INREG, MVT::i32,  Expand);
+
+  // Extended load operations for i1 types must be promoted
+  setLoadExtAction(ISD::EXTLOAD,             MVT::i1,   Promote);
+  setLoadExtAction(ISD::ZEXTLOAD,            MVT::i1,   Promote);
+  setLoadExtAction(ISD::SEXTLOAD,            MVT::i1,   Promote);
+
+  setLoadExtAction(ISD::SEXTLOAD,            MVT::i8,   Expand);
+  setLoadExtAction(ISD::SEXTLOAD,            MVT::i16,   Expand);
+  setLoadExtAction(ISD::SEXTLOAD,            MVT::i32,   Expand);
+
+  // Function alignments (log2)
+  setMinFunctionAlignment(3);
+  setPrefFunctionAlignment(3);
+
+#if LLVM_VERSION_MINOR==3 || LLVM_VERSION_MINOR==4
+  MaxStoresPerMemcpy = 128;
+  MaxStoresPerMemcpyOptSize = 128;
+  MaxStoresPerMemset = 128;
+#else
+  maxStoresPerMemcpy = 128;
+  maxStoresPerMemcpyOptSize = 128;
+  maxStoresPerMemset = 128;
+#endif
+}
+
+SDValue BPFTargetLowering::LowerOperation(SDValue Op,
+                                          SelectionDAG &DAG) const {
+  switch (Op.getOpcode()) {
+  case ISD::BR_CC:              return LowerBR_CC(Op, DAG);
+  case ISD::GlobalAddress:      return LowerGlobalAddress(Op, DAG);
+  case ISD::SELECT_CC:          return LowerSELECT_CC(Op, DAG);
+  default:
+    llvm_unreachable("unimplemented operand");
+  }
+}
+
+//                      Calling Convention Implementation
+#include "BPFGenCallingConv.inc"
+
+SDValue
+BPFTargetLowering::LowerFormalArguments(SDValue Chain, CallingConv::ID CallConv,
+                                        bool isVarArg,
+                                        const SmallVectorImpl<ISD::InputArg>
+                                        &Ins,
+#if LLVM_VERSION_MINOR==4
+                                        SDLoc dl,
+#else
+                                        DebugLoc dl,
+#endif
+                                        SelectionDAG &DAG,
+                                        SmallVectorImpl<SDValue> &InVals)
+                                          const {
+  switch (CallConv) {
+  default:
+    llvm_unreachable("Unsupported calling convention");
+  case CallingConv::C:
+  case CallingConv::Fast:
+    break;
+  }
+
+/// LowerCCCArguments - transform physical registers into virtual registers and
+/// generate load operations for arguments places on the stack.
+  MachineFunction &MF = DAG.getMachineFunction();
+  MachineRegisterInfo &RegInfo = MF.getRegInfo();
+
+  // Assign locations to all of the incoming arguments.
+  SmallVector<CCValAssign, 16> ArgLocs;
+  CCState CCInfo(CallConv, isVarArg, DAG.getMachineFunction(),
+                 getTargetMachine(), ArgLocs, *DAG.getContext());
+  CCInfo.AnalyzeFormalArguments(Ins, CC_BPF64);
+
+  for (unsigned i = 0, e = ArgLocs.size(); i != e; ++i) {
+    CCValAssign &VA = ArgLocs[i];
+    if (VA.isRegLoc()) {
+      // Arguments passed in registers
+      EVT RegVT = VA.getLocVT();
+      switch (RegVT.getSimpleVT().SimpleTy) {
+      default:
+        {
+#ifndef NDEBUG
+          errs() << "LowerFormalArguments Unhandled argument type: "
+               << RegVT.getSimpleVT().SimpleTy << "\n";
+#endif
+          llvm_unreachable(0);
+        }
+      case MVT::i64:
+        unsigned VReg = RegInfo.createVirtualRegister(&BPF::GPRRegClass);
+        RegInfo.addLiveIn(VA.getLocReg(), VReg);
+        SDValue ArgValue = DAG.getCopyFromReg(Chain, dl, VReg, RegVT);
+
+        // If this is an 8/16/32-bit value, it is really passed promoted to 64
+        // bits. Insert an assert[sz]ext to capture this, then truncate to the
+        // right size.
+        if (VA.getLocInfo() == CCValAssign::SExt)
+          ArgValue = DAG.getNode(ISD::AssertSext, dl, RegVT, ArgValue,
+                                 DAG.getValueType(VA.getValVT()));
+        else if (VA.getLocInfo() == CCValAssign::ZExt)
+          ArgValue = DAG.getNode(ISD::AssertZext, dl, RegVT, ArgValue,
+                                 DAG.getValueType(VA.getValVT()));
+
+        if (VA.getLocInfo() != CCValAssign::Full)
+          ArgValue = DAG.getNode(ISD::TRUNCATE, dl, VA.getValVT(), ArgValue);
+
+        InVals.push_back(ArgValue);
+      }
+    } else {
+      assert(VA.isMemLoc());
+      errs() << "Function: " << MF.getName() << " ";
+      MF.getFunction()->getFunctionType()->dump();
+      errs() << "\n";
+      report_fatal_error("too many function args");
+    }
+  }
+
+  if (isVarArg || MF.getFunction()->hasStructRetAttr()) {
+    errs() << "Function: " << MF.getName() << " ";
+    MF.getFunction()->getFunctionType()->dump();
+    errs() << "\n";
+    report_fatal_error("functions with VarArgs or StructRet are not supported");
+  }
+
+  return Chain;
+}
+
+SDValue
+BPFTargetLowering::LowerCall(TargetLowering::CallLoweringInfo &CLI,
+                              SmallVectorImpl<SDValue> &InVals) const {
+  SelectionDAG &DAG                     = CLI.DAG;
+  SmallVector<ISD::OutputArg, 32> &Outs = CLI.Outs;
+  SmallVector<SDValue, 32> &OutVals     = CLI.OutVals;
+  SmallVector<ISD::InputArg, 32> &Ins   = CLI.Ins;
+  SDValue Chain                         = CLI.Chain;
+  SDValue Callee                        = CLI.Callee;
+  bool &isTailCall                      = CLI.IsTailCall;
+  CallingConv::ID CallConv              = CLI.CallConv;
+  bool isVarArg                         = CLI.IsVarArg;
+
+  // BPF target does not support tail call optimization.
+  isTailCall = false;
+
+  switch (CallConv) {
+  default:
+    report_fatal_error("Unsupported calling convention");
+  case CallingConv::Fast:
+  case CallingConv::C:
+    break;
+  }
+
+/// LowerCCCCallTo - functions arguments are copied from virtual regs to
+/// (physical regs)/(stack frame), CALLSEQ_START and CALLSEQ_END are emitted.
+
+  // Analyze operands of the call, assigning locations to each operand.
+  SmallVector<CCValAssign, 16> ArgLocs;
+  CCState CCInfo(CallConv, isVarArg, DAG.getMachineFunction(),
+                 getTargetMachine(), ArgLocs, *DAG.getContext());
+
+  CCInfo.AnalyzeCallOperands(Outs, CC_BPF64);
+
+  // Get a count of how many bytes are to be pushed on the stack.
+  unsigned NumBytes = CCInfo.getNextStackOffset();
+
+  // Create local copies for byval args.
+  SmallVector<SDValue, 8> ByValArgs;
+
+  if (Outs.size() >= 6) {
+    errs() << "too many arguments to a function ";
+    Callee.dump();
+    report_fatal_error("too many args\n");
+  }
+
+  for (unsigned i = 0,  e = Outs.size(); i != e; ++i) {
+    ISD::ArgFlagsTy Flags = Outs[i].Flags;
+    if (!Flags.isByVal())
+      continue;
+
+    Callee.dump();
+    report_fatal_error("cannot pass by value");
+  }
+
+  Chain = DAG.getCALLSEQ_START(Chain, DAG.getConstant(NumBytes,
+                                                      getPointerTy(), true)
+#if LLVM_VERSION_MINOR==4
+                                                      , CLI.DL
+#endif
+                                                      );
+
+  SmallVector<std::pair<unsigned, SDValue>, 4> RegsToPass;
+  SDValue StackPtr;
+
+  // Walk the register/memloc assignments, inserting copies/loads.
+  for (unsigned i = 0, j = 0, e = ArgLocs.size(); i != e; ++i) {
+    CCValAssign &VA = ArgLocs[i];
+    SDValue Arg = OutVals[i];
+    ISD::ArgFlagsTy Flags = Outs[i].Flags;
+
+
+    // Promote the value if needed.
+    switch (VA.getLocInfo()) {
+      default: llvm_unreachable("Unknown loc info!");
+      case CCValAssign::Full: break;
+      case CCValAssign::SExt:
+        Arg = DAG.getNode(ISD::SIGN_EXTEND, CLI.DL, VA.getLocVT(), Arg);
+        break;
+      case CCValAssign::ZExt:
+        Arg = DAG.getNode(ISD::ZERO_EXTEND, CLI.DL, VA.getLocVT(), Arg);
+        break;
+      case CCValAssign::AExt:
+        Arg = DAG.getNode(ISD::ANY_EXTEND, CLI.DL, VA.getLocVT(), Arg);
+        break;
+    }
+
+    // Use local copy if it is a byval arg.
+    if (Flags.isByVal())
+      Arg = ByValArgs[j++];
+
+    // Arguments that can be passed on register must be kept at RegsToPass
+    // vector
+    if (VA.isRegLoc()) {
+      RegsToPass.push_back(std::make_pair(VA.getLocReg(), Arg));
+    } else {
+      llvm_unreachable("call arg pass bug");
+    }
+  }
+
+  SDValue InFlag;
+
+  // Build a sequence of copy-to-reg nodes chained together with token chain and
+  // flag operands which copy the outgoing args into registers.  The InFlag in
+  // necessary since all emitted instructions must be stuck together.
+  for (unsigned i = 0, e = RegsToPass.size(); i != e; ++i) {
+    Chain = DAG.getCopyToReg(Chain, CLI.DL, RegsToPass[i].first,
+                             RegsToPass[i].second, InFlag);
+    InFlag = Chain.getValue(1);
+  }
+
+  // If the callee is a GlobalAddress node (quite common, every direct call is)
+  // turn it into a TargetGlobalAddress node so that legalize doesn't hack it.
+  // Likewise ExternalSymbol -> TargetExternalSymbol.
+  if (GlobalAddressSDNode *G = dyn_cast<GlobalAddressSDNode>(Callee)) {
+    Callee = DAG.getTargetGlobalAddress(G->getGlobal(), CLI.DL, getPointerTy(), G->getOffset()/*0*/,
+                                        0);
+  } else if (ExternalSymbolSDNode *E = dyn_cast<ExternalSymbolSDNode>(Callee)) {
+    Callee = DAG.getTargetExternalSymbol(E->getSymbol(), getPointerTy(),
+                                         0);
+  }
+
+  // Returns a chain & a flag for retval copy to use.
+  SDVTList NodeTys = DAG.getVTList(MVT::Other, MVT::Glue);
+  SmallVector<SDValue, 8> Ops;
+  Ops.push_back(Chain);
+  Ops.push_back(Callee);
+
+  // Add argument registers to the end of the list so that they are
+  // known live into the call.
+  for (unsigned i = 0, e = RegsToPass.size(); i != e; ++i)
+    Ops.push_back(DAG.getRegister(RegsToPass[i].first,
+                                  RegsToPass[i].second.getValueType()));
+
+  if (InFlag.getNode())
+    Ops.push_back(InFlag);
+
+  Chain = DAG.getNode(BPFISD::CALL, CLI.DL, NodeTys, &Ops[0], Ops.size());
+  InFlag = Chain.getValue(1);
+
+  // Create the CALLSEQ_END node.
+  Chain = DAG.getCALLSEQ_END(Chain,
+                             DAG.getConstant(NumBytes, getPointerTy(), true),
+                             DAG.getConstant(0, getPointerTy(), true),
+                             InFlag
+#if LLVM_VERSION_MINOR==4
+                             , CLI.DL
+#endif
+                             );
+  InFlag = Chain.getValue(1);
+
+  // Handle result values, copying them out of physregs into vregs that we
+  // return.
+  return LowerCallResult(Chain, InFlag, CallConv, isVarArg, Ins, CLI.DL,
+                         DAG, InVals);
+}
+
+SDValue
+BPFTargetLowering::LowerReturn(SDValue Chain,
+                               CallingConv::ID CallConv, bool isVarArg,
+                               const SmallVectorImpl<ISD::OutputArg> &Outs,
+                               const SmallVectorImpl<SDValue> &OutVals,
+#if LLVM_VERSION_MINOR==4
+                               SDLoc dl,
+#else
+                               DebugLoc dl,
+#endif
+                               SelectionDAG &DAG) const {
+
+  // CCValAssign - represent the assignment of the return value to a location
+  SmallVector<CCValAssign, 16> RVLocs;
+
+  // CCState - Info about the registers and stack slot.
+  CCState CCInfo(CallConv, isVarArg, DAG.getMachineFunction(),
+                 getTargetMachine(), RVLocs, *DAG.getContext());
+
+  // Analize return values.
+  CCInfo.AnalyzeReturn(Outs, RetCC_BPF64);
+
+  // If this is the first return lowered for this function, add the regs to the
+  // liveout set for the function.
+#if LLVM_VERSION_MINOR==2
+  if (DAG.getMachineFunction().getRegInfo().liveout_empty()) {
+    for (unsigned i = 0; i != RVLocs.size(); ++i)
+      if (RVLocs[i].isRegLoc())
+        DAG.getMachineFunction().getRegInfo().addLiveOut(RVLocs[i].getLocReg());
+  }
+#endif
+
+  SDValue Flag;
+#if LLVM_VERSION_MINOR==3 || LLVM_VERSION_MINOR==4
+  SmallVector<SDValue, 4> RetOps(1, Chain);
+#endif
+
+  // Copy the result values into the output registers.
+  for (unsigned i = 0; i != RVLocs.size(); ++i) {
+    CCValAssign &VA = RVLocs[i];
+    assert(VA.isRegLoc() && "Can only return in registers!");
+
+    Chain = DAG.getCopyToReg(Chain, dl, VA.getLocReg(),
+                             OutVals[i], Flag);
+
+    // Guarantee that all emitted copies are stuck together,
+    // avoiding something bad.
+    Flag = Chain.getValue(1);
+#if LLVM_VERSION_MINOR==3 || LLVM_VERSION_MINOR==4
+    RetOps.push_back(DAG.getRegister(VA.getLocReg(), VA.getLocVT()));
+#endif
+  }
+
+  if (DAG.getMachineFunction().getFunction()->hasStructRetAttr()) {
+    errs() << "Function: " << DAG.getMachineFunction().getName() << " ";
+    DAG.getMachineFunction().getFunction()->getFunctionType()->dump();
+    errs() << "\n";
+    report_fatal_error("BPF doesn't support struct return");
+  }
+
+  unsigned Opc = BPFISD::RET_FLAG;
+#if LLVM_VERSION_MINOR==3 || LLVM_VERSION_MINOR==4
+  RetOps[0] = Chain;  // Update chain.
+
+  // Add the flag if we have it.
+  if (Flag.getNode())
+    RetOps.push_back(Flag);
+
+  return DAG.getNode(Opc, dl, MVT::Other, &RetOps[0], RetOps.size());
+#else
+  if (Flag.getNode())
+    return DAG.getNode(Opc, dl, MVT::Other, Chain, Flag);
+
+  // Return Void
+  return DAG.getNode(Opc, dl, MVT::Other, Chain);
+#endif
+}
+
+/// LowerCallResult - Lower the result values of a call into the
+/// appropriate copies out of appropriate physical registers.
+SDValue
+BPFTargetLowering::LowerCallResult(SDValue Chain, SDValue InFlag,
+                                   CallingConv::ID CallConv, bool isVarArg,
+                                   const SmallVectorImpl<ISD::InputArg> &Ins,
+#if LLVM_VERSION_MINOR==4
+                                   SDLoc dl,
+#else
+                                   DebugLoc dl,
+#endif
+                                   SelectionDAG &DAG,
+                                   SmallVectorImpl<SDValue> &InVals) const {
+
+  // Assign locations to each value returned by this call.
+  SmallVector<CCValAssign, 16> RVLocs;
+  CCState CCInfo(CallConv, isVarArg, DAG.getMachineFunction(),
+                 getTargetMachine(), RVLocs, *DAG.getContext());
+
+  CCInfo.AnalyzeCallResult(Ins, RetCC_BPF64);
+
+  // Copy all of the result registers out of their specified physreg.
+  for (unsigned i = 0; i != RVLocs.size(); ++i) {
+    Chain = DAG.getCopyFromReg(Chain, dl, RVLocs[i].getLocReg(),
+                               RVLocs[i].getValVT(), InFlag).getValue(1);
+    InFlag = Chain.getValue(2);
+    InVals.push_back(Chain.getValue(0));
+  }
+
+  return Chain;
+}
+
+static bool NegateCC(SDValue &LHS, SDValue &RHS, ISD::CondCode &CC)
+{
+  switch (CC) {
+  default:
+    return false;
+  case ISD::SETULT:
+    CC = ISD::SETUGT;
+    std::swap(LHS, RHS);
+    return true;
+  case ISD::SETULE:
+    CC = ISD::SETUGE;
+    std::swap(LHS, RHS);
+    return true;
+  case ISD::SETLT:
+    CC = ISD::SETGT;
+    std::swap(LHS, RHS);
+    return true;
+  case ISD::SETLE:
+    CC = ISD::SETGE;
+    std::swap(LHS, RHS);
+    return true;
+  }
+}
+
+SDValue BPFTargetLowering::LowerBR_CC(SDValue Op,
+                                      SelectionDAG &DAG) const {
+  SDValue Chain  = Op.getOperand(0);
+  ISD::CondCode CC = cast<CondCodeSDNode>(Op.getOperand(1))->get();
+  SDValue LHS   = Op.getOperand(2);
+  SDValue RHS   = Op.getOperand(3);
+  SDValue Dest  = Op.getOperand(4);
+#if LLVM_VERSION_MINOR==4
+  SDLoc    dl(Op);
+#else
+  DebugLoc dl   = Op.getDebugLoc();
+#endif
+
+  NegateCC(LHS, RHS, CC);
+
+  return DAG.getNode(BPFISD::BR_CC, dl, Op.getValueType(),
+                     Chain, LHS, RHS, DAG.getConstant(CC, MVT::i64), Dest);
+}
+
+SDValue BPFTargetLowering::LowerSELECT_CC(SDValue Op,
+                                          SelectionDAG &DAG) const {
+  SDValue LHS    = Op.getOperand(0);
+  SDValue RHS    = Op.getOperand(1);
+  SDValue TrueV  = Op.getOperand(2);
+  SDValue FalseV = Op.getOperand(3);
+  ISD::CondCode CC = cast<CondCodeSDNode>(Op.getOperand(4))->get();
+#if LLVM_VERSION_MINOR==4
+  SDLoc    dl(Op);
+#else
+  DebugLoc dl    = Op.getDebugLoc();
+#endif
+
+  NegateCC(LHS, RHS, CC);
+
+  SDValue TargetCC = DAG.getConstant(CC, MVT::i64);
+
+  SDVTList VTs = DAG.getVTList(Op.getValueType(), MVT::Glue);
+  SmallVector<SDValue, 5> Ops;
+  Ops.push_back(LHS);
+  Ops.push_back(RHS);
+  Ops.push_back(TargetCC);
+  Ops.push_back(TrueV);
+  Ops.push_back(FalseV);
+
+  SDValue sel = DAG.getNode(BPFISD::SELECT_CC, dl, VTs, &Ops[0], Ops.size());
+  DEBUG(errs() << "LowerSELECT_CC:\n"; sel.dumpr(); errs() << "\n");
+  return sel;
+}
+
+const char *BPFTargetLowering::getTargetNodeName(unsigned Opcode) const {
+  switch (Opcode) {
+  default: return NULL;
+  case BPFISD::ADJDYNALLOC:        return "BPFISD::ADJDYNALLOC";
+  case BPFISD::RET_FLAG:           return "BPFISD::RET_FLAG";
+  case BPFISD::CALL:               return "BPFISD::CALL";
+  case BPFISD::SELECT_CC:          return "BPFISD::SELECT_CC";
+  case BPFISD::BR_CC:              return "BPFISD::BR_CC";
+  case BPFISD::Wrapper:            return "BPFISD::Wrapper";
+  }
+}
+
+SDValue BPFTargetLowering::LowerGlobalAddress(SDValue Op,
+                                              SelectionDAG &DAG) const {
+  Op.dump();
+  report_fatal_error("LowerGlobalAddress: BPF cannot access global variables");
+  return SDValue();
+}
+
+MachineBasicBlock*
+BPFTargetLowering::EmitInstrWithCustomInserter(MachineInstr *MI,
+                                               MachineBasicBlock *BB) const {
+  unsigned Opc = MI->getOpcode();
+
+  const TargetInstrInfo &TII = *getTargetMachine().getInstrInfo();
+  DebugLoc dl = MI->getDebugLoc();
+
+  assert(Opc == BPF::Select && "Unexpected instr type to insert");
+
+  // To "insert" a SELECT instruction, we actually have to insert the diamond
+  // control-flow pattern.  The incoming instruction knows the destination vreg
+  // to set, the condition code register to branch on, the true/false values to
+  // select between, and a branch opcode to use.
+  const BasicBlock *LLVM_BB = BB->getBasicBlock();
+  MachineFunction::iterator I = BB;
+  ++I;
+
+  //  thisMBB:
+  //  ...
+  //   TrueVal = ...
+  //   jmp_XX r1, r2 goto copy1MBB
+  //   fallthrough --> copy0MBB
+  MachineBasicBlock *thisMBB = BB;
+  MachineFunction *F = BB->getParent();
+  MachineBasicBlock *copy0MBB = F->CreateMachineBasicBlock(LLVM_BB);
+  MachineBasicBlock *copy1MBB = F->CreateMachineBasicBlock(LLVM_BB);
+
+  F->insert(I, copy0MBB);
+  F->insert(I, copy1MBB);
+  // Update machine-CFG edges by transferring all successors of the current
+  // block to the new block which will contain the Phi node for the select.
+  copy1MBB->splice(copy1MBB->begin(), BB,
+                   llvm::next(MachineBasicBlock::iterator(MI)),
+                   BB->end());
+  copy1MBB->transferSuccessorsAndUpdatePHIs(BB);
+  // Next, add the true and fallthrough blocks as its successors.
+  BB->addSuccessor(copy0MBB);
+  BB->addSuccessor(copy1MBB);
+
+  // Insert Branch if Flag
+  unsigned LHS = MI->getOperand(1).getReg();
+  unsigned RHS = MI->getOperand(2).getReg();
+  int CC  = MI->getOperand(3).getImm();
+  switch (CC) {
+  case ISD::SETGT:
+    BuildMI(BB, dl, TII.get(BPF::JSGT_rr))
+      .addReg(LHS).addReg(RHS).addMBB(copy1MBB);
+    break;
+  case ISD::SETUGT:
+    BuildMI(BB, dl, TII.get(BPF::JUGT_rr))
+      .addReg(LHS).addReg(RHS).addMBB(copy1MBB);
+    break;
+  case ISD::SETGE:
+    BuildMI(BB, dl, TII.get(BPF::JSGE_rr))
+      .addReg(LHS).addReg(RHS).addMBB(copy1MBB);
+    break;
+  case ISD::SETUGE:
+    BuildMI(BB, dl, TII.get(BPF::JUGE_rr))
+      .addReg(LHS).addReg(RHS).addMBB(copy1MBB);
+    break;
+  case ISD::SETEQ:
+    BuildMI(BB, dl, TII.get(BPF::JEQ_rr))
+      .addReg(LHS).addReg(RHS).addMBB(copy1MBB);
+    break;
+  case ISD::SETNE:
+    BuildMI(BB, dl, TII.get(BPF::JNE_rr))
+      .addReg(LHS).addReg(RHS).addMBB(copy1MBB);
+    break;
+  default:
+    report_fatal_error("unimplemented select CondCode " + Twine(CC));
+  }
+
+  //  copy0MBB:
+  //   %FalseValue = ...
+  //   # fallthrough to copy1MBB
+  BB = copy0MBB;
+
+  // Update machine-CFG edges
+  BB->addSuccessor(copy1MBB);
+
+  //  copy1MBB:
+  //   %Result = phi [ %FalseValue, copy0MBB ], [ %TrueValue, thisMBB ]
+  //  ...
+  BB = copy1MBB;
+  BuildMI(*BB, BB->begin(), dl, TII.get(BPF::PHI),
+          MI->getOperand(0).getReg())
+    .addReg(MI->getOperand(5).getReg()).addMBB(copy0MBB)
+    .addReg(MI->getOperand(4).getReg()).addMBB(thisMBB);
+
+  MI->eraseFromParent();   // The pseudo instruction is gone now.
+  return BB;
+}
+
diff --git a/tools/bpf/llvm/lib/Target/BPF/BPFISelLowering.h b/tools/bpf/llvm/lib/Target/BPF/BPFISelLowering.h
new file mode 100644
index 0000000..0850a9e
--- /dev/null
+++ b/tools/bpf/llvm/lib/Target/BPF/BPFISelLowering.h
@@ -0,0 +1,105 @@
+//===-- BPFISelLowering.h - BPF DAG Lowering Interface -......-*- C++ -*-===//
+// This file is distributed under the University of Illinois Open Source
+// License. See LICENSE.TXT for details.
+//
+// This file defines the interfaces that BPF uses to lower LLVM code into a
+// selection DAG.
+
+#ifndef LLVM_TARGET_BPF_ISELLOWERING_H
+#define LLVM_TARGET_BPF_ISELLOWERING_H
+
+#include "BPF.h"
+#include "llvm/CodeGen/SelectionDAG.h"
+#include "llvm/Target/TargetLowering.h"
+
+namespace llvm {
+  namespace BPFISD {
+    enum {
+      FIRST_NUMBER = ISD::BUILTIN_OP_END,
+
+      ADJDYNALLOC,
+
+      /// Return with a flag operand. Operand 0 is the chain operand.
+      RET_FLAG,
+
+      /// CALL - These operations represent an abstract call instruction, which
+      /// includes a bunch of information.
+      CALL,
+
+      /// SELECT_CC - Operand 0 and operand 1 are selection variable, operand 3
+      /// is condition code and operand 4 is flag operand.
+      SELECT_CC,
+
+      // BR_CC - Used to glue together a l.bf to a l.sfXX
+      BR_CC,
+
+      /// Wrapper - A wrapper node for TargetConstantPool, TargetExternalSymbol,
+      /// and TargetGlobalAddress.
+      Wrapper
+    };
+  }
+
+  class BPFSubtarget;
+  class BPFTargetMachine;
+
+  class BPFTargetLowering : public TargetLowering {
+  public:
+    explicit BPFTargetLowering(BPFTargetMachine &TM);
+
+    /// LowerOperation - Provide custom lowering hooks for some operations.
+    virtual SDValue LowerOperation(SDValue Op, SelectionDAG &DAG) const;
+
+    /// getTargetNodeName - This method returns the name of a target specific
+    /// DAG node.
+    virtual const char *getTargetNodeName(unsigned Opcode) const;
+
+    SDValue LowerBR_CC(SDValue Op, SelectionDAG &DAG) const;
+    SDValue LowerSELECT_CC(SDValue Op, SelectionDAG &DAG) const;
+    SDValue LowerGlobalAddress(SDValue Op, SelectionDAG &DAG) const;
+
+    MachineBasicBlock* EmitInstrWithCustomInserter(MachineInstr *MI,
+                                                   MachineBasicBlock *BB) const;
+
+  private:
+    const BPFSubtarget &Subtarget;
+    const BPFTargetMachine &TM;
+
+    SDValue LowerCallResult(SDValue Chain, SDValue InFlag,
+                            CallingConv::ID CallConv, bool isVarArg,
+                            const SmallVectorImpl<ISD::InputArg> &Ins,
+#if LLVM_VERSION_MINOR==4
+                            SDLoc dl,
+#else
+                            DebugLoc dl,
+#endif
+                            SelectionDAG &DAG,
+                            SmallVectorImpl<SDValue> &InVals) const;
+
+    SDValue LowerCall(TargetLowering::CallLoweringInfo &CLI,
+                      SmallVectorImpl<SDValue> &InVals) const;
+
+    SDValue LowerFormalArguments(SDValue Chain,
+                                 CallingConv::ID CallConv, bool isVarArg,
+                                 const SmallVectorImpl<ISD::InputArg> &Ins,
+#if LLVM_VERSION_MINOR==4
+                                 SDLoc dl,
+#else
+                                 DebugLoc dl,
+#endif
+                                 SelectionDAG &DAG,
+                                 SmallVectorImpl<SDValue> &InVals) const;
+
+    SDValue LowerReturn(SDValue Chain,
+                        CallingConv::ID CallConv, bool isVarArg,
+                        const SmallVectorImpl<ISD::OutputArg> &Outs,
+                        const SmallVectorImpl<SDValue> &OutVals,
+#if LLVM_VERSION_MINOR==4
+                        SDLoc dl,
+#else
+                        DebugLoc dl,
+#endif
+                        SelectionDAG &DAG) const;
+  };
+}
+
+#endif // LLVM_TARGET_BPF_ISELLOWERING_H
diff --git a/tools/bpf/llvm/lib/Target/BPF/BPFInstrFormats.td b/tools/bpf/llvm/lib/Target/BPF/BPFInstrFormats.td
new file mode 100644
index 0000000..122ff19
--- /dev/null
+++ b/tools/bpf/llvm/lib/Target/BPF/BPFInstrFormats.td
@@ -0,0 +1,29 @@
+//===- BPFInstrFormats.td - BPF Instruction Formats ----*- tablegen -*-===//
+// This file is distributed under the University of Illinois Open Source
+// License. See LICENSE.TXT for details.
+
+class InstBPF<dag outs, dag ins, string asmstr, list<dag> pattern>
+  : Instruction {
+  field bits<64> Inst;
+  field bits<64> SoftFail = 0;
+  let Size = 8;
+
+  let Namespace = "BPF";
+  let DecoderNamespace = "BPF";
+
+  bits<3> bpfClass;
+  let Inst{58-56} = bpfClass;
+
+  dag OutOperandList = outs;
+  dag InOperandList = ins;
+  let AsmString = asmstr;
+  let Pattern = pattern;
+}
+
+// Pseudo instructions
+class Pseudo<dag outs, dag ins, string asmstr, list<dag> pattern>
+  : InstBPF<outs, ins, asmstr, pattern> {
+  let Inst{63-0} = 0;
+  let isPseudo = 1;
+}
+
diff --git a/tools/bpf/llvm/lib/Target/BPF/BPFInstrInfo.cpp b/tools/bpf/llvm/lib/Target/BPF/BPFInstrInfo.cpp
new file mode 100644
index 0000000..943de85
--- /dev/null
+++ b/tools/bpf/llvm/lib/Target/BPF/BPFInstrInfo.cpp
@@ -0,0 +1,162 @@
+//===-- BPFInstrInfo.cpp - BPF Instruction Information --------*- C++ -*-===//
+// This file is distributed under the University of Illinois Open Source
+// License. See LICENSE.TXT for details.
+//
+// This file contains the BPF implementation of the TargetInstrInfo class.
+
+#include "BPF.h"
+#include "BPFInstrInfo.h"
+#include "BPFSubtarget.h"
+#include "BPFTargetMachine.h"
+#include "llvm/CodeGen/MachineFunctionPass.h"
+#include "llvm/CodeGen/MachineInstrBuilder.h"
+#include "llvm/CodeGen/MachineRegisterInfo.h"
+#include "llvm/Support/ErrorHandling.h"
+#include "llvm/Support/TargetRegistry.h"
+#include "llvm/ADT/STLExtras.h"
+#include "llvm/ADT/SmallVector.h"
+
+#define GET_INSTRINFO_CTOR_DTOR /* for 3.4 */
+#define GET_INSTRINFO_CTOR /* for 3.2, 3.3 */
+#include "BPFGenInstrInfo.inc"
+
+using namespace llvm;
+
+BPFInstrInfo::BPFInstrInfo()
+  : BPFGenInstrInfo(BPF::ADJCALLSTACKDOWN, BPF::ADJCALLSTACKUP),
+    RI(*this) {
+}
+
+void BPFInstrInfo::copyPhysReg(MachineBasicBlock &MBB,
+                                MachineBasicBlock::iterator I, DebugLoc DL,
+                                unsigned DestReg, unsigned SrcReg,
+                                bool KillSrc) const {
+  if (BPF::GPRRegClass.contains(DestReg, SrcReg))
+    BuildMI(MBB, I, DL, get(BPF::MOV_rr), DestReg)
+      .addReg(SrcReg, getKillRegState(KillSrc));
+  else
+    llvm_unreachable("Impossible reg-to-reg copy");
+}
+
+void BPFInstrInfo::
+storeRegToStackSlot(MachineBasicBlock &MBB, MachineBasicBlock::iterator I,
+                    unsigned SrcReg, bool isKill, int FI,
+                    const TargetRegisterClass *RC,
+                    const TargetRegisterInfo *TRI) const {
+  DebugLoc DL;
+  if (I != MBB.end()) DL = I->getDebugLoc();
+
+  if (RC == &BPF::GPRRegClass)
+    BuildMI(MBB, I, DL, get(BPF::STD)).addReg(SrcReg, getKillRegState(isKill))
+      .addFrameIndex(FI).addImm(0);
+  else
+    llvm_unreachable("Can't store this register to stack slot");
+}
+
+void BPFInstrInfo::
+loadRegFromStackSlot(MachineBasicBlock &MBB, MachineBasicBlock::iterator I,
+                     unsigned DestReg, int FI,
+                     const TargetRegisterClass *RC,
+                     const TargetRegisterInfo *TRI) const {
+  DebugLoc DL;
+  if (I != MBB.end()) DL = I->getDebugLoc();
+
+  if (RC == &BPF::GPRRegClass)
+    BuildMI(MBB, I, DL, get(BPF::LDD), DestReg).addFrameIndex(FI).addImm(0);
+  else
+    llvm_unreachable("Can't load this register from stack slot");
+}
+
+bool BPFInstrInfo::AnalyzeBranch(MachineBasicBlock &MBB,
+                                  MachineBasicBlock *&TBB,
+                                  MachineBasicBlock *&FBB,
+                                  SmallVectorImpl<MachineOperand> &Cond,
+                                  bool AllowModify) const {
+  // Start from the bottom of the block and work up, examining the
+  // terminator instructions.
+  MachineBasicBlock::iterator I = MBB.end();
+  while (I != MBB.begin()) {
+    --I;
+    if (I->isDebugValue())
+      continue;
+
+    // Working from the bottom, when we see a non-terminator
+    // instruction, we're done.
+    if (!isUnpredicatedTerminator(I))
+      break;
+
+    // A terminator that isn't a branch can't easily be handled
+    // by this analysis.
+    if (!I->isBranch())
+      return true;
+
+    // Handle unconditional branches.
+    if (I->getOpcode() == BPF::JMP) {
+      if (!AllowModify) {
+        TBB = I->getOperand(0).getMBB();
+        continue;
+      }
+
+      // If the block has any instructions after a J, delete them.
+      while (llvm::next(I) != MBB.end())
+        llvm::next(I)->eraseFromParent();
+      Cond.clear();
+      FBB = 0;
+
+      // Delete the J if it's equivalent to a fall-through.
+      if (MBB.isLayoutSuccessor(I->getOperand(0).getMBB())) {
+        TBB = 0;
+        I->eraseFromParent();
+        I = MBB.end();
+        continue;
+      }
+
+      // TBB is used to indicate the unconditinal destination.
+      TBB = I->getOperand(0).getMBB();
+      continue;
+    }
+    // Cannot handle conditional branches
+    return true;
+  }
+
+  return false;
+}
+
+unsigned
+BPFInstrInfo::InsertBranch(MachineBasicBlock &MBB, MachineBasicBlock *TBB,
+                            MachineBasicBlock *FBB,
+                            const SmallVectorImpl<MachineOperand> &Cond,
+                            DebugLoc DL) const {
+  // Shouldn't be a fall through.
+  assert(TBB && "InsertBranch must not be told to insert a fallthrough");
+
+  if (Cond.empty()) {
+    // Unconditional branch
+    assert(!FBB && "Unconditional branch with multiple successors!");
+    BuildMI(&MBB, DL, get(BPF::JMP)).addMBB(TBB);
+    return 1;
+  }
+
+  llvm_unreachable("Unexpected conditional branch");
+  return 0;
+}
+
+unsigned BPFInstrInfo::RemoveBranch(MachineBasicBlock &MBB) const {
+  MachineBasicBlock::iterator I = MBB.end();
+  unsigned Count = 0;
+
+  while (I != MBB.begin()) {
+    --I;
+    if (I->isDebugValue())
+      continue;
+    if (I->getOpcode() != BPF::JMP)
+      break;
+    // Remove the branch.
+    I->eraseFromParent();
+    I = MBB.end();
+    ++Count;
+  }
+
+  return Count;
+}
+
diff --git a/tools/bpf/llvm/lib/Target/BPF/BPFInstrInfo.h b/tools/bpf/llvm/lib/Target/BPF/BPFInstrInfo.h
new file mode 100644
index 0000000..911387d
--- /dev/null
+++ b/tools/bpf/llvm/lib/Target/BPF/BPFInstrInfo.h
@@ -0,0 +1,53 @@
+//===- BPFInstrInfo.h - BPF Instruction Information ---------*- C++ -*-===//
+// This file is distributed under the University of Illinois Open Source
+// License. See LICENSE.TXT for details.
+
+#ifndef BPFINSTRUCTIONINFO_H
+#define BPFINSTRUCTIONINFO_H
+
+#include "BPFRegisterInfo.h"
+#include "BPFSubtarget.h"
+#include "llvm/Target/TargetInstrInfo.h"
+
+#define GET_INSTRINFO_HEADER
+#include "BPFGenInstrInfo.inc"
+
+namespace llvm {
+
+class BPFInstrInfo : public BPFGenInstrInfo {
+  const BPFRegisterInfo RI;
+public:
+  BPFInstrInfo();
+
+  virtual const BPFRegisterInfo &getRegisterInfo() const { return RI; }
+
+  virtual void copyPhysReg(MachineBasicBlock &MBB,
+                           MachineBasicBlock::iterator I, DebugLoc DL,
+                           unsigned DestReg, unsigned SrcReg,
+                           bool KillSrc) const;
+
+  virtual void storeRegToStackSlot(MachineBasicBlock &MBB,
+                                   MachineBasicBlock::iterator MBBI,
+                                   unsigned SrcReg, bool isKill, int FrameIndex,
+                                   const TargetRegisterClass *RC,
+                                   const TargetRegisterInfo *TRI) const;
+
+  virtual void loadRegFromStackSlot(MachineBasicBlock &MBB,
+                                    MachineBasicBlock::iterator MBBI,
+                                    unsigned DestReg, int FrameIndex,
+                                    const TargetRegisterClass *RC,
+                                    const TargetRegisterInfo *TRI) const;
+  bool AnalyzeBranch(MachineBasicBlock &MBB,
+                     MachineBasicBlock *&TBB, MachineBasicBlock *&FBB,
+                     SmallVectorImpl<MachineOperand> &Cond,
+                     bool AllowModify) const;
+
+  unsigned RemoveBranch(MachineBasicBlock &MBB) const;
+  unsigned InsertBranch(MachineBasicBlock &MBB, MachineBasicBlock *TBB,
+                        MachineBasicBlock *FBB,
+                        const SmallVectorImpl<MachineOperand> &Cond,
+                        DebugLoc DL) const;
+};
+}
+
+#endif
diff --git a/tools/bpf/llvm/lib/Target/BPF/BPFInstrInfo.td b/tools/bpf/llvm/lib/Target/BPF/BPFInstrInfo.td
new file mode 100644
index 0000000..ca95d9c
--- /dev/null
+++ b/tools/bpf/llvm/lib/Target/BPF/BPFInstrInfo.td
@@ -0,0 +1,455 @@
+//===-- BPFInstrInfo.td - Target Description for BPF Target -------------===//
+// This file is distributed under the University of Illinois Open Source
+// License. See LICENSE.TXT for details.
+//
+// This file describes the BPF instructions in TableGen format.
+
+include "BPFInstrFormats.td"
+
+// Instruction Operands and Patterns
+
+//  These are target-independent nodes, but have target-specific formats.
+def SDT_BPFCallSeqStart : SDCallSeqStart<[ SDTCisVT<0, iPTR> ]>;
+def SDT_BPFCallSeqEnd   : SDCallSeqEnd<[ SDTCisVT<0, iPTR>,
+                                          SDTCisVT<1, iPTR> ]>;
+def SDT_BPFCall         : SDTypeProfile<0, -1, [SDTCisVT<0, iPTR>]>;
+def SDT_BPFSetFlag      : SDTypeProfile<0, 3, [SDTCisSameAs<0, 1>]>;
+def SDT_BPFSelectCC     : SDTypeProfile<1, 5, [SDTCisSameAs<1, 2>, SDTCisSameAs<0, 4>,
+                                                SDTCisSameAs<4, 5>]>;
+def SDT_BPFBrCC         : SDTypeProfile<0, 4, [SDTCisSameAs<0, 1>, SDTCisVT<3, OtherVT>]>;
+
+def SDT_BPFWrapper      : SDTypeProfile<1, 1, [SDTCisSameAs<0, 1>,
+                                                SDTCisPtrTy<0>]>;
+//def SDT_BPFAdjDynAlloc  : SDTypeProfile<1, 1, [SDTCisVT<0, i64>,
+//                                                SDTCisVT<1, i64>]>;
+
+def call            : SDNode<"BPFISD::CALL", SDT_BPFCall,
+                             [SDNPHasChain, SDNPOptInGlue, SDNPOutGlue,
+                              SDNPVariadic]>;
+def retflag         : SDNode<"BPFISD::RET_FLAG", SDTNone,
+                             [SDNPHasChain, SDNPOptInGlue, SDNPVariadic]>;
+def callseq_start   : SDNode<"ISD::CALLSEQ_START", SDT_BPFCallSeqStart,
+                             [SDNPHasChain, SDNPOutGlue]>;
+def callseq_end     : SDNode<"ISD::CALLSEQ_END",   SDT_BPFCallSeqEnd,
+                             [SDNPHasChain, SDNPOptInGlue, SDNPOutGlue]>;
+def BPFbrcc        : SDNode<"BPFISD::BR_CC", SDT_BPFBrCC,
+                              [SDNPHasChain, SDNPOutGlue, SDNPInGlue]>;
+
+def BPFselectcc    : SDNode<"BPFISD::SELECT_CC", SDT_BPFSelectCC, [SDNPInGlue]>;
+def BPFWrapper     : SDNode<"BPFISD::Wrapper", SDT_BPFWrapper>;
+
+//def BPFadjdynalloc : SDNode<"BPFISD::ADJDYNALLOC", SDT_BPFAdjDynAlloc>;
+
+// helper macros to produce 64-bit constant
+// 0x11223344 55667788 ->
+// reg = 0x11223344
+// reg <<= 32
+// reg += 0x55667788
+//
+// 0x11223344 FF667788 ->
+// reg = 0x11223345
+// reg <<= 32
+// reg += (long long)(int)0xFF667788
+def LO32 : SDNodeXForm<imm, [{
+  return CurDAG->getTargetConstant((int64_t)(int32_t)(uint64_t)N->getZExtValue(),
+                                   MVT::i64);
+}]>;
+def HI32 : SDNodeXForm<imm, [{
+  return CurDAG->getTargetConstant(((int64_t)N->getZExtValue() -
+         (int64_t)(int32_t)(uint64_t)N->getZExtValue()) >> 32, MVT::i64);
+}]>;
+
+
+def brtarget : Operand<OtherVT>;
+def calltarget : Operand<i64>;
+
+def s32imm   : Operand<i64> {
+  let PrintMethod = "printS32ImmOperand";
+}
+
+def immSExt32 : PatLeaf<(imm),
+                [{return isInt<32>(N->getSExtValue()); }]>;
+
+// Addressing modes.
+def ADDRri : ComplexPattern<i64, 2, "SelectAddr", [frameindex], []>;
+
+// Address operands
+def MEMri : Operand<i64> {
+  let PrintMethod = "printMemOperand";
+  let EncoderMethod = "getMemoryOpValue";
+  let DecoderMethod = "todo_decode_memri";
+  let MIOperandInfo = (ops GPR, i16imm);
+}
+
+// Conditional code predicates - used for pattern matching for SF instructions
+def BPF_CC_EQ  : PatLeaf<(imm),
+                  [{return (N->getZExtValue() == ISD::SETEQ);}]>;
+def BPF_CC_NE  : PatLeaf<(imm),
+                  [{return (N->getZExtValue() == ISD::SETNE);}]>;
+def BPF_CC_GE  : PatLeaf<(imm),
+                  [{return (N->getZExtValue() == ISD::SETGE);}]>;
+def BPF_CC_GT  : PatLeaf<(imm),
+                  [{return (N->getZExtValue() == ISD::SETGT);}]>;
+def BPF_CC_GTU : PatLeaf<(imm),
+                  [{return (N->getZExtValue() == ISD::SETUGT);}]>;
+def BPF_CC_GEU : PatLeaf<(imm),
+                  [{return (N->getZExtValue() == ISD::SETUGE);}]>;
+
+// jump instructions
+class JMP_RR<bits<4> br_op, string asmstr, PatLeaf Cond>
+  : InstBPF<(outs), (ins GPR:$rA, GPR:$rX, brtarget:$dst),
+           !strconcat(asmstr, "\t$rA, $rX goto $dst"),
+           [(BPFbrcc (i64 GPR:$rA), (i64 GPR:$rX), Cond, bb:$dst)]> {
+  bits<4> op;
+  bits<1> src;
+  bits<4> rA;
+  bits<4> rX;
+  bits<16> dst;
+
+  let Inst{63-60} = op;
+  let Inst{59} = src;
+  let Inst{55-52} = rX;
+  let Inst{51-48} = rA;
+  let Inst{47-32} = dst;
+
+  let op = br_op;
+  let src = 1;
+  let bpfClass = 5; // BPF_JUMP
+}
+
+class JMP_RI<bits<4> br_op, string asmstr, PatLeaf Cond>
+  : InstBPF<(outs), (ins GPR:$rA, s32imm:$imm, brtarget:$dst),
+           !strconcat(asmstr, "i\t$rA, $imm goto $dst"),
+           [(BPFbrcc (i64 GPR:$rA), immSExt32:$imm, Cond, bb:$dst)]> {
+  bits<4> op;
+  bits<1> src;
+  bits<4> rA;
+  bits<16> dst;
+  bits<32> imm;
+
+  let Inst{63-60} = op;
+  let Inst{59} = src;
+  let Inst{51-48} = rA;
+  let Inst{47-32} = dst;
+  let Inst{31-0} = imm;
+
+  let op = br_op;
+  let src = 0;
+  let bpfClass = 5; // BPF_JUMP
+}
+
+multiclass J<bits<4> op2Val, string asmstr, PatLeaf Cond> {
+  def _rr : JMP_RR<op2Val, asmstr, Cond>;
+  def _ri : JMP_RI<op2Val, asmstr, Cond>;
+}
+
+let isBranch = 1, isTerminator = 1, hasDelaySlot=0 in {
+// cmp+goto instructions
+defm JEQ  : J<0x1, "jeq",  BPF_CC_EQ>;
+defm JUGT : J<0x2, "jgt", BPF_CC_GTU>;
+defm JUGE : J<0x3, "jge", BPF_CC_GEU>;
+defm JNE  : J<0x5, "jne",  BPF_CC_NE>;
+defm JSGT : J<0x6, "jsgt", BPF_CC_GT>;
+defm JSGE : J<0x7, "jsge", BPF_CC_GE>;
+}
+
+// ALU instructions
+class ALU_RI<bits<4> aluOp, string asmstr, SDNode OpNode>
+  : InstBPF<(outs GPR:$rA), (ins GPR:$rS, s32imm:$imm),
+            !strconcat(asmstr, "i\t$rA, $imm"),
+            [(set GPR:$rA, (OpNode GPR:$rS, immSExt32:$imm))]> {
+  bits<4> op;
+  bits<1> src;
+  bits<4> rA;
+  bits<32> imm;
+
+  let Inst{63-60} = op;
+  let Inst{59} = src;
+  let Inst{51-48} = rA;
+  let Inst{31-0} = imm;
+
+  let op = aluOp;
+  let src = 0;
+  let bpfClass = 4;
+}
+
+class ALU_RR<bits<4> aluOp, string asmstr, SDNode OpNode>
+  : InstBPF<(outs GPR:$rA), (ins GPR:$rS, GPR:$rX),
+            !strconcat(asmstr, "\t$rA, $rX"),
+            [(set GPR:$rA, (OpNode (i64 GPR:$rS), (i64 GPR:$rX)))]> {
+  bits<4> op;
+  bits<1> src;
+  bits<4> rA;
+  bits<4> rX;
+
+  let Inst{63-60} = op;
+  let Inst{59} = src;
+  let Inst{55-52} = rX;
+  let Inst{51-48} = rA;
+
+  let op = aluOp;
+  let src = 1;
+  let bpfClass = 4;
+}
+
+multiclass ALU<bits<4> opVal, string asmstr, SDNode OpNode> {
+  def _rr : ALU_RR<opVal, asmstr, OpNode>;
+  def _ri : ALU_RI<opVal, asmstr, OpNode>;
+}
+
+let Constraints = "$rA = $rS" in {
+let isAsCheapAsAMove = 1 in {
+  defm ADD : ALU<0x0, "add", add>;
+  defm SUB : ALU<0x1, "sub", sub>;
+  defm OR  : ALU<0x4, "or", or>;
+  defm AND : ALU<0x5, "and", and>;
+  defm SLL : ALU<0x6, "sll", shl>;
+  defm SRL : ALU<0x7, "srl", srl>;
+  defm XOR : ALU<0xa, "xor", xor>;
+  defm SRA : ALU<0xc, "sra", sra>;
+}
+  defm MUL : ALU<0x2, "mul", mul>;
+  defm DIV : ALU<0x3, "div", udiv>;
+}
+
+class MOV_RR<string asmstr>
+  : InstBPF<(outs GPR:$rA), (ins GPR:$rX),
+            !strconcat(asmstr, "\t$rA, $rX"),
+            []> {
+  bits<4> op;
+  bits<1> src;
+  bits<4> rA;
+  bits<4> rX;
+
+  let Inst{63-60} = op;
+  let Inst{59} = src;
+  let Inst{55-52} = rX;
+  let Inst{51-48} = rA;
+
+  let op = 0xb;
+  let src = 1;
+  let bpfClass = 4;
+}
+
+class MOV_RI<string asmstr>
+  : InstBPF<(outs GPR:$rA), (ins s32imm:$imm),
+            !strconcat(asmstr, "\t$rA, $imm"),
+            [(set GPR:$rA, (i64 immSExt32:$imm))]> {
+  bits<4> op;
+  bits<1> src;
+  bits<4> rA;
+  bits<32> imm;
+
+  let Inst{63-60} = op;
+  let Inst{59} = src;
+  let Inst{51-48} = rA;
+  let Inst{31-0} = imm;
+
+  let op = 0xb;
+  let src = 0;
+  let bpfClass = 4;
+}
+def MOV_rr : MOV_RR<"mov">;
+def MOV_ri : MOV_RI<"mov">;
+
+// STORE instructions
+class STORE<bits<2> sizeOp, string asmstring, list<dag> pattern>
+  : InstBPF<(outs), (ins GPR:$rX, MEMri:$addr),
+          !strconcat(asmstring, "\t$addr, $rX"), pattern> {
+  bits<3> mode;
+  bits<2> size;
+  bits<4> rX;
+  bits<20> addr;
+
+  let Inst{63-61} = mode;
+  let Inst{60-59} = size;
+  let Inst{51-48} = addr{19-16}; // base reg
+  let Inst{55-52} = rX;
+  let Inst{47-32} = addr{15-0}; // offset
+
+  let mode = 6; // BPF_REL
+  let size = sizeOp;
+  let bpfClass = 3; // BPF_STX
+}
+
+class STOREi64<bits<2> subOp, string asmstring, PatFrag opNode>
+  : STORE<subOp, asmstring, [(opNode (i64 GPR:$rX), ADDRri:$addr)]>;
+
+def STW : STOREi64<0x0, "stw", truncstorei32>;
+def STH : STOREi64<0x1, "sth", truncstorei16>;
+def STB : STOREi64<0x2, "stb", truncstorei8>;
+def STD : STOREi64<0x3, "std", store>;
+
+// LOAD instructions
+class LOAD<bits<2> sizeOp, string asmstring, list<dag> pattern>
+  : InstBPF<(outs GPR:$rA), (ins MEMri:$addr),
+           !strconcat(asmstring, "\t$rA, $addr"), pattern> {
+  bits<3> mode;
+  bits<2> size;
+  bits<4> rA;
+  bits<20> addr;
+
+  let Inst{63-61} = mode;
+  let Inst{60-59} = size;
+  let Inst{51-48} = rA;
+  let Inst{55-52} = addr{19-16};
+  let Inst{47-32} = addr{15-0};
+
+  let mode = 6; // BPF_REL
+  let size = sizeOp;
+  let bpfClass = 1; // BPF_LDX
+}
+
+class LOADi64<bits<2> sizeOp, string asmstring, PatFrag opNode>
+  : LOAD<sizeOp, asmstring, [(set (i64 GPR:$rA), (opNode ADDRri:$addr))]>;
+
+def LDW : LOADi64<0x0, "ldw", zextloadi32>;
+def LDH : LOADi64<0x1, "ldh", zextloadi16>;
+def LDB : LOADi64<0x2, "ldb", zextloadi8>;
+def LDD : LOADi64<0x3, "ldd", load>;
+
+//def LDBS : LOADi64<0x2, "ldbs", sextloadi8>;
+//def LDHS : LOADi64<0x1, "ldhs", sextloadi16>;
+//def LDWS : LOADi64<0x0, "ldws", sextloadi32>;
+
+class BRANCH<bits<4> subOp, string asmstring, list<dag> pattern>
+  : InstBPF<(outs), (ins brtarget:$dst),
+           !strconcat(asmstring, "\t$dst"), pattern> {
+  bits<4> op;
+  bits<16> dst;
+  bits<1> src;
+
+  let Inst{63-60} = op;
+  let Inst{59} = src;
+  let Inst{47-32} = dst;
+
+  let op = subOp;
+  let src = 1;
+  let bpfClass = 5; // BPF_JUMP
+}
+
+class CALL<string asmstring>
+  : InstBPF<(outs), (ins calltarget:$dst),
+           !strconcat(asmstring, "\t$dst"), []> {
+  bits<4> op;
+  bits<32> dst;
+  bits<1> src;
+
+  let Inst{63-60} = op;
+  let Inst{59} = src;
+  let Inst{31-0} = dst;
+
+  let op = 8; // BPF_CALL
+  let src = 0;
+  let bpfClass = 5; // BPF_JUMP
+}
+
+// Jump always
+let isBranch = 1, isTerminator = 1, hasDelaySlot=0, isBarrier = 1 in {
+  def JMP : BRANCH<0x0, "jmp", [(br bb:$dst)]>;
+}
+
+// Jump and link
+let isCall=1, hasDelaySlot=0,
+    Uses = [R11],
+    // Potentially clobbered registers
+    Defs = [R0, R1, R2, R3, R4, R5] in {
+  def JAL  : CALL<"call">;
+}
+
+class NOP_I<string asmstr>
+  : InstBPF<(outs), (ins i32imm:$imm),
+           !strconcat(asmstr, "\t$imm"), []> {
+  bits<32> imm;
+
+  let Inst{31-0} = imm;
+
+  let Inst{63-59} = 0;
+  let bpfClass = 7; // BPF_MISC
+}
+
+let neverHasSideEffects = 1 in
+  def NOP : NOP_I<"nop">;
+
+class RET<string asmstring>
+  : InstBPF<(outs), (ins),
+           !strconcat(asmstring, ""), [(retflag)]> {
+  let Inst{63-59} = 0;
+  let bpfClass = 6; // BPF_RET
+}
+
+let isReturn = 1, isTerminator = 1, hasDelaySlot=0, isBarrier = 1, isNotDuplicable = 1 in {
+  def RET : RET<"ret">;
+}
+
+// ADJCALLSTACKDOWN/UP pseudo insns
+let Defs = [R11], Uses = [R11] in {
+def ADJCALLSTACKDOWN : Pseudo<(outs), (ins i64imm:$amt),
+                              "#ADJCALLSTACKDOWN $amt",
+                              [(callseq_start timm:$amt)]>;
+def ADJCALLSTACKUP   : Pseudo<(outs), (ins i64imm:$amt1, i64imm:$amt2),
+                              "#ADJCALLSTACKUP $amt1 $amt2",
+                              [(callseq_end timm:$amt1, timm:$amt2)]>;
+}
+
+
+//let Defs = [R11], Uses = [R11] in {
+//  def ADJDYNALLOC : Pseudo<(outs GPR:$dst), (ins GPR:$src),
+//                    "#ADJDYNALLOC $dst $src",
+//                    [(set GPR:$dst, (BPFadjdynalloc GPR:$src))]>;
+//}
+
+
+let usesCustomInserter = 1 in {
+  def Select : Pseudo<(outs GPR:$dst), (ins GPR:$lhs, GPR:$rhs, s32imm:$imm, GPR:$src, GPR:$src2),
+                       "# Select PSEUDO $dst = $lhs $imm $rhs ? $src : $src2",
+                       [(set (i64 GPR:$dst),
+                        (BPFselectcc (i64 GPR:$lhs), (i64 GPR:$rhs), (i64 imm:$imm), (i64 GPR:$src), (i64 GPR:$src2)))]>;
+}
+
+// Non-Instruction Patterns
+
+// arbitrary immediate
+def : Pat<(i64 imm:$imm), (ADD_ri (SLL_ri (MOV_ri (HI32 imm:$imm)), 32), (LO32 imm:$imm))>;
+
+// 0xffffFFFF doesn't fit into simm32, optimize common case
+def : Pat<(i64 (and (i64 GPR:$src), 0xffffFFFF)), (SRL_ri (SLL_ri (i64 GPR:$src), 32), 32)>;
+
+// Calls
+def : Pat<(call tglobaladdr:$dst), (JAL tglobaladdr:$dst)>;
+//def : Pat<(call texternalsym:$dst), (JAL texternalsym:$dst)>;
+//def : Pat<(call (i32 imm:$dst)), (JAL (i32 imm:$dst))>;
+//def : Pat<(call imm:$dst), (JAL imm:$dst)>;
+
+// Loads
+def : Pat<(extloadi8  ADDRri:$src), (i64 (LDB ADDRri:$src))>;
+def : Pat<(extloadi16 ADDRri:$src), (i64 (LDH ADDRri:$src))>;
+def : Pat<(extloadi32 ADDRri:$src), (i64 (LDW ADDRri:$src))>;
+
+// Atomics
+class XADD<bits<2> sizeOp, string asmstr, PatFrag opNode>
+  : InstBPF<(outs GPR:$dst), (ins MEMri:$addr, GPR:$val),
+            !strconcat(asmstr, "\t$dst, $addr, $val"),
+            [(set GPR:$dst, (opNode ADDRri:$addr, GPR:$val))]> {
+  bits<3> mode;
+  bits<2> size;
+  bits<4> rX;
+  bits<20> addr;
+
+  let Inst{63-61} = mode;
+  let Inst{60-59} = size;
+  let Inst{51-48} = addr{19-16}; // base reg
+  let Inst{55-52} = rX;
+  let Inst{47-32} = addr{15-0}; // offset
+
+  let mode = 7; // BPF_XADD
+  let size = sizeOp;
+  let bpfClass = 3; // BPF_STX
+}
+
+let Constraints = "$dst = $val" in {
+def XADD32 : XADD<0, "xadd32", atomic_load_add_32>;
+def XADD16 : XADD<1, "xadd16", atomic_load_add_16>;
+def XADD8  : XADD<2, "xadd8", atomic_load_add_8>;
+def XADD64 : XADD<3, "xadd64", atomic_load_add_64>;
+}
diff --git a/tools/bpf/llvm/lib/Target/BPF/BPFMCInstLower.cpp b/tools/bpf/llvm/lib/Target/BPF/BPFMCInstLower.cpp
new file mode 100644
index 0000000..5c15ed7
--- /dev/null
+++ b/tools/bpf/llvm/lib/Target/BPF/BPFMCInstLower.cpp
@@ -0,0 +1,77 @@
+//=-- BPFMCInstLower.cpp - Convert BPF MachineInstr to an MCInst ----------=//
+// This file is distributed under the University of Illinois Open Source
+// License. See LICENSE.TXT for details.
+//
+// This file contains code to lower BPF MachineInstrs to their corresponding
+// MCInst records.
+
+#include "BPFMCInstLower.h"
+#include "MCTargetDesc/BPFBaseInfo.h"
+#include "llvm/CodeGen/AsmPrinter.h"
+#include "llvm/CodeGen/MachineBasicBlock.h"
+#include "llvm/CodeGen/MachineInstr.h"
+#include "llvm/MC/MCAsmInfo.h"
+#include "llvm/MC/MCContext.h"
+#include "llvm/MC/MCExpr.h"
+#include "llvm/MC/MCInst.h"
+#include "llvm/Target/Mangler.h"
+#include "llvm/Support/raw_ostream.h"
+#include "llvm/Support/ErrorHandling.h"
+#include "llvm/ADT/SmallString.h"
+using namespace llvm;
+
+MCSymbol *BPFMCInstLower::
+GetGlobalAddressSymbol(const MachineOperand &MO) const {
+#if LLVM_VERSION_MINOR==4
+  return Printer.getSymbol(MO.getGlobal());
+#else
+  return Printer.Mang->getSymbol(MO.getGlobal());
+#endif
+}
+
+MCOperand BPFMCInstLower::
+LowerSymbolOperand(const MachineOperand &MO, MCSymbol *Sym) const {
+
+  const MCExpr *Expr = MCSymbolRefExpr::Create(Sym, Ctx);
+
+  if (!MO.isJTI() && MO.getOffset())
+    llvm_unreachable("unknown symbol op");
+//    Expr = MCBinaryExpr::CreateAdd(Expr,
+//                                   MCConstantExpr::Create(MO.getOffset(), Ctx),
+//                                   Ctx);
+  return MCOperand::CreateExpr(Expr);
+}
+
+void BPFMCInstLower::Lower(const MachineInstr *MI, MCInst &OutMI) const {
+  OutMI.setOpcode(MI->getOpcode());
+
+  for (unsigned i = 0, e = MI->getNumOperands(); i != e; ++i) {
+    const MachineOperand &MO = MI->getOperand(i);
+
+    MCOperand MCOp;
+    switch (MO.getType()) {
+    default:
+      MI->dump();
+      llvm_unreachable("unknown operand type");
+    case MachineOperand::MO_Register:
+      // Ignore all implicit register operands.
+      if (MO.isImplicit()) continue;
+      MCOp = MCOperand::CreateReg(MO.getReg());
+      break;
+    case MachineOperand::MO_Immediate:
+      MCOp = MCOperand::CreateImm(MO.getImm());
+      break;
+    case MachineOperand::MO_MachineBasicBlock:
+      MCOp = MCOperand::CreateExpr(MCSymbolRefExpr::Create(
+                                   MO.getMBB()->getSymbol(), Ctx));
+      break;
+    case MachineOperand::MO_RegisterMask:
+      continue;
+    case MachineOperand::MO_GlobalAddress:
+      MCOp = LowerSymbolOperand(MO, GetGlobalAddressSymbol(MO));
+      break;
+    }
+
+    OutMI.addOperand(MCOp);
+  }
+}
diff --git a/tools/bpf/llvm/lib/Target/BPF/BPFMCInstLower.h b/tools/bpf/llvm/lib/Target/BPF/BPFMCInstLower.h
new file mode 100644
index 0000000..aaff0c3
--- /dev/null
+++ b/tools/bpf/llvm/lib/Target/BPF/BPFMCInstLower.h
@@ -0,0 +1,40 @@
+//===-- BPFMCInstLower.h - Lower MachineInstr to MCInst --------*- C++ -*-===//
+// This file is distributed under the University of Illinois Open Source
+// License. See LICENSE.TXT for details.
+
+#ifndef BPF_MCINSTLOWER_H
+#define BPF_MCINSTLOWER_H
+
+#include "llvm/Support/Compiler.h"
+
+namespace llvm {
+  class AsmPrinter;
+  class MCContext;
+  class MCInst;
+  class MCOperand;
+  class MCSymbol;
+  class MachineInstr;
+  class MachineModuleInfoMachO;
+  class MachineOperand;
+  class Mangler;
+
+  /// BPFMCInstLower - This class is used to lower an MachineInstr
+  /// into an MCInst.
+class LLVM_LIBRARY_VISIBILITY BPFMCInstLower {
+  MCContext &Ctx;
+  Mangler &Mang;
+
+  AsmPrinter &Printer;
+public:
+  BPFMCInstLower(MCContext &ctx, Mangler &mang, AsmPrinter &printer)
+    : Ctx(ctx), Mang(mang), Printer(printer) {}
+  void Lower(const MachineInstr *MI, MCInst &OutMI) const;
+
+  MCOperand LowerSymbolOperand(const MachineOperand &MO, MCSymbol *Sym) const;
+
+  MCSymbol *GetGlobalAddressSymbol(const MachineOperand &MO) const;
+};
+
+}
+
+#endif
diff --git a/tools/bpf/llvm/lib/Target/BPF/BPFRegisterInfo.cpp b/tools/bpf/llvm/lib/Target/BPF/BPFRegisterInfo.cpp
new file mode 100644
index 0000000..7d46041
--- /dev/null
+++ b/tools/bpf/llvm/lib/Target/BPF/BPFRegisterInfo.cpp
@@ -0,0 +1,122 @@
+//===-- BPFRegisterInfo.cpp - BPF Register Information --------*- C++ -*-===//
+// This file is distributed under the University of Illinois Open Source
+// License. See LICENSE.TXT for details.
+//
+// This file contains the BPF implementation of the TargetRegisterInfo class.
+
+#include "BPF.h"
+#include "BPFRegisterInfo.h"
+#include "BPFSubtarget.h"
+#include "llvm/CodeGen/MachineInstrBuilder.h"
+#include "llvm/CodeGen/MachineFrameInfo.h"
+#include "llvm/CodeGen/MachineFunction.h"
+#include "llvm/CodeGen/RegisterScavenging.h"
+#include "llvm/Support/ErrorHandling.h"
+#include "llvm/Target/TargetFrameLowering.h"
+#include "llvm/Target/TargetInstrInfo.h"
+
+#define GET_REGINFO_TARGET_DESC
+#include "BPFGenRegisterInfo.inc"
+using namespace llvm;
+
+BPFRegisterInfo::BPFRegisterInfo(const TargetInstrInfo &tii)
+  : BPFGenRegisterInfo(BPF::R0), TII(tii) {
+}
+
+const uint16_t*
+BPFRegisterInfo::getCalleeSavedRegs(const MachineFunction *MF) const {
+  return CSR_SaveList;
+}
+
+BitVector BPFRegisterInfo::getReservedRegs(const MachineFunction &MF) const {
+  BitVector Reserved(getNumRegs());
+  Reserved.set(BPF::R10);
+  Reserved.set(BPF::R11);
+  return Reserved;
+}
+
+bool
+BPFRegisterInfo::requiresRegisterScavenging(const MachineFunction &MF) const {
+  return true;
+}
+
+void
+#if LLVM_VERSION_MINOR==3 || LLVM_VERSION_MINOR==4
+BPFRegisterInfo::eliminateFrameIndex(MachineBasicBlock::iterator II,
+                           int SPAdj, unsigned FIOperandNum,
+                           RegScavenger *RS) const {
+#else
+BPFRegisterInfo::eliminateFrameIndex(MachineBasicBlock::iterator II,
+                                       int SPAdj, RegScavenger *RS) const {
+#endif
+  assert(SPAdj == 0 && "Unexpected");
+
+  unsigned i = 0;
+  MachineInstr &MI = *II;
+  MachineFunction &MF = *MI.getParent()->getParent();
+  DebugLoc dl = MI.getDebugLoc();
+
+  while (!MI.getOperand(i).isFI()) {
+    ++i;
+    assert(i < MI.getNumOperands() && "Instr doesn't have FrameIndex operand!");
+  }
+
+  unsigned FrameReg = getFrameRegister(MF);
+  int FrameIndex = MI.getOperand(i).getIndex();
+
+  if (MI.getOpcode() == BPF::MOV_rr) {
+    int Offset = MF.getFrameInfo()->getObjectOffset(FrameIndex);
+
+    MI.getOperand(i).ChangeToRegister(FrameReg, false);
+
+    MachineBasicBlock &MBB = *MI.getParent();
+    unsigned reg = MI.getOperand(i - 1).getReg();
+    BuildMI(MBB, ++ II, dl, TII.get(BPF::ADD_ri), reg)
+       .addReg(reg).addImm(Offset);
+    return;
+  }
+
+  int Offset = MF.getFrameInfo()->getObjectOffset(FrameIndex) +
+               MI.getOperand(i+1).getImm();
+
+  if (!isInt<32>(Offset)) {
+    llvm_unreachable("bug in frame offset");
+  }
+
+  MI.getOperand(i).ChangeToRegister(FrameReg, false);
+  MI.getOperand(i+1).ChangeToImmediate(Offset);
+}
+
+void BPFRegisterInfo::
+processFunctionBeforeFrameFinalized(MachineFunction &MF) const {}
+
+bool BPFRegisterInfo::hasBasePointer(const MachineFunction &MF) const {
+   return false;
+}
+
+bool BPFRegisterInfo::needsStackRealignment(const MachineFunction &MF) const {
+  return false;
+}
+
+unsigned BPFRegisterInfo::getRARegister() const {
+  return BPF::R0;
+}
+
+unsigned BPFRegisterInfo::getFrameRegister(const MachineFunction &MF) const {
+  return BPF::R10;
+}
+
+unsigned BPFRegisterInfo::getBaseRegister() const {
+  llvm_unreachable("What is the base register");
+  return 0;
+}
+
+unsigned BPFRegisterInfo::getEHExceptionRegister() const {
+  llvm_unreachable("What is the exception register");
+  return 0;
+}
+
+unsigned BPFRegisterInfo::getEHHandlerRegister() const {
+  llvm_unreachable("What is the exception handler register");
+  return 0;
+}
diff --git a/tools/bpf/llvm/lib/Target/BPF/BPFRegisterInfo.h b/tools/bpf/llvm/lib/Target/BPF/BPFRegisterInfo.h
new file mode 100644
index 0000000..8aeb341
--- /dev/null
+++ b/tools/bpf/llvm/lib/Target/BPF/BPFRegisterInfo.h
@@ -0,0 +1,65 @@
+//===- BPFRegisterInfo.h - BPF Register Information Impl ------*- C++ -*-===//
+// This file is distributed under the University of Illinois Open Source
+// License. See LICENSE.TXT for details.
+//
+// This file contains the BPF implementation of the TargetRegisterInfo class.
+
+#ifndef BPFREGISTERINFO_H
+#define BPFREGISTERINFO_H
+
+#include "llvm/Target/TargetRegisterInfo.h"
+
+#define GET_REGINFO_HEADER
+#include "BPFGenRegisterInfo.inc"
+
+namespace llvm {
+
+class TargetInstrInfo;
+class Type;
+
+struct BPFRegisterInfo : public BPFGenRegisterInfo {
+  const TargetInstrInfo &TII;
+
+  BPFRegisterInfo(const TargetInstrInfo &tii);
+
+  /// Code Generation virtual methods...
+  const uint16_t *getCalleeSavedRegs(const MachineFunction *MF = 0) const;
+
+  BitVector getReservedRegs(const MachineFunction &MF) const;
+
+  bool requiresRegisterScavenging(const MachineFunction &MF) const;
+
+  // llvm 3.2 defines it here
+  void eliminateCallFramePseudoInstr(MachineFunction &MF,
+                                     MachineBasicBlock &MBB,
+                                     MachineBasicBlock::iterator I) const {
+    // Discard ADJCALLSTACKDOWN, ADJCALLSTACKUP instructions.
+    MBB.erase(I);
+  }
+
+#if LLVM_VERSION_MINOR==3 || LLVM_VERSION_MINOR==4
+  void eliminateFrameIndex(MachineBasicBlock::iterator MI,
+                           int SPAdj, unsigned FIOperandNum,
+                           RegScavenger *RS = NULL) const;
+#else
+  void eliminateFrameIndex(MachineBasicBlock::iterator II,
+                           int SPAdj, RegScavenger *RS = NULL) const;
+#endif
+
+  void processFunctionBeforeFrameFinalized(MachineFunction &MF) const;
+
+  bool hasBasePointer(const MachineFunction &MF) const;
+  bool needsStackRealignment(const MachineFunction &MF) const;
+
+  // Debug information queries.
+  unsigned getRARegister() const;
+  unsigned getFrameRegister(const MachineFunction &MF) const;
+  unsigned getBaseRegister() const;
+
+  // Exception handling queries.
+  unsigned getEHExceptionRegister() const;
+  unsigned getEHHandlerRegister() const;
+  int getDwarfRegNum(unsigned RegNum, bool isEH) const;
+};
+}
+#endif
diff --git a/tools/bpf/llvm/lib/Target/BPF/BPFRegisterInfo.td b/tools/bpf/llvm/lib/Target/BPF/BPFRegisterInfo.td
new file mode 100644
index 0000000..fac0817
--- /dev/null
+++ b/tools/bpf/llvm/lib/Target/BPF/BPFRegisterInfo.td
@@ -0,0 +1,39 @@
+//===- BPFRegisterInfo.td - BPF Register defs ------------*- tablegen -*-===//
+// This file is distributed under the University of Illinois Open Source
+// License. See LICENSE.TXT for details.
+//
+//  Declarations that describe the BPF register file
+
+class BPFReg<string n> : Register<n> {
+  field bits<4> Num;
+  let Namespace = "BPF";
+}
+
+// Registers are identified with 4-bit ID numbers.
+// Ri - 64-bit integer registers
+class Ri<bits<4> num, string n> : BPFReg<n> {
+  let Num = num;
+}
+
+// Integer registers
+def R0 : Ri< 0, "r0">, DwarfRegNum<[0]>;
+def R1 : Ri< 1, "r1">, DwarfRegNum<[1]>;
+def R2 : Ri< 2, "r2">, DwarfRegNum<[2]>;
+def R3 : Ri< 3, "r3">, DwarfRegNum<[3]>;
+def R4 : Ri< 4, "r4">, DwarfRegNum<[4]>;
+def R5 : Ri< 5, "r5">, DwarfRegNum<[5]>;
+def R6 : Ri< 6, "r6">, DwarfRegNum<[6]>;
+def R7 : Ri< 7, "r7">, DwarfRegNum<[7]>;
+def R8 : Ri< 8, "r8">, DwarfRegNum<[8]>;
+def R9 : Ri< 9, "r9">, DwarfRegNum<[9]>;
+def R10 : Ri<10, "r10">, DwarfRegNum<[10]>;
+def R11 : Ri<11, "r11">, DwarfRegNum<[11]>;
+
+// Register classes.
+def GPR : RegisterClass<"BPF", [i64], 64, (add R1, R2, R3, R4, R5,
+                                           R6, R7, R8, R9, // callee saved
+                                           R0, // return value
+                                           R11,  // stack ptr
+                                           R10  // frame ptr
+                                          )>;
+
diff --git a/tools/bpf/llvm/lib/Target/BPF/BPFSubtarget.cpp b/tools/bpf/llvm/lib/Target/BPF/BPFSubtarget.cpp
new file mode 100644
index 0000000..6e98f6d
--- /dev/null
+++ b/tools/bpf/llvm/lib/Target/BPF/BPFSubtarget.cpp
@@ -0,0 +1,23 @@
+//===- BPFSubtarget.cpp - BPF Subtarget Information -----------*- C++ -*-=//
+// This file is distributed under the University of Illinois Open Source
+// License. See LICENSE.TXT for details.
+
+#include "BPF.h"
+#include "BPFSubtarget.h"
+#define GET_SUBTARGETINFO_TARGET_DESC
+#define GET_SUBTARGETINFO_CTOR
+#include "BPFGenSubtargetInfo.inc"
+using namespace llvm;
+
+void BPFSubtarget::anchor() { }
+
+BPFSubtarget::BPFSubtarget(const std::string &TT,
+                           const std::string &CPU, const std::string &FS)
+  : BPFGenSubtargetInfo(TT, CPU, FS)
+{
+  std::string CPUName = CPU;
+  if (CPUName.empty())
+    CPUName = "generic";
+
+  ParseSubtargetFeatures(CPUName, FS);
+}
diff --git a/tools/bpf/llvm/lib/Target/BPF/BPFSubtarget.h b/tools/bpf/llvm/lib/Target/BPF/BPFSubtarget.h
new file mode 100644
index 0000000..cd5d875
--- /dev/null
+++ b/tools/bpf/llvm/lib/Target/BPF/BPFSubtarget.h
@@ -0,0 +1,33 @@
+//=====-- BPFSubtarget.h - Define Subtarget for the BPF -----*- C++ -*--==//
+// This file is distributed under the University of Illinois Open Source
+// License. See LICENSE.TXT for details.
+
+#ifndef BPFSUBTARGET_H
+#define BPFSUBTARGET_H
+
+#include "llvm/Target/TargetSubtargetInfo.h"
+#include "llvm/Target/TargetMachine.h"
+
+#include <string>
+
+#define GET_SUBTARGETINFO_HEADER
+#include "BPFGenSubtargetInfo.inc"
+
+namespace llvm {
+
+class BPFSubtarget : public BPFGenSubtargetInfo {
+  virtual void anchor();
+public:
+  /// This constructor initializes the data members to match that
+  /// of the specified triple.
+  ///
+  BPFSubtarget(const std::string &TT, const std::string &CPU,
+                 const std::string &FS);
+  
+  /// ParseSubtargetFeatures - Parses features string setting specified 
+  /// subtarget options.  Definition of function is auto generated by tblgen.
+  void ParseSubtargetFeatures(StringRef CPU, StringRef FS);
+};
+}
+
+#endif
diff --git a/tools/bpf/llvm/lib/Target/BPF/BPFTargetMachine.cpp b/tools/bpf/llvm/lib/Target/BPF/BPFTargetMachine.cpp
new file mode 100644
index 0000000..bd811fd
--- /dev/null
+++ b/tools/bpf/llvm/lib/Target/BPF/BPFTargetMachine.cpp
@@ -0,0 +1,72 @@
+//===-- BPFTargetMachine.cpp - Define TargetMachine for BPF ---------===//
+// This file is distributed under the University of Illinois Open Source
+// License. See LICENSE.TXT for details.
+//
+// Implements the info about BPF target spec.
+
+#include "BPF.h"
+#include "BPFTargetMachine.h"
+#include "llvm/PassManager.h"
+#include "llvm/CodeGen/Passes.h"
+#include "llvm/Support/FormattedStream.h"
+#include "llvm/Support/TargetRegistry.h"
+#include "llvm/Target/TargetOptions.h"
+using namespace llvm;
+
+extern "C" void LLVMInitializeBPFTarget() {
+  // Register the target.
+  RegisterTargetMachine<BPFTargetMachine> X(TheBPFTarget);
+}
+
+// DataLayout --> Little-endian, 64-bit pointer/ABI/alignment
+// The stack is always 8 byte aligned
+// On function prologue, the stack is created by decrementing
+// its pointer. Once decremented, all references are done with positive
+// offset from the stack/frame pointer.
+BPFTargetMachine::
+BPFTargetMachine(const Target &T, StringRef TT,
+                    StringRef CPU, StringRef FS, const TargetOptions &Options,
+                    Reloc::Model RM, CodeModel::Model CM,
+                    CodeGenOpt::Level OL)
+  : LLVMTargetMachine(T, TT, CPU, FS, Options, RM, CM, OL),
+  Subtarget(TT, CPU, FS),
+  // x86-64 like
+  DL("e-p:64:64-s:64-f64:64:64-i64:64:64-n8:16:32:64-S128"),
+  InstrInfo(), TLInfo(*this), TSInfo(*this),
+  FrameLowering(Subtarget) {
+#if LLVM_VERSION_MINOR==4
+  initAsmInfo();
+#endif
+}
+namespace {
+/// BPF Code Generator Pass Configuration Options.
+class BPFPassConfig : public TargetPassConfig {
+public:
+  BPFPassConfig(BPFTargetMachine *TM, PassManagerBase &PM)
+    : TargetPassConfig(TM, PM) {}
+
+  BPFTargetMachine &getBPFTargetMachine() const {
+    return getTM<BPFTargetMachine>();
+  }
+
+  virtual bool addInstSelector();
+  virtual bool addPreEmitPass();
+};
+}
+
+TargetPassConfig *BPFTargetMachine::createPassConfig(PassManagerBase &PM) {
+  return new BPFPassConfig(this, PM);
+}
+
+// Install an instruction selector pass using
+// the ISelDag to gen BPF code.
+bool BPFPassConfig::addInstSelector() {
+  addPass(createBPFISelDag(getBPFTargetMachine()));
+
+  return false;
+}
+
+bool BPFPassConfig::addPreEmitPass() {
+  addPass(createBPFCFGFixup(getBPFTargetMachine()));
+  return true;
+}
diff --git a/tools/bpf/llvm/lib/Target/BPF/BPFTargetMachine.h b/tools/bpf/llvm/lib/Target/BPF/BPFTargetMachine.h
new file mode 100644
index 0000000..1d6b070
--- /dev/null
+++ b/tools/bpf/llvm/lib/Target/BPF/BPFTargetMachine.h
@@ -0,0 +1,69 @@
+//===-- BPFTargetMachine.h - Define TargetMachine for BPF --- C++ ---===//
+// This file is distributed under the University of Illinois Open Source
+// License. See LICENSE.TXT for details.
+//
+// This file declares the BPF specific subclass of TargetMachine.
+
+#ifndef BPF_TARGETMACHINE_H
+#define BPF_TARGETMACHINE_H
+
+#include "BPFSubtarget.h"
+#include "BPFInstrInfo.h"
+#include "BPFISelLowering.h"
+#include "llvm/Target/TargetSelectionDAGInfo.h"
+#include "BPFFrameLowering.h"
+#include "llvm/Target/TargetMachine.h"
+#if !defined(LLVM_VERSION_MINOR)
+#error "Uknown version"
+#endif
+#if LLVM_VERSION_MINOR==3 || LLVM_VERSION_MINOR==4
+#include "llvm/IR/DataLayout.h"
+#else
+#include "llvm/DataLayout.h"
+#endif
+#include "llvm/Target/TargetFrameLowering.h"
+
+namespace llvm {
+  class formatted_raw_ostream;
+
+  class BPFTargetMachine : public LLVMTargetMachine {
+    BPFSubtarget       Subtarget;
+    const DataLayout   DL; // Calculates type size & alignment
+    BPFInstrInfo       InstrInfo;
+    BPFTargetLowering  TLInfo;
+    TargetSelectionDAGInfo TSInfo;
+    BPFFrameLowering   FrameLowering;
+  public:
+    BPFTargetMachine(const Target &T, StringRef TT,
+                        StringRef CPU, StringRef FS,
+                        const TargetOptions &Options,
+                        Reloc::Model RM, CodeModel::Model CM,
+                        CodeGenOpt::Level OL);
+
+    virtual const BPFInstrInfo *getInstrInfo() const
+    { return &InstrInfo; }
+
+    virtual const TargetFrameLowering *getFrameLowering() const
+    { return &FrameLowering; }
+
+    virtual const BPFSubtarget *getSubtargetImpl() const
+    { return &Subtarget; }
+
+    virtual const DataLayout *getDataLayout() const
+    { return &DL;}
+
+    virtual const BPFRegisterInfo *getRegisterInfo() const
+    { return &InstrInfo.getRegisterInfo(); }
+
+    virtual const BPFTargetLowering *getTargetLowering() const
+    { return &TLInfo; }
+
+    virtual const TargetSelectionDAGInfo* getSelectionDAGInfo() const
+    { return &TSInfo; }
+
+    // Pass Pipeline Configuration
+    virtual TargetPassConfig *createPassConfig(PassManagerBase &PM);
+  };
+}
+
+#endif
diff --git a/tools/bpf/llvm/lib/Target/BPF/InstPrinter/BPFInstPrinter.cpp b/tools/bpf/llvm/lib/Target/BPF/InstPrinter/BPFInstPrinter.cpp
new file mode 100644
index 0000000..89d5cdb
--- /dev/null
+++ b/tools/bpf/llvm/lib/Target/BPF/InstPrinter/BPFInstPrinter.cpp
@@ -0,0 +1,79 @@
+//===-- BPFInstPrinter.cpp - Convert BPF MCInst to asm syntax -----------===//
+// This file is distributed under the University of Illinois Open Source
+// License. See LICENSE.TXT for details.
+//
+// This class prints an BPF MCInst to a .s file.
+
+#define DEBUG_TYPE "asm-printer"
+#include "BPF.h"
+#include "BPFInstPrinter.h"
+#include "llvm/MC/MCAsmInfo.h"
+#include "llvm/MC/MCExpr.h"
+#include "llvm/MC/MCInst.h"
+#include "llvm/MC/MCSymbol.h"
+#include "llvm/Support/ErrorHandling.h"
+#include "llvm/Support/FormattedStream.h"
+using namespace llvm;
+
+
+// Include the auto-generated portion of the assembly writer.
+#include "BPFGenAsmWriter.inc"
+
+void BPFInstPrinter::printInst(const MCInst *MI, raw_ostream &O,
+                                StringRef Annot) {
+  printInstruction(MI, O);
+  printAnnotation(O, Annot);
+}
+
+static void printExpr(const MCExpr *Expr, raw_ostream &O) {
+  const MCSymbolRefExpr *SRE;
+
+  if (const MCBinaryExpr *BE = dyn_cast<MCBinaryExpr>(Expr))
+    SRE = dyn_cast<MCSymbolRefExpr>(BE->getLHS());
+  else
+    SRE = dyn_cast<MCSymbolRefExpr>(Expr);
+  assert(SRE && "Unexpected MCExpr type.");
+
+  MCSymbolRefExpr::VariantKind Kind = SRE->getKind();
+
+  assert(Kind == MCSymbolRefExpr::VK_None);
+  O << *Expr;
+}
+
+void BPFInstPrinter::printOperand(const MCInst *MI, unsigned OpNo,
+                                   raw_ostream &O, const char *Modifier) {
+  assert((Modifier == 0 || Modifier[0] == 0) && "No modifiers supported");
+  const MCOperand &Op = MI->getOperand(OpNo);
+  if (Op.isReg()) {
+    O << getRegisterName(Op.getReg());
+  } else if (Op.isImm()) {
+    O << (int32_t)Op.getImm();
+  } else {
+    assert(Op.isExpr() && "Expected an expression");
+    printExpr(Op.getExpr(), O);
+  }
+}
+
+void BPFInstPrinter::printMemOperand(const MCInst *MI, int OpNo,
+                                      raw_ostream &O, const char *Modifier) {
+  const MCOperand &RegOp = MI->getOperand(OpNo);
+  const MCOperand &OffsetOp = MI->getOperand(OpNo+1);
+  // offset
+  if (OffsetOp.isImm()) {
+    O << OffsetOp.getImm();
+  } else {
+    assert(0 && "Expected an immediate");
+//    assert(OffsetOp.isExpr() && "Expected an expression");
+//    printExpr(OffsetOp.getExpr(), O);
+  }
+  // register
+  assert(RegOp.isReg() && "Register operand not a register");
+  O  << "(" << getRegisterName(RegOp.getReg()) << ")";
+}
+
+void BPFInstPrinter::printS32ImmOperand(const MCInst *MI, unsigned OpNo,
+                                         raw_ostream &O) {
+  const MCOperand &Op = MI->getOperand(OpNo);
+  assert(Op.isImm() && "Immediate operand not an immediate");
+  O << (int32_t)Op.getImm();
+}
diff --git a/tools/bpf/llvm/lib/Target/BPF/InstPrinter/BPFInstPrinter.h b/tools/bpf/llvm/lib/Target/BPF/InstPrinter/BPFInstPrinter.h
new file mode 100644
index 0000000..4f0cba5
--- /dev/null
+++ b/tools/bpf/llvm/lib/Target/BPF/InstPrinter/BPFInstPrinter.h
@@ -0,0 +1,34 @@
+//= BPFInstPrinter.h - Convert BPF MCInst to asm syntax ---------*- C++ -*--//
+// This file is distributed under the University of Illinois Open Source
+// License. See LICENSE.TXT for details.
+//
+// This class prints a BPF MCInst to a .s file.
+
+#ifndef BPFINSTPRINTER_H
+#define BPFINSTPRINTER_H
+
+#include "llvm/MC/MCInstPrinter.h"
+
+namespace llvm {
+  class MCOperand;
+
+  class BPFInstPrinter : public MCInstPrinter {
+  public:
+    BPFInstPrinter(const MCAsmInfo &MAI, const MCInstrInfo &MII,
+                    const MCRegisterInfo &MRI)
+      : MCInstPrinter(MAI, MII, MRI) {}
+
+    void printInst(const MCInst *MI, raw_ostream &O, StringRef Annot);
+    void printOperand(const MCInst *MI, unsigned OpNo,
+                      raw_ostream &O, const char *Modifier = 0);
+    void printMemOperand(const MCInst *MI, int OpNo,raw_ostream &O,
+                         const char *Modifier = 0);
+    void printS32ImmOperand(const MCInst *MI, unsigned OpNo, raw_ostream &O);
+
+    // Autogenerated by tblgen.
+    void printInstruction(const MCInst *MI, raw_ostream &O);
+    static const char *getRegisterName(unsigned RegNo);
+  };
+}
+
+#endif
diff --git a/tools/bpf/llvm/lib/Target/BPF/MCTargetDesc/BPFAsmBackend.cpp b/tools/bpf/llvm/lib/Target/BPF/MCTargetDesc/BPFAsmBackend.cpp
new file mode 100644
index 0000000..8d5b5c9
--- /dev/null
+++ b/tools/bpf/llvm/lib/Target/BPF/MCTargetDesc/BPFAsmBackend.cpp
@@ -0,0 +1,85 @@
+//===-- BPFAsmBackend.cpp - BPF Assembler Backend -----------------------===//
+// This file is distributed under the University of Illinois Open Source
+// License. See LICENSE.TXT for details.
+
+#include "MCTargetDesc/BPFMCTargetDesc.h"
+#include "llvm/MC/MCAsmBackend.h"
+#include "llvm/MC/MCAssembler.h"
+#include "llvm/MC/MCDirectives.h"
+#include "llvm/MC/MCELFObjectWriter.h"
+#include "llvm/MC/MCFixupKindInfo.h"
+#include "llvm/MC/MCObjectWriter.h"
+#include "llvm/MC/MCSubtargetInfo.h"
+#include "llvm/MC/MCExpr.h"
+#include "llvm/MC/MCSymbol.h"
+#include "llvm/Support/ErrorHandling.h"
+#include "llvm/Support/raw_ostream.h"
+
+using namespace llvm;
+
+namespace {
+class BPFAsmBackend : public MCAsmBackend {
+public:
+  BPFAsmBackend(): MCAsmBackend() {}
+  virtual ~BPFAsmBackend() {}
+
+  void applyFixup(const MCFixup &Fixup, char *Data, unsigned DataSize,
+                  uint64_t Value) const;
+
+  MCObjectWriter *createObjectWriter(raw_ostream &OS) const;
+
+  // No instruction requires relaxation
+#if LLVM_VERSION_MINOR==3 || LLVM_VERSION_MINOR==4
+  bool fixupNeedsRelaxation(const MCFixup &Fixup, uint64_t Value,
+                            const MCRelaxableFragment *DF,
+                            const MCAsmLayout &Layout) const { return false; }
+#else
+  bool fixupNeedsRelaxation(const MCFixup &Fixup, uint64_t Value, 
+                            const MCInstFragment *DF,
+                            const MCAsmLayout &Layout) const { return false; }
+#endif
+  
+  unsigned getNumFixupKinds() const { return 1; }
+
+  bool mayNeedRelaxation(const MCInst &Inst) const { return false; }
+
+  void relaxInstruction(const MCInst &Inst, MCInst &Res) const {}
+
+  bool writeNopData(uint64_t Count, MCObjectWriter *OW) const;
+};
+
+bool BPFAsmBackend::writeNopData(uint64_t Count, MCObjectWriter *OW) const {
+  if ((Count % 8) != 0)
+    return false;
+
+  for (uint64_t i = 0; i < Count; i += 8)
+    OW->Write64(0x15000000);
+
+  return true;
+}
+
+void BPFAsmBackend::applyFixup(const MCFixup &Fixup, char *Data,
+                               unsigned DataSize, uint64_t Value) const {
+
+  assert (Fixup.getKind() == FK_PCRel_2);
+  *(uint16_t*)&Data[Fixup.getOffset() + 2] = (uint16_t) ((Value - 8) / 8);
+
+  if (0)
+   errs() << "<MCFixup" << " Offset:" << Fixup.getOffset() << " Value:" <<
+     *(Fixup.getValue()) << " Kind:" << Fixup.getKind() <<
+     " val " << Value << ">\n";
+}
+
+MCObjectWriter *BPFAsmBackend::createObjectWriter(raw_ostream &OS) const {
+  return createBPFELFObjectWriter(OS, 0);
+}
+
+}
+
+MCAsmBackend *llvm::createBPFAsmBackend(const Target &T,
+#if LLVM_VERSION_MINOR==4
+                                        const MCRegisterInfo &MRI,
+#endif
+                                        StringRef TT, StringRef CPU) {
+  return new BPFAsmBackend();
+}
diff --git a/tools/bpf/llvm/lib/Target/BPF/MCTargetDesc/BPFBaseInfo.h b/tools/bpf/llvm/lib/Target/BPF/MCTargetDesc/BPFBaseInfo.h
new file mode 100644
index 0000000..9d03073
--- /dev/null
+++ b/tools/bpf/llvm/lib/Target/BPF/MCTargetDesc/BPFBaseInfo.h
@@ -0,0 +1,33 @@
+//===-- BPFBaseInfo.h - Top level definitions for BPF MC ------*- C++ -*-===//
+// This file is distributed under the University of Illinois Open Source
+// License. See LICENSE.TXT for details.
+#ifndef BPFBASEINFO_H
+#define BPFBASEINFO_H
+
+#include "BPFMCTargetDesc.h"
+#include "llvm/MC/MCExpr.h"
+#include "llvm/Support/DataTypes.h"
+#include "llvm/Support/ErrorHandling.h"
+
+namespace llvm {
+
+static inline unsigned getBPFRegisterNumbering(unsigned Reg) {
+  switch(Reg) {
+    case BPF::R0  : return 0;
+    case BPF::R1  : return 1;
+    case BPF::R2  : return 2;
+    case BPF::R3  : return 3;
+    case BPF::R4  : return 4;
+    case BPF::R5  : return 5;
+    case BPF::R6  : return 6;
+    case BPF::R7  : return 7;
+    case BPF::R8  : return 8;
+    case BPF::R9  : return 9;
+    case BPF::R10 : return 10;
+    case BPF::R11 : return 11;
+    default: llvm_unreachable("Unknown register number!");
+  }
+}
+
+}
+#endif
diff --git a/tools/bpf/llvm/lib/Target/BPF/MCTargetDesc/BPFELFObjectWriter.cpp b/tools/bpf/llvm/lib/Target/BPF/MCTargetDesc/BPFELFObjectWriter.cpp
new file mode 100644
index 0000000..22cf0d6
--- /dev/null
+++ b/tools/bpf/llvm/lib/Target/BPF/MCTargetDesc/BPFELFObjectWriter.cpp
@@ -0,0 +1,119 @@
+//===-- BPFELFObjectWriter.cpp - BPF Writer -------------------------===//
+// This file is distributed under the University of Illinois Open Source
+// License. See LICENSE.TXT for details.
+
+#include "MCTargetDesc/BPFBaseInfo.h"
+#include "MCTargetDesc/BPFMCTargetDesc.h"
+#include "MCTargetDesc/BPFMCCodeEmitter.h"
+#include "llvm/MC/MCObjectWriter.h"
+#include "llvm/MC/MCValue.h"
+#include "llvm/MC/MCAssembler.h"
+#include "llvm/MC/MCSectionELF.h"
+#include "llvm/MC/MCAsmLayout.h"
+#include "llvm/Support/ErrorHandling.h"
+#include "llvm/Support/raw_ostream.h"
+
+using namespace llvm;
+
+namespace {
+class BPFObjectWriter : public MCObjectWriter {
+  public:
+    BPFObjectWriter(raw_ostream &_OS):
+      MCObjectWriter(_OS, true/*isLittleEndian*/) {}
+    virtual ~BPFObjectWriter() {}
+    virtual void WriteObject(MCAssembler &Asm, const MCAsmLayout &Layout);
+    virtual void RecordRelocation(const MCAssembler &Asm,
+                                  const MCAsmLayout &Layout,
+                                  const MCFragment *Fragment,
+                                  const MCFixup &Fixup,
+                                  MCValue Target, uint64_t &FixedValue) {}
+    virtual void ExecutePostLayoutBinding(MCAssembler &Asm,
+                                          const MCAsmLayout &Layout) {}
+
+};
+}
+
+static void WriteSectionData(MCAssembler &Asm, const MCSectionData &SD) {
+  MCObjectWriter *OW = &Asm.getWriter();
+  for (MCSectionData::const_iterator it = SD.begin(),
+         ie = SD.end(); it != ie; ++it) {
+    const MCFragment &F = *it;
+    switch (F.getKind()) {
+    case MCFragment::FT_Align:
+      continue;
+    case MCFragment::FT_Data: {
+      const MCDataFragment &DF = cast<MCDataFragment>(F);
+      OW->WriteBytes(DF.getContents());
+      break;
+    }
+    case MCFragment::FT_Fill: {
+      const MCFillFragment &FF = cast<MCFillFragment>(F);
+
+      assert(FF.getValueSize() && "Invalid virtual align in concrete fragment!");
+
+      for (uint64_t i = 0, e = FF.getSize() / FF.getValueSize(); i != e; ++i) {
+        switch (FF.getValueSize()) {
+        default: llvm_unreachable("Invalid size!");
+        case 1: OW->Write8 (uint8_t (FF.getValue())); break;
+        case 2: OW->Write16(uint16_t(FF.getValue())); break;
+        case 4: OW->Write32(uint32_t(FF.getValue())); break;
+        case 8: OW->Write64(uint64_t(FF.getValue())); break;
+        }
+      }
+      break;
+    }
+    default:
+      errs() << "MCFrag " << F.getKind() << "\n";
+    }
+  }
+}
+
+void BPFObjectWriter::WriteObject(MCAssembler &Asm,
+                                  const MCAsmLayout &Layout) {
+  bool LicenseSeen = false;
+  MCObjectWriter *OW = &Asm.getWriter();
+  OW->WriteBytes(StringRef("bpf"), 4);
+
+  BPFMCCodeEmitter *CE = (BPFMCCodeEmitter*)(&Asm.getEmitter());
+//  Asm.dump();
+  for (MCAssembler::const_iterator i = Asm.begin(), e = Asm.end(); i != e;
+       ++i) {
+    const MCSectionELF &Section =
+      static_cast<const MCSectionELF&>(i->getSection());
+    const StringRef SectionName = Section.getSectionName();
+    const MCSectionData &SD = Asm.getSectionData(Section);
+    int SectionSize = Layout.getSectionAddressSize(&SD);
+    if (SectionSize > 0) {
+      CE->getStrtabIndex(SectionName);
+      if (SectionName == "license")
+        LicenseSeen = true;
+    }
+  }
+
+  if (!LicenseSeen)
+      report_fatal_error("BPF source is missing license");
+
+  OW->Write32(CE->Strtab->length());
+  OW->WriteBytes(StringRef(*CE->Strtab));
+
+  for (MCAssembler::const_iterator i = Asm.begin(), e = Asm.end(); i != e;
+       ++i) {
+    const MCSectionELF &Section =
+      static_cast<const MCSectionELF&>(i->getSection());
+    const StringRef SectionName = Section.getSectionName();
+    const MCSectionData &SD = Asm.getSectionData(Section);
+    int SectionSize = Layout.getSectionAddressSize(&SD);
+    if (SectionSize > 0 &&
+        /* ignore .rodata.* for now */
+        !(Section.getFlags() & ELF::SHF_STRINGS)) {
+      OW->Write32(SectionSize);
+      OW->Write32(CE->getStrtabIndex(SectionName));
+      WriteSectionData(Asm, SD);
+    }
+  }
+}
+
+MCObjectWriter *llvm::createBPFELFObjectWriter(raw_ostream &OS,
+                                               uint8_t OSABI) {
+  return new BPFObjectWriter(OS);
+}
diff --git a/tools/bpf/llvm/lib/Target/BPF/MCTargetDesc/BPFMCAsmInfo.h b/tools/bpf/llvm/lib/Target/BPF/MCTargetDesc/BPFMCAsmInfo.h
new file mode 100644
index 0000000..99132ee
--- /dev/null
+++ b/tools/bpf/llvm/lib/Target/BPF/MCTargetDesc/BPFMCAsmInfo.h
@@ -0,0 +1,34 @@
+//=====-- BPFMCAsmInfo.h - BPF asm properties -----------*- C++ -*--====//
+// This file is distributed under the University of Illinois Open Source
+// License. See LICENSE.TXT for details.
+
+#ifndef BPF_MCASM_INFO_H
+#define BPF_MCASM_INFO_H
+
+#include "llvm/ADT/StringRef.h"
+#include "llvm/MC/MCAsmInfo.h"
+
+namespace llvm {
+  class Target;
+  
+  class BPFMCAsmInfo : public MCAsmInfo {
+  public:
+#if LLVM_VERSION_MINOR==4
+    explicit BPFMCAsmInfo(StringRef TT) {
+#else
+    explicit BPFMCAsmInfo(const Target &T, StringRef TT) {
+#endif
+      PrivateGlobalPrefix         = ".L";
+      WeakRefDirective            = "\t.weak\t";
+
+      // BPF assembly requires ".section" before ".bss"
+      UsesELFSectionDirectiveForBSS = true;
+
+      HasSingleParameterDotFile = false;
+      HasDotTypeDotSizeDirective = false;
+    }
+  };
+
+}
+
+#endif
diff --git a/tools/bpf/llvm/lib/Target/BPF/MCTargetDesc/BPFMCCodeEmitter.cpp b/tools/bpf/llvm/lib/Target/BPF/MCTargetDesc/BPFMCCodeEmitter.cpp
new file mode 100644
index 0000000..9e3f52c
--- /dev/null
+++ b/tools/bpf/llvm/lib/Target/BPF/MCTargetDesc/BPFMCCodeEmitter.cpp
@@ -0,0 +1,120 @@
+//===-- BPFMCCodeEmitter.cpp - Convert BPF code to machine code ---------===//
+// This file is distributed under the University of Illinois Open Source
+// License. See LICENSE.TXT for details.
+
+#define DEBUG_TYPE "mccodeemitter"
+#include "MCTargetDesc/BPFBaseInfo.h"
+#include "MCTargetDesc/BPFMCTargetDesc.h"
+#include "MCTargetDesc/BPFMCCodeEmitter.h"
+#include "llvm/MC/MCCodeEmitter.h"
+#include "llvm/MC/MCFixup.h"
+#include "llvm/MC/MCInst.h"
+#include "llvm/MC/MCInstrInfo.h"
+#include "llvm/MC/MCRegisterInfo.h"
+#include "llvm/MC/MCSubtargetInfo.h"
+#include "llvm/MC/MCSymbol.h"
+#include "llvm/ADT/Statistic.h"
+#include "llvm/Support/raw_ostream.h"
+using namespace llvm;
+
+STATISTIC(MCNumEmitted, "Number of MC instructions emitted");
+
+MCCodeEmitter *llvm::createBPFMCCodeEmitter(const MCInstrInfo &MCII,
+                                             const MCRegisterInfo &MRI,
+                                             const MCSubtargetInfo &STI,
+                                             MCContext &Ctx) {
+  return new BPFMCCodeEmitter(MCII, STI, Ctx);
+}
+
+/// getMachineOpValue - Return binary encoding of operand. If the machine
+/// operand requires relocation, record the relocation and return zero.
+unsigned BPFMCCodeEmitter::
+getMachineOpValue(const MCInst &MI, const MCOperand &MO,
+                  SmallVectorImpl<MCFixup> &Fixups) const {
+  if (MO.isReg())
+    return getBPFRegisterNumbering(MO.getReg());
+  if (MO.isImm())
+    return static_cast<unsigned>(MO.getImm());
+  
+  assert(MO.isExpr());
+
+  const MCExpr *Expr = MO.getExpr();
+  MCExpr::ExprKind Kind = Expr->getKind();
+
+/*  if (Kind == MCExpr::Binary) {
+    Expr = static_cast<const MCBinaryExpr*>(Expr)->getLHS();
+    Kind = Expr->getKind();
+  }*/
+
+  assert (Kind == MCExpr::SymbolRef);
+
+  if (MI.getOpcode() == BPF::JAL) {
+    /* func call name */
+//    Fixups.push_back(MCFixup::Create(0, MO.getExpr(), FK_SecRel_4));
+    const MCSymbolRefExpr *SRE = dyn_cast<MCSymbolRefExpr>(Expr);
+    return getStrtabIndex(SRE->getSymbol().getName());
+
+  } else {
+    /* bb label */
+    Fixups.push_back(MCFixup::Create(0, MO.getExpr(), FK_PCRel_2));
+    return 0;
+  }
+}
+
+// Emit one byte through output stream
+void EmitByte(unsigned char C, unsigned &CurByte, raw_ostream &OS) {
+  OS << (char)C;
+  ++CurByte;
+}
+
+// Emit a series of bytes (little endian)
+void EmitLEConstant(uint64_t Val, unsigned Size, unsigned &CurByte,
+                    raw_ostream &OS) {
+  assert(Size <= 8 && "size too big in emit constant");
+
+  for (unsigned i = 0; i != Size; ++i) {
+    EmitByte(Val & 255, CurByte, OS);
+    Val >>= 8;
+  }
+}
+
+// Emit a series of bytes (big endian)
+void EmitBEConstant(uint64_t Val, unsigned Size, unsigned &CurByte,
+                    raw_ostream &OS) {
+  assert(Size <= 8 && "size too big in emit constant");
+
+  for (int i = (Size-1)*8; i >= 0; i-=8)
+    EmitByte((Val >> i) & 255, CurByte, OS);
+}
+
+void BPFMCCodeEmitter::EncodeInstruction(const MCInst &MI, raw_ostream &OS,
+                                         SmallVectorImpl<MCFixup> &Fixups) const {
+//  unsigned Opcode = MI.getOpcode();
+//  const MCInstrDesc &Desc = MCII.get(Opcode);
+  // Keep track of the current byte being emitted
+  unsigned CurByte = 0;
+
+  // Get instruction encoding and emit it
+  ++MCNumEmitted;       // Keep track of the number of emitted insns.
+  uint64_t Value = getBinaryCodeForInstr(MI, Fixups);
+  EmitByte(Value >> 56, CurByte, OS);
+  EmitByte((Value >> 48) & 0xff, CurByte, OS);
+  EmitLEConstant((Value >> 32) & 0xffff, 2, CurByte, OS);
+  EmitLEConstant(Value & 0xffffFFFF, 4, CurByte, OS);
+}
+
+// Encode BPF Memory Operand
+uint64_t BPFMCCodeEmitter::getMemoryOpValue(const MCInst &MI, unsigned Op,
+                                            SmallVectorImpl<MCFixup> &Fixups) const {
+  uint64_t encoding;
+  const MCOperand op1 = MI.getOperand(1);
+  assert(op1.isReg() && "First operand is not register.");
+  encoding = getBPFRegisterNumbering(op1.getReg());
+  encoding <<= 16;
+  MCOperand op2 = MI.getOperand(2);
+  assert(op2.isImm() && "Second operand is not immediate.");
+  encoding |= op2.getImm() & 0xffff;
+  return encoding;
+}
+
+#include "BPFGenMCCodeEmitter.inc"
diff --git a/tools/bpf/llvm/lib/Target/BPF/MCTargetDesc/BPFMCCodeEmitter.h b/tools/bpf/llvm/lib/Target/BPF/MCTargetDesc/BPFMCCodeEmitter.h
new file mode 100644
index 0000000..84d86c0
--- /dev/null
+++ b/tools/bpf/llvm/lib/Target/BPF/MCTargetDesc/BPFMCCodeEmitter.h
@@ -0,0 +1,67 @@
+//===-- BPFMCCodeEmitter.h - Convert BPF code to machine code ---------===//
+// This file is distributed under the University of Illinois Open Source
+// License. See LICENSE.TXT for details.
+
+#include "MCTargetDesc/BPFBaseInfo.h"
+#include "MCTargetDesc/BPFMCTargetDesc.h"
+#include "llvm/MC/MCCodeEmitter.h"
+#include "llvm/MC/MCFixup.h"
+#include "llvm/MC/MCInst.h"
+#include "llvm/MC/MCInstrInfo.h"
+#include "llvm/MC/MCRegisterInfo.h"
+#include "llvm/MC/MCSubtargetInfo.h"
+#include "llvm/MC/MCSymbol.h"
+#include "llvm/ADT/Statistic.h"
+#include "llvm/Support/raw_ostream.h"
+using namespace llvm;
+
+namespace {
+class BPFMCCodeEmitter : public MCCodeEmitter {
+  BPFMCCodeEmitter(const BPFMCCodeEmitter &);
+  void operator=(const BPFMCCodeEmitter &);
+  const MCInstrInfo &MCII;
+  const MCSubtargetInfo &STI;
+  MCContext &Ctx;
+
+public:
+  BPFMCCodeEmitter(const MCInstrInfo &mcii, const MCSubtargetInfo &sti,
+                    MCContext &ctx)
+    : MCII(mcii), STI(sti), Ctx(ctx) {
+      Strtab = new std::string;
+      Strtab->push_back('\0');
+    }
+
+  ~BPFMCCodeEmitter() {delete Strtab;}
+
+  std::string *Strtab;
+
+  int getStrtabIndex(const StringRef Name) const {
+    std::string Sym = Name.str();
+    Sym.push_back('\0');
+
+    std::string::size_type pos = Strtab->find(Sym);
+    if (pos == std::string::npos) {
+      Strtab->append(Sym);
+      pos = Strtab->find(Sym);
+      assert (pos != std::string::npos);
+    }
+    return pos;
+  }
+
+  // getBinaryCodeForInstr - TableGen'erated function for getting the
+  // binary encoding for an instruction.
+  uint64_t getBinaryCodeForInstr(const MCInst &MI,
+                                 SmallVectorImpl<MCFixup> &Fixups) const;
+
+   // getMachineOpValue - Return binary encoding of operand. If the machin
+   // operand requires relocation, record the relocation and return zero.
+  unsigned getMachineOpValue(const MCInst &MI,const MCOperand &MO,
+                             SmallVectorImpl<MCFixup> &Fixups) const;
+
+  uint64_t getMemoryOpValue(const MCInst &MI, unsigned Op,
+                            SmallVectorImpl<MCFixup> &Fixups) const;
+
+  void EncodeInstruction(const MCInst &MI, raw_ostream &OS,
+                         SmallVectorImpl<MCFixup> &Fixups) const;
+};
+}
diff --git a/tools/bpf/llvm/lib/Target/BPF/MCTargetDesc/BPFMCTargetDesc.cpp b/tools/bpf/llvm/lib/Target/BPF/MCTargetDesc/BPFMCTargetDesc.cpp
new file mode 100644
index 0000000..db043d7
--- /dev/null
+++ b/tools/bpf/llvm/lib/Target/BPF/MCTargetDesc/BPFMCTargetDesc.cpp
@@ -0,0 +1,115 @@
+//===-- BPFMCTargetDesc.cpp - BPF Target Descriptions -----------------===//
+// This file is distributed under the University of Illinois Open Source
+// License. See LICENSE.TXT for details.
+//
+// This file provides BPF specific target descriptions.
+
+#include "BPF.h"
+#include "BPFMCTargetDesc.h"
+#include "BPFMCAsmInfo.h"
+#include "InstPrinter/BPFInstPrinter.h"
+#include "llvm/MC/MCCodeGenInfo.h"
+#include "llvm/MC/MCInstrInfo.h"
+#include "llvm/MC/MCRegisterInfo.h"
+#include "llvm/MC/MCStreamer.h"
+#include "llvm/MC/MCSubtargetInfo.h"
+#include "llvm/Support/ErrorHandling.h"
+#include "llvm/Support/TargetRegistry.h"
+
+#define GET_INSTRINFO_MC_DESC
+#include "BPFGenInstrInfo.inc"
+
+#define GET_SUBTARGETINFO_MC_DESC
+#include "BPFGenSubtargetInfo.inc"
+
+#define GET_REGINFO_MC_DESC
+#include "BPFGenRegisterInfo.inc"
+
+using namespace llvm;
+
+static MCInstrInfo *createBPFMCInstrInfo() {
+  MCInstrInfo *X = new MCInstrInfo();
+  InitBPFMCInstrInfo(X);
+  return X;
+}
+
+static MCRegisterInfo *createBPFMCRegisterInfo(StringRef TT) {
+  MCRegisterInfo *X = new MCRegisterInfo();
+  InitBPFMCRegisterInfo(X, BPF::R9);
+  return X;
+}
+
+static MCSubtargetInfo *createBPFMCSubtargetInfo(StringRef TT, StringRef CPU,
+                                                   StringRef FS) {
+  MCSubtargetInfo *X = new MCSubtargetInfo();
+  InitBPFMCSubtargetInfo(X, TT, CPU, FS);
+  return X;
+}
+
+static MCCodeGenInfo *createBPFMCCodeGenInfo(StringRef TT, Reloc::Model RM,
+                                               CodeModel::Model CM,
+                                               CodeGenOpt::Level OL) {
+  MCCodeGenInfo *X = new MCCodeGenInfo();
+  X->InitMCCodeGenInfo(RM, CM, OL);
+  return X;
+}
+
+static MCStreamer *createBPFMCStreamer(const Target &T, StringRef TT,
+                                    MCContext &Ctx, MCAsmBackend &MAB,
+                                    raw_ostream &_OS,
+                                    MCCodeEmitter *_Emitter,
+                                    bool RelaxAll,
+                                    bool NoExecStack) {
+#if LLVM_VERSION_MINOR==4
+  return createELFStreamer(Ctx, 0, MAB, _OS, _Emitter, RelaxAll, NoExecStack);
+#else
+  return createELFStreamer(Ctx, MAB, _OS, _Emitter, RelaxAll, NoExecStack);
+#endif
+}
+
+static MCInstPrinter *createBPFMCInstPrinter(const Target &T,
+                                              unsigned SyntaxVariant,
+                                              const MCAsmInfo &MAI,
+                                              const MCInstrInfo &MII,
+                                              const MCRegisterInfo &MRI,
+                                              const MCSubtargetInfo &STI) {
+  if (SyntaxVariant == 0)
+    return new BPFInstPrinter(MAI, MII, MRI);
+  return 0;
+}
+
+extern "C" void LLVMInitializeBPFTargetMC() {
+  // Register the MC asm info.
+  RegisterMCAsmInfo<BPFMCAsmInfo> X(TheBPFTarget);
+
+  // Register the MC codegen info.
+  TargetRegistry::RegisterMCCodeGenInfo(TheBPFTarget,
+                                       createBPFMCCodeGenInfo);
+
+  // Register the MC instruction info.
+  TargetRegistry::RegisterMCInstrInfo(TheBPFTarget, createBPFMCInstrInfo);
+
+  // Register the MC register info.
+  TargetRegistry::RegisterMCRegInfo(TheBPFTarget, createBPFMCRegisterInfo);
+
+  // Register the MC subtarget info.
+  TargetRegistry::RegisterMCSubtargetInfo(TheBPFTarget,
+                                          createBPFMCSubtargetInfo);
+
+  // Register the MC code emitter
+  TargetRegistry::RegisterMCCodeEmitter(TheBPFTarget,
+                                        llvm::createBPFMCCodeEmitter);
+
+  // Register the ASM Backend
+  TargetRegistry::RegisterMCAsmBackend(TheBPFTarget,
+                                       createBPFAsmBackend);
+
+  // Register the object streamer
+  TargetRegistry::RegisterMCObjectStreamer(TheBPFTarget,
+                                           createBPFMCStreamer);
+
+
+  // Register the MCInstPrinter.
+  TargetRegistry::RegisterMCInstPrinter(TheBPFTarget,
+                                        createBPFMCInstPrinter);
+}
diff --git a/tools/bpf/llvm/lib/Target/BPF/MCTargetDesc/BPFMCTargetDesc.h b/tools/bpf/llvm/lib/Target/BPF/MCTargetDesc/BPFMCTargetDesc.h
new file mode 100644
index 0000000..b337a00
--- /dev/null
+++ b/tools/bpf/llvm/lib/Target/BPF/MCTargetDesc/BPFMCTargetDesc.h
@@ -0,0 +1,56 @@
+//===-- BPFMCTargetDesc.h - BPF Target Descriptions -----------*- C++ -*-===//
+// This file is distributed under the University of Illinois Open Source
+// License. See LICENSE.TXT for details.
+//
+// This file provides BPF specific target descriptions.
+
+#ifndef BPFMCTARGETDESC_H
+#define BPFMCTARGETDESC_H
+
+#include "llvm/Support/DataTypes.h"
+#include "llvm/Config/config.h"
+
+namespace llvm {
+class MCAsmBackend;
+class MCCodeEmitter;
+class MCContext;
+class MCInstrInfo;
+class MCObjectWriter;
+class MCRegisterInfo;
+class MCSubtargetInfo;
+class Target;
+class StringRef;
+class raw_ostream;
+
+extern Target TheBPFTarget;
+
+MCCodeEmitter *createBPFMCCodeEmitter(const MCInstrInfo &MCII,
+                                       const MCRegisterInfo &MRI,
+                                       const MCSubtargetInfo &STI,
+                                       MCContext &Ctx);
+
+MCAsmBackend *createBPFAsmBackend(const Target &T,
+#if LLVM_VERSION_MINOR==4
+                                  const MCRegisterInfo &MRI,
+#endif
+                                  StringRef TT, StringRef CPU);
+
+
+MCObjectWriter *createBPFELFObjectWriter(raw_ostream &OS, uint8_t OSABI);
+}
+
+// Defines symbolic names for BPF registers.  This defines a mapping from
+// register name to register number.
+//
+#define GET_REGINFO_ENUM
+#include "BPFGenRegisterInfo.inc"
+
+// Defines symbolic names for the BPF instructions.
+//
+#define GET_INSTRINFO_ENUM
+#include "BPFGenInstrInfo.inc"
+
+#define GET_SUBTARGETINFO_ENUM
+#include "BPFGenSubtargetInfo.inc"
+
+#endif
diff --git a/tools/bpf/llvm/lib/Target/BPF/TargetInfo/BPFTargetInfo.cpp b/tools/bpf/llvm/lib/Target/BPF/TargetInfo/BPFTargetInfo.cpp
new file mode 100644
index 0000000..4d16305
--- /dev/null
+++ b/tools/bpf/llvm/lib/Target/BPF/TargetInfo/BPFTargetInfo.cpp
@@ -0,0 +1,13 @@
+//===-- BPFTargetInfo.cpp - BPF Target Implementation -----------------===//
+// This file is distributed under the University of Illinois Open Source
+// License. See LICENSE.TXT for details.
+
+#include "BPF.h"
+#include "llvm/Support/TargetRegistry.h"
+using namespace llvm;
+
+Target llvm::TheBPFTarget;
+
+extern "C" void LLVMInitializeBPFTargetInfo() {
+  RegisterTarget<Triple::x86_64> X(TheBPFTarget, "bpf", "BPF");
+}
diff --git a/tools/bpf/llvm/tools/llc/llc.cpp b/tools/bpf/llvm/tools/llc/llc.cpp
new file mode 100644
index 0000000..517a7a8
--- /dev/null
+++ b/tools/bpf/llvm/tools/llc/llc.cpp
@@ -0,0 +1,381 @@
+//===-- llc.cpp - Implement the LLVM Native Code Generator ----------------===//
+// This file is distributed under the University of Illinois Open Source
+// License. See LICENSE.TXT for details.
+//
+// This is the llc code generator driver. It provides a convenient
+// command-line interface for generating native assembly-language code
+// or C code, given LLVM bitcode.
+
+#include "llvm/Config/config.h"
+#undef LLVM_NATIVE_TARGET
+#undef LLVM_NATIVE_ASMPRINTER
+#undef LLVM_NATIVE_ASMPARSER
+#undef LLVM_NATIVE_DISASSEMBLER
+#if LLVM_VERSION_MINOR==3 || LLVM_VERSION_MINOR==4
+#include "llvm/IR/LLVMContext.h"
+#include "llvm/IR/Module.h"
+#include "llvm/IR/DataLayout.h"
+#include "llvm/IRReader/IRReader.h"
+#include "llvm/Support/SourceMgr.h"
+#else
+#include "llvm/LLVMContext.h"
+#include "llvm/Module.h"
+#include "llvm/DataLayout.h"
+#include "llvm/Support/IRReader.h"
+#endif
+#include "llvm/PassManager.h"
+#include "llvm/Pass.h"
+#include "llvm/ADT/Triple.h"
+#include "llvm/Assembly/PrintModulePass.h"
+#include "llvm/CodeGen/LinkAllAsmWriterComponents.h"
+#include "llvm/CodeGen/LinkAllCodegenComponents.h"
+#include "llvm/MC/SubtargetFeature.h"
+#include "llvm/Support/CommandLine.h"
+#include "llvm/Support/Debug.h"
+#include "llvm/Support/FormattedStream.h"
+#include "llvm/Support/ManagedStatic.h"
+#include "llvm/Support/PluginLoader.h"
+#include "llvm/Support/PrettyStackTrace.h"
+#include "llvm/Support/ToolOutputFile.h"
+#include "llvm/Support/Host.h"
+#include "llvm/Support/Signals.h"
+#include "llvm/Support/TargetRegistry.h"
+#include "llvm/Support/TargetSelect.h"
+#include "llvm/Target/TargetLibraryInfo.h"
+#include "llvm/Target/TargetMachine.h"
+#include <memory>
+using namespace llvm;
+
+extern "C" {
+void AnnotateHappensBefore(const char *file, int line,
+                           const volatile void *cv) {}
+void AnnotateHappensAfter(const char *file, int line,
+                          const volatile void *cv) {}
+void AnnotateIgnoreWritesBegin(const char *file, int line) {}
+void AnnotateIgnoreWritesEnd(const char *file, int line) {}
+}
+
+__attribute__((weak)) bool llvm::DebugFlag;
+
+__attribute__((weak)) bool llvm::isCurrentDebugType(const char *Type) {
+ return false;
+}
+
+// General options for llc.  Other pass-specific options are specified
+// within the corresponding llc passes, and target-specific options
+// and back-end code generation options are specified with the target machine.
+//
+static cl::opt<std::string>
+InputFilename(cl::Positional, cl::desc("<input bitcode>"), cl::init("-"));
+
+static cl::opt<std::string>
+OutputFilename("o", cl::desc("Output filename"), cl::value_desc("filename"));
+
+// Determine optimization level.
+static cl::opt<char>
+OptLevel("O",
+         cl::desc("Optimization level. [-O0, -O1, -O2, or -O3] "
+                  "(default = '-O2')"),
+         cl::Prefix,
+         cl::ZeroOrMore,
+         cl::init(' '));
+
+static cl::opt<std::string>
+TargetTriple("mtriple", cl::desc("Override target triple for module"));
+
+static cl::list<std::string>
+MAttrs("mattr",
+  cl::CommaSeparated,
+  cl::desc("Target specific attributes (-mattr=help for details)"),
+  cl::value_desc("a1,+a2,-a3,..."));
+
+cl::opt<TargetMachine::CodeGenFileType>
+FileType("filetype", cl::init(TargetMachine::CGFT_ObjectFile),
+  cl::desc("Choose a file type (not all types are supported by all targets):"),
+  cl::values(
+       clEnumValN(TargetMachine::CGFT_AssemblyFile, "asm",
+                  "Emit an assembly ('.s') file"),
+       clEnumValN(TargetMachine::CGFT_ObjectFile, "obj",
+                  "Emit a native object ('.o') file"),
+       clEnumValN(TargetMachine::CGFT_Null, "null",
+                  "Emit nothing, for performance testing"),
+       clEnumValEnd));
+
+cl::opt<bool> NoVerify("disable-verify", cl::Hidden,
+                       cl::desc("Do not verify input module"));
+
+static cl::opt<bool>
+DontPlaceZerosInBSS("nozero-initialized-in-bss",
+  cl::desc("Don't place zero-initialized symbols into bss section"),
+  cl::init(false));
+
+static cl::opt<bool>
+DisableSimplifyLibCalls("disable-simplify-libcalls",
+  cl::desc("Disable simplify-libcalls"),
+  cl::init(false));
+
+static cl::opt<bool>
+EnableGuaranteedTailCallOpt("tailcallopt",
+  cl::desc("Turn fastcc calls into tail calls by (potentially) changing ABI."),
+  cl::init(false));
+
+static cl::opt<bool>
+DisableTailCalls("disable-tail-calls",
+  cl::desc("Never emit tail calls"),
+  cl::init(false));
+
+static cl::opt<std::string> StopAfter("stop-after",
+  cl::desc("Stop compilation after a specific pass"),
+  cl::value_desc("pass-name"),
+  cl::init(""));
+static cl::opt<std::string> StartAfter("start-after",
+  cl::desc("Resume compilation after a specific pass"),
+  cl::value_desc("pass-name"),
+  cl::init(""));
+
+// GetFileNameRoot - Helper function to get the basename of a filename.
+static inline std::string
+GetFileNameRoot(const std::string &InputFilename) {
+  std::string IFN = InputFilename;
+  std::string outputFilename;
+  int Len = IFN.length();
+  if ((Len > 2) &&
+      IFN[Len-3] == '.' &&
+      ((IFN[Len-2] == 'b' && IFN[Len-1] == 'c') ||
+       (IFN[Len-2] == 'l' && IFN[Len-1] == 'l'))) {
+    outputFilename = std::string(IFN.begin(), IFN.end()-3); // s/.bc/.s/
+  } else {
+    outputFilename = IFN;
+  }
+  return outputFilename;
+}
+
+static tool_output_file *GetOutputStream(const char *TargetName,
+                                         Triple::OSType OS,
+                                         const char *ProgName) {
+  // If we don't yet have an output filename, make one.
+  if (OutputFilename.empty()) {
+    if (InputFilename == "-")
+      OutputFilename = "-";
+    else {
+      OutputFilename = GetFileNameRoot(InputFilename);
+
+      switch (FileType) {
+      case TargetMachine::CGFT_AssemblyFile:
+        if (TargetName[0] == 'c') {
+          if (TargetName[1] == 0)
+            OutputFilename += ".cbe.c";
+          else if (TargetName[1] == 'p' && TargetName[2] == 'p')
+            OutputFilename += ".cpp";
+          else
+            OutputFilename += ".s";
+        } else
+          OutputFilename += ".s";
+        break;
+      case TargetMachine::CGFT_ObjectFile:
+        OutputFilename += ".o";
+        break;
+      case TargetMachine::CGFT_Null:
+        OutputFilename += ".null";
+        break;
+      }
+    }
+  }
+
+  // Decide if we need "binary" output.
+  bool Binary = false;
+  switch (FileType) {
+  case TargetMachine::CGFT_AssemblyFile:
+    break;
+  case TargetMachine::CGFT_ObjectFile:
+  case TargetMachine::CGFT_Null:
+    Binary = true;
+    break;
+  }
+
+  // Open the file.
+  std::string error;
+#if LLVM_VERSION_MINOR==4
+  sys::fs::OpenFlags OpenFlags = sys::fs::F_None;
+  if (Binary)
+    OpenFlags |= sys::fs::F_Binary;
+#else
+  unsigned OpenFlags = 0;
+  if (Binary) OpenFlags |= raw_fd_ostream::F_Binary;
+#endif
+  tool_output_file *FDOut = new tool_output_file(OutputFilename.c_str(), error,
+                                                 OpenFlags);
+  if (!error.empty()) {
+    errs() << error << '\n';
+    delete FDOut;
+    return 0;
+  }
+
+  return FDOut;
+}
+
+// main - Entry point for the llc compiler.
+//
+int main(int argc, char **argv) {
+  sys::PrintStackTraceOnErrorSignal();
+  PrettyStackTraceProgram X(argc, argv);
+
+  // Enable debug stream buffering.
+  EnableDebugBuffering = true;
+
+  LLVMContext &Context = getGlobalContext();
+  llvm_shutdown_obj Y;  // Call llvm_shutdown() on exit.
+
+  // Initialize targets first, so that --version shows registered targets.
+  InitializeAllTargets();
+  InitializeAllTargetMCs();
+  InitializeAllAsmPrinters();
+  InitializeAllAsmParsers();
+
+  // Initialize codegen and IR passes used by llc so that the -print-after,
+  // -print-before, and -stop-after options work.
+  PassRegistry *Registry = PassRegistry::getPassRegistry();
+  initializeCore(*Registry);
+  initializeCodeGen(*Registry);
+  initializeLoopStrengthReducePass(*Registry);
+  initializeLowerIntrinsicsPass(*Registry);
+  initializeUnreachableBlockElimPass(*Registry);
+
+  // Register the target printer for --version.
+  cl::AddExtraVersionPrinter(TargetRegistry::printRegisteredTargetsForVersion);
+
+  cl::ParseCommandLineOptions(argc, argv, "llvm system compiler\n");
+
+  // Load the module to be compiled...
+  SMDiagnostic Err;
+  std::auto_ptr<Module> M;
+  Module *mod = 0;
+  Triple TheTriple;
+
+  M.reset(ParseIRFile(InputFilename, Err, Context));
+  mod = M.get();
+  if (mod == 0) {
+    Err.print(argv[0], errs());
+    return 1;
+  }
+
+  // If we are supposed to override the target triple, do so now.
+  if (!TargetTriple.empty())
+    mod->setTargetTriple(Triple::normalize(TargetTriple));
+  TheTriple = Triple(mod->getTargetTriple());
+
+  if (TheTriple.getTriple().empty())
+    TheTriple.setTriple(sys::getDefaultTargetTriple());
+
+  // Get the target specific parser.
+  std::string Error;
+  const Target *TheTarget = TargetRegistry::lookupTarget("bpf", TheTriple,
+                                                         Error);
+  if (!TheTarget) {
+    errs() << argv[0] << ": " << Error;
+    return 1;
+  }
+
+  // Package up features to be passed to target/subtarget
+  std::string FeaturesStr;
+  if (MAttrs.size()) {
+    SubtargetFeatures Features;
+    for (unsigned i = 0; i != MAttrs.size(); ++i)
+      Features.AddFeature(MAttrs[i]);
+    FeaturesStr = Features.getString();
+  }
+
+  CodeGenOpt::Level OLvl = CodeGenOpt::Default;
+  switch (OptLevel) {
+  default:
+    errs() << argv[0] << ": invalid optimization level.\n";
+    return 1;
+  case ' ': break;
+  case '0': OLvl = CodeGenOpt::None; break;
+  case '1': OLvl = CodeGenOpt::Less; break;
+  case '2': OLvl = CodeGenOpt::Default; break;
+  case '3': OLvl = CodeGenOpt::Aggressive; break;
+  }
+
+  TargetOptions Options;
+  Options.NoZerosInBSS = DontPlaceZerosInBSS;
+  Options.GuaranteedTailCallOpt = EnableGuaranteedTailCallOpt;
+  Options.DisableTailCalls = DisableTailCalls;
+
+  std::auto_ptr<TargetMachine>
+    target(TheTarget->createTargetMachine(TheTriple.getTriple(),
+                                          "", FeaturesStr, Options,
+                                          Reloc::Default, CodeModel::Default, OLvl));
+  assert(target.get() && "Could not allocate target machine!");
+  assert(mod && "Should have exited after outputting help!");
+  TargetMachine &Target = *target.get();
+
+  Target.setMCUseLoc(false);
+
+  Target.setMCUseCFI(false);
+
+  // Figure out where we are going to send the output.
+  OwningPtr<tool_output_file> Out
+    (GetOutputStream(TheTarget->getName(), TheTriple.getOS(), argv[0]));
+  if (!Out) return 1;
+
+  // Build up all of the passes that we want to do to the module.
+  PassManager PM;
+
+  // Add an appropriate TargetLibraryInfo pass for the module's triple.
+  TargetLibraryInfo *TLI = new TargetLibraryInfo(TheTriple);
+  if (DisableSimplifyLibCalls)
+    TLI->disableAllFunctions();
+  PM.add(TLI);
+
+  // Add the target data from the target machine, if it exists, or the module.
+  if (const DataLayout *TD = Target.getDataLayout())
+    PM.add(new DataLayout(*TD));
+  else
+    PM.add(new DataLayout(mod));
+
+  // Override default to generate verbose assembly.
+  Target.setAsmVerbosityDefault(true);
+
+  {
+    formatted_raw_ostream FOS(Out->os());
+
+    AnalysisID StartAfterID = 0;
+    AnalysisID StopAfterID = 0;
+    const PassRegistry *PR = PassRegistry::getPassRegistry();
+    if (!StartAfter.empty()) {
+      const PassInfo *PI = PR->getPassInfo(StartAfter);
+      if (!PI) {
+        errs() << argv[0] << ": start-after pass is not registered.\n";
+        return 1;
+      }
+      StartAfterID = PI->getTypeInfo();
+    }
+    if (!StopAfter.empty()) {
+      const PassInfo *PI = PR->getPassInfo(StopAfter);
+      if (!PI) {
+        errs() << argv[0] << ": stop-after pass is not registered.\n";
+        return 1;
+      }
+      StopAfterID = PI->getTypeInfo();
+    }
+
+    // Ask the target to add backend passes as necessary.
+    if (Target.addPassesToEmitFile(PM, FOS, FileType, NoVerify,
+                                   StartAfterID, StopAfterID)) {
+      errs() << argv[0] << ": target does not support generation of this"
+             << " file type!\n";
+      return 1;
+    }
+
+    // Before executing passes, print the final values of the LLVM options.
+    cl::PrintOptionValues();
+
+    PM.run(*mod);
+  }
+
+  // Declare success.
+  Out->keep();
+
+  return 0;
+}
-- 
1.7.9.5


^ permalink raw reply related	[flat|nested] 26+ messages in thread

* [RFC PATCH v2 tip 7/7] tracing filter examples in BPF
  2014-02-06  1:10 [RFC PATCH v2 tip 0/7] 64-bit BPF insn set and tracing filters Alexei Starovoitov
                   ` (5 preceding siblings ...)
  2014-02-06  1:10 ` [RFC PATCH v2 tip 6/7] LLVM BPF backend Alexei Starovoitov
@ 2014-02-06  1:10 ` Alexei Starovoitov
  2014-02-06 10:42 ` [RFC PATCH v2 tip 0/7] 64-bit BPF insn set and tracing filters Daniel Borkmann
  7 siblings, 0 replies; 26+ messages in thread
From: Alexei Starovoitov @ 2014-02-06  1:10 UTC (permalink / raw)
  To: Ingo Molnar
  Cc: David S. Miller, Steven Rostedt, Peter Zijlstra, H. Peter Anvin,
	Thomas Gleixner, Masami Hiramatsu, Tom Zanussi, Jovi Zhangwei,
	Eric Dumazet, Linus Torvalds, Andrew Morton, Frederic Weisbecker,
	Arnaldo Carvalho de Melo, Pekka Enberg, Arjan van de Ven,
	Christoph Hellwig, linux-kernel, netdev

filter_check/ - userspace correctness checker of BPF filter
examples/ - BPF filter examples in C

will be compiled by LLVM into .bpf
$cd examples
$make - compile .c into .bpf
$make check - check correctness of *.bpf
$make try - to apply netif_rcv.bpf as a tracing filter

Signed-off-by: Alexei Starovoitov <ast@plumgrid.com>
---
 tools/bpf/examples/Makefile                 |   71 +++++++++++++++++
 tools/bpf/examples/README.txt               |   59 ++++++++++++++
 tools/bpf/examples/dropmon.c                |   40 ++++++++++
 tools/bpf/examples/netif_rcv.c              |   34 ++++++++
 tools/bpf/filter_check/Makefile             |   32 ++++++++
 tools/bpf/filter_check/README.txt           |    3 +
 tools/bpf/filter_check/trace_filter_check.c |  115 +++++++++++++++++++++++++++
 7 files changed, 354 insertions(+)
 create mode 100644 tools/bpf/examples/Makefile
 create mode 100644 tools/bpf/examples/README.txt
 create mode 100644 tools/bpf/examples/dropmon.c
 create mode 100644 tools/bpf/examples/netif_rcv.c
 create mode 100644 tools/bpf/filter_check/Makefile
 create mode 100644 tools/bpf/filter_check/README.txt
 create mode 100644 tools/bpf/filter_check/trace_filter_check.c

diff --git a/tools/bpf/examples/Makefile b/tools/bpf/examples/Makefile
new file mode 100644
index 0000000..1da6fd5
--- /dev/null
+++ b/tools/bpf/examples/Makefile
@@ -0,0 +1,71 @@
+KOBJ := $(PWD)/../../..
+
+VERSION_FILE := $(KOBJ)/include/generated/uapi/linux/version.h
+
+ifeq (,$(wildcard $(VERSION_FILE)))
+  $(error Linux kernel source not configured - missing version.h)
+endif
+
+BLD=$(PWD)
+LLC=$(BLD)/../llvm/bld/Debug+Asserts/bin/llc
+CHK=$(BLD)/../filter_check/trace_filter_check
+
+EXTRA_CFLAGS=
+
+ifeq ($(NESTED),1)
+# to get NOSTDINC_FLAGS and LINUXINCLUDE from kernel build
+# have to trick top Makefile
+# pretend that we're building a module
+KBUILD_EXTMOD=$(PWD)
+# and include main kernel Makefile
+include Makefile
+
+# cannot have other targets (like all, clean) here
+# since they will conflict
+%.bpf: %.c
+	clang $(NOSTDINC_FLAGS) $(LINUXINCLUDE) $(EXTRA_CFLAGS) \
+	  -D__KERNEL__ -Wno-unused-value -Wno-pointer-sign \
+	  -O2 -emit-llvm -c $< -o -| $(LLC) -o $@
+
+else
+
+SRCS := $(notdir $(wildcard *.c))
+BPFS = $(patsubst %.c,$(BLD)/%.bpf,$(SRCS))
+
+all: $(LLC)
+# invoke make recursively with current Makefile, but
+# for specific .bpf targets
+	$(MAKE) -C $(KOBJ) -f $(BLD)/Makefile NESTED=1 $(BPFS)
+
+$(LLC):
+	$(MAKE) -C ../llvm/bld -j4
+
+$(CHK):
+	$(MAKE) -C ../filter_check
+
+check: $(CHK)
+	@$(foreach bpf,$(patsubst %.c,%.bpf,$(SRCS)),echo Checking $(bpf) ...;$(CHK) $(bpf);)
+
+try:
+	@echo --- BPF filter for static tracepoint net:netif_receive_skb ---
+	@echo | sudo tee /sys/kernel/debug/tracing/trace > /dev/null
+	@cat netif_rcv.bpf | sudo tee /sys/kernel/debug/tracing/events/net/netif_receive_skb/filter > /dev/null
+	@echo 1 | sudo tee /sys/kernel/debug/tracing/events/net/netif_receive_skb/enable > /dev/null
+	ping -c1 localhost | grep req
+	sudo cat /sys/kernel/debug/tracing/trace
+	@echo 0 | sudo tee /sys/kernel/debug/tracing/events/net/netif_receive_skb/enable > /dev/null
+	@echo 0 | sudo tee /sys/kernel/debug/tracing/events/net/netif_receive_skb/filter > /dev/null
+	@echo | sudo tee /sys/kernel/debug/tracing/trace
+	@echo --- BPF filter for dynamic kprobe __netif_receive_skb ---
+	@echo "p:my __netif_receive_skb" | sudo tee /sys/kernel/debug/tracing/kprobe_events > /dev/null
+	@cat netif_rcv.bpf | sudo tee /sys/kernel/debug/tracing/events/kprobes/my/filter > /dev/null
+	@echo 1 | sudo tee /sys/kernel/debug/tracing/events/kprobes/my/enable > /dev/null
+	ping -c1 localhost | grep req
+	sudo cat /sys/kernel/debug/tracing/trace
+	@echo 0 | sudo tee /sys/kernel/debug/tracing/events/kprobes/my/filter > /dev/null
+	@echo 0 | sudo tee /sys/kernel/debug/tracing/events/kprobes/my/enable > /dev/null
+	@echo | sudo tee /sys/kernel/debug/tracing/kprobe_events > /dev/null
+
+clean:
+	rm -f *.bpf
+endif
diff --git a/tools/bpf/examples/README.txt b/tools/bpf/examples/README.txt
new file mode 100644
index 0000000..0768ae1
--- /dev/null
+++ b/tools/bpf/examples/README.txt
@@ -0,0 +1,59 @@
+Tracing filter examples
+
+netif_rcv: tracing filter example that prints events for loobpack device only
+
+$ cat netif_rcv.bpf > /sys/kernel/debug/tracing/events/net/netif_receive_skb/filter
+$ echo 1 > /sys/kernel/debug/tracing/events/net/netif_receive_skb/enable
+$ ping -c1 localhost
+$ cat /sys/kernel/debug/tracing/trace
+            ping-5913  [003] ..s2  3779.285726: __netif_receive_skb_core: skb ffff880808e3a300 dev ffff88080bbf8000
+            ping-5913  [003] ..s2  3779.285744: __netif_receive_skb_core: skb ffff880808e3a900 dev ffff88080bbf8000
+
+Alternatively do:
+
+$make - compile .c into .bpf
+
+$make check - check correctness of *.bpf
+
+$make try - to apply netif_rcv.bpf as a tracing filter
+
+Should see output like:
+
+--- BPF filter for static tracepoint net:netif_receive_skb ---
+ping -c1 localhost | grep req
+64 bytes from localhost (127.0.0.1): icmp_req=1 ttl=64 time=0.040 ms
+sudo cat /sys/kernel/debug/tracing/trace
+# tracer: nop
+#
+# entries-in-buffer/entries-written: 2/2   #P:4
+#
+#                              _-----=> irqs-off
+#                             / _----=> need-resched
+#                            | / _---=> hardirq/softirq
+#                            || / _--=> preempt-depth
+#                            ||| /     delay
+#           TASK-PID   CPU#  ||||    TIMESTAMP  FUNCTION
+#              | |       |   ||||       |         |
+            ping-5023  [001] ..s2  3554.532361: __netif_receive_skb_core: skb ffff8807f88bcc00 dev ffff88080b4d0000
+            ping-5023  [001] ..s2  3554.532378: __netif_receive_skb_core: skb ffff8807f88bcd00 dev ffff88080b4d0000
+
+--- BPF filter for dynamic kprobe __netif_receive_skb ---
+ping -c1 localhost | grep req
+64 bytes from localhost (127.0.0.1): icmp_req=1 ttl=64 time=0.061 ms
+sudo cat /sys/kernel/debug/tracing/trace
+# tracer: nop
+#
+# entries-in-buffer/entries-written: 2/2   #P:4
+#
+#                              _-----=> irqs-off
+#                             / _----=> need-resched
+#                            | / _---=> hardirq/softirq
+#                            || / _--=> preempt-depth
+#                            ||| /     delay
+#           TASK-PID   CPU#  ||||    TIMESTAMP  FUNCTION
+#              | |       |   ||||       |         |
+            ping-5053  [002] d.s2  3554.902215: kprobe_ftrace_handler: skb ffff8807ae6f7700 dev ffff88080b4d0000
+            ping-5053  [002] d.s2  3554.902236: kprobe_ftrace_handler: skb ffff8807ae6f7200 dev ffff88080b4d0000
+
+dropmon: faster version of tools/perf/scripts/python/net_dropmonitor.py
+work in progress
diff --git a/tools/bpf/examples/dropmon.c b/tools/bpf/examples/dropmon.c
new file mode 100644
index 0000000..3ed3f41
--- /dev/null
+++ b/tools/bpf/examples/dropmon.c
@@ -0,0 +1,40 @@
+/*
+ * drop monitor in BPF, faster version of
+ * tools/perf/scripts/python/net_dropmonitor.py
+ */
+#include <linux/bpf.h>
+#include <trace/bpf_trace.h>
+
+#define DESC(NAME) __attribute__((section(NAME), used))
+
+DESC("e skb:kfree_skb")
+/* attaches to /sys/kernel/debug/tracing/events/skb/kfree_skb */
+void dropmon(struct bpf_context *ctx)
+{
+	void *loc;
+	uint64_t *drop_cnt;
+
+	/*
+	 * skb:kfree_skb is defined as:
+	 * TRACE_EVENT(kfree_skb,
+	 *         TP_PROTO(struct sk_buff *skb, void *location),
+	 * so ctx->arg2 is 'location'
+	 */
+	loc = (void *)ctx->arg2;
+
+	drop_cnt = bpf_table_lookup(ctx, 0, &loc);
+	if (drop_cnt) {
+		__sync_fetch_and_add(drop_cnt, 1);
+	} else {
+		uint64_t init = 0;
+		bpf_table_update(ctx, 0, &loc, &init);
+	}
+}
+
+struct bpf_table t[] DESC("bpftables") = {
+	{BPF_TABLE_HASH, sizeof(void *), sizeof(uint64_t), 4096, 0}
+};
+
+/* filter code license: */
+char l[] DESC("license") = "GPL v2";
+
diff --git a/tools/bpf/examples/netif_rcv.c b/tools/bpf/examples/netif_rcv.c
new file mode 100644
index 0000000..cd69f5c
--- /dev/null
+++ b/tools/bpf/examples/netif_rcv.c
@@ -0,0 +1,34 @@
+/*
+ * tracing filter example
+ * attaches to /sys/kernel/debug/tracing/events/net/netif_receive_skb
+ * prints events for loobpack device only
+ */
+#include <linux/skbuff.h>
+#include <linux/netdevice.h>
+#include <linux/bpf.h>
+#include <trace/bpf_trace.h>
+
+#define DESC(NAME) __attribute__((section(NAME), used))
+
+DESC("e net:netif_receive_skb")
+void my_filter(struct bpf_context *ctx)
+{
+	char devname[4] = "lo";
+	struct net_device *dev;
+	struct sk_buff *skb = 0;
+
+	/*
+	 * for tracepoints arg1 is the 1st arg of TP_ARGS() macro
+	 * defined in include/trace/events/.h
+	 * for kprobe events arg1 is the 1st arg of probed function
+	 */
+	skb = (struct sk_buff *)ctx->arg1;
+	dev = bpf_load_pointer(&skb->dev);
+	if (bpf_memcmp(dev->name, devname, 2) == 0) {
+		char fmt[] = "skb %p dev %p \n";
+		bpf_trace_printk(fmt, sizeof(fmt), (long)skb, (long)dev, 0);
+	}
+}
+
+/* filter code license: */
+char license[] DESC("license") = "GPL";
diff --git a/tools/bpf/filter_check/Makefile b/tools/bpf/filter_check/Makefile
new file mode 100644
index 0000000..b0ac7aa
--- /dev/null
+++ b/tools/bpf/filter_check/Makefile
@@ -0,0 +1,32 @@
+CC = gcc
+
+all: trace_filter_check
+
+srctree=../../..
+src-perf=../../perf
+ARCH=x86
+
+CFLAGS += -I$(src-perf)/util/include
+CFLAGS += -I$(src-perf)/arch/$(ARCH)/include
+CFLAGS += -I$(srctree)/arch/$(ARCH)/include/uapi
+CFLAGS += -I$(srctree)/arch/$(ARCH)/include
+CFLAGS += -I$(srctree)/include/uapi
+CFLAGS += -I$(srctree)/include
+CFLAGS += -O2 -w
+
+$(srctree)/kernel/bpf_jit/bpf_check.o: $(srctree)/kernel/bpf_jit/bpf_check.c
+	$(MAKE) -C $(srctree) kernel/bpf_jit/bpf_check.o
+$(srctree)/kernel/bpf_jit/bpf_run.o: $(srctree)/kernel/bpf_jit/bpf_run.c
+	$(MAKE) -C $(srctree) kernel/bpf_jit/bpf_run.o
+$(srctree)/kernel/trace/bpf_trace_callbacks.o: $(srctree)/kernel/trace/bpf_trace_callbacks.c
+	$(MAKE) -C $(srctree) kernel/trace/bpf_trace_callbacks.o
+
+trace_filter_check: LDLIBS = -Wl,--unresolved-symbols=ignore-all
+trace_filter_check: trace_filter_check.o \
+	$(srctree)/kernel/bpf_jit/bpf_check.o \
+	$(srctree)/kernel/bpf_jit/bpf_run.o \
+	$(srctree)/kernel/trace/bpf_trace_callbacks.o
+
+clean:
+	rm -rf *.o trace_filter_check
+
diff --git a/tools/bpf/filter_check/README.txt b/tools/bpf/filter_check/README.txt
new file mode 100644
index 0000000..f5badcd
--- /dev/null
+++ b/tools/bpf/filter_check/README.txt
@@ -0,0 +1,3 @@
+To pre-check correctness of the filter do:
+$ trace_filter_check filter_ex1.bpf
+(final filter check always happens in kernel)
diff --git a/tools/bpf/filter_check/trace_filter_check.c b/tools/bpf/filter_check/trace_filter_check.c
new file mode 100644
index 0000000..32ac7ff
--- /dev/null
+++ b/tools/bpf/filter_check/trace_filter_check.c
@@ -0,0 +1,115 @@
+/* Copyright (c) 2011-2014 PLUMgrid, http://plumgrid.com
+ *
+ * This program is free software; you can redistribute it and/or
+ * modify it under the terms of version 2 of the GNU General Public
+ * License as published by the Free Software Foundation.
+ */
+#include <linux/bpf.h>
+#include <trace/bpf_trace.h>
+#include <stdio.h>
+#include <stdlib.h>
+#include <stdarg.h>
+#include <errno.h>
+
+/* for i386 use kernel ABI, this attr ignored by gcc in 64-bit */
+#define REGPARM __attribute__((regparm(3)))
+
+REGPARM
+void *__kmalloc(size_t size, int flags)
+{
+	return calloc(size, 1);
+}
+
+REGPARM
+void kfree(void *objp)
+{
+	free(objp);
+}
+
+int kmalloc_caches[128];
+REGPARM
+void *kmem_cache_alloc_trace(void *caches, int flags, size_t size)
+{
+	return calloc(size, 1);
+}
+
+void bpf_compile(void *prog)
+{
+}
+
+void __bpf_free(void *prog)
+{
+}
+
+REGPARM
+int memcmp(char *p1, char *p2, int len)
+{
+	int i;
+	for (i = 0; i < len; i++)
+		if (*p1++ != *p2++)
+			return 1;
+	return 0;
+}
+
+REGPARM
+int memcpy(char *p1, char *p2, int len)
+{
+	int i;
+	for (i = 0; i < len; i++)
+		*p1++ = *p2++;
+	return 0;
+}
+
+REGPARM
+int strcmp(char *p1, char *p2)
+{
+	return memcmp(p1, p2, strlen(p1));
+}
+
+
+REGPARM
+int printk(const char *fmt, ...)
+{
+	int ret;
+	va_list ap;
+
+	va_start(ap, fmt);
+	ret = vprintf(fmt, ap);
+	va_end(ap);
+	return ret;
+}
+
+char buf[16000];
+REGPARM
+int bpf_load_image(const char *image, int image_len, struct bpf_callbacks *cb,
+		   void **p_prog);
+
+int main(int ac, char **av)
+{
+	FILE *f;
+	int size, err;
+	void *prog;
+
+	if (ac < 2) {
+		printf("Usage: %s bpf_binary_image\n", av[0]);
+		return 1;
+	}
+
+	f = fopen(av[1], "r");
+	if (!f) {
+		printf("fopen %s\n", strerror(errno));
+		return 2;
+	}
+	size = fread(buf, 1, sizeof(buf), f);
+	if (size <= 0) {
+		printf("fread %s\n", strerror(errno));
+		return 3;
+	}
+	err = bpf_load_image(buf, size, &bpf_trace_cb, &prog);
+	if (!err)
+		printf("OK\n");
+	else
+		printf("err %s\n", strerror(-err));
+	fclose(f);
+	return 0;
+}
-- 
1.7.9.5


^ permalink raw reply related	[flat|nested] 26+ messages in thread

* Re: [RFC PATCH v2 tip 0/7] 64-bit BPF insn set and tracing filters
  2014-02-06  1:10 [RFC PATCH v2 tip 0/7] 64-bit BPF insn set and tracing filters Alexei Starovoitov
                   ` (6 preceding siblings ...)
  2014-02-06  1:10 ` [RFC PATCH v2 tip 7/7] tracing filter examples in BPF Alexei Starovoitov
@ 2014-02-06 10:42 ` Daniel Borkmann
  2014-02-07  1:20   ` Alexei Starovoitov
  7 siblings, 1 reply; 26+ messages in thread
From: Daniel Borkmann @ 2014-02-06 10:42 UTC (permalink / raw)
  To: Alexei Starovoitov
  Cc: Ingo Molnar, David S. Miller, Steven Rostedt, Peter Zijlstra,
	H. Peter Anvin, Thomas Gleixner, Masami Hiramatsu, Tom Zanussi,
	Jovi Zhangwei, Eric Dumazet, Linus Torvalds, Andrew Morton,
	Frederic Weisbecker, Arnaldo Carvalho de Melo, Pekka Enberg,
	Arjan van de Ven, Christoph Hellwig, linux-kernel, netdev

Hi Alexei,

On 02/06/2014 02:10 AM, Alexei Starovoitov wrote:
> Hi All,
>
> this patch set addresses main sticking points of the previous discussion:
> http://thread.gmane.org/gmane.linux.kernel/1605783
>
> Main difference:
> . all components are now in one place
>    tools/bpf/llvm - standalone LLVM backend for extended BPF instruction set
>
> . regs.si, regs.di accessors are replaced with arg1, arg2
>
> . compiler enforces presence of 'license' string in source C code
>    kernel enforces GPL compatibility of BPF program
>
> Why bother with it?
> Current 32-bit BPF is safe, but limited.
> kernel modules are 'all-goes', but not-safe.
> Extended 64-bit BPF provides safe and restricted kernel modules.
>
> Just like the first two, extended BPF can be used for all sorts of things.
> Initially for tracing/debugging/[ks]tap-like without vmlinux around,
> then for networking, security, etc
>
> To make exising kernel modules safe the x86 disassembler and code analyzer
> are needed. We've tried to follow that path. Disassembler was straight forward,
> but x86 analyzer was becoming unbearably complex due to variety of addressing
> modes, so we started to hack GCC to reduce output x86 insns and facing
> the headache of redoing disasm/analyzer for arm and other arhcs.
> Plus there is old 32-bit bpf insn set already.
> On one side extended BPF is a 64-bit extension to current BPF.
> On the other side it's a common subset of x86-64/aarch64/... ISAs:
> a generic 64-bit insn set that can be JITed to native HW one to one.

First of all, I think it's very interesting work ! I'm just a bit concerned
that this _huge_ patchset with 64 bit BPF, or however we call it, will line
up in one row next to the BPF code we currently have and next to new nftables
engine and we will end up with three such engines which do quite similar
things and are all exposed to user space thus they need to be maintained
_forever_, adding up legacy even more. What would be the long-term future use
cases where the 64 bit engine comes into place compared to the current BPF
engine? What are the concrete killer features? I didn't went through your code
in detail, but although we might/could have _some_ performance benefits but at
the _huge_ cost of adding complexity. The current BPF I find okay to debug and
to follow, but how would be debug'ability of 64 bit programs end up, as you
mention, it becomes "unbearably complex"? Did you instead consider to replace
the current BPF engine instead, and add a sort of built-in compatibility
mode for current BPF programs? I think that this would be the way better
option to go with instead of adding a new engine next to the other. For
maintainability, trying to replace the old one might be harder to do on the
short term but better to maintain on the long run for everyone, no?

Best,

Daniel

> Tested on x86-64 and i386.
> BPF core was tested on arm-v7.
>
> V2 vs V1 details:
> 0001-Extended-BPF-core-framework:
>    no difference to instruction set
>    new bpf image format to include license string and enforcement during load
>
> 0002-Extended-BPF-JIT-for-x86-64: no changes
>
> 0003-Extended-BPF-64-bit-BPF-design-document: no changes
>
> 0004-Revert-x86-ptrace-Remove-unused-regs_get_argument:
>    restoring Masami's get_Nth_argument accessor to simplify kprobe filters
>
> 0005-use-BPF-in-tracing-filters: minor changes to switch from si/di to argN
>
> 0006-LLVM-BPF-backend: standalone BPF backend for LLVM
>    requires: apt-get install llvm-3.2-dev clang
>    compiles in 7 seconds, links with the rest of llvm infra
>    compatible with llvm 3.2, 3.3 and just released 3.4
>    Written in llvm coding style and llvm license, so it can be
>    upstreamed into llvm tree
>
> 0007-tracing-filter-examples-in-BPF:
>    tools/bpf/filter_check: userspace pre-checker of BPF filter
>    runs the same bpf_check() code as kernel does
>
>    tools/bpf/examples/netif_rcv.c:
> -----
> #define DESC(NAME) __attribute__((section(NAME), used))
> void my_filter(struct bpf_context *ctx)
> {
>          char devname[4] = "lo";
>          struct net_device *dev;
>          struct sk_buff *skb = 0;
>
>          /*
>           * for tracepoints arg1 is the 1st arg of TP_ARGS() macro
>           * defined in include/trace/events/.h
>           * for kprobe events arg1 is the 1st arg of probed function
>           */
>          skb = (struct sk_buff *)ctx->arg1;
>
>          dev = bpf_load_pointer(&skb->dev);
>          if (bpf_memcmp(dev->name, devname, 2) == 0) {
>                  char fmt[] = "skb %p dev %p \n";
>                  bpf_trace_printk(fmt, sizeof(fmt), (long)skb, (long)dev, 0);
>          }
> }
> /* filter code license: */
> char license[] DESC("license") = "GPL";
> -----
>
> $cd tools/bpf/examples
> $make
>    compile it using clang+llvm_bpf
> $make check
>    check safety
> $make try
>    attach this filter to net:netif_receive_skb and kprobe __netif_receive_skb
>    and try ping
>
> dropmon.c is a demo of faster version of net_dropmonitor:
> -----
> /* attaches to /sys/kernel/debug/tracing/events/skb/kfree_skb */
> void dropmon(struct bpf_context *ctx)
> {
>          void *loc;
>          uint64_t *drop_cnt;
>
>          /*
>           * skb:kfree_skb is defined as:
>           * TRACE_EVENT(kfree_skb,
>           *         TP_PROTO(struct sk_buff *skb, void *location),
>           * so ctx->arg2 is 'location'
>           */
>          loc = (void *)ctx->arg2;
>
>          drop_cnt = bpf_table_lookup(ctx, 0, &loc);
>          if (drop_cnt) {
>                  __sync_fetch_and_add(drop_cnt, 1);
>          } else {
>                  uint64_t init = 0;
>                  bpf_table_update(ctx, 0, &loc, &init);
>          }
> }
> struct bpf_table t[] DESC("bpftables") = {
>          {BPF_TABLE_HASH, sizeof(void *), sizeof(uint64_t), 4096, 0}
> };
> /* filter code license: */
> char l[] DESC("license") = "GPL v2";
> -----
> It's not fully functional yet. Minimal work remaining to implement
> bpf_table_lookup()/bpf_table_update() in kernel
> and userspace access to filter's table.
>
> This example demonstrates that some interesting events don't have to be
> always fed into userspace, but can be pre-processed in kernel.
> tools/perf/scripts/python/net_dropmonitor.py would need to read bpf table
> from kernel (via debugfs or netlink) and print it in a nice format.
>
> Same as in V1 BPF filters are called before tracepoints store the TP_STRUCT
> fields, since performance advantage is significant.
>
> TODO:
>
> - complete 'dropmonitor': finish bpf hashtable and userspace access to it
>
> - add multi-probe support, so that one C program can specify multiple
>    functions for different probe points (similar to [ks]tap)
>
> - add 'lsmod' like facility to list all loaded BPF filters
>
> - add -m32 flag to llvm, so that C pointers are 32-bit,
>    but emitted BPF is still 64-bit.
>    Useful for kernel struct walking in BPF program on 32-bit archs
>
> - finish testing on arm
>
> - teach llvm to store line numbers in BPF image, so that bpf_check()
>    can print nice errors when program is not safe
>
> - allow read-only "strings" in C code
>    today analyzer can only verify safety of: char s[] = "string"; bpf_print(s);
>    but bpf_print("string"); cannot be proven yet
>
> - write JIT from BPF to aarch64
>
> - refactor openvswitch + BPF proposal
>
> If direction is ok, I would like to commit this part to a branch of tip tree
> or staging tree and continue working there.
> Future deltas will be easier to review.
>
> Thanks
>
> Alexei Starovoitov (7):
>    Extended BPF core framework
>    Extended BPF JIT for x86-64
>    Extended BPF (64-bit BPF) design document
>    Revert "x86/ptrace: Remove unused regs_get_argument_nth API"
>    use BPF in tracing filters
>    LLVM BPF backend
>    tracing filter examples in BPF
>
>   Documentation/bpf_jit.txt                          |  204 ++++
>   arch/x86/Kconfig                                   |    1 +
>   arch/x86/include/asm/ptrace.h                      |    3 +
>   arch/x86/kernel/ptrace.c                           |   24 +
>   arch/x86/net/Makefile                              |    1 +
>   arch/x86/net/bpf64_jit_comp.c                      |  625 ++++++++++++
>   arch/x86/net/bpf_jit_comp.c                        |   23 +-
>   arch/x86/net/bpf_jit_comp.h                        |   35 +
>   include/linux/bpf.h                                |  149 +++
>   include/linux/bpf_jit.h                            |  134 +++
>   include/linux/ftrace_event.h                       |    5 +
>   include/trace/bpf_trace.h                          |   41 +
>   include/trace/ftrace.h                             |   17 +
>   kernel/Makefile                                    |    1 +
>   kernel/bpf_jit/Makefile                            |    3 +
>   kernel/bpf_jit/bpf_check.c                         | 1054 ++++++++++++++++++++
>   kernel/bpf_jit/bpf_run.c                           |  511 ++++++++++
>   kernel/trace/Kconfig                               |    1 +
>   kernel/trace/Makefile                              |    1 +
>   kernel/trace/bpf_trace_callbacks.c                 |  193 ++++
>   kernel/trace/trace.c                               |    7 +
>   kernel/trace/trace.h                               |   11 +-
>   kernel/trace/trace_events.c                        |    9 +-
>   kernel/trace/trace_events_filter.c                 |   61 +-
>   kernel/trace/trace_kprobe.c                        |   15 +-
>   lib/Kconfig.debug                                  |   15 +
>   tools/bpf/examples/Makefile                        |   71 ++
>   tools/bpf/examples/README.txt                      |   59 ++
>   tools/bpf/examples/dropmon.c                       |   40 +
>   tools/bpf/examples/netif_rcv.c                     |   34 +
>   tools/bpf/filter_check/Makefile                    |   32 +
>   tools/bpf/filter_check/README.txt                  |    3 +
>   tools/bpf/filter_check/trace_filter_check.c        |  115 +++
>   tools/bpf/llvm/LICENSE.TXT                         |   70 ++
>   tools/bpf/llvm/Makefile.rules                      |  641 ++++++++++++
>   tools/bpf/llvm/README.txt                          |   23 +
>   tools/bpf/llvm/bld/.gitignore                      |    2 +
>   tools/bpf/llvm/bld/Makefile                        |   27 +
>   tools/bpf/llvm/bld/Makefile.common                 |   14 +
>   tools/bpf/llvm/bld/Makefile.config                 |  124 +++
>   .../llvm/bld/include/llvm/Config/AsmParsers.def    |    8 +
>   .../llvm/bld/include/llvm/Config/AsmPrinters.def   |    9 +
>   .../llvm/bld/include/llvm/Config/Disassemblers.def |    8 +
>   tools/bpf/llvm/bld/include/llvm/Config/Targets.def |    9 +
>   .../bpf/llvm/bld/include/llvm/Support/DataTypes.h  |   96 ++
>   tools/bpf/llvm/bld/lib/Makefile                    |   11 +
>   .../llvm/bld/lib/Target/BPF/InstPrinter/Makefile   |   10 +
>   .../llvm/bld/lib/Target/BPF/MCTargetDesc/Makefile  |   11 +
>   tools/bpf/llvm/bld/lib/Target/BPF/Makefile         |   17 +
>   .../llvm/bld/lib/Target/BPF/TargetInfo/Makefile    |   10 +
>   tools/bpf/llvm/bld/lib/Target/Makefile             |   11 +
>   tools/bpf/llvm/bld/tools/Makefile                  |   12 +
>   tools/bpf/llvm/bld/tools/llc/Makefile              |   15 +
>   tools/bpf/llvm/lib/Target/BPF/BPF.h                |   30 +
>   tools/bpf/llvm/lib/Target/BPF/BPF.td               |   29 +
>   tools/bpf/llvm/lib/Target/BPF/BPFAsmPrinter.cpp    |  100 ++
>   tools/bpf/llvm/lib/Target/BPF/BPFCFGFixup.cpp      |   62 ++
>   tools/bpf/llvm/lib/Target/BPF/BPFCallingConv.td    |   24 +
>   tools/bpf/llvm/lib/Target/BPF/BPFFrameLowering.cpp |   36 +
>   tools/bpf/llvm/lib/Target/BPF/BPFFrameLowering.h   |   35 +
>   tools/bpf/llvm/lib/Target/BPF/BPFISelDAGToDAG.cpp  |  182 ++++
>   tools/bpf/llvm/lib/Target/BPF/BPFISelLowering.cpp  |  676 +++++++++++++
>   tools/bpf/llvm/lib/Target/BPF/BPFISelLowering.h    |  105 ++
>   tools/bpf/llvm/lib/Target/BPF/BPFInstrFormats.td   |   29 +
>   tools/bpf/llvm/lib/Target/BPF/BPFInstrInfo.cpp     |  162 +++
>   tools/bpf/llvm/lib/Target/BPF/BPFInstrInfo.h       |   53 +
>   tools/bpf/llvm/lib/Target/BPF/BPFInstrInfo.td      |  455 +++++++++
>   tools/bpf/llvm/lib/Target/BPF/BPFMCInstLower.cpp   |   77 ++
>   tools/bpf/llvm/lib/Target/BPF/BPFMCInstLower.h     |   40 +
>   tools/bpf/llvm/lib/Target/BPF/BPFRegisterInfo.cpp  |  122 +++
>   tools/bpf/llvm/lib/Target/BPF/BPFRegisterInfo.h    |   65 ++
>   tools/bpf/llvm/lib/Target/BPF/BPFRegisterInfo.td   |   39 +
>   tools/bpf/llvm/lib/Target/BPF/BPFSubtarget.cpp     |   23 +
>   tools/bpf/llvm/lib/Target/BPF/BPFSubtarget.h       |   33 +
>   tools/bpf/llvm/lib/Target/BPF/BPFTargetMachine.cpp |   72 ++
>   tools/bpf/llvm/lib/Target/BPF/BPFTargetMachine.h   |   69 ++
>   .../lib/Target/BPF/InstPrinter/BPFInstPrinter.cpp  |   79 ++
>   .../lib/Target/BPF/InstPrinter/BPFInstPrinter.h    |   34 +
>   .../lib/Target/BPF/MCTargetDesc/BPFAsmBackend.cpp  |   85 ++
>   .../llvm/lib/Target/BPF/MCTargetDesc/BPFBaseInfo.h |   33 +
>   .../Target/BPF/MCTargetDesc/BPFELFObjectWriter.cpp |  119 +++
>   .../lib/Target/BPF/MCTargetDesc/BPFMCAsmInfo.h     |   34 +
>   .../Target/BPF/MCTargetDesc/BPFMCCodeEmitter.cpp   |  120 +++
>   .../lib/Target/BPF/MCTargetDesc/BPFMCCodeEmitter.h |   67 ++
>   .../Target/BPF/MCTargetDesc/BPFMCTargetDesc.cpp    |  115 +++
>   .../lib/Target/BPF/MCTargetDesc/BPFMCTargetDesc.h  |   56 ++
>   .../lib/Target/BPF/TargetInfo/BPFTargetInfo.cpp    |   13 +
>   tools/bpf/llvm/tools/llc/llc.cpp                   |  381 +++++++
>   88 files changed, 8255 insertions(+), 25 deletions(-)
>   create mode 100644 Documentation/bpf_jit.txt
>   create mode 100644 arch/x86/net/bpf64_jit_comp.c
>   create mode 100644 arch/x86/net/bpf_jit_comp.h
>   create mode 100644 include/linux/bpf.h
>   create mode 100644 include/linux/bpf_jit.h
>   create mode 100644 include/trace/bpf_trace.h
>   create mode 100644 kernel/bpf_jit/Makefile
>   create mode 100644 kernel/bpf_jit/bpf_check.c
>   create mode 100644 kernel/bpf_jit/bpf_run.c
>   create mode 100644 kernel/trace/bpf_trace_callbacks.c
>   create mode 100644 tools/bpf/examples/Makefile
>   create mode 100644 tools/bpf/examples/README.txt
>   create mode 100644 tools/bpf/examples/dropmon.c
>   create mode 100644 tools/bpf/examples/netif_rcv.c
>   create mode 100644 tools/bpf/filter_check/Makefile
>   create mode 100644 tools/bpf/filter_check/README.txt
>   create mode 100644 tools/bpf/filter_check/trace_filter_check.c
>   create mode 100644 tools/bpf/llvm/LICENSE.TXT
>   create mode 100644 tools/bpf/llvm/Makefile.rules
>   create mode 100644 tools/bpf/llvm/README.txt
>   create mode 100644 tools/bpf/llvm/bld/.gitignore
>   create mode 100644 tools/bpf/llvm/bld/Makefile
>   create mode 100644 tools/bpf/llvm/bld/Makefile.common
>   create mode 100644 tools/bpf/llvm/bld/Makefile.config
>   create mode 100644 tools/bpf/llvm/bld/include/llvm/Config/AsmParsers.def
>   create mode 100644 tools/bpf/llvm/bld/include/llvm/Config/AsmPrinters.def
>   create mode 100644 tools/bpf/llvm/bld/include/llvm/Config/Disassemblers.def
>   create mode 100644 tools/bpf/llvm/bld/include/llvm/Config/Targets.def
>   create mode 100644 tools/bpf/llvm/bld/include/llvm/Support/DataTypes.h
>   create mode 100644 tools/bpf/llvm/bld/lib/Makefile
>   create mode 100644 tools/bpf/llvm/bld/lib/Target/BPF/InstPrinter/Makefile
>   create mode 100644 tools/bpf/llvm/bld/lib/Target/BPF/MCTargetDesc/Makefile
>   create mode 100644 tools/bpf/llvm/bld/lib/Target/BPF/Makefile
>   create mode 100644 tools/bpf/llvm/bld/lib/Target/BPF/TargetInfo/Makefile
>   create mode 100644 tools/bpf/llvm/bld/lib/Target/Makefile
>   create mode 100644 tools/bpf/llvm/bld/tools/Makefile
>   create mode 100644 tools/bpf/llvm/bld/tools/llc/Makefile
>   create mode 100644 tools/bpf/llvm/lib/Target/BPF/BPF.h
>   create mode 100644 tools/bpf/llvm/lib/Target/BPF/BPF.td
>   create mode 100644 tools/bpf/llvm/lib/Target/BPF/BPFAsmPrinter.cpp
>   create mode 100644 tools/bpf/llvm/lib/Target/BPF/BPFCFGFixup.cpp
>   create mode 100644 tools/bpf/llvm/lib/Target/BPF/BPFCallingConv.td
>   create mode 100644 tools/bpf/llvm/lib/Target/BPF/BPFFrameLowering.cpp
>   create mode 100644 tools/bpf/llvm/lib/Target/BPF/BPFFrameLowering.h
>   create mode 100644 tools/bpf/llvm/lib/Target/BPF/BPFISelDAGToDAG.cpp
>   create mode 100644 tools/bpf/llvm/lib/Target/BPF/BPFISelLowering.cpp
>   create mode 100644 tools/bpf/llvm/lib/Target/BPF/BPFISelLowering.h
>   create mode 100644 tools/bpf/llvm/lib/Target/BPF/BPFInstrFormats.td
>   create mode 100644 tools/bpf/llvm/lib/Target/BPF/BPFInstrInfo.cpp
>   create mode 100644 tools/bpf/llvm/lib/Target/BPF/BPFInstrInfo.h
>   create mode 100644 tools/bpf/llvm/lib/Target/BPF/BPFInstrInfo.td
>   create mode 100644 tools/bpf/llvm/lib/Target/BPF/BPFMCInstLower.cpp
>   create mode 100644 tools/bpf/llvm/lib/Target/BPF/BPFMCInstLower.h
>   create mode 100644 tools/bpf/llvm/lib/Target/BPF/BPFRegisterInfo.cpp
>   create mode 100644 tools/bpf/llvm/lib/Target/BPF/BPFRegisterInfo.h
>   create mode 100644 tools/bpf/llvm/lib/Target/BPF/BPFRegisterInfo.td
>   create mode 100644 tools/bpf/llvm/lib/Target/BPF/BPFSubtarget.cpp
>   create mode 100644 tools/bpf/llvm/lib/Target/BPF/BPFSubtarget.h
>   create mode 100644 tools/bpf/llvm/lib/Target/BPF/BPFTargetMachine.cpp
>   create mode 100644 tools/bpf/llvm/lib/Target/BPF/BPFTargetMachine.h
>   create mode 100644 tools/bpf/llvm/lib/Target/BPF/InstPrinter/BPFInstPrinter.cpp
>   create mode 100644 tools/bpf/llvm/lib/Target/BPF/InstPrinter/BPFInstPrinter.h
>   create mode 100644 tools/bpf/llvm/lib/Target/BPF/MCTargetDesc/BPFAsmBackend.cpp
>   create mode 100644 tools/bpf/llvm/lib/Target/BPF/MCTargetDesc/BPFBaseInfo.h
>   create mode 100644 tools/bpf/llvm/lib/Target/BPF/MCTargetDesc/BPFELFObjectWriter.cpp
>   create mode 100644 tools/bpf/llvm/lib/Target/BPF/MCTargetDesc/BPFMCAsmInfo.h
>   create mode 100644 tools/bpf/llvm/lib/Target/BPF/MCTargetDesc/BPFMCCodeEmitter.cpp
>   create mode 100644 tools/bpf/llvm/lib/Target/BPF/MCTargetDesc/BPFMCCodeEmitter.h
>   create mode 100644 tools/bpf/llvm/lib/Target/BPF/MCTargetDesc/BPFMCTargetDesc.cpp
>   create mode 100644 tools/bpf/llvm/lib/Target/BPF/MCTargetDesc/BPFMCTargetDesc.h
>   create mode 100644 tools/bpf/llvm/lib/Target/BPF/TargetInfo/BPFTargetInfo.cpp
>   create mode 100644 tools/bpf/llvm/tools/llc/llc.cpp
>

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [RFC PATCH v2 tip 0/7] 64-bit BPF insn set and tracing filters
  2014-02-06 10:42 ` [RFC PATCH v2 tip 0/7] 64-bit BPF insn set and tracing filters Daniel Borkmann
@ 2014-02-07  1:20   ` Alexei Starovoitov
  2014-02-13 20:20     ` Daniel Borkmann
  2014-02-13 22:32     ` H. Peter Anvin
  0 siblings, 2 replies; 26+ messages in thread
From: Alexei Starovoitov @ 2014-02-07  1:20 UTC (permalink / raw)
  To: Daniel Borkmann
  Cc: Ingo Molnar, David S. Miller, Steven Rostedt, Peter Zijlstra,
	H. Peter Anvin, Thomas Gleixner, Masami Hiramatsu, Tom Zanussi,
	Jovi Zhangwei, Eric Dumazet, Linus Torvalds, Andrew Morton,
	Frederic Weisbecker, Arnaldo Carvalho de Melo, Pekka Enberg,
	Arjan van de Ven, Christoph Hellwig, linux-kernel, netdev

On Thu, Feb 6, 2014 at 2:42 AM, Daniel Borkmann <dborkman@redhat.com> wrote:
> Hi Alexei,
>
>
> On 02/06/2014 02:10 AM, Alexei Starovoitov wrote:
>>
>> Hi All,
>>
>> this patch set addresses main sticking points of the previous discussion:
>> http://thread.gmane.org/gmane.linux.kernel/1605783
>>
>> Main difference:
>> . all components are now in one place
>>    tools/bpf/llvm - standalone LLVM backend for extended BPF instruction
>> set
>>
>> . regs.si, regs.di accessors are replaced with arg1, arg2
>>
>> . compiler enforces presence of 'license' string in source C code
>>    kernel enforces GPL compatibility of BPF program
>>
>> Why bother with it?
>> Current 32-bit BPF is safe, but limited.
>> kernel modules are 'all-goes', but not-safe.
>> Extended 64-bit BPF provides safe and restricted kernel modules.
>>
>> Just like the first two, extended BPF can be used for all sorts of things.
>> Initially for tracing/debugging/[ks]tap-like without vmlinux around,
>> then for networking, security, etc
>>
>> To make exising kernel modules safe the x86 disassembler and code analyzer
>> are needed. We've tried to follow that path. Disassembler was straight
>> forward,
>> but x86 analyzer was becoming unbearably complex due to variety of
>> addressing
>> modes, so we started to hack GCC to reduce output x86 insns and facing
>> the headache of redoing disasm/analyzer for arm and other arhcs.
>> Plus there is old 32-bit bpf insn set already.
>> On one side extended BPF is a 64-bit extension to current BPF.
>> On the other side it's a common subset of x86-64/aarch64/... ISAs:
>> a generic 64-bit insn set that can be JITed to native HW one to one.

Hi Daniel,

Thank you for taking a look. Good questions. I had the same concerns.
Old BPF was carefully extended in specific places.
End result may look big at first glance, but every extension has specific
reason behind it. I tried to explain the reasoning in Documentation/bpf_jit.txt

I'm planning to write an on-the-fly converter from old BPF to BPF64
when BPF64 manages to demonstrate that it is equally safe.
It is straight forward to convert. Encoding is very similar.
Core concepts are the same.
Try diff include/uapi/linux/filter.h include/linux/bpf.h
to see how much is reused.

I believe that old BPF outlived itself and BPF64 should
replace it in all current use cases plus a lot more.
It just cannot happen at once.
BPF64 can come in. bpf32->bpf64 converter functioning.
JIT from bpf64->aarch64 and may be sparc64 needs to be in place.
Then old bpf can fade away.

> First of all, I think it's very interesting work ! I'm just a bit concerned
> that this _huge_ patchset with 64 bit BPF, or however we call it, will line

Huge?
kernel is only 2k
the rest is 6k of userspace LLVM backend where most of it is llvm's
boilerplate code. GCC backend for BPF is 3k.
The goal is to have both GCC and LLVM backends to be upstreamed
when kernel pieces are agreed upon.
For comparison existing tools/net/bpf* is 2.5k
but here with 6k we get optimizing compiler from C and assembler.

> up in one row next to the BPF code we currently have and next to new
> nftables
> engine and we will end up with three such engines which do quite similar
> things and are all exposed to user space thus they need to be maintained
> _forever_, adding up legacy even more. What would be the long-term future
> use
> cases where the 64 bit engine comes into place compared to the current BPF
> engine? What are the concrete killer features? I didn't went through your

killer features vs old bpf are:
- zero-cost function calls
- 32-bit vs 64-bit
- optimizing compiler that can compile C into BPF64

Why call kernel function from BPF?
So that BPF instruction set has to be extended only once and JITs are
written only once.
Over the years many extensions crept into old BPF as 'negative offsets'.
but JITs don't support all of them and assume bpf input as 'skb' only.
seccomp is using old bpf, but, because of these limitations, cannot use JIT.
BPF64 allows seccomp to be JITed, since bpf input is generalized
as 'struct bpf_context'.
New 'negative offset' extension for old bpf would mean implementing it in
JITs of all architectures? Painful, but doable. We can do better.

Fixed instruction set that allows zero-overhead calls into kernel functions
is much more flexible and extendable in a clean way.
Take a look at kernel/trace/bpf_trace_callbacks.c
It is a customization of generic BPF64 core for 'tracing filters'.
The set of functions for networking and definition of 'bpf_context'
will be different.
So BPF64 for tracing need X extensions, BPF64 for networking needs Y
extensions, but core framework stays the same and JIT stays the same.

How to do zero-overhead call?
Map BPF registers to native registers one to one
and have compatible calling convention between BPF and native.
Then BPF asm code:
mov R1, 1
mov R2, 2
call foo
will be JITed into x86-64:
mov rdi, 1
mov rsi, 2
call foo
That makes BPF64 calls into kernel as fast as possible.
Especially for networking we don't want overhead of FFI mechanisms.

That's why A and X regs and lack of callee-saved regs make old BPF
impractical to support generic function calls.

BPF64 defines R1-R5 as function arguments and R6-R9 as
callee-saved, so kernel can natively call into JIT-ed BPF and back
with no extra argument shuffling.
gcc/llvm backends know that R6-R9 will be preserved while BPF is
calling into kernel functions and can make proper optimizations.
R6-R9 map to rbx-r15 on x86-64. On aarch64 we have
even more freedom of mapping.

> code
> in detail, but although we might/could have _some_ performance benefits but
> at
> the _huge_ cost of adding complexity. The current BPF I find okay to debug
> and
> to follow, but how would be debug'ability of 64 bit programs end up, as you
> mention, it becomes "unbearably complex"?

"unbearably complex" was the reference to x86 static analyzer :)
It's difficult to reconstruct and verify control and data flow of x86 asm code.
Binary compilers do that (like transmeta and others), but that's not suitable
for kernel.

Both old bpf asm and bpf64 asm code I find equivalent in readability.

clang dropmon.c ...|llc -filetype=asm
will produce the following bpf64 asm code:
        mov     r6, r1
        ldd     r1, 8(r6)
        std     -8(r10), r1
        mov     r7, 0
        mov     r3, r10
        addi    r3, -8
        mov     r1, r6
        mov     r2, r7
        call    bpf_table_lookup
        jeqi    r0, 0 goto .LBB0_2

which corresponds to C:
void dropmon(struct bpf_context *ctx)
{       void *loc;
        uint64_t *drop_cnt;
        loc = (void *)ctx->arg2;
        drop_cnt = bpf_table_lookup(ctx, 0, &loc);
        if (drop_cnt) ...

I think restricted C is easier to program and debug.
Which is another killer feature of bpf64.

Interesting use case would be if some kernel subsystem
decides to generate BPF64 insns on the fly and JIT them.
Sort of self-modifieable kernel code.
It's certainly easier to generate BPF64 binary with macroses
from linux/bpf.h instead of x86 binary...
I may be dreaming here :)

> Did you instead consider to
> replace
> the current BPF engine instead, and add a sort of built-in compatibility
> mode for current BPF programs? I think that this would be the way better
> option to go with instead of adding a new engine next to the other. For
> maintainability, trying to replace the old one might be harder to do on the
> short term but better to maintain on the long run for everyone, no?

Exactly. I think on-the-fly converter from bpf32->bpf64 is this built-in
compatibility layer. I completely agree that replacing bpf32 is hard
short term, since it will raise too many concerns about
stability/safety, but long term it's a way to go.

I'm open to all suggestions on how to make it more generic, useful,
faster.

Thank you for feedback.

Regards,
Alexei

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [RFC PATCH v2 tip 0/7] 64-bit BPF insn set and tracing filters
  2014-02-07  1:20   ` Alexei Starovoitov
@ 2014-02-13 20:20     ` Daniel Borkmann
  2014-02-13 22:22       ` Daniel Borkmann
  2014-02-14  4:47       ` Alexei Starovoitov
  2014-02-13 22:32     ` H. Peter Anvin
  1 sibling, 2 replies; 26+ messages in thread
From: Daniel Borkmann @ 2014-02-13 20:20 UTC (permalink / raw)
  To: Alexei Starovoitov
  Cc: Ingo Molnar, David S. Miller, Steven Rostedt, Peter Zijlstra,
	H. Peter Anvin, Thomas Gleixner, Masami Hiramatsu, Tom Zanussi,
	Jovi Zhangwei, Eric Dumazet, Linus Torvalds, Andrew Morton,
	Frederic Weisbecker, Arnaldo Carvalho de Melo, Pekka Enberg,
	Arjan van de Ven, Christoph Hellwig, linux-kernel, netdev

On 02/07/2014 02:20 AM, Alexei Starovoitov wrote:
...
> Hi Daniel,

Thanks for your answer and sorry for the late reply.

> Thank you for taking a look. Good questions. I had the same concerns.
> Old BPF was carefully extended in specific places.
> End result may look big at first glance, but every extension has specific
> reason behind it. I tried to explain the reasoning in Documentation/bpf_jit.txt
>
> I'm planning to write an on-the-fly converter from old BPF to BPF64
> when BPF64 manages to demonstrate that it is equally safe.
> It is straight forward to convert. Encoding is very similar.
> Core concepts are the same.
> Try diff include/uapi/linux/filter.h include/linux/bpf.h
> to see how much is reused.
>
> I believe that old BPF outlived itself and BPF64 should
> replace it in all current use cases plus a lot more.
> It just cannot happen at once.
> BPF64 can come in. bpf32->bpf64 converter functioning.
> JIT from bpf64->aarch64 and may be sparc64 needs to be in place.
> Then old bpf can fade away.

Do you see a possibility to integrate your work step by step? That is,
to first integrate the interpreter part only; meaning, to detect "old"
BPF programs e.g. coming from SO_ATTACH_FILTER et al and run them in
compatibility mode while extended BPF is fully integrated and replaces
the old engine in net/core/filter.c. Maybe, "old" programs can be
transformed transparently to the new representation and then would be
good to execute in eBPF. If possible, in such a way that in the first
step JIT compilers won't need any upgrades. Once that is resolved,
JIT compilers could successively migrate, arch by arch, to compile the
new code? And last but not least the existing tools as well for handling
eBPF. I think, if possible, that would be great. Also, I unfortunately
haven't looked into your code too deeply yet due to time constraints,
but I'm wondering e.g. for accessing some skb fields we currently use
the "hack" to "overload" load instructions with negative arguments. Do
we have a sort of "meta" instruction that is extendible in eBPF to avoid
such things in future?

>> First of all, I think it's very interesting work ! I'm just a bit concerned
>> that this _huge_ patchset with 64 bit BPF, or however we call it, will line
>
> Huge?
> kernel is only 2k
> the rest is 6k of userspace LLVM backend where most of it is llvm's
> boilerplate code. GCC backend for BPF is 3k.
> The goal is to have both GCC and LLVM backends to be upstreamed
> when kernel pieces are agreed upon.
> For comparison existing tools/net/bpf* is 2.5k
> but here with 6k we get optimizing compiler from C and assembler.
>
>> up in one row next to the BPF code we currently have and next to new
>> nftables
>> engine and we will end up with three such engines which do quite similar
>> things and are all exposed to user space thus they need to be maintained
>> _forever_, adding up legacy even more. What would be the long-term future
>> use
>> cases where the 64 bit engine comes into place compared to the current BPF
>> engine? What are the concrete killer features? I didn't went through your
>
> killer features vs old bpf are:
> - zero-cost function calls
> - 32-bit vs 64-bit
> - optimizing compiler that can compile C into BPF64
>
> Why call kernel function from BPF?
> So that BPF instruction set has to be extended only once and JITs are
> written only once.
> Over the years many extensions crept into old BPF as 'negative offsets'.
> but JITs don't support all of them and assume bpf input as 'skb' only.
> seccomp is using old bpf, but, because of these limitations, cannot use JIT.
> BPF64 allows seccomp to be JITed, since bpf input is generalized
> as 'struct bpf_context'.
> New 'negative offset' extension for old bpf would mean implementing it in
> JITs of all architectures? Painful, but doable. We can do better.
>
> Fixed instruction set that allows zero-overhead calls into kernel functions
> is much more flexible and extendable in a clean way.
> Take a look at kernel/trace/bpf_trace_callbacks.c
> It is a customization of generic BPF64 core for 'tracing filters'.
> The set of functions for networking and definition of 'bpf_context'
> will be different.
> So BPF64 for tracing need X extensions, BPF64 for networking needs Y
> extensions, but core framework stays the same and JIT stays the same.
>
> How to do zero-overhead call?
> Map BPF registers to native registers one to one
> and have compatible calling convention between BPF and native.
> Then BPF asm code:
> mov R1, 1
> mov R2, 2
> call foo
> will be JITed into x86-64:
> mov rdi, 1
> mov rsi, 2
> call foo
> That makes BPF64 calls into kernel as fast as possible.
> Especially for networking we don't want overhead of FFI mechanisms.
>
> That's why A and X regs and lack of callee-saved regs make old BPF
> impractical to support generic function calls.
>
> BPF64 defines R1-R5 as function arguments and R6-R9 as
> callee-saved, so kernel can natively call into JIT-ed BPF and back
> with no extra argument shuffling.
> gcc/llvm backends know that R6-R9 will be preserved while BPF is
> calling into kernel functions and can make proper optimizations.
> R6-R9 map to rbx-r15 on x86-64. On aarch64 we have
> even more freedom of mapping.
>
>> code
>> in detail, but although we might/could have _some_ performance benefits but
>> at
>> the _huge_ cost of adding complexity. The current BPF I find okay to debug
>> and
>> to follow, but how would be debug'ability of 64 bit programs end up, as you
>> mention, it becomes "unbearably complex"?
>
> "unbearably complex" was the reference to x86 static analyzer :)
> It's difficult to reconstruct and verify control and data flow of x86 asm code.
> Binary compilers do that (like transmeta and others), but that's not suitable
> for kernel.
>
> Both old bpf asm and bpf64 asm code I find equivalent in readability.
>
> clang dropmon.c ...|llc -filetype=asm
> will produce the following bpf64 asm code:
>          mov     r6, r1
>          ldd     r1, 8(r6)
>          std     -8(r10), r1
>          mov     r7, 0
>          mov     r3, r10
>          addi    r3, -8
>          mov     r1, r6
>          mov     r2, r7
>          call    bpf_table_lookup
>          jeqi    r0, 0 goto .LBB0_2
>
> which corresponds to C:
> void dropmon(struct bpf_context *ctx)
> {       void *loc;
>          uint64_t *drop_cnt;
>          loc = (void *)ctx->arg2;
>          drop_cnt = bpf_table_lookup(ctx, 0, &loc);
>          if (drop_cnt) ...
>
> I think restricted C is easier to program and debug.
> Which is another killer feature of bpf64.
>
> Interesting use case would be if some kernel subsystem
> decides to generate BPF64 insns on the fly and JIT them.
> Sort of self-modifieable kernel code.
> It's certainly easier to generate BPF64 binary with macroses
> from linux/bpf.h instead of x86 binary...
> I may be dreaming here :)
>
>> Did you instead consider to
>> replace
>> the current BPF engine instead, and add a sort of built-in compatibility
>> mode for current BPF programs? I think that this would be the way better
>> option to go with instead of adding a new engine next to the other. For
>> maintainability, trying to replace the old one might be harder to do on the
>> short term but better to maintain on the long run for everyone, no?
>
> Exactly. I think on-the-fly converter from bpf32->bpf64 is this built-in
> compatibility layer. I completely agree that replacing bpf32 is hard
> short term, since it will raise too many concerns about
> stability/safety, but long term it's a way to go.

Yes, I agree.

> I'm open to all suggestions on how to make it more generic, useful,
> faster.
>
> Thank you for feedback.

Thank you, must have been really fun to implement this. :)

> Regards,
> Alexei
>

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [RFC PATCH v2 tip 0/7] 64-bit BPF insn set and tracing filters
  2014-02-13 20:20     ` Daniel Borkmann
@ 2014-02-13 22:22       ` Daniel Borkmann
  2014-02-14  0:59         ` Alexei Starovoitov
  2014-02-14  4:47       ` Alexei Starovoitov
  1 sibling, 1 reply; 26+ messages in thread
From: Daniel Borkmann @ 2014-02-13 22:22 UTC (permalink / raw)
  To: Alexei Starovoitov
  Cc: Ingo Molnar, David S. Miller, Steven Rostedt, Peter Zijlstra,
	H. Peter Anvin, Thomas Gleixner, Masami Hiramatsu, Tom Zanussi,
	Jovi Zhangwei, Eric Dumazet, Linus Torvalds, Andrew Morton,
	Frederic Weisbecker, Arnaldo Carvalho de Melo, Pekka Enberg,
	Arjan van de Ven, Christoph Hellwig, linux-kernel, netdev

On 02/13/2014 09:20 PM, Daniel Borkmann wrote:
> On 02/07/2014 02:20 AM, Alexei Starovoitov wrote:
> ...
>> Hi Daniel,
>
> Thanks for your answer and sorry for the late reply.
>
>> Thank you for taking a look. Good questions. I had the same concerns.
>> Old BPF was carefully extended in specific places.
>> End result may look big at first glance, but every extension has specific
>> reason behind it. I tried to explain the reasoning in Documentation/bpf_jit.txt
>>
>> I'm planning to write an on-the-fly converter from old BPF to BPF64
>> when BPF64 manages to demonstrate that it is equally safe.
>> It is straight forward to convert. Encoding is very similar.
>> Core concepts are the same.
>> Try diff include/uapi/linux/filter.h include/linux/bpf.h
>> to see how much is reused.
>>
>> I believe that old BPF outlived itself and BPF64 should
>> replace it in all current use cases plus a lot more.
>> It just cannot happen at once.
>> BPF64 can come in. bpf32->bpf64 converter functioning.
>> JIT from bpf64->aarch64 and may be sparc64 needs to be in place.
>> Then old bpf can fade away.
>
> Do you see a possibility to integrate your work step by step? That is,
> to first integrate the interpreter part only; meaning, to detect "old"
> BPF programs e.g. coming from SO_ATTACH_FILTER et al and run them in
> compatibility mode while extended BPF is fully integrated and replaces
> the old engine in net/core/filter.c. Maybe, "old" programs can be
> transformed transparently to the new representation and then would be
> good to execute in eBPF. If possible, in such a way that in the first
> step JIT compilers won't need any upgrades. Once that is resolved,
> JIT compilers could successively migrate, arch by arch, to compile the
> new code? And last but not least the existing tools as well for handling
> eBPF. I think, if possible, that would be great. Also, I unfortunately
> haven't looked into your code too deeply yet due to time constraints,
> but I'm wondering e.g. for accessing some skb fields we currently use
> the "hack" to "overload" load instructions with negative arguments. Do
> we have a sort of "meta" instruction that is extendible in eBPF to avoid
> such things in future?
>
>>> First of all, I think it's very interesting work ! I'm just a bit concerned
>>> that this _huge_ patchset with 64 bit BPF, or however we call it, will line
>>
>> Huge?
>> kernel is only 2k
>> the rest is 6k of userspace LLVM backend where most of it is llvm's
>> boilerplate code. GCC backend for BPF is 3k.
>> The goal is to have both GCC and LLVM backends to be upstreamed
>> when kernel pieces are agreed upon.
>> For comparison existing tools/net/bpf* is 2.5k
>> but here with 6k we get optimizing compiler from C and assembler.
>>
>>> up in one row next to the BPF code we currently have and next to new
>>> nftables
>>> engine and we will end up with three such engines which do quite similar
>>> things and are all exposed to user space thus they need to be maintained
>>> _forever_, adding up legacy even more. What would be the long-term future
>>> use
>>> cases where the 64 bit engine comes into place compared to the current BPF
>>> engine? What are the concrete killer features? I didn't went through your
>>
>> killer features vs old bpf are:
>> - zero-cost function calls
>> - 32-bit vs 64-bit
>> - optimizing compiler that can compile C into BPF64
>>
>> Why call kernel function from BPF?
>> So that BPF instruction set has to be extended only once and JITs are
>> written only once.
>> Over the years many extensions crept into old BPF as 'negative offsets'.
>> but JITs don't support all of them and assume bpf input as 'skb' only.
>> seccomp is using old bpf, but, because of these limitations, cannot use JIT.
>> BPF64 allows seccomp to be JITed, since bpf input is generalized
>> as 'struct bpf_context'.
>> New 'negative offset' extension for old bpf would mean implementing it in
>> JITs of all architectures? Painful, but doable. We can do better.

I'm very curious, do you also have any performance numbers, e.g. for
networking by taking JIT'ed/non-JIT'ed BPF filters and compare them against
JIT'ed/non-JIT'ed eBPF filters to see how many pps we gain or loose e.g.
for a scenario with a middle box running cls_bpf .. or some other macro/
micro benchmark just to get a picture where both stand in terms of
performance? Who knows, maybe it would outperform nftables engine as
well? ;-) How would that look on a 32bit arch with eBPF that is 64bit?

>> Fixed instruction set that allows zero-overhead calls into kernel functions
>> is much more flexible and extendable in a clean way.
>> Take a look at kernel/trace/bpf_trace_callbacks.c
>> It is a customization of generic BPF64 core for 'tracing filters'.
>> The set of functions for networking and definition of 'bpf_context'
>> will be different.
>> So BPF64 for tracing need X extensions, BPF64 for networking needs Y
>> extensions, but core framework stays the same and JIT stays the same.
>>
>> How to do zero-overhead call?
>> Map BPF registers to native registers one to one
>> and have compatible calling convention between BPF and native.
>> Then BPF asm code:
>> mov R1, 1
>> mov R2, 2
>> call foo
>> will be JITed into x86-64:
>> mov rdi, 1
>> mov rsi, 2
>> call foo
>> That makes BPF64 calls into kernel as fast as possible.
>> Especially for networking we don't want overhead of FFI mechanisms.
>>
>> That's why A and X regs and lack of callee-saved regs make old BPF
>> impractical to support generic function calls.
>>
>> BPF64 defines R1-R5 as function arguments and R6-R9 as
>> callee-saved, so kernel can natively call into JIT-ed BPF and back
>> with no extra argument shuffling.
>> gcc/llvm backends know that R6-R9 will be preserved while BPF is
>> calling into kernel functions and can make proper optimizations.
>> R6-R9 map to rbx-r15 on x86-64. On aarch64 we have
>> even more freedom of mapping.
>>
>>> code
>>> in detail, but although we might/could have _some_ performance benefits but
>>> at
>>> the _huge_ cost of adding complexity. The current BPF I find okay to debug
>>> and
>>> to follow, but how would be debug'ability of 64 bit programs end up, as you
>>> mention, it becomes "unbearably complex"?
>>
>> "unbearably complex" was the reference to x86 static analyzer :)
>> It's difficult to reconstruct and verify control and data flow of x86 asm code.
>> Binary compilers do that (like transmeta and others), but that's not suitable
>> for kernel.
>>
>> Both old bpf asm and bpf64 asm code I find equivalent in readability.
>>
>> clang dropmon.c ...|llc -filetype=asm
>> will produce the following bpf64 asm code:
>>          mov     r6, r1
>>          ldd     r1, 8(r6)
>>          std     -8(r10), r1
>>          mov     r7, 0
>>          mov     r3, r10
>>          addi    r3, -8
>>          mov     r1, r6
>>          mov     r2, r7
>>          call    bpf_table_lookup
>>          jeqi    r0, 0 goto .LBB0_2
>>
>> which corresponds to C:
>> void dropmon(struct bpf_context *ctx)
>> {       void *loc;
>>          uint64_t *drop_cnt;
>>          loc = (void *)ctx->arg2;
>>          drop_cnt = bpf_table_lookup(ctx, 0, &loc);
>>          if (drop_cnt) ...
>>
>> I think restricted C is easier to program and debug.
>> Which is another killer feature of bpf64.
>>
>> Interesting use case would be if some kernel subsystem
>> decides to generate BPF64 insns on the fly and JIT them.
>> Sort of self-modifieable kernel code.
>> It's certainly easier to generate BPF64 binary with macroses
>> from linux/bpf.h instead of x86 binary...
>> I may be dreaming here :)
>>
>>> Did you instead consider to
>>> replace
>>> the current BPF engine instead, and add a sort of built-in compatibility
>>> mode for current BPF programs? I think that this would be the way better
>>> option to go with instead of adding a new engine next to the other. For
>>> maintainability, trying to replace the old one might be harder to do on the
>>> short term but better to maintain on the long run for everyone, no?
>>
>> Exactly. I think on-the-fly converter from bpf32->bpf64 is this built-in
>> compatibility layer. I completely agree that replacing bpf32 is hard
>> short term, since it will raise too many concerns about
>> stability/safety, but long term it's a way to go.
>
> Yes, I agree.
>
>> I'm open to all suggestions on how to make it more generic, useful,
>> faster.
>>
>> Thank you for feedback.
>
> Thank you, must have been really fun to implement this. :)
>
>> Regards,
>> Alexei
>>
> --
> To unsubscribe from this list: send the line "unsubscribe netdev" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [RFC PATCH v2 tip 0/7] 64-bit BPF insn set and tracing filters
  2014-02-07  1:20   ` Alexei Starovoitov
  2014-02-13 20:20     ` Daniel Borkmann
@ 2014-02-13 22:32     ` H. Peter Anvin
  2014-02-13 22:44       ` Daniel Borkmann
  1 sibling, 1 reply; 26+ messages in thread
From: H. Peter Anvin @ 2014-02-13 22:32 UTC (permalink / raw)
  To: Alexei Starovoitov, Daniel Borkmann
  Cc: Ingo Molnar, David S. Miller, Steven Rostedt, Peter Zijlstra,
	Thomas Gleixner, Masami Hiramatsu, Tom Zanussi, Jovi Zhangwei,
	Eric Dumazet, Linus Torvalds, Andrew Morton, Frederic Weisbecker,
	Arnaldo Carvalho de Melo, Pekka Enberg, Arjan van de Ven,
	Christoph Hellwig, linux-kernel, netdev

On 02/06/2014 05:20 PM, Alexei Starovoitov wrote:
> 
> I believe that old BPF outlived itself and BPF64 should
> replace it in all current use cases plus a lot more.
> It just cannot happen at once.
> BPF64 can come in. bpf32->bpf64 converter functioning.
> JIT from bpf64->aarch64 and may be sparc64 needs to be in place.
> Then old bpf can fade away.
> 

I don't think that is doable any time soon.  Right now pretty much all
mobile devices, for example, are 32 bits and they really want to use
syscall filtering for security.  Performance matters greatly there.

As such, 32-bit JIT support is going to be very important for a long
time to come.

	-hpa



^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [RFC PATCH v2 tip 0/7] 64-bit BPF insn set and tracing filters
  2014-02-13 22:32     ` H. Peter Anvin
@ 2014-02-13 22:44       ` Daniel Borkmann
  2014-02-13 22:47         ` H. Peter Anvin
  0 siblings, 1 reply; 26+ messages in thread
From: Daniel Borkmann @ 2014-02-13 22:44 UTC (permalink / raw)
  To: H. Peter Anvin
  Cc: Alexei Starovoitov, Ingo Molnar, David S. Miller, Steven Rostedt,
	Peter Zijlstra, Thomas Gleixner, Masami Hiramatsu, Tom Zanussi,
	Jovi Zhangwei, Eric Dumazet, Linus Torvalds, Andrew Morton,
	Frederic Weisbecker, Arnaldo Carvalho de Melo, Pekka Enberg,
	Arjan van de Ven, Christoph Hellwig, linux-kernel, netdev

On 02/13/2014 11:32 PM, H. Peter Anvin wrote:
> On 02/06/2014 05:20 PM, Alexei Starovoitov wrote:
>>
>> I believe that old BPF outlived itself and BPF64 should
>> replace it in all current use cases plus a lot more.
>> It just cannot happen at once.
>> BPF64 can come in. bpf32->bpf64 converter functioning.
>> JIT from bpf64->aarch64 and may be sparc64 needs to be in place.
>> Then old bpf can fade away.
>
> I don't think that is doable any time soon.  Right now pretty much all
> mobile devices, for example, are 32 bits and they really want to use
> syscall filtering for security.  Performance matters greatly there.

Well, if that would be the case, then seccomp would have had JIT support
long ago. ;-) Right now BPF filters with seccomp are not JIT compiled
for _any_ architecture.

> As such, 32-bit JIT support is going to be very important for a long
> time to come.

True, I think that pretty much depends if we can manage to find a way
to cleanly integrate it into net/core/filter.c while still supporting
the old instructions as I've mentioned earlier.

> 	-hpa
>
>

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [RFC PATCH v2 tip 0/7] 64-bit BPF insn set and tracing filters
  2014-02-13 22:44       ` Daniel Borkmann
@ 2014-02-13 22:47         ` H. Peter Anvin
  2014-02-13 22:55           ` Daniel Borkmann
  0 siblings, 1 reply; 26+ messages in thread
From: H. Peter Anvin @ 2014-02-13 22:47 UTC (permalink / raw)
  To: Daniel Borkmann
  Cc: Alexei Starovoitov, Ingo Molnar, David S. Miller, Steven Rostedt,
	Peter Zijlstra, Thomas Gleixner, Masami Hiramatsu, Tom Zanussi,
	Jovi Zhangwei, Eric Dumazet, Linus Torvalds, Andrew Morton,
	Frederic Weisbecker, Arnaldo Carvalho de Melo, Pekka Enberg,
	Arjan van de Ven, Christoph Hellwig, linux-kernel, netdev

On 02/13/2014 02:44 PM, Daniel Borkmann wrote:
> 
> Well, if that would be the case, then seccomp would have had JIT support
> long ago. ;-) Right now BPF filters with seccomp are not JIT compiled
> for _any_ architecture.
> 

Really, I was under the impression there were.  They *should be*, that
was an important concept in the development of the seccomp filters.

	-hpa


^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [RFC PATCH v2 tip 0/7] 64-bit BPF insn set and tracing filters
  2014-02-13 22:47         ` H. Peter Anvin
@ 2014-02-13 22:55           ` Daniel Borkmann
  0 siblings, 0 replies; 26+ messages in thread
From: Daniel Borkmann @ 2014-02-13 22:55 UTC (permalink / raw)
  To: H. Peter Anvin
  Cc: Alexei Starovoitov, Ingo Molnar, David S. Miller, Steven Rostedt,
	Peter Zijlstra, Thomas Gleixner, Masami Hiramatsu, Tom Zanussi,
	Jovi Zhangwei, Eric Dumazet, Linus Torvalds, Andrew Morton,
	Frederic Weisbecker, Arnaldo Carvalho de Melo, Pekka Enberg,
	Arjan van de Ven, Christoph Hellwig, linux-kernel, netdev

On 02/13/2014 11:47 PM, H. Peter Anvin wrote:
> On 02/13/2014 02:44 PM, Daniel Borkmann wrote:
>>
>> Well, if that would be the case, then seccomp would have had JIT support
>> long ago. ;-) Right now BPF filters with seccomp are not JIT compiled
>> for _any_ architecture.
>
> Really, I was under the impression there were.  They *should be*, that
> was an important concept in the development of the seccomp filters.

$ git grep -n BPF_S_ANC_SECCOMP_LD_W
include/linux/filter.h:153:     BPF_S_ANC_SECCOMP_LD_W,
kernel/seccomp.c:136:                   ftest->code = BPF_S_ANC_SECCOMP_LD_W;
net/core/filter.c:389:          case BPF_S_ANC_SECCOMP_LD_W:
net/core/filter.c:812:          [BPF_S_ANC_SECCOMP_LD_W] = BPF_LD|BPF_B|BPF_ABS,

Afaik, there had been attempts to support it, but had flaws in it.

> 	-hpa
>

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [RFC PATCH v2 tip 0/7] 64-bit BPF insn set and tracing filters
  2014-02-13 22:22       ` Daniel Borkmann
@ 2014-02-14  0:59         ` Alexei Starovoitov
  2014-02-14 17:02           ` Daniel Borkmann
  0 siblings, 1 reply; 26+ messages in thread
From: Alexei Starovoitov @ 2014-02-14  0:59 UTC (permalink / raw)
  To: Daniel Borkmann
  Cc: Ingo Molnar, David S. Miller, Steven Rostedt, Peter Zijlstra,
	H. Peter Anvin, Thomas Gleixner, Masami Hiramatsu, Tom Zanussi,
	Jovi Zhangwei, Eric Dumazet, Linus Torvalds, Andrew Morton,
	Frederic Weisbecker, Arnaldo Carvalho de Melo, Pekka Enberg,
	Arjan van de Ven, Christoph Hellwig, linux-kernel, netdev

On Thu, Feb 13, 2014 at 2:22 PM, Daniel Borkmann <dborkman@redhat.com> wrote:
> On 02/13/2014 09:20 PM, Daniel Borkmann wrote:
>>
>> On 02/07/2014 02:20 AM, Alexei Starovoitov wrote:
>> ...
>>>
>>> Hi Daniel,
>>
>>
>> Thanks for your answer and sorry for the late reply.
>>
>>> Thank you for taking a look. Good questions. I had the same concerns.
>>> Old BPF was carefully extended in specific places.
>>> End result may look big at first glance, but every extension has specific
>>> reason behind it. I tried to explain the reasoning in
>>> Documentation/bpf_jit.txt
>>>
>>> I'm planning to write an on-the-fly converter from old BPF to BPF64
>>> when BPF64 manages to demonstrate that it is equally safe.
>>> It is straight forward to convert. Encoding is very similar.
>>> Core concepts are the same.
>>> Try diff include/uapi/linux/filter.h include/linux/bpf.h
>>> to see how much is reused.
>>>
>>> I believe that old BPF outlived itself and BPF64 should
>>> replace it in all current use cases plus a lot more.
>>> It just cannot happen at once.
>>> BPF64 can come in. bpf32->bpf64 converter functioning.
>>> JIT from bpf64->aarch64 and may be sparc64 needs to be in place.
>>> Then old bpf can fade away.
>>
>>
>> Do you see a possibility to integrate your work step by step? That is,
>> to first integrate the interpreter part only; meaning, to detect "old"
>> BPF programs e.g. coming from SO_ATTACH_FILTER et al and run them in
>> compatibility mode while extended BPF is fully integrated and replaces
>> the old engine in net/core/filter.c. Maybe, "old" programs can be
>> transformed transparently to the new representation and then would be
>> good to execute in eBPF. If possible, in such a way that in the first
>> step JIT compilers won't need any upgrades. Once that is resolved,
>> JIT compilers could successively migrate, arch by arch, to compile the
>> new code? And last but not least the existing tools as well for handling
>> eBPF. I think, if possible, that would be great. Also, I unfortunately
>> haven't looked into your code too deeply yet due to time constraints,
>> but I'm wondering e.g. for accessing some skb fields we currently use
>> the "hack" to "overload" load instructions with negative arguments. Do
>> we have a sort of "meta" instruction that is extendible in eBPF to avoid
>> such things in future?
>>
>>>> First of all, I think it's very interesting work ! I'm just a bit
>>>> concerned
>>>> that this _huge_ patchset with 64 bit BPF, or however we call it, will
>>>> line
>>>
>>>
>>> Huge?
>>> kernel is only 2k
>>> the rest is 6k of userspace LLVM backend where most of it is llvm's
>>> boilerplate code. GCC backend for BPF is 3k.
>>> The goal is to have both GCC and LLVM backends to be upstreamed
>>> when kernel pieces are agreed upon.
>>> For comparison existing tools/net/bpf* is 2.5k
>>> but here with 6k we get optimizing compiler from C and assembler.
>>>
>>>> up in one row next to the BPF code we currently have and next to new
>>>> nftables
>>>> engine and we will end up with three such engines which do quite similar
>>>> things and are all exposed to user space thus they need to be maintained
>>>> _forever_, adding up legacy even more. What would be the long-term
>>>> future
>>>> use
>>>> cases where the 64 bit engine comes into place compared to the current
>>>> BPF
>>>> engine? What are the concrete killer features? I didn't went through
>>>> your
>>>
>>>
>>> killer features vs old bpf are:
>>> - zero-cost function calls
>>> - 32-bit vs 64-bit
>>> - optimizing compiler that can compile C into BPF64
>>>
>>> Why call kernel function from BPF?
>>> So that BPF instruction set has to be extended only once and JITs are
>>> written only once.
>>> Over the years many extensions crept into old BPF as 'negative offsets'.
>>> but JITs don't support all of them and assume bpf input as 'skb' only.
>>> seccomp is using old bpf, but, because of these limitations, cannot use
>>> JIT.
>>> BPF64 allows seccomp to be JITed, since bpf input is generalized
>>> as 'struct bpf_context'.
>>> New 'negative offset' extension for old bpf would mean implementing it in
>>> JITs of all architectures? Painful, but doable. We can do better.
>
>
> I'm very curious, do you also have any performance numbers, e.g. for
> networking by taking JIT'ed/non-JIT'ed BPF filters and compare them against
> JIT'ed/non-JIT'ed eBPF filters to see how many pps we gain or loose e.g.
> for a scenario with a middle box running cls_bpf .. or some other macro/
> micro benchmark just to get a picture where both stand in terms of
> performance? Who knows, maybe it would outperform nftables engine as
> well? ;-) How would that look on a 32bit arch with eBPF that is 64bit?

I don't have jited/non-jited numbers, but I suspect for micro-benchmarks
the gap should be big. I was shooting for near native performance after JIT.

So I took flow_dissector() function, tweaked it a bit and compiled into BPF.
x86_64 skb_flow_dissect() same skb (all cached)          -  42 nsec per call
x86_64 skb_flow_dissect() different skbs (cache misses)  - 141 nsec per call
bpf_jit skb_flow_dissect() same skb (all cached)         -  51 nsec per call
bpf_jit skb_flow_dissect() different skbs (cache misses) - 135 nsec per call

C->BPF64->x86_64 is slower than C->x86_64 when all data is in cache,
but presence of cache misses hide extra insns.

For gre flow_dissector() looks into inner packet, but for vxlan it does not,
since it needs to know udp port number. We can extend it with if (static_key)
and walk the list of udp_offload_base->offload->port like we do in
udp_gro_receive(),
but for RPS we just need a hash. I think custom loadable
flow_dissector() is the way to go.
If we know that majority of the traffic on the given machine is vxlan to port N
we can hard code this into BPF program. Don't need to walk outer packet either.
Just pick ip/port from inner. It's doable with old BPF too.

What we used to think as dynamic, with BPF can be hard coded.

As soon as I have time I'm thinking to play with nftables. The idea is:
rules are changed rarely, but a lot of traffic goes through them,
so we can spend time optimizing them.

Either user input or nft program can be converted to C, then LLVM invoked
to optimize the whole thing, generate BPF and load it.
Adding a rule will take time, but if execution of such ip/nftables
will be faster
the end user will benefit.

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [RFC PATCH v2 tip 0/7] 64-bit BPF insn set and tracing filters
  2014-02-13 20:20     ` Daniel Borkmann
  2014-02-13 22:22       ` Daniel Borkmann
@ 2014-02-14  4:47       ` Alexei Starovoitov
  2014-02-14 17:27         ` Daniel Borkmann
  1 sibling, 1 reply; 26+ messages in thread
From: Alexei Starovoitov @ 2014-02-14  4:47 UTC (permalink / raw)
  To: Daniel Borkmann
  Cc: Ingo Molnar, David S. Miller, Steven Rostedt, Peter Zijlstra,
	H. Peter Anvin, Thomas Gleixner, Masami Hiramatsu, Tom Zanussi,
	Jovi Zhangwei, Eric Dumazet, Linus Torvalds, Andrew Morton,
	Frederic Weisbecker, Arnaldo Carvalho de Melo, Pekka Enberg,
	Arjan van de Ven, Christoph Hellwig, linux-kernel, netdev

On Thu, Feb 13, 2014 at 12:20 PM, Daniel Borkmann <dborkman@redhat.com> wrote:
> On 02/07/2014 02:20 AM, Alexei Starovoitov wrote:
> ...
>>
>> Hi Daniel,
>
>
> Thanks for your answer and sorry for the late reply.
>
>
>> Thank you for taking a look. Good questions. I had the same concerns.
>> Old BPF was carefully extended in specific places.
>> End result may look big at first glance, but every extension has specific
>> reason behind it. I tried to explain the reasoning in
>> Documentation/bpf_jit.txt
>>
>> I'm planning to write an on-the-fly converter from old BPF to BPF64
>> when BPF64 manages to demonstrate that it is equally safe.
>> It is straight forward to convert. Encoding is very similar.
>> Core concepts are the same.
>> Try diff include/uapi/linux/filter.h include/linux/bpf.h
>> to see how much is reused.
>>
>> I believe that old BPF outlived itself and BPF64 should
>> replace it in all current use cases plus a lot more.
>> It just cannot happen at once.
>> BPF64 can come in. bpf32->bpf64 converter functioning.
>> JIT from bpf64->aarch64 and may be sparc64 needs to be in place.
>> Then old bpf can fade away.
>
>
> Do you see a possibility to integrate your work step by step? That is,

Sure. let's see how we can do it.

> to first integrate the interpreter part only; meaning, to detect "old"
> BPF programs e.g. coming from SO_ATTACH_FILTER et al and run them in
> compatibility mode while extended BPF is fully integrated and replaces
> the old engine in net/core/filter.c. Maybe, "old" programs can be

do you mean drop bfp64_jit, checker and just have bpf32->bpf64 converter
and bpf64 interpreter as phase 1 ?
Checking is done by old bpf32,
all existing bpf32 jits, if available, can convert bpf32 to native,
but interpreter will be running on bpf64 ?
phase 2 to introduce bpf64_x86 jit and so on?
Sounds fine.

Today I didn't try to optimize bpf64 interpreter, since insn set is designed
for eventual JITing and interpreter is there to support archs that don't
have jit yet.
I guess I have to tweak it to perform at bpf32 interpreter speeds.

> transformed transparently to the new representation and then would be
> good to execute in eBPF. If possible, in such a way that in the first
> step JIT compilers won't need any upgrades. Once that is resolved,
> JIT compilers could successively migrate, arch by arch, to compile the
> new code? And last but not least the existing tools as well for handling
> eBPF. I think, if possible, that would be great. Also, I unfortunately
> haven't looked into your code too deeply yet due to time constraints,
> but I'm wondering e.g. for accessing some skb fields we currently use
> the "hack" to "overload" load instructions with negative arguments. Do
> we have a sort of "meta" instruction that is extendible in eBPF to avoid
> such things in future?

Exactly.
This 'negative offset' hack of bpf32 isn't very clean, since jits for all archs
need to change when new offsets added.
For bpf64 I'm proposing a customizable 'bpf_context' and variable set
of bpf-callable functions, so JITs don't need to change and verifier
stays the same.
That's the idea behind 'bpf_callbacks' in include/linux/bpf_jit.h

Some meta data makes sense to pass as input into bpf program.
Like for seccomp 'bpf_context' can be 'struct seccomp_data'

For networking, bpf_context can be 'skb',
then bpf_s_anc_protocol becomes a normal 2-byte bpf64 load
from skb->protocol field. Allowing access to other fields of skb
is just a matter of defining permissions of 'struct bpf_context' in
bpf_callback->get_context_access()

Some other meta data and extensions are cleaner when defined
as function calls from bpf, since calls are free.
I think bpf_table_lookup() is a fundamental one that allows to define
arbitrary tables within bpf and access them from the program.
(here I need feedback the most whether to access tables
via netlink from userspace or via debugfs...)

It probably will be easier to read the code of bpf32-bpf64 converter
to understand the differences between the two.
I guess I have to start working on the converter sooner than I thought...

Thanks
Alexei

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [RFC PATCH v2 tip 0/7] 64-bit BPF insn set and tracing filters
  2014-02-14  0:59         ` Alexei Starovoitov
@ 2014-02-14 17:02           ` Daniel Borkmann
  2014-02-14 17:55             ` Alexei Starovoitov
  0 siblings, 1 reply; 26+ messages in thread
From: Daniel Borkmann @ 2014-02-14 17:02 UTC (permalink / raw)
  To: Alexei Starovoitov
  Cc: Ingo Molnar, David S. Miller, Steven Rostedt, Peter Zijlstra,
	H. Peter Anvin, Thomas Gleixner, Masami Hiramatsu, Tom Zanussi,
	Jovi Zhangwei, Eric Dumazet, Linus Torvalds, Andrew Morton,
	Frederic Weisbecker, Arnaldo Carvalho de Melo, Pekka Enberg,
	Arjan van de Ven, Christoph Hellwig, linux-kernel, netdev

On 02/14/2014 01:59 AM, Alexei Starovoitov wrote:
...
>> I'm very curious, do you also have any performance numbers, e.g. for
>> networking by taking JIT'ed/non-JIT'ed BPF filters and compare them against
>> JIT'ed/non-JIT'ed eBPF filters to see how many pps we gain or loose e.g.
>> for a scenario with a middle box running cls_bpf .. or some other macro/
>> micro benchmark just to get a picture where both stand in terms of
>> performance? Who knows, maybe it would outperform nftables engine as
>> well? ;-) How would that look on a 32bit arch with eBPF that is 64bit?
>
> I don't have jited/non-jited numbers, but I suspect for micro-benchmarks
> the gap should be big. I was shooting for near native performance after JIT.

Ohh, I meant it would be interesting to see a comparison of e.g. common libpcap
high-level filters that are in 32bit BPF + JIT (current code) vs 64bit BPF + JIT
(new code). I'm wondering how 32bit-only archs should be handled to not regress
in evaluation performance to the current code.

> So I took flow_dissector() function, tweaked it a bit and compiled into BPF.
> x86_64 skb_flow_dissect() same skb (all cached)          -  42 nsec per call
> x86_64 skb_flow_dissect() different skbs (cache misses)  - 141 nsec per call
> bpf_jit skb_flow_dissect() same skb (all cached)         -  51 nsec per call
> bpf_jit skb_flow_dissect() different skbs (cache misses) - 135 nsec per call
>
> C->BPF64->x86_64 is slower than C->x86_64 when all data is in cache,
> but presence of cache misses hide extra insns.
>
> For gre flow_dissector() looks into inner packet, but for vxlan it does not,
> since it needs to know udp port number. We can extend it with if (static_key)
> and walk the list of udp_offload_base->offload->port like we do in
> udp_gro_receive(),
> but for RPS we just need a hash. I think custom loadable
> flow_dissector() is the way to go.
> If we know that majority of the traffic on the given machine is vxlan to port N
> we can hard code this into BPF program. Don't need to walk outer packet either.
> Just pick ip/port from inner. It's doable with old BPF too.
>
> What we used to think as dynamic, with BPF can be hard coded.
>
> As soon as I have time I'm thinking to play with nftables. The idea is:
> rules are changed rarely, but a lot of traffic goes through them,
> so we can spend time optimizing them.
>
> Either user input or nft program can be converted to C, then LLVM invoked
> to optimize the whole thing, generate BPF and load it.
> Adding a rule will take time, but if execution of such ip/nftables
> will be faster
> the end user will benefit.
> --
> To unsubscribe from this list: send the line "unsubscribe netdev" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
>

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [RFC PATCH v2 tip 0/7] 64-bit BPF insn set and tracing filters
  2014-02-14  4:47       ` Alexei Starovoitov
@ 2014-02-14 17:27         ` Daniel Borkmann
  2014-02-14 20:17           ` Alexei Starovoitov
  0 siblings, 1 reply; 26+ messages in thread
From: Daniel Borkmann @ 2014-02-14 17:27 UTC (permalink / raw)
  To: Alexei Starovoitov
  Cc: Ingo Molnar, David S. Miller, Steven Rostedt, Peter Zijlstra,
	H. Peter Anvin, Thomas Gleixner, Masami Hiramatsu, Tom Zanussi,
	Jovi Zhangwei, Eric Dumazet, Linus Torvalds, Andrew Morton,
	Frederic Weisbecker, Arnaldo Carvalho de Melo, Pekka Enberg,
	Arjan van de Ven, Christoph Hellwig, linux-kernel, netdev

On 02/14/2014 05:47 AM, Alexei Starovoitov wrote:
...
>> Do you see a possibility to integrate your work step by step? That is,
>
> Sure. let's see how we can do it.
>
>> to first integrate the interpreter part only; meaning, to detect "old"
>> BPF programs e.g. coming from SO_ATTACH_FILTER et al and run them in
>> compatibility mode while extended BPF is fully integrated and replaces
>> the old engine in net/core/filter.c. Maybe, "old" programs can be
>
> do you mean drop bfp64_jit, checker and just have bpf32->bpf64 converter
> and bpf64 interpreter as phase 1 ?
> Checking is done by old bpf32,
> all existing bpf32 jits, if available, can convert bpf32 to native,
> but interpreter will be running on bpf64 ?
> phase 2 to introduce bpf64_x86 jit and so on?
> Sounds fine.

If that's possible, so first step would be to migrate bpf_run() from patch1
into sk_run_filter() form net/core/filter.c, and also bring in related
include file into include/linux/filter.h resp. include/uapi/linux/filter.h.
Plus code that is needed to verify the image in new (and old) format e.g.
bpf_load_image() et al, and to either convert old programs into the new
format, for example; generally, to find a way to still handle them (bpf/seccomp)
while having the new code included and leaving new JITs aside. That I think
could be phase 1. Phase 2 would be to successively replace current JITs, etc.

> Today I didn't try to optimize bpf64 interpreter, since insn set is designed
> for eventual JITing and interpreter is there to support archs that don't
> have jit yet.
> I guess I have to tweak it to perform at bpf32 interpreter speeds.
>
>> transformed transparently to the new representation and then would be
>> good to execute in eBPF. If possible, in such a way that in the first
>> step JIT compilers won't need any upgrades. Once that is resolved,
>> JIT compilers could successively migrate, arch by arch, to compile the
>> new code? And last but not least the existing tools as well for handling
>> eBPF. I think, if possible, that would be great. Also, I unfortunately
>> haven't looked into your code too deeply yet due to time constraints,
>> but I'm wondering e.g. for accessing some skb fields we currently use
>> the "hack" to "overload" load instructions with negative arguments. Do
>> we have a sort of "meta" instruction that is extendible in eBPF to avoid
>> such things in future?
>
> Exactly.
> This 'negative offset' hack of bpf32 isn't very clean, since jits for all archs
> need to change when new offsets added.
> For bpf64 I'm proposing a customizable 'bpf_context' and variable set
> of bpf-callable functions, so JITs don't need to change and verifier
> stays the same.
> That's the idea behind 'bpf_callbacks' in include/linux/bpf_jit.h
>
> Some meta data makes sense to pass as input into bpf program.
> Like for seccomp 'bpf_context' can be 'struct seccomp_data'
>
> For networking, bpf_context can be 'skb',
> then bpf_s_anc_protocol becomes a normal 2-byte bpf64 load
> from skb->protocol field. Allowing access to other fields of skb
> is just a matter of defining permissions of 'struct bpf_context' in
> bpf_callback->get_context_access()
>
> Some other meta data and extensions are cleaner when defined
> as function calls from bpf, since calls are free.
> I think bpf_table_lookup() is a fundamental one that allows to define
> arbitrary tables within bpf and access them from the program.
> (here I need feedback the most whether to access tables
> via netlink from userspace or via debugfs...)
>
> It probably will be easier to read the code of bpf32-bpf64 converter
> to understand the differences between the two.
> I guess I have to start working on the converter sooner than I thought...
>
> Thanks
> Alexei
> --
> To unsubscribe from this list: send the line "unsubscribe netdev" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
>

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [RFC PATCH v2 tip 0/7] 64-bit BPF insn set and tracing filters
  2014-02-14 17:02           ` Daniel Borkmann
@ 2014-02-14 17:55             ` Alexei Starovoitov
  2014-02-15 16:13               ` Daniel Borkmann
  0 siblings, 1 reply; 26+ messages in thread
From: Alexei Starovoitov @ 2014-02-14 17:55 UTC (permalink / raw)
  To: Daniel Borkmann
  Cc: Ingo Molnar, David S. Miller, Steven Rostedt, Peter Zijlstra,
	H. Peter Anvin, Thomas Gleixner, Masami Hiramatsu, Tom Zanussi,
	Jovi Zhangwei, Eric Dumazet, Linus Torvalds, Andrew Morton,
	Frederic Weisbecker, Arnaldo Carvalho de Melo, Pekka Enberg,
	Arjan van de Ven, Christoph Hellwig, linux-kernel, netdev

On Fri, Feb 14, 2014 at 9:02 AM, Daniel Borkmann <dborkman@redhat.com> wrote:
> On 02/14/2014 01:59 AM, Alexei Starovoitov wrote:
> ...
>>>
>>> I'm very curious, do you also have any performance numbers, e.g. for
>>>
>>> networking by taking JIT'ed/non-JIT'ed BPF filters and compare them
>>> against
>>> JIT'ed/non-JIT'ed eBPF filters to see how many pps we gain or loose e.g.
>>> for a scenario with a middle box running cls_bpf .. or some other macro/
>>> micro benchmark just to get a picture where both stand in terms of
>>> performance? Who knows, maybe it would outperform nftables engine as
>>> well? ;-) How would that look on a 32bit arch with eBPF that is 64bit?
>>
>>
>> I don't have jited/non-jited numbers, but I suspect for micro-benchmarks
>> the gap should be big. I was shooting for near native performance after
>> JIT.
>
>
> Ohh, I meant it would be interesting to see a comparison of e.g. common
> libpcap
> high-level filters that are in 32bit BPF + JIT (current code) vs 64bit BPF +
> JIT
> (new code). I'm wondering how 32bit-only archs should be handled to not
> regress
> in evaluation performance to the current code.

Agreed. If we want to rip off old bpf interpreter and replace it with old->new
converter + new bpf interpreter, the performance of it should be very close.
In grand scheme some differences are ok, since libpcap bpf filters are not hot.
So much is happening before and after that tcpdump won't notice whether
filter was jited or not. cls_bpf is a different story, though I don't know what
specific use case you have there.
Could you define a bpf micro benchmark ? cls_bpf with pktgen?

Thanks
Alexei

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [RFC PATCH v2 tip 0/7] 64-bit BPF insn set and tracing filters
  2014-02-14 17:27         ` Daniel Borkmann
@ 2014-02-14 20:17           ` Alexei Starovoitov
  0 siblings, 0 replies; 26+ messages in thread
From: Alexei Starovoitov @ 2014-02-14 20:17 UTC (permalink / raw)
  To: Daniel Borkmann
  Cc: Ingo Molnar, David S. Miller, Steven Rostedt, Peter Zijlstra,
	H. Peter Anvin, Thomas Gleixner, Masami Hiramatsu, Tom Zanussi,
	Jovi Zhangwei, Eric Dumazet, Linus Torvalds, Andrew Morton,
	Frederic Weisbecker, Arnaldo Carvalho de Melo, Pekka Enberg,
	Arjan van de Ven, Christoph Hellwig, linux-kernel, netdev

On Fri, Feb 14, 2014 at 9:27 AM, Daniel Borkmann <dborkman@redhat.com> wrote:
> On 02/14/2014 05:47 AM, Alexei Starovoitov wrote:
> ...
>>>
>>> Do you see a possibility to integrate your work step by step? That is,
>>
>>
>> Sure. let's see how we can do it.
>>
>>> to first integrate the interpreter part only; meaning, to detect "old"
>>> BPF programs e.g. coming from SO_ATTACH_FILTER et al and run them in
>>> compatibility mode while extended BPF is fully integrated and replaces
>>> the old engine in net/core/filter.c. Maybe, "old" programs can be
>>
>>
>> do you mean drop bfp64_jit, checker and just have bpf32->bpf64 converter
>> and bpf64 interpreter as phase 1 ?
>> Checking is done by old bpf32,
>> all existing bpf32 jits, if available, can convert bpf32 to native,
>> but interpreter will be running on bpf64 ?
>> phase 2 to introduce bpf64_x86 jit and so on?
>> Sounds fine.
>
>
> If that's possible, so first step would be to migrate bpf_run() from patch1
> into sk_run_filter() form net/core/filter.c, and also bring in related
> include file into include/linux/filter.h resp. include/uapi/linux/filter.h.
> Plus code that is needed to verify the image in new (and old) format e.g.
> bpf_load_image() et al, and to either convert old programs into the new
> format, for example; generally, to find a way to still handle them
> (bpf/seccomp)
> while having the new code included and leaving new JITs aside. That I think
> could be phase 1. Phase 2 would be to successively replace current JITs,
> etc.

Sounds good.
Let me rephrase.
step 1:
sk_attach_filter() -> __sk_prepare_filter() -> sk_chk_filter() all stays as-is.
sk_chk_filter() calls new bpf_convert() that converts old bpf to new bpf insns.
Old sk_run_filter() is gone and replaced with bpf_run() that iterates
over new insns.
Here would need to make sure that all sk_run_filter() users (seccomp,
ppp, isdn, team)
are unaffected.

step 2:
use 'len' field of 'struct sock_fprog' to differentiate between old and new bpf.
len < 4096 -> old bpf insns, go to step 1
len > 4096 -> new bpf insns, verify them through new bpf_check()
and run them via the same new sk_run_filter()==bpf_run()
This way all current users of bpf can load new programs through the
same interfaces.

step 3:
replace bpf32_x86 jit with bpf64_jit.

step 4:
old filter attach interfaces do not allow the most interesting bpf64 programs
with bpf_tables (like the one for kernel tracing), extend them or add new

Initially I extended include/uapi/linux/filter.h, but then decided
it's too aggressive
to change uapi header and split it into include/linux/bpf.h instead.
It's definitely cleaner to have one. I guess with a comment that bpf64 insn set
may change, it should be ok. I'll go back to single filter.h

As far as making bpf64 interpreter to perform at bpf32 speeds on i386 and arm32,
I think I have to reconsider 32-bit subregs. Peter Anvin should be happy :)
If old bpf is like 8086, bpf64 with 32-bit subregs and 64-bit
registers is like x86-64 with x32.

Sounds like we're converging. What other stake holders have to say?

Thanks
Alexei

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [RFC PATCH v2 tip 0/7] 64-bit BPF insn set and tracing filters
  2014-02-14 17:55             ` Alexei Starovoitov
@ 2014-02-15 16:13               ` Daniel Borkmann
  0 siblings, 0 replies; 26+ messages in thread
From: Daniel Borkmann @ 2014-02-15 16:13 UTC (permalink / raw)
  To: Alexei Starovoitov
  Cc: Ingo Molnar, David S. Miller, Steven Rostedt, Peter Zijlstra,
	H. Peter Anvin, Thomas Gleixner, Masami Hiramatsu, Tom Zanussi,
	Jovi Zhangwei, Eric Dumazet, Linus Torvalds, Andrew Morton,
	Frederic Weisbecker, Arnaldo Carvalho de Melo, Pekka Enberg,
	Arjan van de Ven, Christoph Hellwig, linux-kernel, netdev

On 02/14/2014 06:55 PM, Alexei Starovoitov wrote:
> On Fri, Feb 14, 2014 at 9:02 AM, Daniel Borkmann <dborkman@redhat.com> wrote:
>> On 02/14/2014 01:59 AM, Alexei Starovoitov wrote:
>> ...
>>>>
>>>> I'm very curious, do you also have any performance numbers, e.g. for
>>>>
>>>> networking by taking JIT'ed/non-JIT'ed BPF filters and compare them
>>>> against
>>>> JIT'ed/non-JIT'ed eBPF filters to see how many pps we gain or loose e.g.
>>>> for a scenario with a middle box running cls_bpf .. or some other macro/
>>>> micro benchmark just to get a picture where both stand in terms of
>>>> performance? Who knows, maybe it would outperform nftables engine as
>>>> well? ;-) How would that look on a 32bit arch with eBPF that is 64bit?
>>>
>>> I don't have jited/non-jited numbers, but I suspect for micro-benchmarks
>>> the gap should be big. I was shooting for near native performance after
>>> JIT.
>>
>> Ohh, I meant it would be interesting to see a comparison of e.g. common
>> libpcap
>> high-level filters that are in 32bit BPF + JIT (current code) vs 64bit BPF +
>> JIT
>> (new code). I'm wondering how 32bit-only archs should be handled to not
>> regress
>> in evaluation performance to the current code.
>
> Agreed. If we want to rip off old bpf interpreter and replace it with old->new
> converter + new bpf interpreter, the performance of it should be very close.
> In grand scheme some differences are ok, since libpcap bpf filters are not hot.
> So much is happening before and after that tcpdump won't notice whether
> filter was jited or not. cls_bpf is a different story, though I don't know what
> specific use case you have there.
> Could you define a bpf micro benchmark ? cls_bpf with pktgen?

Well, that's just one example, it's not necessarily about cls_bpf, e.g. just
a modified pktgen where skb goes through new/old BPF filters on local output
path and the remote machine queries nic counters (e.g. ifpps, an ixia box or
something else suitable). There's probably even something more simple and
suitable for comparing both, so it doesn't necessarily has to be this scenario.
It's just to have a basic comparison to see where we would stand e.g. in case
of 32/64bit architectures. Otherwise, regarding your other email, sounds like
convergence at least to _me_. Details can then still be discussed at particular
steps, but integrating this step by step into the existing architecture would
be a good start, imho.

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [RFC PATCH v2 tip 0/7] 64-bit BPF insn set and tracing filters
  2014-02-06  0:27 ` David Miller
@ 2014-02-06  0:57   ` Alexei Starovoitov
  0 siblings, 0 replies; 26+ messages in thread
From: Alexei Starovoitov @ 2014-02-06  0:57 UTC (permalink / raw)
  To: David Miller
  Cc: mingo, rostedt, a.p.zijlstra, hpa, tglx, masami.hiramatsu.pt,
	tom.zanussi, jovi.zhangwei, Eric Dumazet, torvalds, akpm,
	fweisbec, acme, penberg, arjan, hch, linux-kernel

On Wed, Feb 5, 2014 at 4:27 PM, David Miller <davem@redhat.com> wrote:
>
> From: Alexei Starovoitov <ast@plumgrid.com>
> Date: Wed,  5 Feb 2014 16:10:00 -0800
>
> > this patch set addresses main sticking points of the previous discussion:
> > http://thread.gmane.org/gmane.linux.kernel/1605783
>
> You really need to properly CC: netdev on this patch series.

Sure. Happy to extend the audience.
Will repost with netdev included.

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [RFC PATCH v2 tip 0/7] 64-bit BPF insn set and tracing filters
  2014-02-06  0:10 Alexei Starovoitov
@ 2014-02-06  0:27 ` David Miller
  2014-02-06  0:57   ` Alexei Starovoitov
  0 siblings, 1 reply; 26+ messages in thread
From: David Miller @ 2014-02-06  0:27 UTC (permalink / raw)
  To: ast
  Cc: mingo, rostedt, a.p.zijlstra, hpa, tglx, masami.hiramatsu.pt,
	tom.zanussi, jovi.zhangwei, edumazet, torvalds, akpm, fweisbec,
	acme, penberg, arjan, hch, linux-kernel

From: Alexei Starovoitov <ast@plumgrid.com>
Date: Wed,  5 Feb 2014 16:10:00 -0800

> this patch set addresses main sticking points of the previous discussion:
> http://thread.gmane.org/gmane.linux.kernel/1605783

You really need to properly CC: netdev on this patch series.

^ permalink raw reply	[flat|nested] 26+ messages in thread

* [RFC PATCH v2 tip 0/7] 64-bit BPF insn set and tracing filters
@ 2014-02-06  0:10 Alexei Starovoitov
  2014-02-06  0:27 ` David Miller
  0 siblings, 1 reply; 26+ messages in thread
From: Alexei Starovoitov @ 2014-02-06  0:10 UTC (permalink / raw)
  To: Ingo Molnar
  Cc: Steven Rostedt, Peter Zijlstra, H. Peter Anvin, Thomas Gleixner,
	Masami Hiramatsu, Tom Zanussi, Jovi Zhangwei, Eric Dumazet,
	Linus Torvalds, Andrew Morton, Frederic Weisbecker,
	Arnaldo Carvalho de Melo, Pekka Enberg, David S. Miller,
	Arjan van de Ven, Christoph Hellwig, linux-kernel

Hi All,

this patch set addresses main sticking points of the previous discussion:
http://thread.gmane.org/gmane.linux.kernel/1605783

Main difference:
. all components are now in one place
  tools/bpf/llvm - standalone LLVM backend for extended BPF instruction set

. regs.si, regs.di accessors are replaced with arg1, arg2

. compiler enforces presence of 'license' string in source C code
  kernel enforces GPL compatibility of BPF program

Why bother with it?
Current 32-bit BPF is safe, but limited.
kernel modules are 'all-goes', but not-safe.
Extended 64-bit BPF provides safe and restricted kernel modules.

Just like the first two, extended BPF can be used for all sorts of things.
Initially for tracing/debugging/[ks]tap-like without vmlinux around,
then for networking, security, etc

To make exising kernel modules safe the x86 disassembler and code analyzer
are needed. We've tried to follow that path. Disassembler was straight forward,
but x86 analyzer was becoming unbearably complex due to variety of addressing
modes, so we started to hack GCC to reduce output x86 insns and facing
the headache of redoing disasm/analyzer for arm and other arhcs.
Plus there is old 32-bit bpf insn set already.
On one side extended BPF is a 64-bit extension to current BPF.
On the other side it's a common subset of x86-64/aarch64/... ISAs:
a generic 64-bit insn set that can be JITed to native HW one to one.

Tested on x86-64 and i386.
BPF core was tested on arm-v7.

V2 vs V1 details:
0001-Extended-BPF-core-framework:
  no difference to instruction set
  new bpf image format to include license string and enforcement during load

0002-Extended-BPF-JIT-for-x86-64: no changes

0003-Extended-BPF-64-bit-BPF-design-document: no changes

0004-Revert-x86-ptrace-Remove-unused-regs_get_argument:
  restoring Masami's get_Nth_argument accessor to simplify kprobe filters

0005-use-BPF-in-tracing-filters: minor changes to switch from si/di to argN

0006-LLVM-BPF-backend: standalone BPF backend for LLVM
  requires: apt-get install llvm-3.2-dev clang
  compiles in 7 seconds, links with the rest of llvm infra
  compatible with llvm 3.2, 3.3 and just released 3.4
  Written in llvm coding style and llvm license, so it can be
  upstreamed into llvm tree

0007-tracing-filter-examples-in-BPF:
  tools/bpf/filter_check: userspace pre-checker of BPF filter
  runs the same bpf_check() code as kernel does

  tools/bpf/examples/netif_rcv.c:
-----
#define DESC(NAME) __attribute__((section(NAME), used))
void my_filter(struct bpf_context *ctx)
{
        char devname[4] = "lo";
        struct net_device *dev;
        struct sk_buff *skb = 0;

        /*
         * for tracepoints arg1 is the 1st arg of TP_ARGS() macro
         * defined in include/trace/events/.h
         * for kprobe events arg1 is the 1st arg of probed function
         */
        skb = (struct sk_buff *)ctx->arg1;

        dev = bpf_load_pointer(&skb->dev);
        if (bpf_memcmp(dev->name, devname, 2) == 0) {
                char fmt[] = "skb %p dev %p \n";
                bpf_trace_printk(fmt, sizeof(fmt), (long)skb, (long)dev, 0);
        }
}
/* filter code license: */
char license[] DESC("license") = "GPL";
-----

$cd tools/bpf/examples
$make
  compile it using clang+llvm_bpf
$make check
  check safety
$make try
  attach this filter to net:netif_receive_skb and kprobe __netif_receive_skb
  and try ping

dropmon.c is a demo of faster version of net_dropmonitor:
-----
/* attaches to /sys/kernel/debug/tracing/events/skb/kfree_skb */
void dropmon(struct bpf_context *ctx)
{
        void *loc;
        uint64_t *drop_cnt;

        /*
         * skb:kfree_skb is defined as:
         * TRACE_EVENT(kfree_skb,
         *         TP_PROTO(struct sk_buff *skb, void *location),
         * so ctx->arg2 is 'location'
         */
        loc = (void *)ctx->arg2;

        drop_cnt = bpf_table_lookup(ctx, 0, &loc);
        if (drop_cnt) {
                __sync_fetch_and_add(drop_cnt, 1);
        } else {
                uint64_t init = 0;
                bpf_table_update(ctx, 0, &loc, &init);
        }
}
struct bpf_table t[] DESC("bpftables") = {
        {BPF_TABLE_HASH, sizeof(void *), sizeof(uint64_t), 4096, 0}
};
/* filter code license: */
char l[] DESC("license") = "GPL v2";
-----
It's not fully functional yet. Minimal work remaining to implement
bpf_table_lookup()/bpf_table_update() in kernel
and userspace access to filter's table.

This example demonstrates that some interesting events don't have to be
always fed into userspace, but can be pre-processed in kernel.
tools/perf/scripts/python/net_dropmonitor.py would need to read bpf table
from kernel (via debugfs or netlink) and print it in a nice format.

Same as in V1 BPF filters are called before tracepoints store the TP_STRUCT
fields, since performance advantage is significant.

TODO:

- complete 'dropmonitor': finish bpf hashtable and userspace access to it

- add multi-probe support, so that one C program can specify multiple
  functions for different probe points (similar to [ks]tap)

- add 'lsmod' like facility to list all loaded BPF filters

- add -m32 flag to llvm, so that C pointers are 32-bit,
  but emitted BPF is still 64-bit.
  Useful for kernel struct walking in BPF program on 32-bit archs

- finish testing on arm

- teach llvm to store line numbers in BPF image, so that bpf_check()
  can print nice errors when program is not safe

- allow read-only "strings" in C code
  today analyzer can only verify safety of: char s[] = "string"; bpf_print(s);
  but bpf_print("string"); cannot be proven yet

- write JIT from BPF to aarch64

If direction is ok, I would like to commit this part to a branch of tip tree
or staging tree and continue working there.
Future deltas will be easier to review.

Thanks

Alexei Starovoitov (7):
  Extended BPF core framework
  Extended BPF JIT for x86-64
  Extended BPF (64-bit BPF) design document
  Revert "x86/ptrace: Remove unused regs_get_argument_nth API"
  use BPF in tracing filters
  LLVM BPF backend
  tracing filter examples in BPF

 Documentation/bpf_jit.txt                          |  204 ++++
 arch/x86/Kconfig                                   |    1 +
 arch/x86/include/asm/ptrace.h                      |    3 +
 arch/x86/kernel/ptrace.c                           |   24 +
 arch/x86/net/Makefile                              |    1 +
 arch/x86/net/bpf64_jit_comp.c                      |  625 ++++++++++++
 arch/x86/net/bpf_jit_comp.c                        |   23 +-
 arch/x86/net/bpf_jit_comp.h                        |   35 +
 include/linux/bpf.h                                |  149 +++
 include/linux/bpf_jit.h                            |  134 +++
 include/linux/ftrace_event.h                       |    5 +
 include/trace/bpf_trace.h                          |   41 +
 include/trace/ftrace.h                             |   17 +
 kernel/Makefile                                    |    1 +
 kernel/bpf_jit/Makefile                            |    3 +
 kernel/bpf_jit/bpf_check.c                         | 1054 ++++++++++++++++++++
 kernel/bpf_jit/bpf_run.c                           |  511 ++++++++++
 kernel/trace/Kconfig                               |    1 +
 kernel/trace/Makefile                              |    1 +
 kernel/trace/bpf_trace_callbacks.c                 |  193 ++++
 kernel/trace/trace.c                               |    7 +
 kernel/trace/trace.h                               |   11 +-
 kernel/trace/trace_events.c                        |    9 +-
 kernel/trace/trace_events_filter.c                 |   61 +-
 kernel/trace/trace_kprobe.c                        |   15 +-
 lib/Kconfig.debug                                  |   15 +
 tools/bpf/examples/Makefile                        |   71 ++
 tools/bpf/examples/README.txt                      |   59 ++
 tools/bpf/examples/dropmon.c                       |   40 +
 tools/bpf/examples/netif_rcv.c                     |   34 +
 tools/bpf/filter_check/Makefile                    |   32 +
 tools/bpf/filter_check/README.txt                  |    3 +
 tools/bpf/filter_check/trace_filter_check.c        |  115 +++
 tools/bpf/llvm/LICENSE.TXT                         |   70 ++
 tools/bpf/llvm/Makefile.rules                      |  641 ++++++++++++
 tools/bpf/llvm/README.txt                          |   23 +
 tools/bpf/llvm/bld/.gitignore                      |    2 +
 tools/bpf/llvm/bld/Makefile                        |   27 +
 tools/bpf/llvm/bld/Makefile.common                 |   14 +
 tools/bpf/llvm/bld/Makefile.config                 |  124 +++
 .../llvm/bld/include/llvm/Config/AsmParsers.def    |    8 +
 .../llvm/bld/include/llvm/Config/AsmPrinters.def   |    9 +
 .../llvm/bld/include/llvm/Config/Disassemblers.def |    8 +
 tools/bpf/llvm/bld/include/llvm/Config/Targets.def |    9 +
 .../bpf/llvm/bld/include/llvm/Support/DataTypes.h  |   96 ++
 tools/bpf/llvm/bld/lib/Makefile                    |   11 +
 .../llvm/bld/lib/Target/BPF/InstPrinter/Makefile   |   10 +
 .../llvm/bld/lib/Target/BPF/MCTargetDesc/Makefile  |   11 +
 tools/bpf/llvm/bld/lib/Target/BPF/Makefile         |   17 +
 .../llvm/bld/lib/Target/BPF/TargetInfo/Makefile    |   10 +
 tools/bpf/llvm/bld/lib/Target/Makefile             |   11 +
 tools/bpf/llvm/bld/tools/Makefile                  |   12 +
 tools/bpf/llvm/bld/tools/llc/Makefile              |   15 +
 tools/bpf/llvm/lib/Target/BPF/BPF.h                |   30 +
 tools/bpf/llvm/lib/Target/BPF/BPF.td               |   29 +
 tools/bpf/llvm/lib/Target/BPF/BPFAsmPrinter.cpp    |  100 ++
 tools/bpf/llvm/lib/Target/BPF/BPFCFGFixup.cpp      |   62 ++
 tools/bpf/llvm/lib/Target/BPF/BPFCallingConv.td    |   24 +
 tools/bpf/llvm/lib/Target/BPF/BPFFrameLowering.cpp |   36 +
 tools/bpf/llvm/lib/Target/BPF/BPFFrameLowering.h   |   35 +
 tools/bpf/llvm/lib/Target/BPF/BPFISelDAGToDAG.cpp  |  182 ++++
 tools/bpf/llvm/lib/Target/BPF/BPFISelLowering.cpp  |  676 +++++++++++++
 tools/bpf/llvm/lib/Target/BPF/BPFISelLowering.h    |  105 ++
 tools/bpf/llvm/lib/Target/BPF/BPFInstrFormats.td   |   29 +
 tools/bpf/llvm/lib/Target/BPF/BPFInstrInfo.cpp     |  162 +++
 tools/bpf/llvm/lib/Target/BPF/BPFInstrInfo.h       |   53 +
 tools/bpf/llvm/lib/Target/BPF/BPFInstrInfo.td      |  455 +++++++++
 tools/bpf/llvm/lib/Target/BPF/BPFMCInstLower.cpp   |   77 ++
 tools/bpf/llvm/lib/Target/BPF/BPFMCInstLower.h     |   40 +
 tools/bpf/llvm/lib/Target/BPF/BPFRegisterInfo.cpp  |  122 +++
 tools/bpf/llvm/lib/Target/BPF/BPFRegisterInfo.h    |   65 ++
 tools/bpf/llvm/lib/Target/BPF/BPFRegisterInfo.td   |   39 +
 tools/bpf/llvm/lib/Target/BPF/BPFSubtarget.cpp     |   23 +
 tools/bpf/llvm/lib/Target/BPF/BPFSubtarget.h       |   33 +
 tools/bpf/llvm/lib/Target/BPF/BPFTargetMachine.cpp |   72 ++
 tools/bpf/llvm/lib/Target/BPF/BPFTargetMachine.h   |   69 ++
 .../lib/Target/BPF/InstPrinter/BPFInstPrinter.cpp  |   79 ++
 .../lib/Target/BPF/InstPrinter/BPFInstPrinter.h    |   34 +
 .../lib/Target/BPF/MCTargetDesc/BPFAsmBackend.cpp  |   85 ++
 .../llvm/lib/Target/BPF/MCTargetDesc/BPFBaseInfo.h |   33 +
 .../Target/BPF/MCTargetDesc/BPFELFObjectWriter.cpp |  119 +++
 .../lib/Target/BPF/MCTargetDesc/BPFMCAsmInfo.h     |   34 +
 .../Target/BPF/MCTargetDesc/BPFMCCodeEmitter.cpp   |  120 +++
 .../lib/Target/BPF/MCTargetDesc/BPFMCCodeEmitter.h |   67 ++
 .../Target/BPF/MCTargetDesc/BPFMCTargetDesc.cpp    |  115 +++
 .../lib/Target/BPF/MCTargetDesc/BPFMCTargetDesc.h  |   56 ++
 .../lib/Target/BPF/TargetInfo/BPFTargetInfo.cpp    |   13 +
 tools/bpf/llvm/tools/llc/llc.cpp                   |  381 +++++++
 88 files changed, 8255 insertions(+), 25 deletions(-)
 create mode 100644 Documentation/bpf_jit.txt
 create mode 100644 arch/x86/net/bpf64_jit_comp.c
 create mode 100644 arch/x86/net/bpf_jit_comp.h
 create mode 100644 include/linux/bpf.h
 create mode 100644 include/linux/bpf_jit.h
 create mode 100644 include/trace/bpf_trace.h
 create mode 100644 kernel/bpf_jit/Makefile
 create mode 100644 kernel/bpf_jit/bpf_check.c
 create mode 100644 kernel/bpf_jit/bpf_run.c
 create mode 100644 kernel/trace/bpf_trace_callbacks.c
 create mode 100644 tools/bpf/examples/Makefile
 create mode 100644 tools/bpf/examples/README.txt
 create mode 100644 tools/bpf/examples/dropmon.c
 create mode 100644 tools/bpf/examples/netif_rcv.c
 create mode 100644 tools/bpf/filter_check/Makefile
 create mode 100644 tools/bpf/filter_check/README.txt
 create mode 100644 tools/bpf/filter_check/trace_filter_check.c
 create mode 100644 tools/bpf/llvm/LICENSE.TXT
 create mode 100644 tools/bpf/llvm/Makefile.rules
 create mode 100644 tools/bpf/llvm/README.txt
 create mode 100644 tools/bpf/llvm/bld/.gitignore
 create mode 100644 tools/bpf/llvm/bld/Makefile
 create mode 100644 tools/bpf/llvm/bld/Makefile.common
 create mode 100644 tools/bpf/llvm/bld/Makefile.config
 create mode 100644 tools/bpf/llvm/bld/include/llvm/Config/AsmParsers.def
 create mode 100644 tools/bpf/llvm/bld/include/llvm/Config/AsmPrinters.def
 create mode 100644 tools/bpf/llvm/bld/include/llvm/Config/Disassemblers.def
 create mode 100644 tools/bpf/llvm/bld/include/llvm/Config/Targets.def
 create mode 100644 tools/bpf/llvm/bld/include/llvm/Support/DataTypes.h
 create mode 100644 tools/bpf/llvm/bld/lib/Makefile
 create mode 100644 tools/bpf/llvm/bld/lib/Target/BPF/InstPrinter/Makefile
 create mode 100644 tools/bpf/llvm/bld/lib/Target/BPF/MCTargetDesc/Makefile
 create mode 100644 tools/bpf/llvm/bld/lib/Target/BPF/Makefile
 create mode 100644 tools/bpf/llvm/bld/lib/Target/BPF/TargetInfo/Makefile
 create mode 100644 tools/bpf/llvm/bld/lib/Target/Makefile
 create mode 100644 tools/bpf/llvm/bld/tools/Makefile
 create mode 100644 tools/bpf/llvm/bld/tools/llc/Makefile
 create mode 100644 tools/bpf/llvm/lib/Target/BPF/BPF.h
 create mode 100644 tools/bpf/llvm/lib/Target/BPF/BPF.td
 create mode 100644 tools/bpf/llvm/lib/Target/BPF/BPFAsmPrinter.cpp
 create mode 100644 tools/bpf/llvm/lib/Target/BPF/BPFCFGFixup.cpp
 create mode 100644 tools/bpf/llvm/lib/Target/BPF/BPFCallingConv.td
 create mode 100644 tools/bpf/llvm/lib/Target/BPF/BPFFrameLowering.cpp
 create mode 100644 tools/bpf/llvm/lib/Target/BPF/BPFFrameLowering.h
 create mode 100644 tools/bpf/llvm/lib/Target/BPF/BPFISelDAGToDAG.cpp
 create mode 100644 tools/bpf/llvm/lib/Target/BPF/BPFISelLowering.cpp
 create mode 100644 tools/bpf/llvm/lib/Target/BPF/BPFISelLowering.h
 create mode 100644 tools/bpf/llvm/lib/Target/BPF/BPFInstrFormats.td
 create mode 100644 tools/bpf/llvm/lib/Target/BPF/BPFInstrInfo.cpp
 create mode 100644 tools/bpf/llvm/lib/Target/BPF/BPFInstrInfo.h
 create mode 100644 tools/bpf/llvm/lib/Target/BPF/BPFInstrInfo.td
 create mode 100644 tools/bpf/llvm/lib/Target/BPF/BPFMCInstLower.cpp
 create mode 100644 tools/bpf/llvm/lib/Target/BPF/BPFMCInstLower.h
 create mode 100644 tools/bpf/llvm/lib/Target/BPF/BPFRegisterInfo.cpp
 create mode 100644 tools/bpf/llvm/lib/Target/BPF/BPFRegisterInfo.h
 create mode 100644 tools/bpf/llvm/lib/Target/BPF/BPFRegisterInfo.td
 create mode 100644 tools/bpf/llvm/lib/Target/BPF/BPFSubtarget.cpp
 create mode 100644 tools/bpf/llvm/lib/Target/BPF/BPFSubtarget.h
 create mode 100644 tools/bpf/llvm/lib/Target/BPF/BPFTargetMachine.cpp
 create mode 100644 tools/bpf/llvm/lib/Target/BPF/BPFTargetMachine.h
 create mode 100644 tools/bpf/llvm/lib/Target/BPF/InstPrinter/BPFInstPrinter.cpp
 create mode 100644 tools/bpf/llvm/lib/Target/BPF/InstPrinter/BPFInstPrinter.h
 create mode 100644 tools/bpf/llvm/lib/Target/BPF/MCTargetDesc/BPFAsmBackend.cpp
 create mode 100644 tools/bpf/llvm/lib/Target/BPF/MCTargetDesc/BPFBaseInfo.h
 create mode 100644 tools/bpf/llvm/lib/Target/BPF/MCTargetDesc/BPFELFObjectWriter.cpp
 create mode 100644 tools/bpf/llvm/lib/Target/BPF/MCTargetDesc/BPFMCAsmInfo.h
 create mode 100644 tools/bpf/llvm/lib/Target/BPF/MCTargetDesc/BPFMCCodeEmitter.cpp
 create mode 100644 tools/bpf/llvm/lib/Target/BPF/MCTargetDesc/BPFMCCodeEmitter.h
 create mode 100644 tools/bpf/llvm/lib/Target/BPF/MCTargetDesc/BPFMCTargetDesc.cpp
 create mode 100644 tools/bpf/llvm/lib/Target/BPF/MCTargetDesc/BPFMCTargetDesc.h
 create mode 100644 tools/bpf/llvm/lib/Target/BPF/TargetInfo/BPFTargetInfo.cpp
 create mode 100644 tools/bpf/llvm/tools/llc/llc.cpp

-- 
1.7.9.5


^ permalink raw reply	[flat|nested] 26+ messages in thread

end of thread, other threads:[~2014-02-15 16:14 UTC | newest]

Thread overview: 26+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2014-02-06  1:10 [RFC PATCH v2 tip 0/7] 64-bit BPF insn set and tracing filters Alexei Starovoitov
2014-02-06  1:10 ` [RFC PATCH v2 tip 1/7] Extended BPF core framework Alexei Starovoitov
2014-02-06  1:10 ` [RFC PATCH v2 tip 2/7] Extended BPF JIT for x86-64 Alexei Starovoitov
2014-02-06  1:10 ` [RFC PATCH v2 tip 3/7] Extended BPF (64-bit BPF) design document Alexei Starovoitov
2014-02-06  1:10 ` [RFC PATCH v2 tip 4/7] Revert "x86/ptrace: Remove unused regs_get_argument_nth API" Alexei Starovoitov
2014-02-06  1:10 ` [RFC PATCH v2 tip 5/7] use BPF in tracing filters Alexei Starovoitov
2014-02-06  1:10 ` [RFC PATCH v2 tip 6/7] LLVM BPF backend Alexei Starovoitov
2014-02-06  1:10 ` [RFC PATCH v2 tip 7/7] tracing filter examples in BPF Alexei Starovoitov
2014-02-06 10:42 ` [RFC PATCH v2 tip 0/7] 64-bit BPF insn set and tracing filters Daniel Borkmann
2014-02-07  1:20   ` Alexei Starovoitov
2014-02-13 20:20     ` Daniel Borkmann
2014-02-13 22:22       ` Daniel Borkmann
2014-02-14  0:59         ` Alexei Starovoitov
2014-02-14 17:02           ` Daniel Borkmann
2014-02-14 17:55             ` Alexei Starovoitov
2014-02-15 16:13               ` Daniel Borkmann
2014-02-14  4:47       ` Alexei Starovoitov
2014-02-14 17:27         ` Daniel Borkmann
2014-02-14 20:17           ` Alexei Starovoitov
2014-02-13 22:32     ` H. Peter Anvin
2014-02-13 22:44       ` Daniel Borkmann
2014-02-13 22:47         ` H. Peter Anvin
2014-02-13 22:55           ` Daniel Borkmann
  -- strict thread matches above, loose matches on Subject: below --
2014-02-06  0:10 Alexei Starovoitov
2014-02-06  0:27 ` David Miller
2014-02-06  0:57   ` Alexei Starovoitov

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.