All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH v2 net-next 0/4] net: filter: BPF updates
@ 2013-03-19 14:33 Daniel Borkmann
  2013-03-19 14:34 ` [PATCH v2 net-next 1/4] flow_keys: include thoff into flow_keys for later usage Daniel Borkmann
                   ` (4 more replies)
  0 siblings, 5 replies; 14+ messages in thread
From: Daniel Borkmann @ 2013-03-19 14:33 UTC (permalink / raw)
  To: netdev; +Cc: davem, eric.dumazet, jasowang

This set adds i) an ancillary operation to the BPF engine and ii) a
BPF JIT image disassembler in order to verify or debug the BPF JIT
compilers under arch/*/net/.

v1 -> v2:
	- No need to reorder choke_skb_cb structure

Daniel Borkmann (4):
  flow_keys: include thoff into flow_keys for later usage
  net: flow_dissector: add __skb_get_poff to get a start offset to payload
  filter: add ANC_PAY_OFFSET instruction for loading payload start offset
  filter: add minimal BPF JIT emitted image disassembler

 include/linux/filter.h      |   1 +
 include/linux/skbuff.h      |   2 +
 include/net/flow_keys.h     |   1 +
 include/uapi/linux/filter.h |   3 +-
 net/core/filter.c           |   5 +
 net/core/flow_dissector.c   |  62 ++++++++++++-
 scripts/bpf_jit_disasm.c    | 216 ++++++++++++++++++++++++++++++++++++++++++++
 7 files changed, 288 insertions(+), 2 deletions(-)
 create mode 100644 scripts/bpf_jit_disasm.c

-- 
1.7.11.7

^ permalink raw reply	[flat|nested] 14+ messages in thread

* [PATCH v2 net-next 1/4] flow_keys: include thoff into flow_keys for later usage
  2013-03-19 14:33 [PATCH v2 net-next 0/4] net: filter: BPF updates Daniel Borkmann
@ 2013-03-19 14:34 ` Daniel Borkmann
  2013-03-19 15:03   ` Eric Dumazet
  2013-03-19 14:34 ` [PATCH v2 net-next 2/4] net: flow_dissector: add __skb_get_poff to get a start offset to payload Daniel Borkmann
                   ` (3 subsequent siblings)
  4 siblings, 1 reply; 14+ messages in thread
From: Daniel Borkmann @ 2013-03-19 14:34 UTC (permalink / raw)
  To: netdev; +Cc: davem, eric.dumazet, jasowang

In skb_flow_dissect(), we perform a dissection of a skbuff. Since we're
doing the work here anyway, also store thoff for a later usage, e.g. in
the BPF filter. Also, by having thoff 16 Bit, we do not need to pack
flow_keys and reorder choke_skb_cb.

Suggested-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: Daniel Borkmann <dborkman@redhat.com>
---
 This patch also needs to go into the net tree, since Eric or Jason will
 post a bug fix on top of this one.

 include/net/flow_keys.h   | 1 +
 net/core/flow_dissector.c | 5 ++++-
 2 files changed, 5 insertions(+), 1 deletion(-)

diff --git a/include/net/flow_keys.h b/include/net/flow_keys.h
index 80461c1..bb8271d 100644
--- a/include/net/flow_keys.h
+++ b/include/net/flow_keys.h
@@ -9,6 +9,7 @@ struct flow_keys {
 		__be32 ports;
 		__be16 port16[2];
 	};
+	u16 thoff;
 	u8 ip_proto;
 };
 
diff --git a/net/core/flow_dissector.c b/net/core/flow_dissector.c
index f8d9e03..eb9dde1 100644
--- a/net/core/flow_dissector.c
+++ b/net/core/flow_dissector.c
@@ -23,7 +23,8 @@ static void iph_to_flow_copy_addrs(struct flow_keys *flow, const struct iphdr *i
 
 bool skb_flow_dissect(const struct sk_buff *skb, struct flow_keys *flow)
 {
-	int poff, nhoff = skb_network_offset(skb);
+	int poff;
+	u16 nhoff = skb_network_offset(skb);
 	u8 ip_proto;
 	__be16 proto = skb->protocol;
 
@@ -151,6 +152,8 @@ ipv6:
 			flow->ports = *ports;
 	}
 
+	flow->thoff = nhoff;
+
 	return true;
 }
 EXPORT_SYMBOL(skb_flow_dissect);
-- 
1.7.11.7

^ permalink raw reply related	[flat|nested] 14+ messages in thread

* [PATCH v2 net-next 2/4] net: flow_dissector: add __skb_get_poff to get a start offset to payload
  2013-03-19 14:33 [PATCH v2 net-next 0/4] net: filter: BPF updates Daniel Borkmann
  2013-03-19 14:34 ` [PATCH v2 net-next 1/4] flow_keys: include thoff into flow_keys for later usage Daniel Borkmann
@ 2013-03-19 14:34 ` Daniel Borkmann
  2013-03-19 15:03   ` Eric Dumazet
  2013-03-19 14:34 ` [PATCH v2 net-next 3/4] filter: add ANC_PAY_OFFSET instruction for loading payload start offset Daniel Borkmann
                   ` (2 subsequent siblings)
  4 siblings, 1 reply; 14+ messages in thread
From: Daniel Borkmann @ 2013-03-19 14:34 UTC (permalink / raw)
  To: netdev; +Cc: davem, eric.dumazet, jasowang

__skb_get_poff() returns the offset to the payload as far as it could
be dissected. The main user is currently BPF, so that we can dynamically
truncate packets without needing to push actual payload to the user
space and instead can analyze headers only.

Suggested-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: Daniel Borkmann <dborkman@redhat.com>
---
 include/linux/skbuff.h    |  2 ++
 net/core/flow_dissector.c | 57 +++++++++++++++++++++++++++++++++++++++++++++++
 2 files changed, 59 insertions(+)

diff --git a/include/linux/skbuff.h b/include/linux/skbuff.h
index eb2106f..0e84fd8 100644
--- a/include/linux/skbuff.h
+++ b/include/linux/skbuff.h
@@ -2835,6 +2835,8 @@ static inline void skb_checksum_none_assert(const struct sk_buff *skb)
 
 bool skb_partial_csum_set(struct sk_buff *skb, u16 start, u16 off);
 
+u32 __skb_get_poff(const struct sk_buff *skb);
+
 /**
  * skb_head_is_locked - Determine if the skb->head is locked down
  * @skb: skb to check
diff --git a/net/core/flow_dissector.c b/net/core/flow_dissector.c
index eb9dde1..8213da7 100644
--- a/net/core/flow_dissector.c
+++ b/net/core/flow_dissector.c
@@ -5,6 +5,10 @@
 #include <linux/if_vlan.h>
 #include <net/ip.h>
 #include <net/ipv6.h>
+#include <linux/igmp.h>
+#include <linux/icmp.h>
+#include <linux/sctp.h>
+#include <linux/dccp.h>
 #include <linux/if_tunnel.h>
 #include <linux/if_pppox.h>
 #include <linux/ppp_defs.h>
@@ -229,6 +233,59 @@ u16 __skb_tx_hash(const struct net_device *dev, const struct sk_buff *skb,
 }
 EXPORT_SYMBOL(__skb_tx_hash);
 
+/* __skb_get_poff() returns the offset to the payload as far as it could
+ * be dissected. The main user is currently BPF, so that we can dynamically
+ * truncate packets without needing to push actual payload to the user
+ * space and can analyze headers only, instead.
+ */
+u32 __skb_get_poff(const struct sk_buff *skb)
+{
+	struct flow_keys keys;
+	u32 poff = 0;
+
+	if (!skb_flow_dissect(skb, &keys))
+		return 0;
+
+	poff += keys.thoff;
+	switch (keys.ip_proto) {
+	case IPPROTO_TCP: {
+		const struct tcphdr *tcph;
+		struct tcphdr _tcph;
+
+		tcph = skb_header_pointer(skb, poff, sizeof(_tcph), &_tcph);
+		if (!tcph)
+			return poff;
+
+		poff += max_t(u32, sizeof(struct tcphdr), tcph->doff * 4);
+		break;
+	}
+	case IPPROTO_UDP:
+	case IPPROTO_UDPLITE:
+		poff += sizeof(struct udphdr);
+		break;
+	/* For the rest, we do not really care about header
+	 * extensions at this point for now.
+	 */
+	case IPPROTO_ICMP:
+		poff += sizeof(struct icmphdr);
+		break;
+	case IPPROTO_ICMPV6:
+		poff += sizeof(struct icmp6hdr);
+		break;
+	case IPPROTO_IGMP:
+		poff += sizeof(struct igmphdr);
+		break;
+	case IPPROTO_DCCP:
+		poff += sizeof(struct dccp_hdr);
+		break;
+	case IPPROTO_SCTP:
+		poff += sizeof(struct sctphdr);
+		break;
+	}
+
+	return poff;
+}
+
 static inline u16 dev_cap_txqueue(struct net_device *dev, u16 queue_index)
 {
 	if (unlikely(queue_index >= dev->real_num_tx_queues)) {
-- 
1.7.11.7

^ permalink raw reply related	[flat|nested] 14+ messages in thread

* [PATCH v2 net-next 3/4] filter: add ANC_PAY_OFFSET instruction for loading payload start offset
  2013-03-19 14:33 [PATCH v2 net-next 0/4] net: filter: BPF updates Daniel Borkmann
  2013-03-19 14:34 ` [PATCH v2 net-next 1/4] flow_keys: include thoff into flow_keys for later usage Daniel Borkmann
  2013-03-19 14:34 ` [PATCH v2 net-next 2/4] net: flow_dissector: add __skb_get_poff to get a start offset to payload Daniel Borkmann
@ 2013-03-19 14:34 ` Daniel Borkmann
  2013-03-19 15:04   ` Eric Dumazet
  2013-03-19 14:34 ` [PATCH v2 net-next 4/4] filter: add minimal BPF JIT emitted image disassembler Daniel Borkmann
  2013-03-19 14:38 ` [PATCH v2 net-next 0/4] net: filter: BPF updates David Miller
  4 siblings, 1 reply; 14+ messages in thread
From: Daniel Borkmann @ 2013-03-19 14:34 UTC (permalink / raw)
  To: netdev; +Cc: davem, eric.dumazet, jasowang

It is very useful to do dynamic truncation of packets. In particular,
we're interested to push the necessary header bytes to the user space and
cut off user payload that should probably not be transferred for some reasons
(e.g. privacy, speed, or others). With the ancillary extension PAY_OFFSET,
we can load it into the accumulator, and return it. E.g. in bpfc syntax ...

        ld #poff        ; { 0x20, 0, 0, 0xfffff034 },
        ret a           ; { 0x16, 0, 0, 0x00000000 },

... as a filter will accomplish this without having to do a big hackery in
a BPF filter itself. Follow-up JIT implementations are welcome.

Thanks to Eric Dumazet for suggesting and discussing this during the
Netfilter Workshop in Copenhagen.

Suggested-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: Daniel Borkmann <dborkman@redhat.com>
---
 include/linux/filter.h      | 1 +
 include/uapi/linux/filter.h | 3 ++-
 net/core/filter.c           | 5 +++++
 3 files changed, 8 insertions(+), 1 deletion(-)

diff --git a/include/linux/filter.h b/include/linux/filter.h
index c45eabc..d2059cb 100644
--- a/include/linux/filter.h
+++ b/include/linux/filter.h
@@ -126,6 +126,7 @@ enum {
 	BPF_S_ANC_SECCOMP_LD_W,
 	BPF_S_ANC_VLAN_TAG,
 	BPF_S_ANC_VLAN_TAG_PRESENT,
+	BPF_S_ANC_PAY_OFFSET,
 };
 
 #endif /* __LINUX_FILTER_H__ */
diff --git a/include/uapi/linux/filter.h b/include/uapi/linux/filter.h
index 9cfde69..8eb9cca 100644
--- a/include/uapi/linux/filter.h
+++ b/include/uapi/linux/filter.h
@@ -129,7 +129,8 @@ struct sock_fprog {	/* Required for SO_ATTACH_FILTER. */
 #define SKF_AD_ALU_XOR_X	40
 #define SKF_AD_VLAN_TAG	44
 #define SKF_AD_VLAN_TAG_PRESENT 48
-#define SKF_AD_MAX	52
+#define SKF_AD_PAY_OFFSET	52
+#define SKF_AD_MAX	56
 #define SKF_NET_OFF   (-0x100000)
 #define SKF_LL_OFF    (-0x200000)
 
diff --git a/net/core/filter.c b/net/core/filter.c
index 2e20b55..dad2a17 100644
--- a/net/core/filter.c
+++ b/net/core/filter.c
@@ -348,6 +348,9 @@ load_b:
 		case BPF_S_ANC_VLAN_TAG_PRESENT:
 			A = !!vlan_tx_tag_present(skb);
 			continue;
+		case BPF_S_ANC_PAY_OFFSET:
+			A = __skb_get_poff(skb);
+			continue;
 		case BPF_S_ANC_NLATTR: {
 			struct nlattr *nla;
 
@@ -612,6 +615,7 @@ int sk_chk_filter(struct sock_filter *filter, unsigned int flen)
 			ANCILLARY(ALU_XOR_X);
 			ANCILLARY(VLAN_TAG);
 			ANCILLARY(VLAN_TAG_PRESENT);
+			ANCILLARY(PAY_OFFSET);
 			}
 
 			/* ancillary operation unknown or unsupported */
@@ -814,6 +818,7 @@ static void sk_decode_filter(struct sock_filter *filt, struct sock_filter *to)
 		[BPF_S_ANC_SECCOMP_LD_W] = BPF_LD|BPF_B|BPF_ABS,
 		[BPF_S_ANC_VLAN_TAG]	= BPF_LD|BPF_B|BPF_ABS,
 		[BPF_S_ANC_VLAN_TAG_PRESENT] = BPF_LD|BPF_B|BPF_ABS,
+		[BPF_S_ANC_PAY_OFFSET]	= BPF_LD|BPF_B|BPF_ABS,
 		[BPF_S_LD_W_LEN]	= BPF_LD|BPF_W|BPF_LEN,
 		[BPF_S_LD_W_IND]	= BPF_LD|BPF_W|BPF_IND,
 		[BPF_S_LD_H_IND]	= BPF_LD|BPF_H|BPF_IND,
-- 
1.7.11.7

^ permalink raw reply related	[flat|nested] 14+ messages in thread

* [PATCH v2 net-next 4/4] filter: add minimal BPF JIT emitted image disassembler
  2013-03-19 14:33 [PATCH v2 net-next 0/4] net: filter: BPF updates Daniel Borkmann
                   ` (2 preceding siblings ...)
  2013-03-19 14:34 ` [PATCH v2 net-next 3/4] filter: add ANC_PAY_OFFSET instruction for loading payload start offset Daniel Borkmann
@ 2013-03-19 14:34 ` Daniel Borkmann
  2013-03-19 15:05   ` Eric Dumazet
  2013-03-19 14:38 ` [PATCH v2 net-next 0/4] net: filter: BPF updates David Miller
  4 siblings, 1 reply; 14+ messages in thread
From: Daniel Borkmann @ 2013-03-19 14:34 UTC (permalink / raw)
  To: netdev; +Cc: davem, eric.dumazet, jasowang, Eric Dumazet

This is a minimal stand-alone user space helper, that allows for debugging or
verification of emitted BPF JIT images. This is in particular useful for
emitted opcode debugging, since minor bugs in the JIT compiler can be fatal.
The disassembler is architecture generic and uses libopcodes and libbfd.

How to get to the disassembly, example:

  1) `echo 2 > /proc/sys/net/core/bpf_jit_enable`
  2) Load a BPF filter (e.g. `tcpdump -p -n -s 0 -i eth1 host 192.168.20.0/24`)
  3) Run e.g. `bpf_jit_disasm -o` to disassemble the most recent JIT code output

`bpf_jit_disasm -o` will display the related opcodes to a particular instruction
as well. Example for x86_64:

$./bpf_jit_disasm
94 bytes emitted from JIT compiler (pass:3, flen:9)
ffffffffa0356000 + <x>:
   0:	push   %rbp
   1:	mov    %rsp,%rbp
   4:	sub    $0x60,%rsp
   8:	mov    %rbx,-0x8(%rbp)
   c:	mov    0x68(%rdi),%r9d
  10:	sub    0x6c(%rdi),%r9d
  14:	mov    0xe0(%rdi),%r8
  1b:	mov    $0xc,%esi
  20:	callq  0xffffffffe0d01b71
  25:	cmp    $0x86dd,%eax
  2a:	jne    0x000000000000003d
  2c:	mov    $0x14,%esi
  31:	callq  0xffffffffe0d01b8d
  36:	cmp    $0x6,%eax
[...]
  5c:	leaveq
  5d:	retq

$ ./bpf_jit_disasm -o
94 bytes emitted from JIT compiler (pass:3, flen:9)
ffffffffa0356000 + <x>:
   0:	push   %rbp
	55
   1:	mov    %rsp,%rbp
	48 89 e5
   4:	sub    $0x60,%rsp
	48 83 ec 60
   8:	mov    %rbx,-0x8(%rbp)
	48 89 5d f8
   c:	mov    0x68(%rdi),%r9d
	44 8b 4f 68
  10:	sub    0x6c(%rdi),%r9d
	44 2b 4f 6c
[...]
  5c:	leaveq
	c9
  5d:	retq
	c3

Cc: Eric Dumazet <edumazet@google.com>
Signed-off-by: Daniel Borkmann <dborkman@redhat.com>
---
 scripts/bpf_jit_disasm.c | 216 +++++++++++++++++++++++++++++++++++++++++++++++
 1 file changed, 216 insertions(+)
 create mode 100644 scripts/bpf_jit_disasm.c

diff --git a/scripts/bpf_jit_disasm.c b/scripts/bpf_jit_disasm.c
new file mode 100644
index 0000000..1fe9fb5
--- /dev/null
+++ b/scripts/bpf_jit_disasm.c
@@ -0,0 +1,216 @@
+/*
+ * Minimal BPF JIT image disassembler
+ *
+ * Disassembles BPF JIT compiler emitted opcodes back to asm insn's for
+ * debugging or verification purposes.
+ *
+ * There is no Makefile. Compile with
+ *
+ *   `gcc -Wall -O2 bpf_jit_disasm.c -o bpf_jit_disasm -lopcodes -lbfd -ldl`
+ *
+ * or similar.
+ *
+ * To get the disassembly of the JIT code, do the following:
+ *
+ *  1) `echo 2 > /proc/sys/net/core/bpf_jit_enable`
+ *  2) Load a BPF filter (e.g. `tcpdump -p -n -s 0 -i eth1 host 192.168.20.0/24`)
+ *  3) Run e.g. `./bpf_jit_disasm -o` to read out the last JIT code
+ *
+ * Copyright 2013 Daniel Borkmann <borkmann@redhat.com>
+ * Licensed under the GNU General Public License, version 2.0 (GPLv2)
+ */
+
+#include <stdint.h>
+#include <stdio.h>
+#include <stdlib.h>
+#include <assert.h>
+#include <unistd.h>
+#include <string.h>
+#include <bfd.h>
+#include <dis-asm.h>
+#include <sys/klog.h>
+#include <sys/types.h>
+#include <regex.h>
+
+#define VERSION_STRING	"1.0"
+
+static void get_exec_path(char *tpath, size_t size)
+{
+	char *path;
+	ssize_t len;
+
+	snprintf(tpath, size, "/proc/%d/exe", (int) getpid());
+	tpath[size - 1] = 0;
+
+	path = strdup(tpath);
+	assert(path);
+
+	len = readlink(path, tpath, size);
+	tpath[len] = 0;
+
+	free(path);
+}
+
+static void get_asm_insns(uint8_t *image, size_t len, unsigned long base,
+			  int opcodes)
+{
+	int count, i, pc = 0;
+	char tpath[256];
+	struct disassemble_info info;
+	disassembler_ftype disassemble;
+	bfd *bfdf;
+
+	memset(tpath, 0, sizeof(tpath));
+	get_exec_path(tpath, sizeof(tpath));
+
+	bfdf = bfd_openr(tpath, NULL);
+	assert(bfdf);
+	assert(bfd_check_format(bfdf, bfd_object));
+
+	init_disassemble_info(&info, stdout, (fprintf_ftype) fprintf);
+	info.arch = bfd_get_arch(bfdf);
+	info.mach = bfd_get_mach(bfdf);
+	info.buffer = image;
+	info.buffer_length = len;
+
+	disassemble_init_for_target(&info);
+
+	disassemble = disassembler(bfdf);
+	assert(disassemble);
+
+	do {
+		printf("%4x:\t", pc);
+
+		count = disassemble(pc, &info);
+
+		if (opcodes) {
+			printf("\n\t");
+			for (i = 0; i < count; ++i)
+				printf("%02x ", (uint8_t) image[pc + i]);
+		}
+		printf("\n");
+
+		pc += count;
+	} while(count > 0 && pc < len);
+
+	bfd_close(bfdf);
+}
+
+static char *get_klog_buff(int *klen)
+{
+	int ret, len = klogctl(10, NULL, 0);
+	char *buff = malloc(len);
+
+	assert(buff && klen);
+	ret = klogctl(3, buff, len);
+	assert(ret >= 0);
+	*klen = ret;
+
+	return buff;
+}
+
+static void put_klog_buff(char *buff)
+{
+	free(buff);
+}
+
+static int get_last_jit_image(char *haystack, size_t hlen,
+			      uint8_t *image, size_t ilen,
+			      unsigned long *base)
+{
+	char *ptr, *pptr, *tmp;
+	off_t off = 0;
+	int ret, flen, proglen, pass, ulen = 0;
+	regmatch_t pmatch[1];
+	regex_t regex;
+
+	if (hlen == 0)
+		return 0;
+
+	ret = regcomp(&regex, "flen=[[:alnum:]]+ proglen=[[:digit:]]+ "
+		      "pass=[[:digit:]]+ image=[[:xdigit:]]+", REG_EXTENDED);
+	assert(ret == 0);
+
+	ptr = haystack;
+	while (1) {
+		ret = regexec(&regex, ptr, 1, pmatch, 0);
+		if (ret == 0) {
+			ptr += pmatch[0].rm_eo;
+			off += pmatch[0].rm_eo;
+			assert(off < hlen);
+		} else
+			break;
+	}
+
+	ptr = haystack + off - (pmatch[0].rm_eo - pmatch[0].rm_so);
+	ret = sscanf(ptr, "flen=%d proglen=%d pass=%d image=%lx",
+		     &flen, &proglen, &pass, base);
+	if (ret != 4)
+		return 0;
+
+	tmp = ptr = haystack + off;
+	while ((ptr = strtok(tmp, "\n")) != NULL && ulen < ilen) {
+		tmp = NULL;
+		if (!strstr(ptr, "JIT code"))
+			continue;
+		pptr = ptr;
+		while ((ptr = strstr(pptr, ":")))
+			pptr = ptr + 1;
+		ptr = pptr;
+		do {
+			image[ulen++] = (uint8_t) strtoul(pptr, &pptr, 16);
+			if (ptr == pptr || ulen >= ilen) {
+				ulen--;
+				break;
+			}
+			ptr = pptr;
+		} while (1);
+	}
+
+	assert(ulen == proglen);
+	printf("%d bytes emitted from JIT compiler (pass:%d, flen:%d)\n",
+	       proglen, pass, flen);
+	printf("%lx + <x>:\n", *base);
+
+	regfree(&regex);
+	return ulen;
+}
+
+static void help(void)
+{
+	printf("Usage: bpf_jit_disasm [-ohv]\n");
+	printf("Version %s, written by Daniel Borkmann <borkmann@redhat.com>\n",
+	       VERSION_STRING);
+	printf("  -o                             Include opcodes in output\n");
+	printf("  -h|-v                          Show help/version\n");
+	exit(0);
+}
+
+int main(int argc, char **argv)
+{
+	int len, klen, opcodes = 0;
+	char *kbuff;
+	unsigned long base;
+	uint8_t image[4096];
+
+	if (argc > 1) {
+		if (!strncmp("-o", argv[argc - 1], 2))
+			opcodes = 1;
+		if (!strncmp("-h", argv[argc - 1], 2) ||
+		    !strncmp("-v", argv[argc - 1], 2))
+			help();
+	}
+
+	bfd_init();
+	memset(image, 0, sizeof(image));
+
+	kbuff = get_klog_buff(&klen);
+
+	len = get_last_jit_image(kbuff, klen, image, sizeof(image), &base);
+	if (len > 0 && base > 0)
+		get_asm_insns(image, len, base, opcodes);
+
+	put_klog_buff(kbuff);
+
+	return 0;
+}
-- 
1.7.11.7

^ permalink raw reply related	[flat|nested] 14+ messages in thread

* Re: [PATCH v2 net-next 0/4] net: filter: BPF updates
  2013-03-19 14:33 [PATCH v2 net-next 0/4] net: filter: BPF updates Daniel Borkmann
                   ` (3 preceding siblings ...)
  2013-03-19 14:34 ` [PATCH v2 net-next 4/4] filter: add minimal BPF JIT emitted image disassembler Daniel Borkmann
@ 2013-03-19 14:38 ` David Miller
  2013-03-19 14:42   ` Daniel Borkmann
  4 siblings, 1 reply; 14+ messages in thread
From: David Miller @ 2013-03-19 14:38 UTC (permalink / raw)
  To: dborkman; +Cc: netdev, eric.dumazet, jasowang

From: Daniel Borkmann <dborkman@redhat.com>
Date: Tue, 19 Mar 2013 15:33:59 +0100

> This set adds i) an ancillary operation to the BPF engine and ii) a
> BPF JIT image disassembler in order to verify or debug the BPF JIT
> compilers under arch/*/net/.
> 
> v1 -> v2:
> 	- No need to reorder choke_skb_cb structure

So we want the first patch in 'net' right?

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH v2 net-next 0/4] net: filter: BPF updates
  2013-03-19 14:38 ` [PATCH v2 net-next 0/4] net: filter: BPF updates David Miller
@ 2013-03-19 14:42   ` Daniel Borkmann
  2013-03-19 14:51     ` David Miller
  0 siblings, 1 reply; 14+ messages in thread
From: Daniel Borkmann @ 2013-03-19 14:42 UTC (permalink / raw)
  To: David Miller; +Cc: netdev, eric.dumazet, jasowang

On 03/19/2013 03:38 PM, David Miller wrote:
> From: Daniel Borkmann <dborkman@redhat.com>
> Date: Tue, 19 Mar 2013 15:33:59 +0100
>
>> This set adds i) an ancillary operation to the BPF engine and ii) a
>> BPF JIT image disassembler in order to verify or debug the BPF JIT
>> compilers under arch/*/net/.
>>
>> v1 -> v2:
>> 	- No need to reorder choke_skb_cb structure
>
> So we want the first patch in 'net' right?

Eric and Jason mentioned that they need to do a bug fix, where they
could simplify code for the fix by having the 1st patch of this set
in the net tree as well.

However, all other patches of this set except the last one also depend
on the first one, so it would be net and net-next then. Eric, please
correct me if I'm wrong?

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH v2 net-next 0/4] net: filter: BPF updates
  2013-03-19 14:42   ` Daniel Borkmann
@ 2013-03-19 14:51     ` David Miller
  0 siblings, 0 replies; 14+ messages in thread
From: David Miller @ 2013-03-19 14:51 UTC (permalink / raw)
  To: dborkman; +Cc: netdev, eric.dumazet, jasowang

From: Daniel Borkmann <dborkman@redhat.com>
Date: Tue, 19 Mar 2013 15:42:07 +0100

> On 03/19/2013 03:38 PM, David Miller wrote:
>> From: Daniel Borkmann <dborkman@redhat.com>
>> Date: Tue, 19 Mar 2013 15:33:59 +0100
>>
>>> This set adds i) an ancillary operation to the BPF engine and ii) a
>>> BPF JIT image disassembler in order to verify or debug the BPF JIT
>>> compilers under arch/*/net/.
>>>
>>> v1 -> v2:
>>> 	- No need to reorder choke_skb_cb structure
>>
>> So we want the first patch in 'net' right?
> 
> Eric and Jason mentioned that they need to do a bug fix, where they
> could simplify code for the fix by having the 1st patch of this set
> in the net tree as well.
> 
> However, all other patches of this set except the last one also depend
> on the first one, so it would be net and net-next then. Eric, please
> correct me if I'm wrong?

That's fine, I can handle such dependencies without any problems.

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH v2 net-next 1/4] flow_keys: include thoff into flow_keys for later usage
  2013-03-19 14:34 ` [PATCH v2 net-next 1/4] flow_keys: include thoff into flow_keys for later usage Daniel Borkmann
@ 2013-03-19 15:03   ` Eric Dumazet
  2013-03-19 15:06     ` Daniel Borkmann
  2013-03-19 15:38     ` Eric Dumazet
  0 siblings, 2 replies; 14+ messages in thread
From: Eric Dumazet @ 2013-03-19 15:03 UTC (permalink / raw)
  To: Daniel Borkmann; +Cc: netdev, davem, jasowang

On Tue, 2013-03-19 at 15:34 +0100, Daniel Borkmann wrote:
> In skb_flow_dissect(), we perform a dissection of a skbuff. Since we're
> doing the work here anyway, also store thoff for a later usage, e.g. in
> the BPF filter. Also, by having thoff 16 Bit, we do not need to pack
> flow_keys and reorder choke_skb_cb.
> 
> Suggested-by: Eric Dumazet <edumazet@google.com>
> Signed-off-by: Daniel Borkmann <dborkman@redhat.com>
> ---
>  This patch also needs to go into the net tree, since Eric or Jason will
>  post a bug fix on top of this one.
> 
>  include/net/flow_keys.h   | 1 +
>  net/core/flow_dissector.c | 5 ++++-
>  2 files changed, 5 insertions(+), 1 deletion(-)

Oh well, you left the choke_skb_cb description in changelog

Acked-by: Eric Dumazet <edumazet@google.com>

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH v2 net-next 2/4] net: flow_dissector: add __skb_get_poff to get a start offset to payload
  2013-03-19 14:34 ` [PATCH v2 net-next 2/4] net: flow_dissector: add __skb_get_poff to get a start offset to payload Daniel Borkmann
@ 2013-03-19 15:03   ` Eric Dumazet
  0 siblings, 0 replies; 14+ messages in thread
From: Eric Dumazet @ 2013-03-19 15:03 UTC (permalink / raw)
  To: Daniel Borkmann; +Cc: netdev, davem, jasowang

On Tue, 2013-03-19 at 15:34 +0100, Daniel Borkmann wrote:
> __skb_get_poff() returns the offset to the payload as far as it could
> be dissected. The main user is currently BPF, so that we can dynamically
> truncate packets without needing to push actual payload to the user
> space and instead can analyze headers only.
> 
> Suggested-by: Eric Dumazet <edumazet@google.com>
> Signed-off-by: Daniel Borkmann <dborkman@redhat.com>
> ---
>  include/linux/skbuff.h    |  2 ++
>  net/core/flow_dissector.c | 57 +++++++++++++++++++++++++++++++++++++++++++++++
>  2 files changed, 59 insertions(+)

Acked-by: Eric Dumazet <edumazet@google.com>

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH v2 net-next 3/4] filter: add ANC_PAY_OFFSET instruction for loading payload start offset
  2013-03-19 14:34 ` [PATCH v2 net-next 3/4] filter: add ANC_PAY_OFFSET instruction for loading payload start offset Daniel Borkmann
@ 2013-03-19 15:04   ` Eric Dumazet
  0 siblings, 0 replies; 14+ messages in thread
From: Eric Dumazet @ 2013-03-19 15:04 UTC (permalink / raw)
  To: Daniel Borkmann; +Cc: netdev, davem, jasowang

On Tue, 2013-03-19 at 15:34 +0100, Daniel Borkmann wrote:
> It is very useful to do dynamic truncation of packets. In particular,
> we're interested to push the necessary header bytes to the user space and
> cut off user payload that should probably not be transferred for some reasons
> (e.g. privacy, speed, or others). With the ancillary extension PAY_OFFSET,
> we can load it into the accumulator, and return it. E.g. in bpfc syntax ...
> 
>         ld #poff        ; { 0x20, 0, 0, 0xfffff034 },
>         ret a           ; { 0x16, 0, 0, 0x00000000 },
> 
> ... as a filter will accomplish this without having to do a big hackery in
> a BPF filter itself. Follow-up JIT implementations are welcome.
> 
> Thanks to Eric Dumazet for suggesting and discussing this during the
> Netfilter Workshop in Copenhagen.
> 
> Suggested-by: Eric Dumazet <edumazet@google.com>
> Signed-off-by: Daniel Borkmann <dborkman@redhat.com>
> ---


Thanks a lot Daniel

Acked-by: Eric Dumazet <edumazet@google.com>

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH v2 net-next 4/4] filter: add minimal BPF JIT emitted image disassembler
  2013-03-19 14:34 ` [PATCH v2 net-next 4/4] filter: add minimal BPF JIT emitted image disassembler Daniel Borkmann
@ 2013-03-19 15:05   ` Eric Dumazet
  0 siblings, 0 replies; 14+ messages in thread
From: Eric Dumazet @ 2013-03-19 15:05 UTC (permalink / raw)
  To: Daniel Borkmann; +Cc: netdev, davem, jasowang, Eric Dumazet

On Tue, 2013-03-19 at 15:34 +0100, Daniel Borkmann wrote:
> This is a minimal stand-alone user space helper, that allows for debugging or
> verification of emitted BPF JIT images. This is in particular useful for
> emitted opcode debugging, since minor bugs in the JIT compiler can be fatal.
> The disassembler is architecture generic and uses libopcodes and libbfd.
> 
> How to get to the disassembly, example:
> 
>   1) `echo 2 > /proc/sys/net/core/bpf_jit_enable`
>   2) Load a BPF filter (e.g. `tcpdump -p -n -s 0 -i eth1 host 192.168.20.0/24`)
>   3) Run e.g. `bpf_jit_disasm -o` to disassemble the most recent JIT code output
> 
> `bpf_jit_disasm -o` will display the related opcodes to a particular instruction
> as well. Example for x86_64:
> 
> $./bpf_jit_disasm
> 94 bytes emitted from JIT compiler (pass:3, flen:9)
> ffffffffa0356000 + <x>:
>    0:	push   %rbp
>    1:	mov    %rsp,%rbp
>    4:	sub    $0x60,%rsp
>    8:	mov    %rbx,-0x8(%rbp)
>    c:	mov    0x68(%rdi),%r9d
>   10:	sub    0x6c(%rdi),%r9d
>   14:	mov    0xe0(%rdi),%r8
>   1b:	mov    $0xc,%esi
>   20:	callq  0xffffffffe0d01b71
>   25:	cmp    $0x86dd,%eax
>   2a:	jne    0x000000000000003d
>   2c:	mov    $0x14,%esi
>   31:	callq  0xffffffffe0d01b8d
>   36:	cmp    $0x6,%eax
> [...]
>   5c:	leaveq
>   5d:	retq
> 
> $ ./bpf_jit_disasm -o
> 94 bytes emitted from JIT compiler (pass:3, flen:9)
> ffffffffa0356000 + <x>:
>    0:	push   %rbp
> 	55
>    1:	mov    %rsp,%rbp
> 	48 89 e5
>    4:	sub    $0x60,%rsp
> 	48 83 ec 60
>    8:	mov    %rbx,-0x8(%rbp)
> 	48 89 5d f8
>    c:	mov    0x68(%rdi),%r9d
> 	44 8b 4f 68
>   10:	sub    0x6c(%rdi),%r9d
> 	44 2b 4f 6c
> [...]
>   5c:	leaveq
> 	c9
>   5d:	retq
> 	c3
> 
> Cc: Eric Dumazet <edumazet@google.com>
> Signed-off-by: Daniel Borkmann <dborkman@redhat.com>
> ---

Very useful, thanks Daniel !

Acked-by: Eric Dumazet <edumazet@google.com>

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH v2 net-next 1/4] flow_keys: include thoff into flow_keys for later usage
  2013-03-19 15:03   ` Eric Dumazet
@ 2013-03-19 15:06     ` Daniel Borkmann
  2013-03-19 15:38     ` Eric Dumazet
  1 sibling, 0 replies; 14+ messages in thread
From: Daniel Borkmann @ 2013-03-19 15:06 UTC (permalink / raw)
  To: Eric Dumazet; +Cc: netdev, davem, jasowang

On 03/19/2013 04:03 PM, Eric Dumazet wrote:
> On Tue, 2013-03-19 at 15:34 +0100, Daniel Borkmann wrote:
>> In skb_flow_dissect(), we perform a dissection of a skbuff. Since we're
>> doing the work here anyway, also store thoff for a later usage, e.g. in
>> the BPF filter. Also, by having thoff 16 Bit, we do not need to pack
>> flow_keys and reorder choke_skb_cb.
>>
>> Suggested-by: Eric Dumazet <edumazet@google.com>
>> Signed-off-by: Daniel Borkmann <dborkman@redhat.com>
>> ---
>>   This patch also needs to go into the net tree, since Eric or Jason will
>>   post a bug fix on top of this one.
>>
>>   include/net/flow_keys.h   | 1 +
>>   net/core/flow_dissector.c | 5 ++++-
>>   2 files changed, 5 insertions(+), 1 deletion(-)
>
> Oh well, you left the choke_skb_cb description in changelog

But it's not the old changelog. ;-)

  Also, by having thoff 16 Bit, we do *not* need to pack flow_keys and
  reorder choke_skb_cb.

> Acked-by: Eric Dumazet <edumazet@google.com>

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH v2 net-next 1/4] flow_keys: include thoff into flow_keys for later usage
  2013-03-19 15:03   ` Eric Dumazet
  2013-03-19 15:06     ` Daniel Borkmann
@ 2013-03-19 15:38     ` Eric Dumazet
  1 sibling, 0 replies; 14+ messages in thread
From: Eric Dumazet @ 2013-03-19 15:38 UTC (permalink / raw)
  To: Daniel Borkmann; +Cc: netdev, davem, jasowang

On Tue, 2013-03-19 at 08:03 -0700, Eric Dumazet wrote:
> On Tue, 2013-03-19 at 15:34 +0100, Daniel Borkmann wrote:
> > In skb_flow_dissect(), we perform a dissection of a skbuff. Since we're
> > doing the work here anyway, also store thoff for a later usage, e.g. in
> > the BPF filter. Also, by having thoff 16 Bit, we do not need to pack
> > flow_keys and reorder choke_skb_cb.
> > 
> > Suggested-by: Eric Dumazet <edumazet@google.com>
> > Signed-off-by: Daniel Borkmann <dborkman@redhat.com>
> > ---
> >  This patch also needs to go into the net tree, since Eric or Jason will
> >  post a bug fix on top of this one.
> > 
> >  include/net/flow_keys.h   | 1 +
> >  net/core/flow_dissector.c | 5 ++++-
> >  2 files changed, 5 insertions(+), 1 deletion(-)
> 
> Oh well, you left the choke_skb_cb description in changelog
> 
> Acked-by: Eric Dumazet <edumazet@google.com>
> 

Actually this patch has a bug

the nhoff variable should stay as an int, or a malicious user could
trigger an infinite loop with a big packet, as nhoff could wrap.

^ permalink raw reply	[flat|nested] 14+ messages in thread

end of thread, other threads:[~2013-03-19 15:38 UTC | newest]

Thread overview: 14+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2013-03-19 14:33 [PATCH v2 net-next 0/4] net: filter: BPF updates Daniel Borkmann
2013-03-19 14:34 ` [PATCH v2 net-next 1/4] flow_keys: include thoff into flow_keys for later usage Daniel Borkmann
2013-03-19 15:03   ` Eric Dumazet
2013-03-19 15:06     ` Daniel Borkmann
2013-03-19 15:38     ` Eric Dumazet
2013-03-19 14:34 ` [PATCH v2 net-next 2/4] net: flow_dissector: add __skb_get_poff to get a start offset to payload Daniel Borkmann
2013-03-19 15:03   ` Eric Dumazet
2013-03-19 14:34 ` [PATCH v2 net-next 3/4] filter: add ANC_PAY_OFFSET instruction for loading payload start offset Daniel Borkmann
2013-03-19 15:04   ` Eric Dumazet
2013-03-19 14:34 ` [PATCH v2 net-next 4/4] filter: add minimal BPF JIT emitted image disassembler Daniel Borkmann
2013-03-19 15:05   ` Eric Dumazet
2013-03-19 14:38 ` [PATCH v2 net-next 0/4] net: filter: BPF updates David Miller
2013-03-19 14:42   ` Daniel Borkmann
2013-03-19 14:51     ` David Miller

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.