Netdev Archive on lore.kernel.org
 help / color / Atom feed
* [PATCH v2 bpf-next 00/12] bpf: revolutionize bpf tracing
@ 2019-10-10  4:14 Alexei Starovoitov
  2019-10-10  4:14 ` [PATCH v2 bpf-next 01/12] bpf: add typecast to raw_tracepoints to help BTF generation Alexei Starovoitov
                   ` (11 more replies)
  0 siblings, 12 replies; 33+ messages in thread
From: Alexei Starovoitov @ 2019-10-10  4:14 UTC (permalink / raw)
  To: davem; +Cc: daniel, x86, netdev, bpf, kernel-team

v1->v2:
- addressed feedback from Andrii and Eric. Thanks a lot for review!
- added missing check at raw_tp attach time.
- Andrii noticed that expected_attach_type cannot be reused.
  Had to introduce new field to bpf_attr.
- cleaned up logging nicely by introducing bpf_log() helper.
- rebased.

Revolutionize bpf tracing and bpf C programming.
C language allows any pointer to be typecasted to any other pointer
or convert integer to a pointer.
Though bpf verifier is operating at assembly level it has strict type
checking for fixed number of types.
Known types are defined in 'enum bpf_reg_type'.
For example:
PTR_TO_FLOW_KEYS is a pointer to 'struct bpf_flow_keys'
PTR_TO_SOCKET is a pointer to 'struct bpf_sock',
and so on.

When it comes to bpf tracing there are no types to track.
bpf+kprobe receives 'struct pt_regs' as input.
bpf+raw_tracepoint receives raw kernel arguments as an array of u64 values.
It was up to bpf program to interpret these integers.
Typical tracing program looks like:
int bpf_prog(struct pt_regs *ctx)
{
    struct net_device *dev;
    struct sk_buff *skb;
    int ifindex;

    skb = (struct sk_buff *) ctx->di;
    bpf_probe_read(&dev, sizeof(dev), &skb->dev);
    bpf_probe_read(&ifindex, sizeof(ifindex), &dev->ifindex);
}
Addressing mistakes will not be caught by C compiler or by the verifier.
The program above could have typecasted ctx->si to skb and page faulted
on every bpf_probe_read().
bpf_probe_read() allows reading any address and suppresses page faults.
Typical program has hundreds of bpf_probe_read() calls to walk
kernel data structures.
Not only tracing program would be slow, but there was always a risk
that bpf_probe_read() would read mmio region of memory and cause
unpredictable hw behavior.

With introduction of Compile Once Run Everywhere technology in libbpf
and in LLVM and BPF Type Format (BTF) the verifier is finally ready
for the next step in program verification.
Now it can use in-kernel BTF to type check bpf assembly code.

Equivalent program will look like:
struct trace_kfree_skb {
    struct sk_buff *skb;
    void *location;
};
SEC("raw_tracepoint/kfree_skb")
int trace_kfree_skb(struct trace_kfree_skb* ctx)
{
    struct sk_buff *skb = ctx->skb;
    struct net_device *dev;
    int ifindex;

    __builtin_preserve_access_index(({
        dev = skb->dev;
        ifindex = dev->ifindex;
    }));
}

These patches teach bpf verifier to recognize kfree_skb's first argument
as 'struct sk_buff *' because this is what kernel C code is doing.
The bpf program cannot 'cheat' and say that the first argument
to kfree_skb raw_tracepoint is some other type.
The verifier will catch such type mismatch between bpf program
assumption of kernel code and the actual type in the kernel.

Furthermore skb->dev access is type tracked as well.
The verifier can see which field of skb is being read
in bpf assembly. It will match offset to type.
If bpf program has code:
struct net_device *dev = (void *)skb->len;
C compiler will not complain and generate bpf assembly code,
but the verifier will recognize that integer 'len' field
is being accessed at offsetof(struct sk_buff, len) and will reject
further dereference of 'dev' variable because it contains
integer value instead of a pointer.

Such sophisticated type tracking allows calling networking
bpf helpers from tracing programs.
This patchset allows calling bpf_skb_event_output() that dumps
skb data into perf ring buffer.
It greatly improves observability.
Now users can not only see packet lenth of the skb
about to be freed in kfree_skb() kernel function, but can
dump it to user space via perf ring buffer using bpf helper
that was previously available only to TC and socket filters.
See patch 10 for full example.

The end result is safer and faster bpf tracing.
Safer - because direct calls to bpf_probe_read() are disallowed and
arbitrary addresses cannot be read.
Faster - because normal loads are used to walk kernel data structures
instead of bpf_probe_read() calls.
Note that such loads can page fault and are supported by
hidden bpf_probe_read() in interpreter and via exception table
if program is JITed.

See patches for details.

Alexei Starovoitov (12):
  bpf: add typecast to raw_tracepoints to help BTF generation
  bpf: add typecast to bpf helpers to help BTF generation
  bpf: process in-kernel BTF
  bpf: add attach_btf_id attribute to program load
  libbpf: auto-detect btf_id of raw_tracepoint
  bpf: implement accurate raw_tp context access via BTF
  bpf: attach raw_tp program with BTF via type name
  bpf: add support for BTF pointers to interpreter
  bpf: add support for BTF pointers to x86 JIT
  bpf: check types of arguments passed into helpers
  bpf: disallow bpf_probe_read[_str] helpers
  selftests/bpf: add kfree_skb raw_tp test

 arch/x86/net/bpf_jit_comp.c                   |  97 +++++-
 include/linux/bpf.h                           |  39 ++-
 include/linux/bpf_verifier.h                  |   8 +-
 include/linux/btf.h                           |   1 +
 include/linux/extable.h                       |  10 +
 include/linux/filter.h                        |   6 +-
 include/trace/bpf_probe.h                     |   3 +-
 include/uapi/linux/bpf.h                      |  28 +-
 kernel/bpf/btf.c                              | 325 +++++++++++++++++-
 kernel/bpf/core.c                             |  39 ++-
 kernel/bpf/syscall.c                          |  85 +++--
 kernel/bpf/verifier.c                         | 159 ++++++++-
 kernel/extable.c                              |   2 +
 kernel/trace/bpf_trace.c                      |  10 +-
 net/core/filter.c                             |  15 +-
 tools/include/uapi/linux/bpf.h                |  28 +-
 tools/lib/bpf/bpf.c                           |   3 +
 tools/lib/bpf/libbpf.c                        |  17 +
 .../selftests/bpf/prog_tests/kfree_skb.c      |  90 +++++
 tools/testing/selftests/bpf/progs/kfree_skb.c |  74 ++++
 20 files changed, 975 insertions(+), 64 deletions(-)
 create mode 100644 tools/testing/selftests/bpf/prog_tests/kfree_skb.c
 create mode 100644 tools/testing/selftests/bpf/progs/kfree_skb.c

-- 
2.23.0


^ permalink raw reply	[flat|nested] 33+ messages in thread

* [PATCH v2 bpf-next 01/12] bpf: add typecast to raw_tracepoints to help BTF generation
  2019-10-10  4:14 [PATCH v2 bpf-next 00/12] bpf: revolutionize bpf tracing Alexei Starovoitov
@ 2019-10-10  4:14 ` Alexei Starovoitov
  2019-10-10  4:14 ` [PATCH v2 bpf-next 02/12] bpf: add typecast to bpf helpers " Alexei Starovoitov
                   ` (10 subsequent siblings)
  11 siblings, 0 replies; 33+ messages in thread
From: Alexei Starovoitov @ 2019-10-10  4:14 UTC (permalink / raw)
  To: davem; +Cc: daniel, x86, netdev, bpf, kernel-team

When pahole converts dwarf to btf it emits only used types.
Wrap existing __bpf_trace_##template() function into
btf_trace_##template typedef and use it in type cast to
make gcc emits this type into dwarf. Then pahole will convert it to btf.
The "btf_trace_" prefix will be used to identify BTF enabled raw tracepoints.

Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: Andrii Nakryiko <andriin@fb.com>
Acked-by: John Fastabend <john.fastabend@gmail.com>
---
 include/trace/bpf_probe.h | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/include/trace/bpf_probe.h b/include/trace/bpf_probe.h
index d6e556c0a085..ff1a879773df 100644
--- a/include/trace/bpf_probe.h
+++ b/include/trace/bpf_probe.h
@@ -74,11 +74,12 @@ static inline void bpf_test_probe_##call(void)				\
 {									\
 	check_trace_callback_type_##call(__bpf_trace_##template);	\
 }									\
+typedef void (*btf_trace_##template)(void *__data, proto);		\
 static struct bpf_raw_event_map	__used					\
 	__attribute__((section("__bpf_raw_tp_map")))			\
 __bpf_trace_tp_map_##call = {						\
 	.tp		= &__tracepoint_##call,				\
-	.bpf_func	= (void *)__bpf_trace_##template,		\
+	.bpf_func	= (void *)(btf_trace_##template)__bpf_trace_##template,	\
 	.num_args	= COUNT_ARGS(args),				\
 	.writable_size	= size,						\
 };
-- 
2.23.0


^ permalink raw reply	[flat|nested] 33+ messages in thread

* [PATCH v2 bpf-next 02/12] bpf: add typecast to bpf helpers to help BTF generation
  2019-10-10  4:14 [PATCH v2 bpf-next 00/12] bpf: revolutionize bpf tracing Alexei Starovoitov
  2019-10-10  4:14 ` [PATCH v2 bpf-next 01/12] bpf: add typecast to raw_tracepoints to help BTF generation Alexei Starovoitov
@ 2019-10-10  4:14 ` " Alexei Starovoitov
  2019-10-10  4:14 ` [PATCH v2 bpf-next 03/12] bpf: process in-kernel BTF Alexei Starovoitov
                   ` (9 subsequent siblings)
  11 siblings, 0 replies; 33+ messages in thread
From: Alexei Starovoitov @ 2019-10-10  4:14 UTC (permalink / raw)
  To: davem; +Cc: daniel, x86, netdev, bpf, kernel-team

When pahole converts dwarf to btf it emits only used types.
Wrap existing bpf helper functions into typedef and use it in
typecast to make gcc emits this type into dwarf.
Then pahole will convert it to btf.
The "btf_#name_of_helper" types will be used to figure out
types of arguments of bpf helpers.
The generated code before and after is the same.
Only dwarf and btf sections are different.

Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: Andrii Nakryiko <andriin@fb.com>
Acked-by: John Fastabend <john.fastabend@gmail.com>
---
 include/linux/filter.h | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/include/linux/filter.h b/include/linux/filter.h
index 2ce57645f3cd..d3d51d7aff2c 100644
--- a/include/linux/filter.h
+++ b/include/linux/filter.h
@@ -464,10 +464,11 @@ static inline bool insn_is_zext(const struct bpf_insn *insn)
 #define BPF_CALL_x(x, name, ...)					       \
 	static __always_inline						       \
 	u64 ____##name(__BPF_MAP(x, __BPF_DECL_ARGS, __BPF_V, __VA_ARGS__));   \
+	typedef u64 (*btf_##name)(__BPF_MAP(x, __BPF_DECL_ARGS, __BPF_V, __VA_ARGS__)); \
 	u64 name(__BPF_REG(x, __BPF_DECL_REGS, __BPF_N, __VA_ARGS__));	       \
 	u64 name(__BPF_REG(x, __BPF_DECL_REGS, __BPF_N, __VA_ARGS__))	       \
 	{								       \
-		return ____##name(__BPF_MAP(x,__BPF_CAST,__BPF_N,__VA_ARGS__));\
+		return ((btf_##name)____##name)(__BPF_MAP(x,__BPF_CAST,__BPF_N,__VA_ARGS__));\
 	}								       \
 	static __always_inline						       \
 	u64 ____##name(__BPF_MAP(x, __BPF_DECL_ARGS, __BPF_V, __VA_ARGS__))
-- 
2.23.0


^ permalink raw reply	[flat|nested] 33+ messages in thread

* [PATCH v2 bpf-next 03/12] bpf: process in-kernel BTF
  2019-10-10  4:14 [PATCH v2 bpf-next 00/12] bpf: revolutionize bpf tracing Alexei Starovoitov
  2019-10-10  4:14 ` [PATCH v2 bpf-next 01/12] bpf: add typecast to raw_tracepoints to help BTF generation Alexei Starovoitov
  2019-10-10  4:14 ` [PATCH v2 bpf-next 02/12] bpf: add typecast to bpf helpers " Alexei Starovoitov
@ 2019-10-10  4:14 ` Alexei Starovoitov
  2019-10-11 17:56   ` Andrii Nakryiko
  2019-10-10  4:14 ` [PATCH v2 bpf-next 04/12] bpf: add attach_btf_id attribute to program load Alexei Starovoitov
                   ` (8 subsequent siblings)
  11 siblings, 1 reply; 33+ messages in thread
From: Alexei Starovoitov @ 2019-10-10  4:14 UTC (permalink / raw)
  To: davem; +Cc: daniel, x86, netdev, bpf, kernel-team

If in-kernel BTF exists parse it and prepare 'struct btf *btf_vmlinux'
for further use by the verifier.
In-kernel BTF is trusted just like kallsyms and other build artifacts
embedded into vmlinux.
Yet run this BTF image through BTF verifier to make sure
that it is valid and it wasn't mangled during the build.
If in-kernel BTF is incorrect it means either gcc or pahole or kernel
are buggy. In such case disallow loading BPF programs.

Signed-off-by: Alexei Starovoitov <ast@kernel.org>
---
 include/linux/bpf_verifier.h |  4 +-
 include/linux/btf.h          |  1 +
 kernel/bpf/btf.c             | 71 +++++++++++++++++++++++++++++++++++-
 kernel/bpf/verifier.c        | 20 ++++++++++
 4 files changed, 94 insertions(+), 2 deletions(-)

diff --git a/include/linux/bpf_verifier.h b/include/linux/bpf_verifier.h
index 26a6d58ca78c..713efae62e96 100644
--- a/include/linux/bpf_verifier.h
+++ b/include/linux/bpf_verifier.h
@@ -330,10 +330,12 @@ static inline bool bpf_verifier_log_full(const struct bpf_verifier_log *log)
 #define BPF_LOG_STATS	4
 #define BPF_LOG_LEVEL	(BPF_LOG_LEVEL1 | BPF_LOG_LEVEL2)
 #define BPF_LOG_MASK	(BPF_LOG_LEVEL | BPF_LOG_STATS)
+#define BPF_LOG_KERNEL	(BPF_LOG_MASK + 1) /* kernel internal flag */
 
 static inline bool bpf_verifier_log_needed(const struct bpf_verifier_log *log)
 {
-	return log->level && log->ubuf && !bpf_verifier_log_full(log);
+	return (log->level && log->ubuf && !bpf_verifier_log_full(log)) ||
+		log->level == BPF_LOG_KERNEL;
 }
 
 #define BPF_MAX_SUBPROGS 256
diff --git a/include/linux/btf.h b/include/linux/btf.h
index 64cdf2a23d42..55d43bc856be 100644
--- a/include/linux/btf.h
+++ b/include/linux/btf.h
@@ -56,6 +56,7 @@ bool btf_type_is_void(const struct btf_type *t);
 #ifdef CONFIG_BPF_SYSCALL
 const struct btf_type *btf_type_by_id(const struct btf *btf, u32 type_id);
 const char *btf_name_by_offset(const struct btf *btf, u32 offset);
+struct btf *btf_parse_vmlinux(void);
 #else
 static inline const struct btf_type *btf_type_by_id(const struct btf *btf,
 						    u32 type_id)
diff --git a/kernel/bpf/btf.c b/kernel/bpf/btf.c
index 29c7c06c6bd6..ddeab1e8d21e 100644
--- a/kernel/bpf/btf.c
+++ b/kernel/bpf/btf.c
@@ -698,6 +698,13 @@ __printf(4, 5) static void __btf_verifier_log_type(struct btf_verifier_env *env,
 	if (!bpf_verifier_log_needed(log))
 		return;
 
+	/* btf verifier prints all types it is processing via
+	 * btf_verifier_log_type(..., fmt = NULL).
+	 * Skip those prints for in-kernel BTF verification.
+	 */
+	if (log->level == BPF_LOG_KERNEL && !fmt)
+		return;
+
 	__btf_verifier_log(log, "[%u] %s %s%s",
 			   env->log_type_id,
 			   btf_kind_str[kind],
@@ -735,6 +742,8 @@ static void btf_verifier_log_member(struct btf_verifier_env *env,
 	if (!bpf_verifier_log_needed(log))
 		return;
 
+	if (log->level == BPF_LOG_KERNEL && !fmt)
+		return;
 	/* The CHECK_META phase already did a btf dump.
 	 *
 	 * If member is logged again, it must hit an error in
@@ -777,6 +786,8 @@ static void btf_verifier_log_vsi(struct btf_verifier_env *env,
 
 	if (!bpf_verifier_log_needed(log))
 		return;
+	if (log->level == BPF_LOG_KERNEL && !fmt)
+		return;
 	if (env->phase != CHECK_META)
 		btf_verifier_log_type(env, datasec_type, NULL);
 
@@ -802,6 +813,8 @@ static void btf_verifier_log_hdr(struct btf_verifier_env *env,
 	if (!bpf_verifier_log_needed(log))
 		return;
 
+	if (log->level == BPF_LOG_KERNEL)
+		return;
 	hdr = &btf->hdr;
 	__btf_verifier_log(log, "magic: 0x%x\n", hdr->magic);
 	__btf_verifier_log(log, "version: %u\n", hdr->version);
@@ -2405,7 +2418,8 @@ static s32 btf_enum_check_meta(struct btf_verifier_env *env,
 			return -EINVAL;
 		}
 
-
+		if (env->log.level == BPF_LOG_KERNEL)
+			continue;
 		btf_verifier_log(env, "\t%s val=%d\n",
 				 __btf_name_by_offset(btf, enums[i].name_off),
 				 enums[i].val);
@@ -3367,6 +3381,61 @@ static struct btf *btf_parse(void __user *btf_data, u32 btf_data_size,
 	return ERR_PTR(err);
 }
 
+extern char __weak _binary__btf_vmlinux_bin_start[];
+extern char __weak _binary__btf_vmlinux_bin_end[];
+
+struct btf *btf_parse_vmlinux(void)
+{
+	struct btf_verifier_env *env = NULL;
+	struct bpf_verifier_log *log;
+	struct btf *btf = NULL;
+	int err;
+
+	env = kzalloc(sizeof(*env), GFP_KERNEL | __GFP_NOWARN);
+	if (!env)
+		return ERR_PTR(-ENOMEM);
+
+	log = &env->log;
+	log->level = BPF_LOG_KERNEL;
+
+	btf = kzalloc(sizeof(*btf), GFP_KERNEL | __GFP_NOWARN);
+	if (!btf) {
+		err = -ENOMEM;
+		goto errout;
+	}
+	env->btf = btf;
+
+	btf->data = _binary__btf_vmlinux_bin_start;
+	btf->data_size = _binary__btf_vmlinux_bin_end -
+		_binary__btf_vmlinux_bin_start;
+
+	err = btf_parse_hdr(env);
+	if (err)
+		goto errout;
+
+	btf->nohdr_data = btf->data + btf->hdr.hdr_len;
+
+	err = btf_parse_str_sec(env);
+	if (err)
+		goto errout;
+
+	err = btf_check_all_metas(env);
+	if (err)
+		goto errout;
+
+	btf_verifier_env_free(env);
+	refcount_set(&btf->refcnt, 1);
+	return btf;
+
+errout:
+	btf_verifier_env_free(env);
+	if (btf) {
+		kvfree(btf->types);
+		kfree(btf);
+	}
+	return ERR_PTR(err);
+}
+
 void btf_type_seq_show(const struct btf *btf, u32 type_id, void *obj,
 		       struct seq_file *m)
 {
diff --git a/kernel/bpf/verifier.c b/kernel/bpf/verifier.c
index ffc3e53f5300..051a355037bf 100644
--- a/kernel/bpf/verifier.c
+++ b/kernel/bpf/verifier.c
@@ -207,6 +207,8 @@ struct bpf_call_arg_meta {
 	int func_id;
 };
 
+struct btf *btf_vmlinux;
+
 static DEFINE_MUTEX(bpf_verifier_lock);
 
 static const struct bpf_line_info *
@@ -243,6 +245,10 @@ void bpf_verifier_vlog(struct bpf_verifier_log *log, const char *fmt,
 	n = min(log->len_total - log->len_used - 1, n);
 	log->kbuf[n] = '\0';
 
+	if (log->level == BPF_LOG_KERNEL) {
+		pr_err("BPF:%s\n", log->kbuf);
+		return;
+	}
 	if (!copy_to_user(log->ubuf + log->len_used, log->kbuf, n + 1))
 		log->len_used += n;
 	else
@@ -9241,6 +9247,13 @@ int bpf_check(struct bpf_prog **prog, union bpf_attr *attr,
 	env->ops = bpf_verifier_ops[env->prog->type];
 	is_priv = capable(CAP_SYS_ADMIN);
 
+	if (!btf_vmlinux && IS_ENABLED(CONFIG_DEBUG_INFO_BTF)) {
+		mutex_lock(&bpf_verifier_lock);
+		if (!btf_vmlinux)
+			btf_vmlinux = btf_parse_vmlinux();
+		mutex_unlock(&bpf_verifier_lock);
+	}
+
 	/* grab the mutex to protect few globals used by verifier */
 	if (!is_priv)
 		mutex_lock(&bpf_verifier_lock);
@@ -9260,6 +9273,13 @@ int bpf_check(struct bpf_prog **prog, union bpf_attr *attr,
 			goto err_unlock;
 	}
 
+	if (IS_ERR(btf_vmlinux)) {
+		/* Either gcc or pahole or kernel are broken. */
+		verbose(env, "in-kernel BTF is malformed\n");
+		ret = PTR_ERR(btf_vmlinux);
+		goto err_unlock;
+	}
+
 	env->strict_alignment = !!(attr->prog_flags & BPF_F_STRICT_ALIGNMENT);
 	if (!IS_ENABLED(CONFIG_HAVE_EFFICIENT_UNALIGNED_ACCESS))
 		env->strict_alignment = true;
-- 
2.23.0


^ permalink raw reply	[flat|nested] 33+ messages in thread

* [PATCH v2 bpf-next 04/12] bpf: add attach_btf_id attribute to program load
  2019-10-10  4:14 [PATCH v2 bpf-next 00/12] bpf: revolutionize bpf tracing Alexei Starovoitov
                   ` (2 preceding siblings ...)
  2019-10-10  4:14 ` [PATCH v2 bpf-next 03/12] bpf: process in-kernel BTF Alexei Starovoitov
@ 2019-10-10  4:14 ` Alexei Starovoitov
  2019-10-11 17:58   ` Andrii Nakryiko
  2019-10-10  4:14 ` [PATCH v2 bpf-next 05/12] libbpf: auto-detect btf_id of raw_tracepoint Alexei Starovoitov
                   ` (7 subsequent siblings)
  11 siblings, 1 reply; 33+ messages in thread
From: Alexei Starovoitov @ 2019-10-10  4:14 UTC (permalink / raw)
  To: davem; +Cc: daniel, x86, netdev, bpf, kernel-team

Add attach_btf_id attribute to prog_load command.
It's similar to existing expected_attach_type attribute which is
used in several cgroup based program types.
Unfortunately expected_attach_type is ignored for
tracing programs and cannot be reused for new purpose.
Hence introduce attach_btf_id to verify bpf programs against
given in-kernel BTF type id at load time.
It is strictly checked to be valid for raw_tp programs only.
In a later patches it will become:
btf_id == 0 semantics of existing raw_tp progs.
btd_id > 0 raw_tp with BTF and additional type safety.

Signed-off-by: Alexei Starovoitov <ast@kernel.org>
---
 include/linux/bpf.h            |  1 +
 include/uapi/linux/bpf.h       |  1 +
 kernel/bpf/syscall.c           | 18 ++++++++++++++----
 tools/include/uapi/linux/bpf.h |  1 +
 4 files changed, 17 insertions(+), 4 deletions(-)

diff --git a/include/linux/bpf.h b/include/linux/bpf.h
index 5b9d22338606..a254327c62e9 100644
--- a/include/linux/bpf.h
+++ b/include/linux/bpf.h
@@ -375,6 +375,7 @@ struct bpf_prog_aux {
 	u32 id;
 	u32 func_cnt; /* used by non-func prog as the number of func progs */
 	u32 func_idx; /* 0 for non-func prog, the index in func array for func prog */
+	u32 attach_btf_id; /* in-kernel BTF type id to attach to */
 	bool verifier_zext; /* Zero extensions has been inserted by verifier. */
 	bool offload_requested;
 	struct bpf_prog **func;
diff --git a/include/uapi/linux/bpf.h b/include/uapi/linux/bpf.h
index a65c3b0c6935..3bb2cd1de341 100644
--- a/include/uapi/linux/bpf.h
+++ b/include/uapi/linux/bpf.h
@@ -420,6 +420,7 @@ union bpf_attr {
 		__u32		line_info_rec_size;	/* userspace bpf_line_info size */
 		__aligned_u64	line_info;	/* line info */
 		__u32		line_info_cnt;	/* number of bpf_line_info records */
+		__u32		attach_btf_id;	/* in-kernel BTF type id to attach to */
 	};
 
 	struct { /* anonymous struct used by BPF_OBJ_* commands */
diff --git a/kernel/bpf/syscall.c b/kernel/bpf/syscall.c
index 82eabd4e38ad..b56c482c9760 100644
--- a/kernel/bpf/syscall.c
+++ b/kernel/bpf/syscall.c
@@ -23,6 +23,7 @@
 #include <linux/timekeeping.h>
 #include <linux/ctype.h>
 #include <linux/nospec.h>
+#include <uapi/linux/btf.h>
 
 #define IS_FD_ARRAY(map) ((map)->map_type == BPF_MAP_TYPE_PROG_ARRAY || \
 			   (map)->map_type == BPF_MAP_TYPE_PERF_EVENT_ARRAY || \
@@ -1565,8 +1566,9 @@ static void bpf_prog_load_fixup_attach_type(union bpf_attr *attr)
 }
 
 static int
-bpf_prog_load_check_attach_type(enum bpf_prog_type prog_type,
-				enum bpf_attach_type expected_attach_type)
+bpf_prog_load_check_attach(enum bpf_prog_type prog_type,
+			   enum bpf_attach_type expected_attach_type,
+			   u32 btf_id)
 {
 	switch (prog_type) {
 	case BPF_PROG_TYPE_CGROUP_SOCK:
@@ -1608,13 +1610,19 @@ bpf_prog_load_check_attach_type(enum bpf_prog_type prog_type,
 		default:
 			return -EINVAL;
 		}
+	case BPF_PROG_TYPE_RAW_TRACEPOINT:
+		if (btf_id > BTF_MAX_TYPE)
+			return -EINVAL;
+		return 0;
 	default:
+		if (btf_id)
+			return -EINVAL;
 		return 0;
 	}
 }
 
 /* last field in 'union bpf_attr' used by this command */
-#define	BPF_PROG_LOAD_LAST_FIELD line_info_cnt
+#define	BPF_PROG_LOAD_LAST_FIELD attach_btf_id
 
 static int bpf_prog_load(union bpf_attr *attr, union bpf_attr __user *uattr)
 {
@@ -1656,7 +1664,8 @@ static int bpf_prog_load(union bpf_attr *attr, union bpf_attr __user *uattr)
 		return -EPERM;
 
 	bpf_prog_load_fixup_attach_type(attr);
-	if (bpf_prog_load_check_attach_type(type, attr->expected_attach_type))
+	if (bpf_prog_load_check_attach(type, attr->expected_attach_type,
+				       attr->attach_btf_id))
 		return -EINVAL;
 
 	/* plain bpf_prog allocation */
@@ -1665,6 +1674,7 @@ static int bpf_prog_load(union bpf_attr *attr, union bpf_attr __user *uattr)
 		return -ENOMEM;
 
 	prog->expected_attach_type = attr->expected_attach_type;
+	prog->aux->attach_btf_id = attr->attach_btf_id;
 
 	prog->aux->offload_requested = !!attr->prog_ifindex;
 
diff --git a/tools/include/uapi/linux/bpf.h b/tools/include/uapi/linux/bpf.h
index a65c3b0c6935..3bb2cd1de341 100644
--- a/tools/include/uapi/linux/bpf.h
+++ b/tools/include/uapi/linux/bpf.h
@@ -420,6 +420,7 @@ union bpf_attr {
 		__u32		line_info_rec_size;	/* userspace bpf_line_info size */
 		__aligned_u64	line_info;	/* line info */
 		__u32		line_info_cnt;	/* number of bpf_line_info records */
+		__u32		attach_btf_id;	/* in-kernel BTF type id to attach to */
 	};
 
 	struct { /* anonymous struct used by BPF_OBJ_* commands */
-- 
2.23.0


^ permalink raw reply	[flat|nested] 33+ messages in thread

* [PATCH v2 bpf-next 05/12] libbpf: auto-detect btf_id of raw_tracepoint
  2019-10-10  4:14 [PATCH v2 bpf-next 00/12] bpf: revolutionize bpf tracing Alexei Starovoitov
                   ` (3 preceding siblings ...)
  2019-10-10  4:14 ` [PATCH v2 bpf-next 04/12] bpf: add attach_btf_id attribute to program load Alexei Starovoitov
@ 2019-10-10  4:14 ` Alexei Starovoitov
  2019-10-11 18:02   ` Andrii Nakryiko
  2019-10-11 18:07   ` Andrii Nakryiko
  2019-10-10  4:14 ` [PATCH v2 bpf-next 06/12] bpf: implement accurate raw_tp context access via BTF Alexei Starovoitov
                   ` (6 subsequent siblings)
  11 siblings, 2 replies; 33+ messages in thread
From: Alexei Starovoitov @ 2019-10-10  4:14 UTC (permalink / raw)
  To: davem; +Cc: daniel, x86, netdev, bpf, kernel-team

For raw tracepoint program types libbpf will try to find
btf_id of raw tracepoint in vmlinux's BTF.
It's a responsiblity of bpf program author to annotate the program
with SEC("raw_tracepoint/name") where "name" is a valid raw tracepoint.
If "name" is indeed a valid raw tracepoint then in-kernel BTF
will have "btf_trace_##name" typedef that points to function
prototype of that raw tracepoint. BTF description captures
exact argument the kernel C code is passing into raw tracepoint.
The kernel verifier will check the types while loading bpf program.

libbpf keeps BTF type id in expected_attach_type, but since
kernel ignores this attribute for tracing programs copy it
into attach_btf_id attribute before loading.

Signed-off-by: Alexei Starovoitov <ast@kernel.org>
---
 tools/lib/bpf/bpf.c    |  3 +++
 tools/lib/bpf/libbpf.c | 17 +++++++++++++++++
 2 files changed, 20 insertions(+)

diff --git a/tools/lib/bpf/bpf.c b/tools/lib/bpf/bpf.c
index cbb933532981..79046067720f 100644
--- a/tools/lib/bpf/bpf.c
+++ b/tools/lib/bpf/bpf.c
@@ -228,6 +228,9 @@ int bpf_load_program_xattr(const struct bpf_load_program_attr *load_attr,
 	memset(&attr, 0, sizeof(attr));
 	attr.prog_type = load_attr->prog_type;
 	attr.expected_attach_type = load_attr->expected_attach_type;
+	if (attr.prog_type == BPF_PROG_TYPE_RAW_TRACEPOINT)
+		/* expected_attach_type is ignored for tracing progs */
+		attr.attach_btf_id = attr.expected_attach_type;
 	attr.insn_cnt = (__u32)load_attr->insns_cnt;
 	attr.insns = ptr_to_u64(load_attr->insns);
 	attr.license = ptr_to_u64(load_attr->license);
diff --git a/tools/lib/bpf/libbpf.c b/tools/lib/bpf/libbpf.c
index a02cdedc4e3f..8bf30a67428c 100644
--- a/tools/lib/bpf/libbpf.c
+++ b/tools/lib/bpf/libbpf.c
@@ -4586,6 +4586,23 @@ int libbpf_prog_type_by_name(const char *name, enum bpf_prog_type *prog_type,
 			continue;
 		*prog_type = section_names[i].prog_type;
 		*expected_attach_type = section_names[i].expected_attach_type;
+		if (*prog_type == BPF_PROG_TYPE_RAW_TRACEPOINT) {
+			struct btf *btf = bpf_core_find_kernel_btf();
+			char raw_tp_btf_name[128] = "btf_trace_";
+			char *dst = raw_tp_btf_name + sizeof("btf_trace_") - 1;
+			int ret;
+
+			if (IS_ERR(btf))
+				/* lack of kernel BTF is not a failure */
+				return 0;
+			/* prepend "btf_trace_" prefix per kernel convention */
+			strncat(dst, name + section_names[i].len,
+				sizeof(raw_tp_btf_name) - (dst - raw_tp_btf_name));
+			ret = btf__find_by_name(btf, raw_tp_btf_name);
+			if (ret > 0)
+				*expected_attach_type = ret;
+			btf__free(btf);
+		}
 		return 0;
 	}
 	pr_warning("failed to guess program type based on ELF section name '%s'\n", name);
-- 
2.23.0


^ permalink raw reply	[flat|nested] 33+ messages in thread

* [PATCH v2 bpf-next 06/12] bpf: implement accurate raw_tp context access via BTF
  2019-10-10  4:14 [PATCH v2 bpf-next 00/12] bpf: revolutionize bpf tracing Alexei Starovoitov
                   ` (4 preceding siblings ...)
  2019-10-10  4:14 ` [PATCH v2 bpf-next 05/12] libbpf: auto-detect btf_id of raw_tracepoint Alexei Starovoitov
@ 2019-10-10  4:14 ` Alexei Starovoitov
  2019-10-11 18:31   ` Andrii Nakryiko
  2019-10-10  4:14 ` [PATCH v2 bpf-next 07/12] bpf: attach raw_tp program with BTF via type name Alexei Starovoitov
                   ` (5 subsequent siblings)
  11 siblings, 1 reply; 33+ messages in thread
From: Alexei Starovoitov @ 2019-10-10  4:14 UTC (permalink / raw)
  To: davem; +Cc: daniel, x86, netdev, bpf, kernel-team

libbpf analyzes bpf C program, searches in-kernel BTF for given type name
and stores it into expected_attach_type.
The kernel verifier expects this btf_id to point to something like:
typedef void (*btf_trace_kfree_skb)(void *, struct sk_buff *skb, void *loc);
which represents signature of raw_tracepoint "kfree_skb".

Then btf_ctx_access() matches ctx+0 access in bpf program with 'skb'
and 'ctx+8' access with 'loc' arguments of "kfree_skb" tracepoint.
In first case it passes btf_id of 'struct sk_buff *' back to the verifier core
and 'void *' in second case.

Then the verifier tracks PTR_TO_BTF_ID as any other pointer type.
Like PTR_TO_SOCKET points to 'struct bpf_sock',
PTR_TO_TCP_SOCK points to 'struct bpf_tcp_sock', and so on.
PTR_TO_BTF_ID points to in-kernel structs.
If 1234 is btf_id of 'struct sk_buff' in vmlinux's BTF
then PTR_TO_BTF_ID#1234 points to one of in kernel skbs.

When PTR_TO_BTF_ID#1234 is dereferenced (like r2 = *(u64 *)r1 + 32)
the btf_struct_access() checks which field of 'struct sk_buff' is
at offset 32. Checks that size of access matches type definition
of the field and continues to track the dereferenced type.
If that field was a pointer to 'struct net_device' the r2's type
will be PTR_TO_BTF_ID#456. Where 456 is btf_id of 'struct net_device'
in vmlinux's BTF.

Such verifier analysis prevents "cheating" in BPF C program.
The program cannot cast arbitrary pointer to 'struct sk_buff *'
and access it. C compiler would allow type cast, of course,
but the verifier will notice type mismatch based on BPF assembly
and in-kernel BTF.

Signed-off-by: Alexei Starovoitov <ast@kernel.org>
---
 include/linux/bpf.h          |  17 +++-
 include/linux/bpf_verifier.h |   4 +
 kernel/bpf/btf.c             | 186 +++++++++++++++++++++++++++++++++++
 kernel/bpf/verifier.c        |  86 +++++++++++++++-
 kernel/trace/bpf_trace.c     |   2 +-
 5 files changed, 290 insertions(+), 5 deletions(-)

diff --git a/include/linux/bpf.h b/include/linux/bpf.h
index a254327c62e9..4218e269be59 100644
--- a/include/linux/bpf.h
+++ b/include/linux/bpf.h
@@ -16,6 +16,7 @@
 #include <linux/u64_stats_sync.h>
 
 struct bpf_verifier_env;
+struct bpf_verifier_log;
 struct perf_event;
 struct bpf_prog;
 struct bpf_map;
@@ -281,6 +282,7 @@ enum bpf_reg_type {
 	PTR_TO_TCP_SOCK_OR_NULL, /* reg points to struct tcp_sock or NULL */
 	PTR_TO_TP_BUFFER,	 /* reg points to a writable raw tp's buffer */
 	PTR_TO_XDP_SOCK,	 /* reg points to struct xdp_sock */
+	PTR_TO_BTF_ID,		 /* reg points to kernel struct */
 };
 
 /* The information passed from prog-specific *_is_valid_access
@@ -288,7 +290,11 @@ enum bpf_reg_type {
  */
 struct bpf_insn_access_aux {
 	enum bpf_reg_type reg_type;
-	int ctx_field_size;
+	union {
+		int ctx_field_size;
+		u32 btf_id;
+	};
+	struct bpf_verifier_log *log; /* for verbose logs */
 };
 
 static inline void
@@ -483,6 +489,7 @@ struct bpf_event_entry {
 
 bool bpf_prog_array_compatible(struct bpf_array *array, const struct bpf_prog *fp);
 int bpf_prog_calc_tag(struct bpf_prog *fp);
+const char *kernel_type_name(u32 btf_type_id);
 
 const struct bpf_func_proto *bpf_get_trace_printk_proto(void);
 
@@ -748,6 +755,14 @@ int bpf_prog_test_run_skb(struct bpf_prog *prog, const union bpf_attr *kattr,
 int bpf_prog_test_run_flow_dissector(struct bpf_prog *prog,
 				     const union bpf_attr *kattr,
 				     union bpf_attr __user *uattr);
+bool btf_ctx_access(int off, int size, enum bpf_access_type type,
+		    const struct bpf_prog *prog,
+		    struct bpf_insn_access_aux *info);
+int btf_struct_access(struct bpf_verifier_log *log,
+		      const struct btf_type *t, int off, int size,
+		      enum bpf_access_type atype,
+		      u32 *next_btf_id);
+
 #else /* !CONFIG_BPF_SYSCALL */
 static inline struct bpf_prog *bpf_prog_get(u32 ufd)
 {
diff --git a/include/linux/bpf_verifier.h b/include/linux/bpf_verifier.h
index 713efae62e96..6e7284ea1468 100644
--- a/include/linux/bpf_verifier.h
+++ b/include/linux/bpf_verifier.h
@@ -52,6 +52,8 @@ struct bpf_reg_state {
 		 */
 		struct bpf_map *map_ptr;
 
+		u32 btf_id; /* for PTR_TO_BTF_ID */
+
 		/* Max size from any of the above. */
 		unsigned long raw;
 	};
@@ -399,6 +401,8 @@ __printf(2, 0) void bpf_verifier_vlog(struct bpf_verifier_log *log,
 				      const char *fmt, va_list args);
 __printf(2, 3) void bpf_verifier_log_write(struct bpf_verifier_env *env,
 					   const char *fmt, ...);
+__printf(2, 3) void bpf_log(struct bpf_verifier_log *log,
+			    const char *fmt, ...);
 
 static inline struct bpf_func_state *cur_func(struct bpf_verifier_env *env)
 {
diff --git a/kernel/bpf/btf.c b/kernel/bpf/btf.c
index ddeab1e8d21e..01f929566e8d 100644
--- a/kernel/bpf/btf.c
+++ b/kernel/bpf/btf.c
@@ -3436,6 +3436,192 @@ struct btf *btf_parse_vmlinux(void)
 	return ERR_PTR(err);
 }
 
+extern struct btf *btf_vmlinux;
+
+bool btf_ctx_access(int off, int size, enum bpf_access_type type,
+		    const struct bpf_prog *prog,
+		    struct bpf_insn_access_aux *info)
+{
+	struct bpf_verifier_log *log = info->log;
+	u32 btf_id = prog->aux->attach_btf_id;
+	const struct btf_param *args;
+	const struct btf_type *t;
+	const char prefix[] = "btf_trace_";
+	const char *tname;
+	u32 nr_args, arg;
+
+	if (!btf_id)
+		return true;
+
+	if (IS_ERR(btf_vmlinux)) {
+		bpf_log(log, "btf_vmlinux is malformed\n");
+		return false;
+	}
+
+	t = btf_type_by_id(btf_vmlinux, btf_id);
+	if (!t || BTF_INFO_KIND(t->info) != BTF_KIND_TYPEDEF) {
+		bpf_log(log, "btf_id is invalid\n");
+		return false;
+	}
+
+	tname = __btf_name_by_offset(btf_vmlinux, t->name_off);
+	if (strncmp(prefix, tname, sizeof(prefix) - 1)) {
+		bpf_log(log, "btf_id points to wrong type name %s\n", tname);
+		return false;
+	}
+	tname += sizeof(prefix) - 1;
+
+	t = btf_type_by_id(btf_vmlinux, t->type);
+	if (!btf_type_is_ptr(t))
+		return false;
+	t = btf_type_by_id(btf_vmlinux, t->type);
+	if (!btf_type_is_func_proto(t))
+		return false;
+
+	if (off % 8) {
+		bpf_log(log, "raw_tp '%s' offset %d is not multiple of 8\n",
+			tname, off);
+		return false;
+	}
+	arg = off / 8;
+	args = (const struct btf_param *)(t + 1);
+	/* skip first 'void *__data' argument in btf_trace_##name typedef */
+	args++;
+	nr_args = btf_type_vlen(t) - 1;
+	if (arg >= nr_args) {
+		bpf_log(log, "raw_tp '%s' doesn't have %d-th argument\n",
+			tname, arg);
+		return false;
+	}
+
+	t = btf_type_by_id(btf_vmlinux, args[arg].type);
+	/* skip modifiers */
+	while (btf_type_is_modifier(t))
+		t = btf_type_by_id(btf_vmlinux, t->type);
+	if (btf_type_is_int(t))
+		/* accessing a scalar */
+		return true;
+	if (!btf_type_is_ptr(t)) {
+		bpf_log(log,
+			"raw_tp '%s' arg%d '%s' has type %s. Only pointer access is allowed\n",
+			tname, arg,
+			__btf_name_by_offset(btf_vmlinux, t->name_off),
+			btf_kind_str[BTF_INFO_KIND(t->info)]);
+		return false;
+	}
+	if (t->type == 0)
+		/* This is a pointer to void.
+		 * It is the same as scalar from the verifier safety pov.
+		 * No further pointer walking is allowed.
+		 */
+		return true;
+
+	/* this is a pointer to another type */
+	info->reg_type = PTR_TO_BTF_ID;
+	info->btf_id = t->type;
+
+	t = btf_type_by_id(btf_vmlinux, t->type);
+	/* skip modifiers */
+	while (btf_type_is_modifier(t))
+		t = btf_type_by_id(btf_vmlinux, t->type);
+	if (!btf_type_is_struct(t)) {
+		bpf_log(log,
+			"raw_tp '%s' arg%d type %s is not a struct\n",
+			tname, arg, btf_kind_str[BTF_INFO_KIND(t->info)]);
+		return false;
+	}
+	bpf_log(log, "raw_tp '%s' arg%d has btf_id %d type %s '%s'\n",
+		tname, arg, info->btf_id, btf_kind_str[BTF_INFO_KIND(t->info)],
+		__btf_name_by_offset(btf_vmlinux, t->name_off));
+	return true;
+}
+
+int btf_struct_access(struct bpf_verifier_log *log,
+		      const struct btf_type *t, int off, int size,
+		      enum bpf_access_type atype,
+		      u32 *next_btf_id)
+{
+	const struct btf_member *member;
+	const struct btf_type *mtype;
+	const char *tname, *mname;
+	int i, moff = 0, msize;
+
+again:
+	tname = __btf_name_by_offset(btf_vmlinux, t->name_off);
+	if (!btf_type_is_struct(t)) {
+		bpf_log(log, "Type '%s' is not a struct", tname);
+		return -EINVAL;
+	}
+
+	for_each_member(i, t, member) {
+		/* offset of the field in bits */
+		moff = btf_member_bit_offset(t, member);
+
+		if (off + size <= moff / 8)
+			/* won't find anything, field is already too far */
+			break;
+
+		/* type of the field */
+		mtype = btf_type_by_id(btf_vmlinux, member->type);
+		mname = __btf_name_by_offset(btf_vmlinux, member->name_off);
+
+		/* skip modifiers */
+		while (btf_type_is_modifier(mtype))
+			mtype = btf_type_by_id(btf_vmlinux, mtype->type);
+
+		if (btf_type_is_array(mtype))
+			/* array deref is not supported yet */
+			continue;
+
+		if (!btf_type_has_size(mtype) && !btf_type_is_ptr(mtype)) {
+			bpf_log(log, "field %s doesn't have size\n", mname);
+			return -EFAULT;
+		}
+		if (btf_type_is_ptr(mtype))
+			msize = 8;
+		else
+			msize = mtype->size;
+		if (off >= moff / 8 + msize)
+			/* no overlap with member, keep iterating */
+			continue;
+		/* the 'off' we're looking for is either equal to start
+		 * of this field or inside of this struct
+		 */
+		if (btf_type_is_struct(mtype)) {
+			/* our field must be inside that union or struct */
+			t = mtype;
+
+			/* adjust offset we're looking for */
+			off -= moff / 8;
+			goto again;
+		}
+		if (msize != size) {
+			/* field access size doesn't match */
+			bpf_log(log,
+				"cannot access %d bytes in struct %s field %s that has size %d\n",
+				size, tname, mname, msize);
+			return -EACCES;
+		}
+
+		if (btf_type_is_ptr(mtype)) {
+			const struct btf_type *stype;
+
+			stype = btf_type_by_id(btf_vmlinux, mtype->type);
+			/* skip modifiers */
+			while (btf_type_is_modifier(stype))
+				stype = btf_type_by_id(btf_vmlinux, stype->type);
+			if (btf_type_is_struct(stype)) {
+				*next_btf_id = mtype->type;
+				return PTR_TO_BTF_ID;
+			}
+		}
+		/* all other fields are treated as scalars */
+		return SCALAR_VALUE;
+	}
+	bpf_log(log, "struct %s doesn't have field at offset %d\n", tname, off);
+	return -EINVAL;
+}
+
 void btf_type_seq_show(const struct btf *btf, u32 type_id, void *obj,
 		       struct seq_file *m)
 {
diff --git a/kernel/bpf/verifier.c b/kernel/bpf/verifier.c
index 051a355037bf..8246275704aa 100644
--- a/kernel/bpf/verifier.c
+++ b/kernel/bpf/verifier.c
@@ -286,6 +286,19 @@ __printf(2, 3) static void verbose(void *private_data, const char *fmt, ...)
 	va_end(args);
 }
 
+__printf(2, 3) void bpf_log(struct bpf_verifier_log *log,
+			    const char *fmt, ...)
+{
+	va_list args;
+
+	if (!bpf_verifier_log_needed(log))
+		return;
+
+	va_start(args, fmt);
+	bpf_verifier_vlog(log, fmt, args);
+	va_end(args);
+}
+
 static const char *ltrim(const char *s)
 {
 	while (isspace(*s))
@@ -406,6 +419,7 @@ static const char * const reg_type_str[] = {
 	[PTR_TO_TCP_SOCK_OR_NULL] = "tcp_sock_or_null",
 	[PTR_TO_TP_BUFFER]	= "tp_buffer",
 	[PTR_TO_XDP_SOCK]	= "xdp_sock",
+	[PTR_TO_BTF_ID]		= "ptr_",
 };
 
 static char slot_type_char[] = {
@@ -436,6 +450,12 @@ static struct bpf_func_state *func(struct bpf_verifier_env *env,
 	return cur->frame[reg->frameno];
 }
 
+const char *kernel_type_name(u32 id)
+{
+	return btf_name_by_offset(btf_vmlinux,
+				  btf_type_by_id(btf_vmlinux, id)->name_off);
+}
+
 static void print_verifier_state(struct bpf_verifier_env *env,
 				 const struct bpf_func_state *state)
 {
@@ -460,6 +480,8 @@ static void print_verifier_state(struct bpf_verifier_env *env,
 			/* reg->off should be 0 for SCALAR_VALUE */
 			verbose(env, "%lld", reg->var_off.value + reg->off);
 		} else {
+			if (t == PTR_TO_BTF_ID)
+				verbose(env, "%s", kernel_type_name(reg->btf_id));
 			verbose(env, "(id=%d", reg->id);
 			if (reg_type_may_be_refcounted_or_null(t))
 				verbose(env, ",ref_obj_id=%d", reg->ref_obj_id);
@@ -2337,10 +2359,12 @@ static int check_packet_access(struct bpf_verifier_env *env, u32 regno, int off,
 
 /* check access to 'struct bpf_context' fields.  Supports fixed offsets only */
 static int check_ctx_access(struct bpf_verifier_env *env, int insn_idx, int off, int size,
-			    enum bpf_access_type t, enum bpf_reg_type *reg_type)
+			    enum bpf_access_type t, enum bpf_reg_type *reg_type,
+			    u32 *btf_id)
 {
 	struct bpf_insn_access_aux info = {
 		.reg_type = *reg_type,
+		.log = &env->log,
 	};
 
 	if (env->ops->is_valid_access &&
@@ -2354,7 +2378,10 @@ static int check_ctx_access(struct bpf_verifier_env *env, int insn_idx, int off,
 		 */
 		*reg_type = info.reg_type;
 
-		env->insn_aux_data[insn_idx].ctx_field_size = info.ctx_field_size;
+		if (*reg_type == PTR_TO_BTF_ID)
+			*btf_id = info.btf_id;
+		else
+			env->insn_aux_data[insn_idx].ctx_field_size = info.ctx_field_size;
 		/* remember the offset of last byte accessed in ctx */
 		if (env->prog->aux->max_ctx_offset < off + size)
 			env->prog->aux->max_ctx_offset = off + size;
@@ -2745,6 +2772,53 @@ static void coerce_reg_to_size(struct bpf_reg_state *reg, int size)
 	reg->smax_value = reg->umax_value;
 }
 
+static int check_ptr_to_btf_access(struct bpf_verifier_env *env,
+				   struct bpf_reg_state *regs,
+				   int regno, int off, int size,
+				   enum bpf_access_type atype,
+				   int value_regno)
+{
+	struct bpf_reg_state *reg = regs + regno;
+	const struct btf_type *t = btf_type_by_id(btf_vmlinux, reg->btf_id);
+	const char *tname = btf_name_by_offset(btf_vmlinux, t->name_off);
+	u32 btf_id;
+	int ret;
+
+	if (atype != BPF_READ) {
+		verbose(env, "only read is supported\n");
+		return -EACCES;
+	}
+
+	if (off < 0) {
+		verbose(env,
+			"R%d is ptr_%s invalid negative access: off=%d\n",
+			regno, tname, off);
+		return -EACCES;
+	}
+	if (!tnum_is_const(reg->var_off) || reg->var_off.value) {
+		char tn_buf[48];
+
+		tnum_strn(tn_buf, sizeof(tn_buf), reg->var_off);
+		verbose(env,
+			"R%d is ptr_%s invalid variable offset: off=%d, var_off=%s\n",
+			regno, tname, off, tn_buf);
+		return -EACCES;
+	}
+
+	ret = btf_struct_access(&env->log, t, off, size, atype, &btf_id);
+	if (ret < 0)
+		return ret;
+
+	if (ret == SCALAR_VALUE) {
+		mark_reg_unknown(env, regs, value_regno);
+		return 0;
+	}
+	mark_reg_known_zero(env, regs, value_regno);
+	regs[value_regno].type = PTR_TO_BTF_ID;
+	regs[value_regno].btf_id = btf_id;
+	return 0;
+}
+
 /* check whether memory at (regno + off) is accessible for t = (read | write)
  * if t==write, value_regno is a register which value is stored into memory
  * if t==read, value_regno is a register which will receive the value from memory
@@ -2787,6 +2861,7 @@ static int check_mem_access(struct bpf_verifier_env *env, int insn_idx, u32 regn
 
 	} else if (reg->type == PTR_TO_CTX) {
 		enum bpf_reg_type reg_type = SCALAR_VALUE;
+		u32 btf_id = 0;
 
 		if (t == BPF_WRITE && value_regno >= 0 &&
 		    is_pointer_value(env, value_regno)) {
@@ -2798,7 +2873,7 @@ static int check_mem_access(struct bpf_verifier_env *env, int insn_idx, u32 regn
 		if (err < 0)
 			return err;
 
-		err = check_ctx_access(env, insn_idx, off, size, t, &reg_type);
+		err = check_ctx_access(env, insn_idx, off, size, t, &reg_type, &btf_id);
 		if (!err && t == BPF_READ && value_regno >= 0) {
 			/* ctx access returns either a scalar, or a
 			 * PTR_TO_PACKET[_META,_END]. In the latter
@@ -2817,6 +2892,8 @@ static int check_mem_access(struct bpf_verifier_env *env, int insn_idx, u32 regn
 				 * a sub-register.
 				 */
 				regs[value_regno].subreg_def = DEF_NOT_SUBREG;
+				if (reg_type == PTR_TO_BTF_ID)
+					regs[value_regno].btf_id = btf_id;
 			}
 			regs[value_regno].type = reg_type;
 		}
@@ -2876,6 +2953,9 @@ static int check_mem_access(struct bpf_verifier_env *env, int insn_idx, u32 regn
 		err = check_tp_buffer_access(env, reg, regno, off, size);
 		if (!err && t == BPF_READ && value_regno >= 0)
 			mark_reg_unknown(env, regs, value_regno);
+	} else if (reg->type == PTR_TO_BTF_ID) {
+		err = check_ptr_to_btf_access(env, regs, regno, off, size, t,
+					      value_regno);
 	} else {
 		verbose(env, "R%d invalid mem access '%s'\n", regno,
 			reg_type_str[reg->type]);
diff --git a/kernel/trace/bpf_trace.c b/kernel/trace/bpf_trace.c
index 44bd08f2443b..6221e8c6ecc3 100644
--- a/kernel/trace/bpf_trace.c
+++ b/kernel/trace/bpf_trace.c
@@ -1074,7 +1074,7 @@ static bool raw_tp_prog_is_valid_access(int off, int size,
 		return false;
 	if (off % size != 0)
 		return false;
-	return true;
+	return btf_ctx_access(off, size, type, prog, info);
 }
 
 const struct bpf_verifier_ops raw_tracepoint_verifier_ops = {
-- 
2.23.0


^ permalink raw reply	[flat|nested] 33+ messages in thread

* [PATCH v2 bpf-next 07/12] bpf: attach raw_tp program with BTF via type name
  2019-10-10  4:14 [PATCH v2 bpf-next 00/12] bpf: revolutionize bpf tracing Alexei Starovoitov
                   ` (5 preceding siblings ...)
  2019-10-10  4:14 ` [PATCH v2 bpf-next 06/12] bpf: implement accurate raw_tp context access via BTF Alexei Starovoitov
@ 2019-10-10  4:14 ` Alexei Starovoitov
  2019-10-11 18:44   ` Andrii Nakryiko
  2019-10-10  4:14 ` [PATCH v2 bpf-next 08/12] bpf: add support for BTF pointers to interpreter Alexei Starovoitov
                   ` (4 subsequent siblings)
  11 siblings, 1 reply; 33+ messages in thread
From: Alexei Starovoitov @ 2019-10-10  4:14 UTC (permalink / raw)
  To: davem; +Cc: daniel, x86, netdev, bpf, kernel-team

BTF type id specified at program load time has all
necessary information to attach that program to raw tracepoint.
Use kernel type name to find raw tracepoint.

Signed-off-by: Alexei Starovoitov <ast@kernel.org>
---
 kernel/bpf/syscall.c | 67 +++++++++++++++++++++++++++++---------------
 1 file changed, 44 insertions(+), 23 deletions(-)

diff --git a/kernel/bpf/syscall.c b/kernel/bpf/syscall.c
index b56c482c9760..03f36e73d84a 100644
--- a/kernel/bpf/syscall.c
+++ b/kernel/bpf/syscall.c
@@ -1816,17 +1816,49 @@ static int bpf_raw_tracepoint_open(const union bpf_attr *attr)
 	struct bpf_raw_tracepoint *raw_tp;
 	struct bpf_raw_event_map *btp;
 	struct bpf_prog *prog;
-	char tp_name[128];
+	const char *tp_name;
+	char buf[128];
 	int tp_fd, err;
 
-	if (strncpy_from_user(tp_name, u64_to_user_ptr(attr->raw_tracepoint.name),
-			      sizeof(tp_name) - 1) < 0)
-		return -EFAULT;
-	tp_name[sizeof(tp_name) - 1] = 0;
+	prog = bpf_prog_get(attr->raw_tracepoint.prog_fd);
+	if (IS_ERR(prog))
+		return PTR_ERR(prog);
+
+	if (prog->type != BPF_PROG_TYPE_RAW_TRACEPOINT &&
+	    prog->type != BPF_PROG_TYPE_RAW_TRACEPOINT_WRITABLE) {
+		err = -EINVAL;
+		goto out_put_prog;
+	}
+
+	if (prog->type == BPF_PROG_TYPE_RAW_TRACEPOINT &&
+	    prog->aux->attach_btf_id) {
+		if (attr->raw_tracepoint.name) {
+			/* raw_tp name should not be specified in raw_tp
+			 * programs that were verified via in-kernel BTF info
+			 */
+			err = -EINVAL;
+			goto out_put_prog;
+		}
+		/* raw_tp name is taken from type name instead */
+		tp_name = kernel_type_name(prog->aux->attach_btf_id);
+		/* skip the prefix */
+		tp_name += sizeof("btf_trace_") - 1;
+	} else {
+		if (strncpy_from_user(buf,
+				      u64_to_user_ptr(attr->raw_tracepoint.name),
+				      sizeof(buf) - 1) < 0) {
+			err = -EFAULT;
+			goto out_put_prog;
+		}
+		buf[sizeof(buf) - 1] = 0;
+		tp_name = buf;
+	}
 
 	btp = bpf_get_raw_tracepoint(tp_name);
-	if (!btp)
-		return -ENOENT;
+	if (!btp) {
+		err = -ENOENT;
+		goto out_put_prog;
+	}
 
 	raw_tp = kzalloc(sizeof(*raw_tp), GFP_USER);
 	if (!raw_tp) {
@@ -1834,38 +1866,27 @@ static int bpf_raw_tracepoint_open(const union bpf_attr *attr)
 		goto out_put_btp;
 	}
 	raw_tp->btp = btp;
-
-	prog = bpf_prog_get(attr->raw_tracepoint.prog_fd);
-	if (IS_ERR(prog)) {
-		err = PTR_ERR(prog);
-		goto out_free_tp;
-	}
-	if (prog->type != BPF_PROG_TYPE_RAW_TRACEPOINT &&
-	    prog->type != BPF_PROG_TYPE_RAW_TRACEPOINT_WRITABLE) {
-		err = -EINVAL;
-		goto out_put_prog;
-	}
+	raw_tp->prog = prog;
 
 	err = bpf_probe_register(raw_tp->btp, prog);
 	if (err)
-		goto out_put_prog;
+		goto out_free_tp;
 
-	raw_tp->prog = prog;
 	tp_fd = anon_inode_getfd("bpf-raw-tracepoint", &bpf_raw_tp_fops, raw_tp,
 				 O_CLOEXEC);
 	if (tp_fd < 0) {
 		bpf_probe_unregister(raw_tp->btp, prog);
 		err = tp_fd;
-		goto out_put_prog;
+		goto out_free_tp;
 	}
 	return tp_fd;
 
-out_put_prog:
-	bpf_prog_put(prog);
 out_free_tp:
 	kfree(raw_tp);
 out_put_btp:
 	bpf_put_raw_tracepoint(btp);
+out_put_prog:
+	bpf_prog_put(prog);
 	return err;
 }
 
-- 
2.23.0


^ permalink raw reply	[flat|nested] 33+ messages in thread

* [PATCH v2 bpf-next 08/12] bpf: add support for BTF pointers to interpreter
  2019-10-10  4:14 [PATCH v2 bpf-next 00/12] bpf: revolutionize bpf tracing Alexei Starovoitov
                   ` (6 preceding siblings ...)
  2019-10-10  4:14 ` [PATCH v2 bpf-next 07/12] bpf: attach raw_tp program with BTF via type name Alexei Starovoitov
@ 2019-10-10  4:14 ` Alexei Starovoitov
  2019-10-10  4:15 ` [PATCH v2 bpf-next 09/12] bpf: add support for BTF pointers to x86 JIT Alexei Starovoitov
                   ` (3 subsequent siblings)
  11 siblings, 0 replies; 33+ messages in thread
From: Alexei Starovoitov @ 2019-10-10  4:14 UTC (permalink / raw)
  To: davem; +Cc: daniel, x86, netdev, bpf, kernel-team

Pointer to BTF object is a pointer to kernel object or NULL.
The memory access in the interpreter has to be done via probe_kernel_read
to avoid page faults.

Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: Andrii Nakryiko <andriin@fb.com>
---
 include/linux/filter.h |  3 +++
 kernel/bpf/core.c      | 19 +++++++++++++++++++
 kernel/bpf/verifier.c  |  8 ++++++++
 3 files changed, 30 insertions(+)

diff --git a/include/linux/filter.h b/include/linux/filter.h
index d3d51d7aff2c..22ebea2e64ea 100644
--- a/include/linux/filter.h
+++ b/include/linux/filter.h
@@ -65,6 +65,9 @@ struct ctl_table_header;
 /* unused opcode to mark special call to bpf_tail_call() helper */
 #define BPF_TAIL_CALL	0xf0
 
+/* unused opcode to mark special load instruction. Same as BPF_ABS */
+#define BPF_PROBE_MEM	0x20
+
 /* unused opcode to mark call to interpreter with arguments */
 #define BPF_CALL_ARGS	0xe0
 
diff --git a/kernel/bpf/core.c b/kernel/bpf/core.c
index 66088a9e9b9e..8a765bbd33f0 100644
--- a/kernel/bpf/core.c
+++ b/kernel/bpf/core.c
@@ -1291,6 +1291,11 @@ bool bpf_opcode_in_insntable(u8 code)
 }
 
 #ifndef CONFIG_BPF_JIT_ALWAYS_ON
+u64 __weak bpf_probe_read(void * dst, u32 size, const void * unsafe_ptr)
+{
+	memset(dst, 0, size);
+	return -EFAULT;
+}
 /**
  *	__bpf_prog_run - run eBPF program on a given context
  *	@regs: is the array of MAX_BPF_EXT_REG eBPF pseudo-registers
@@ -1310,6 +1315,10 @@ static u64 __no_fgcse ___bpf_prog_run(u64 *regs, const struct bpf_insn *insn, u6
 		/* Non-UAPI available opcodes. */
 		[BPF_JMP | BPF_CALL_ARGS] = &&JMP_CALL_ARGS,
 		[BPF_JMP | BPF_TAIL_CALL] = &&JMP_TAIL_CALL,
+		[BPF_LDX | BPF_PROBE_MEM | BPF_B] = &&LDX_PROBE_MEM_B,
+		[BPF_LDX | BPF_PROBE_MEM | BPF_H] = &&LDX_PROBE_MEM_H,
+		[BPF_LDX | BPF_PROBE_MEM | BPF_W] = &&LDX_PROBE_MEM_W,
+		[BPF_LDX | BPF_PROBE_MEM | BPF_DW] = &&LDX_PROBE_MEM_DW,
 	};
 #undef BPF_INSN_3_LBL
 #undef BPF_INSN_2_LBL
@@ -1542,6 +1551,16 @@ static u64 __no_fgcse ___bpf_prog_run(u64 *regs, const struct bpf_insn *insn, u6
 	LDST(W,  u32)
 	LDST(DW, u64)
 #undef LDST
+#define LDX_PROBE(SIZEOP, SIZE)						\
+	LDX_PROBE_MEM_##SIZEOP:						\
+		bpf_probe_read(&DST, SIZE, (const void *)(long) SRC);	\
+		CONT;
+	LDX_PROBE(B,  1)
+	LDX_PROBE(H,  2)
+	LDX_PROBE(W,  4)
+	LDX_PROBE(DW, 8)
+#undef LDX_PROBE
+
 	STX_XADD_W: /* lock xadd *(u32 *)(dst_reg + off16) += src_reg */
 		atomic_add((u32) SRC, (atomic_t *)(unsigned long)
 			   (DST + insn->off));
diff --git a/kernel/bpf/verifier.c b/kernel/bpf/verifier.c
index 8246275704aa..2ade5193b76c 100644
--- a/kernel/bpf/verifier.c
+++ b/kernel/bpf/verifier.c
@@ -7526,6 +7526,7 @@ static bool reg_type_mismatch_ok(enum bpf_reg_type type)
 	case PTR_TO_TCP_SOCK:
 	case PTR_TO_TCP_SOCK_OR_NULL:
 	case PTR_TO_XDP_SOCK:
+	case PTR_TO_BTF_ID:
 		return false;
 	default:
 		return true;
@@ -8667,6 +8668,13 @@ static int convert_ctx_accesses(struct bpf_verifier_env *env)
 		case PTR_TO_XDP_SOCK:
 			convert_ctx_access = bpf_xdp_sock_convert_ctx_access;
 			break;
+		case PTR_TO_BTF_ID:
+			if (type == BPF_WRITE) {
+				verbose(env, "Writes through BTF pointers are not allowed\n");
+				return -EINVAL;
+			}
+			insn->code = BPF_LDX | BPF_PROBE_MEM | BPF_SIZE((insn)->code);
+			continue;
 		default:
 			continue;
 		}
-- 
2.23.0


^ permalink raw reply	[flat|nested] 33+ messages in thread

* [PATCH v2 bpf-next 09/12] bpf: add support for BTF pointers to x86 JIT
  2019-10-10  4:14 [PATCH v2 bpf-next 00/12] bpf: revolutionize bpf tracing Alexei Starovoitov
                   ` (7 preceding siblings ...)
  2019-10-10  4:14 ` [PATCH v2 bpf-next 08/12] bpf: add support for BTF pointers to interpreter Alexei Starovoitov
@ 2019-10-10  4:15 ` Alexei Starovoitov
  2019-10-11 18:48   ` Andrii Nakryiko
  2019-10-10  4:15 ` [PATCH v2 bpf-next 10/12] bpf: check types of arguments passed into helpers Alexei Starovoitov
                   ` (2 subsequent siblings)
  11 siblings, 1 reply; 33+ messages in thread
From: Alexei Starovoitov @ 2019-10-10  4:15 UTC (permalink / raw)
  To: davem; +Cc: daniel, x86, netdev, bpf, kernel-team

Pointer to BTF object is a pointer to kernel object or NULL.
Such pointers can only be used by BPF_LDX instructions.
The verifier changed their opcode from LDX|MEM|size
to LDX|PROBE_MEM|size to make JITing easier.
The number of entries in extable is the number of BPF_LDX insns
that access kernel memory via "pointer to BTF type".
Only these load instructions can fault.
Since x86 extable is relative it has to be allocated in the same
memory region as JITed code.
Allocate it prior to last pass of JITing and let the last pass populate it.
Pointer to extable in bpf_prog_aux is necessary to make page fault
handling fast.
Page fault handling is done in two steps:
1. bpf_prog_kallsyms_find() finds BPF program that page faulted.
   It's done by walking rb tree.
2. then extable for given bpf program is binary searched.
This process is similar to how page faulting is done for kernel modules.
The exception handler skips over faulting x86 instruction and
initializes destination register with zero. This mimics exact
behavior of bpf_probe_read (when probe_kernel_read faults dest is zeroed).

JITs for other architectures can add support in similar way.
Until then they will reject unknown opcode and fallback to interpreter.

Since extable should be aligned and placed near JITed code
make bpf_jit_binary_alloc() return 4 byte aligned image offset,
so that extable aligning formula in bpf_int_jit_compile() doesn't need
to rely on internal implementation of bpf_jit_binary_alloc().
On x86 gcc defaults to 16-byte alignment for regular kernel functions
due to better performance. JITed code may be aligned to 16 in the future,
but it will use 4 in the meantime.

Signed-off-by: Alexei Starovoitov <ast@kernel.org>
---
 arch/x86/net/bpf_jit_comp.c | 97 +++++++++++++++++++++++++++++++++++--
 include/linux/bpf.h         |  3 ++
 include/linux/extable.h     | 10 ++++
 kernel/bpf/core.c           | 20 +++++++-
 kernel/bpf/verifier.c       |  1 +
 kernel/extable.c            |  2 +
 6 files changed, 128 insertions(+), 5 deletions(-)

diff --git a/arch/x86/net/bpf_jit_comp.c b/arch/x86/net/bpf_jit_comp.c
index 3ad2ba1ad855..8cd23d8309bf 100644
--- a/arch/x86/net/bpf_jit_comp.c
+++ b/arch/x86/net/bpf_jit_comp.c
@@ -9,7 +9,7 @@
 #include <linux/filter.h>
 #include <linux/if_vlan.h>
 #include <linux/bpf.h>
-
+#include <asm/extable.h>
 #include <asm/set_memory.h>
 #include <asm/nospec-branch.h>
 
@@ -123,6 +123,19 @@ static const int reg2hex[] = {
 	[AUX_REG] = 3,    /* R11 temp register */
 };
 
+static const int reg2pt_regs[] = {
+	[BPF_REG_0] = offsetof(struct pt_regs, ax),
+	[BPF_REG_1] = offsetof(struct pt_regs, di),
+	[BPF_REG_2] = offsetof(struct pt_regs, si),
+	[BPF_REG_3] = offsetof(struct pt_regs, dx),
+	[BPF_REG_4] = offsetof(struct pt_regs, cx),
+	[BPF_REG_5] = offsetof(struct pt_regs, r8),
+	[BPF_REG_6] = offsetof(struct pt_regs, bx),
+	[BPF_REG_7] = offsetof(struct pt_regs, r13),
+	[BPF_REG_8] = offsetof(struct pt_regs, r14),
+	[BPF_REG_9] = offsetof(struct pt_regs, r15),
+};
+
 /*
  * is_ereg() == true if BPF register 'reg' maps to x86-64 r8..r15
  * which need extra byte of encoding.
@@ -377,6 +390,19 @@ static void emit_mov_reg(u8 **pprog, bool is64, u32 dst_reg, u32 src_reg)
 	*pprog = prog;
 }
 
+
+static bool ex_handler_bpf(const struct exception_table_entry *x,
+			   struct pt_regs *regs, int trapnr,
+			   unsigned long error_code, unsigned long fault_addr)
+{
+	u32 reg = x->fixup >> 8;
+
+	/* jump over faulting load and clear dest register */
+	*(unsigned long *)((void *)regs + reg) = 0;
+	regs->ip += x->fixup & 0xff;
+	return true;
+}
+
 static int do_jit(struct bpf_prog *bpf_prog, int *addrs, u8 *image,
 		  int oldproglen, struct jit_context *ctx)
 {
@@ -384,7 +410,7 @@ static int do_jit(struct bpf_prog *bpf_prog, int *addrs, u8 *image,
 	int insn_cnt = bpf_prog->len;
 	bool seen_exit = false;
 	u8 temp[BPF_MAX_INSN_SIZE + BPF_INSN_SAFETY];
-	int i, cnt = 0;
+	int i, cnt = 0, excnt = 0;
 	int proglen = 0;
 	u8 *prog = temp;
 
@@ -778,14 +804,17 @@ stx:			if (is_imm8(insn->off))
 
 			/* LDX: dst_reg = *(u8*)(src_reg + off) */
 		case BPF_LDX | BPF_MEM | BPF_B:
+		case BPF_LDX | BPF_PROBE_MEM | BPF_B:
 			/* Emit 'movzx rax, byte ptr [rax + off]' */
 			EMIT3(add_2mod(0x48, src_reg, dst_reg), 0x0F, 0xB6);
 			goto ldx;
 		case BPF_LDX | BPF_MEM | BPF_H:
+		case BPF_LDX | BPF_PROBE_MEM | BPF_H:
 			/* Emit 'movzx rax, word ptr [rax + off]' */
 			EMIT3(add_2mod(0x48, src_reg, dst_reg), 0x0F, 0xB7);
 			goto ldx;
 		case BPF_LDX | BPF_MEM | BPF_W:
+		case BPF_LDX | BPF_PROBE_MEM | BPF_W:
 			/* Emit 'mov eax, dword ptr [rax+0x14]' */
 			if (is_ereg(dst_reg) || is_ereg(src_reg))
 				EMIT2(add_2mod(0x40, src_reg, dst_reg), 0x8B);
@@ -793,6 +822,7 @@ stx:			if (is_imm8(insn->off))
 				EMIT1(0x8B);
 			goto ldx;
 		case BPF_LDX | BPF_MEM | BPF_DW:
+		case BPF_LDX | BPF_PROBE_MEM | BPF_DW:
 			/* Emit 'mov rax, qword ptr [rax+0x14]' */
 			EMIT2(add_2mod(0x48, src_reg, dst_reg), 0x8B);
 ldx:			/*
@@ -805,6 +835,48 @@ stx:			if (is_imm8(insn->off))
 			else
 				EMIT1_off32(add_2reg(0x80, src_reg, dst_reg),
 					    insn->off);
+			if (BPF_MODE(insn->code) == BPF_PROBE_MEM) {
+				struct exception_table_entry *ex;
+				u8 *_insn = image + proglen;
+				s64 delta;
+
+				if (!bpf_prog->aux->extable)
+					break;
+
+				if (excnt >= bpf_prog->aux->num_exentries) {
+					pr_err("ex gen bug\n");
+					return -EFAULT;
+				}
+				ex = &bpf_prog->aux->extable[excnt++];
+
+				delta = _insn - (u8 *)&ex->insn;
+				if (!is_simm32(delta)) {
+					pr_err("extable->insn doesn't fit into 32-bit\n");
+					return -EFAULT;
+				}
+				ex->insn = delta;
+
+				delta = (u8 *)ex_handler_bpf - (u8 *)&ex->handler;
+				if (!is_simm32(delta)) {
+					pr_err("extable->handler doesn't fit into 32-bit\n");
+					return -EFAULT;
+				}
+				ex->handler = delta;
+
+				if (dst_reg > BPF_REG_9) {
+					pr_err("verifier error\n");
+					return -EFAULT;
+				}
+				/*
+				 * Compute size of x86 insn and its target dest x86 register.
+				 * ex_handler_bpf() will use lower 8 bits to adjust
+				 * pt_regs->ip to jump over this x86 instruction
+				 * and upper bits to figure out which pt_regs to zero out.
+				 * End result: x86 insn "mov rbx, qword ptr [rax+0x14]"
+				 * of 4 bytes will be ignored and rbx will be zero inited.
+				 */
+				ex->fixup = (prog - temp) | (reg2pt_regs[dst_reg] << 8);
+			}
 			break;
 
 			/* STX XADD: lock *(u32*)(dst_reg + off) += src_reg */
@@ -1058,6 +1130,11 @@ xadd:			if (is_imm8(insn->off))
 		addrs[i] = proglen;
 		prog = temp;
 	}
+
+	if (image && excnt != bpf_prog->aux->num_exentries) {
+		pr_err("extable is not populated\n");
+		return -EFAULT;
+	}
 	return proglen;
 }
 
@@ -1158,12 +1235,24 @@ struct bpf_prog *bpf_int_jit_compile(struct bpf_prog *prog)
 			break;
 		}
 		if (proglen == oldproglen) {
-			header = bpf_jit_binary_alloc(proglen, &image,
-						      1, jit_fill_hole);
+			/*
+			 * The number of entries in extable is the number of BPF_LDX
+			 * insns that access kernel memory via "pointer to BTF type".
+			 * The verifier changed their opcode from LDX|MEM|size
+			 * to LDX|PROBE_MEM|size to make JITing easier.
+			 */
+			u32 align = __alignof__(struct exception_table_entry);
+			u32 extable_size = prog->aux->num_exentries *
+				sizeof(struct exception_table_entry);
+
+			/* allocate module memory for x86 insns and extable */
+			header = bpf_jit_binary_alloc(roundup(proglen, align) + extable_size,
+						      &image, align, jit_fill_hole);
 			if (!header) {
 				prog = orig_prog;
 				goto out_addrs;
 			}
+			prog->aux->extable = (void *) image + roundup(proglen, align);
 		}
 		oldproglen = proglen;
 		cond_resched();
diff --git a/include/linux/bpf.h b/include/linux/bpf.h
index 4218e269be59..6edfe50f1c2c 100644
--- a/include/linux/bpf.h
+++ b/include/linux/bpf.h
@@ -24,6 +24,7 @@ struct sock;
 struct seq_file;
 struct btf;
 struct btf_type;
+struct exception_table_entry;
 
 extern struct idr btf_idr;
 extern spinlock_t btf_idr_lock;
@@ -423,6 +424,8 @@ struct bpf_prog_aux {
 	 * main prog always has linfo_idx == 0
 	 */
 	u32 linfo_idx;
+	u32 num_exentries;
+	struct exception_table_entry *extable;
 	struct bpf_prog_stats __percpu *stats;
 	union {
 		struct work_struct work;
diff --git a/include/linux/extable.h b/include/linux/extable.h
index 81ecfaa83ad3..4ab9e78f313b 100644
--- a/include/linux/extable.h
+++ b/include/linux/extable.h
@@ -33,4 +33,14 @@ search_module_extables(unsigned long addr)
 }
 #endif /*CONFIG_MODULES*/
 
+#ifdef CONFIG_BPF_JIT
+const struct exception_table_entry *search_bpf_extables(unsigned long addr);
+#else
+static inline const struct exception_table_entry *
+search_bpf_extables(unsigned long addr)
+{
+	return NULL;
+}
+#endif
+
 #endif /* _LINUX_EXTABLE_H */
diff --git a/kernel/bpf/core.c b/kernel/bpf/core.c
index 8a765bbd33f0..673f5d40a93e 100644
--- a/kernel/bpf/core.c
+++ b/kernel/bpf/core.c
@@ -30,7 +30,7 @@
 #include <linux/kallsyms.h>
 #include <linux/rcupdate.h>
 #include <linux/perf_event.h>
-
+#include <linux/extable.h>
 #include <asm/unaligned.h>
 
 /* Registers */
@@ -712,6 +712,24 @@ bool is_bpf_text_address(unsigned long addr)
 	return ret;
 }
 
+const struct exception_table_entry *search_bpf_extables(unsigned long addr)
+{
+	const struct exception_table_entry *e = NULL;
+	struct bpf_prog *prog;
+
+	rcu_read_lock();
+	prog = bpf_prog_kallsyms_find(addr);
+	if (!prog)
+		goto out;
+	if (!prog->aux->num_exentries)
+		goto out;
+
+	e = search_extable(prog->aux->extable, prog->aux->num_exentries, addr);
+out:
+	rcu_read_unlock();
+	return e;
+}
+
 int bpf_get_kallsym(unsigned int symnum, unsigned long *value, char *type,
 		    char *sym)
 {
diff --git a/kernel/bpf/verifier.c b/kernel/bpf/verifier.c
index 2ade5193b76c..3404caa2f196 100644
--- a/kernel/bpf/verifier.c
+++ b/kernel/bpf/verifier.c
@@ -8674,6 +8674,7 @@ static int convert_ctx_accesses(struct bpf_verifier_env *env)
 				return -EINVAL;
 			}
 			insn->code = BPF_LDX | BPF_PROBE_MEM | BPF_SIZE((insn)->code);
+			env->prog->aux->num_exentries++;
 			continue;
 		default:
 			continue;
diff --git a/kernel/extable.c b/kernel/extable.c
index f6c9406eec7d..f6920a11e28a 100644
--- a/kernel/extable.c
+++ b/kernel/extable.c
@@ -56,6 +56,8 @@ const struct exception_table_entry *search_exception_tables(unsigned long addr)
 	e = search_kernel_exception_table(addr);
 	if (!e)
 		e = search_module_extables(addr);
+	if (!e)
+		e = search_bpf_extables(addr);
 	return e;
 }
 
-- 
2.23.0


^ permalink raw reply	[flat|nested] 33+ messages in thread

* [PATCH v2 bpf-next 10/12] bpf: check types of arguments passed into helpers
  2019-10-10  4:14 [PATCH v2 bpf-next 00/12] bpf: revolutionize bpf tracing Alexei Starovoitov
                   ` (8 preceding siblings ...)
  2019-10-10  4:15 ` [PATCH v2 bpf-next 09/12] bpf: add support for BTF pointers to x86 JIT Alexei Starovoitov
@ 2019-10-10  4:15 ` Alexei Starovoitov
  2019-10-11 19:02   ` Andrii Nakryiko
  2019-10-10  4:15 ` [PATCH v2 bpf-next 11/12] bpf: disallow bpf_probe_read[_str] helpers Alexei Starovoitov
  2019-10-10  4:15 ` [PATCH v2 bpf-next 12/12] selftests/bpf: add kfree_skb raw_tp test Alexei Starovoitov
  11 siblings, 1 reply; 33+ messages in thread
From: Alexei Starovoitov @ 2019-10-10  4:15 UTC (permalink / raw)
  To: davem; +Cc: daniel, x86, netdev, bpf, kernel-team

Introduce new helper that reuses existing skb perf_event output
implementation, but can be called from raw_tracepoint programs
that receive 'struct sk_buff *' as tracepoint argument or
can walk other kernel data structures to skb pointer.

In order to do that teach verifier to resolve true C types
of bpf helpers into in-kernel BTF ids.
The type of kernel pointer passed by raw tracepoint into bpf
program will be tracked by the verifier all the way until
it's passed into helper function.
For example:
kfree_skb() kernel function calls trace_kfree_skb(skb, loc);
bpf programs receives that skb pointer and may eventually
pass it into bpf_skb_output() bpf helper which in-kernel is
implemented via bpf_skb_event_output() kernel function.
Its first argument in the kernel is 'struct sk_buff *'.
The verifier makes sure that types match all the way.

Signed-off-by: Alexei Starovoitov <ast@kernel.org>
---
 include/linux/bpf.h            | 18 ++++++---
 include/uapi/linux/bpf.h       | 27 +++++++++++++-
 kernel/bpf/btf.c               | 68 ++++++++++++++++++++++++++++++++++
 kernel/bpf/verifier.c          | 44 ++++++++++++++--------
 kernel/trace/bpf_trace.c       |  4 ++
 net/core/filter.c              | 15 +++++++-
 tools/include/uapi/linux/bpf.h | 27 +++++++++++++-
 7 files changed, 180 insertions(+), 23 deletions(-)

diff --git a/include/linux/bpf.h b/include/linux/bpf.h
index 6edfe50f1c2c..d3df073f374a 100644
--- a/include/linux/bpf.h
+++ b/include/linux/bpf.h
@@ -213,6 +213,7 @@ enum bpf_arg_type {
 	ARG_PTR_TO_INT,		/* pointer to int */
 	ARG_PTR_TO_LONG,	/* pointer to long */
 	ARG_PTR_TO_SOCKET,	/* pointer to bpf_sock (fullsock) */
+	ARG_PTR_TO_BTF_ID,	/* pointer to in-kernel struct */
 };
 
 /* type of values returned from helper functions */
@@ -235,11 +236,17 @@ struct bpf_func_proto {
 	bool gpl_only;
 	bool pkt_access;
 	enum bpf_return_type ret_type;
-	enum bpf_arg_type arg1_type;
-	enum bpf_arg_type arg2_type;
-	enum bpf_arg_type arg3_type;
-	enum bpf_arg_type arg4_type;
-	enum bpf_arg_type arg5_type;
+	union {
+		struct {
+			enum bpf_arg_type arg1_type;
+			enum bpf_arg_type arg2_type;
+			enum bpf_arg_type arg3_type;
+			enum bpf_arg_type arg4_type;
+			enum bpf_arg_type arg5_type;
+		};
+		enum bpf_arg_type arg_type[5];
+	};
+	u32 *btf_id; /* BTF ids of arguments */
 };
 
 /* bpf_context is intentionally undefined structure. Pointer to bpf_context is
@@ -765,6 +772,7 @@ int btf_struct_access(struct bpf_verifier_log *log,
 		      const struct btf_type *t, int off, int size,
 		      enum bpf_access_type atype,
 		      u32 *next_btf_id);
+u32 btf_resolve_helper_id(struct bpf_verifier_log *log, void *, int);
 
 #else /* !CONFIG_BPF_SYSCALL */
 static inline struct bpf_prog *bpf_prog_get(u32 ufd)
diff --git a/include/uapi/linux/bpf.h b/include/uapi/linux/bpf.h
index 3bb2cd1de341..b0454440186f 100644
--- a/include/uapi/linux/bpf.h
+++ b/include/uapi/linux/bpf.h
@@ -2751,6 +2751,30 @@ union bpf_attr {
  *		**-EOPNOTSUPP** kernel configuration does not enable SYN cookies
  *
  *		**-EPROTONOSUPPORT** IP packet version is not 4 or 6
+ *
+ * int bpf_skb_output(void *ctx, struct bpf_map *map, u64 flags, void *data, u64 size)
+ * 	Description
+ * 		Write raw *data* blob into a special BPF perf event held by
+ * 		*map* of type **BPF_MAP_TYPE_PERF_EVENT_ARRAY**. This perf
+ * 		event must have the following attributes: **PERF_SAMPLE_RAW**
+ * 		as **sample_type**, **PERF_TYPE_SOFTWARE** as **type**, and
+ * 		**PERF_COUNT_SW_BPF_OUTPUT** as **config**.
+ *
+ * 		The *flags* are used to indicate the index in *map* for which
+ * 		the value must be put, masked with **BPF_F_INDEX_MASK**.
+ * 		Alternatively, *flags* can be set to **BPF_F_CURRENT_CPU**
+ * 		to indicate that the index of the current CPU core should be
+ * 		used.
+ *
+ * 		The value to write, of *size*, is passed through eBPF stack and
+ * 		pointed by *data*.
+ *
+ * 		*ctx* is a pointer to in-kernel sutrct sk_buff.
+ *
+ * 		This helper is similar to **bpf_perf_event_output**\ () but
+ * 		restricted to raw_tracepoint bpf programs.
+ * 	Return
+ * 		0 on success, or a negative error in case of failure.
  */
 #define __BPF_FUNC_MAPPER(FN)		\
 	FN(unspec),			\
@@ -2863,7 +2887,8 @@ union bpf_attr {
 	FN(sk_storage_get),		\
 	FN(sk_storage_delete),		\
 	FN(send_signal),		\
-	FN(tcp_gen_syncookie),
+	FN(tcp_gen_syncookie),		\
+	FN(skb_output),
 
 /* integer value in 'imm' field of BPF_CALL instruction selects which helper
  * function eBPF program intends to call
diff --git a/kernel/bpf/btf.c b/kernel/bpf/btf.c
index 01f929566e8d..45b71a73356d 100644
--- a/kernel/bpf/btf.c
+++ b/kernel/bpf/btf.c
@@ -3622,6 +3622,74 @@ int btf_struct_access(struct bpf_verifier_log *log,
 	return -EINVAL;
 }
 
+u32 btf_resolve_helper_id(struct bpf_verifier_log *log, void *fn, int arg)
+{
+	char fnname[KSYM_SYMBOL_LEN + 4] = "btf_";
+	const struct btf_param *args;
+	const struct btf_type *t;
+	const char *tname, *sym;
+	u32 btf_id, i;
+
+	if (IS_ERR(btf_vmlinux)) {
+		bpf_log(log, "btf_vmlinux is malformed\n");
+		return -EINVAL;
+	}
+
+	sym = kallsyms_lookup((long)fn, NULL, NULL, NULL, fnname + 4);
+	if (!sym) {
+		bpf_log(log, "kernel doesn't have kallsyms\n");
+		return -EFAULT;
+	}
+
+	for (i = 1; i <= btf_vmlinux->nr_types; i++) {
+		t = btf_type_by_id(btf_vmlinux, i);
+		if (BTF_INFO_KIND(t->info) != BTF_KIND_TYPEDEF)
+			continue;
+		tname = __btf_name_by_offset(btf_vmlinux, t->name_off);
+		if (!strcmp(tname, fnname))
+			break;
+	}
+	if (i > btf_vmlinux->nr_types) {
+		bpf_log(log, "helper %s type is not found\n", fnname);
+		return -ENOENT;
+	}
+
+	t = btf_type_by_id(btf_vmlinux, t->type);
+	if (!btf_type_is_ptr(t))
+		return -EFAULT;
+	t = btf_type_by_id(btf_vmlinux, t->type);
+	if (!btf_type_is_func_proto(t))
+		return -EFAULT;
+
+	args = (const struct btf_param *)(t + 1);
+	if (arg >= btf_type_vlen(t)) {
+		bpf_log(log, "bpf helper %s doesn't have %d-th argument\n",
+			fnname, arg);
+		return -EINVAL;
+	}
+
+	t = btf_type_by_id(btf_vmlinux, args[arg].type);
+	if (!btf_type_is_ptr(t) || !t->type) {
+		/* anything but the pointer to struct is a helper config bug */
+		bpf_log(log, "ARG_PTR_TO_BTF is misconfigured\n");
+		return -EFAULT;
+	}
+	btf_id = t->type;
+	t = btf_type_by_id(btf_vmlinux, t->type);
+	/* skip modifiers */
+	while (btf_type_is_modifier(t)) {
+		btf_id = t->type;
+		t = btf_type_by_id(btf_vmlinux, t->type);
+	}
+	if (!btf_type_is_struct(t)) {
+		bpf_log(log, "ARG_PTR_TO_BTF is not a struct\n");
+		return -EFAULT;
+	}
+	bpf_log(log, "helper %s arg%d has btf_id %d struct %s\n", fnname + 4,
+		arg, btf_id, __btf_name_by_offset(btf_vmlinux, t->name_off));
+	return btf_id;
+}
+
 void btf_type_seq_show(const struct btf *btf, u32 type_id, void *obj,
 		       struct seq_file *m)
 {
diff --git a/kernel/bpf/verifier.c b/kernel/bpf/verifier.c
index 3404caa2f196..d04eb66b815a 100644
--- a/kernel/bpf/verifier.c
+++ b/kernel/bpf/verifier.c
@@ -205,6 +205,7 @@ struct bpf_call_arg_meta {
 	u64 msize_umax_value;
 	int ref_obj_id;
 	int func_id;
+	u32 btf_id;
 };
 
 struct btf *btf_vmlinux;
@@ -3384,6 +3385,22 @@ static int check_func_arg(struct bpf_verifier_env *env, u32 regno,
 		expected_type = PTR_TO_SOCKET;
 		if (type != expected_type)
 			goto err_type;
+	} else if (arg_type == ARG_PTR_TO_BTF_ID) {
+		expected_type = PTR_TO_BTF_ID;
+		if (type != expected_type)
+			goto err_type;
+		if (reg->btf_id != meta->btf_id) {
+			verbose(env, "Helper has type %s got %s in R%d\n",
+				kernel_type_name(meta->btf_id),
+				kernel_type_name(reg->btf_id), regno);
+
+			return -EACCES;
+		}
+		if (!tnum_is_const(reg->var_off) || reg->var_off.value || reg->off) {
+			verbose(env, "R%d is a pointer to in-kernel struct with non-zero offset\n",
+				regno);
+			return -EACCES;
+		}
 	} else if (arg_type == ARG_PTR_TO_SPIN_LOCK) {
 		if (meta->func_id == BPF_FUNC_spin_lock) {
 			if (process_spin_lock(env, regno, true))
@@ -3531,6 +3548,7 @@ static int check_map_func_compatibility(struct bpf_verifier_env *env,
 	case BPF_MAP_TYPE_PERF_EVENT_ARRAY:
 		if (func_id != BPF_FUNC_perf_event_read &&
 		    func_id != BPF_FUNC_perf_event_output &&
+		    func_id != BPF_FUNC_skb_output &&
 		    func_id != BPF_FUNC_perf_event_read_value)
 			goto error;
 		break;
@@ -3618,6 +3636,7 @@ static int check_map_func_compatibility(struct bpf_verifier_env *env,
 	case BPF_FUNC_perf_event_read:
 	case BPF_FUNC_perf_event_output:
 	case BPF_FUNC_perf_event_read_value:
+	case BPF_FUNC_skb_output:
 		if (map->map_type != BPF_MAP_TYPE_PERF_EVENT_ARRAY)
 			goto error;
 		break;
@@ -4072,21 +4091,16 @@ static int check_helper_call(struct bpf_verifier_env *env, int func_id, int insn
 
 	meta.func_id = func_id;
 	/* check args */
-	err = check_func_arg(env, BPF_REG_1, fn->arg1_type, &meta);
-	if (err)
-		return err;
-	err = check_func_arg(env, BPF_REG_2, fn->arg2_type, &meta);
-	if (err)
-		return err;
-	err = check_func_arg(env, BPF_REG_3, fn->arg3_type, &meta);
-	if (err)
-		return err;
-	err = check_func_arg(env, BPF_REG_4, fn->arg4_type, &meta);
-	if (err)
-		return err;
-	err = check_func_arg(env, BPF_REG_5, fn->arg5_type, &meta);
-	if (err)
-		return err;
+	for (i = 0; i < 5; i++) {
+		if (fn->arg_type[i] == ARG_PTR_TO_BTF_ID) {
+			if (!fn->btf_id[i])
+				fn->btf_id[i] = btf_resolve_helper_id(&env->log, fn->func, 0);
+			meta.btf_id = fn->btf_id[i];
+		}
+		err = check_func_arg(env, BPF_REG_1 + i, fn->arg_type[i], &meta);
+		if (err)
+			return err;
+	}
 
 	err = record_func_map(env, &meta, func_id, insn_idx);
 	if (err)
diff --git a/kernel/trace/bpf_trace.c b/kernel/trace/bpf_trace.c
index 6221e8c6ecc3..52f7e9d8c29b 100644
--- a/kernel/trace/bpf_trace.c
+++ b/kernel/trace/bpf_trace.c
@@ -995,6 +995,8 @@ static const struct bpf_func_proto bpf_perf_event_output_proto_raw_tp = {
 	.arg5_type	= ARG_CONST_SIZE_OR_ZERO,
 };
 
+extern const struct bpf_func_proto bpf_skb_output_proto;
+
 BPF_CALL_3(bpf_get_stackid_raw_tp, struct bpf_raw_tracepoint_args *, args,
 	   struct bpf_map *, map, u64, flags)
 {
@@ -1053,6 +1055,8 @@ raw_tp_prog_func_proto(enum bpf_func_id func_id, const struct bpf_prog *prog)
 	switch (func_id) {
 	case BPF_FUNC_perf_event_output:
 		return &bpf_perf_event_output_proto_raw_tp;
+	case BPF_FUNC_skb_output:
+		return &bpf_skb_output_proto;
 	case BPF_FUNC_get_stackid:
 		return &bpf_get_stackid_proto_raw_tp;
 	case BPF_FUNC_get_stack:
diff --git a/net/core/filter.c b/net/core/filter.c
index ed6563622ce3..c48fe0971b25 100644
--- a/net/core/filter.c
+++ b/net/core/filter.c
@@ -3798,7 +3798,7 @@ BPF_CALL_5(bpf_skb_event_output, struct sk_buff *, skb, struct bpf_map *, map,
 
 	if (unlikely(flags & ~(BPF_F_CTXLEN_MASK | BPF_F_INDEX_MASK)))
 		return -EINVAL;
-	if (unlikely(skb_size > skb->len))
+	if (unlikely(!skb || skb_size > skb->len))
 		return -EFAULT;
 
 	return bpf_event_output(map, flags, meta, meta_size, skb, skb_size,
@@ -3816,6 +3816,19 @@ static const struct bpf_func_proto bpf_skb_event_output_proto = {
 	.arg5_type	= ARG_CONST_SIZE_OR_ZERO,
 };
 
+static u32 bpf_skb_output_btf_ids[5];
+const struct bpf_func_proto bpf_skb_output_proto = {
+	.func		= bpf_skb_event_output,
+	.gpl_only	= true,
+	.ret_type	= RET_INTEGER,
+	.arg1_type	= ARG_PTR_TO_BTF_ID,
+	.arg2_type	= ARG_CONST_MAP_PTR,
+	.arg3_type	= ARG_ANYTHING,
+	.arg4_type	= ARG_PTR_TO_MEM,
+	.arg5_type	= ARG_CONST_SIZE_OR_ZERO,
+	.btf_id		= bpf_skb_output_btf_ids,
+};
+
 static unsigned short bpf_tunnel_key_af(u64 flags)
 {
 	return flags & BPF_F_TUNINFO_IPV6 ? AF_INET6 : AF_INET;
diff --git a/tools/include/uapi/linux/bpf.h b/tools/include/uapi/linux/bpf.h
index 3bb2cd1de341..b0454440186f 100644
--- a/tools/include/uapi/linux/bpf.h
+++ b/tools/include/uapi/linux/bpf.h
@@ -2751,6 +2751,30 @@ union bpf_attr {
  *		**-EOPNOTSUPP** kernel configuration does not enable SYN cookies
  *
  *		**-EPROTONOSUPPORT** IP packet version is not 4 or 6
+ *
+ * int bpf_skb_output(void *ctx, struct bpf_map *map, u64 flags, void *data, u64 size)
+ * 	Description
+ * 		Write raw *data* blob into a special BPF perf event held by
+ * 		*map* of type **BPF_MAP_TYPE_PERF_EVENT_ARRAY**. This perf
+ * 		event must have the following attributes: **PERF_SAMPLE_RAW**
+ * 		as **sample_type**, **PERF_TYPE_SOFTWARE** as **type**, and
+ * 		**PERF_COUNT_SW_BPF_OUTPUT** as **config**.
+ *
+ * 		The *flags* are used to indicate the index in *map* for which
+ * 		the value must be put, masked with **BPF_F_INDEX_MASK**.
+ * 		Alternatively, *flags* can be set to **BPF_F_CURRENT_CPU**
+ * 		to indicate that the index of the current CPU core should be
+ * 		used.
+ *
+ * 		The value to write, of *size*, is passed through eBPF stack and
+ * 		pointed by *data*.
+ *
+ * 		*ctx* is a pointer to in-kernel sutrct sk_buff.
+ *
+ * 		This helper is similar to **bpf_perf_event_output**\ () but
+ * 		restricted to raw_tracepoint bpf programs.
+ * 	Return
+ * 		0 on success, or a negative error in case of failure.
  */
 #define __BPF_FUNC_MAPPER(FN)		\
 	FN(unspec),			\
@@ -2863,7 +2887,8 @@ union bpf_attr {
 	FN(sk_storage_get),		\
 	FN(sk_storage_delete),		\
 	FN(send_signal),		\
-	FN(tcp_gen_syncookie),
+	FN(tcp_gen_syncookie),		\
+	FN(skb_output),
 
 /* integer value in 'imm' field of BPF_CALL instruction selects which helper
  * function eBPF program intends to call
-- 
2.23.0


^ permalink raw reply	[flat|nested] 33+ messages in thread

* [PATCH v2 bpf-next 11/12] bpf: disallow bpf_probe_read[_str] helpers
  2019-10-10  4:14 [PATCH v2 bpf-next 00/12] bpf: revolutionize bpf tracing Alexei Starovoitov
                   ` (9 preceding siblings ...)
  2019-10-10  4:15 ` [PATCH v2 bpf-next 10/12] bpf: check types of arguments passed into helpers Alexei Starovoitov
@ 2019-10-10  4:15 ` Alexei Starovoitov
  2019-10-11 19:03   ` Andrii Nakryiko
  2019-10-10  4:15 ` [PATCH v2 bpf-next 12/12] selftests/bpf: add kfree_skb raw_tp test Alexei Starovoitov
  11 siblings, 1 reply; 33+ messages in thread
From: Alexei Starovoitov @ 2019-10-10  4:15 UTC (permalink / raw)
  To: davem; +Cc: daniel, x86, netdev, bpf, kernel-team

Disallow bpf_probe_read() and bpf_probe_read_str() helpers in
raw_tracepoint bpf programs that use in-kernel BTF to track
types of memory accesses.

Signed-off-by: Alexei Starovoitov <ast@kernel.org>
---
 kernel/trace/bpf_trace.c | 4 ++++
 1 file changed, 4 insertions(+)

diff --git a/kernel/trace/bpf_trace.c b/kernel/trace/bpf_trace.c
index 52f7e9d8c29b..fa5743abf842 100644
--- a/kernel/trace/bpf_trace.c
+++ b/kernel/trace/bpf_trace.c
@@ -700,6 +700,8 @@ tracing_func_proto(enum bpf_func_id func_id, const struct bpf_prog *prog)
 	case BPF_FUNC_map_peek_elem:
 		return &bpf_map_peek_elem_proto;
 	case BPF_FUNC_probe_read:
+		if (prog->aux->attach_btf_id)
+			return NULL;
 		return &bpf_probe_read_proto;
 	case BPF_FUNC_ktime_get_ns:
 		return &bpf_ktime_get_ns_proto;
@@ -728,6 +730,8 @@ tracing_func_proto(enum bpf_func_id func_id, const struct bpf_prog *prog)
 	case BPF_FUNC_get_prandom_u32:
 		return &bpf_get_prandom_u32_proto;
 	case BPF_FUNC_probe_read_str:
+		if (prog->aux->attach_btf_id)
+			return NULL;
 		return &bpf_probe_read_str_proto;
 #ifdef CONFIG_CGROUPS
 	case BPF_FUNC_get_current_cgroup_id:
-- 
2.23.0


^ permalink raw reply	[flat|nested] 33+ messages in thread

* [PATCH v2 bpf-next 12/12] selftests/bpf: add kfree_skb raw_tp test
  2019-10-10  4:14 [PATCH v2 bpf-next 00/12] bpf: revolutionize bpf tracing Alexei Starovoitov
                   ` (10 preceding siblings ...)
  2019-10-10  4:15 ` [PATCH v2 bpf-next 11/12] bpf: disallow bpf_probe_read[_str] helpers Alexei Starovoitov
@ 2019-10-10  4:15 ` Alexei Starovoitov
  2019-10-10 11:07   ` Ido Schimmel
  2019-10-11 19:05   ` Andrii Nakryiko
  11 siblings, 2 replies; 33+ messages in thread
From: Alexei Starovoitov @ 2019-10-10  4:15 UTC (permalink / raw)
  To: davem; +Cc: daniel, x86, netdev, bpf, kernel-team

Load basic cls_bpf program.
Load raw_tracepoint program and attach to kfree_skb raw tracepoint.
Trigger cls_bpf via prog_test_run.
At the end of test_run kernel will call kfree_skb
which will trigger trace_kfree_skb tracepoint.
Which will call our raw_tracepoint program.
Which will take that skb and will dump it into perf ring buffer.
Check that user space received correct packet.

Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: Andrii Nakryiko <andriin@fb.com>
---
 .../selftests/bpf/prog_tests/kfree_skb.c      | 90 +++++++++++++++++++
 tools/testing/selftests/bpf/progs/kfree_skb.c | 74 +++++++++++++++
 2 files changed, 164 insertions(+)
 create mode 100644 tools/testing/selftests/bpf/prog_tests/kfree_skb.c
 create mode 100644 tools/testing/selftests/bpf/progs/kfree_skb.c

diff --git a/tools/testing/selftests/bpf/prog_tests/kfree_skb.c b/tools/testing/selftests/bpf/prog_tests/kfree_skb.c
new file mode 100644
index 000000000000..0cf91b6bf276
--- /dev/null
+++ b/tools/testing/selftests/bpf/prog_tests/kfree_skb.c
@@ -0,0 +1,90 @@
+// SPDX-License-Identifier: GPL-2.0
+#include <test_progs.h>
+
+static void on_sample(void *ctx, int cpu, void *data, __u32 size)
+{
+	int ifindex = *(int *)data, duration = 0;
+	struct ipv6_packet *pkt_v6 = data + 4;
+
+	if (ifindex != 1)
+		/* spurious kfree_skb not on loopback device */
+		return;
+	if (CHECK(size != 76, "check_size", "size %u != 76\n", size))
+		return;
+	if (CHECK(pkt_v6->eth.h_proto != 0xdd86, "check_eth",
+		  "h_proto %x\n", pkt_v6->eth.h_proto))
+		return;
+	if (CHECK(pkt_v6->iph.nexthdr != 6, "check_ip",
+		  "iph.nexthdr %x\n", pkt_v6->iph.nexthdr))
+		return;
+	if (CHECK(pkt_v6->tcp.doff != 5, "check_tcp",
+		  "tcp.doff %x\n", pkt_v6->tcp.doff))
+		return;
+
+	*(bool *)ctx = true;
+}
+
+void test_kfree_skb(void)
+{
+	struct bpf_prog_load_attr attr = {
+		.file = "./kfree_skb.o",
+		.log_level = 2,
+	};
+
+	struct bpf_object *obj, *obj2 = NULL;
+	struct perf_buffer_opts pb_opts = {};
+	struct perf_buffer *pb = NULL;
+	struct bpf_link *link = NULL;
+	struct bpf_map *perf_buf_map;
+	struct bpf_program *prog;
+	__u32 duration, retval;
+	int err, pkt_fd, kfree_skb_fd;
+	bool passed = false;
+
+	err = bpf_prog_load("./test_pkt_access.o", BPF_PROG_TYPE_SCHED_CLS, &obj, &pkt_fd);
+	if (CHECK(err, "prog_load sched cls", "err %d errno %d\n", err, errno))
+		return;
+
+	err = bpf_prog_load_xattr(&attr, &obj2, &kfree_skb_fd);
+	if (CHECK(err, "prog_load raw tp", "err %d errno %d\n", err, errno))
+		goto close_prog;
+
+	prog = bpf_object__find_program_by_title(obj2, "raw_tracepoint/kfree_skb");
+	if (CHECK(!prog, "find_prog", "prog kfree_skb not found\n"))
+		goto close_prog;
+	link = bpf_program__attach_raw_tracepoint(prog, NULL);
+	if (CHECK(IS_ERR(link), "attach_raw_tp", "err %ld\n", PTR_ERR(link)))
+		goto close_prog;
+
+	perf_buf_map = bpf_object__find_map_by_name(obj2, "perf_buf_map");
+	if (CHECK(!perf_buf_map, "find_perf_buf_map", "not found\n"))
+		goto close_prog;
+
+	/* set up perf buffer */
+	pb_opts.sample_cb = on_sample;
+	pb_opts.ctx = &passed;
+	pb = perf_buffer__new(bpf_map__fd(perf_buf_map), 1, &pb_opts);
+	if (CHECK(IS_ERR(pb), "perf_buf__new", "err %ld\n", PTR_ERR(pb)))
+		goto close_prog;
+
+	err = bpf_prog_test_run(pkt_fd, 1, &pkt_v6, sizeof(pkt_v6),
+				NULL, NULL, &retval, &duration);
+	CHECK(err || retval, "ipv6",
+	      "err %d errno %d retval %d duration %d\n",
+	      err, errno, retval, duration);
+
+	/* read perf buffer */
+	err = perf_buffer__poll(pb, 100);
+	if (CHECK(err < 0, "perf_buffer__poll", "err %d\n", err))
+		goto close_prog;
+	/* make sure kfree_skb program was triggered
+	 * and it sent expected skb into ring buffer
+	 */
+	CHECK_FAIL(!passed);
+close_prog:
+	perf_buffer__free(pb);
+	if (!IS_ERR_OR_NULL(link))
+		bpf_link__destroy(link);
+	bpf_object__close(obj);
+	bpf_object__close(obj2);
+}
diff --git a/tools/testing/selftests/bpf/progs/kfree_skb.c b/tools/testing/selftests/bpf/progs/kfree_skb.c
new file mode 100644
index 000000000000..fc25797cc64d
--- /dev/null
+++ b/tools/testing/selftests/bpf/progs/kfree_skb.c
@@ -0,0 +1,74 @@
+// SPDX-License-Identifier: GPL-2.0
+// Copyright (c) 2019 Facebook
+#include <linux/bpf.h>
+#include "bpf_helpers.h"
+
+char _license[] SEC("license") = "GPL";
+struct {
+	__uint(type, BPF_MAP_TYPE_PERF_EVENT_ARRAY);
+	__uint(key_size, sizeof(int));
+	__uint(value_size, sizeof(int));
+} perf_buf_map SEC(".maps");
+
+#define _(P) (__builtin_preserve_access_index(P))
+
+/* define few struct-s that bpf program needs to access */
+struct callback_head {
+	struct callback_head *next;
+	void (*func)(struct callback_head *head);
+};
+struct dev_ifalias {
+	struct callback_head rcuhead;
+};
+
+struct net_device /* same as kernel's struct net_device */ {
+	int ifindex;
+	struct dev_ifalias *ifalias;
+};
+
+struct sk_buff {
+	/* field names and sizes should match to those in the kernel */
+	unsigned int len, data_len;
+	__u16 mac_len, hdr_len, queue_mapping;
+	struct net_device *dev;
+	/* order of the fields doesn't matter */
+};
+
+/* copy arguments from
+ * include/trace/events/skb.h:
+ * TRACE_EVENT(kfree_skb,
+ *         TP_PROTO(struct sk_buff *skb, void *location),
+ *
+ * into struct below:
+ */
+struct trace_kfree_skb {
+	struct sk_buff *skb;
+	void *location;
+};
+
+SEC("raw_tracepoint/kfree_skb")
+int trace_kfree_skb(struct trace_kfree_skb *ctx)
+{
+	struct sk_buff *skb = ctx->skb;
+	struct net_device *dev;
+	int ifindex;
+	struct callback_head *ptr;
+	void *func;
+
+	__builtin_preserve_access_index(({
+		dev = skb->dev;
+		ifindex = dev->ifindex;
+		ptr = dev->ifalias->rcuhead.next;
+		func = ptr->func;
+	}));
+
+	bpf_printk("rcuhead.next %llx func %llx\n", ptr, func);
+	bpf_printk("skb->len %d\n", _(skb->len));
+	bpf_printk("skb->queue_mapping %d\n", _(skb->queue_mapping));
+	bpf_printk("dev->ifindex %d\n", ifindex);
+
+	/* send first 72 byte of the packet to user space */
+	bpf_skb_output(skb, &perf_buf_map, (72ull << 32) | BPF_F_CURRENT_CPU,
+		       &ifindex, sizeof(ifindex));
+	return 0;
+}
-- 
2.23.0


^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [PATCH v2 bpf-next 12/12] selftests/bpf: add kfree_skb raw_tp test
  2019-10-10  4:15 ` [PATCH v2 bpf-next 12/12] selftests/bpf: add kfree_skb raw_tp test Alexei Starovoitov
@ 2019-10-10 11:07   ` Ido Schimmel
  2019-10-10 19:07     ` Alexei Starovoitov
  2019-10-11 19:05   ` Andrii Nakryiko
  1 sibling, 1 reply; 33+ messages in thread
From: Ido Schimmel @ 2019-10-10 11:07 UTC (permalink / raw)
  To: Alexei Starovoitov; +Cc: davem, daniel, x86, netdev, bpf, kernel-team

On Wed, Oct 09, 2019 at 09:15:03PM -0700, Alexei Starovoitov wrote:
> Load basic cls_bpf program.
> Load raw_tracepoint program and attach to kfree_skb raw tracepoint.
> Trigger cls_bpf via prog_test_run.
> At the end of test_run kernel will call kfree_skb
> which will trigger trace_kfree_skb tracepoint.
> Which will call our raw_tracepoint program.
> Which will take that skb and will dump it into perf ring buffer.
> Check that user space received correct packet.
> 
> Signed-off-by: Alexei Starovoitov <ast@kernel.org>
> Acked-by: Andrii Nakryiko <andriin@fb.com>
> ---
>  .../selftests/bpf/prog_tests/kfree_skb.c      | 90 +++++++++++++++++++
>  tools/testing/selftests/bpf/progs/kfree_skb.c | 74 +++++++++++++++
>  2 files changed, 164 insertions(+)
>  create mode 100644 tools/testing/selftests/bpf/prog_tests/kfree_skb.c
>  create mode 100644 tools/testing/selftests/bpf/progs/kfree_skb.c
> 
> diff --git a/tools/testing/selftests/bpf/prog_tests/kfree_skb.c b/tools/testing/selftests/bpf/prog_tests/kfree_skb.c
> new file mode 100644
> index 000000000000..0cf91b6bf276
> --- /dev/null
> +++ b/tools/testing/selftests/bpf/prog_tests/kfree_skb.c
> @@ -0,0 +1,90 @@
> +// SPDX-License-Identifier: GPL-2.0
> +#include <test_progs.h>
> +
> +static void on_sample(void *ctx, int cpu, void *data, __u32 size)
> +{
> +	int ifindex = *(int *)data, duration = 0;
> +	struct ipv6_packet *pkt_v6 = data + 4;
> +
> +	if (ifindex != 1)
> +		/* spurious kfree_skb not on loopback device */
> +		return;
> +	if (CHECK(size != 76, "check_size", "size %u != 76\n", size))
> +		return;
> +	if (CHECK(pkt_v6->eth.h_proto != 0xdd86, "check_eth",
> +		  "h_proto %x\n", pkt_v6->eth.h_proto))
> +		return;
> +	if (CHECK(pkt_v6->iph.nexthdr != 6, "check_ip",
> +		  "iph.nexthdr %x\n", pkt_v6->iph.nexthdr))
> +		return;
> +	if (CHECK(pkt_v6->tcp.doff != 5, "check_tcp",
> +		  "tcp.doff %x\n", pkt_v6->tcp.doff))
> +		return;
> +
> +	*(bool *)ctx = true;
> +}
> +
> +void test_kfree_skb(void)
> +{
> +	struct bpf_prog_load_attr attr = {
> +		.file = "./kfree_skb.o",
> +		.log_level = 2,
> +	};
> +
> +	struct bpf_object *obj, *obj2 = NULL;
> +	struct perf_buffer_opts pb_opts = {};
> +	struct perf_buffer *pb = NULL;
> +	struct bpf_link *link = NULL;
> +	struct bpf_map *perf_buf_map;
> +	struct bpf_program *prog;
> +	__u32 duration, retval;
> +	int err, pkt_fd, kfree_skb_fd;
> +	bool passed = false;
> +
> +	err = bpf_prog_load("./test_pkt_access.o", BPF_PROG_TYPE_SCHED_CLS, &obj, &pkt_fd);
> +	if (CHECK(err, "prog_load sched cls", "err %d errno %d\n", err, errno))
> +		return;
> +
> +	err = bpf_prog_load_xattr(&attr, &obj2, &kfree_skb_fd);
> +	if (CHECK(err, "prog_load raw tp", "err %d errno %d\n", err, errno))
> +		goto close_prog;
> +
> +	prog = bpf_object__find_program_by_title(obj2, "raw_tracepoint/kfree_skb");
> +	if (CHECK(!prog, "find_prog", "prog kfree_skb not found\n"))
> +		goto close_prog;
> +	link = bpf_program__attach_raw_tracepoint(prog, NULL);
> +	if (CHECK(IS_ERR(link), "attach_raw_tp", "err %ld\n", PTR_ERR(link)))
> +		goto close_prog;
> +
> +	perf_buf_map = bpf_object__find_map_by_name(obj2, "perf_buf_map");
> +	if (CHECK(!perf_buf_map, "find_perf_buf_map", "not found\n"))
> +		goto close_prog;
> +
> +	/* set up perf buffer */
> +	pb_opts.sample_cb = on_sample;
> +	pb_opts.ctx = &passed;
> +	pb = perf_buffer__new(bpf_map__fd(perf_buf_map), 1, &pb_opts);
> +	if (CHECK(IS_ERR(pb), "perf_buf__new", "err %ld\n", PTR_ERR(pb)))
> +		goto close_prog;
> +
> +	err = bpf_prog_test_run(pkt_fd, 1, &pkt_v6, sizeof(pkt_v6),
> +				NULL, NULL, &retval, &duration);
> +	CHECK(err || retval, "ipv6",
> +	      "err %d errno %d retval %d duration %d\n",
> +	      err, errno, retval, duration);
> +
> +	/* read perf buffer */
> +	err = perf_buffer__poll(pb, 100);
> +	if (CHECK(err < 0, "perf_buffer__poll", "err %d\n", err))
> +		goto close_prog;
> +	/* make sure kfree_skb program was triggered
> +	 * and it sent expected skb into ring buffer
> +	 */
> +	CHECK_FAIL(!passed);
> +close_prog:
> +	perf_buffer__free(pb);
> +	if (!IS_ERR_OR_NULL(link))
> +		bpf_link__destroy(link);
> +	bpf_object__close(obj);
> +	bpf_object__close(obj2);
> +}
> diff --git a/tools/testing/selftests/bpf/progs/kfree_skb.c b/tools/testing/selftests/bpf/progs/kfree_skb.c
> new file mode 100644
> index 000000000000..fc25797cc64d
> --- /dev/null
> +++ b/tools/testing/selftests/bpf/progs/kfree_skb.c
> @@ -0,0 +1,74 @@
> +// SPDX-License-Identifier: GPL-2.0
> +// Copyright (c) 2019 Facebook
> +#include <linux/bpf.h>
> +#include "bpf_helpers.h"
> +
> +char _license[] SEC("license") = "GPL";
> +struct {
> +	__uint(type, BPF_MAP_TYPE_PERF_EVENT_ARRAY);
> +	__uint(key_size, sizeof(int));
> +	__uint(value_size, sizeof(int));
> +} perf_buf_map SEC(".maps");
> +
> +#define _(P) (__builtin_preserve_access_index(P))
> +
> +/* define few struct-s that bpf program needs to access */
> +struct callback_head {
> +	struct callback_head *next;
> +	void (*func)(struct callback_head *head);
> +};
> +struct dev_ifalias {
> +	struct callback_head rcuhead;
> +};
> +
> +struct net_device /* same as kernel's struct net_device */ {
> +	int ifindex;
> +	struct dev_ifalias *ifalias;
> +};
> +
> +struct sk_buff {
> +	/* field names and sizes should match to those in the kernel */
> +	unsigned int len, data_len;
> +	__u16 mac_len, hdr_len, queue_mapping;
> +	struct net_device *dev;
> +	/* order of the fields doesn't matter */
> +};
> +
> +/* copy arguments from
> + * include/trace/events/skb.h:
> + * TRACE_EVENT(kfree_skb,
> + *         TP_PROTO(struct sk_buff *skb, void *location),
> + *
> + * into struct below:
> + */
> +struct trace_kfree_skb {
> +	struct sk_buff *skb;
> +	void *location;
> +};
> +
> +SEC("raw_tracepoint/kfree_skb")
> +int trace_kfree_skb(struct trace_kfree_skb *ctx)
> +{
> +	struct sk_buff *skb = ctx->skb;
> +	struct net_device *dev;
> +	int ifindex;
> +	struct callback_head *ptr;
> +	void *func;
> +
> +	__builtin_preserve_access_index(({
> +		dev = skb->dev;
> +		ifindex = dev->ifindex;

Hi Alexei,

The patchset looks very useful. One question: Is it always safe to
access 'skb->dev->ifindex' here? I'm asking because 'dev' is inside a
union with 'dev_scratch' which is 'unsigned long' and therefore might
not always be a valid memory address. Consider for example the following
code path:

...
__udp_queue_rcv_skb(sk, skb)
	__udp_enqueue_schedule_skb(sk, skb)
		udp_set_dev_scratch(skb)
		// returns error
	...
	kfree_skb(skb) // ebpf program is invoked

How is this handled by eBPF?

Thanks

> +		ptr = dev->ifalias->rcuhead.next;
> +		func = ptr->func;
> +	}));
> +
> +	bpf_printk("rcuhead.next %llx func %llx\n", ptr, func);
> +	bpf_printk("skb->len %d\n", _(skb->len));
> +	bpf_printk("skb->queue_mapping %d\n", _(skb->queue_mapping));
> +	bpf_printk("dev->ifindex %d\n", ifindex);
> +
> +	/* send first 72 byte of the packet to user space */
> +	bpf_skb_output(skb, &perf_buf_map, (72ull << 32) | BPF_F_CURRENT_CPU,
> +		       &ifindex, sizeof(ifindex));
> +	return 0;
> +}
> -- 
> 2.23.0
> 

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [PATCH v2 bpf-next 12/12] selftests/bpf: add kfree_skb raw_tp test
  2019-10-10 11:07   ` Ido Schimmel
@ 2019-10-10 19:07     ` Alexei Starovoitov
  0 siblings, 0 replies; 33+ messages in thread
From: Alexei Starovoitov @ 2019-10-10 19:07 UTC (permalink / raw)
  To: Ido Schimmel; +Cc: davem, daniel, x86, netdev, bpf, kernel-team, edumazet

On Thu, Oct 10, 2019 at 02:07:29PM +0300, Ido Schimmel wrote:
> On Wed, Oct 09, 2019 at 09:15:03PM -0700, Alexei Starovoitov wrote:
> > +SEC("raw_tracepoint/kfree_skb")
> > +int trace_kfree_skb(struct trace_kfree_skb *ctx)
> > +{
> > +	struct sk_buff *skb = ctx->skb;
> > +	struct net_device *dev;
> > +	int ifindex;
> > +	struct callback_head *ptr;
> > +	void *func;
> > +
> > +	__builtin_preserve_access_index(({
> > +		dev = skb->dev;
> > +		ifindex = dev->ifindex;
> 
> Hi Alexei,
> 
> The patchset looks very useful. One question: Is it always safe to
> access 'skb->dev->ifindex' here? I'm asking because 'dev' is inside a
> union with 'dev_scratch' which is 'unsigned long' and therefore might
> not always be a valid memory address. Consider for example the following
> code path:
> 
> ...
> __udp_queue_rcv_skb(sk, skb)
> 	__udp_enqueue_schedule_skb(sk, skb)
> 		udp_set_dev_scratch(skb)
> 		// returns error
> 	...
> 	kfree_skb(skb) // ebpf program is invoked
> 
> How is this handled by eBPF?

Excellent question. There are cases like this where the verifier cannot possibly
track semantics of the kernel code and union of pointer with scratch area
like this. That's why every access through btf pointer is a hidden probe_read.
Comparing to old school tracing all memory accesses were probe_read
and bpf prog was free to read anything and page fault everywhere.
Now bpf prog will almost always access correct data. Yet corner cases like
this are inevitable. I'm working on few ideas how to improve it further
with btf-tagged slab allocations and kasan-like memory shadowing.

Your question made me thinking whether we have a long standing issue
with dev_scratch, since even classic bpf has SKF_AD_IFINDEX hack
which is implemented as:
    *insn++ = BPF_LDX_MEM(BPF_FIELD_SIZEOF(struct sk_buff, dev),
                          BPF_REG_TMP, BPF_REG_CTX,
                          offsetof(struct sk_buff, dev));
    /* if (tmp != 0) goto pc + 1 */
    *insn++ = BPF_JMP_IMM(BPF_JNE, BPF_REG_TMP, 0, 1);
    *insn++ = BPF_EXIT_INSN();
    if (fp->k == SKF_AD_OFF + SKF_AD_IFINDEX)
            *insn = BPF_LDX_MEM(BPF_W, BPF_REG_A, BPF_REG_TMP,
                                offsetof(struct net_device, ifindex));

That means for long time [c|e]BPF code was checking skb->dev for NULL only.
I've analyzed the code where socket filters can be called and I think it's good
there. dev_scratch is used after sk_filter has run.
But there are other hooks: lwt, various cgroups.
I've checked lwt and cgroup inet/egress. I think dev_scratch should not be
used in these paths. So should be good there as well.
But I think the whole idea of aliasing scratch into 'dev' pointer is dangerous.
There are plenty of tracepoints that do skb->dev->foo. Hard to track where
everything is called.
I think udp code need to move this dev_scratch into some other place in skb.


^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [PATCH v2 bpf-next 03/12] bpf: process in-kernel BTF
  2019-10-10  4:14 ` [PATCH v2 bpf-next 03/12] bpf: process in-kernel BTF Alexei Starovoitov
@ 2019-10-11 17:56   ` Andrii Nakryiko
  0 siblings, 0 replies; 33+ messages in thread
From: Andrii Nakryiko @ 2019-10-11 17:56 UTC (permalink / raw)
  To: Alexei Starovoitov
  Cc: David S. Miller, Daniel Borkmann, x86, Networking, bpf, Kernel Team

On Wed, Oct 9, 2019 at 9:15 PM Alexei Starovoitov <ast@kernel.org> wrote:
>
> If in-kernel BTF exists parse it and prepare 'struct btf *btf_vmlinux'
> for further use by the verifier.
> In-kernel BTF is trusted just like kallsyms and other build artifacts
> embedded into vmlinux.
> Yet run this BTF image through BTF verifier to make sure
> that it is valid and it wasn't mangled during the build.
> If in-kernel BTF is incorrect it means either gcc or pahole or kernel
> are buggy. In such case disallow loading BPF programs.
>
> Signed-off-by: Alexei Starovoitov <ast@kernel.org>
> ---

Acked-by: Andrii Nakryiko <andriin@fb.com>

>  include/linux/bpf_verifier.h |  4 +-
>  include/linux/btf.h          |  1 +
>  kernel/bpf/btf.c             | 71 +++++++++++++++++++++++++++++++++++-
>  kernel/bpf/verifier.c        | 20 ++++++++++
>  4 files changed, 94 insertions(+), 2 deletions(-)
>

[...]

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [PATCH v2 bpf-next 04/12] bpf: add attach_btf_id attribute to program load
  2019-10-10  4:14 ` [PATCH v2 bpf-next 04/12] bpf: add attach_btf_id attribute to program load Alexei Starovoitov
@ 2019-10-11 17:58   ` Andrii Nakryiko
  0 siblings, 0 replies; 33+ messages in thread
From: Andrii Nakryiko @ 2019-10-11 17:58 UTC (permalink / raw)
  To: Alexei Starovoitov
  Cc: David S. Miller, Daniel Borkmann, x86, Networking, bpf, Kernel Team

On Wed, Oct 9, 2019 at 9:16 PM Alexei Starovoitov <ast@kernel.org> wrote:
>
> Add attach_btf_id attribute to prog_load command.
> It's similar to existing expected_attach_type attribute which is
> used in several cgroup based program types.
> Unfortunately expected_attach_type is ignored for
> tracing programs and cannot be reused for new purpose.
> Hence introduce attach_btf_id to verify bpf programs against
> given in-kernel BTF type id at load time.
> It is strictly checked to be valid for raw_tp programs only.
> In a later patches it will become:
> btf_id == 0 semantics of existing raw_tp progs.
> btd_id > 0 raw_tp with BTF and additional type safety.
>
> Signed-off-by: Alexei Starovoitov <ast@kernel.org>
> ---

Looks good!

Acked-by: Andrii Nakryiko <andriin@fb.com>

>  include/linux/bpf.h            |  1 +
>  include/uapi/linux/bpf.h       |  1 +
>  kernel/bpf/syscall.c           | 18 ++++++++++++++----
>  tools/include/uapi/linux/bpf.h |  1 +
>  4 files changed, 17 insertions(+), 4 deletions(-)
>

[...]

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [PATCH v2 bpf-next 05/12] libbpf: auto-detect btf_id of raw_tracepoint
  2019-10-10  4:14 ` [PATCH v2 bpf-next 05/12] libbpf: auto-detect btf_id of raw_tracepoint Alexei Starovoitov
@ 2019-10-11 18:02   ` Andrii Nakryiko
  2019-10-11 18:07   ` Andrii Nakryiko
  1 sibling, 0 replies; 33+ messages in thread
From: Andrii Nakryiko @ 2019-10-11 18:02 UTC (permalink / raw)
  To: Alexei Starovoitov
  Cc: David S. Miller, Daniel Borkmann, x86, Networking, bpf, Kernel Team

On Wed, Oct 9, 2019 at 9:17 PM Alexei Starovoitov <ast@kernel.org> wrote:
>
> For raw tracepoint program types libbpf will try to find
> btf_id of raw tracepoint in vmlinux's BTF.
> It's a responsiblity of bpf program author to annotate the program
> with SEC("raw_tracepoint/name") where "name" is a valid raw tracepoint.
> If "name" is indeed a valid raw tracepoint then in-kernel BTF
> will have "btf_trace_##name" typedef that points to function
> prototype of that raw tracepoint. BTF description captures
> exact argument the kernel C code is passing into raw tracepoint.
> The kernel verifier will check the types while loading bpf program.
>
> libbpf keeps BTF type id in expected_attach_type, but since
> kernel ignores this attribute for tracing programs copy it
> into attach_btf_id attribute before loading.
>
> Signed-off-by: Alexei Starovoitov <ast@kernel.org>
> ---

Acked-by: Andrii Nakryiko <andriin@fb.com>

>  tools/lib/bpf/bpf.c    |  3 +++
>  tools/lib/bpf/libbpf.c | 17 +++++++++++++++++
>  2 files changed, 20 insertions(+)
>

[...]

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [PATCH v2 bpf-next 05/12] libbpf: auto-detect btf_id of raw_tracepoint
  2019-10-10  4:14 ` [PATCH v2 bpf-next 05/12] libbpf: auto-detect btf_id of raw_tracepoint Alexei Starovoitov
  2019-10-11 18:02   ` Andrii Nakryiko
@ 2019-10-11 18:07   ` Andrii Nakryiko
  2019-10-12  0:40     ` Alexei Starovoitov
  1 sibling, 1 reply; 33+ messages in thread
From: Andrii Nakryiko @ 2019-10-11 18:07 UTC (permalink / raw)
  To: Alexei Starovoitov
  Cc: David S. Miller, Daniel Borkmann, x86, Networking, bpf, Kernel Team

On Wed, Oct 9, 2019 at 9:17 PM Alexei Starovoitov <ast@kernel.org> wrote:
>
> For raw tracepoint program types libbpf will try to find
> btf_id of raw tracepoint in vmlinux's BTF.
> It's a responsiblity of bpf program author to annotate the program
> with SEC("raw_tracepoint/name") where "name" is a valid raw tracepoint.
> If "name" is indeed a valid raw tracepoint then in-kernel BTF
> will have "btf_trace_##name" typedef that points to function
> prototype of that raw tracepoint. BTF description captures
> exact argument the kernel C code is passing into raw tracepoint.
> The kernel verifier will check the types while loading bpf program.
>
> libbpf keeps BTF type id in expected_attach_type, but since
> kernel ignores this attribute for tracing programs copy it
> into attach_btf_id attribute before loading.
>
> Signed-off-by: Alexei Starovoitov <ast@kernel.org>
> ---
>  tools/lib/bpf/bpf.c    |  3 +++
>  tools/lib/bpf/libbpf.c | 17 +++++++++++++++++
>  2 files changed, 20 insertions(+)
>
> diff --git a/tools/lib/bpf/bpf.c b/tools/lib/bpf/bpf.c
> index cbb933532981..79046067720f 100644
> --- a/tools/lib/bpf/bpf.c
> +++ b/tools/lib/bpf/bpf.c
> @@ -228,6 +228,9 @@ int bpf_load_program_xattr(const struct bpf_load_program_attr *load_attr,
>         memset(&attr, 0, sizeof(attr));
>         attr.prog_type = load_attr->prog_type;
>         attr.expected_attach_type = load_attr->expected_attach_type;
> +       if (attr.prog_type == BPF_PROG_TYPE_RAW_TRACEPOINT)
> +               /* expected_attach_type is ignored for tracing progs */
> +               attr.attach_btf_id = attr.expected_attach_type;
>         attr.insn_cnt = (__u32)load_attr->insns_cnt;
>         attr.insns = ptr_to_u64(load_attr->insns);
>         attr.license = ptr_to_u64(load_attr->license);
> diff --git a/tools/lib/bpf/libbpf.c b/tools/lib/bpf/libbpf.c
> index a02cdedc4e3f..8bf30a67428c 100644
> --- a/tools/lib/bpf/libbpf.c
> +++ b/tools/lib/bpf/libbpf.c
> @@ -4586,6 +4586,23 @@ int libbpf_prog_type_by_name(const char *name, enum bpf_prog_type *prog_type,
>                         continue;
>                 *prog_type = section_names[i].prog_type;
>                 *expected_attach_type = section_names[i].expected_attach_type;
> +               if (*prog_type == BPF_PROG_TYPE_RAW_TRACEPOINT) {
> +                       struct btf *btf = bpf_core_find_kernel_btf();
> +                       char raw_tp_btf_name[128] = "btf_trace_";
> +                       char *dst = raw_tp_btf_name + sizeof("btf_trace_") - 1;
> +                       int ret;
> +
> +                       if (IS_ERR(btf))
> +                               /* lack of kernel BTF is not a failure */
> +                               return 0;
> +                       /* prepend "btf_trace_" prefix per kernel convention */
> +                       strncat(dst, name + section_names[i].len,
> +                               sizeof(raw_tp_btf_name) - (dst - raw_tp_btf_name));
> +                       ret = btf__find_by_name(btf, raw_tp_btf_name);
> +                       if (ret > 0)
> +                               *expected_attach_type = ret;

Well, actually, I realized after I gave Acked-by, so not yet :)

This needs kernel feature probe of whether kernel supports specifying
attach_btf_id, otherwise on older kernels we'll stop successfully
loading valid program.

But even if kernel supports attach_btf_id, I think users still need to
opt in into specifying attach_btf_id by libbpf. Think about existing
raw_tp programs that are using bpf_probe_read() because they were not
created with this kernel feature in mind. They will suddenly stop
working without any of user's fault.

One way to do this would be another section prefix for raw tracepoint
w/ BTF. E.g., "raw_tp_typed" or something along those lines? I'll say
it upfront, I don't think "raw_tp_btf" is great name, because we rely
on BTF for a lot of other stuff already, so just adding _btf prefix
doesn't mean specifically BTF type tracking in kernel ;)

> +                       btf__free(btf);
> +               }
>                 return 0;
>         }
>         pr_warning("failed to guess program type based on ELF section name '%s'\n", name);
> --
> 2.23.0
>

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [PATCH v2 bpf-next 06/12] bpf: implement accurate raw_tp context access via BTF
  2019-10-10  4:14 ` [PATCH v2 bpf-next 06/12] bpf: implement accurate raw_tp context access via BTF Alexei Starovoitov
@ 2019-10-11 18:31   ` Andrii Nakryiko
  2019-10-11 23:13     ` Andrii Nakryiko
  0 siblings, 1 reply; 33+ messages in thread
From: Andrii Nakryiko @ 2019-10-11 18:31 UTC (permalink / raw)
  To: Alexei Starovoitov
  Cc: David S. Miller, Daniel Borkmann, x86, Networking, bpf, Kernel Team

On Wed, Oct 9, 2019 at 9:17 PM Alexei Starovoitov <ast@kernel.org> wrote:
>
> libbpf analyzes bpf C program, searches in-kernel BTF for given type name
> and stores it into expected_attach_type.
> The kernel verifier expects this btf_id to point to something like:
> typedef void (*btf_trace_kfree_skb)(void *, struct sk_buff *skb, void *loc);
> which represents signature of raw_tracepoint "kfree_skb".
>
> Then btf_ctx_access() matches ctx+0 access in bpf program with 'skb'
> and 'ctx+8' access with 'loc' arguments of "kfree_skb" tracepoint.
> In first case it passes btf_id of 'struct sk_buff *' back to the verifier core
> and 'void *' in second case.
>
> Then the verifier tracks PTR_TO_BTF_ID as any other pointer type.
> Like PTR_TO_SOCKET points to 'struct bpf_sock',
> PTR_TO_TCP_SOCK points to 'struct bpf_tcp_sock', and so on.
> PTR_TO_BTF_ID points to in-kernel structs.
> If 1234 is btf_id of 'struct sk_buff' in vmlinux's BTF
> then PTR_TO_BTF_ID#1234 points to one of in kernel skbs.
>
> When PTR_TO_BTF_ID#1234 is dereferenced (like r2 = *(u64 *)r1 + 32)
> the btf_struct_access() checks which field of 'struct sk_buff' is
> at offset 32. Checks that size of access matches type definition
> of the field and continues to track the dereferenced type.
> If that field was a pointer to 'struct net_device' the r2's type
> will be PTR_TO_BTF_ID#456. Where 456 is btf_id of 'struct net_device'
> in vmlinux's BTF.
>
> Such verifier analysis prevents "cheating" in BPF C program.
> The program cannot cast arbitrary pointer to 'struct sk_buff *'
> and access it. C compiler would allow type cast, of course,
> but the verifier will notice type mismatch based on BPF assembly
> and in-kernel BTF.
>
> Signed-off-by: Alexei Starovoitov <ast@kernel.org>
> ---
>  include/linux/bpf.h          |  17 +++-
>  include/linux/bpf_verifier.h |   4 +
>  kernel/bpf/btf.c             | 186 +++++++++++++++++++++++++++++++++++
>  kernel/bpf/verifier.c        |  86 +++++++++++++++-
>  kernel/trace/bpf_trace.c     |   2 +-
>  5 files changed, 290 insertions(+), 5 deletions(-)
>

[...]

> +int btf_struct_access(struct bpf_verifier_log *log,
> +                     const struct btf_type *t, int off, int size,
> +                     enum bpf_access_type atype,
> +                     u32 *next_btf_id)
> +{
> +       const struct btf_member *member;
> +       const struct btf_type *mtype;
> +       const char *tname, *mname;
> +       int i, moff = 0, msize;
> +
> +again:
> +       tname = __btf_name_by_offset(btf_vmlinux, t->name_off);
> +       if (!btf_type_is_struct(t)) {
> +               bpf_log(log, "Type '%s' is not a struct", tname);
> +               return -EINVAL;
> +       }
> +
> +       for_each_member(i, t, member) {
> +               /* offset of the field in bits */
> +               moff = btf_member_bit_offset(t, member);

This whole logic of offset/size checking doesn't work for bitfields.
Your moff % 8 might be non-zero (most probably, actually, for
bitfield). Also, msize of underlying integer type is not the same as
member's bit size. So probably just check that it's a bitfield and
skip it?

The check is surprisingly subtle and not straightforward, btw. You
need to get btf_member_bitfield_size(t, member) and check that it's
not equal to underlying type's size (which is in bytes, so * 8). It's
unfortunate it's so non-straightforward. But if you don't filter that,
all those `moff / 8` and `msize` checks are bogus.

> +
> +               if (off + size <= moff / 8)
> +                       /* won't find anything, field is already too far */
> +                       break;
> +
> +               /* type of the field */
> +               mtype = btf_type_by_id(btf_vmlinux, member->type);
> +               mname = __btf_name_by_offset(btf_vmlinux, member->name_off);
> +
> +               /* skip modifiers */
> +               while (btf_type_is_modifier(mtype))
> +                       mtype = btf_type_by_id(btf_vmlinux, mtype->type);
> +
> +               if (btf_type_is_array(mtype))
> +                       /* array deref is not supported yet */
> +                       continue;
> +
> +               if (!btf_type_has_size(mtype) && !btf_type_is_ptr(mtype)) {
> +                       bpf_log(log, "field %s doesn't have size\n", mname);
> +                       return -EFAULT;
> +               }
> +               if (btf_type_is_ptr(mtype))
> +                       msize = 8;
> +               else
> +                       msize = mtype->size;
> +               if (off >= moff / 8 + msize)
> +                       /* no overlap with member, keep iterating */
> +                       continue;
> +               /* the 'off' we're looking for is either equal to start
> +                * of this field or inside of this struct
> +                */
> +               if (btf_type_is_struct(mtype)) {
> +                       /* our field must be inside that union or struct */
> +                       t = mtype;
> +
> +                       /* adjust offset we're looking for */
> +                       off -= moff / 8;
> +                       goto again;
> +               }
> +               if (msize != size) {
> +                       /* field access size doesn't match */
> +                       bpf_log(log,
> +                               "cannot access %d bytes in struct %s field %s that has size %d\n",
> +                               size, tname, mname, msize);
> +                       return -EACCES;

Are you sure this has to be an error? Why not just default to
SCALAR_VALUE here? E.g., if compiler generated one read for few
smaller fields, or user wants to read lower 1 byte of int field, etc.
I think if you move this size check into the following ptr check, it
should be fine. Pointer is the only case where you care about correct
read/value, isn't it?

> +               }
> +
> +               if (btf_type_is_ptr(mtype)) {
> +                       const struct btf_type *stype;
> +
> +                       stype = btf_type_by_id(btf_vmlinux, mtype->type);
> +                       /* skip modifiers */
> +                       while (btf_type_is_modifier(stype))
> +                               stype = btf_type_by_id(btf_vmlinux, stype->type);
> +                       if (btf_type_is_struct(stype)) {
> +                               *next_btf_id = mtype->type;
> +                               return PTR_TO_BTF_ID;
> +                       }
> +               }
> +               /* all other fields are treated as scalars */
> +               return SCALAR_VALUE;
> +       }
> +       bpf_log(log, "struct %s doesn't have field at offset %d\n", tname, off);
> +       return -EINVAL;
> +}
> +

[...]

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [PATCH v2 bpf-next 07/12] bpf: attach raw_tp program with BTF via type name
  2019-10-10  4:14 ` [PATCH v2 bpf-next 07/12] bpf: attach raw_tp program with BTF via type name Alexei Starovoitov
@ 2019-10-11 18:44   ` Andrii Nakryiko
  0 siblings, 0 replies; 33+ messages in thread
From: Andrii Nakryiko @ 2019-10-11 18:44 UTC (permalink / raw)
  To: Alexei Starovoitov
  Cc: David S. Miller, Daniel Borkmann, x86, Networking, bpf, Kernel Team

On Wed, Oct 9, 2019 at 9:16 PM Alexei Starovoitov <ast@kernel.org> wrote:
>
> BTF type id specified at program load time has all
> necessary information to attach that program to raw tracepoint.
> Use kernel type name to find raw tracepoint.
>
> Signed-off-by: Alexei Starovoitov <ast@kernel.org>
> ---
>  kernel/bpf/syscall.c | 67 +++++++++++++++++++++++++++++---------------
>  1 file changed, 44 insertions(+), 23 deletions(-)
>
> diff --git a/kernel/bpf/syscall.c b/kernel/bpf/syscall.c
> index b56c482c9760..03f36e73d84a 100644
> --- a/kernel/bpf/syscall.c
> +++ b/kernel/bpf/syscall.c
> @@ -1816,17 +1816,49 @@ static int bpf_raw_tracepoint_open(const union bpf_attr *attr)
>         struct bpf_raw_tracepoint *raw_tp;
>         struct bpf_raw_event_map *btp;
>         struct bpf_prog *prog;
> -       char tp_name[128];
> +       const char *tp_name;
> +       char buf[128];
>         int tp_fd, err;

Shouldn't there be CHECK_ATTR(BPF_RAW_TRACEPOINT_OPEN) somewhere here?

Other than this pre-existing issue, everything else looks reasonable.

[...]

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [PATCH v2 bpf-next 09/12] bpf: add support for BTF pointers to x86 JIT
  2019-10-10  4:15 ` [PATCH v2 bpf-next 09/12] bpf: add support for BTF pointers to x86 JIT Alexei Starovoitov
@ 2019-10-11 18:48   ` Andrii Nakryiko
  0 siblings, 0 replies; 33+ messages in thread
From: Andrii Nakryiko @ 2019-10-11 18:48 UTC (permalink / raw)
  To: Alexei Starovoitov
  Cc: David S. Miller, Daniel Borkmann, x86, Networking, bpf, Kernel Team

On Wed, Oct 9, 2019 at 9:15 PM Alexei Starovoitov <ast@kernel.org> wrote:
>
> Pointer to BTF object is a pointer to kernel object or NULL.
> Such pointers can only be used by BPF_LDX instructions.
> The verifier changed their opcode from LDX|MEM|size
> to LDX|PROBE_MEM|size to make JITing easier.
> The number of entries in extable is the number of BPF_LDX insns
> that access kernel memory via "pointer to BTF type".
> Only these load instructions can fault.
> Since x86 extable is relative it has to be allocated in the same
> memory region as JITed code.
> Allocate it prior to last pass of JITing and let the last pass populate it.
> Pointer to extable in bpf_prog_aux is necessary to make page fault
> handling fast.
> Page fault handling is done in two steps:
> 1. bpf_prog_kallsyms_find() finds BPF program that page faulted.
>    It's done by walking rb tree.
> 2. then extable for given bpf program is binary searched.
> This process is similar to how page faulting is done for kernel modules.
> The exception handler skips over faulting x86 instruction and
> initializes destination register with zero. This mimics exact
> behavior of bpf_probe_read (when probe_kernel_read faults dest is zeroed).
>
> JITs for other architectures can add support in similar way.
> Until then they will reject unknown opcode and fallback to interpreter.
>
> Since extable should be aligned and placed near JITed code
> make bpf_jit_binary_alloc() return 4 byte aligned image offset,
> so that extable aligning formula in bpf_int_jit_compile() doesn't need
> to rely on internal implementation of bpf_jit_binary_alloc().
> On x86 gcc defaults to 16-byte alignment for regular kernel functions
> due to better performance. JITed code may be aligned to 16 in the future,
> but it will use 4 in the meantime.
>
> Signed-off-by: Alexei Starovoitov <ast@kernel.org>
> ---

Acked-by: Andrii Nakryiko <andriin@fb.com>

>  arch/x86/net/bpf_jit_comp.c | 97 +++++++++++++++++++++++++++++++++++--
>  include/linux/bpf.h         |  3 ++
>  include/linux/extable.h     | 10 ++++
>  kernel/bpf/core.c           | 20 +++++++-
>  kernel/bpf/verifier.c       |  1 +
>  kernel/extable.c            |  2 +
>  6 files changed, 128 insertions(+), 5 deletions(-)
>

[...]

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [PATCH v2 bpf-next 10/12] bpf: check types of arguments passed into helpers
  2019-10-10  4:15 ` [PATCH v2 bpf-next 10/12] bpf: check types of arguments passed into helpers Alexei Starovoitov
@ 2019-10-11 19:02   ` Andrii Nakryiko
  2019-10-12  1:39     ` Alexei Starovoitov
  0 siblings, 1 reply; 33+ messages in thread
From: Andrii Nakryiko @ 2019-10-11 19:02 UTC (permalink / raw)
  To: Alexei Starovoitov
  Cc: David S. Miller, Daniel Borkmann, x86, Networking, bpf, Kernel Team

On Wed, Oct 9, 2019 at 9:15 PM Alexei Starovoitov <ast@kernel.org> wrote:
>
> Introduce new helper that reuses existing skb perf_event output
> implementation, but can be called from raw_tracepoint programs
> that receive 'struct sk_buff *' as tracepoint argument or
> can walk other kernel data structures to skb pointer.
>
> In order to do that teach verifier to resolve true C types
> of bpf helpers into in-kernel BTF ids.
> The type of kernel pointer passed by raw tracepoint into bpf
> program will be tracked by the verifier all the way until
> it's passed into helper function.
> For example:
> kfree_skb() kernel function calls trace_kfree_skb(skb, loc);
> bpf programs receives that skb pointer and may eventually
> pass it into bpf_skb_output() bpf helper which in-kernel is
> implemented via bpf_skb_event_output() kernel function.
> Its first argument in the kernel is 'struct sk_buff *'.
> The verifier makes sure that types match all the way.
>
> Signed-off-by: Alexei Starovoitov <ast@kernel.org>
> ---
>  include/linux/bpf.h            | 18 ++++++---
>  include/uapi/linux/bpf.h       | 27 +++++++++++++-
>  kernel/bpf/btf.c               | 68 ++++++++++++++++++++++++++++++++++
>  kernel/bpf/verifier.c          | 44 ++++++++++++++--------
>  kernel/trace/bpf_trace.c       |  4 ++
>  net/core/filter.c              | 15 +++++++-
>  tools/include/uapi/linux/bpf.h | 27 +++++++++++++-
>  7 files changed, 180 insertions(+), 23 deletions(-)
>
> diff --git a/include/linux/bpf.h b/include/linux/bpf.h
> index 6edfe50f1c2c..d3df073f374a 100644
> --- a/include/linux/bpf.h
> +++ b/include/linux/bpf.h
> @@ -213,6 +213,7 @@ enum bpf_arg_type {
>         ARG_PTR_TO_INT,         /* pointer to int */
>         ARG_PTR_TO_LONG,        /* pointer to long */
>         ARG_PTR_TO_SOCKET,      /* pointer to bpf_sock (fullsock) */
> +       ARG_PTR_TO_BTF_ID,      /* pointer to in-kernel struct */
>  };
>
>  /* type of values returned from helper functions */
> @@ -235,11 +236,17 @@ struct bpf_func_proto {
>         bool gpl_only;
>         bool pkt_access;
>         enum bpf_return_type ret_type;
> -       enum bpf_arg_type arg1_type;
> -       enum bpf_arg_type arg2_type;
> -       enum bpf_arg_type arg3_type;
> -       enum bpf_arg_type arg4_type;
> -       enum bpf_arg_type arg5_type;
> +       union {
> +               struct {
> +                       enum bpf_arg_type arg1_type;
> +                       enum bpf_arg_type arg2_type;
> +                       enum bpf_arg_type arg3_type;
> +                       enum bpf_arg_type arg4_type;
> +                       enum bpf_arg_type arg5_type;
> +               };
> +               enum bpf_arg_type arg_type[5];
> +       };
> +       u32 *btf_id; /* BTF ids of arguments */

are you trying to save memory with this? otherwise not sure why it's
not just `u32 btf_id[5]`? Even in that case it will save at most 12
bytes (and I haven't even check alignment padding and stuff). So
doesn't seem worth it?

>  };
>
>  /* bpf_context is intentionally undefined structure. Pointer to bpf_context is
> @@ -765,6 +772,7 @@ int btf_struct_access(struct bpf_verifier_log *log,
>                       const struct btf_type *t, int off, int size,
>                       enum bpf_access_type atype,
>                       u32 *next_btf_id);
> +u32 btf_resolve_helper_id(struct bpf_verifier_log *log, void *, int);
>
>  #else /* !CONFIG_BPF_SYSCALL */
>  static inline struct bpf_prog *bpf_prog_get(u32 ufd)
> diff --git a/include/uapi/linux/bpf.h b/include/uapi/linux/bpf.h
> index 3bb2cd1de341..b0454440186f 100644
> --- a/include/uapi/linux/bpf.h
> +++ b/include/uapi/linux/bpf.h
> @@ -2751,6 +2751,30 @@ union bpf_attr {
>   *             **-EOPNOTSUPP** kernel configuration does not enable SYN cookies
>   *
>   *             **-EPROTONOSUPPORT** IP packet version is not 4 or 6
> + *
> + * int bpf_skb_output(void *ctx, struct bpf_map *map, u64 flags, void *data, u64 size)
> + *     Description
> + *             Write raw *data* blob into a special BPF perf event held by
> + *             *map* of type **BPF_MAP_TYPE_PERF_EVENT_ARRAY**. This perf
> + *             event must have the following attributes: **PERF_SAMPLE_RAW**
> + *             as **sample_type**, **PERF_TYPE_SOFTWARE** as **type**, and
> + *             **PERF_COUNT_SW_BPF_OUTPUT** as **config**.
> + *
> + *             The *flags* are used to indicate the index in *map* for which
> + *             the value must be put, masked with **BPF_F_INDEX_MASK**.
> + *             Alternatively, *flags* can be set to **BPF_F_CURRENT_CPU**
> + *             to indicate that the index of the current CPU core should be
> + *             used.
> + *
> + *             The value to write, of *size*, is passed through eBPF stack and
> + *             pointed by *data*.

typo? pointed __to__ by *data*?

> + *
> + *             *ctx* is a pointer to in-kernel sutrct sk_buff.
> + *
> + *             This helper is similar to **bpf_perf_event_output**\ () but
> + *             restricted to raw_tracepoint bpf programs.

nit: with BTF type tracking enabled?

> + *     Return
> + *             0 on success, or a negative error in case of failure.
>   */
>  #define __BPF_FUNC_MAPPER(FN)          \
>         FN(unspec),                     \
> @@ -2863,7 +2887,8 @@ union bpf_attr {
>         FN(sk_storage_get),             \
>         FN(sk_storage_delete),          \
>         FN(send_signal),                \
> -       FN(tcp_gen_syncookie),
> +       FN(tcp_gen_syncookie),          \
> +       FN(skb_output),
>

[...]

> @@ -4072,21 +4091,16 @@ static int check_helper_call(struct bpf_verifier_env *env, int func_id, int insn
>
>         meta.func_id = func_id;
>         /* check args */
> -       err = check_func_arg(env, BPF_REG_1, fn->arg1_type, &meta);
> -       if (err)
> -               return err;
> -       err = check_func_arg(env, BPF_REG_2, fn->arg2_type, &meta);
> -       if (err)
> -               return err;
> -       err = check_func_arg(env, BPF_REG_3, fn->arg3_type, &meta);
> -       if (err)
> -               return err;
> -       err = check_func_arg(env, BPF_REG_4, fn->arg4_type, &meta);
> -       if (err)
> -               return err;
> -       err = check_func_arg(env, BPF_REG_5, fn->arg5_type, &meta);
> -       if (err)
> -               return err;
> +       for (i = 0; i < 5; i++) {
> +               if (fn->arg_type[i] == ARG_PTR_TO_BTF_ID) {
> +                       if (!fn->btf_id[i])
> +                               fn->btf_id[i] = btf_resolve_helper_id(&env->log, fn->func, 0);

bug: 0 -> i  :)



> +                       meta.btf_id = fn->btf_id[i];
> +               }
> +               err = check_func_arg(env, BPF_REG_1 + i, fn->arg_type[i], &meta);
> +               if (err)
> +                       return err;
> +       }
>

[...]

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [PATCH v2 bpf-next 11/12] bpf: disallow bpf_probe_read[_str] helpers
  2019-10-10  4:15 ` [PATCH v2 bpf-next 11/12] bpf: disallow bpf_probe_read[_str] helpers Alexei Starovoitov
@ 2019-10-11 19:03   ` Andrii Nakryiko
  0 siblings, 0 replies; 33+ messages in thread
From: Andrii Nakryiko @ 2019-10-11 19:03 UTC (permalink / raw)
  To: Alexei Starovoitov
  Cc: David S. Miller, Daniel Borkmann, x86, Networking, bpf, Kernel Team

On Wed, Oct 9, 2019 at 9:15 PM Alexei Starovoitov <ast@kernel.org> wrote:
>
> Disallow bpf_probe_read() and bpf_probe_read_str() helpers in
> raw_tracepoint bpf programs that use in-kernel BTF to track
> types of memory accesses.
>
> Signed-off-by: Alexei Starovoitov <ast@kernel.org>
> ---

I like it much better, thanks!

Acked-by: Andrii Nakryiko <andriin@fb.com>

>  kernel/trace/bpf_trace.c | 4 ++++
>  1 file changed, 4 insertions(+)
>
> diff --git a/kernel/trace/bpf_trace.c b/kernel/trace/bpf_trace.c
> index 52f7e9d8c29b..fa5743abf842 100644
> --- a/kernel/trace/bpf_trace.c
> +++ b/kernel/trace/bpf_trace.c
> @@ -700,6 +700,8 @@ tracing_func_proto(enum bpf_func_id func_id, const struct bpf_prog *prog)
>         case BPF_FUNC_map_peek_elem:
>                 return &bpf_map_peek_elem_proto;
>         case BPF_FUNC_probe_read:
> +               if (prog->aux->attach_btf_id)
> +                       return NULL;
>                 return &bpf_probe_read_proto;
>         case BPF_FUNC_ktime_get_ns:
>                 return &bpf_ktime_get_ns_proto;
> @@ -728,6 +730,8 @@ tracing_func_proto(enum bpf_func_id func_id, const struct bpf_prog *prog)
>         case BPF_FUNC_get_prandom_u32:
>                 return &bpf_get_prandom_u32_proto;
>         case BPF_FUNC_probe_read_str:
> +               if (prog->aux->attach_btf_id)
> +                       return NULL;
>                 return &bpf_probe_read_str_proto;
>  #ifdef CONFIG_CGROUPS
>         case BPF_FUNC_get_current_cgroup_id:
> --
> 2.23.0
>

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [PATCH v2 bpf-next 12/12] selftests/bpf: add kfree_skb raw_tp test
  2019-10-10  4:15 ` [PATCH v2 bpf-next 12/12] selftests/bpf: add kfree_skb raw_tp test Alexei Starovoitov
  2019-10-10 11:07   ` Ido Schimmel
@ 2019-10-11 19:05   ` Andrii Nakryiko
  1 sibling, 0 replies; 33+ messages in thread
From: Andrii Nakryiko @ 2019-10-11 19:05 UTC (permalink / raw)
  To: Alexei Starovoitov
  Cc: David S. Miller, Daniel Borkmann, x86, Networking, bpf, Kernel Team

On Wed, Oct 9, 2019 at 9:15 PM Alexei Starovoitov <ast@kernel.org> wrote:
>
> Load basic cls_bpf program.
> Load raw_tracepoint program and attach to kfree_skb raw tracepoint.
> Trigger cls_bpf via prog_test_run.
> At the end of test_run kernel will call kfree_skb
> which will trigger trace_kfree_skb tracepoint.
> Which will call our raw_tracepoint program.
> Which will take that skb and will dump it into perf ring buffer.
> Check that user space received correct packet.
>
> Signed-off-by: Alexei Starovoitov <ast@kernel.org>
> Acked-by: Andrii Nakryiko <andriin@fb.com>
> ---
>  .../selftests/bpf/prog_tests/kfree_skb.c      | 90 +++++++++++++++++++
>  tools/testing/selftests/bpf/progs/kfree_skb.c | 74 +++++++++++++++
>  2 files changed, 164 insertions(+)
>  create mode 100644 tools/testing/selftests/bpf/prog_tests/kfree_skb.c
>  create mode 100644 tools/testing/selftests/bpf/progs/kfree_skb.c
>

[...]

> +
> +void test_kfree_skb(void)
> +{
> +       struct bpf_prog_load_attr attr = {
> +               .file = "./kfree_skb.o",
> +               .log_level = 2,

This is rather verbose and memory-consuming. Do you really want to
leave it at 2?


> +       };
> +
> +       struct bpf_object *obj, *obj2 = NULL;
> +       struct perf_buffer_opts pb_opts = {};
> +       struct perf_buffer *pb = NULL;
> +       struct bpf_link *link = NULL;
> +       struct bpf_map *perf_buf_map;
> +       struct bpf_program *prog;
> +       __u32 duration, retval;
> +       int err, pkt_fd, kfree_skb_fd;
> +       bool passed = false;
> +

[...]

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [PATCH v2 bpf-next 06/12] bpf: implement accurate raw_tp context access via BTF
  2019-10-11 18:31   ` Andrii Nakryiko
@ 2019-10-11 23:13     ` Andrii Nakryiko
  0 siblings, 0 replies; 33+ messages in thread
From: Andrii Nakryiko @ 2019-10-11 23:13 UTC (permalink / raw)
  To: Alexei Starovoitov
  Cc: David S. Miller, Daniel Borkmann, x86, Networking, bpf, Kernel Team

On Fri, Oct 11, 2019 at 11:31 AM Andrii Nakryiko
<andrii.nakryiko@gmail.com> wrote:
>
> On Wed, Oct 9, 2019 at 9:17 PM Alexei Starovoitov <ast@kernel.org> wrote:
> >
> > libbpf analyzes bpf C program, searches in-kernel BTF for given type name
> > and stores it into expected_attach_type.
> > The kernel verifier expects this btf_id to point to something like:
> > typedef void (*btf_trace_kfree_skb)(void *, struct sk_buff *skb, void *loc);
> > which represents signature of raw_tracepoint "kfree_skb".
> >
> > Then btf_ctx_access() matches ctx+0 access in bpf program with 'skb'
> > and 'ctx+8' access with 'loc' arguments of "kfree_skb" tracepoint.
> > In first case it passes btf_id of 'struct sk_buff *' back to the verifier core
> > and 'void *' in second case.
> >
> > Then the verifier tracks PTR_TO_BTF_ID as any other pointer type.
> > Like PTR_TO_SOCKET points to 'struct bpf_sock',
> > PTR_TO_TCP_SOCK points to 'struct bpf_tcp_sock', and so on.
> > PTR_TO_BTF_ID points to in-kernel structs.
> > If 1234 is btf_id of 'struct sk_buff' in vmlinux's BTF
> > then PTR_TO_BTF_ID#1234 points to one of in kernel skbs.
> >
> > When PTR_TO_BTF_ID#1234 is dereferenced (like r2 = *(u64 *)r1 + 32)
> > the btf_struct_access() checks which field of 'struct sk_buff' is
> > at offset 32. Checks that size of access matches type definition
> > of the field and continues to track the dereferenced type.
> > If that field was a pointer to 'struct net_device' the r2's type
> > will be PTR_TO_BTF_ID#456. Where 456 is btf_id of 'struct net_device'
> > in vmlinux's BTF.
> >
> > Such verifier analysis prevents "cheating" in BPF C program.
> > The program cannot cast arbitrary pointer to 'struct sk_buff *'
> > and access it. C compiler would allow type cast, of course,
> > but the verifier will notice type mismatch based on BPF assembly
> > and in-kernel BTF.
> >
> > Signed-off-by: Alexei Starovoitov <ast@kernel.org>
> > ---
> >  include/linux/bpf.h          |  17 +++-
> >  include/linux/bpf_verifier.h |   4 +
> >  kernel/bpf/btf.c             | 186 +++++++++++++++++++++++++++++++++++
> >  kernel/bpf/verifier.c        |  86 +++++++++++++++-
> >  kernel/trace/bpf_trace.c     |   2 +-
> >  5 files changed, 290 insertions(+), 5 deletions(-)
> >
>
> [...]
>
> > +int btf_struct_access(struct bpf_verifier_log *log,
> > +                     const struct btf_type *t, int off, int size,
> > +                     enum bpf_access_type atype,
> > +                     u32 *next_btf_id)
> > +{
> > +       const struct btf_member *member;
> > +       const struct btf_type *mtype;
> > +       const char *tname, *mname;
> > +       int i, moff = 0, msize;
> > +
> > +again:
> > +       tname = __btf_name_by_offset(btf_vmlinux, t->name_off);
> > +       if (!btf_type_is_struct(t)) {
> > +               bpf_log(log, "Type '%s' is not a struct", tname);
> > +               return -EINVAL;
> > +       }
> > +
> > +       for_each_member(i, t, member) {
> > +               /* offset of the field in bits */
> > +               moff = btf_member_bit_offset(t, member);
>
> This whole logic of offset/size checking doesn't work for bitfields.
> Your moff % 8 might be non-zero (most probably, actually, for
> bitfield). Also, msize of underlying integer type is not the same as
> member's bit size. So probably just check that it's a bitfield and
> skip it?
>
> The check is surprisingly subtle and not straightforward, btw. You

Well, this part is not true, checking btf_member_bitfield_size(t,
member) for non-zero is enough to derive it's bitfield.

> need to get btf_member_bitfield_size(t, member) and check that it's
> not equal to underlying type's size (which is in bytes, so * 8). It's
> unfortunate it's so non-straightforward. But if you don't filter that,
> all those `moff / 8` and `msize` checks are bogus.
>
> > +
> > +               if (off + size <= moff / 8)
> > +                       /* won't find anything, field is already too far */
> > +                       break;
> > +
> > +               /* type of the field */
> > +               mtype = btf_type_by_id(btf_vmlinux, member->type);
> > +               mname = __btf_name_by_offset(btf_vmlinux, member->name_off);
> > +
> > +               /* skip modifiers */
> > +               while (btf_type_is_modifier(mtype))
> > +                       mtype = btf_type_by_id(btf_vmlinux, mtype->type);
> > +
> > +               if (btf_type_is_array(mtype))
> > +                       /* array deref is not supported yet */
> > +                       continue;
> > +
> > +               if (!btf_type_has_size(mtype) && !btf_type_is_ptr(mtype)) {
> > +                       bpf_log(log, "field %s doesn't have size\n", mname);
> > +                       return -EFAULT;
> > +               }
> > +               if (btf_type_is_ptr(mtype))
> > +                       msize = 8;
> > +               else
> > +                       msize = mtype->size;
> > +               if (off >= moff / 8 + msize)
> > +                       /* no overlap with member, keep iterating */
> > +                       continue;
> > +               /* the 'off' we're looking for is either equal to start
> > +                * of this field or inside of this struct
> > +                */
> > +               if (btf_type_is_struct(mtype)) {
> > +                       /* our field must be inside that union or struct */
> > +                       t = mtype;
> > +
> > +                       /* adjust offset we're looking for */
> > +                       off -= moff / 8;
> > +                       goto again;
> > +               }
> > +               if (msize != size) {
> > +                       /* field access size doesn't match */
> > +                       bpf_log(log,
> > +                               "cannot access %d bytes in struct %s field %s that has size %d\n",
> > +                               size, tname, mname, msize);
> > +                       return -EACCES;
>
> Are you sure this has to be an error? Why not just default to
> SCALAR_VALUE here? E.g., if compiler generated one read for few
> smaller fields, or user wants to read lower 1 byte of int field, etc.
> I think if you move this size check into the following ptr check, it
> should be fine. Pointer is the only case where you care about correct
> read/value, isn't it?
>
> > +               }
> > +
> > +               if (btf_type_is_ptr(mtype)) {
> > +                       const struct btf_type *stype;
> > +
> > +                       stype = btf_type_by_id(btf_vmlinux, mtype->type);
> > +                       /* skip modifiers */
> > +                       while (btf_type_is_modifier(stype))
> > +                               stype = btf_type_by_id(btf_vmlinux, stype->type);
> > +                       if (btf_type_is_struct(stype)) {
> > +                               *next_btf_id = mtype->type;
> > +                               return PTR_TO_BTF_ID;
> > +                       }
> > +               }
> > +               /* all other fields are treated as scalars */
> > +               return SCALAR_VALUE;
> > +       }
> > +       bpf_log(log, "struct %s doesn't have field at offset %d\n", tname, off);
> > +       return -EINVAL;
> > +}
> > +
>
> [...]

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [PATCH v2 bpf-next 05/12] libbpf: auto-detect btf_id of raw_tracepoint
  2019-10-11 18:07   ` Andrii Nakryiko
@ 2019-10-12  0:40     ` Alexei Starovoitov
  2019-10-12  1:29       ` Alexei Starovoitov
  2019-10-12  4:39       ` Andrii Nakryiko
  0 siblings, 2 replies; 33+ messages in thread
From: Alexei Starovoitov @ 2019-10-12  0:40 UTC (permalink / raw)
  To: Andrii Nakryiko, Alexei Starovoitov
  Cc: David S. Miller, Daniel Borkmann, x86, Networking, bpf, Kernel Team

On 10/11/19 11:07 AM, Andrii Nakryiko wrote:
> On Wed, Oct 9, 2019 at 9:17 PM Alexei Starovoitov <ast@kernel.org> wrote:
>>
>> For raw tracepoint program types libbpf will try to find
>> btf_id of raw tracepoint in vmlinux's BTF.
>> It's a responsiblity of bpf program author to annotate the program
>> with SEC("raw_tracepoint/name") where "name" is a valid raw tracepoint.
>> If "name" is indeed a valid raw tracepoint then in-kernel BTF
>> will have "btf_trace_##name" typedef that points to function
>> prototype of that raw tracepoint. BTF description captures
>> exact argument the kernel C code is passing into raw tracepoint.
>> The kernel verifier will check the types while loading bpf program.
>>
>> libbpf keeps BTF type id in expected_attach_type, but since
>> kernel ignores this attribute for tracing programs copy it
>> into attach_btf_id attribute before loading.
>>
>> Signed-off-by: Alexei Starovoitov <ast@kernel.org>
>> ---
>>   tools/lib/bpf/bpf.c    |  3 +++
>>   tools/lib/bpf/libbpf.c | 17 +++++++++++++++++
>>   2 files changed, 20 insertions(+)
>>
>> diff --git a/tools/lib/bpf/bpf.c b/tools/lib/bpf/bpf.c
>> index cbb933532981..79046067720f 100644
>> --- a/tools/lib/bpf/bpf.c
>> +++ b/tools/lib/bpf/bpf.c
>> @@ -228,6 +228,9 @@ int bpf_load_program_xattr(const struct bpf_load_program_attr *load_attr,
>>          memset(&attr, 0, sizeof(attr));
>>          attr.prog_type = load_attr->prog_type;
>>          attr.expected_attach_type = load_attr->expected_attach_type;
>> +       if (attr.prog_type == BPF_PROG_TYPE_RAW_TRACEPOINT)
>> +               /* expected_attach_type is ignored for tracing progs */
>> +               attr.attach_btf_id = attr.expected_attach_type;
>>          attr.insn_cnt = (__u32)load_attr->insns_cnt;
>>          attr.insns = ptr_to_u64(load_attr->insns);
>>          attr.license = ptr_to_u64(load_attr->license);
>> diff --git a/tools/lib/bpf/libbpf.c b/tools/lib/bpf/libbpf.c
>> index a02cdedc4e3f..8bf30a67428c 100644
>> --- a/tools/lib/bpf/libbpf.c
>> +++ b/tools/lib/bpf/libbpf.c
>> @@ -4586,6 +4586,23 @@ int libbpf_prog_type_by_name(const char *name, enum bpf_prog_type *prog_type,
>>                          continue;
>>                  *prog_type = section_names[i].prog_type;
>>                  *expected_attach_type = section_names[i].expected_attach_type;
>> +               if (*prog_type == BPF_PROG_TYPE_RAW_TRACEPOINT) {
>> +                       struct btf *btf = bpf_core_find_kernel_btf();
>> +                       char raw_tp_btf_name[128] = "btf_trace_";
>> +                       char *dst = raw_tp_btf_name + sizeof("btf_trace_") - 1;
>> +                       int ret;
>> +
>> +                       if (IS_ERR(btf))
>> +                               /* lack of kernel BTF is not a failure */
>> +                               return 0;
>> +                       /* prepend "btf_trace_" prefix per kernel convention */
>> +                       strncat(dst, name + section_names[i].len,
>> +                               sizeof(raw_tp_btf_name) - (dst - raw_tp_btf_name));
>> +                       ret = btf__find_by_name(btf, raw_tp_btf_name);
>> +                       if (ret > 0)
>> +                               *expected_attach_type = ret;
> 
> Well, actually, I realized after I gave Acked-by, so not yet :)
> 
> This needs kernel feature probe of whether kernel supports specifying
> attach_btf_id, otherwise on older kernels we'll stop successfully
> loading valid program.

The code above won't find anything on older kernels.
The patch 1 of the series has to be there for proper btf to be
generated by pahole.
Before that happens expected_attach_type will stay zero
and corresponding copy in attach_btf_id will be zero as well.
I see no issues being compatible with older kernels.

> But even if kernel supports attach_btf_id, I think users still need to
> opt in into specifying attach_btf_id by libbpf. Think about existing
> raw_tp programs that are using bpf_probe_read() because they were not
> created with this kernel feature in mind. They will suddenly stop
> working without any of user's fault.

This one is excellent catch.
loop1.c should have caught it, since it has
SEC("raw_tracepoint/kfree_skb")
{
   int nested_loops(volatile struct pt_regs* ctx)
    .. = PT_REGS_RC(ctx);

and verifier would have rejected it.
But the way the test is written it's not using libbpf's autodetect
of program type, so everything is passing.

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [PATCH v2 bpf-next 05/12] libbpf: auto-detect btf_id of raw_tracepoint
  2019-10-12  0:40     ` Alexei Starovoitov
@ 2019-10-12  1:29       ` Alexei Starovoitov
  2019-10-12  4:38         ` Andrii Nakryiko
  2019-10-12  4:39       ` Andrii Nakryiko
  1 sibling, 1 reply; 33+ messages in thread
From: Alexei Starovoitov @ 2019-10-12  1:29 UTC (permalink / raw)
  To: Andrii Nakryiko, Alexei Starovoitov
  Cc: David S. Miller, Daniel Borkmann, x86, Networking, bpf, Kernel Team

On 10/11/19 5:40 PM, Alexei Starovoitov wrote:
>> But even if kernel supports attach_btf_id, I think users still need to
>> opt in into specifying attach_btf_id by libbpf. Think about existing
>> raw_tp programs that are using bpf_probe_read() because they were not
>> created with this kernel feature in mind. They will suddenly stop
>> working without any of user's fault.
> 
> This one is excellent catch.
> loop1.c should have caught it, since it has
> SEC("raw_tracepoint/kfree_skb")
> {
>    int nested_loops(volatile struct pt_regs* ctx)
>     .. = PT_REGS_RC(ctx);
> 
> and verifier would have rejected it.
> But the way the test is written it's not using libbpf's autodetect
> of program type, so everything is passing.

With:
diff --git a/tools/testing/selftests/bpf/prog_tests/bpf_verif_scale.c 
b/tools/testing/selftests/bpf/prog_tests/bpf_verif_scale.c
index 1c01ee2600a9..e27156dce10d 100644
--- a/tools/testing/selftests/bpf/prog_tests/bpf_verif_scale.c
+++ b/tools/testing/selftests/bpf/prog_tests/bpf_verif_scale.c
@@ -67,7 +67,7 @@ void test_bpf_verif_scale(void)
                  */
                 { "pyperf600_nounroll.o", BPF_PROG_TYPE_RAW_TRACEPOINT },

-               { "loop1.o", BPF_PROG_TYPE_RAW_TRACEPOINT },
+               { "loop1.o", BPF_PROG_TYPE_UNSPEC},
                 { "loop2.o", BPF_PROG_TYPE_RAW_TRACEPOINT },

libbpf prog auto-detection kicks in and ...
# ./test_progs -n 3/10
libbpf: load bpf program failed: Permission denied
libbpf: -- BEGIN DUMP LOG ---
libbpf:
raw_tp 'kfree_skb' doesn't have 10-th argument
invalid bpf_context access off=80 size=8

Good :) The verifier is doing its job.

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [PATCH v2 bpf-next 10/12] bpf: check types of arguments passed into helpers
  2019-10-11 19:02   ` Andrii Nakryiko
@ 2019-10-12  1:39     ` Alexei Starovoitov
  2019-10-12  4:25       ` Andrii Nakryiko
  0 siblings, 1 reply; 33+ messages in thread
From: Alexei Starovoitov @ 2019-10-12  1:39 UTC (permalink / raw)
  To: Andrii Nakryiko, Alexei Starovoitov
  Cc: David S. Miller, Daniel Borkmann, x86, Networking, bpf, Kernel Team

On 10/11/19 12:02 PM, Andrii Nakryiko wrote:
> On Wed, Oct 9, 2019 at 9:15 PM Alexei Starovoitov <ast@kernel.org> wrote:
>>
>>   /* type of values returned from helper functions */
>> @@ -235,11 +236,17 @@ struct bpf_func_proto {
>>          bool gpl_only;
>>          bool pkt_access;
>>          enum bpf_return_type ret_type;
>> -       enum bpf_arg_type arg1_type;
>> -       enum bpf_arg_type arg2_type;
>> -       enum bpf_arg_type arg3_type;
>> -       enum bpf_arg_type arg4_type;
>> -       enum bpf_arg_type arg5_type;
>> +       union {
>> +               struct {
>> +                       enum bpf_arg_type arg1_type;
>> +                       enum bpf_arg_type arg2_type;
>> +                       enum bpf_arg_type arg3_type;
>> +                       enum bpf_arg_type arg4_type;
>> +                       enum bpf_arg_type arg5_type;
>> +               };
>> +               enum bpf_arg_type arg_type[5];
>> +       };
>> +       u32 *btf_id; /* BTF ids of arguments */
> 
> are you trying to save memory with this? otherwise not sure why it's
> not just `u32 btf_id[5]`? Even in that case it will save at most 12
> bytes (and I haven't even check alignment padding and stuff). So
> doesn't seem worth it?

Glad you asked :)
It cannot be "u32 btf_id[5];".
Guess why?
I think it's a cool trick.
I was happy when I finally figured out to solve it this way
after analyzing a bunch of ugly solutions.

>> + *
>> + *             The value to write, of *size*, is passed through eBPF stack and
>> + *             pointed by *data*.
> 
> typo? pointed __to__ by *data*?

I'm not an grammar expert. That was a copy paste from existing comment.

>> + *
>> + *             *ctx* is a pointer to in-kernel sutrct sk_buff.
>> + *
>> + *             This helper is similar to **bpf_perf_event_output**\ () but
>> + *             restricted to raw_tracepoint bpf programs.
> 
> nit: with BTF type tracking enabled?

sure.

>> +       for (i = 0; i < 5; i++) {
>> +               if (fn->arg_type[i] == ARG_PTR_TO_BTF_ID) {
>> +                       if (!fn->btf_id[i])
>> +                               fn->btf_id[i] = btf_resolve_helper_id(&env->log, fn->func, 0);
> 
> bug: 0 -> i  :)

Nice catch.
Clearly I don't have a use case yet for 2nd arg being ptr_to_btf.

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [PATCH v2 bpf-next 10/12] bpf: check types of arguments passed into helpers
  2019-10-12  1:39     ` Alexei Starovoitov
@ 2019-10-12  4:25       ` Andrii Nakryiko
  0 siblings, 0 replies; 33+ messages in thread
From: Andrii Nakryiko @ 2019-10-12  4:25 UTC (permalink / raw)
  To: Alexei Starovoitov
  Cc: Alexei Starovoitov, David S. Miller, Daniel Borkmann, x86,
	Networking, bpf, Kernel Team

On Fri, Oct 11, 2019 at 6:39 PM Alexei Starovoitov <ast@fb.com> wrote:
>
> On 10/11/19 12:02 PM, Andrii Nakryiko wrote:
> > On Wed, Oct 9, 2019 at 9:15 PM Alexei Starovoitov <ast@kernel.org> wrote:
> >>
> >>   /* type of values returned from helper functions */
> >> @@ -235,11 +236,17 @@ struct bpf_func_proto {
> >>          bool gpl_only;
> >>          bool pkt_access;
> >>          enum bpf_return_type ret_type;
> >> -       enum bpf_arg_type arg1_type;
> >> -       enum bpf_arg_type arg2_type;
> >> -       enum bpf_arg_type arg3_type;
> >> -       enum bpf_arg_type arg4_type;
> >> -       enum bpf_arg_type arg5_type;
> >> +       union {
> >> +               struct {
> >> +                       enum bpf_arg_type arg1_type;
> >> +                       enum bpf_arg_type arg2_type;
> >> +                       enum bpf_arg_type arg3_type;
> >> +                       enum bpf_arg_type arg4_type;
> >> +                       enum bpf_arg_type arg5_type;
> >> +               };
> >> +               enum bpf_arg_type arg_type[5];
> >> +       };
> >> +       u32 *btf_id; /* BTF ids of arguments */
> >
> > are you trying to save memory with this? otherwise not sure why it's
> > not just `u32 btf_id[5]`? Even in that case it will save at most 12
> > bytes (and I haven't even check alignment padding and stuff). So
> > doesn't seem worth it?
>
> Glad you asked :)
> It cannot be "u32 btf_id[5];".
> Guess why?

/data/users/andriin/linux/kernel/bpf/verifier.c: In function
‘check_helper_call’:
/data/users/andriin/linux/kernel/bpf/verifier.c:4097:19: error:
assignment of read-only location ‘fn->btf_id[i]’
     fn->btf_id[i] = btf_resolve_helper_id(&env->log, fn->func, 0);
                   ^
That answers it :) Yeah, indirection w/ pointer is a clever hack :)

> I think it's a cool trick.
> I was happy when I finally figured out to solve it this way
> after analyzing a bunch of ugly solutions.
>
> >> + *
> >> + *             The value to write, of *size*, is passed through eBPF stack and
> >> + *             pointed by *data*.
> >
> > typo? pointed __to__ by *data*?
>
> I'm not an grammar expert. That was a copy paste from existing comment.
>
> >> + *
> >> + *             *ctx* is a pointer to in-kernel sutrct sk_buff.

randomly spotted "sutrct" here :)

> >> + *
> >> + *             This helper is similar to **bpf_perf_event_output**\ () but
> >> + *             restricted to raw_tracepoint bpf programs.
> >
> > nit: with BTF type tracking enabled?
>
> sure.
>
> >> +       for (i = 0; i < 5; i++) {
> >> +               if (fn->arg_type[i] == ARG_PTR_TO_BTF_ID) {
> >> +                       if (!fn->btf_id[i])
> >> +                               fn->btf_id[i] = btf_resolve_helper_id(&env->log, fn->func, 0);
> >
> > bug: 0 -> i  :)
>
> Nice catch.
> Clearly I don't have a use case yet for 2nd arg being ptr_to_btf.

This actually brings an interesting question. There are a bunch of
helpers that track stuff like iphdr and so on. You could use that to
test, except you can't, because their args are not marked as
ARG_PTR_TO_BTF_ID. But marking it as such would break usual program
types that don't track BTF. I wonder if it's possible to have some
arrangement, that make the same helper be, sort of "BTF-enabled" for
BTF-enabled type of programs (so far just raw tracepoint with
attach_btf_id set), but for other programs where BTF types are not
tracked it still allows normal semantics. Thoughts?

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [PATCH v2 bpf-next 05/12] libbpf: auto-detect btf_id of raw_tracepoint
  2019-10-12  1:29       ` Alexei Starovoitov
@ 2019-10-12  4:38         ` Andrii Nakryiko
  2019-10-12  4:53           ` Alexei Starovoitov
  0 siblings, 1 reply; 33+ messages in thread
From: Andrii Nakryiko @ 2019-10-12  4:38 UTC (permalink / raw)
  To: Alexei Starovoitov
  Cc: Alexei Starovoitov, David S. Miller, Daniel Borkmann, x86,
	Networking, bpf, Kernel Team

On Fri, Oct 11, 2019 at 6:29 PM Alexei Starovoitov <ast@fb.com> wrote:
>
> On 10/11/19 5:40 PM, Alexei Starovoitov wrote:
> >> But even if kernel supports attach_btf_id, I think users still need to
> >> opt in into specifying attach_btf_id by libbpf. Think about existing
> >> raw_tp programs that are using bpf_probe_read() because they were not
> >> created with this kernel feature in mind. They will suddenly stop
> >> working without any of user's fault.
> >
> > This one is excellent catch.
> > loop1.c should have caught it, since it has
> > SEC("raw_tracepoint/kfree_skb")
> > {
> >    int nested_loops(volatile struct pt_regs* ctx)
> >     .. = PT_REGS_RC(ctx);
> >
> > and verifier would have rejected it.
> > But the way the test is written it's not using libbpf's autodetect
> > of program type, so everything is passing.
>
> With:
> diff --git a/tools/testing/selftests/bpf/prog_tests/bpf_verif_scale.c
> b/tools/testing/selftests/bpf/prog_tests/bpf_verif_scale.c
> index 1c01ee2600a9..e27156dce10d 100644
> --- a/tools/testing/selftests/bpf/prog_tests/bpf_verif_scale.c
> +++ b/tools/testing/selftests/bpf/prog_tests/bpf_verif_scale.c
> @@ -67,7 +67,7 @@ void test_bpf_verif_scale(void)
>                   */
>                  { "pyperf600_nounroll.o", BPF_PROG_TYPE_RAW_TRACEPOINT },
>
> -               { "loop1.o", BPF_PROG_TYPE_RAW_TRACEPOINT },
> +               { "loop1.o", BPF_PROG_TYPE_UNSPEC},
>                  { "loop2.o", BPF_PROG_TYPE_RAW_TRACEPOINT },
>
> libbpf prog auto-detection kicks in and ...
> # ./test_progs -n 3/10
> libbpf: load bpf program failed: Permission denied
> libbpf: -- BEGIN DUMP LOG ---
> libbpf:
> raw_tp 'kfree_skb' doesn't have 10-th argument
> invalid bpf_context access off=80 size=8
>
> Good :) The verifier is doing its job.

oh, another super intuitive error from verifier ;) 10th argument, what?..

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [PATCH v2 bpf-next 05/12] libbpf: auto-detect btf_id of raw_tracepoint
  2019-10-12  0:40     ` Alexei Starovoitov
  2019-10-12  1:29       ` Alexei Starovoitov
@ 2019-10-12  4:39       ` Andrii Nakryiko
  1 sibling, 0 replies; 33+ messages in thread
From: Andrii Nakryiko @ 2019-10-12  4:39 UTC (permalink / raw)
  To: Alexei Starovoitov
  Cc: Alexei Starovoitov, David S. Miller, Daniel Borkmann, x86,
	Networking, bpf, Kernel Team

On Fri, Oct 11, 2019 at 5:40 PM Alexei Starovoitov <ast@fb.com> wrote:
>
> On 10/11/19 11:07 AM, Andrii Nakryiko wrote:
> > On Wed, Oct 9, 2019 at 9:17 PM Alexei Starovoitov <ast@kernel.org> wrote:
> >>
> >> For raw tracepoint program types libbpf will try to find
> >> btf_id of raw tracepoint in vmlinux's BTF.
> >> It's a responsiblity of bpf program author to annotate the program
> >> with SEC("raw_tracepoint/name") where "name" is a valid raw tracepoint.
> >> If "name" is indeed a valid raw tracepoint then in-kernel BTF
> >> will have "btf_trace_##name" typedef that points to function
> >> prototype of that raw tracepoint. BTF description captures
> >> exact argument the kernel C code is passing into raw tracepoint.
> >> The kernel verifier will check the types while loading bpf program.
> >>
> >> libbpf keeps BTF type id in expected_attach_type, but since
> >> kernel ignores this attribute for tracing programs copy it
> >> into attach_btf_id attribute before loading.
> >>
> >> Signed-off-by: Alexei Starovoitov <ast@kernel.org>
> >> ---
> >>   tools/lib/bpf/bpf.c    |  3 +++
> >>   tools/lib/bpf/libbpf.c | 17 +++++++++++++++++
> >>   2 files changed, 20 insertions(+)
> >>
> >> diff --git a/tools/lib/bpf/bpf.c b/tools/lib/bpf/bpf.c
> >> index cbb933532981..79046067720f 100644
> >> --- a/tools/lib/bpf/bpf.c
> >> +++ b/tools/lib/bpf/bpf.c
> >> @@ -228,6 +228,9 @@ int bpf_load_program_xattr(const struct bpf_load_program_attr *load_attr,
> >>          memset(&attr, 0, sizeof(attr));
> >>          attr.prog_type = load_attr->prog_type;
> >>          attr.expected_attach_type = load_attr->expected_attach_type;
> >> +       if (attr.prog_type == BPF_PROG_TYPE_RAW_TRACEPOINT)
> >> +               /* expected_attach_type is ignored for tracing progs */
> >> +               attr.attach_btf_id = attr.expected_attach_type;
> >>          attr.insn_cnt = (__u32)load_attr->insns_cnt;
> >>          attr.insns = ptr_to_u64(load_attr->insns);
> >>          attr.license = ptr_to_u64(load_attr->license);
> >> diff --git a/tools/lib/bpf/libbpf.c b/tools/lib/bpf/libbpf.c
> >> index a02cdedc4e3f..8bf30a67428c 100644
> >> --- a/tools/lib/bpf/libbpf.c
> >> +++ b/tools/lib/bpf/libbpf.c
> >> @@ -4586,6 +4586,23 @@ int libbpf_prog_type_by_name(const char *name, enum bpf_prog_type *prog_type,
> >>                          continue;
> >>                  *prog_type = section_names[i].prog_type;
> >>                  *expected_attach_type = section_names[i].expected_attach_type;
> >> +               if (*prog_type == BPF_PROG_TYPE_RAW_TRACEPOINT) {
> >> +                       struct btf *btf = bpf_core_find_kernel_btf();
> >> +                       char raw_tp_btf_name[128] = "btf_trace_";
> >> +                       char *dst = raw_tp_btf_name + sizeof("btf_trace_") - 1;
> >> +                       int ret;
> >> +
> >> +                       if (IS_ERR(btf))
> >> +                               /* lack of kernel BTF is not a failure */
> >> +                               return 0;
> >> +                       /* prepend "btf_trace_" prefix per kernel convention */
> >> +                       strncat(dst, name + section_names[i].len,
> >> +                               sizeof(raw_tp_btf_name) - (dst - raw_tp_btf_name));
> >> +                       ret = btf__find_by_name(btf, raw_tp_btf_name);
> >> +                       if (ret > 0)
> >> +                               *expected_attach_type = ret;
> >
> > Well, actually, I realized after I gave Acked-by, so not yet :)
> >
> > This needs kernel feature probe of whether kernel supports specifying
> > attach_btf_id, otherwise on older kernels we'll stop successfully
> > loading valid program.
>
> The code above won't find anything on older kernels.
> The patch 1 of the series has to be there for proper btf to be
> generated by pahole.
> Before that happens expected_attach_type will stay zero
> and corresponding copy in attach_btf_id will be zero as well.
> I see no issues being compatible with older kernels.

indeed, this one is not an issue.

>
> > But even if kernel supports attach_btf_id, I think users still need to
> > opt in into specifying attach_btf_id by libbpf. Think about existing
> > raw_tp programs that are using bpf_probe_read() because they were not
> > created with this kernel feature in mind. They will suddenly stop
> > working without any of user's fault.
>
> This one is excellent catch.
> loop1.c should have caught it, since it has
> SEC("raw_tracepoint/kfree_skb")
> {
>    int nested_loops(volatile struct pt_regs* ctx)
>     .. = PT_REGS_RC(ctx);
>
> and verifier would have rejected it.
> But the way the test is written it's not using libbpf's autodetect
> of program type, so everything is passing.

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [PATCH v2 bpf-next 05/12] libbpf: auto-detect btf_id of raw_tracepoint
  2019-10-12  4:38         ` Andrii Nakryiko
@ 2019-10-12  4:53           ` Alexei Starovoitov
  0 siblings, 0 replies; 33+ messages in thread
From: Alexei Starovoitov @ 2019-10-12  4:53 UTC (permalink / raw)
  To: Andrii Nakryiko
  Cc: Alexei Starovoitov, David S. Miller, Daniel Borkmann, x86,
	Networking, bpf, Kernel Team

On 10/11/19 9:38 PM, Andrii Nakryiko wrote:
> On Fri, Oct 11, 2019 at 6:29 PM Alexei Starovoitov <ast@fb.com> wrote:
>>
>> On 10/11/19 5:40 PM, Alexei Starovoitov wrote:
>>>> But even if kernel supports attach_btf_id, I think users still need to
>>>> opt in into specifying attach_btf_id by libbpf. Think about existing
>>>> raw_tp programs that are using bpf_probe_read() because they were not
>>>> created with this kernel feature in mind. They will suddenly stop
>>>> working without any of user's fault.
>>>
>>> This one is excellent catch.
>>> loop1.c should have caught it, since it has
>>> SEC("raw_tracepoint/kfree_skb")
>>> {
>>>     int nested_loops(volatile struct pt_regs* ctx)
>>>      .. = PT_REGS_RC(ctx);
>>>
>>> and verifier would have rejected it.
>>> But the way the test is written it's not using libbpf's autodetect
>>> of program type, so everything is passing.
>>
>> With:
>> diff --git a/tools/testing/selftests/bpf/prog_tests/bpf_verif_scale.c
>> b/tools/testing/selftests/bpf/prog_tests/bpf_verif_scale.c
>> index 1c01ee2600a9..e27156dce10d 100644
>> --- a/tools/testing/selftests/bpf/prog_tests/bpf_verif_scale.c
>> +++ b/tools/testing/selftests/bpf/prog_tests/bpf_verif_scale.c
>> @@ -67,7 +67,7 @@ void test_bpf_verif_scale(void)
>>                    */
>>                   { "pyperf600_nounroll.o", BPF_PROG_TYPE_RAW_TRACEPOINT },
>>
>> -               { "loop1.o", BPF_PROG_TYPE_RAW_TRACEPOINT },
>> +               { "loop1.o", BPF_PROG_TYPE_UNSPEC},
>>                   { "loop2.o", BPF_PROG_TYPE_RAW_TRACEPOINT },
>>
>> libbpf prog auto-detection kicks in and ...
>> # ./test_progs -n 3/10
>> libbpf: load bpf program failed: Permission denied
>> libbpf: -- BEGIN DUMP LOG ---
>> libbpf:
>> raw_tp 'kfree_skb' doesn't have 10-th argument
>> invalid bpf_context access off=80 size=8
>>
>> Good :) The verifier is doing its job.
> 
> oh, another super intuitive error from verifier ;) 10th argument, what?..

I know, but there is no env->linfo and no insn_idx to call
verbose_linfo() from there. That's even bigger refactoring
that I'd rather to later.

^ permalink raw reply	[flat|nested] 33+ messages in thread

end of thread, back to index

Thread overview: 33+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2019-10-10  4:14 [PATCH v2 bpf-next 00/12] bpf: revolutionize bpf tracing Alexei Starovoitov
2019-10-10  4:14 ` [PATCH v2 bpf-next 01/12] bpf: add typecast to raw_tracepoints to help BTF generation Alexei Starovoitov
2019-10-10  4:14 ` [PATCH v2 bpf-next 02/12] bpf: add typecast to bpf helpers " Alexei Starovoitov
2019-10-10  4:14 ` [PATCH v2 bpf-next 03/12] bpf: process in-kernel BTF Alexei Starovoitov
2019-10-11 17:56   ` Andrii Nakryiko
2019-10-10  4:14 ` [PATCH v2 bpf-next 04/12] bpf: add attach_btf_id attribute to program load Alexei Starovoitov
2019-10-11 17:58   ` Andrii Nakryiko
2019-10-10  4:14 ` [PATCH v2 bpf-next 05/12] libbpf: auto-detect btf_id of raw_tracepoint Alexei Starovoitov
2019-10-11 18:02   ` Andrii Nakryiko
2019-10-11 18:07   ` Andrii Nakryiko
2019-10-12  0:40     ` Alexei Starovoitov
2019-10-12  1:29       ` Alexei Starovoitov
2019-10-12  4:38         ` Andrii Nakryiko
2019-10-12  4:53           ` Alexei Starovoitov
2019-10-12  4:39       ` Andrii Nakryiko
2019-10-10  4:14 ` [PATCH v2 bpf-next 06/12] bpf: implement accurate raw_tp context access via BTF Alexei Starovoitov
2019-10-11 18:31   ` Andrii Nakryiko
2019-10-11 23:13     ` Andrii Nakryiko
2019-10-10  4:14 ` [PATCH v2 bpf-next 07/12] bpf: attach raw_tp program with BTF via type name Alexei Starovoitov
2019-10-11 18:44   ` Andrii Nakryiko
2019-10-10  4:14 ` [PATCH v2 bpf-next 08/12] bpf: add support for BTF pointers to interpreter Alexei Starovoitov
2019-10-10  4:15 ` [PATCH v2 bpf-next 09/12] bpf: add support for BTF pointers to x86 JIT Alexei Starovoitov
2019-10-11 18:48   ` Andrii Nakryiko
2019-10-10  4:15 ` [PATCH v2 bpf-next 10/12] bpf: check types of arguments passed into helpers Alexei Starovoitov
2019-10-11 19:02   ` Andrii Nakryiko
2019-10-12  1:39     ` Alexei Starovoitov
2019-10-12  4:25       ` Andrii Nakryiko
2019-10-10  4:15 ` [PATCH v2 bpf-next 11/12] bpf: disallow bpf_probe_read[_str] helpers Alexei Starovoitov
2019-10-11 19:03   ` Andrii Nakryiko
2019-10-10  4:15 ` [PATCH v2 bpf-next 12/12] selftests/bpf: add kfree_skb raw_tp test Alexei Starovoitov
2019-10-10 11:07   ` Ido Schimmel
2019-10-10 19:07     ` Alexei Starovoitov
2019-10-11 19:05   ` Andrii Nakryiko

Netdev Archive on lore.kernel.org

Archives are clonable:
	git clone --mirror https://lore.kernel.org/netdev/0 netdev/git/0.git
	git clone --mirror https://lore.kernel.org/netdev/1 netdev/git/1.git

	# If you have public-inbox 1.1+ installed, you may
	# initialize and index your mirror using the following commands:
	public-inbox-init -V2 netdev netdev/ https://lore.kernel.org/netdev \
		netdev@vger.kernel.org netdev@archiver.kernel.org
	public-inbox-index netdev

Example config snippet for mirrors

Newsgroup available over NNTP:
	nntp://nntp.lore.kernel.org/org.kernel.vger.netdev


AGPL code for this site: git clone https://public-inbox.org/ public-inbox