* [PATCH bpf-next v9 0/9] bpf: getsockopt and setsockopt hooks @ 2019-06-27 20:38 Stanislav Fomichev 2019-06-27 20:38 ` [PATCH bpf-next v9 1/9] bpf: implement " Stanislav Fomichev ` (9 more replies) 0 siblings, 10 replies; 13+ messages in thread From: Stanislav Fomichev @ 2019-06-27 20:38 UTC (permalink / raw) To: netdev, bpf; +Cc: davem, ast, daniel, Stanislav Fomichev This series implements two new per-cgroup hooks: getsockopt and setsockopt along with a new sockopt program type. The idea is pretty similar to recently introduced cgroup sysctl hooks, but implementation is simpler (no need to convert to/from strings). What this can be applied to: * move business logic of what tos/priority/etc can be set by containers (either pass or reject) * handle existing options (or introduce new ones) differently by propagating some information in cgroup/socket local storage Compared to a simple syscall/{g,s}etsockopt tracepoint, those hooks are context aware. Meaning, they can access underlying socket and use cgroup and socket local storage. v9: * allow overwriting setsocktop arguments (Alexei Starovoitov) (see individual changes for more changelog details) Stanislav Fomichev (9): bpf: implement getsockopt and setsockopt hooks bpf: sync bpf.h to tools/ libbpf: support sockopt hooks selftests/bpf: test sockopt section name selftests/bpf: add sockopt test selftests/bpf: add sockopt test that exercises sk helpers selftests/bpf: add sockopt test that exercises BPF_F_ALLOW_MULTI bpf: add sockopt documentation bpftool: support cgroup sockopt Documentation/bpf/index.rst | 1 + Documentation/bpf/prog_cgroup_sockopt.rst | 93 ++ include/linux/bpf-cgroup.h | 45 + include/linux/bpf.h | 2 + include/linux/bpf_types.h | 1 + include/linux/filter.h | 10 + include/uapi/linux/bpf.h | 14 + kernel/bpf/cgroup.c | 333 ++++++ kernel/bpf/core.c | 9 + kernel/bpf/syscall.c | 19 + kernel/bpf/verifier.c | 8 + net/core/filter.c | 2 +- net/socket.c | 30 + .../bpftool/Documentation/bpftool-cgroup.rst | 7 +- .../bpftool/Documentation/bpftool-prog.rst | 3 +- tools/bpf/bpftool/bash-completion/bpftool | 9 +- tools/bpf/bpftool/cgroup.c | 5 +- tools/bpf/bpftool/main.h | 1 + tools/bpf/bpftool/prog.c | 3 +- tools/include/uapi/linux/bpf.h | 14 + tools/lib/bpf/libbpf.c | 5 + tools/lib/bpf/libbpf_probes.c | 1 + tools/testing/selftests/bpf/.gitignore | 3 + tools/testing/selftests/bpf/Makefile | 6 +- .../selftests/bpf/progs/sockopt_multi.c | 71 ++ .../testing/selftests/bpf/progs/sockopt_sk.c | 111 ++ .../selftests/bpf/test_section_names.c | 10 + tools/testing/selftests/bpf/test_sockopt.c | 1021 +++++++++++++++++ .../selftests/bpf/test_sockopt_multi.c | 374 ++++++ tools/testing/selftests/bpf/test_sockopt_sk.c | 214 ++++ 30 files changed, 2415 insertions(+), 10 deletions(-) create mode 100644 Documentation/bpf/prog_cgroup_sockopt.rst create mode 100644 tools/testing/selftests/bpf/progs/sockopt_multi.c create mode 100644 tools/testing/selftests/bpf/progs/sockopt_sk.c create mode 100644 tools/testing/selftests/bpf/test_sockopt.c create mode 100644 tools/testing/selftests/bpf/test_sockopt_multi.c create mode 100644 tools/testing/selftests/bpf/test_sockopt_sk.c -- 2.22.0.410.gd8fdbe21b5-goog ^ permalink raw reply [flat|nested] 13+ messages in thread
* [PATCH bpf-next v9 1/9] bpf: implement getsockopt and setsockopt hooks 2019-06-27 20:38 [PATCH bpf-next v9 0/9] bpf: getsockopt and setsockopt hooks Stanislav Fomichev @ 2019-06-27 20:38 ` Stanislav Fomichev 2019-06-27 20:38 ` [PATCH bpf-next v9 2/9] bpf: sync bpf.h to tools/ Stanislav Fomichev ` (8 subsequent siblings) 9 siblings, 0 replies; 13+ messages in thread From: Stanislav Fomichev @ 2019-06-27 20:38 UTC (permalink / raw) To: netdev, bpf Cc: davem, ast, daniel, Stanislav Fomichev, Andrii Nakryiko, Martin Lau Implement new BPF_PROG_TYPE_CGROUP_SOCKOPT program type and BPF_CGROUP_{G,S}ETSOCKOPT cgroup hooks. BPF_CGROUP_SETSOCKOPT can modify user setsockopt arguments before passing them down to the kernel or bypass kernel completely. BPF_CGROUP_GETSOCKOPT can can inspect/modify getsockopt arguments that kernel returns. Both hooks reuse existing PTR_TO_PACKET{,_END} infrastructure. The buffer memory is pre-allocated (because I don't think there is a precedent for working with __user memory from bpf). This might be slow to do for each {s,g}etsockopt call, that's why I've added __cgroup_bpf_prog_array_is_empty that exits early if there is nothing attached to a cgroup. Note, however, that there is a race between __cgroup_bpf_prog_array_is_empty and BPF_PROG_RUN_ARRAY where cgroup program layout might have changed; this should not be a problem because in general there is a race between multiple calls to {s,g}etsocktop and user adding/removing bpf progs from a cgroup. The return code of the BPF program is handled as follows: * 0: EPERM * 1: success, continue with next BPF program in the cgroup chain v9: * allow overwriting setsockopt arguments (Alexei Starovoitov): * use set_fs (same as kernel_setsockopt) * buffer is always kzalloc'd (no small on-stack buffer) v8: * use s32 for optlen (Andrii Nakryiko) v7: * return only 0 or 1 (Alexei Starovoitov) * always run all progs (Alexei Starovoitov) * use optval=0 as kernel bypass in setsockopt (Alexei Starovoitov) (decided to use optval=-1 instead, optval=0 might be a valid input) * call getsockopt hook after kernel handlers (Alexei Starovoitov) v6: * rework cgroup chaining; stop as soon as bpf program returns 0 or 2; see patch with the documentation for the details * drop Andrii's and Martin's Acked-by (not sure they are comfortable with the new state of things) v5: * skip copy_to_user() and put_user() when ret == 0 (Martin Lau) v4: * don't export bpf_sk_fullsock helper (Martin Lau) * size != sizeof(__u64) for uapi pointers (Martin Lau) * offsetof instead of bpf_ctx_range when checking ctx access (Martin Lau) v3: * typos in BPF_PROG_CGROUP_SOCKOPT_RUN_ARRAY comments (Andrii Nakryiko) * reverse christmas tree in BPF_PROG_CGROUP_SOCKOPT_RUN_ARRAY (Andrii Nakryiko) * use __bpf_md_ptr instead of __u32 for optval{,_end} (Martin Lau) * use BPF_FIELD_SIZEOF() for consistency (Martin Lau) * new CG_SOCKOPT_ACCESS macro to wrap repeated parts v2: * moved bpf_sockopt_kern fields around to remove a hole (Martin Lau) * aligned bpf_sockopt_kern->buf to 8 bytes (Martin Lau) * bpf_prog_array_is_empty instead of bpf_prog_array_length (Martin Lau) * added [0,2] return code check to verifier (Martin Lau) * dropped unused buf[64] from the stack (Martin Lau) * use PTR_TO_SOCKET for bpf_sockopt->sk (Martin Lau) * dropped bpf_target_off from ctx rewrites (Martin Lau) * use return code for kernel bypass (Martin Lau & Andrii Nakryiko) Cc: Andrii Nakryiko <andriin@fb.com> Cc: Martin Lau <kafai@fb.com> Signed-off-by: Stanislav Fomichev <sdf@google.com> --- include/linux/bpf-cgroup.h | 45 +++++ include/linux/bpf.h | 2 + include/linux/bpf_types.h | 1 + include/linux/filter.h | 10 ++ include/uapi/linux/bpf.h | 14 ++ kernel/bpf/cgroup.c | 333 +++++++++++++++++++++++++++++++++++++ kernel/bpf/core.c | 9 + kernel/bpf/syscall.c | 19 +++ kernel/bpf/verifier.c | 8 + net/core/filter.c | 2 +- net/socket.c | 30 ++++ 11 files changed, 472 insertions(+), 1 deletion(-) diff --git a/include/linux/bpf-cgroup.h b/include/linux/bpf-cgroup.h index bd79ae32909a..169fd25f6bc2 100644 --- a/include/linux/bpf-cgroup.h +++ b/include/linux/bpf-cgroup.h @@ -124,6 +124,14 @@ int __cgroup_bpf_run_filter_sysctl(struct ctl_table_header *head, loff_t *ppos, void **new_buf, enum bpf_attach_type type); +int __cgroup_bpf_run_filter_setsockopt(struct sock *sock, int *level, + int *optname, char __user *optval, + int *optlen, char **kernel_optval); +int __cgroup_bpf_run_filter_getsockopt(struct sock *sk, int level, + int optname, char __user *optval, + int __user *optlen, int max_optlen, + int retval); + static inline enum bpf_cgroup_storage_type cgroup_storage_type( struct bpf_map *map) { @@ -286,6 +294,38 @@ int bpf_percpu_cgroup_storage_update(struct bpf_map *map, void *key, __ret; \ }) +#define BPF_CGROUP_RUN_PROG_SETSOCKOPT(sock, level, optname, optval, optlen, \ + kernel_optval) \ +({ \ + int __ret = 0; \ + if (cgroup_bpf_enabled) \ + __ret = __cgroup_bpf_run_filter_setsockopt(sock, level, \ + optname, optval, \ + optlen, \ + kernel_optval); \ + __ret; \ +}) + +#define BPF_CGROUP_GETSOCKOPT_MAX_OPTLEN(optlen) \ +({ \ + int __ret = 0; \ + if (cgroup_bpf_enabled) \ + get_user(__ret, optlen); \ + __ret; \ +}) + +#define BPF_CGROUP_RUN_PROG_GETSOCKOPT(sock, level, optname, optval, optlen, \ + max_optlen, retval) \ +({ \ + int __ret = retval; \ + if (cgroup_bpf_enabled) \ + __ret = __cgroup_bpf_run_filter_getsockopt(sock, level, \ + optname, optval, \ + optlen, max_optlen, \ + retval); \ + __ret; \ +}) + int cgroup_bpf_prog_attach(const union bpf_attr *attr, enum bpf_prog_type ptype, struct bpf_prog *prog); int cgroup_bpf_prog_detach(const union bpf_attr *attr, @@ -357,6 +397,11 @@ static inline int bpf_percpu_cgroup_storage_update(struct bpf_map *map, #define BPF_CGROUP_RUN_PROG_SOCK_OPS(sock_ops) ({ 0; }) #define BPF_CGROUP_RUN_PROG_DEVICE_CGROUP(type,major,minor,access) ({ 0; }) #define BPF_CGROUP_RUN_PROG_SYSCTL(head,table,write,buf,count,pos,nbuf) ({ 0; }) +#define BPF_CGROUP_GETSOCKOPT_MAX_OPTLEN(optlen) ({ 0; }) +#define BPF_CGROUP_RUN_PROG_GETSOCKOPT(sock, level, optname, optval, \ + optlen, max_optlen, retval) ({ retval; }) +#define BPF_CGROUP_RUN_PROG_SETSOCKOPT(sock, level, optname, optval, optlen, \ + kernel_optval) ({ 0; }) #define for_each_cgroup_storage_type(stype) for (; false; ) diff --git a/include/linux/bpf.h b/include/linux/bpf.h index a62e7889b0b6..18f4cc2c6acd 100644 --- a/include/linux/bpf.h +++ b/include/linux/bpf.h @@ -518,6 +518,7 @@ struct bpf_prog_array { struct bpf_prog_array *bpf_prog_array_alloc(u32 prog_cnt, gfp_t flags); void bpf_prog_array_free(struct bpf_prog_array *progs); int bpf_prog_array_length(struct bpf_prog_array *progs); +bool bpf_prog_array_is_empty(struct bpf_prog_array *array); int bpf_prog_array_copy_to_user(struct bpf_prog_array *progs, __u32 __user *prog_ids, u32 cnt); @@ -1051,6 +1052,7 @@ extern const struct bpf_func_proto bpf_spin_unlock_proto; extern const struct bpf_func_proto bpf_get_local_storage_proto; extern const struct bpf_func_proto bpf_strtol_proto; extern const struct bpf_func_proto bpf_strtoul_proto; +extern const struct bpf_func_proto bpf_tcp_sock_proto; /* Shared helpers among cBPF and eBPF. */ void bpf_user_rnd_init_once(void); diff --git a/include/linux/bpf_types.h b/include/linux/bpf_types.h index 5a9975678d6f..eec5aeeeaf92 100644 --- a/include/linux/bpf_types.h +++ b/include/linux/bpf_types.h @@ -30,6 +30,7 @@ BPF_PROG_TYPE(BPF_PROG_TYPE_RAW_TRACEPOINT_WRITABLE, raw_tracepoint_writable) #ifdef CONFIG_CGROUP_BPF BPF_PROG_TYPE(BPF_PROG_TYPE_CGROUP_DEVICE, cg_dev) BPF_PROG_TYPE(BPF_PROG_TYPE_CGROUP_SYSCTL, cg_sysctl) +BPF_PROG_TYPE(BPF_PROG_TYPE_CGROUP_SOCKOPT, cg_sockopt) #endif #ifdef CONFIG_BPF_LIRC_MODE2 BPF_PROG_TYPE(BPF_PROG_TYPE_LIRC_MODE2, lirc_mode2) diff --git a/include/linux/filter.h b/include/linux/filter.h index 43b45d6db36d..340f7d648974 100644 --- a/include/linux/filter.h +++ b/include/linux/filter.h @@ -1199,4 +1199,14 @@ struct bpf_sysctl_kern { u64 tmp_reg; }; +struct bpf_sockopt_kern { + struct sock *sk; + u8 *optval; + u8 *optval_end; + s32 level; + s32 optname; + s32 optlen; + s32 retval; +}; + #endif /* __LINUX_FILTER_H__ */ diff --git a/include/uapi/linux/bpf.h b/include/uapi/linux/bpf.h index b077507efa3f..a396b516a2b2 100644 --- a/include/uapi/linux/bpf.h +++ b/include/uapi/linux/bpf.h @@ -170,6 +170,7 @@ enum bpf_prog_type { BPF_PROG_TYPE_FLOW_DISSECTOR, BPF_PROG_TYPE_CGROUP_SYSCTL, BPF_PROG_TYPE_RAW_TRACEPOINT_WRITABLE, + BPF_PROG_TYPE_CGROUP_SOCKOPT, }; enum bpf_attach_type { @@ -194,6 +195,8 @@ enum bpf_attach_type { BPF_CGROUP_SYSCTL, BPF_CGROUP_UDP4_RECVMSG, BPF_CGROUP_UDP6_RECVMSG, + BPF_CGROUP_GETSOCKOPT, + BPF_CGROUP_SETSOCKOPT, __MAX_BPF_ATTACH_TYPE }; @@ -3541,4 +3544,15 @@ struct bpf_sysctl { */ }; +struct bpf_sockopt { + __bpf_md_ptr(struct bpf_sock *, sk); + __bpf_md_ptr(void *, optval); + __bpf_md_ptr(void *, optval_end); + + __s32 level; + __s32 optname; + __s32 optlen; + __s32 retval; +}; + #endif /* _UAPI__LINUX_BPF_H__ */ diff --git a/kernel/bpf/cgroup.c b/kernel/bpf/cgroup.c index 077ed3a19848..76fa0076f20d 100644 --- a/kernel/bpf/cgroup.c +++ b/kernel/bpf/cgroup.c @@ -15,6 +15,7 @@ #include <linux/bpf.h> #include <linux/bpf-cgroup.h> #include <net/sock.h> +#include <net/bpf_sk_storage.h> #include "../cgroup/cgroup-internal.h" @@ -938,6 +939,188 @@ int __cgroup_bpf_run_filter_sysctl(struct ctl_table_header *head, } EXPORT_SYMBOL(__cgroup_bpf_run_filter_sysctl); +static bool __cgroup_bpf_prog_array_is_empty(struct cgroup *cgrp, + enum bpf_attach_type attach_type) +{ + struct bpf_prog_array *prog_array; + bool empty; + + rcu_read_lock(); + prog_array = rcu_dereference(cgrp->bpf.effective[attach_type]); + empty = bpf_prog_array_is_empty(prog_array); + rcu_read_unlock(); + + return empty; +} + +static int sockopt_alloc_buf(struct bpf_sockopt_kern *ctx, int max_optlen) +{ + if (unlikely(max_optlen > PAGE_SIZE) || max_optlen < 0) + return -EINVAL; + + ctx->optval = kzalloc(max_optlen, GFP_USER); + if (!ctx->optval) + return -ENOMEM; + + ctx->optval_end = ctx->optval + max_optlen; + ctx->optlen = max_optlen; + + return 0; +} + +static void sockopt_free_buf(struct bpf_sockopt_kern *ctx) +{ + kfree(ctx->optval); +} + +int __cgroup_bpf_run_filter_setsockopt(struct sock *sk, int *level, + int *optname, char __user *optval, + int *optlen, char **kernel_optval) +{ + struct cgroup *cgrp = sock_cgroup_ptr(&sk->sk_cgrp_data); + struct bpf_sockopt_kern ctx = { + .sk = sk, + .level = *level, + .optname = *optname, + }; + int ret; + + /* Opportunistic check to see whether we have any BPF program + * attached to the hook so we don't waste time allocating + * memory and locking the socket. + */ + if (!cgroup_bpf_enabled || + __cgroup_bpf_prog_array_is_empty(cgrp, BPF_CGROUP_SETSOCKOPT)) + return 0; + + ret = sockopt_alloc_buf(&ctx, *optlen); + if (ret) + return ret; + + if (copy_from_user(ctx.optval, optval, *optlen) != 0) { + ret = -EFAULT; + goto out; + } + + lock_sock(sk); + ret = BPF_PROG_RUN_ARRAY(cgrp->bpf.effective[BPF_CGROUP_SETSOCKOPT], + &ctx, BPF_PROG_RUN); + release_sock(sk); + + if (!ret) { + ret = -EPERM; + goto out; + } + + if (ctx.optlen == -1) { + /* optlen set to -1, bypass kernel */ + ret = 1; + } else if (ctx.optlen > *optlen || ctx.optlen < -1) { + /* optlen is out of bounds */ + ret = -EFAULT; + } else { + /* optlen within bounds, run kernel handler */ + ret = 0; + + /* export any potential modifications */ + *level = ctx.level; + *optname = ctx.optname; + *optlen = ctx.optlen; + *kernel_optval = ctx.optval; + } + +out: + if (ret) + sockopt_free_buf(&ctx); + return ret; +} +EXPORT_SYMBOL(__cgroup_bpf_run_filter_setsockopt); + +int __cgroup_bpf_run_filter_getsockopt(struct sock *sk, int level, + int optname, char __user *optval, + int __user *optlen, int max_optlen, + int retval) +{ + struct cgroup *cgrp = sock_cgroup_ptr(&sk->sk_cgrp_data); + struct bpf_sockopt_kern ctx = { + .sk = sk, + .level = level, + .optname = optname, + .retval = retval, + }; + int ret; + + /* Opportunistic check to see whether we have any BPF program + * attached to the hook so we don't waste time allocating + * memory and locking the socket. + */ + if (!cgroup_bpf_enabled || + __cgroup_bpf_prog_array_is_empty(cgrp, BPF_CGROUP_GETSOCKOPT)) + return retval; + + ret = sockopt_alloc_buf(&ctx, max_optlen); + if (ret) + return ret; + + if (!retval) { + /* If kernel getsockopt finished successfully, + * copy whatever was returned to the user back + * into our temporary buffer. Set optlen to the + * one that kernel returned as well to let + * BPF programs inspect the value. + */ + + if (get_user(ctx.optlen, optlen)) { + ret = -EFAULT; + goto out; + } + + if (ctx.optlen > max_optlen) + ctx.optlen = max_optlen; + + if (copy_from_user(ctx.optval, optval, ctx.optlen) != 0) { + ret = -EFAULT; + goto out; + } + } + + lock_sock(sk); + ret = BPF_PROG_RUN_ARRAY(cgrp->bpf.effective[BPF_CGROUP_GETSOCKOPT], + &ctx, BPF_PROG_RUN); + release_sock(sk); + + if (!ret) { + ret = -EPERM; + goto out; + } + + if (ctx.optlen > max_optlen) { + ret = -EFAULT; + goto out; + } + + /* BPF programs only allowed to set retval to 0, not some + * arbitrary value. + */ + if (ctx.retval != 0 && ctx.retval != retval) { + ret = -EFAULT; + goto out; + } + + if (copy_to_user(optval, ctx.optval, ctx.optlen) || + put_user(ctx.optlen, optlen)) { + ret = -EFAULT; + goto out; + } + + ret = ctx.retval; + +out: + sockopt_free_buf(&ctx); + return ret; +} +EXPORT_SYMBOL(__cgroup_bpf_run_filter_getsockopt); + static ssize_t sysctl_cpy_dir(const struct ctl_dir *dir, char **bufp, size_t *lenp) { @@ -1198,3 +1381,153 @@ const struct bpf_verifier_ops cg_sysctl_verifier_ops = { const struct bpf_prog_ops cg_sysctl_prog_ops = { }; + +static const struct bpf_func_proto * +cg_sockopt_func_proto(enum bpf_func_id func_id, const struct bpf_prog *prog) +{ + switch (func_id) { + case BPF_FUNC_sk_storage_get: + return &bpf_sk_storage_get_proto; + case BPF_FUNC_sk_storage_delete: + return &bpf_sk_storage_delete_proto; +#ifdef CONFIG_INET + case BPF_FUNC_tcp_sock: + return &bpf_tcp_sock_proto; +#endif + default: + return cgroup_base_func_proto(func_id, prog); + } +} + +static bool cg_sockopt_is_valid_access(int off, int size, + enum bpf_access_type type, + const struct bpf_prog *prog, + struct bpf_insn_access_aux *info) +{ + const int size_default = sizeof(__u32); + + if (off < 0 || off >= sizeof(struct bpf_sockopt)) + return false; + + if (off % size != 0) + return false; + + if (type == BPF_WRITE) { + switch (off) { + case offsetof(struct bpf_sockopt, retval): + if (size != size_default) + return false; + return prog->expected_attach_type == + BPF_CGROUP_GETSOCKOPT; + case offsetof(struct bpf_sockopt, optname): + /* fallthrough */ + case offsetof(struct bpf_sockopt, level): + if (size != size_default) + return false; + return prog->expected_attach_type == + BPF_CGROUP_SETSOCKOPT; + case offsetof(struct bpf_sockopt, optlen): + return size == size_default; + default: + return false; + } + } + + switch (off) { + case offsetof(struct bpf_sockopt, sk): + if (size != sizeof(__u64)) + return false; + info->reg_type = PTR_TO_SOCKET; + break; + case offsetof(struct bpf_sockopt, optval): + if (size != sizeof(__u64)) + return false; + info->reg_type = PTR_TO_PACKET; + break; + case offsetof(struct bpf_sockopt, optval_end): + if (size != sizeof(__u64)) + return false; + info->reg_type = PTR_TO_PACKET_END; + break; + case offsetof(struct bpf_sockopt, retval): + if (size != size_default) + return false; + return prog->expected_attach_type == BPF_CGROUP_GETSOCKOPT; + default: + if (size != size_default) + return false; + break; + } + return true; +} + +#define CG_SOCKOPT_ACCESS_FIELD(T, F) \ + T(BPF_FIELD_SIZEOF(struct bpf_sockopt_kern, F), \ + si->dst_reg, si->src_reg, \ + offsetof(struct bpf_sockopt_kern, F)) + +static u32 cg_sockopt_convert_ctx_access(enum bpf_access_type type, + const struct bpf_insn *si, + struct bpf_insn *insn_buf, + struct bpf_prog *prog, + u32 *target_size) +{ + struct bpf_insn *insn = insn_buf; + + switch (si->off) { + case offsetof(struct bpf_sockopt, sk): + *insn++ = CG_SOCKOPT_ACCESS_FIELD(BPF_LDX_MEM, sk); + break; + case offsetof(struct bpf_sockopt, level): + if (type == BPF_WRITE) + *insn++ = CG_SOCKOPT_ACCESS_FIELD(BPF_STX_MEM, level); + else + *insn++ = CG_SOCKOPT_ACCESS_FIELD(BPF_LDX_MEM, level); + break; + case offsetof(struct bpf_sockopt, optname): + if (type == BPF_WRITE) + *insn++ = CG_SOCKOPT_ACCESS_FIELD(BPF_STX_MEM, optname); + else + *insn++ = CG_SOCKOPT_ACCESS_FIELD(BPF_LDX_MEM, optname); + break; + case offsetof(struct bpf_sockopt, optlen): + if (type == BPF_WRITE) + *insn++ = CG_SOCKOPT_ACCESS_FIELD(BPF_STX_MEM, optlen); + else + *insn++ = CG_SOCKOPT_ACCESS_FIELD(BPF_LDX_MEM, optlen); + break; + case offsetof(struct bpf_sockopt, retval): + if (type == BPF_WRITE) + *insn++ = CG_SOCKOPT_ACCESS_FIELD(BPF_STX_MEM, retval); + else + *insn++ = CG_SOCKOPT_ACCESS_FIELD(BPF_LDX_MEM, retval); + break; + case offsetof(struct bpf_sockopt, optval): + *insn++ = CG_SOCKOPT_ACCESS_FIELD(BPF_LDX_MEM, optval); + break; + case offsetof(struct bpf_sockopt, optval_end): + *insn++ = CG_SOCKOPT_ACCESS_FIELD(BPF_LDX_MEM, optval_end); + break; + } + + return insn - insn_buf; +} + +static int cg_sockopt_get_prologue(struct bpf_insn *insn_buf, + bool direct_write, + const struct bpf_prog *prog) +{ + /* Nothing to do for sockopt argument. The data is kzalloc'ated. + */ + return 0; +} + +const struct bpf_verifier_ops cg_sockopt_verifier_ops = { + .get_func_proto = cg_sockopt_func_proto, + .is_valid_access = cg_sockopt_is_valid_access, + .convert_ctx_access = cg_sockopt_convert_ctx_access, + .gen_prologue = cg_sockopt_get_prologue, +}; + +const struct bpf_prog_ops cg_sockopt_prog_ops = { +}; diff --git a/kernel/bpf/core.c b/kernel/bpf/core.c index 561ed07d3007..e2c1b43728da 100644 --- a/kernel/bpf/core.c +++ b/kernel/bpf/core.c @@ -1809,6 +1809,15 @@ int bpf_prog_array_length(struct bpf_prog_array *array) return cnt; } +bool bpf_prog_array_is_empty(struct bpf_prog_array *array) +{ + struct bpf_prog_array_item *item; + + for (item = array->items; item->prog; item++) + if (item->prog != &dummy_bpf_prog.prog) + return false; + return true; +} static bool bpf_prog_array_copy_core(struct bpf_prog_array *array, u32 *prog_ids, diff --git a/kernel/bpf/syscall.c b/kernel/bpf/syscall.c index 7713cf39795a..b0f545e07425 100644 --- a/kernel/bpf/syscall.c +++ b/kernel/bpf/syscall.c @@ -1590,6 +1590,14 @@ bpf_prog_load_check_attach_type(enum bpf_prog_type prog_type, default: return -EINVAL; } + case BPF_PROG_TYPE_CGROUP_SOCKOPT: + switch (expected_attach_type) { + case BPF_CGROUP_SETSOCKOPT: + case BPF_CGROUP_GETSOCKOPT: + return 0; + default: + return -EINVAL; + } default: return 0; } @@ -1840,6 +1848,7 @@ static int bpf_prog_attach_check_attach_type(const struct bpf_prog *prog, switch (prog->type) { case BPF_PROG_TYPE_CGROUP_SOCK: case BPF_PROG_TYPE_CGROUP_SOCK_ADDR: + case BPF_PROG_TYPE_CGROUP_SOCKOPT: return attach_type == prog->expected_attach_type ? 0 : -EINVAL; case BPF_PROG_TYPE_CGROUP_SKB: return prog->enforce_expected_attach_type && @@ -1912,6 +1921,10 @@ static int bpf_prog_attach(const union bpf_attr *attr) case BPF_CGROUP_SYSCTL: ptype = BPF_PROG_TYPE_CGROUP_SYSCTL; break; + case BPF_CGROUP_GETSOCKOPT: + case BPF_CGROUP_SETSOCKOPT: + ptype = BPF_PROG_TYPE_CGROUP_SOCKOPT; + break; default: return -EINVAL; } @@ -1995,6 +2008,10 @@ static int bpf_prog_detach(const union bpf_attr *attr) case BPF_CGROUP_SYSCTL: ptype = BPF_PROG_TYPE_CGROUP_SYSCTL; break; + case BPF_CGROUP_GETSOCKOPT: + case BPF_CGROUP_SETSOCKOPT: + ptype = BPF_PROG_TYPE_CGROUP_SOCKOPT; + break; default: return -EINVAL; } @@ -2031,6 +2048,8 @@ static int bpf_prog_query(const union bpf_attr *attr, case BPF_CGROUP_SOCK_OPS: case BPF_CGROUP_DEVICE: case BPF_CGROUP_SYSCTL: + case BPF_CGROUP_GETSOCKOPT: + case BPF_CGROUP_SETSOCKOPT: break; case BPF_LIRC_MODE2: return lirc_prog_query(attr, uattr); diff --git a/kernel/bpf/verifier.c b/kernel/bpf/verifier.c index 0e079b2298f8..6b5623d320f9 100644 --- a/kernel/bpf/verifier.c +++ b/kernel/bpf/verifier.c @@ -2215,6 +2215,13 @@ static bool may_access_direct_pkt_data(struct bpf_verifier_env *env, env->seen_direct_write = true; return true; + + case BPF_PROG_TYPE_CGROUP_SOCKOPT: + if (t == BPF_WRITE) + env->seen_direct_write = true; + + return true; + default: return false; } @@ -6066,6 +6073,7 @@ static int check_return_code(struct bpf_verifier_env *env) case BPF_PROG_TYPE_SOCK_OPS: case BPF_PROG_TYPE_CGROUP_DEVICE: case BPF_PROG_TYPE_CGROUP_SYSCTL: + case BPF_PROG_TYPE_CGROUP_SOCKOPT: break; default: return 0; diff --git a/net/core/filter.c b/net/core/filter.c index 2014d76e0d2a..dc8534be12fc 100644 --- a/net/core/filter.c +++ b/net/core/filter.c @@ -5651,7 +5651,7 @@ BPF_CALL_1(bpf_tcp_sock, struct sock *, sk) return (unsigned long)NULL; } -static const struct bpf_func_proto bpf_tcp_sock_proto = { +const struct bpf_func_proto bpf_tcp_sock_proto = { .func = bpf_tcp_sock, .gpl_only = false, .ret_type = RET_PTR_TO_TCP_SOCK_OR_NULL, diff --git a/net/socket.c b/net/socket.c index 963df5dbdd54..0ddfbfb761d9 100644 --- a/net/socket.c +++ b/net/socket.c @@ -2051,6 +2051,8 @@ SYSCALL_DEFINE4(recv, int, fd, void __user *, ubuf, size_t, size, static int __sys_setsockopt(int fd, int level, int optname, char __user *optval, int optlen) { + mm_segment_t oldfs = get_fs(); + char *kernel_optval = NULL; int err, fput_needed; struct socket *sock; @@ -2063,6 +2065,22 @@ static int __sys_setsockopt(int fd, int level, int optname, if (err) goto out_put; + err = BPF_CGROUP_RUN_PROG_SETSOCKOPT(sock->sk, &level, + &optname, optval, &optlen, + &kernel_optval); + + if (err < 0) { + goto out_put; + } else if (err > 0) { + err = 0; + goto out_put; + } + + if (kernel_optval) { + set_fs(KERNEL_DS); + optval = (char __user __force *)kernel_optval; + } + if (level == SOL_SOCKET) err = sock_setsockopt(sock, level, optname, optval, @@ -2071,6 +2089,11 @@ static int __sys_setsockopt(int fd, int level, int optname, err = sock->ops->setsockopt(sock, level, optname, optval, optlen); + + if (kernel_optval) { + set_fs(oldfs); + kfree(kernel_optval); + } out_put: fput_light(sock->file, fput_needed); } @@ -2093,6 +2116,7 @@ static int __sys_getsockopt(int fd, int level, int optname, { int err, fput_needed; struct socket *sock; + int max_optlen; sock = sockfd_lookup_light(fd, &err, &fput_needed); if (sock != NULL) { @@ -2100,6 +2124,8 @@ static int __sys_getsockopt(int fd, int level, int optname, if (err) goto out_put; + max_optlen = BPF_CGROUP_GETSOCKOPT_MAX_OPTLEN(optlen); + if (level == SOL_SOCKET) err = sock_getsockopt(sock, level, optname, optval, @@ -2108,6 +2134,10 @@ static int __sys_getsockopt(int fd, int level, int optname, err = sock->ops->getsockopt(sock, level, optname, optval, optlen); + + err = BPF_CGROUP_RUN_PROG_GETSOCKOPT(sock->sk, level, optname, + optval, optlen, + max_optlen, err); out_put: fput_light(sock->file, fput_needed); } -- 2.22.0.410.gd8fdbe21b5-goog ^ permalink raw reply related [flat|nested] 13+ messages in thread
* [PATCH bpf-next v9 2/9] bpf: sync bpf.h to tools/ 2019-06-27 20:38 [PATCH bpf-next v9 0/9] bpf: getsockopt and setsockopt hooks Stanislav Fomichev 2019-06-27 20:38 ` [PATCH bpf-next v9 1/9] bpf: implement " Stanislav Fomichev @ 2019-06-27 20:38 ` Stanislav Fomichev 2019-06-27 20:38 ` [PATCH bpf-next v9 3/9] libbpf: support sockopt hooks Stanislav Fomichev ` (7 subsequent siblings) 9 siblings, 0 replies; 13+ messages in thread From: Stanislav Fomichev @ 2019-06-27 20:38 UTC (permalink / raw) To: netdev, bpf Cc: davem, ast, daniel, Stanislav Fomichev, Andrii Nakryiko, Martin Lau Export new prog type and hook points to the libbpf. Cc: Andrii Nakryiko <andriin@fb.com> Cc: Martin Lau <kafai@fb.com> Signed-off-by: Stanislav Fomichev <sdf@google.com> --- tools/include/uapi/linux/bpf.h | 14 ++++++++++++++ 1 file changed, 14 insertions(+) diff --git a/tools/include/uapi/linux/bpf.h b/tools/include/uapi/linux/bpf.h index b077507efa3f..a396b516a2b2 100644 --- a/tools/include/uapi/linux/bpf.h +++ b/tools/include/uapi/linux/bpf.h @@ -170,6 +170,7 @@ enum bpf_prog_type { BPF_PROG_TYPE_FLOW_DISSECTOR, BPF_PROG_TYPE_CGROUP_SYSCTL, BPF_PROG_TYPE_RAW_TRACEPOINT_WRITABLE, + BPF_PROG_TYPE_CGROUP_SOCKOPT, }; enum bpf_attach_type { @@ -194,6 +195,8 @@ enum bpf_attach_type { BPF_CGROUP_SYSCTL, BPF_CGROUP_UDP4_RECVMSG, BPF_CGROUP_UDP6_RECVMSG, + BPF_CGROUP_GETSOCKOPT, + BPF_CGROUP_SETSOCKOPT, __MAX_BPF_ATTACH_TYPE }; @@ -3541,4 +3544,15 @@ struct bpf_sysctl { */ }; +struct bpf_sockopt { + __bpf_md_ptr(struct bpf_sock *, sk); + __bpf_md_ptr(void *, optval); + __bpf_md_ptr(void *, optval_end); + + __s32 level; + __s32 optname; + __s32 optlen; + __s32 retval; +}; + #endif /* _UAPI__LINUX_BPF_H__ */ -- 2.22.0.410.gd8fdbe21b5-goog ^ permalink raw reply related [flat|nested] 13+ messages in thread
* [PATCH bpf-next v9 3/9] libbpf: support sockopt hooks 2019-06-27 20:38 [PATCH bpf-next v9 0/9] bpf: getsockopt and setsockopt hooks Stanislav Fomichev 2019-06-27 20:38 ` [PATCH bpf-next v9 1/9] bpf: implement " Stanislav Fomichev 2019-06-27 20:38 ` [PATCH bpf-next v9 2/9] bpf: sync bpf.h to tools/ Stanislav Fomichev @ 2019-06-27 20:38 ` Stanislav Fomichev 2019-06-27 20:38 ` [PATCH bpf-next v9 4/9] selftests/bpf: test sockopt section name Stanislav Fomichev ` (6 subsequent siblings) 9 siblings, 0 replies; 13+ messages in thread From: Stanislav Fomichev @ 2019-06-27 20:38 UTC (permalink / raw) To: netdev, bpf Cc: davem, ast, daniel, Stanislav Fomichev, Andrii Nakryiko, Martin Lau Make libbpf aware of new sockopt hooks so it can derive prog type and hook point from the section names. Cc: Andrii Nakryiko <andriin@fb.com> Cc: Martin Lau <kafai@fb.com> Signed-off-by: Stanislav Fomichev <sdf@google.com> --- tools/lib/bpf/libbpf.c | 5 +++++ tools/lib/bpf/libbpf_probes.c | 1 + 2 files changed, 6 insertions(+) diff --git a/tools/lib/bpf/libbpf.c b/tools/lib/bpf/libbpf.c index 5186b7710430..6e6ebef11ba3 100644 --- a/tools/lib/bpf/libbpf.c +++ b/tools/lib/bpf/libbpf.c @@ -2646,6 +2646,7 @@ static bool bpf_prog_type__needs_kver(enum bpf_prog_type type) case BPF_PROG_TYPE_RAW_TRACEPOINT_WRITABLE: case BPF_PROG_TYPE_PERF_EVENT: case BPF_PROG_TYPE_CGROUP_SYSCTL: + case BPF_PROG_TYPE_CGROUP_SOCKOPT: return false; case BPF_PROG_TYPE_KPROBE: default: @@ -3604,6 +3605,10 @@ static const struct { BPF_CGROUP_UDP6_RECVMSG), BPF_EAPROG_SEC("cgroup/sysctl", BPF_PROG_TYPE_CGROUP_SYSCTL, BPF_CGROUP_SYSCTL), + BPF_EAPROG_SEC("cgroup/getsockopt", BPF_PROG_TYPE_CGROUP_SOCKOPT, + BPF_CGROUP_GETSOCKOPT), + BPF_EAPROG_SEC("cgroup/setsockopt", BPF_PROG_TYPE_CGROUP_SOCKOPT, + BPF_CGROUP_SETSOCKOPT), }; #undef BPF_PROG_SEC_IMPL diff --git a/tools/lib/bpf/libbpf_probes.c b/tools/lib/bpf/libbpf_probes.c index 6635a31a7a16..ace1a0708d99 100644 --- a/tools/lib/bpf/libbpf_probes.c +++ b/tools/lib/bpf/libbpf_probes.c @@ -101,6 +101,7 @@ probe_load(enum bpf_prog_type prog_type, const struct bpf_insn *insns, case BPF_PROG_TYPE_SK_REUSEPORT: case BPF_PROG_TYPE_FLOW_DISSECTOR: case BPF_PROG_TYPE_CGROUP_SYSCTL: + case BPF_PROG_TYPE_CGROUP_SOCKOPT: default: break; } -- 2.22.0.410.gd8fdbe21b5-goog ^ permalink raw reply related [flat|nested] 13+ messages in thread
* [PATCH bpf-next v9 4/9] selftests/bpf: test sockopt section name 2019-06-27 20:38 [PATCH bpf-next v9 0/9] bpf: getsockopt and setsockopt hooks Stanislav Fomichev ` (2 preceding siblings ...) 2019-06-27 20:38 ` [PATCH bpf-next v9 3/9] libbpf: support sockopt hooks Stanislav Fomichev @ 2019-06-27 20:38 ` Stanislav Fomichev 2019-06-27 20:38 ` [PATCH bpf-next v9 5/9] selftests/bpf: add sockopt test Stanislav Fomichev ` (5 subsequent siblings) 9 siblings, 0 replies; 13+ messages in thread From: Stanislav Fomichev @ 2019-06-27 20:38 UTC (permalink / raw) To: netdev, bpf Cc: davem, ast, daniel, Stanislav Fomichev, Andrii Nakryiko, Martin Lau Add tests that make sure libbpf section detection works. Cc: Andrii Nakryiko <andriin@fb.com> Cc: Martin Lau <kafai@fb.com> Signed-off-by: Stanislav Fomichev <sdf@google.com> --- tools/testing/selftests/bpf/test_section_names.c | 10 ++++++++++ 1 file changed, 10 insertions(+) diff --git a/tools/testing/selftests/bpf/test_section_names.c b/tools/testing/selftests/bpf/test_section_names.c index dee2f2eceb0f..29833aeaf0de 100644 --- a/tools/testing/selftests/bpf/test_section_names.c +++ b/tools/testing/selftests/bpf/test_section_names.c @@ -134,6 +134,16 @@ static struct sec_name_test tests[] = { {0, BPF_PROG_TYPE_CGROUP_SYSCTL, BPF_CGROUP_SYSCTL}, {0, BPF_CGROUP_SYSCTL}, }, + { + "cgroup/getsockopt", + {0, BPF_PROG_TYPE_CGROUP_SOCKOPT, BPF_CGROUP_GETSOCKOPT}, + {0, BPF_CGROUP_GETSOCKOPT}, + }, + { + "cgroup/setsockopt", + {0, BPF_PROG_TYPE_CGROUP_SOCKOPT, BPF_CGROUP_SETSOCKOPT}, + {0, BPF_CGROUP_SETSOCKOPT}, + }, }; static int test_prog_type_by_name(const struct sec_name_test *test) -- 2.22.0.410.gd8fdbe21b5-goog ^ permalink raw reply related [flat|nested] 13+ messages in thread
* [PATCH bpf-next v9 5/9] selftests/bpf: add sockopt test 2019-06-27 20:38 [PATCH bpf-next v9 0/9] bpf: getsockopt and setsockopt hooks Stanislav Fomichev ` (3 preceding siblings ...) 2019-06-27 20:38 ` [PATCH bpf-next v9 4/9] selftests/bpf: test sockopt section name Stanislav Fomichev @ 2019-06-27 20:38 ` Stanislav Fomichev 2019-06-27 20:38 ` [PATCH bpf-next v9 6/9] selftests/bpf: add sockopt test that exercises sk helpers Stanislav Fomichev ` (4 subsequent siblings) 9 siblings, 0 replies; 13+ messages in thread From: Stanislav Fomichev @ 2019-06-27 20:38 UTC (permalink / raw) To: netdev, bpf Cc: davem, ast, daniel, Stanislav Fomichev, Andrii Nakryiko, Martin Lau Add sockopt selftests: * require proper expected_attach_type * enforce context field read/write access * test bpf_sockopt_handled handler * test EPERM * test limiting optlen from getsockopt * test out-of-bounds access v9: * add tests for setsockopt argument mangling v7: * remove return 2; test retval=0 and optlen=-1 v3: * use DW for optval{,_end} loads v2: * use return code 2 for kernel bypass Cc: Andrii Nakryiko <andriin@fb.com> Cc: Martin Lau <kafai@fb.com> Signed-off-by: Stanislav Fomichev <sdf@google.com> --- tools/testing/selftests/bpf/.gitignore | 1 + tools/testing/selftests/bpf/Makefile | 3 +- tools/testing/selftests/bpf/test_sockopt.c | 1021 ++++++++++++++++++++ 3 files changed, 1024 insertions(+), 1 deletion(-) create mode 100644 tools/testing/selftests/bpf/test_sockopt.c diff --git a/tools/testing/selftests/bpf/.gitignore b/tools/testing/selftests/bpf/.gitignore index 7470327edcfe..3fe92601223d 100644 --- a/tools/testing/selftests/bpf/.gitignore +++ b/tools/testing/selftests/bpf/.gitignore @@ -39,3 +39,4 @@ libbpf.so.* test_hashmap test_btf_dump xdping +test_sockopt diff --git a/tools/testing/selftests/bpf/Makefile b/tools/testing/selftests/bpf/Makefile index faeb8220daa2..7f2210402e57 100644 --- a/tools/testing/selftests/bpf/Makefile +++ b/tools/testing/selftests/bpf/Makefile @@ -26,7 +26,7 @@ TEST_GEN_PROGS = test_verifier test_tag test_maps test_lru_map test_lpm_map test test_sock test_btf test_sockmap get_cgroup_id_user test_socket_cookie \ test_cgroup_storage test_select_reuseport test_section_names \ test_netcnt test_tcpnotify_user test_sock_fields test_sysctl test_hashmap \ - test_btf_dump test_cgroup_attach xdping + test_btf_dump test_cgroup_attach xdping test_sockopt BPF_OBJ_FILES = $(patsubst %.c,%.o, $(notdir $(wildcard progs/*.c))) TEST_GEN_FILES = $(BPF_OBJ_FILES) @@ -103,6 +103,7 @@ $(OUTPUT)/test_netcnt: cgroup_helpers.c $(OUTPUT)/test_sock_fields: cgroup_helpers.c $(OUTPUT)/test_sysctl: cgroup_helpers.c $(OUTPUT)/test_cgroup_attach: cgroup_helpers.c +$(OUTPUT)/test_sockopt: cgroup_helpers.c .PHONY: force diff --git a/tools/testing/selftests/bpf/test_sockopt.c b/tools/testing/selftests/bpf/test_sockopt.c new file mode 100644 index 000000000000..23bd0819382d --- /dev/null +++ b/tools/testing/selftests/bpf/test_sockopt.c @@ -0,0 +1,1021 @@ +// SPDX-License-Identifier: GPL-2.0 + +#include <errno.h> +#include <stdio.h> +#include <unistd.h> +#include <sys/types.h> +#include <sys/socket.h> +#include <netinet/in.h> + +#include <linux/filter.h> +#include <bpf/bpf.h> +#include <bpf/libbpf.h> + +#include "bpf_rlimit.h" +#include "bpf_util.h" +#include "cgroup_helpers.h" + +#define CG_PATH "/sockopt" + +static char bpf_log_buf[4096]; +static bool verbose; + +enum sockopt_test_error { + OK = 0, + DENY_LOAD, + DENY_ATTACH, + EPERM_GETSOCKOPT, + EFAULT_GETSOCKOPT, + EPERM_SETSOCKOPT, + EFAULT_SETSOCKOPT, +}; + +static struct sockopt_test { + const char *descr; + const struct bpf_insn insns[64]; + enum bpf_attach_type attach_type; + enum bpf_attach_type expected_attach_type; + + int set_optname; + int set_level; + const char set_optval[64]; + socklen_t set_optlen; + + int get_optname; + int get_level; + const char get_optval[64]; + socklen_t get_optlen; + socklen_t get_optlen_ret; + + enum sockopt_test_error error; +} tests[] = { + + /* ==================== getsockopt ==================== */ + + { + .descr = "getsockopt: no expected_attach_type", + .insns = { + /* return 1 */ + BPF_MOV64_IMM(BPF_REG_0, 1), + BPF_EXIT_INSN(), + + }, + .attach_type = BPF_CGROUP_GETSOCKOPT, + .expected_attach_type = 0, + .error = DENY_LOAD, + }, + { + .descr = "getsockopt: wrong expected_attach_type", + .insns = { + /* return 1 */ + BPF_MOV64_IMM(BPF_REG_0, 1), + BPF_EXIT_INSN(), + + }, + .attach_type = BPF_CGROUP_GETSOCKOPT, + .expected_attach_type = BPF_CGROUP_SETSOCKOPT, + .error = DENY_ATTACH, + }, + { + .descr = "getsockopt: bypass bpf hook", + .insns = { + /* return 1 */ + BPF_MOV64_IMM(BPF_REG_0, 1), + BPF_EXIT_INSN(), + }, + .attach_type = BPF_CGROUP_GETSOCKOPT, + .expected_attach_type = BPF_CGROUP_GETSOCKOPT, + + .get_level = SOL_IP, + .set_level = SOL_IP, + + .get_optname = IP_TOS, + .set_optname = IP_TOS, + + .set_optval = { 1 << 3 }, + .set_optlen = 1, + + .get_optval = { 1 << 3 }, + .get_optlen = 1, + }, + { + .descr = "getsockopt: return EPERM from bpf hook", + .insns = { + BPF_MOV64_IMM(BPF_REG_0, 0), + BPF_EXIT_INSN(), + }, + .attach_type = BPF_CGROUP_GETSOCKOPT, + .expected_attach_type = BPF_CGROUP_GETSOCKOPT, + + .get_level = SOL_IP, + .get_optname = IP_TOS, + + .get_optlen = 1, + .error = EPERM_GETSOCKOPT, + }, + { + .descr = "getsockopt: no optval bounds check, deny loading", + .insns = { + /* r6 = ctx->optval */ + BPF_LDX_MEM(BPF_DW, BPF_REG_6, BPF_REG_1, + offsetof(struct bpf_sockopt, optval)), + + /* ctx->optval[0] = 0x80 */ + BPF_MOV64_IMM(BPF_REG_0, 0x80), + BPF_STX_MEM(BPF_W, BPF_REG_6, BPF_REG_0, 0), + + /* return 1 */ + BPF_MOV64_IMM(BPF_REG_0, 1), + BPF_EXIT_INSN(), + }, + .attach_type = BPF_CGROUP_GETSOCKOPT, + .expected_attach_type = BPF_CGROUP_GETSOCKOPT, + .error = DENY_LOAD, + }, + { + .descr = "getsockopt: read ctx->level", + .insns = { + /* r6 = ctx->level */ + BPF_LDX_MEM(BPF_W, BPF_REG_6, BPF_REG_1, + offsetof(struct bpf_sockopt, level)), + + /* if (ctx->level == 123) { */ + BPF_JMP_IMM(BPF_JNE, BPF_REG_6, 123, 4), + /* ctx->retval = 0 */ + BPF_MOV64_IMM(BPF_REG_0, 0), + BPF_STX_MEM(BPF_W, BPF_REG_1, BPF_REG_0, + offsetof(struct bpf_sockopt, retval)), + /* return 1 */ + BPF_MOV64_IMM(BPF_REG_0, 1), + BPF_JMP_A(1), + /* } else { */ + /* return 0 */ + BPF_MOV64_IMM(BPF_REG_0, 0), + /* } */ + BPF_EXIT_INSN(), + }, + .attach_type = BPF_CGROUP_GETSOCKOPT, + .expected_attach_type = BPF_CGROUP_GETSOCKOPT, + + .get_level = 123, + + .get_optlen = 1, + }, + { + .descr = "getsockopt: deny writing to ctx->level", + .insns = { + /* ctx->level = 1 */ + BPF_MOV64_IMM(BPF_REG_0, 1), + BPF_STX_MEM(BPF_W, BPF_REG_1, BPF_REG_0, + offsetof(struct bpf_sockopt, level)), + BPF_EXIT_INSN(), + }, + .attach_type = BPF_CGROUP_GETSOCKOPT, + .expected_attach_type = BPF_CGROUP_GETSOCKOPT, + + .error = DENY_LOAD, + }, + { + .descr = "getsockopt: read ctx->optname", + .insns = { + /* r6 = ctx->optname */ + BPF_LDX_MEM(BPF_W, BPF_REG_6, BPF_REG_1, + offsetof(struct bpf_sockopt, optname)), + + /* if (ctx->optname == 123) { */ + BPF_JMP_IMM(BPF_JNE, BPF_REG_6, 123, 4), + /* ctx->retval = 0 */ + BPF_MOV64_IMM(BPF_REG_0, 0), + BPF_STX_MEM(BPF_W, BPF_REG_1, BPF_REG_0, + offsetof(struct bpf_sockopt, retval)), + /* return 1 */ + BPF_MOV64_IMM(BPF_REG_0, 1), + BPF_JMP_A(1), + /* } else { */ + /* return 0 */ + BPF_MOV64_IMM(BPF_REG_0, 0), + /* } */ + BPF_EXIT_INSN(), + }, + .attach_type = BPF_CGROUP_GETSOCKOPT, + .expected_attach_type = BPF_CGROUP_GETSOCKOPT, + + .get_optname = 123, + + .get_optlen = 1, + }, + { + .descr = "getsockopt: read ctx->retval", + .insns = { + /* r6 = ctx->retval */ + BPF_LDX_MEM(BPF_W, BPF_REG_6, BPF_REG_1, + offsetof(struct bpf_sockopt, retval)), + + /* return 1 */ + BPF_MOV64_IMM(BPF_REG_0, 1), + BPF_EXIT_INSN(), + }, + .attach_type = BPF_CGROUP_GETSOCKOPT, + .expected_attach_type = BPF_CGROUP_GETSOCKOPT, + + .get_level = SOL_IP, + .get_optname = IP_TOS, + .get_optlen = 1, + }, + { + .descr = "getsockopt: deny writing to ctx->optname", + .insns = { + /* ctx->optname = 1 */ + BPF_MOV64_IMM(BPF_REG_0, 1), + BPF_STX_MEM(BPF_W, BPF_REG_1, BPF_REG_0, + offsetof(struct bpf_sockopt, optname)), + BPF_EXIT_INSN(), + }, + .attach_type = BPF_CGROUP_GETSOCKOPT, + .expected_attach_type = BPF_CGROUP_GETSOCKOPT, + + .error = DENY_LOAD, + }, + { + .descr = "getsockopt: read ctx->optlen", + .insns = { + /* r6 = ctx->optlen */ + BPF_LDX_MEM(BPF_W, BPF_REG_6, BPF_REG_1, + offsetof(struct bpf_sockopt, optlen)), + + /* if (ctx->optlen == 64) { */ + BPF_JMP_IMM(BPF_JNE, BPF_REG_6, 64, 4), + /* ctx->retval = 0 */ + BPF_MOV64_IMM(BPF_REG_0, 0), + BPF_STX_MEM(BPF_W, BPF_REG_1, BPF_REG_0, + offsetof(struct bpf_sockopt, retval)), + /* return 1 */ + BPF_MOV64_IMM(BPF_REG_0, 1), + BPF_JMP_A(1), + /* } else { */ + /* return 0 */ + BPF_MOV64_IMM(BPF_REG_0, 0), + /* } */ + BPF_EXIT_INSN(), + }, + .attach_type = BPF_CGROUP_GETSOCKOPT, + .expected_attach_type = BPF_CGROUP_GETSOCKOPT, + + .get_optlen = 64, + }, + { + .descr = "getsockopt: deny bigger ctx->optlen", + .insns = { + /* ctx->optlen = 65 */ + BPF_MOV64_IMM(BPF_REG_0, 65), + BPF_STX_MEM(BPF_W, BPF_REG_1, BPF_REG_0, + offsetof(struct bpf_sockopt, optlen)), + + /* ctx->retval = 0 */ + BPF_MOV64_IMM(BPF_REG_0, 0), + BPF_STX_MEM(BPF_W, BPF_REG_1, BPF_REG_0, + offsetof(struct bpf_sockopt, retval)), + + /* return 1 */ + BPF_MOV64_IMM(BPF_REG_0, 1), + BPF_EXIT_INSN(), + }, + .attach_type = BPF_CGROUP_GETSOCKOPT, + .expected_attach_type = BPF_CGROUP_GETSOCKOPT, + + .get_optlen = 64, + + .error = EFAULT_GETSOCKOPT, + }, + { + .descr = "getsockopt: deny arbitrary ctx->retval", + .insns = { + /* ctx->retval = 123 */ + BPF_MOV64_IMM(BPF_REG_0, 123), + BPF_STX_MEM(BPF_W, BPF_REG_1, BPF_REG_0, + offsetof(struct bpf_sockopt, retval)), + + /* return 1 */ + BPF_MOV64_IMM(BPF_REG_0, 1), + BPF_EXIT_INSN(), + }, + .attach_type = BPF_CGROUP_GETSOCKOPT, + .expected_attach_type = BPF_CGROUP_GETSOCKOPT, + + .get_optlen = 64, + + .error = EFAULT_GETSOCKOPT, + }, + { + .descr = "getsockopt: support smaller ctx->optlen", + .insns = { + /* ctx->optlen = 32 */ + BPF_MOV64_IMM(BPF_REG_0, 32), + BPF_STX_MEM(BPF_W, BPF_REG_1, BPF_REG_0, + offsetof(struct bpf_sockopt, optlen)), + /* ctx->retval = 0 */ + BPF_MOV64_IMM(BPF_REG_0, 0), + BPF_STX_MEM(BPF_W, BPF_REG_1, BPF_REG_0, + offsetof(struct bpf_sockopt, retval)), + /* return 1 */ + BPF_MOV64_IMM(BPF_REG_0, 1), + BPF_EXIT_INSN(), + }, + .attach_type = BPF_CGROUP_GETSOCKOPT, + .expected_attach_type = BPF_CGROUP_GETSOCKOPT, + + .get_optlen = 64, + .get_optlen_ret = 32, + }, + { + .descr = "getsockopt: deny writing to ctx->optval", + .insns = { + /* ctx->optval = 1 */ + BPF_MOV64_IMM(BPF_REG_0, 1), + BPF_STX_MEM(BPF_DW, BPF_REG_1, BPF_REG_0, + offsetof(struct bpf_sockopt, optval)), + BPF_EXIT_INSN(), + }, + .attach_type = BPF_CGROUP_GETSOCKOPT, + .expected_attach_type = BPF_CGROUP_GETSOCKOPT, + + .error = DENY_LOAD, + }, + { + .descr = "getsockopt: deny writing to ctx->optval_end", + .insns = { + /* ctx->optval_end = 1 */ + BPF_MOV64_IMM(BPF_REG_0, 1), + BPF_STX_MEM(BPF_DW, BPF_REG_1, BPF_REG_0, + offsetof(struct bpf_sockopt, optval_end)), + BPF_EXIT_INSN(), + }, + .attach_type = BPF_CGROUP_GETSOCKOPT, + .expected_attach_type = BPF_CGROUP_GETSOCKOPT, + + .error = DENY_LOAD, + }, + { + .descr = "getsockopt: rewrite value", + .insns = { + /* r6 = ctx->optval */ + BPF_LDX_MEM(BPF_DW, BPF_REG_6, BPF_REG_1, + offsetof(struct bpf_sockopt, optval)), + /* r2 = ctx->optval */ + BPF_MOV64_REG(BPF_REG_2, BPF_REG_6), + /* r6 = ctx->optval + 1 */ + BPF_ALU64_IMM(BPF_ADD, BPF_REG_6, 1), + + /* r7 = ctx->optval_end */ + BPF_LDX_MEM(BPF_DW, BPF_REG_7, BPF_REG_1, + offsetof(struct bpf_sockopt, optval_end)), + + /* if (ctx->optval + 1 <= ctx->optval_end) { */ + BPF_JMP_REG(BPF_JGT, BPF_REG_6, BPF_REG_7, 1), + /* ctx->optval[0] = 0xF0 */ + BPF_ST_MEM(BPF_B, BPF_REG_2, 0, 0xF0), + /* } */ + + /* ctx->retval = 0 */ + BPF_MOV64_IMM(BPF_REG_0, 0), + BPF_STX_MEM(BPF_W, BPF_REG_1, BPF_REG_0, + offsetof(struct bpf_sockopt, retval)), + + /* return 1*/ + BPF_MOV64_IMM(BPF_REG_0, 1), + BPF_EXIT_INSN(), + }, + .attach_type = BPF_CGROUP_GETSOCKOPT, + .expected_attach_type = BPF_CGROUP_GETSOCKOPT, + + .get_level = SOL_IP, + .get_optname = IP_TOS, + + .get_optval = { 0xF0 }, + .get_optlen = 1, + }, + + /* ==================== setsockopt ==================== */ + + { + .descr = "setsockopt: no expected_attach_type", + .insns = { + /* return 1 */ + BPF_MOV64_IMM(BPF_REG_0, 1), + BPF_EXIT_INSN(), + + }, + .attach_type = BPF_CGROUP_SETSOCKOPT, + .expected_attach_type = 0, + .error = DENY_LOAD, + }, + { + .descr = "setsockopt: wrong expected_attach_type", + .insns = { + /* return 1 */ + BPF_MOV64_IMM(BPF_REG_0, 1), + BPF_EXIT_INSN(), + + }, + .attach_type = BPF_CGROUP_SETSOCKOPT, + .expected_attach_type = BPF_CGROUP_GETSOCKOPT, + .error = DENY_ATTACH, + }, + { + .descr = "setsockopt: bypass bpf hook", + .insns = { + /* return 1 */ + BPF_MOV64_IMM(BPF_REG_0, 1), + BPF_EXIT_INSN(), + }, + .attach_type = BPF_CGROUP_SETSOCKOPT, + .expected_attach_type = BPF_CGROUP_SETSOCKOPT, + + .get_level = SOL_IP, + .set_level = SOL_IP, + + .get_optname = IP_TOS, + .set_optname = IP_TOS, + + .set_optval = { 1 << 3 }, + .set_optlen = 1, + + .get_optval = { 1 << 3 }, + .get_optlen = 1, + }, + { + .descr = "setsockopt: return EPERM from bpf hook", + .insns = { + /* return 0 */ + BPF_MOV64_IMM(BPF_REG_0, 0), + BPF_EXIT_INSN(), + }, + .attach_type = BPF_CGROUP_SETSOCKOPT, + .expected_attach_type = BPF_CGROUP_SETSOCKOPT, + + .set_level = SOL_IP, + .set_optname = IP_TOS, + + .set_optlen = 1, + .error = EPERM_SETSOCKOPT, + }, + { + .descr = "setsockopt: no optval bounds check, deny loading", + .insns = { + /* r6 = ctx->optval */ + BPF_LDX_MEM(BPF_DW, BPF_REG_6, BPF_REG_1, + offsetof(struct bpf_sockopt, optval)), + + /* r0 = ctx->optval[0] */ + BPF_LDX_MEM(BPF_W, BPF_REG_0, BPF_REG_6, 0), + + /* return 1 */ + BPF_MOV64_IMM(BPF_REG_0, 1), + BPF_EXIT_INSN(), + }, + .attach_type = BPF_CGROUP_SETSOCKOPT, + .expected_attach_type = BPF_CGROUP_SETSOCKOPT, + .error = DENY_LOAD, + }, + { + .descr = "setsockopt: read ctx->level", + .insns = { + /* r6 = ctx->level */ + BPF_LDX_MEM(BPF_W, BPF_REG_6, BPF_REG_1, + offsetof(struct bpf_sockopt, level)), + + /* if (ctx->level == 123) { */ + BPF_JMP_IMM(BPF_JNE, BPF_REG_6, 123, 4), + /* ctx->optlen = -1 */ + BPF_MOV64_IMM(BPF_REG_0, -1), + BPF_STX_MEM(BPF_W, BPF_REG_1, BPF_REG_0, + offsetof(struct bpf_sockopt, optlen)), + /* return 1 */ + BPF_MOV64_IMM(BPF_REG_0, 1), + BPF_JMP_A(1), + /* } else { */ + /* return 0 */ + BPF_MOV64_IMM(BPF_REG_0, 0), + /* } */ + BPF_EXIT_INSN(), + }, + .attach_type = BPF_CGROUP_SETSOCKOPT, + .expected_attach_type = BPF_CGROUP_SETSOCKOPT, + + .set_level = 123, + + .set_optlen = 1, + }, + { + .descr = "setsockopt: allow changing ctx->level", + .insns = { + /* ctx->level = SOL_IP */ + BPF_MOV64_IMM(BPF_REG_0, SOL_IP), + BPF_STX_MEM(BPF_W, BPF_REG_1, BPF_REG_0, + offsetof(struct bpf_sockopt, level)), + /* return 1 */ + BPF_MOV64_IMM(BPF_REG_0, 1), + BPF_EXIT_INSN(), + }, + .attach_type = BPF_CGROUP_SETSOCKOPT, + .expected_attach_type = BPF_CGROUP_SETSOCKOPT, + + .get_level = SOL_IP, + .set_level = 234, /* should be rewritten to SOL_IP */ + + .get_optname = IP_TOS, + .set_optname = IP_TOS, + + .set_optval = { 1 << 3 }, + .set_optlen = 1, + .get_optval = { 1 << 3 }, + .get_optlen = 1, + }, + { + .descr = "setsockopt: read ctx->optname", + .insns = { + /* r6 = ctx->optname */ + BPF_LDX_MEM(BPF_W, BPF_REG_6, BPF_REG_1, + offsetof(struct bpf_sockopt, optname)), + + /* if (ctx->optname == 123) { */ + BPF_JMP_IMM(BPF_JNE, BPF_REG_6, 123, 4), + /* ctx->optlen = -1 */ + BPF_MOV64_IMM(BPF_REG_0, -1), + BPF_STX_MEM(BPF_W, BPF_REG_1, BPF_REG_0, + offsetof(struct bpf_sockopt, optlen)), + /* return 1 */ + BPF_MOV64_IMM(BPF_REG_0, 1), + BPF_JMP_A(1), + /* } else { */ + /* return 0 */ + BPF_MOV64_IMM(BPF_REG_0, 0), + /* } */ + BPF_EXIT_INSN(), + }, + .attach_type = BPF_CGROUP_SETSOCKOPT, + .expected_attach_type = BPF_CGROUP_SETSOCKOPT, + + .set_optname = 123, + + .set_optlen = 1, + }, + { + .descr = "setsockopt: allow changing ctx->optname", + .insns = { + /* ctx->optname = IP_TOS */ + BPF_MOV64_IMM(BPF_REG_0, IP_TOS), + BPF_STX_MEM(BPF_W, BPF_REG_1, BPF_REG_0, + offsetof(struct bpf_sockopt, optname)), + /* return 1 */ + BPF_MOV64_IMM(BPF_REG_0, 1), + BPF_EXIT_INSN(), + }, + .attach_type = BPF_CGROUP_SETSOCKOPT, + .expected_attach_type = BPF_CGROUP_SETSOCKOPT, + + .get_level = SOL_IP, + .set_level = SOL_IP, + + .get_optname = IP_TOS, + .set_optname = 456, /* should be rewritten to IP_TOS */ + + .set_optval = { 1 << 3 }, + .set_optlen = 1, + .get_optval = { 1 << 3 }, + .get_optlen = 1, + }, + { + .descr = "setsockopt: read ctx->optlen", + .insns = { + /* r6 = ctx->optlen */ + BPF_LDX_MEM(BPF_W, BPF_REG_6, BPF_REG_1, + offsetof(struct bpf_sockopt, optlen)), + + /* if (ctx->optlen == 64) { */ + BPF_JMP_IMM(BPF_JNE, BPF_REG_6, 64, 4), + /* ctx->optlen = -1 */ + BPF_MOV64_IMM(BPF_REG_0, -1), + BPF_STX_MEM(BPF_W, BPF_REG_1, BPF_REG_0, + offsetof(struct bpf_sockopt, optlen)), + /* return 1 */ + BPF_MOV64_IMM(BPF_REG_0, 1), + BPF_JMP_A(1), + /* } else { */ + /* return 0 */ + BPF_MOV64_IMM(BPF_REG_0, 0), + /* } */ + BPF_EXIT_INSN(), + }, + .attach_type = BPF_CGROUP_SETSOCKOPT, + .expected_attach_type = BPF_CGROUP_SETSOCKOPT, + + .set_optlen = 64, + }, + { + .descr = "setsockopt: ctx->optlen == -1 is ok", + .insns = { + /* ctx->optlen = -1 */ + BPF_MOV64_IMM(BPF_REG_0, -1), + BPF_STX_MEM(BPF_W, BPF_REG_1, BPF_REG_0, + offsetof(struct bpf_sockopt, optlen)), + /* return 1 */ + BPF_MOV64_IMM(BPF_REG_0, 1), + BPF_EXIT_INSN(), + }, + .attach_type = BPF_CGROUP_SETSOCKOPT, + .expected_attach_type = BPF_CGROUP_SETSOCKOPT, + + .set_optlen = 64, + }, + { + .descr = "setsockopt: deny ctx->optlen < 0 (except -1)", + .insns = { + /* ctx->optlen = -2 */ + BPF_MOV64_IMM(BPF_REG_0, -2), + BPF_STX_MEM(BPF_W, BPF_REG_1, BPF_REG_0, + offsetof(struct bpf_sockopt, optlen)), + /* return 1 */ + BPF_MOV64_IMM(BPF_REG_0, 1), + BPF_EXIT_INSN(), + }, + .attach_type = BPF_CGROUP_SETSOCKOPT, + .expected_attach_type = BPF_CGROUP_SETSOCKOPT, + + .set_optlen = 4, + + .error = EFAULT_SETSOCKOPT, + }, + { + .descr = "setsockopt: deny ctx->optlen > input optlen", + .insns = { + /* ctx->optlen = 65 */ + BPF_MOV64_IMM(BPF_REG_0, 65), + BPF_STX_MEM(BPF_W, BPF_REG_1, BPF_REG_0, + offsetof(struct bpf_sockopt, optlen)), + BPF_MOV64_IMM(BPF_REG_0, 1), + BPF_EXIT_INSN(), + }, + .attach_type = BPF_CGROUP_SETSOCKOPT, + .expected_attach_type = BPF_CGROUP_SETSOCKOPT, + + .set_optlen = 64, + + .error = EFAULT_SETSOCKOPT, + }, + { + .descr = "setsockopt: allow changing ctx->optlen within bounds", + .insns = { + /* r6 = ctx->optval */ + BPF_LDX_MEM(BPF_DW, BPF_REG_6, BPF_REG_1, + offsetof(struct bpf_sockopt, optval)), + /* r2 = ctx->optval */ + BPF_MOV64_REG(BPF_REG_2, BPF_REG_6), + /* r6 = ctx->optval + 1 */ + BPF_ALU64_IMM(BPF_ADD, BPF_REG_6, 1), + + /* r7 = ctx->optval_end */ + BPF_LDX_MEM(BPF_DW, BPF_REG_7, BPF_REG_1, + offsetof(struct bpf_sockopt, optval_end)), + + /* if (ctx->optval + 1 <= ctx->optval_end) { */ + BPF_JMP_REG(BPF_JGT, BPF_REG_6, BPF_REG_7, 1), + /* ctx->optval[0] = 1 << 3 */ + BPF_ST_MEM(BPF_B, BPF_REG_2, 0, 1 << 3), + /* } */ + + /* ctx->optlen = 1 */ + BPF_MOV64_IMM(BPF_REG_0, 1), + BPF_STX_MEM(BPF_W, BPF_REG_1, BPF_REG_0, + offsetof(struct bpf_sockopt, optlen)), + + /* return 1*/ + BPF_MOV64_IMM(BPF_REG_0, 1), + BPF_EXIT_INSN(), + }, + .attach_type = BPF_CGROUP_SETSOCKOPT, + .expected_attach_type = BPF_CGROUP_SETSOCKOPT, + + .get_level = SOL_IP, + .set_level = SOL_IP, + + .get_optname = IP_TOS, + .set_optname = IP_TOS, + + .set_optval = { 1, 1, 1, 1 }, + .set_optlen = 4, + .get_optval = { 1 << 3 }, + .get_optlen = 1, + }, + { + .descr = "setsockopt: deny write ctx->retval", + .insns = { + /* ctx->retval = 0 */ + BPF_MOV64_IMM(BPF_REG_0, 0), + BPF_STX_MEM(BPF_W, BPF_REG_1, BPF_REG_0, + offsetof(struct bpf_sockopt, retval)), + + /* return 1 */ + BPF_MOV64_IMM(BPF_REG_0, 1), + BPF_EXIT_INSN(), + }, + .attach_type = BPF_CGROUP_SETSOCKOPT, + .expected_attach_type = BPF_CGROUP_SETSOCKOPT, + + .error = DENY_LOAD, + }, + { + .descr = "setsockopt: deny read ctx->retval", + .insns = { + /* r6 = ctx->retval */ + BPF_LDX_MEM(BPF_W, BPF_REG_6, BPF_REG_1, + offsetof(struct bpf_sockopt, retval)), + + /* return 1 */ + BPF_MOV64_IMM(BPF_REG_0, 1), + BPF_EXIT_INSN(), + }, + .attach_type = BPF_CGROUP_SETSOCKOPT, + .expected_attach_type = BPF_CGROUP_SETSOCKOPT, + + .error = DENY_LOAD, + }, + { + .descr = "setsockopt: deny writing to ctx->optval", + .insns = { + /* ctx->optval = 1 */ + BPF_MOV64_IMM(BPF_REG_0, 1), + BPF_STX_MEM(BPF_DW, BPF_REG_1, BPF_REG_0, + offsetof(struct bpf_sockopt, optval)), + BPF_EXIT_INSN(), + }, + .attach_type = BPF_CGROUP_SETSOCKOPT, + .expected_attach_type = BPF_CGROUP_SETSOCKOPT, + + .error = DENY_LOAD, + }, + { + .descr = "setsockopt: deny writing to ctx->optval_end", + .insns = { + /* ctx->optval_end = 1 */ + BPF_MOV64_IMM(BPF_REG_0, 1), + BPF_STX_MEM(BPF_DW, BPF_REG_1, BPF_REG_0, + offsetof(struct bpf_sockopt, optval_end)), + BPF_EXIT_INSN(), + }, + .attach_type = BPF_CGROUP_SETSOCKOPT, + .expected_attach_type = BPF_CGROUP_SETSOCKOPT, + + .error = DENY_LOAD, + }, + { + .descr = "setsockopt: allow IP_TOS <= 128", + .insns = { + /* r6 = ctx->optval */ + BPF_LDX_MEM(BPF_DW, BPF_REG_6, BPF_REG_1, + offsetof(struct bpf_sockopt, optval)), + /* r7 = ctx->optval + 1 */ + BPF_MOV64_REG(BPF_REG_7, BPF_REG_6), + BPF_ALU64_IMM(BPF_ADD, BPF_REG_7, 1), + + /* r8 = ctx->optval_end */ + BPF_LDX_MEM(BPF_DW, BPF_REG_8, BPF_REG_1, + offsetof(struct bpf_sockopt, optval_end)), + + /* if (ctx->optval + 1 <= ctx->optval_end) { */ + BPF_JMP_REG(BPF_JGT, BPF_REG_7, BPF_REG_8, 4), + + /* r9 = ctx->optval[0] */ + BPF_LDX_MEM(BPF_B, BPF_REG_9, BPF_REG_6, 0), + + /* if (ctx->optval[0] < 128) */ + BPF_JMP_IMM(BPF_JGT, BPF_REG_9, 128, 2), + BPF_MOV64_IMM(BPF_REG_0, 1), + BPF_JMP_A(1), + /* } */ + + /* } else { */ + BPF_MOV64_IMM(BPF_REG_0, 0), + /* } */ + + BPF_EXIT_INSN(), + }, + .attach_type = BPF_CGROUP_SETSOCKOPT, + .expected_attach_type = BPF_CGROUP_SETSOCKOPT, + + .get_level = SOL_IP, + .set_level = SOL_IP, + + .get_optname = IP_TOS, + .set_optname = IP_TOS, + + .set_optval = { 0x80 }, + .set_optlen = 1, + .get_optval = { 0x80 }, + .get_optlen = 1, + }, + { + .descr = "setsockopt: deny IP_TOS > 128", + .insns = { + /* r6 = ctx->optval */ + BPF_LDX_MEM(BPF_DW, BPF_REG_6, BPF_REG_1, + offsetof(struct bpf_sockopt, optval)), + /* r7 = ctx->optval + 1 */ + BPF_MOV64_REG(BPF_REG_7, BPF_REG_6), + BPF_ALU64_IMM(BPF_ADD, BPF_REG_7, 1), + + /* r8 = ctx->optval_end */ + BPF_LDX_MEM(BPF_DW, BPF_REG_8, BPF_REG_1, + offsetof(struct bpf_sockopt, optval_end)), + + /* if (ctx->optval + 1 <= ctx->optval_end) { */ + BPF_JMP_REG(BPF_JGT, BPF_REG_7, BPF_REG_8, 4), + + /* r9 = ctx->optval[0] */ + BPF_LDX_MEM(BPF_B, BPF_REG_9, BPF_REG_6, 0), + + /* if (ctx->optval[0] < 128) */ + BPF_JMP_IMM(BPF_JGT, BPF_REG_9, 128, 2), + BPF_MOV64_IMM(BPF_REG_0, 1), + BPF_JMP_A(1), + /* } */ + + /* } else { */ + BPF_MOV64_IMM(BPF_REG_0, 0), + /* } */ + + BPF_EXIT_INSN(), + }, + .attach_type = BPF_CGROUP_SETSOCKOPT, + .expected_attach_type = BPF_CGROUP_SETSOCKOPT, + + .get_level = SOL_IP, + .set_level = SOL_IP, + + .get_optname = IP_TOS, + .set_optname = IP_TOS, + + .set_optval = { 0x81 }, + .set_optlen = 1, + .get_optval = { 0x00 }, + .get_optlen = 1, + + .error = EPERM_SETSOCKOPT, + }, +}; + +static int load_prog(const struct bpf_insn *insns, + enum bpf_attach_type expected_attach_type) +{ + struct bpf_load_program_attr attr = { + .prog_type = BPF_PROG_TYPE_CGROUP_SOCKOPT, + .expected_attach_type = expected_attach_type, + .insns = insns, + .license = "GPL", + .log_level = 2, + }; + int fd; + + for (; + insns[attr.insns_cnt].code != (BPF_JMP | BPF_EXIT); + attr.insns_cnt++) { + } + attr.insns_cnt++; + + fd = bpf_load_program_xattr(&attr, bpf_log_buf, sizeof(bpf_log_buf)); + if (verbose && fd < 0) + fprintf(stderr, "%s\n", bpf_log_buf); + + return fd; +} + +static int run_test(int cgroup_fd, struct sockopt_test *test) +{ + int sock_fd, err, prog_fd; + void *optval = NULL; + int ret = 0; + + prog_fd = load_prog(test->insns, test->expected_attach_type); + if (prog_fd < 0) { + if (test->error == DENY_LOAD) + return 0; + + log_err("Failed to load BPF program"); + return -1; + } + + err = bpf_prog_attach(prog_fd, cgroup_fd, test->attach_type, 0); + if (err < 0) { + if (test->error == DENY_ATTACH) + goto close_prog_fd; + + log_err("Failed to attach BPF program"); + ret = -1; + goto close_prog_fd; + } + + sock_fd = socket(AF_INET, SOCK_STREAM, 0); + if (sock_fd < 0) { + log_err("Failed to create AF_INET socket"); + ret = -1; + goto detach_prog; + } + + if (test->set_optlen) { + err = setsockopt(sock_fd, test->set_level, test->set_optname, + test->set_optval, test->set_optlen); + if (err) { + if (errno == EPERM && test->error == EPERM_SETSOCKOPT) + goto close_sock_fd; + if (errno == EFAULT && test->error == EFAULT_SETSOCKOPT) + goto free_optval; + + log_err("Failed to call setsockopt"); + ret = -1; + goto close_sock_fd; + } + } + + if (test->get_optlen) { + optval = malloc(test->get_optlen); + socklen_t optlen = test->get_optlen; + socklen_t expected_get_optlen = test->get_optlen_ret ?: + test->get_optlen; + + err = getsockopt(sock_fd, test->get_level, test->get_optname, + optval, &optlen); + if (err) { + if (errno == EPERM && test->error == EPERM_GETSOCKOPT) + goto free_optval; + if (errno == EFAULT && test->error == EFAULT_GETSOCKOPT) + goto free_optval; + + log_err("Failed to call getsockopt"); + ret = -1; + goto free_optval; + } + + if (optlen != expected_get_optlen) { + errno = 0; + log_err("getsockopt returned unexpected optlen"); + ret = -1; + goto free_optval; + } + + if (memcmp(optval, test->get_optval, optlen) != 0) { + errno = 0; + log_err("getsockopt returned unexpected optval"); + ret = -1; + goto free_optval; + } + } + + ret = test->error != OK; + +free_optval: + free(optval); +close_sock_fd: + close(sock_fd); +detach_prog: + bpf_prog_detach2(prog_fd, cgroup_fd, test->attach_type); +close_prog_fd: + close(prog_fd); + return ret; +} + +int main(int args, char **argv) +{ + int err = EXIT_FAILURE, error_cnt = 0; + int cgroup_fd, i; + + if (setup_cgroup_environment()) + goto cleanup_obj; + + cgroup_fd = create_and_get_cgroup(CG_PATH); + if (cgroup_fd < 0) + goto cleanup_cgroup_env; + + if (join_cgroup(CG_PATH)) + goto cleanup_cgroup; + + for (i = 0; i < ARRAY_SIZE(tests); i++) { + int err = run_test(cgroup_fd, &tests[i]); + + if (err) + error_cnt++; + + printf("#%d %s: %s\n", i, err ? "FAIL" : "PASS", + tests[i].descr); + } + + printf("Summary: %ld PASSED, %d FAILED\n", + ARRAY_SIZE(tests) - error_cnt, error_cnt); + err = error_cnt ? EXIT_FAILURE : EXIT_SUCCESS; + +cleanup_cgroup: + close(cgroup_fd); +cleanup_cgroup_env: + cleanup_cgroup_environment(); +cleanup_obj: + return err; +} -- 2.22.0.410.gd8fdbe21b5-goog ^ permalink raw reply related [flat|nested] 13+ messages in thread
* [PATCH bpf-next v9 6/9] selftests/bpf: add sockopt test that exercises sk helpers 2019-06-27 20:38 [PATCH bpf-next v9 0/9] bpf: getsockopt and setsockopt hooks Stanislav Fomichev ` (4 preceding siblings ...) 2019-06-27 20:38 ` [PATCH bpf-next v9 5/9] selftests/bpf: add sockopt test Stanislav Fomichev @ 2019-06-27 20:38 ` Stanislav Fomichev 2019-06-27 20:38 ` [PATCH bpf-next v9 7/9] selftests/bpf: add sockopt test that exercises BPF_F_ALLOW_MULTI Stanislav Fomichev ` (3 subsequent siblings) 9 siblings, 0 replies; 13+ messages in thread From: Stanislav Fomichev @ 2019-06-27 20:38 UTC (permalink / raw) To: netdev, bpf Cc: davem, ast, daniel, Stanislav Fomichev, Andrii Nakryiko, Martin Lau socktop test that introduces new SOL_CUSTOM sockopt level and stores whatever users sets in sk storage. Whenever getsockopt is called, the original value is retrieved. v9: * SO_SNDBUF example to override user-supplied buffer v7: * use retval=0 and optlen-1 v6: * test 'ret=1' use-case as well (Alexei Starovoitov) v4: * don't call bpf_sk_fullsock helper v3: * drop (__u8 *)(long) casts for optval{,_end} v2: * new test Cc: Andrii Nakryiko <andriin@fb.com> Cc: Martin Lau <kafai@fb.com> Signed-off-by: Stanislav Fomichev <sdf@google.com> --- tools/testing/selftests/bpf/.gitignore | 1 + tools/testing/selftests/bpf/Makefile | 3 +- .../testing/selftests/bpf/progs/sockopt_sk.c | 111 +++++++++ tools/testing/selftests/bpf/test_sockopt_sk.c | 214 ++++++++++++++++++ 4 files changed, 328 insertions(+), 1 deletion(-) create mode 100644 tools/testing/selftests/bpf/progs/sockopt_sk.c create mode 100644 tools/testing/selftests/bpf/test_sockopt_sk.c diff --git a/tools/testing/selftests/bpf/.gitignore b/tools/testing/selftests/bpf/.gitignore index 3fe92601223d..8ac076c311d4 100644 --- a/tools/testing/selftests/bpf/.gitignore +++ b/tools/testing/selftests/bpf/.gitignore @@ -40,3 +40,4 @@ test_hashmap test_btf_dump xdping test_sockopt +test_sockopt_sk diff --git a/tools/testing/selftests/bpf/Makefile b/tools/testing/selftests/bpf/Makefile index 7f2210402e57..57ae4e168ace 100644 --- a/tools/testing/selftests/bpf/Makefile +++ b/tools/testing/selftests/bpf/Makefile @@ -26,7 +26,7 @@ TEST_GEN_PROGS = test_verifier test_tag test_maps test_lru_map test_lpm_map test test_sock test_btf test_sockmap get_cgroup_id_user test_socket_cookie \ test_cgroup_storage test_select_reuseport test_section_names \ test_netcnt test_tcpnotify_user test_sock_fields test_sysctl test_hashmap \ - test_btf_dump test_cgroup_attach xdping test_sockopt + test_btf_dump test_cgroup_attach xdping test_sockopt test_sockopt_sk BPF_OBJ_FILES = $(patsubst %.c,%.o, $(notdir $(wildcard progs/*.c))) TEST_GEN_FILES = $(BPF_OBJ_FILES) @@ -104,6 +104,7 @@ $(OUTPUT)/test_sock_fields: cgroup_helpers.c $(OUTPUT)/test_sysctl: cgroup_helpers.c $(OUTPUT)/test_cgroup_attach: cgroup_helpers.c $(OUTPUT)/test_sockopt: cgroup_helpers.c +$(OUTPUT)/test_sockopt_sk: cgroup_helpers.c .PHONY: force diff --git a/tools/testing/selftests/bpf/progs/sockopt_sk.c b/tools/testing/selftests/bpf/progs/sockopt_sk.c new file mode 100644 index 000000000000..076122c898e9 --- /dev/null +++ b/tools/testing/selftests/bpf/progs/sockopt_sk.c @@ -0,0 +1,111 @@ +// SPDX-License-Identifier: GPL-2.0 +#include <netinet/in.h> +#include <linux/bpf.h> +#include "bpf_helpers.h" + +char _license[] SEC("license") = "GPL"; +__u32 _version SEC("version") = 1; + +#define SOL_CUSTOM 0xdeadbeef + +struct sockopt_sk { + __u8 val; +}; + +struct bpf_map_def SEC("maps") socket_storage_map = { + .type = BPF_MAP_TYPE_SK_STORAGE, + .key_size = sizeof(int), + .value_size = sizeof(struct sockopt_sk), + .map_flags = BPF_F_NO_PREALLOC, +}; +BPF_ANNOTATE_KV_PAIR(socket_storage_map, int, struct sockopt_sk); + +SEC("cgroup/getsockopt") +int _getsockopt(struct bpf_sockopt *ctx) +{ + __u8 *optval_end = ctx->optval_end; + __u8 *optval = ctx->optval; + struct sockopt_sk *storage; + + if (ctx->level == SOL_IP && ctx->optname == IP_TOS) + /* Not interested in SOL_IP:IP_TOS; + * let next BPF program in the cgroup chain or kernel + * handle it. + */ + return 1; + + if (ctx->level == SOL_SOCKET && ctx->optname == SO_SNDBUF) { + /* Not interested in SOL_SOCKET:SO_SNDBUF; + * let next BPF program in the cgroup chain or kernel + * handle it. + */ + return 1; + } + + if (ctx->level != SOL_CUSTOM) + return 0; /* EPERM, deny everything except custom level */ + + if (optval + 1 > optval_end) + return 0; /* EPERM, bounds check */ + + storage = bpf_sk_storage_get(&socket_storage_map, ctx->sk, 0, + BPF_SK_STORAGE_GET_F_CREATE); + if (!storage) + return 0; /* EPERM, couldn't get sk storage */ + + if (!ctx->retval) + return 0; /* EPERM, kernel should not have handled + * SOL_CUSTOM, something is wrong! + */ + ctx->retval = 0; /* Reset system call return value to zero */ + + optval[0] = storage->val; + ctx->optlen = 1; + + return 1; +} + +SEC("cgroup/setsockopt") +int _setsockopt(struct bpf_sockopt *ctx) +{ + __u8 *optval_end = ctx->optval_end; + __u8 *optval = ctx->optval; + struct sockopt_sk *storage; + + if (ctx->level == SOL_IP && ctx->optname == IP_TOS) + /* Not interested in SOL_IP:IP_TOS; + * let next BPF program in the cgroup chain or kernel + * handle it. + */ + return 1; + + if (ctx->level == SOL_SOCKET && ctx->optname == SO_SNDBUF) { + /* Overwrite SO_SNDBUF value */ + + if (optval + sizeof(__u32) > optval_end) + return 0; /* EPERM, bounds check */ + + *(__u32 *)optval = 0x55AA; + ctx->optlen = 4; + + return 1; + } + + if (ctx->level != SOL_CUSTOM) + return 0; /* EPERM, deny everything except custom level */ + + if (optval + 1 > optval_end) + return 0; /* EPERM, bounds check */ + + storage = bpf_sk_storage_get(&socket_storage_map, ctx->sk, 0, + BPF_SK_STORAGE_GET_F_CREATE); + if (!storage) + return 0; /* EPERM, couldn't get sk storage */ + + storage->val = optval[0]; + ctx->optlen = -1; /* BPF has consumed this option, don't call kernel + * setsockopt handler. + */ + + return 1; +} diff --git a/tools/testing/selftests/bpf/test_sockopt_sk.c b/tools/testing/selftests/bpf/test_sockopt_sk.c new file mode 100644 index 000000000000..12e79ed075ce --- /dev/null +++ b/tools/testing/selftests/bpf/test_sockopt_sk.c @@ -0,0 +1,214 @@ +// SPDX-License-Identifier: GPL-2.0 + +#include <errno.h> +#include <stdio.h> +#include <unistd.h> +#include <sys/types.h> +#include <sys/socket.h> +#include <netinet/in.h> + +#include <linux/filter.h> +#include <bpf/bpf.h> +#include <bpf/libbpf.h> + +#include "bpf_rlimit.h" +#include "bpf_util.h" +#include "cgroup_helpers.h" + +#define CG_PATH "/sockopt" + +#define SOL_CUSTOM 0xdeadbeef + +static int getsetsockopt(void) +{ + int fd, err; + char buf[4] = {}; + socklen_t optlen; + + fd = socket(AF_INET, SOCK_STREAM, 0); + if (fd < 0) { + log_err("Failed to create socket"); + return -1; + } + + /* IP_TOS - BPF bypass */ + + buf[0] = 0x08; + err = setsockopt(fd, SOL_IP, IP_TOS, buf, 1); + if (err) { + log_err("Failed to call setsockopt(IP_TOS)"); + goto err; + } + + buf[0] = 0x00; + optlen = 1; + err = getsockopt(fd, SOL_IP, IP_TOS, buf, &optlen); + if (err) { + log_err("Failed to call getsockopt(IP_TOS)"); + goto err; + } + + if (buf[0] != 0x08) { + log_err("Unexpected getsockopt(IP_TOS) buf[0] 0x%02x != 0x08", + buf[0]); + goto err; + } + + /* IP_TTL - EPERM */ + + buf[0] = 1; + err = setsockopt(fd, SOL_IP, IP_TTL, buf, 1); + if (!err || errno != EPERM) { + log_err("Unexpected success from setsockopt(IP_TTL)"); + goto err; + } + + /* SOL_CUSTOM - handled by BPF */ + + buf[0] = 0x01; + err = setsockopt(fd, SOL_CUSTOM, 0, buf, 1); + if (err) { + log_err("Failed to call setsockopt"); + goto err; + } + + buf[0] = 0x00; + optlen = 4; + err = getsockopt(fd, SOL_CUSTOM, 0, buf, &optlen); + if (err) { + log_err("Failed to call getsockopt"); + goto err; + } + + if (optlen != 1) { + log_err("Unexpected optlen %d != 1", optlen); + goto err; + } + if (buf[0] != 0x01) { + log_err("Unexpected buf[0] 0x%02x != 0x01", buf[0]); + goto err; + } + + /* SO_SNDBUF is overwritten */ + + buf[0] = 0x01; + buf[1] = 0x01; + buf[2] = 0x01; + buf[3] = 0x01; + err = setsockopt(fd, SOL_SOCKET, SO_SNDBUF, buf, 4); + if (err) { + log_err("Failed to call setsockopt(SO_SNDBUF)"); + goto err; + } + + buf[0] = 0x00; + buf[1] = 0x00; + buf[2] = 0x00; + buf[3] = 0x00; + optlen = 4; + err = getsockopt(fd, SOL_SOCKET, SO_SNDBUF, buf, &optlen); + if (err) { + log_err("Failed to call getsockopt(SO_SNDBUF)"); + goto err; + } + + if (*(__u32 *)buf != 0x55AA*2) { + log_err("Unexpected getsockopt(SO_SNDBUF) 0x%x != 0x55AA*2", + *(__u32 *)buf); + goto err; + } + + close(fd); + return 0; +err: + close(fd); + return -1; +} + +static int prog_attach(struct bpf_object *obj, int cgroup_fd, const char *title) +{ + enum bpf_attach_type attach_type; + enum bpf_prog_type prog_type; + struct bpf_program *prog; + int err; + + err = libbpf_prog_type_by_name(title, &prog_type, &attach_type); + if (err) { + log_err("Failed to deduct types for %s BPF program", title); + return -1; + } + + prog = bpf_object__find_program_by_title(obj, title); + if (!prog) { + log_err("Failed to find %s BPF program", title); + return -1; + } + + err = bpf_prog_attach(bpf_program__fd(prog), cgroup_fd, + attach_type, 0); + if (err) { + log_err("Failed to attach %s BPF program", title); + return -1; + } + + return 0; +} + +static int run_test(int cgroup_fd) +{ + struct bpf_prog_load_attr attr = { + .file = "./sockopt_sk.o", + }; + struct bpf_object *obj; + int ignored; + int err; + + err = bpf_prog_load_xattr(&attr, &obj, &ignored); + if (err) { + log_err("Failed to load BPF object"); + return -1; + } + + err = prog_attach(obj, cgroup_fd, "cgroup/getsockopt"); + if (err) + goto close_bpf_object; + + err = prog_attach(obj, cgroup_fd, "cgroup/setsockopt"); + if (err) + goto close_bpf_object; + + err = getsetsockopt(); + +close_bpf_object: + bpf_object__close(obj); + return err; +} + +int main(int args, char **argv) +{ + int cgroup_fd; + int err = EXIT_SUCCESS; + + if (setup_cgroup_environment()) + goto cleanup_obj; + + cgroup_fd = create_and_get_cgroup(CG_PATH); + if (cgroup_fd < 0) + goto cleanup_cgroup_env; + + if (join_cgroup(CG_PATH)) + goto cleanup_cgroup; + + if (run_test(cgroup_fd)) + err = EXIT_FAILURE; + + printf("test_sockopt_sk: %s\n", + err == EXIT_SUCCESS ? "PASSED" : "FAILED"); + +cleanup_cgroup: + close(cgroup_fd); +cleanup_cgroup_env: + cleanup_cgroup_environment(); +cleanup_obj: + return err; +} -- 2.22.0.410.gd8fdbe21b5-goog ^ permalink raw reply related [flat|nested] 13+ messages in thread
* [PATCH bpf-next v9 7/9] selftests/bpf: add sockopt test that exercises BPF_F_ALLOW_MULTI 2019-06-27 20:38 [PATCH bpf-next v9 0/9] bpf: getsockopt and setsockopt hooks Stanislav Fomichev ` (5 preceding siblings ...) 2019-06-27 20:38 ` [PATCH bpf-next v9 6/9] selftests/bpf: add sockopt test that exercises sk helpers Stanislav Fomichev @ 2019-06-27 20:38 ` Stanislav Fomichev 2019-06-27 20:38 ` [PATCH bpf-next v9 8/9] bpf: add sockopt documentation Stanislav Fomichev ` (2 subsequent siblings) 9 siblings, 0 replies; 13+ messages in thread From: Stanislav Fomichev @ 2019-06-27 20:38 UTC (permalink / raw) To: netdev, bpf Cc: davem, ast, daniel, Stanislav Fomichev, Andrii Nakryiko, Martin Lau sockopt test that verifies chaining behavior. v9: * setsockopt chaining example v7: * rework the test to verify cgroup getsockopt chaining Cc: Andrii Nakryiko <andriin@fb.com> Cc: Martin Lau <kafai@fb.com> Signed-off-by: Stanislav Fomichev <sdf@google.com> --- tools/testing/selftests/bpf/.gitignore | 1 + tools/testing/selftests/bpf/Makefile | 4 +- .../selftests/bpf/progs/sockopt_multi.c | 71 ++++ .../selftests/bpf/test_sockopt_multi.c | 374 ++++++++++++++++++ 4 files changed, 449 insertions(+), 1 deletion(-) create mode 100644 tools/testing/selftests/bpf/progs/sockopt_multi.c create mode 100644 tools/testing/selftests/bpf/test_sockopt_multi.c diff --git a/tools/testing/selftests/bpf/.gitignore b/tools/testing/selftests/bpf/.gitignore index 8ac076c311d4..a2f7f79c7908 100644 --- a/tools/testing/selftests/bpf/.gitignore +++ b/tools/testing/selftests/bpf/.gitignore @@ -41,3 +41,4 @@ test_btf_dump xdping test_sockopt test_sockopt_sk +test_sockopt_multi diff --git a/tools/testing/selftests/bpf/Makefile b/tools/testing/selftests/bpf/Makefile index 57ae4e168ace..de1754a8f5fe 100644 --- a/tools/testing/selftests/bpf/Makefile +++ b/tools/testing/selftests/bpf/Makefile @@ -26,7 +26,8 @@ TEST_GEN_PROGS = test_verifier test_tag test_maps test_lru_map test_lpm_map test test_sock test_btf test_sockmap get_cgroup_id_user test_socket_cookie \ test_cgroup_storage test_select_reuseport test_section_names \ test_netcnt test_tcpnotify_user test_sock_fields test_sysctl test_hashmap \ - test_btf_dump test_cgroup_attach xdping test_sockopt test_sockopt_sk + test_btf_dump test_cgroup_attach xdping test_sockopt test_sockopt_sk \ + test_sockopt_multi BPF_OBJ_FILES = $(patsubst %.c,%.o, $(notdir $(wildcard progs/*.c))) TEST_GEN_FILES = $(BPF_OBJ_FILES) @@ -105,6 +106,7 @@ $(OUTPUT)/test_sysctl: cgroup_helpers.c $(OUTPUT)/test_cgroup_attach: cgroup_helpers.c $(OUTPUT)/test_sockopt: cgroup_helpers.c $(OUTPUT)/test_sockopt_sk: cgroup_helpers.c +$(OUTPUT)/test_sockopt_multi: cgroup_helpers.c .PHONY: force diff --git a/tools/testing/selftests/bpf/progs/sockopt_multi.c b/tools/testing/selftests/bpf/progs/sockopt_multi.c new file mode 100644 index 000000000000..4afd2595c08e --- /dev/null +++ b/tools/testing/selftests/bpf/progs/sockopt_multi.c @@ -0,0 +1,71 @@ +// SPDX-License-Identifier: GPL-2.0 +#include <netinet/in.h> +#include <linux/bpf.h> +#include "bpf_helpers.h" + +char _license[] SEC("license") = "GPL"; +__u32 _version SEC("version") = 1; + +SEC("cgroup/getsockopt/child") +int _getsockopt_child(struct bpf_sockopt *ctx) +{ + __u8 *optval_end = ctx->optval_end; + __u8 *optval = ctx->optval; + + if (ctx->level != SOL_IP || ctx->optname != IP_TOS) + return 1; + + if (optval + 1 > optval_end) + return 0; /* EPERM, bounds check */ + + if (optval[0] != 0x80) + return 0; /* EPERM, unexpected optval from the kernel */ + + ctx->retval = 0; /* Reset system call return value to zero */ + + optval[0] = 0x90; + ctx->optlen = 1; + + return 1; +} + +SEC("cgroup/getsockopt/parent") +int _getsockopt_parent(struct bpf_sockopt *ctx) +{ + __u8 *optval_end = ctx->optval_end; + __u8 *optval = ctx->optval; + + if (ctx->level != SOL_IP || ctx->optname != IP_TOS) + return 1; + + if (optval + 1 > optval_end) + return 0; /* EPERM, bounds check */ + + if (optval[0] != 0x90) + return 0; /* EPERM, unexpected optval from the kernel */ + + ctx->retval = 0; /* Reset system call return value to zero */ + + optval[0] = 0xA0; + ctx->optlen = 1; + + return 1; +} + +SEC("cgroup/setsockopt") +int _setsockopt(struct bpf_sockopt *ctx) +{ + __u8 *optval_end = ctx->optval_end; + __u8 *optval = ctx->optval; + + if (ctx->level != SOL_IP || ctx->optname != IP_TOS) + return 1; + + if (optval + 1 > optval_end) + return 0; /* EPERM, bounds check */ + + optval[0] += 0x10; + ctx->optlen = 1; + + return 1; +} diff --git a/tools/testing/selftests/bpf/test_sockopt_multi.c b/tools/testing/selftests/bpf/test_sockopt_multi.c new file mode 100644 index 000000000000..4be3441db867 --- /dev/null +++ b/tools/testing/selftests/bpf/test_sockopt_multi.c @@ -0,0 +1,374 @@ +// SPDX-License-Identifier: GPL-2.0 + +#include <error.h> +#include <errno.h> +#include <stdio.h> +#include <unistd.h> +#include <sys/types.h> +#include <sys/socket.h> +#include <netinet/in.h> + +#include <linux/filter.h> +#include <bpf/bpf.h> +#include <bpf/libbpf.h> + +#include "bpf_rlimit.h" +#include "bpf_util.h" +#include "cgroup_helpers.h" + +static int prog_attach(struct bpf_object *obj, int cgroup_fd, const char *title) +{ + enum bpf_attach_type attach_type; + enum bpf_prog_type prog_type; + struct bpf_program *prog; + int err; + + err = libbpf_prog_type_by_name(title, &prog_type, &attach_type); + if (err) { + log_err("Failed to deduct types for %s BPF program", title); + return -1; + } + + prog = bpf_object__find_program_by_title(obj, title); + if (!prog) { + log_err("Failed to find %s BPF program", title); + return -1; + } + + err = bpf_prog_attach(bpf_program__fd(prog), cgroup_fd, + attach_type, BPF_F_ALLOW_MULTI); + if (err) { + log_err("Failed to attach %s BPF program", title); + return -1; + } + + return 0; +} + +static int prog_detach(struct bpf_object *obj, int cgroup_fd, const char *title) +{ + enum bpf_attach_type attach_type; + enum bpf_prog_type prog_type; + struct bpf_program *prog; + int err; + + err = libbpf_prog_type_by_name(title, &prog_type, &attach_type); + if (err) + return -1; + + prog = bpf_object__find_program_by_title(obj, title); + if (!prog) + return -1; + + err = bpf_prog_detach2(bpf_program__fd(prog), cgroup_fd, + attach_type); + if (err) + return -1; + + return 0; +} + +static int run_getsockopt_test(struct bpf_object *obj, int cg_parent, + int cg_child, int sock_fd) +{ + socklen_t optlen; + __u8 buf; + int err; + + /* Set IP_TOS to the expected value (0x80). */ + + buf = 0x80; + err = setsockopt(sock_fd, SOL_IP, IP_TOS, &buf, 1); + if (err < 0) { + log_err("Failed to call setsockopt(IP_TOS)"); + goto detach; + } + + buf = 0x00; + optlen = 1; + err = getsockopt(sock_fd, SOL_IP, IP_TOS, &buf, &optlen); + if (err) { + log_err("Failed to call getsockopt(IP_TOS)"); + goto detach; + } + + if (buf != 0x80) { + log_err("Unexpected getsockopt 0x%x != 0x80 without BPF", buf); + err = -1; + goto detach; + } + + /* Attach child program and make sure it returns new value: + * - kernel: -> 0x80 + * - child: 0x80 -> 0x90 + */ + + err = prog_attach(obj, cg_child, "cgroup/getsockopt/child"); + if (err) + goto detach; + + buf = 0x00; + optlen = 1; + err = getsockopt(sock_fd, SOL_IP, IP_TOS, &buf, &optlen); + if (err) { + log_err("Failed to call getsockopt(IP_TOS)"); + goto detach; + } + + if (buf != 0x90) { + log_err("Unexpected getsockopt 0x%x != 0x90", buf); + err = -1; + goto detach; + } + + /* Attach parent program and make sure it returns new value: + * - kernel: -> 0x80 + * - child: 0x80 -> 0x90 + * - parent: 0x90 -> 0xA0 + */ + + err = prog_attach(obj, cg_parent, "cgroup/getsockopt/parent"); + if (err) + goto detach; + + buf = 0x00; + optlen = 1; + err = getsockopt(sock_fd, SOL_IP, IP_TOS, &buf, &optlen); + if (err) { + log_err("Failed to call getsockopt(IP_TOS)"); + goto detach; + } + + if (buf != 0xA0) { + log_err("Unexpected getsockopt 0x%x != 0xA0", buf); + err = -1; + goto detach; + } + + /* Setting unexpected initial sockopt should return EPERM: + * - kernel: -> 0x40 + * - child: unexpected 0x40, EPERM + * - parent: unexpected 0x40, EPERM + */ + + buf = 0x40; + if (setsockopt(sock_fd, SOL_IP, IP_TOS, &buf, 1) < 0) { + log_err("Failed to call setsockopt(IP_TOS)"); + goto detach; + } + + buf = 0x00; + optlen = 1; + err = getsockopt(sock_fd, SOL_IP, IP_TOS, &buf, &optlen); + if (!err) { + log_err("Unexpected success from getsockopt(IP_TOS)"); + goto detach; + } + + /* Detach child program and make sure we still get EPERM: + * - kernel: -> 0x40 + * - parent: unexpected 0x40, EPERM + */ + + err = prog_detach(obj, cg_child, "cgroup/getsockopt/child"); + if (err) { + log_err("Failed to detach child program"); + goto detach; + } + + buf = 0x00; + optlen = 1; + err = getsockopt(sock_fd, SOL_IP, IP_TOS, &buf, &optlen); + if (!err) { + log_err("Unexpected success from getsockopt(IP_TOS)"); + goto detach; + } + + /* Set initial value to the one the parent program expects: + * - kernel: -> 0x90 + * - parent: 0x90 -> 0xA0 + */ + + buf = 0x90; + err = setsockopt(sock_fd, SOL_IP, IP_TOS, &buf, 1); + if (err < 0) { + log_err("Failed to call setsockopt(IP_TOS)"); + goto detach; + } + + buf = 0x00; + optlen = 1; + err = getsockopt(sock_fd, SOL_IP, IP_TOS, &buf, &optlen); + if (err) { + log_err("Failed to call getsockopt(IP_TOS)"); + goto detach; + } + + if (buf != 0xA0) { + log_err("Unexpected getsockopt 0x%x != 0xA0", buf); + err = -1; + goto detach; + } + +detach: + prog_detach(obj, cg_child, "cgroup/getsockopt/child"); + prog_detach(obj, cg_parent, "cgroup/getsockopt/parent"); + + return err; +} + +static int run_setsockopt_test(struct bpf_object *obj, int cg_parent, + int cg_child, int sock_fd) +{ + socklen_t optlen; + __u8 buf; + int err; + + /* Set IP_TOS to the expected value (0x80). */ + + buf = 0x80; + err = setsockopt(sock_fd, SOL_IP, IP_TOS, &buf, 1); + if (err < 0) { + log_err("Failed to call setsockopt(IP_TOS)"); + goto detach; + } + + buf = 0x00; + optlen = 1; + err = getsockopt(sock_fd, SOL_IP, IP_TOS, &buf, &optlen); + if (err) { + log_err("Failed to call getsockopt(IP_TOS)"); + goto detach; + } + + if (buf != 0x80) { + log_err("Unexpected getsockopt 0x%x != 0x80 without BPF", buf); + err = -1; + goto detach; + } + + /* Attach child program and make sure it adds 0x10. */ + + err = prog_attach(obj, cg_child, "cgroup/setsockopt"); + if (err) + goto detach; + + buf = 0x80; + err = setsockopt(sock_fd, SOL_IP, IP_TOS, &buf, 1); + if (err < 0) { + log_err("Failed to call setsockopt(IP_TOS)"); + goto detach; + } + + buf = 0x00; + optlen = 1; + err = getsockopt(sock_fd, SOL_IP, IP_TOS, &buf, &optlen); + if (err) { + log_err("Failed to call getsockopt(IP_TOS)"); + goto detach; + } + + if (buf != 0x80 + 0x10) { + log_err("Unexpected getsockopt 0x%x != 0x80 + 0x10", buf); + err = -1; + goto detach; + } + + /* Attach parent program and make sure it adds another 0x10. */ + + err = prog_attach(obj, cg_parent, "cgroup/setsockopt"); + if (err) + goto detach; + + buf = 0x80; + err = setsockopt(sock_fd, SOL_IP, IP_TOS, &buf, 1); + if (err < 0) { + log_err("Failed to call setsockopt(IP_TOS)"); + goto detach; + } + + buf = 0x00; + optlen = 1; + err = getsockopt(sock_fd, SOL_IP, IP_TOS, &buf, &optlen); + if (err) { + log_err("Failed to call getsockopt(IP_TOS)"); + goto detach; + } + + if (buf != 0x80 + 2 * 0x10) { + log_err("Unexpected getsockopt 0x%x != 0x80 + 2 * 0x10", buf); + err = -1; + goto detach; + } + +detach: + prog_detach(obj, cg_child, "cgroup/setsockopt"); + prog_detach(obj, cg_parent, "cgroup/setsockopt"); + + return err; +} + +int main(int argc, char **argv) +{ + struct bpf_prog_load_attr attr = { + .file = "./sockopt_multi.o", + }; + int cg_parent = -1, cg_child = -1; + struct bpf_object *obj = NULL; + int sock_fd = -1; + int err = -1; + int ignored; + + if (setup_cgroup_environment()) { + log_err("Failed to setup cgroup environment\n"); + goto out; + } + + cg_parent = create_and_get_cgroup("/parent"); + if (cg_parent < 0) { + log_err("Failed to create cgroup /parent\n"); + goto out; + } + + cg_child = create_and_get_cgroup("/parent/child"); + if (cg_child < 0) { + log_err("Failed to create cgroup /parent/child\n"); + goto out; + } + + if (join_cgroup("/parent/child")) { + log_err("Failed to join cgroup /parent/child\n"); + goto out; + } + + err = bpf_prog_load_xattr(&attr, &obj, &ignored); + if (err) { + log_err("Failed to load BPF object"); + goto out; + } + + sock_fd = socket(AF_INET, SOCK_STREAM, 0); + if (sock_fd < 0) { + log_err("Failed to create socket"); + goto out; + } + + if (run_getsockopt_test(obj, cg_parent, cg_child, sock_fd)) + err = -1; + printf("test_sockopt_multi: getsockopt %s\n", + err ? "FAILED" : "PASSED"); + + if (run_setsockopt_test(obj, cg_parent, cg_child, sock_fd)) + err = -1; + printf("test_sockopt_multi: setsockopt %s\n", + err ? "FAILED" : "PASSED"); + +out: + close(sock_fd); + bpf_object__close(obj); + close(cg_child); + close(cg_parent); + + printf("test_sockopt_multi: %s\n", err ? "FAILED" : "PASSED"); + return err ? EXIT_FAILURE : EXIT_SUCCESS; +} -- 2.22.0.410.gd8fdbe21b5-goog ^ permalink raw reply related [flat|nested] 13+ messages in thread
* [PATCH bpf-next v9 8/9] bpf: add sockopt documentation 2019-06-27 20:38 [PATCH bpf-next v9 0/9] bpf: getsockopt and setsockopt hooks Stanislav Fomichev ` (6 preceding siblings ...) 2019-06-27 20:38 ` [PATCH bpf-next v9 7/9] selftests/bpf: add sockopt test that exercises BPF_F_ALLOW_MULTI Stanislav Fomichev @ 2019-06-27 20:38 ` Stanislav Fomichev 2019-06-27 20:38 ` [PATCH bpf-next v9 9/9] bpftool: support cgroup sockopt Stanislav Fomichev 2019-06-27 22:31 ` [PATCH bpf-next v9 0/9] bpf: getsockopt and setsockopt hooks Alexei Starovoitov 9 siblings, 0 replies; 13+ messages in thread From: Stanislav Fomichev @ 2019-06-27 20:38 UTC (permalink / raw) To: netdev, bpf Cc: davem, ast, daniel, Stanislav Fomichev, Andrii Nakryiko, Martin Lau Provide user documentation about sockopt prog type and cgroup hooks. v9: * add details about setsockopt context and inheritance v7: * add description for retval=0 and optlen=-1 v6: * describe cgroup chaining, add example v2: * use return code 2 for kernel bypass Cc: Andrii Nakryiko <andriin@fb.com> Cc: Martin Lau <kafai@fb.com> Signed-off-by: Stanislav Fomichev <sdf@google.com> --- Documentation/bpf/index.rst | 1 + Documentation/bpf/prog_cgroup_sockopt.rst | 93 +++++++++++++++++++++++ 2 files changed, 94 insertions(+) create mode 100644 Documentation/bpf/prog_cgroup_sockopt.rst diff --git a/Documentation/bpf/index.rst b/Documentation/bpf/index.rst index d3fe4cac0c90..801a6ed3f2e5 100644 --- a/Documentation/bpf/index.rst +++ b/Documentation/bpf/index.rst @@ -42,6 +42,7 @@ Program types .. toctree:: :maxdepth: 1 + prog_cgroup_sockopt prog_cgroup_sysctl prog_flow_dissector diff --git a/Documentation/bpf/prog_cgroup_sockopt.rst b/Documentation/bpf/prog_cgroup_sockopt.rst new file mode 100644 index 000000000000..c47d974629ae --- /dev/null +++ b/Documentation/bpf/prog_cgroup_sockopt.rst @@ -0,0 +1,93 @@ +.. SPDX-License-Identifier: GPL-2.0 + +============================ +BPF_PROG_TYPE_CGROUP_SOCKOPT +============================ + +``BPF_PROG_TYPE_CGROUP_SOCKOPT`` program type can be attached to two +cgroup hooks: + +* ``BPF_CGROUP_GETSOCKOPT`` - called every time process executes ``getsockopt`` + system call. +* ``BPF_CGROUP_SETSOCKOPT`` - called every time process executes ``setsockopt`` + system call. + +The context (``struct bpf_sockopt``) has associated socket (``sk``) and +all input arguments: ``level``, ``optname``, ``optval`` and ``optlen``. + +BPF_CGROUP_SETSOCKOPT +===================== + +``BPF_CGROUP_SETSOCKOPT`` is triggered *before* the kernel handling of +sockopt and it has writable context: it can modify the supplied arguments +before passing them down to the kernel. This hook has access to the cgroup +and socket local storage. + +If BPF program sets ``optlen`` to -1, the control will be returned +back to the userspace after all other BPF programs in the cgroup +chain finish (i.e. kernel ``setsockopt`` handling will *not* be executed). + +Note, that ``optlen`` can not be increased beyond the user-supplied +value. It can only be decreased or set to -1. Any other value will +trigger ``EFAULT``. + +Return Type +----------- + +* ``0`` - reject the syscall, ``EPERM`` will be returned to the userspace. +* ``1`` - success, continue with next BPF program in the cgroup chain. + +BPF_CGROUP_GETSOCKOPT +===================== + +``BPF_CGROUP_GETSOCKOPT`` is triggered *after* the kernel handing of +sockopt. The BPF hook can observe ``optval``, ``optlen`` and ``retval`` +if it's interested in whatever kernel has returned. BPF hook can override +the values above, adjust ``optlen`` and reset ``retval`` to 0. If ``optlen`` +has been increased above initial ``getsockopt`` value (i.e. userspace +buffer is too small), ``EFAULT`` is returned. + +This hook has access to the cgroup and socket local storage. + +Note, that the only acceptable value to set to ``retval`` is 0 and the +original value that the kernel returned. Any other value will trigger +``EFAULT``. + +Return Type +----------- + +* ``0`` - reject the syscall, ``EPERM`` will be returned to the userspace. +* ``1`` - success: copy ``optval`` and ``optlen`` to userspace, return + ``retval`` from the syscall (note that this can be overwritten by + the BPF program from the parent cgroup). + +Cgroup Inheritance +================== + +Suppose, there is the following cgroup hierarchy where each cgroup +has ``BPF_CGROUP_GETSOCKOPT`` attached at each level with +``BPF_F_ALLOW_MULTI`` flag:: + + A (root, parent) + \ + B (child) + +When the application calls ``getsockopt`` syscall from the cgroup B, +the programs are executed from the bottom up: B, A. First program +(B) sees the result of kernel's ``getsockopt``. It can optionally +adjust ``optval``, ``optlen`` and reset ``retval`` to 0. After that +control will be passed to the second (A) program which will see the +same context as B including any potential modifications. + +Same for ``BPF_CGROUP_SETSOCKOPT``: if the program is attached to +A and B, the trigger order is B, then A. If B does any changes +to the input arguments (``level``, ``optname``, ``optval``, ``optlen``), +then the next program in the chain (A) will see those changes, +*not* the original input ``setsockopt`` arguments. The potentially +modified values will be then passed down to the kernel. + +Example +======= + +See ``tools/testing/selftests/bpf/progs/sockopt_sk.c`` for an example +of BPF program that handles socket options. -- 2.22.0.410.gd8fdbe21b5-goog ^ permalink raw reply related [flat|nested] 13+ messages in thread
* [PATCH bpf-next v9 9/9] bpftool: support cgroup sockopt 2019-06-27 20:38 [PATCH bpf-next v9 0/9] bpf: getsockopt and setsockopt hooks Stanislav Fomichev ` (7 preceding siblings ...) 2019-06-27 20:38 ` [PATCH bpf-next v9 8/9] bpf: add sockopt documentation Stanislav Fomichev @ 2019-06-27 20:38 ` Stanislav Fomichev 2019-06-27 22:31 ` [PATCH bpf-next v9 0/9] bpf: getsockopt and setsockopt hooks Alexei Starovoitov 9 siblings, 0 replies; 13+ messages in thread From: Stanislav Fomichev @ 2019-06-27 20:38 UTC (permalink / raw) To: netdev, bpf Cc: davem, ast, daniel, Stanislav Fomichev, Andrii Nakryiko, Martin Lau, Jakub Kicinski Support sockopt prog type and cgroup hooks in the bpftool. Cc: Andrii Nakryiko <andriin@fb.com> Cc: Martin Lau <kafai@fb.com> Acked-by: Jakub Kicinski <jakub.kicinski@netronome.com> Signed-off-by: Stanislav Fomichev <sdf@google.com> --- tools/bpf/bpftool/Documentation/bpftool-cgroup.rst | 7 +++++-- tools/bpf/bpftool/Documentation/bpftool-prog.rst | 3 ++- tools/bpf/bpftool/bash-completion/bpftool | 9 ++++++--- tools/bpf/bpftool/cgroup.c | 5 ++++- tools/bpf/bpftool/main.h | 1 + tools/bpf/bpftool/prog.c | 3 ++- 6 files changed, 20 insertions(+), 8 deletions(-) diff --git a/tools/bpf/bpftool/Documentation/bpftool-cgroup.rst b/tools/bpf/bpftool/Documentation/bpftool-cgroup.rst index d80fdde79c22..585f270c2d25 100644 --- a/tools/bpf/bpftool/Documentation/bpftool-cgroup.rst +++ b/tools/bpf/bpftool/Documentation/bpftool-cgroup.rst @@ -29,7 +29,8 @@ CGROUP COMMANDS | *PROG* := { **id** *PROG_ID* | **pinned** *FILE* | **tag** *PROG_TAG* } | *ATTACH_TYPE* := { **ingress** | **egress** | **sock_create** | **sock_ops** | **device** | | **bind4** | **bind6** | **post_bind4** | **post_bind6** | **connect4** | **connect6** | -| **sendmsg4** | **sendmsg6** | **recvmsg4** | **recvmsg6** | **sysctl** } +| **sendmsg4** | **sendmsg6** | **recvmsg4** | **recvmsg6** | **sysctl** | +| **getsockopt** | **setsockopt** } | *ATTACH_FLAGS* := { **multi** | **override** } DESCRIPTION @@ -90,7 +91,9 @@ DESCRIPTION an unconnected udp4 socket (since 5.2); **recvmsg6** call to recvfrom(2), recvmsg(2), recvmmsg(2) for an unconnected udp6 socket (since 5.2); - **sysctl** sysctl access (since 5.2). + **sysctl** sysctl access (since 5.2); + **getsockopt** call to getsockopt (since 5.3); + **setsockopt** call to setsockopt (since 5.3). **bpftool cgroup detach** *CGROUP* *ATTACH_TYPE* *PROG* Detach *PROG* from the cgroup *CGROUP* and attach type diff --git a/tools/bpf/bpftool/Documentation/bpftool-prog.rst b/tools/bpf/bpftool/Documentation/bpftool-prog.rst index 55dd06517a3b..1df637f85f94 100644 --- a/tools/bpf/bpftool/Documentation/bpftool-prog.rst +++ b/tools/bpf/bpftool/Documentation/bpftool-prog.rst @@ -40,7 +40,8 @@ PROG COMMANDS | **lwt_seg6local** | **sockops** | **sk_skb** | **sk_msg** | **lirc_mode2** | | **cgroup/bind4** | **cgroup/bind6** | **cgroup/post_bind4** | **cgroup/post_bind6** | | **cgroup/connect4** | **cgroup/connect6** | **cgroup/sendmsg4** | **cgroup/sendmsg6** | -| **cgroup/recvmsg4** | **cgroup/recvmsg6** | **cgroup/sysctl** +| **cgroup/recvmsg4** | **cgroup/recvmsg6** | **cgroup/sysctl** | +| **cgroup/getsockopt** | **cgroup/setsockopt** | } | *ATTACH_TYPE* := { | **msg_verdict** | **stream_verdict** | **stream_parser** | **flow_dissector** diff --git a/tools/bpf/bpftool/bash-completion/bpftool b/tools/bpf/bpftool/bash-completion/bpftool index a17e84c67498..ba37095e1f62 100644 --- a/tools/bpf/bpftool/bash-completion/bpftool +++ b/tools/bpf/bpftool/bash-completion/bpftool @@ -379,7 +379,8 @@ _bpftool() cgroup/sendmsg4 cgroup/sendmsg6 \ cgroup/recvmsg4 cgroup/recvmsg6 \ cgroup/post_bind4 cgroup/post_bind6 \ - cgroup/sysctl" -- \ + cgroup/sysctl cgroup/getsockopt \ + cgroup/setsockopt" -- \ "$cur" ) ) return 0 ;; @@ -689,7 +690,8 @@ _bpftool() attach|detach) local ATTACH_TYPES='ingress egress sock_create sock_ops \ device bind4 bind6 post_bind4 post_bind6 connect4 \ - connect6 sendmsg4 sendmsg6 recvmsg4 recvmsg6 sysctl' + connect6 sendmsg4 sendmsg6 recvmsg4 recvmsg6 sysctl \ + getsockopt setsockopt' local ATTACH_FLAGS='multi override' local PROG_TYPE='id pinned tag' case $prev in @@ -699,7 +701,8 @@ _bpftool() ;; ingress|egress|sock_create|sock_ops|device|bind4|bind6|\ post_bind4|post_bind6|connect4|connect6|sendmsg4|\ - sendmsg6|recvmsg4|recvmsg6|sysctl) + sendmsg6|recvmsg4|recvmsg6|sysctl|getsockopt|\ + setsockopt) COMPREPLY=( $( compgen -W "$PROG_TYPE" -- \ "$cur" ) ) return 0 diff --git a/tools/bpf/bpftool/cgroup.c b/tools/bpf/bpftool/cgroup.c index 73ec8ea33fb4..390b89a224f1 100644 --- a/tools/bpf/bpftool/cgroup.c +++ b/tools/bpf/bpftool/cgroup.c @@ -26,7 +26,8 @@ " sock_ops | device | bind4 | bind6 |\n" \ " post_bind4 | post_bind6 | connect4 |\n" \ " connect6 | sendmsg4 | sendmsg6 |\n" \ - " recvmsg4 | recvmsg6 | sysctl }" + " recvmsg4 | recvmsg6 | sysctl |\n" \ + " getsockopt | setsockopt }" static const char * const attach_type_strings[] = { [BPF_CGROUP_INET_INGRESS] = "ingress", @@ -45,6 +46,8 @@ static const char * const attach_type_strings[] = { [BPF_CGROUP_SYSCTL] = "sysctl", [BPF_CGROUP_UDP4_RECVMSG] = "recvmsg4", [BPF_CGROUP_UDP6_RECVMSG] = "recvmsg6", + [BPF_CGROUP_GETSOCKOPT] = "getsockopt", + [BPF_CGROUP_SETSOCKOPT] = "setsockopt", [__MAX_BPF_ATTACH_TYPE] = NULL, }; diff --git a/tools/bpf/bpftool/main.h b/tools/bpf/bpftool/main.h index 28a2a5857e14..9c5d9c80f71e 100644 --- a/tools/bpf/bpftool/main.h +++ b/tools/bpf/bpftool/main.h @@ -74,6 +74,7 @@ static const char * const prog_type_name[] = { [BPF_PROG_TYPE_SK_REUSEPORT] = "sk_reuseport", [BPF_PROG_TYPE_FLOW_DISSECTOR] = "flow_dissector", [BPF_PROG_TYPE_CGROUP_SYSCTL] = "cgroup_sysctl", + [BPF_PROG_TYPE_CGROUP_SOCKOPT] = "cgroup_sockopt", }; extern const char * const map_type_name[]; diff --git a/tools/bpf/bpftool/prog.c b/tools/bpf/bpftool/prog.c index f1a831f05010..9b0db5d14e31 100644 --- a/tools/bpf/bpftool/prog.c +++ b/tools/bpf/bpftool/prog.c @@ -1071,7 +1071,8 @@ static int do_help(int argc, char **argv) " cgroup/bind4 | cgroup/bind6 | cgroup/post_bind4 |\n" " cgroup/post_bind6 | cgroup/connect4 | cgroup/connect6 |\n" " cgroup/sendmsg4 | cgroup/sendmsg6 | cgroup/recvmsg4 |\n" - " cgroup/recvmsg6 }\n" + " cgroup/recvmsg6 | cgroup/getsockopt |\n" + " cgroup/setsockopt }\n" " ATTACH_TYPE := { msg_verdict | stream_verdict | stream_parser |\n" " flow_dissector }\n" " " HELP_SPEC_OPTIONS "\n" -- 2.22.0.410.gd8fdbe21b5-goog ^ permalink raw reply related [flat|nested] 13+ messages in thread
* Re: [PATCH bpf-next v9 0/9] bpf: getsockopt and setsockopt hooks 2019-06-27 20:38 [PATCH bpf-next v9 0/9] bpf: getsockopt and setsockopt hooks Stanislav Fomichev ` (8 preceding siblings ...) 2019-06-27 20:38 ` [PATCH bpf-next v9 9/9] bpftool: support cgroup sockopt Stanislav Fomichev @ 2019-06-27 22:31 ` Alexei Starovoitov 2019-06-27 22:43 ` Stanislav Fomichev 9 siblings, 1 reply; 13+ messages in thread From: Alexei Starovoitov @ 2019-06-27 22:31 UTC (permalink / raw) To: Stanislav Fomichev; +Cc: netdev, bpf, davem, ast, daniel On Thu, Jun 27, 2019 at 01:38:46PM -0700, Stanislav Fomichev wrote: > This series implements two new per-cgroup hooks: getsockopt and > setsockopt along with a new sockopt program type. The idea is pretty > similar to recently introduced cgroup sysctl hooks, but > implementation is simpler (no need to convert to/from strings). > > What this can be applied to: > * move business logic of what tos/priority/etc can be set by > containers (either pass or reject) > * handle existing options (or introduce new ones) differently by > propagating some information in cgroup/socket local storage > > Compared to a simple syscall/{g,s}etsockopt tracepoint, those > hooks are context aware. Meaning, they can access underlying socket > and use cgroup and socket local storage. > > v9: > * allow overwriting setsocktop arguments (Alexei Starovoitov) > (see individual changes for more changelog details) Applied. Thanks. There is a build warning though: test_sockopt_sk.c: In function ‘getsetsockopt’: test_sockopt_sk.c:115:2: warning: dereferencing type-punned pointer will break strict-aliasing rules [-Wstrict-aliasing] if (*(__u32 *)buf != 0x55AA*2) { ^~ test_sockopt_sk.c:116:3: warning: dereferencing type-punned pointer will break strict-aliasing rules [-Wstrict-aliasing] log_err("Unexpected getsockopt(SO_SNDBUF) 0x%x != 0x55AA*2", ^~~~~~~ Pls fix it in the follow up. ^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [PATCH bpf-next v9 0/9] bpf: getsockopt and setsockopt hooks 2019-06-27 22:31 ` [PATCH bpf-next v9 0/9] bpf: getsockopt and setsockopt hooks Alexei Starovoitov @ 2019-06-27 22:43 ` Stanislav Fomichev 2019-06-27 22:46 ` Alexei Starovoitov 0 siblings, 1 reply; 13+ messages in thread From: Stanislav Fomichev @ 2019-06-27 22:43 UTC (permalink / raw) To: Alexei Starovoitov; +Cc: Stanislav Fomichev, netdev, bpf, davem, ast, daniel On 06/27, Alexei Starovoitov wrote: > On Thu, Jun 27, 2019 at 01:38:46PM -0700, Stanislav Fomichev wrote: > > This series implements two new per-cgroup hooks: getsockopt and > > setsockopt along with a new sockopt program type. The idea is pretty > > similar to recently introduced cgroup sysctl hooks, but > > implementation is simpler (no need to convert to/from strings). > > > > What this can be applied to: > > * move business logic of what tos/priority/etc can be set by > > containers (either pass or reject) > > * handle existing options (or introduce new ones) differently by > > propagating some information in cgroup/socket local storage > > > > Compared to a simple syscall/{g,s}etsockopt tracepoint, those > > hooks are context aware. Meaning, they can access underlying socket > > and use cgroup and socket local storage. > > > > v9: > > * allow overwriting setsocktop arguments (Alexei Starovoitov) > > (see individual changes for more changelog details) > > Applied. Thanks. Great, thanks for all the reviews! > There is a build warning though: > test_sockopt_sk.c: In function ‘getsetsockopt’: > test_sockopt_sk.c:115:2: warning: dereferencing type-punned pointer will break strict-aliasing rules [-Wstrict-aliasing] > if (*(__u32 *)buf != 0x55AA*2) { > ^~ > test_sockopt_sk.c:116:3: warning: dereferencing type-punned pointer will break strict-aliasing rules [-Wstrict-aliasing] > log_err("Unexpected getsockopt(SO_SNDBUF) 0x%x != 0x55AA*2", > ^~~~~~~ > > Pls fix it in the follow up. Sure, but I can't reproduce it with gcc7 nor with clang9 :-/ Presumably, a switch to __get_unaligned_cpu32 should help, I'll try to play with compilers a bit before sending a fix. ^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [PATCH bpf-next v9 0/9] bpf: getsockopt and setsockopt hooks 2019-06-27 22:43 ` Stanislav Fomichev @ 2019-06-27 22:46 ` Alexei Starovoitov 0 siblings, 0 replies; 13+ messages in thread From: Alexei Starovoitov @ 2019-06-27 22:46 UTC (permalink / raw) To: Stanislav Fomichev Cc: Stanislav Fomichev, Network Development, bpf, David S. Miller, Alexei Starovoitov, Daniel Borkmann On Thu, Jun 27, 2019 at 3:43 PM Stanislav Fomichev <sdf@fomichev.me> wrote: > > > There is a build warning though: > > test_sockopt_sk.c: In function ‘getsetsockopt’: > > test_sockopt_sk.c:115:2: warning: dereferencing type-punned pointer will break strict-aliasing rules [-Wstrict-aliasing] > > if (*(__u32 *)buf != 0x55AA*2) { > > ^~ > > test_sockopt_sk.c:116:3: warning: dereferencing type-punned pointer will break strict-aliasing rules [-Wstrict-aliasing] > > log_err("Unexpected getsockopt(SO_SNDBUF) 0x%x != 0x55AA*2", > > ^~~~~~~ > > > > Pls fix it in the follow up. > Sure, but I can't reproduce it with gcc7 nor with clang9 :-/ I see it with gcc 6.3 ^ permalink raw reply [flat|nested] 13+ messages in thread
end of thread, other threads:[~2019-06-27 22:46 UTC | newest] Thread overview: 13+ messages (download: mbox.gz / follow: Atom feed) -- links below jump to the message on this page -- 2019-06-27 20:38 [PATCH bpf-next v9 0/9] bpf: getsockopt and setsockopt hooks Stanislav Fomichev 2019-06-27 20:38 ` [PATCH bpf-next v9 1/9] bpf: implement " Stanislav Fomichev 2019-06-27 20:38 ` [PATCH bpf-next v9 2/9] bpf: sync bpf.h to tools/ Stanislav Fomichev 2019-06-27 20:38 ` [PATCH bpf-next v9 3/9] libbpf: support sockopt hooks Stanislav Fomichev 2019-06-27 20:38 ` [PATCH bpf-next v9 4/9] selftests/bpf: test sockopt section name Stanislav Fomichev 2019-06-27 20:38 ` [PATCH bpf-next v9 5/9] selftests/bpf: add sockopt test Stanislav Fomichev 2019-06-27 20:38 ` [PATCH bpf-next v9 6/9] selftests/bpf: add sockopt test that exercises sk helpers Stanislav Fomichev 2019-06-27 20:38 ` [PATCH bpf-next v9 7/9] selftests/bpf: add sockopt test that exercises BPF_F_ALLOW_MULTI Stanislav Fomichev 2019-06-27 20:38 ` [PATCH bpf-next v9 8/9] bpf: add sockopt documentation Stanislav Fomichev 2019-06-27 20:38 ` [PATCH bpf-next v9 9/9] bpftool: support cgroup sockopt Stanislav Fomichev 2019-06-27 22:31 ` [PATCH bpf-next v9 0/9] bpf: getsockopt and setsockopt hooks Alexei Starovoitov 2019-06-27 22:43 ` Stanislav Fomichev 2019-06-27 22:46 ` Alexei Starovoitov
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).