* [PATCH bpf-next 1/5] bpf: Simplify __cgroup_bpf_attach
2019-12-11 2:33 [PATCH bpf-next 0/5] bpf: Support replacing cgroup-bpf program in MULTI mode Andrey Ignatov
@ 2019-12-11 2:33 ` Andrey Ignatov
2019-12-12 17:57 ` Martin Lau
2019-12-11 2:33 ` [PATCH bpf-next 2/5] bpf: Remove unused new_flags in hierarchy_allows_attach() Andrey Ignatov
` (3 subsequent siblings)
4 siblings, 1 reply; 13+ messages in thread
From: Andrey Ignatov @ 2019-12-11 2:33 UTC (permalink / raw)
To: bpf; +Cc: Andrey Ignatov, ast, daniel, kernel-team
__cgroup_bpf_attach has a lot of identical code to handle two scenarios:
BPF_F_ALLOW_MULTI is set and unset.
Simplify it by splitting the two main steps:
* First, the decision is made whether a new bpf_prog_list entry should
be allocated or existing entry should be reused for the new program.
This decision is saved in replace_pl pointer;
* Next, replace_pl pointer is used to handle both possible states of
BPF_F_ALLOW_MULTI flag (set / unset) instead of doing similar work for
them separately.
This splitting, in turn, allows to make further simplifications:
* The check for attaching same program twice in BPF_F_ALLOW_MULTI mode
can be done before allocating cgroup storage, so that if user tries to
attach same program twice no alloc/free happens as it was before;
* pl_was_allocated becomes redundant so it's removed.
Signed-off-by: Andrey Ignatov <rdna@fb.com>
---
kernel/bpf/cgroup.c | 62 +++++++++++++++++----------------------------
1 file changed, 23 insertions(+), 39 deletions(-)
diff --git a/kernel/bpf/cgroup.c b/kernel/bpf/cgroup.c
index 9f90d3c92bda..e8cbdd1be687 100644
--- a/kernel/bpf/cgroup.c
+++ b/kernel/bpf/cgroup.c
@@ -295,9 +295,8 @@ int __cgroup_bpf_attach(struct cgroup *cgrp, struct bpf_prog *prog,
struct bpf_prog *old_prog = NULL;
struct bpf_cgroup_storage *storage[MAX_BPF_CGROUP_STORAGE_TYPE],
*old_storage[MAX_BPF_CGROUP_STORAGE_TYPE] = {NULL};
+ struct bpf_prog_list *pl, *replace_pl = NULL;
enum bpf_cgroup_storage_type stype;
- struct bpf_prog_list *pl;
- bool pl_was_allocated;
int err;
if ((flags & BPF_F_ALLOW_OVERRIDE) && (flags & BPF_F_ALLOW_MULTI))
@@ -317,6 +316,16 @@ int __cgroup_bpf_attach(struct cgroup *cgrp, struct bpf_prog *prog,
if (prog_list_length(progs) >= BPF_CGROUP_MAX_PROGS)
return -E2BIG;
+ if (flags & BPF_F_ALLOW_MULTI) {
+ list_for_each_entry(pl, progs, node) {
+ if (pl->prog == prog)
+ /* disallow attaching the same prog twice */
+ return -EINVAL;
+ }
+ } else if (!list_empty(progs)) {
+ replace_pl = list_first_entry(progs, typeof(*pl), node);
+ }
+
for_each_cgroup_storage_type(stype) {
storage[stype] = bpf_cgroup_storage_alloc(prog, stype);
if (IS_ERR(storage[stype])) {
@@ -327,52 +336,27 @@ int __cgroup_bpf_attach(struct cgroup *cgrp, struct bpf_prog *prog,
}
}
- if (flags & BPF_F_ALLOW_MULTI) {
- list_for_each_entry(pl, progs, node) {
- if (pl->prog == prog) {
- /* disallow attaching the same prog twice */
- for_each_cgroup_storage_type(stype)
- bpf_cgroup_storage_free(storage[stype]);
- return -EINVAL;
- }
+ if (replace_pl) {
+ pl = replace_pl;
+ old_prog = pl->prog;
+ for_each_cgroup_storage_type(stype) {
+ old_storage[stype] = pl->storage[stype];
+ bpf_cgroup_storage_unlink(old_storage[stype]);
}
-
+ } else {
pl = kmalloc(sizeof(*pl), GFP_KERNEL);
if (!pl) {
for_each_cgroup_storage_type(stype)
bpf_cgroup_storage_free(storage[stype]);
return -ENOMEM;
}
-
- pl_was_allocated = true;
- pl->prog = prog;
- for_each_cgroup_storage_type(stype)
- pl->storage[stype] = storage[stype];
list_add_tail(&pl->node, progs);
- } else {
- if (list_empty(progs)) {
- pl = kmalloc(sizeof(*pl), GFP_KERNEL);
- if (!pl) {
- for_each_cgroup_storage_type(stype)
- bpf_cgroup_storage_free(storage[stype]);
- return -ENOMEM;
- }
- pl_was_allocated = true;
- list_add_tail(&pl->node, progs);
- } else {
- pl = list_first_entry(progs, typeof(*pl), node);
- old_prog = pl->prog;
- for_each_cgroup_storage_type(stype) {
- old_storage[stype] = pl->storage[stype];
- bpf_cgroup_storage_unlink(old_storage[stype]);
- }
- pl_was_allocated = false;
- }
- pl->prog = prog;
- for_each_cgroup_storage_type(stype)
- pl->storage[stype] = storage[stype];
}
+ pl->prog = prog;
+ for_each_cgroup_storage_type(stype)
+ pl->storage[stype] = storage[stype];
+
cgrp->bpf.flags[type] = flags;
err = update_effective_progs(cgrp, type);
@@ -401,7 +385,7 @@ int __cgroup_bpf_attach(struct cgroup *cgrp, struct bpf_prog *prog,
pl->storage[stype] = old_storage[stype];
bpf_cgroup_storage_link(old_storage[stype], cgrp, type);
}
- if (pl_was_allocated) {
+ if (!replace_pl) {
list_del(&pl->node);
kfree(pl);
}
--
2.17.1
^ permalink raw reply related [flat|nested] 13+ messages in thread
* Re: [PATCH bpf-next 1/5] bpf: Simplify __cgroup_bpf_attach
2019-12-11 2:33 ` [PATCH bpf-next 1/5] bpf: Simplify __cgroup_bpf_attach Andrey Ignatov
@ 2019-12-12 17:57 ` Martin Lau
0 siblings, 0 replies; 13+ messages in thread
From: Martin Lau @ 2019-12-12 17:57 UTC (permalink / raw)
To: Andrey Ignatov; +Cc: bpf, ast, daniel, Kernel Team
On Tue, Dec 10, 2019 at 06:33:27PM -0800, Andrey Ignatov wrote:
> __cgroup_bpf_attach has a lot of identical code to handle two scenarios:
> BPF_F_ALLOW_MULTI is set and unset.
>
> Simplify it by splitting the two main steps:
>
> * First, the decision is made whether a new bpf_prog_list entry should
> be allocated or existing entry should be reused for the new program.
> This decision is saved in replace_pl pointer;
>
> * Next, replace_pl pointer is used to handle both possible states of
> BPF_F_ALLOW_MULTI flag (set / unset) instead of doing similar work for
> them separately.
>
> This splitting, in turn, allows to make further simplifications:
>
> * The check for attaching same program twice in BPF_F_ALLOW_MULTI mode
> can be done before allocating cgroup storage, so that if user tries to
> attach same program twice no alloc/free happens as it was before;
>
> * pl_was_allocated becomes redundant so it's removed.
Acked-by: Martin KaFai Lau <kafai@fb.com>
^ permalink raw reply [flat|nested] 13+ messages in thread
* [PATCH bpf-next 2/5] bpf: Remove unused new_flags in hierarchy_allows_attach()
2019-12-11 2:33 [PATCH bpf-next 0/5] bpf: Support replacing cgroup-bpf program in MULTI mode Andrey Ignatov
2019-12-11 2:33 ` [PATCH bpf-next 1/5] bpf: Simplify __cgroup_bpf_attach Andrey Ignatov
@ 2019-12-11 2:33 ` Andrey Ignatov
2019-12-12 17:57 ` [Potential Spoof] " Martin Lau
2019-12-11 2:33 ` [PATCH bpf-next 3/5] bpf: Support replacing cgroup-bpf program in MULTI mode Andrey Ignatov
` (2 subsequent siblings)
4 siblings, 1 reply; 13+ messages in thread
From: Andrey Ignatov @ 2019-12-11 2:33 UTC (permalink / raw)
To: bpf; +Cc: Andrey Ignatov, ast, daniel, kernel-team
new_flags is unused, remove it.
Signed-off-by: Andrey Ignatov <rdna@fb.com>
---
kernel/bpf/cgroup.c | 5 ++---
1 file changed, 2 insertions(+), 3 deletions(-)
diff --git a/kernel/bpf/cgroup.c b/kernel/bpf/cgroup.c
index e8cbdd1be687..283efe3ce052 100644
--- a/kernel/bpf/cgroup.c
+++ b/kernel/bpf/cgroup.c
@@ -103,8 +103,7 @@ static u32 prog_list_length(struct list_head *head)
* if parent has overridable or multi-prog, allow attaching
*/
static bool hierarchy_allows_attach(struct cgroup *cgrp,
- enum bpf_attach_type type,
- u32 new_flags)
+ enum bpf_attach_type type)
{
struct cgroup *p;
@@ -303,7 +302,7 @@ int __cgroup_bpf_attach(struct cgroup *cgrp, struct bpf_prog *prog,
/* invalid combination */
return -EINVAL;
- if (!hierarchy_allows_attach(cgrp, type, flags))
+ if (!hierarchy_allows_attach(cgrp, type))
return -EPERM;
if (!list_empty(progs) && cgrp->bpf.flags[type] != flags)
--
2.17.1
^ permalink raw reply related [flat|nested] 13+ messages in thread
* [PATCH bpf-next 3/5] bpf: Support replacing cgroup-bpf program in MULTI mode
2019-12-11 2:33 [PATCH bpf-next 0/5] bpf: Support replacing cgroup-bpf program in MULTI mode Andrey Ignatov
2019-12-11 2:33 ` [PATCH bpf-next 1/5] bpf: Simplify __cgroup_bpf_attach Andrey Ignatov
2019-12-11 2:33 ` [PATCH bpf-next 2/5] bpf: Remove unused new_flags in hierarchy_allows_attach() Andrey Ignatov
@ 2019-12-11 2:33 ` Andrey Ignatov
2019-12-12 18:18 ` [Potential Spoof] " Martin Lau
2019-12-11 2:33 ` [PATCH bpf-next 4/5] libbpf: Introduce bpf_prog_attach_xattr Andrey Ignatov
2019-12-11 2:33 ` [PATCH bpf-next 5/5] selftests/bpf: Cover BPF_F_REPLACE in test_cgroup_attach Andrey Ignatov
4 siblings, 1 reply; 13+ messages in thread
From: Andrey Ignatov @ 2019-12-11 2:33 UTC (permalink / raw)
To: bpf; +Cc: Andrey Ignatov, ast, daniel, kernel-team
The common use-case in production is to have multiple cgroup-bpf
programs per attach type that cover multiple use-cases. Such programs
are attached with BPF_F_ALLOW_MULTI and can be maintained by different
people.
Order of programs usually matters, for example imagine two egress
programs: the first one drops packets and the second one counts packets.
If they're swapped the result of counting program will be different.
It brings operational challenges with updating cgroup-bpf program(s)
attached with BPF_F_ALLOW_MULTI since there is no way to replace a
program:
* One way to update is to detach all programs first and then attach the
new version(s) again in the right order. This introduces an
interruption in the work a program is doing and may not be acceptable
(e.g. if it's egress firewall);
* Another way is attach the new version of a program first and only then
detach the old version. This introduces the time interval when two
versions of same program are working, what may not be acceptable if a
program is not idempotent. It also imposes additional burden on
program developers to make sure that two versions of their program can
co-exist.
Solve the problem by introducing a "replace" mode in BPF_PROG_ATTACH
command for cgroup-bpf programs being attached with BPF_F_ALLOW_MULTI
flag. This mode is enabled by newly introduced BPF_F_REPLACE attach flag
and bpf_attr.replace_bpf_fd attribute to pass fd of the old program to
replace
That way user can replace any program among those attached with
BPF_F_ALLOW_MULTI flag without the problems described above.
Details of the new API:
* If BPF_F_REPLACE is set but replace_bpf_fd doesn't have valid
descriptor of BPF program, BPF_PROG_ATTACH will return corresponding
error (EINVAL or EBADF).
* If replace_bpf_fd has valid descriptor of BPF program but such a
program is not attached to specified cgroup, BPF_PROG_ATTACH will
return ENOENT.
BPF_F_REPLACE is introduced to make the user intend clear, since
replace_bpf_fd alone can't be used for this (its default value, 0, is a
valid fd). BPF_F_REPLACE also makes it possible to extend the API in the
future (e.g. add BPF_F_BEFORE and BPF_F_AFTER if needed).
Signed-off-by: Andrey Ignatov <rdna@fb.com>
---
include/linux/bpf-cgroup.h | 4 +++-
include/uapi/linux/bpf.h | 10 ++++++++++
kernel/bpf/cgroup.c | 30 ++++++++++++++++++++++++++----
kernel/bpf/syscall.c | 4 ++--
kernel/cgroup/cgroup.c | 5 +++--
tools/include/uapi/linux/bpf.h | 10 ++++++++++
6 files changed, 54 insertions(+), 9 deletions(-)
diff --git a/include/linux/bpf-cgroup.h b/include/linux/bpf-cgroup.h
index 169fd25f6bc2..18f6a6da7c3c 100644
--- a/include/linux/bpf-cgroup.h
+++ b/include/linux/bpf-cgroup.h
@@ -85,6 +85,7 @@ int cgroup_bpf_inherit(struct cgroup *cgrp);
void cgroup_bpf_offline(struct cgroup *cgrp);
int __cgroup_bpf_attach(struct cgroup *cgrp, struct bpf_prog *prog,
+ struct bpf_prog *replace_prog,
enum bpf_attach_type type, u32 flags);
int __cgroup_bpf_detach(struct cgroup *cgrp, struct bpf_prog *prog,
enum bpf_attach_type type);
@@ -93,7 +94,8 @@ int __cgroup_bpf_query(struct cgroup *cgrp, const union bpf_attr *attr,
/* Wrapper for __cgroup_bpf_*() protected by cgroup_mutex */
int cgroup_bpf_attach(struct cgroup *cgrp, struct bpf_prog *prog,
- enum bpf_attach_type type, u32 flags);
+ struct bpf_prog *replace_prog, enum bpf_attach_type type,
+ u32 flags);
int cgroup_bpf_detach(struct cgroup *cgrp, struct bpf_prog *prog,
enum bpf_attach_type type, u32 flags);
int cgroup_bpf_query(struct cgroup *cgrp, const union bpf_attr *attr,
diff --git a/include/uapi/linux/bpf.h b/include/uapi/linux/bpf.h
index dbbcf0b02970..7df436da542d 100644
--- a/include/uapi/linux/bpf.h
+++ b/include/uapi/linux/bpf.h
@@ -231,6 +231,11 @@ enum bpf_attach_type {
* When children program makes decision (like picking TCP CA or sock bind)
* parent program has a chance to override it.
*
+ * With BPF_F_ALLOW_MULTI a new program is added to the end of the list of
+ * programs for a cgroup. Though it's possible to replace an old program at
+ * any position by also specifying BPF_F_REPLACE flag and position itself in
+ * replace_bpf_fd attribute. Old program at this position will be released.
+ *
* A cgroup with MULTI or OVERRIDE flag allows any attach flags in sub-cgroups.
* A cgroup with NONE doesn't allow any programs in sub-cgroups.
* Ex1:
@@ -249,6 +254,7 @@ enum bpf_attach_type {
*/
#define BPF_F_ALLOW_OVERRIDE (1U << 0)
#define BPF_F_ALLOW_MULTI (1U << 1)
+#define BPF_F_REPLACE (1U << 2)
/* If BPF_F_STRICT_ALIGNMENT is used in BPF_PROG_LOAD command, the
* verifier will perform strict alignment checking as if the kernel
@@ -442,6 +448,10 @@ union bpf_attr {
__u32 attach_bpf_fd; /* eBPF program to attach */
__u32 attach_type;
__u32 attach_flags;
+ __u32 replace_bpf_fd; /* previously attached eBPF
+ * program to replace if
+ * BPF_F_REPLACE is used
+ */
};
struct { /* anonymous struct used by BPF_PROG_TEST_RUN command */
diff --git a/kernel/bpf/cgroup.c b/kernel/bpf/cgroup.c
index 283efe3ce052..45346c79613a 100644
--- a/kernel/bpf/cgroup.c
+++ b/kernel/bpf/cgroup.c
@@ -282,14 +282,17 @@ static int update_effective_progs(struct cgroup *cgrp,
* propagate the change to descendants
* @cgrp: The cgroup which descendants to traverse
* @prog: A program to attach
+ * @replace_prog: Previously attached program to replace if BPF_F_REPLACE is set
* @type: Type of attach operation
* @flags: Option flags
*
* Must be called with cgroup_mutex held.
*/
int __cgroup_bpf_attach(struct cgroup *cgrp, struct bpf_prog *prog,
+ struct bpf_prog *replace_prog,
enum bpf_attach_type type, u32 flags)
{
+ u32 saved_flags = (flags & (BPF_F_ALLOW_OVERRIDE | BPF_F_ALLOW_MULTI));
struct list_head *progs = &cgrp->bpf.progs[type];
struct bpf_prog *old_prog = NULL;
struct bpf_cgroup_storage *storage[MAX_BPF_CGROUP_STORAGE_TYPE],
@@ -298,14 +301,15 @@ int __cgroup_bpf_attach(struct cgroup *cgrp, struct bpf_prog *prog,
enum bpf_cgroup_storage_type stype;
int err;
- if ((flags & BPF_F_ALLOW_OVERRIDE) && (flags & BPF_F_ALLOW_MULTI))
+ if (((flags & BPF_F_ALLOW_OVERRIDE) && (flags & BPF_F_ALLOW_MULTI)) ||
+ ((flags & BPF_F_REPLACE) && !(flags & BPF_F_ALLOW_MULTI)))
/* invalid combination */
return -EINVAL;
if (!hierarchy_allows_attach(cgrp, type))
return -EPERM;
- if (!list_empty(progs) && cgrp->bpf.flags[type] != flags)
+ if (!list_empty(progs) && cgrp->bpf.flags[type] != saved_flags)
/* Disallow attaching non-overridable on top
* of existing overridable in this cgroup.
* Disallow attaching multi-prog if overridable or none
@@ -320,7 +324,12 @@ int __cgroup_bpf_attach(struct cgroup *cgrp, struct bpf_prog *prog,
if (pl->prog == prog)
/* disallow attaching the same prog twice */
return -EINVAL;
+ if (pl->prog == replace_prog)
+ replace_pl = pl;
}
+ if ((flags & BPF_F_REPLACE) && !replace_pl)
+ /* prog to replace not found for cgroup */
+ return -ENOENT;
} else if (!list_empty(progs)) {
replace_pl = list_first_entry(progs, typeof(*pl), node);
}
@@ -356,7 +365,7 @@ int __cgroup_bpf_attach(struct cgroup *cgrp, struct bpf_prog *prog,
for_each_cgroup_storage_type(stype)
pl->storage[stype] = storage[stype];
- cgrp->bpf.flags[type] = flags;
+ cgrp->bpf.flags[type] = saved_flags;
err = update_effective_progs(cgrp, type);
if (err)
@@ -522,6 +531,7 @@ int __cgroup_bpf_query(struct cgroup *cgrp, const union bpf_attr *attr,
int cgroup_bpf_prog_attach(const union bpf_attr *attr,
enum bpf_prog_type ptype, struct bpf_prog *prog)
{
+ struct bpf_prog *replace_prog = NULL;
struct cgroup *cgrp;
int ret;
@@ -529,8 +539,20 @@ int cgroup_bpf_prog_attach(const union bpf_attr *attr,
if (IS_ERR(cgrp))
return PTR_ERR(cgrp);
- ret = cgroup_bpf_attach(cgrp, prog, attr->attach_type,
+ if ((attr->attach_flags & BPF_F_ALLOW_MULTI) &&
+ (attr->attach_flags & BPF_F_REPLACE)) {
+ replace_prog = bpf_prog_get_type(attr->replace_bpf_fd, ptype);
+ if (IS_ERR(replace_prog)) {
+ cgroup_put(cgrp);
+ return PTR_ERR(replace_prog);
+ }
+ }
+
+ ret = cgroup_bpf_attach(cgrp, prog, replace_prog, attr->attach_type,
attr->attach_flags);
+
+ if (replace_prog)
+ bpf_prog_put(replace_prog);
cgroup_put(cgrp);
return ret;
}
diff --git a/kernel/bpf/syscall.c b/kernel/bpf/syscall.c
index e3461ec59570..1e4abb618c5a 100644
--- a/kernel/bpf/syscall.c
+++ b/kernel/bpf/syscall.c
@@ -2040,10 +2040,10 @@ static int bpf_prog_attach_check_attach_type(const struct bpf_prog *prog,
}
}
-#define BPF_PROG_ATTACH_LAST_FIELD attach_flags
+#define BPF_PROG_ATTACH_LAST_FIELD replace_bpf_fd
#define BPF_F_ATTACH_MASK \
- (BPF_F_ALLOW_OVERRIDE | BPF_F_ALLOW_MULTI)
+ (BPF_F_ALLOW_OVERRIDE | BPF_F_ALLOW_MULTI | BPF_F_REPLACE)
static int bpf_prog_attach(const union bpf_attr *attr)
{
diff --git a/kernel/cgroup/cgroup.c b/kernel/cgroup/cgroup.c
index 735af8f15f95..725365df066d 100644
--- a/kernel/cgroup/cgroup.c
+++ b/kernel/cgroup/cgroup.c
@@ -6288,12 +6288,13 @@ void cgroup_sk_free(struct sock_cgroup_data *skcd)
#ifdef CONFIG_CGROUP_BPF
int cgroup_bpf_attach(struct cgroup *cgrp, struct bpf_prog *prog,
- enum bpf_attach_type type, u32 flags)
+ struct bpf_prog *replace_prog, enum bpf_attach_type type,
+ u32 flags)
{
int ret;
mutex_lock(&cgroup_mutex);
- ret = __cgroup_bpf_attach(cgrp, prog, type, flags);
+ ret = __cgroup_bpf_attach(cgrp, prog, replace_prog, type, flags);
mutex_unlock(&cgroup_mutex);
return ret;
}
diff --git a/tools/include/uapi/linux/bpf.h b/tools/include/uapi/linux/bpf.h
index dbbcf0b02970..7df436da542d 100644
--- a/tools/include/uapi/linux/bpf.h
+++ b/tools/include/uapi/linux/bpf.h
@@ -231,6 +231,11 @@ enum bpf_attach_type {
* When children program makes decision (like picking TCP CA or sock bind)
* parent program has a chance to override it.
*
+ * With BPF_F_ALLOW_MULTI a new program is added to the end of the list of
+ * programs for a cgroup. Though it's possible to replace an old program at
+ * any position by also specifying BPF_F_REPLACE flag and position itself in
+ * replace_bpf_fd attribute. Old program at this position will be released.
+ *
* A cgroup with MULTI or OVERRIDE flag allows any attach flags in sub-cgroups.
* A cgroup with NONE doesn't allow any programs in sub-cgroups.
* Ex1:
@@ -249,6 +254,7 @@ enum bpf_attach_type {
*/
#define BPF_F_ALLOW_OVERRIDE (1U << 0)
#define BPF_F_ALLOW_MULTI (1U << 1)
+#define BPF_F_REPLACE (1U << 2)
/* If BPF_F_STRICT_ALIGNMENT is used in BPF_PROG_LOAD command, the
* verifier will perform strict alignment checking as if the kernel
@@ -442,6 +448,10 @@ union bpf_attr {
__u32 attach_bpf_fd; /* eBPF program to attach */
__u32 attach_type;
__u32 attach_flags;
+ __u32 replace_bpf_fd; /* previously attached eBPF
+ * program to replace if
+ * BPF_F_REPLACE is used
+ */
};
struct { /* anonymous struct used by BPF_PROG_TEST_RUN command */
--
2.17.1
^ permalink raw reply related [flat|nested] 13+ messages in thread
* Re: [Potential Spoof] [PATCH bpf-next 3/5] bpf: Support replacing cgroup-bpf program in MULTI mode
2019-12-11 2:33 ` [PATCH bpf-next 3/5] bpf: Support replacing cgroup-bpf program in MULTI mode Andrey Ignatov
@ 2019-12-12 18:18 ` Martin Lau
2019-12-12 18:46 ` Andrey Ignatov
0 siblings, 1 reply; 13+ messages in thread
From: Martin Lau @ 2019-12-12 18:18 UTC (permalink / raw)
To: Andrey Ignatov; +Cc: bpf, ast, daniel, Kernel Team
On Tue, Dec 10, 2019 at 06:33:29PM -0800, Andrey Ignatov wrote:
> The common use-case in production is to have multiple cgroup-bpf
> programs per attach type that cover multiple use-cases. Such programs
> are attached with BPF_F_ALLOW_MULTI and can be maintained by different
> people.
>
> Order of programs usually matters, for example imagine two egress
> programs: the first one drops packets and the second one counts packets.
> If they're swapped the result of counting program will be different.
>
> It brings operational challenges with updating cgroup-bpf program(s)
> attached with BPF_F_ALLOW_MULTI since there is no way to replace a
> program:
>
> * One way to update is to detach all programs first and then attach the
> new version(s) again in the right order. This introduces an
> interruption in the work a program is doing and may not be acceptable
> (e.g. if it's egress firewall);
>
> * Another way is attach the new version of a program first and only then
> detach the old version. This introduces the time interval when two
> versions of same program are working, what may not be acceptable if a
> program is not idempotent. It also imposes additional burden on
> program developers to make sure that two versions of their program can
> co-exist.
>
> Solve the problem by introducing a "replace" mode in BPF_PROG_ATTACH
> command for cgroup-bpf programs being attached with BPF_F_ALLOW_MULTI
> flag. This mode is enabled by newly introduced BPF_F_REPLACE attach flag
> and bpf_attr.replace_bpf_fd attribute to pass fd of the old program to
> replace
>
> That way user can replace any program among those attached with
> BPF_F_ALLOW_MULTI flag without the problems described above.
>
> Details of the new API:
>
> * If BPF_F_REPLACE is set but replace_bpf_fd doesn't have valid
> descriptor of BPF program, BPF_PROG_ATTACH will return corresponding
> error (EINVAL or EBADF).
>
> * If replace_bpf_fd has valid descriptor of BPF program but such a
> program is not attached to specified cgroup, BPF_PROG_ATTACH will
> return ENOENT.
>
> BPF_F_REPLACE is introduced to make the user intend clear, since
> replace_bpf_fd alone can't be used for this (its default value, 0, is a
> valid fd). BPF_F_REPLACE also makes it possible to extend the API in the
> future (e.g. add BPF_F_BEFORE and BPF_F_AFTER if needed).
Thanks for the details explanation.
[ ... ]
> diff --git a/kernel/bpf/cgroup.c b/kernel/bpf/cgroup.c
> index 283efe3ce052..45346c79613a 100644
> --- a/kernel/bpf/cgroup.c
> +++ b/kernel/bpf/cgroup.c
> @@ -282,14 +282,17 @@ static int update_effective_progs(struct cgroup *cgrp,
> * propagate the change to descendants
> * @cgrp: The cgroup which descendants to traverse
> * @prog: A program to attach
> + * @replace_prog: Previously attached program to replace if BPF_F_REPLACE is set
> * @type: Type of attach operation
> * @flags: Option flags
> *
> * Must be called with cgroup_mutex held.
> */
> int __cgroup_bpf_attach(struct cgroup *cgrp, struct bpf_prog *prog,
> + struct bpf_prog *replace_prog,
> enum bpf_attach_type type, u32 flags)
> {
> + u32 saved_flags = (flags & (BPF_F_ALLOW_OVERRIDE | BPF_F_ALLOW_MULTI));
> struct list_head *progs = &cgrp->bpf.progs[type];
> struct bpf_prog *old_prog = NULL;
> struct bpf_cgroup_storage *storage[MAX_BPF_CGROUP_STORAGE_TYPE],
> @@ -298,14 +301,15 @@ int __cgroup_bpf_attach(struct cgroup *cgrp, struct bpf_prog *prog,
> enum bpf_cgroup_storage_type stype;
> int err;
>
> - if ((flags & BPF_F_ALLOW_OVERRIDE) && (flags & BPF_F_ALLOW_MULTI))
> + if (((flags & BPF_F_ALLOW_OVERRIDE) && (flags & BPF_F_ALLOW_MULTI)) ||
> + ((flags & BPF_F_REPLACE) && !(flags & BPF_F_ALLOW_MULTI)))
> /* invalid combination */
> return -EINVAL;
>
> if (!hierarchy_allows_attach(cgrp, type))
> return -EPERM;
>
> - if (!list_empty(progs) && cgrp->bpf.flags[type] != flags)
> + if (!list_empty(progs) && cgrp->bpf.flags[type] != saved_flags)
> /* Disallow attaching non-overridable on top
> * of existing overridable in this cgroup.
> * Disallow attaching multi-prog if overridable or none
> @@ -320,7 +324,12 @@ int __cgroup_bpf_attach(struct cgroup *cgrp, struct bpf_prog *prog,
> if (pl->prog == prog)
> /* disallow attaching the same prog twice */
> return -EINVAL;
> + if (pl->prog == replace_prog)
> + replace_pl = pl;
> }
> + if ((flags & BPF_F_REPLACE) && !replace_pl)
> + /* prog to replace not found for cgroup */
> + return -ENOENT;
> } else if (!list_empty(progs)) {
> replace_pl = list_first_entry(progs, typeof(*pl), node);
> }
> @@ -356,7 +365,7 @@ int __cgroup_bpf_attach(struct cgroup *cgrp, struct bpf_prog *prog,
> for_each_cgroup_storage_type(stype)
> pl->storage[stype] = storage[stype];
>
> - cgrp->bpf.flags[type] = flags;
> + cgrp->bpf.flags[type] = saved_flags;
>
> err = update_effective_progs(cgrp, type);
> if (err)
> @@ -522,6 +531,7 @@ int __cgroup_bpf_query(struct cgroup *cgrp, const union bpf_attr *attr,
> int cgroup_bpf_prog_attach(const union bpf_attr *attr,
> enum bpf_prog_type ptype, struct bpf_prog *prog)
> {
> + struct bpf_prog *replace_prog = NULL;
> struct cgroup *cgrp;
> int ret;
>
> @@ -529,8 +539,20 @@ int cgroup_bpf_prog_attach(const union bpf_attr *attr,
> if (IS_ERR(cgrp))
> return PTR_ERR(cgrp);
>
> - ret = cgroup_bpf_attach(cgrp, prog, attr->attach_type,
> + if ((attr->attach_flags & BPF_F_ALLOW_MULTI) &&
> + (attr->attach_flags & BPF_F_REPLACE)) {
The patch looks good. One optional nit for consideration,
Since it is testing BPF_F_REPLACE here already,
how about moving the
"((flags & BPF_F_REPLACE) && !(flags & BPF_F_ALLOW_MULTI))"
test from __cgroup_bpf_attach() to this function here?
Clear the BPF_F_REPLACE bit before passing to cgroup_bpf_attach().
Then the "saved_flags" logic in cgroup_bpf_attach() can go away.
cgroup_bpf_attach() can work on the "replace_prog" alone.
Acked-by: Martin KaFai Lau <kafai@fb.com>
> + replace_prog = bpf_prog_get_type(attr->replace_bpf_fd, ptype);
> + if (IS_ERR(replace_prog)) {
> + cgroup_put(cgrp);
> + return PTR_ERR(replace_prog);
> + }
> + }
> +
> + ret = cgroup_bpf_attach(cgrp, prog, replace_prog, attr->attach_type,
> attr->attach_flags);
> +
> + if (replace_prog)
> + bpf_prog_put(replace_prog);
> cgroup_put(cgrp);
> return ret;
> }
^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [PATCH bpf-next 3/5] bpf: Support replacing cgroup-bpf program in MULTI mode
2019-12-12 18:18 ` [Potential Spoof] " Martin Lau
@ 2019-12-12 18:46 ` Andrey Ignatov
0 siblings, 0 replies; 13+ messages in thread
From: Andrey Ignatov @ 2019-12-12 18:46 UTC (permalink / raw)
To: Martin Lau; +Cc: bpf, ast, daniel, Kernel Team
Martin Lau <kafai@fb.com> [Thu, 2019-12-12 10:18 -0800]:
> On Tue, Dec 10, 2019 at 06:33:29PM -0800, Andrey Ignatov wrote:
> > The common use-case in production is to have multiple cgroup-bpf
> > programs per attach type that cover multiple use-cases. Such programs
> > are attached with BPF_F_ALLOW_MULTI and can be maintained by different
> > people.
> >
> > Order of programs usually matters, for example imagine two egress
> > programs: the first one drops packets and the second one counts packets.
> > If they're swapped the result of counting program will be different.
> >
> > It brings operational challenges with updating cgroup-bpf program(s)
> > attached with BPF_F_ALLOW_MULTI since there is no way to replace a
> > program:
> >
> > * One way to update is to detach all programs first and then attach the
> > new version(s) again in the right order. This introduces an
> > interruption in the work a program is doing and may not be acceptable
> > (e.g. if it's egress firewall);
> >
> > * Another way is attach the new version of a program first and only then
> > detach the old version. This introduces the time interval when two
> > versions of same program are working, what may not be acceptable if a
> > program is not idempotent. It also imposes additional burden on
> > program developers to make sure that two versions of their program can
> > co-exist.
> >
> > Solve the problem by introducing a "replace" mode in BPF_PROG_ATTACH
> > command for cgroup-bpf programs being attached with BPF_F_ALLOW_MULTI
> > flag. This mode is enabled by newly introduced BPF_F_REPLACE attach flag
> > and bpf_attr.replace_bpf_fd attribute to pass fd of the old program to
> > replace
> >
> > That way user can replace any program among those attached with
> > BPF_F_ALLOW_MULTI flag without the problems described above.
> >
> > Details of the new API:
> >
> > * If BPF_F_REPLACE is set but replace_bpf_fd doesn't have valid
> > descriptor of BPF program, BPF_PROG_ATTACH will return corresponding
> > error (EINVAL or EBADF).
> >
> > * If replace_bpf_fd has valid descriptor of BPF program but such a
> > program is not attached to specified cgroup, BPF_PROG_ATTACH will
> > return ENOENT.
> >
> > BPF_F_REPLACE is introduced to make the user intend clear, since
> > replace_bpf_fd alone can't be used for this (its default value, 0, is a
> > valid fd). BPF_F_REPLACE also makes it possible to extend the API in the
> > future (e.g. add BPF_F_BEFORE and BPF_F_AFTER if needed).
> Thanks for the details explanation.
>
> [ ... ]
>
> > diff --git a/kernel/bpf/cgroup.c b/kernel/bpf/cgroup.c
> > index 283efe3ce052..45346c79613a 100644
> > --- a/kernel/bpf/cgroup.c
> > +++ b/kernel/bpf/cgroup.c
> > @@ -282,14 +282,17 @@ static int update_effective_progs(struct cgroup *cgrp,
> > * propagate the change to descendants
> > * @cgrp: The cgroup which descendants to traverse
> > * @prog: A program to attach
> > + * @replace_prog: Previously attached program to replace if BPF_F_REPLACE is set
> > * @type: Type of attach operation
> > * @flags: Option flags
> > *
> > * Must be called with cgroup_mutex held.
> > */
> > int __cgroup_bpf_attach(struct cgroup *cgrp, struct bpf_prog *prog,
> > + struct bpf_prog *replace_prog,
> > enum bpf_attach_type type, u32 flags)
> > {
> > + u32 saved_flags = (flags & (BPF_F_ALLOW_OVERRIDE | BPF_F_ALLOW_MULTI));
> > struct list_head *progs = &cgrp->bpf.progs[type];
> > struct bpf_prog *old_prog = NULL;
> > struct bpf_cgroup_storage *storage[MAX_BPF_CGROUP_STORAGE_TYPE],
> > @@ -298,14 +301,15 @@ int __cgroup_bpf_attach(struct cgroup *cgrp, struct bpf_prog *prog,
> > enum bpf_cgroup_storage_type stype;
> > int err;
> >
> > - if ((flags & BPF_F_ALLOW_OVERRIDE) && (flags & BPF_F_ALLOW_MULTI))
> > + if (((flags & BPF_F_ALLOW_OVERRIDE) && (flags & BPF_F_ALLOW_MULTI)) ||
> > + ((flags & BPF_F_REPLACE) && !(flags & BPF_F_ALLOW_MULTI)))
> > /* invalid combination */
> > return -EINVAL;
> >
> > if (!hierarchy_allows_attach(cgrp, type))
> > return -EPERM;
> >
> > - if (!list_empty(progs) && cgrp->bpf.flags[type] != flags)
> > + if (!list_empty(progs) && cgrp->bpf.flags[type] != saved_flags)
> > /* Disallow attaching non-overridable on top
> > * of existing overridable in this cgroup.
> > * Disallow attaching multi-prog if overridable or none
> > @@ -320,7 +324,12 @@ int __cgroup_bpf_attach(struct cgroup *cgrp, struct bpf_prog *prog,
> > if (pl->prog == prog)
> > /* disallow attaching the same prog twice */
> > return -EINVAL;
> > + if (pl->prog == replace_prog)
> > + replace_pl = pl;
> > }
> > + if ((flags & BPF_F_REPLACE) && !replace_pl)
> > + /* prog to replace not found for cgroup */
> > + return -ENOENT;
> > } else if (!list_empty(progs)) {
> > replace_pl = list_first_entry(progs, typeof(*pl), node);
> > }
> > @@ -356,7 +365,7 @@ int __cgroup_bpf_attach(struct cgroup *cgrp, struct bpf_prog *prog,
> > for_each_cgroup_storage_type(stype)
> > pl->storage[stype] = storage[stype];
> >
> > - cgrp->bpf.flags[type] = flags;
> > + cgrp->bpf.flags[type] = saved_flags;
> >
> > err = update_effective_progs(cgrp, type);
> > if (err)
> > @@ -522,6 +531,7 @@ int __cgroup_bpf_query(struct cgroup *cgrp, const union bpf_attr *attr,
> > int cgroup_bpf_prog_attach(const union bpf_attr *attr,
> > enum bpf_prog_type ptype, struct bpf_prog *prog)
> > {
> > + struct bpf_prog *replace_prog = NULL;
> > struct cgroup *cgrp;
> > int ret;
> >
> > @@ -529,8 +539,20 @@ int cgroup_bpf_prog_attach(const union bpf_attr *attr,
> > if (IS_ERR(cgrp))
> > return PTR_ERR(cgrp);
> >
> > - ret = cgroup_bpf_attach(cgrp, prog, attr->attach_type,
> > + if ((attr->attach_flags & BPF_F_ALLOW_MULTI) &&
> > + (attr->attach_flags & BPF_F_REPLACE)) {
> The patch looks good. One optional nit for consideration,
>
> Since it is testing BPF_F_REPLACE here already,
> how about moving the
> "((flags & BPF_F_REPLACE) && !(flags & BPF_F_ALLOW_MULTI))"
> test from __cgroup_bpf_attach() to this function here?
> Clear the BPF_F_REPLACE bit before passing to cgroup_bpf_attach().
>
> Then the "saved_flags" logic in cgroup_bpf_attach() can go away.
> cgroup_bpf_attach() can work on the "replace_prog" alone.
>
> Acked-by: Martin KaFai Lau <kafai@fb.com>
Thank you for review Martin!
I considered doing exactly this and a few other options to split the
logic between __cgroup_bpf_attach() and cgroup_bpf_prog_attach() since
it's not super clear what belongs where, but decided to go with the
current approach.
A couple of reasons I split it this way:
1)
To keep the whole logic and decisions in __cgroup_bpf_attach() and use
cgroup_bpf_prog_attach() only to acquire cgroup-bpf specific resources
that correspond to the user input, such as cgroup and program to
replace. Unfortunately to acquire replace_prog I still need to check
flags to avoid unnecessary work for the most common case when
BPF_F_REPLACE is not set, but IMO it's better to keep the logic to
verify flags combinations in one place, __cgroup_bpf_attach().
2)
Also I think saved_flags would be introduced sooner or later anyway if
new flags are added since as it can be seen there is a clear separation
between flags that control programs arrangement, like OVERRIDE and
MULTI, and should be remembered for the whole life time of the program,
and one-time-needed flags such as REPLACE that are needed only once to
attach program and don't make sense in its future life time.
> > + replace_prog = bpf_prog_get_type(attr->replace_bpf_fd, ptype);
> > + if (IS_ERR(replace_prog)) {
> > + cgroup_put(cgrp);
> > + return PTR_ERR(replace_prog);
> > + }
> > + }
> > +
> > + ret = cgroup_bpf_attach(cgrp, prog, replace_prog, attr->attach_type,
> > attr->attach_flags);
> > +
> > + if (replace_prog)
> > + bpf_prog_put(replace_prog);
> > cgroup_put(cgrp);
> > return ret;
> > }
--
Andrey Ignatov
^ permalink raw reply [flat|nested] 13+ messages in thread
* [PATCH bpf-next 4/5] libbpf: Introduce bpf_prog_attach_xattr
2019-12-11 2:33 [PATCH bpf-next 0/5] bpf: Support replacing cgroup-bpf program in MULTI mode Andrey Ignatov
` (2 preceding siblings ...)
2019-12-11 2:33 ` [PATCH bpf-next 3/5] bpf: Support replacing cgroup-bpf program in MULTI mode Andrey Ignatov
@ 2019-12-11 2:33 ` Andrey Ignatov
2019-12-12 8:08 ` Andrii Nakryiko
2019-12-11 2:33 ` [PATCH bpf-next 5/5] selftests/bpf: Cover BPF_F_REPLACE in test_cgroup_attach Andrey Ignatov
4 siblings, 1 reply; 13+ messages in thread
From: Andrey Ignatov @ 2019-12-11 2:33 UTC (permalink / raw)
To: bpf; +Cc: Andrey Ignatov, ast, daniel, kernel-team
Introduce a new bpf_prog_attach_xattr function that accepts an
extendable structure and supports passing a new attribute to
BPF_PROG_ATTACH command: replace_prog_fd that is fd of previously
attached cgroup-bpf program to replace if recently introduced
BPF_F_REPLACE flag is used.
The new function is named to be consistent with other xattr-functions
(bpf_prog_test_run_xattr, bpf_create_map_xattr, bpf_load_program_xattr).
NOTE: DECLARE_LIBBPF_OPTS macro is not used here because it's available
in libbpf.h, and unavailable in bpf.h. Please let me know if the macro
should be shared in a common place and used here instead of declaring
struct bpf_prog_attach_attr directly.
Signed-off-by: Andrey Ignatov <rdna@fb.com>
---
tools/lib/bpf/bpf.c | 22 ++++++++++++++++++----
tools/lib/bpf/bpf.h | 10 ++++++++++
tools/lib/bpf/libbpf.map | 5 +++++
3 files changed, 33 insertions(+), 4 deletions(-)
diff --git a/tools/lib/bpf/bpf.c b/tools/lib/bpf/bpf.c
index 98596e15390f..5a2830fac227 100644
--- a/tools/lib/bpf/bpf.c
+++ b/tools/lib/bpf/bpf.c
@@ -466,14 +466,28 @@ int bpf_obj_get(const char *pathname)
int bpf_prog_attach(int prog_fd, int target_fd, enum bpf_attach_type type,
unsigned int flags)
+{
+ struct bpf_prog_attach_attr attach_attr;
+
+ memset(&attach_attr, 0, sizeof(attach_attr));
+ attach_attr.target_fd = target_fd;
+ attach_attr.prog_fd = prog_fd;
+ attach_attr.type = type;
+ attach_attr.flags = flags;
+
+ return bpf_prog_attach_xattr(&attach_attr);
+}
+
+int bpf_prog_attach_xattr(const struct bpf_prog_attach_attr *attach_attr)
{
union bpf_attr attr;
memset(&attr, 0, sizeof(attr));
- attr.target_fd = target_fd;
- attr.attach_bpf_fd = prog_fd;
- attr.attach_type = type;
- attr.attach_flags = flags;
+ attr.target_fd = attach_attr->target_fd;
+ attr.attach_bpf_fd = attach_attr->prog_fd;
+ attr.attach_type = attach_attr->type;
+ attr.attach_flags = attach_attr->flags;
+ attr.replace_bpf_fd = attach_attr->replace_prog_fd;
return sys_bpf(BPF_PROG_ATTACH, &attr, sizeof(attr));
}
diff --git a/tools/lib/bpf/bpf.h b/tools/lib/bpf/bpf.h
index 3c791fa8e68e..4b7269d3bae7 100644
--- a/tools/lib/bpf/bpf.h
+++ b/tools/lib/bpf/bpf.h
@@ -128,8 +128,18 @@ LIBBPF_API int bpf_map_get_next_key(int fd, const void *key, void *next_key);
LIBBPF_API int bpf_map_freeze(int fd);
LIBBPF_API int bpf_obj_pin(int fd, const char *pathname);
LIBBPF_API int bpf_obj_get(const char *pathname);
+
+struct bpf_prog_attach_attr {
+ int target_fd;
+ int prog_fd;
+ enum bpf_attach_type type;
+ unsigned int flags;
+ int replace_prog_fd;
+};
+
LIBBPF_API int bpf_prog_attach(int prog_fd, int attachable_fd,
enum bpf_attach_type type, unsigned int flags);
+LIBBPF_API int bpf_prog_attach_xattr(const struct bpf_prog_attach_attr *attr);
LIBBPF_API int bpf_prog_detach(int attachable_fd, enum bpf_attach_type type);
LIBBPF_API int bpf_prog_detach2(int prog_fd, int attachable_fd,
enum bpf_attach_type type);
diff --git a/tools/lib/bpf/libbpf.map b/tools/lib/bpf/libbpf.map
index 8ddc2c40e482..42b065454031 100644
--- a/tools/lib/bpf/libbpf.map
+++ b/tools/lib/bpf/libbpf.map
@@ -208,3 +208,8 @@ LIBBPF_0.0.6 {
btf__find_by_name_kind;
libbpf_find_vmlinux_btf_id;
} LIBBPF_0.0.5;
+
+LIBBPF_0.0.7 {
+ global:
+ bpf_prog_attach_xattr;
+} LIBBPF_0.0.6;
--
2.17.1
^ permalink raw reply related [flat|nested] 13+ messages in thread
* Re: [PATCH bpf-next 4/5] libbpf: Introduce bpf_prog_attach_xattr
2019-12-11 2:33 ` [PATCH bpf-next 4/5] libbpf: Introduce bpf_prog_attach_xattr Andrey Ignatov
@ 2019-12-12 8:08 ` Andrii Nakryiko
0 siblings, 0 replies; 13+ messages in thread
From: Andrii Nakryiko @ 2019-12-12 8:08 UTC (permalink / raw)
To: Andrey Ignatov; +Cc: bpf, Alexei Starovoitov, Daniel Borkmann, Kernel Team
On Tue, Dec 10, 2019 at 6:35 PM Andrey Ignatov <rdna@fb.com> wrote:
>
> Introduce a new bpf_prog_attach_xattr function that accepts an
> extendable structure and supports passing a new attribute to
> BPF_PROG_ATTACH command: replace_prog_fd that is fd of previously
> attached cgroup-bpf program to replace if recently introduced
> BPF_F_REPLACE flag is used.
>
> The new function is named to be consistent with other xattr-functions
> (bpf_prog_test_run_xattr, bpf_create_map_xattr, bpf_load_program_xattr).
>
> NOTE: DECLARE_LIBBPF_OPTS macro is not used here because it's available
> in libbpf.h, and unavailable in bpf.h. Please let me know if the macro
> should be shared in a common place and used here instead of declaring
> struct bpf_prog_attach_attr directly.
>
I think doing opts is a better way to go forward. With current
approach, next time we need another extra field, we'd need to add yet
another function and/or symbol version existing one. BTW, with
xxx_opts approach, we tried to keep mandatory arguments that are going
to be always specified as first few arguments of a function, and other
stuff that's optional (e.g., flags or replace_prog_fd seem to be good
candidates), would go under opts. This differs from xattr way, which
is why I'm pointing this out.
> Signed-off-by: Andrey Ignatov <rdna@fb.com>
> ---
> tools/lib/bpf/bpf.c | 22 ++++++++++++++++++----
> tools/lib/bpf/bpf.h | 10 ++++++++++
> tools/lib/bpf/libbpf.map | 5 +++++
> 3 files changed, 33 insertions(+), 4 deletions(-)
>
> diff --git a/tools/lib/bpf/bpf.c b/tools/lib/bpf/bpf.c
> index 98596e15390f..5a2830fac227 100644
> --- a/tools/lib/bpf/bpf.c
> +++ b/tools/lib/bpf/bpf.c
> @@ -466,14 +466,28 @@ int bpf_obj_get(const char *pathname)
>
> int bpf_prog_attach(int prog_fd, int target_fd, enum bpf_attach_type type,
> unsigned int flags)
> +{
> + struct bpf_prog_attach_attr attach_attr;
> +
> + memset(&attach_attr, 0, sizeof(attach_attr));
> + attach_attr.target_fd = target_fd;
> + attach_attr.prog_fd = prog_fd;
> + attach_attr.type = type;
> + attach_attr.flags = flags;
> +
> + return bpf_prog_attach_xattr(&attach_attr);
> +}
> +
> +int bpf_prog_attach_xattr(const struct bpf_prog_attach_attr *attach_attr)
> {
> union bpf_attr attr;
>
> memset(&attr, 0, sizeof(attr));
> - attr.target_fd = target_fd;
> - attr.attach_bpf_fd = prog_fd;
> - attr.attach_type = type;
> - attr.attach_flags = flags;
> + attr.target_fd = attach_attr->target_fd;
> + attr.attach_bpf_fd = attach_attr->prog_fd;
> + attr.attach_type = attach_attr->type;
> + attr.attach_flags = attach_attr->flags;
> + attr.replace_bpf_fd = attach_attr->replace_prog_fd;
>
> return sys_bpf(BPF_PROG_ATTACH, &attr, sizeof(attr));
> }
> diff --git a/tools/lib/bpf/bpf.h b/tools/lib/bpf/bpf.h
> index 3c791fa8e68e..4b7269d3bae7 100644
> --- a/tools/lib/bpf/bpf.h
> +++ b/tools/lib/bpf/bpf.h
> @@ -128,8 +128,18 @@ LIBBPF_API int bpf_map_get_next_key(int fd, const void *key, void *next_key);
> LIBBPF_API int bpf_map_freeze(int fd);
> LIBBPF_API int bpf_obj_pin(int fd, const char *pathname);
> LIBBPF_API int bpf_obj_get(const char *pathname);
> +
> +struct bpf_prog_attach_attr {
> + int target_fd;
> + int prog_fd;
> + enum bpf_attach_type type;
> + unsigned int flags;
> + int replace_prog_fd;
> +};
> +
> LIBBPF_API int bpf_prog_attach(int prog_fd, int attachable_fd,
> enum bpf_attach_type type, unsigned int flags);
> +LIBBPF_API int bpf_prog_attach_xattr(const struct bpf_prog_attach_attr *attr);
> LIBBPF_API int bpf_prog_detach(int attachable_fd, enum bpf_attach_type type);
> LIBBPF_API int bpf_prog_detach2(int prog_fd, int attachable_fd,
> enum bpf_attach_type type);
> diff --git a/tools/lib/bpf/libbpf.map b/tools/lib/bpf/libbpf.map
> index 8ddc2c40e482..42b065454031 100644
> --- a/tools/lib/bpf/libbpf.map
> +++ b/tools/lib/bpf/libbpf.map
> @@ -208,3 +208,8 @@ LIBBPF_0.0.6 {
> btf__find_by_name_kind;
> libbpf_find_vmlinux_btf_id;
> } LIBBPF_0.0.5;
> +
> +LIBBPF_0.0.7 {
> + global:
> + bpf_prog_attach_xattr;
> +} LIBBPF_0.0.6;
> --
> 2.17.1
>
^ permalink raw reply [flat|nested] 13+ messages in thread
* [PATCH bpf-next 5/5] selftests/bpf: Cover BPF_F_REPLACE in test_cgroup_attach
2019-12-11 2:33 [PATCH bpf-next 0/5] bpf: Support replacing cgroup-bpf program in MULTI mode Andrey Ignatov
` (3 preceding siblings ...)
2019-12-11 2:33 ` [PATCH bpf-next 4/5] libbpf: Introduce bpf_prog_attach_xattr Andrey Ignatov
@ 2019-12-11 2:33 ` Andrey Ignatov
2019-12-12 18:26 ` Martin Lau
4 siblings, 1 reply; 13+ messages in thread
From: Andrey Ignatov @ 2019-12-11 2:33 UTC (permalink / raw)
To: bpf; +Cc: Andrey Ignatov, ast, daniel, kernel-team
Test replacement of a cgroup-bpf program attached with BPF_F_ALLOW_MULTI
and possible failure modes: invalid combination of flags, invalid
replace_bpf_fd, replacing a non-attachd to specified cgroup program.
Example of program replacing:
# gdb -q ./test_cgroup_attach
Reading symbols from /data/users/rdna/bin/test_cgroup_attach...done.
...
Breakpoint 1, test_multiprog () at test_cgroup_attach.c:442
442 test_cgroup_attach.c: No such file or directory.
(gdb)
[2]+ Stopped gdb -q ./test_cgroup_attach
# bpftool c s /mnt/cgroup2/cgroup-test-work-dir/cg1
ID AttachType AttachFlags Name
35 egress multi
36 egress multi
# fg gdb -q ./test_cgroup_attach
c
Continuing.
Detaching after fork from child process 361.
Breakpoint 2, test_multiprog () at test_cgroup_attach.c:453
453 in test_cgroup_attach.c
(gdb)
[2]+ Stopped gdb -q ./test_cgroup_attach
# bpftool c s /mnt/cgroup2/cgroup-test-work-dir/cg1
ID AttachType AttachFlags Name
41 egress multi
36 egress multi
Signed-off-by: Andrey Ignatov <rdna@fb.com>
---
.../selftests/bpf/test_cgroup_attach.c | 61 +++++++++++++++++--
1 file changed, 56 insertions(+), 5 deletions(-)
diff --git a/tools/testing/selftests/bpf/test_cgroup_attach.c b/tools/testing/selftests/bpf/test_cgroup_attach.c
index 7671909ee1cb..b9148d752207 100644
--- a/tools/testing/selftests/bpf/test_cgroup_attach.c
+++ b/tools/testing/selftests/bpf/test_cgroup_attach.c
@@ -250,7 +250,7 @@ static int prog_load_cnt(int verdict, int val)
BPF_LD_MAP_FD(BPF_REG_1, map_fd),
BPF_RAW_INSN(BPF_JMP | BPF_CALL, 0, 0, 0, BPF_FUNC_map_lookup_elem),
BPF_JMP_IMM(BPF_JEQ, BPF_REG_0, 0, 2),
- BPF_MOV64_IMM(BPF_REG_1, val), /* r1 = 1 */
+ BPF_MOV64_IMM(BPF_REG_1, val), /* r1 = val */
BPF_RAW_INSN(BPF_STX | BPF_XADD | BPF_DW, BPF_REG_0, BPF_REG_1, 0, 0), /* xadd r0 += r1 */
BPF_LD_MAP_FD(BPF_REG_1, cgroup_storage_fd),
@@ -290,11 +290,12 @@ static int test_multiprog(void)
{
__u32 prog_ids[4], prog_cnt = 0, attach_flags, saved_prog_id;
int cg1 = 0, cg2 = 0, cg3 = 0, cg4 = 0, cg5 = 0, key = 0;
- int drop_prog, allow_prog[6] = {}, rc = 0;
+ int drop_prog, allow_prog[7] = {}, rc = 0;
+ struct bpf_prog_attach_attr attach_attr;
unsigned long long value;
int i = 0;
- for (i = 0; i < 6; i++) {
+ for (i = 0; i < ARRAY_SIZE(allow_prog); i++) {
allow_prog[i] = prog_load_cnt(1, 1 << i);
if (!allow_prog[i])
goto err;
@@ -400,6 +401,56 @@ static int test_multiprog(void)
assert(bpf_map_lookup_elem(map_fd, &key, &value) == 0);
assert(value == 1 + 2 + 8 + 16);
+ /* invalid input */
+
+ memset(&attach_attr, 0, sizeof(attach_attr));
+ attach_attr.target_fd = cg1;
+ attach_attr.prog_fd = allow_prog[6];
+ attach_attr.replace_prog_fd = allow_prog[0];
+ attach_attr.type = BPF_CGROUP_INET_EGRESS;
+ attach_attr.flags = BPF_F_ALLOW_OVERRIDE | BPF_F_REPLACE;
+
+ if (!bpf_prog_attach_xattr(&attach_attr)) {
+ log_err("Unexpected success with OVERRIDE | REPLACE");
+ goto err;
+ }
+ assert(errno == EINVAL);
+
+ attach_attr.flags = BPF_F_REPLACE;
+ if (!bpf_prog_attach_xattr(&attach_attr)) {
+ log_err("Unexpected success with REPLACE alone");
+ goto err;
+ }
+ assert(errno == EINVAL);
+ attach_attr.flags = BPF_F_ALLOW_MULTI | BPF_F_REPLACE;
+
+ attach_attr.replace_prog_fd = -1;
+ if (!bpf_prog_attach_xattr(&attach_attr)) {
+ log_err("Unexpected success with bad replace fd");
+ goto err;
+ }
+ assert(errno == EBADF);
+
+ /* replacing a program that is not attached to cgroup should fail */
+ attach_attr.replace_prog_fd = allow_prog[3];
+ if (!bpf_prog_attach_xattr(&attach_attr)) {
+ log_err("Unexpected success: replace not-attached prog on cg1");
+ goto err;
+ }
+ assert(errno == ENOENT);
+ attach_attr.replace_prog_fd = allow_prog[0];
+
+ /* replace 1st from the top program */
+ if (bpf_prog_attach_xattr(&attach_attr)) {
+ log_err("Replace prog1 with prog7 on cg1");
+ goto err;
+ }
+ value = 0;
+ assert(bpf_map_update_elem(map_fd, &key, &value, 0) == 0);
+ assert(system(PING_CMD) == 0);
+ assert(bpf_map_lookup_elem(map_fd, &key, &value) == 0);
+ assert(value == 64 + 2 + 8 + 16);
+
/* detach 3rd from bottom program and ping again */
errno = 0;
if (!bpf_prog_detach2(0, cg3, BPF_CGROUP_INET_EGRESS)) {
@@ -414,7 +465,7 @@ static int test_multiprog(void)
assert(bpf_map_update_elem(map_fd, &key, &value, 0) == 0);
assert(system(PING_CMD) == 0);
assert(bpf_map_lookup_elem(map_fd, &key, &value) == 0);
- assert(value == 1 + 2 + 16);
+ assert(value == 64 + 2 + 16);
/* detach 2nd from bottom program and ping again */
if (bpf_prog_detach2(-1, cg4, BPF_CGROUP_INET_EGRESS)) {
@@ -425,7 +476,7 @@ static int test_multiprog(void)
assert(bpf_map_update_elem(map_fd, &key, &value, 0) == 0);
assert(system(PING_CMD) == 0);
assert(bpf_map_lookup_elem(map_fd, &key, &value) == 0);
- assert(value == 1 + 2 + 4);
+ assert(value == 64 + 2 + 4);
prog_cnt = 4;
assert(bpf_prog_query(cg5, BPF_CGROUP_INET_EGRESS, BPF_F_QUERY_EFFECTIVE,
--
2.17.1
^ permalink raw reply related [flat|nested] 13+ messages in thread
* Re: [PATCH bpf-next 5/5] selftests/bpf: Cover BPF_F_REPLACE in test_cgroup_attach
2019-12-11 2:33 ` [PATCH bpf-next 5/5] selftests/bpf: Cover BPF_F_REPLACE in test_cgroup_attach Andrey Ignatov
@ 2019-12-12 18:26 ` Martin Lau
2019-12-12 18:51 ` Andrey Ignatov
0 siblings, 1 reply; 13+ messages in thread
From: Martin Lau @ 2019-12-12 18:26 UTC (permalink / raw)
To: Andrey Ignatov; +Cc: bpf, ast, daniel, Kernel Team
On Tue, Dec 10, 2019 at 06:33:31PM -0800, Andrey Ignatov wrote:
> Test replacement of a cgroup-bpf program attached with BPF_F_ALLOW_MULTI
> and possible failure modes: invalid combination of flags, invalid
> replace_bpf_fd, replacing a non-attachd to specified cgroup program.
>
> Example of program replacing:
>
> # gdb -q ./test_cgroup_attach
> Reading symbols from /data/users/rdna/bin/test_cgroup_attach...done.
> ...
> Breakpoint 1, test_multiprog () at test_cgroup_attach.c:442
> 442 test_cgroup_attach.c: No such file or directory.
> (gdb)
> [2]+ Stopped gdb -q ./test_cgroup_attach
> # bpftool c s /mnt/cgroup2/cgroup-test-work-dir/cg1
> ID AttachType AttachFlags Name
> 35 egress multi
> 36 egress multi
> # fg gdb -q ./test_cgroup_attach
> c
> Continuing.
> Detaching after fork from child process 361.
>
> Breakpoint 2, test_multiprog () at test_cgroup_attach.c:453
> 453 in test_cgroup_attach.c
> (gdb)
> [2]+ Stopped gdb -q ./test_cgroup_attach
> # bpftool c s /mnt/cgroup2/cgroup-test-work-dir/cg1
> ID AttachType AttachFlags Name
> 41 egress multi
> 36 egress multi
>
> Signed-off-by: Andrey Ignatov <rdna@fb.com>
> ---
> .../selftests/bpf/test_cgroup_attach.c | 61 +++++++++++++++++--
> 1 file changed, 56 insertions(+), 5 deletions(-)
>
> diff --git a/tools/testing/selftests/bpf/test_cgroup_attach.c b/tools/testing/selftests/bpf/test_cgroup_attach.c
> index 7671909ee1cb..b9148d752207 100644
> --- a/tools/testing/selftests/bpf/test_cgroup_attach.c
> +++ b/tools/testing/selftests/bpf/test_cgroup_attach.c
> @@ -250,7 +250,7 @@ static int prog_load_cnt(int verdict, int val)
> BPF_LD_MAP_FD(BPF_REG_1, map_fd),
> BPF_RAW_INSN(BPF_JMP | BPF_CALL, 0, 0, 0, BPF_FUNC_map_lookup_elem),
> BPF_JMP_IMM(BPF_JEQ, BPF_REG_0, 0, 2),
> - BPF_MOV64_IMM(BPF_REG_1, val), /* r1 = 1 */
> + BPF_MOV64_IMM(BPF_REG_1, val), /* r1 = val */
> BPF_RAW_INSN(BPF_STX | BPF_XADD | BPF_DW, BPF_REG_0, BPF_REG_1, 0, 0), /* xadd r0 += r1 */
>
> BPF_LD_MAP_FD(BPF_REG_1, cgroup_storage_fd),
> @@ -290,11 +290,12 @@ static int test_multiprog(void)
> {
> __u32 prog_ids[4], prog_cnt = 0, attach_flags, saved_prog_id;
> int cg1 = 0, cg2 = 0, cg3 = 0, cg4 = 0, cg5 = 0, key = 0;
> - int drop_prog, allow_prog[6] = {}, rc = 0;
> + int drop_prog, allow_prog[7] = {}, rc = 0;
> + struct bpf_prog_attach_attr attach_attr;
> unsigned long long value;
> int i = 0;
>
> - for (i = 0; i < 6; i++) {
> + for (i = 0; i < ARRAY_SIZE(allow_prog); i++) {
> allow_prog[i] = prog_load_cnt(1, 1 << i);
> if (!allow_prog[i])
> goto err;
> @@ -400,6 +401,56 @@ static int test_multiprog(void)
> assert(bpf_map_lookup_elem(map_fd, &key, &value) == 0);
> assert(value == 1 + 2 + 8 + 16);
>
> + /* invalid input */
> +
> + memset(&attach_attr, 0, sizeof(attach_attr));
> + attach_attr.target_fd = cg1;
> + attach_attr.prog_fd = allow_prog[6];
> + attach_attr.replace_prog_fd = allow_prog[0];
> + attach_attr.type = BPF_CGROUP_INET_EGRESS;
> + attach_attr.flags = BPF_F_ALLOW_OVERRIDE | BPF_F_REPLACE;
> +
> + if (!bpf_prog_attach_xattr(&attach_attr)) {
> + log_err("Unexpected success with OVERRIDE | REPLACE");
> + goto err;
> + }
> + assert(errno == EINVAL);
> +
> + attach_attr.flags = BPF_F_REPLACE;
> + if (!bpf_prog_attach_xattr(&attach_attr)) {
> + log_err("Unexpected success with REPLACE alone");
> + goto err;
> + }
> + assert(errno == EINVAL);
> + attach_attr.flags = BPF_F_ALLOW_MULTI | BPF_F_REPLACE;
> +
> + attach_attr.replace_prog_fd = -1;
> + if (!bpf_prog_attach_xattr(&attach_attr)) {
The whole set LGTM. I expect this attach bit will change based on
the discussion in patch 4.
> + log_err("Unexpected success with bad replace fd");
> + goto err;
> + }
^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [PATCH bpf-next 5/5] selftests/bpf: Cover BPF_F_REPLACE in test_cgroup_attach
2019-12-12 18:26 ` Martin Lau
@ 2019-12-12 18:51 ` Andrey Ignatov
0 siblings, 0 replies; 13+ messages in thread
From: Andrey Ignatov @ 2019-12-12 18:51 UTC (permalink / raw)
To: Martin Lau; +Cc: bpf, ast, daniel, Andrii Nakryiko, Kernel Team
Martin Lau <kafai@fb.com> [Thu, 2019-12-12 10:26 -0800]:
> On Tue, Dec 10, 2019 at 06:33:31PM -0800, Andrey Ignatov wrote:
...
> > + attach_attr.replace_prog_fd = -1;
> > + if (!bpf_prog_attach_xattr(&attach_attr)) {
> The whole set LGTM. I expect this attach bit will change based on
> the discussion in patch 4.
Right, I'll change libbpf part and this test according to the feedback
from Andrii and send v2.
Thank you!
--
Andrey Ignatov
^ permalink raw reply [flat|nested] 13+ messages in thread