All of lore.kernel.org
 help / color / mirror / Atom feed
From: Martin Lau <kafai@fb.com>
To: Daniel Borkmann <daniel@iogearbox.net>
Cc: "bpf@vger.kernel.org" <bpf@vger.kernel.org>,
	Alexei Starovoitov <ast@kernel.org>,
	David Miller <davem@davemloft.net>,
	Kernel Team <Kernel-team@fb.com>,
	"netdev@vger.kernel.org" <netdev@vger.kernel.org>
Subject: Re: [PATCH bpf-next v3 06/11] bpf: Introduce BPF_MAP_TYPE_STRUCT_OPS
Date: Wed, 8 Jan 2020 01:53:26 +0000	[thread overview]
Message-ID: <20200108015323.oooj47pybplopzqr@kafai-mbp> (raw)
In-Reply-To: <ee7a3631-bb47-3d58-7ad2-431b9af40589@iogearbox.net>

On Tue, Jan 07, 2020 at 10:44:41PM +0100, Daniel Borkmann wrote:
> On 1/7/20 7:55 PM, Martin Lau wrote:
> > On Tue, Jan 07, 2020 at 05:00:37PM +0100, Daniel Borkmann wrote:
> > > On 12/31/19 7:20 AM, Martin KaFai Lau wrote:
> > > [...]
> [...]
> > > > +		err = arch_prepare_bpf_trampoline(image,
> > > > +						  st_map->image + PAGE_SIZE,
> > > > +						  &st_ops->func_models[i], 0,
> > > > +						  &prog, 1, NULL, 0, NULL);
> > > > +		if (err < 0)
> > > > +			goto reset_unlock;
> > > > +
> > > > +		*(void **)(kdata + moff) = image;
> > > > +		image += err;
> > > > +
> > > > +		/* put prog_id to udata */
> > > > +		*(unsigned long *)(udata + moff) = prog->aux->id;
> > udata (with all progs' id) will be returned during lookup_elem().
> > 
> > > > +	}
> > > > +
> > > > +	refcount_set(&kvalue->refcnt, 1);
> > > > +	bpf_map_inc(map);
> > > > +
> > > > +	err = st_ops->reg(kdata);
> > > > +	if (!err) {
> > > > +		/* Pair with smp_load_acquire() during lookup */
> > > > +		smp_store_release(&kvalue->state, BPF_STRUCT_OPS_STATE_INUSE);
> > > 
> > > Is there a reason using READ_ONCE/WRITE_ONCE pair is not enough?
> > The intention is to ensure lookup_elem() can see all the progs' id once
> > the state is set to BPF_STRUCT_OPS_STATE_INUSE.
> > 
> > Is READ_ONCE/WRITE_ONCE enough to do this?
> 
> True, given the above udata store, makes sense as-is.
> 
> > > > +		goto unlock;
> > > > +	}
> > > > +
> > > > +	/* Error during st_ops->reg() */
> > > > +	bpf_map_put(map);
> > > > +
> > > > +reset_unlock:
> > > > +	bpf_struct_ops_map_put_progs(st_map);
> > > > +	memset(uvalue, 0, map->value_size);
> > > > +	memset(kvalue, 0, map->value_size);
> > > > +
> > > > +unlock:
> > > > +	spin_unlock(&st_map->lock);
> > > > +	return err;
> > > > +}
> > > [...]
> > > > +static struct bpf_map *bpf_struct_ops_map_alloc(union bpf_attr *attr)
> > > > +{
> > > > +	const struct bpf_struct_ops *st_ops;
> > > > +	size_t map_total_size, st_map_size;
> > > > +	struct bpf_struct_ops_map *st_map;
> > > > +	const struct btf_type *t, *vt;
> > > > +	struct bpf_map_memory mem;
> > > > +	struct bpf_map *map;
> > > > +	int err;
> > > > +
> > > > +	if (!capable(CAP_SYS_ADMIN))
> > > > +		return ERR_PTR(-EPERM);
> > > > +
> > > > +	st_ops = bpf_struct_ops_find_value(attr->btf_vmlinux_value_type_id);
> > > > +	if (!st_ops)
> > > > +		return ERR_PTR(-ENOTSUPP);
> > > > +
> > > > +	vt = st_ops->value_type;
> > > > +	if (attr->value_size != vt->size)
> > > > +		return ERR_PTR(-EINVAL);
> > > > +
> > > > +	t = st_ops->type;
> > > > +
> > > > +	st_map_size = sizeof(*st_map) +
> > > > +		/* kvalue stores the
> > > > +		 * struct bpf_struct_ops_tcp_congestions_ops
> > > > +		 */
> > > > +		(vt->size - sizeof(struct bpf_struct_ops_value));
> > > > +	map_total_size = st_map_size +
> > > > +		/* uvalue */
> > > > +		sizeof(vt->size) +
> > > > +		/* struct bpf_progs **progs */
> > > > +		 btf_type_vlen(t) * sizeof(struct bpf_prog *);
> > > > +	err = bpf_map_charge_init(&mem, map_total_size);
> > > > +	if (err < 0)
> > > > +		return ERR_PTR(err);
> > > > +
> > > > +	st_map = bpf_map_area_alloc(st_map_size, NUMA_NO_NODE);
> > > > +	if (!st_map) {
> > > > +		bpf_map_charge_finish(&mem);
> > > > +		return ERR_PTR(-ENOMEM);
> > > > +	}
> > > > +	st_map->st_ops = st_ops;
> > > > +	map = &st_map->map;
> > > > +
> > > > +	st_map->uvalue = bpf_map_area_alloc(vt->size, NUMA_NO_NODE);
> > > > +	st_map->progs =
> > > > +		bpf_map_area_alloc(btf_type_vlen(t) * sizeof(struct bpf_prog *),
> > > > +				   NUMA_NO_NODE);
> > > > +	st_map->image = bpf_jit_alloc_exec(PAGE_SIZE);
> > > > +	if (!st_map->uvalue || !st_map->progs || !st_map->image) {
> > > > +		bpf_struct_ops_map_free(map);
> > > > +		bpf_map_charge_finish(&mem);
> > > > +		return ERR_PTR(-ENOMEM);
> > > > +	}
> > > > +
> > > > +	spin_lock_init(&st_map->lock);
> > > > +	set_vm_flush_reset_perms(st_map->image);
> > > > +	set_memory_x((long)st_map->image, 1);
> > > 
> > > Shouldn't this be using text poke as well once you write the image later on,
> > > otherwise we create yet another instance of W+X memory ... :/
> > Once image is written in update_elem(), it will never be changed.
> > I can set it to ro after it is written.
> 
> And we could also move the set_memory_x() to that point once image is written and
> marked read-only; mid term text poke interface to avoid all this.
I will respin.

Thanks for the review!  

> 
> Other than that nothing obvious stands out from reviewing patch 1-8, so no objections
> from my side.
> 
> > > > +	bpf_map_init_from_attr(map, attr);
> > > > +	bpf_map_charge_move(&map->memory, &mem);
> > > > +
> > > > +	return map;
> > > > +}
> > > > +
> > > > +const struct bpf_map_ops bpf_struct_ops_map_ops = {
> > > > +	.map_alloc_check = bpf_struct_ops_map_alloc_check,
> > > > +	.map_alloc = bpf_struct_ops_map_alloc,
> > > > +	.map_free = bpf_struct_ops_map_free,
> > > > +	.map_get_next_key = bpf_struct_ops_map_get_next_key,
> > > > +	.map_lookup_elem = bpf_struct_ops_map_lookup_elem,
> > > > +	.map_delete_elem = bpf_struct_ops_map_delete_elem,
> > > > +	.map_update_elem = bpf_struct_ops_map_update_elem,
> > > > +	.map_seq_show_elem = bpf_struct_ops_map_seq_show_elem,
> > > > +};
> > > [...]
> 

  reply	other threads:[~2020-01-08  1:53 UTC|newest]

Thread overview: 24+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2019-12-31  6:20 [PATCH bpf-next v3 00/11] Introduce BPF STRUCT_OPS Martin KaFai Lau
2019-12-31  6:20 ` [PATCH bpf-next v3 01/11] bpf: Save PTR_TO_BTF_ID register state when spilling to stack Martin KaFai Lau
2019-12-31  6:20 ` [PATCH bpf-next v3 02/11] bpf: Avoid storing modifier to info->btf_id Martin KaFai Lau
2019-12-31  6:20 ` [PATCH bpf-next v3 03/11] bpf: Add enum support to btf_ctx_access() Martin KaFai Lau
2019-12-31  6:20 ` [PATCH bpf-next v3 04/11] bpf: Support bitfield read access in btf_struct_access Martin KaFai Lau
2020-01-02 18:04   ` Yonghong Song
2019-12-31  6:20 ` [PATCH bpf-next v3 05/11] bpf: Introduce BPF_PROG_TYPE_STRUCT_OPS Martin KaFai Lau
2019-12-31  6:20 ` [PATCH bpf-next v3 06/11] bpf: Introduce BPF_MAP_TYPE_STRUCT_OPS Martin KaFai Lau
2020-01-02 18:33   ` Yonghong Song
2020-01-07 16:00   ` Daniel Borkmann
2020-01-07 18:55     ` Martin Lau
2020-01-07 21:44       ` Daniel Borkmann
2020-01-08  1:53         ` Martin Lau [this message]
2020-01-07 20:50   ` Daniel Borkmann
2020-01-07 20:51     ` Daniel Borkmann
2020-01-08  0:21   ` Daniel Borkmann
2020-01-08  1:52     ` Martin Lau
2020-01-08 16:53       ` Daniel Borkmann
2020-01-08 18:41         ` Martin Lau
2019-12-31  6:20 ` [PATCH bpf-next v3 07/11] bpf: tcp: Support tcp_congestion_ops in bpf Martin KaFai Lau
2019-12-31  6:20 ` [PATCH bpf-next v3 08/11] bpf: Add BPF_FUNC_tcp_send_ack helper Martin KaFai Lau
2019-12-31  6:20 ` [PATCH bpf-next v3 09/11] bpf: Synch uapi bpf.h to tools/ Martin KaFai Lau
2019-12-31  6:20 ` [PATCH bpf-next v3 10/11] bpf: libbpf: Add STRUCT_OPS support Martin KaFai Lau
2019-12-31  6:21 ` [PATCH bpf-next v3 11/11] bpf: Add bpf_dctcp example Martin KaFai Lau

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20200108015323.oooj47pybplopzqr@kafai-mbp \
    --to=kafai@fb.com \
    --cc=Kernel-team@fb.com \
    --cc=ast@kernel.org \
    --cc=bpf@vger.kernel.org \
    --cc=daniel@iogearbox.net \
    --cc=davem@davemloft.net \
    --cc=netdev@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.