All of lore.kernel.org
 help / color / mirror / Atom feed
From: KP Singh <kpsingh@chromium.org>
To: Martin KaFai Lau <kafai@fb.com>
Cc: KP Singh <kpsingh@chromium.org>,
	bpf@vger.kernel.org, linux-security-module@vger.kernel.org,
	Alexei Starovoitov <ast@kernel.org>,
	Daniel Borkmann <daniel@iogearbox.net>,
	Paul Turner <pjt@google.com>, Jann Horn <jannh@google.com>
Subject: Re: [PATCH bpf-next v2 1/4] bpf: Generalize bpf_sk_storage
Date: Mon, 29 Jun 2020 18:01:00 +0200	[thread overview]
Message-ID: <20200629160100.GA171259@google.com> (raw)
In-Reply-To: <20200619064332.fycpxuegmmkbfe54@kafai-mbp.dhcp.thefacebook.com>

Thanks for your feedback! Apologies it took some time for me
to incorporate this into another revision.

On 18-Jun 23:43, Martin KaFai Lau wrote:
> On Wed, Jun 17, 2020 at 10:29:38PM +0200, KP Singh wrote:
> > From: KP Singh <kpsingh@google.com>
> > 
> > Refactor the functionality in bpf_sk_storage.c so that concept of
> > storage linked to kernel objects can be extended to other objects like
> > inode, task_struct etc.
> > 
> > bpf_sk_storage is updated to be bpf_local_storage with a union that
> > contains a pointer to the owner object. The type of the
> > bpf_local_storage can be determined using the newly added
> > bpf_local_storage_type enum.
> > 
> > Each new local storage will still be a separate map and provide its own
> > set of helpers. This allows for future object specific extensions and
> > still share a lot of the underlying implementation.
> Thanks for taking up this effort to refactor sk_local_storage.
> 
> I took a quick look.  I have some comments and would like to explore
> some thoughts.
> 
> > --- a/net/core/bpf_sk_storage.c
> > +++ b/kernel/bpf/bpf_local_storage.c
> > @@ -1,19 +1,22 @@
> >  // SPDX-License-Identifier: GPL-2.0
> >  /* Copyright (c) 2019 Facebook  */
> > +#include "linux/bpf.h"
> > +#include "asm-generic/bug.h"
> > +#include "linux/err.h"
> "<" ">"
> 
> >  #include <linux/rculist.h>
> >  #include <linux/list.h>
> >  #include <linux/hash.h>
> >  #include <linux/types.h>
> >  #include <linux/spinlock.h>
> >  #include <linux/bpf.h>
> > -#include <net/bpf_sk_storage.h>
> > +#include <linux/bpf_local_storage.h>
> >  #include <net/sock.h>
> >  #include <uapi/linux/sock_diag.h>
> >  #include <uapi/linux/btf.h>
> >  
> >  static atomic_t cache_idx;
> inode local storage and sk local storage probably need a separate
> cache_idx.  An improvement on picking cache_idx has just been
> landed also.

I see, thanks! I rebased and I now see that cache_idx is now a:

  static u64 cache_idx_usage_counts[BPF_STORAGE_CACHE_SIZE];

which tracks the free cache slots rather than using a single atomic
cache_idx. I guess all types of local storage can share this now
right?

> 
> [ ... ]
> 
> > +struct bpf_local_storage {
> > +	struct bpf_local_storage_data __rcu *cache[BPF_STORAGE_CACHE_SIZE];
> >  	return NULL;

[...]

> >  }
> >  
> > -/* sk_storage->lock must be held and selem->sk_storage == sk_storage.
> > +static void __unlink_local_storage(struct bpf_local_storage *local_storage,
> > +				   bool uncharge_omem)
> Nit. indent is off.  There are a few more cases like this.

Thanks, will fix this. (note to self: don't trust the editor's
clang-format blindly).

> 
> > +{
> > +	struct sock *sk;
> > +
> > +	switch (local_storage->stype) {
> Does it need a new bpf_local_storage_type?  Is map_type as good?
> 
> Instead of adding any new member (e.g. stype) to
> "struct bpf_local_storage",  can the smap pointer be directly used
> here instead?
> 
> For example in __unlink_local_storage() here, it should
> have a hold to the selem which then has a hold to smap.

Good point, Updated to using the map->map_type.

> 
> > +	case BPF_LOCAL_STORAGE_SK:
> > +		sk = local_storage->sk;
> > +		if (uncharge_omem)
> > +			atomic_sub(sizeof(struct bpf_local_storage),
> > +				   &sk->sk_omem_alloc);
> > +
> > +		/* After this RCU_INIT, sk may be freed and cannot be used */
> > +		RCU_INIT_POINTER(sk->sk_bpf_storage, NULL);
> > +		local_storage->sk = NULL;
> > +		break;
> > +	}
> Another thought on the stype switch cases.
> 
> Instead of having multiple switches on stype in bpf_local_storage.c which may
> not be scalable soon if we are planning to support a few more kernel objects,
> have you considered putting them into its own "ops".  May be a few new
> ops can be added to bpf_map_ops to do local storage unlink/update/alloc...etc.

Good idea, I was able to refactor this with the following ops:

        /* Functions called by bpf_local_storage maps */
	void (*map_local_storage_unlink)(struct bpf_local_storage *local_storage,
                                         bool uncharge_omem);
	struct bpf_local_storage_elem *(*map_selem_alloc)(
		struct bpf_local_storage_map *smap, void *owner, void *value,
		bool charge_omem);
	struct bpf_local_storage_data *(*map_local_storage_update)(
		void  *owner, struct bpf_map *map, void *value, u64 flags);
	int (*map_local_storage_alloc)(void *owner,
				       struct bpf_local_storage_map *smap,
				       struct bpf_local_storage_elem *elem);

Let me know if you have any particular thoughts/suggestions about
this.

> 
> > +}
> > +
> > +/* local_storage->lock must be held and selem->local_storage == local_storage.
> >   * The caller must ensure selem->smap is still valid to be
> >   * dereferenced for its smap->elem_size and smap->cache_idx.
> > + *
> > + * uncharge_omem is only relevant when:

[...]

> > +	/* bpf_local_storage_map is currently limited to CAP_SYS_ADMIN as
> >  	 * the map_alloc_check() side also does.
> >  	 */
> >  	if (!bpf_capable())
> > @@ -1025,10 +1127,10 @@ bpf_sk_storage_diag_alloc(const struct nlattr *nla_stgs)
> >  }
> >  EXPORT_SYMBOL_GPL(bpf_sk_storage_diag_alloc);
> Would it be cleaner to leave bpf_sk specific function, map_ops, and func_proto
> in net/core/bpf_sk_storage.c?

Sure, I can also keep the sk_clone code their as well for now.

> 
> There is a test in map_tests/sk_storage_map.c, in case you may not notice.

I will try to make it generic as a part of this series. If it takes
too much time, I will send a separate patch for testing
inode_storage_map and till then we have some assurance with
test_local_storage in test_progs.

- KP

  reply	other threads:[~2020-06-29 19:37 UTC|newest]

Thread overview: 15+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-06-17 20:29 [PATCH bpf-next v2 0/4] Generalizing bpf_local_storage KP Singh
2020-06-17 20:29 ` [PATCH bpf-next v2 1/4] bpf: Generalize bpf_sk_storage KP Singh
2020-06-19  6:43   ` Martin KaFai Lau
2020-06-29 16:01     ` KP Singh [this message]
2020-06-30 19:34       ` Martin KaFai Lau
2020-06-30 22:00         ` KP Singh
2020-07-06 18:56           ` Martin KaFai Lau
2020-06-17 20:29 ` [PATCH bpf-next v2 2/4] bpf: Implement bpf_local_storage for inodes KP Singh
2020-06-19  6:52   ` Martin KaFai Lau
2020-06-30 11:49     ` KP Singh
2020-06-22  9:40   ` Quentin Monnet
2020-06-17 20:29 ` [PATCH bpf-next v2 3/4] bpf: Allow local storage to be used from LSM programs KP Singh
2020-06-17 20:29 ` [PATCH bpf-next v2 4/4] bpf: Add selftests for local_storage KP Singh
2020-06-18 18:16   ` Andrii Nakryiko
2020-06-30 11:50     ` KP Singh

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20200629160100.GA171259@google.com \
    --to=kpsingh@chromium.org \
    --cc=ast@kernel.org \
    --cc=bpf@vger.kernel.org \
    --cc=daniel@iogearbox.net \
    --cc=jannh@google.com \
    --cc=kafai@fb.com \
    --cc=linux-security-module@vger.kernel.org \
    --cc=pjt@google.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.