All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH RFC bpf-next v1 0/8] Pinning bpf objects outside bpffs
@ 2022-01-06 21:50 Hao Luo
  2022-01-06 21:50 ` [PATCH RFC bpf-next v1 1/8] bpf: Support pinning in non-bpf file system Hao Luo
                   ` (9 more replies)
  0 siblings, 10 replies; 29+ messages in thread
From: Hao Luo @ 2022-01-06 21:50 UTC (permalink / raw)
  To: Alexei Starovoitov, Andrii Nakryiko, Daniel Borkmann
  Cc: Martin KaFai Lau, Song Liu, Yonghong Song, KP Singh,
	Shakeel Butt, Joe Burton, Stanislav Fomichev, bpf, Hao Luo

Bpffs is a pseudo file system that persists bpf objects. Previously
bpf objects can only be pinned in bpffs, this patchset extends pinning
to allow bpf objects to be pinned (or exposed) to other file systems.

In particular, this patchset allows pinning bpf objects in kernfs. This
creates a new file entry in the kernfs file system and the created file
is able to reference the bpf object. By doing so, bpf can be used to
customize the file's operations, such as seq_show.

As a concrete usecase of this feature, this patchset introduces a
simple new program type called 'bpf_view', which can be used to format
a seq file by a kernel object's state. By pinning a bpf_view program
into a cgroup directory, userspace is able to read the cgroup's state
from file in a format defined by the bpf program.

Different from bpffs, kernfs doesn't have a callback when a kernfs node
is freed, which is problem if we allow the kernfs node to hold an extra
reference of the bpf object, because there is no chance to dec the
object's refcnt. Therefore the kernfs node created by pinning doesn't
hold reference of the bpf object. The lifetime of the kernfs node
depends on the lifetime of the bpf object. Rather than "pinning in
kernfs", it is "exposing to kernfs". We require the bpf object to be
pinned in bpffs first before it can be pinned in kernfs. When the
object is unpinned from bpffs, their kernfs nodes will be removed
automatically. This somehow treats a pinned bpf object as a persistent
"device".

We rely on fsnotify to monitor the inode events in bpffs. A new function
bpf_watch_inode() is introduced. It allows registering a callback
function at inode destruction. For the kernfs case, a callback that
removes kernfs node is registered at the destruction of bpffs inodes.
For other file systems such as sockfs, bpf_watch_inode() can monitor the
destruction of sockfs inodes and the created file entry can hold the bpf
object's reference. In this case, it is truly "pinning".

File operations other than seq_show can also be implemented using bpf.
For example, bpf may be of help for .poll and .mmap in kernfs.

Patch organization:
 - patch 1/8 and 2/8 are preparations. 1/8 implements bpf_watch_inode();
   2/8 records bpffs inode in bpf object.
 - patch 3/8 and 4/8 implement generic logic for creating bpf backed
   kernfs file.
 - patch 5/8 and 6/8 add a new program type for formatting output.
 - patch 7/8 implements cgroup seq_show operation using bpf.
 - patch 8/8 adds selftest.

Hao Luo (8):
  bpf: Support pinning in non-bpf file system.
  bpf: Record back pointer to the inode in bpffs
  bpf: Expose bpf object in kernfs
  bpf: Support removing kernfs entries
  bpf: Introduce a new program type bpf_view.
  libbpf: Support of bpf_view prog type.
  bpf: Add seq_show operation for bpf in cgroupfs
  selftests/bpf: Test exposing bpf objects in kernfs

 include/linux/bpf.h                           |   9 +-
 include/uapi/linux/bpf.h                      |   2 +
 kernel/bpf/Makefile                           |   2 +-
 kernel/bpf/bpf_view.c                         | 190 ++++++++++++++
 kernel/bpf/bpf_view.h                         |  25 ++
 kernel/bpf/inode.c                            | 219 ++++++++++++++--
 kernel/bpf/inode.h                            |  54 ++++
 kernel/bpf/kernfs_node.c                      | 165 ++++++++++++
 kernel/bpf/syscall.c                          |   3 +
 kernel/bpf/verifier.c                         |   6 +
 kernel/trace/bpf_trace.c                      |  12 +-
 tools/include/uapi/linux/bpf.h                |   2 +
 tools/lib/bpf/libbpf.c                        |  21 ++
 .../selftests/bpf/prog_tests/pinning_kernfs.c | 245 ++++++++++++++++++
 .../selftests/bpf/progs/pinning_kernfs.c      |  72 +++++
 15 files changed, 995 insertions(+), 32 deletions(-)
 create mode 100644 kernel/bpf/bpf_view.c
 create mode 100644 kernel/bpf/bpf_view.h
 create mode 100644 kernel/bpf/inode.h
 create mode 100644 kernel/bpf/kernfs_node.c
 create mode 100644 tools/testing/selftests/bpf/prog_tests/pinning_kernfs.c
 create mode 100644 tools/testing/selftests/bpf/progs/pinning_kernfs.c

-- 
2.34.1.448.ga2b2bfdf31-goog


^ permalink raw reply	[flat|nested] 29+ messages in thread
* Re: [PATCH RFC bpf-next v1 5/8] bpf: Introduce a new program type bpf_view.
@ 2022-01-10  7:06 kernel test robot
  0 siblings, 0 replies; 29+ messages in thread
From: kernel test robot @ 2022-01-10  7:06 UTC (permalink / raw)
  To: kbuild

[-- Attachment #1: Type: text/plain, Size: 15611 bytes --]

CC: llvm(a)lists.linux.dev
CC: kbuild-all(a)lists.01.org
In-Reply-To: <20220106215059.2308931-6-haoluo@google.com>
References: <20220106215059.2308931-6-haoluo@google.com>
TO: Hao Luo <haoluo@google.com>

Hi Hao,

[FYI, it's a private test report for your RFC patch.]
[auto build test WARNING on bpf-next/master]

url:    https://github.com/0day-ci/linux/commits/Hao-Luo/Pinning-bpf-objects-outside-bpffs/20220107-055252
base:   https://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-next.git master
:::::: branch date: 3 days ago
:::::: commit date: 3 days ago
config: arm-randconfig-c002-20220107 (https://download.01.org/0day-ci/archive/20220110/202201101429.2XquVD3A-lkp(a)intel.com/config)
compiler: clang version 14.0.0 (https://github.com/llvm/llvm-project 32167bfe64a4c5dd4eb3f7a58e24f4cba76f5ac2)
reproduce (this is a W=1 build):
        wget https://raw.githubusercontent.com/intel/lkp-tests/master/sbin/make.cross -O ~/bin/make.cross
        chmod +x ~/bin/make.cross
        # install arm cross compiling tool for clang build
        # apt-get install binutils-arm-linux-gnueabi
        # https://github.com/0day-ci/linux/commit/f3a5b66e45ed0d7bdc610cce2e0b6a3c606dbb95
        git remote add linux-review https://github.com/0day-ci/linux
        git fetch --no-tags linux-review Hao-Luo/Pinning-bpf-objects-outside-bpffs/20220107-055252
        git checkout f3a5b66e45ed0d7bdc610cce2e0b6a3c606dbb95
        # save the config file to linux build tree
        COMPILER_INSTALL_PATH=$HOME/0day COMPILER=clang make.cross ARCH=arm clang-analyzer 

If you fix the issue, kindly add following tag as appropriate
Reported-by: kernel test robot <lkp@intel.com>


clang-analyzer warnings: (new ones prefixed by >>)
   fs/ext4/mballoc.c:5817:3: note: Value stored to 'err' is never read
                   err = PTR_ERR(bitmap_bh);
                   ^     ~~~~~~~~~~~~~~~~~~
   Suppressed 2 warnings (2 in non-user code).
   Use -header-filter=.* to display errors from all non-system headers. Use -system-headers to display errors from system headers as well.
   1 warning generated.
   Suppressed 1 warnings (1 in non-user code).
   Use -header-filter=.* to display errors from all non-system headers. Use -system-headers to display errors from system headers as well.
   2 warnings generated.
   fs/xfs/libxfs/xfs_attr.c:1243:2: warning: Value stored to 'error' is never read [clang-analyzer-deadcode.DeadStores]
           error = xfs_attr_node_removename(args, state);
           ^       ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
   fs/xfs/libxfs/xfs_attr.c:1243:2: note: Value stored to 'error' is never read
           error = xfs_attr_node_removename(args, state);
           ^       ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
   Suppressed 1 warnings (1 in non-user code).
   Use -header-filter=.* to display errors from all non-system headers. Use -system-headers to display errors from system headers as well.
   3 warnings generated.
   fs/xfs/libxfs/xfs_attr_leaf.c:2243:29: warning: Value stored to 'drop_leaf' during its initialization is never read [clang-analyzer-deadcode.DeadStores]
           struct xfs_attr_leafblock *drop_leaf = drop_blk->bp->b_addr;
                                      ^~~~~~~~~   ~~~~~~~~~~~~~~~~~~~~
   fs/xfs/libxfs/xfs_attr_leaf.c:2243:29: note: Value stored to 'drop_leaf' during its initialization is never read
           struct xfs_attr_leafblock *drop_leaf = drop_blk->bp->b_addr;
                                      ^~~~~~~~~   ~~~~~~~~~~~~~~~~~~~~
   fs/xfs/libxfs/xfs_attr_leaf.c:2244:29: warning: Value stored to 'save_leaf' during its initialization is never read [clang-analyzer-deadcode.DeadStores]
           struct xfs_attr_leafblock *save_leaf = save_blk->bp->b_addr;
                                      ^~~~~~~~~   ~~~~~~~~~~~~~~~~~~~~
   fs/xfs/libxfs/xfs_attr_leaf.c:2244:29: note: Value stored to 'save_leaf' during its initialization is never read
           struct xfs_attr_leafblock *save_leaf = save_blk->bp->b_addr;
                                      ^~~~~~~~~   ~~~~~~~~~~~~~~~~~~~~
   Suppressed 1 warnings (1 in non-user code).
   Use -header-filter=.* to display errors from all non-system headers. Use -system-headers to display errors from system headers as well.
   1 warning generated.
   Suppressed 1 warnings (1 in non-user code).
   Use -header-filter=.* to display errors from all non-system headers. Use -system-headers to display errors from system headers as well.
   1 warning generated.
   Suppressed 1 warnings (1 in non-user code).
   Use -header-filter=.* to display errors from all non-system headers. Use -system-headers to display errors from system headers as well.
   1 warning generated.
   Suppressed 1 warnings (1 in non-user code).
   Use -header-filter=.* to display errors from all non-system headers. Use -system-headers to display errors from system headers as well.
   1 warning generated.
   Suppressed 1 warnings (1 in non-user code).
   Use -header-filter=.* to display errors from all non-system headers. Use -system-headers to display errors from system headers as well.
   1 warning generated.
   Suppressed 1 warnings (1 in non-user code).
   Use -header-filter=.* to display errors from all non-system headers. Use -system-headers to display errors from system headers as well.
   1 warning generated.
   Suppressed 1 warnings (1 in non-user code).
   Use -header-filter=.* to display errors from all non-system headers. Use -system-headers to display errors from system headers as well.
   1 warning generated.
   Suppressed 1 warnings (1 in non-user code).
   Use -header-filter=.* to display errors from all non-system headers. Use -system-headers to display errors from system headers as well.
   4 warnings generated.
   fs/xfs/xfs_reflink.c:1151:3: warning: Value stored to 'qdelta' is never read [clang-analyzer-deadcode.DeadStores]
                   qdelta += dmap->br_blockcount;
                   ^         ~~~~~~~~~~~~~~~~~~~
   fs/xfs/xfs_reflink.c:1151:3: note: Value stored to 'qdelta' is never read
                   qdelta += dmap->br_blockcount;
                   ^         ~~~~~~~~~~~~~~~~~~~
   fs/xfs/xfs_reflink.c:1326:2: warning: Value stored to 'ret' is never read [clang-analyzer-deadcode.DeadStores]
           ret = -EINVAL;
           ^     ~~~~~~~
   fs/xfs/xfs_reflink.c:1326:2: note: Value stored to 'ret' is never read
           ret = -EINVAL;
           ^     ~~~~~~~
   Suppressed 2 warnings (1 in non-user code, 1 with check filters).
   Use -header-filter=.* to display errors from all non-system headers. Use -system-headers to display errors from system headers as well.
   1 warning generated.
   drivers/clk/clk-max9485.c:199:9: warning: Access to field 'out' results in a dereference of a null pointer (loaded from variable 'prev') [clang-analyzer-core.NullDereference]
           return prev->out;
                  ^~~~
   drivers/clk/clk-max9485.c:165:36: note: 'prev' initialized to a null pointer value
           const struct max9485_rate *curr, *prev = NULL;
                                             ^~~~
   drivers/clk/clk-max9485.c:167:29: note: Assuming field 'out' is equal to 0
           for (curr = max9485_rates; curr->out != 0; curr++) {
                                      ^~~~~~~~~~~~~~
   drivers/clk/clk-max9485.c:167:2: note: Loop condition is false. Execution continues on line 199
           for (curr = max9485_rates; curr->out != 0; curr++) {
           ^
   drivers/clk/clk-max9485.c:199:9: note: Access to field 'out' results in a dereference of a null pointer (loaded from variable 'prev')
           return prev->out;
                  ^~~~
   1 warning generated.
   Suppressed 1 warnings (1 in non-user code).
   Use -header-filter=.* to display errors from all non-system headers. Use -system-headers to display errors from system headers as well.
   1 warning generated.
   Suppressed 1 warnings (1 in non-user code).
   Use -header-filter=.* to display errors from all non-system headers. Use -system-headers to display errors from system headers as well.
   1 warning generated.
   Suppressed 1 warnings (1 in non-user code).
   Use -header-filter=.* to display errors from all non-system headers. Use -system-headers to display errors from system headers as well.
   1 warning generated.
   Suppressed 1 warnings (1 in non-user code).
   Use -header-filter=.* to display errors from all non-system headers. Use -system-headers to display errors from system headers as well.
   1 warning generated.
   Suppressed 1 warnings (1 in non-user code).
   Use -header-filter=.* to display errors from all non-system headers. Use -system-headers to display errors from system headers as well.
   2 warnings generated.
>> kernel/bpf/bpf_view.c:151:36: warning: Array subscript is undefined [clang-analyzer-core.uninitialized.ArraySubscript]
                   target->ctx_arg_info[i].btf_id = bpf_view_btf_ids[idx[i]];
                                                    ^
   kernel/bpf/bpf_view.c:174:2: note: Calling 'register_bpf_view_target'
           register_bpf_view_target(&cgroup_view_tinfo, cgroup_view_idx);
           ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
   kernel/bpf/bpf_view.c:150:14: note: Assuming 'i' is < field 'ctx_arg_info_size'
           for (i = 0; i < target->ctx_arg_info_size; ++i)
                       ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~
   kernel/bpf/bpf_view.c:150:2: note: Loop condition is true.  Entering loop body
           for (i = 0; i < target->ctx_arg_info_size; ++i)
           ^
   kernel/bpf/bpf_view.c:150:14: note: Assuming 'i' is < field 'ctx_arg_info_size'
           for (i = 0; i < target->ctx_arg_info_size; ++i)
                       ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~
   kernel/bpf/bpf_view.c:150:2: note: Loop condition is true.  Entering loop body
           for (i = 0; i < target->ctx_arg_info_size; ++i)
           ^
   kernel/bpf/bpf_view.c:150:45: note: The value 2 is assigned to 'i'
           for (i = 0; i < target->ctx_arg_info_size; ++i)
                                                      ^~~
   kernel/bpf/bpf_view.c:150:14: note: Assuming 'i' is < field 'ctx_arg_info_size'
           for (i = 0; i < target->ctx_arg_info_size; ++i)
                       ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~
   kernel/bpf/bpf_view.c:150:2: note: Loop condition is true.  Entering loop body
           for (i = 0; i < target->ctx_arg_info_size; ++i)
           ^
   kernel/bpf/bpf_view.c:151:36: note: Array subscript is undefined
                   target->ctx_arg_info[i].btf_id = bpf_view_btf_ids[idx[i]];
                                                    ^                ~~~~~~
   Suppressed 1 warnings (1 in non-user code).
   Use -header-filter=.* to display errors from all non-system headers. Use -system-headers to display errors from system headers as well.
   1 warning generated.
   Suppressed 1 warnings (1 in non-user code).
   Use -header-filter=.* to display errors from all non-system headers. Use -system-headers to display errors from system headers as well.
   1 warning generated.
   Suppressed 1 warnings (1 in non-user code).
   Use -header-filter=.* to display errors from all non-system headers. Use -system-headers to display errors from system headers as well.
   3 warnings generated.
   mm/slab.c:1645:2: warning: Assigned value is garbage or undefined [clang-analyzer-core.uninitialized.Assign]
           list_for_each_entry_safe(page, n, list, slab_list) {
           ^
   include/linux/list.h:718:7: note: expanded from macro 'list_for_each_entry_safe'
                   n = list_next_entry(pos, member);                       \
                       ^
   include/linux/list.h:557:2: note: expanded from macro 'list_next_entry'
           list_entry((pos)->member.next, typeof(*(pos)), member)
           ^
   include/linux/list.h:513:2: note: expanded from macro 'list_entry'
           container_of(ptr, type, member)
           ^
   include/linux/container_of.h:18:2: note: expanded from macro 'container_of'
           void *__mptr = (void *)(ptr);                                   \
           ^
   mm/slab.c:4135:6: note: Assuming 'count' is <= MAX_SLABINFO_WRITE
           if (count > MAX_SLABINFO_WRITE)
               ^~~~~~~~~~~~~~~~~~~~~~~~~~
   mm/slab.c:4135:2: note: Taking false branch
           if (count > MAX_SLABINFO_WRITE)
           ^
   mm/slab.c:4137:2: note: Taking false branch
           if (copy_from_user(&kbuf, buffer, count))
           ^
   mm/slab.c:4142:6: note: Assuming 'tmp' is non-null
           if (!tmp)
               ^~~~
   mm/slab.c:4142:2: note: Taking false branch
           if (!tmp)
           ^
   mm/slab.c:4146:6: note: Assuming the condition is false
           if (sscanf(tmp, " %d %d %d", &limit, &batchcount, &shared) != 3)
               ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
   mm/slab.c:4146:2: note: Taking false branch
           if (sscanf(tmp, " %d %d %d", &limit, &batchcount, &shared) != 3)
           ^
   mm/slab.c:4152:2: note: Loop condition is true.  Entering loop body
           list_for_each_entry(cachep, &slab_caches, list) {
           ^
   include/linux/list.h:630:2: note: expanded from macro 'list_for_each_entry'
           for (pos = list_first_entry(head, typeof(*pos), member);        \
           ^
   mm/slab.c:4153:7: note: Assuming the condition is true
                   if (!strcmp(cachep->name, kbuf)) {
                       ^~~~~~~~~~~~~~~~~~~~~~~~~~~
   mm/slab.c:4153:3: note: Taking true branch
                   if (!strcmp(cachep->name, kbuf)) {
                   ^
   mm/slab.c:4154:8: note: Assuming 'limit' is >= 1
                           if (limit < 1 || batchcount < 1 ||
                               ^~~~~~~~~
   mm/slab.c:4154:8: note: Left side of '||' is false
   mm/slab.c:4154:21: note: Assuming 'batchcount' is >= 1
                           if (limit < 1 || batchcount < 1 ||
                                            ^~~~~~~~~~~~~~
   mm/slab.c:4154:8: note: Left side of '||' is false
                           if (limit < 1 || batchcount < 1 ||
                               ^
   mm/slab.c:4155:6: note: Assuming 'batchcount' is <= 'limit'
                                           batchcount > limit || shared < 0) {
                                           ^~~~~~~~~~~~~~~~~~
   mm/slab.c:4154:8: note: Left side of '||' is false

vim +151 kernel/bpf/bpf_view.c

f3a5b66e45ed0d Hao Luo 2022-01-06  144  
f3a5b66e45ed0d Hao Luo 2022-01-06  145  static void register_bpf_view_target(struct bpf_view_target_info *target,
f3a5b66e45ed0d Hao Luo 2022-01-06  146  				     int idx[BPF_VIEW_CTX_ARG_MAX])
f3a5b66e45ed0d Hao Luo 2022-01-06  147  {
f3a5b66e45ed0d Hao Luo 2022-01-06  148  	int i;
f3a5b66e45ed0d Hao Luo 2022-01-06  149  
f3a5b66e45ed0d Hao Luo 2022-01-06  150  	for (i = 0; i < target->ctx_arg_info_size; ++i)
f3a5b66e45ed0d Hao Luo 2022-01-06 @151  		target->ctx_arg_info[i].btf_id = bpf_view_btf_ids[idx[i]];
f3a5b66e45ed0d Hao Luo 2022-01-06  152  
f3a5b66e45ed0d Hao Luo 2022-01-06  153  	INIT_LIST_HEAD(&target->list);
f3a5b66e45ed0d Hao Luo 2022-01-06  154  	list_add(&target->list, &targets);
f3a5b66e45ed0d Hao Luo 2022-01-06  155  }
f3a5b66e45ed0d Hao Luo 2022-01-06  156  

---
0-DAY CI Kernel Test Service, Intel Corporation
https://lists.01.org/hyperkitty/list/kbuild-all(a)lists.01.org

^ permalink raw reply	[flat|nested] 29+ messages in thread

end of thread, other threads:[~2022-01-12 19:20 UTC | newest]

Thread overview: 29+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2022-01-06 21:50 [PATCH RFC bpf-next v1 0/8] Pinning bpf objects outside bpffs Hao Luo
2022-01-06 21:50 ` [PATCH RFC bpf-next v1 1/8] bpf: Support pinning in non-bpf file system Hao Luo
2022-01-07  0:04   ` kernel test robot
2022-01-07  0:33   ` Yonghong Song
2022-01-08  0:41   ` kernel test robot
2022-01-08  0:41     ` kernel test robot
2022-01-06 21:50 ` [PATCH RFC bpf-next v1 2/8] bpf: Record back pointer to the inode in bpffs Hao Luo
2022-01-06 21:50 ` [PATCH RFC bpf-next v1 3/8] bpf: Expose bpf object in kernfs Hao Luo
2022-01-06 21:50 ` [PATCH RFC bpf-next v1 4/8] bpf: Support removing kernfs entries Hao Luo
2022-01-06 21:50 ` [PATCH RFC bpf-next v1 5/8] bpf: Introduce a new program type bpf_view Hao Luo
2022-01-07  0:35   ` Yonghong Song
2022-01-06 21:50 ` [PATCH RFC bpf-next v1 6/8] libbpf: Support of bpf_view prog type Hao Luo
2022-01-06 21:50 ` [PATCH RFC bpf-next v1 7/8] bpf: Add seq_show operation for bpf in cgroupfs Hao Luo
2022-01-06 21:50 ` [PATCH RFC bpf-next v1 8/8] selftests/bpf: Test exposing bpf objects in kernfs Hao Luo
2022-01-06 23:02 ` [PATCH RFC bpf-next v1 0/8] Pinning bpf objects outside bpffs sdf
2022-01-07 18:59   ` Hao Luo
2022-01-07 19:25     ` sdf
2022-01-10 18:55       ` Hao Luo
2022-01-10 19:22         ` Stanislav Fomichev
2022-01-11  3:33         ` Alexei Starovoitov
2022-01-11 17:06           ` Stanislav Fomichev
2022-01-11 18:20           ` Hao Luo
2022-01-12 18:55             ` Song Liu
2022-01-12 19:19               ` Hao Luo
2022-01-07  0:30 ` Yonghong Song
2022-01-07 20:43   ` Hao Luo
2022-01-10 17:30     ` Yonghong Song
2022-01-10 18:56       ` Hao Luo
2022-01-10  7:06 [PATCH RFC bpf-next v1 5/8] bpf: Introduce a new program type bpf_view kernel test robot

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.