All of lore.kernel.org
 help / color / mirror / Atom feed
* Create inner maps dynamically from ebpf kernel prog program
@ 2021-06-21 13:12 rainkin
  2021-06-22  5:55 ` Yonghong Song
  0 siblings, 1 reply; 4+ messages in thread
From: rainkin @ 2021-06-21 13:12 UTC (permalink / raw)
  To: bpf

Hi,

My ebpf program is attched to kprobe/vfs_read, my use case is to store
information of each file (i.e., inode) of each process by using
map-in-map (e.g., outer map is a hash map where key is pid, value is a
inner map where key is inode, value is some stateful information I
want to store.
Thus I need to create a new inner map for a new coming inode.

I know there exists local storage for task/inode, however, limited to
my kernel version (4.1x), those local storage cannot be used.

I tried two methods:
1. dynamically create a new inner in user-land ebpf program by
following this tutorial:
https://github.com/torvalds/linux/blob/master/samples/bpf/test_map_in_map_user.c
Then insert the new inner map into the outer map.
The limitation of this method:
It requires ebpf kernel program send a message to user-land program to
create a newly inner map.
And ebpf kernel programs might access the map before user-land program
finishes the job.

2. Thus, i prefer the second method: dynamically create inner maps in
the kernel ebpf program.
According to the discussion in the following thread, it seems that it
can be done by calling bpf_map_update_elem():
https://lore.kernel.org/bpf/878sdlpv92.fsf@toke.dk/T/#e9bac624324ffd3efb0c9f600426306e3a40ec
7b5
> Creating a new map for map_in_map from bpf prog can be implemented.
> bpf_map_update_elem() is doing memory allocation for map elements. In such a case calling
> this helper on map_in_map can, in theory, create a new inner map and insert it into the outer map.

However, when I call method to create a new inner, it return the error:
64: (bf) r2 = r10
65: (07) r2 += -144
66: (bf) r3 = r10
67: (07) r3 += -176
; bpf_map_update_elem(&outer, &ino, &new_inner, BPF_ANY);
68: (18) r1 = 0xffff8dfb7399e400
70: (b7) r4 = 0
71: (85) call bpf_map_update_elem#2
cannot pass map_type 13 into func bpf_map_update_elem#2

new_inner is a structure of inner hashmap.

Any suggestions?
Thanks,
Rainkin

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: Create inner maps dynamically from ebpf kernel prog program
  2021-06-21 13:12 Create inner maps dynamically from ebpf kernel prog program rainkin
@ 2021-06-22  5:55 ` Yonghong Song
  2021-06-22  6:47   ` rainkin
  0 siblings, 1 reply; 4+ messages in thread
From: Yonghong Song @ 2021-06-22  5:55 UTC (permalink / raw)
  To: rainkin, bpf



On 6/21/21 6:12 AM, rainkin wrote:
> Hi,
> 
> My ebpf program is attched to kprobe/vfs_read, my use case is to store
> information of each file (i.e., inode) of each process by using
> map-in-map (e.g., outer map is a hash map where key is pid, value is a
> inner map where key is inode, value is some stateful information I
> want to store.
> Thus I need to create a new inner map for a new coming inode.
> 
> I know there exists local storage for task/inode, however, limited to
> my kernel version (4.1x), those local storage cannot be used.
> 
> I tried two methods:
> 1. dynamically create a new inner in user-land ebpf program by
> following this tutorial:
> https://github.com/torvalds/linux/blob/master/samples/bpf/test_map_in_map_user.c
> Then insert the new inner map into the outer map.
> The limitation of this method:
> It requires ebpf kernel program send a message to user-land program to
> create a newly inner map.
> And ebpf kernel programs might access the map before user-land program
> finishes the job.
> 
> 2. Thus, i prefer the second method: dynamically create inner maps in
> the kernel ebpf program.
> According to the discussion in the following thread, it seems that it
> can be done by calling bpf_map_update_elem():
> https://lore.kernel.org/bpf/878sdlpv92.fsf@toke.dk/T/#e9bac624324ffd3efb0c9f600426306e3a40ec
> 7b5
>> Creating a new map for map_in_map from bpf prog can be implemented.
>> bpf_map_update_elem() is doing memory allocation for map elements. In such a case calling
>> this helper on map_in_map can, in theory, create a new inner map and insert it into the outer map.
> 
> However, when I call method to create a new inner, it return the error:
> 64: (bf) r2 = r10
> 65: (07) r2 += -144
> 66: (bf) r3 = r10
> 67: (07) r3 += -176
> ; bpf_map_update_elem(&outer, &ino, &new_inner, BPF_ANY);
> 68: (18) r1 = 0xffff8dfb7399e400
> 70: (b7) r4 = 0
> 71: (85) call bpf_map_update_elem#2
> cannot pass map_type 13 into func bpf_map_update_elem#2

This is expected based on current verifier implementation.
In verifier check_map_func_compatibility() function, we have

         case BPF_MAP_TYPE_ARRAY_OF_MAPS:
         case BPF_MAP_TYPE_HASH_OF_MAPS:
                 if (func_id != BPF_FUNC_map_lookup_elem)
                         goto error;
                 break;

For array/hash map-in-map, the only supported helper
is bpf_map_lookup_elem(). bpf_map_update_elem()
is not supported yet.

For your method #1, the bpf helper bpf_send_signal() or
bpf_send_signal_thread() might help to send some info
to user space, but I think they are not available in
4.x kernels.

Maybe a single map with key (pid, inode) may work?

> 
> new_inner is a structure of inner hashmap.
> 
> Any suggestions?
> Thanks,
> Rainkin
> 

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: Create inner maps dynamically from ebpf kernel prog program
  2021-06-22  5:55 ` Yonghong Song
@ 2021-06-22  6:47   ` rainkin
  2021-06-22 15:40     ` Yonghong Song
  0 siblings, 1 reply; 4+ messages in thread
From: rainkin @ 2021-06-22  6:47 UTC (permalink / raw)
  To: Yonghong Song; +Cc: bpf

>
>
>
> On 6/21/21 6:12 AM, rainkin wrote:
> > Hi,
> >
> > My ebpf program is attched to kprobe/vfs_read, my use case is to store
> > information of each file (i.e., inode) of each process by using
> > map-in-map (e.g., outer map is a hash map where key is pid, value is a
> > inner map where key is inode, value is some stateful information I
> > want to store.
> > Thus I need to create a new inner map for a new coming inode.
> >
> > I know there exists local storage for task/inode, however, limited to
> > my kernel version (4.1x), those local storage cannot be used.
> >
> > I tried two methods:
> > 1. dynamically create a new inner in user-land ebpf program by
> > following this tutorial:
> > https://github.com/torvalds/linux/blob/master/samples/bpf/test_map_in_map_user.c
> > Then insert the new inner map into the outer map.
> > The limitation of this method:
> > It requires ebpf kernel program send a message to user-land program to
> > create a newly inner map.
> > And ebpf kernel programs might access the map before user-land program
> > finishes the job.
> >
> > 2. Thus, i prefer the second method: dynamically create inner maps in
> > the kernel ebpf program.
> > According to the discussion in the following thread, it seems that it
> > can be done by calling bpf_map_update_elem():
> > https://lore.kernel.org/bpf/878sdlpv92.fsf@toke.dk/T/#e9bac624324ffd3efb0c9f600426306e3a40ec
> > 7b5
> >> Creating a new map for map_in_map from bpf prog can be implemented.
> >> bpf_map_update_elem() is doing memory allocation for map elements. In such a case calling
> >> this helper on map_in_map can, in theory, create a new inner map and insert it into the outer map.
> >
> > However, when I call method to create a new inner, it return the error:
> > 64: (bf) r2 = r10
> > 65: (07) r2 += -144
> > 66: (bf) r3 = r10
> > 67: (07) r3 += -176
> > ; bpf_map_update_elem(&outer, &ino, &new_inner, BPF_ANY);
> > 68: (18) r1 = 0xffff8dfb7399e400
> > 70: (b7) r4 = 0
> > 71: (85) call bpf_map_update_elem#2
> > cannot pass map_type 13 into func bpf_map_update_elem#2
>
> This is expected based on current verifier implementation.
> In verifier check_map_func_compatibility() function, we have
>
>          case BPF_MAP_TYPE_ARRAY_OF_MAPS:
>          case BPF_MAP_TYPE_HASH_OF_MAPS:
>                  if (func_id != BPF_FUNC_map_lookup_elem)
>                          goto error;
>                  break;
>
> For array/hash map-in-map, the only supported helper
> is bpf_map_lookup_elem(). bpf_map_update_elem()
> is not supported yet.

Thanks for your answer!
If I understand correctly, the conclusion is that (at least for now)
*ebpf kernel program*
CAN only do lookup for array/hash map-in-map, and CANNOT do
add/update/delete for array/hash
map-in-map, and CANNOT create reguar hash/array maps dynamically.


>
> For your method #1, the bpf helper bpf_send_signal() or
> bpf_send_signal_thread() might help to send some info
> to user space, but I think they are not available in
> 4.x kernels.
>
> Maybe a single map with key (pid, inode) may work?
>
> >
> > new_inner is a structure of inner hashmap.
> >
> > Any suggestions?
> > Thanks,
> > Rainkin
> >

a single map with key (pid, inode) is ok for the above scenario, however,
when I want to cleanup all entries realted to a certain pid when a
process exits,
a single map is NOT ok. I need to go through all the keys of the
single map and delete keys related
to the certain pid.

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: Create inner maps dynamically from ebpf kernel prog program
  2021-06-22  6:47   ` rainkin
@ 2021-06-22 15:40     ` Yonghong Song
  0 siblings, 0 replies; 4+ messages in thread
From: Yonghong Song @ 2021-06-22 15:40 UTC (permalink / raw)
  To: rainkin; +Cc: bpf



On 6/21/21 11:47 PM, rainkin wrote:
>>
>>
>>
>> On 6/21/21 6:12 AM, rainkin wrote:
>>> Hi,
>>>
>>> My ebpf program is attched to kprobe/vfs_read, my use case is to store
>>> information of each file (i.e., inode) of each process by using
>>> map-in-map (e.g., outer map is a hash map where key is pid, value is a
>>> inner map where key is inode, value is some stateful information I
>>> want to store.
>>> Thus I need to create a new inner map for a new coming inode.
>>>
>>> I know there exists local storage for task/inode, however, limited to
>>> my kernel version (4.1x), those local storage cannot be used.
>>>
>>> I tried two methods:
>>> 1. dynamically create a new inner in user-land ebpf program by
>>> following this tutorial:
>>> https://github.com/torvalds/linux/blob/master/samples/bpf/test_map_in_map_user.c
>>> Then insert the new inner map into the outer map.
>>> The limitation of this method:
>>> It requires ebpf kernel program send a message to user-land program to
>>> create a newly inner map.
>>> And ebpf kernel programs might access the map before user-land program
>>> finishes the job.
>>>
>>> 2. Thus, i prefer the second method: dynamically create inner maps in
>>> the kernel ebpf program.
>>> According to the discussion in the following thread, it seems that it
>>> can be done by calling bpf_map_update_elem():
>>> https://lore.kernel.org/bpf/878sdlpv92.fsf@toke.dk/T/#e9bac624324ffd3efb0c9f600426306e3a40ec
>>> 7b5
>>>> Creating a new map for map_in_map from bpf prog can be implemented.
>>>> bpf_map_update_elem() is doing memory allocation for map elements. In such a case calling
>>>> this helper on map_in_map can, in theory, create a new inner map and insert it into the outer map.
>>>
>>> However, when I call method to create a new inner, it return the error:
>>> 64: (bf) r2 = r10
>>> 65: (07) r2 += -144
>>> 66: (bf) r3 = r10
>>> 67: (07) r3 += -176
>>> ; bpf_map_update_elem(&outer, &ino, &new_inner, BPF_ANY);
>>> 68: (18) r1 = 0xffff8dfb7399e400
>>> 70: (b7) r4 = 0
>>> 71: (85) call bpf_map_update_elem#2
>>> cannot pass map_type 13 into func bpf_map_update_elem#2
>>
>> This is expected based on current verifier implementation.
>> In verifier check_map_func_compatibility() function, we have
>>
>>           case BPF_MAP_TYPE_ARRAY_OF_MAPS:
>>           case BPF_MAP_TYPE_HASH_OF_MAPS:
>>                   if (func_id != BPF_FUNC_map_lookup_elem)
>>                           goto error;
>>                   break;
>>
>> For array/hash map-in-map, the only supported helper
>> is bpf_map_lookup_elem(). bpf_map_update_elem()
>> is not supported yet.
> 
> Thanks for your answer!
> If I understand correctly, the conclusion is that (at least for now)
> *ebpf kernel program*
> CAN only do lookup for array/hash map-in-map, and CANNOT do
> add/update/delete for array/hash
> map-in-map, and CANNOT create reguar hash/array maps dynamically.

Right.

> 
> 
>>
>> For your method #1, the bpf helper bpf_send_signal() or
>> bpf_send_signal_thread() might help to send some info
>> to user space, but I think they are not available in
>> 4.x kernels.
>>
>> Maybe a single map with key (pid, inode) may work?
>>
>>>
>>> new_inner is a structure of inner hashmap.
>>>
>>> Any suggestions?
>>> Thanks,
>>> Rainkin
>>>
> 
> a single map with key (pid, inode) is ok for the above scenario, however,
> when I want to cleanup all entries realted to a certain pid when a
> process exits,
> a single map is NOT ok. I need to go through all the keys of the
> single map and delete keys related
> to the certain pid.

I understand this. Totally agree that it is expensive for the cleanup.

In such cases, map_in_map is the best strategy.
Alexei recently added a support to call bpf create_map/update_map 
syscall in the bpf program ([1]). This needs to be a new program
type though.

In your particular case, you are doing kprobe/vfs_read which is
in the process context and in the beginning of syscall, it probably
safe to call create/update_map syscalls (I did not look at the
kernel codes thoroughly). But verifier needs to ensure it is
indeed safe. There are some ongoing compiler annotation work ([2]),
which may help annotate such functions so verifier can do
an effective work.

BTW, this is all future work. For now, esp. if you are using
4.1x kernels, I guess (pid, inode) probably your best shot.


[1] 
https://lore.kernel.org/bpf/20210514003623.28033-2-alexei.starovoitov@gmail.com/
[2] https://reviews.llvm.org/D103667

^ permalink raw reply	[flat|nested] 4+ messages in thread

end of thread, other threads:[~2021-06-22 15:40 UTC | newest]

Thread overview: 4+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-06-21 13:12 Create inner maps dynamically from ebpf kernel prog program rainkin
2021-06-22  5:55 ` Yonghong Song
2021-06-22  6:47   ` rainkin
2021-06-22 15:40     ` Yonghong Song

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.