bpf.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* Shared Umem between processes
@ 2020-03-11 15:58 Gaul, Maximilian
  2020-03-12  7:55 ` Björn Töpel
  0 siblings, 1 reply; 3+ messages in thread
From: Gaul, Maximilian @ 2020-03-11 15:58 UTC (permalink / raw)
  To: bpf

Hello everyone,


I am not sure if this is the correct address for my question / problem but I was forwarded to this e-mail from the libbpf github-issue section, so this is my excuse.


Just a few information at the start of this e-mail: My program is largely based on: https://github.com/xdp-project/xdp-tutorial/tree/master/advanced03-AF_XDP and I am using libbpf: https://github.com/libbpf/libbpf


I am currently trying to build an application that enables me to process multiple udp-multicast streams at once in parallel (each with up to several ten-thousands of packets per second).


My first solution was to steer each multicast-stream on a separate RX-Queue on my NIC via `ethtool -N <if> flow-type udp4 ...` and to spawn as much user-space processes (each with a separate AF-XDP socket connected to one of the RX-Queues) as there are streams to process.


But because this solution is limited to the amount of RX-Queues the NIC has and I wanted to build something hardware-independent, I looked around a bit and found a feature called `XDP_SHARED_UMEM`.



As far as I understand (please correct me if I am wrong), at the moment libbpf only supports shared umem between threads of a process but not between processes - right?

I ran unto the problem, that `struct xsk_umem` is hidden in `xsk.c`. This prevents me from copying the content from the original socket / umem into shared memory. I am not sure, what information the sub-process (the one which is using the umem from another process) needs so I figured the simplest solution would be to just copy the whole umem struct.



So I went with the "quick-fix" to just move the definition of `struct xsk_umem` into `xsk.h` and to copy the umem-information from the original process into a shared memory. This process then calls `fork()` thus spawning a sub-process. This sub-process then reads the previously written umem-information from shared memory and passes it into `xsk_configure_socket` (af_xdp_user.c) which then eventually calls `xsk_socket__create` in `xsk.c`. This function then checks for `umem->refcount` and sets the flags for shared umem accordingly.



After returning from `xsk_socket__create` (we are still in `xsk_configure_socket` in af_xdp_user.c), `bpf_get_link_xdp_id` is called (I don't know if that's necessary). But after that call I exit the function `xsk_socket__create` in the sub-process because I figured it is probably bad to configure the umem a second time by calling `xsk_ring_prod__reserve` after that:




static struct xsk_socket_info *xsk_configure_socket(struct config *cfg, struct xsk_umem_info *umem) {

struct xsk_socket_config xsk_cfg;
struct xsk_socket_info *xsk_info;
uint32_t idx;
uint32_t prog_id = 0;
int i;
int ret;

xsk_info = calloc(1, sizeof(*xsk_info));
if (!xsk_info)
return NULL;

xsk_info->umem = umem;
xsk_cfg.rx_size = XSK_RING_CONS__DEFAULT_NUM_DESCS;
xsk_cfg.tx_size = XSK_RING_PROD__DEFAULT_NUM_DESCS;
xsk_cfg.libbpf_flags = 0;
xsk_cfg.xdp_flags = cfg->xdp_flags;
xsk_cfg.bind_flags = cfg->xsk_bind_flags;
ret = xsk_socket__create(&xsk_info->xsk, cfg->ifname, cfg->xsk_if_queue, umem->umem, &xsk_info->rx, &xsk_info->tx, &xsk_cfg);

if (ret) {
fprintf(stderr, "FAIL 1\n");
goto error_exit;
}

ret = bpf_get_link_xdp_id(cfg->ifindex, &prog_id, cfg->xdp_flags);
if (ret) {
fprintf(stderr, "FAIL 2\n");
goto error_exit;
}

/* Initialize umem frame allocation */
for (i = 0; i < NUM_FRAMES; i++)
xsk_info->umem_frame_addr[i] = i * FRAME_SIZE;

xsk_info->umem_frame_free = NUM_FRAMES;

if(cfg->use_shrd_umem) {
return xsk_info;
}
        ...
}

Somehow what I am doing doesn't work because my sub-process dies in `xsk_configure_socket`. I am not able to debug it properly with GDB though. Another point I don't understand is the statement:

However, note that you need to supply the XSK_LIBBPF_FLAGS__INHIBIT_PROG_LOAD libbpf_flag with the xsk_socket__create calls and load your own XDP program as there is no built in one in libbpf that will route the traffic for you.

from https://www.kernel.org/doc/html/latest/networking/af_xdp.html#xdp-shared-umem-bind-flag

I didn't know that libbpf loads a XDP-program? Why would it do that? I am using my own af-xdp program which filters for udp-packets. If I set `xsk_cfg.libbpf_flags = XSK_LIBBPF_FLAGS__INHIBIT_PROG_LOAD;` in `xsk_configure_socket`, the af-xdp-socket fd is not put into the kernel `xsks-map` which basically means that I don't receive any packets.

As you probably already noticed, I am overstrained with the concept of Shared Umem and I have to say, there is no documentation about it besides the two sentences in https://www.kernel.org/doc/html/latest/networking/af_xdp.html#xdp-shared-umem-bind-flag and a mail in a linux mailbox from Nov. 2019 stating that this feature is now implemented.

Can you please help?

Best regards

Max

^ permalink raw reply	[flat|nested] 3+ messages in thread

* Re: Shared Umem between processes
  2020-03-11 15:58 Shared Umem between processes Gaul, Maximilian
@ 2020-03-12  7:55 ` Björn Töpel
  2020-03-12  8:20   ` AW: " Gaul, Maximilian
  0 siblings, 1 reply; 3+ messages in thread
From: Björn Töpel @ 2020-03-12  7:55 UTC (permalink / raw)
  To: Gaul, Maximilian, Xdp; +Cc: bpf

On Wed, 11 Mar 2020 at 16:59, Gaul, Maximilian <maximilian.gaul@hm.edu> wrote:
>
> Hello everyone,
>

Hi! I'm moving this to the XDP newbies list, which is a more proper
place for these kind of discussions!

>
> I am not sure if this is the correct address for my question / problem but I was forwarded to this e-mail from the libbpf github-issue section, so this is my excuse.
>
>
> Just a few information at the start of this e-mail: My program is largely based on: https://github.com/xdp-project/xdp-tutorial/tree/master/advanced03-AF_XDP and I am using libbpf: https://github.com/libbpf/libbpf
>
>
> I am currently trying to build an application that enables me to process multiple udp-multicast streams at once in parallel (each with up to several ten-thousands of packets per second).
>
>
> My first solution was to steer each multicast-stream on a separate RX-Queue on my NIC via `ethtool -N <if> flow-type udp4 ...` and to spawn as much user-space processes (each with a separate AF-XDP socket connected to one of the RX-Queues) as there are streams to process.
>
>
> But because this solution is limited to the amount of RX-Queues the NIC has and I wanted to build something hardware-independent, I looked around a bit and found a feature called `XDP_SHARED_UMEM`.
>

Let's start with defining what shared-umem is: The idea is to share
the same umem, fill ring, and completion ring for multiple
sockets. The sockets sharing that umem/fr/cr are tied (bound) to one
hardware ring. It's a mechanism to load-balance a HW queue over
multiple sockets.

If I'm reading you correctly, you'd like a solution:

           hw_q0,
xsk_q0_0, xsk_q0_1, xsk_q0_2,

instead of:

hw_q0,    hw_q1,    hw_q2,
xsk_q0_0, xsk_q1_0, xsk_q2_0,

In the first case you'll need to mux the flows in the XDP program
using an XSKMAP.

Is this what you're trying to do?

>
>
> As far as I understand (please correct me if I am wrong), at the moment libbpf only supports shared umem between threads of a process but not between processes - right?
>

Yes, that is correct, and for a reason! :-) Note that if you'd like to
do a multi-*process* setup with shared umem, you: need to have a
control process that manages the fill/completion rings, and
synchronize between the processes, OR re-mmap the fill/completetion
ring from the socket owning the umem in multiple processes *and*
synchronize the access to them. Neither is pleasant.

Honestly, not a setup I'd recommend.

> I ran unto the problem, that `struct xsk_umem` is hidden in `xsk.c`. This prevents me from copying the content from the original socket / umem into shared memory. I am not sure, what information the sub-process (the one which is using the umem from another process) needs so I figured the simplest solution would be to just copy the whole umem struct.
>

Just for completeness; To setup shared umem:

1. create socket 0 and register the umem to this.
2. mmap the fr/cr using socket 0
3. create socket 1, 2, n and refer to socket 0 for the umem.

So, in a multiprocess solution step 3 would be done in separate
processes, and step 2 depending on your application. You'd need to
pass socket 0 to the other processes *and* share the umem memory from
the process where socket 0 was created. This is pretty much a threaded
solution, given all the shared state.

I advice not taking this path.

>
>
> So I went with the "quick-fix" to just move the definition of `struct xsk_umem` into `xsk.h` and to copy the umem-information from the original process into a shared memory. This process then calls `fork()` thus spawning a sub-process. This sub-process then reads the previously written umem-information from shared memory and passes it into `xsk_configure_socket` (af_xdp_user.c) which then eventually calls `xsk_socket__create` in `xsk.c`. This function then checks for `umem->refcount` and sets the flags for shared umem accordingly.
>
>
>
> After returning from `xsk_socket__create` (we are still in `xsk_configure_socket` in af_xdp_user.c), `bpf_get_link_xdp_id` is called (I don't know if that's necessary). But after that call I exit the function `xsk_socket__create` in the sub-process because I figured it is probably bad to configure the umem a second time by calling `xsk_ring_prod__reserve` after that:
>
>
>
>
> static struct xsk_socket_info *xsk_configure_socket(struct config *cfg, struct xsk_umem_info *umem) {
>
> struct xsk_socket_config xsk_cfg;
> struct xsk_socket_info *xsk_info;
> uint32_t idx;
> uint32_t prog_id = 0;
> int i;
> int ret;
>
> xsk_info = calloc(1, sizeof(*xsk_info));
> if (!xsk_info)
> return NULL;
>
> xsk_info->umem = umem;
> xsk_cfg.rx_size = XSK_RING_CONS__DEFAULT_NUM_DESCS;
> xsk_cfg.tx_size = XSK_RING_PROD__DEFAULT_NUM_DESCS;
> xsk_cfg.libbpf_flags = 0;
> xsk_cfg.xdp_flags = cfg->xdp_flags;
> xsk_cfg.bind_flags = cfg->xsk_bind_flags;
> ret = xsk_socket__create(&xsk_info->xsk, cfg->ifname, cfg->xsk_if_queue, umem->umem, &xsk_info->rx, &xsk_info->tx, &xsk_cfg);
>
> if (ret) {
> fprintf(stderr, "FAIL 1\n");
> goto error_exit;
> }
>
> ret = bpf_get_link_xdp_id(cfg->ifindex, &prog_id, cfg->xdp_flags);
> if (ret) {
> fprintf(stderr, "FAIL 2\n");
> goto error_exit;
> }
>
> /* Initialize umem frame allocation */
> for (i = 0; i < NUM_FRAMES; i++)
> xsk_info->umem_frame_addr[i] = i * FRAME_SIZE;
>
> xsk_info->umem_frame_free = NUM_FRAMES;
>
> if(cfg->use_shrd_umem) {
> return xsk_info;
> }
>         ...
> }
>
> Somehow what I am doing doesn't work because my sub-process dies in `xsk_configure_socket`. I am not able to debug it properly with GDB though. Another point I don't understand is the statement:
>
> However, note that you need to supply the XSK_LIBBPF_FLAGS__INHIBIT_PROG_LOAD libbpf_flag with the xsk_socket__create calls and load your own XDP program as there is no built in one in libbpf that will route the traffic for you.
>
> from https://www.kernel.org/doc/html/latest/networking/af_xdp.html#xdp-shared-umem-bind-flag
>
> I didn't know that libbpf loads a XDP-program? Why would it do that? I am using my own af-xdp program which filters for udp-packets. If I set `xsk_cfg.libbpf_flags = XSK_LIBBPF_FLAGS__INHIBIT_PROG_LOAD;` in `xsk_configure_socket`, the af-xdp-socket fd is not put into the kernel `xsks-map` which basically means that I don't receive any packets.
>
> As you probably already noticed, I am overstrained with the concept of Shared Umem and I have to say, there is no documentation about it besides the two sentences in https://www.kernel.org/doc/html/latest/networking/af_xdp.html#xdp-shared-umem-bind-flag and a mail in a linux mailbox from Nov. 2019 stating that this feature is now implemented.
>
> Can you please help?
>

XDP sockets always use an XDP program, it just that a default one is
provided if the use doesn't explicitly add one. Have a look at
tools/lib/bpf/xsk.c:xsk_load_xdp_prog. So, for shared umem you need to
explicitly have a program that muxes over the sockets. A naïve variant
can be found in samples/bpf/xdpsock_kern.c


Cheers,
Björn

> Best regards
>
> Max

^ permalink raw reply	[flat|nested] 3+ messages in thread

* AW: Shared Umem between processes
  2020-03-12  7:55 ` Björn Töpel
@ 2020-03-12  8:20   ` Gaul, Maximilian
  0 siblings, 0 replies; 3+ messages in thread
From: Gaul, Maximilian @ 2020-03-12  8:20 UTC (permalink / raw)
  To: Björn Töpel, Xdp; +Cc: bpf

I don't know if this reply works but I will try.

On Thu, 12 Mar 2020 at 08:55, Björn Töpel <bjorn.topel@gmail.com> wrote:
>>
>> Hello everyone,
>>
>
> Hi! I'm moving this to the XDP newbies list, which is a more proper
> place for these kind of discussions!
>
Sure, no problem. Thank you.
>>
>> I am not sure if this is the correct address for my question / problem but I was forwarded to this e-mail from the libbpf github-issue section, so this is my excuse.
>>
>>
>> Just a few information at the start of this e-mail: My program is largely based on:  https://github.com/xdp-project/xdp-tutorial/tree/master/advanced03-AF_XDP and I am using libbpf: https://github.com/libbpf/libbpf
>>
>>
>> I am currently trying to build an application that enables me to process multiple udp-multicast streams at once in parallel (each with up to several ten-thousands of packets per second).
>>
>>
>> My first solution was to steer each multicast-stream on a separate RX-Queue on my NIC via `ethtool -N <if> flow-type udp4 ...` and to spawn as much user-space processes (each with a separate AF-XDP socket connected to one of the RX-Queues) as there are streams  to process.
>>
>>
>> But because this solution is limited to the amount of RX-Queues the NIC has and I wanted to build something hardware-independent, I looked around a bit and found a feature called `XDP_SHARED_UMEM`.
>>

> Let's start with defining what shared-umem is: The idea is to share
> the same umem, fill ring, and completion ring for multiple
> sockets. The sockets sharing that umem/fr/cr are tied (bound) to one
> hardware ring. It's a mechanism to load-balance a HW queue over
> multiple sockets.
> 
> If I'm reading you correctly, you'd like a solution:
> 
>            hw_q0,
> xsk_q0_0, xsk_q0_1, xsk_q0_2,
> 
> instead of:
> 
> hw_q0,    hw_q1,    hw_q2,
> xsk_q0_0, xsk_q1_0, xsk_q2_0,
>
> In the first case you'll need to mux the flows in the XDP program
> using an XSKMAP.
> 
> Is this what you're trying to do?
>
Yes it is. But I had the problem that I couldn't create multiple sockets (no sharing, everyone with its own umem and rx/tx queues) tied to the same RX-Queue. Maybe I did something wrong. But is this possible?
>>
>>
>> As far as I understand (please correct me if I am wrong), at the moment libbpf only supports shared umem between threads of a process but not between processes - right?
>>
>
> Yes, that is correct, and for a reason! :-) Note that if you'd like to
> do a multi-*process* setup with shared umem, you: need to have a
> control process that manages the fill/completion rings, and
> synchronize between the processes, OR re-mmap the fill/completetion
> ring from the socket owning the umem in multiple processes *and*
> synchronize the access to them. Neither is pleasant.
> 
> Honestly, not a setup I'd recommend.
>
This indeed sounds very unpleasent. So instead, if I understand correctly, you would go with the version above (the XDP program distributing the packets on the sockets via a XSKMAP). Is there something I have to watch out for? As I said, I wasn't able to create multiple sockets for the same RX-Queue.
>> I ran unto the problem, that `struct xsk_umem` is hidden in `xsk.c`. This prevents me from copying the content from the original socket / umem into shared memory. I am not sure, what information the sub-process (the one which is using the umem from another  process) needs so I figured the simplest solution would be to just copy the whole umem struct.
>>
>
> Just for completeness; To setup shared umem:
> 
> 1. create socket 0 and register the umem to this.
> 2. mmap the fr/cr using socket 0
> 3. create socket 1, 2, n and refer to socket 0 for the umem.
>
> So, in a multiprocess solution step 3 would be done in separate
> processes, and step 2 depending on your application. You'd need to
> pass socket 0 to the other processes *and* share the umem memory from
> the process where socket 0 was created. This is pretty much a threaded
> solution, given all the shared state.
>
> I advice not taking this path.
>
I am not entirely sure what you mean with *passing socket 0* is this just the fd of the socket? What's about the `struct xsk_umem`? Do I need that? I guess so because `xsk_socket__create()` has a parameter `struct xsk_umem`.
>>
>>
>> So I went with the "quick-fix" to just move the definition of `struct xsk_umem` into `xsk.h` and to copy the umem-information from the original process into a shared memory. This process then calls `fork()` thus spawning a sub-process. This sub-process then  reads the previously written umem-information from shared memory and passes it into `xsk_configure_socket` (af_xdp_user.c) which then eventually calls `xsk_socket__create` in `xsk.c`. This function then checks for `umem->refcount` and sets the flags for shared  umem accordingly.
>>
>>
>>
>> After returning from `xsk_socket__create` (we are still in `xsk_configure_socket` in af_xdp_user.c), `bpf_get_link_xdp_id` is called (I don't know if that's necessary). But after that call I exit the function `xsk_socket__create` in the sub-process because  I figured it is probably bad to configure the umem a second time by calling `xsk_ring_prod__reserve` after that:
>>
>>
>>
>>
>> static struct xsk_socket_info *xsk_configure_socket(struct config *cfg, struct xsk_umem_info *umem) {
>>
>> struct xsk_socket_config xsk_cfg;
>> struct xsk_socket_info *xsk_info;
>> uint32_t idx;
>> uint32_t prog_id = 0;
>> int i;
>> int ret;
>>
>> xsk_info = calloc(1, sizeof(*xsk_info));
>> if (!xsk_info)
>> return NULL;
>>
>> xsk_info->umem = umem;
>> xsk_cfg.rx_size = XSK_RING_CONS__DEFAULT_NUM_DESCS;
>> xsk_cfg.tx_size = XSK_RING_PROD__DEFAULT_NUM_DESCS;
>> xsk_cfg.libbpf_flags = 0;
>> xsk_cfg.xdp_flags = cfg->xdp_flags;
>> xsk_cfg.bind_flags = cfg->xsk_bind_flags;
>> ret = xsk_socket__create(&xsk_info->xsk, cfg->ifname, cfg->xsk_if_queue, umem->umem, &xsk_info->rx, &xsk_info->tx, &xsk_cfg);
>>
>> if (ret) {
>> fprintf(stderr, "FAIL 1\n");
>> goto error_exit;
>> }
>>
>> ret = bpf_get_link_xdp_id(cfg->ifindex, &prog_id, cfg->xdp_flags);
>> if (ret) {
>> fprintf(stderr, "FAIL 2\n");
>> goto error_exit;
>> }
>>
>> /* Initialize umem frame allocation */
>> for (i = 0; i < NUM_FRAMES; i++)
>> xsk_info->umem_frame_addr[i] = i * FRAME_SIZE;
>>
>> xsk_info->umem_frame_free = NUM_FRAMES;
>>
>> if(cfg->use_shrd_umem) {
>> return xsk_info;
>> }
>>         ...
>> }
>>
>> Somehow what I am doing doesn't work because my sub-process dies in `xsk_configure_socket`. I am not able to debug it properly with GDB though. Another point I don't understand is the statement:
>>
>> However, note that you need to supply the XSK_LIBBPF_FLAGS__INHIBIT_PROG_LOAD libbpf_flag with the xsk_socket__create calls and load your own XDP program as there is no built in one in libbpf that will route the traffic for you.
>>
>> from  https://www.kernel.org/doc/html/latest/networking/af_xdp.html#xdp-shared-umem-bind-flag
>>
>> I didn't know that libbpf loads a XDP-program? Why would it do that? I am using my own af-xdp program which filters for udp-packets. If I set `xsk_cfg.libbpf_flags = XSK_LIBBPF_FLAGS__INHIBIT_PROG_LOAD;` in `xsk_configure_socket`, the af-xdp-socket fd is  not put into the kernel `xsks-map` which basically means that I don't receive any packets.
>>
>> As you probably already noticed, I am overstrained with the concept of Shared Umem and I have to say, there is no documentation about it besides the two sentences in  https://www.kernel.org/doc/html/latest/networking/af_xdp.html#xdp-shared-umem-bind-flag and a mail in a linux mailbox from Nov. 2019 stating that this feature is now implemented.
>>
>> Can you please help?
>>
>
> XDP sockets always use an XDP program, it just that a default one is
> provided if the use doesn't explicitly add one. Have a look at
> tools/lib/bpf/xsk.c:xsk_load_xdp_prog. So, for shared umem you need to
> explicitly have a program that muxes over the sockets. A naïve variant
> can be found in samples/bpf/xdpsock_kern.c
> 
> 
> Cheers,
> Björn
> 
>> Best regards
>>
>> Max
    

^ permalink raw reply	[flat|nested] 3+ messages in thread

end of thread, other threads:[~2020-03-12  8:20 UTC | newest]

Thread overview: 3+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-03-11 15:58 Shared Umem between processes Gaul, Maximilian
2020-03-12  7:55 ` Björn Töpel
2020-03-12  8:20   ` AW: " Gaul, Maximilian

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).