* Shared Umem between processes @ 2020-03-11 15:58 Gaul, Maximilian 2020-03-12 7:55 ` Björn Töpel 0 siblings, 1 reply; 3+ messages in thread From: Gaul, Maximilian @ 2020-03-11 15:58 UTC (permalink / raw) To: bpf Hello everyone, I am not sure if this is the correct address for my question / problem but I was forwarded to this e-mail from the libbpf github-issue section, so this is my excuse. Just a few information at the start of this e-mail: My program is largely based on: https://github.com/xdp-project/xdp-tutorial/tree/master/advanced03-AF_XDP and I am using libbpf: https://github.com/libbpf/libbpf I am currently trying to build an application that enables me to process multiple udp-multicast streams at once in parallel (each with up to several ten-thousands of packets per second). My first solution was to steer each multicast-stream on a separate RX-Queue on my NIC via `ethtool -N <if> flow-type udp4 ...` and to spawn as much user-space processes (each with a separate AF-XDP socket connected to one of the RX-Queues) as there are streams to process. But because this solution is limited to the amount of RX-Queues the NIC has and I wanted to build something hardware-independent, I looked around a bit and found a feature called `XDP_SHARED_UMEM`. As far as I understand (please correct me if I am wrong), at the moment libbpf only supports shared umem between threads of a process but not between processes - right? I ran unto the problem, that `struct xsk_umem` is hidden in `xsk.c`. This prevents me from copying the content from the original socket / umem into shared memory. I am not sure, what information the sub-process (the one which is using the umem from another process) needs so I figured the simplest solution would be to just copy the whole umem struct. So I went with the "quick-fix" to just move the definition of `struct xsk_umem` into `xsk.h` and to copy the umem-information from the original process into a shared memory. This process then calls `fork()` thus spawning a sub-process. This sub-process then reads the previously written umem-information from shared memory and passes it into `xsk_configure_socket` (af_xdp_user.c) which then eventually calls `xsk_socket__create` in `xsk.c`. This function then checks for `umem->refcount` and sets the flags for shared umem accordingly. After returning from `xsk_socket__create` (we are still in `xsk_configure_socket` in af_xdp_user.c), `bpf_get_link_xdp_id` is called (I don't know if that's necessary). But after that call I exit the function `xsk_socket__create` in the sub-process because I figured it is probably bad to configure the umem a second time by calling `xsk_ring_prod__reserve` after that: static struct xsk_socket_info *xsk_configure_socket(struct config *cfg, struct xsk_umem_info *umem) { struct xsk_socket_config xsk_cfg; struct xsk_socket_info *xsk_info; uint32_t idx; uint32_t prog_id = 0; int i; int ret; xsk_info = calloc(1, sizeof(*xsk_info)); if (!xsk_info) return NULL; xsk_info->umem = umem; xsk_cfg.rx_size = XSK_RING_CONS__DEFAULT_NUM_DESCS; xsk_cfg.tx_size = XSK_RING_PROD__DEFAULT_NUM_DESCS; xsk_cfg.libbpf_flags = 0; xsk_cfg.xdp_flags = cfg->xdp_flags; xsk_cfg.bind_flags = cfg->xsk_bind_flags; ret = xsk_socket__create(&xsk_info->xsk, cfg->ifname, cfg->xsk_if_queue, umem->umem, &xsk_info->rx, &xsk_info->tx, &xsk_cfg); if (ret) { fprintf(stderr, "FAIL 1\n"); goto error_exit; } ret = bpf_get_link_xdp_id(cfg->ifindex, &prog_id, cfg->xdp_flags); if (ret) { fprintf(stderr, "FAIL 2\n"); goto error_exit; } /* Initialize umem frame allocation */ for (i = 0; i < NUM_FRAMES; i++) xsk_info->umem_frame_addr[i] = i * FRAME_SIZE; xsk_info->umem_frame_free = NUM_FRAMES; if(cfg->use_shrd_umem) { return xsk_info; } ... } Somehow what I am doing doesn't work because my sub-process dies in `xsk_configure_socket`. I am not able to debug it properly with GDB though. Another point I don't understand is the statement: However, note that you need to supply the XSK_LIBBPF_FLAGS__INHIBIT_PROG_LOAD libbpf_flag with the xsk_socket__create calls and load your own XDP program as there is no built in one in libbpf that will route the traffic for you. from https://www.kernel.org/doc/html/latest/networking/af_xdp.html#xdp-shared-umem-bind-flag I didn't know that libbpf loads a XDP-program? Why would it do that? I am using my own af-xdp program which filters for udp-packets. If I set `xsk_cfg.libbpf_flags = XSK_LIBBPF_FLAGS__INHIBIT_PROG_LOAD;` in `xsk_configure_socket`, the af-xdp-socket fd is not put into the kernel `xsks-map` which basically means that I don't receive any packets. As you probably already noticed, I am overstrained with the concept of Shared Umem and I have to say, there is no documentation about it besides the two sentences in https://www.kernel.org/doc/html/latest/networking/af_xdp.html#xdp-shared-umem-bind-flag and a mail in a linux mailbox from Nov. 2019 stating that this feature is now implemented. Can you please help? Best regards Max ^ permalink raw reply [flat|nested] 3+ messages in thread
* Re: Shared Umem between processes 2020-03-11 15:58 Shared Umem between processes Gaul, Maximilian @ 2020-03-12 7:55 ` Björn Töpel 2020-03-12 8:20 ` AW: " Gaul, Maximilian 0 siblings, 1 reply; 3+ messages in thread From: Björn Töpel @ 2020-03-12 7:55 UTC (permalink / raw) To: Gaul, Maximilian, Xdp; +Cc: bpf On Wed, 11 Mar 2020 at 16:59, Gaul, Maximilian <maximilian.gaul@hm.edu> wrote: > > Hello everyone, > Hi! I'm moving this to the XDP newbies list, which is a more proper place for these kind of discussions! > > I am not sure if this is the correct address for my question / problem but I was forwarded to this e-mail from the libbpf github-issue section, so this is my excuse. > > > Just a few information at the start of this e-mail: My program is largely based on: https://github.com/xdp-project/xdp-tutorial/tree/master/advanced03-AF_XDP and I am using libbpf: https://github.com/libbpf/libbpf > > > I am currently trying to build an application that enables me to process multiple udp-multicast streams at once in parallel (each with up to several ten-thousands of packets per second). > > > My first solution was to steer each multicast-stream on a separate RX-Queue on my NIC via `ethtool -N <if> flow-type udp4 ...` and to spawn as much user-space processes (each with a separate AF-XDP socket connected to one of the RX-Queues) as there are streams to process. > > > But because this solution is limited to the amount of RX-Queues the NIC has and I wanted to build something hardware-independent, I looked around a bit and found a feature called `XDP_SHARED_UMEM`. > Let's start with defining what shared-umem is: The idea is to share the same umem, fill ring, and completion ring for multiple sockets. The sockets sharing that umem/fr/cr are tied (bound) to one hardware ring. It's a mechanism to load-balance a HW queue over multiple sockets. If I'm reading you correctly, you'd like a solution: hw_q0, xsk_q0_0, xsk_q0_1, xsk_q0_2, instead of: hw_q0, hw_q1, hw_q2, xsk_q0_0, xsk_q1_0, xsk_q2_0, In the first case you'll need to mux the flows in the XDP program using an XSKMAP. Is this what you're trying to do? > > > As far as I understand (please correct me if I am wrong), at the moment libbpf only supports shared umem between threads of a process but not between processes - right? > Yes, that is correct, and for a reason! :-) Note that if you'd like to do a multi-*process* setup with shared umem, you: need to have a control process that manages the fill/completion rings, and synchronize between the processes, OR re-mmap the fill/completetion ring from the socket owning the umem in multiple processes *and* synchronize the access to them. Neither is pleasant. Honestly, not a setup I'd recommend. > I ran unto the problem, that `struct xsk_umem` is hidden in `xsk.c`. This prevents me from copying the content from the original socket / umem into shared memory. I am not sure, what information the sub-process (the one which is using the umem from another process) needs so I figured the simplest solution would be to just copy the whole umem struct. > Just for completeness; To setup shared umem: 1. create socket 0 and register the umem to this. 2. mmap the fr/cr using socket 0 3. create socket 1, 2, n and refer to socket 0 for the umem. So, in a multiprocess solution step 3 would be done in separate processes, and step 2 depending on your application. You'd need to pass socket 0 to the other processes *and* share the umem memory from the process where socket 0 was created. This is pretty much a threaded solution, given all the shared state. I advice not taking this path. > > > So I went with the "quick-fix" to just move the definition of `struct xsk_umem` into `xsk.h` and to copy the umem-information from the original process into a shared memory. This process then calls `fork()` thus spawning a sub-process. This sub-process then reads the previously written umem-information from shared memory and passes it into `xsk_configure_socket` (af_xdp_user.c) which then eventually calls `xsk_socket__create` in `xsk.c`. This function then checks for `umem->refcount` and sets the flags for shared umem accordingly. > > > > After returning from `xsk_socket__create` (we are still in `xsk_configure_socket` in af_xdp_user.c), `bpf_get_link_xdp_id` is called (I don't know if that's necessary). But after that call I exit the function `xsk_socket__create` in the sub-process because I figured it is probably bad to configure the umem a second time by calling `xsk_ring_prod__reserve` after that: > > > > > static struct xsk_socket_info *xsk_configure_socket(struct config *cfg, struct xsk_umem_info *umem) { > > struct xsk_socket_config xsk_cfg; > struct xsk_socket_info *xsk_info; > uint32_t idx; > uint32_t prog_id = 0; > int i; > int ret; > > xsk_info = calloc(1, sizeof(*xsk_info)); > if (!xsk_info) > return NULL; > > xsk_info->umem = umem; > xsk_cfg.rx_size = XSK_RING_CONS__DEFAULT_NUM_DESCS; > xsk_cfg.tx_size = XSK_RING_PROD__DEFAULT_NUM_DESCS; > xsk_cfg.libbpf_flags = 0; > xsk_cfg.xdp_flags = cfg->xdp_flags; > xsk_cfg.bind_flags = cfg->xsk_bind_flags; > ret = xsk_socket__create(&xsk_info->xsk, cfg->ifname, cfg->xsk_if_queue, umem->umem, &xsk_info->rx, &xsk_info->tx, &xsk_cfg); > > if (ret) { > fprintf(stderr, "FAIL 1\n"); > goto error_exit; > } > > ret = bpf_get_link_xdp_id(cfg->ifindex, &prog_id, cfg->xdp_flags); > if (ret) { > fprintf(stderr, "FAIL 2\n"); > goto error_exit; > } > > /* Initialize umem frame allocation */ > for (i = 0; i < NUM_FRAMES; i++) > xsk_info->umem_frame_addr[i] = i * FRAME_SIZE; > > xsk_info->umem_frame_free = NUM_FRAMES; > > if(cfg->use_shrd_umem) { > return xsk_info; > } > ... > } > > Somehow what I am doing doesn't work because my sub-process dies in `xsk_configure_socket`. I am not able to debug it properly with GDB though. Another point I don't understand is the statement: > > However, note that you need to supply the XSK_LIBBPF_FLAGS__INHIBIT_PROG_LOAD libbpf_flag with the xsk_socket__create calls and load your own XDP program as there is no built in one in libbpf that will route the traffic for you. > > from https://www.kernel.org/doc/html/latest/networking/af_xdp.html#xdp-shared-umem-bind-flag > > I didn't know that libbpf loads a XDP-program? Why would it do that? I am using my own af-xdp program which filters for udp-packets. If I set `xsk_cfg.libbpf_flags = XSK_LIBBPF_FLAGS__INHIBIT_PROG_LOAD;` in `xsk_configure_socket`, the af-xdp-socket fd is not put into the kernel `xsks-map` which basically means that I don't receive any packets. > > As you probably already noticed, I am overstrained with the concept of Shared Umem and I have to say, there is no documentation about it besides the two sentences in https://www.kernel.org/doc/html/latest/networking/af_xdp.html#xdp-shared-umem-bind-flag and a mail in a linux mailbox from Nov. 2019 stating that this feature is now implemented. > > Can you please help? > XDP sockets always use an XDP program, it just that a default one is provided if the use doesn't explicitly add one. Have a look at tools/lib/bpf/xsk.c:xsk_load_xdp_prog. So, for shared umem you need to explicitly have a program that muxes over the sockets. A naïve variant can be found in samples/bpf/xdpsock_kern.c Cheers, Björn > Best regards > > Max ^ permalink raw reply [flat|nested] 3+ messages in thread
* AW: Shared Umem between processes 2020-03-12 7:55 ` Björn Töpel @ 2020-03-12 8:20 ` Gaul, Maximilian 0 siblings, 0 replies; 3+ messages in thread From: Gaul, Maximilian @ 2020-03-12 8:20 UTC (permalink / raw) To: Björn Töpel, Xdp; +Cc: bpf I don't know if this reply works but I will try. On Thu, 12 Mar 2020 at 08:55, Björn Töpel <bjorn.topel@gmail.com> wrote: >> >> Hello everyone, >> > > Hi! I'm moving this to the XDP newbies list, which is a more proper > place for these kind of discussions! > Sure, no problem. Thank you. >> >> I am not sure if this is the correct address for my question / problem but I was forwarded to this e-mail from the libbpf github-issue section, so this is my excuse. >> >> >> Just a few information at the start of this e-mail: My program is largely based on: https://github.com/xdp-project/xdp-tutorial/tree/master/advanced03-AF_XDP and I am using libbpf: https://github.com/libbpf/libbpf >> >> >> I am currently trying to build an application that enables me to process multiple udp-multicast streams at once in parallel (each with up to several ten-thousands of packets per second). >> >> >> My first solution was to steer each multicast-stream on a separate RX-Queue on my NIC via `ethtool -N <if> flow-type udp4 ...` and to spawn as much user-space processes (each with a separate AF-XDP socket connected to one of the RX-Queues) as there are streams to process. >> >> >> But because this solution is limited to the amount of RX-Queues the NIC has and I wanted to build something hardware-independent, I looked around a bit and found a feature called `XDP_SHARED_UMEM`. >> > Let's start with defining what shared-umem is: The idea is to share > the same umem, fill ring, and completion ring for multiple > sockets. The sockets sharing that umem/fr/cr are tied (bound) to one > hardware ring. It's a mechanism to load-balance a HW queue over > multiple sockets. > > If I'm reading you correctly, you'd like a solution: > > hw_q0, > xsk_q0_0, xsk_q0_1, xsk_q0_2, > > instead of: > > hw_q0, hw_q1, hw_q2, > xsk_q0_0, xsk_q1_0, xsk_q2_0, > > In the first case you'll need to mux the flows in the XDP program > using an XSKMAP. > > Is this what you're trying to do? > Yes it is. But I had the problem that I couldn't create multiple sockets (no sharing, everyone with its own umem and rx/tx queues) tied to the same RX-Queue. Maybe I did something wrong. But is this possible? >> >> >> As far as I understand (please correct me if I am wrong), at the moment libbpf only supports shared umem between threads of a process but not between processes - right? >> > > Yes, that is correct, and for a reason! :-) Note that if you'd like to > do a multi-*process* setup with shared umem, you: need to have a > control process that manages the fill/completion rings, and > synchronize between the processes, OR re-mmap the fill/completetion > ring from the socket owning the umem in multiple processes *and* > synchronize the access to them. Neither is pleasant. > > Honestly, not a setup I'd recommend. > This indeed sounds very unpleasent. So instead, if I understand correctly, you would go with the version above (the XDP program distributing the packets on the sockets via a XSKMAP). Is there something I have to watch out for? As I said, I wasn't able to create multiple sockets for the same RX-Queue. >> I ran unto the problem, that `struct xsk_umem` is hidden in `xsk.c`. This prevents me from copying the content from the original socket / umem into shared memory. I am not sure, what information the sub-process (the one which is using the umem from another process) needs so I figured the simplest solution would be to just copy the whole umem struct. >> > > Just for completeness; To setup shared umem: > > 1. create socket 0 and register the umem to this. > 2. mmap the fr/cr using socket 0 > 3. create socket 1, 2, n and refer to socket 0 for the umem. > > So, in a multiprocess solution step 3 would be done in separate > processes, and step 2 depending on your application. You'd need to > pass socket 0 to the other processes *and* share the umem memory from > the process where socket 0 was created. This is pretty much a threaded > solution, given all the shared state. > > I advice not taking this path. > I am not entirely sure what you mean with *passing socket 0* is this just the fd of the socket? What's about the `struct xsk_umem`? Do I need that? I guess so because `xsk_socket__create()` has a parameter `struct xsk_umem`. >> >> >> So I went with the "quick-fix" to just move the definition of `struct xsk_umem` into `xsk.h` and to copy the umem-information from the original process into a shared memory. This process then calls `fork()` thus spawning a sub-process. This sub-process then reads the previously written umem-information from shared memory and passes it into `xsk_configure_socket` (af_xdp_user.c) which then eventually calls `xsk_socket__create` in `xsk.c`. This function then checks for `umem->refcount` and sets the flags for shared umem accordingly. >> >> >> >> After returning from `xsk_socket__create` (we are still in `xsk_configure_socket` in af_xdp_user.c), `bpf_get_link_xdp_id` is called (I don't know if that's necessary). But after that call I exit the function `xsk_socket__create` in the sub-process because I figured it is probably bad to configure the umem a second time by calling `xsk_ring_prod__reserve` after that: >> >> >> >> >> static struct xsk_socket_info *xsk_configure_socket(struct config *cfg, struct xsk_umem_info *umem) { >> >> struct xsk_socket_config xsk_cfg; >> struct xsk_socket_info *xsk_info; >> uint32_t idx; >> uint32_t prog_id = 0; >> int i; >> int ret; >> >> xsk_info = calloc(1, sizeof(*xsk_info)); >> if (!xsk_info) >> return NULL; >> >> xsk_info->umem = umem; >> xsk_cfg.rx_size = XSK_RING_CONS__DEFAULT_NUM_DESCS; >> xsk_cfg.tx_size = XSK_RING_PROD__DEFAULT_NUM_DESCS; >> xsk_cfg.libbpf_flags = 0; >> xsk_cfg.xdp_flags = cfg->xdp_flags; >> xsk_cfg.bind_flags = cfg->xsk_bind_flags; >> ret = xsk_socket__create(&xsk_info->xsk, cfg->ifname, cfg->xsk_if_queue, umem->umem, &xsk_info->rx, &xsk_info->tx, &xsk_cfg); >> >> if (ret) { >> fprintf(stderr, "FAIL 1\n"); >> goto error_exit; >> } >> >> ret = bpf_get_link_xdp_id(cfg->ifindex, &prog_id, cfg->xdp_flags); >> if (ret) { >> fprintf(stderr, "FAIL 2\n"); >> goto error_exit; >> } >> >> /* Initialize umem frame allocation */ >> for (i = 0; i < NUM_FRAMES; i++) >> xsk_info->umem_frame_addr[i] = i * FRAME_SIZE; >> >> xsk_info->umem_frame_free = NUM_FRAMES; >> >> if(cfg->use_shrd_umem) { >> return xsk_info; >> } >> ... >> } >> >> Somehow what I am doing doesn't work because my sub-process dies in `xsk_configure_socket`. I am not able to debug it properly with GDB though. Another point I don't understand is the statement: >> >> However, note that you need to supply the XSK_LIBBPF_FLAGS__INHIBIT_PROG_LOAD libbpf_flag with the xsk_socket__create calls and load your own XDP program as there is no built in one in libbpf that will route the traffic for you. >> >> from https://www.kernel.org/doc/html/latest/networking/af_xdp.html#xdp-shared-umem-bind-flag >> >> I didn't know that libbpf loads a XDP-program? Why would it do that? I am using my own af-xdp program which filters for udp-packets. If I set `xsk_cfg.libbpf_flags = XSK_LIBBPF_FLAGS__INHIBIT_PROG_LOAD;` in `xsk_configure_socket`, the af-xdp-socket fd is not put into the kernel `xsks-map` which basically means that I don't receive any packets. >> >> As you probably already noticed, I am overstrained with the concept of Shared Umem and I have to say, there is no documentation about it besides the two sentences in https://www.kernel.org/doc/html/latest/networking/af_xdp.html#xdp-shared-umem-bind-flag and a mail in a linux mailbox from Nov. 2019 stating that this feature is now implemented. >> >> Can you please help? >> > > XDP sockets always use an XDP program, it just that a default one is > provided if the use doesn't explicitly add one. Have a look at > tools/lib/bpf/xsk.c:xsk_load_xdp_prog. So, for shared umem you need to > explicitly have a program that muxes over the sockets. A naïve variant > can be found in samples/bpf/xdpsock_kern.c > > > Cheers, > Björn > >> Best regards >> >> Max ^ permalink raw reply [flat|nested] 3+ messages in thread
end of thread, other threads:[~2020-03-12 8:20 UTC | newest] Thread overview: 3+ messages (download: mbox.gz / follow: Atom feed) -- links below jump to the message on this page -- 2020-03-11 15:58 Shared Umem between processes Gaul, Maximilian 2020-03-12 7:55 ` Björn Töpel 2020-03-12 8:20 ` AW: " Gaul, Maximilian
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).