* AF_XDP sendto kick returning EPERM @ 2021-04-23 15:44 Srivats P 2021-04-27 7:28 ` Magnus Karlsson 0 siblings, 1 reply; 8+ messages in thread From: Srivats P @ 2021-04-23 15:44 UTC (permalink / raw) To: Xdp Hi, I'm using sendto() to kick tx in my AF_XDP program after I submit descriptors to the tx ring - ret = sendto(xsk_socket__fd(xsk_), NULL, 0, MSG_DONTWAIT, NULL, 0); However, I'm receiving EPERM as the return value every time. AFAIK this is not an expected return value. Since this is with i40e, I checked i40e_xsk_wakeup() - but that also doesn't return EPERM. I am running as root and I don't see any problems with creating the xsk, configuring umem etc. Also, no packets seem to go out either. # uname -a Linux Ostinato-1 5.11.15-1-default #1 SMP Fri Apr 16 16:47:34 UTC 2021 (64fb5bf) x86_64 x86_64 x86_64 GNU/Linux I don't see the problem on another machine with i40e but older kernel 5.4 series Any suggestions on what to look for or how to proceed? Srivats ^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: AF_XDP sendto kick returning EPERM 2021-04-23 15:44 AF_XDP sendto kick returning EPERM Srivats P @ 2021-04-27 7:28 ` Magnus Karlsson 2021-04-29 15:47 ` Srivats P 0 siblings, 1 reply; 8+ messages in thread From: Magnus Karlsson @ 2021-04-27 7:28 UTC (permalink / raw) To: Srivats P; +Cc: Xdp On Fri, Apr 23, 2021 at 5:44 PM Srivats P <pstavirs@gmail.com> wrote: > > Hi, > > I'm using sendto() to kick tx in my AF_XDP program after I submit > descriptors to the tx ring - > > ret = sendto(xsk_socket__fd(xsk_), NULL, 0, MSG_DONTWAIT, NULL, 0); > > However, I'm receiving EPERM as the return value every time. AFAIK > this is not an expected return value. Since this is with i40e, I > checked i40e_xsk_wakeup() - but that also doesn't return EPERM. I am > running as root and I don't see any problems with creating the xsk, > configuring umem etc. > > Also, no packets seem to go out either. > > # uname -a > Linux Ostinato-1 5.11.15-1-default #1 SMP Fri Apr 16 16:47:34 UTC 2021 > (64fb5bf) x86_64 x86_64 x86_64 GNU/Linux > > I don't see the problem on another machine with i40e but older kernel 5.4 series > > Any suggestions on what to look for or how to proceed? Weird. Have not seen this before. What is your command line for xdpsock? Is it unmodified? Using bpftrace, we can get the call stack of xsk_sendmsg. Somewhere in this stack there must be an EPERM. You can run the same command on your system, but use ftrace to see what a sendto call hits. Then see where the code terminates. mkarlsso@kurt:~/src/dna-linux$ sudo bpftrace -e 'kprobe:xsk_sendmsg { @[kstack()] = count(); }' Attaching 1 probe... ^C @[ xsk_sendmsg+1 sock_sendmsg+94 __sys_sendto+238 __x64_sys_sendto+37 do_syscall_64+51 entry_SYSCALL_64_after_hwframe+68 ]: 2244805 > Srivats ^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: AF_XDP sendto kick returning EPERM 2021-04-27 7:28 ` Magnus Karlsson @ 2021-04-29 15:47 ` Srivats P 2021-05-03 8:24 ` Magnus Karlsson 0 siblings, 1 reply; 8+ messages in thread From: Srivats P @ 2021-04-29 15:47 UTC (permalink / raw) To: Magnus Karlsson; +Cc: Xdp On Tue, Apr 27, 2021 at 12:58 PM Magnus Karlsson <magnus.karlsson@gmail.com> wrote: > > On Fri, Apr 23, 2021 at 5:44 PM Srivats P <pstavirs@gmail.com> wrote: > > > > Hi, > > > > I'm using sendto() to kick tx in my AF_XDP program after I submit > > descriptors to the tx ring - > > > > ret = sendto(xsk_socket__fd(xsk_), NULL, 0, MSG_DONTWAIT, NULL, 0); > > > > However, I'm receiving EPERM as the return value every time. AFAIK > > this is not an expected return value. Since this is with i40e, I > > checked i40e_xsk_wakeup() - but that also doesn't return EPERM. I am > > running as root and I don't see any problems with creating the xsk, > > configuring umem etc. > > > > Also, no packets seem to go out either. > > > > # uname -a > > Linux Ostinato-1 5.11.15-1-default #1 SMP Fri Apr 16 16:47:34 UTC 2021 > > (64fb5bf) x86_64 x86_64 x86_64 GNU/Linux > > > > I don't see the problem on another machine with i40e but older kernel 5.4 series > > > > Any suggestions on what to look for or how to proceed? > > Weird. Have not seen this before. What is your command line for > xdpsock? Is it unmodified? This is not xdpsock, but my own AF_XDP program. > > Using bpftrace, we can get the call stack of xsk_sendmsg. Somewhere in > this stack there must be an EPERM. You can run the same command on > your system, but use ftrace to see what a sendto call hits. Then see > where the code terminates. > > mkarlsso@kurt:~/src/dna-linux$ sudo bpftrace -e 'kprobe:xsk_sendmsg { > @[kstack()] = count(); }' > Attaching 1 probe... > ^C > > @[ > xsk_sendmsg+1 > sock_sendmsg+94 > __sys_sendto+238 > __x64_sys_sendto+37 > do_syscall_64+51 > entry_SYSCALL_64_after_hwframe+68 > ]: 2244805 Ostinato-1:~ # bpftrace -e 'kprobe:xsk_sendmsg { @[kstack()] = count(); }' Attaching 1 probe...^C@[ xsk_sendmsg+1 sock_sendmsg+94 __sys_sendto+238 __x64_sys_sendto+37 do_syscall_64+51 entry_SYSCALL_64_after_hwframe+68 ]: 1253307 Which doesn't seem to suggest any error - I've looked at the source code for all these functions, but don't see any reference to EPERM. Srivats ^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: AF_XDP sendto kick returning EPERM 2021-04-29 15:47 ` Srivats P @ 2021-05-03 8:24 ` Magnus Karlsson 2021-05-07 14:47 ` Srivats P 0 siblings, 1 reply; 8+ messages in thread From: Magnus Karlsson @ 2021-05-03 8:24 UTC (permalink / raw) To: Srivats P; +Cc: Xdp On Thu, Apr 29, 2021 at 5:47 PM Srivats P <pstavirs@gmail.com> wrote: > > On Tue, Apr 27, 2021 at 12:58 PM Magnus Karlsson > <magnus.karlsson@gmail.com> wrote: > > > > On Fri, Apr 23, 2021 at 5:44 PM Srivats P <pstavirs@gmail.com> wrote: > > > > > > Hi, > > > > > > I'm using sendto() to kick tx in my AF_XDP program after I submit > > > descriptors to the tx ring - > > > > > > ret = sendto(xsk_socket__fd(xsk_), NULL, 0, MSG_DONTWAIT, NULL, 0); > > > > > > However, I'm receiving EPERM as the return value every time. AFAIK > > > this is not an expected return value. Since this is with i40e, I > > > checked i40e_xsk_wakeup() - but that also doesn't return EPERM. I am > > > running as root and I don't see any problems with creating the xsk, > > > configuring umem etc. > > > > > > Also, no packets seem to go out either. > > > > > > # uname -a > > > Linux Ostinato-1 5.11.15-1-default #1 SMP Fri Apr 16 16:47:34 UTC 2021 > > > (64fb5bf) x86_64 x86_64 x86_64 GNU/Linux > > > > > > I don't see the problem on another machine with i40e but older kernel 5.4 series > > > > > > Any suggestions on what to look for or how to proceed? > > > > Weird. Have not seen this before. What is your command line for > > xdpsock? Is it unmodified? > > This is not xdpsock, but my own AF_XDP program. > > > > > Using bpftrace, we can get the call stack of xsk_sendmsg. Somewhere in > > this stack there must be an EPERM. You can run the same command on > > your system, but use ftrace to see what a sendto call hits. Then see > > where the code terminates. > > > > mkarlsso@kurt:~/src/dna-linux$ sudo bpftrace -e 'kprobe:xsk_sendmsg { > > @[kstack()] = count(); }' > > Attaching 1 probe... > > ^C > > > > @[ > > xsk_sendmsg+1 > > sock_sendmsg+94 > > __sys_sendto+238 > > __x64_sys_sendto+37 > > do_syscall_64+51 > > entry_SYSCALL_64_after_hwframe+68 > > ]: 2244805 > > Ostinato-1:~ # bpftrace -e 'kprobe:xsk_sendmsg { > @[kstack()] = count(); }' > Attaching 1 probe...^C@[ > xsk_sendmsg+1 > sock_sendmsg+94 > __sys_sendto+238 > __x64_sys_sendto+37 > do_syscall_64+51 > entry_SYSCALL_64_after_hwframe+68 > ]: 1253307 > > Which doesn't seem to suggest any error - I've looked at the source > code for all these functions, but don't see any reference to EPERM. It must be in there somewhere :-). Could you plesae use ftrace (through perf for example) and trace all functions that a sendto hits in your case? Then we might see what it hits. Are you running in SKB mode or in zero-copy mode? Guess it is zero-copy from your mail, but just want to verify. Does Rx work as expected? Could you share your AF_XDP program? /Magnus > Srivats ^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: AF_XDP sendto kick returning EPERM 2021-05-03 8:24 ` Magnus Karlsson @ 2021-05-07 14:47 ` Srivats P 2021-05-07 15:09 ` Srivats P 0 siblings, 1 reply; 8+ messages in thread From: Srivats P @ 2021-05-07 14:47 UTC (permalink / raw) To: Magnus Karlsson; +Cc: Xdp On Mon, May 3, 2021 at 1:54 PM Magnus Karlsson <magnus.karlsson@gmail.com> wrote: > > On Thu, Apr 29, 2021 at 5:47 PM Srivats P <pstavirs@gmail.com> wrote: > > > > On Tue, Apr 27, 2021 at 12:58 PM Magnus Karlsson > > <magnus.karlsson@gmail.com> wrote: > > > > > > On Fri, Apr 23, 2021 at 5:44 PM Srivats P <pstavirs@gmail.com> wrote: > > > > > > > > Hi, > > > > > > > > I'm using sendto() to kick tx in my AF_XDP program after I submit > > > > descriptors to the tx ring - > > > > > > > > ret = sendto(xsk_socket__fd(xsk_), NULL, 0, MSG_DONTWAIT, NULL, 0); > > > > > > > > However, I'm receiving EPERM as the return value every time. AFAIK > > > > this is not an expected return value. Since this is with i40e, I > > > > checked i40e_xsk_wakeup() - but that also doesn't return EPERM. I am > > > > running as root and I don't see any problems with creating the xsk, > > > > configuring umem etc. > > > > > > > > Also, no packets seem to go out either. > > > > > > > > # uname -a > > > > Linux Ostinato-1 5.11.15-1-default #1 SMP Fri Apr 16 16:47:34 UTC 2021 > > > > (64fb5bf) x86_64 x86_64 x86_64 GNU/Linux > > > > > > > > I don't see the problem on another machine with i40e but older kernel 5.4 series > > > > > > > > Any suggestions on what to look for or how to proceed? > > > > > > Weird. Have not seen this before. What is your command line for > > > xdpsock? Is it unmodified? > > > > This is not xdpsock, but my own AF_XDP program. > > > > > > > > Using bpftrace, we can get the call stack of xsk_sendmsg. Somewhere in > > > this stack there must be an EPERM. You can run the same command on > > > your system, but use ftrace to see what a sendto call hits. Then see > > > where the code terminates. > > > > > > mkarlsso@kurt:~/src/dna-linux$ sudo bpftrace -e 'kprobe:xsk_sendmsg { > > > @[kstack()] = count(); }' > > > Attaching 1 probe... > > > ^C > > > > > > @[ > > > xsk_sendmsg+1 > > > sock_sendmsg+94 > > > __sys_sendto+238 > > > __x64_sys_sendto+37 > > > do_syscall_64+51 > > > entry_SYSCALL_64_after_hwframe+68 > > > ]: 2244805 > > > > Ostinato-1:~ # bpftrace -e 'kprobe:xsk_sendmsg { > > @[kstack()] = count(); }' > > Attaching 1 probe...^C@[ > > xsk_sendmsg+1 > > sock_sendmsg+94 > > __sys_sendto+238 > > __x64_sys_sendto+37 > > do_syscall_64+51 > > entry_SYSCALL_64_after_hwframe+68 > > ]: 1253307 > > > > Which doesn't seem to suggest any error - I've looked at the source > > code for all these functions, but don't see any reference to EPERM. > > It must be in there somewhere :-). Could you plesae use ftrace > (through perf for example) and trace all functions that a sendto hits > in your case? Then we might see what it hits. > > Are you running in SKB mode or in zero-copy mode? Guess it is > zero-copy from your mail, but just want to verify. Does Rx work as > expected? > > Could you share your AF_XDP program? After some experimentation and a lot of head-scratching, I found part of the problem last night. The sendto() was not returning EPERM (-1), but ENXIO (-6) - I was mistakenly printing the return value of the sento() call (which always returns -1 in case of failure), instead of errno (duh!). Looking at the code, I see ENXIO is returned if the xsk is unbound. I'm still investigating this and will post an update soon. The problem is happening at a customer end and there's some delay and follow up required to get the logs. Srivats ^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: AF_XDP sendto kick returning EPERM 2021-05-07 14:47 ` Srivats P @ 2021-05-07 15:09 ` Srivats P 2021-05-09 15:41 ` Maciej Fijalkowski 0 siblings, 1 reply; 8+ messages in thread From: Srivats P @ 2021-05-07 15:09 UTC (permalink / raw) To: Magnus Karlsson; +Cc: Xdp Here's an update - On Fri, May 7, 2021 at 8:17 PM Srivats P <pstavirs@gmail.com> wrote: > > On Mon, May 3, 2021 at 1:54 PM Magnus Karlsson > <magnus.karlsson@gmail.com> wrote: > > > > On Thu, Apr 29, 2021 at 5:47 PM Srivats P <pstavirs@gmail.com> wrote: > > > > > > On Tue, Apr 27, 2021 at 12:58 PM Magnus Karlsson > > > <magnus.karlsson@gmail.com> wrote: > > > > > > > > On Fri, Apr 23, 2021 at 5:44 PM Srivats P <pstavirs@gmail.com> wrote: > > > > > > > > > > Hi, > > > > > > > > > > I'm using sendto() to kick tx in my AF_XDP program after I submit > > > > > descriptors to the tx ring - > > > > > > > > > > ret = sendto(xsk_socket__fd(xsk_), NULL, 0, MSG_DONTWAIT, NULL, 0); > > > > > > > > > > However, I'm receiving EPERM as the return value every time. AFAIK > > > > > this is not an expected return value. Since this is with i40e, I > > > > > checked i40e_xsk_wakeup() - but that also doesn't return EPERM. I am > > > > > running as root and I don't see any problems with creating the xsk, > > > > > configuring umem etc. > > > > > > > > > > Also, no packets seem to go out either. > > > > > > > > > > # uname -a > > > > > Linux Ostinato-1 5.11.15-1-default #1 SMP Fri Apr 16 16:47:34 UTC 2021 > > > > > (64fb5bf) x86_64 x86_64 x86_64 GNU/Linux > > > > > > > > > > I don't see the problem on another machine with i40e but older kernel 5.4 series > > > > > > > > > > Any suggestions on what to look for or how to proceed? > > > > > > > > Weird. Have not seen this before. What is your command line for > > > > xdpsock? Is it unmodified? > > > > > > This is not xdpsock, but my own AF_XDP program. > > > > > > > > > > > Using bpftrace, we can get the call stack of xsk_sendmsg. Somewhere in > > > > this stack there must be an EPERM. You can run the same command on > > > > your system, but use ftrace to see what a sendto call hits. Then see > > > > where the code terminates. > > > > > > > > mkarlsso@kurt:~/src/dna-linux$ sudo bpftrace -e 'kprobe:xsk_sendmsg { > > > > @[kstack()] = count(); }' > > > > Attaching 1 probe... > > > > ^C > > > > > > > > @[ > > > > xsk_sendmsg+1 > > > > sock_sendmsg+94 > > > > __sys_sendto+238 > > > > __x64_sys_sendto+37 > > > > do_syscall_64+51 > > > > entry_SYSCALL_64_after_hwframe+68 > > > > ]: 2244805 > > > > > > Ostinato-1:~ # bpftrace -e 'kprobe:xsk_sendmsg { > > > @[kstack()] = count(); }' > > > Attaching 1 probe...^C@[ > > > xsk_sendmsg+1 > > > sock_sendmsg+94 > > > __sys_sendto+238 > > > __x64_sys_sendto+37 > > > do_syscall_64+51 > > > entry_SYSCALL_64_after_hwframe+68 > > > ]: 1253307 > > > > > > Which doesn't seem to suggest any error - I've looked at the source > > > code for all these functions, but don't see any reference to EPERM. > > > > It must be in there somewhere :-). Could you plesae use ftrace > > (through perf for example) and trace all functions that a sendto hits > > in your case? Then we might see what it hits. > > > > Are you running in SKB mode or in zero-copy mode? Guess it is > > zero-copy from your mail, but just want to verify. Does Rx work as > > expected? > > > > Could you share your AF_XDP program? > > After some experimentation and a lot of head-scratching, I found part > of the problem last night. The sendto() was not returning EPERM (-1), > but ENXIO (-6) - I was mistakenly printing the return value of the > sento() call (which always returns -1 in case of failure), instead of > errno (duh!). > > Looking at the code, I see ENXIO is returned if the xsk is unbound. > I'm still investigating this and will post an update soon. The problem > is happening at a customer end and there's some delay and follow up > required to get the logs. sendto() was returning ENXIO because the interface MTU was set to 9000 which I know is not supported with AF_XDP. But shouldn't xsk_socket__create() fail in this case? Note the actual packet being transmitted was 64 bytes. Not sure if it has a role in the above sendto() failure, but before xsk socket create, my call to bpf_set_link_xdp_fd() was failing because of the MTU problem (the newly added error message for this case was very helpful!). Once MTU was reduced to 1500 both the RX eBPF program link to the interface failure and the TX sendto() returning ENXIO always went away. Kernel version 5.12 Can someone tell me what is expected to happen for a Tx AF_XDP socket in case of MTU > 4K? I also found a second case of sendto() returning ENXIO. In this scenario, I was removing my RX eBPF program by calling bpf_set_link_xdp_fd(ifIndex, -1, 0) while AF_XDP transmit (and associated sento() wakeup) was still going on. In this case, sendto starts failing with ENETDOWN for some time followed by ENXIO subsequently. This case was on Kernel version 5.4.0 Does removing a XDP program cause the interface to go down (ENETDOWN) leading to XDP socket unbind (ENXIO)? Should removing (or replacing) an RX eBPF program, affect AF_XDP TX? Srivats ^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: AF_XDP sendto kick returning EPERM 2021-05-07 15:09 ` Srivats P @ 2021-05-09 15:41 ` Maciej Fijalkowski 2021-05-11 12:02 ` Srivats P 0 siblings, 1 reply; 8+ messages in thread From: Maciej Fijalkowski @ 2021-05-09 15:41 UTC (permalink / raw) To: Srivats P; +Cc: Magnus Karlsson, Xdp On Fri, May 07, 2021 at 08:39:04PM +0530, Srivats P wrote: > Here's an update - > > On Fri, May 7, 2021 at 8:17 PM Srivats P <pstavirs@gmail.com> wrote: > > > > On Mon, May 3, 2021 at 1:54 PM Magnus Karlsson > > <magnus.karlsson@gmail.com> wrote: > > > > > > On Thu, Apr 29, 2021 at 5:47 PM Srivats P <pstavirs@gmail.com> wrote: > > > > > > > > On Tue, Apr 27, 2021 at 12:58 PM Magnus Karlsson > > > > <magnus.karlsson@gmail.com> wrote: > > > > > > > > > > On Fri, Apr 23, 2021 at 5:44 PM Srivats P <pstavirs@gmail.com> wrote: > > > > > > > > > > > > Hi, > > > > > > > > > > > > I'm using sendto() to kick tx in my AF_XDP program after I submit > > > > > > descriptors to the tx ring - > > > > > > > > > > > > ret = sendto(xsk_socket__fd(xsk_), NULL, 0, MSG_DONTWAIT, NULL, 0); > > > > > > > > > > > > However, I'm receiving EPERM as the return value every time. AFAIK > > > > > > this is not an expected return value. Since this is with i40e, I > > > > > > checked i40e_xsk_wakeup() - but that also doesn't return EPERM. I am > > > > > > running as root and I don't see any problems with creating the xsk, > > > > > > configuring umem etc. > > > > > > > > > > > > Also, no packets seem to go out either. > > > > > > > > > > > > # uname -a > > > > > > Linux Ostinato-1 5.11.15-1-default #1 SMP Fri Apr 16 16:47:34 UTC 2021 > > > > > > (64fb5bf) x86_64 x86_64 x86_64 GNU/Linux > > > > > > > > > > > > I don't see the problem on another machine with i40e but older kernel 5.4 series > > > > > > > > > > > > Any suggestions on what to look for or how to proceed? > > > > > > > > > > Weird. Have not seen this before. What is your command line for > > > > > xdpsock? Is it unmodified? > > > > > > > > This is not xdpsock, but my own AF_XDP program. > > > > > > > > > > > > > > Using bpftrace, we can get the call stack of xsk_sendmsg. Somewhere in > > > > > this stack there must be an EPERM. You can run the same command on > > > > > your system, but use ftrace to see what a sendto call hits. Then see > > > > > where the code terminates. > > > > > > > > > > mkarlsso@kurt:~/src/dna-linux$ sudo bpftrace -e 'kprobe:xsk_sendmsg { > > > > > @[kstack()] = count(); }' > > > > > Attaching 1 probe... > > > > > ^C > > > > > > > > > > @[ > > > > > xsk_sendmsg+1 > > > > > sock_sendmsg+94 > > > > > __sys_sendto+238 > > > > > __x64_sys_sendto+37 > > > > > do_syscall_64+51 > > > > > entry_SYSCALL_64_after_hwframe+68 > > > > > ]: 2244805 > > > > > > > > Ostinato-1:~ # bpftrace -e 'kprobe:xsk_sendmsg { > > > > @[kstack()] = count(); }' > > > > Attaching 1 probe...^C@[ > > > > xsk_sendmsg+1 > > > > sock_sendmsg+94 > > > > __sys_sendto+238 > > > > __x64_sys_sendto+37 > > > > do_syscall_64+51 > > > > entry_SYSCALL_64_after_hwframe+68 > > > > ]: 1253307 > > > > > > > > Which doesn't seem to suggest any error - I've looked at the source > > > > code for all these functions, but don't see any reference to EPERM. > > > > > > It must be in there somewhere :-). Could you plesae use ftrace > > > (through perf for example) and trace all functions that a sendto hits > > > in your case? Then we might see what it hits. > > > > > > Are you running in SKB mode or in zero-copy mode? Guess it is > > > zero-copy from your mail, but just want to verify. Does Rx work as > > > expected? > > > > > > Could you share your AF_XDP program? +1, that would help us probably :) > > > > After some experimentation and a lot of head-scratching, I found part > > of the problem last night. The sendto() was not returning EPERM (-1), > > but ENXIO (-6) - I was mistakenly printing the return value of the > > sento() call (which always returns -1 in case of failure), instead of > > errno (duh!). > > > > Looking at the code, I see ENXIO is returned if the xsk is unbound. > > I'm still investigating this and will post an update soon. The problem > > is happening at a customer end and there's some delay and follow up > > required to get the logs. > > sendto() was returning ENXIO because the interface MTU was set to 9000 > which I know is not supported with AF_XDP. But shouldn't > xsk_socket__create() fail in this case? Note the actual packet being > transmitted was 64 bytes. It depends. You said that you have your own AF_XDP app, so if you're setting the XSK_LIBBPF_FLAGS__INHIBIT_PROG_LOAD flag then libbpf wouldn't be loading the built-in AF_XDP eBPF prog on interface and that's where the failure should happen. > > Not sure if it has a role in the above sendto() failure, but before > xsk socket create, my call to bpf_set_link_xdp_fd() was failing > because of the MTU problem (the newly added error message for this > case was very helpful!). Once MTU was reduced to 1500 both the RX eBPF > program link to the interface failure and the TX sendto() returning > ENXIO always went away. Kernel version 5.12 > > Can someone tell me what is expected to happen for a Tx AF_XDP socket > in case of MTU > 4K? See the last paragraph. > > I also found a second case of sendto() returning ENXIO. In this > scenario, I was removing my RX eBPF program by calling > > bpf_set_link_xdp_fd(ifIndex, -1, 0) > > while AF_XDP transmit (and associated sento() wakeup) was still going > on. In this case, sendto starts failing with ENETDOWN for some time > followed by ENXIO subsequently. This case was on Kernel version 5.4.0 I think that we addressed the ENETDOWN Tx issue with the following set: https://lore.kernel.org/netdev/20200205045834.56795-1-maciej.fijalkowski@intel.com/ I see that it has been merged in 5.6. But it was related to being unable to spawn multiple AF_XDP Tx-only instances. With what you're saying it feels to me that you have multiple instances of your AF_XDP progs and you terminate one of them? Previously, every instance would die due to the fact that the underlying XDP prog would be unloaded from interface, but right now we have bpf_link support for AF_XDP which would handle that properly. Note that it was developed for the built-in prog. > > Does removing a XDP program cause the interface to go down (ENETDOWN) > leading to XDP socket unbind (ENXIO)? Should removing (or replacing) > an RX eBPF program, affect AF_XDP TX? Removing XDP prog causes the interface to undergo the reset or some other mechanism as it needs to remove the XDP Tx resources and change the Rx memory model. For Intel drivers, the AF_XDP Tx resources are configured during the load of Rx eBPF prog. We would have to develop some mechanism that detaches the creation of XDP Tx resources from loading Rx eBPF prog. There have been discussions around feature detection but I think it was about the opposite - don't configure Tx rings if your prog will not be doing XDP_TX action. > > Srivats ^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: AF_XDP sendto kick returning EPERM 2021-05-09 15:41 ` Maciej Fijalkowski @ 2021-05-11 12:02 ` Srivats P 0 siblings, 0 replies; 8+ messages in thread From: Srivats P @ 2021-05-11 12:02 UTC (permalink / raw) To: Maciej Fijalkowski; +Cc: Magnus Karlsson, Xdp On Sun, May 9, 2021 at 9:24 PM Maciej Fijalkowski <maciej.fijalkowski@intel.com> wrote: > > On Fri, May 07, 2021 at 08:39:04PM +0530, Srivats P wrote: > > Here's an update - > > > > On Fri, May 7, 2021 at 8:17 PM Srivats P <pstavirs@gmail.com> wrote: > > > > > > On Mon, May 3, 2021 at 1:54 PM Magnus Karlsson > > > <magnus.karlsson@gmail.com> wrote: > > > > > > > > On Thu, Apr 29, 2021 at 5:47 PM Srivats P <pstavirs@gmail.com> wrote: > > > > > > > > > > On Tue, Apr 27, 2021 at 12:58 PM Magnus Karlsson > > > > > <magnus.karlsson@gmail.com> wrote: > > > > > > > > > > > > On Fri, Apr 23, 2021 at 5:44 PM Srivats P <pstavirs@gmail.com> wrote: > > > > > > > > > > > > > > Hi, > > > > > > > > > > > > > > I'm using sendto() to kick tx in my AF_XDP program after I submit > > > > > > > descriptors to the tx ring - > > > > > > > > > > > > > > ret = sendto(xsk_socket__fd(xsk_), NULL, 0, MSG_DONTWAIT, NULL, 0); > > > > > > > > > > > > > > However, I'm receiving EPERM as the return value every time. AFAIK > > > > > > > this is not an expected return value. Since this is with i40e, I > > > > > > > checked i40e_xsk_wakeup() - but that also doesn't return EPERM. I am > > > > > > > running as root and I don't see any problems with creating the xsk, > > > > > > > configuring umem etc. > > > > > > > > > > > > > > Also, no packets seem to go out either. > > > > > > > > > > > > > > # uname -a > > > > > > > Linux Ostinato-1 5.11.15-1-default #1 SMP Fri Apr 16 16:47:34 UTC 2021 > > > > > > > (64fb5bf) x86_64 x86_64 x86_64 GNU/Linux > > > > > > > > > > > > > > I don't see the problem on another machine with i40e but older kernel 5.4 series > > > > > > > > > > > > > > Any suggestions on what to look for or how to proceed? > > > > > > > > > > > > Weird. Have not seen this before. What is your command line for > > > > > > xdpsock? Is it unmodified? > > > > > > > > > > This is not xdpsock, but my own AF_XDP program. > > > > > > > > > > > > > > > > > Using bpftrace, we can get the call stack of xsk_sendmsg. Somewhere in > > > > > > this stack there must be an EPERM. You can run the same command on > > > > > > your system, but use ftrace to see what a sendto call hits. Then see > > > > > > where the code terminates. > > > > > > > > > > > > mkarlsso@kurt:~/src/dna-linux$ sudo bpftrace -e 'kprobe:xsk_sendmsg { > > > > > > @[kstack()] = count(); }' > > > > > > Attaching 1 probe... > > > > > > ^C > > > > > > > > > > > > @[ > > > > > > xsk_sendmsg+1 > > > > > > sock_sendmsg+94 > > > > > > __sys_sendto+238 > > > > > > __x64_sys_sendto+37 > > > > > > do_syscall_64+51 > > > > > > entry_SYSCALL_64_after_hwframe+68 > > > > > > ]: 2244805 > > > > > > > > > > Ostinato-1:~ # bpftrace -e 'kprobe:xsk_sendmsg { > > > > > @[kstack()] = count(); }' > > > > > Attaching 1 probe...^C@[ > > > > > xsk_sendmsg+1 > > > > > sock_sendmsg+94 > > > > > __sys_sendto+238 > > > > > __x64_sys_sendto+37 > > > > > do_syscall_64+51 > > > > > entry_SYSCALL_64_after_hwframe+68 > > > > > ]: 1253307 > > > > > > > > > > Which doesn't seem to suggest any error - I've looked at the source > > > > > code for all these functions, but don't see any reference to EPERM. > > > > > > > > It must be in there somewhere :-). Could you plesae use ftrace > > > > (through perf for example) and trace all functions that a sendto hits > > > > in your case? Then we might see what it hits. > > > > > > > > Are you running in SKB mode or in zero-copy mode? Guess it is > > > > zero-copy from your mail, but just want to verify. Does Rx work as > > > > expected? > > > > > > > > Could you share your AF_XDP program? > > +1, that would help us probably :) The code is proprietary, but if required I can extract relevant bits into a sample program or modify the sample xdpsock_user.c suitably. > > > > > > > After some experimentation and a lot of head-scratching, I found part > > > of the problem last night. The sendto() was not returning EPERM (-1), > > > but ENXIO (-6) - I was mistakenly printing the return value of the > > > sento() call (which always returns -1 in case of failure), instead of > > > errno (duh!). > > > > > > Looking at the code, I see ENXIO is returned if the xsk is unbound. > > > I'm still investigating this and will post an update soon. The problem > > > is happening at a customer end and there's some delay and follow up > > > required to get the logs. > > > > sendto() was returning ENXIO because the interface MTU was set to 9000 > > which I know is not supported with AF_XDP. But shouldn't > > xsk_socket__create() fail in this case? Note the actual packet being > > transmitted was 64 bytes. > > It depends. You said that you have your own AF_XDP app, so if you're > setting the XSK_LIBBPF_FLAGS__INHIBIT_PROG_LOAD flag then libbpf wouldn't > be loading the built-in AF_XDP eBPF prog on interface and that's where the > failure should happen. I used AF_XDP for TX only with my own eBPF program for RX. For this reason, I was using INHIBIT_PROG_LOAD while opening the xsk. That's why I didn't see an error while creating the xsk. > > > > > Not sure if it has a role in the above sendto() failure, but before > > xsk socket create, my call to bpf_set_link_xdp_fd() was failing > > because of the MTU problem (the newly added error message for this > > case was very helpful!). Once MTU was reduced to 1500 both the RX eBPF > > program link to the interface failure and the TX sendto() returning > > ENXIO always went away. Kernel version 5.12 > > > > Can someone tell me what is expected to happen for a Tx AF_XDP socket > > in case of MTU > 4K? > > See the last paragraph. > > > > > I also found a second case of sendto() returning ENXIO. In this > > scenario, I was removing my RX eBPF program by calling > > > > bpf_set_link_xdp_fd(ifIndex, -1, 0) > > > > while AF_XDP transmit (and associated sento() wakeup) was still going > > on. In this case, sendto starts failing with ENETDOWN for some time > > followed by ENXIO subsequently. This case was on Kernel version 5.4.0 > > I think that we addressed the ENETDOWN Tx issue with the following set: > https://lore.kernel.org/netdev/20200205045834.56795-1-maciej.fijalkowski@intel.com/ > > I see that it has been merged in 5.6. But it was related to being unable > to spawn multiple AF_XDP Tx-only instances. With what you're saying it > feels to me that you have multiple instances of your AF_XDP progs and you > terminate one of them? Previously, every instance would die due to the > fact that the underlying XDP prog would be unloaded from interface, but > right now we have bpf_link support for AF_XDP which would handle that > properly. Note that it was developed for the built-in prog. I think my case is different. I have only one AF_XDP Tx-only instance, but I'm not using the built-in AF_XDP eBPF program. So when I remove my eBPF program the AF_XDP Tx also gets affected. I solved my problem by cleaning up the AF_XDP Tx first before removing my custom eBPF Rx program. > > > > > Does removing a XDP program cause the interface to go down (ENETDOWN) > > leading to XDP socket unbind (ENXIO)? Should removing (or replacing) > > an RX eBPF program, affect AF_XDP TX? > > Removing XDP prog causes the interface to undergo the reset or some other > mechanism as it needs to remove the XDP Tx resources and change the Rx > memory model. For Intel drivers, the AF_XDP Tx resources are configured > during the load of Rx eBPF prog. We would have to develop some mechanism > that detaches the creation of XDP Tx resources from loading Rx eBPF prog. > There have been discussions around feature detection but I think it was > about the opposite - don't configure Tx rings if your prog will not be > doing XDP_TX action. I guess I was sort of implicitly assuming that XDP Tx and Rx paths are independent. Which is not the case. This is good to keep in mind while coding. I think it might be a worthwhile goal to allow the eBPF program to be removed/replaced without affecting Tx - not sure how feasible that is though. Thanks for all the help! > > > > > Srivats ^ permalink raw reply [flat|nested] 8+ messages in thread
end of thread, other threads:[~2021-05-11 12:03 UTC | newest] Thread overview: 8+ messages (download: mbox.gz / follow: Atom feed) -- links below jump to the message on this page -- 2021-04-23 15:44 AF_XDP sendto kick returning EPERM Srivats P 2021-04-27 7:28 ` Magnus Karlsson 2021-04-29 15:47 ` Srivats P 2021-05-03 8:24 ` Magnus Karlsson 2021-05-07 14:47 ` Srivats P 2021-05-07 15:09 ` Srivats P 2021-05-09 15:41 ` Maciej Fijalkowski 2021-05-11 12:02 ` Srivats P
This is an external index of several public inboxes, see mirroring instructions on how to clone and mirror all data and code used by this external index.