All of lore.kernel.org
 help / color / mirror / Atom feed
* AF_XDP sendto kick returning EPERM
@ 2021-04-23 15:44 Srivats P
  2021-04-27  7:28 ` Magnus Karlsson
  0 siblings, 1 reply; 8+ messages in thread
From: Srivats P @ 2021-04-23 15:44 UTC (permalink / raw)
  To: Xdp

Hi,

I'm using sendto() to kick tx in my AF_XDP program after I submit
descriptors to the tx ring -

ret = sendto(xsk_socket__fd(xsk_), NULL, 0, MSG_DONTWAIT, NULL, 0);

However, I'm receiving EPERM as the return value every time. AFAIK
this is not an expected return value. Since this is with i40e, I
checked i40e_xsk_wakeup() - but that also doesn't return EPERM. I am
running as root and I don't see any problems with creating the xsk,
configuring umem etc.

Also, no packets seem to go out either.

# uname -a
Linux Ostinato-1 5.11.15-1-default #1 SMP Fri Apr 16 16:47:34 UTC 2021
(64fb5bf) x86_64 x86_64 x86_64 GNU/Linux

I don't see the problem on another machine with i40e but older kernel 5.4 series

Any suggestions on what to look for or how to proceed?

Srivats

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: AF_XDP sendto kick returning EPERM
  2021-04-23 15:44 AF_XDP sendto kick returning EPERM Srivats P
@ 2021-04-27  7:28 ` Magnus Karlsson
  2021-04-29 15:47   ` Srivats P
  0 siblings, 1 reply; 8+ messages in thread
From: Magnus Karlsson @ 2021-04-27  7:28 UTC (permalink / raw)
  To: Srivats P; +Cc: Xdp

On Fri, Apr 23, 2021 at 5:44 PM Srivats P <pstavirs@gmail.com> wrote:
>
> Hi,
>
> I'm using sendto() to kick tx in my AF_XDP program after I submit
> descriptors to the tx ring -
>
> ret = sendto(xsk_socket__fd(xsk_), NULL, 0, MSG_DONTWAIT, NULL, 0);
>
> However, I'm receiving EPERM as the return value every time. AFAIK
> this is not an expected return value. Since this is with i40e, I
> checked i40e_xsk_wakeup() - but that also doesn't return EPERM. I am
> running as root and I don't see any problems with creating the xsk,
> configuring umem etc.
>
> Also, no packets seem to go out either.
>
> # uname -a
> Linux Ostinato-1 5.11.15-1-default #1 SMP Fri Apr 16 16:47:34 UTC 2021
> (64fb5bf) x86_64 x86_64 x86_64 GNU/Linux
>
> I don't see the problem on another machine with i40e but older kernel 5.4 series
>
> Any suggestions on what to look for or how to proceed?

Weird. Have not seen this before. What is your command line for
xdpsock? Is it unmodified?

Using bpftrace, we can get the call stack of xsk_sendmsg. Somewhere in
this stack there must be an EPERM. You can run the same command on
your system, but use ftrace to see what a sendto call hits. Then see
where the code terminates.

mkarlsso@kurt:~/src/dna-linux$ sudo bpftrace -e 'kprobe:xsk_sendmsg {
@[kstack()] = count(); }'
Attaching 1 probe...
^C

@[
    xsk_sendmsg+1
    sock_sendmsg+94
    __sys_sendto+238
    __x64_sys_sendto+37
    do_syscall_64+51
    entry_SYSCALL_64_after_hwframe+68
]: 2244805


> Srivats

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: AF_XDP sendto kick returning EPERM
  2021-04-27  7:28 ` Magnus Karlsson
@ 2021-04-29 15:47   ` Srivats P
  2021-05-03  8:24     ` Magnus Karlsson
  0 siblings, 1 reply; 8+ messages in thread
From: Srivats P @ 2021-04-29 15:47 UTC (permalink / raw)
  To: Magnus Karlsson; +Cc: Xdp

On Tue, Apr 27, 2021 at 12:58 PM Magnus Karlsson
<magnus.karlsson@gmail.com> wrote:
>
> On Fri, Apr 23, 2021 at 5:44 PM Srivats P <pstavirs@gmail.com> wrote:
> >
> > Hi,
> >
> > I'm using sendto() to kick tx in my AF_XDP program after I submit
> > descriptors to the tx ring -
> >
> > ret = sendto(xsk_socket__fd(xsk_), NULL, 0, MSG_DONTWAIT, NULL, 0);
> >
> > However, I'm receiving EPERM as the return value every time. AFAIK
> > this is not an expected return value. Since this is with i40e, I
> > checked i40e_xsk_wakeup() - but that also doesn't return EPERM. I am
> > running as root and I don't see any problems with creating the xsk,
> > configuring umem etc.
> >
> > Also, no packets seem to go out either.
> >
> > # uname -a
> > Linux Ostinato-1 5.11.15-1-default #1 SMP Fri Apr 16 16:47:34 UTC 2021
> > (64fb5bf) x86_64 x86_64 x86_64 GNU/Linux
> >
> > I don't see the problem on another machine with i40e but older kernel 5.4 series
> >
> > Any suggestions on what to look for or how to proceed?
>
> Weird. Have not seen this before. What is your command line for
> xdpsock? Is it unmodified?

This is not xdpsock, but my own AF_XDP program.

>
> Using bpftrace, we can get the call stack of xsk_sendmsg. Somewhere in
> this stack there must be an EPERM. You can run the same command on
> your system, but use ftrace to see what a sendto call hits. Then see
> where the code terminates.
>
> mkarlsso@kurt:~/src/dna-linux$ sudo bpftrace -e 'kprobe:xsk_sendmsg {
> @[kstack()] = count(); }'
> Attaching 1 probe...
> ^C
>
> @[
>     xsk_sendmsg+1
>     sock_sendmsg+94
>     __sys_sendto+238
>     __x64_sys_sendto+37
>     do_syscall_64+51
>     entry_SYSCALL_64_after_hwframe+68
> ]: 2244805

Ostinato-1:~ # bpftrace -e 'kprobe:xsk_sendmsg {
@[kstack()] = count(); }'
Attaching 1 probe...^C@[
    xsk_sendmsg+1
    sock_sendmsg+94
    __sys_sendto+238
    __x64_sys_sendto+37
    do_syscall_64+51
    entry_SYSCALL_64_after_hwframe+68
]: 1253307

Which doesn't seem to suggest any error - I've looked at the source
code for all these functions, but don't see any reference to EPERM.

Srivats

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: AF_XDP sendto kick returning EPERM
  2021-04-29 15:47   ` Srivats P
@ 2021-05-03  8:24     ` Magnus Karlsson
  2021-05-07 14:47       ` Srivats P
  0 siblings, 1 reply; 8+ messages in thread
From: Magnus Karlsson @ 2021-05-03  8:24 UTC (permalink / raw)
  To: Srivats P; +Cc: Xdp

On Thu, Apr 29, 2021 at 5:47 PM Srivats P <pstavirs@gmail.com> wrote:
>
> On Tue, Apr 27, 2021 at 12:58 PM Magnus Karlsson
> <magnus.karlsson@gmail.com> wrote:
> >
> > On Fri, Apr 23, 2021 at 5:44 PM Srivats P <pstavirs@gmail.com> wrote:
> > >
> > > Hi,
> > >
> > > I'm using sendto() to kick tx in my AF_XDP program after I submit
> > > descriptors to the tx ring -
> > >
> > > ret = sendto(xsk_socket__fd(xsk_), NULL, 0, MSG_DONTWAIT, NULL, 0);
> > >
> > > However, I'm receiving EPERM as the return value every time. AFAIK
> > > this is not an expected return value. Since this is with i40e, I
> > > checked i40e_xsk_wakeup() - but that also doesn't return EPERM. I am
> > > running as root and I don't see any problems with creating the xsk,
> > > configuring umem etc.
> > >
> > > Also, no packets seem to go out either.
> > >
> > > # uname -a
> > > Linux Ostinato-1 5.11.15-1-default #1 SMP Fri Apr 16 16:47:34 UTC 2021
> > > (64fb5bf) x86_64 x86_64 x86_64 GNU/Linux
> > >
> > > I don't see the problem on another machine with i40e but older kernel 5.4 series
> > >
> > > Any suggestions on what to look for or how to proceed?
> >
> > Weird. Have not seen this before. What is your command line for
> > xdpsock? Is it unmodified?
>
> This is not xdpsock, but my own AF_XDP program.
>
> >
> > Using bpftrace, we can get the call stack of xsk_sendmsg. Somewhere in
> > this stack there must be an EPERM. You can run the same command on
> > your system, but use ftrace to see what a sendto call hits. Then see
> > where the code terminates.
> >
> > mkarlsso@kurt:~/src/dna-linux$ sudo bpftrace -e 'kprobe:xsk_sendmsg {
> > @[kstack()] = count(); }'
> > Attaching 1 probe...
> > ^C
> >
> > @[
> >     xsk_sendmsg+1
> >     sock_sendmsg+94
> >     __sys_sendto+238
> >     __x64_sys_sendto+37
> >     do_syscall_64+51
> >     entry_SYSCALL_64_after_hwframe+68
> > ]: 2244805
>
> Ostinato-1:~ # bpftrace -e 'kprobe:xsk_sendmsg {
> @[kstack()] = count(); }'
> Attaching 1 probe...^C@[
>     xsk_sendmsg+1
>     sock_sendmsg+94
>     __sys_sendto+238
>     __x64_sys_sendto+37
>     do_syscall_64+51
>     entry_SYSCALL_64_after_hwframe+68
> ]: 1253307
>
> Which doesn't seem to suggest any error - I've looked at the source
> code for all these functions, but don't see any reference to EPERM.

It must be in there somewhere :-). Could you plesae use ftrace
(through perf for example) and trace all functions that a sendto hits
in your case? Then we might see what it hits.

Are you running in SKB mode or in zero-copy mode? Guess it is
zero-copy from your mail, but just want to verify. Does Rx work as
expected?

Could you share your AF_XDP program?

/Magnus

> Srivats

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: AF_XDP sendto kick returning EPERM
  2021-05-03  8:24     ` Magnus Karlsson
@ 2021-05-07 14:47       ` Srivats P
  2021-05-07 15:09         ` Srivats P
  0 siblings, 1 reply; 8+ messages in thread
From: Srivats P @ 2021-05-07 14:47 UTC (permalink / raw)
  To: Magnus Karlsson; +Cc: Xdp

On Mon, May 3, 2021 at 1:54 PM Magnus Karlsson
<magnus.karlsson@gmail.com> wrote:
>
> On Thu, Apr 29, 2021 at 5:47 PM Srivats P <pstavirs@gmail.com> wrote:
> >
> > On Tue, Apr 27, 2021 at 12:58 PM Magnus Karlsson
> > <magnus.karlsson@gmail.com> wrote:
> > >
> > > On Fri, Apr 23, 2021 at 5:44 PM Srivats P <pstavirs@gmail.com> wrote:
> > > >
> > > > Hi,
> > > >
> > > > I'm using sendto() to kick tx in my AF_XDP program after I submit
> > > > descriptors to the tx ring -
> > > >
> > > > ret = sendto(xsk_socket__fd(xsk_), NULL, 0, MSG_DONTWAIT, NULL, 0);
> > > >
> > > > However, I'm receiving EPERM as the return value every time. AFAIK
> > > > this is not an expected return value. Since this is with i40e, I
> > > > checked i40e_xsk_wakeup() - but that also doesn't return EPERM. I am
> > > > running as root and I don't see any problems with creating the xsk,
> > > > configuring umem etc.
> > > >
> > > > Also, no packets seem to go out either.
> > > >
> > > > # uname -a
> > > > Linux Ostinato-1 5.11.15-1-default #1 SMP Fri Apr 16 16:47:34 UTC 2021
> > > > (64fb5bf) x86_64 x86_64 x86_64 GNU/Linux
> > > >
> > > > I don't see the problem on another machine with i40e but older kernel 5.4 series
> > > >
> > > > Any suggestions on what to look for or how to proceed?
> > >
> > > Weird. Have not seen this before. What is your command line for
> > > xdpsock? Is it unmodified?
> >
> > This is not xdpsock, but my own AF_XDP program.
> >
> > >
> > > Using bpftrace, we can get the call stack of xsk_sendmsg. Somewhere in
> > > this stack there must be an EPERM. You can run the same command on
> > > your system, but use ftrace to see what a sendto call hits. Then see
> > > where the code terminates.
> > >
> > > mkarlsso@kurt:~/src/dna-linux$ sudo bpftrace -e 'kprobe:xsk_sendmsg {
> > > @[kstack()] = count(); }'
> > > Attaching 1 probe...
> > > ^C
> > >
> > > @[
> > >     xsk_sendmsg+1
> > >     sock_sendmsg+94
> > >     __sys_sendto+238
> > >     __x64_sys_sendto+37
> > >     do_syscall_64+51
> > >     entry_SYSCALL_64_after_hwframe+68
> > > ]: 2244805
> >
> > Ostinato-1:~ # bpftrace -e 'kprobe:xsk_sendmsg {
> > @[kstack()] = count(); }'
> > Attaching 1 probe...^C@[
> >     xsk_sendmsg+1
> >     sock_sendmsg+94
> >     __sys_sendto+238
> >     __x64_sys_sendto+37
> >     do_syscall_64+51
> >     entry_SYSCALL_64_after_hwframe+68
> > ]: 1253307
> >
> > Which doesn't seem to suggest any error - I've looked at the source
> > code for all these functions, but don't see any reference to EPERM.
>
> It must be in there somewhere :-). Could you plesae use ftrace
> (through perf for example) and trace all functions that a sendto hits
> in your case? Then we might see what it hits.
>
> Are you running in SKB mode or in zero-copy mode? Guess it is
> zero-copy from your mail, but just want to verify. Does Rx work as
> expected?
>
> Could you share your AF_XDP program?

After some experimentation and a lot of head-scratching, I found part
of the problem last night. The sendto() was not returning EPERM (-1),
but ENXIO (-6) - I was mistakenly printing the return value of the
sento() call (which always returns -1 in case of failure), instead of
errno (duh!).

Looking at the code, I see ENXIO is returned if the xsk is unbound.
I'm still investigating this and will post an update soon. The problem
is happening at a customer end and there's some delay and follow up
required to get the logs.

Srivats

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: AF_XDP sendto kick returning EPERM
  2021-05-07 14:47       ` Srivats P
@ 2021-05-07 15:09         ` Srivats P
  2021-05-09 15:41           ` Maciej Fijalkowski
  0 siblings, 1 reply; 8+ messages in thread
From: Srivats P @ 2021-05-07 15:09 UTC (permalink / raw)
  To: Magnus Karlsson; +Cc: Xdp

Here's an update -

On Fri, May 7, 2021 at 8:17 PM Srivats P <pstavirs@gmail.com> wrote:
>
> On Mon, May 3, 2021 at 1:54 PM Magnus Karlsson
> <magnus.karlsson@gmail.com> wrote:
> >
> > On Thu, Apr 29, 2021 at 5:47 PM Srivats P <pstavirs@gmail.com> wrote:
> > >
> > > On Tue, Apr 27, 2021 at 12:58 PM Magnus Karlsson
> > > <magnus.karlsson@gmail.com> wrote:
> > > >
> > > > On Fri, Apr 23, 2021 at 5:44 PM Srivats P <pstavirs@gmail.com> wrote:
> > > > >
> > > > > Hi,
> > > > >
> > > > > I'm using sendto() to kick tx in my AF_XDP program after I submit
> > > > > descriptors to the tx ring -
> > > > >
> > > > > ret = sendto(xsk_socket__fd(xsk_), NULL, 0, MSG_DONTWAIT, NULL, 0);
> > > > >
> > > > > However, I'm receiving EPERM as the return value every time. AFAIK
> > > > > this is not an expected return value. Since this is with i40e, I
> > > > > checked i40e_xsk_wakeup() - but that also doesn't return EPERM. I am
> > > > > running as root and I don't see any problems with creating the xsk,
> > > > > configuring umem etc.
> > > > >
> > > > > Also, no packets seem to go out either.
> > > > >
> > > > > # uname -a
> > > > > Linux Ostinato-1 5.11.15-1-default #1 SMP Fri Apr 16 16:47:34 UTC 2021
> > > > > (64fb5bf) x86_64 x86_64 x86_64 GNU/Linux
> > > > >
> > > > > I don't see the problem on another machine with i40e but older kernel 5.4 series
> > > > >
> > > > > Any suggestions on what to look for or how to proceed?
> > > >
> > > > Weird. Have not seen this before. What is your command line for
> > > > xdpsock? Is it unmodified?
> > >
> > > This is not xdpsock, but my own AF_XDP program.
> > >
> > > >
> > > > Using bpftrace, we can get the call stack of xsk_sendmsg. Somewhere in
> > > > this stack there must be an EPERM. You can run the same command on
> > > > your system, but use ftrace to see what a sendto call hits. Then see
> > > > where the code terminates.
> > > >
> > > > mkarlsso@kurt:~/src/dna-linux$ sudo bpftrace -e 'kprobe:xsk_sendmsg {
> > > > @[kstack()] = count(); }'
> > > > Attaching 1 probe...
> > > > ^C
> > > >
> > > > @[
> > > >     xsk_sendmsg+1
> > > >     sock_sendmsg+94
> > > >     __sys_sendto+238
> > > >     __x64_sys_sendto+37
> > > >     do_syscall_64+51
> > > >     entry_SYSCALL_64_after_hwframe+68
> > > > ]: 2244805
> > >
> > > Ostinato-1:~ # bpftrace -e 'kprobe:xsk_sendmsg {
> > > @[kstack()] = count(); }'
> > > Attaching 1 probe...^C@[
> > >     xsk_sendmsg+1
> > >     sock_sendmsg+94
> > >     __sys_sendto+238
> > >     __x64_sys_sendto+37
> > >     do_syscall_64+51
> > >     entry_SYSCALL_64_after_hwframe+68
> > > ]: 1253307
> > >
> > > Which doesn't seem to suggest any error - I've looked at the source
> > > code for all these functions, but don't see any reference to EPERM.
> >
> > It must be in there somewhere :-). Could you plesae use ftrace
> > (through perf for example) and trace all functions that a sendto hits
> > in your case? Then we might see what it hits.
> >
> > Are you running in SKB mode or in zero-copy mode? Guess it is
> > zero-copy from your mail, but just want to verify. Does Rx work as
> > expected?
> >
> > Could you share your AF_XDP program?
>
> After some experimentation and a lot of head-scratching, I found part
> of the problem last night. The sendto() was not returning EPERM (-1),
> but ENXIO (-6) - I was mistakenly printing the return value of the
> sento() call (which always returns -1 in case of failure), instead of
> errno (duh!).
>
> Looking at the code, I see ENXIO is returned if the xsk is unbound.
> I'm still investigating this and will post an update soon. The problem
> is happening at a customer end and there's some delay and follow up
> required to get the logs.

sendto() was returning ENXIO because the interface MTU was set to 9000
which I know is not supported with AF_XDP. But shouldn't
xsk_socket__create() fail in this case? Note the actual packet being
transmitted was 64 bytes.

Not sure if it has a role in the above sendto() failure, but before
xsk socket create, my call to bpf_set_link_xdp_fd() was failing
because of the MTU problem (the newly added error message for this
case was very helpful!). Once MTU was reduced to 1500 both the RX eBPF
program link to the interface failure and the TX sendto() returning
ENXIO always went away. Kernel version 5.12

Can someone tell me what is expected to happen for a Tx AF_XDP socket
in case of MTU > 4K?

I also found a second case of sendto() returning ENXIO. In this
scenario, I was removing my RX eBPF program by calling

    bpf_set_link_xdp_fd(ifIndex, -1, 0)

while AF_XDP transmit (and associated sento() wakeup) was still going
on. In this case, sendto starts failing with ENETDOWN for some time
followed by ENXIO subsequently. This case was on Kernel version 5.4.0

Does removing a XDP program cause the interface to go down (ENETDOWN)
leading to XDP socket unbind (ENXIO)? Should removing (or replacing)
an RX eBPF program, affect AF_XDP TX?

Srivats

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: AF_XDP sendto kick returning EPERM
  2021-05-07 15:09         ` Srivats P
@ 2021-05-09 15:41           ` Maciej Fijalkowski
  2021-05-11 12:02             ` Srivats P
  0 siblings, 1 reply; 8+ messages in thread
From: Maciej Fijalkowski @ 2021-05-09 15:41 UTC (permalink / raw)
  To: Srivats P; +Cc: Magnus Karlsson, Xdp

On Fri, May 07, 2021 at 08:39:04PM +0530, Srivats P wrote:
> Here's an update -
> 
> On Fri, May 7, 2021 at 8:17 PM Srivats P <pstavirs@gmail.com> wrote:
> >
> > On Mon, May 3, 2021 at 1:54 PM Magnus Karlsson
> > <magnus.karlsson@gmail.com> wrote:
> > >
> > > On Thu, Apr 29, 2021 at 5:47 PM Srivats P <pstavirs@gmail.com> wrote:
> > > >
> > > > On Tue, Apr 27, 2021 at 12:58 PM Magnus Karlsson
> > > > <magnus.karlsson@gmail.com> wrote:
> > > > >
> > > > > On Fri, Apr 23, 2021 at 5:44 PM Srivats P <pstavirs@gmail.com> wrote:
> > > > > >
> > > > > > Hi,
> > > > > >
> > > > > > I'm using sendto() to kick tx in my AF_XDP program after I submit
> > > > > > descriptors to the tx ring -
> > > > > >
> > > > > > ret = sendto(xsk_socket__fd(xsk_), NULL, 0, MSG_DONTWAIT, NULL, 0);
> > > > > >
> > > > > > However, I'm receiving EPERM as the return value every time. AFAIK
> > > > > > this is not an expected return value. Since this is with i40e, I
> > > > > > checked i40e_xsk_wakeup() - but that also doesn't return EPERM. I am
> > > > > > running as root and I don't see any problems with creating the xsk,
> > > > > > configuring umem etc.
> > > > > >
> > > > > > Also, no packets seem to go out either.
> > > > > >
> > > > > > # uname -a
> > > > > > Linux Ostinato-1 5.11.15-1-default #1 SMP Fri Apr 16 16:47:34 UTC 2021
> > > > > > (64fb5bf) x86_64 x86_64 x86_64 GNU/Linux
> > > > > >
> > > > > > I don't see the problem on another machine with i40e but older kernel 5.4 series
> > > > > >
> > > > > > Any suggestions on what to look for or how to proceed?
> > > > >
> > > > > Weird. Have not seen this before. What is your command line for
> > > > > xdpsock? Is it unmodified?
> > > >
> > > > This is not xdpsock, but my own AF_XDP program.
> > > >
> > > > >
> > > > > Using bpftrace, we can get the call stack of xsk_sendmsg. Somewhere in
> > > > > this stack there must be an EPERM. You can run the same command on
> > > > > your system, but use ftrace to see what a sendto call hits. Then see
> > > > > where the code terminates.
> > > > >
> > > > > mkarlsso@kurt:~/src/dna-linux$ sudo bpftrace -e 'kprobe:xsk_sendmsg {
> > > > > @[kstack()] = count(); }'
> > > > > Attaching 1 probe...
> > > > > ^C
> > > > >
> > > > > @[
> > > > >     xsk_sendmsg+1
> > > > >     sock_sendmsg+94
> > > > >     __sys_sendto+238
> > > > >     __x64_sys_sendto+37
> > > > >     do_syscall_64+51
> > > > >     entry_SYSCALL_64_after_hwframe+68
> > > > > ]: 2244805
> > > >
> > > > Ostinato-1:~ # bpftrace -e 'kprobe:xsk_sendmsg {
> > > > @[kstack()] = count(); }'
> > > > Attaching 1 probe...^C@[
> > > >     xsk_sendmsg+1
> > > >     sock_sendmsg+94
> > > >     __sys_sendto+238
> > > >     __x64_sys_sendto+37
> > > >     do_syscall_64+51
> > > >     entry_SYSCALL_64_after_hwframe+68
> > > > ]: 1253307
> > > >
> > > > Which doesn't seem to suggest any error - I've looked at the source
> > > > code for all these functions, but don't see any reference to EPERM.
> > >
> > > It must be in there somewhere :-). Could you plesae use ftrace
> > > (through perf for example) and trace all functions that a sendto hits
> > > in your case? Then we might see what it hits.
> > >
> > > Are you running in SKB mode or in zero-copy mode? Guess it is
> > > zero-copy from your mail, but just want to verify. Does Rx work as
> > > expected?
> > >
> > > Could you share your AF_XDP program?

+1, that would help us probably :)

> >
> > After some experimentation and a lot of head-scratching, I found part
> > of the problem last night. The sendto() was not returning EPERM (-1),
> > but ENXIO (-6) - I was mistakenly printing the return value of the
> > sento() call (which always returns -1 in case of failure), instead of
> > errno (duh!).
> >
> > Looking at the code, I see ENXIO is returned if the xsk is unbound.
> > I'm still investigating this and will post an update soon. The problem
> > is happening at a customer end and there's some delay and follow up
> > required to get the logs.
> 
> sendto() was returning ENXIO because the interface MTU was set to 9000
> which I know is not supported with AF_XDP. But shouldn't
> xsk_socket__create() fail in this case? Note the actual packet being
> transmitted was 64 bytes.

It depends. You said that you have your own AF_XDP app, so if you're
setting the XSK_LIBBPF_FLAGS__INHIBIT_PROG_LOAD flag then libbpf wouldn't
be loading the built-in AF_XDP eBPF prog on interface and that's where the
failure should happen.

> 
> Not sure if it has a role in the above sendto() failure, but before
> xsk socket create, my call to bpf_set_link_xdp_fd() was failing
> because of the MTU problem (the newly added error message for this
> case was very helpful!). Once MTU was reduced to 1500 both the RX eBPF
> program link to the interface failure and the TX sendto() returning
> ENXIO always went away. Kernel version 5.12
> 
> Can someone tell me what is expected to happen for a Tx AF_XDP socket
> in case of MTU > 4K?

See the last paragraph.

> 
> I also found a second case of sendto() returning ENXIO. In this
> scenario, I was removing my RX eBPF program by calling
> 
>     bpf_set_link_xdp_fd(ifIndex, -1, 0)
> 
> while AF_XDP transmit (and associated sento() wakeup) was still going
> on. In this case, sendto starts failing with ENETDOWN for some time
> followed by ENXIO subsequently. This case was on Kernel version 5.4.0

I think that we addressed the ENETDOWN Tx issue with the following set:
https://lore.kernel.org/netdev/20200205045834.56795-1-maciej.fijalkowski@intel.com/

I see that it has been merged in 5.6. But it was related to being unable
to spawn multiple AF_XDP Tx-only instances. With what you're saying it
feels to me that you have multiple instances of your AF_XDP progs and you
terminate one of them? Previously, every instance would die due to the
fact that the underlying XDP prog would be unloaded from interface, but
right now we have bpf_link support for AF_XDP which would handle that
properly. Note that it was developed for the built-in prog.

> 
> Does removing a XDP program cause the interface to go down (ENETDOWN)
> leading to XDP socket unbind (ENXIO)? Should removing (or replacing)
> an RX eBPF program, affect AF_XDP TX?

Removing XDP prog causes the interface to undergo the reset or some other
mechanism as it needs to remove the XDP Tx resources and change the Rx
memory model. For Intel drivers, the AF_XDP Tx resources are configured
during the load of Rx eBPF prog. We would have to develop some mechanism
that detaches the creation of XDP Tx resources from loading Rx eBPF prog.
There have been discussions around feature detection but I think it was
about the opposite - don't configure Tx rings if your prog will not be
doing XDP_TX action.

> 
> Srivats

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: AF_XDP sendto kick returning EPERM
  2021-05-09 15:41           ` Maciej Fijalkowski
@ 2021-05-11 12:02             ` Srivats P
  0 siblings, 0 replies; 8+ messages in thread
From: Srivats P @ 2021-05-11 12:02 UTC (permalink / raw)
  To: Maciej Fijalkowski; +Cc: Magnus Karlsson, Xdp

On Sun, May 9, 2021 at 9:24 PM Maciej Fijalkowski
<maciej.fijalkowski@intel.com> wrote:
>
> On Fri, May 07, 2021 at 08:39:04PM +0530, Srivats P wrote:
> > Here's an update -
> >
> > On Fri, May 7, 2021 at 8:17 PM Srivats P <pstavirs@gmail.com> wrote:
> > >
> > > On Mon, May 3, 2021 at 1:54 PM Magnus Karlsson
> > > <magnus.karlsson@gmail.com> wrote:
> > > >
> > > > On Thu, Apr 29, 2021 at 5:47 PM Srivats P <pstavirs@gmail.com> wrote:
> > > > >
> > > > > On Tue, Apr 27, 2021 at 12:58 PM Magnus Karlsson
> > > > > <magnus.karlsson@gmail.com> wrote:
> > > > > >
> > > > > > On Fri, Apr 23, 2021 at 5:44 PM Srivats P <pstavirs@gmail.com> wrote:
> > > > > > >
> > > > > > > Hi,
> > > > > > >
> > > > > > > I'm using sendto() to kick tx in my AF_XDP program after I submit
> > > > > > > descriptors to the tx ring -
> > > > > > >
> > > > > > > ret = sendto(xsk_socket__fd(xsk_), NULL, 0, MSG_DONTWAIT, NULL, 0);
> > > > > > >
> > > > > > > However, I'm receiving EPERM as the return value every time. AFAIK
> > > > > > > this is not an expected return value. Since this is with i40e, I
> > > > > > > checked i40e_xsk_wakeup() - but that also doesn't return EPERM. I am
> > > > > > > running as root and I don't see any problems with creating the xsk,
> > > > > > > configuring umem etc.
> > > > > > >
> > > > > > > Also, no packets seem to go out either.
> > > > > > >
> > > > > > > # uname -a
> > > > > > > Linux Ostinato-1 5.11.15-1-default #1 SMP Fri Apr 16 16:47:34 UTC 2021
> > > > > > > (64fb5bf) x86_64 x86_64 x86_64 GNU/Linux
> > > > > > >
> > > > > > > I don't see the problem on another machine with i40e but older kernel 5.4 series
> > > > > > >
> > > > > > > Any suggestions on what to look for or how to proceed?
> > > > > >
> > > > > > Weird. Have not seen this before. What is your command line for
> > > > > > xdpsock? Is it unmodified?
> > > > >
> > > > > This is not xdpsock, but my own AF_XDP program.
> > > > >
> > > > > >
> > > > > > Using bpftrace, we can get the call stack of xsk_sendmsg. Somewhere in
> > > > > > this stack there must be an EPERM. You can run the same command on
> > > > > > your system, but use ftrace to see what a sendto call hits. Then see
> > > > > > where the code terminates.
> > > > > >
> > > > > > mkarlsso@kurt:~/src/dna-linux$ sudo bpftrace -e 'kprobe:xsk_sendmsg {
> > > > > > @[kstack()] = count(); }'
> > > > > > Attaching 1 probe...
> > > > > > ^C
> > > > > >
> > > > > > @[
> > > > > >     xsk_sendmsg+1
> > > > > >     sock_sendmsg+94
> > > > > >     __sys_sendto+238
> > > > > >     __x64_sys_sendto+37
> > > > > >     do_syscall_64+51
> > > > > >     entry_SYSCALL_64_after_hwframe+68
> > > > > > ]: 2244805
> > > > >
> > > > > Ostinato-1:~ # bpftrace -e 'kprobe:xsk_sendmsg {
> > > > > @[kstack()] = count(); }'
> > > > > Attaching 1 probe...^C@[
> > > > >     xsk_sendmsg+1
> > > > >     sock_sendmsg+94
> > > > >     __sys_sendto+238
> > > > >     __x64_sys_sendto+37
> > > > >     do_syscall_64+51
> > > > >     entry_SYSCALL_64_after_hwframe+68
> > > > > ]: 1253307
> > > > >
> > > > > Which doesn't seem to suggest any error - I've looked at the source
> > > > > code for all these functions, but don't see any reference to EPERM.
> > > >
> > > > It must be in there somewhere :-). Could you plesae use ftrace
> > > > (through perf for example) and trace all functions that a sendto hits
> > > > in your case? Then we might see what it hits.
> > > >
> > > > Are you running in SKB mode or in zero-copy mode? Guess it is
> > > > zero-copy from your mail, but just want to verify. Does Rx work as
> > > > expected?
> > > >
> > > > Could you share your AF_XDP program?
>
> +1, that would help us probably :)

The code is proprietary, but if required I can extract relevant bits
into a sample program or modify the sample xdpsock_user.c suitably.

>
> > >
> > > After some experimentation and a lot of head-scratching, I found part
> > > of the problem last night. The sendto() was not returning EPERM (-1),
> > > but ENXIO (-6) - I was mistakenly printing the return value of the
> > > sento() call (which always returns -1 in case of failure), instead of
> > > errno (duh!).
> > >
> > > Looking at the code, I see ENXIO is returned if the xsk is unbound.
> > > I'm still investigating this and will post an update soon. The problem
> > > is happening at a customer end and there's some delay and follow up
> > > required to get the logs.
> >
> > sendto() was returning ENXIO because the interface MTU was set to 9000
> > which I know is not supported with AF_XDP. But shouldn't
> > xsk_socket__create() fail in this case? Note the actual packet being
> > transmitted was 64 bytes.
>
> It depends. You said that you have your own AF_XDP app, so if you're
> setting the XSK_LIBBPF_FLAGS__INHIBIT_PROG_LOAD flag then libbpf wouldn't
> be loading the built-in AF_XDP eBPF prog on interface and that's where the
> failure should happen.

I used AF_XDP for TX only with my own eBPF program for RX. For this
reason, I was using INHIBIT_PROG_LOAD while opening the xsk. That's
why I didn't see an error while creating the xsk.

>
> >
> > Not sure if it has a role in the above sendto() failure, but before
> > xsk socket create, my call to bpf_set_link_xdp_fd() was failing
> > because of the MTU problem (the newly added error message for this
> > case was very helpful!). Once MTU was reduced to 1500 both the RX eBPF
> > program link to the interface failure and the TX sendto() returning
> > ENXIO always went away. Kernel version 5.12
> >
> > Can someone tell me what is expected to happen for a Tx AF_XDP socket
> > in case of MTU > 4K?
>
> See the last paragraph.
>
> >
> > I also found a second case of sendto() returning ENXIO. In this
> > scenario, I was removing my RX eBPF program by calling
> >
> >     bpf_set_link_xdp_fd(ifIndex, -1, 0)
> >
> > while AF_XDP transmit (and associated sento() wakeup) was still going
> > on. In this case, sendto starts failing with ENETDOWN for some time
> > followed by ENXIO subsequently. This case was on Kernel version 5.4.0
>
> I think that we addressed the ENETDOWN Tx issue with the following set:
> https://lore.kernel.org/netdev/20200205045834.56795-1-maciej.fijalkowski@intel.com/
>
> I see that it has been merged in 5.6. But it was related to being unable
> to spawn multiple AF_XDP Tx-only instances. With what you're saying it
> feels to me that you have multiple instances of your AF_XDP progs and you
> terminate one of them? Previously, every instance would die due to the
> fact that the underlying XDP prog would be unloaded from interface, but
> right now we have bpf_link support for AF_XDP which would handle that
> properly. Note that it was developed for the built-in prog.

I think my case is different. I have only one AF_XDP Tx-only instance,
but I'm not using the built-in AF_XDP eBPF program. So when I remove
my eBPF program the AF_XDP Tx also gets affected. I solved my problem
by cleaning up the AF_XDP Tx first before removing my custom eBPF Rx
program.

>
> >
> > Does removing a XDP program cause the interface to go down (ENETDOWN)
> > leading to XDP socket unbind (ENXIO)? Should removing (or replacing)
> > an RX eBPF program, affect AF_XDP TX?
>
> Removing XDP prog causes the interface to undergo the reset or some other
> mechanism as it needs to remove the XDP Tx resources and change the Rx
> memory model. For Intel drivers, the AF_XDP Tx resources are configured
> during the load of Rx eBPF prog. We would have to develop some mechanism
> that detaches the creation of XDP Tx resources from loading Rx eBPF prog.
> There have been discussions around feature detection but I think it was
> about the opposite - don't configure Tx rings if your prog will not be
> doing XDP_TX action.

I guess I was sort of implicitly assuming that XDP Tx and Rx paths are
independent. Which is not the case. This is good to keep in mind while
coding.

I think it might be a worthwhile goal to allow the eBPF program to be
removed/replaced without affecting Tx - not sure how feasible that is
though.

Thanks for all the help!

>
> >
> > Srivats

^ permalink raw reply	[flat|nested] 8+ messages in thread

end of thread, other threads:[~2021-05-11 12:03 UTC | newest]

Thread overview: 8+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-04-23 15:44 AF_XDP sendto kick returning EPERM Srivats P
2021-04-27  7:28 ` Magnus Karlsson
2021-04-29 15:47   ` Srivats P
2021-05-03  8:24     ` Magnus Karlsson
2021-05-07 14:47       ` Srivats P
2021-05-07 15:09         ` Srivats P
2021-05-09 15:41           ` Maciej Fijalkowski
2021-05-11 12:02             ` Srivats P

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.