xdp-newbies.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* Running XSK on a tun device
@ 2020-04-03  1:05 Gilberto Bertin
  2020-04-03 14:10 ` Magnus Karlsson
  0 siblings, 1 reply; 4+ messages in thread
From: Gilberto Bertin @ 2020-04-03  1:05 UTC (permalink / raw)
  To: xdp-newbies

I am trying to bind an XSK socket to a tun device, so that I can run some
automated tests on an XSK based server I'm working on. A tun device would in
fact allow me to have fine control over what packets I'm sending to and
receiving from the server (as opposed for example to an approach where the
server listens on a regular interface and tests interact with it over sockets).

The XSK logic of the server is largely based on the one presented in the
xdpsock_user sample in samples/bpf in the Linux kernel (the server is using the
XDP_USE_NEED_WAKEUP bind flag).

When I manually interact with it using a pair of veth devices and netcat,
everything works as expected: the server receives and then sends back packets
properly.

The troubles start when I try to bind it to a tun device as I am not able to move
any packet between the device and the server.

I tried then to reproduce the issue with a simplified setup based on the
xdpsock_user sample, and I got the same results (I tested different combination
of options such as driver mode vs skb mode, poll vs non poll mode, need-wakeup
vs no-need-wakeup, all with the same outcome).

By inspecting more closely the behavior of the sample program I found that:

- packets are actually being received in the rx ring, as poll returns 1 each time
  something is written on the fd of the tun device
- the program gets stuck in rx_drop() [1], more precisely in:

	ret = xsk_ring_prod__reserve(&xsk->umem->fq, rcvd, &idx_fq);
	while (ret != rcvd) {
		if (ret < 0)
			exit_with_error(-ret);
		if (xsk_ring_prod__needs_wakeup(&xsk->umem->fq))
			ret = poll(fds, num_socks, opt_timeout);
		ret = xsk_ring_prod__reserve(&xsk->umem->fq, rcvd, &idx_fq);
	}

  where xsk_ring_prod__reserve keeps returning 0.

I'm not sure why this is happening as most of the descriptors in the fill ring
should be available (especially since this exact same code works fine for other
devices like veth).

As I'm still getting acquainted with the codebase it's not obvious to me where I
should start looking for to understand what's the underling cause of this issue
so I'd really appreciate some help/pointers on this.

Cheers,
Gilberto

[1] https://github.com/torvalds/linux/blob/8ed47e140867a6e7d56170f325c8d4fdee6d6b66/samples/bpf/xdpsock_user.c#L873-L880

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: Running XSK on a tun device
  2020-04-03  1:05 Running XSK on a tun device Gilberto Bertin
@ 2020-04-03 14:10 ` Magnus Karlsson
  2020-04-03 15:31   ` Gilberto Bertin
  0 siblings, 1 reply; 4+ messages in thread
From: Magnus Karlsson @ 2020-04-03 14:10 UTC (permalink / raw)
  To: Gilberto Bertin; +Cc: Xdp

On Fri, Apr 3, 2020 at 3:07 AM Gilberto Bertin <me@jibi.io> wrote:
>
> I am trying to bind an XSK socket to a tun device, so that I can run some
> automated tests on an XSK based server I'm working on. A tun device would in
> fact allow me to have fine control over what packets I'm sending to and
> receiving from the server (as opposed for example to an approach where the
> server listens on a regular interface and tests interact with it over sockets).
>
> The XSK logic of the server is largely based on the one presented in the
> xdpsock_user sample in samples/bpf in the Linux kernel (the server is using the
> XDP_USE_NEED_WAKEUP bind flag).
>
> When I manually interact with it using a pair of veth devices and netcat,
> everything works as expected: the server receives and then sends back packets
> properly.
>
> The troubles start when I try to bind it to a tun device as I am not able to move
> any packet between the device and the server.
>
> I tried then to reproduce the issue with a simplified setup based on the
> xdpsock_user sample, and I got the same results (I tested different combination
> of options such as driver mode vs skb mode, poll vs non poll mode, need-wakeup
> vs no-need-wakeup, all with the same outcome).
>
> By inspecting more closely the behavior of the sample program I found that:
>
> - packets are actually being received in the rx ring, as poll returns 1 each time
>   something is written on the fd of the tun device
> - the program gets stuck in rx_drop() [1], more precisely in:
>
>         ret = xsk_ring_prod__reserve(&xsk->umem->fq, rcvd, &idx_fq);
>         while (ret != rcvd) {
>                 if (ret < 0)
>                         exit_with_error(-ret);
>                 if (xsk_ring_prod__needs_wakeup(&xsk->umem->fq))
>                         ret = poll(fds, num_socks, opt_timeout);
>                 ret = xsk_ring_prod__reserve(&xsk->umem->fq, rcvd, &idx_fq);
>         }
>
>   where xsk_ring_prod__reserve keeps returning 0.

Which kernel version are you running? If my memory serves me right, in
versions prior to 5.6, the update of the global state that signifies
that there is space available in the fill ring was updated in a lazy
manner. If you are not using the latest kernel, could you please try
it? Maybe it could give us some hints on what is going on.

Also have to say that the sample program is quite simplistic. If you
cannot reserve some entries in the fill ring at some point, you should
just go ahead and do something else (receive for example) and come
back later and try to do the same thing. It is not critical to always
be able to fill it again, even though it is good practice in a high
performance situation to keep it as full as possible to minimize the
risk of packet loss.

Note that there is not zero-copy support for TUN, but there is XDP
support so copy mode and XDP_DRV should work. Also note that I have
never tried TUN with AF_XDP, so you can have stumbled upon something
new.

/Magnus

> I'm not sure why this is happening as most of the descriptors in the fill ring
> should be available (especially since this exact same code works fine for other
> devices like veth).
>
> As I'm still getting acquainted with the codebase it's not obvious to me where I
> should start looking for to understand what's the underling cause of this issue
> so I'd really appreciate some help/pointers on this.
>
> Cheers,
> Gilberto
>
> [1] https://github.com/torvalds/linux/blob/8ed47e140867a6e7d56170f325c8d4fdee6d6b66/samples/bpf/xdpsock_user.c#L873-L880

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: Running XSK on a tun device
  2020-04-03 14:10 ` Magnus Karlsson
@ 2020-04-03 15:31   ` Gilberto Bertin
  2020-04-03 15:35     ` Magnus Karlsson
  0 siblings, 1 reply; 4+ messages in thread
From: Gilberto Bertin @ 2020-04-03 15:31 UTC (permalink / raw)
  To: Magnus Karlsson; +Cc: Xdp

On Fri, Apr 03, 2020 at 04:10:39PM +0200, Magnus Karlsson wrote:
> On Fri, Apr 3, 2020 at 3:07 AM Gilberto Bertin <me@jibi.io> wrote:
> >
> > I am trying to bind an XSK socket to a tun device, so that I can run some
> > automated tests on an XSK based server I'm working on. A tun device would in
> > fact allow me to have fine control over what packets I'm sending to and
> > receiving from the server (as opposed for example to an approach where the
> > server listens on a regular interface and tests interact with it over sockets).
> >
> > The XSK logic of the server is largely based on the one presented in the
> > xdpsock_user sample in samples/bpf in the Linux kernel (the server is using the
> > XDP_USE_NEED_WAKEUP bind flag).
> >
> > When I manually interact with it using a pair of veth devices and netcat,
> > everything works as expected: the server receives and then sends back packets
> > properly.
> >
> > The troubles start when I try to bind it to a tun device as I am not able to move
> > any packet between the device and the server.
> >
> > I tried then to reproduce the issue with a simplified setup based on the
> > xdpsock_user sample, and I got the same results (I tested different combination
> > of options such as driver mode vs skb mode, poll vs non poll mode, need-wakeup
> > vs no-need-wakeup, all with the same outcome).
> >
> > By inspecting more closely the behavior of the sample program I found that:
> >
> > - packets are actually being received in the rx ring, as poll returns 1 each time
> >   something is written on the fd of the tun device
> > - the program gets stuck in rx_drop() [1], more precisely in:
> >
> >         ret = xsk_ring_prod__reserve(&xsk->umem->fq, rcvd, &idx_fq);
> >         while (ret != rcvd) {
> >                 if (ret < 0)
> >                         exit_with_error(-ret);
> >                 if (xsk_ring_prod__needs_wakeup(&xsk->umem->fq))
> >                         ret = poll(fds, num_socks, opt_timeout);
> >                 ret = xsk_ring_prod__reserve(&xsk->umem->fq, rcvd, &idx_fq);
> >         }
> >
> >   where xsk_ring_prod__reserve keeps returning 0.
> 
> Which kernel version are you running? If my memory serves me right, in
> versions prior to 5.6, the update of the global state that signifies
> that there is space available in the fill ring was updated in a lazy
> manner. If you are not using the latest kernel, could you please try
> it? Maybe it could give us some hints on what is going on.

I was using a 5.6.0 kernel. I just tested the latest 5.6.2 and I keep
experiencing the same behaviour.

> Also have to say that the sample program is quite simplistic. If you
> cannot reserve some entries in the fill ring at some point, you should
> just go ahead and do something else (receive for example) and come
> back later and try to do the same thing. It is not critical to always
> be able to fill it again, even though it is good practice in a high
> performance situation to keep it as full as possible to minimize the
> risk of packet loss.

Makes sense, thanks! I'll have a look at updating that logic.

I just tried another XSK example program (the one in the xdp-tutorial repo)
which does something slightly different then the xdpsock_user one: it uses
xsk_prod_nb_free() to (as I understand it) determine if it's possible to reserve
and then submit descriptors in the fq ring [1]. With this logic I'm able to
receive packets succesfully from the tun interface.

> Note that there is not zero-copy support for TUN, but there is XDP
> support so copy mode and XDP_DRV should work.

Ack

> Also note that I have
> never tried TUN with AF_XDP, so you can have stumbled upon something
> new.

Makes sense, not sure at this point if this should be considered a bug of the
tun driver or if other drivers may exibit the same behaviour. In the latter case
I'd be happy to help updating the sample if you think it's worth it
(once I understand a bit more how the whole thing works :P).

Next I'll check if the TX path works correctly with tun, and eventually follow
up with my findings.

Thanks,
Gilberto

[1] https://github.com/xdp-project/xdp-tutorial/blob/cc0f4fbaabff9c149e5981beb71f7b52f02a6391/advanced03-AF_XDP/af_xdp_user.c#L345-L368

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: Running XSK on a tun device
  2020-04-03 15:31   ` Gilberto Bertin
@ 2020-04-03 15:35     ` Magnus Karlsson
  0 siblings, 0 replies; 4+ messages in thread
From: Magnus Karlsson @ 2020-04-03 15:35 UTC (permalink / raw)
  To: Gilberto Bertin; +Cc: Xdp

On Fri, Apr 3, 2020 at 5:31 PM Gilberto Bertin <me@jibi.io> wrote:
>
> On Fri, Apr 03, 2020 at 04:10:39PM +0200, Magnus Karlsson wrote:
> > On Fri, Apr 3, 2020 at 3:07 AM Gilberto Bertin <me@jibi.io> wrote:
> > >
> > > I am trying to bind an XSK socket to a tun device, so that I can run some
> > > automated tests on an XSK based server I'm working on. A tun device would in
> > > fact allow me to have fine control over what packets I'm sending to and
> > > receiving from the server (as opposed for example to an approach where the
> > > server listens on a regular interface and tests interact with it over sockets).
> > >
> > > The XSK logic of the server is largely based on the one presented in the
> > > xdpsock_user sample in samples/bpf in the Linux kernel (the server is using the
> > > XDP_USE_NEED_WAKEUP bind flag).
> > >
> > > When I manually interact with it using a pair of veth devices and netcat,
> > > everything works as expected: the server receives and then sends back packets
> > > properly.
> > >
> > > The troubles start when I try to bind it to a tun device as I am not able to move
> > > any packet between the device and the server.
> > >
> > > I tried then to reproduce the issue with a simplified setup based on the
> > > xdpsock_user sample, and I got the same results (I tested different combination
> > > of options such as driver mode vs skb mode, poll vs non poll mode, need-wakeup
> > > vs no-need-wakeup, all with the same outcome).
> > >
> > > By inspecting more closely the behavior of the sample program I found that:
> > >
> > > - packets are actually being received in the rx ring, as poll returns 1 each time
> > >   something is written on the fd of the tun device
> > > - the program gets stuck in rx_drop() [1], more precisely in:
> > >
> > >         ret = xsk_ring_prod__reserve(&xsk->umem->fq, rcvd, &idx_fq);
> > >         while (ret != rcvd) {
> > >                 if (ret < 0)
> > >                         exit_with_error(-ret);
> > >                 if (xsk_ring_prod__needs_wakeup(&xsk->umem->fq))
> > >                         ret = poll(fds, num_socks, opt_timeout);
> > >                 ret = xsk_ring_prod__reserve(&xsk->umem->fq, rcvd, &idx_fq);
> > >         }
> > >
> > >   where xsk_ring_prod__reserve keeps returning 0.
> >
> > Which kernel version are you running? If my memory serves me right, in
> > versions prior to 5.6, the update of the global state that signifies
> > that there is space available in the fill ring was updated in a lazy
> > manner. If you are not using the latest kernel, could you please try
> > it? Maybe it could give us some hints on what is going on.
>
> I was using a 5.6.0 kernel. I just tested the latest 5.6.2 and I keep
> experiencing the same behaviour.
>
> > Also have to say that the sample program is quite simplistic. If you
> > cannot reserve some entries in the fill ring at some point, you should
> > just go ahead and do something else (receive for example) and come
> > back later and try to do the same thing. It is not critical to always
> > be able to fill it again, even though it is good practice in a high
> > performance situation to keep it as full as possible to minimize the
> > risk of packet loss.
>
> Makes sense, thanks! I'll have a look at updating that logic.
>
> I just tried another XSK example program (the one in the xdp-tutorial repo)
> which does something slightly different then the xdpsock_user one: it uses
> xsk_prod_nb_free() to (as I understand it) determine if it's possible to reserve
> and then submit descriptors in the fq ring [1]. With this logic I'm able to
> receive packets succesfully from the tun interface.

Good.

> > Note that there is not zero-copy support for TUN, but there is XDP
> > support so copy mode and XDP_DRV should work.
>
> Ack
>
> > Also note that I have
> > never tried TUN with AF_XDP, so you can have stumbled upon something
> > new.
>
> Makes sense, not sure at this point if this should be considered a bug of the
> tun driver or if other drivers may exibit the same behaviour. In the latter case
> I'd be happy to help updating the sample if you think it's worth it
> (once I understand a bit more how the whole thing works :P).

That is a good idea. Please do, so that the example gets better and
more relevant.

/Magnus

> Next I'll check if the TX path works correctly with tun, and eventually follow
> up with my findings.
>
> Thanks,
> Gilberto
>
> [1] https://github.com/xdp-project/xdp-tutorial/blob/cc0f4fbaabff9c149e5981beb71f7b52f02a6391/advanced03-AF_XDP/af_xdp_user.c#L345-L368

^ permalink raw reply	[flat|nested] 4+ messages in thread

end of thread, other threads:[~2020-04-03 15:35 UTC | newest]

Thread overview: 4+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-04-03  1:05 Running XSK on a tun device Gilberto Bertin
2020-04-03 14:10 ` Magnus Karlsson
2020-04-03 15:31   ` Gilberto Bertin
2020-04-03 15:35     ` Magnus Karlsson

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).