xdp-newbies.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* net/mlx5e: bind() always returns EINVAL with XDP_ZEROCOPY
@ 2019-08-07 12:49 Kal Cutter Conley
  2019-09-01 16:47 ` Kal Cutter Conley
  0 siblings, 1 reply; 11+ messages in thread
From: Kal Cutter Conley @ 2019-08-07 12:49 UTC (permalink / raw)
  To: xdp-newbies

Hello,
I am testing the mlx5e driver with AF_XDP. When I specify
XDP_ZEROCOPY, bind() always returns EINVAL. I observe the same problem
with the xdpsock sample:

sudo samples/bpf/xdpsock -r -i dcb1-port1 -z
samples/bpf/xdpsock_user.c:xsk_configure_socket:322: errno:
22/"Invalid argument"

Without XDP_ZEROCOPY, everything works as expected. Is this a known
issue/limitation? I expected this to be supported looking at the
code/commit history.

Thanks,
Kal

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: net/mlx5e: bind() always returns EINVAL with XDP_ZEROCOPY
  2019-08-07 12:49 net/mlx5e: bind() always returns EINVAL with XDP_ZEROCOPY Kal Cutter Conley
@ 2019-09-01 16:47 ` Kal Cutter Conley
  2019-09-02  9:08   ` Jesper Dangaard Brouer
  0 siblings, 1 reply; 11+ messages in thread
From: Kal Cutter Conley @ 2019-09-01 16:47 UTC (permalink / raw)
  To: xdp-newbies

Hi,
I figured out the problem. Let me document the issue here for others
and hopefully start a discussion.

The mlx5 driver uses special queue ids for ZC. If N is the number of
configured queues, then for XDP_ZEROCOPY the queue ids start at N. So
queue ids [0..N) can only be used with XDP_COPY and queue ids [N..2N)
can only be used with XDP_ZEROCOPY.

sudo ethtool -L eth0 combined 16
sudo samples/bpf/xdpsock -r -i eth0 -c -q 0   # OK
sudo samples/bpf/xdpsock -r -i eth0 -z -q 0   # ERROR
sudo samples/bpf/xdpsock -r -i eth0 -c -q 16  # ERROR
sudo samples/bpf/xdpsock -r -i eth0 -z -q 16  # OK

Why was this done? To use zerocopy if available and fallback on copy
mode normally you would set sxdp_flags=0. However, here this is no
longer possible. To support this driver, you have to first try binding
with XDP_ZEROCOPY and the special queue id, then if that fails, you
have to try binding again with a normal queue id. Peculiarities like
this complicate the XDP user api. Maybe someone can explain the
benefits?

Kal


On Wed, Aug 7, 2019 at 2:49 PM Kal Cutter Conley <kal.conley@dectris.com> wrote:
>
> Hello,
> I am testing the mlx5e driver with AF_XDP. When I specify
> XDP_ZEROCOPY, bind() always returns EINVAL. I observe the same problem
> with the xdpsock sample:
>
> sudo samples/bpf/xdpsock -r -i dcb1-port1 -z
> samples/bpf/xdpsock_user.c:xsk_configure_socket:322: errno:
> 22/"Invalid argument"
>
> Without XDP_ZEROCOPY, everything works as expected. Is this a known
> issue/limitation? I expected this to be supported looking at the
> code/commit history.
>
> Thanks,
> Kal

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: net/mlx5e: bind() always returns EINVAL with XDP_ZEROCOPY
  2019-09-01 16:47 ` Kal Cutter Conley
@ 2019-09-02  9:08   ` Jesper Dangaard Brouer
  2019-09-03 20:19     ` Saeed Mahameed
  0 siblings, 1 reply; 11+ messages in thread
From: Jesper Dangaard Brouer @ 2019-09-02  9:08 UTC (permalink / raw)
  To: Kal Cutter Conley
  Cc: brouer, Karlsson, Magnus, Björn Töpel,
	Maxim Mikityanskiy, Jakub Kicinski,
	Toke Høiland-Jørgensen, Andy Gospodarek, xdp-newbies,
	Saeed Mahameed, Tariq Toukan, netdev

On Sun, 1 Sep 2019 18:47:15 +0200
Kal Cutter Conley <kal.conley@dectris.com> wrote:

> Hi,
> I figured out the problem. Let me document the issue here for others
> and hopefully start a discussion.
> 
> The mlx5 driver uses special queue ids for ZC. If N is the number of
> configured queues, then for XDP_ZEROCOPY the queue ids start at N. So
> queue ids [0..N) can only be used with XDP_COPY and queue ids [N..2N)
> can only be used with XDP_ZEROCOPY.

Thanks for the followup and explanation on how mlx5 AF_XDP queue
implementation is different from other vendors.


> sudo ethtool -L eth0 combined 16
> sudo samples/bpf/xdpsock -r -i eth0 -c -q 0   # OK
> sudo samples/bpf/xdpsock -r -i eth0 -z -q 0   # ERROR
> sudo samples/bpf/xdpsock -r -i eth0 -c -q 16  # ERROR
> sudo samples/bpf/xdpsock -r -i eth0 -z -q 16  # OK
> 
> Why was this done? To use zerocopy if available and fallback on copy
> mode normally you would set sxdp_flags=0. However, here this is no
> longer possible. To support this driver, you have to first try binding
> with XDP_ZEROCOPY and the special queue id, then if that fails, you
> have to try binding again with a normal queue id. Peculiarities like
> this complicate the XDP user api. Maybe someone can explain the
> benefits?

Thanks for complaining, it is actually valuable. It really illustrate
the kernel need to improve in this area, which is what our talk[1] at
LPC2019 (Sep 10) is about.

Title: "Making Networking Queues a First Class Citizen in the Kernel"
 [1] https://linuxplumbersconf.org/event/4/contributions/462/

As you can see, several vendors are actually involved. Kudos to Magnus
for taking initiative here!  It's unfortunately not solved "tomorrow",
as first we have to agree this is needed (facility to register queues),
then agree on API and get commitment from vendors, as this requires
drivers changes.  There is a long road ahead, but I think it will be
worthwhile in the end, as effective use of dedicated hardware queues
(both RX and TX) is key to performance.

-- 
Best regards,
  Jesper Dangaard Brouer
  MSc.CS, Principal Kernel Engineer at Red Hat
  LinkedIn: http://www.linkedin.com/in/brouer



> On Wed, Aug 7, 2019 at 2:49 PM Kal Cutter Conley <kal.conley@dectris.com> wrote:
> >
> > Hello,
> > I am testing the mlx5e driver with AF_XDP. When I specify
> > XDP_ZEROCOPY, bind() always returns EINVAL. I observe the same problem
> > with the xdpsock sample:
> >
> > sudo samples/bpf/xdpsock -r -i dcb1-port1 -z
> > samples/bpf/xdpsock_user.c:xsk_configure_socket:322: errno:
> > 22/"Invalid argument"
> >
> > Without XDP_ZEROCOPY, everything works as expected. Is this a known
> > issue/limitation? I expected this to be supported looking at the
> > code/commit history.
> >
> > Thanks,
> > Kal  

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: net/mlx5e: bind() always returns EINVAL with XDP_ZEROCOPY
  2019-09-02  9:08   ` Jesper Dangaard Brouer
@ 2019-09-03 20:19     ` Saeed Mahameed
  2020-06-14  8:55       ` Kal Cutter Conley
  0 siblings, 1 reply; 11+ messages in thread
From: Saeed Mahameed @ 2019-09-03 20:19 UTC (permalink / raw)
  To: kal.conley, brouer
  Cc: Maxim Mikityanskiy, magnus.karlsson, toke.hoiland-jorgensen,
	xdp-newbies, Tariq Toukan, gospo, jakub.kicinski, netdev,
	bjorn.topel

On Mon, 2019-09-02 at 11:08 +0200, Jesper Dangaard Brouer wrote:
> On Sun, 1 Sep 2019 18:47:15 +0200
> Kal Cutter Conley <kal.conley@dectris.com> wrote:
> 
> > Hi,
> > I figured out the problem. Let me document the issue here for
> > others
> > and hopefully start a discussion.
> > 
> > The mlx5 driver uses special queue ids for ZC. If N is the number
> > of
> > configured queues, then for XDP_ZEROCOPY the queue ids start at N.
> > So
> > queue ids [0..N) can only be used with XDP_COPY and queue ids
> > [N..2N)
> > can only be used with XDP_ZEROCOPY.
> 
> Thanks for the followup and explanation on how mlx5 AF_XDP queue
> implementation is different from other vendors.
> 
> 
> > sudo ethtool -L eth0 combined 16
> > sudo samples/bpf/xdpsock -r -i eth0 -c -q 0   # OK
> > sudo samples/bpf/xdpsock -r -i eth0 -z -q 0   # ERROR
> > sudo samples/bpf/xdpsock -r -i eth0 -c -q 16  # ERROR
> > sudo samples/bpf/xdpsock -r -i eth0 -z -q 16  # OK
> > 
> > Why was this done? To use zerocopy if available and fallback on
> > copy
> > mode normally you would set sxdp_flags=0. However, here this is no
> > longer possible. To support this driver, you have to first try
> > binding
> > with XDP_ZEROCOPY and the special queue id, then if that fails, you
> > have to try binding again with a normal queue id. Peculiarities
> > like
> > this complicate the XDP user api. Maybe someone can explain the
> > benefits?
> 

in mlx5 we like to keep full functional separation between different
queues. Unlike other implementations in mlx5 kernel standard rx rings
can still function while xsk queues are opened. from user perspective
this should be very simple and very usefull:

queues 0..(N-1): can't be used for XSK ZC since they are standard RX
queues managed by kernel  and driver
queues N..(2N-1): Are XSK user app managed queues, they can't be used
for anything else.

benefits:
- RSS is not interrupted, Ongoing traffic and Current RX queues keeps
going normally when XSK apps are activated/deactivated on the fly.
- Well-defined full logical separation between different types of RX
queue.

as Jesper explained we understand the confusion, and we will come up
with a solution the fits all vendors.

> Thanks for complaining, it is actually valuable. It really illustrate
> the kernel need to improve in this area, which is what our talk[1] at
> LPC2019 (Sep 10) is about.
> 
> Title: "Making Networking Queues a First Class Citizen in the Kernel"
>  [1] https://linuxplumbersconf.org/event/4/contributions/462/
> 
> As you can see, several vendors are actually involved. Kudos to
> Magnus
> for taking initiative here!  It's unfortunately not solved
> "tomorrow",
> as first we have to agree this is needed (facility to register
> queues),
> then agree on API and get commitment from vendors, as this requires
> drivers changes.  There is a long road ahead, but I think it will be
> worthwhile in the end, as effective use of dedicated hardware queues
> (both RX and TX) is key to performance.
> 

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: net/mlx5e: bind() always returns EINVAL with XDP_ZEROCOPY
  2019-09-03 20:19     ` Saeed Mahameed
@ 2020-06-14  8:55       ` Kal Cutter Conley
  2020-06-18 15:23         ` Jonathan Lemon
  0 siblings, 1 reply; 11+ messages in thread
From: Kal Cutter Conley @ 2020-06-14  8:55 UTC (permalink / raw)
  To: Saeed Mahameed
  Cc: brouer, Maxim Mikityanskiy, magnus.karlsson,
	toke.hoiland-jorgensen, xdp-newbies, Tariq Toukan, gospo,
	jakub.kicinski, netdev, bjorn.topel

Hi Saeed,
Thanks for explaining the reasoning behind the special mlx5 queue
numbering with XDP zerocopy.

We have a process using AF_XDP that also shares the network interface
with other processes on the system. ethtool rx flow classification
rules are used to route the traffic to the appropriate XSK queue
N..(2N-1). The issue is these queues are only valid as long they are
active (as far as I can tell). This means if my AF_XDP process dies
other processes no longer receive ingress traffic routed over queues
N..(2N-1) even though my XDP program is still loaded and would happily
always return XDP_PASS. Other drivers do not have this usability issue
because they use queues that are always valid. Is there a simple
workaround for this issue? It seems to me queues N..(2N-1) should
simply map to 0..(N-1) when they are not active?

Kal


On Tue, Sep 3, 2019 at 10:19 PM Saeed Mahameed <saeedm@mellanox.com> wrote:
>
> On Mon, 2019-09-02 at 11:08 +0200, Jesper Dangaard Brouer wrote:
> > On Sun, 1 Sep 2019 18:47:15 +0200
> > Kal Cutter Conley <kal.conley@dectris.com> wrote:
> >
> > > Hi,
> > > I figured out the problem. Let me document the issue here for
> > > others
> > > and hopefully start a discussion.
> > >
> > > The mlx5 driver uses special queue ids for ZC. If N is the number
> > > of
> > > configured queues, then for XDP_ZEROCOPY the queue ids start at N.
> > > So
> > > queue ids [0..N) can only be used with XDP_COPY and queue ids
> > > [N..2N)
> > > can only be used with XDP_ZEROCOPY.
> >
> > Thanks for the followup and explanation on how mlx5 AF_XDP queue
> > implementation is different from other vendors.
> >
> >
> > > sudo ethtool -L eth0 combined 16
> > > sudo samples/bpf/xdpsock -r -i eth0 -c -q 0   # OK
> > > sudo samples/bpf/xdpsock -r -i eth0 -z -q 0   # ERROR
> > > sudo samples/bpf/xdpsock -r -i eth0 -c -q 16  # ERROR
> > > sudo samples/bpf/xdpsock -r -i eth0 -z -q 16  # OK
> > >
> > > Why was this done? To use zerocopy if available and fallback on
> > > copy
> > > mode normally you would set sxdp_flags=0. However, here this is no
> > > longer possible. To support this driver, you have to first try
> > > binding
> > > with XDP_ZEROCOPY and the special queue id, then if that fails, you
> > > have to try binding again with a normal queue id. Peculiarities
> > > like
> > > this complicate the XDP user api. Maybe someone can explain the
> > > benefits?
> >
>
> in mlx5 we like to keep full functional separation between different
> queues. Unlike other implementations in mlx5 kernel standard rx rings
> can still function while xsk queues are opened. from user perspective
> this should be very simple and very usefull:
>
> queues 0..(N-1): can't be used for XSK ZC since they are standard RX
> queues managed by kernel  and driver
> queues N..(2N-1): Are XSK user app managed queues, they can't be used
> for anything else.
>
> benefits:
> - RSS is not interrupted, Ongoing traffic and Current RX queues keeps
> going normally when XSK apps are activated/deactivated on the fly.
> - Well-defined full logical separation between different types of RX
> queue.
>
> as Jesper explained we understand the confusion, and we will come up
> with a solution the fits all vendors.
>
> > Thanks for complaining, it is actually valuable. It really illustrate
> > the kernel need to improve in this area, which is what our talk[1] at
> > LPC2019 (Sep 10) is about.
> >
> > Title: "Making Networking Queues a First Class Citizen in the Kernel"
> >  [1] https://linuxplumbersconf.org/event/4/contributions/462/
> >
> > As you can see, several vendors are actually involved. Kudos to
> > Magnus
> > for taking initiative here!  It's unfortunately not solved
> > "tomorrow",
> > as first we have to agree this is needed (facility to register
> > queues),
> > then agree on API and get commitment from vendors, as this requires
> > drivers changes.  There is a long road ahead, but I think it will be
> > worthwhile in the end, as effective use of dedicated hardware queues
> > (both RX and TX) is key to performance.
> >

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: net/mlx5e: bind() always returns EINVAL with XDP_ZEROCOPY
  2020-06-14  8:55       ` Kal Cutter Conley
@ 2020-06-18 15:23         ` Jonathan Lemon
  2020-06-18 17:31           ` Kal Cutter Conley
  2020-06-20 10:42           ` Kal Cutter Conley
  0 siblings, 2 replies; 11+ messages in thread
From: Jonathan Lemon @ 2020-06-18 15:23 UTC (permalink / raw)
  To: Kal Cutter Conley
  Cc: Saeed Mahameed, brouer, Maxim Mikityanskiy, magnus.karlsson,
	toke.hoiland-jorgensen, xdp-newbies, Tariq Toukan, gospo,
	jakub.kicinski, netdev, bjorn.topel

On Sun, Jun 14, 2020 at 10:55:30AM +0200, Kal Cutter Conley wrote:
> Hi Saeed,
> Thanks for explaining the reasoning behind the special mlx5 queue
> numbering with XDP zerocopy.
> 
> We have a process using AF_XDP that also shares the network interface
> with other processes on the system. ethtool rx flow classification
> rules are used to route the traffic to the appropriate XSK queue
> N..(2N-1). The issue is these queues are only valid as long they are
> active (as far as I can tell). This means if my AF_XDP process dies
> other processes no longer receive ingress traffic routed over queues
> N..(2N-1) even though my XDP program is still loaded and would happily
> always return XDP_PASS. Other drivers do not have this usability issue
> because they use queues that are always valid. Is there a simple
> workaround for this issue? It seems to me queues N..(2N-1) should
> simply map to 0..(N-1) when they are not active?

If your XDP program returns XDP_PASS, the packet should be delivered to
the xsk socket.  If the application isn't running, where would it go?

I do agree that the usability of this can be improved.  What if the flow
rules are inserted and removed along with queue creatioin/destruction?
-- 
Jonathan

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: net/mlx5e: bind() always returns EINVAL with XDP_ZEROCOPY
  2020-06-18 15:23         ` Jonathan Lemon
@ 2020-06-18 17:31           ` Kal Cutter Conley
  2020-06-20 10:42           ` Kal Cutter Conley
  1 sibling, 0 replies; 11+ messages in thread
From: Kal Cutter Conley @ 2020-06-18 17:31 UTC (permalink / raw)
  To: Jonathan Lemon
  Cc: Saeed Mahameed, brouer, Maxim Mikityanskiy, magnus.karlsson,
	toke.hoiland-jorgensen, xdp-newbies, Tariq Toukan, gospo,
	jakub.kicinski, netdev, bjorn.topel

On Thu, Jun 18, 2020 at 5:23 PM Jonathan Lemon <jonathan.lemon@gmail.com> wrote:
>
> On Sun, Jun 14, 2020 at 10:55:30AM +0200, Kal Cutter Conley wrote:
> > Hi Saeed,
> > Thanks for explaining the reasoning behind the special mlx5 queue
> > numbering with XDP zerocopy.
> >
> > We have a process using AF_XDP that also shares the network interface
> > with other processes on the system. ethtool rx flow classification
> > rules are used to route the traffic to the appropriate XSK queue
> > N..(2N-1). The issue is these queues are only valid as long they are
> > active (as far as I can tell). This means if my AF_XDP process dies
> > other processes no longer receive ingress traffic routed over queues
> > N..(2N-1) even though my XDP program is still loaded and would happily
> > always return XDP_PASS. Other drivers do not have this usability issue
> > because they use queues that are always valid. Is there a simple
> > workaround for this issue? It seems to me queues N..(2N-1) should
> > simply map to 0..(N-1) when they are not active?
>
> If your XDP program returns XDP_PASS, the packet should be delivered to
> the xsk socket.  If the application isn't running, where would it go?

XDP_PASS means the packet is passed to the normal network stack for
processing. XDP_REDIRECT means the packet should be delivered to the
xsk socket.

>
> I do agree that the usability of this can be improved.  What if the flow
> rules are inserted and removed along with queue creatioin/destruction?

The problem is the mlx5 driver allows flow rules to be set on
N..(2N-1) at any time; even when no XDP program is loaded. Given this
fact, it would be totally weird if they just suddenly disappeared the
first time the queues go inactive. That's why I suggested that they
just always map to queues 0..(N-1) when they are not active. This way,
at least it's less surprising. What do people think?

> Jonathan

Kal

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: net/mlx5e: bind() always returns EINVAL with XDP_ZEROCOPY
  2020-06-18 15:23         ` Jonathan Lemon
  2020-06-18 17:31           ` Kal Cutter Conley
@ 2020-06-20 10:42           ` Kal Cutter Conley
  2020-06-20 18:42             ` Jonathan Lemon
  1 sibling, 1 reply; 11+ messages in thread
From: Kal Cutter Conley @ 2020-06-20 10:42 UTC (permalink / raw)
  To: Jonathan Lemon
  Cc: Saeed Mahameed, brouer, Maxim Mikityanskiy, magnus.karlsson,
	toke.hoiland-jorgensen, xdp-newbies, Tariq Toukan, gospo, netdev,
	bjorn.topel

On Thu, Jun 18, 2020 at 5:23 PM Jonathan Lemon <jonathan.lemon@gmail.com> wrote:
>
> On Sun, Jun 14, 2020 at 10:55:30AM +0200, Kal Cutter Conley wrote:
> > Hi Saeed,
> > Thanks for explaining the reasoning behind the special mlx5 queue
> > numbering with XDP zerocopy.
> >
> > We have a process using AF_XDP that also shares the network interface
> > with other processes on the system. ethtool rx flow classification
> > rules are used to route the traffic to the appropriate XSK queue
> > N..(2N-1). The issue is these queues are only valid as long they are
> > active (as far as I can tell). This means if my AF_XDP process dies
> > other processes no longer receive ingress traffic routed over queues
> > N..(2N-1) even though my XDP program is still loaded and would happily
> > always return XDP_PASS. Other drivers do not have this usability issue
> > because they use queues that are always valid. Is there a simple
> > workaround for this issue? It seems to me queues N..(2N-1) should
> > simply map to 0..(N-1) when they are not active?
>
> If your XDP program returns XDP_PASS, the packet should be delivered to
> the xsk socket.  If the application isn't running, where would it go?
>
> I do agree that the usability of this can be improved.  What if the flow
> rules are inserted and removed along with queue creatioin/destruction?

I think I misunderstood your suggestion here. Do you mean the rules
should be inserted / removed on the hardware level but still show in
ethtool even if they are not active in the hardware? In this case the
rules always occupy a "location" but just never apply if the
respective queues are not "enabled". I think this would be the best
possible solution.

> --
> Jonathan

Kal

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: net/mlx5e: bind() always returns EINVAL with XDP_ZEROCOPY
  2020-06-20 10:42           ` Kal Cutter Conley
@ 2020-06-20 18:42             ` Jonathan Lemon
  2020-06-21 10:03               ` Kal Cutter Conley
  0 siblings, 1 reply; 11+ messages in thread
From: Jonathan Lemon @ 2020-06-20 18:42 UTC (permalink / raw)
  To: Kal Cutter Conley
  Cc: Saeed Mahameed, brouer, Maxim Mikityanskiy, magnus.karlsson,
	toke.hoiland-jorgensen, xdp-newbies, Tariq Toukan, gospo, netdev,
	bjorn.topel

On Sat, Jun 20, 2020 at 12:42:36PM +0200, Kal Cutter Conley wrote:
> On Thu, Jun 18, 2020 at 5:23 PM Jonathan Lemon <jonathan.lemon@gmail.com> wrote:
> >
> > On Sun, Jun 14, 2020 at 10:55:30AM +0200, Kal Cutter Conley wrote:
> > > Hi Saeed,
> > > Thanks for explaining the reasoning behind the special mlx5 queue
> > > numbering with XDP zerocopy.
> > >
> > > We have a process using AF_XDP that also shares the network interface
> > > with other processes on the system. ethtool rx flow classification
> > > rules are used to route the traffic to the appropriate XSK queue
> > > N..(2N-1). The issue is these queues are only valid as long they are
> > > active (as far as I can tell). This means if my AF_XDP process dies
> > > other processes no longer receive ingress traffic routed over queues
> > > N..(2N-1) even though my XDP program is still loaded and would happily
> > > always return XDP_PASS. Other drivers do not have this usability issue
> > > because they use queues that are always valid. Is there a simple
> > > workaround for this issue? It seems to me queues N..(2N-1) should
> > > simply map to 0..(N-1) when they are not active?
> >
> > If your XDP program returns XDP_PASS, the packet should be delivered to
> > the xsk socket.  If the application isn't running, where would it go?
> >
> > I do agree that the usability of this can be improved.  What if the flow
> > rules are inserted and removed along with queue creatioin/destruction?
> 
> I think I misunderstood your suggestion here. Do you mean the rules
> should be inserted / removed on the hardware level but still show in
> ethtool even if they are not active in the hardware? In this case the
> rules always occupy a "location" but just never apply if the
> respective queues are not "enabled". I think this would be the best
> possible solution.

No, that wasn't what I was suggesting.  I would think that having
ethtool return something that isn't true woulld be really confusing -
either the rules are enabled and active, or they should not be there.

I was thinking more along the lines of having the flow rules inserted
and removed when the queue is created/destroyed, so the steering rule is
a property of the queue itself rather than maintained externally through
ethtool.
-- 
Jonathan

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: net/mlx5e: bind() always returns EINVAL with XDP_ZEROCOPY
  2020-06-20 18:42             ` Jonathan Lemon
@ 2020-06-21 10:03               ` Kal Cutter Conley
  2020-06-21 19:20                 ` Jonathan Lemon
  0 siblings, 1 reply; 11+ messages in thread
From: Kal Cutter Conley @ 2020-06-21 10:03 UTC (permalink / raw)
  To: Jonathan Lemon
  Cc: Saeed Mahameed, brouer, Maxim Mikityanskiy, magnus.karlsson,
	toke.hoiland-jorgensen, xdp-newbies, Tariq Toukan, gospo, netdev,
	bjorn.topel

On Sat, Jun 20, 2020 at 8:42 PM Jonathan Lemon <jonathan.lemon@gmail.com> wrote:
>
> On Sat, Jun 20, 2020 at 12:42:36PM +0200, Kal Cutter Conley wrote:
> > On Thu, Jun 18, 2020 at 5:23 PM Jonathan Lemon <jonathan.lemon@gmail.com> wrote:
> > >
> > > On Sun, Jun 14, 2020 at 10:55:30AM +0200, Kal Cutter Conley wrote:
> > > > Hi Saeed,
> > > > Thanks for explaining the reasoning behind the special mlx5 queue
> > > > numbering with XDP zerocopy.
> > > >
> > > > We have a process using AF_XDP that also shares the network interface
> > > > with other processes on the system. ethtool rx flow classification
> > > > rules are used to route the traffic to the appropriate XSK queue
> > > > N..(2N-1). The issue is these queues are only valid as long they are
> > > > active (as far as I can tell). This means if my AF_XDP process dies
> > > > other processes no longer receive ingress traffic routed over queues
> > > > N..(2N-1) even though my XDP program is still loaded and would happily
> > > > always return XDP_PASS. Other drivers do not have this usability issue
> > > > because they use queues that are always valid. Is there a simple
> > > > workaround for this issue? It seems to me queues N..(2N-1) should
> > > > simply map to 0..(N-1) when they are not active?
> > >
> > > If your XDP program returns XDP_PASS, the packet should be delivered to
> > > the xsk socket.  If the application isn't running, where would it go?
> > >
> > > I do agree that the usability of this can be improved.  What if the flow
> > > rules are inserted and removed along with queue creatioin/destruction?
> >
> > I think I misunderstood your suggestion here. Do you mean the rules
> > should be inserted / removed on the hardware level but still show in
> > ethtool even if they are not active in the hardware? In this case the
> > rules always occupy a "location" but just never apply if the
> > respective queues are not "enabled". I think this would be the best
> > possible solution.
>
> No, that wasn't what I was suggesting.  I would think that having
> ethtool return something that isn't true woulld be really confusing -
> either the rules are enabled and active, or they should not be there.

I think how Mellanox handles XDP ZC queue numbering is confusing no
matter what (at least given the current ethtool interface). However,
in its current form, it is not only confusing, it is also problematic.

If they changed the behavior so that the rules no longer apply when
the respective queues are inactive, then at least it would be less
_problematic_.

Would it really be more confusing if they made this change? Consider
what ethtool currently shows. For example, if I have 8 RX channels
configured and a RX classification rule for (XSK) queue 15:

[root@localhost ~]# ethtool -n eth0
8 RX rings available
Total 1 rules

Filter: 0
        Rule Type: UDP over IPv4
        Src IP addr: 0.0.0.0 mask: 255.255.255.255
        Dest IP addr: 169.254.116.10 mask: 0.0.0.0
        TOS: 0x0 mask: 0xff
        Src port: 0 mask: 0xffff
        Dest port: 0 mask: 0xffff
        Action: Direct to queue 15

ethtool prints 8 available queues and at the same time filter 0
directs traffic to queue 15. So it's already apparent here that queue
15 is special (since it says only 8 are available).

>
> I was thinking more along the lines of having the flow rules inserted
> and removed when the queue is created/destroyed, so the steering rule is
> a property of the queue itself rather than maintained externally through
> ethtool.

I think presenting the flow rules as a property of the interface makes
more sense (as they are now). Since:
    (1) Flow rules affect all traffic for the interface.
    (2) Since flow rules are ordered (the first rule that matches is
used), a rule's "location" (priority) has to be global to the
interface anyway.
    (3) Flow rules can be used to discard traffic. In this case, there
is no queue to be a property of.
    (4) What if you wanted to support more complicated rules that
apply to multiple queues? E.g. Say all 10.0.0.0/8 traffic should use
queues 0-3 (which particular queue is used for a flow depends on
rxhash).

> --
> Jonathan

Kal

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: net/mlx5e: bind() always returns EINVAL with XDP_ZEROCOPY
  2020-06-21 10:03               ` Kal Cutter Conley
@ 2020-06-21 19:20                 ` Jonathan Lemon
  0 siblings, 0 replies; 11+ messages in thread
From: Jonathan Lemon @ 2020-06-21 19:20 UTC (permalink / raw)
  To: Kal Cutter Conley
  Cc: Saeed Mahameed, brouer, Maxim Mikityanskiy, magnus.karlsson,
	toke.hoiland-jorgensen, xdp-newbies, Tariq Toukan, gospo, netdev,
	bjorn.topel

On Sun, Jun 21, 2020 at 12:03:14PM +0200, Kal Cutter Conley wrote:
> On Sat, Jun 20, 2020 at 8:42 PM Jonathan Lemon <jonathan.lemon@gmail.com> wrote:
> >
> > On Sat, Jun 20, 2020 at 12:42:36PM +0200, Kal Cutter Conley wrote:
> > > On Thu, Jun 18, 2020 at 5:23 PM Jonathan Lemon <jonathan.lemon@gmail.com> wrote:
> > > >
> > > > On Sun, Jun 14, 2020 at 10:55:30AM +0200, Kal Cutter Conley wrote:
> > > > > Hi Saeed,
> > > > > Thanks for explaining the reasoning behind the special mlx5 queue
> > > > > numbering with XDP zerocopy.
> > > > >
> > > > > We have a process using AF_XDP that also shares the network interface
> > > > > with other processes on the system. ethtool rx flow classification
> > > > > rules are used to route the traffic to the appropriate XSK queue
> > > > > N..(2N-1). The issue is these queues are only valid as long they are
> > > > > active (as far as I can tell). This means if my AF_XDP process dies
> > > > > other processes no longer receive ingress traffic routed over queues
> > > > > N..(2N-1) even though my XDP program is still loaded and would happily
> > > > > always return XDP_PASS. Other drivers do not have this usability issue
> > > > > because they use queues that are always valid. Is there a simple
> > > > > workaround for this issue? It seems to me queues N..(2N-1) should
> > > > > simply map to 0..(N-1) when they are not active?
> > > >
> > > > If your XDP program returns XDP_PASS, the packet should be delivered to
> > > > the xsk socket.  If the application isn't running, where would it go?
> > > >
> > > > I do agree that the usability of this can be improved.  What if the flow
> > > > rules are inserted and removed along with queue creatioin/destruction?
> > >
> > > I think I misunderstood your suggestion here. Do you mean the rules
> > > should be inserted / removed on the hardware level but still show in
> > > ethtool even if they are not active in the hardware? In this case the
> > > rules always occupy a "location" but just never apply if the
> > > respective queues are not "enabled". I think this would be the best
> > > possible solution.
> >
> > No, that wasn't what I was suggesting.  I would think that having
> > ethtool return something that isn't true woulld be really confusing -
> > either the rules are enabled and active, or they should not be there.
> 
> I think how Mellanox handles XDP ZC queue numbering is confusing no
> matter what (at least given the current ethtool interface). However,
> in its current form, it is not only confusing, it is also problematic.
> 
> If they changed the behavior so that the rules no longer apply when
> the respective queues are inactive, then at least it would be less
> _problematic_.
> 
> Would it really be more confusing if they made this change? Consider
> what ethtool currently shows. For example, if I have 8 RX channels
> configured and a RX classification rule for (XSK) queue 15:
> 
> [root@localhost ~]# ethtool -n eth0
> 8 RX rings available
> Total 1 rules
> 
> Filter: 0
>         Rule Type: UDP over IPv4
>         Src IP addr: 0.0.0.0 mask: 255.255.255.255
>         Dest IP addr: 169.254.116.10 mask: 0.0.0.0
>         TOS: 0x0 mask: 0xff
>         Src port: 0 mask: 0xffff
>         Dest port: 0 mask: 0xffff
>         Action: Direct to queue 15
> 
> ethtool prints 8 available queues and at the same time filter 0
> directs traffic to queue 15. So it's already apparent here that queue
> 15 is special (since it says only 8 are available).

True.  The issue is that ZC queues /are/ special, they are bound to an
application which provides the packet memory, and are not truly general
purpose queues for use by the system.


> > I was thinking more along the lines of having the flow rules inserted
> > and removed when the queue is created/destroyed, so the steering rule is
> > a property of the queue itself rather than maintained externally through
> > ethtool.
> 
> I think presenting the flow rules as a property of the interface makes
> more sense (as they are now). Since:
>     (1) Flow rules affect all traffic for the interface.

Queues are a property of the interface, in that adding or removing a queue
changes the interface behavior.  It would seem reasonable that these
queue changes would also change interface properties.


>     (2) Since flow rules are ordered (the first rule that matches is
> used), a rule's "location" (priority) has to be global to the
> interface anyway.

The ordering of flow rules is an issue, I don't have an answer for that.


>     (3) Flow rules can be used to discard traffic. In this case, there
> is no queue to be a property of.

I'm only advocating adding rules which are specific for the queue.


>     (4) What if you wanted to support more complicated rules that
> apply to multiple queues? E.g. Say all 10.0.0.0/8 traffic should use
> queues 0-3 (which particular queue is used for a flow depends on
> rxhash).

Today, this could be done with the 'context' parameter to -X and -N.
However, I don't think that -X accepts N..(2N-1) numbering, so only
flow_steering to a specific queue is available.


It might be nice to have:

  ethtool -X eth0 context new empty         <--- empty context
  ethtool -N eth0 flow-type ... context 1
  ethtool -X eth0 context 1 queue 15        <--- add member

Where the RSS context starts out empty (drop packets), and queues are
explicitly added to them, intead of starting with a default context.
This way flow rules don't change, just the RSS membership.  This does
change the flow_hash steering as queues are added/removed, which could
be an issue.  If the queue doesn't exist, then the packet is dropped.
-- 
Jonathan

^ permalink raw reply	[flat|nested] 11+ messages in thread

end of thread, other threads:[~2020-06-21 19:20 UTC | newest]

Thread overview: 11+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2019-08-07 12:49 net/mlx5e: bind() always returns EINVAL with XDP_ZEROCOPY Kal Cutter Conley
2019-09-01 16:47 ` Kal Cutter Conley
2019-09-02  9:08   ` Jesper Dangaard Brouer
2019-09-03 20:19     ` Saeed Mahameed
2020-06-14  8:55       ` Kal Cutter Conley
2020-06-18 15:23         ` Jonathan Lemon
2020-06-18 17:31           ` Kal Cutter Conley
2020-06-20 10:42           ` Kal Cutter Conley
2020-06-20 18:42             ` Jonathan Lemon
2020-06-21 10:03               ` Kal Cutter Conley
2020-06-21 19:20                 ` Jonathan Lemon

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).