All of lore.kernel.org
 help / color / mirror / Atom feed
* Re: Questions on XDP
@ 2017-02-18 23:31 Alexei Starovoitov
  2017-02-18 23:48 ` John Fastabend
  0 siblings, 1 reply; 21+ messages in thread
From: Alexei Starovoitov @ 2017-02-18 23:31 UTC (permalink / raw)
  To: Alexander Duyck
  Cc: Eric Dumazet, Jesper Dangaard Brouer, John Fastabend, Netdev,
	Tom Herbert, Alexei Starovoitov, John Fastabend, Daniel Borkmann,
	David Miller

On Sat, Feb 18, 2017 at 10:18 AM, Alexander Duyck
<alexander.duyck@gmail.com> wrote:
>
>> XDP_DROP does not require having one page per frame.
>
> Agreed.

why do you think so?
xdp_drop is targeting ddos where in good case
all traffic is passed up and in bad case
most of the traffic is dropped, but good traffic still needs
to be serviced by the layers after. Like other xdp
programs and the stack.
Say ixgbe+xdp goes with 2k per packet,
very soon we will have a bunch of half pages
sitting in the stack and other halfs requiring
complex refcnting and making the actual
ddos mitigation ineffective and forcing nic to drop packets
because it runs out of buffers. Why complicate things?
packet per page approach is simple and effective.
virtio is different. there we don't have hw that needs
to have buffers ready for dma.

> Looking at the Mellanox way of doing it I am not entirely sure it is
> useful.  It looks good for benchmarks but that is about it.  Also I

it's the opposite. It already runs very nicely in production.
In real life it's always a combination of xdp_drop, xdp_tx and
xdp_pass actions.
Sounds like ixgbe wants to do things differently because
of not-invented-here. That new approach may turn
out to be good or bad, but why risk it?
mlx4 approach works.
mlx5 has few issues though, because page recycling
was done too simplistic. Generic page pool/recycling
that all drivers will use should solve that. I hope.
Is the proposal to have generic split-page recycler ?
How that is going to work?

> don't see it extending out to the point that we would be able to
> exchange packets between interfaces which really seems like it should
> be the ultimate goal for XDP_TX.

we don't have a use case for multi-port xdp_tx,
but I'm not objecting to doing it in general.
Just right now I don't see a need to complicate
drivers to do so.

> It seems like eventually we want to be able to peel off the buffer and
> send it to something other than ourselves.  For example it seems like
> it might be useful at some point to use XDP to do traffic
> classification and have it route packets between multiple interfaces
> on a host and it wouldn't make sense to have all of them map every
> page as bidirectional because it starts becoming ridiculous if you
> have dozens of interfaces in a system.

dozen interfaces? Like a single nic with dozen ports?
or many nics with many ports on the same system?
are you trying to build a switch out of x86?
I don't think it's realistic to have multi-terrabit x86 box.
Is it all because of dpdk/6wind demos?
I saw how dpdk was bragging that they can saturate
pcie bus. So? Why is this useful?
Why anyone would care to put a bunch of nics
into x86 and demonstrate that bandwidth of pcie is now
a limiting factor ?

> Also as far as the one page per frame it occurs to me that you will
> have to eventually deal with things like frame replication.

... only in cases where one needs to demo a multi-port
bridge with lots of nics in one x86 box.
I don't see practicality of such setup and I think
that copying full page every time xdp needs to
broadcast is preferred vs doing atomic refcnting
that will slow down the main case. broadcast is slow path.

My strong believe that xdp should not care about
niche architectures. It never meant to be a solution
for everyone and for all use cases.
If xdp sucks on powerpc, so be it.
cpus with 64k pages are doomed. We should
not sacrifice performance on x86 because of ppc.
I think it was a mistake that ixgbe choose to do that
in the past. When mb()s were added because
of powerpc and it took years to introduce dma_mb()
and return performance to good levels.
btw, dma_mb work was awesome.
In xdp I don't want to make such trade-offs.
Really only x86 and arm64 archs matter today.
Everything else is best effort.

^ permalink raw reply	[flat|nested] 21+ messages in thread
* Re: Questions on XDP
@ 2017-02-18 23:59 Alexei Starovoitov
  0 siblings, 0 replies; 21+ messages in thread
From: Alexei Starovoitov @ 2017-02-18 23:59 UTC (permalink / raw)
  To: John Fastabend
  Cc: Alexander Duyck, Eric Dumazet, Jesper Dangaard Brouer, Netdev,
	Tom Herbert, Alexei Starovoitov, John Fastabend, Daniel Borkmann,
	David Miller

On Sat, Feb 18, 2017 at 3:48 PM, John Fastabend
<john.fastabend@gmail.com> wrote:
>
> We are running our vswitch in userspace now for many workloads
> it would be nice to have these in kernel if possible.
...
> Maybe Alex had something else in mind but we have many virtual interfaces
> plus physical interfaces in vswitch use case. Possibly thousands.

virtual interfaces towards many VMs is certainly good use case
that we need to address.
we'd still need to copy the packet from memory of one vm into another,
right? so per packet allocation strategy for virtual interface can
be anything.

Sounds like you already have patches that do that?
Excellent. Please share.

^ permalink raw reply	[flat|nested] 21+ messages in thread
* Questions on XDP
@ 2017-02-16 20:41 Alexander Duyck
  2017-02-16 22:36 ` John Fastabend
  0 siblings, 1 reply; 21+ messages in thread
From: Alexander Duyck @ 2017-02-16 20:41 UTC (permalink / raw)
  To: Netdev
  Cc: Tom Herbert, Alexei Starovoitov, John Fastabend, Jesper Dangaard Brouer

So I'm in the process of working on enabling XDP for the Intel NICs
and I had a few questions so I just thought I would put them out here
to try and get everything sorted before I paint myself into a corner.

So my first question is why does the documentation mention 1 frame per
page for XDP?  Is this with the intention at some point to try and
support page flipping into user space, or is it supposed to have been
for the use with an API such as the AF_PACKET mmap stuff?  If I am not
mistaken the page flipping has been tried in the past and failed, and
as far as the AF_PACKET stuff my understanding is that the pages had
to be mapped beforehand so it doesn't gain us anything without a
hardware offload to a pre-mapped queue.

Second I was wondering about supporting jumbo frames and scatter
gather.  Specifically if I let XDP handle the first 2-3K of a frame,
and then processed the remaining portion of the frame following the
directive set forth based on the first frame would that be good enough
to satisfy XDP or do I actually have to support 1 linear buffer
always.

Finally I was looking at xdp_adjust_head.  From what I can tell all
that is technically required to support it is allowing the head to be
adjusted either in or out.  I'm assuming there is some amount of
padding that is preferred.  With the setup I have currently I am
guaranteeing at least NET_SKB_PAD + NET_IP_ALIGN, however I have found
that there should be enough room for 192 bytes on an x86 system if I
am using a 2K buffer.  I'm just wondering if that is enough padding or
if we need more for XDP.

Anyway sorry for the stupid questions but I haven't been paying close
of attention to this and was mostly focused on the DMA bits needed to
support this so now I am playing catch-up.

- Alex

^ permalink raw reply	[flat|nested] 21+ messages in thread

end of thread, other threads:[~2017-02-22 22:00 UTC | newest]

Thread overview: 21+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2017-02-18 23:31 Questions on XDP Alexei Starovoitov
2017-02-18 23:48 ` John Fastabend
2017-02-18 23:59   ` Eric Dumazet
2017-02-19  2:16   ` Alexander Duyck
2017-02-19  3:48     ` John Fastabend
2017-02-20 20:06       ` Jakub Kicinski
2017-02-22  5:02         ` John Fastabend
2017-02-21  3:18     ` Alexei Starovoitov
2017-02-21  3:39       ` John Fastabend
2017-02-21  4:00         ` Alexander Duyck
2017-02-21  7:55           ` Alexei Starovoitov
2017-02-21 17:44             ` Alexander Duyck
2017-02-22 17:08               ` John Fastabend
2017-02-22 21:59                 ` Jesper Dangaard Brouer
  -- strict thread matches above, loose matches on Subject: below --
2017-02-18 23:59 Alexei Starovoitov
2017-02-16 20:41 Alexander Duyck
2017-02-16 22:36 ` John Fastabend
2017-02-18 16:34   ` Jesper Dangaard Brouer
2017-02-18 17:41     ` Eric Dumazet
2017-02-18 18:18       ` Alexander Duyck
2017-02-18 23:28         ` John Fastabend

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.