xdp-newbies.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* Redirect packet back to host stack after AF_XDP?
@ 2022-12-14 20:49 Vincent Li
  2022-12-14 22:53 ` Toke Høiland-Jørgensen
  0 siblings, 1 reply; 12+ messages in thread
From: Vincent Li @ 2022-12-14 20:49 UTC (permalink / raw)
  To: xdp-newbies

Hi,

If I have an user space stack like mTCP works on top of AF_XDP as tcp
stateful packet filter to drop tcp packet like tcp syn/rst/ack flood
or other tcp attack, and redirect good tcp packet back to linux host
stack after mTCP filtering, is that possible?

Thanks

Vincent

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: Redirect packet back to host stack after AF_XDP?
  2022-12-14 20:49 Redirect packet back to host stack after AF_XDP? Vincent Li
@ 2022-12-14 22:53 ` Toke Høiland-Jørgensen
  2022-12-15  1:57   ` Vincent Li
  2022-12-17  2:52   ` Vincent Li
  0 siblings, 2 replies; 12+ messages in thread
From: Toke Høiland-Jørgensen @ 2022-12-14 22:53 UTC (permalink / raw)
  To: Vincent Li, xdp-newbies

Vincent Li <vincent.mc.li@gmail.com> writes:

> Hi,
>
> If I have an user space stack like mTCP works on top of AF_XDP as tcp
> stateful packet filter to drop tcp packet like tcp syn/rst/ack flood
> or other tcp attack, and redirect good tcp packet back to linux host
> stack after mTCP filtering, is that possible?

Not really, no. You can inject it using regular userspace methods (say,
a TUN device), or using AF_XDP on a veth device. But in both cases the
packet will come in on a different interface, so it's not really
transparent. And performance is not great either.

In general, if you want to filter traffic before passing it on to the
kernel, the best bet is to implement your filtering in BPF and run it as
an XDP program.

-Toke


^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: Redirect packet back to host stack after AF_XDP?
  2022-12-14 22:53 ` Toke Høiland-Jørgensen
@ 2022-12-15  1:57   ` Vincent Li
  2022-12-15 11:08     ` Toke Høiland-Jørgensen
  2022-12-17  2:52   ` Vincent Li
  1 sibling, 1 reply; 12+ messages in thread
From: Vincent Li @ 2022-12-15  1:57 UTC (permalink / raw)
  To: Toke Høiland-Jørgensen; +Cc: xdp-newbies

On Wed, Dec 14, 2022 at 2:53 PM Toke Høiland-Jørgensen <toke@redhat.com> wrote:
>
> Vincent Li <vincent.mc.li@gmail.com> writes:
>
> > Hi,
> >
> > If I have an user space stack like mTCP works on top of AF_XDP as tcp
> > stateful packet filter to drop tcp packet like tcp syn/rst/ack flood
> > or other tcp attack, and redirect good tcp packet back to linux host
> > stack after mTCP filtering, is that possible?
>
> Not really, no. You can inject it using regular userspace methods (say,
> a TUN device), or using AF_XDP on a veth device. But in both cases the
> packet will come in on a different interface, so it's not really
> transparent. And performance is not great either.
>
I see

> In general, if you want to filter traffic before passing it on to the
> kernel, the best bet is to implement your filtering in BPF and run it as
> an XDP program.
>
I read about this
https://eric-keller.github.io/papers/2020/HybridNetworkStack_ieee_nfvsdn2020_slides.pdf,
thought that is good idea to run mTCP on top of AF_XDP as  anti DDOS
tool

> -Toke
>

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: Redirect packet back to host stack after AF_XDP?
  2022-12-15  1:57   ` Vincent Li
@ 2022-12-15 11:08     ` Toke Høiland-Jørgensen
  2022-12-15 18:52       ` Vincent Li
  0 siblings, 1 reply; 12+ messages in thread
From: Toke Høiland-Jørgensen @ 2022-12-15 11:08 UTC (permalink / raw)
  To: Vincent Li; +Cc: xdp-newbies

Vincent Li <vincent.mc.li@gmail.com> writes:

> On Wed, Dec 14, 2022 at 2:53 PM Toke Høiland-Jørgensen <toke@redhat.com> wrote:
>>
>> Vincent Li <vincent.mc.li@gmail.com> writes:
>>
>> > Hi,
>> >
>> > If I have an user space stack like mTCP works on top of AF_XDP as tcp
>> > stateful packet filter to drop tcp packet like tcp syn/rst/ack flood
>> > or other tcp attack, and redirect good tcp packet back to linux host
>> > stack after mTCP filtering, is that possible?
>>
>> Not really, no. You can inject it using regular userspace methods (say,
>> a TUN device), or using AF_XDP on a veth device. But in both cases the
>> packet will come in on a different interface, so it's not really
>> transparent. And performance is not great either.
>>
> I see
>
>> In general, if you want to filter traffic before passing it on to the
>> kernel, the best bet is to implement your filtering in BPF and run it as
>> an XDP program.
>>
> I read about this
> https://eric-keller.github.io/papers/2020/HybridNetworkStack_ieee_nfvsdn2020_slides.pdf,
> thought that is good idea to run mTCP on top of AF_XDP as  anti DDOS
> tool

Right, that slide deck seems awfully hand-wavy about how they're getting
packets back into the kernel, though... I guess you could ask the author
how they're doing it? :)

-Toke


^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: Redirect packet back to host stack after AF_XDP?
  2022-12-15 11:08     ` Toke Høiland-Jørgensen
@ 2022-12-15 18:52       ` Vincent Li
  0 siblings, 0 replies; 12+ messages in thread
From: Vincent Li @ 2022-12-15 18:52 UTC (permalink / raw)
  To: Toke Høiland-Jørgensen; +Cc: xdp-newbies

On Thu, Dec 15, 2022 at 3:09 AM Toke Høiland-Jørgensen <toke@redhat.com> wrote:
>
> Vincent Li <vincent.mc.li@gmail.com> writes:
>
> > On Wed, Dec 14, 2022 at 2:53 PM Toke Høiland-Jørgensen <toke@redhat.com> wrote:
> >>
> >> Vincent Li <vincent.mc.li@gmail.com> writes:
> >>
> >> > Hi,
> >> >
> >> > If I have an user space stack like mTCP works on top of AF_XDP as tcp
> >> > stateful packet filter to drop tcp packet like tcp syn/rst/ack flood
> >> > or other tcp attack, and redirect good tcp packet back to linux host
> >> > stack after mTCP filtering, is that possible?
> >>
> >> Not really, no. You can inject it using regular userspace methods (say,
> >> a TUN device), or using AF_XDP on a veth device. But in both cases the
> >> packet will come in on a different interface, so it's not really
> >> transparent. And performance is not great either.
> >>
> > I see
> >
> >> In general, if you want to filter traffic before passing it on to the
> >> kernel, the best bet is to implement your filtering in BPF and run it as
> >> an XDP program.
> >>
> > I read about this
> > https://eric-keller.github.io/papers/2020/HybridNetworkStack_ieee_nfvsdn2020_slides.pdf,
> > thought that is good idea to run mTCP on top of AF_XDP as  anti DDOS
> > tool
>
> Right, that slide deck seems awfully hand-wavy about how they're getting
> packets back into the kernel, though... I guess you could ask the author
> how they're doing it? :)

I will try :), thanks again!

>
> -Toke
>

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: Redirect packet back to host stack after AF_XDP?
  2022-12-14 22:53 ` Toke Høiland-Jørgensen
  2022-12-15  1:57   ` Vincent Li
@ 2022-12-17  2:52   ` Vincent Li
  2023-01-02 11:33     ` Toke Høiland-Jørgensen
  1 sibling, 1 reply; 12+ messages in thread
From: Vincent Li @ 2022-12-17  2:52 UTC (permalink / raw)
  To: Toke Høiland-Jørgensen; +Cc: xdp-newbies

On Wed, Dec 14, 2022 at 2:53 PM Toke Høiland-Jørgensen <toke@redhat.com> wrote:
>
> Vincent Li <vincent.mc.li@gmail.com> writes:
>
> > Hi,
> >
> > If I have an user space stack like mTCP works on top of AF_XDP as tcp
> > stateful packet filter to drop tcp packet like tcp syn/rst/ack flood
> > or other tcp attack, and redirect good tcp packet back to linux host
> > stack after mTCP filtering, is that possible?
>
> Not really, no. You can inject it using regular userspace methods (say,
> a TUN device), or using AF_XDP on a veth device. But in both cases the
> packet will come in on a different interface, so it's not really
> transparent. And performance is not great either.

I have thought about it more :) what about this scenario


good tcp rst/ack or bad flooding rst/ack -> NIC1 -> mTCP+AF_XDP ->NIC2

NIC1 and NIC2 on the same host, drop flooding rst/ack by mTCP,
redirect good tcp rst/ack to NIC2, is that possible? any performance
impact?


>
> In general, if you want to filter traffic before passing it on to the
> kernel, the best bet is to implement your filtering in BPF and run it as
> an XDP program.

I am thinking for scenario like tcp rst/ack flood DDOS attack to NIC1
above, I can't simply drop every rst/ack because  there could be
legitimate rst/ack, in this case since mTCP can validate legitimate
stateful tcp connection, drop flooding rst/ack packet, redirect good
rst/ack to NIC2. I am not sure a BPF XDP program attached to NIC1 is
able to do stateful TCP packet filtering, does that make sense to you?

>
> -Toke
>

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: Redirect packet back to host stack after AF_XDP?
  2022-12-17  2:52   ` Vincent Li
@ 2023-01-02 11:33     ` Toke Høiland-Jørgensen
  2023-01-09 21:28       ` Vincent Li
  0 siblings, 1 reply; 12+ messages in thread
From: Toke Høiland-Jørgensen @ 2023-01-02 11:33 UTC (permalink / raw)
  To: Vincent Li; +Cc: xdp-newbies

Vincent Li <vincent.mc.li@gmail.com> writes:

> On Wed, Dec 14, 2022 at 2:53 PM Toke Høiland-Jørgensen <toke@redhat.com> wrote:
>>
>> Vincent Li <vincent.mc.li@gmail.com> writes:
>>
>> > Hi,
>> >
>> > If I have an user space stack like mTCP works on top of AF_XDP as tcp
>> > stateful packet filter to drop tcp packet like tcp syn/rst/ack flood
>> > or other tcp attack, and redirect good tcp packet back to linux host
>> > stack after mTCP filtering, is that possible?
>>
>> Not really, no. You can inject it using regular userspace methods (say,
>> a TUN device), or using AF_XDP on a veth device. But in both cases the
>> packet will come in on a different interface, so it's not really
>> transparent. And performance is not great either.
>
> I have thought about it more :) what about this scenario
>
>
> good tcp rst/ack or bad flooding rst/ack -> NIC1 -> mTCP+AF_XDP ->NIC2
>
> NIC1 and NIC2 on the same host, drop flooding rst/ack by mTCP,
> redirect good tcp rst/ack to NIC2, is that possible?

You can do this if NIC2 is a veth device: you inject packets into the
veth on the TX side, they come out on the other side and from the kernel
PoV it looks like all packets come in on the peer veth. You'll need to
redirect packets the other way as well.

> any performance impact?

Yes, obviously :)

>> In general, if you want to filter traffic before passing it on to the
>> kernel, the best bet is to implement your filtering in BPF and run it as
>> an XDP program.
>
> I am thinking for scenario like tcp rst/ack flood DDOS attack to NIC1
> above, I can't simply drop every rst/ack because there could be
> legitimate rst/ack, in this case since mTCP can validate legitimate
> stateful tcp connection, drop flooding rst/ack packet, redirect good
> rst/ack to NIC2. I am not sure a BPF XDP program attached to NIC1 is
> able to do stateful TCP packet filtering, does that make sense to you?

It makes sense in the "it can probably be made to work" sense. Not in
the "why would anyone want to do this" sense. If you're trying to
protect against SYN flooding using XDP there are better solutions than
proxying things through a userspace TCP stack. See for instance Maxim's
synproxy patches:

https://lore.kernel.org/r/20220615134847.3753567-1-maximmi@nvidia.com

-Toke


^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: Redirect packet back to host stack after AF_XDP?
  2023-01-02 11:33     ` Toke Høiland-Jørgensen
@ 2023-01-09 21:28       ` Vincent Li
  2023-01-10 15:23         ` Toke Høiland-Jørgensen
  0 siblings, 1 reply; 12+ messages in thread
From: Vincent Li @ 2023-01-09 21:28 UTC (permalink / raw)
  To: Toke Høiland-Jørgensen; +Cc: xdp-newbies

On Mon, Jan 2, 2023 at 3:34 AM Toke Høiland-Jørgensen <toke@redhat.com> wrote:
>
> Vincent Li <vincent.mc.li@gmail.com> writes:
>
> > On Wed, Dec 14, 2022 at 2:53 PM Toke Høiland-Jørgensen <toke@redhat.com> wrote:
> >>
> >> Vincent Li <vincent.mc.li@gmail.com> writes:
> >>
> >> > Hi,
> >> >
> >> > If I have an user space stack like mTCP works on top of AF_XDP as tcp
> >> > stateful packet filter to drop tcp packet like tcp syn/rst/ack flood
> >> > or other tcp attack, and redirect good tcp packet back to linux host
> >> > stack after mTCP filtering, is that possible?
> >>
> >> Not really, no. You can inject it using regular userspace methods (say,
> >> a TUN device), or using AF_XDP on a veth device. But in both cases the
> >> packet will come in on a different interface, so it's not really
> >> transparent. And performance is not great either.
> >
> > I have thought about it more :) what about this scenario
> >
> >
> > good tcp rst/ack or bad flooding rst/ack -> NIC1 -> mTCP+AF_XDP ->NIC2
> >
> > NIC1 and NIC2 on the same host, drop flooding rst/ack by mTCP,
> > redirect good tcp rst/ack to NIC2, is that possible?
>
> You can do this if NIC2 is a veth device: you inject packets into the
> veth on the TX side, they come out on the other side and from the kernel
> PoV it looks like all packets come in on the peer veth. You'll need to
> redirect packets the other way as well.
>
> > any performance impact?
>
> Yes, obviously :)
>
> >> In general, if you want to filter traffic before passing it on to the
> >> kernel, the best bet is to implement your filtering in BPF and run it as
> >> an XDP program.
> >
> > I am thinking for scenario like tcp rst/ack flood DDOS attack to NIC1
> > above, I can't simply drop every rst/ack because there could be
> > legitimate rst/ack, in this case since mTCP can validate legitimate
> > stateful tcp connection, drop flooding rst/ack packet, redirect good
> > rst/ack to NIC2. I am not sure a BPF XDP program attached to NIC1 is
> > able to do stateful TCP packet filtering, does that make sense to you?
>
> It makes sense in the "it can probably be made to work" sense. Not in
> the "why would anyone want to do this" sense. If you're trying to
> protect against SYN flooding using XDP there are better solutions than
> proxying things through a user space TCP stack. See for instance Maxim's
> synproxy patches:
>

SYN flooding is just one of the example, what I have in mind is an
user space TCP/IP stack runs on top of AF_XDP as middle box/proxy for
packet filtering or load balancing, like F5 BIG-IP runs an user space
TCP/IP stack on top of AF_XDP. I thought open source mTCP + AF_XDP
could be a similar use case as middle box.  user space TCP/IP stack +
AF_XDP as middle box/proxy,  the performance is not going to be good?

> https://lore.kernel.org/r/20220615134847.3753567-1-maximmi@nvidia.com

thanks,  it appears it requires iptables rules setup to work with the
synproxy if I recall correctly

>
> -Toke
>

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: Redirect packet back to host stack after AF_XDP?
  2023-01-09 21:28       ` Vincent Li
@ 2023-01-10 15:23         ` Toke Høiland-Jørgensen
  2023-01-10 16:54           ` Vincent Li
  0 siblings, 1 reply; 12+ messages in thread
From: Toke Høiland-Jørgensen @ 2023-01-10 15:23 UTC (permalink / raw)
  To: Vincent Li; +Cc: xdp-newbies

Vincent Li <vincent.mc.li@gmail.com> writes:

> On Mon, Jan 2, 2023 at 3:34 AM Toke Høiland-Jørgensen <toke@redhat.com> wrote:
>>
>> Vincent Li <vincent.mc.li@gmail.com> writes:
>>
>> > On Wed, Dec 14, 2022 at 2:53 PM Toke Høiland-Jørgensen <toke@redhat.com> wrote:
>> >>
>> >> Vincent Li <vincent.mc.li@gmail.com> writes:
>> >>
>> >> > Hi,
>> >> >
>> >> > If I have an user space stack like mTCP works on top of AF_XDP as tcp
>> >> > stateful packet filter to drop tcp packet like tcp syn/rst/ack flood
>> >> > or other tcp attack, and redirect good tcp packet back to linux host
>> >> > stack after mTCP filtering, is that possible?
>> >>
>> >> Not really, no. You can inject it using regular userspace methods (say,
>> >> a TUN device), or using AF_XDP on a veth device. But in both cases the
>> >> packet will come in on a different interface, so it's not really
>> >> transparent. And performance is not great either.
>> >
>> > I have thought about it more :) what about this scenario
>> >
>> >
>> > good tcp rst/ack or bad flooding rst/ack -> NIC1 -> mTCP+AF_XDP ->NIC2
>> >
>> > NIC1 and NIC2 on the same host, drop flooding rst/ack by mTCP,
>> > redirect good tcp rst/ack to NIC2, is that possible?
>>
>> You can do this if NIC2 is a veth device: you inject packets into the
>> veth on the TX side, they come out on the other side and from the kernel
>> PoV it looks like all packets come in on the peer veth. You'll need to
>> redirect packets the other way as well.
>>
>> > any performance impact?
>>
>> Yes, obviously :)
>>
>> >> In general, if you want to filter traffic before passing it on to the
>> >> kernel, the best bet is to implement your filtering in BPF and run it as
>> >> an XDP program.
>> >
>> > I am thinking for scenario like tcp rst/ack flood DDOS attack to NIC1
>> > above, I can't simply drop every rst/ack because there could be
>> > legitimate rst/ack, in this case since mTCP can validate legitimate
>> > stateful tcp connection, drop flooding rst/ack packet, redirect good
>> > rst/ack to NIC2. I am not sure a BPF XDP program attached to NIC1 is
>> > able to do stateful TCP packet filtering, does that make sense to you?
>>
>> It makes sense in the "it can probably be made to work" sense. Not in
>> the "why would anyone want to do this" sense. If you're trying to
>> protect against SYN flooding using XDP there are better solutions than
>> proxying things through a user space TCP stack. See for instance Maxim's
>> synproxy patches:
>>
>
> SYN flooding is just one of the example, what I have in mind is an
> user space TCP/IP stack runs on top of AF_XDP as middle box/proxy for
> packet filtering or load balancing, like F5 BIG-IP runs an user space
> TCP/IP stack on top of AF_XDP. I thought open source mTCP + AF_XDP
> could be a similar use case as middle box.  user space TCP/IP stack +
> AF_XDP as middle box/proxy,  the performance is not going to be good?

Well, you can certainly build a proxy using AF_XDP by intercepting all
the traffic and bridging it onto a veth device, say. I've certainly
heard of people doing that. It'll have some non-trivial overhead,
though; even if AF_XDP is fairly high performance, you're still making
all traffic take an extra hop through userspace, and you'll lose
features like hardware TSO, etc. Whether it can be done with "good"
performance depends on your use case, I guess (i.e., how do you define
"good performance"?).

I guess I don't really see the utility in having a user-space TCP stack
be a middlebox? If you're doing packet-level filtering, you could just
do that in regular XDP (and the same for load balancing, see e.g.,
Katran), and if you want to do application-level filtering (say, a WAF),
you could just use the kernel TCP stack?

>> https://lore.kernel.org/r/20220615134847.3753567-1-maximmi@nvidia.com
>
> thanks,  it appears it requires iptables rules setup to work with the
> synproxy if I recall correctly

Might be; not familiar with the details of that...

-Toke


^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: Redirect packet back to host stack after AF_XDP?
  2023-01-10 15:23         ` Toke Høiland-Jørgensen
@ 2023-01-10 16:54           ` Vincent Li
  2023-01-10 23:27             ` Toke Høiland-Jørgensen
  0 siblings, 1 reply; 12+ messages in thread
From: Vincent Li @ 2023-01-10 16:54 UTC (permalink / raw)
  To: Toke Høiland-Jørgensen; +Cc: xdp-newbies

On Tue, Jan 10, 2023 at 7:23 AM Toke Høiland-Jørgensen <toke@redhat.com> wrote:
>
> Vincent Li <vincent.mc.li@gmail.com> writes:
>
> > On Mon, Jan 2, 2023 at 3:34 AM Toke Høiland-Jørgensen <toke@redhat.com> wrote:
> >>
> >> Vincent Li <vincent.mc.li@gmail.com> writes:
> >>
> >> > On Wed, Dec 14, 2022 at 2:53 PM Toke Høiland-Jørgensen <toke@redhat.com> wrote:
> >> >>
> >> >> Vincent Li <vincent.mc.li@gmail.com> writes:
> >> >>
> >> >> > Hi,
> >> >> >
> >> >> > If I have an user space stack like mTCP works on top of AF_XDP as tcp
> >> >> > stateful packet filter to drop tcp packet like tcp syn/rst/ack flood
> >> >> > or other tcp attack, and redirect good tcp packet back to linux host
> >> >> > stack after mTCP filtering, is that possible?
> >> >>
> >> >> Not really, no. You can inject it using regular userspace methods (say,
> >> >> a TUN device), or using AF_XDP on a veth device. But in both cases the
> >> >> packet will come in on a different interface, so it's not really
> >> >> transparent. And performance is not great either.
> >> >
> >> > I have thought about it more :) what about this scenario
> >> >
> >> >
> >> > good tcp rst/ack or bad flooding rst/ack -> NIC1 -> mTCP+AF_XDP ->NIC2
> >> >
> >> > NIC1 and NIC2 on the same host, drop flooding rst/ack by mTCP,
> >> > redirect good tcp rst/ack to NIC2, is that possible?
> >>
> >> You can do this if NIC2 is a veth device: you inject packets into the
> >> veth on the TX side, they come out on the other side and from the kernel
> >> PoV it looks like all packets come in on the peer veth. You'll need to
> >> redirect packets the other way as well.
> >>
> >> > any performance impact?
> >>
> >> Yes, obviously :)
> >>
> >> >> In general, if you want to filter traffic before passing it on to the
> >> >> kernel, the best bet is to implement your filtering in BPF and run it as
> >> >> an XDP program.
> >> >
> >> > I am thinking for scenario like tcp rst/ack flood DDOS attack to NIC1
> >> > above, I can't simply drop every rst/ack because there could be
> >> > legitimate rst/ack, in this case since mTCP can validate legitimate
> >> > stateful tcp connection, drop flooding rst/ack packet, redirect good
> >> > rst/ack to NIC2. I am not sure a BPF XDP program attached to NIC1 is
> >> > able to do stateful TCP packet filtering, does that make sense to you?
> >>
> >> It makes sense in the "it can probably be made to work" sense. Not in
> >> the "why would anyone want to do this" sense. If you're trying to
> >> protect against SYN flooding using XDP there are better solutions than
> >> proxying things through a user space TCP stack. See for instance Maxim's
> >> synproxy patches:
> >>
> >
> > SYN flooding is just one of the example, what I have in mind is an
> > user space TCP/IP stack runs on top of AF_XDP as middle box/proxy for
> > packet filtering or load balancing, like F5 BIG-IP runs an user space
> > TCP/IP stack on top of AF_XDP. I thought open source mTCP + AF_XDP
> > could be a similar use case as middle box.  user space TCP/IP stack +
> > AF_XDP as middle box/proxy,  the performance is not going to be good?
>
> Well, you can certainly build a proxy using AF_XDP by intercepting all
> the traffic and bridging it onto a veth device, say. I've certainly
> heard of people doing that. It'll have some non-trivial overhead,
> though; even if AF_XDP is fairly high performance, you're still making
> all traffic take an extra hop through userspace, and you'll lose
> features like hardware TSO, etc. Whether it can be done with "good"
> performance depends on your use case, I guess (i.e., how do you define
> "good performance"?).
>
> I guess I don't really see the utility in having a user-space TCP stack
> be a middlebox? If you're doing packet-level filtering, you could just
> do that in regular XDP (and the same for load balancing, see e.g.,
> Katran), and if you want to do application-level filtering (say, a WAF),
> you could just use the kernel TCP stack?
>

the reason I mention user-space TCP stack is user space stack appears
performs better than kernel TCP stack, and we see user-space stack +
DPDK for high speed packet processing applications out there, since
XDP/AF_XDP seems to be competing with DPDK, so I thought why not user
space stack + AF_XDP :)

> >> https://lore.kernel.org/r/20220615134847.3753567-1-maximmi@nvidia.com
> >
> > thanks,  it appears it requires iptables rules setup to work with the
> > synproxy if I recall correctly
>
> Might be; not familiar with the details of that...
>
> -Toke
>

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: Redirect packet back to host stack after AF_XDP?
  2023-01-10 16:54           ` Vincent Li
@ 2023-01-10 23:27             ` Toke Høiland-Jørgensen
  2023-01-11  0:11               ` Vincent Li
  0 siblings, 1 reply; 12+ messages in thread
From: Toke Høiland-Jørgensen @ 2023-01-10 23:27 UTC (permalink / raw)
  To: Vincent Li; +Cc: xdp-newbies

Vincent Li <vincent.mc.li@gmail.com> writes:

> On Tue, Jan 10, 2023 at 7:23 AM Toke Høiland-Jørgensen <toke@redhat.com> wrote:
>>
>> Vincent Li <vincent.mc.li@gmail.com> writes:
>>
>> > On Mon, Jan 2, 2023 at 3:34 AM Toke Høiland-Jørgensen <toke@redhat.com> wrote:
>> >>
>> >> Vincent Li <vincent.mc.li@gmail.com> writes:
>> >>
>> >> > On Wed, Dec 14, 2022 at 2:53 PM Toke Høiland-Jørgensen <toke@redhat.com> wrote:
>> >> >>
>> >> >> Vincent Li <vincent.mc.li@gmail.com> writes:
>> >> >>
>> >> >> > Hi,
>> >> >> >
>> >> >> > If I have an user space stack like mTCP works on top of AF_XDP as tcp
>> >> >> > stateful packet filter to drop tcp packet like tcp syn/rst/ack flood
>> >> >> > or other tcp attack, and redirect good tcp packet back to linux host
>> >> >> > stack after mTCP filtering, is that possible?
>> >> >>
>> >> >> Not really, no. You can inject it using regular userspace methods (say,
>> >> >> a TUN device), or using AF_XDP on a veth device. But in both cases the
>> >> >> packet will come in on a different interface, so it's not really
>> >> >> transparent. And performance is not great either.
>> >> >
>> >> > I have thought about it more :) what about this scenario
>> >> >
>> >> >
>> >> > good tcp rst/ack or bad flooding rst/ack -> NIC1 -> mTCP+AF_XDP ->NIC2
>> >> >
>> >> > NIC1 and NIC2 on the same host, drop flooding rst/ack by mTCP,
>> >> > redirect good tcp rst/ack to NIC2, is that possible?
>> >>
>> >> You can do this if NIC2 is a veth device: you inject packets into the
>> >> veth on the TX side, they come out on the other side and from the kernel
>> >> PoV it looks like all packets come in on the peer veth. You'll need to
>> >> redirect packets the other way as well.
>> >>
>> >> > any performance impact?
>> >>
>> >> Yes, obviously :)
>> >>
>> >> >> In general, if you want to filter traffic before passing it on to the
>> >> >> kernel, the best bet is to implement your filtering in BPF and run it as
>> >> >> an XDP program.
>> >> >
>> >> > I am thinking for scenario like tcp rst/ack flood DDOS attack to NIC1
>> >> > above, I can't simply drop every rst/ack because there could be
>> >> > legitimate rst/ack, in this case since mTCP can validate legitimate
>> >> > stateful tcp connection, drop flooding rst/ack packet, redirect good
>> >> > rst/ack to NIC2. I am not sure a BPF XDP program attached to NIC1 is
>> >> > able to do stateful TCP packet filtering, does that make sense to you?
>> >>
>> >> It makes sense in the "it can probably be made to work" sense. Not in
>> >> the "why would anyone want to do this" sense. If you're trying to
>> >> protect against SYN flooding using XDP there are better solutions than
>> >> proxying things through a user space TCP stack. See for instance Maxim's
>> >> synproxy patches:
>> >>
>> >
>> > SYN flooding is just one of the example, what I have in mind is an
>> > user space TCP/IP stack runs on top of AF_XDP as middle box/proxy for
>> > packet filtering or load balancing, like F5 BIG-IP runs an user space
>> > TCP/IP stack on top of AF_XDP. I thought open source mTCP + AF_XDP
>> > could be a similar use case as middle box.  user space TCP/IP stack +
>> > AF_XDP as middle box/proxy,  the performance is not going to be good?
>>
>> Well, you can certainly build a proxy using AF_XDP by intercepting all
>> the traffic and bridging it onto a veth device, say. I've certainly
>> heard of people doing that. It'll have some non-trivial overhead,
>> though; even if AF_XDP is fairly high performance, you're still making
>> all traffic take an extra hop through userspace, and you'll lose
>> features like hardware TSO, etc. Whether it can be done with "good"
>> performance depends on your use case, I guess (i.e., how do you define
>> "good performance"?).
>>
>> I guess I don't really see the utility in having a user-space TCP stack
>> be a middlebox? If you're doing packet-level filtering, you could just
>> do that in regular XDP (and the same for load balancing, see e.g.,
>> Katran), and if you want to do application-level filtering (say, a WAF),
>> you could just use the kernel TCP stack?
>>
>
> the reason I mention user-space TCP stack is user space stack appears
> performs better than kernel TCP stack, and we see user-space stack +
> DPDK for high speed packet processing applications out there, since
> XDP/AF_XDP seems to be competing with DPDK, so I thought why not user
> space stack + AF_XDP :)

Well, there's a difference between running a user-level stack directly
in the end application, or using it as a middlebox. The latter just adds
overhead, and again, I really don't see why you'd want to do that?

Also, the mTCP web site cites tests against a 3.10 kernel, and the code
doesn't look like it's been touched for years. So I'd suggest running
some up-to-date tests against a modern kernel (and trying things like
io_uring if your concern is syscall overhead for small flows) before
drawing any conclusions about performance :)

That being said, it's certainly *possible* to do what you're suggesting;
there's even a PMD driver in DPDK for AF_XDP, so in that sense it's
pluggable. So, like, feel free to try it out? I'm just cautioning
against thinking it some kind of magic bullet; packet processing at high
speeds in software is *hard*, so the details matter a lot, and it's
really easy to throw away any performance gains by inefficiencies
elsewhere in the stack (which goes for both the kernel stack, XDP, and
AF_XDP).

-Toke


^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: Redirect packet back to host stack after AF_XDP?
  2023-01-10 23:27             ` Toke Høiland-Jørgensen
@ 2023-01-11  0:11               ` Vincent Li
  0 siblings, 0 replies; 12+ messages in thread
From: Vincent Li @ 2023-01-11  0:11 UTC (permalink / raw)
  To: Toke Høiland-Jørgensen; +Cc: xdp-newbies

On Tue, Jan 10, 2023 at 3:27 PM Toke Høiland-Jørgensen <toke@redhat.com> wrote:
>
> Vincent Li <vincent.mc.li@gmail.com> writes:
>
> > On Tue, Jan 10, 2023 at 7:23 AM Toke Høiland-Jørgensen <toke@redhat.com> wrote:
> >>
> >> Vincent Li <vincent.mc.li@gmail.com> writes:
> >>
> >> > On Mon, Jan 2, 2023 at 3:34 AM Toke Høiland-Jørgensen <toke@redhat.com> wrote:
> >> >>
> >> >> Vincent Li <vincent.mc.li@gmail.com> writes:
> >> >>
> >> >> > On Wed, Dec 14, 2022 at 2:53 PM Toke Høiland-Jørgensen <toke@redhat.com> wrote:
> >> >> >>
> >> >> >> Vincent Li <vincent.mc.li@gmail.com> writes:
> >> >> >>
> >> >> >> > Hi,
> >> >> >> >
> >> >> >> > If I have an user space stack like mTCP works on top of AF_XDP as tcp
> >> >> >> > stateful packet filter to drop tcp packet like tcp syn/rst/ack flood
> >> >> >> > or other tcp attack, and redirect good tcp packet back to linux host
> >> >> >> > stack after mTCP filtering, is that possible?
> >> >> >>
> >> >> >> Not really, no. You can inject it using regular userspace methods (say,
> >> >> >> a TUN device), or using AF_XDP on a veth device. But in both cases the
> >> >> >> packet will come in on a different interface, so it's not really
> >> >> >> transparent. And performance is not great either.
> >> >> >
> >> >> > I have thought about it more :) what about this scenario
> >> >> >
> >> >> >
> >> >> > good tcp rst/ack or bad flooding rst/ack -> NIC1 -> mTCP+AF_XDP ->NIC2
> >> >> >
> >> >> > NIC1 and NIC2 on the same host, drop flooding rst/ack by mTCP,
> >> >> > redirect good tcp rst/ack to NIC2, is that possible?
> >> >>
> >> >> You can do this if NIC2 is a veth device: you inject packets into the
> >> >> veth on the TX side, they come out on the other side and from the kernel
> >> >> PoV it looks like all packets come in on the peer veth. You'll need to
> >> >> redirect packets the other way as well.
> >> >>
> >> >> > any performance impact?
> >> >>
> >> >> Yes, obviously :)
> >> >>
> >> >> >> In general, if you want to filter traffic before passing it on to the
> >> >> >> kernel, the best bet is to implement your filtering in BPF and run it as
> >> >> >> an XDP program.
> >> >> >
> >> >> > I am thinking for scenario like tcp rst/ack flood DDOS attack to NIC1
> >> >> > above, I can't simply drop every rst/ack because there could be
> >> >> > legitimate rst/ack, in this case since mTCP can validate legitimate
> >> >> > stateful tcp connection, drop flooding rst/ack packet, redirect good
> >> >> > rst/ack to NIC2. I am not sure a BPF XDP program attached to NIC1 is
> >> >> > able to do stateful TCP packet filtering, does that make sense to you?
> >> >>
> >> >> It makes sense in the "it can probably be made to work" sense. Not in
> >> >> the "why would anyone want to do this" sense. If you're trying to
> >> >> protect against SYN flooding using XDP there are better solutions than
> >> >> proxying things through a user space TCP stack. See for instance Maxim's
> >> >> synproxy patches:
> >> >>
> >> >
> >> > SYN flooding is just one of the example, what I have in mind is an
> >> > user space TCP/IP stack runs on top of AF_XDP as middle box/proxy for
> >> > packet filtering or load balancing, like F5 BIG-IP runs an user space
> >> > TCP/IP stack on top of AF_XDP. I thought open source mTCP + AF_XDP
> >> > could be a similar use case as middle box.  user space TCP/IP stack +
> >> > AF_XDP as middle box/proxy,  the performance is not going to be good?
> >>
> >> Well, you can certainly build a proxy using AF_XDP by intercepting all
> >> the traffic and bridging it onto a veth device, say. I've certainly
> >> heard of people doing that. It'll have some non-trivial overhead,
> >> though; even if AF_XDP is fairly high performance, you're still making
> >> all traffic take an extra hop through userspace, and you'll lose
> >> features like hardware TSO, etc. Whether it can be done with "good"
> >> performance depends on your use case, I guess (i.e., how do you define
> >> "good performance"?).
> >>
> >> I guess I don't really see the utility in having a user-space TCP stack
> >> be a middlebox? If you're doing packet-level filtering, you could just
> >> do that in regular XDP (and the same for load balancing, see e.g.,
> >> Katran), and if you want to do application-level filtering (say, a WAF),
> >> you could just use the kernel TCP stack?
> >>
> >
> > the reason I mention user-space TCP stack is user space stack appears
> > performs better than kernel TCP stack, and we see user-space stack +
> > DPDK for high speed packet processing applications out there, since
> > XDP/AF_XDP seems to be competing with DPDK, so I thought why not user
> > space stack + AF_XDP :)
>
> Well, there's a difference between running a user-level stack directly
> in the end application, or using it as a middlebox. The latter just adds
> overhead, and again, I really don't see why you'd want to do that?

The middle box indeed adds overhead in the packet path, but commercial
products like BIG-IP  is a middle box with a well tuned  user-space
stack that runs on DPDK successfully, and could potentially run on
AF_XDP if AF_XDP performs well as DPDK.  I think there's no reason
open source user-space stack can't do this :).

Suricata also added AF_XDP support, no TX support though
https://github.com/OISF/suricata/pull/8210/commits/2cf0b3616c668c0d3fda4de47a6b93630dcc366d,
 Suricata could work as inline middle box in IPS mode, but seems not
supported so far for AF_XDP. so I am  thinking of an user-space stack
+ AF_XDP that works as a stealth inline middle box to do the advanced
packet processing/filtering/proxying to protect application servers,
maybe AF_XDP is not designed for such use cases?

>
> Also, the mTCP web site cites tests against a 3.10 kernel, and the code
> doesn't look like it's been touched for years. So I'd suggest running
> some up-to-date tests against a modern kernel (and trying things like
> io_uring if your concern is syscall overhead for small flows) before
> drawing any conclusions about performance :)
>

I cloned https://github.com/mcabranches/mtcp that added AF_XDP
support, I could run the example epserver app  fine on Ubuntu 20.04
5.4 kernel, and yes, it appears performance worse from CPU usage wise
than kernel stack when under hping3 RST flood. perf top example when
under hping3 RST flood:

Samples: 120K of event 'cycles', 4000 Hz, Event count (approx.):
6522491740 lost: 0/0 drop: 0/0

Overhead  Shared Object                            Symbol

  21.42%  epserver                                 [.] ProcessPacket

  10.89%  epserver                                 [.] StreamHTSearch

   7.07%  [kernel]                                 [k] native_write_msr

   5.75%  epserver                                 [.] HashFlow

   5.69%  [kernel]                                 [k] native_read_msr

   3.62%  epserver                                 [.] xsk_ring_prod__fill_addr

   2.20%  [kernel]                                 [k] xsk_poll

   2.05%  epserver                                 [.] afxdp_get_rptr

   1.87%  [kernel]                                 [k]
native_queued_spin_lock_slowpath

   1.57%  epserver                                 [.] ProcessIPv4Packet

   1.55%  [kernel]                                 [k] _raw_spin_lock_irqsave

   1.38%  [kernel]                                 [k] do_sys_poll

   1.34%  epserver                                 [.] TCPCalcChecksum

   1.22%  [kernel]                                 [k] vmxnet3_poll_rx_only

   1.21%  libpthread-2.31.so                       [.] pthread_mutex_unlock

   1.21%  epserver                                 [.] xsk_alloc_umem_frame

   1.04%  [kernel]                                 [k] psi_task_change

   0.99%  [kernel]                                 [k] cpuacct_charge

   0.91%  epserver                                 [.] ProcessTCPPacket

   0.90%  [kernel]                                 [k] vmware_sched_clock

   0.80%  [kernel]                                 [k]
poll_schedule_timeout.constprop.0

   0.72%  epserver                                 [.] MTCPRunThread

   0.69%  epserver                                 [.] xsk_prod_nb_free

   0.64%  epserver                                 [.] afxdp_recv_pkts

   0.64%  [kernel]                                 [k] read_tsc

   0.56%  [kernel]                                 [k] xsk_generic_rcv

   0.55%  [kernel]                                 [k] memcpy

   0.53%  [kernel]                                 [k] x86_pmu_disable_all

   0.51%  [kernel]                                 [k] native_apic_msr_eoi_write

   0.50%  [kernel]                                 [k] fput_many

   0.49%  [kernel]                                 [k]
__perf_event_task_sched_in

   0.48%  [kernel]                                 [k] __fget_light

   0.48%  [kernel]                                 [k] __schedule

   0.47%  epserver                                 [.] xsk_cons_nb_avail

   0.46%  [kernel]                                 [k] memset

   0.41%  [kernel]                                 [k] irq_entries_start

if I run hping3 RST flood a ngnix web server runs on Linux kernel, no
visible CPU usage overhead

> That being said, it's certainly *possible* to do what you're suggesting;
> there's even a PMD driver in DPDK for AF_XDP, so in that sense it's
> pluggable. So, like, feel free to try it out? I'm just cautioning
> against thinking it some kind of magic bullet; packet processing at high
> speeds in software is *hard*, so the details matter a lot, and it's
> really easy to throw away any performance gains by inefficiencies
> elsewhere in the stack (which goes for both the kernel stack, XDP, and
> AF_XDP).

I agree there is no magic bullet, I am just thinking if there is
potential for AF_XDP  as a middle box senario.
Thank you for your patience, feel free to drop the discussion :)

>
> -Toke
>

^ permalink raw reply	[flat|nested] 12+ messages in thread

end of thread, other threads:[~2023-01-11  0:11 UTC | newest]

Thread overview: 12+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2022-12-14 20:49 Redirect packet back to host stack after AF_XDP? Vincent Li
2022-12-14 22:53 ` Toke Høiland-Jørgensen
2022-12-15  1:57   ` Vincent Li
2022-12-15 11:08     ` Toke Høiland-Jørgensen
2022-12-15 18:52       ` Vincent Li
2022-12-17  2:52   ` Vincent Li
2023-01-02 11:33     ` Toke Høiland-Jørgensen
2023-01-09 21:28       ` Vincent Li
2023-01-10 15:23         ` Toke Høiland-Jørgensen
2023-01-10 16:54           ` Vincent Li
2023-01-10 23:27             ` Toke Høiland-Jørgensen
2023-01-11  0:11               ` Vincent Li

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).