From: Edward Cree <ecree@solarflare.com>
To: Alexander Petrovsky <askjuise@gmail.com>, xdp-newbies@vger.kernel.org
Subject: Re: IP fragmentation
Date: Mon, 20 Jul 2020 13:17:40 +0100 [thread overview]
Message-ID: <6de242a4-263a-bbde-7af4-68532904e4b3@solarflare.com> (raw)
In-Reply-To: <CAH57y_Rxm9_eB5jyjJ2OryLd6HB6mXSG8s-MR3BWs-99PVNG0g@mail.gmail.com>
On 20/07/2020 09:15, Alexander Petrovsky wrote:
> But, the main problem for us it's fragmented IP packets. Some times
> ago I tried to use for such packets AF_XDP, fast pass them into the
> user space, accumulate and after that pass back to the network, it was
> a PoC.
Not 100% sure this works because I haven't tried it, but as long as
packets aren't being re-ordered, you can do it without needing to
save the payload in a map.
All the map needs to store is (for each IPID being tracked) what host
this connection goes to.
If you receive a First Fragment (frag_off=0, MF=1), you look up the
tuple through the regular LB to pick a server, and record that host
in the map entry for the IPID.
For any other fragment, you look up the IPID in the map to get the
destination host, and if MF=0 you delete the map entry.
(If the IPID wasn't found, either drop or punt to userspace.)
Then TX/REDIRECT the packet to the appropriate host.
You might want to add some kind of simple ageing to this so that map
entries from interrupted/spurious fragment chains don't stick around
and build up over time.
The problem comes when 'middle' fragments can either come after the
last (MF=0) fragment (technically this can be handled by tracking
the byte range seen for the IPID, and not deleting from the map
until all bytes up to the frag_off+total_len of the last-frag have
been seen), or worse, before the first fragment. If the frag_off=0
fragment isn't the first one received, then this doesn't work
because you don't know at the time of receiving fragments what L4
ports they belong to. But I don't know how common that situation is
and whether having it take the slow-path is acceptable.
HTH,
-ed
next prev parent reply other threads:[~2020-07-20 12:17 UTC|newest]
Thread overview: 4+ messages / expand[flat|nested] mbox.gz Atom feed top
2020-07-20 8:15 IP fragmentation Alexander Petrovsky
2020-07-20 12:17 ` Edward Cree [this message]
2020-07-20 14:47 ` Alexander Petrovsky
2020-07-21 12:33 ` Edward Cree
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=6de242a4-263a-bbde-7af4-68532904e4b3@solarflare.com \
--to=ecree@solarflare.com \
--cc=askjuise@gmail.com \
--cc=xdp-newbies@vger.kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).