xdp-newbies.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Edward Cree <ecree@solarflare.com>
To: Alexander Petrovsky <askjuise@gmail.com>, xdp-newbies@vger.kernel.org
Subject: Re: IP fragmentation
Date: Mon, 20 Jul 2020 13:17:40 +0100	[thread overview]
Message-ID: <6de242a4-263a-bbde-7af4-68532904e4b3@solarflare.com> (raw)
In-Reply-To: <CAH57y_Rxm9_eB5jyjJ2OryLd6HB6mXSG8s-MR3BWs-99PVNG0g@mail.gmail.com>

On 20/07/2020 09:15, Alexander Petrovsky wrote:
> But, the main problem for us it's fragmented IP packets. Some times
> ago I tried to use for such packets AF_XDP, fast pass them into the
> user space, accumulate and after that pass back to the network, it was
> a PoC.
Not 100% sure this works because I haven't tried it, but as long as
 packets aren't being re-ordered, you can do it without needing to
 save the payload in a map.
All the map needs to store is (for each IPID being tracked) what host
 this connection goes to.
If you receive a First Fragment (frag_off=0, MF=1), you look up the
 tuple through the regular LB to pick a server, and record that host
 in the map entry for the IPID.
For any other fragment, you look up the IPID in the map to get the
 destination host, and if MF=0 you delete the map entry.
 (If the IPID wasn't found, either drop or punt to userspace.)
Then TX/REDIRECT the packet to the appropriate host.
You might want to add some kind of simple ageing to this so that map
 entries from interrupted/spurious fragment chains don't stick around
 and build up over time.

The problem comes when 'middle' fragments can either come after the
 last (MF=0) fragment (technically this can be handled by tracking
 the byte range seen for the IPID, and not deleting from the map
 until all bytes up to the frag_off+total_len of the last-frag have
 been seen), or worse, before the first fragment.  If the frag_off=0
 fragment isn't the first one received, then this doesn't work
 because you don't know at the time of receiving fragments what L4
 ports they belong to.  But I don't know how common that situation is
 and whether having it take the slow-path is acceptable.

HTH,
-ed

  reply	other threads:[~2020-07-20 12:17 UTC|newest]

Thread overview: 4+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-07-20  8:15 IP fragmentation Alexander Petrovsky
2020-07-20 12:17 ` Edward Cree [this message]
2020-07-20 14:47   ` Alexander Petrovsky
2020-07-21 12:33     ` Edward Cree

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=6de242a4-263a-bbde-7af4-68532904e4b3@solarflare.com \
    --to=ecree@solarflare.com \
    --cc=askjuise@gmail.com \
    --cc=xdp-newbies@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).