archive mirror
 help / color / mirror / Atom feed
From: "David S. Miller" <>
Subject: Re: [ANNOUNCE] NF-HIPAC: High Performance Packet Classification for Netfilter
Date: Wed, 25 Sep 2002 22:40:01 -0700 (PDT)	[thread overview]
Message-ID: <> (raw)
In-Reply-To: <>

   From: Rusty Russell <>
   Date: Thu, 26 Sep 2002 15:19:33 +1000
   3) If you want to keep counters in the subsystems (eg. iptables keeps packet
      and byte counters at the moment for every rule because it's easy). You
      need to keep this info in your route cache, then call the subsystems when
      you evict something from the cache.

The flow cache, routing table, whatever you want to call it, doesn't
care what your info looks like, that will be totally private.

Actually the idea is that skb tagging would not be needed, the
skb->dst would _BE_ the tag.  From it you could get at all the
info you needed.

Draw a picture in your head as you read this.

The private information could be a socket, a fast forwarding path, a
stack of IPSEC decapsulation operations, netfilter connection tracking
info, anything.

So skb->dst could point to something like:

	struct socket_dst {
		struct dst_entry dst;
		enum dst_type dst_type;
		struct sock *sk;

Here dst->input() would be tcp_ipv4_rcv().

It could just as easily be:

	struct nf_conntrack_dst {
		struct dst_entry dst;
		enum dst_type dst_type;
		struct nf_conntrack_info *ct;

And dst->input() would be nf_conntrack_input(), or actually something
more specific.


And when you have the "drop packet" firewall rule, the skb->dst then
points to nf_blackhole() (which perhaps logs the event if you need to
do that).

If you have things that must happen in a sequence to flow through
your path properly, that's where the "stackable" bit comes in.  You
do that one bit, skb->dst = dst_pop(skb->dst), then your caller
will pass the packet on to skb->dst->{output,input}().

Is it clearer now the kind of things you'll be able to do?

Stackable entries also allow encapsulation situations to propagate
precisely calculated PMTU information back to the transports.

Cached entries are created by demand, the level 2 lookup area is
where long term information sits.  The top level lookup engine
carries cached entries of whatever is active at a given moment,
that is why I refer to it as a flow cache sometimes.

All of this was concocted to do IPSEC in a reasonable way, but as a
nice side effect it ends up solving a ton of other problems too.  I
think some guy named Linus once told me that's the indication of a
clean design. :-)


Let me give another example of how netfilter could use this.  Consider
FTP nat stuff, you'd start with the "from TCP, any address/anyport, to
any address/ FTP port" DST.  This watches for new FTP connections, so
you can track them properly and do NAT or whatever else.  This
destination would drop anything other only than SYN packets.  It's
dst->input() would point to something like ftp_new_connection_input().

When you get a new connection, you create a destination which handles
the translation of specifically THAT ftp connection (ie. specific
instead of "any" addr/port pairs are used for the key).  Here,
dst->input() would point to ftp_existing_connection_input().

  reply	other threads:[~2002-09-26  5:40 UTC|newest]

Thread overview: 15+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2002-09-25 22:41 [ANNOUNCE] NF-HIPAC: High Performance Packet Classification for Netfilter nf
2002-09-25 22:52 ` David S. Miller
2002-09-26  0:10   ` Rik van Riel
2002-09-26  0:25     ` David S. Miller
2002-09-26  0:38   ` nf
2002-09-26  0:37     ` David S. Miller
2002-09-26  1:44       ` nf
2002-09-26  3:30         ` David S. Miller
2002-09-26  5:19   ` Rusty Russell
2002-09-26  5:40     ` David S. Miller [this message]
2002-09-26 15:27       ` James Morris
2002-09-26 20:52         ` David S. Miller
2002-09-27  3:00           ` Michael Richardson
2002-09-27 14:12           ` jamal
2002-09-28  1:30             ` David S. Miller

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \ \ \ \ \ \

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).