* [ANNOUNCE] NF-HIPAC: High Performance Packet Classification for Netfilter @ 2002-09-25 22:41 nf 2002-09-25 22:52 ` David S. Miller 0 siblings, 1 reply; 15+ messages in thread From: nf @ 2002-09-25 22:41 UTC (permalink / raw) To: linux-kernel Hi, nf-hipac aims to become a drop-in replacement for the iptables packet filtering module. It implements a novel framework for packet classification which uses an advanced algorithm to reduce the number of memory lookups per packet. The module is ideal for environments where large rulesets and/or high bandwidth networks are involved. The algorithm code is designed in a way that it can be verified in userspace, so the algorithm code itself can be considered correct. We are not able to really verify the remaining files nfhp_mod.[ch] and the userspace tool (nf-hipac.[ch]), but they are tested in depth and shouldn't contain any critical bugs. We have the results of some basic performance tests available on our web page. The test compares the performance of the iptables filter table to the performance of nf-hipac. Results are pretty impressive :-) You can find the performance test results on our web page http://www.hipac.org The releases can be downloaded from http://sourceforge.net/projects/nf-hipac/ Features: - optimized for high performance packet classification with moderate memory usage - completely dynamic: data structure isn't rebuild from scratch when inserting or deleting rules, so fast updates are possible - userspace tool syntax is very similar to the iptables syntax - kernel does not need to be patched - compatible to iptables: you can use iptables and nf-hipac at the same time: for example you could use the connection tracking module from iptables and match the states with nf-hipac - match support for: + source/destination ip + in/out interface + protocol (udp, tcp, icmp) + source/destination ports (udp, tcp) + icmp type + tcp flags + ttl + state match (conntrack module must be loaded) - /proc/net/nf-hipac: + algorithm statistics available via # cat /proc/net/nf-hipac + allows to dynamically limit the maximum memory usage # echo > /proc/net/nf-hipac Enjoy, +-----------------------+----------------------+ | Michael Bellion | Thomas Heinz | | <mbellion@hipac.org> | <creatix@hipac.org> | +-----------------------+----------------------+ ^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [ANNOUNCE] NF-HIPAC: High Performance Packet Classification for Netfilter 2002-09-25 22:41 [ANNOUNCE] NF-HIPAC: High Performance Packet Classification for Netfilter nf @ 2002-09-25 22:52 ` David S. Miller 2002-09-26 0:10 ` Rik van Riel ` (2 more replies) 0 siblings, 3 replies; 15+ messages in thread From: David S. Miller @ 2002-09-25 22:52 UTC (permalink / raw) To: nf; +Cc: linux-kernel From: nf@hipac.org Date: Thu, 26 Sep 2002 00:41:56 +0200 We have the results of some basic performance tests available on our web page. The test compares the performance of the iptables filter table to the performance of nf-hipac. Results are pretty impressive :-) You missed the real trick, extending the routing tables to do packet filter rule lookup. That's where the real performance gains can be found, and how we intend to eventually make all firewalling/encapsulation/et al. work. Such a scheme can even obviate socket lookup if implemented properly. It'd basically be a flow cache, much like route lookups but with an expanded key set and the capability to stack routes. Such a flow cache could even be two level, with the top level being %100 cpu local on SMP (ie. no shared cache lines). We have to do the route lookup anyways, if it got you to the packet filtering tables (or ipsec encap information, or TCP socket, etc etc) as a side effect, lots of netfilter becomes superfluous because all it is doing is maintaining these lookup tables etc. which is what you are optimizing. Everyone looks to optimize these things on such a local level, and that's not where the real improvements live. Think globally, and the more powerful solutions are much more evident. Everything, from packet forwarding, to firewalling, to TCP socket packet receive, can be described with routes. It doesn't make sense for forwarding, TCP, netfilter, and encapsulation schemes to duplicate all of this table lookup logic and in fact it's entirely superfluous. This stackable routes idea being worked on, watch this space over the next couple of weeks :-) ^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [ANNOUNCE] NF-HIPAC: High Performance Packet Classification for Netfilter 2002-09-25 22:52 ` David S. Miller @ 2002-09-26 0:10 ` Rik van Riel 2002-09-26 0:25 ` David S. Miller 2002-09-26 0:38 ` nf 2002-09-26 5:19 ` Rusty Russell 2 siblings, 1 reply; 15+ messages in thread From: Rik van Riel @ 2002-09-26 0:10 UTC (permalink / raw) To: David S. Miller; +Cc: nf, linux-kernel On Wed, 25 Sep 2002, David S. Miller wrote: > Everything, from packet forwarding, to firewalling, to TCP socket > packet receive, can be described with routes. QoS stuff too ? Rik -- Bravely reimplemented by the knights who say "NIH". http://www.surriel.com/ http://distro.conectiva.com/ Spamtraps of the month: september@surriel.com trac@trac.org ^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [ANNOUNCE] NF-HIPAC: High Performance Packet Classification for Netfilter 2002-09-26 0:10 ` Rik van Riel @ 2002-09-26 0:25 ` David S. Miller 0 siblings, 0 replies; 15+ messages in thread From: David S. Miller @ 2002-09-26 0:25 UTC (permalink / raw) To: riel; +Cc: nf, linux-kernel From: Rik van Riel <riel@conectiva.com.br> Date: Wed, 25 Sep 2002 21:10:02 -0300 (BRT) On Wed, 25 Sep 2002, David S. Miller wrote: > Everything, from packet forwarding, to firewalling, to TCP socket > packet receive, can be described with routes. QoS stuff too ? It is definitely possible. ^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [ANNOUNCE] NF-HIPAC: High Performance Packet Classification for Netfilter 2002-09-25 22:52 ` David S. Miller 2002-09-26 0:10 ` Rik van Riel @ 2002-09-26 0:38 ` nf 2002-09-26 0:37 ` David S. Miller 2002-09-26 5:19 ` Rusty Russell 2 siblings, 1 reply; 15+ messages in thread From: nf @ 2002-09-26 0:38 UTC (permalink / raw) To: David S. Miller; +Cc: linux-kernel Hi, > You missed the real trick, extending the routing tables to > do packet filter rule lookup. That's where the real > performance gains can be found, ... Yes, that's certainly true. But we have to take step by step. We started our project in August 2001 and up to now almost all our work went into developing and optimizing the algorithm and not into an optimal integration into the linux kernel. We chose the netfilter integration as a first step, because it was easy and fast to do. It doesn't break anything in the kernel, no kernel patch is needed, it can be used together with other existing netfilter/iptables modules and switching from the iptables filter module to nf-hipac is really easy. We have now basically finished the work on the algorithm itself. We can now concentrate on porting the algorithm to other fields and on a better integration into the kernel. We designed the algorithm code in a way that allows to port it to other fields than packetfiltering without much work. We were aware from the beginning that combining several fields (e.g. routing and filtering) is THE way to go and it is no problem to support this with our algorithm. Our algorithm is already fast with a small number of rules, but what makes it really interesting is, that it is possible to use huge rulesets/routing tables/qos ... without much slowing down performance. In practical use people won't notice much of a difference between using 25 or 25000 rules. The nf-hipac team Michael Bellion, Thomas Heinz ^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [ANNOUNCE] NF-HIPAC: High Performance Packet Classification for Netfilter 2002-09-26 0:38 ` nf @ 2002-09-26 0:37 ` David S. Miller 2002-09-26 1:44 ` nf 0 siblings, 1 reply; 15+ messages in thread From: David S. Miller @ 2002-09-26 0:37 UTC (permalink / raw) To: nf; +Cc: linux-kernel From: nf@hipac.org Date: Thu, 26 Sep 2002 02:38:06 +0200 > You missed the real trick, extending the routing tables to > do packet filter rule lookup. That's where the real > performance gains can be found, ... Yes, that's certainly true. But we have to take step by step. All the work you will spend to validate all the converted modules will be equal if not greater to doing the algorithm correctly. So why not do it correctly from the start? :-) Seriously, just sit tight with your work, and once the stackable route stuff is done, you can look into applying your algorithms to the new flow cache. ^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [ANNOUNCE] NF-HIPAC: High Performance Packet Classification for Netfilter 2002-09-26 0:37 ` David S. Miller @ 2002-09-26 1:44 ` nf 2002-09-26 3:30 ` David S. Miller 0 siblings, 1 reply; 15+ messages in thread From: nf @ 2002-09-26 1:44 UTC (permalink / raw) To: David S. Miller; +Cc: linux-kernel > Seriously, just sit tight with your work, and once the stackable > route stuff is done, you can look into applying your algorithms > to the new flow cache. Sorry, we are a bit confused of the formulation "adding the algorithmus to the new flow cache" Why to the flow cache? What exaclty is the job of this flow cache? Does the job go beyond caching recently "lookup results"? What happens if the flow cache doesn't have a certain lookup result in the cache yet? We mean, how is the packet classification solved then? Is it right, that the code will then use a linear search algorithm and compare the packet with each rule sequentially until a rule is found that matches all relevant fields? Our algorithm does not implement some kind of cache. Our algorithm is actually a replacement for that linear search algorithm. Our algorithm implements an advanced approach to the packet classification problem itself. the nf-hipac team Michael Bellion, Thomas Heinz ^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [ANNOUNCE] NF-HIPAC: High Performance Packet Classification for Netfilter 2002-09-26 1:44 ` nf @ 2002-09-26 3:30 ` David S. Miller 0 siblings, 0 replies; 15+ messages in thread From: David S. Miller @ 2002-09-26 3:30 UTC (permalink / raw) To: nf; +Cc: linux-kernel From: nf@hipac.org Date: Thu, 26 Sep 2002 03:44:26 +0200 Sorry, we are a bit confused of the formulation "adding the algorithmus to the new flow cache" Why to the flow cache? What exaclty is the job of this flow cache? Does the job go beyond caching recently "lookup results"? It is just a lookup function, keyed on whatever we'd like to take from the incoming packet. I mean that if you find a stronger hash/lookup architecture that is faster for this purpose, we can integrate into _whatever_ scheme we end up using. What happens if the flow cache doesn't have a certain lookup result in the cache yet? It goes to the level 2 lookup tables, which are slightly more complex yet are still reasonably fast for lookups. Is it right, that the code will then use a linear search algorithm and compare the packet with each rule sequentially until a rule is found that matches all relevant fields? No linear search, but because we'll be doing masked/prefix lookups on the various keys the lookup table will be different than the one at the top level which uses perfect comparison results. Just look at how the routing code works now, the final result will be architected not much differently. Franks a lot, David S. Miller davem@redhat.com ^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [ANNOUNCE] NF-HIPAC: High Performance Packet Classification for Netfilter 2002-09-25 22:52 ` David S. Miller 2002-09-26 0:10 ` Rik van Riel 2002-09-26 0:38 ` nf @ 2002-09-26 5:19 ` Rusty Russell 2002-09-26 5:40 ` David S. Miller 2 siblings, 1 reply; 15+ messages in thread From: Rusty Russell @ 2002-09-26 5:19 UTC (permalink / raw) To: David S. Miller; +Cc: nf, linux-kernel On Wed, 25 Sep 2002 15:52:46 -0700 (PDT) "David S. Miller" <davem@redhat.com> wrote: > We have to do the route lookup anyways, if it got you to the packet > filtering tables (or ipsec encap information, or TCP socket, etc etc) > as a side effect, lots of netfilter becomes superfluous because all it > is doing is maintaining these lookup tables etc. which is what you are > optimizing. Indeed. Note that netfilter infrastructure had this from the beginning, but it sits unused (skb->nf_cache), awaiting someone to get enthusiastic. There are three real issues: 1) You need a way to say "too hard, don't cache this". We have NFC_UNKNOWN (I looked at some packet field you don't have a bit for) and NFC_UNCACHABLE (give me every packet dammit!). 2) You need a sane "selective flush" mechanism for timeouts and rule changes (eg. connection tracking and nat). 3) If you want to keep counters in the subsystems (eg. iptables keeps packet and byte counters at the moment for every rule because it's easy). You need to keep this info in your route cache, then call the subsystems when you evict something from the cache. Cheers! Rusty. -- there are those who do and those who hang on and you don't see too many doers quoting their contemporaries. -- Larry McVoy ^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [ANNOUNCE] NF-HIPAC: High Performance Packet Classification for Netfilter 2002-09-26 5:19 ` Rusty Russell @ 2002-09-26 5:40 ` David S. Miller 2002-09-26 15:27 ` James Morris 0 siblings, 1 reply; 15+ messages in thread From: David S. Miller @ 2002-09-26 5:40 UTC (permalink / raw) To: rusty; +Cc: nf, linux-kernel From: Rusty Russell <rusty@rustcorp.com.au> Date: Thu, 26 Sep 2002 15:19:33 +1000 3) If you want to keep counters in the subsystems (eg. iptables keeps packet and byte counters at the moment for every rule because it's easy). You need to keep this info in your route cache, then call the subsystems when you evict something from the cache. The flow cache, routing table, whatever you want to call it, doesn't care what your info looks like, that will be totally private. Actually the idea is that skb tagging would not be needed, the skb->dst would _BE_ the tag. From it you could get at all the info you needed. Draw a picture in your head as you read this. The private information could be a socket, a fast forwarding path, a stack of IPSEC decapsulation operations, netfilter connection tracking info, anything. So skb->dst could point to something like: struct socket_dst { struct dst_entry dst; enum dst_type dst_type; struct sock *sk; } Here dst->input() would be tcp_ipv4_rcv(). It could just as easily be: struct nf_conntrack_dst { struct dst_entry dst; enum dst_type dst_type; struct nf_conntrack_info *ct; } And dst->input() would be nf_conntrack_input(), or actually something more specific. See? And when you have the "drop packet" firewall rule, the skb->dst then points to nf_blackhole() (which perhaps logs the event if you need to do that). If you have things that must happen in a sequence to flow through your path properly, that's where the "stackable" bit comes in. You do that one bit, skb->dst = dst_pop(skb->dst), then your caller will pass the packet on to skb->dst->{output,input}(). Is it clearer now the kind of things you'll be able to do? Stackable entries also allow encapsulation situations to propagate precisely calculated PMTU information back to the transports. Cached entries are created by demand, the level 2 lookup area is where long term information sits. The top level lookup engine carries cached entries of whatever is active at a given moment, that is why I refer to it as a flow cache sometimes. All of this was concocted to do IPSEC in a reasonable way, but as a nice side effect it ends up solving a ton of other problems too. I think some guy named Linus once told me that's the indication of a clean design. :-) ... Let me give another example of how netfilter could use this. Consider FTP nat stuff, you'd start with the "from TCP, any address/anyport, to any address/ FTP port" DST. This watches for new FTP connections, so you can track them properly and do NAT or whatever else. This destination would drop anything other only than SYN packets. It's dst->input() would point to something like ftp_new_connection_input(). When you get a new connection, you create a destination which handles the translation of specifically THAT ftp connection (ie. specific instead of "any" addr/port pairs are used for the key). Here, dst->input() would point to ftp_existing_connection_input(). ^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [ANNOUNCE] NF-HIPAC: High Performance Packet Classification for Netfilter 2002-09-26 5:40 ` David S. Miller @ 2002-09-26 15:27 ` James Morris 2002-09-26 20:52 ` David S. Miller 0 siblings, 1 reply; 15+ messages in thread From: James Morris @ 2002-09-26 15:27 UTC (permalink / raw) To: David S. Miller; +Cc: Rusty Russell, nf, linux-kernel, netdev On Wed, 25 Sep 2002, David S. Miller wrote: > If you have things that must happen in a sequence to flow through > your path properly, that's where the "stackable" bit comes in. You > do that one bit, skb->dst = dst_pop(skb->dst), then your caller > will pass the packet on to skb->dst->{output,input}(). > > Is it clearer now the kind of things you'll be able to do? > So, this could be used for generic network layer encapsulation, and be used for GRE tunnels, SIT etc. without the kinds of kludges currently in use? Sounds nice. - James -- James Morris <jmorris@intercode.com.au> ^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [ANNOUNCE] NF-HIPAC: High Performance Packet Classification for Netfilter 2002-09-26 15:27 ` James Morris @ 2002-09-26 20:52 ` David S. Miller 2002-09-27 3:00 ` Michael Richardson 2002-09-27 14:12 ` jamal 0 siblings, 2 replies; 15+ messages in thread From: David S. Miller @ 2002-09-26 20:52 UTC (permalink / raw) To: jmorris; +Cc: rusty, nf, linux-kernel, netdev From: James Morris <jmorris@intercode.com.au> Date: Fri, 27 Sep 2002 01:27:41 +1000 (EST) So, this could be used for generic network layer encapsulation, and be used for GRE tunnels, SIT etc. without the kinds of kludges currently in use? Sounds nice. Such IPIP tunnels have very real problems though, since only 64-bits of packet quoting are required in ICMP errors, it is often impossible to deal with PMTU requests properly, see "#ifndef I_WISH_WORLD_WERE_PERFECT" in net/ipv4/ip_gre.c ^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [ANNOUNCE] NF-HIPAC: High Performance Packet Classification for Netfilter 2002-09-26 20:52 ` David S. Miller @ 2002-09-27 3:00 ` Michael Richardson 2002-09-27 14:12 ` jamal 1 sibling, 0 replies; 15+ messages in thread From: Michael Richardson @ 2002-09-27 3:00 UTC (permalink / raw) To: David S. Miller; +Cc: jmorris, rusty, nf, linux-kernel, netdev -----BEGIN PGP SIGNED MESSAGE----- >>>>> "David" == David S Miller <davem@redhat.com> writes: David> From: James Morris <jmorris@intercode.com.au> David> Date: Fri, 27 Sep 2002 01:27:41 +1000 (EST) David> So, this could be used for generic network layer encapsulation, and be David> used for GRE tunnels, SIT etc. without the kinds of kludges currently in David> use? Sounds nice. David> Such IPIP tunnels have very real problems though, since only 64-bits David> of packet quoting are required in ICMP errors, it is often impossible David> to deal with PMTU requests properly, see "#ifndef David> I_WISH_WORLD_WERE_PERFECT" in net/ipv4/ip_gre.c IPsec tunnels are even worse, because, not only is there not enough info returned, but, being paranoid, one should really not even trust it. ICMP Port not reachable for UDP port 500 are even more nasty, because sometimes they indicate a REAL problem :-) Eons ago, I proposed a way to deal with this problem, see: http://www.sandelman.ottawa.on.ca/SSW/ietf/draft-richardson-ipsec-pmtu-discovery-00.txt I think that now that Linux doesn't linearize skbuff's prior to passing them to protocol handlers, that I actually could get the fragment info from the skb chain. Excerpt from document: Gateway G2 upon receiving an ESP or AH packet that needs to be reassembled, MUST take note of the size largest fragment received. This value is compared to the previous largest fragment size. If this size has changed by more than 10%, or more than 2*MSL time (i.e. 2 minutes) has passed since the previous ICMP message, then an ICMP Datagram Too Big message is generated. The largest fragment size is initialized to 576 bytes. The ICMP datagram is addressed from gateway G2 to the originating node C, and gives a size that is based on the maximum fragment size (above), minus the IPsec overhead. The ICMP datagram is sent via the tunnel on which the IPsec packet was a member. I.e. the ICMP is encapsulated. A packet arriving at G1 with the DF bit set, does not cause the DF bit to be set on the encapsulating datagram. (proposal two changes the destination IP of the ICMP message) ] ON HUMILITY: to err is human. To moo, bovine. | firewalls [ ] Michael Richardson, Sandelman Software Works, Ottawa, ON |net architect[ ] mcr@sandelman.ottawa.on.ca http://www.sandelman.ottawa.on.ca/ |device driver[ ] panic("Just another Debian GNU/Linux using, kernel hacking, security guy"); [ -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.0.7 (GNU/Linux) Comment: Finger me for keys iQCVAwUBPZPJt4qHRg3pndX9AQGqYwP+JtDyOhQwV2kiFWqxs8H8QbU0OJmV/krd rNhapv6/qzxcqHHPWHiypxQLZ3uYHcNKfwdoQFhOgVxdZXivkOnvn9bjoKDIp+EL JKNNvSrHNZMJ9yQCqUnsI+2XknkR9bCOOK9DfTcJ9e2lSFlH7H1vSTnRo6GOJhVh arQUa22xoc8= =VAyj -----END PGP SIGNATURE----- ^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [ANNOUNCE] NF-HIPAC: High Performance Packet Classification for Netfilter 2002-09-26 20:52 ` David S. Miller 2002-09-27 3:00 ` Michael Richardson @ 2002-09-27 14:12 ` jamal 2002-09-28 1:30 ` David S. Miller 1 sibling, 1 reply; 15+ messages in thread From: jamal @ 2002-09-27 14:12 UTC (permalink / raw) To: David S. Miller; +Cc: jmorris, rusty, nf, linux-kernel, netdev Dave, now that i followed the thread on lk (slrn is great; thanks Jason); I am actually interested to find out how you are going to pull what you propose ;-> There are not that many things that will work well with a dst-cache like idea. I actually considered the stacking idea when i first was trying to prototype code that i posted. It is much harder to make use of in practise. At least this is my experience. If you look at the scheme i posted, youll see that the policy could be to direct the packets to a IPV4-forwarding block or totaly bypass it etc (i just didnt wanna jump into that step yet sicne it is quiet involved architecturaly) In any case we need to encourage people like the hipac authors to be putting out things (i only wish theyd incorporate it into the tc framework!); whatever changes made should consider that there is more than one way to do things and people will always come with better ways to do certain portions of the packet path. cheers, jamal On Thu, 26 Sep 2002, David S. Miller wrote: > From: James Morris <jmorris@intercode.com.au> > Date: Fri, 27 Sep 2002 01:27:41 +1000 (EST) > > So, this could be used for generic network layer encapsulation, and be > used for GRE tunnels, SIT etc. without the kinds of kludges currently in > use? Sounds nice. > > Such IPIP tunnels have very real problems though, since only 64-bits > of packet quoting are required in ICMP errors, it is often impossible > to deal with PMTU requests properly, see "#ifndef > I_WISH_WORLD_WERE_PERFECT" in net/ipv4/ip_gre.c > > > ^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [ANNOUNCE] NF-HIPAC: High Performance Packet Classification for Netfilter 2002-09-27 14:12 ` jamal @ 2002-09-28 1:30 ` David S. Miller 0 siblings, 0 replies; 15+ messages in thread From: David S. Miller @ 2002-09-28 1:30 UTC (permalink / raw) To: hadi; +Cc: jmorris, rusty, nf, linux-kernel, netdev From: jamal <hadi@cyberus.ca> Date: Fri, 27 Sep 2002 10:12:26 -0400 (EDT) whatever changes made should consider that there is more than one way to do things and people will always come with better ways to do certain portions of the packet path. Of course. ^ permalink raw reply [flat|nested] 15+ messages in thread
end of thread, other threads:[~2002-09-28 1:32 UTC | newest] Thread overview: 15+ messages (download: mbox.gz / follow: Atom feed) -- links below jump to the message on this page -- 2002-09-25 22:41 [ANNOUNCE] NF-HIPAC: High Performance Packet Classification for Netfilter nf 2002-09-25 22:52 ` David S. Miller 2002-09-26 0:10 ` Rik van Riel 2002-09-26 0:25 ` David S. Miller 2002-09-26 0:38 ` nf 2002-09-26 0:37 ` David S. Miller 2002-09-26 1:44 ` nf 2002-09-26 3:30 ` David S. Miller 2002-09-26 5:19 ` Rusty Russell 2002-09-26 5:40 ` David S. Miller 2002-09-26 15:27 ` James Morris 2002-09-26 20:52 ` David S. Miller 2002-09-27 3:00 ` Michael Richardson 2002-09-27 14:12 ` jamal 2002-09-28 1:30 ` David S. Miller
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).