netfilter-devel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Pablo Neira Ayuso <pablo@netfilter.org>
To: Patrick McHardy <kaber@trash.net>
Cc: netfilter-devel@vger.kernel.org
Subject: Re: [PATCH] netfilter: xtables: add cluster match
Date: Wed, 18 Feb 2009 12:06:39 +0100	[thread overview]
Message-ID: <499BEBBF.7080705@netfilter.org> (raw)
In-Reply-To: <499BDF5D.2010809@trash.net>

Patrick McHardy wrote:
> Pablo Neira Ayuso wrote:
>> Patrick McHardy wrote:
>>>> A possible solution (that thinking it well, I don't like too much yet)
>>>> would be to convert this to a HASHMARK target that will store the
>>>> result
>>>> of the hash in the skbuff mark, but the problem is that it would
>>>> require
>>>> a reserved space for hashmarks since they may clash with other
>>>> user-defined marks.
>>> That sounds a bit like a premature optimization. What I don't get
>>> is why you don't simply set cluster-total-nodes to one when two
>>> are down or remove the rule entirely.
>>
>> Indeed, but in practise existing failover daemons (at least those
>> free/opensource that I know) doesn't show that "intelligent" behaviour
>> since they initially (according to the configuration file) assign the
>> resources to each node, and if one node fails, it assigns the
>> corresponding resources to another sane node (ie. the daemon runs a
>> script with the corresponding iptables rules).
>>
>> Re-adjusting cluster-total-nodes and cluster-local-nodes options (eg. if
>> one cluster node goes down and there are only two nodes alive, change
>> the rule-set to have only two nodes) seems indeed the natural way to go
>> since the alive cluster nodes would share the workload that the failing
>> node has left. However, as said, existing failover daemons only select
>> one new master to recover what a failing node was doing, thus, only one
>> runs the script to inject the states into the kernel.
>>
>> Therefore AFAICS, without the /proc interface, I would need one iptables
>> rule per cluster-local-node handled, and so it's still the possible
>> sub-optimal situation when one or several node fails.
> 
> OK, that explains why you want to handle it this way. I don't want
> to merge the proc file part though, so until the daemons get smarter,
> people will have to use multiple rules.

:(

> BTW, I recently looked into TIPC, its incredibly easy to use since
> it deals with dead-node dectection etc internally and all you need
> to do is exchange a few messages. Might be quite easy to write a
> smarter failover daemon.

I see, I don't have more convincing arguments that "I would also need
time for that but in the meanwhile, please allow this". Well, failover
daemons are delicate pieces of software, they have to be stable,
well-tested, bug-free, give timely responses. Still TIPC is experimental
and I guess that the dead-node detection is only layer 3/4 based on
heartbeats. Dead-node detection is a tricky issue, the more you can
perform different layer checkings, the more increase chances to make
wrong decisions that may lead to inconsistent situations and tons of
problems. VRRP is the current standard and this one of his limitations,
and so on.

Well, if you are not going to accept the /proc interface, not matter
what I can argument, I give up on this ;)

Anyway, probably, this is a premature optimization (but worth?). Some
numbers, in my testbed, I get ~1800 TCP connections per second less with
eight cluster rules (no /proc interface).

24347 TCP connections per second with one rule.
22580 TCP connections per second with eight rules.

OK, I'll send you another patch without the /proc interface.

-- 
"Los honestos son inadaptados sociales" -- Les Luthiers

  reply	other threads:[~2009-02-18 11:06 UTC|newest]

Thread overview: 49+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2009-02-14 19:29 [PATCH] netfilter: xtables: add cluster match Pablo Neira Ayuso
2009-02-14 20:28 ` Jan Engelhardt
2009-02-14 20:42   ` Pablo Neira Ayuso
2009-02-14 22:31     ` Jan Engelhardt
2009-02-14 22:32       ` Jan Engelhardt
2009-02-16 10:56 ` Patrick McHardy
2009-02-16 14:01   ` Pablo Neira Ayuso
2009-02-16 14:03     ` Patrick McHardy
2009-02-16 14:30       ` Pablo Neira Ayuso
2009-02-16 15:01         ` Patrick McHardy
2009-02-16 15:14         ` Pablo Neira Ayuso
2009-02-16 15:10           ` Patrick McHardy
2009-02-16 15:27             ` Pablo Neira Ayuso
2009-02-17 10:46             ` Pablo Neira Ayuso
2009-02-17 10:50               ` Patrick McHardy
2009-02-17 13:50                 ` Pablo Neira Ayuso
2009-02-17 19:45                   ` Vincent Bernat
2009-02-18 10:14                     ` Patrick McHardy
2009-02-18 10:13                   ` Patrick McHardy
2009-02-18 11:06                     ` Pablo Neira Ayuso [this message]
2009-02-18 11:14                       ` Patrick McHardy
2009-02-18 17:20                       ` Vincent Bernat
2009-02-18 17:25                         ` Patrick McHardy
2009-02-18 18:38                           ` Pablo Neira Ayuso
2009-02-16 17:17         ` Jan Engelhardt
2009-02-16 17:13     ` Jan Engelhardt
2009-02-16 17:16       ` Patrick McHardy
2009-02-16 17:22         ` Jan Engelhardt
2009-02-16  9:23 Pablo Neira Ayuso
2009-02-16  9:31 ` Pablo Neira Ayuso
2009-02-16 12:13   ` Jan Engelhardt
2009-02-16 12:17     ` Patrick McHardy
2009-02-16  9:32 Pablo Neira Ayuso
2009-02-19 23:14 Pablo Neira Ayuso
2009-02-20  9:24 ` Patrick McHardy
2009-02-20 13:15   ` Pablo Neira Ayuso
2009-02-20 13:48     ` Patrick McHardy
2009-02-20 16:52       ` Pablo Neira Ayuso
2009-02-20 20:50 Pablo Neira Ayuso
2009-02-20 20:56 ` Pablo Neira Ayuso
2009-02-23 10:13 Pablo Neira Ayuso
2009-02-24 13:46 ` Patrick McHardy
2009-02-24 14:05   ` Pablo Neira Ayuso
2009-02-24 14:06     ` Patrick McHardy
2009-02-24 23:13       ` Pablo Neira Ayuso
2009-02-25  5:52         ` Patrick McHardy
2009-02-25  9:42           ` Pablo Neira Ayuso
2009-02-25 10:20             ` Patrick McHardy
2009-03-16 16:11 ` Patrick McHardy

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=499BEBBF.7080705@netfilter.org \
    --to=pablo@netfilter.org \
    --cc=kaber@trash.net \
    --cc=netfilter-devel@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).