All of lore.kernel.org
 help / color / mirror / Atom feed
From: Florian Westphal <fw@strlen.de>
To: Nicolas Dichtel <nicolas.dichtel@6wind.com>
Cc: Florian Westphal <fw@strlen.de>,
	netfilter-devel@vger.kernel.org, nuclearcat@nuclearcat.com
Subject: Re: [PATCH nf] netfilter: conntrack: refine gc worker heuristics, redux
Date: Mon, 16 Jan 2017 18:33:49 +0100	[thread overview]
Message-ID: <20170116173349.GA12001@breakpoint.cc> (raw)
In-Reply-To: <1fa618d3-4c2a-72f1-d5f8-ddfadc6ad029@6wind.com>

Nicolas Dichtel <nicolas.dichtel@6wind.com> wrote:
> Le 16/01/2017 à 11:56, Florian Westphal a écrit :
> > This further refines the changes made to conntrack gc_worker in
> > commit e0df8cae6c16 ("netfilter: conntrack: refine gc worker heuristics").
> > 
> > The main of this change is to reduce the scan interval when evictions
> > take place.
> > 
> > However, on the reporters' setup, there are 1-2 Million conntrack entries
> > in total, and roughly 8k new connections per second.
> > 
> > In this case we'll always evict at least one entry per gc cycle and scan
> > interval is always at 1 jiffy.
> > 
> > Given we scan ~10k entries per run its clearly wrong to reduce interval
> > based on number of evictions, it will only waste cpu cycles.
> > 
> > Thus only look at the ratio (scanned entries vs. evicted entries) to make
> > a decision on whether to reduce or not.
> > 
> > Given the fact that evictor is supposed to only kick in when system turns
> > idle after a busy period, we should pick a high ratio -- this makes it 50%.
> > 
> > Also, get rid of GC_MAX_EVICTS.  Instead of breaking loop and instant
> > resched, just don't bother checking this at all (the loop calls
> > cond_resched for every bucket anyway).
> > 
> > Removal of GC_MAX_EVICTS was suggested by Nicolas in the past; I
> > should have listened.
> I (still) agree with that part, maybe it could be done in another patch.

Done.

> > Fixes: b87a2f9199ea82eaadc ("netfilter: conntrack: add gc worker to remove timed-out entries")
> > Signed-off-by: Florian Westphal <fw@strlen.de>
> With that patch, it's worse than before. I regularly have a delay > 30 seconds
> in my tests (this was infrequent before the patch).

:-/

We could lower the ratio of course, but under synflood with 3 second syn_recv timeout
I get a scan/evict ratio of ~15%, and we should not schedule worker all the time in that case.

> A solution, to fix the balance between cpu cycles consumption and short ct
> events delays may be to introduce a new sysctl which allows the user to choose
> between two predefined configs: short delay or long delay for ct events.
> Even if I don't like this, maybe it could be acceptable with a default value set
> to 'short delay'.

TBH in that case I'd prefer to just give up, and change worker to this:

        for (i = 0; i < nf_conntrack_htable_size; i++) {
         [..]
                hlist_nulls_for_each_entry_rcu(h, n, &ct_hash[i], hnnode) {
                        tmp = nf_ct_tuplehash_to_ctrack(h);

                        if (nf_ct_is_expired(tmp)) {
                                nf_ct_gc_expired(tmp);
                                continue;
                        }
                }
	[..]
             cond_resched_rcu_qs();
        }

        if (!gc_work->exiting)
                queue_delayed_work(system_long_wq, &gc_work->dwork, nf_conntrack_gc_interval);
}

Where nf_conntrack_gc_interval gets exported to userspace (in milliseconds),
with a default set e.g. to one minute.

Obvious downside: with low setting like 1 second we'll do frequent
rescan of entire table.


      reply	other threads:[~2017-01-16 17:34 UTC|newest]

Thread overview: 3+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2017-01-16 10:56 [PATCH nf] netfilter: conntrack: refine gc worker heuristics, redux Florian Westphal
2017-01-16 16:42 ` Nicolas Dichtel
2017-01-16 17:33   ` Florian Westphal [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20170116173349.GA12001@breakpoint.cc \
    --to=fw@strlen.de \
    --cc=netfilter-devel@vger.kernel.org \
    --cc=nicolas.dichtel@6wind.com \
    --cc=nuclearcat@nuclearcat.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.