From: Daniel Borkmann <daniel@iogearbox.net>
To: Pablo Neira Ayuso <pablo@netfilter.org>
Cc: Jamal Hadi Salim <jhs@mojatatu.com>,
Patrick McHardy <kaber@trash.net>,
Alexei Starovoitov <alexei.starovoitov@gmail.com>,
netfilter-devel@vger.kernel.org, davem@davemloft.net,
netdev@vger.kernel.org
Subject: Re: [PATCH 6/6] net: move qdisc ingress filtering on top of netfilter ingress hooks
Date: Fri, 01 May 2015 01:01:14 +0200 [thread overview]
Message-ID: <5542B43A.6020201@iogearbox.net> (raw)
In-Reply-To: <55427F81.4080807@iogearbox.net>
On 04/30/2015 09:16 PM, Daniel Borkmann wrote:
> On 04/30/2015 06:36 PM, Pablo Neira Ayuso wrote:
> ...
>> But where are the barriers? These unfounded performance claims are
>> simply absurd, qdisc ingress barely performs a bit better just because
>> it executes a bit less code and only in the single CPU scenario with
>> no rules at all.
>
> I think we're going in circles a bit. :( You are right in saying that
> currently, there's a central spinlock, which is worked on to get rid
> of, you've seen the patch on the list floating around already. Single
> CPU, artificial micro-benchmark, which were done show that you see on
> your machine ~613Kpps to ~545Kpps, others have seen it more amplified
> as 22.4Mpps to 18.0Mpps drop from __netif_receive_skb_core() up to an
> empty dummy u32_classify() rule, which has already been acknowledged
> that this gap needs to be improved. Lets call it unfounded then. I
> think we wouldn't even have this discussion if we wouldn't try brute
> forcing both worlds behind this single static key, or, have both
> invoked from within the same layer/list.
Ok, out of curiosity, I did the same as both of you: I'm using a pretty
standard Supermicro X10SLM-F/X10SLM-F, Xeon E3-1240 v3.
*** ingress + dummy u32, net-next:
w/o perf:
...
Result: OK: 5157948(c5157388+d559) usec, 100000000 (60byte,0frags)
19387551pps 9306Mb/sec (9306024480bps) errors: 100000000
perf record -C0 -ecycles:k ./pktgen.sh p1p1
...
Result: OK: 5182638(c5182057+d580) usec, 100000000 (60byte,0frags)
19295191pps 9261Mb/sec (9261691680bps) errors: 100000000
26.07% kpktgend_0 [kernel.kallsyms] [k] __netif_receive_skb_core
14.39% kpktgend_0 [kernel.kallsyms] [k] kfree_skb
13.69% kpktgend_0 [cls_u32] [k] u32_classify
11.75% kpktgend_0 [kernel.kallsyms] [k] _raw_spin_lock
5.34% kpktgend_0 [sch_ingress] [k] ingress_enqueue
5.21% kpktgend_0 [kernel.kallsyms] [k] tc_classify_compat
4.93% kpktgend_0 [kernel.kallsyms] [k] skb_defer_rx_timestamp
3.41% kpktgend_0 [kernel.kallsyms] [k] netif_receive_skb_internal
3.21% kpktgend_0 [pktgen] [k] pktgen_thread_worker
3.16% kpktgend_0 [kernel.kallsyms] [k] tc_classify
3.08% kpktgend_0 [kernel.kallsyms] [k] ip_rcv
2.05% kpktgend_0 [kernel.kallsyms] [k] __netif_receive_skb
1.60% kpktgend_0 [kernel.kallsyms] [k] netif_receive_skb_sk
1.15% kpktgend_0 [kernel.kallsyms] [k] classify
0.45% kpktgend_0 [kernel.kallsyms] [k] __local_bh_enable_ip
*** nf hook infra + ingress + dummy u32, net-next:
w/o perf:
...
Result: OK: 6555903(c6555744+d159) usec, 100000000 (60byte,0frags)
15253426pps 7321Mb/sec (7321644480bps) errors: 100000000
perf record -C0 -ecycles:k ./pktgen.sh p1p1
...
Result: OK: 6591291(c6591153+d138) usec, 100000000 (60byte,0frags)
15171532pps 7282Mb/sec (7282335360bps) errors: 100000000
25.94% kpktgend_0 [kernel.kallsyms] [k] __netif_receive_skb_core
12.19% kpktgend_0 [kernel.kallsyms] [k] kfree_skb
11.00% kpktgend_0 [kernel.kallsyms] [k] _raw_spin_lock
10.58% kpktgend_0 [cls_u32] [k] u32_classify
5.34% kpktgend_0 [sch_ingress] [k] handle_ing
4.68% kpktgend_0 [kernel.kallsyms] [k] nf_iterate
4.33% kpktgend_0 [kernel.kallsyms] [k] tc_classify_compat
4.32% kpktgend_0 [sch_ingress] [k] ingress_enqueue
3.62% kpktgend_0 [kernel.kallsyms] [k] skb_defer_rx_timestamp
2.95% kpktgend_0 [kernel.kallsyms] [k] nf_hook_slow
2.75% kpktgend_0 [kernel.kallsyms] [k] ip_rcv
2.60% kpktgend_0 [kernel.kallsyms] [k] tc_classify
2.52% kpktgend_0 [kernel.kallsyms] [k] netif_receive_skb_internal
2.50% kpktgend_0 [pktgen] [k] pktgen_thread_worker
1.77% kpktgend_0 [kernel.kallsyms] [k] __netif_receive_skb
1.28% kpktgend_0 [kernel.kallsyms] [k] netif_receive_skb_sk
0.94% kpktgend_0 [kernel.kallsyms] [k] classify
0.38% kpktgend_0 [kernel.kallsyms] [k] __local_bh_enable_ip
*** drop ingress spinlock (patch w/ bstats addition) + ingress +
dummy u32, net-next:
w/o perf:
...
Result: OK: 4789828(c4789353+d474) usec, 100000000 (60byte,0frags)
20877576pps 10021Mb/sec (10021236480bps) errors: 100000000
perf record -C0 -ecycles:k ./pktgen.sh p1p1
...
Result: OK: 4829276(c4828437+d839) usec, 100000000 (60byte,0frags)
20707036pps 9939Mb/sec (9939377280bps) errors: 100000000
33.11% kpktgend_0 [kernel.kallsyms] [k] __netif_receive_skb_core
15.27% kpktgend_0 [kernel.kallsyms] [k] kfree_skb
14.60% kpktgend_0 [cls_u32] [k] u32_classify
6.06% kpktgend_0 [sch_ingress] [k] ingress_enqueue
5.55% kpktgend_0 [kernel.kallsyms] [k] tc_classify_compat
5.31% kpktgend_0 [kernel.kallsyms] [k] skb_defer_rx_timestamp
3.77% kpktgend_0 [pktgen] [k] pktgen_thread_worker
3.45% kpktgend_0 [kernel.kallsyms] [k] netif_receive_skb_internal
3.33% kpktgend_0 [kernel.kallsyms] [k] tc_classify
3.33% kpktgend_0 [kernel.kallsyms] [k] ip_rcv
2.34% kpktgend_0 [kernel.kallsyms] [k] __netif_receive_skb
1.78% kpktgend_0 [kernel.kallsyms] [k] netif_receive_skb_sk
1.15% kpktgend_0 [kernel.kallsyms] [k] classify
0.48% kpktgend_0 [kernel.kallsyms] [k] __local_bh_enable_ip
That means, here, moving ingress behind nf hooks, I see a similar
slowdown in this micro-benchmark as Alexei of really worst case
of ~27%.
Now in real world that might probably just end up as a few percent
depending on the use case, but really, why should we go down that
path if we can just avoid that?
If you find a way, where both tc/nf hooks are triggered from within
the same list, then that would probably look better already. Or,
as a start, as mentioned, with a second static key for netfilter,
which can later on then still be reworked for a better integration,
although I agree with you that it's less clean and I see the point
of consolidating code.
If you want, I'm happy to provide numbers if you have a next set as
well, feel free to ping me.
Thanks,
Daniel
next prev parent reply other threads:[~2015-04-30 23:01 UTC|newest]
Thread overview: 43+ messages / expand[flat|nested] mbox.gz Atom feed top
2015-04-29 18:53 [PATCH 0/6 RFC] Netfilter ingress support (v2) Pablo Neira Ayuso
2015-04-29 18:53 ` [PATCH 1/6] netfilter: cleanup struct nf_hook_ops indentation Pablo Neira Ayuso
2015-04-29 18:53 ` [PATCH 2/6] netfilter: add hook list to nf_hook_state Pablo Neira Ayuso
2015-04-29 18:53 ` [PATCH 3/6] netfilter: add nf_hook_list_active() Pablo Neira Ayuso
2015-04-29 18:53 ` [PATCH 4/6] netfilter: move generic hook infrastructure into net/core/hooks.c Pablo Neira Ayuso
2015-04-29 23:59 ` Patrick McHardy
2015-04-29 18:53 ` [PATCH 5/6] net: add netfilter ingress hook Pablo Neira Ayuso
2015-04-29 18:53 ` [PATCH 6/6] net: move qdisc ingress filtering on top of netfilter ingress hooks Pablo Neira Ayuso
2015-04-29 20:27 ` Daniel Borkmann
2015-04-29 23:32 ` Pablo Neira Ayuso
2015-04-30 0:10 ` Daniel Borkmann
2015-04-30 0:20 ` Daniel Borkmann
2015-04-30 0:30 ` Patrick McHardy
2015-04-30 0:41 ` Daniel Borkmann
2015-04-30 0:48 ` Patrick McHardy
2015-04-30 1:16 ` Alexei Starovoitov
2015-04-30 1:34 ` Patrick McHardy
2015-04-30 2:22 ` Jamal Hadi Salim
2015-04-30 3:11 ` Patrick McHardy
2015-04-30 11:55 ` Jamal Hadi Salim
2015-04-30 15:33 ` Pablo Neira Ayuso
2015-04-30 16:09 ` Daniel Borkmann
2015-04-30 16:36 ` Pablo Neira Ayuso
2015-04-30 19:16 ` Daniel Borkmann
2015-04-30 23:01 ` Daniel Borkmann [this message]
2015-05-01 1:15 ` Jamal Hadi Salim
2015-04-30 10:12 ` Pablo Neira Ayuso
2015-04-30 19:05 ` Alexei Starovoitov
2015-04-30 0:37 ` Patrick McHardy
2015-04-30 1:04 ` Daniel Borkmann
2015-04-30 1:43 ` Patrick McHardy
2015-04-30 2:35 ` Jamal Hadi Salim
2015-04-30 3:29 ` Patrick McHardy
2015-04-30 4:05 ` Patrick McHardy
2015-04-30 6:02 ` Alexei Starovoitov
2015-04-30 9:24 ` Daniel Borkmann
2015-04-30 10:28 ` Pablo Neira Ayuso
2015-04-29 23:36 ` Patrick McHardy
2015-04-30 0:00 ` Daniel Borkmann
2015-04-30 0:15 ` Patrick McHardy
2015-04-29 21:53 ` Cong Wang
2015-04-29 23:37 ` Patrick McHardy
2015-04-29 23:42 ` Pablo Neira Ayuso
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=5542B43A.6020201@iogearbox.net \
--to=daniel@iogearbox.net \
--cc=alexei.starovoitov@gmail.com \
--cc=davem@davemloft.net \
--cc=jhs@mojatatu.com \
--cc=kaber@trash.net \
--cc=netdev@vger.kernel.org \
--cc=netfilter-devel@vger.kernel.org \
--cc=pablo@netfilter.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).