All of lore.kernel.org
 help / color / mirror / Atom feed
From: Jesper Dangaard Brouer <brouer@redhat.com>
To: Brenden Blanco <bblanco@plumgrid.com>
Cc: davem@davemloft.net, netdev@vger.kernel.org, tom@herbertland.com,
	alexei.starovoitov@gmail.com, Or Gerlitz <ogerlitz@mellanox.com>,
	daniel@iogearbox.net, john.fastabend@gmail.com,
	brouer@redhat.com
Subject: Re: [RFC PATCH 5/5] Add sample for adding simple drop program to link
Date: Wed, 6 Apr 2016 22:01:00 +0200	[thread overview]
Message-ID: <20160406220100.0df04925@redhat.com> (raw)
In-Reply-To: <20160406214848.7568235b@redhat.com>

On Wed, 6 Apr 2016 21:48:48 +0200
Jesper Dangaard Brouer <brouer@redhat.com> wrote:

> I'm testing with this program and these patches, after getting past the
> challenge of compiling the samples/bpf files ;-)
> 
> 
> On Fri,  1 Apr 2016 18:21:58 -0700 Brenden Blanco <bblanco@plumgrid.com> wrote:
> 
> > Add a sample program that only drops packets at the
> > BPF_PROG_TYPE_PHYS_DEV hook of a link. With the drop-only program,
> > observed single core rate is ~14.6Mpps.  
> 
> On my i7-4790K CPU @ 4.00GHz I'm seeing 9.7Mpps (single flow/cpu).
> (generator: pktgen_sample03_burst_single_flow.sh)
> 
>  # ./netdrvx1 $(</sys/class/net/mlx4p1/ifindex)
>  sh: /sys/kernel/debug/tracing/kprobe_events: No such file or directory
>  Success: Loaded file ./netdrvx1_kern.o
>  proto 17:    9776320 drops/s
> 
> These numbers are quite impressive. Compared to: sending it to local
> socket that drop packets 1.7Mpps. Compared to: dropping with iptables
> in "raw" table 3.7Mpps.
> 
> If I do multiple flows, via ./pktgen_sample05_flow_per_thread.sh
> then I hit this strange 14.5Mpps limit (proto 17:   14505558 drops/s).
> And the RX 4x CPUs are starting to NOT use 100% in softirq, they have
> some cycles attributed to %idle. (I verified generator is sending at
> 24Mpps).
> 
> 
> > Other tests were run, for instance without the dropcnt increment or
> > without reading from the packet header, the packet rate was mostly
> > unchanged.  
> 
> If I change the program to not touch packet data (don't call
> load_byte()) then the performance increase to 14.6Mpps (single
> flow/cpu).  And the RX CPU is mostly idle... mlx4_en_process_rx_cq()
> and page alloc/free functions taking the time.
> 
> > $ perf record -a samples/bpf/netdrvx1 $(</sys/class/net/eth0/ifindex)
> > proto 17:   14597724 drops/s
> > 
> > ./pktgen_sample03_burst_single_flow.sh -i $DEV -d $IP -m $MAC -t 4
> > Running... ctrl^C to stop
> > Device: eth4@0
> > Result: OK: 6486875(c6485849+d1026) usec, 23689465 (60byte,0frags)
> >   3651906pps 1752Mb/sec (1752914880bps) errors: 0
> > Device: eth4@1
> > Result: OK: 6486874(c6485656+d1217) usec, 23689489 (60byte,0frags)
> >   3651911pps 1752Mb/sec (1752917280bps) errors: 0
> > Device: eth4@2
> > Result: OK: 6486851(c6485730+d1120) usec, 23687853 (60byte,0frags)
> >   3651672pps 1752Mb/sec (1752802560bps) errors: 0
> > Device: eth4@3
> > Result: OK: 6486879(c6485807+d1071) usec, 23688954 (60byte,0frags)
> >   3651825pps 1752Mb/sec (1752876000bps) errors: 0
> > 
> > perf report --no-children:
> >   18.36%  ksoftirqd/1    [mlx4_en]         [k] mlx4_en_process_rx_cq
> >   15.98%  swapper        [kernel.vmlinux]  [k] poll_idle
> >   12.71%  ksoftirqd/1    [mlx4_en]         [k] mlx4_en_alloc_frags
> >    6.87%  ksoftirqd/1    [mlx4_en]         [k] mlx4_en_free_frag
> >    4.20%  ksoftirqd/1    [kernel.vmlinux]  [k] get_page_from_freelist
> >    4.09%  swapper        [mlx4_en]         [k] mlx4_en_process_rx_cq
> >    3.32%  ksoftirqd/1    [kernel.vmlinux]  [k] sk_load_byte_positive_offset
> >    2.39%  ksoftirqd/1    [mdio]            [k] 0x00000000000074cd
> >    2.23%  swapper        [mlx4_en]         [k] mlx4_en_alloc_frags
> >    2.20%  ksoftirqd/1    [kernel.vmlinux]  [k] free_pages_prepare
> >    2.08%  ksoftirqd/1    [mlx4_en]         [k] mlx4_call_bpf
> >    1.57%  ksoftirqd/1    [kernel.vmlinux]  [k] percpu_array_map_lookup_elem
> >    1.35%  ksoftirqd/1    [mdio]            [k] 0x00000000000074fa
> >    1.09%  ksoftirqd/1    [kernel.vmlinux]  [k] free_one_page
> >    1.02%  ksoftirqd/1    [kernel.vmlinux]  [k] bpf_map_lookup_elem
> >    0.90%  ksoftirqd/1    [kernel.vmlinux]  [k] __alloc_pages_nodemask
> >    0.88%  swapper        [kernel.vmlinux]  [k] intel_idle
> >    0.82%  ksoftirqd/1    [mdio]            [k] 0x00000000000074be
> >    0.80%  swapper        [mlx4_en]         [k] mlx4_en_free_frag  
> 
> My picture (single flow/cpu) looks a little bit different:
> 
>  +   64.33%  ksoftirqd/7    [kernel.vmlinux]  [k] __bpf_prog_run
>  +    9.60%  ksoftirqd/7    [mlx4_en]         [k] mlx4_en_alloc_frags
>  +    7.71%  ksoftirqd/7    [mlx4_en]         [k] mlx4_en_process_rx_cq
>  +    5.47%  ksoftirqd/7    [mlx4_en]         [k] mlx4_en_free_frag
>  +    1.68%  ksoftirqd/7    [kernel.vmlinux]  [k] get_page_from_freelist
>  +    1.52%  ksoftirqd/7    [mlx4_en]         [k] mlx4_call_bpf
>  +    1.02%  ksoftirqd/7    [kernel.vmlinux]  [k] free_pages_prepare
>  +    0.72%  ksoftirqd/7    [mlx4_en]         [k] mlx4_alloc_pages.isra.20
>  +    0.70%  ksoftirqd/7    [kernel.vmlinux]  [k] __rcu_read_unlock
>  +    0.65%  ksoftirqd/7    [kernel.vmlinux]  [k] percpu_array_map_lookup_elem
> 
> On my i7-4790K CPU, I don't have DDIO, thus I assume this high cost in
> __bpf_prog_run is due to a cache-miss on the packet data.

Before someone else point out the obvious... I forgot to enable JIT.
Enable it::

 # echo 1 > /proc/sys/net/core/bpf_jit_enable

Performance increased to: 10.8Mpps (proto 17:   10819446 drops/s)

 Samples: 51K of event 'cycles', Event count (approx.): 56775706510
   Overhead  Command      Shared Object     Symbol
 +   55.90%  ksoftirqd/7  [kernel.vmlinux]  [k] sk_load_byte_positive_offset
 +   10.71%  ksoftirqd/7  [mlx4_en]         [k] mlx4_en_alloc_frags
 +    8.26%  ksoftirqd/7  [mlx4_en]         [k] mlx4_en_process_rx_cq
 +    5.94%  ksoftirqd/7  [mlx4_en]         [k] mlx4_en_free_frag
 +    2.04%  ksoftirqd/7  [kernel.vmlinux]  [k] get_page_from_freelist
 +    2.03%  ksoftirqd/7  [kernel.vmlinux]  [k] percpu_array_map_lookup_elem
 +    1.42%  ksoftirqd/7  [mlx4_en]         [k] mlx4_call_bpf
 +    1.04%  ksoftirqd/7  [kernel.vmlinux]  [k] free_pages_prepare
 +    1.03%  ksoftirqd/7  [kernel.vmlinux]  [k] __rcu_read_unlock
 +    0.97%  ksoftirqd/7  [mlx4_en]         [k] mlx4_alloc_pages.isra.20
 +    0.95%  ksoftirqd/7  [devlink]         [k] 0x0000000000005f87
 +    0.58%  ksoftirqd/7  [devlink]         [k] 0x0000000000005f8f
 +    0.49%  ksoftirqd/7  [kernel.vmlinux]  [k] __free_pages_ok
 +    0.47%  ksoftirqd/7  [kernel.vmlinux]  [k] __rcu_read_lock
 +    0.46%  ksoftirqd/7  [kernel.vmlinux]  [k] free_one_page
 +    0.38%  ksoftirqd/7  [kernel.vmlinux]  [k] net_rx_action
 +    0.36%  ksoftirqd/7  [kernel.vmlinux]  [k] bpf_map_lookup_elem
 +    0.36%  ksoftirqd/7  [kernel.vmlinux]  [k] __mod_zone_page_state
 +    0.34%  ksoftirqd/7  [kernel.vmlinux]  [k] __alloc_pages_nodemask
 +    0.32%  ksoftirqd/7  [kernel.vmlinux]  [k] _raw_spin_lock
 +    0.31%  ksoftirqd/7  [devlink]         [k] 0x0000000000005f0a
 +    0.29%  ksoftirqd/7  [kernel.vmlinux]  [k] next_zones_zonelist

It is a very likely cache-miss in sk_load_byte_positive_offset().

-- 
Best regards,
  Jesper Dangaard Brouer
  MSc.CS, Principal Kernel Engineer at Red Hat
  Author of http://www.iptv-analyzer.org
  LinkedIn: http://www.linkedin.com/in/brouer

  reply	other threads:[~2016-04-06 20:01 UTC|newest]

Thread overview: 62+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2016-04-02  1:21 [RFC PATCH 0/5] Add driver bpf hook for early packet drop Brenden Blanco
2016-04-02  1:21 ` [RFC PATCH 1/5] bpf: add PHYS_DEV prog type for early driver filter Brenden Blanco
2016-04-02 16:39   ` Tom Herbert
2016-04-03  7:02     ` Brenden Blanco
2016-04-04 22:07       ` Thomas Graf
2016-04-05  8:19         ` Jesper Dangaard Brouer
2016-04-04  8:49   ` Daniel Borkmann
2016-04-04 13:07     ` Jesper Dangaard Brouer
2016-04-04 13:36       ` Daniel Borkmann
2016-04-04 14:09         ` Tom Herbert
2016-04-04 15:12           ` Jesper Dangaard Brouer
2016-04-04 15:29             ` Brenden Blanco
2016-04-04 16:07               ` John Fastabend
2016-04-04 16:17                 ` Brenden Blanco
2016-04-04 20:00                   ` Alexei Starovoitov
2016-04-04 22:04                     ` Thomas Graf
2016-04-05  2:25                       ` Alexei Starovoitov
2016-04-05  8:11                         ` Jesper Dangaard Brouer
2016-04-05  9:29                     ` Jesper Dangaard Brouer
2016-04-05 22:06                       ` Alexei Starovoitov
2016-04-04 14:33       ` Eric Dumazet
2016-04-04 15:18         ` Edward Cree
2016-04-02  1:21 ` [RFC PATCH 2/5] net: add ndo to set bpf prog in adapter rx Brenden Blanco
2016-04-02  1:21 ` [RFC PATCH 3/5] rtnl: add option for setting link bpf prog Brenden Blanco
2016-04-02  1:21 ` [RFC PATCH 4/5] mlx4: add support for fast rx drop bpf program Brenden Blanco
2016-04-02  2:08   ` Eric Dumazet
2016-04-02  2:47     ` Alexei Starovoitov
2016-04-04 14:57       ` Jesper Dangaard Brouer
2016-04-04 15:22         ` Eric Dumazet
2016-04-04 18:50           ` Alexei Starovoitov
2016-04-05 14:15             ` Or Gerlitz
2016-04-06  4:05               ` Brenden Blanco
2016-04-03  6:15     ` Brenden Blanco
2016-04-05  2:20       ` Brenden Blanco
2016-04-05  2:44         ` Eric Dumazet
2016-04-05 18:59         ` Eran Ben Elisha
2016-04-02  8:23   ` Jesper Dangaard Brouer
2016-04-03  6:11     ` Brenden Blanco
2016-04-04 18:27       ` Alexei Starovoitov
2016-04-05  6:04         ` Jesper Dangaard Brouer
2016-04-02 18:40   ` Johannes Berg
2016-04-03  6:38     ` Brenden Blanco
2016-04-04  7:35       ` Johannes Berg
2016-04-04  9:57         ` Daniel Borkmann
2016-04-04 18:46           ` Alexei Starovoitov
2016-04-04 21:01             ` Daniel Borkmann
2016-04-05  1:17               ` Alexei Starovoitov
2016-04-04  8:33   ` Jesper Dangaard Brouer
2016-04-04  9:22   ` Daniel Borkmann
2016-04-02  1:21 ` [RFC PATCH 5/5] Add sample for adding simple drop program to link Brenden Blanco
2016-04-06 19:48   ` Jesper Dangaard Brouer
2016-04-06 20:01     ` Jesper Dangaard Brouer [this message]
2016-04-06 23:11       ` Alexei Starovoitov
2016-04-06 20:03     ` Daniel Borkmann
2016-04-02 16:47 ` [RFC PATCH 0/5] Add driver bpf hook for early packet drop Tom Herbert
2016-04-03  5:41   ` Brenden Blanco
2016-04-04  7:48     ` Jesper Dangaard Brouer
2016-04-04 18:10       ` Alexei Starovoitov
2016-04-02 18:41 ` Johannes Berg
2016-04-02 22:57   ` Tom Herbert
2016-04-03  2:28     ` Lorenzo Colitti
2016-04-04  7:37       ` Johannes Berg

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20160406220100.0df04925@redhat.com \
    --to=brouer@redhat.com \
    --cc=alexei.starovoitov@gmail.com \
    --cc=bblanco@plumgrid.com \
    --cc=daniel@iogearbox.net \
    --cc=davem@davemloft.net \
    --cc=john.fastabend@gmail.com \
    --cc=netdev@vger.kernel.org \
    --cc=ogerlitz@mellanox.com \
    --cc=tom@herbertland.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.