All of lore.kernel.org
 help / color / mirror / Atom feed
From: "Samudrala, Sridhar" <sridhar.samudrala@intel.com>
To: "Alexei Starovoitov" <alexei.starovoitov@gmail.com>,
	"Karlsson, Magnus" <magnus.karlsson@intel.com>,
	"Björn Töpel" <bjorn.topel@intel.com>,
	Netdev <netdev@vger.kernel.org>,
	"bpf@vger.kernel.org" <bpf@vger.kernel.org>,
	intel-wired-lan@lists.osuosl.org, "Fijalkowski,
	Maciej" <maciej.fijalkowski@intel.com>,
	"Herbert, Tom" <tom.herbert@intel.com>
Subject: Re: FW: [PATCH bpf-next 2/4] xsk: allow AF_XDP sockets to receive packets directly from a queue
Date: Wed, 9 Oct 2019 09:53:21 -0700	[thread overview]
Message-ID: <2bc26acd-170d-634e-c066-71557b2b3e4f@intel.com> (raw)
In-Reply-To: <3ED8E928C4210A4289A677D2FEB48235140134CE@fmsmsx111.amr.corp.intel.com>


>> +
>> +u32 bpf_direct_xsk(const struct bpf_prog *prog, struct xdp_buff *xdp)
>> +{
>> +       struct xdp_sock *xsk;
>> +
>> +       xsk = xdp_get_xsk_from_qid(xdp->rxq->dev, xdp->rxq->queue_index);
>> +       if (xsk) {
>> +               struct bpf_redirect_info *ri =
>> + this_cpu_ptr(&bpf_redirect_info);
>> +
>> +               ri->xsk = xsk;
>> +               return XDP_REDIRECT;
>> +       }
>> +
>> +       return XDP_PASS;
>> +}
>> +EXPORT_SYMBOL(bpf_direct_xsk);
> 
> So you're saying there is a:
> """
> xdpsock rxdrop 1 core (both app and queue's irq pinned to the same core)
>     default : taskset -c 1 ./xdpsock -i enp66s0f0 -r -q 1
>     direct-xsk :taskset -c 1 ./xdpsock -i enp66s0f0 -r -q 1 6.1x improvement in drop rate """
> 
> 6.1x gain running above C code vs exactly equivalent BPF code?
> How is that possible?

It seems to be due to the overhead of __bpf_prog_run on older processors 
(Ivybridge). The overhead is smaller on newer processors, but even on 
skylake i see around 1.5x improvement.

perf report with default xdpsock
================================
Samples: 2K of event 'cycles:ppp', Event count (approx.): 8437658090
Overhead  Command          Shared Object     Symbol
   34.57%  xdpsock          xdpsock           [.] main
   17.19%  ksoftirqd/1      [kernel.vmlinux]  [k] ___bpf_prog_run
   13.12%  xdpsock          [kernel.vmlinux]  [k] ___bpf_prog_run
    4.09%  ksoftirqd/1      [kernel.vmlinux]  [k] __x86_indirect_thunk_rax
    3.08%  xdpsock          [kernel.vmlinux]  [k] nmi
    2.76%  ksoftirqd/1      [kernel.vmlinux]  [k] xsk_map_lookup_elem
    2.33%  xdpsock          [kernel.vmlinux]  [k] __x86_indirect_thunk_rax
    2.33%  ksoftirqd/1      [i40e]            [k] i40e_clean_rx_irq_zc
    2.16%  xdpsock          [kernel.vmlinux]  [k] bpf_map_lookup_elem
    1.82%  ksoftirqd/1      [kernel.vmlinux]  [k] xdp_do_redirect
    1.41%  ksoftirqd/1      [kernel.vmlinux]  [k] xsk_rcv
    1.39%  ksoftirqd/1      [kernel.vmlinux]  [k] update_curr
    1.09%  ksoftirqd/1      [kernel.vmlinux]  [k] bpf_xdp_redirect_map
    1.09%  xdpsock          [i40e]            [k] i40e_clean_rx_irq_zc
    1.08%  ksoftirqd/1      [kernel.vmlinux]  [k] __xsk_map_redirect
    1.07%  swapper          [kernel.vmlinux]  [k] xsk_umem_peek_addr
    1.05%  ksoftirqd/1      [kernel.vmlinux]  [k] xsk_umem_peek_addr
    0.89%  swapper          [kernel.vmlinux]  [k] __xsk_map_redirect
    0.87%  ksoftirqd/1      [kernel.vmlinux]  [k] __bpf_prog_run32
    0.87%  swapper          [kernel.vmlinux]  [k] intel_idle
    0.67%  xdpsock          [kernel.vmlinux]  [k] bpf_xdp_redirect_map
    0.57%  xdpsock          [kernel.vmlinux]  [k] xdp_do_redirect

perf report with direct xdpsock
===============================
Samples: 2K of event 'cycles:ppp', Event count (approx.): 17996091975
Overhead  Command          Shared Object     Symbol
   18.44%  xdpsock          [i40e]            [k] i40e_clean_rx_irq_zc
   15.14%  ksoftirqd/1      [i40e]            [k] i40e_clean_rx_irq_zc
    6.87%  xdpsock          [kernel.vmlinux]  [k] xsk_umem_peek_addr
    5.03%  ksoftirqd/1      [kernel.vmlinux]  [k] xdp_do_redirect
    4.21%  xdpsock          xdpsock           [.] main
    4.13%  ksoftirqd/1      [i40e]            [k] 
i40e_clean_programming_status
    3.71%  xdpsock          [kernel.vmlinux]  [k] xsk_rcv
    3.44%  ksoftirqd/1      [kernel.vmlinux]  [k] nmi
    3.41%  xdpsock          [kernel.vmlinux]  [k] nmi
    3.20%  ksoftirqd/1      [kernel.vmlinux]  [k] xsk_rcv
    2.45%  xdpsock          [kernel.vmlinux]  [k] xdp_get_xsk_from_qid
    2.35%  ksoftirqd/1      [kernel.vmlinux]  [k] xsk_umem_peek_addr
    2.33%  ksoftirqd/1      [kernel.vmlinux]  [k] net_rx_action
    2.16%  ksoftirqd/1      [kernel.vmlinux]  [k] xsk_umem_consume_tx
    2.10%  swapper          [kernel.vmlinux]  [k] __softirqentry_text_start
    2.06%  xdpsock          [kernel.vmlinux]  [k] native_irq_return_iret
    1.43%  xdpsock          [kernel.vmlinux]  [k] check_preempt_wakeup
    1.42%  xdpsock          [kernel.vmlinux]  [k] xsk_umem_consume_tx
    1.22%  xdpsock          [kernel.vmlinux]  [k] xdp_do_redirect
    1.21%  xdpsock          [kernel.vmlinux]  [k] 
dma_direct_sync_single_for_device
    1.16%  ksoftirqd/1      [kernel.vmlinux]  [k] irqtime_account_irq
    1.09%  xdpsock          [kernel.vmlinux]  [k] sock_def_readable
    0.99%  swapper          [kernel.vmlinux]  [k] intel_idle
    0.88%  xdpsock          [i40e]            [k] 
i40e_clean_programming_status
    0.74%  ksoftirqd/1      [kernel.vmlinux]  [k] xsk_umem_discard_addr
    0.71%  ksoftirqd/1      [kernel.vmlinux]  [k] __switch_to
    0.50%  ksoftirqd/1      [kernel.vmlinux]  [k] 
dma_direct_sync_single_for_device

WARNING: multiple messages have this Message-ID (diff)
From: Samudrala, Sridhar <sridhar.samudrala@intel.com>
To: intel-wired-lan@osuosl.org
Subject: [Intel-wired-lan] FW: [PATCH bpf-next 2/4] xsk: allow AF_XDP sockets to receive packets directly from a queue
Date: Wed, 9 Oct 2019 09:53:21 -0700	[thread overview]
Message-ID: <2bc26acd-170d-634e-c066-71557b2b3e4f@intel.com> (raw)
In-Reply-To: <3ED8E928C4210A4289A677D2FEB48235140134CE@fmsmsx111.amr.corp.intel.com>


>> +
>> +u32 bpf_direct_xsk(const struct bpf_prog *prog, struct xdp_buff *xdp)
>> +{
>> +       struct xdp_sock *xsk;
>> +
>> +       xsk = xdp_get_xsk_from_qid(xdp->rxq->dev, xdp->rxq->queue_index);
>> +       if (xsk) {
>> +               struct bpf_redirect_info *ri =
>> + this_cpu_ptr(&bpf_redirect_info);
>> +
>> +               ri->xsk = xsk;
>> +               return XDP_REDIRECT;
>> +       }
>> +
>> +       return XDP_PASS;
>> +}
>> +EXPORT_SYMBOL(bpf_direct_xsk);
> 
> So you're saying there is a:
> """
> xdpsock rxdrop 1 core (both app and queue's irq pinned to the same core)
>     default : taskset -c 1 ./xdpsock -i enp66s0f0 -r -q 1
>     direct-xsk :taskset -c 1 ./xdpsock -i enp66s0f0 -r -q 1 6.1x improvement in drop rate """
> 
> 6.1x gain running above C code vs exactly equivalent BPF code?
> How is that possible?

It seems to be due to the overhead of __bpf_prog_run on older processors 
(Ivybridge). The overhead is smaller on newer processors, but even on 
skylake i see around 1.5x improvement.

perf report with default xdpsock
================================
Samples: 2K of event 'cycles:ppp', Event count (approx.): 8437658090
Overhead  Command          Shared Object     Symbol
   34.57%  xdpsock          xdpsock           [.] main
   17.19%  ksoftirqd/1      [kernel.vmlinux]  [k] ___bpf_prog_run
   13.12%  xdpsock          [kernel.vmlinux]  [k] ___bpf_prog_run
    4.09%  ksoftirqd/1      [kernel.vmlinux]  [k] __x86_indirect_thunk_rax
    3.08%  xdpsock          [kernel.vmlinux]  [k] nmi
    2.76%  ksoftirqd/1      [kernel.vmlinux]  [k] xsk_map_lookup_elem
    2.33%  xdpsock          [kernel.vmlinux]  [k] __x86_indirect_thunk_rax
    2.33%  ksoftirqd/1      [i40e]            [k] i40e_clean_rx_irq_zc
    2.16%  xdpsock          [kernel.vmlinux]  [k] bpf_map_lookup_elem
    1.82%  ksoftirqd/1      [kernel.vmlinux]  [k] xdp_do_redirect
    1.41%  ksoftirqd/1      [kernel.vmlinux]  [k] xsk_rcv
    1.39%  ksoftirqd/1      [kernel.vmlinux]  [k] update_curr
    1.09%  ksoftirqd/1      [kernel.vmlinux]  [k] bpf_xdp_redirect_map
    1.09%  xdpsock          [i40e]            [k] i40e_clean_rx_irq_zc
    1.08%  ksoftirqd/1      [kernel.vmlinux]  [k] __xsk_map_redirect
    1.07%  swapper          [kernel.vmlinux]  [k] xsk_umem_peek_addr
    1.05%  ksoftirqd/1      [kernel.vmlinux]  [k] xsk_umem_peek_addr
    0.89%  swapper          [kernel.vmlinux]  [k] __xsk_map_redirect
    0.87%  ksoftirqd/1      [kernel.vmlinux]  [k] __bpf_prog_run32
    0.87%  swapper          [kernel.vmlinux]  [k] intel_idle
    0.67%  xdpsock          [kernel.vmlinux]  [k] bpf_xdp_redirect_map
    0.57%  xdpsock          [kernel.vmlinux]  [k] xdp_do_redirect

perf report with direct xdpsock
===============================
Samples: 2K of event 'cycles:ppp', Event count (approx.): 17996091975
Overhead  Command          Shared Object     Symbol
   18.44%  xdpsock          [i40e]            [k] i40e_clean_rx_irq_zc
   15.14%  ksoftirqd/1      [i40e]            [k] i40e_clean_rx_irq_zc
    6.87%  xdpsock          [kernel.vmlinux]  [k] xsk_umem_peek_addr
    5.03%  ksoftirqd/1      [kernel.vmlinux]  [k] xdp_do_redirect
    4.21%  xdpsock          xdpsock           [.] main
    4.13%  ksoftirqd/1      [i40e]            [k] 
i40e_clean_programming_status
    3.71%  xdpsock          [kernel.vmlinux]  [k] xsk_rcv
    3.44%  ksoftirqd/1      [kernel.vmlinux]  [k] nmi
    3.41%  xdpsock          [kernel.vmlinux]  [k] nmi
    3.20%  ksoftirqd/1      [kernel.vmlinux]  [k] xsk_rcv
    2.45%  xdpsock          [kernel.vmlinux]  [k] xdp_get_xsk_from_qid
    2.35%  ksoftirqd/1      [kernel.vmlinux]  [k] xsk_umem_peek_addr
    2.33%  ksoftirqd/1      [kernel.vmlinux]  [k] net_rx_action
    2.16%  ksoftirqd/1      [kernel.vmlinux]  [k] xsk_umem_consume_tx
    2.10%  swapper          [kernel.vmlinux]  [k] __softirqentry_text_start
    2.06%  xdpsock          [kernel.vmlinux]  [k] native_irq_return_iret
    1.43%  xdpsock          [kernel.vmlinux]  [k] check_preempt_wakeup
    1.42%  xdpsock          [kernel.vmlinux]  [k] xsk_umem_consume_tx
    1.22%  xdpsock          [kernel.vmlinux]  [k] xdp_do_redirect
    1.21%  xdpsock          [kernel.vmlinux]  [k] 
dma_direct_sync_single_for_device
    1.16%  ksoftirqd/1      [kernel.vmlinux]  [k] irqtime_account_irq
    1.09%  xdpsock          [kernel.vmlinux]  [k] sock_def_readable
    0.99%  swapper          [kernel.vmlinux]  [k] intel_idle
    0.88%  xdpsock          [i40e]            [k] 
i40e_clean_programming_status
    0.74%  ksoftirqd/1      [kernel.vmlinux]  [k] xsk_umem_discard_addr
    0.71%  ksoftirqd/1      [kernel.vmlinux]  [k] __switch_to
    0.50%  ksoftirqd/1      [kernel.vmlinux]  [k] 
dma_direct_sync_single_for_device

  parent reply	other threads:[~2019-10-09 16:53 UTC|newest]

Thread overview: 84+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2019-10-08  6:16 [PATCH bpf-next 0/4] Enable direct receive on AF_XDP sockets Sridhar Samudrala
2019-10-08  6:16 ` [Intel-wired-lan] " Sridhar Samudrala
2019-10-08  6:16 ` [PATCH bpf-next 1/4] bpf: introduce bpf_get_prog_id and bpf_set_prog_id helper functions Sridhar Samudrala
2019-10-08  6:16   ` [Intel-wired-lan] " Sridhar Samudrala
2019-10-08  6:16 ` [PATCH bpf-next 2/4] xsk: allow AF_XDP sockets to receive packets directly from a queue Sridhar Samudrala
2019-10-08  6:16   ` [Intel-wired-lan] " Sridhar Samudrala
2019-10-08  6:58   ` Toke Høiland-Jørgensen
2019-10-08  6:58     ` [Intel-wired-lan] " Toke =?unknown-8bit?q?H=C3=B8iland-J=C3=B8rgensen?=
2019-10-08  8:47     ` Björn Töpel
2019-10-08  8:47       ` [Intel-wired-lan] " =?unknown-8bit?q?Bj=C3=B6rn_T=C3=B6pel?=
2019-10-08  8:48       ` Björn Töpel
2019-10-08  8:48         ` [Intel-wired-lan] " =?unknown-8bit?q?Bj=C3=B6rn_T=C3=B6pel?=
2019-10-08  9:04       ` Toke Høiland-Jørgensen
2019-10-08  9:04         ` [Intel-wired-lan] " Toke =?unknown-8bit?q?H=C3=B8iland-J=C3=B8rgensen?=
2019-10-08  8:05   ` Björn Töpel
2019-10-08  8:05     ` [Intel-wired-lan] " =?unknown-8bit?q?Bj=C3=B6rn_T=C3=B6pel?=
2019-10-09 16:32     ` Samudrala, Sridhar
2019-10-09 16:32       ` [Intel-wired-lan] " Samudrala, Sridhar
2019-10-09  1:20   ` Alexei Starovoitov
2019-10-09  1:20     ` [Intel-wired-lan] " Alexei Starovoitov
     [not found]     ` <3ED8E928C4210A4289A677D2FEB48235140134CE@fmsmsx111.amr.corp.intel.com>
2019-10-09 16:53       ` Samudrala, Sridhar [this message]
2019-10-09 16:53         ` [Intel-wired-lan] FW: " Samudrala, Sridhar
2019-10-09 17:17         ` Alexei Starovoitov
2019-10-09 17:17           ` [Intel-wired-lan] " Alexei Starovoitov
2019-10-09 19:12           ` Samudrala, Sridhar
2019-10-09 19:12             ` [Intel-wired-lan] " Samudrala, Sridhar
2019-10-10  1:06             ` Alexei Starovoitov
2019-10-10  1:06               ` [Intel-wired-lan] " Alexei Starovoitov
2019-10-18 18:40               ` Samudrala, Sridhar
2019-10-18 18:40                 ` [Intel-wired-lan] " Samudrala, Sridhar
2019-10-18 19:22                 ` Toke Høiland-Jørgensen
2019-10-18 19:22                   ` [Intel-wired-lan] " Toke =?unknown-8bit?q?H=C3=B8iland-J=C3=B8rgensen?=
2019-10-19  0:14                 ` Alexei Starovoitov
2019-10-19  0:14                   ` [Intel-wired-lan] " Alexei Starovoitov
2019-10-19  0:45                   ` Samudrala, Sridhar
2019-10-19  0:45                     ` [Intel-wired-lan] " Samudrala, Sridhar
2019-10-19  2:25                     ` Alexei Starovoitov
2019-10-19  2:25                       ` [Intel-wired-lan] " Alexei Starovoitov
2019-10-20 10:14                       ` Toke Høiland-Jørgensen
2019-10-20 10:14                         ` [Intel-wired-lan] " Toke =?unknown-8bit?q?H=C3=B8iland-J=C3=B8rgensen?=
2019-10-20 17:12                         ` Björn Töpel
2019-10-20 17:12                           ` =?unknown-8bit?q?Bj=C3=B6rn_T=C3=B6pel?=
2019-10-21 20:10                           ` Samudrala, Sridhar
2019-10-21 20:10                             ` Samudrala, Sridhar
2019-10-21 22:34                             ` Alexei Starovoitov
2019-10-21 22:34                               ` Alexei Starovoitov
2019-10-22 19:06                               ` Samudrala, Sridhar
2019-10-22 19:06                                 ` Samudrala, Sridhar
2019-10-23 17:42                                 ` Alexei Starovoitov
2019-10-23 17:42                                   ` Alexei Starovoitov
2019-10-24 18:12                                   ` Samudrala, Sridhar
2019-10-24 18:12                                     ` Samudrala, Sridhar
2019-10-25  7:42                                     ` Björn Töpel
2019-10-25  7:42                                       ` =?unknown-8bit?q?Bj=C3=B6rn_T=C3=B6pel?=
2019-10-31 22:38                                       ` Samudrala, Sridhar
2019-10-31 22:38                                         ` Samudrala, Sridhar
2019-10-31 23:15                                         ` Alexei Starovoitov
2019-10-31 23:15                                           ` Alexei Starovoitov
2019-11-01  0:21                                         ` Jakub Kicinski
2019-11-01  0:21                                           ` Jakub Kicinski
2019-11-01 18:31                                           ` Samudrala, Sridhar
2019-11-01 18:31                                             ` Samudrala, Sridhar
2019-11-04  2:08                                           ` dan
2019-11-04  2:08                                             ` dan
2019-10-25  9:07                                   ` Björn Töpel
2019-10-25  9:07                                     ` =?unknown-8bit?q?Bj=C3=B6rn_T=C3=B6pel?=
2019-10-08  6:16 ` [PATCH bpf-next 3/4] libbpf: handle AF_XDP sockets created with XDP_DIRECT bind flag Sridhar Samudrala
2019-10-08  6:16   ` [Intel-wired-lan] " Sridhar Samudrala
2019-10-08  8:05   ` Björn Töpel
2019-10-08  8:05     ` [Intel-wired-lan] " =?unknown-8bit?q?Bj=C3=B6rn_T=C3=B6pel?=
2019-10-08  6:16 ` [PATCH bpf-next 4/4] xdpsock: add an option to create AF_XDP sockets in XDP_DIRECT mode Sridhar Samudrala
2019-10-08  6:16   ` [Intel-wired-lan] " Sridhar Samudrala
2019-10-08  8:05   ` Björn Töpel
2019-10-08  8:05     ` [Intel-wired-lan] " =?unknown-8bit?q?Bj=C3=B6rn_T=C3=B6pel?=
2019-10-08  8:05 ` [PATCH bpf-next 0/4] Enable direct receive on AF_XDP sockets Björn Töpel
2019-10-08  8:05   ` [Intel-wired-lan] " =?unknown-8bit?q?Bj=C3=B6rn_T=C3=B6pel?=
2019-10-09 16:19   ` Samudrala, Sridhar
2019-10-09 16:19     ` [Intel-wired-lan] " Samudrala, Sridhar
2019-10-09  0:49 ` Jakub Kicinski
2019-10-09  0:49   ` [Intel-wired-lan] " Jakub Kicinski
2019-10-09  6:29   ` Samudrala, Sridhar
2019-10-09  6:29     ` [Intel-wired-lan] " Samudrala, Sridhar
2019-10-09 16:53     ` Jakub Kicinski
2019-10-09 16:53       ` [Intel-wired-lan] " Jakub Kicinski

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=2bc26acd-170d-634e-c066-71557b2b3e4f@intel.com \
    --to=sridhar.samudrala@intel.com \
    --cc=alexei.starovoitov@gmail.com \
    --cc=bjorn.topel@intel.com \
    --cc=bpf@vger.kernel.org \
    --cc=intel-wired-lan@lists.osuosl.org \
    --cc=maciej.fijalkowski@intel.com \
    --cc=magnus.karlsson@intel.com \
    --cc=netdev@vger.kernel.org \
    --cc=tom.herbert@intel.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.