From: "Samudrala, Sridhar" <sridhar.samudrala@intel.com> To: alexei.starovoitov@gmail.com Cc: bjorn.topel@gmail.com, bjorn.topel@intel.com, bpf@vger.kernel.org, intel-wired-lan@lists.osuosl.org, jakub.kicinski@netronome.com, maciej.fijalkowski@intel.com, magnus.karlsson@intel.com, netdev@vger.kernel.org, sridhar.samudrala@intel.com, toke@redhat.com, tom.herbert@intel.com Subject: Re: [Intel-wired-lan] FW: [PATCH bpf-next 2/4] xsk: allow AF_XDP sockets to receive packets directly from a queue Date: Thu, 24 Oct 2019 11:12:55 -0700 [thread overview] Message-ID: <68d6e154-8646-7904-f081-10ec32115496@intel.com> (raw) In-Reply-To: <CAADnVQKwnMChzeGaC66A99cHn5szB4hPZaGXq8JAhd8sjrdGeA@mail.gmail.com> > > OK. Here is another data point that shows the perf report with the same test but CPU mitigations > > turned OFF. Here bpf_prog overhead goes down from almost (10.18 + 4.51)% to (3.23 + 1.44%). > > > > 21.40% ksoftirqd/28 [i40e] [k] i40e_clean_rx_irq_zc > > 14.13% xdpsock [i40e] [k] i40e_clean_rx_irq_zc > > 8.33% ksoftirqd/28 [kernel.vmlinux] [k] xsk_rcv > > 6.09% ksoftirqd/28 [kernel.vmlinux] [k] xdp_do_redirect > > 5.19% xdpsock xdpsock [.] main > > 3.48% ksoftirqd/28 [kernel.vmlinux] [k] bpf_xdp_redirect_map > > 3.23% ksoftirqd/28 bpf_prog_3c8251c7e0fef8db [k] bpf_prog_3c8251c7e0fef8db > > > > So a major component of the bpf_prog overhead seems to be due to the CPU vulnerability mitigations. > I feel that it's an incorrect conclusion because JIT is not doing > any retpolines (because there are no indirect calls in bpf). > There should be no difference in bpf_prog runtime with or without mitigations. > Also you're running root, so no spectre mitigations either. > This 3% seems like a lot for a function that does few loads that should > hit d-cache and one direct call. > Please investigate why you're seeing this 10% cpu cost when mitigations are on. > perf report/annotate is the best. > Also please double check that you're using the latest perf. > Since bpf performance analysis was greatly improved several versions ago. > I don't think old perf will be showing bogus numbers, but better to > run the latest. Here is perf annotate output for bpf_prog_ with and without mitigations turned ON Using the perf built from the bpf-next tree. perf version 5.3.g4071324a76c1 With mitigations ON ------------------- Samples: 6K of event 'cycles', 4000 Hz, Event count (approx.): 5646512726 bpf_prog_3c8251c7e0fef8db bpf_prog_3c8251c7e0fef8db [Percent: local period] 45.05 push %rbp 0.02 mov %rsp,%rbp 0.03 sub $0x8,%rsp 22.09 push %rbx 7.66 push %r13 1.08 push %r14 1.85 push %r15 0.63 pushq $0x0 1.13 mov 0x28(%rdi),%rsi 0.47 mov 0x8(%rsi),%esi 3.47 mov %esi,-0x4(%rbp) 0.02 movabs $0xffff8ab414a83e00,%rdi 0.90 mov $0x2,%edx 2.85 callq *ffffffffd149fc5f 1.55 and $0x6,%rax test %rax,%rax 1.48 jne 72 mov %rbp,%rsi add $0xfffffffffffffffc,%rsi movabs $0xffff8ab414a83e00,%rdi callq *ffffffffd0e5fd5f mov %rax,%rdi mov $0x2,%eax test %rdi,%rdi je 72 mov -0x4(%rbp),%esi movabs $0xffff8ab414a83e00,%rdi xor %edx,%edx callq *ffffffffd149fc5f 72: pop %rbx pop %r15 1.90 pop %r14 1.93 pop %r13 pop %rbx 3.63 leaveq 2.27 retq With mitigations OFF -------------------- Samples: 2K of event 'cycles', 4000 Hz, Event count (approx.): 1872116166 bpf_prog_3c8251c7e0fef8db bpf_prog_3c8251c7e0fef8db [Percent: local period] 0.15 push %rbp mov %rsp,%rbp 13.79 sub $0x8,%rsp 0.30 push %rbx 0.15 push %r13 0.20 push %r14 14.50 push %r15 0.20 pushq $0x0 mov 0x28(%rdi),%rsi 0.25 mov 0x8(%rsi),%esi 14.37 mov %esi,-0x4(%rbp) 0.25 movabs $0xffff8ea2c673b800,%rdi mov $0x2,%edx 13.60 callq *ffffffffe50c2f38 14.33 and $0x6,%rax test %rax,%rax jne 72 mov %rbp,%rsi add $0xfffffffffffffffc,%rsi movabs $0xffff8ea2c673b800,%rdi callq *ffffffffe4a83038 mov %rax,%rdi mov $0x2,%eax test %rdi,%rdi je 72 mov -0x4(%rbp),%esi movabs $0xffff8ea2c673b800,%rdi xor %edx,%edx callq *ffffffffe50c2f38 72: pop %rbx pop %r15 13.97 pop %r14 0.10 pop %r13 pop %rbx 13.71 leaveq 0.15 retq Do you see any issues with this data? With mitigations ON push %rbp and push %rbx overhead seems to be pretty high. Thanks Sridhar
WARNING: multiple messages have this Message-ID (diff)
From: Samudrala, Sridhar <sridhar.samudrala@intel.com> To: intel-wired-lan@osuosl.org Subject: [Intel-wired-lan] FW: [PATCH bpf-next 2/4] xsk: allow AF_XDP sockets to receive packets directly from a queue Date: Thu, 24 Oct 2019 11:12:55 -0700 [thread overview] Message-ID: <68d6e154-8646-7904-f081-10ec32115496@intel.com> (raw) In-Reply-To: <CAADnVQKwnMChzeGaC66A99cHn5szB4hPZaGXq8JAhd8sjrdGeA@mail.gmail.com> > > OK. Here is another data point that shows the perf report with the same test but CPU mitigations > > turned OFF. Here bpf_prog overhead goes down from almost (10.18 + 4.51)% to (3.23 + 1.44%). > > > > 21.40% ksoftirqd/28 [i40e] [k] i40e_clean_rx_irq_zc > > 14.13% xdpsock [i40e] [k] i40e_clean_rx_irq_zc > > 8.33% ksoftirqd/28 [kernel.vmlinux] [k] xsk_rcv > > 6.09% ksoftirqd/28 [kernel.vmlinux] [k] xdp_do_redirect > > 5.19% xdpsock xdpsock [.] main > > 3.48% ksoftirqd/28 [kernel.vmlinux] [k] bpf_xdp_redirect_map > > 3.23% ksoftirqd/28 bpf_prog_3c8251c7e0fef8db [k] bpf_prog_3c8251c7e0fef8db > > > > So a major component of the bpf_prog overhead seems to be due to the CPU vulnerability mitigations. > I feel that it's an incorrect conclusion because JIT is not doing > any retpolines (because there are no indirect calls in bpf). > There should be no difference in bpf_prog runtime with or without mitigations. > Also you're running root, so no spectre mitigations either. > This 3% seems like a lot for a function that does few loads that should > hit d-cache and one direct call. > Please investigate why you're seeing this 10% cpu cost when mitigations are on. > perf report/annotate is the best. > Also please double check that you're using the latest perf. > Since bpf performance analysis was greatly improved several versions ago. > I don't think old perf will be showing bogus numbers, but better to > run the latest. Here is perf annotate output for bpf_prog_ with and without mitigations turned ON Using the perf built from the bpf-next tree. perf version 5.3.g4071324a76c1 With mitigations ON ------------------- Samples: 6K of event 'cycles', 4000 Hz, Event count (approx.): 5646512726 bpf_prog_3c8251c7e0fef8db bpf_prog_3c8251c7e0fef8db [Percent: local period] 45.05 push %rbp 0.02 mov %rsp,%rbp 0.03 sub $0x8,%rsp 22.09 push %rbx 7.66 push %r13 1.08 push %r14 1.85 push %r15 0.63 pushq $0x0 1.13 mov 0x28(%rdi),%rsi 0.47 mov 0x8(%rsi),%esi 3.47 mov %esi,-0x4(%rbp) 0.02 movabs $0xffff8ab414a83e00,%rdi 0.90 mov $0x2,%edx 2.85 callq *ffffffffd149fc5f 1.55 and $0x6,%rax test %rax,%rax 1.48 jne 72 mov %rbp,%rsi add $0xfffffffffffffffc,%rsi movabs $0xffff8ab414a83e00,%rdi callq *ffffffffd0e5fd5f mov %rax,%rdi mov $0x2,%eax test %rdi,%rdi je 72 mov -0x4(%rbp),%esi movabs $0xffff8ab414a83e00,%rdi xor %edx,%edx callq *ffffffffd149fc5f 72: pop %rbx pop %r15 1.90 pop %r14 1.93 pop %r13 pop %rbx 3.63 leaveq 2.27 retq With mitigations OFF -------------------- Samples: 2K of event 'cycles', 4000 Hz, Event count (approx.): 1872116166 bpf_prog_3c8251c7e0fef8db bpf_prog_3c8251c7e0fef8db [Percent: local period] 0.15 push %rbp mov %rsp,%rbp 13.79 sub $0x8,%rsp 0.30 push %rbx 0.15 push %r13 0.20 push %r14 14.50 push %r15 0.20 pushq $0x0 mov 0x28(%rdi),%rsi 0.25 mov 0x8(%rsi),%esi 14.37 mov %esi,-0x4(%rbp) 0.25 movabs $0xffff8ea2c673b800,%rdi mov $0x2,%edx 13.60 callq *ffffffffe50c2f38 14.33 and $0x6,%rax test %rax,%rax jne 72 mov %rbp,%rsi add $0xfffffffffffffffc,%rsi movabs $0xffff8ea2c673b800,%rdi callq *ffffffffe4a83038 mov %rax,%rdi mov $0x2,%eax test %rdi,%rdi je 72 mov -0x4(%rbp),%esi movabs $0xffff8ea2c673b800,%rdi xor %edx,%edx callq *ffffffffe50c2f38 72: pop %rbx pop %r15 13.97 pop %r14 0.10 pop %r13 pop %rbx 13.71 leaveq 0.15 retq Do you see any issues with this data? With mitigations ON push %rbp and push %rbx overhead seems to be pretty high. Thanks Sridhar
next prev parent reply other threads:[~2019-10-24 18:12 UTC|newest] Thread overview: 84+ messages / expand[flat|nested] mbox.gz Atom feed top 2019-10-08 6:16 [PATCH bpf-next 0/4] Enable direct receive on AF_XDP sockets Sridhar Samudrala 2019-10-08 6:16 ` [Intel-wired-lan] " Sridhar Samudrala 2019-10-08 6:16 ` [PATCH bpf-next 1/4] bpf: introduce bpf_get_prog_id and bpf_set_prog_id helper functions Sridhar Samudrala 2019-10-08 6:16 ` [Intel-wired-lan] " Sridhar Samudrala 2019-10-08 6:16 ` [PATCH bpf-next 2/4] xsk: allow AF_XDP sockets to receive packets directly from a queue Sridhar Samudrala 2019-10-08 6:16 ` [Intel-wired-lan] " Sridhar Samudrala 2019-10-08 6:58 ` Toke Høiland-Jørgensen 2019-10-08 6:58 ` [Intel-wired-lan] " Toke =?unknown-8bit?q?H=C3=B8iland-J=C3=B8rgensen?= 2019-10-08 8:47 ` Björn Töpel 2019-10-08 8:47 ` [Intel-wired-lan] " =?unknown-8bit?q?Bj=C3=B6rn_T=C3=B6pel?= 2019-10-08 8:48 ` Björn Töpel 2019-10-08 8:48 ` [Intel-wired-lan] " =?unknown-8bit?q?Bj=C3=B6rn_T=C3=B6pel?= 2019-10-08 9:04 ` Toke Høiland-Jørgensen 2019-10-08 9:04 ` [Intel-wired-lan] " Toke =?unknown-8bit?q?H=C3=B8iland-J=C3=B8rgensen?= 2019-10-08 8:05 ` Björn Töpel 2019-10-08 8:05 ` [Intel-wired-lan] " =?unknown-8bit?q?Bj=C3=B6rn_T=C3=B6pel?= 2019-10-09 16:32 ` Samudrala, Sridhar 2019-10-09 16:32 ` [Intel-wired-lan] " Samudrala, Sridhar 2019-10-09 1:20 ` Alexei Starovoitov 2019-10-09 1:20 ` [Intel-wired-lan] " Alexei Starovoitov [not found] ` <3ED8E928C4210A4289A677D2FEB48235140134CE@fmsmsx111.amr.corp.intel.com> 2019-10-09 16:53 ` FW: " Samudrala, Sridhar 2019-10-09 16:53 ` [Intel-wired-lan] " Samudrala, Sridhar 2019-10-09 17:17 ` Alexei Starovoitov 2019-10-09 17:17 ` [Intel-wired-lan] " Alexei Starovoitov 2019-10-09 19:12 ` Samudrala, Sridhar 2019-10-09 19:12 ` [Intel-wired-lan] " Samudrala, Sridhar 2019-10-10 1:06 ` Alexei Starovoitov 2019-10-10 1:06 ` [Intel-wired-lan] " Alexei Starovoitov 2019-10-18 18:40 ` Samudrala, Sridhar 2019-10-18 18:40 ` [Intel-wired-lan] " Samudrala, Sridhar 2019-10-18 19:22 ` Toke Høiland-Jørgensen 2019-10-18 19:22 ` [Intel-wired-lan] " Toke =?unknown-8bit?q?H=C3=B8iland-J=C3=B8rgensen?= 2019-10-19 0:14 ` Alexei Starovoitov 2019-10-19 0:14 ` [Intel-wired-lan] " Alexei Starovoitov 2019-10-19 0:45 ` Samudrala, Sridhar 2019-10-19 0:45 ` [Intel-wired-lan] " Samudrala, Sridhar 2019-10-19 2:25 ` Alexei Starovoitov 2019-10-19 2:25 ` [Intel-wired-lan] " Alexei Starovoitov 2019-10-20 10:14 ` Toke Høiland-Jørgensen 2019-10-20 10:14 ` [Intel-wired-lan] " Toke =?unknown-8bit?q?H=C3=B8iland-J=C3=B8rgensen?= 2019-10-20 17:12 ` Björn Töpel 2019-10-20 17:12 ` =?unknown-8bit?q?Bj=C3=B6rn_T=C3=B6pel?= 2019-10-21 20:10 ` Samudrala, Sridhar 2019-10-21 20:10 ` Samudrala, Sridhar 2019-10-21 22:34 ` Alexei Starovoitov 2019-10-21 22:34 ` Alexei Starovoitov 2019-10-22 19:06 ` Samudrala, Sridhar 2019-10-22 19:06 ` Samudrala, Sridhar 2019-10-23 17:42 ` Alexei Starovoitov 2019-10-23 17:42 ` Alexei Starovoitov 2019-10-24 18:12 ` Samudrala, Sridhar [this message] 2019-10-24 18:12 ` Samudrala, Sridhar 2019-10-25 7:42 ` Björn Töpel 2019-10-25 7:42 ` =?unknown-8bit?q?Bj=C3=B6rn_T=C3=B6pel?= 2019-10-31 22:38 ` Samudrala, Sridhar 2019-10-31 22:38 ` Samudrala, Sridhar 2019-10-31 23:15 ` Alexei Starovoitov 2019-10-31 23:15 ` Alexei Starovoitov 2019-11-01 0:21 ` Jakub Kicinski 2019-11-01 0:21 ` Jakub Kicinski 2019-11-01 18:31 ` Samudrala, Sridhar 2019-11-01 18:31 ` Samudrala, Sridhar 2019-11-04 2:08 ` dan 2019-11-04 2:08 ` dan 2019-10-25 9:07 ` Björn Töpel 2019-10-25 9:07 ` =?unknown-8bit?q?Bj=C3=B6rn_T=C3=B6pel?= 2019-10-08 6:16 ` [PATCH bpf-next 3/4] libbpf: handle AF_XDP sockets created with XDP_DIRECT bind flag Sridhar Samudrala 2019-10-08 6:16 ` [Intel-wired-lan] " Sridhar Samudrala 2019-10-08 8:05 ` Björn Töpel 2019-10-08 8:05 ` [Intel-wired-lan] " =?unknown-8bit?q?Bj=C3=B6rn_T=C3=B6pel?= 2019-10-08 6:16 ` [PATCH bpf-next 4/4] xdpsock: add an option to create AF_XDP sockets in XDP_DIRECT mode Sridhar Samudrala 2019-10-08 6:16 ` [Intel-wired-lan] " Sridhar Samudrala 2019-10-08 8:05 ` Björn Töpel 2019-10-08 8:05 ` [Intel-wired-lan] " =?unknown-8bit?q?Bj=C3=B6rn_T=C3=B6pel?= 2019-10-08 8:05 ` [PATCH bpf-next 0/4] Enable direct receive on AF_XDP sockets Björn Töpel 2019-10-08 8:05 ` [Intel-wired-lan] " =?unknown-8bit?q?Bj=C3=B6rn_T=C3=B6pel?= 2019-10-09 16:19 ` Samudrala, Sridhar 2019-10-09 16:19 ` [Intel-wired-lan] " Samudrala, Sridhar 2019-10-09 0:49 ` Jakub Kicinski 2019-10-09 0:49 ` [Intel-wired-lan] " Jakub Kicinski 2019-10-09 6:29 ` Samudrala, Sridhar 2019-10-09 6:29 ` [Intel-wired-lan] " Samudrala, Sridhar 2019-10-09 16:53 ` Jakub Kicinski 2019-10-09 16:53 ` [Intel-wired-lan] " Jakub Kicinski
Reply instructions: You may reply publicly to this message via plain-text email using any one of the following methods: * Save the following mbox file, import it into your mail client, and reply-to-all from there: mbox Avoid top-posting and favor interleaved quoting: https://en.wikipedia.org/wiki/Posting_style#Interleaved_style * Reply using the --to, --cc, and --in-reply-to switches of git-send-email(1): git send-email \ --in-reply-to=68d6e154-8646-7904-f081-10ec32115496@intel.com \ --to=sridhar.samudrala@intel.com \ --cc=alexei.starovoitov@gmail.com \ --cc=bjorn.topel@gmail.com \ --cc=bjorn.topel@intel.com \ --cc=bpf@vger.kernel.org \ --cc=intel-wired-lan@lists.osuosl.org \ --cc=jakub.kicinski@netronome.com \ --cc=maciej.fijalkowski@intel.com \ --cc=magnus.karlsson@intel.com \ --cc=netdev@vger.kernel.org \ --cc=toke@redhat.com \ --cc=tom.herbert@intel.com \ /path/to/YOUR_REPLY https://kernel.org/pub/software/scm/git/docs/git-send-email.html * If your mail client supports setting the In-Reply-To header via mailto: links, try the mailto: linkBe sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes, see mirroring instructions on how to clone and mirror all data and code used by this external index.