All of lore.kernel.org
 help / color / mirror / Atom feed
From: Alexei Starovoitov <alexei.starovoitov@gmail.com>
To: "Samudrala, Sridhar" <sridhar.samudrala@intel.com>
Cc: "Björn Töpel" <bjorn.topel@gmail.com>,
	"Toke Høiland-Jørgensen" <toke@redhat.com>,
	"Jakub Kicinski" <jakub.kicinski@netronome.com>,
	Netdev <netdev@vger.kernel.org>,
	intel-wired-lan <intel-wired-lan@lists.osuosl.org>,
	"Herbert, Tom" <tom.herbert@intel.com>,
	"Fijalkowski, Maciej" <maciej.fijalkowski@intel.com>,
	"bpf@vger.kernel.org" <bpf@vger.kernel.org>,
	"Björn Töpel" <bjorn.topel@intel.com>,
	"Karlsson, Magnus" <magnus.karlsson@intel.com>
Subject: Re: [Intel-wired-lan] FW: [PATCH bpf-next 2/4] xsk: allow AF_XDP sockets to receive packets directly from a queue
Date: Mon, 21 Oct 2019 15:34:18 -0700	[thread overview]
Message-ID: <CAADnVQ+jiEO+jnFR-G=xG=zz7UOSBieZbc1NN=sSnAwvPaJjUQ@mail.gmail.com> (raw)
In-Reply-To: <7642a460-9ba3-d9f7-6cf8-aac45c7eef0d@intel.com>

On Mon, Oct 21, 2019 at 1:10 PM Samudrala, Sridhar
<sridhar.samudrala@intel.com> wrote:
>
> On 10/20/2019 10:12 AM, Björn Töpel wrote:
> > On Sun, 20 Oct 2019 at 12:15, Toke Høiland-Jørgensen <toke@redhat.com> wrote:
> >>
> >> Alexei Starovoitov <alexei.starovoitov@gmail.com> writes:
> >>
> >>> On Fri, Oct 18, 2019 at 05:45:26PM -0700, Samudrala, Sridhar wrote:
> >>>> On 10/18/2019 5:14 PM, Alexei Starovoitov wrote:
> >>>>> On Fri, Oct 18, 2019 at 11:40:07AM -0700, Samudrala, Sridhar wrote:
> >>>>>>
> >>>>>> Perf report for "AF_XDP default rxdrop" with patched kernel - mitigations ON
> >>>>>> ==========================================================================
> >>>>>> Samples: 44K of event 'cycles', Event count (approx.): 38532389541
> >>>>>> Overhead  Command          Shared Object              Symbol
> >>>>>>     15.31%  ksoftirqd/28     [i40e]                     [k] i40e_clean_rx_irq_zc
> >>>>>>     10.50%  ksoftirqd/28     bpf_prog_80b55d8a76303785  [k] bpf_prog_80b55d8a76303785
> >>>>>>      9.48%  xdpsock          [i40e]                     [k] i40e_clean_rx_irq_zc
> >>>>>>      8.62%  xdpsock          xdpsock                    [.] main
> >>>>>>      7.11%  ksoftirqd/28     [kernel.vmlinux]           [k] xsk_rcv
> >>>>>>      5.81%  ksoftirqd/28     [kernel.vmlinux]           [k] xdp_do_redirect
> >>>>>>      4.46%  xdpsock          bpf_prog_80b55d8a76303785  [k] bpf_prog_80b55d8a76303785
> >>>>>>      3.83%  xdpsock          [kernel.vmlinux]           [k] xsk_rcv
> >>>>>
> >>>>> why everything is duplicated?
> >>>>> Same code runs in different tasks ?
> >>>>
> >>>> Yes. looks like these functions run from both the app(xdpsock) context and ksoftirqd context.
> >>>>
> >>>>>
> >>>>>>      2.81%  ksoftirqd/28     [kernel.vmlinux]           [k] bpf_xdp_redirect_map
> >>>>>>      2.78%  ksoftirqd/28     [kernel.vmlinux]           [k] xsk_map_lookup_elem
> >>>>>>      2.44%  xdpsock          [kernel.vmlinux]           [k] xdp_do_redirect
> >>>>>>      2.19%  ksoftirqd/28     [kernel.vmlinux]           [k] __xsk_map_redirect
> >>>>>>      1.62%  ksoftirqd/28     [kernel.vmlinux]           [k] xsk_umem_peek_addr
> >>>>>>      1.57%  xdpsock          [kernel.vmlinux]           [k] xsk_umem_peek_addr
> >>>>>>      1.32%  ksoftirqd/28     [kernel.vmlinux]           [k] dma_direct_sync_single_for_cpu
> >>>>>>      1.28%  xdpsock          [kernel.vmlinux]           [k] bpf_xdp_redirect_map
> >>>>>>      1.15%  xdpsock          [kernel.vmlinux]           [k] dma_direct_sync_single_for_device
> >>>>>>      1.12%  xdpsock          [kernel.vmlinux]           [k] xsk_map_lookup_elem
> >>>>>>      1.06%  xdpsock          [kernel.vmlinux]           [k] __xsk_map_redirect
> >>>>>>      0.94%  ksoftirqd/28     [kernel.vmlinux]           [k] dma_direct_sync_single_for_device
> >>>>>>      0.75%  ksoftirqd/28     [kernel.vmlinux]           [k] __x86_indirect_thunk_rax
> >>>>>>      0.66%  ksoftirqd/28     [i40e]                     [k] i40e_clean_programming_status
> >>>>>>      0.64%  ksoftirqd/28     [kernel.vmlinux]           [k] net_rx_action
> >>>>>>      0.64%  swapper          [kernel.vmlinux]           [k] intel_idle
> >>>>>>      0.62%  ksoftirqd/28     [i40e]                     [k] i40e_napi_poll
> >>>>>>      0.57%  xdpsock          [kernel.vmlinux]           [k] dma_direct_sync_single_for_cpu
> >>>>>>
> >>>>>> Perf report for "AF_XDP direct rxdrop" with patched kernel - mitigations ON
> >>>>>> ==========================================================================
> >>>>>> Samples: 46K of event 'cycles', Event count (approx.): 38387018585
> >>>>>> Overhead  Command          Shared Object             Symbol
> >>>>>>     21.94%  ksoftirqd/28     [i40e]                    [k] i40e_clean_rx_irq_zc
> >>>>>>     14.36%  xdpsock          xdpsock                   [.] main
> >>>>>>     11.53%  ksoftirqd/28     [kernel.vmlinux]          [k] xsk_rcv
> >>>>>>     11.32%  xdpsock          [i40e]                    [k] i40e_clean_rx_irq_zc
> >>>>>>      4.02%  xdpsock          [kernel.vmlinux]          [k] xsk_rcv
> >>>>>>      2.91%  ksoftirqd/28     [kernel.vmlinux]          [k] xdp_do_redirect
> >>>>>>      2.45%  ksoftirqd/28     [kernel.vmlinux]          [k] xsk_umem_peek_addr
> >>>>>>      2.19%  xdpsock          [kernel.vmlinux]          [k] xsk_umem_peek_addr
> >>>>>>      2.08%  ksoftirqd/28     [kernel.vmlinux]          [k] bpf_direct_xsk
> >>>>>>      2.07%  ksoftirqd/28     [kernel.vmlinux]          [k] dma_direct_sync_single_for_cpu
> >>>>>>      1.53%  ksoftirqd/28     [kernel.vmlinux]          [k] dma_direct_sync_single_for_device
> >>>>>>      1.39%  xdpsock          [kernel.vmlinux]          [k] dma_direct_sync_single_for_device
> >>>>>>      1.22%  ksoftirqd/28     [kernel.vmlinux]          [k] xdp_get_xsk_from_qid
> >>>>>>      1.12%  ksoftirqd/28     [i40e]                    [k] i40e_clean_programming_status
> >>>>>>      0.96%  ksoftirqd/28     [i40e]                    [k] i40e_napi_poll
> >>>>>>      0.95%  ksoftirqd/28     [kernel.vmlinux]          [k] net_rx_action
> >>>>>>      0.89%  xdpsock          [kernel.vmlinux]          [k] xdp_do_redirect
> >>>>>>      0.83%  swapper          [i40e]                    [k] i40e_clean_rx_irq_zc
> >>>>>>      0.70%  swapper          [kernel.vmlinux]          [k] intel_idle
> >>>>>>      0.66%  xdpsock          [kernel.vmlinux]          [k] dma_direct_sync_single_for_cpu
> >>>>>>      0.60%  xdpsock          [kernel.vmlinux]          [k] bpf_direct_xsk
> >>>>>>      0.50%  ksoftirqd/28     [kernel.vmlinux]          [k] xsk_umem_discard_addr
> >>>>>>
> >>>>>> Based on the perf reports comparing AF_XDP default and direct rxdrop, we can say that
> >>>>>> AF_XDP direct rxdrop codepath is avoiding the overhead of going through these functions
> >>>>>>   bpf_prog_xxx
> >>>>>>           bpf_xdp_redirect_map
> >>>>>>   xsk_map_lookup_elem
> >>>>>>           __xsk_map_redirect
> >>>>>> With AF_XDP direct, xsk_rcv() is directly called via bpf_direct_xsk() in xdp_do_redirect()
> >>>>>
> >>>>> I don't think you're identifying the overhead correctly.
> >>>>> xsk_map_lookup_elem is 1%
> >>>>> but bpf_xdp_redirect_map() suppose to call __xsk_map_lookup_elem()
> >>>>> which is a different function:
> >>>>> ffffffff81493fe0 T __xsk_map_lookup_elem
> >>>>> ffffffff81492e80 t xsk_map_lookup_elem
> >>>>>
> >>>>> 10% for bpf_prog_80b55d8a76303785 is huge.
> >>>>> It's the actual code of the program _without_ any helpers.
> >>>>> How does the program actually look?
> >>>>
> >>>> It is the xdp program that is loaded via xsk_load_xdp_prog() in tools/lib/bpf/xsk.c
> >>>> https://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-next.git/tree/tools/lib/bpf/xsk.c#n268
> >>>
> >>> I see. Looks like map_gen_lookup was never implemented for xskmap.
> >>> How about adding it first the way array_map_gen_lookup() is implemented?
> >>> This will easily give 2x perf gain.
> >>
> >> I guess we should implement this for devmaps as well now that we allow
> >> lookups into those.
> >>
> >> However, in this particular example, the lookup from BPF is not actually
> >> needed, since bpf_redirect_map() will return a configurable error value
> >> when the map lookup fails (for exactly this use case).
> >>
> >> So replacing:
> >>
> >> if (bpf_map_lookup_elem(&xsks_map, &index))
> >>      return bpf_redirect_map(&xsks_map, index, 0);
> >>
> >> with simply
> >>
> >> return bpf_redirect_map(&xsks_map, index, XDP_PASS);
> >>
> >> would save the call to xsk_map_lookup_elem().
> >>
> >
> > Thanks for the reminder! I just submitted a patch. Still, doing the
> > map_gen_lookup()  for xsk/devmaps still makes sense!
> >
>
> I tried Bjorn's patch that avoids the lookups in the BPF prog.
> https://lore.kernel.org/netdev/20191021105938.11820-1-bjorn.topel@gmail.com/
>
> With this patch I am also seeing around 3-4% increase in xdpsock rxdrop performance and
> the perf report looks like this.
>
> Samples: 44K of event 'cycles', Event count (approx.): 38749965204
> Overhead  Command          Shared Object              Symbol
>    16.06%  ksoftirqd/28     [i40e]                     [k] i40e_clean_rx_irq_zc
>    10.18%  ksoftirqd/28     bpf_prog_3c8251c7e0fef8db  [k] bpf_prog_3c8251c7e0fef8db
>    10.15%  xdpsock          [i40e]                     [k] i40e_clean_rx_irq_zc
>    10.06%  ksoftirqd/28     [kernel.vmlinux]           [k] xsk_rcv
>     7.45%  xdpsock          xdpsock                    [.] main
>     5.76%  ksoftirqd/28     [kernel.vmlinux]           [k] xdp_do_redirect
>     4.51%  xdpsock          bpf_prog_3c8251c7e0fef8db  [k] bpf_prog_3c8251c7e0fef8db
>     3.67%  xdpsock          [kernel.vmlinux]           [k] xsk_rcv
>     3.06%  ksoftirqd/28     [kernel.vmlinux]           [k] bpf_xdp_redirect_map
>     2.34%  ksoftirqd/28     [kernel.vmlinux]           [k] __xsk_map_redirect
>     2.33%  xdpsock          [kernel.vmlinux]           [k] xdp_do_redirect
>     1.69%  ksoftirqd/28     [kernel.vmlinux]           [k] xsk_umem_peek_addr
>     1.69%  xdpsock          [kernel.vmlinux]           [k] xsk_umem_peek_addr
>     1.42%  ksoftirqd/28     [kernel.vmlinux]           [k] dma_direct_sync_single_for_cpu
>     1.19%  xdpsock          [kernel.vmlinux]           [k] bpf_xdp_redirect_map
>     1.13%  xdpsock          [kernel.vmlinux]           [k] dma_direct_sync_single_for_device
>     0.95%  ksoftirqd/28     [kernel.vmlinux]           [k] dma_direct_sync_single_for_device
>     0.92%  swapper          [kernel.vmlinux]           [k] intel_idle
>     0.92%  xdpsock          [kernel.vmlinux]           [k] __xsk_map_redirect
>     0.80%  ksoftirqd/28     [kernel.vmlinux]           [k] __x86_indirect_thunk_rax
>     0.73%  ksoftirqd/28     [i40e]                     [k] i40e_clean_programming_status
>     0.71%  ksoftirqd/28     [kernel.vmlinux]           [k] __xsk_map_lookup_elem
>     0.63%  ksoftirqd/28     [kernel.vmlinux]           [k] net_rx_action
>     0.62%  ksoftirqd/28     [i40e]                     [k] i40e_napi_poll
>     0.58%  xdpsock          [kernel.vmlinux]           [k] dma_direct_sync_single_for_cpu
>
> So with this patch applied, direct receive performance improvement comes down from 46% to 42%.
> I think it is still substantial enough to provide an option to allow direct receive for
> certain use cases. If it is OK, i can re-spin and submit the patches on top of the latest bpf-next

I think it's too early to consider such drastic approach.
The run-time performance of XDP program should be the same as C code.
Something fishy in these numbers, since spending 10% cpu in few loads
and single call to bpf_xdp_redirect_map() just not right.

WARNING: multiple messages have this Message-ID (diff)
From: Alexei Starovoitov <alexei.starovoitov@gmail.com>
To: intel-wired-lan@osuosl.org
Subject: [Intel-wired-lan] FW: [PATCH bpf-next 2/4] xsk: allow AF_XDP sockets to receive packets directly from a queue
Date: Mon, 21 Oct 2019 15:34:18 -0700	[thread overview]
Message-ID: <CAADnVQ+jiEO+jnFR-G=xG=zz7UOSBieZbc1NN=sSnAwvPaJjUQ@mail.gmail.com> (raw)
In-Reply-To: <7642a460-9ba3-d9f7-6cf8-aac45c7eef0d@intel.com>

On Mon, Oct 21, 2019 at 1:10 PM Samudrala, Sridhar
<sridhar.samudrala@intel.com> wrote:
>
> On 10/20/2019 10:12 AM, Bj?rn T?pel wrote:
> > On Sun, 20 Oct 2019 at 12:15, Toke H?iland-J?rgensen <toke@redhat.com> wrote:
> >>
> >> Alexei Starovoitov <alexei.starovoitov@gmail.com> writes:
> >>
> >>> On Fri, Oct 18, 2019 at 05:45:26PM -0700, Samudrala, Sridhar wrote:
> >>>> On 10/18/2019 5:14 PM, Alexei Starovoitov wrote:
> >>>>> On Fri, Oct 18, 2019 at 11:40:07AM -0700, Samudrala, Sridhar wrote:
> >>>>>>
> >>>>>> Perf report for "AF_XDP default rxdrop" with patched kernel - mitigations ON
> >>>>>> ==========================================================================
> >>>>>> Samples: 44K of event 'cycles', Event count (approx.): 38532389541
> >>>>>> Overhead  Command          Shared Object              Symbol
> >>>>>>     15.31%  ksoftirqd/28     [i40e]                     [k] i40e_clean_rx_irq_zc
> >>>>>>     10.50%  ksoftirqd/28     bpf_prog_80b55d8a76303785  [k] bpf_prog_80b55d8a76303785
> >>>>>>      9.48%  xdpsock          [i40e]                     [k] i40e_clean_rx_irq_zc
> >>>>>>      8.62%  xdpsock          xdpsock                    [.] main
> >>>>>>      7.11%  ksoftirqd/28     [kernel.vmlinux]           [k] xsk_rcv
> >>>>>>      5.81%  ksoftirqd/28     [kernel.vmlinux]           [k] xdp_do_redirect
> >>>>>>      4.46%  xdpsock          bpf_prog_80b55d8a76303785  [k] bpf_prog_80b55d8a76303785
> >>>>>>      3.83%  xdpsock          [kernel.vmlinux]           [k] xsk_rcv
> >>>>>
> >>>>> why everything is duplicated?
> >>>>> Same code runs in different tasks ?
> >>>>
> >>>> Yes. looks like these functions run from both the app(xdpsock) context and ksoftirqd context.
> >>>>
> >>>>>
> >>>>>>      2.81%  ksoftirqd/28     [kernel.vmlinux]           [k] bpf_xdp_redirect_map
> >>>>>>      2.78%  ksoftirqd/28     [kernel.vmlinux]           [k] xsk_map_lookup_elem
> >>>>>>      2.44%  xdpsock          [kernel.vmlinux]           [k] xdp_do_redirect
> >>>>>>      2.19%  ksoftirqd/28     [kernel.vmlinux]           [k] __xsk_map_redirect
> >>>>>>      1.62%  ksoftirqd/28     [kernel.vmlinux]           [k] xsk_umem_peek_addr
> >>>>>>      1.57%  xdpsock          [kernel.vmlinux]           [k] xsk_umem_peek_addr
> >>>>>>      1.32%  ksoftirqd/28     [kernel.vmlinux]           [k] dma_direct_sync_single_for_cpu
> >>>>>>      1.28%  xdpsock          [kernel.vmlinux]           [k] bpf_xdp_redirect_map
> >>>>>>      1.15%  xdpsock          [kernel.vmlinux]           [k] dma_direct_sync_single_for_device
> >>>>>>      1.12%  xdpsock          [kernel.vmlinux]           [k] xsk_map_lookup_elem
> >>>>>>      1.06%  xdpsock          [kernel.vmlinux]           [k] __xsk_map_redirect
> >>>>>>      0.94%  ksoftirqd/28     [kernel.vmlinux]           [k] dma_direct_sync_single_for_device
> >>>>>>      0.75%  ksoftirqd/28     [kernel.vmlinux]           [k] __x86_indirect_thunk_rax
> >>>>>>      0.66%  ksoftirqd/28     [i40e]                     [k] i40e_clean_programming_status
> >>>>>>      0.64%  ksoftirqd/28     [kernel.vmlinux]           [k] net_rx_action
> >>>>>>      0.64%  swapper          [kernel.vmlinux]           [k] intel_idle
> >>>>>>      0.62%  ksoftirqd/28     [i40e]                     [k] i40e_napi_poll
> >>>>>>      0.57%  xdpsock          [kernel.vmlinux]           [k] dma_direct_sync_single_for_cpu
> >>>>>>
> >>>>>> Perf report for "AF_XDP direct rxdrop" with patched kernel - mitigations ON
> >>>>>> ==========================================================================
> >>>>>> Samples: 46K of event 'cycles', Event count (approx.): 38387018585
> >>>>>> Overhead  Command          Shared Object             Symbol
> >>>>>>     21.94%  ksoftirqd/28     [i40e]                    [k] i40e_clean_rx_irq_zc
> >>>>>>     14.36%  xdpsock          xdpsock                   [.] main
> >>>>>>     11.53%  ksoftirqd/28     [kernel.vmlinux]          [k] xsk_rcv
> >>>>>>     11.32%  xdpsock          [i40e]                    [k] i40e_clean_rx_irq_zc
> >>>>>>      4.02%  xdpsock          [kernel.vmlinux]          [k] xsk_rcv
> >>>>>>      2.91%  ksoftirqd/28     [kernel.vmlinux]          [k] xdp_do_redirect
> >>>>>>      2.45%  ksoftirqd/28     [kernel.vmlinux]          [k] xsk_umem_peek_addr
> >>>>>>      2.19%  xdpsock          [kernel.vmlinux]          [k] xsk_umem_peek_addr
> >>>>>>      2.08%  ksoftirqd/28     [kernel.vmlinux]          [k] bpf_direct_xsk
> >>>>>>      2.07%  ksoftirqd/28     [kernel.vmlinux]          [k] dma_direct_sync_single_for_cpu
> >>>>>>      1.53%  ksoftirqd/28     [kernel.vmlinux]          [k] dma_direct_sync_single_for_device
> >>>>>>      1.39%  xdpsock          [kernel.vmlinux]          [k] dma_direct_sync_single_for_device
> >>>>>>      1.22%  ksoftirqd/28     [kernel.vmlinux]          [k] xdp_get_xsk_from_qid
> >>>>>>      1.12%  ksoftirqd/28     [i40e]                    [k] i40e_clean_programming_status
> >>>>>>      0.96%  ksoftirqd/28     [i40e]                    [k] i40e_napi_poll
> >>>>>>      0.95%  ksoftirqd/28     [kernel.vmlinux]          [k] net_rx_action
> >>>>>>      0.89%  xdpsock          [kernel.vmlinux]          [k] xdp_do_redirect
> >>>>>>      0.83%  swapper          [i40e]                    [k] i40e_clean_rx_irq_zc
> >>>>>>      0.70%  swapper          [kernel.vmlinux]          [k] intel_idle
> >>>>>>      0.66%  xdpsock          [kernel.vmlinux]          [k] dma_direct_sync_single_for_cpu
> >>>>>>      0.60%  xdpsock          [kernel.vmlinux]          [k] bpf_direct_xsk
> >>>>>>      0.50%  ksoftirqd/28     [kernel.vmlinux]          [k] xsk_umem_discard_addr
> >>>>>>
> >>>>>> Based on the perf reports comparing AF_XDP default and direct rxdrop, we can say that
> >>>>>> AF_XDP direct rxdrop codepath is avoiding the overhead of going through these functions
> >>>>>>   bpf_prog_xxx
> >>>>>>           bpf_xdp_redirect_map
> >>>>>>   xsk_map_lookup_elem
> >>>>>>           __xsk_map_redirect
> >>>>>> With AF_XDP direct, xsk_rcv() is directly called via bpf_direct_xsk() in xdp_do_redirect()
> >>>>>
> >>>>> I don't think you're identifying the overhead correctly.
> >>>>> xsk_map_lookup_elem is 1%
> >>>>> but bpf_xdp_redirect_map() suppose to call __xsk_map_lookup_elem()
> >>>>> which is a different function:
> >>>>> ffffffff81493fe0 T __xsk_map_lookup_elem
> >>>>> ffffffff81492e80 t xsk_map_lookup_elem
> >>>>>
> >>>>> 10% for bpf_prog_80b55d8a76303785 is huge.
> >>>>> It's the actual code of the program _without_ any helpers.
> >>>>> How does the program actually look?
> >>>>
> >>>> It is the xdp program that is loaded via xsk_load_xdp_prog() in tools/lib/bpf/xsk.c
> >>>> https://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-next.git/tree/tools/lib/bpf/xsk.c#n268
> >>>
> >>> I see. Looks like map_gen_lookup was never implemented for xskmap.
> >>> How about adding it first the way array_map_gen_lookup() is implemented?
> >>> This will easily give 2x perf gain.
> >>
> >> I guess we should implement this for devmaps as well now that we allow
> >> lookups into those.
> >>
> >> However, in this particular example, the lookup from BPF is not actually
> >> needed, since bpf_redirect_map() will return a configurable error value
> >> when the map lookup fails (for exactly this use case).
> >>
> >> So replacing:
> >>
> >> if (bpf_map_lookup_elem(&xsks_map, &index))
> >>      return bpf_redirect_map(&xsks_map, index, 0);
> >>
> >> with simply
> >>
> >> return bpf_redirect_map(&xsks_map, index, XDP_PASS);
> >>
> >> would save the call to xsk_map_lookup_elem().
> >>
> >
> > Thanks for the reminder! I just submitted a patch. Still, doing the
> > map_gen_lookup()  for xsk/devmaps still makes sense!
> >
>
> I tried Bjorn's patch that avoids the lookups in the BPF prog.
> https://lore.kernel.org/netdev/20191021105938.11820-1-bjorn.topel at gmail.com/
>
> With this patch I am also seeing around 3-4% increase in xdpsock rxdrop performance and
> the perf report looks like this.
>
> Samples: 44K of event 'cycles', Event count (approx.): 38749965204
> Overhead  Command          Shared Object              Symbol
>    16.06%  ksoftirqd/28     [i40e]                     [k] i40e_clean_rx_irq_zc
>    10.18%  ksoftirqd/28     bpf_prog_3c8251c7e0fef8db  [k] bpf_prog_3c8251c7e0fef8db
>    10.15%  xdpsock          [i40e]                     [k] i40e_clean_rx_irq_zc
>    10.06%  ksoftirqd/28     [kernel.vmlinux]           [k] xsk_rcv
>     7.45%  xdpsock          xdpsock                    [.] main
>     5.76%  ksoftirqd/28     [kernel.vmlinux]           [k] xdp_do_redirect
>     4.51%  xdpsock          bpf_prog_3c8251c7e0fef8db  [k] bpf_prog_3c8251c7e0fef8db
>     3.67%  xdpsock          [kernel.vmlinux]           [k] xsk_rcv
>     3.06%  ksoftirqd/28     [kernel.vmlinux]           [k] bpf_xdp_redirect_map
>     2.34%  ksoftirqd/28     [kernel.vmlinux]           [k] __xsk_map_redirect
>     2.33%  xdpsock          [kernel.vmlinux]           [k] xdp_do_redirect
>     1.69%  ksoftirqd/28     [kernel.vmlinux]           [k] xsk_umem_peek_addr
>     1.69%  xdpsock          [kernel.vmlinux]           [k] xsk_umem_peek_addr
>     1.42%  ksoftirqd/28     [kernel.vmlinux]           [k] dma_direct_sync_single_for_cpu
>     1.19%  xdpsock          [kernel.vmlinux]           [k] bpf_xdp_redirect_map
>     1.13%  xdpsock          [kernel.vmlinux]           [k] dma_direct_sync_single_for_device
>     0.95%  ksoftirqd/28     [kernel.vmlinux]           [k] dma_direct_sync_single_for_device
>     0.92%  swapper          [kernel.vmlinux]           [k] intel_idle
>     0.92%  xdpsock          [kernel.vmlinux]           [k] __xsk_map_redirect
>     0.80%  ksoftirqd/28     [kernel.vmlinux]           [k] __x86_indirect_thunk_rax
>     0.73%  ksoftirqd/28     [i40e]                     [k] i40e_clean_programming_status
>     0.71%  ksoftirqd/28     [kernel.vmlinux]           [k] __xsk_map_lookup_elem
>     0.63%  ksoftirqd/28     [kernel.vmlinux]           [k] net_rx_action
>     0.62%  ksoftirqd/28     [i40e]                     [k] i40e_napi_poll
>     0.58%  xdpsock          [kernel.vmlinux]           [k] dma_direct_sync_single_for_cpu
>
> So with this patch applied, direct receive performance improvement comes down from 46% to 42%.
> I think it is still substantial enough to provide an option to allow direct receive for
> certain use cases. If it is OK, i can re-spin and submit the patches on top of the latest bpf-next

I think it's too early to consider such drastic approach.
The run-time performance of XDP program should be the same as C code.
Something fishy in these numbers, since spending 10% cpu in few loads
and single call to bpf_xdp_redirect_map() just not right.

  reply	other threads:[~2019-10-21 22:34 UTC|newest]

Thread overview: 84+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2019-10-08  6:16 [PATCH bpf-next 0/4] Enable direct receive on AF_XDP sockets Sridhar Samudrala
2019-10-08  6:16 ` [Intel-wired-lan] " Sridhar Samudrala
2019-10-08  6:16 ` [PATCH bpf-next 1/4] bpf: introduce bpf_get_prog_id and bpf_set_prog_id helper functions Sridhar Samudrala
2019-10-08  6:16   ` [Intel-wired-lan] " Sridhar Samudrala
2019-10-08  6:16 ` [PATCH bpf-next 2/4] xsk: allow AF_XDP sockets to receive packets directly from a queue Sridhar Samudrala
2019-10-08  6:16   ` [Intel-wired-lan] " Sridhar Samudrala
2019-10-08  6:58   ` Toke Høiland-Jørgensen
2019-10-08  6:58     ` [Intel-wired-lan] " Toke =?unknown-8bit?q?H=C3=B8iland-J=C3=B8rgensen?=
2019-10-08  8:47     ` Björn Töpel
2019-10-08  8:47       ` [Intel-wired-lan] " =?unknown-8bit?q?Bj=C3=B6rn_T=C3=B6pel?=
2019-10-08  8:48       ` Björn Töpel
2019-10-08  8:48         ` [Intel-wired-lan] " =?unknown-8bit?q?Bj=C3=B6rn_T=C3=B6pel?=
2019-10-08  9:04       ` Toke Høiland-Jørgensen
2019-10-08  9:04         ` [Intel-wired-lan] " Toke =?unknown-8bit?q?H=C3=B8iland-J=C3=B8rgensen?=
2019-10-08  8:05   ` Björn Töpel
2019-10-08  8:05     ` [Intel-wired-lan] " =?unknown-8bit?q?Bj=C3=B6rn_T=C3=B6pel?=
2019-10-09 16:32     ` Samudrala, Sridhar
2019-10-09 16:32       ` [Intel-wired-lan] " Samudrala, Sridhar
2019-10-09  1:20   ` Alexei Starovoitov
2019-10-09  1:20     ` [Intel-wired-lan] " Alexei Starovoitov
     [not found]     ` <3ED8E928C4210A4289A677D2FEB48235140134CE@fmsmsx111.amr.corp.intel.com>
2019-10-09 16:53       ` FW: " Samudrala, Sridhar
2019-10-09 16:53         ` [Intel-wired-lan] " Samudrala, Sridhar
2019-10-09 17:17         ` Alexei Starovoitov
2019-10-09 17:17           ` [Intel-wired-lan] " Alexei Starovoitov
2019-10-09 19:12           ` Samudrala, Sridhar
2019-10-09 19:12             ` [Intel-wired-lan] " Samudrala, Sridhar
2019-10-10  1:06             ` Alexei Starovoitov
2019-10-10  1:06               ` [Intel-wired-lan] " Alexei Starovoitov
2019-10-18 18:40               ` Samudrala, Sridhar
2019-10-18 18:40                 ` [Intel-wired-lan] " Samudrala, Sridhar
2019-10-18 19:22                 ` Toke Høiland-Jørgensen
2019-10-18 19:22                   ` [Intel-wired-lan] " Toke =?unknown-8bit?q?H=C3=B8iland-J=C3=B8rgensen?=
2019-10-19  0:14                 ` Alexei Starovoitov
2019-10-19  0:14                   ` [Intel-wired-lan] " Alexei Starovoitov
2019-10-19  0:45                   ` Samudrala, Sridhar
2019-10-19  0:45                     ` [Intel-wired-lan] " Samudrala, Sridhar
2019-10-19  2:25                     ` Alexei Starovoitov
2019-10-19  2:25                       ` [Intel-wired-lan] " Alexei Starovoitov
2019-10-20 10:14                       ` Toke Høiland-Jørgensen
2019-10-20 10:14                         ` [Intel-wired-lan] " Toke =?unknown-8bit?q?H=C3=B8iland-J=C3=B8rgensen?=
2019-10-20 17:12                         ` Björn Töpel
2019-10-20 17:12                           ` =?unknown-8bit?q?Bj=C3=B6rn_T=C3=B6pel?=
2019-10-21 20:10                           ` Samudrala, Sridhar
2019-10-21 20:10                             ` Samudrala, Sridhar
2019-10-21 22:34                             ` Alexei Starovoitov [this message]
2019-10-21 22:34                               ` Alexei Starovoitov
2019-10-22 19:06                               ` Samudrala, Sridhar
2019-10-22 19:06                                 ` Samudrala, Sridhar
2019-10-23 17:42                                 ` Alexei Starovoitov
2019-10-23 17:42                                   ` Alexei Starovoitov
2019-10-24 18:12                                   ` Samudrala, Sridhar
2019-10-24 18:12                                     ` Samudrala, Sridhar
2019-10-25  7:42                                     ` Björn Töpel
2019-10-25  7:42                                       ` =?unknown-8bit?q?Bj=C3=B6rn_T=C3=B6pel?=
2019-10-31 22:38                                       ` Samudrala, Sridhar
2019-10-31 22:38                                         ` Samudrala, Sridhar
2019-10-31 23:15                                         ` Alexei Starovoitov
2019-10-31 23:15                                           ` Alexei Starovoitov
2019-11-01  0:21                                         ` Jakub Kicinski
2019-11-01  0:21                                           ` Jakub Kicinski
2019-11-01 18:31                                           ` Samudrala, Sridhar
2019-11-01 18:31                                             ` Samudrala, Sridhar
2019-11-04  2:08                                           ` dan
2019-11-04  2:08                                             ` dan
2019-10-25  9:07                                   ` Björn Töpel
2019-10-25  9:07                                     ` =?unknown-8bit?q?Bj=C3=B6rn_T=C3=B6pel?=
2019-10-08  6:16 ` [PATCH bpf-next 3/4] libbpf: handle AF_XDP sockets created with XDP_DIRECT bind flag Sridhar Samudrala
2019-10-08  6:16   ` [Intel-wired-lan] " Sridhar Samudrala
2019-10-08  8:05   ` Björn Töpel
2019-10-08  8:05     ` [Intel-wired-lan] " =?unknown-8bit?q?Bj=C3=B6rn_T=C3=B6pel?=
2019-10-08  6:16 ` [PATCH bpf-next 4/4] xdpsock: add an option to create AF_XDP sockets in XDP_DIRECT mode Sridhar Samudrala
2019-10-08  6:16   ` [Intel-wired-lan] " Sridhar Samudrala
2019-10-08  8:05   ` Björn Töpel
2019-10-08  8:05     ` [Intel-wired-lan] " =?unknown-8bit?q?Bj=C3=B6rn_T=C3=B6pel?=
2019-10-08  8:05 ` [PATCH bpf-next 0/4] Enable direct receive on AF_XDP sockets Björn Töpel
2019-10-08  8:05   ` [Intel-wired-lan] " =?unknown-8bit?q?Bj=C3=B6rn_T=C3=B6pel?=
2019-10-09 16:19   ` Samudrala, Sridhar
2019-10-09 16:19     ` [Intel-wired-lan] " Samudrala, Sridhar
2019-10-09  0:49 ` Jakub Kicinski
2019-10-09  0:49   ` [Intel-wired-lan] " Jakub Kicinski
2019-10-09  6:29   ` Samudrala, Sridhar
2019-10-09  6:29     ` [Intel-wired-lan] " Samudrala, Sridhar
2019-10-09 16:53     ` Jakub Kicinski
2019-10-09 16:53       ` [Intel-wired-lan] " Jakub Kicinski

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to='CAADnVQ+jiEO+jnFR-G=xG=zz7UOSBieZbc1NN=sSnAwvPaJjUQ@mail.gmail.com' \
    --to=alexei.starovoitov@gmail.com \
    --cc=bjorn.topel@gmail.com \
    --cc=bjorn.topel@intel.com \
    --cc=bpf@vger.kernel.org \
    --cc=intel-wired-lan@lists.osuosl.org \
    --cc=jakub.kicinski@netronome.com \
    --cc=maciej.fijalkowski@intel.com \
    --cc=magnus.karlsson@intel.com \
    --cc=netdev@vger.kernel.org \
    --cc=sridhar.samudrala@intel.com \
    --cc=toke@redhat.com \
    --cc=tom.herbert@intel.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.