All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH bpf-next] xsk, net: make sock_def_readable() have external linkage
@ 2020-01-20  9:29 Björn Töpel
  2020-01-21  0:47 ` John Fastabend
  0 siblings, 1 reply; 4+ messages in thread
From: Björn Töpel @ 2020-01-20  9:29 UTC (permalink / raw)
  To: netdev, ast, daniel
  Cc: Björn Töpel, bpf, magnus.karlsson, jonathan.lemon

From: Björn Töpel <bjorn.topel@intel.com>

XDP sockets use the default implementation of struct sock's
sk_data_ready callback, which is sock_def_readable(). This function is
called in the XDP socket fast-path, and involves a retpoline. By
letting sock_def_readable() have external linkage, and being called
directly, the retpoline can be avoided.

Signed-off-by: Björn Töpel <bjorn.topel@intel.com>
---
 include/net/sock.h | 2 ++
 net/core/sock.c    | 2 +-
 net/xdp/xsk.c      | 2 +-
 3 files changed, 4 insertions(+), 2 deletions(-)

diff --git a/include/net/sock.h b/include/net/sock.h
index 8dff68b4c316..0891c55f1e82 100644
--- a/include/net/sock.h
+++ b/include/net/sock.h
@@ -2612,4 +2612,6 @@ static inline bool sk_dev_equal_l3scope(struct sock *sk, int dif)
 	return false;
 }
 
+void sock_def_readable(struct sock *sk);
+
 #endif	/* _SOCK_H */
diff --git a/net/core/sock.c b/net/core/sock.c
index 8459ad579f73..a4c8fac781ff 100644
--- a/net/core/sock.c
+++ b/net/core/sock.c
@@ -2786,7 +2786,7 @@ static void sock_def_error_report(struct sock *sk)
 	rcu_read_unlock();
 }
 
-static void sock_def_readable(struct sock *sk)
+void sock_def_readable(struct sock *sk)
 {
 	struct socket_wq *wq;
 
diff --git a/net/xdp/xsk.c b/net/xdp/xsk.c
index 02ada7ab8c6e..df600487a68d 100644
--- a/net/xdp/xsk.c
+++ b/net/xdp/xsk.c
@@ -217,7 +217,7 @@ static int xsk_rcv(struct xdp_sock *xs, struct xdp_buff *xdp)
 static void xsk_flush(struct xdp_sock *xs)
 {
 	xskq_prod_submit(xs->rx);
-	xs->sk.sk_data_ready(&xs->sk);
+	sock_def_readable(&xs->sk);
 }
 
 int xsk_generic_rcv(struct xdp_sock *xs, struct xdp_buff *xdp)
-- 
2.20.1


^ permalink raw reply related	[flat|nested] 4+ messages in thread

* RE: [PATCH bpf-next] xsk, net: make sock_def_readable() have external linkage
  2020-01-20  9:29 [PATCH bpf-next] xsk, net: make sock_def_readable() have external linkage Björn Töpel
@ 2020-01-21  0:47 ` John Fastabend
  2020-01-21 12:06   ` Björn Töpel
  0 siblings, 1 reply; 4+ messages in thread
From: John Fastabend @ 2020-01-21  0:47 UTC (permalink / raw)
  To: Björn Töpel, netdev, ast, daniel
  Cc: Björn Töpel, bpf, magnus.karlsson, jonathan.lemon

Björn Töpel wrote:
> From: Björn Töpel <bjorn.topel@intel.com>
> 
> XDP sockets use the default implementation of struct sock's
> sk_data_ready callback, which is sock_def_readable(). This function is
> called in the XDP socket fast-path, and involves a retpoline. By
> letting sock_def_readable() have external linkage, and being called
> directly, the retpoline can be avoided.
> 
> Signed-off-by: Björn Töpel <bjorn.topel@intel.com>
> ---
>  include/net/sock.h | 2 ++
>  net/core/sock.c    | 2 +-
>  net/xdp/xsk.c      | 2 +-
>  3 files changed, 4 insertions(+), 2 deletions(-)
> 

I think this is fine but curious were you able to measure the
difference with before/after pps or something?

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: [PATCH bpf-next] xsk, net: make sock_def_readable() have external linkage
  2020-01-21  0:47 ` John Fastabend
@ 2020-01-21 12:06   ` Björn Töpel
  2020-01-21 23:36     ` Daniel Borkmann
  0 siblings, 1 reply; 4+ messages in thread
From: Björn Töpel @ 2020-01-21 12:06 UTC (permalink / raw)
  To: John Fastabend
  Cc: Netdev, Alexei Starovoitov, Daniel Borkmann,
	Björn Töpel, bpf, Karlsson, Magnus, Jonathan Lemon

On Tue, 21 Jan 2020 at 01:48, John Fastabend <john.fastabend@gmail.com> wrote:
>
> Björn Töpel wrote:
> > From: Björn Töpel <bjorn.topel@intel.com>
> >
> > XDP sockets use the default implementation of struct sock's
> > sk_data_ready callback, which is sock_def_readable(). This function is
> > called in the XDP socket fast-path, and involves a retpoline. By
> > letting sock_def_readable() have external linkage, and being called
> > directly, the retpoline can be avoided.
> >
> > Signed-off-by: Björn Töpel <bjorn.topel@intel.com>
> > ---
> >  include/net/sock.h | 2 ++
> >  net/core/sock.c    | 2 +-
> >  net/xdp/xsk.c      | 2 +-
> >  3 files changed, 4 insertions(+), 2 deletions(-)
> >
>
> I think this is fine but curious were you able to measure the
> difference with before/after pps or something?

Ugh, yeah, of course I've should have added that. Sorry for that! Here
goes; Benchmark is xdpsock rxdrop, NAPI running on core 20:

**Pre-patch: xdpsock rxdrop: 22.8 Mpps
 Performance counter stats for 'CPU(s) 20':

         10,000.58 msec cpu-clock                 #    1.000 CPUs
utilized
                12      context-switches          #    0.001 K/sec
                 1      cpu-migrations            #    0.000 K/sec
                 0      page-faults               #    0.000 K/sec
    29,931,407,416      cycles                    #    2.993 GHz
    82,538,852,331      instructions              #    2.76  insn per
cycle
    15,894,169,979      branches                  # 1589.324 M/sec
        30,916,486      branch-misses             #    0.19% of all
branches

      10.000636027 seconds time elapsed

**Post-patch: xdpsock rxdrop: 23.2 Mpps
         10,000.90 msec cpu-clock                 #    1.000 CPUs
utilized
                12      context-switches          #    0.001 K/sec
                 1      cpu-migrations            #    0.000 K/sec
                 0      page-faults               #    0.000 K/sec
    29,932,353,067      cycles                    #    2.993 GHz
    84,299,636,827      instructions              #    2.82  insn per
cycle
    16,228,795,437      branches                  # 1622.733 M/sec
        28,113,847      branch-misses             #    0.17% of all
branches

      10.000596454 seconds time elapsed

This could fall into the category of noise. :-) PPS and IPC is up a
bit. OTOH, maybe UDP can benefit from this as well?


Björn

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: [PATCH bpf-next] xsk, net: make sock_def_readable() have external linkage
  2020-01-21 12:06   ` Björn Töpel
@ 2020-01-21 23:36     ` Daniel Borkmann
  0 siblings, 0 replies; 4+ messages in thread
From: Daniel Borkmann @ 2020-01-21 23:36 UTC (permalink / raw)
  To: Björn Töpel, John Fastabend
  Cc: Netdev, Alexei Starovoitov, Björn Töpel, bpf, Karlsson,
	Magnus, Jonathan Lemon

On 1/21/20 1:06 PM, Björn Töpel wrote:
> On Tue, 21 Jan 2020 at 01:48, John Fastabend <john.fastabend@gmail.com> wrote:
>> Björn Töpel wrote:
>>> From: Björn Töpel <bjorn.topel@intel.com>
>>>
>>> XDP sockets use the default implementation of struct sock's
>>> sk_data_ready callback, which is sock_def_readable(). This function is
>>> called in the XDP socket fast-path, and involves a retpoline. By
>>> letting sock_def_readable() have external linkage, and being called
>>> directly, the retpoline can be avoided.
>>>
>>> Signed-off-by: Björn Töpel <bjorn.topel@intel.com>
>>> ---
>>>   include/net/sock.h | 2 ++
>>>   net/core/sock.c    | 2 +-
>>>   net/xdp/xsk.c      | 2 +-
>>>   3 files changed, 4 insertions(+), 2 deletions(-)
>>>
>>
>> I think this is fine but curious were you able to measure the
>> difference with before/after pps or something?
> 
> Ugh, yeah, of course I've should have added that. Sorry for that! Here
> goes; Benchmark is xdpsock rxdrop, NAPI running on core 20:
> 
> **Pre-patch: xdpsock rxdrop: 22.8 Mpps
>   Performance counter stats for 'CPU(s) 20':
> 
>           10,000.58 msec cpu-clock                 #    1.000 CPUs
> utilized
>                  12      context-switches          #    0.001 K/sec
>                   1      cpu-migrations            #    0.000 K/sec
>                   0      page-faults               #    0.000 K/sec
>      29,931,407,416      cycles                    #    2.993 GHz
>      82,538,852,331      instructions              #    2.76  insn per
> cycle
>      15,894,169,979      branches                  # 1589.324 M/sec
>          30,916,486      branch-misses             #    0.19% of all
> branches
> 
>        10.000636027 seconds time elapsed
> 
> **Post-patch: xdpsock rxdrop: 23.2 Mpps
>           10,000.90 msec cpu-clock                 #    1.000 CPUs
> utilized
>                  12      context-switches          #    0.001 K/sec
>                   1      cpu-migrations            #    0.000 K/sec
>                   0      page-faults               #    0.000 K/sec
>      29,932,353,067      cycles                    #    2.993 GHz
>      84,299,636,827      instructions              #    2.82  insn per
> cycle
>      16,228,795,437      branches                  # 1622.733 M/sec
>          28,113,847      branch-misses             #    0.17% of all
> branches
> 
>        10.000596454 seconds time elapsed
> 
> This could fall into the category of noise. :-) PPS and IPC is up a
> bit. OTOH, maybe UDP can benefit from this as well?

Yeah, might be within noise range though getting rid of retpolines in XDP
[socket] fast-path is fine and patch is tiny, so applied, thanks!

^ permalink raw reply	[flat|nested] 4+ messages in thread

end of thread, other threads:[~2020-01-21 23:36 UTC | newest]

Thread overview: 4+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-01-20  9:29 [PATCH bpf-next] xsk, net: make sock_def_readable() have external linkage Björn Töpel
2020-01-21  0:47 ` John Fastabend
2020-01-21 12:06   ` Björn Töpel
2020-01-21 23:36     ` Daniel Borkmann

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.