All of lore.kernel.org
 help / color / mirror / Atom feed
From: Eric Dumazet <eric.dumazet-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org>
To: Paolo Abeni <pabeni-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
Cc: netdev-u79uwXL29TY76Z2rM5mHXA@public.gmane.org,
	"David S. Miller" <davem-fT/PcQaiUtIeIZ0/mPfg9Q@public.gmane.org>,
	James Morris <jmorris-gx6/JNMH7DfYtjvyW6yDsg@public.gmane.org>,
	Trond Myklebust
	<trond.myklebust-7I+n7zu2hftEKMMhf/gKZA@public.gmane.org>,
	Alexander Duyck
	<alexander.duyck-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org>,
	Daniel Borkmann <daniel-FeC+5ew28dpmcu3hnIyYJQ@public.gmane.org>,
	Eric Dumazet <edumazet-hpIqsD4AKlfQT0dZR+AlfA@public.gmane.org>,
	Tom Herbert <tom-BjP2VixgY4xUbtYUoyoikg@public.gmane.org>,
	Hannes Frederic Sowa
	<hannes-tFNcAqjVMyqKXQKiL6tip0B+6BGkLq7r@public.gmane.org>,
	linux-nfs-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
Subject: Re: [PATCH net-next] udp: do fwd memory scheduling on dequeue
Date: Mon, 31 Oct 2016 08:16:46 -0700	[thread overview]
Message-ID: <1477927006.7065.304.camel@edumazet-glaptop3.roam.corp.google.com> (raw)
In-Reply-To: <1477926132.6655.10.camel-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>

On Mon, 2016-10-31 at 16:02 +0100, Paolo Abeni wrote:

> 
> No problem at all with incremental patches ;-)
> 
> In our experiment, touching udp_memory_allocated is only a part of the
> the source of contention, with the biggest source of contention being
> the sk_rmem_alloc update - which happens on every dequeue.
> 
> We experimented doing fwd alloc of the whole sk_rcvbuf; even in that
> scenario we hit relevant contention if sk_rmem_alloc update was done
> under the lock, while full sk_rcvbuf forward allocation and
> sk_rmem_alloc update outside the spinlock gave very similar performance
> to our posted patch.


> I think that the next step (after the double lock on dequeue removal)
> should be moving sk_rmem_alloc outside the lock: the needed changes for
> doing that on top of double lock on dequeue removal are very small
> (would add ~10 lines of code).
> 

During my load tests, one of the major factor was sk_drops being
incremented like crazy, dirtying a critical cache line, hurting
sk_filter reads for example.

        /* --- cacheline 6 boundary (384 bytes) --- */
        struct {
                atomic_t           rmem_alloc;           /* 0x180   0x4 */
                int                len;                  /* 0x184   0x4 */
                struct sk_buff *   head;                 /* 0x188   0x8 */
                struct sk_buff *   tail;                 /* 0x190   0x8 */
        } sk_backlog;                                    /* 0x180  0x18 */
        int                        sk_forward_alloc;     /* 0x198   0x4 */
        __u32                      sk_txhash;            /* 0x19c   0x4 */
        unsigned int               sk_napi_id;           /* 0x1a0   0x4 */
        unsigned int               sk_ll_usec;           /* 0x1a4   0x4 */
        atomic_t                   sk_drops;             /* 0x1a8   0x4 */
        int                        sk_rcvbuf;            /* 0x1ac   0x4 */
        struct sk_filter *         sk_filter;            /* 0x1b0   0x8 */
        union {
                struct socket_wq * sk_wq;                /*         0x8 */
                struct socket_wq * sk_wq_raw;            /*         0x8 */
        };                                               /* 0x1b8   0x8 */

I was playing moving this field elsewhere and not reading it if not
necessary.

diff --git a/include/net/sock.h b/include/net/sock.h
index f13ac87a8015cb18c5d3fe5fdcf2d6a0592428f4..a901df591eb45e153517cdb8b409b61563d1a4e3 100644
--- a/include/net/sock.h
+++ b/include/net/sock.h
@@ -2112,7 +2112,8 @@ struct sock_skb_cb {
 static inline void
 sock_skb_set_dropcount(const struct sock *sk, struct sk_buff *skb)
 {
-	SOCK_SKB_CB(skb)->dropcount = atomic_read(&sk->sk_drops);
+	SOCK_SKB_CB(skb)->dropcount = sock_flag(sk, SOCK_RXQ_OVFL) ?
+					atomic_read(&sk->sk_drops) : 0;
 }
 
 static inline void sk_drops_add(struct sock *sk, const struct sk_buff *skb)



--
To unsubscribe from this list: send the line "unsubscribe linux-nfs" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

WARNING: multiple messages have this Message-ID (diff)
From: Eric Dumazet <eric.dumazet@gmail.com>
To: Paolo Abeni <pabeni@redhat.com>
Cc: netdev@vger.kernel.org, "David S. Miller" <davem@davemloft.net>,
	James Morris <jmorris@namei.org>,
	Trond Myklebust <trond.myklebust@primarydata.com>,
	Alexander Duyck <alexander.duyck@gmail.com>,
	Daniel Borkmann <daniel@iogearbox.net>,
	Eric Dumazet <edumazet@google.com>,
	Tom Herbert <tom@herbertland.com>,
	Hannes Frederic Sowa <hannes@stressinduktion.org>,
	linux-nfs@vger.kernel.org
Subject: Re: [PATCH net-next] udp: do fwd memory scheduling on dequeue
Date: Mon, 31 Oct 2016 08:16:46 -0700	[thread overview]
Message-ID: <1477927006.7065.304.camel@edumazet-glaptop3.roam.corp.google.com> (raw)
In-Reply-To: <1477926132.6655.10.camel@redhat.com>

On Mon, 2016-10-31 at 16:02 +0100, Paolo Abeni wrote:

> 
> No problem at all with incremental patches ;-)
> 
> In our experiment, touching udp_memory_allocated is only a part of the
> the source of contention, with the biggest source of contention being
> the sk_rmem_alloc update - which happens on every dequeue.
> 
> We experimented doing fwd alloc of the whole sk_rcvbuf; even in that
> scenario we hit relevant contention if sk_rmem_alloc update was done
> under the lock, while full sk_rcvbuf forward allocation and
> sk_rmem_alloc update outside the spinlock gave very similar performance
> to our posted patch.


> I think that the next step (after the double lock on dequeue removal)
> should be moving sk_rmem_alloc outside the lock: the needed changes for
> doing that on top of double lock on dequeue removal are very small
> (would add ~10 lines of code).
> 

During my load tests, one of the major factor was sk_drops being
incremented like crazy, dirtying a critical cache line, hurting
sk_filter reads for example.

        /* --- cacheline 6 boundary (384 bytes) --- */
        struct {
                atomic_t           rmem_alloc;           /* 0x180   0x4 */
                int                len;                  /* 0x184   0x4 */
                struct sk_buff *   head;                 /* 0x188   0x8 */
                struct sk_buff *   tail;                 /* 0x190   0x8 */
        } sk_backlog;                                    /* 0x180  0x18 */
        int                        sk_forward_alloc;     /* 0x198   0x4 */
        __u32                      sk_txhash;            /* 0x19c   0x4 */
        unsigned int               sk_napi_id;           /* 0x1a0   0x4 */
        unsigned int               sk_ll_usec;           /* 0x1a4   0x4 */
        atomic_t                   sk_drops;             /* 0x1a8   0x4 */
        int                        sk_rcvbuf;            /* 0x1ac   0x4 */
        struct sk_filter *         sk_filter;            /* 0x1b0   0x8 */
        union {
                struct socket_wq * sk_wq;                /*         0x8 */
                struct socket_wq * sk_wq_raw;            /*         0x8 */
        };                                               /* 0x1b8   0x8 */

I was playing moving this field elsewhere and not reading it if not
necessary.

diff --git a/include/net/sock.h b/include/net/sock.h
index f13ac87a8015cb18c5d3fe5fdcf2d6a0592428f4..a901df591eb45e153517cdb8b409b61563d1a4e3 100644
--- a/include/net/sock.h
+++ b/include/net/sock.h
@@ -2112,7 +2112,8 @@ struct sock_skb_cb {
 static inline void
 sock_skb_set_dropcount(const struct sock *sk, struct sk_buff *skb)
 {
-	SOCK_SKB_CB(skb)->dropcount = atomic_read(&sk->sk_drops);
+	SOCK_SKB_CB(skb)->dropcount = sock_flag(sk, SOCK_RXQ_OVFL) ?
+					atomic_read(&sk->sk_drops) : 0;
 }
 
 static inline void sk_drops_add(struct sock *sk, const struct sk_buff *skb)




  parent reply	other threads:[~2016-10-31 15:16 UTC|newest]

Thread overview: 11+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2016-10-28 13:20 [PATCH net-next] udp: do fwd memory scheduling on dequeue Paolo Abeni
2016-10-28 13:20 ` Paolo Abeni
2016-10-28 17:16 ` Eric Dumazet
2016-10-28 17:50   ` Eric Dumazet
     [not found]     ` <1477677030.7065.250.camel-XN9IlZ5yJG9HTL0Zs8A6p+yfmBU6pStAUsxypvmhUTTZJqsBc5GL+g@public.gmane.org>
2016-10-29  8:17       ` Paolo Abeni
2016-10-29  8:17         ` Paolo Abeni
     [not found]         ` <1477729045.5306.11.camel-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
2016-10-29 12:43           ` Eric Dumazet
2016-10-29 12:43             ` Eric Dumazet
2016-10-31 15:02             ` Paolo Abeni
     [not found]               ` <1477926132.6655.10.camel-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
2016-10-31 15:16                 ` Eric Dumazet [this message]
2016-10-31 15:16                   ` Eric Dumazet

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=1477927006.7065.304.camel@edumazet-glaptop3.roam.corp.google.com \
    --to=eric.dumazet-re5jqeeqqe8avxtiumwx3w@public.gmane.org \
    --cc=alexander.duyck-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org \
    --cc=daniel-FeC+5ew28dpmcu3hnIyYJQ@public.gmane.org \
    --cc=davem-fT/PcQaiUtIeIZ0/mPfg9Q@public.gmane.org \
    --cc=edumazet-hpIqsD4AKlfQT0dZR+AlfA@public.gmane.org \
    --cc=hannes-tFNcAqjVMyqKXQKiL6tip0B+6BGkLq7r@public.gmane.org \
    --cc=jmorris-gx6/JNMH7DfYtjvyW6yDsg@public.gmane.org \
    --cc=linux-nfs-u79uwXL29TY76Z2rM5mHXA@public.gmane.org \
    --cc=netdev-u79uwXL29TY76Z2rM5mHXA@public.gmane.org \
    --cc=pabeni-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org \
    --cc=tom-BjP2VixgY4xUbtYUoyoikg@public.gmane.org \
    --cc=trond.myklebust-7I+n7zu2hftEKMMhf/gKZA@public.gmane.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.