From mboxrd@z Thu Jan  1 00:00:00 1970
From: Paolo Abeni <pabeni@redhat.com>
Subject: Re: [PATCH net-next 2/2] udp: implement and use per cpu rx skbs
 cache
Date: Sun, 22 Apr 2018 13:22:58 +0200
Message-ID: <1524396178.10317.18.camel@redhat.com>
References: <cover.1524045911.git.pabeni@redhat.com>
         <dacbc7a3626bb170629e02159ed2f90120f06382.1524045911.git.pabeni@redhat.com>
         <890db004-4dfe-7f77-61ee-1ac0d7d2a24c@gmail.com>
         <1524071712.2599.60.camel@redhat.com>
         <3270c995-4eea-b3e1-128c-82921d89eb79@gmail.com>
         <1524123637.3160.16.camel@redhat.com>
         <0e3abeb5-8081-f9ea-4de6-cc1a7edfc5a5@gmail.com>
         <20180420154836.3690a39e@redhat.com>
Mime-Version: 1.0
Content-Type: text/plain; charset="UTF-8"
Content-Transfer-Encoding: 7bit
Cc: netdev@vger.kernel.org, "David S. Miller" <davem@davemloft.net>,
        Tariq Toukan <tariqt@mellanox.com>
To: Jesper Dangaard Brouer <brouer@redhat.com>,
        Eric Dumazet <eric.dumazet@gmail.com>,
        Willem de Bruijn <willemdebruijn.kernel@gmail.com>
Return-path: <netdev-owner@vger.kernel.org>
Received: from mx3-rdu2.redhat.com ([66.187.233.73]:45108 "EHLO mx1.redhat.com"
        rhost-flags-OK-OK-OK-FAIL) by vger.kernel.org with ESMTP
        id S1750910AbeDVLXD (ORCPT <rfc822;netdev@vger.kernel.org>);
        Sun, 22 Apr 2018 07:23:03 -0400
In-Reply-To: <20180420154836.3690a39e@redhat.com>
Sender: netdev-owner@vger.kernel.org
List-ID: <netdev.vger.kernel.org>

On Fri, 2018-04-20 at 15:48 +0200, Jesper Dangaard Brouer wrote:
> On Thu, 19 Apr 2018 06:47:10 -0700 Eric Dumazet <eric.dumazet@gmail.com> wrote:
> > On 04/19/2018 12:40 AM, Paolo Abeni wrote:
> > > On Wed, 2018-04-18 at 12:21 -0700, Eric Dumazet wrote:  
> > > > On 04/18/2018 10:15 AM, Paolo Abeni wrote:
> 
> [...]
> > > 
> > > Any suggestions for better results are more than welcome!  
> > 
> > Yes, remote skb freeing. I mentioned this idea to Jesper and Tariq in
> > Seoul (netdev conference). Not tied to UDP, but a generic solution.
> 
> Yes, I remember.  I think... was it the idea, where you basically
> wanted to queue back SKBs to the CPU that allocated them, right?
> 
> Freeing an SKB on the same CPU that allocated it, have multiple
> advantages. (1) the SLUB allocator can use a non-atomic
> "cpu-local" (double)cmpxchg. (2) the 4 cache-lines memset cleared of
> the SKB stay local.  (3) the atomic SKB refcnt/users stay local.

By the time the skb is returned to the ingress cpu, isn't that skb most
probably out of the cache?

> We just have to avoid that queue back SKB's mechanism, doesn't cost
> more than the operations we expect to save.  Bulk transfer is an
> obvious approach.  For storing SKBs until they are returned, we already
> have a fast mechanism see napi_consume_skb calling _kfree_skb_defer,
> which SLUB/SLAB-bulk free to amortize cost (1).
> 
> I guess, the missing information is that we don't know what CPU the SKB
> were created on...
> 
> Where to store this CPU info?
> 
> (a) In struct sk_buff, in a cache-line that is already read on remote
> CPU in UDP code?
> 
> (b) In struct page, as SLUB alloc hand-out objects/SKBs on a per page
> basis, we could have SLUB store a hint about the CPU it was allocated
> on, and bet on returning to that CPU ? (might be bad to read the
> struct-page cache-line)

Bulking would be doable only for connected sockets, elsewhere would be
difficult to assemble a burst long enough to amortize the handshake
with the remote CPU (spinlock + ipi needed ?!?)

Would be good enough for unconnected sockets sending a whole skb burst
back to one of the (several) ingress CPU? e.g. peeking the CPU
associated with the first skb inside the burst, we would somewhat
balance the load between the ingress CPUs.

Cheers,

Paolo