From mboxrd@z Thu Jan 1 00:00:00 1970 From: Eric Dumazet Subject: Re: [PATCH net-next 2/2] udp: implement and use per cpu rx skbs cache Date: Wed, 18 Apr 2018 09:56:22 -0700 Message-ID: <890db004-4dfe-7f77-61ee-1ac0d7d2a24c@gmail.com> References: Mime-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 7bit Cc: "David S. Miller" , Eric Dumazet To: Paolo Abeni , netdev@vger.kernel.org Return-path: Received: from mail-pf0-f195.google.com ([209.85.192.195]:38040 "EHLO mail-pf0-f195.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1750961AbeDRQ4Y (ORCPT ); Wed, 18 Apr 2018 12:56:24 -0400 Received: by mail-pf0-f195.google.com with SMTP id y69so1202158pfb.5 for ; Wed, 18 Apr 2018 09:56:24 -0700 (PDT) In-Reply-To: Content-Language: en-US Sender: netdev-owner@vger.kernel.org List-ID: On 04/18/2018 03:22 AM, Paolo Abeni wrote: > This changeset extends the idea behind commit c8c8b127091b ("udp: > under rx pressure, try to condense skbs"), trading more BH cpu > time and memory bandwidth to decrease the load on the user space > receiver. > > At boot time we allocate a limited amount of skbs with small > data buffer, storing them in per cpu arrays. Such skbs are never > freed. > > At run time, under rx pressure, the BH tries to copy the current > skb contents into the cache - if the current cache skb is available, > and the ingress skb is small enough and without any head states. > > When using the cache skb, the ingress skb is dropped by the BH > - while still hot on cache - and the cache skb is inserted into > the rx queue, after increasing its usage count. Also, the cache > array index is moved to the next entry. > > The receive side is unmodified: in udp_rcvmsg() the usage skb > usage count is decreased and the skb is _not_ freed - since the > cache keeps usage > 0. Since skb->usage is hot in the cache of the > receiver at consume time - the receiver has just read skb->data, > which lies in the same cacheline - the whole skb_consume_udp() becomes > really cheap. > > UDP receive performances under flood improve as follow: > > NR RX queues Kpps Kpps Delta (%) > Before After > > 1 2252 2305 2 > 2 2151 2569 19 > 4 2033 2396 17 > 8 1969 2329 18 > > Overall performances of knotd DNS server under real traffic flood > improves as follow: > > Kpps Kpps Delta (%) > Before After > > 3777 3981 5 It might be time for knotd DNS server to finally use SO_REUSEPORT instead of adding this bloat to the kernel ? Sorry, 5% improvement while you easily can get 300% improvement with no kernel change is not appealing to me :/