From mboxrd@z Thu Jan 1 00:00:00 1970 From: Paolo Abeni Subject: Re: [PATCH net-next 2/2] udp: implement and use per cpu rx skbs cache Date: Thu, 19 Apr 2018 09:40:37 +0200 Message-ID: <1524123637.3160.16.camel@redhat.com> References: <890db004-4dfe-7f77-61ee-1ac0d7d2a24c@gmail.com> <1524071712.2599.60.camel@redhat.com> <3270c995-4eea-b3e1-128c-82921d89eb79@gmail.com> Mime-Version: 1.0 Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: 7bit Cc: "David S. Miller" To: Eric Dumazet , netdev@vger.kernel.org Return-path: Received: from mx3-rdu2.redhat.com ([66.187.233.73]:57552 "EHLO mx1.redhat.com" rhost-flags-OK-OK-OK-FAIL) by vger.kernel.org with ESMTP id S1750980AbeDSHkk (ORCPT ); Thu, 19 Apr 2018 03:40:40 -0400 In-Reply-To: <3270c995-4eea-b3e1-128c-82921d89eb79@gmail.com> Sender: netdev-owner@vger.kernel.org List-ID: Hi, On Wed, 2018-04-18 at 12:21 -0700, Eric Dumazet wrote: > > On 04/18/2018 10:15 AM, Paolo Abeni wrote: > is not appealing to me :/ > > > > Thank you for the feedback. > > Sorry for not being clear about it, but knotd is using SO_REUSEPORT and > > the above tests are leveraging it. > > > > That 5% is on top of that 300%. > > Then there is something wrong. > > Adding copies should not increase performance. The skb and data are copied into the UDP skb cache only if the socket is under memory pressure, and that happens if and only if the receiver is slower than the BH/IP receive path. The copy slows down the RX path - which was dropping packets - and makes the udp_recvmsg() considerably faster, as consuming skb becomes almost a no-op. AFAICS, this is similar to the strategy you used in: ommit c8c8b127091b758f5768f906bcdeeb88bc9951ca Author: Eric Dumazet Date: Wed Dec 7 09:19:33 2016 -0800 udp: under rx pressure, try to condense skbs with the difference that with the UDP skb cache there is an hard limit to the amount of memory the BH is allowed to copy. > If it does, there is certainly another way, reaching 10% instead of 5% I benchmarked vs a DNS server to test and verify that we get measurable benefits in real life scenario. The measured performance gain for the RX path with reasonable configurations is ~20%. Any suggestions for better results are more than welcome! Cheers, Paolo