From mboxrd@z Thu Jan 1 00:00:00 1970 From: Aaro Koskinen Subject: Re: Improving OCTEON II 10G Ethernet performance Date: Fri, 26 Aug 2016 00:18:52 +0300 Message-ID: <20160825211852.GG12169@raspberrypi.musicnaut.iki.fi> References: <57BF21C7.5070709@caviumnetworks.com> <20160825182210.GE12169@raspberrypi.musicnaut.iki.fi> <57BF5101.6080909@caviumnetworks.com> Mime-Version: 1.0 Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: 7bit Cc: driverdev-devel , linux-mips , Aaro Koskinen , netdev To: David Daney Return-path: Content-Disposition: inline In-Reply-To: <57BF5101.6080909@caviumnetworks.com> List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: driverdev-devel-bounces@linuxdriverproject.org Sender: "devel" List-Id: netdev.vger.kernel.org Hi, On Thu, Aug 25, 2016 at 01:11:45PM -0700, David Daney wrote: > On 08/25/2016 11:22 AM, Aaro Koskinen wrote: > >On Thu, Aug 25, 2016 at 09:50:15AM -0700, David Daney wrote: > >>Ideally we would configure the packet classifiers on the RX side to create > >>multiple RX queues based on a hash of the TCP 5-tuple, and handle each queue > >>with a single NAPI instance. That should result in better performance while > >>maintaining packet ordering. > > > >Would this need anything else than reprogramming CVMX_PIP_PRT_TAGX, and > >eliminating the global pow_receive_group and creating multiple NAPI instances > >and registering IRQ handlers? > > That is essentially how it works. Set the tag generation parameters, and > use the low order bits of the tag to select which POW/SSO group is assigned. > The SSO group corresponds to an "rx queue" OK, I will try to experiment with this. Even though my home routers are 2-core only I could still create more queues and verify that the traffic gets distributed by checking the counters... > >In the Yocto tree, the CVMX_PIP_PRT_TAGX register values are actually > >documented: > > > >http://git.yoctoproject.org/cgit/cgit.cgi/linux-yocto-contrib/tree/arch/mips/include/asm/octeon/cvmx-pip-defs.h?h=apaliwal/octeon#n3737 > > Wow, I didn't realize that documentation was made public. Also D-Link and Qbiquity GPL source offerings for their products usually include documentation for register fields. Only in mainline kernel they are missing. A. From mboxrd@z Thu Jan 1 00:00:00 1970 Received: with ECARTIS (v1.0.0; list linux-mips); Thu, 25 Aug 2016 23:19:01 +0200 (CEST) Received: from emh02.mail.saunalahti.fi ([62.142.5.108]:33684 "EHLO emh02.mail.saunalahti.fi" rhost-flags-OK-OK-OK-OK) by eddie.linux-mips.org with ESMTP id S23992196AbcHYVSy2KOaW (ORCPT ); Thu, 25 Aug 2016 23:18:54 +0200 Received: from raspberrypi.musicnaut.iki.fi (85-76-72-196-nat.elisa-mobile.fi [85.76.72.196]) by emh02.mail.saunalahti.fi (Postfix) with ESMTP id 2809623401D; Fri, 26 Aug 2016 00:18:52 +0300 (EEST) Date: Fri, 26 Aug 2016 00:18:52 +0300 From: Aaro Koskinen To: David Daney Cc: Ed Swierk , linux-mips , driverdev-devel , netdev , Aaro Koskinen Subject: Re: Improving OCTEON II 10G Ethernet performance Message-ID: <20160825211852.GG12169@raspberrypi.musicnaut.iki.fi> References: <57BF21C7.5070709@caviumnetworks.com> <20160825182210.GE12169@raspberrypi.musicnaut.iki.fi> <57BF5101.6080909@caviumnetworks.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <57BF5101.6080909@caviumnetworks.com> User-Agent: Mutt/1.5.23 (2014-03-12) Return-Path: X-Envelope-To: <"|/home/ecartis/ecartis -s linux-mips"> (uid 0) X-Orcpt: rfc822;linux-mips@linux-mips.org Original-Recipient: rfc822;linux-mips@linux-mips.org X-archive-position: 54756 X-ecartis-version: Ecartis v1.0.0 Sender: linux-mips-bounce@linux-mips.org Errors-to: linux-mips-bounce@linux-mips.org X-original-sender: aaro.koskinen@iki.fi Precedence: bulk List-help: List-unsubscribe: List-software: Ecartis version 1.0.0 List-Id: linux-mips X-List-ID: linux-mips List-subscribe: List-owner: List-post: List-archive: X-list: linux-mips Hi, On Thu, Aug 25, 2016 at 01:11:45PM -0700, David Daney wrote: > On 08/25/2016 11:22 AM, Aaro Koskinen wrote: > >On Thu, Aug 25, 2016 at 09:50:15AM -0700, David Daney wrote: > >>Ideally we would configure the packet classifiers on the RX side to create > >>multiple RX queues based on a hash of the TCP 5-tuple, and handle each queue > >>with a single NAPI instance. That should result in better performance while > >>maintaining packet ordering. > > > >Would this need anything else than reprogramming CVMX_PIP_PRT_TAGX, and > >eliminating the global pow_receive_group and creating multiple NAPI instances > >and registering IRQ handlers? > > That is essentially how it works. Set the tag generation parameters, and > use the low order bits of the tag to select which POW/SSO group is assigned. > The SSO group corresponds to an "rx queue" OK, I will try to experiment with this. Even though my home routers are 2-core only I could still create more queues and verify that the traffic gets distributed by checking the counters... > >In the Yocto tree, the CVMX_PIP_PRT_TAGX register values are actually > >documented: > > > >http://git.yoctoproject.org/cgit/cgit.cgi/linux-yocto-contrib/tree/arch/mips/include/asm/octeon/cvmx-pip-defs.h?h=apaliwal/octeon#n3737 > > Wow, I didn't realize that documentation was made public. Also D-Link and Qbiquity GPL source offerings for their products usually include documentation for register fields. Only in mainline kernel they are missing. A.