From mboxrd@z Thu Jan 1 00:00:00 1970 From: David Miller Subject: Re: [PATCH net-next] virtio_net: force_napi_tx module param. Date: Sun, 29 Jul 2018 09:00:27 -0700 (PDT) Message-ID: <20180729.090027.1373538625446665385.davem@davemloft.net> References: <20180723231119.142904-1-caleb.raitto@gmail.com> Mime-Version: 1.0 Content-Type: Text/Plain; charset=us-ascii Content-Transfer-Encoding: 7bit Cc: mst@redhat.com, jasowang@redhat.com, netdev@vger.kernel.org, caraitto@google.com To: caleb.raitto@gmail.com Return-path: Received: from shards.monkeyblade.net ([23.128.96.9]:41152 "EHLO shards.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1726438AbeG2RbW (ORCPT ); Sun, 29 Jul 2018 13:31:22 -0400 In-Reply-To: <20180723231119.142904-1-caleb.raitto@gmail.com> Sender: netdev-owner@vger.kernel.org List-ID: From: Caleb Raitto Date: Mon, 23 Jul 2018 16:11:19 -0700 > From: Caleb Raitto > > The driver disables tx napi if it's not certain that completions will > be processed affine with tx service. > > Its heuristic doesn't account for some scenarios where it is, such as > when the queue pair count matches the core but not hyperthread count. > > Allow userspace to override the heuristic. This is an alternative > solution to that in the linked patch. That added more logic in the > kernel for these cases, but the agreement was that this was better left > to user control. > > Do not expand the existing napi_tx variable to a ternary value, > because doing so can break user applications that expect > boolean ('Y'/'N') instead of integer output. Add a new param instead. > > Link: https://patchwork.ozlabs.org/patch/725249/ > Acked-by: Willem de Bruijn > Acked-by: Jon Olson > Signed-off-by: Caleb Raitto So I looked into the history surrounding these issues. First of all, it's always ends up turning out crummy when drivers start to set affinities themselves. The worst possible case is to do it _conditionally_, and that is exactly what virtio_net is doing. >>From the user's perspective, this provides a really bad experience. So if I have a 32-queue device and there are 32 cpus, you'll do all the affinity settings, stopping Irqbalanced from doing anything right? So if I add one more cpu, you'll say "oops, no idea what to do in this situation" and not touch the affinities at all? That makes no sense at all. If the driver is going to set affinities at all, OWN that decision and set it all the time to something reasonable. Or accept that you shouldn't be touching this stuff in the first place and leave the affinities alone. Right now we're kinda in a situation where the driver has been setting affinities in the ncpus==nqueues cases for some time, so we can't stop doing it. Which means we have to set them in all cases to make the user experience sane again. I looked at the linked to patch again: https://patchwork.ozlabs.org/patch/725249/ And I think the strategy should be made more generic, to get rid of the hyperthreading assumptions. I also agree that the "assign to first N cpus" logic doesn't make much sense either. Just distribute across the available cpus evenly, and be done with it. If you have 64 cpus and 32 queues, this assigns queues to every other cpu. Then we don't need this weird new module parameter.