From mboxrd@z Thu Jan 1 00:00:00 1970 From: David Miller Subject: Re: Poor TCP performance with XPS enabled after scrubbing skb Date: Fri, 25 May 2018 16:29:21 -0400 (EDT) Message-ID: <20180525.162921.398304898376507234.davem@davemloft.net> References: <20180515193128.GA11901@plex.lan> <20180524191729.GA3770@plex.lan> Mime-Version: 1.0 Content-Type: Text/Plain; charset=us-ascii Content-Transfer-Encoding: 7bit Cc: eric.dumazet@gmail.com, netdev@vger.kernel.org, pabeni@redhat.com To: fbl@sysclose.org Return-path: Received: from shards.monkeyblade.net ([184.105.139.130]:37386 "EHLO shards.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S968311AbeEYU3X (ORCPT ); Fri, 25 May 2018 16:29:23 -0400 In-Reply-To: <20180524191729.GA3770@plex.lan> Sender: netdev-owner@vger.kernel.org List-ID: From: Flavio Leitner Date: Thu, 24 May 2018 16:17:29 -0300 > veth originally called skb_orphan() on veth_xmit() most probably > because there was no TX completion. Then the code got generalized to > dev_forward_skb() and later on moved to skb_scrub_packet(). > > The issue is that we call skb_scrub_packet() on TX and RX paths and > that is done while crossing netns. It doesn't look correct to keep > the ->sk because I suspect that iptables/selinux/bpf, or some code > path that I am probably missing could expose/use the wrong ->sk, for > example. > > However, netdev_pick_tx() can't store the queue mapping without ->sk. > > The hack in the first email relies on the headers (skb_tx_hash) to > always selected the same TX queue, which solves the original problem > but not the TCP small queues you mentioned. Right, we can't allow a socket reference to escape over a netns crossing. However, that is where we get the queue mapping state. We might need to put the sk based decision into the skb somehow in order to satisfy these two incompatibel requirements.