From mboxrd@z Thu Jan 1 00:00:00 1970 From: Koichiro Den Subject: Re: [PATCH net-next] virtio-net: invoke zerocopy callback on xmit path if no tx napi Date: Wed, 23 Aug 2017 23:28:24 +0900 Message-ID: <1503498504.8694.26.camel@klaipeden.com> References: <20170819063854.27010-1-den@klaipeden.com> <5352c98a-fa48-fcf9-c062-9986a317a1b0@redhat.com> <64d451ae-9944-e978-5a05-54bb1a62aaad@redhat.com> <20170822204015-mutt-send-email-mst@kernel.org> Mime-Version: 1.0 Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: 8BIT Cc: Willem de Bruijn , virtualization@lists.linux-foundation.org, Network Development To: "Michael S. Tsirkin" , Jason Wang Return-path: Received: from sender-of-o52.zoho.com ([135.84.80.217]:21333 "EHLO sender-of-o52.zoho.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1754138AbdHWO23 (ORCPT ); Wed, 23 Aug 2017 10:28:29 -0400 In-Reply-To: <20170822204015-mutt-send-email-mst@kernel.org> Sender: netdev-owner@vger.kernel.org List-ID: On Tue, 2017-08-22 at 20:55 +0300, Michael S. Tsirkin wrote: > On Tue, Aug 22, 2017 at 10:50:41AM +0800, Jason Wang wrote: > > > Perhaps the descriptor pool should also be > > > revised to allow out of order completions. Then there is no need to > > > copy zerocopy packets whenever they may experience delay. > > > > Yes, but as replied in the referenced thread, windows driver may treat out > > of order completion as a bug. > > That would be a windows driver bug then, but I don't think it makes this > assumption. What the referenced thread > (https://patchwork.kernel.org/patch/3787671/) is saying is that host > must use any buffers made available on a tx vq within a reasonable > timeframe otherwise windows guests panic. > > Ideally we would detect that a packet is actually experiencing delay and > trigger the copy at that point e.g. by calling skb_linearize. But it > isn't easy to track these packets though and even harder to do a data > copy without races. > > Which reminds me that skb_linearize in net core seems to be > fundamentally racy - I suspect that if skb is cloned, and someone is > trying to use the shared frags while another thread calls skb_linearize, > we get some use after free bugs which likely mostly go undetected > because the corrupted packets mostly go on wire and get dropped > by checksum code. > Please let me make sure if I understand it correctly: * always do copy with skb_orphan_frags_rx as Willem mentioned in the earlier post, before the xmit_skb as opposed to my original patch, is safe but too costly so cannot be adopted. * as a generic solution, if we were to somehow overcome the safety issue, track the delay and do copy if some threshold is reached could be an answer, but it's hard for now. * so things like the current vhost-net implementation of deciding whether or not to do zerocopy beforehand referring the zerocopy tx error ratio is a point of practical compromise. Thanks.