From mboxrd@z Thu Jan 1 00:00:00 1970 From: Willem de Bruijn Subject: Re: [PATCH net-next] virtio-net: invoke zerocopy callback on xmit path if no tx napi Date: Wed, 23 Aug 2017 11:20:45 -0400 Message-ID: References: <20170819063854.27010-1-den@klaipeden.com> <5352c98a-fa48-fcf9-c062-9986a317a1b0@redhat.com> <64d451ae-9944-e978-5a05-54bb1a62aaad@redhat.com> <20170822204015-mutt-send-email-mst@kernel.org> <1503498504.8694.26.camel@klaipeden.com> Mime-Version: 1.0 Content-Type: text/plain; charset="UTF-8" Cc: "Michael S. Tsirkin" , Jason Wang , virtualization@lists.linux-foundation.org, Network Development To: Koichiro Den Return-path: Received: from mail-oi0-f43.google.com ([209.85.218.43]:36059 "EHLO mail-oi0-f43.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S932090AbdHWPV0 (ORCPT ); Wed, 23 Aug 2017 11:21:26 -0400 Received: by mail-oi0-f43.google.com with SMTP id g131so4220349oic.3 for ; Wed, 23 Aug 2017 08:21:26 -0700 (PDT) In-Reply-To: <1503498504.8694.26.camel@klaipeden.com> Sender: netdev-owner@vger.kernel.org List-ID: > Please let me make sure if I understand it correctly: > * always do copy with skb_orphan_frags_rx as Willem mentioned in the earlier > post, before the xmit_skb as opposed to my original patch, is safe but too > costly so cannot be adopted. One more point about msg_zerocopy in the guest. This does add new allocation limits on optmem and locked pages rlimit. Hitting these should be extremely rare. The tcp small queues limit normally throttles well before this. Virtio-net is an exception because it breaks the tsq signal by calling skb_orphan before transmission. As a result hitting these limits is more likely here. But, in this edge case the sendmsg call will not block, either, but fail with -ENOBUFS. The caller can send without zerocopy to make forward progress and trigger free_old_xmit_skbs from start_xmit. > * as a generic solution, if we were to somehow overcome the safety issue, track > the delay and do copy if some threshold is reached could be an answer, but it's > hard for now.> * so things like the current vhost-net implementation of deciding whether or not > to do zerocopy beforehand referring the zerocopy tx error ratio is a point of > practical compromise. The fragility of this mechanism is another argument for switching to tx napi as default. Is there any more data about the windows guest issues when completions are not queued within a reasonable timeframe? What is this timescale and do we really need to work around this. That is the only thing keeping us from removing the HoL blocking in vhost-net zerocopy.