Re: [Qemu-devel] [virtio-dev] RE: virtio-net: configurable TX queue size

From: Wei Wang <wei.w.wang@intel.com>
To: "Michael S. Tsirkin" <mst@redhat.com>
Cc: "Jason Wang" <jasowang@redhat.com>,
	"Stefan Hajnoczi" <stefanha@gmail.com>,
	"Marc-André Lureau" <marcandre.lureau@gmail.com>,
	"pbonzini@redhat.com" <pbonzini@redhat.com>,
	"virtio-dev@lists.oasis-open.org"
	<virtio-dev@lists.oasis-open.org>,
	"qemu-devel@nongnu.org" <qemu-devel@nongnu.org>,
	"Jan Scheurich" <jan.scheurich@ericsson.com>
Subject: Re: [Qemu-devel] [virtio-dev] RE: virtio-net: configurable TX queue size
Date: Thu, 11 May 2017 13:09:41 +0800	[thread overview]
Message-ID: <5913F215.3070503@intel.com> (raw)
In-Reply-To: <20170510230621-mutt-send-email-mst@kernel.org>

On 05/11/2017 04:07 AM, Michael S. Tsirkin wrote:
> On Wed, May 10, 2017 at 05:52:23PM +0800, Wei Wang wrote:
>> On 05/07/2017 12:39 PM, Wang, Wei W wrote:
>>> On 05/06/2017 04:37 AM, Michael S. Tsirkin wrote:
>>>> On Fri, May 05, 2017 at 10:27:13AM +0800, Jason Wang wrote:
>>>>> On 2017年05月04日 18:58, Wang, Wei W wrote:
>>>>>> Hi,
>>>>>>
>>>>>> I want to re-open the discussion left long time ago:
>>>>>> https://lists.gnu.org/archive/html/qemu-devel/2015-11/msg06194.html
>>>>>> , and discuss the possibility of changing the hardcoded (256) TX
>>>>>> queue size to be configurable between 256 and 1024.
>>>>> Yes, I think we probably need this.
>>>>>
>>>>>> The reason to propose this request is that a severe issue of packet
>>>>>> drops in TX direction was observed with the existing hardcoded 256
>>>>>> queue size, which causes performance issues for packet drop
>>>>>> sensitive guest applications that cannot use indirect descriptor
>>>>>> tables. The issue goes away with 1K queue size.
>>>>> Do we need even more, what if we find 1K is even not sufficient in the
>>>>> future? Modern nics has size up to ~8192.
>>>>>
>>>>>> The concern mentioned in the previous discussion (please check the
>>>>>> link
>>>>>> above) is that the number of chained descriptors would exceed
>>>>>> UIO_MAXIOV (1024) supported by the Linux.
>>>>> We could try to address this limitation but probably need a new
>>>>> feature bit to allow more than UIO_MAXIOV sgs.
>>>> I'd say we should split the queue size and the sg size.
>>>>
>> I'm still doing some investigation about this, one question (or issue) I
>> found from the implementation is that the virtio-net device changes
>> the message layout when the vnet_hdr needs an endianness swap
>> (i.e. virtio_needs_swap()). This change adds one more iov to the
>> iov[]-s passed from the driver.
>>
>> To be more precise, the message from the driver could be in one
>> of the two following layout:
>> Layout1:
>> iov[0]: vnet_hdr + data
>>
>> Layout2:
>> iov[0]: vnet_hdr
>> iov[1]: data
>>
>> If the driver passes the message in Layout1, and the following code
>> from the device changes the message from Layout1 to Layout2:
>>
>> if (n->needs_vnet_hdr_swap) {
>>                  virtio_net_hdr_swap(vdev, (void *) &mhdr);
>>                  sg2[0].iov_base = &mhdr;
>>                  sg2[0].iov_len = n->guest_hdr_len;
>>                  out_num = iov_copy(&sg2[1], ARRAY_SIZE(sg2) - 1,
>>                                     out_sg, out_num,
>>                                     n->guest_hdr_len, -1);
>>                  if (out_num == VIRTQUEUE_MAX_SIZE) {
>>                      goto drop;
>>                  }
>>                  out_num += 1;
>>                  out_sg = sg2;
>>              }
>>
>> sg2[0] is the extra one, which potentially causes the off-by-one
>> issue. I didn't find other possibilities that can cause the issue.
>>
>> Could we keep the original layout by just copying the swapped
>> "mhdr" back to original out_sg[0].iov_base?
>>
>> Best,
>> Wei
> We can't because that data should be read-only by host.
>

OK. I just posted a patch to solve this issue. Please have a check
there.  Thanks.

Best,
Wei