All of lore.kernel.org
 help / color / mirror / Atom feed
* [Qemu-devel] virtio-net: configurable TX queue size
@ 2017-05-04 10:58 Wang, Wei W
  2017-05-05  2:27 ` Jason Wang
  0 siblings, 1 reply; 14+ messages in thread
From: Wang, Wei W @ 2017-05-04 10:58 UTC (permalink / raw)
  To: Michael S. Tsirkin, jasowang, Stefan Hajnoczi,
	Marc-André Lureau, pbonzini, virtio-dev, qemu-devel
  Cc: Jan Scheurich

Hi,

I want to re-open the discussion left long time ago:
https://lists.gnu.org/archive/html/qemu-devel/2015-11/msg06194.html
, and discuss the possibility of changing the hardcoded (256) TX  queue
size to be configurable between 256 and 1024.

The reason to propose this request is that a severe issue of packet drops in
TX direction was observed with the existing hardcoded 256 queue size,
which causes performance issues for packet drop sensitive guest
applications that cannot use indirect descriptor tables. The issue goes away
with 1K queue size.

The concern mentioned in the previous discussion (please check the link
above) is that the number of chained descriptors would exceed
UIO_MAXIOV (1024) supported by the Linux.

>From the code,  I think the number of the chained descriptors is limited to
MAX_SKB_FRAGS + 2 (~18), which is much less than UIO_MAXIOV.
Please point out if I missed anything. Thanks.

Best,
Wei

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [Qemu-devel] virtio-net: configurable TX queue size
  2017-05-04 10:58 [Qemu-devel] virtio-net: configurable TX queue size Wang, Wei W
@ 2017-05-05  2:27 ` Jason Wang
  2017-05-05  5:53   ` [Qemu-devel] [virtio-dev] " Wei Wang
  2017-05-05 20:36   ` [Qemu-devel] " Michael S. Tsirkin
  0 siblings, 2 replies; 14+ messages in thread
From: Jason Wang @ 2017-05-05  2:27 UTC (permalink / raw)
  To: Wang, Wei W, Michael S. Tsirkin, Stefan Hajnoczi,
	Marc-André Lureau, pbonzini, virtio-dev, qemu-devel
  Cc: Jan Scheurich



On 2017年05月04日 18:58, Wang, Wei W wrote:
> Hi,
>
> I want to re-open the discussion left long time ago:
> https://lists.gnu.org/archive/html/qemu-devel/2015-11/msg06194.html
> , and discuss the possibility of changing the hardcoded (256) TX  queue
> size to be configurable between 256 and 1024.

Yes, I think we probably need this.

>
> The reason to propose this request is that a severe issue of packet drops in
> TX direction was observed with the existing hardcoded 256 queue size,
> which causes performance issues for packet drop sensitive guest
> applications that cannot use indirect descriptor tables. The issue goes away
> with 1K queue size.

Do we need even more, what if we find 1K is even not sufficient in the 
future? Modern nics has size up to ~8192.

>
> The concern mentioned in the previous discussion (please check the link
> above) is that the number of chained descriptors would exceed
> UIO_MAXIOV (1024) supported by the Linux.

We could try to address this limitation but probably need a new feature 
bit to allow more than UIO_MAXIOV sgs.

>
>  From the code,  I think the number of the chained descriptors is limited to
> MAX_SKB_FRAGS + 2 (~18), which is much less than UIO_MAXIOV.

This is the limitation of #page frags for skb, not the iov limitation.

Thanks

> Please point out if I missed anything. Thanks.
>
> Best,
> Wei
>
>

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [Qemu-devel] [virtio-dev] Re: virtio-net: configurable TX queue size
  2017-05-05  2:27 ` Jason Wang
@ 2017-05-05  5:53   ` Wei Wang
  2017-05-05  9:20     ` Jason Wang
  2017-05-05 20:36   ` [Qemu-devel] " Michael S. Tsirkin
  1 sibling, 1 reply; 14+ messages in thread
From: Wei Wang @ 2017-05-05  5:53 UTC (permalink / raw)
  To: Jason Wang, Michael S. Tsirkin, Stefan Hajnoczi,
	Marc-André Lureau, pbonzini, virtio-dev, qemu-devel
  Cc: Jan Scheurich

On 05/05/2017 10:27 AM, Jason Wang wrote:
>
>
> On 2017年05月04日 18:58, Wang, Wei W wrote:
>> Hi,
>>
>> I want to re-open the discussion left long time ago:
>> https://lists.gnu.org/archive/html/qemu-devel/2015-11/msg06194.html
>> , and discuss the possibility of changing the hardcoded (256) TX  queue
>> size to be configurable between 256 and 1024.
>
> Yes, I think we probably need this.

That's great, thanks.

>
>>
>> The reason to propose this request is that a severe issue of packet 
>> drops in
>> TX direction was observed with the existing hardcoded 256 queue size,
>> which causes performance issues for packet drop sensitive guest
>> applications that cannot use indirect descriptor tables. The issue 
>> goes away
>> with 1K queue size.
>
> Do we need even more, what if we find 1K is even not sufficient in the 
> future? Modern nics has size up to ~8192.

Yes. Probably, we can also set the RX queue size to 8192 (currently it's 
1K) as well.

>
>>
>> The concern mentioned in the previous discussion (please check the link
>> above) is that the number of chained descriptors would exceed
>> UIO_MAXIOV (1024) supported by the Linux.
>
> We could try to address this limitation but probably need a new 
> feature bit to allow more than UIO_MAXIOV sgs.

I think we should first discuss whether it would be an issue below.

>
>>
>>  From the code,  I think the number of the chained descriptors is 
>> limited to
>> MAX_SKB_FRAGS + 2 (~18), which is much less than UIO_MAXIOV.
>
> This is the limitation of #page frags for skb, not the iov limitation.

I think the number of page frags are filled into the same number of 
descriptors
in the virtio-net driver (e.g. use 10 descriptors for 10 page frags). On 
the other
side, the virtio-net backend uses the same number of iov for the 
descriptors.

Since the number of page frags is limited to 18, I think there wouldn't 
be more
than 18 iovs to be passed to writev, right?

Best,
Wei

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [Qemu-devel] [virtio-dev] Re: virtio-net: configurable TX queue size
  2017-05-05  5:53   ` [Qemu-devel] [virtio-dev] " Wei Wang
@ 2017-05-05  9:20     ` Jason Wang
  2017-05-05 22:08       ` Michael S. Tsirkin
  2017-05-07 12:02       ` [Qemu-devel] [virtio-dev] " Yan Vugenfirer
  0 siblings, 2 replies; 14+ messages in thread
From: Jason Wang @ 2017-05-05  9:20 UTC (permalink / raw)
  To: Wei Wang, Michael S. Tsirkin, Stefan Hajnoczi,
	Marc-André Lureau, pbonzini, virtio-dev, qemu-devel
  Cc: Jan Scheurich



On 2017年05月05日 13:53, Wei Wang wrote:
> On 05/05/2017 10:27 AM, Jason Wang wrote:
>>
>>
>> On 2017年05月04日 18:58, Wang, Wei W wrote:
>>> Hi,
>>>
>>> I want to re-open the discussion left long time ago:
>>> https://lists.gnu.org/archive/html/qemu-devel/2015-11/msg06194.html
>>> , and discuss the possibility of changing the hardcoded (256) TX  queue
>>> size to be configurable between 256 and 1024.
>>
>> Yes, I think we probably need this.
>
> That's great, thanks.
>
>>
>>>
>>> The reason to propose this request is that a severe issue of packet 
>>> drops in
>>> TX direction was observed with the existing hardcoded 256 queue size,
>>> which causes performance issues for packet drop sensitive guest
>>> applications that cannot use indirect descriptor tables. The issue 
>>> goes away
>>> with 1K queue size.
>>
>> Do we need even more, what if we find 1K is even not sufficient in 
>> the future? Modern nics has size up to ~8192.
>
> Yes. Probably, we can also set the RX queue size to 8192 (currently 
> it's 1K) as well.
>
>>
>>>
>>> The concern mentioned in the previous discussion (please check the link
>>> above) is that the number of chained descriptors would exceed
>>> UIO_MAXIOV (1024) supported by the Linux.
>>
>> We could try to address this limitation but probably need a new 
>> feature bit to allow more than UIO_MAXIOV sgs.
>
> I think we should first discuss whether it would be an issue below.
>
>>
>>>
>>>  From the code,  I think the number of the chained descriptors is 
>>> limited to
>>> MAX_SKB_FRAGS + 2 (~18), which is much less than UIO_MAXIOV.
>>
>> This is the limitation of #page frags for skb, not the iov limitation.
>
> I think the number of page frags are filled into the same number of 
> descriptors
> in the virtio-net driver (e.g. use 10 descriptors for 10 page frags). 
> On the other
> side, the virtio-net backend uses the same number of iov for the 
> descriptors.
>
> Since the number of page frags is limited to 18, I think there 
> wouldn't be more
> than 18 iovs to be passed to writev, right?

Looks not, see skb_copy_datagram_from_iter().

Thanks

>
> Best,
> Wei
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: virtio-dev-unsubscribe@lists.oasis-open.org
> For additional commands, e-mail: virtio-dev-help@lists.oasis-open.org
>

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [Qemu-devel] virtio-net: configurable TX queue size
  2017-05-05  2:27 ` Jason Wang
  2017-05-05  5:53   ` [Qemu-devel] [virtio-dev] " Wei Wang
@ 2017-05-05 20:36   ` Michael S. Tsirkin
  2017-05-07  4:39     ` Wang, Wei W
  1 sibling, 1 reply; 14+ messages in thread
From: Michael S. Tsirkin @ 2017-05-05 20:36 UTC (permalink / raw)
  To: Jason Wang
  Cc: Wang, Wei W, Stefan Hajnoczi, Marc-André Lureau, pbonzini,
	virtio-dev, qemu-devel, Jan Scheurich

On Fri, May 05, 2017 at 10:27:13AM +0800, Jason Wang wrote:
> 
> 
> On 2017年05月04日 18:58, Wang, Wei W wrote:
> > Hi,
> > 
> > I want to re-open the discussion left long time ago:
> > https://lists.gnu.org/archive/html/qemu-devel/2015-11/msg06194.html
> > , and discuss the possibility of changing the hardcoded (256) TX  queue
> > size to be configurable between 256 and 1024.
> 
> Yes, I think we probably need this.
> 
> > 
> > The reason to propose this request is that a severe issue of packet drops in
> > TX direction was observed with the existing hardcoded 256 queue size,
> > which causes performance issues for packet drop sensitive guest
> > applications that cannot use indirect descriptor tables. The issue goes away
> > with 1K queue size.
> 
> Do we need even more, what if we find 1K is even not sufficient in the
> future? Modern nics has size up to ~8192.
> 
> > 
> > The concern mentioned in the previous discussion (please check the link
> > above) is that the number of chained descriptors would exceed
> > UIO_MAXIOV (1024) supported by the Linux.
> 
> We could try to address this limitation but probably need a new feature bit
> to allow more than UIO_MAXIOV sgs.

I'd say we should split the queue size and the sg size.

> > 
> >  From the code,  I think the number of the chained descriptors is limited to
> > MAX_SKB_FRAGS + 2 (~18), which is much less than UIO_MAXIOV.
> 
> This is the limitation of #page frags for skb, not the iov limitation.
> 
> Thanks
> 
> > Please point out if I missed anything. Thanks.
> > 
> > Best,
> > Wei
> > 
> > 

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [Qemu-devel] [virtio-dev] Re: virtio-net: configurable TX queue size
  2017-05-05  9:20     ` Jason Wang
@ 2017-05-05 22:08       ` Michael S. Tsirkin
  2017-05-07 12:02       ` [Qemu-devel] [virtio-dev] " Yan Vugenfirer
  1 sibling, 0 replies; 14+ messages in thread
From: Michael S. Tsirkin @ 2017-05-05 22:08 UTC (permalink / raw)
  To: Jason Wang
  Cc: Wei Wang, Stefan Hajnoczi, Marc-André Lureau, pbonzini,
	virtio-dev, qemu-devel, Jan Scheurich

On Fri, May 05, 2017 at 05:20:07PM +0800, Jason Wang wrote:
> 
> 
> On 2017年05月05日 13:53, Wei Wang wrote:
> > On 05/05/2017 10:27 AM, Jason Wang wrote:
> > > 
> > > 
> > > On 2017年05月04日 18:58, Wang, Wei W wrote:
> > > > Hi,
> > > > 
> > > > I want to re-open the discussion left long time ago:
> > > > https://lists.gnu.org/archive/html/qemu-devel/2015-11/msg06194.html
> > > > , and discuss the possibility of changing the hardcoded (256) TX  queue
> > > > size to be configurable between 256 and 1024.
> > > 
> > > Yes, I think we probably need this.
> > 
> > That's great, thanks.
> > 
> > > 
> > > > 
> > > > The reason to propose this request is that a severe issue of
> > > > packet drops in
> > > > TX direction was observed with the existing hardcoded 256 queue size,
> > > > which causes performance issues for packet drop sensitive guest
> > > > applications that cannot use indirect descriptor tables. The
> > > > issue goes away
> > > > with 1K queue size.
> > > 
> > > Do we need even more, what if we find 1K is even not sufficient in
> > > the future? Modern nics has size up to ~8192.
> > 
> > Yes. Probably, we can also set the RX queue size to 8192 (currently it's
> > 1K) as well.
> > 
> > > 
> > > > 
> > > > The concern mentioned in the previous discussion (please check the link
> > > > above) is that the number of chained descriptors would exceed
> > > > UIO_MAXIOV (1024) supported by the Linux.
> > > 
> > > We could try to address this limitation but probably need a new
> > > feature bit to allow more than UIO_MAXIOV sgs.
> > 
> > I think we should first discuss whether it would be an issue below.
> > 
> > > 
> > > > 
> > > >  From the code,  I think the number of the chained descriptors
> > > > is limited to
> > > > MAX_SKB_FRAGS + 2 (~18), which is much less than UIO_MAXIOV.
> > > 
> > > This is the limitation of #page frags for skb, not the iov limitation.
> > 
> > I think the number of page frags are filled into the same number of
> > descriptors
> > in the virtio-net driver (e.g. use 10 descriptors for 10 page frags). On
> > the other
> > side, the virtio-net backend uses the same number of iov for the
> > descriptors.
> > 
> > Since the number of page frags is limited to 18, I think there wouldn't
> > be more
> > than 18 iovs to be passed to writev, right?
> 
> Looks not, see skb_copy_datagram_from_iter().
> 
> Thanks

Besides, guests don't all use linux drivers. Some use e.g.
dpdk pmd ones. It's best not to make assumptions outside
the spec - if you need to make an assumption, it's
best to add a flag to specify it.

> > 
> > Best,
> > Wei
> > 
> > ---------------------------------------------------------------------
> > To unsubscribe, e-mail: virtio-dev-unsubscribe@lists.oasis-open.org
> > For additional commands, e-mail: virtio-dev-help@lists.oasis-open.org
> > 

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [Qemu-devel] virtio-net: configurable TX queue size
  2017-05-05 20:36   ` [Qemu-devel] " Michael S. Tsirkin
@ 2017-05-07  4:39     ` Wang, Wei W
  2017-05-10  9:00       ` Jason Wang
  2017-05-10  9:52       ` [Qemu-devel] [virtio-dev] " Wei Wang
  0 siblings, 2 replies; 14+ messages in thread
From: Wang, Wei W @ 2017-05-07  4:39 UTC (permalink / raw)
  To: Michael S. Tsirkin, Jason Wang
  Cc: Stefan Hajnoczi, Marc-André Lureau, pbonzini, virtio-dev,
	qemu-devel, Jan Scheurich

On 05/06/2017 04:37 AM, Michael S. Tsirkin wrote:
> On Fri, May 05, 2017 at 10:27:13AM +0800, Jason Wang wrote:
> >
> >
> > On 2017年05月04日 18:58, Wang, Wei W wrote:
> > > Hi,
> > >
> > > I want to re-open the discussion left long time ago:
> > > https://lists.gnu.org/archive/html/qemu-devel/2015-11/msg06194.html
> > > , and discuss the possibility of changing the hardcoded (256) TX
> > > queue size to be configurable between 256 and 1024.
> >
> > Yes, I think we probably need this.
> >
> > >
> > > The reason to propose this request is that a severe issue of packet
> > > drops in TX direction was observed with the existing hardcoded 256
> > > queue size, which causes performance issues for packet drop
> > > sensitive guest applications that cannot use indirect descriptor
> > > tables. The issue goes away with 1K queue size.
> >
> > Do we need even more, what if we find 1K is even not sufficient in the
> > future? Modern nics has size up to ~8192.
> >
> > >
> > > The concern mentioned in the previous discussion (please check the
> > > link
> > > above) is that the number of chained descriptors would exceed
> > > UIO_MAXIOV (1024) supported by the Linux.
> >
> > We could try to address this limitation but probably need a new
> > feature bit to allow more than UIO_MAXIOV sgs.
> 
> I'd say we should split the queue size and the sg size.
> 

I think we can just split the iov size in the virtio-net backend,
that is, split the large iov[] into multiple iov[1024] to send to writev.

Best,
Wei



^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [Qemu-devel] [virtio-dev] virtio-net: configurable TX queue size
  2017-05-05  9:20     ` Jason Wang
  2017-05-05 22:08       ` Michael S. Tsirkin
@ 2017-05-07 12:02       ` Yan Vugenfirer
  2017-05-08  1:23         ` Wei Wang
  1 sibling, 1 reply; 14+ messages in thread
From: Yan Vugenfirer @ 2017-05-07 12:02 UTC (permalink / raw)
  To: Jason Wang
  Cc: Wei Wang, Michael S. Tsirkin, Stefan Hajnoczi,
	Marc-André Lureau, pbonzini, virtio-dev, qemu-devel,
	Jan Scheurich


> On 5 May 2017, at 12:20, Jason Wang <jasowang@redhat.com> wrote:
> 
> 
> 
> On 2017年05月05日 13:53, Wei Wang wrote:
>> On 05/05/2017 10:27 AM, Jason Wang wrote:
>>> 
>>> 
>>> On 2017年05月04日 18:58, Wang, Wei W wrote:
>>>> Hi,
>>>> 
>>>> I want to re-open the discussion left long time ago:
>>>> https://lists.gnu.org/archive/html/qemu-devel/2015-11/msg06194.html
>>>> , and discuss the possibility of changing the hardcoded (256) TX  queue
>>>> size to be configurable between 256 and 1024.
>>> 
>>> Yes, I think we probably need this.
>> 
>> That's great, thanks.
>> 
>>> 
>>>> 
>>>> The reason to propose this request is that a severe issue of packet drops in
>>>> TX direction was observed with the existing hardcoded 256 queue size,
>>>> which causes performance issues for packet drop sensitive guest
>>>> applications that cannot use indirect descriptor tables. The issue goes away
>>>> with 1K queue size.
>>> 
>>> Do we need even more, what if we find 1K is even not sufficient in the future? Modern nics has size up to ~8192.
>> 
>> Yes. Probably, we can also set the RX queue size to 8192 (currently it's 1K) as well.
>> 
>>> 
>>>> 
>>>> The concern mentioned in the previous discussion (please check the link
>>>> above) is that the number of chained descriptors would exceed
>>>> UIO_MAXIOV (1024) supported by the Linux.
>>> 
>>> We could try to address this limitation but probably need a new feature bit to allow more than UIO_MAXIOV sgs.
>> 
>> I think we should first discuss whether it would be an issue below.
>> 
>>> 
>>>> 
>>>> From the code,  I think the number of the chained descriptors is limited to
>>>> MAX_SKB_FRAGS + 2 (~18), which is much less than UIO_MAXIOV.
>>> 
>>> This is the limitation of #page frags for skb, not the iov limitation.
>> 
>> I think the number of page frags are filled into the same number of descriptors
>> in the virtio-net driver (e.g. use 10 descriptors for 10 page frags). On the other
>> side, the virtio-net backend uses the same number of iov for the descriptors.
>> 
>> Since the number of page frags is limited to 18, I think there wouldn't be more
>> than 18 iovs to be passed to writev, right?
> 
This limitation assumption is incorrect for Windows. We saw cases (and strangely enough with small packets) when Windows returns scatter gather list with 32 descriptors or more.

Best regards,
Yan.

> Looks not, see skb_copy_datagram_from_iter().

> 
> Thanks
> 
>> 
>> Best,
>> Wei
>> 
>> ---------------------------------------------------------------------
>> To unsubscribe, e-mail: virtio-dev-unsubscribe@lists.oasis-open.org <mailto:virtio-dev-unsubscribe@lists.oasis-open.org>
>> For additional commands, e-mail: virtio-dev-help@lists.oasis-open.org <mailto:virtio-dev-help@lists.oasis-open.org>
>> 
> 
> 
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: virtio-dev-unsubscribe@lists.oasis-open.org <mailto:virtio-dev-unsubscribe@lists.oasis-open.org>
> For additional commands, e-mail: virtio-dev-help@lists.oasis-open.org <mailto:virtio-dev-help@lists.oasis-open.org>

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [Qemu-devel] [virtio-dev] virtio-net: configurable TX queue size
  2017-05-07 12:02       ` [Qemu-devel] [virtio-dev] " Yan Vugenfirer
@ 2017-05-08  1:23         ` Wei Wang
  0 siblings, 0 replies; 14+ messages in thread
From: Wei Wang @ 2017-05-08  1:23 UTC (permalink / raw)
  To: Yan Vugenfirer, Jason Wang
  Cc: virtio-dev, Michael S. Tsirkin, Stefan Hajnoczi, qemu-devel,
	Jan Scheurich, Marc-André Lureau, pbonzini

On 05/07/2017 08:02 PM, Yan Vugenfirer wrote:
>> On 5 May 2017, at 12:20, Jason Wang <jasowang@redhat.com> wrote:
>>
>>
>>
>> On 2017年05月05日 13:53, Wei Wang wrote:
>>> On 05/05/2017 10:27 AM, Jason Wang wrote:
>>>>
>>>> On 2017年05月04日 18:58, Wang, Wei W wrote:
>>>>> Hi,
>>>>>
>>>>> I want to re-open the discussion left long time ago:
>>>>> https://lists.gnu.org/archive/html/qemu-devel/2015-11/msg06194.html
>>>>> , and discuss the possibility of changing the hardcoded (256) TX  queue
>>>>> size to be configurable between 256 and 1024.
>>>> Yes, I think we probably need this.
>>> That's great, thanks.
>>>
>>>>> The reason to propose this request is that a severe issue of packet drops in
>>>>> TX direction was observed with the existing hardcoded 256 queue size,
>>>>> which causes performance issues for packet drop sensitive guest
>>>>> applications that cannot use indirect descriptor tables. The issue goes away
>>>>> with 1K queue size.
>>>> Do we need even more, what if we find 1K is even not sufficient in the future? Modern nics has size up to ~8192.
>>> Yes. Probably, we can also set the RX queue size to 8192 (currently it's 1K) as well.
>>>
>>>>> The concern mentioned in the previous discussion (please check the link
>>>>> above) is that the number of chained descriptors would exceed
>>>>> UIO_MAXIOV (1024) supported by the Linux.
>>>> We could try to address this limitation but probably need a new feature bit to allow more than UIO_MAXIOV sgs.
>>> I think we should first discuss whether it would be an issue below.
>>>
>>>>>  From the code,  I think the number of the chained descriptors is limited to
>>>>> MAX_SKB_FRAGS + 2 (~18), which is much less than UIO_MAXIOV.
>>>> This is the limitation of #page frags for skb, not the iov limitation.
>>> I think the number of page frags are filled into the same number of descriptors
>>> in the virtio-net driver (e.g. use 10 descriptors for 10 page frags). On the other
>>> side, the virtio-net backend uses the same number of iov for the descriptors.
>>>
>>> Since the number of page frags is limited to 18, I think there wouldn't be more
>>> than 18 iovs to be passed to writev, right?
> This limitation assumption is incorrect for Windows. We saw cases (and strangely enough with small packets) when Windows returns scatter gather list with 32 descriptors or more.
>
OK, thanks for sharing. Michael has also pointed out the non-linux 
driver cases. We're looking into ways to split the large iov.

Best,
Wei

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [Qemu-devel] virtio-net: configurable TX queue size
  2017-05-07  4:39     ` Wang, Wei W
@ 2017-05-10  9:00       ` Jason Wang
  2017-05-10  9:59         ` Wei Wang
  2017-05-10  9:52       ` [Qemu-devel] [virtio-dev] " Wei Wang
  1 sibling, 1 reply; 14+ messages in thread
From: Jason Wang @ 2017-05-10  9:00 UTC (permalink / raw)
  To: Wang, Wei W, Michael S. Tsirkin
  Cc: virtio-dev, Stefan Hajnoczi, qemu-devel, Jan Scheurich,
	Marc-André Lureau, pbonzini



On 2017年05月07日 12:39, Wang, Wei W wrote:
> On 05/06/2017 04:37 AM, Michael S. Tsirkin wrote:
>> On Fri, May 05, 2017 at 10:27:13AM +0800, Jason Wang wrote:
>>>
>>> On 2017年05月04日 18:58, Wang, Wei W wrote:
>>>> Hi,
>>>>
>>>> I want to re-open the discussion left long time ago:
>>>> https://lists.gnu.org/archive/html/qemu-devel/2015-11/msg06194.html
>>>> , and discuss the possibility of changing the hardcoded (256) TX
>>>> queue size to be configurable between 256 and 1024.
>>> Yes, I think we probably need this.
>>>
>>>> The reason to propose this request is that a severe issue of packet
>>>> drops in TX direction was observed with the existing hardcoded 256
>>>> queue size, which causes performance issues for packet drop
>>>> sensitive guest applications that cannot use indirect descriptor
>>>> tables. The issue goes away with 1K queue size.
>>> Do we need even more, what if we find 1K is even not sufficient in the
>>> future? Modern nics has size up to ~8192.
>>>
>>>> The concern mentioned in the previous discussion (please check the
>>>> link
>>>> above) is that the number of chained descriptors would exceed
>>>> UIO_MAXIOV (1024) supported by the Linux.
>>> We could try to address this limitation but probably need a new
>>> feature bit to allow more than UIO_MAXIOV sgs.
>> I'd say we should split the queue size and the sg size.
>>
> I think we can just split the iov size in the virtio-net backend,
> that is, split the large iov[] into multiple iov[1024] to send to writev.
>
> Best,
> Wei

Maybe, but let's first clarify the limitation and new ability in the 
spec. Want to send a patch?

Thanks

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [Qemu-devel] [virtio-dev] RE: virtio-net: configurable TX queue size
  2017-05-07  4:39     ` Wang, Wei W
  2017-05-10  9:00       ` Jason Wang
@ 2017-05-10  9:52       ` Wei Wang
  2017-05-10 20:07         ` Michael S. Tsirkin
  1 sibling, 1 reply; 14+ messages in thread
From: Wei Wang @ 2017-05-10  9:52 UTC (permalink / raw)
  To: Michael S. Tsirkin, Jason Wang
  Cc: Stefan Hajnoczi, Marc-André Lureau, pbonzini, virtio-dev,
	qemu-devel, Jan Scheurich

On 05/07/2017 12:39 PM, Wang, Wei W wrote:
> On 05/06/2017 04:37 AM, Michael S. Tsirkin wrote:
>> On Fri, May 05, 2017 at 10:27:13AM +0800, Jason Wang wrote:
>>>
>>> On 2017年05月04日 18:58, Wang, Wei W wrote:
>>>> Hi,
>>>>
>>>> I want to re-open the discussion left long time ago:
>>>> https://lists.gnu.org/archive/html/qemu-devel/2015-11/msg06194.html
>>>> , and discuss the possibility of changing the hardcoded (256) TX
>>>> queue size to be configurable between 256 and 1024.
>>> Yes, I think we probably need this.
>>>
>>>> The reason to propose this request is that a severe issue of packet
>>>> drops in TX direction was observed with the existing hardcoded 256
>>>> queue size, which causes performance issues for packet drop
>>>> sensitive guest applications that cannot use indirect descriptor
>>>> tables. The issue goes away with 1K queue size.
>>> Do we need even more, what if we find 1K is even not sufficient in the
>>> future? Modern nics has size up to ~8192.
>>>
>>>> The concern mentioned in the previous discussion (please check the
>>>> link
>>>> above) is that the number of chained descriptors would exceed
>>>> UIO_MAXIOV (1024) supported by the Linux.
>>> We could try to address this limitation but probably need a new
>>> feature bit to allow more than UIO_MAXIOV sgs.
>> I'd say we should split the queue size and the sg size.
>>

I'm still doing some investigation about this, one question (or issue) I
found from the implementation is that the virtio-net device changes
the message layout when the vnet_hdr needs an endianness swap
(i.e. virtio_needs_swap()). This change adds one more iov to the
iov[]-s passed from the driver.

To be more precise, the message from the driver could be in one
of the two following layout:
Layout1:
iov[0]: vnet_hdr + data

Layout2:
iov[0]: vnet_hdr
iov[1]: data

If the driver passes the message in Layout1, and the following code
from the device changes the message from Layout1 to Layout2:

if (n->needs_vnet_hdr_swap) {
                 virtio_net_hdr_swap(vdev, (void *) &mhdr);
                 sg2[0].iov_base = &mhdr;
                 sg2[0].iov_len = n->guest_hdr_len;
                 out_num = iov_copy(&sg2[1], ARRAY_SIZE(sg2) - 1,
                                    out_sg, out_num,
                                    n->guest_hdr_len, -1);
                 if (out_num == VIRTQUEUE_MAX_SIZE) {
                     goto drop;
                 }
                 out_num += 1;
                 out_sg = sg2;
             }

sg2[0] is the extra one, which potentially causes the off-by-one
issue. I didn't find other possibilities that can cause the issue.

Could we keep the original layout by just copying the swapped
"mhdr" back to original out_sg[0].iov_base?

Best,
Wei

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [Qemu-devel] virtio-net: configurable TX queue size
  2017-05-10  9:00       ` Jason Wang
@ 2017-05-10  9:59         ` Wei Wang
  0 siblings, 0 replies; 14+ messages in thread
From: Wei Wang @ 2017-05-10  9:59 UTC (permalink / raw)
  To: Jason Wang, Michael S. Tsirkin
  Cc: virtio-dev, Stefan Hajnoczi, qemu-devel, Jan Scheurich,
	Marc-André Lureau, pbonzini

On 05/10/2017 05:00 PM, Jason Wang wrote:
>
>
> On 2017年05月07日 12:39, Wang, Wei W wrote:
>> On 05/06/2017 04:37 AM, Michael S. Tsirkin wrote:
>>> On Fri, May 05, 2017 at 10:27:13AM +0800, Jason Wang wrote:
>>>>
>>>> On 2017年05月04日 18:58, Wang, Wei W wrote:
>>>>> Hi,
>>>>>
>>>>> I want to re-open the discussion left long time ago:
>>>>> https://lists.gnu.org/archive/html/qemu-devel/2015-11/msg06194.html
>>>>> , and discuss the possibility of changing the hardcoded (256) TX
>>>>> queue size to be configurable between 256 and 1024.
>>>> Yes, I think we probably need this.
>>>>
>>>>> The reason to propose this request is that a severe issue of packet
>>>>> drops in TX direction was observed with the existing hardcoded 256
>>>>> queue size, which causes performance issues for packet drop
>>>>> sensitive guest applications that cannot use indirect descriptor
>>>>> tables. The issue goes away with 1K queue size.
>>>> Do we need even more, what if we find 1K is even not sufficient in the
>>>> future? Modern nics has size up to ~8192.
>>>>
>>>>> The concern mentioned in the previous discussion (please check the
>>>>> link
>>>>> above) is that the number of chained descriptors would exceed
>>>>> UIO_MAXIOV (1024) supported by the Linux.
>>>> We could try to address this limitation but probably need a new
>>>> feature bit to allow more than UIO_MAXIOV sgs.
>>> I'd say we should split the queue size and the sg size.
>>>
>> I think we can just split the iov size in the virtio-net backend,
>> that is, split the large iov[] into multiple iov[1024] to send to 
>> writev.
>>
>> Best,
>> Wei
>
> Maybe, but let's first clarify the limitation and new ability in the 
> spec. Want to send a patch?
>
Hi Jason,

I just posted a new discussion in another email. Please have a check if 
it's possible to solve the
issue from the existing implementation.

Best,
Wei

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [Qemu-devel] [virtio-dev] RE: virtio-net: configurable TX queue size
  2017-05-10  9:52       ` [Qemu-devel] [virtio-dev] " Wei Wang
@ 2017-05-10 20:07         ` Michael S. Tsirkin
  2017-05-11  5:09           ` Wei Wang
  0 siblings, 1 reply; 14+ messages in thread
From: Michael S. Tsirkin @ 2017-05-10 20:07 UTC (permalink / raw)
  To: Wei Wang
  Cc: Jason Wang, Stefan Hajnoczi, Marc-André Lureau, pbonzini,
	virtio-dev, qemu-devel, Jan Scheurich

On Wed, May 10, 2017 at 05:52:23PM +0800, Wei Wang wrote:
> On 05/07/2017 12:39 PM, Wang, Wei W wrote:
> > On 05/06/2017 04:37 AM, Michael S. Tsirkin wrote:
> > > On Fri, May 05, 2017 at 10:27:13AM +0800, Jason Wang wrote:
> > > > 
> > > > On 2017年05月04日 18:58, Wang, Wei W wrote:
> > > > > Hi,
> > > > > 
> > > > > I want to re-open the discussion left long time ago:
> > > > > https://lists.gnu.org/archive/html/qemu-devel/2015-11/msg06194.html
> > > > > , and discuss the possibility of changing the hardcoded (256) TX
> > > > > queue size to be configurable between 256 and 1024.
> > > > Yes, I think we probably need this.
> > > > 
> > > > > The reason to propose this request is that a severe issue of packet
> > > > > drops in TX direction was observed with the existing hardcoded 256
> > > > > queue size, which causes performance issues for packet drop
> > > > > sensitive guest applications that cannot use indirect descriptor
> > > > > tables. The issue goes away with 1K queue size.
> > > > Do we need even more, what if we find 1K is even not sufficient in the
> > > > future? Modern nics has size up to ~8192.
> > > > 
> > > > > The concern mentioned in the previous discussion (please check the
> > > > > link
> > > > > above) is that the number of chained descriptors would exceed
> > > > > UIO_MAXIOV (1024) supported by the Linux.
> > > > We could try to address this limitation but probably need a new
> > > > feature bit to allow more than UIO_MAXIOV sgs.
> > > I'd say we should split the queue size and the sg size.
> > > 
> 
> I'm still doing some investigation about this, one question (or issue) I
> found from the implementation is that the virtio-net device changes
> the message layout when the vnet_hdr needs an endianness swap
> (i.e. virtio_needs_swap()). This change adds one more iov to the
> iov[]-s passed from the driver.
> 
> To be more precise, the message from the driver could be in one
> of the two following layout:
> Layout1:
> iov[0]: vnet_hdr + data
> 
> Layout2:
> iov[0]: vnet_hdr
> iov[1]: data
> 
> If the driver passes the message in Layout1, and the following code
> from the device changes the message from Layout1 to Layout2:
> 
> if (n->needs_vnet_hdr_swap) {
>                 virtio_net_hdr_swap(vdev, (void *) &mhdr);
>                 sg2[0].iov_base = &mhdr;
>                 sg2[0].iov_len = n->guest_hdr_len;
>                 out_num = iov_copy(&sg2[1], ARRAY_SIZE(sg2) - 1,
>                                    out_sg, out_num,
>                                    n->guest_hdr_len, -1);
>                 if (out_num == VIRTQUEUE_MAX_SIZE) {
>                     goto drop;
>                 }
>                 out_num += 1;
>                 out_sg = sg2;
>             }
> 
> sg2[0] is the extra one, which potentially causes the off-by-one
> issue. I didn't find other possibilities that can cause the issue.
> 
> Could we keep the original layout by just copying the swapped
> "mhdr" back to original out_sg[0].iov_base?
> 
> Best,
> Wei

We can't because that data should be read-only by host.

> 
> 
> 
> 
> 
> 
> 
> 
> 
> 
> 
> 
> 
> 
> 
> 
> 
> 
> 
> 
> 
> 
> 

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [Qemu-devel] [virtio-dev] RE: virtio-net: configurable TX queue size
  2017-05-10 20:07         ` Michael S. Tsirkin
@ 2017-05-11  5:09           ` Wei Wang
  0 siblings, 0 replies; 14+ messages in thread
From: Wei Wang @ 2017-05-11  5:09 UTC (permalink / raw)
  To: Michael S. Tsirkin
  Cc: Jason Wang, Stefan Hajnoczi, Marc-André Lureau, pbonzini,
	virtio-dev, qemu-devel, Jan Scheurich

On 05/11/2017 04:07 AM, Michael S. Tsirkin wrote:
> On Wed, May 10, 2017 at 05:52:23PM +0800, Wei Wang wrote:
>> On 05/07/2017 12:39 PM, Wang, Wei W wrote:
>>> On 05/06/2017 04:37 AM, Michael S. Tsirkin wrote:
>>>> On Fri, May 05, 2017 at 10:27:13AM +0800, Jason Wang wrote:
>>>>> On 2017年05月04日 18:58, Wang, Wei W wrote:
>>>>>> Hi,
>>>>>>
>>>>>> I want to re-open the discussion left long time ago:
>>>>>> https://lists.gnu.org/archive/html/qemu-devel/2015-11/msg06194.html
>>>>>> , and discuss the possibility of changing the hardcoded (256) TX
>>>>>> queue size to be configurable between 256 and 1024.
>>>>> Yes, I think we probably need this.
>>>>>
>>>>>> The reason to propose this request is that a severe issue of packet
>>>>>> drops in TX direction was observed with the existing hardcoded 256
>>>>>> queue size, which causes performance issues for packet drop
>>>>>> sensitive guest applications that cannot use indirect descriptor
>>>>>> tables. The issue goes away with 1K queue size.
>>>>> Do we need even more, what if we find 1K is even not sufficient in the
>>>>> future? Modern nics has size up to ~8192.
>>>>>
>>>>>> The concern mentioned in the previous discussion (please check the
>>>>>> link
>>>>>> above) is that the number of chained descriptors would exceed
>>>>>> UIO_MAXIOV (1024) supported by the Linux.
>>>>> We could try to address this limitation but probably need a new
>>>>> feature bit to allow more than UIO_MAXIOV sgs.
>>>> I'd say we should split the queue size and the sg size.
>>>>
>> I'm still doing some investigation about this, one question (or issue) I
>> found from the implementation is that the virtio-net device changes
>> the message layout when the vnet_hdr needs an endianness swap
>> (i.e. virtio_needs_swap()). This change adds one more iov to the
>> iov[]-s passed from the driver.
>>
>> To be more precise, the message from the driver could be in one
>> of the two following layout:
>> Layout1:
>> iov[0]: vnet_hdr + data
>>
>> Layout2:
>> iov[0]: vnet_hdr
>> iov[1]: data
>>
>> If the driver passes the message in Layout1, and the following code
>> from the device changes the message from Layout1 to Layout2:
>>
>> if (n->needs_vnet_hdr_swap) {
>>                  virtio_net_hdr_swap(vdev, (void *) &mhdr);
>>                  sg2[0].iov_base = &mhdr;
>>                  sg2[0].iov_len = n->guest_hdr_len;
>>                  out_num = iov_copy(&sg2[1], ARRAY_SIZE(sg2) - 1,
>>                                     out_sg, out_num,
>>                                     n->guest_hdr_len, -1);
>>                  if (out_num == VIRTQUEUE_MAX_SIZE) {
>>                      goto drop;
>>                  }
>>                  out_num += 1;
>>                  out_sg = sg2;
>>              }
>>
>> sg2[0] is the extra one, which potentially causes the off-by-one
>> issue. I didn't find other possibilities that can cause the issue.
>>
>> Could we keep the original layout by just copying the swapped
>> "mhdr" back to original out_sg[0].iov_base?
>>
>> Best,
>> Wei
> We can't because that data should be read-only by host.
>

OK. I just posted a patch to solve this issue. Please have a check
there.  Thanks.

Best,
Wei

^ permalink raw reply	[flat|nested] 14+ messages in thread

end of thread, other threads:[~2017-05-11  5:07 UTC | newest]

Thread overview: 14+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2017-05-04 10:58 [Qemu-devel] virtio-net: configurable TX queue size Wang, Wei W
2017-05-05  2:27 ` Jason Wang
2017-05-05  5:53   ` [Qemu-devel] [virtio-dev] " Wei Wang
2017-05-05  9:20     ` Jason Wang
2017-05-05 22:08       ` Michael S. Tsirkin
2017-05-07 12:02       ` [Qemu-devel] [virtio-dev] " Yan Vugenfirer
2017-05-08  1:23         ` Wei Wang
2017-05-05 20:36   ` [Qemu-devel] " Michael S. Tsirkin
2017-05-07  4:39     ` Wang, Wei W
2017-05-10  9:00       ` Jason Wang
2017-05-10  9:59         ` Wei Wang
2017-05-10  9:52       ` [Qemu-devel] [virtio-dev] " Wei Wang
2017-05-10 20:07         ` Michael S. Tsirkin
2017-05-11  5:09           ` Wei Wang

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.