From mboxrd@z Thu Jan 1 00:00:00 1970 From: Jason Wang Subject: Re: [PATCH 0/5] VSOCK: support mergeable rx buffer in vhost-vsock Date: Tue, 6 Nov 2018 11:32:18 +0800 Message-ID: <229559d5-1787-09b1-6c26-57b535f20006@redhat.com> References: <5BDFF49C.3040603@huawei.com> <5BE0F9C9.2080003@huawei.com> <5BE107B5.2050900@huawei.com> Mime-Version: 1.0 Content-Type: text/plain; charset=utf-8; format=flowed Content-Transfer-Encoding: 8bit Cc: netdev@vger.kernel.org, kvm@vger.kernel.org, virtualization@lists.linux-foundation.org To: jiangyiwen , stefanha@redhat.com Return-path: Received: from mx1.redhat.com ([209.132.183.28]:37400 "EHLO mx1.redhat.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1727481AbeKFMza (ORCPT ); Tue, 6 Nov 2018 07:55:30 -0500 In-Reply-To: <5BE107B5.2050900@huawei.com> Content-Language: en-US Sender: netdev-owner@vger.kernel.org List-ID: On 2018/11/6 上午11:17, jiangyiwen wrote: > On 2018/11/6 10:41, Jason Wang wrote: >> On 2018/11/6 上午10:17, jiangyiwen wrote: >>> On 2018/11/5 17:21, Jason Wang wrote: >>>> On 2018/11/5 下午3:43, jiangyiwen wrote: >>>>> Now vsock only support send/receive small packet, it can't achieve >>>>> high performance. As previous discussed with Jason Wang, I revisit the >>>>> idea of vhost-net about mergeable rx buffer and implement the mergeable >>>>> rx buffer in vhost-vsock, it can allow big packet to be scattered in >>>>> into different buffers and improve performance obviously. >>>>> >>>>> I write a tool to test the vhost-vsock performance, mainly send big >>>>> packet(64K) included guest->Host and Host->Guest. The result as >>>>> follows: >>>>> >>>>> Before performance: >>>>> Single socket Multiple sockets(Max Bandwidth) >>>>> Guest->Host ~400MB/s ~480MB/s >>>>> Host->Guest ~1450MB/s ~1600MB/s >>>>> >>>>> After performance: >>>>> Single socket Multiple sockets(Max Bandwidth) >>>>> Guest->Host ~1700MB/s ~2900MB/s >>>>> Host->Guest ~1700MB/s ~2900MB/s >>>>> >>>>> From the test results, the performance is improved obviously, and guest >>>>> memory will not be wasted. >>>> Hi: >>>> >>>> Thanks for the patches and the numbers are really impressive. >>>> >>>> But instead of duplicating codes between sock and net. I was considering to use virtio-net as a transport of vsock. Then we may have all existed features likes batching, mergeable rx buffers and multiqueue. Want to consider this idea? Thoughts? >>>> >>>> >>> Hi Jason, >>> >>> I am not very familiar with virtio-net, so I am afraid I can't give too >>> much effective advice. Then I have several problems: >>> >>> 1. If use virtio-net as a transport, guest should see a virtio-net >>> device instead of virtio-vsock device, right? Is vsock only as a >>> transport between socket and net_device? User should still use >>> AF_VSOCK type to create socket, right? >> >> Well, there're many choices. What you need is just to keep the socket API and hide the implementation. For example, you can keep the vosck device in guest and switch to use vhost-net in host. We probably need a new feature bit or header to let vhost know we are passing vsock packet. And vhost-net could forward the packet to vsock core on host. >> >> >>> 2. I want to know if this idea has already started, and how is >>> the current progress? >> >> Not yet started. Just want to listen from the community. If this sounds good, do you have interest in implementing this? >> >> >>> 3. And what is stefan's idea? >> >> Talk with Stefan a little on this during KVM Forum. I think he tends to agree on this idea. Anyway, let's wait for his reply. >> >> >> Thanks >> >> > Hi Jason, > > Thanks your reply, what you want is try to avoid duplicate code, and still > use the existed features with virtio-net. Yes, technically we can use virtio-net driver is guest as well but we could do it step by step. > Yes, if this sounds good and most people can recognize this idea, I am very > happy to implement this. Cool, thanks. > > In addition, I hope you can review these patches before the new idea is > implemented, after all the performance can be improved. :-) Ok. So the patch actually did three things: - mergeable buffer implementation - increase the default rx buffer size - add used and signal guest in a batch It would be helpful if you can measure the performance improvement independently. This can give reviewer a better understanding on how much did each part help. Thanks > > Thanks, > Yiwen. > >>> Thanks, >>> Yiwen. >>> >> . >> >