From mboxrd@z Thu Jan 1 00:00:00 1970 From: Maxime Coquelin Subject: Re: [RFC PATCH] net/virtio: Align Virtio-net header on cache line in receive path Date: Mon, 6 Mar 2017 15:11:28 +0100 Message-ID: <3fa785c1-7a7b-b13e-4bd0-ee34ed5985fe@redhat.com> References: <20170221173243.20779-1-maxime.coquelin@redhat.com> <20170222013734.GJ18844@yliu-dev.sh.intel.com> <024ad979-8b54-ac33-54b4-5f8753b74d75@redhat.com> <20170223054954.GU18844@yliu-dev.sh.intel.com> <349f9a71-7407-e45a-4687-a54fe7e778c8@redhat.com> <20170306084649.GH18844@yliu-dev.sh.intel.com> Mime-Version: 1.0 Content-Type: text/plain; charset=windows-1252; format=flowed Content-Transfer-Encoding: 7bit Cc: cunming.liang@intel.com, jianfeng.tan@intel.com, dev@dpdk.org, "Wang, Zhihong" , "Yao, Lei A" To: Yuanhan Liu Return-path: Received: from mx1.redhat.com (mx1.redhat.com [209.132.183.28]) by dpdk.org (Postfix) with ESMTP id A9B706CB4 for ; Mon, 6 Mar 2017 15:11:34 +0100 (CET) In-Reply-To: <20170306084649.GH18844@yliu-dev.sh.intel.com> List-Id: DPDK patches and discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: dev-bounces@dpdk.org Sender: "dev" On 03/06/2017 09:46 AM, Yuanhan Liu wrote: > On Wed, Mar 01, 2017 at 08:36:24AM +0100, Maxime Coquelin wrote: >> >> >> On 02/23/2017 06:49 AM, Yuanhan Liu wrote: >>> On Wed, Feb 22, 2017 at 10:36:36AM +0100, Maxime Coquelin wrote: >>>> >>>> >>>> On 02/22/2017 02:37 AM, Yuanhan Liu wrote: >>>>> On Tue, Feb 21, 2017 at 06:32:43PM +0100, Maxime Coquelin wrote: >>>>>> This patch aligns the Virtio-net header on a cache-line boundary to >>>>>> optimize cache utilization, as it puts the Virtio-net header (which >>>>>> is always accessed) on the same cache line as the packet header. >>>>>> >>>>>> For example with an application that forwards packets at L2 level, >>>>>> a single cache-line will be accessed with this patch, instead of >>>>>> two before. >>>>> >>>>> I'm assuming you were testing pkt size <= (64 - hdr_size)? >>>> >>>> No, I tested with 64 bytes packets only. >>> >>> Oh, my bad, I overlooked it. While you were saying "a single cache >>> line", I was thinking putting the virtio net hdr and the "whole" >>> packet data in single cache line, which is not possible for pkt >>> size 64B. >>> >>>> I run some more tests this morning with different packet sizes, >>>> and also with changing the mbuf size on guest side to have multi- >>>> buffers packets: >>>> >>>> +-------+--------+--------+-------------------------+ >>>> | Txpkt | Rxmbuf | v17.02 | v17.02 + vnet hdr align | >>>> +-------+--------+--------+-------------------------+ >>>> | 64 | 2048 | 11.05 | 11.78 | >>>> | 128 | 2048 | 10.66 | 11.48 | >>>> | 256 | 2048 | 10.47 | 11.21 | >>>> | 512 | 2048 | 10.22 | 10.88 | >>>> | 1024 | 2048 | 7.65 | 7.84 | >>>> | 1500 | 2048 | 6.25 | 6.45 | >>>> | 2000 | 2048 | 5.31 | 5.43 | >>>> | 2048 | 2048 | 5.32 | 4.25 | >>>> | 1500 | 512 | 3.89 | 3.98 | >>>> | 2048 | 512 | 1.96 | 2.02 | >>>> +-------+--------+--------+-------------------------+ >>> >>> Could you share more info, say is it a PVP test? Is mergeable on? >>> What's the fwd mode? >> >> No, this is not PVP benchmark, I have neither another server nor a packet >> generator connected to my Haswell machine back-to-back. >> >> This is simple micro-benchmark, vhost PMD in txonly, Virtio PMD in >> rxonly. In this configuration, mergeable is ON and no offload disabled >> in QEMU cmdline. > > Okay, I see. So the boost, as you have stated, comes from saving two > cache line access to one. Before that, vhost write 2 cache lines, > while the virtio pmd reads 2 cache lines: one for reading the header, > another one for reading the ether header, for updating xstats (there > is no ether access in the fwd mode you tested). > >> That's why I would be interested in more testing on recent hardware >> with PVP benchmark. Is it something that could be run in Intel lab? > > I think Yao Lei could help on that? But as stated, I think it may > break the performance for bit packets. And I also won't expect big > boost even for 64B in PVP test, judging that it's only 6% boost in > micro bechmarking. That would be great. Note that on SandyBridge, on which I see a drop in perf with microbenchmark, I get a 4% gain on PVP benchmark. So on recent hardware that show a gain on microbenchmark, I'm curious of the gain with PVP bench. Cheers, Maxime