From mboxrd@z Thu Jan 1 00:00:00 1970 From: Maxime Coquelin Subject: Re: [PATCH v6 0/6] vhost: optimize enqueue Date: Wed, 21 Sep 2016 06:39:50 +0200 Message-ID: <48715e6b-5f7e-0781-f794-07c5dd29e15a@redhat.com> References: <1474336817-22683-1-git-send-email-zhihong.wang@intel.com> <20160921022656.GA23158@yliu-dev.sh.intel.com> Mime-Version: 1.0 Content-Type: text/plain; charset=windows-1252; format=flowed Content-Transfer-Encoding: 7bit Cc: Zhihong Wang , dev@dpdk.org, thomas.monjalon@6wind.com To: Yuanhan Liu Return-path: Received: from mx1.redhat.com (mx1.redhat.com [209.132.183.28]) by dpdk.org (Postfix) with ESMTP id 804AC29D1 for ; Wed, 21 Sep 2016 06:39:54 +0200 (CEST) In-Reply-To: <20160921022656.GA23158@yliu-dev.sh.intel.com> List-Id: patches and discussions about DPDK List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: dev-bounces@dpdk.org Sender: "dev" Hi Yuanhan, On 09/21/2016 04:26 AM, Yuanhan Liu wrote: > Hi Maxime, > > Do you have more comments about this set? If no, I think I could merge > it shortly. No more comments, this is good to me. Feel free to add: Reviewed-by: Maxime Coquelin Thanks, Maxime > Thanks. > > --yliu > > On Mon, Sep 19, 2016 at 10:00:11PM -0400, Zhihong Wang wrote: >> This patch set optimizes the vhost enqueue function. >> >> It implements the vhost logic from scratch into a single function designed >> for high performance and good maintainability, and improves CPU efficiency >> significantly by optimizing cache access, which means: >> >> * Higher maximum throughput can be achieved for fast frontends like DPDK >> virtio pmd. >> >> * Better scalability can be achieved that each vhost core can support >> more connections because it takes less cycles to handle each single >> frontend. >> >> This patch set contains: >> >> 1. A Windows VM compatibility fix for vhost enqueue in 16.07 release. >> >> 2. A baseline patch to rewrite the vhost logic. >> >> 3. A series of optimization patches added upon the baseline. >> >> The main optimization techniques are: >> >> 1. Reorder code to reduce CPU pipeline stall cycles. >> >> 2. Batch update the used ring for better efficiency. >> >> 3. Prefetch descriptor to hide cache latency. >> >> 4. Remove useless volatile attribute to allow compiler optimization. >> >> Code reordering and batch used ring update bring most of the performance >> improvements. >> >> In the existing code there're 2 callbacks for vhost enqueue: >> >> * virtio_dev_merge_rx for mrg_rxbuf turned on cases. >> >> * virtio_dev_rx for mrg_rxbuf turned off cases. >> >> The performance of the existing code is not optimal, especially when the >> mrg_rxbuf feature turned on. Besides, having 2 callback paths increases >> maintenance efforts. >> >> Also, there's a compatibility issue in the existing code which causes >> Windows VM to hang when the mrg_rxbuf feature turned on. >> >> --- >> Changes in v6: >> >> 1. Merge duplicated code. >> >> 2. Introduce a function for used ring write. >> >> 3. Add necessary comments. >> >> --- >> Changes in v5: >> >> 1. Rebase to dpdk-next-virtio master. >> >> 2. Rename variables to keep consistent in naming style. >> >> 3. Small changes like return value adjustment and vertical alignment. >> >> 4. Add details in commit log. >> >> --- >> Changes in v4: >> >> 1. Fix a Windows VM compatibility issue. >> >> 2. Free shadow used ring in the right place. >> >> 3. Add failure check for shadow used ring malloc. >> >> 4. Refactor the code for clearer logic. >> >> 5. Add PRINT_PACKET for debugging. >> >> --- >> Changes in v3: >> >> 1. Remove unnecessary memset which causes frontend stall on SNB & IVB. >> >> 2. Rename variables to follow naming convention. >> >> 3. Rewrite enqueue and delete the obsolete in the same patch. >> >> --- >> Changes in v2: >> >> 1. Split the big function into several small ones. >> >> 2. Use multiple patches to explain each optimization. >> >> 3. Add comments. >> >> Zhihong Wang (6): >> vhost: fix windows vm hang >> vhost: rewrite enqueue >> vhost: remove useless volatile >> vhost: add desc prefetch >> vhost: batch update used ring >> vhost: optimize cache access >> >> lib/librte_vhost/vhost.c | 20 +- >> lib/librte_vhost/vhost.h | 6 +- >> lib/librte_vhost/vhost_user.c | 31 ++- >> lib/librte_vhost/virtio_net.c | 541 ++++++++++++++---------------------------- >> 4 files changed, 225 insertions(+), 373 deletions(-) >> >> -- >> 2.7.4