From mboxrd@z Thu Jan 1 00:00:00 1970 From: Yuanhan Liu Subject: Re: [PATCH v3 0/5] vhost: optimize enqueue Date: Wed, 12 Oct 2016 10:53:07 +0800 Message-ID: <20161012025307.GG16751@yliu-dev.sh.intel.com> References: <1471319402-112998-1-git-send-email-zhihong.wang@intel.com> <1471585430-125925-1-git-send-email-zhihong.wang@intel.com> <8F6C2BD409508844A0EFC19955BE09414E7B5581@SHSMSX103.ccr.corp.intel.com> <20160922022903.GJ23158@yliu-dev.sh.intel.com> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Cc: Maxime Coquelin , "dev@dpdk.org" , Thomas Monjalon To: "Wang, Zhihong" , Jianbo Liu Return-path: Received: from mga02.intel.com (mga02.intel.com [134.134.136.20]) by dpdk.org (Postfix) with ESMTP id A3DD06CCE for ; Wed, 12 Oct 2016 04:52:18 +0200 (CEST) Content-Disposition: inline In-Reply-To: List-Id: patches and discussions about DPDK List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: dev-bounces@dpdk.org Sender: "dev" On Thu, Sep 22, 2016 at 01:47:45PM +0800, Jianbo Liu wrote: > On 22 September 2016 at 10:29, Yuanhan Liu wrote: > > On Wed, Sep 21, 2016 at 08:54:11PM +0800, Jianbo Liu wrote: > >> >> > My setup consists of one host running a guest. > >> >> > The guest generates as much 64bytes packets as possible using > >> >> > >> >> Have you tested with other different packet size? > >> >> My testing shows that performance is dropping when packet size is more > >> >> than 256. > >> > > >> > > >> > Hi Jianbo, > >> > > >> > Thanks for reporting this. > >> > > >> > 1. Are you running the vector frontend with mrg_rxbuf=off? > >> > > Yes, my testing is mrg_rxbuf=off, but not vector frontend PMD. > > >> > 2. Could you please specify what CPU you're running? Is it Haswell > >> > or Ivy Bridge? > >> > > It's an ARM server. > > >> > 3. How many percentage of drop are you seeing? > The testing result: > size (bytes) improvement (%) > 64 3.92 > 128 11.51 > 256 24.16 > 512 -13.79 > 1024 -22.51 > 1500 -12.22 > A correction is that performance is dropping if byte size is larger than 512. I have thought of this twice. Unfortunately, I think I may need NACK this series. Merging two code path into one is really good: as you stated, it improves the maintainability. But only if we see no performance regression on both path after the refactor. Unfortunately, that's not the case here: it hurts the performance for one code path (non-mrg Rx). That makes me think we may should not do the code path merge at all. I think that also aligns with what you have said before (internally): we could do the merge if it gives comparable performance before and after that. Besides that, I don't quite like the way you did in patch 2 (rewrite enqueue): you made a lot of changes in one patch. That means if something wrong happened, it is hard to narrow down which change introduces that regression. Badly, that's exactly what we met here. Weeks have been passed, I see no progress. That's the reason we like the idea of "one patch only does one thing, an atomic thing". So I will apply the first patch (it's a bug fixing patch) and ask you to refactor the rest, without the code path merge. I think we could still have a good maintainability code base if we introduce more common helper functions that can be used on both Rx path, or even on Tx path (such as update_used_ring, or shadow_used_ring). It's a bit late for too many changes for v16.11. I think you could just grab patch 6 (vhost: optimize cache access) to the old mrg-Rx code path, if that also helps the performance? Let us handle the left in next release, such as shadow used ring. Thanks. --yliu