From mboxrd@z Thu Jan  1 00:00:00 1970
From: Yuanhan Liu <yuanhan.liu@linux.intel.com>
Subject: Re: [PATCH v3 0/5] vhost: optimize enqueue
Date: Wed, 12 Oct 2016 10:53:07 +0800
Message-ID: <20161012025307.GG16751@yliu-dev.sh.intel.com>
References: <1471319402-112998-1-git-send-email-zhihong.wang@intel.com>
 <1471585430-125925-1-git-send-email-zhihong.wang@intel.com>
 <e6addeba-ffbc-baae-61c8-5b8e798c843e@redhat.com>
 <CAP4Qi3-cSgHDPC3Wne3RSL0t=Z-vhYUPsPWH6VAXsXsHYX6ShQ@mail.gmail.com>
 <8F6C2BD409508844A0EFC19955BE09414E7B5581@SHSMSX103.ccr.corp.intel.com>
 <CAP4Qi39-KD8pY-3M31asoDV+dja27XzFTsBMq9ignoawdL8=HQ@mail.gmail.com>
 <20160922022903.GJ23158@yliu-dev.sh.intel.com>
 <CAP4Qi392=aOMrSyTu-5qwpSLpwK-NVdHp-aztT-xT=BcRPWoew@mail.gmail.com>
Mime-Version: 1.0
Content-Type: text/plain; charset=us-ascii
Cc: Maxime Coquelin <maxime.coquelin@redhat.com>,
 "dev@dpdk.org" <dev@dpdk.org>, Thomas Monjalon <thomas.monjalon@6wind.com>
To: "Wang, Zhihong" <zhihong.wang@intel.com>,
 Jianbo Liu <jianbo.liu@linaro.org>
Return-path: <dev-bounces@dpdk.org>
Received: from mga02.intel.com (mga02.intel.com [134.134.136.20])
 by dpdk.org (Postfix) with ESMTP id A3DD06CCE
 for <dev@dpdk.org>; Wed, 12 Oct 2016 04:52:18 +0200 (CEST)
Content-Disposition: inline
In-Reply-To: <CAP4Qi392=aOMrSyTu-5qwpSLpwK-NVdHp-aztT-xT=BcRPWoew@mail.gmail.com>
List-Id: patches and discussions about DPDK <dev.dpdk.org>
List-Unsubscribe: <http://dpdk.org/ml/options/dev>,
 <mailto:dev-request@dpdk.org?subject=unsubscribe>
List-Archive: <http://dpdk.org/ml/archives/dev/>
List-Post: <mailto:dev@dpdk.org>
List-Help: <mailto:dev-request@dpdk.org?subject=help>
List-Subscribe: <http://dpdk.org/ml/listinfo/dev>,
 <mailto:dev-request@dpdk.org?subject=subscribe>
Errors-To: dev-bounces@dpdk.org
Sender: "dev" <dev-bounces@dpdk.org>

On Thu, Sep 22, 2016 at 01:47:45PM +0800, Jianbo Liu wrote:
> On 22 September 2016 at 10:29, Yuanhan Liu <yuanhan.liu@linux.intel.com> wrote:
> > On Wed, Sep 21, 2016 at 08:54:11PM +0800, Jianbo Liu wrote:
> >> >> > My setup consists of one host running a guest.
> >> >> > The guest generates as much 64bytes packets as possible using
> >> >>
> >> >> Have you tested with other different packet size?
> >> >> My testing shows that performance is dropping when packet size is more
> >> >> than 256.
> >> >
> >> >
> >> > Hi Jianbo,
> >> >
> >> > Thanks for reporting this.
> >> >
> >> >  1. Are you running the vector frontend with mrg_rxbuf=off?
> >> >
> Yes, my testing is mrg_rxbuf=off, but not vector frontend PMD.
> 
> >> >  2. Could you please specify what CPU you're running? Is it Haswell
> >> >     or Ivy Bridge?
> >> >
> It's an ARM server.
> 
> >> >  3. How many percentage of drop are you seeing?
> The testing result:
> size (bytes)     improvement (%)
> 64                   3.92
> 128                 11.51
> 256                  24.16
> 512                  -13.79
> 1024                -22.51
> 1500                -12.22
> A correction is that performance is dropping if byte size is larger than 512.

I have thought of this twice. Unfortunately, I think I may need NACK this
series.

Merging two code path into one is really good: as you stated, it improves
the maintainability. But only if we see no performance regression on both 
path after the refactor. Unfortunately, that's not the case here: it hurts
the performance for one code path (non-mrg Rx).

That makes me think we may should not do the code path merge at all. I think
that also aligns with what you have said before (internally): we could do the
merge if it gives comparable performance before and after that.

Besides that, I don't quite like the way you did in patch 2 (rewrite enqueue):
you made a lot of changes in one patch. That means if something wrong happened,
it is hard to narrow down which change introduces that regression. Badly,
that's exactly what we met here. Weeks have been passed, I see no progress.

That's the reason we like the idea of "one patch only does one thing, an
atomic thing".

So I will apply the first patch (it's a bug fixing patch) and ask you to
refactor the rest, without the code path merge.

I think we could still have a good maintainability code base if we introduce 
more common helper functions that can be used on both Rx path, or even on
Tx path (such as update_used_ring, or shadow_used_ring).

It's a bit late for too many changes for v16.11. I think you could just
grab patch 6 (vhost: optimize cache access) to the old mrg-Rx code path,
if that also helps the performance? Let us handle the left in next release,
such as shadow used ring.

Thanks.

	--yliu