From mboxrd@z Thu Jan  1 00:00:00 1970
From: Thomas Monjalon <thomas.monjalon@6wind.com>
Subject: Re: [PATCH v3 0/5] vhost: optimize enqueue
Date: Fri, 23 Sep 2016 15:41:08 +0200
Message-ID: <1536480.IYe8r5XoNN@xps13>
References: <1471319402-112998-1-git-send-email-zhihong.wang@intel.com>
 <8F6C2BD409508844A0EFC19955BE09414E7B6204@SHSMSX103.ccr.corp.intel.com>
 <CAP4Qi3_DxAnvs0jX1P=G_PiLnRRbP5Wty-eU-OPE_81RGCAuTA@mail.gmail.com>
Mime-Version: 1.0
Content-Type: text/plain; charset="us-ascii"
Content-Transfer-Encoding: 7Bit
Cc: dev@dpdk.org, "Wang, Zhihong" <zhihong.wang@intel.com>,
 Yuanhan Liu <yuanhan.liu@linux.intel.com>,
 Maxime Coquelin <maxime.coquelin@redhat.com>
To: Jianbo Liu <jianbo.liu@linaro.org>
Return-path: <dev-bounces@dpdk.org>
Received: from mail-wm0-f41.google.com (mail-wm0-f41.google.com [74.125.82.41])
 by dpdk.org (Postfix) with ESMTP id C713068F5
 for <dev@dpdk.org>; Fri, 23 Sep 2016 15:41:11 +0200 (CEST)
Received: by mail-wm0-f41.google.com with SMTP id 197so5876352wmk.1
 for <dev@dpdk.org>; Fri, 23 Sep 2016 06:41:11 -0700 (PDT)
In-Reply-To: <CAP4Qi3_DxAnvs0jX1P=G_PiLnRRbP5Wty-eU-OPE_81RGCAuTA@mail.gmail.com>
List-Id: patches and discussions about DPDK <dev.dpdk.org>
List-Unsubscribe: <http://dpdk.org/ml/options/dev>,
 <mailto:dev-request@dpdk.org?subject=unsubscribe>
List-Archive: <http://dpdk.org/ml/archives/dev/>
List-Post: <mailto:dev@dpdk.org>
List-Help: <mailto:dev-request@dpdk.org?subject=help>
List-Subscribe: <http://dpdk.org/ml/listinfo/dev>,
 <mailto:dev-request@dpdk.org?subject=subscribe>
Errors-To: dev-bounces@dpdk.org
Sender: "dev" <dev-bounces@dpdk.org>

2016-09-23 18:41, Jianbo Liu:
> On 23 September 2016 at 10:56, Wang, Zhihong <zhihong.wang@intel.com> wrote:
> .....
> > This is expected because the 2nd patch is just a baseline and all optimization
> > patches are organized in the rest of this patch set.
> >
> > I think you can do bottleneck analysis on ARM to see what's slowing down the
> > perf, there might be some micro-arch complications there, mostly likely in
> > memcpy.
> >
> > Do you use glibc's memcpy? I suggest to hand-crafted it on your own.
> >
> > Could you publish the mrg_rxbuf=on data also? Since it's more widely used
> > in terms of spec integrity.
> >
> I don't think it will be helpful for you, considering the differences
> between x86 and arm.
> So please move on with this patchset...

Jianbo,
I don't understand.
You said that the 2nd patch is a regression:
-       volatile uint16_t       last_used_idx;
+       uint16_t                last_used_idx;

And the overrall series lead to performance regression
for packets > 512 B, right?
But we don't know wether you have tested the v6 or not.

Zhihong talked about some improvements possible in rte_memcpy.
ARM64 is using libc memcpy in rte_memcpy.

Now you seem to give up.
Does it mean you accept having a regression in 16.11 release?
Are you working on rte_memcpy?