From mboxrd@z Thu Jan  1 00:00:00 1970
From: Jianbo Liu <jianbo.liu@linaro.org>
Subject: Re: [PATCH v3 0/5] vhost: optimize enqueue
Date: Mon, 26 Sep 2016 13:12:46 +0800
Message-ID: <CAP4Qi39FfeA_B1-5D-5wzXK_S2wJXGUHp9S+w2Z_BtoedAfrfQ@mail.gmail.com>
References: <1471319402-112998-1-git-send-email-zhihong.wang@intel.com>
 <8F6C2BD409508844A0EFC19955BE09414E7B6204@SHSMSX103.ccr.corp.intel.com>
 <CAP4Qi3_DxAnvs0jX1P=G_PiLnRRbP5Wty-eU-OPE_81RGCAuTA@mail.gmail.com>
 <1536480.IYe8r5XoNN@xps13>
 <8F6C2BD409508844A0EFC19955BE09414E7B6EA6@SHSMSX103.ccr.corp.intel.com>
Mime-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Cc: Thomas Monjalon <thomas.monjalon@6wind.com>, "dev@dpdk.org" <dev@dpdk.org>,
 Yuanhan Liu <yuanhan.liu@linux.intel.com>,
 Maxime Coquelin <maxime.coquelin@redhat.com>
To: "Wang, Zhihong" <zhihong.wang@intel.com>
Return-path: <dev-bounces@dpdk.org>
Received: from mail-yb0-f179.google.com (mail-yb0-f179.google.com
 [209.85.213.179]) by dpdk.org (Postfix) with ESMTP id CB29329CF
 for <dev@dpdk.org>; Mon, 26 Sep 2016 07:12:47 +0200 (CEST)
Received: by mail-yb0-f179.google.com with SMTP id i83so7047285ybi.3
 for <dev@dpdk.org>; Sun, 25 Sep 2016 22:12:47 -0700 (PDT)
In-Reply-To: <8F6C2BD409508844A0EFC19955BE09414E7B6EA6@SHSMSX103.ccr.corp.intel.com>
List-Id: patches and discussions about DPDK <dev.dpdk.org>
List-Unsubscribe: <http://dpdk.org/ml/options/dev>,
 <mailto:dev-request@dpdk.org?subject=unsubscribe>
List-Archive: <http://dpdk.org/ml/archives/dev/>
List-Post: <mailto:dev@dpdk.org>
List-Help: <mailto:dev-request@dpdk.org?subject=help>
List-Subscribe: <http://dpdk.org/ml/listinfo/dev>,
 <mailto:dev-request@dpdk.org?subject=subscribe>
Errors-To: dev-bounces@dpdk.org
Sender: "dev" <dev-bounces@dpdk.org>

On 25 September 2016 at 13:41, Wang, Zhihong <zhihong.wang@intel.com> wrote:
>
>
>> -----Original Message-----
>> From: Thomas Monjalon [mailto:thomas.monjalon@6wind.com]
>> Sent: Friday, September 23, 2016 9:41 PM
>> To: Jianbo Liu <jianbo.liu@linaro.org>
>> Cc: dev@dpdk.org; Wang, Zhihong <zhihong.wang@intel.com>; Yuanhan Liu
>> <yuanhan.liu@linux.intel.com>; Maxime Coquelin
>> <maxime.coquelin@redhat.com>
....
> This patch does help in ARM for small packets like 64B sized ones,
> this actually proves the similarity between x86 and ARM in terms
> of caching optimization in this patch.
>
> My estimation is based on:
>
>  1. The last patch are for mrg_rxbuf=on, and since you said it helps
>     perf, we can ignore it for now when we discuss mrg_rxbuf=off
>
>  2. Vhost enqueue perf =
>     Ring overhead + Virtio header overhead + Data memcpy overhead
>
>  3. This patch helps small packets traffic, which means it helps
>     ring + virtio header operations
>
>  4. So, when you say perf drop when packet size larger than 512B,
>     this is most likely caused by memcpy in ARM not working well
>     with this patch
>
> I'm not saying glibc's memcpy is not good enough, it's just that
> this is a rather special use case. And since we see specialized
> memcpy + this patch give better performance than other combinations
> significantly on x86, we suggest to hand-craft a specialized memcpy
> for it.
>
> Of course on ARM this is still just my speculation, and we need to
> either prove it or find the actual root cause.
>
> It can be **REALLY HELPFUL** if you could help to test this patch on
> ARM for mrg_rxbuf=on cases to see if this patch is in fact helpful
> to ARM at all, since mrg_rxbuf=on the more widely used cases.
>
Actually it's worse than mrg_rxbuf=off.