From mboxrd@z Thu Jan 1 00:00:00 1970 From: Stephen Hemminger Subject: Re: [ [PATCH v2] 01/13] virtio: Introduce config RTE_VIRTIO_INC_VECTOR Date: Fri, 18 Dec 2015 09:33:42 -0800 Message-ID: <20151218093342.78fc5f72@xeon-e3> References: <1450098032-21198-1-git-send-email-sshukla@mvista.com> <1450098032-21198-2-git-send-email-sshukla@mvista.com> <20151217152435.3c733ac1@xeon-e3> Mime-Version: 1.0 Content-Type: text/plain; charset=US-ASCII Content-Transfer-Encoding: 7bit Cc: "dev@dpdk.org" To: "Xie, Huawei" Return-path: Received: from mail-pf0-f171.google.com (mail-pf0-f171.google.com [209.85.192.171]) by dpdk.org (Postfix) with ESMTP id B9FC75A86 for ; Fri, 18 Dec 2015 18:33:35 +0100 (CET) Received: by mail-pf0-f171.google.com with SMTP id u7so5560574pfb.1 for ; Fri, 18 Dec 2015 09:33:35 -0800 (PST) In-Reply-To: List-Id: patches and discussions about DPDK List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: dev-bounces@dpdk.org Sender: "dev" On Fri, 18 Dec 2015 09:52:29 +0000 "Xie, Huawei" wrote: > > low level SSE bit twiddling. > Hi Stephen: > We only did SSE twiddling to RX, which almost doubles the performance > comparing to normal path in virtio/vhost performance test case. Indirect > and any layout feature enabling are mostly for TX. We also did some > optimization for single segment and non-offload case in TX, without > using SSE, which also gives ~60% performance improvement, in Qian's > result. My optimization is mostly for single segment and non-offload > case, which i calls simple rx/tx. > I plan to add virtio/vhost performance benchmark so that we could easily > measure the performance difference for each patch. > > Indirect and any layout features are useful for multiple segment > transmitted packet mbufs. I had acked your patch at the first time, and > thought it is applied. I don't understand why you say it is ignored by > Intel. Sorry, did not mean to blame Intel, ... more that why didn't it get in 2.2? It turns out any layout/indirect helps all transmits because they can then take a single tx descriptor rather than multiple.