From mboxrd@z Thu Jan  1 00:00:00 1970
From: =?UTF-8?Q?Mattias_R=c3=b6nnblom?= <mattias.ronnblom@ericsson.com>
Subject: Re: [PATCH 1/2] ring: synchronize the load and store of
 the tail
Date: Tue, 6 Nov 2018 12:03:25 +0100
Message-ID: <ff686db3-82e4-c3ae-0a61-1200da97faf6@ericsson.com>
References: <1537172244-64874-2-git-send-email-gavin.hu@arm.com>
 <1874944.OrACW1nkDZ@xps> <20181027150024.GA2294@jerin>
 <17713879.gC9jYcxDUo@xps>
 <HE1PR0701MB239463577089E46983B2371CE1C80@HE1PR0701MB2394.eurprd07.prod.outlook.com>
 <AM6PR08MB3672A936960FCEACC700AB1498CA0@AM6PR08MB3672.eurprd08.prod.outlook.com>
Mime-Version: 1.0
Content-Type: text/plain; charset=utf-8; format=flowed
Content-Transfer-Encoding: 7bit
Cc: "Gavin Hu (Arm Technology China)" <Gavin.Hu@arm.com>,
 "dev@dpdk.org" <dev@dpdk.org>, "stable@dpdk.org" <stable@dpdk.org>,
 Ola Liljedahl <Ola.Liljedahl@arm.com>,
 "olivier.matz@6wind.com" <olivier.matz@6wind.com>,
 "chaozhu@linux.vnet.ibm.com" <chaozhu@linux.vnet.ibm.com>,
 "bruce.richardson@intel.com" <bruce.richardson@intel.com>,
 "konstantin.ananyev@intel.com" <konstantin.ananyev@intel.com>,
 nd <nd@arm.com>
To: Honnappa Nagarahalli <Honnappa.Nagarahalli@arm.com>,
 Thomas Monjalon <thomas@monjalon.net>,
 Jerin Jacob <jerin.jacob@caviumnetworks.com>
Return-path: <dev-bounces@dpdk.org>
In-Reply-To: <AM6PR08MB3672A936960FCEACC700AB1498CA0@AM6PR08MB3672.eurprd08.prod.outlook.com>
Content-Language: en-US
List-Id: DPDK patches and discussions <dev.dpdk.org>
List-Unsubscribe: <https://mails.dpdk.org/options/dev>,
 <mailto:dev-request@dpdk.org?subject=unsubscribe>
List-Archive: <http://mails.dpdk.org/archives/dev/>
List-Post: <mailto:dev@dpdk.org>
List-Help: <mailto:dev-request@dpdk.org?subject=help>
List-Subscribe: <https://mails.dpdk.org/listinfo/dev>,
 <mailto:dev-request@dpdk.org?subject=subscribe>
Errors-To: dev-bounces@dpdk.org
Sender: "dev" <dev-bounces@dpdk.org>

On 2018-11-05 22:51, Honnappa Nagarahalli wrote:
>> I've also run an out-of-tree DSW throughput benchmark, and I've found that
>> going from Non-C11 to C11 gives a 4% slowdown. After this patch, the
>> slowdown is only 2,8%.
> This is interesting. The general understanding seems to be that C11 atomics should not add any additional instructions on x86. But, we still see some drop in performance. Is this attributed to compiler not being allowed to re-order?
> 

I was lazy enough not to disassemble, so I don't know.

I would suggest non-C11 mode stays as the default on x86_64.