All of lore.kernel.org
 help / color / mirror / Atom feed
From: "Phil Yang (Arm Technology China)" <Phil.Yang@arm.com>
To: "Hunt, David" <david.hunt@intel.com>,
	"dev@dpdk.org" <dev@dpdk.org>,
	"thomas@monjalon.net" <thomas@monjalon.net>
Cc: "reshma.pattan@intel.com" <reshma.pattan@intel.com>,
	"Gavin Hu (Arm Technology China)" <Gavin.Hu@arm.com>,
	Honnappa Nagarahalli <Honnappa.Nagarahalli@arm.com>,
	nd <nd@arm.com>, nd <nd@arm.com>
Subject: Re: [dpdk-dev] [PATCH v4 2/3] test/distributor: replace sync builtins with atomic builtins
Date: Thu, 11 Apr 2019 11:31:40 +0000	[thread overview]
Message-ID: <DB7PR08MB3385B398972C39D3A99C58A5E92F0@DB7PR08MB3385.eurprd08.prod.outlook.com> (raw)
In-Reply-To: <85718290-9c5a-01f5-21ab-0f1000e4dfde@intel.com>

> -----Original Message-----
> From: Hunt, David <david.hunt@intel.com>
> Sent: Wednesday, April 10, 2019 10:06 PM
> To: Phil Yang (Arm Technology China) <Phil.Yang@arm.com>; dev@dpdk.org;
> thomas@monjalon.net
> Cc: reshma.pattan@intel.com; Gavin Hu (Arm Technology China)
> <Gavin.Hu@arm.com>; Honnappa Nagarahalli
> <Honnappa.Nagarahalli@arm.com>; nd <nd@arm.com>
> Subject: Re: [PATCH v4 2/3] test/distributor: replace sync builtins with atomic
> builtins
> 
> Hi Phil,
> 
> On 8/4/2019 4:02 AM, Phil Yang wrote:
> > '__sync' built-in functions are deprecated, should use the '__atomic'
> > built-in instead. the sync built-in functions are full barriers, while
> > atomic built-in functions offer less restrictive one-way barriers,
> > which help performance.
> >
> > Here is the example test result on TX2:
> > sudo ./arm64-armv8a-linuxapp-gcc/app/test -l 112-139 \ -n 4
> > --socket-mem=1024,1024 -- -i
> > RTE>>distributor_perf_autotest
> >
> > *** distributor_perf_autotest without this patch *** ==== Cache line
> > switch test === Time for 33554432 iterations = 1519202730 ticks Ticks
> > per iteration = 45
> >
> > *** distributor_perf_autotest with this patch *** ==== Cache line
> > switch test === Time for 33554432 iterations = 1251715496 ticks Ticks
> > per iteration = 37
> >
> > Less ticks needed for the cache line switch test. It got 17% of
> > performance improvement.
> 
> 
Hi, Dave

Thanks for your input.

> I'm seeing about an 8% performance degradation on my platform for the

I'd tested this patch on our x86 server (E5-2640 v3 @ 2.60GHz) several rounds. However, I didn't found performance degradation. Please check the test result below.
$ sudo  ./x86_64-native-linuxapp-gcc/app/test -l 8-15 -n 4 --socket-mem=1024,1024 -- -i
RTE>>distributor_perf_autotest

####  without this patch ####
==== Cache line switch test ===
Time for 33554432 iterations = 12379399910 ticks
Ticks per iteration = 368

=== Performance test of distributor (single mode) ===
Time per burst:  5815
Time per packet: 90

=== Performance test of distributor (burst mode) ===
Time per burst:  3487
Time per packet: 54

####  with this patch ####
==== Cache line switch test ===
Time for 33554432 iterations = 12388791845 ticks
Ticks per iteration = 369

=== Performance test of distributor (single mode) ===
Time per burst:  5796
Time per packet: 90

=== Performance test of distributor (burst mode) ===
Time per burst:  3477
Time per packet: 54

From my test, there was a little bit of performance improvement (You can also think of it as a measurement bias) on x86. 

> cache line switch test with the patch, however the single mode and burst
> mode tests area showing no difference, which are the more important tests.
> What kind of differences are you seeing in the single/burst mode tests?

Actually, I found no difference in the single mode and burst mode on aarch64 neither. I think it means this test case is not the hotspot for those two mode's performance. 

Just like the __sync_xxx builtins, the __atomic_xxx builtins are atomic operations, which elide the memory barrier. So I think it should benefit all platform.

Thanks,
Phil
> 
> Rgds,
> Dave.
> 
> 
> ---snip---
> 
> 


  reply	other threads:[~2019-04-11 11:31 UTC|newest]

Thread overview: 30+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2019-01-03  9:49 [PATCH] packet_ordering: replace sync builtins with atomic builtins Phil Yang
2019-03-28 18:42 ` Thomas Monjalon
2019-03-29  1:34   ` Phil Yang (Arm Technology China)
2019-03-29 10:56 ` [PATCH v2 0/3] example and test cases optimizations Phil Yang
2019-03-29 10:56 ` [PATCH v2 1/3] packet_ordering: add statistics for each worker thread Phil Yang
2019-03-29 16:39   ` Pattan, Reshma
2019-03-30 16:55     ` Phil Yang (Arm Technology China)
2019-04-01 12:58       ` Pattan, Reshma
2019-04-02  3:33         ` Phil Yang (Arm Technology China)
2019-03-29 10:56 ` [PATCH v2 2/3] test/distributor: replace sync builtins with atomic builtins Phil Yang
2019-04-01 16:24   ` Honnappa Nagarahalli
2019-04-02  3:43     ` Phil Yang (Arm Technology China)
2019-03-29 10:56 ` [PATCH v2 3/3] test/ring_perf: " Phil Yang
2019-04-01 16:24   ` Honnappa Nagarahalli
2019-04-03  6:59 ` [PATCH v3 0/3] example and test cases optimizations Phil Yang
2019-04-03  6:59   ` [PATCH v3 1/3] packet_ordering: add statistics for each worker thread Phil Yang
2019-04-04 23:24     ` [dpdk-dev] " Thomas Monjalon
2019-04-08  4:04       ` Phil Yang (Arm Technology China)
2019-04-03  6:59   ` [PATCH v3 2/3] test/distributor: replace sync builtins with atomic builtins Phil Yang
2019-04-04 15:30     ` Honnappa Nagarahalli
2019-04-03  6:59   ` [PATCH v3 3/3] test/ring_perf: " Phil Yang
2019-04-08  3:02 ` [dpdk-dev] [PATCH v4 0/3] example and test cases optimizations Phil Yang
2019-07-04 20:15   ` Thomas Monjalon
2019-07-05  3:19     ` Phil Yang (Arm Technology China)
2019-07-08 14:38   ` Thomas Monjalon
2019-04-08  3:02 ` [dpdk-dev] [PATCH v4 1/3] packet_ordering: add statistics for each worker thread Phil Yang
2019-04-08  3:02 ` [dpdk-dev] [PATCH v4 2/3] test/distributor: replace sync builtins with atomic builtins Phil Yang
2019-04-10 14:05   ` Hunt, David
2019-04-11 11:31     ` Phil Yang (Arm Technology China) [this message]
2019-04-08  3:02 ` [dpdk-dev] [PATCH v4 3/3] test/ring_perf: " Phil Yang

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=DB7PR08MB3385B398972C39D3A99C58A5E92F0@DB7PR08MB3385.eurprd08.prod.outlook.com \
    --to=phil.yang@arm.com \
    --cc=Gavin.Hu@arm.com \
    --cc=Honnappa.Nagarahalli@arm.com \
    --cc=david.hunt@intel.com \
    --cc=dev@dpdk.org \
    --cc=nd@arm.com \
    --cc=reshma.pattan@intel.com \
    --cc=thomas@monjalon.net \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.