All of lore.kernel.org
 help / color / mirror / Atom feed
From: Phil Yang <Phil.Yang@arm.com>
To: Alexander Kozyrev <akozyrev@mellanox.com>,
	Honnappa Nagarahalli <Honnappa.Nagarahalli@arm.com>,
	Matan Azrad <matan@mellanox.com>,
	Shahaf Shuler <shahafs@mellanox.com>,
	Slava Ovsiienko <viacheslavo@mellanox.com>
Cc: "drc@linux.vnet.ibm.com" <drc@linux.vnet.ibm.com>,
	nd <nd@arm.com>, "dev@dpdk.org" <dev@dpdk.org>, nd <nd@arm.com>,
	nd <nd@arm.com>
Subject: Re: [dpdk-dev] [PATCH v3] net/mlx5: relaxed ordering for	multi-packet	RQ buffer refcnt
Date: Wed, 22 Jul 2020 12:06:13 +0000	[thread overview]
Message-ID: <VE1PR08MB4640B45B79BB6699FF931FECE9790@VE1PR08MB4640.eurprd08.prod.outlook.com> (raw)
In-Reply-To: <AM0PR05MB4561A1C975F70E4E49576C73A2780@AM0PR05MB4561.eurprd05.prod.outlook.com>

Alexander Kozyrev <akozyrev@mellanox.com> writes:

<snip>
> >
> > > > > Subject: RE: [dpdk-dev] [PATCH v3] net/mlx5: relaxed ordering for
> > > > > multi- packet RQ buffer refcnt
> > > > >
> > > > > Hi Phil Yang, we noticed that this patch gives us 10% of
> > > > > performance degradation on ARM.
> > > > > x86 seems to be unaffected though. Do you know what may be the
> > > > > reason of this behavior?
> > > >
> > > > Hi Alexander,
> > > >
> > > > Thanks for your feedback.
> > > > This patch removed some expensive memory barriers on aarch64, it
> > > > should get better performance.
> > > > I am not sure the root cause of this degradation now, I will start
> > > > the investigation. We can profiling this issue together.
> > > > Could you share your test case(including your testbed configuration)
> > > > with
> > > us?
<...>
> > >
> > > I'm surprised too, Phil, but looks like it is actually making things
> > > worse. I used Connect-X 6DX on aarch64:
> > > Linux dragon71-bf 5.4.31-mlnx.15.ge938819 #1 SMP PREEMPT Thu Jul 2
> > > 17:01:15 IDT 2020 aarch64 aarch64 aarch64 GNU/Linux Traffic generator
> > > sends 60 bytes packets and DUT executes the following command:
> > > arm64-bluefield-linuxapp-gcc/build/app/test-pmd/testpmd -n 4  -w
> > > 0000:03:00.1,mprq_en=1,rxqs_min_mprq=1  -w
> > > 0000:03:00.0,mprq_en=1,rxqs_min_mprq=1 -c 0xe  -- --burst=64 --
> > > mbcache=512 -i  --nb-cores=1  --rxq=1 --txq=1 --txd=256 --rxd=256
> > > --auto- start --rss-udp Without a patch I'm getting 3.2mpps, and only
> > > 2.9mpps when the patch is applied.
> > You are running on A72 cores, is that correct?
> 
> Correct, cat /proc/cpuinfo
> processor       : 0
> BogoMIPS        : 312.50
> Features        : fp asimd evtstrm crc32 cpuid
> CPU implementer : 0x41
> CPU architecture: 8
> CPU variant     : 0x0
> CPU part        : 0xd08
> CPU revision    : 3

Thanks a lot for your input, Alex.
With your test command line, I remeasured this patch on two different aarch64 machines and both got some performance improvement.

SOC#1. On Thunderx2 (with LSE support), I see 7.6% performance improvement on throughput. 
NIC: ConnectX-6 / driver: mlx5_core version: 5.0-1.0.0.0 / firmware-version: 20.27.1016 (MT_0000000224)

SOC#2. On N1SDP (I disabled LSE to generate A72 likewise instructions), I also see slightly (about 1%~2%) performance improvement on throughput.
NIC: ConnectX-5 / driver: mlx5_core / version: 5.0-2.1.8 / firmware-version: 16.27.2008 (MT_0000000090)

Without LSE (i.e. A72 and SOC#2 case.) it uses the 'Exclusive' mechanism to achieve atomicity.
For example, it generates below instructions for __atomic_add_fetch.
__atomic_add_fetch(&buf->refcnt, 1, __ATOMIC_ACQUIRE);
   70118:       f94037e3        ldr     x3, [sp, #104]
   7011c:       91002060        add     x0, x3, #0x8
   70120:       485ffc02        ldaxrh  w2, [x0]
   70124:       11000442        add     w2, w2, #0x1
   70128:       48057c02        stxrh   w5, w2, [x0]
   7012c:       35ffffa5        cbnz    w5, 70120 <mlx5_rx_burst_mprq+0xb48>

In general, I think this patch will not lead to a sharp decline in performance. 
Maybe you can try other testbeds?

Thanks,
Phil





  reply	other threads:[~2020-07-22 12:06 UTC|newest]

Thread overview: 70+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-02-13 12:38 [dpdk-dev] [PATCH RFC v1 0/6] barrier fix and optimization for mlx5 on aarch64 Gavin Hu
2020-02-13 12:38 ` [dpdk-dev] [PATCH RFC v1 1/6] net/mlx5: relax the barrier for UAR write Gavin Hu
2020-02-13 12:38 ` [dpdk-dev] [PATCH RFC v1 2/6] net/mlx5: use cio barrier before the BF WQE Gavin Hu
2020-02-13 12:38 ` [dpdk-dev] [PATCH RFC v1 3/6] net/mlx5: add missing barrier Gavin Hu
2020-02-13 12:38 ` [dpdk-dev] [PATCH RFC v1 4/6] net/mlx5: add descriptive comment for a barrier Gavin Hu
2020-02-13 12:38 ` [dpdk-dev] [PATCH RFC v1 5/6] net/mlx5: non-cacheable mapping defaulted for aarch64 Gavin Hu
2020-02-13 12:38 ` [dpdk-dev] [PATCH RFC v1 6/6] net/mlx5: relaxed ordering for multi-packet RQ buffer refcnt Gavin Hu
2020-04-10 16:41 ` [dpdk-dev] [PATCH RFC v2 0/7] introduce new barrier class and use it for mlx5 PMD Gavin Hu
2020-04-10 17:20   ` Andrew Rybchenko
2020-04-11  3:46     ` Gavin Hu
2020-04-13  9:51       ` Andrew Rybchenko
2020-04-13 16:46         ` Gavin Hu
2020-05-11 18:06   ` [dpdk-dev] [RFC] eal: adjust barriers for IO on Armv8-a Honnappa Nagarahalli
2020-05-12  6:18     ` Ruifeng Wang
2020-05-12  6:42       ` Jerin Jacob
2020-05-12  8:02         ` Ruifeng Wang
2020-05-12  8:28           ` Jerin Jacob
2020-05-12 21:44           ` Honnappa Nagarahalli
2020-05-13 14:49             ` Jerin Jacob
2020-05-14  1:02               ` Honnappa Nagarahalli
2020-06-27 19:12   ` [dpdk-dev] [PATCH v2] " Honnappa Nagarahalli
2020-06-27 19:25     ` Honnappa Nagarahalli
2020-06-30  5:13       ` Jerin Jacob
2020-07-03 18:57   ` [dpdk-dev] [PATCH v3 1/3] " Honnappa Nagarahalli
2020-07-03 18:57     ` [dpdk-dev] [PATCH v3 2/3] doc: update armv8-a IO barrier changes Honnappa Nagarahalli
2020-07-05  0:57       ` Jerin Jacob
2020-07-03 18:57     ` [dpdk-dev] [PATCH v3 3/3] doc: update deprecation of CIO barrier APIs Honnappa Nagarahalli
2020-07-05  0:57       ` Jerin Jacob
2020-07-07 20:19       ` Ajit Khaparde
2020-07-08 11:05       ` Ananyev, Konstantin
2020-07-06 23:43   ` [dpdk-dev] [PATCH v4 1/3] eal: adjust barriers for IO on Armv8-a Honnappa Nagarahalli
2020-07-06 23:43     ` [dpdk-dev] [PATCH v4 2/3] doc: update armv8-a IO barrier changes Honnappa Nagarahalli
2020-07-07  8:36       ` David Marchand
2020-07-07 18:37         ` Honnappa Nagarahalli
2020-07-06 23:43     ` [dpdk-dev] [PATCH v4 3/3] doc: update deprecation of CIO barrier APIs Honnappa Nagarahalli
2020-07-07  8:39       ` David Marchand
2020-07-07 20:14       ` David Christensen
2020-07-08 11:49       ` David Marchand
2020-04-10 16:41 ` [dpdk-dev] [PATCH RFC v2 1/7] eal: introduce new class of barriers for DMA use cases Gavin Hu
2020-04-10 16:41 ` [dpdk-dev] [PATCH RFC v2 2/7] net/mlx5: dmb for immediate doorbell ring on aarch64 Gavin Hu
2020-04-10 16:41 ` [dpdk-dev] [PATCH RFC v2 3/7] net/mlx5: relax barrier to order UAR writes " Gavin Hu
2020-04-10 16:41 ` [dpdk-dev] [PATCH RFC v2 4/7] net/mlx5: relax barrier for aarch64 Gavin Hu
2020-04-10 16:41 ` [dpdk-dev] [PATCH RFC v2 5/7] net/mlx5: add descriptive comment for a barrier Gavin Hu
2020-04-10 16:41 ` [dpdk-dev] [PATCH RFC v2 6/7] net/mlx5: relax ordering for multi-packet RQ buffer refcnt Gavin Hu
2020-06-23  8:26   ` [dpdk-dev] [PATCH v3] net/mlx5: relaxed " Phil Yang
2020-07-13  3:02     ` Phil Yang
2020-07-20 23:21       ` Alexander Kozyrev
2020-07-21  1:55         ` Phil Yang
2020-07-21  3:58           ` Alexander Kozyrev
2020-07-21  4:03             ` Honnappa Nagarahalli
2020-07-21  4:11               ` Alexander Kozyrev
2020-07-22 12:06                 ` Phil Yang [this message]
2020-07-23  4:47         ` Honnappa Nagarahalli
2020-07-23  6:11           ` Phil Yang
2020-07-23 16:53             ` Alexander Kozyrev
2020-07-27 14:52               ` Phil Yang
2020-08-06  2:43                 ` Alexander Kozyrev
2020-08-11  5:20                   ` Honnappa Nagarahalli
2020-09-02 21:52                     ` Alexander Kozyrev
2020-09-03  2:55                       ` Phil Yang
2020-09-09 13:29                         ` Alexander Kozyrev
2020-09-10  1:34                           ` Honnappa Nagarahalli
2020-09-03  2:53     ` [dpdk-dev] [PATCH v4] " Phil Yang
2020-09-10  1:30       ` Honnappa Nagarahalli
2020-09-10  1:36         ` Alexander Kozyrev
2020-09-29 15:22           ` Phil Yang
2020-09-30 12:44             ` Slava Ovsiienko
2020-09-30 12:52               ` Raslan Darawsheh
2020-09-30 13:57       ` Raslan Darawsheh
2020-04-10 16:41 ` [dpdk-dev] [PATCH RFC v2 7/7] doc: clarify one configuration in mlx5 guide Gavin Hu

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=VE1PR08MB4640B45B79BB6699FF931FECE9790@VE1PR08MB4640.eurprd08.prod.outlook.com \
    --to=phil.yang@arm.com \
    --cc=Honnappa.Nagarahalli@arm.com \
    --cc=akozyrev@mellanox.com \
    --cc=dev@dpdk.org \
    --cc=drc@linux.vnet.ibm.com \
    --cc=matan@mellanox.com \
    --cc=nd@arm.com \
    --cc=shahafs@mellanox.com \
    --cc=viacheslavo@mellanox.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.