All of lore.kernel.org
 help / color / mirror / Atom feed
From: "Ananyev, Konstantin" <konstantin.ananyev@intel.com>
To: Stephen Hemminger <stephen@networkplumber.org>
Cc: "dev@dpdk.org" <dev@dpdk.org>,
	"olivier.matz@6wind.com" <olivier.matz@6wind.com>,
	"honnappa.nagarahalli@arm.com" <honnappa.nagarahalli@arm.com>,
	"jerinj@marvell.com" <jerinj@marvell.com>,
	"drc@linux.vnet.ibm.com" <drc@linux.vnet.ibm.com>
Subject: Re: [dpdk-dev] [RFC] ring: make ring implementation non-inlined
Date: Sat, 21 Mar 2020 01:03:35 +0000	[thread overview]
Message-ID: <SN6PR11MB25583FEF506336C09C934DBA9AF20@SN6PR11MB2558.namprd11.prod.outlook.com> (raw)
In-Reply-To: <20200320105435.47681954@hermes.lan>



> -----Original Message-----
> From: Stephen Hemminger <stephen@networkplumber.org>
> Sent: Friday, March 20, 2020 5:55 PM
> To: Ananyev, Konstantin <konstantin.ananyev@intel.com>
> Cc: dev@dpdk.org; olivier.matz@6wind.com; honnappa.nagarahalli@arm.com; jerinj@marvell.com; drc@linux.vnet.ibm.com
> Subject: Re: [RFC] ring: make ring implementation non-inlined
> 
> On Fri, 20 Mar 2020 16:41:38 +0000
> Konstantin Ananyev <konstantin.ananyev@intel.com> wrote:
> 
> > As was discussed here:
> > http://mails.dpdk.org/archives/dev/2020-February/158586.html
> > this RFC aimed to hide ring internals into .c and make all
> > ring functions non-inlined. In theory that might help to
> > maintain ABI stability in future.
> > This is just a POC to measure the impact of proposed idea,
> > proper implementation would definetly need some extra effort.
> > On IA box (SKX) ring_perf_autotest shows ~20-30 cycles extra for
> > enqueue+dequeue pair. On some more realistic code, I suspect
> > the impact it might be a bit higher.
> > For MP/MC bulk transfers degradation seems quite small,
> > though for SP/SC and/or small transfers it is more then noticable
> > (see exact numbers below).
> > From my perspective we'd probably keep it inlined for now
> > to avoid any non-anticipated perfomance degradations.
> > Though intersted to see perf results and opinions from
> > other interested parties.
> >
> > Intel(R) Xeon(R) Platinum 8160 CPU @ 2.10GHz
> > ring_perf_autotest (without patch/with patch)
> >
> > ### Testing single element enq/deq ###
> > legacy APIs: SP/SC: single: 8.75/43.23
> > legacy APIs: MP/MC: single: 56.18/80.44
> >
> > ### Testing burst enq/deq ###
> > legacy APIs: SP/SC: burst (size: 8): 37.36/53.37
> > legacy APIs: SP/SC: burst (size: 32): 93.97/117.30
> > legacy APIs: MP/MC: burst (size: 8): 78.23/91.45
> > legacy APIs: MP/MC: burst (size: 32): 131.59/152.49
> >
> > ### Testing bulk enq/deq ###
> > legacy APIs: SP/SC: bulk (size: 8): 37.29/54.48
> > legacy APIs: SP/SC: bulk (size: 32): 92.68/113.01
> > legacy APIs: MP/MC: bulk (size: 8): 78.40/93.50
> > legacy APIs: MP/MC: bulk (size: 32): 131.49/154.25
> >
> > ### Testing empty bulk deq ###
> > legacy APIs: SP/SC: bulk (size: 8): 4.00/16.86
> > legacy APIs: MP/MC: bulk (size: 8): 7.01/15.55
> >
> > ### Testing using two hyperthreads ###
> > legacy APIs: SP/SC: bulk (size: 8): 10.64/17.56
> > legacy APIs: MP/MC: bulk (size: 8): 15.30/16.69
> > legacy APIs: SP/SC: bulk (size: 32): 5.84/7.09
> > legacy APIs: MP/MC: bulk (size: 32): 6.34/7.54
> >
> > ### Testing using two physical cores ###
> > legacy APIs: SP/SC: bulk (size: 8): 24.34/42.40
> > legacy APIs: MP/MC: bulk (size: 8): 70.34/71.82
> > legacy APIs: SP/SC: bulk (size: 32): 12.67/14.68
> > legacy APIs: MP/MC: bulk (size: 32): 22.41/17.93
> >
> > ### Testing single element enq/deq ###
> > elem APIs: element size 16B: SP/SC: single: 10.65/41.96
> > elem APIs: element size 16B: MP/MC: single: 44.33/81.36
> >
> > ### Testing burst enq/deq ###
> > elem APIs: element size 16B: SP/SC: burst (size: 8): 39.20/58.52
> > elem APIs: element size 16B: SP/SC: burst (size: 32): 123.19/142.79
> > elem APIs: element size 16B: MP/MC: burst (size: 8): 80.72/101.36
> > elem APIs: element size 16B: MP/MC: burst (size: 32): 169.21/185.38
> >
> > ### Testing bulk enq/deq ###
> > elem APIs: element size 16B: SP/SC: bulk (size: 8): 41.64/58.46
> > elem APIs: element size 16B: SP/SC: bulk (size: 32): 122.74/142.52
> > elem APIs: element size 16B: MP/MC: bulk (size: 8): 80.60/103.14
> > elem APIs: element size 16B: MP/MC: bulk (size: 32): 169.39/186.67
> >
> > ### Testing empty bulk deq ###
> > elem APIs: element size 16B: SP/SC: bulk (size: 8): 5.01/17.17
> > elem APIs: element size 16B: MP/MC: bulk (size: 8): 6.01/14.80
> >
> > ### Testing using two hyperthreads ###
> > elem APIs: element size 16B: SP/SC: bulk (size: 8): 12.02/17.18
> > elem APIs: element size 16B: MP/MC: bulk (size: 8): 16.81/21.14
> > elem APIs: element size 16B: SP/SC: bulk (size: 32): 7.87/9.01
> > elem APIs: element size 16B: MP/MC: bulk (size: 32): 8.22/10.57
> >
> > ### Testing using two physical cores ###
> > elem APIs: element size 16B: SP/SC: bulk (size: 8): 27.00/51.94
> > elem APIs: element size 16B: MP/MC: bulk (size: 8): 78.24/74.48
> > elem APIs: element size 16B: SP/SC: bulk (size: 32): 15.41/16.14
> > elem APIs: element size 16B: MP/MC: bulk (size: 32): 18.72/21.64
> >
> > Signed-off-by: Konstantin Ananyev <konstantin.ananyev@intel.com>
> 
> What is impact with LTO? I suspect compiler might have a chance to
> get speed back with LTO.

Might be, but LTO is not enabled by default.
So don't see much point in digging any further here.

  reply	other threads:[~2020-03-21  1:03 UTC|newest]

Thread overview: 12+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-03-20 16:41 [dpdk-dev] [RFC] ring: make ring implementation non-inlined Konstantin Ananyev
2020-03-20 17:54 ` Stephen Hemminger
2020-03-21  1:03   ` Ananyev, Konstantin [this message]
2020-03-25 21:09 ` Jerin Jacob
2020-03-26  0:28   ` Ananyev, Konstantin
2020-03-26  8:04   ` Morten Brørup
2020-03-31 23:25     ` Thomas Monjalon
2020-06-30 23:15       ` Honnappa Nagarahalli
2020-07-01  7:27         ` Morten Brørup
2020-07-01 12:21           ` Ananyev, Konstantin
2020-07-01 14:11             ` Honnappa Nagarahalli
2020-07-01 14:31               ` David Marchand

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=SN6PR11MB25583FEF506336C09C934DBA9AF20@SN6PR11MB2558.namprd11.prod.outlook.com \
    --to=konstantin.ananyev@intel.com \
    --cc=dev@dpdk.org \
    --cc=drc@linux.vnet.ibm.com \
    --cc=honnappa.nagarahalli@arm.com \
    --cc=jerinj@marvell.com \
    --cc=olivier.matz@6wind.com \
    --cc=stephen@networkplumber.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.