All of lore.kernel.org
 help / color / mirror / Atom feed
From: Konstantin Ananyev <konstantin.v.ananyev@yandex.ru>
To: Bruce Richardson <bruce.richardson@intel.com>,
	Stephen Hemminger <stephen@networkplumber.org>
Cc: Honnappa Nagarahalli <Honnappa.Nagarahalli@arm.com>,
	Tyler Retzlaff <roretzla@linux.microsoft.com>,
	"dev@dpdk.org" <dev@dpdk.org>, nd <nd@arm.com>,
	"Ananyev, Konstantin" <konstantin.ananyev@intel.com>
Subject: Re: [RFC] rte_ring: don't use always inline
Date: Fri, 6 May 2022 18:48:34 +0100	[thread overview]
Message-ID: <1d8bf669-9ada-5ab1-6eb4-bd2a0771c544@yandex.ru> (raw)
In-Reply-To: <YnVPOjP35iNg+cMU@bricha3-MOBL.ger.corp.intel.com>

06/05/2022 17:39, Bruce Richardson пишет:
> On Fri, May 06, 2022 at 09:33:41AM -0700, Stephen Hemminger wrote:
>> On Fri, 6 May 2022 16:28:41 +0100
>> Bruce Richardson <bruce.richardson@intel.com> wrote:
>>
>>> On Fri, May 06, 2022 at 03:12:32PM +0000, Honnappa Nagarahalli wrote:
>>>> <snip>
>>>>>
>>>>> On Thu, May 05, 2022 at 10:59:32PM +0000, Honnappa Nagarahalli wrote:
>>>>>> Thanks Stephen. Do you see any performance difference with this change?
>>>>>
>>>>> as a matter of due diligence i think a comparison should be made just to be
>>>>> confident nothing is regressing.
>>>>>
>>>>> i support this change in principal since it is generally accepted best practice to
>>>>> not force inlining since it can remove more valuable optimizations that the
>>>>> compiler may make that the human can't see.
>>>>> the optimizations may vary depending on compiler implementation.
>>>>>
>>>>> force inlining should be used as a targeted measure rather than blanket on
>>>>> every function and when in use probably needs to be periodically reviewed and
>>>>> potentially removed as the code / compiler evolves.
>>>>>
>>>>> also one other consideration is the impact of a particular compiler's force
>>>>> inlining intrinsic/builtin is that it may permit inlining of functions when not
>>>>> declared in a header. i.e. a function from one library may be able to be inlined
>>>>> to another binary as a link time optimization. although everything here is in a
>>>>> header so it's a bit moot.
>>>>>
>>>>> i'd like to see this change go in if possible.
>>>> Like Stephen mentions below, I am sure we will have a for and against discussion here.
>>>> As a DPDK community we have put performance front and center, I would prefer to go down that route first.
>>>>   
>>>
>>> I ran some initial numbers with this patch, and the very quick summary of
>>> what I've seen so far:
>>>
>>> * Unit tests show no major differences, and while it depends on what
>>>    specific number you are interested in, most seem within margin of error.
>>> * Within unit tests, the one number I mostly look at when considering
>>>    inlining is the "empty poll" cost, since I believe we should look to keep
>>>    that as close to zero as possible. In the past I've seen that number jump
>>>    from 3 cycles to 12 cycles due to missed inlining. In this case, it seem
>>>    fine.
>>> * Ran a quick test with the eventdev_pipeline example app using SW eventdev,
>>>    as a test of an actual app which is fairly ring-heavy [used 8 workers
>>>    with 1000 cycles per packet hop]. (Thanks to Harry vH for this suggestion
>>>    of a workload)
>>>    * GCC 8 build - no difference observed
>>>    * GCC 11 build - approx 2% perf reduction observed

Just to note that apart from ring_perf_autotest, there also exist 
ring_stress_autotest which trying to do some stress-testing in hopefully
more realistic usage scenarios.
Might be worth to consider when benchmarking.

>>>
>>> As I said, these are just some quick rough numbers, and I'll try and get
>>> some more numbers on a couple of different platforms, see if the small
>>> reduction seen is consistent or not. I may also test a few differnet
>>> combinations/options in the eventdev test.  It would be good if others also
>>> tested on a few platforms available to them.
>>>
>>> /Bruce
>>
>> I wonder if a mixed approach might help where some key bits were marked
>> as more important to inline? Or setting compiler flags in build infra?
> 
> Yep, could be a number of options.


  reply	other threads:[~2022-05-06 17:49 UTC|newest]

Thread overview: 13+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-05-05 22:45 [RFC] rte_ring: don't use always inline Stephen Hemminger
2022-05-05 22:59 ` Honnappa Nagarahalli
2022-05-05 23:10   ` Stephen Hemminger
2022-05-05 23:16     ` Stephen Hemminger
2022-05-06  1:37     ` Honnappa Nagarahalli
2022-05-06  7:24   ` Tyler Retzlaff
2022-05-06 15:12     ` Honnappa Nagarahalli
2022-05-06 15:28       ` Bruce Richardson
2022-05-06 16:33         ` Stephen Hemminger
2022-05-06 16:39           ` Bruce Richardson
2022-05-06 17:48             ` Konstantin Ananyev [this message]
2022-05-06 15:41     ` Stephen Hemminger
2022-05-06 16:38       ` Bruce Richardson

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=1d8bf669-9ada-5ab1-6eb4-bd2a0771c544@yandex.ru \
    --to=konstantin.v.ananyev@yandex.ru \
    --cc=Honnappa.Nagarahalli@arm.com \
    --cc=bruce.richardson@intel.com \
    --cc=dev@dpdk.org \
    --cc=konstantin.ananyev@intel.com \
    --cc=nd@arm.com \
    --cc=roretzla@linux.microsoft.com \
    --cc=stephen@networkplumber.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.