All of lore.kernel.org
 help / color / mirror / Atom feed
From: Andrew Cooper <amc96@srcf.net>
To: Jan Beulich <jbeulich@suse.com>,
	Andrew Cooper <andrew.cooper3@citrix.com>
Cc: "Roger Pau Monné" <roger.pau@citrix.com>, "Wei Liu" <wl@xen.org>,
	Xen-devel <xen-devel@lists.xenproject.org>
Subject: Re: [PATCH 5/5] x86/ioapic: Drop function pointers from __ioapic_{read,write}_entry()
Date: Thu, 18 Nov 2021 00:32:14 +0000	[thread overview]
Message-ID: <6935bdd8-6b4a-80f6-d134-768dc0d37abe@srcf.net> (raw)
In-Reply-To: <e220b6f2-3cb9-e7b0-6b74-4b266e4e7fb6@suse.com>

On 12/11/2021 10:43, Jan Beulich wrote:
> On 11.11.2021 18:57, Andrew Cooper wrote:
>> Function pointers are expensive, and the raw parameter is a constant from all
>> callers, meaning that it predicts very well with local branch history.
> The code change is fine, but I'm having trouble with "all" here: Both
> functions aren't even static, so while callers in io_apic.c may
> benefit (perhaps with the exception of ioapic_{read,write}_entry(),
> depending on whether the compiler views inlining them as warranted),
> I'm in no way convinced this extends to the callers in VT-d code.
>
> Further ISTR clang being quite a bit less aggressive about inlining,
> so the effects might not be quite as good there even for the call
> sites in io_apic.c.
>
> Can you clarify this for me please?

The way the compiler lays out the code is unrelated to why this form is 
an improvement.

Branch history is a function of "the $N most recently taken branches".  
This is because "how you got here" is typically relevant to "where you 
should go next".

Trivial schemes maintain a shift register of taken / not-taken results.  
Less trivial schemes maintain a rolling hash of (src addr, dst addr) 
tuples of all taken branches (direct and indirect).  In both cases, the 
instantaneous branch history is an input into the final prediction, and 
is commonly used to select which saturating counter (or bank of 
counters) is used.

Consider something like

while ( cond )
{
     memcpy(dst1, src1, 64);
     memcpy(dst2, src2, 7);
}

Here, the conditional jump inside memcpy() coping with the tail of the 
copy flips result 50% of the time, which is fiendish to predict for.

However, because the branch history differs (by memcpy()'s return 
address which was accumulated by the call instruction), the predictor 
can actually use two different taken/not-taken counters for the two 
different "instances" if the tail jump.  After a few iterations to warm 
up, the predictor will get every jump perfect despite the fact that 
memcpy() is a library call and the branches would otherwise alias.


Bringing it back to the code in question.  The "raw" parameter is an 
explicit true or false at the top of all call paths leading into these 
functions.  Therefore, an individual branch history has a high 
correlation with said true or false, irrespective of the absolute code 
layout.  As a consequence, the correct result of the prediction is 
highly correlated with the branch history, and it will predict 
perfectly[1] after a few times the path has been used.

~Andrew

[1] Obviously, it's not actually perfect outside of a synthetic 
example.  Aliasing in the predictor is a necessary property of keeping 
the logic small enough to provide an answer fast, but the less 
accidental aliasing there is, the faster the CPU performance in 
benchmarks, so incentives are in our favour here.


  reply	other threads:[~2021-11-18  0:32 UTC|newest]

Thread overview: 34+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-11-11 17:57 [PATCH 0/5] xen: various function pointer cleanups Andrew Cooper
2021-11-11 17:57 ` [PATCH 1/5] xen/domain: Remove function pointers from domain pause helpers Andrew Cooper
2021-11-12  9:36   ` Julien Grall
2021-11-18  1:47     ` Andrew Cooper
2021-11-18  9:28       ` Julien Grall
2021-11-12  9:57   ` Jan Beulich
2021-11-17 23:31     ` Andrew Cooper
2021-11-15 10:13   ` Bertrand Marquis
2021-11-15 10:20     ` Jan Beulich
2021-11-15 10:23       ` Bertrand Marquis
2021-11-15 10:55         ` Jan Beulich
2021-11-15 11:23           ` Bertrand Marquis
2021-11-15 14:11             ` Julien Grall
2021-11-15 14:45               ` Bertrand Marquis
2021-11-16  0:41           ` Stefano Stabellini
2021-11-16  7:15             ` Jan Beulich
2021-11-11 17:57 ` [PATCH 2/5] xen/domain: Improve pirq handling Andrew Cooper
2021-11-12 10:16   ` Jan Beulich
2021-11-11 17:57 ` [PATCH 3/5] xen/sort: Switch to an extern inline implementation Andrew Cooper
2021-11-11 18:15   ` Julien Grall
2021-11-16  0:36     ` Stefano Stabellini
2021-11-16  0:41       ` Andrew Cooper
2021-12-17 15:56         ` Andrew Cooper
2021-12-17 16:15           ` Julien Grall
2021-11-12  9:39   ` Julien Grall
2021-11-12 10:25   ` Jan Beulich
2021-11-11 17:57 ` [PATCH 4/5] xen/wait: Remove indirect jump Andrew Cooper
2021-11-12 10:35   ` Jan Beulich
2021-11-11 17:57 ` [PATCH 5/5] x86/ioapic: Drop function pointers from __ioapic_{read,write}_entry() Andrew Cooper
2021-11-12 10:43   ` Jan Beulich
2021-11-18  0:32     ` Andrew Cooper [this message]
2021-11-18  9:06       ` Jan Beulich
2021-11-18  9:07         ` Jan Beulich
2021-11-18 17:33           ` Andrew Cooper

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=6935bdd8-6b4a-80f6-d134-768dc0d37abe@srcf.net \
    --to=amc96@srcf.net \
    --cc=andrew.cooper3@citrix.com \
    --cc=jbeulich@suse.com \
    --cc=roger.pau@citrix.com \
    --cc=wl@xen.org \
    --cc=xen-devel@lists.xenproject.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.