From: Ingo Molnar <mingo@kernel.org>
To: Dominik Brodowski <linux@dominikbrodowski.net>
Cc: Linus Torvalds <torvalds@linux-foundation.org>,
Linux Kernel Mailing List <linux-kernel@vger.kernel.org>,
the arch/x86 maintainers <x86@kernel.org>,
Dan Williams <dan.j.williams@intel.com>,
Thomas Gleixner <tglx@linutronix.de>,
Andi Kleen <ak@linux.intel.com>,
Andrew Lutomirski <luto@kernel.org>,
Josh Poimboeuf <jpoimboe@redhat.com>,
Brian Gerst <brgerst@gmail.com>,
Denys Vlasenko <dvlasenk@redhat.com>,
Borislav Petkov <bp@alien8.de>,
Peter Zijlstra <a.p.zijlstra@chello.nl>
Subject: Re: [RFC v2 PATCH 6/7] x86/entry: get rid of ALLOC_PT_GPREGS_ON_STACK and SAVE_AND_CLEAR_REGS
Date: Thu, 8 Feb 2018 10:47:10 +0100 [thread overview]
Message-ID: <20180208094710.qnjixhm6hybebdv7@gmail.com> (raw)
In-Reply-To: <20180207212915.GA4633@isilmar-4.linta.de>
* Dominik Brodowski <linux@dominikbrodowski.net> wrote:
> On Wed, Feb 07, 2018 at 12:44:41PM -0800, Linus Torvalds wrote:
> > On Wed, Feb 7, 2018 at 12:15 PM, Dominik Brodowski
> > <linux@dominikbrodowski.net> wrote:
> > >
> > > Note: The testb $3, CS(%rsp) instruction in idtentry() does not need
> > > modification. Previously %rsp was manually decreased by 15*8; with
> > > this patch, %rsp is decreased by 15 pushq instructions. Moreover,
> > > error_entry did and does the exact same test (with offset=8) after
> > > the registers have been moved/pushed and cleared.
> >
> > So this has the problem that now those save/clear instructions will
> > all be done in that idtentry macro.
> >
> > So now that code will be duplicated for all the users of that macro.
> >
> > The old code did the saving in the common error_entry and
> > paranoid_entry routines, in order to be able to share all the code,
> > and making the duplicated stub functions generated by the idtentry
> > macro smaller.
> >
> > Now, admittedly the new push sequence is much smaller than the old
> > movq sequence, so the duplication doesn't hurt as much, but it's still
> > likely quite noticeable.
> >
> > So this removes lines of asm code, but it adds a lot of instructions
> > to the end result thanks to the macro, I think.
>
> Indeed, that is the case (see below). However, if we want to switch to
> PUSH instructions and do this in a routine which is call'ed and which
> ret'urns, %rsp needs to be moved around even more often than the old
> ALLOC_PT_GPREGS_ON_STACK macro did (which you wanted to get rid of,
> IIUYC). Or do I miss something?
>
> text data bss dec hex filename
> 19500 0 0 19500 4c2c arch/x86/entry/entry_64.o-orig
> 19510 0 0 19510 4c36 arch/x86/entry/entry_64.o-3_of_7
> 21105 0 0 21105 5271 arch/x86/entry/entry_64.o-5_of_7
> 24307 0 0 24307 5ef3 arch/x86/entry/entry_64.o-7_of_7
So while this shows a +~5K increase in text size, these numbers also _very_
significantly over-represent the extra footprint. The assumptions that resulted in
us compressing the IRQ entry code have changed very significantly with the new x86
IRQ allocation code we introduced in the last year:
- IRQ vectors are usually populated in tightly clustered groups.
With our new vector allocator code the typical per CPU allocation percentage on
x86 systems is ~3 device vectors and ~10 fixed vectors out of ~220 vectors -
i.e. a very low ~6% utilization (!). This means that the _real_ text footprint
increase is probably closer to 0.1K of text...
This is a very typical picture on an average desktop system:
/sys/kernel/debug/irq/domains/VECTORS:
Online bitmaps: 40
Global available: 8007
Global reserved: 24
Total allocated: 73
System: 42: 0-19,32,50,128,237-255
| CPU | avl | man | act | vectors
0 199 0 3 33-34,48
1 199 0 3 33-35
2 199 0 3 33-35
3 199 0 3 33-35
4 199 0 3 33-35
5 199 0 3 33-35
6 199 0 3 33-35
7 199 0 3 33-35
8 199 0 3 33-35
9 199 0 3 33-35
10 201 0 1 33
11 201 0 1 33
12 201 0 1 33
13 201 0 1 33
14 201 0 1 33
15 201 0 1 33
16 201 0 1 33
17 201 0 1 33
18 201 0 1 33
19 201 0 1 33
20 199 0 3 33-35
21 199 0 3 33-35
22 199 0 3 33-35
23 199 0 3 33-35
24 199 0 3 33-35
25 199 0 3 33-35
26 199 0 3 33-35
27 199 0 3 33-35
28 199 0 3 33-35
29 199 0 3 33-35
30 201 0 1 33
31 201 0 1 33
32 201 0 1 33
33 202 0 0
34 202 0 0
35 202 0 0
36 202 0 0
37 202 0 0
38 202 0 0
39 202 0 0
I.e. the average number of (device) IRQ vectors per CPU is around 2, with the
max being 3.
The days where we allocated a lot of vectors on every CPU and the compression
of the IRQ entry code text mattered are over.
- Another issue is that only a small minority of vectors is frequent enough to
actually matter to cache utilization in practice: 3-4 key IPIs and 1-2 device
IRQs at most - and those vectors tend to be tightly clustered as well into about
two groups, and are probably already on 2-3 cache lines in practice.
For the common case of 'cache cold' IRQs it's the depth of the call chain and
the fragmentation of the resulting I$ that should be the main performance limit
- not the overall size of it.
- The CPU side cost of IRQ delivery is still very expensive even in the best, most
cached case, as in 'over a thousand cycles'. So much stuff is done that maybe
contemporary x86 IRQ entry microcode already prefetches the IDT entry and its
expected call target address.
So for those reasons I'm really tempted by the all around simplification offered
by this series:
2 files changed, 62 insertions(+), 160 deletions(-)
Basically in this specific case I'd like to turn the argument around,
use this simpler design and put the burden of proof on the patches that
want to _complicate it_ beyond this simple, straightforward model: show us
the numbers that it truly makes a difference: it's not that hard to
spray a CPU with IRQs by using IPIs, and the cache miss rate and the
overall perf counter stats should tell a clear story about whether it
makes sense to cache-compress (and complicate) the IRQ entry sequences.
Thanks,
Ingo
next prev parent reply other threads:[~2018-02-08 9:47 UTC|newest]
Thread overview: 15+ messages / expand[flat|nested] mbox.gz Atom feed top
2018-02-07 20:15 [RFC v2 PATCH 0/7] x86/entry: simplify and unify SAVE/POP_REGS Dominik Brodowski
2018-02-07 20:15 ` [RFC v2 PATCH 1/7] x86/entry: merge SAVE_C_REGS and SAVE_EXTRA_REGS, remove unused extensions Dominik Brodowski
2018-02-07 20:15 ` [RFC v2 PATCH 2/7] x86/entry: merge POP_C_REGS and POP_EXTRA_REGS Dominik Brodowski
2018-02-07 20:15 ` [RFC v2 PATCH 3/7] x86/entry: interleave XOR register clearing with PUSH instructions Dominik Brodowski
2018-02-07 20:15 ` [RFC v2 PATCH 4/7] x86/entry: introduce PUSH_AND_CLEAN_REGS Dominik Brodowski
2018-02-07 20:15 ` [RFC v2 PATCH 5/7] x86/entry: use PUSH_AND_CLEAN_REGS in more cases Dominik Brodowski
2018-02-07 20:15 ` [RFC v2 PATCH 6/7] x86/entry: get rid of ALLOC_PT_GPREGS_ON_STACK and SAVE_AND_CLEAR_REGS Dominik Brodowski
2018-02-07 20:44 ` Linus Torvalds
2018-02-07 21:29 ` Dominik Brodowski
2018-02-07 21:58 ` Linus Torvalds
2018-02-08 7:20 ` Dominik Brodowski
2018-02-08 9:47 ` Ingo Molnar [this message]
2018-02-08 17:39 ` Linus Torvalds
2018-02-08 18:35 ` Josh Poimboeuf
2018-02-07 20:15 ` [RFC v2 PATCH 7/7] x86/entry: indent PUSH_AND_CLEAR_REGS and POP_REGS properly Dominik Brodowski
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20180208094710.qnjixhm6hybebdv7@gmail.com \
--to=mingo@kernel.org \
--cc=a.p.zijlstra@chello.nl \
--cc=ak@linux.intel.com \
--cc=bp@alien8.de \
--cc=brgerst@gmail.com \
--cc=dan.j.williams@intel.com \
--cc=dvlasenk@redhat.com \
--cc=jpoimboe@redhat.com \
--cc=linux-kernel@vger.kernel.org \
--cc=linux@dominikbrodowski.net \
--cc=luto@kernel.org \
--cc=tglx@linutronix.de \
--cc=torvalds@linux-foundation.org \
--cc=x86@kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.