All of lore.kernel.org
 help / color / mirror / Atom feed
From: Benjamin Herrenschmidt <benh@kernel.crashing.org>
To: Scott Wood <scottwood@freescale.com>
Cc: linuxppc-dev@lists.ozlabs.org
Subject: Re: [PATCH 2/7] powerpc/mm: 64-bit 4k: use a PMD-based virtual page table
Date: Tue, 24 May 2011 06:51:01 +1000	[thread overview]
Message-ID: <1306183861.7481.208.camel@pasglop> (raw)
In-Reply-To: <20110523135433.557e2d63@schlenkerla.am.freescale.net>

On Mon, 2011-05-23 at 13:54 -0500, Scott Wood wrote:
> On Sat, 21 May 2011 08:15:36 +1000
> Benjamin Herrenschmidt <benh@kernel.crashing.org> wrote:
> 
> > On Fri, 2011-05-20 at 15:57 -0500, Scott Wood wrote:
> > 
> > > I see a 2% cost going from virtual pmd to full 4-level walk in the
> > > benchmark mentioned above (some type of sort), and just under 3% in
> > > page-stride lat_mem_rd from lmbench.
> > > 
> > > OTOH, the virtual pmd approach still leaves the possibility of taking a
> > > bunch of virtual page table misses if non-localized accesses happen over a
> > > very large chunk of address space (tens of GiB), and we'd have one fewer
> > > type of TLB miss to worry about complexity-wise with a straight table walk.
> > > 
> > > Let me know what you'd prefer.
> > 
> > I'm tempted to kill the virtual linear feature alltogether.. it didn't
> > buy us that much. Have you looked if you can snatch back some of those
> > cycles with hand tuning of the level walker ?
> 
> That's after trying a bit of that (pulled the pgd load up before
> normal_tlb_miss, and some other reordering).  Not sure how much more can be
> squeezed out of it with such techniques, at least with e5500.
> 
> Hmm, in the normal miss case we know we're in the first EXTLB level,
> right?  So we could cut out a load/mfspr by subtracting EXTLB from r12
> to get the PACA (that load's latency is pretty well buried, but maybe we
> could replace it with loading pgd, replacing it later if it's a kernel
> region).  Maybe move pgd to the first EXTLB, so it's in the same cache line
> as the state save data. The PACA cacheline containing pgd is probably
> pretty hot in normal kernel code, but not so much in a long stretch of
> userspace plus TLB misses (other than for pgd itself).

Is your linear mapping bolted ? If it is you may be able to cut out most
of the save/restore stuff (SRR0,1, ...) since with a normal walk you
won't take nested misses.
 
> > Would it work/help to have a simple cache of the last pmd & address and
> > compare just that ?
> 
> Maybe.
> 
> It would still slow down the case where you miss that cache -- not by as
> much as a virtual page table miss (and it wouldn't compete for TLB entries
> with actual user pages), but it would happen more often, since you'd only be
> able to cache one pmd.
>
> > Maybe in a SPRG or a known cache hot location like
> > the PACA in a line that we already load anyways ?
> 
> A cache access is faster than a SPRG access on our chips (plus we
> don't have many to spare, especially if we want to avoid swapping SPRG4-7 on
> guest entry/exit in KVM), so I'd favor putting it in the PACA.
> 
> I'll try this stuff out and see what helps.

Cool,

Cheers,
Ben.

  reply	other threads:[~2011-05-23 20:51 UTC|newest]

Thread overview: 27+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2011-05-18 21:04 [PATCH 1/7] powerpc/mm: 64-bit 4k: use page-sized PMDs Scott Wood
2011-05-18 21:05 ` [PATCH 2/7] powerpc/mm: 64-bit 4k: use a PMD-based virtual page table Scott Wood
2011-05-18 21:33   ` Benjamin Herrenschmidt
2011-05-20 20:57     ` Scott Wood
2011-05-20 22:15       ` Benjamin Herrenschmidt
2011-05-23 18:54         ` Scott Wood
2011-05-23 20:51           ` Benjamin Herrenschmidt [this message]
2011-05-23 23:31             ` Scott Wood
2011-05-24  2:52               ` Benjamin Herrenschmidt
2011-05-18 21:05 ` [PATCH 3/7] powerpc/mm: 64-bit tlb miss: get PACA from memory rather than SPR Scott Wood
2011-05-18 21:05 ` [PATCH 4/7] powerpc/mm: 64-bit: Don't load PACA in normal TLB miss exceptions Scott Wood
2011-05-18 21:05 ` [PATCH 5/7] powerpc/mm: 64-bit: don't handle non-standard page sizes Scott Wood
2011-05-18 21:36   ` Benjamin Herrenschmidt
2011-05-18 21:50     ` Scott Wood
2011-05-18 21:54       ` Benjamin Herrenschmidt
2011-05-18 21:05 ` [PATCH 6/7] powerpc/mm: 64-bit: tlb handler micro-optimization Scott Wood
2011-05-18 21:37   ` Benjamin Herrenschmidt
2011-05-18 21:51     ` Scott Wood
2011-05-18 21:54       ` Benjamin Herrenschmidt
2011-05-18 22:27         ` Scott Wood
2011-05-18 21:05 ` [PATCH 7/7] powerpc/e5500: set MMU_FTR_USE_PAIRED_MAS Scott Wood
2011-05-18 21:38   ` Benjamin Herrenschmidt
2011-05-18 21:52     ` Scott Wood
2011-05-18 21:58       ` Benjamin Herrenschmidt
2011-05-18 21:32 ` [PATCH 1/7] powerpc/mm: 64-bit 4k: use page-sized PMDs Benjamin Herrenschmidt
2011-05-18 21:46   ` Scott Wood
2011-05-18 21:52     ` Benjamin Herrenschmidt

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=1306183861.7481.208.camel@pasglop \
    --to=benh@kernel.crashing.org \
    --cc=linuxppc-dev@lists.ozlabs.org \
    --cc=scottwood@freescale.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.