All of lore.kernel.org
 help / color / mirror / Atom feed
From: Peter Zijlstra <peterz@infradead.org>
To: Matthew Wilcox <willy@infradead.org>
Cc: "Liang, Kan" <kan.liang@linux.intel.com>,
	Will Deacon <will@kernel.org>,
	Michael Ellerman <mpe@ellerman.id.au>,
	mingo@redhat.com, acme@kernel.org, linux-kernel@vger.kernel.org,
	mark.rutland@arm.com, alexander.shishkin@linux.intel.com,
	jolsa@redhat.com, eranian@google.com, ak@linux.intel.com,
	dave.hansen@intel.com, kirill.shutemov@linux.intel.com,
	benh@kernel.crashing.org, paulus@samba.org,
	David Miller <davem@davemloft.net>,
	vbabka@suse.cz
Subject: Re: [PATCH V9 1/4] perf/core: Add PERF_SAMPLE_DATA_PAGE_SIZE
Date: Wed, 11 Nov 2020 21:00:00 +0100	[thread overview]
Message-ID: <20201111200000.GL2628@hirez.programming.kicks-ass.net> (raw)
In-Reply-To: <20201111182620.GW17076@casper.infradead.org>

On Wed, Nov 11, 2020 at 06:26:20PM +0000, Matthew Wilcox wrote:
> On Wed, Nov 11, 2020 at 06:22:53PM +0100, Peter Zijlstra wrote:
> > On Wed, Nov 11, 2020 at 04:38:48PM +0000, Matthew Wilcox wrote:
> > > 	if (pud_leaf(pud))
> > > 		return PUD_SIZE;
> > 
> > But that doesn't handle non-pagetable aligned hugetlb sizes. Granted,
> > that's unlikely at the PUD level, but why be inconsistent..
> > 
> > So we really want:
> > 
> > 	if (p*d_leaf(p*d)) {
> > 		if (!'special') {
> > 			page = p*d_page(p*d);
> > 			if (PageHuge(page))
> > 				return page_size(compound_head(page));
> > 		}
> > 		return P*D_SIZE;
> > 	}
> 
> Still doesn't work because pages can be mapped at funny offsets.

Wait, what?! Is there hardware that has unaligned TLB page-sizes?

Can you start a 64K page at an 8k offset? I don't think I've ever seen
that. Still even with that, how would the above go wrong there? It would
find the compound page covering @addr, PageHuge() (and possibly some
addition arch specific condition) returns true and we get the compound
size to find the hardware page size used.

> What we really want is for a weak definition of
> 
> unsigned long tlb_size(struct mm_struct *mm, unsigned long addr)
> {
> 	if (p*d_leaf(p*d))
> 		return p*d_size(p*d);
> }
> 
> then ARM can look at its special bit in the page table to determine
> whether this is a singleton or part of a brace of pages.

That's basically what we provide. but really the only thing that's
missing from this generic page walker is the ability to detect if a
!PageHuge compound page is actually still a hardware page.

> > Now, when you add !PMD THP sizes (presumably for architectures that have
> > 'funny' sizes, otherwise what's the point), then you get to add '||
> 
> This is the problem with all the huge page support in Linux today.
> It's written by people who work for hardware companies who think only
> about exploiting the hardware features they sell.  You all ignore the
> very real software overhedas of trying to manage millions of pages.
> I see a 6% reduction in kernel overhead when running kernbench using
> THPs that may go as large as 256kB.  On x86.  Intel x86, at that.

That's a really nice improvement. However then this code doesn't care
about it. Please make it possible to distinguish between THP on hardware
pages vs software pages.

  reply	other threads:[~2020-11-11 20:00 UTC|newest]

Thread overview: 39+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-10-01 13:57 [PATCH V9 0/4] Add the page size in the perf record (kernel) kan.liang
2020-10-01 13:57 ` [PATCH V9 1/4] perf/core: Add PERF_SAMPLE_DATA_PAGE_SIZE kan.liang
2020-10-09  9:09   ` Peter Zijlstra
2020-10-09  9:16     ` Peter Zijlstra
2020-10-09  9:37     ` Will Deacon
2020-10-09  9:53       ` Peter Zijlstra
2020-10-20  2:49         ` Leo Yan
2020-10-20  7:19           ` Peter Zijlstra
2020-10-20  8:16             ` Leo Yan
2020-10-09 12:29     ` Liang, Kan
2020-10-09 12:57       ` Peter Zijlstra
2020-10-09 13:28     ` Michael Ellerman
2020-10-12  8:48       ` Will Deacon
2020-10-13 14:57         ` Liang, Kan
2020-10-13 15:46           ` Peter Zijlstra
2020-10-13 16:34             ` Peter Zijlstra
2020-11-04 17:11               ` Liang, Kan
2020-11-10 15:20                 ` Liang, Kan
2020-11-11  9:57                 ` Peter Zijlstra
2020-11-11 11:22                   ` Peter Zijlstra
2020-11-11 12:43                     ` Peter Zijlstra
2020-11-11 15:30                       ` Matthew Wilcox
2020-11-11 15:52                         ` Peter Zijlstra
2020-11-11 15:57                         ` Peter Zijlstra
2020-11-11 16:38                           ` Matthew Wilcox
2020-11-11 17:22                             ` Peter Zijlstra
2020-11-11 18:26                               ` Matthew Wilcox
2020-11-11 20:00                                 ` Peter Zijlstra [this message]
2020-11-11 22:33                                   ` Matthew Wilcox
2020-11-12  9:53                                     ` Peter Zijlstra
2020-11-12 11:36                                       ` Peter Zijlstra
2020-11-12 14:01                                         ` Matthew Wilcox
2020-10-29 10:51   ` [tip: perf/core] " tip-bot2 for Kan Liang
2020-10-01 13:57 ` [PATCH V9 2/4] perf/x86/intel: Support PERF_SAMPLE_DATA_PAGE_SIZE kan.liang
2020-10-29 10:51   ` [tip: perf/core] " tip-bot2 for Kan Liang
2020-10-01 13:57 ` [PATCH V9 3/4] powerpc/perf: " kan.liang
2020-10-29 10:51   ` [tip: perf/core] " tip-bot2 for Kan Liang
2020-10-01 13:57 ` [PATCH V9 4/4] perf/core: Add support for PERF_SAMPLE_CODE_PAGE_SIZE kan.liang
2020-10-29 10:51   ` [tip: perf/core] " tip-bot2 for Stephane Eranian

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20201111200000.GL2628@hirez.programming.kicks-ass.net \
    --to=peterz@infradead.org \
    --cc=acme@kernel.org \
    --cc=ak@linux.intel.com \
    --cc=alexander.shishkin@linux.intel.com \
    --cc=benh@kernel.crashing.org \
    --cc=dave.hansen@intel.com \
    --cc=davem@davemloft.net \
    --cc=eranian@google.com \
    --cc=jolsa@redhat.com \
    --cc=kan.liang@linux.intel.com \
    --cc=kirill.shutemov@linux.intel.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=mark.rutland@arm.com \
    --cc=mingo@redhat.com \
    --cc=mpe@ellerman.id.au \
    --cc=paulus@samba.org \
    --cc=vbabka@suse.cz \
    --cc=will@kernel.org \
    --cc=willy@infradead.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.