[02/25] mm: Introduce mm_fault_accounting()
diff mbox series

Message ID 20200615221607.7764-3-peterx@redhat.com
State New
Headers show
Series
  • mm: Page fault accounting cleanups
Related show

Commit Message

Peter Xu June 15, 2020, 10:15 p.m. UTC
Provide this helper for doing memory page fault accounting across archs.  It
can be defined unconditionally because perf_sw_event() is always defined, and
perf_sw_event() will be a no-op if !CONFIG_PERF_EVENTS.

Signed-off-by: Peter Xu <peterx@redhat.com>
---
 include/linux/mm.h |  2 ++
 mm/memory.c        | 18 ++++++++++++++++++
 2 files changed, 20 insertions(+)

Comments

Linus Torvalds June 15, 2020, 10:32 p.m. UTC | #1
On Mon, Jun 15, 2020 at 3:16 PM Peter Xu <peterx@redhat.com> wrote:
>
> Provide this helper for doing memory page fault accounting across archs.  It
> can be defined unconditionally because perf_sw_event() is always defined, and
> perf_sw_event() will be a no-op if !CONFIG_PERF_EVENTS.

Well, the downside is that now it forces a separate I$ miss and all
those extra arguments because it's a out-of-line function and the
compiler won't see that they all go away.

Yeah, maybe some day maybe we'll have LTO and these kinds of things
will not matter. And maybe they already don't. But it seems kind of
sad to basically force non-optimal code generation from this series.

Why would you export the symbol, btw? Page fault handling is never a module.

              Linus
Peter Xu June 15, 2020, 11:19 p.m. UTC | #2
On Mon, Jun 15, 2020 at 03:32:40PM -0700, Linus Torvalds wrote:
> On Mon, Jun 15, 2020 at 3:16 PM Peter Xu <peterx@redhat.com> wrote:
> >
> > Provide this helper for doing memory page fault accounting across archs.  It
> > can be defined unconditionally because perf_sw_event() is always defined, and
> > perf_sw_event() will be a no-op if !CONFIG_PERF_EVENTS.
> 
> Well, the downside is that now it forces a separate I$ miss and all
> those extra arguments because it's a out-of-line function and the
> compiler won't see that they all go away.
> 
> Yeah, maybe some day maybe we'll have LTO and these kinds of things
> will not matter. And maybe they already don't. But it seems kind of
> sad to basically force non-optimal code generation from this series.

I tried to make it static inline firstly in linux/mm.h, however it'll need to
have linux/mm.h include linux/perf_event.h which seems to have created a loop
dependency of headers.  I verified current code will at least generate inlined
functions too for x86 (no mm_fault_accounting() in "objdump -t vmlinux") with
gcc10.

Another alternative is to make it a macro, it's just that I feel the function
definition is a bit cleaner.  Any further suggestions welcomed too.

> 
> Why would you export the symbol, btw? Page fault handling is never a module.

I followed handle_mm_fault() which is exported too, since potentially
mm_fault_accounting() should always be called in the same context of
handle_mm_fault().  Or do you prefer me to drop it?

Thanks,
Andrew Morton June 16, 2020, 7 p.m. UTC | #3
On Mon, 15 Jun 2020 19:19:17 -0400 Peter Xu <peterx@redhat.com> wrote:

> On Mon, Jun 15, 2020 at 03:32:40PM -0700, Linus Torvalds wrote:
> > On Mon, Jun 15, 2020 at 3:16 PM Peter Xu <peterx@redhat.com> wrote:
> > >
> > > Provide this helper for doing memory page fault accounting across archs.  It
> > > can be defined unconditionally because perf_sw_event() is always defined, and
> > > perf_sw_event() will be a no-op if !CONFIG_PERF_EVENTS.
> > 
> > Well, the downside is that now it forces a separate I$ miss and all
> > those extra arguments because it's a out-of-line function and the
> > compiler won't see that they all go away.
> > 
> > Yeah, maybe some day maybe we'll have LTO and these kinds of things
> > will not matter. And maybe they already don't. But it seems kind of
> > sad to basically force non-optimal code generation from this series.
> 
> I tried to make it static inline firstly in linux/mm.h, however it'll need to
> have linux/mm.h include linux/perf_event.h which seems to have created a loop
> dependency of headers.  I verified current code will at least generate inlined
> functions too for x86 (no mm_fault_accounting() in "objdump -t vmlinux") with
> gcc10.
> 
> Another alternative is to make it a macro, it's just that I feel the function
> definition is a bit cleaner.  Any further suggestions welcomed too.

Could create a new header file mm_fault.h which includes mm.h and
perf_event.h.  A later cleanup could move other fault-related things
into that header and add the appropriate inclusions into files which
use these things.

btw, I think mm_account_fault() might be a better name for this function.

And some (kerneldoc) documentation would be nice.  Although this
function is pretty self-evident.

> > 
> > Why would you export the symbol, btw? Page fault handling is never a module.
> 
> I followed handle_mm_fault() which is exported too, since potentially
> mm_fault_accounting() should always be called in the same context of
> handle_mm_fault().  Or do you prefer me to drop it?

Let's not add an unneeded export.  If someone for some reason needs it
later, it can be added then.
Peter Xu June 17, 2020, 4:26 p.m. UTC | #4
On Tue, Jun 16, 2020 at 12:00:37PM -0700, Andrew Morton wrote:
> On Mon, 15 Jun 2020 19:19:17 -0400 Peter Xu <peterx@redhat.com> wrote:
> 
> > On Mon, Jun 15, 2020 at 03:32:40PM -0700, Linus Torvalds wrote:
> > > On Mon, Jun 15, 2020 at 3:16 PM Peter Xu <peterx@redhat.com> wrote:
> > > >
> > > > Provide this helper for doing memory page fault accounting across archs.  It
> > > > can be defined unconditionally because perf_sw_event() is always defined, and
> > > > perf_sw_event() will be a no-op if !CONFIG_PERF_EVENTS.
> > > 
> > > Well, the downside is that now it forces a separate I$ miss and all
> > > those extra arguments because it's a out-of-line function and the
> > > compiler won't see that they all go away.
> > > 
> > > Yeah, maybe some day maybe we'll have LTO and these kinds of things
> > > will not matter. And maybe they already don't. But it seems kind of
> > > sad to basically force non-optimal code generation from this series.
> > 
> > I tried to make it static inline firstly in linux/mm.h, however it'll need to
> > have linux/mm.h include linux/perf_event.h which seems to have created a loop
> > dependency of headers.  I verified current code will at least generate inlined
> > functions too for x86 (no mm_fault_accounting() in "objdump -t vmlinux") with
> > gcc10.
> > 
> > Another alternative is to make it a macro, it's just that I feel the function
> > definition is a bit cleaner.  Any further suggestions welcomed too.
> 
> Could create a new header file mm_fault.h which includes mm.h and
> perf_event.h.  A later cleanup could move other fault-related things
> into that header and add the appropriate inclusions into files which
> use these things.
> 
> btw, I think mm_account_fault() might be a better name for this function.
> 
> And some (kerneldoc) documentation would be nice.  Although this
> function is pretty self-evident.
> 
> > > 
> > > Why would you export the symbol, btw? Page fault handling is never a module.
> > 
> > I followed handle_mm_fault() which is exported too, since potentially
> > mm_fault_accounting() should always be called in the same context of
> > handle_mm_fault().  Or do you prefer me to drop it?
> 
> Let's not add an unneeded export.  If someone for some reason needs it
> later, it can be added then.

I plan to take the approach that Linus suggested, probably with
mm_account_fault() declared as static inline in memory.c.  I'll remember to add
some kerneldoc too.

Thanks!

Patch
diff mbox series

diff --git a/include/linux/mm.h b/include/linux/mm.h
index f3fe7371855c..b96db64125be 100644
--- a/include/linux/mm.h
+++ b/include/linux/mm.h
@@ -1649,6 +1649,8 @@  void truncate_pagecache_range(struct inode *inode, loff_t offset, loff_t end);
 int truncate_inode_page(struct address_space *mapping, struct page *page);
 int generic_error_remove_page(struct address_space *mapping, struct page *page);
 int invalidate_inode_page(struct page *page);
+void mm_fault_accounting(struct task_struct *task, struct pt_regs *regs,
+			 unsigned long address, bool major);
 
 #ifdef CONFIG_MMU
 extern vm_fault_t handle_mm_fault(struct vm_area_struct *vma,
diff --git a/mm/memory.c b/mm/memory.c
index f703fe8c8346..c4f8319f0ed8 100644
--- a/mm/memory.c
+++ b/mm/memory.c
@@ -71,6 +71,8 @@ 
 #include <linux/dax.h>
 #include <linux/oom.h>
 #include <linux/numa.h>
+#include <linux/perf_event.h>
+#include <linux/ptrace.h>
 
 #include <trace/events/kmem.h>
 
@@ -4397,6 +4399,22 @@  vm_fault_t handle_mm_fault(struct vm_area_struct *vma, unsigned long address,
 }
 EXPORT_SYMBOL_GPL(handle_mm_fault);
 
+void mm_fault_accounting(struct task_struct *task, struct pt_regs *regs,
+			 unsigned long address, bool major)
+{
+	perf_sw_event(PERF_COUNT_SW_PAGE_FAULTS, 1, regs, address);
+	if (major) {
+		task->maj_flt++;
+		perf_sw_event(PERF_COUNT_SW_PAGE_FAULTS_MAJ, 1,
+			      regs, address);
+	} else {
+		task->min_flt++;
+		perf_sw_event(PERF_COUNT_SW_PAGE_FAULTS_MIN, 1,
+			      regs, address);
+	}
+}
+EXPORT_SYMBOL_GPL(mm_fault_accounting);
+
 #ifndef __PAGETABLE_P4D_FOLDED
 /*
  * Allocate p4d page table.