linuxppc-dev.lists.ozlabs.org archive mirror
 help / color / mirror / Atom feed
* [PATCH 0/10] Replace _PAGE_NUMA with PAGE_NONE protections v5
@ 2015-01-05 10:54 Mel Gorman
  2015-01-05 10:54 ` [PATCH 01/10] mm: numa: Do not dereference pmd outside of the lock during NUMA hinting fault Mel Gorman
                   ` (9 more replies)
  0 siblings, 10 replies; 14+ messages in thread
From: Mel Gorman @ 2015-01-05 10:54 UTC (permalink / raw)
  To: Andrew Morton
  Cc: Rik van Riel, Hugh Dickins, Linux Kernel, Linux-MM, Ingo Molnar,
	Aneesh Kumar, Sasha Levin, LinuxPPC-dev, Kirill Shutemov,
	Mel Gorman

Changelog since V4
o Rebase to 3.19-rc2						(mel)

Changelog since V3
o Minor comment update						(benh)
o Add ack'ed bys

Changelog since V2
o Rename *_protnone_numa to _protnone and extend docs		(linus)
o Rebase to mmotm-20141119 for pre-merge testing		(mel)
o Conver WARN_ON to VM_WARN_ON					(aneesh)

Changelog since V1
o ppc64 paranoia checks and clarifications			(aneesh)
o Fix trinity regression (hopefully)
o Reduce unnecessary TLB flushes				(mel)

Automatic NUMA balancing depends on protecting PTEs to trap a fault and
gather reference locality information. Very broadly speaking it marks PTEs
as not present and uses another bit to distinguish between NUMA hinting
faults and other types of faults. This approach is not universally loved,
ultimately resulted in swap space shrinking and has had a number of
problems with Xen support. This series is very heavily based on patches
from Linus and Aneesh to replace the existing PTE/PMD NUMA helper functions
with normal change protections that should be less problematic. This was
tested on a few different workloads that showed automatic NUMA balancing
was still active with mostly comparable results.

specjbb single JVM: There was negligible performance difference in the
	benchmark itself for short runs. However, system activity is
	higher and interrupts are much higher over time -- possibly TLB
	flushes. Migrations are also higher. Overall, this is more overhead
	but considering the problems faced with the old approach I think
	we just have to suck it up and find another way of reducing the
	overhead.

specjbb multi JVM: Negligible performance difference to the actual benchmark
	but like the single JVM case, the system overhead is noticeably
	higher.  Again, interrupts are a major factor.

autonumabench: This was all over the place and about all that can be
	reasonably concluded is that it's different but not necessarily
	better or worse.

autonumabench
                                          3.19.0-rc2            3.19.0-rc2
                                             vanilla         protnone-v5r1
Time System-NUMA01                  268.99 (  0.00%)     1350.70 (-402.14%)
Time System-NUMA01_THEADLOCAL       110.14 (  0.00%)       50.68 ( 53.99%)
Time System-NUMA02                   20.14 (  0.00%)       31.12 (-54.52%)
Time System-NUMA02_SMT                7.40 (  0.00%)        6.57 ( 11.22%)
Time Elapsed-NUMA01                 687.57 (  0.00%)      528.51 ( 23.13%)
Time Elapsed-NUMA01_THEADLOCAL      540.29 (  0.00%)      554.36 ( -2.60%)
Time Elapsed-NUMA02                  84.98 (  0.00%)       78.87 (  7.19%)
Time Elapsed-NUMA02_SMT              77.32 (  0.00%)       87.07 (-12.61%)

System CPU usage of NUMA01 is worse but it's an adverse workload on this
machine so I'm reluctant to conclude that it's a problem that matters.
Overall time to complete the benchmark is comparable

          3.19.0-rc2  3.19.0-rc2
             vanillaprotnone-v5r1
User        58100.89    48351.17
System        407.74     1439.22
Elapsed      1411.44     1250.55


NUMA alloc hit                 5398081     5536696
NUMA alloc miss                      0           0
NUMA interleave hit                  0           0
NUMA alloc local               5398073     5536668
NUMA base PTE updates        622722221   442576477
NUMA huge PMD updates          1215268      863690
NUMA page range updates     1244939437   884785757
NUMA hint faults               1696858     1221541
NUMA hint local faults         1046842      791219
NUMA hint local percent             61          64
NUMA pages migrated            6044430    59291698

The NUMA pages migrated look terrible but when I looked at a graph of the
activity over time I see that the massive spike in migration activity was
during NUMA01. This correlates with high system CPU usage and could be simply
down to bad luck but any modifications that affect that workload would be
related to scan rates and migrations, not the protection mechanism. For
all other workloads, migration activity was comparable.

Overall, headline performance figures are comparable but the overhead
is higher, mostly in interrupts. To some extent, higher overhead from
this approach was anticipated but not to this degree. It's going to be
necessary to reduce this again with a separate series in the future. It's
still worth going ahead with this series though as it's likely to avoid
constant headaches with Xen and is probably easier to maintain.

 arch/powerpc/include/asm/pgtable.h    |  54 ++----------
 arch/powerpc/include/asm/pte-common.h |   5 --
 arch/powerpc/include/asm/pte-hash64.h |   6 --
 arch/powerpc/kvm/book3s_hv_rm_mmu.c   |   2 +-
 arch/powerpc/mm/copro_fault.c         |   8 +-
 arch/powerpc/mm/fault.c               |  25 ++----
 arch/powerpc/mm/pgtable.c             |  11 ++-
 arch/powerpc/mm/pgtable_64.c          |   3 +-
 arch/x86/include/asm/pgtable.h        |  46 +++++-----
 arch/x86/include/asm/pgtable_64.h     |   5 --
 arch/x86/include/asm/pgtable_types.h  |  41 +--------
 arch/x86/mm/gup.c                     |   4 +-
 include/asm-generic/pgtable.h         | 153 ++--------------------------------
 include/linux/migrate.h               |   4 -
 include/linux/swapops.h               |   2 +-
 include/uapi/linux/mempolicy.h        |   2 +-
 mm/gup.c                              |  10 +--
 mm/huge_memory.c                      |  50 ++++++-----
 mm/memory.c                           |  18 ++--
 mm/mempolicy.c                        |   2 +-
 mm/migrate.c                          |   8 +-
 mm/mprotect.c                         |  48 +++++------
 mm/pgtable-generic.c                  |   2 -
 23 files changed, 135 insertions(+), 374 deletions(-)

-- 
2.1.2

^ permalink raw reply	[flat|nested] 14+ messages in thread
* [PATCH 0/10] Replace _PAGE_NUMA with PAGE_NONE protections v4
@ 2014-12-04 11:24 Mel Gorman
  2014-12-04 11:24 ` [PATCH 04/10] ppc64: Add paranoid warnings for unexpected DSISR_PROTFAULT Mel Gorman
  0 siblings, 1 reply; 14+ messages in thread
From: Mel Gorman @ 2014-12-04 11:24 UTC (permalink / raw)
  To: Andrew Morton
  Cc: Rik van Riel, LinuxPPC-dev, Hugh Dickins, Linux Kernel, Linux-MM,
	Ingo Molnar, Paul Mackerras, Aneesh Kumar, Sasha Levin,
	Dave Jones, Linus Torvalds, Kirill Shutemov, Mel Gorman

There are no functional changes here and I kept the mmotm-20141119 baseline
as that is what got tested but it rebases cleanly to current mmotm. The
series makes architectural changes but splitting this on a per-arch basis
would cause bisect-related brain damage. I'm hoping this can go through
Andrew without conflict. It's been tested by myself (standard tests),
Aneesh (ppc64) and Sasha (trinity) so there is some degree of confidence
that it's ok.

Changelog since V3
o Minor comment update						(benh)
o Add ack'ed bys

Changelog since V2
o Rename *_protnone_numa to _protnone and extend docs		(linus)
o Rebase to mmotm-20141119 for pre-merge testing		(mel)
o Conver WARN_ON to VM_WARN_ON					(aneesh)

Changelog since V1
o ppc64 paranoia checks and clarifications			(aneesh)
o Fix trinity regression (hopefully)
o Reduce unnecessary TLB flushes				(mel)

Automatic NUMA balancing depends on being able to protect PTEs to trap a
fault and gather reference locality information. Very broadly speaking it
would mark PTEs as not present and use another bit to distinguish between
NUMA hinting faults and other types of faults. It was universally loved
by everybody and caused no problems whatsoever. That last sentence might
be a lie.

This series is very heavily based on patches from Linus and Aneesh to
replace the existing PTE/PMD NUMA helper functions with normal change
protections. I did alter and add parts of it but I consider them relatively
minor contributions. At their suggestion, acked-bys are in there but I've
no problem converting them to Signed-off-by if requested.

AFAIK, this has received no testing on ppc64 and I'm depending on Aneesh for
that. I tested trinity under kvm-tool and passed and ran a few other basic
tests. At the time of writing, only the short-lived tests have completed
but testing of V2 indicated that long-term testing had no surprises. In
most cases I'm leaving out detail as it's not that interesting.

specjbb single JVM: There was negligible performance difference in the
	benchmark itself for short runs. However, system activity is
	higher and interrupts are much higher over time -- possibly TLB
	flushes. Migrations are also higher. Overall, this is more overhead
	but considering the problems faced with the old approach I think
	we just have to suck it up and find another way of reducing the
	overhead.

specjbb multi JVM: Negligible performance difference to the actual benchmark
	but like the single JVM case, the system overhead is noticeably
	higher.  Again, interrupts are a major factor.

autonumabench: This was all over the place and about all that can be
	reasonably concluded is that it's different but not necessarily
	better or worse.

autonumabench
                                     3.18.0-rc5            3.18.0-rc5
                                 mmotm-20141119         protnone-v3r3
User    NUMA01               32380.24 (  0.00%)    21642.92 ( 33.16%)
User    NUMA01_THEADLOCAL    22481.02 (  0.00%)    22283.22 (  0.88%)
User    NUMA02                3137.00 (  0.00%)     3116.54 (  0.65%)
User    NUMA02_SMT            1614.03 (  0.00%)     1543.53 (  4.37%)
System  NUMA01                 322.97 (  0.00%)     1465.89 (-353.88%)
System  NUMA01_THEADLOCAL       91.87 (  0.00%)       49.32 ( 46.32%)
System  NUMA02                  37.83 (  0.00%)       14.61 ( 61.38%)
System  NUMA02_SMT               7.36 (  0.00%)        7.45 ( -1.22%)
Elapsed NUMA01                 716.63 (  0.00%)      599.29 ( 16.37%)
Elapsed NUMA01_THEADLOCAL      553.98 (  0.00%)      539.94 (  2.53%)
Elapsed NUMA02                  83.85 (  0.00%)       83.04 (  0.97%)
Elapsed NUMA02_SMT              86.57 (  0.00%)       79.15 (  8.57%)
CPU     NUMA01                4563.00 (  0.00%)     3855.00 ( 15.52%)
CPU     NUMA01_THEADLOCAL     4074.00 (  0.00%)     4136.00 ( -1.52%)
CPU     NUMA02                3785.00 (  0.00%)     3770.00 (  0.40%)
CPU     NUMA02_SMT            1872.00 (  0.00%)     1959.00 ( -4.65%)

System CPU usage of NUMA01 is worse but it's an adverse workload on this
machine so I'm reluctant to conclude that it's a problem that matters. On
the other workloads that are sensible on this machine, system CPU usage
is great.  Overall time to complete the benchmark is comparable

          3.18.0-rc5  3.18.0-rc5
        mmotm-20141119protnone-v3r3
User        59612.50    48586.44
System        460.22     1537.45
Elapsed      1442.20     1304.29

NUMA alloc hit                 5075182     5743353
NUMA alloc miss                      0           0
NUMA interleave hit                  0           0
NUMA alloc local               5075174     5743339
NUMA base PTE updates        637061448   443106883
NUMA huge PMD updates          1243434      864747
NUMA page range updates     1273699656   885857347
NUMA hint faults               1658116     1214277
NUMA hint local faults          959487      754113
NUMA hint local percent             57          62
NUMA pages migrated            5467056    61676398

The NUMA pages migrated look terrible but when I looked at a graph of the
activity over time I see that the massive spike in migration activity was
during NUMA01. This correlates with high system CPU usage and could be simply
down to bad luck but any modifications that affect that workload would be
related to scan rates and migrations, not the protection mechanism. For
all other workloads, migration activity was comparable.

Overall, headline performance figures are comparable but the overhead
is higher, mostly in interrupts. To some extent, higher overhead from
this approach was anticipated but not to this degree. It's going to be
necessary to reduce this again with a separate series in the future. It's
still worth going ahead with this series though as it's likely to avoid
constant headaches with Xen and is probably easier to maintain.

 arch/powerpc/include/asm/pgtable.h    |  54 ++----------
 arch/powerpc/include/asm/pte-common.h |   5 --
 arch/powerpc/include/asm/pte-hash64.h |   6 --
 arch/powerpc/kvm/book3s_hv_rm_mmu.c   |   2 +-
 arch/powerpc/mm/copro_fault.c         |   8 +-
 arch/powerpc/mm/fault.c               |  25 ++----
 arch/powerpc/mm/pgtable.c             |  11 ++-
 arch/powerpc/mm/pgtable_64.c          |   3 +-
 arch/x86/include/asm/pgtable.h        |  46 +++++-----
 arch/x86/include/asm/pgtable_64.h     |   5 --
 arch/x86/include/asm/pgtable_types.h  |  41 +--------
 arch/x86/mm/gup.c                     |   4 +-
 include/asm-generic/pgtable.h         | 153 ++--------------------------------
 include/linux/migrate.h               |   4 -
 include/linux/swapops.h               |   2 +-
 include/uapi/linux/mempolicy.h        |   2 +-
 mm/gup.c                              |  10 +--
 mm/huge_memory.c                      |  50 ++++++-----
 mm/memory.c                           |  18 ++--
 mm/mempolicy.c                        |   2 +-
 mm/migrate.c                          |   8 +-
 mm/mprotect.c                         |  48 +++++------
 mm/pgtable-generic.c                  |   2 -
 23 files changed, 135 insertions(+), 374 deletions(-)

-- 
2.1.2

^ permalink raw reply	[flat|nested] 14+ messages in thread
* [PATCH 0/10] Replace _PAGE_NUMA with PAGE_NONE protections v3
@ 2014-11-21 13:57 Mel Gorman
  2014-11-21 13:57 ` [PATCH 04/10] ppc64: Add paranoid warnings for unexpected DSISR_PROTFAULT Mel Gorman
  0 siblings, 1 reply; 14+ messages in thread
From: Mel Gorman @ 2014-11-21 13:57 UTC (permalink / raw)
  To: Linux Kernel, Linux-MM, LinuxPPC-dev
  Cc: Rik van Riel, Hugh Dickins, Ingo Molnar, Paul Mackerras,
	Aneesh Kumar, Sasha Levin, Dave Jones, Linus Torvalds,
	Kirill Shutemov, Mel Gorman

The main change here is to rebase on mmotm-20141119 as the series had
significant conflicts that were non-obvious to resolve. The main blockers
for merging are independent testing from Sasha (trinity), independent
testing from Aneesh (ppc64 support) and acks from Ben and Paul on the
powerpc patches.

Changelog since V2
o Rename *_protnone_numa to _protnone and extend docs		(linus)
o Rebase to mmotm-20141119 for pre-merge testing		(mel)
o Conver WARN_ON to VM_WARN_ON					(aneesh)

Changelog since V1
o ppc64 paranoia checks and clarifications			(aneesh)
o Fix trinity regression (hopefully)
o Reduce unnecessary TLB flushes				(mel)

Automatic NUMA balancing depends on being able to protect PTEs to trap a
fault and gather reference locality information. Very broadly speaking it
would mark PTEs as not present and use another bit to distinguish between
NUMA hinting faults and other types of faults. It was universally loved
by everybody and caused no problems whatsoever. That last sentence might
be a lie.

This series is very heavily based on patches from Linus and Aneesh to
replace the existing PTE/PMD NUMA helper functions with normal change
protections. I did alter and add parts of it but I consider them relatively
minor contributions. At their suggestion, acked-bys are in there but I've
no problem converting them to Signed-off-by if requested.

AFAIK, this has received no testing on ppc64 and I'm depending on Aneesh for
that. I tested trinity under kvm-tool and passed and ran a few other basic
tests. At the time of writing, only the short-lived tests have completed
but testing of V2 indicated that long-term testing had no surprises. In
most cases I'm leaving out detail as it's not that interesting.

specjbb single JVM: There was negligible performance difference in the
	benchmark itself for short runs. However, system activity is
	higher and interrupts are much higher over time -- possibly TLB
	flushes. Migrations are also higher. Overall, this is more overhead
	but considering the problems faced with the old approach I think
	we just have to suck it up and find another way of reducing the
	overhead.

specjbb multi JVM: Negligible performance difference to the actual benchmark
	but like the single JVM case, the system overhead is noticeably
	higher.  Again, interrupts are a major factor.

autonumabench: This was all over the place and about all that can be
	reasonably concluded is that it's different but not necessarily
	better or worse.

autonumabench
                                     3.18.0-rc5            3.18.0-rc5
                                 mmotm-20141119         protnone-v3r3
User    NUMA01               32380.24 (  0.00%)    21642.92 ( 33.16%)
User    NUMA01_THEADLOCAL    22481.02 (  0.00%)    22283.22 (  0.88%)
User    NUMA02                3137.00 (  0.00%)     3116.54 (  0.65%)
User    NUMA02_SMT            1614.03 (  0.00%)     1543.53 (  4.37%)
System  NUMA01                 322.97 (  0.00%)     1465.89 (-353.88%)
System  NUMA01_THEADLOCAL       91.87 (  0.00%)       49.32 ( 46.32%)
System  NUMA02                  37.83 (  0.00%)       14.61 ( 61.38%)
System  NUMA02_SMT               7.36 (  0.00%)        7.45 ( -1.22%)
Elapsed NUMA01                 716.63 (  0.00%)      599.29 ( 16.37%)
Elapsed NUMA01_THEADLOCAL      553.98 (  0.00%)      539.94 (  2.53%)
Elapsed NUMA02                  83.85 (  0.00%)       83.04 (  0.97%)
Elapsed NUMA02_SMT              86.57 (  0.00%)       79.15 (  8.57%)
CPU     NUMA01                4563.00 (  0.00%)     3855.00 ( 15.52%)
CPU     NUMA01_THEADLOCAL     4074.00 (  0.00%)     4136.00 ( -1.52%)
CPU     NUMA02                3785.00 (  0.00%)     3770.00 (  0.40%)
CPU     NUMA02_SMT            1872.00 (  0.00%)     1959.00 ( -4.65%)

System CPU usage of NUMA01 is worse but it's an adverse workload on this
machine so I'm reluctant to conclude that it's a problem that matters. On
the other workloads that are sensible on this machine, system CPU usage
is great.  Overall time to complete the benchmark is comparable

          3.18.0-rc5  3.18.0-rc5
        mmotm-20141119protnone-v3r3
User        59612.50    48586.44
System        460.22     1537.45
Elapsed      1442.20     1304.29

NUMA alloc hit                 5075182     5743353
NUMA alloc miss                      0           0
NUMA interleave hit                  0           0
NUMA alloc local               5075174     5743339
NUMA base PTE updates        637061448   443106883
NUMA huge PMD updates          1243434      864747
NUMA page range updates     1273699656   885857347
NUMA hint faults               1658116     1214277
NUMA hint local faults          959487      754113
NUMA hint local percent             57          62
NUMA pages migrated            5467056    61676398

The NUMA pages migrated look terrible but when I looked at a graph of the
activity over time I see that the massive spike in migration activity was
during NUMA01. This correlates with high system CPU usage and could be simply
down to bad luck but any modifications that affect that workload would be
related to scan rates and migrations, not the protection mechanism. For
all other workloads, migration activity was comparable.

Overall, headline performance figures are comparable but the overhead
is higher, mostly in interrupts. To some extent, higher overhead from
this approach was anticipated but not to this degree. It's going to be
necessary to reduce this again with a separate series in the future. It's
still worth going ahead with this series though as it's likely to avoid
constant headaches with Xen and is probably easier to maintain.

 arch/powerpc/include/asm/pgtable.h    |  53 ++----------
 arch/powerpc/include/asm/pte-common.h |   5 --
 arch/powerpc/include/asm/pte-hash64.h |   6 --
 arch/powerpc/kvm/book3s_hv_rm_mmu.c   |   2 +-
 arch/powerpc/mm/copro_fault.c         |   8 +-
 arch/powerpc/mm/fault.c               |  25 ++----
 arch/powerpc/mm/pgtable.c             |  11 ++-
 arch/powerpc/mm/pgtable_64.c          |   3 +-
 arch/x86/include/asm/pgtable.h        |  46 +++++-----
 arch/x86/include/asm/pgtable_64.h     |   5 --
 arch/x86/include/asm/pgtable_types.h  |  41 +--------
 arch/x86/mm/gup.c                     |   4 +-
 include/asm-generic/pgtable.h         | 153 ++--------------------------------
 include/linux/migrate.h               |   4 -
 include/linux/swapops.h               |   2 +-
 include/uapi/linux/mempolicy.h        |   2 +-
 mm/gup.c                              |  10 +--
 mm/huge_memory.c                      |  50 ++++++-----
 mm/memory.c                           |  18 ++--
 mm/mempolicy.c                        |   2 +-
 mm/migrate.c                          |   8 +-
 mm/mprotect.c                         |  48 +++++------
 mm/pgtable-generic.c                  |   2 -
 23 files changed, 134 insertions(+), 374 deletions(-)

-- 
2.1.2

^ permalink raw reply	[flat|nested] 14+ messages in thread
* [PATCH 0/10] Replace _PAGE_NUMA with PAGE_NONE protections v2
@ 2014-11-20 10:19 Mel Gorman
  2014-11-20 10:19 ` [PATCH 04/10] ppc64: Add paranoid warnings for unexpected DSISR_PROTFAULT Mel Gorman
  0 siblings, 1 reply; 14+ messages in thread
From: Mel Gorman @ 2014-11-20 10:19 UTC (permalink / raw)
  To: Linux Kernel
  Cc: Rik van Riel, Linus Torvalds, Hugh Dickins, Linux-MM,
	Ingo Molnar, Paul Mackerras, Aneesh Kumar, Sasha Levin,
	Dave Jones, LinuxPPC-dev, Kirill Shutemov, Mel Gorman

V1 failed while running under kvm-tools very quickly and a second report
indicated that it happens on bare metal as well. This version survived
an overnight run of trinity running under kvm-tools here but verification
from Sasha would be appreciated.

Changelog since V1
o ppc64 paranoia checks and clarifications			(aneesh)
o Fix trinity regression (hopefully)
o Reduce unnecessary TLB flushes				(mel)

Automatic NUMA balancing depends on being able to protect PTEs to trap a
fault and gather reference locality information. Very broadly speaking it
would mark PTEs as not present and use another bit to distinguish between
NUMA hinting faults and other types of faults. It was universally loved
by everybody and caused no problems whatsoever. That last sentence might
be a lie.

This series is very heavily based on patches from Linus and Aneesh to
replace the existing PTE/PMD NUMA helper functions with normal change
protections. I did alter and add parts of it but I consider them relatively
minor contributions. At their suggestion, acked-bys are in there but I've
no problem converting them to Signed-off-by if requested.

AFAIK, this has received no testing on ppc64 and I'm depending on Aneesh for
that. I tested trinity under kvm-tool and passed and ran a few other basic
tests. In most cases I'm leaving out detail as it's not that interesting.

specjbb single JVM: There was negligible performance difference in the
	benchmark itself for short and long runs. However, system activity
	is higher and interrupts are much higher over time -- possibly
	TLB flushes. Migrations are also higher. Overall, this is more
	overhead but considering the problems faced with the old approach
	I think we just have to suck it up and find another way of reducing
	the overhead.

specjbb multi JVM: Negligible performance difference to the actual benchmarm
	but like the single JVM case, the system overhead is noticably
	higher.  Again, interrupts are a major factor.

autonumabench: This was all over the place and about all that can be
	reasonably concluded is that it's different but not necessarily
	better or worse.

autonumabench
                                     3.18.0-rc4            3.18.0-rc4
                                        vanilla         protnone-v2r5
User    NUMA01               32806.01 (  0.00%)    20250.67 ( 38.27%)
User    NUMA01_THEADLOCAL    23910.28 (  0.00%)    22734.37 (  4.92%)
User    NUMA02                3176.85 (  0.00%)     3082.68 (  2.96%)
User    NUMA02_SMT            1600.06 (  0.00%)     1547.08 (  3.31%)
System  NUMA01                 719.07 (  0.00%)     1344.39 (-86.96%)
System  NUMA01_THEADLOCAL      916.26 (  0.00%)      180.90 ( 80.26%)
System  NUMA02                  20.92 (  0.00%)       17.34 ( 17.11%)
System  NUMA02_SMT               8.76 (  0.00%)        7.24 ( 17.35%)
Elapsed NUMA01                 728.27 (  0.00%)      519.28 ( 28.70%)
Elapsed NUMA01_THEADLOCAL      589.15 (  0.00%)      554.73 (  5.84%)
Elapsed NUMA02                  81.20 (  0.00%)       81.72 ( -0.64%)
Elapsed NUMA02_SMT              80.49 (  0.00%)       79.58 (  1.13%)
CPU     NUMA01                4603.00 (  0.00%)     4158.00 (  9.67%)
CPU     NUMA01_THEADLOCAL     4213.00 (  0.00%)     4130.00 (  1.97%)
CPU     NUMA02                3937.00 (  0.00%)     3793.00 (  3.66%)
CPU     NUMA02_SMT            1998.00 (  0.00%)     1952.00 (  2.30%)


System CPU usage of NUMA01 is worse but it's an adverse workload on this
machine so I'm reluctant to conclude that it's a problem that matters. On
the other workloads that are sensible on this machine, system CPU usage
is great.  Overall time to complete the benchmark is comparable

          3.18.0-rc4  3.18.0-rc4
             vanillaprotnone-v2r5
User        61493.38    47615.01
System       1665.17     1550.07
Elapsed      1480.79     1236.74

NUMA alloc hit                 4739774     5328362
NUMA alloc miss                      0           0
NUMA interleave hit                  0           0
NUMA alloc local               4664980     5328351
NUMA base PTE updates        556489407   444119981
NUMA huge PMD updates          1086000      866680
NUMA page range updates     1112521407   887860141
NUMA hint faults               1538964     1242142
NUMA hint local faults          835871      814313
NUMA hint local percent             54          65
NUMA pages migrated            7329212    59883854

The NUMA pages migrated look terrible but when I looked at a graph of the
activity over time I see that the massive spike in migration activity was
during NUMA01. This correlates with high system CPU usage and could be simply
down to bad luck but any modifications that affect that workload would be
related to scan rates and migrations, not the protection mechanism. For
all other workloads, migration activity was comparable.

Overall, headline performance figures are comparable but the overhead
is higher, mostly in interrupts. To some extent, higher overhead from
this approach was anticipated but not to this degree. It's going to be
necessary to reduce this again with a separate series in the future. It's
still worth going ahead with this series though as it's likely to avoid
constant headaches with Xen and is probably easier to maintain.

 arch/powerpc/include/asm/pgtable.h    |  53 ++----------
 arch/powerpc/include/asm/pte-common.h |   5 --
 arch/powerpc/include/asm/pte-hash64.h |   6 --
 arch/powerpc/kvm/book3s_hv_rm_mmu.c   |   2 +-
 arch/powerpc/mm/copro_fault.c         |   8 +-
 arch/powerpc/mm/fault.c               |  25 ++----
 arch/powerpc/mm/gup.c                 |   4 +-
 arch/powerpc/mm/pgtable.c             |   8 +-
 arch/powerpc/mm/pgtable_64.c          |   3 +-
 arch/x86/include/asm/pgtable.h        |  46 +++++-----
 arch/x86/include/asm/pgtable_64.h     |   5 --
 arch/x86/include/asm/pgtable_types.h  |  41 +--------
 arch/x86/mm/gup.c                     |   4 +-
 include/asm-generic/pgtable.h         | 152 ++--------------------------------
 include/linux/migrate.h               |   4 -
 include/linux/swapops.h               |   2 +-
 include/uapi/linux/mempolicy.h        |   2 +-
 mm/gup.c                              |   8 +-
 mm/huge_memory.c                      |  50 ++++++-----
 mm/memory.c                           |  18 ++--
 mm/mempolicy.c                        |   2 +-
 mm/migrate.c                          |   8 +-
 mm/mprotect.c                         |  48 +++++------
 mm/pgtable-generic.c                  |   2 -
 24 files changed, 131 insertions(+), 375 deletions(-)

-- 
2.1.2

^ permalink raw reply	[flat|nested] 14+ messages in thread

end of thread, other threads:[~2015-01-05 10:54 UTC | newest]

Thread overview: 14+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2015-01-05 10:54 [PATCH 0/10] Replace _PAGE_NUMA with PAGE_NONE protections v5 Mel Gorman
2015-01-05 10:54 ` [PATCH 01/10] mm: numa: Do not dereference pmd outside of the lock during NUMA hinting fault Mel Gorman
2015-01-05 10:54 ` [PATCH 02/10] mm: Add p[te|md] protnone helpers for use by NUMA balancing Mel Gorman
2015-01-05 10:54 ` [PATCH 03/10] mm: Convert p[te|md]_numa users to p[te|md]_protnone_numa Mel Gorman
2015-01-05 10:54 ` [PATCH 04/10] ppc64: Add paranoid warnings for unexpected DSISR_PROTFAULT Mel Gorman
2015-01-05 10:54 ` [PATCH 05/10] mm: Convert p[te|md]_mknonnuma and remaining page table manipulations Mel Gorman
2015-01-05 10:54 ` [PATCH 06/10] mm: Remove remaining references to NUMA hinting bits and helpers Mel Gorman
2015-01-05 10:54 ` [PATCH 07/10] mm: numa: Do not trap faults on the huge zero page Mel Gorman
2015-01-05 10:54 ` [PATCH 08/10] x86: mm: Restore original pte_special check Mel Gorman
2015-01-05 10:54 ` [PATCH 09/10] mm: numa: Add paranoid check around pte_protnone_numa Mel Gorman
2015-01-05 10:54 ` [PATCH 10/10] mm: numa: Avoid unnecessary TLB flushes when setting NUMA hinting entries Mel Gorman
  -- strict thread matches above, loose matches on Subject: below --
2014-12-04 11:24 [PATCH 0/10] Replace _PAGE_NUMA with PAGE_NONE protections v4 Mel Gorman
2014-12-04 11:24 ` [PATCH 04/10] ppc64: Add paranoid warnings for unexpected DSISR_PROTFAULT Mel Gorman
2014-11-21 13:57 [PATCH 0/10] Replace _PAGE_NUMA with PAGE_NONE protections v3 Mel Gorman
2014-11-21 13:57 ` [PATCH 04/10] ppc64: Add paranoid warnings for unexpected DSISR_PROTFAULT Mel Gorman
2014-11-20 10:19 [PATCH 0/10] Replace _PAGE_NUMA with PAGE_NONE protections v2 Mel Gorman
2014-11-20 10:19 ` [PATCH 04/10] ppc64: Add paranoid warnings for unexpected DSISR_PROTFAULT Mel Gorman

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).