All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH 0/2] hugetlbfs: support split page table lock
@ 2013-05-28 19:52 ` Naoya Horiguchi
  0 siblings, 0 replies; 42+ messages in thread
From: Naoya Horiguchi @ 2013-05-28 19:52 UTC (permalink / raw)
  To: linux-mm
  Cc: Andrew Morton, Mel Gorman, Andi Kleen, Michal Hocko,
	KOSAKI Motohiro, Rik van Riel, linux-kernel

Hi,

In previous discussion [1] on "extend hugepage migration" patches, Michal and
Kosaki-san commented that in the patch "migrate: add migrate_entry_wait_huge()"
we need to solve the issue in the arch-independent manner and had better care
USE_SPLIT_PTLOCK=y case. So this patch(es) does that.

I made sure that the patched kernel shows no regression in functional tests
of libhugetlbfs.

[1]: http://thread.gmane.org/gmane.linux.kernel.mm/96665/focus=96661

Thanks,
Naoya Horiguchi

^ permalink raw reply	[flat|nested] 42+ messages in thread
* [PATCH 0/2 v2] split page table lock for hugepage
@ 2013-08-30 17:18 Naoya Horiguchi
  2013-08-30 17:18   ` Naoya Horiguchi
  0 siblings, 1 reply; 42+ messages in thread
From: Naoya Horiguchi @ 2013-08-30 17:18 UTC (permalink / raw)
  To: linux-mm
  Cc: Andrew Morton, Mel Gorman, Andi Kleen, Michal Hocko,
	KOSAKI Motohiro, Rik van Riel, Andrea Arcangeli, kirill.shutemov,
	Aneesh Kumar K.V, Alex Thorlton, linux-kernel

Hi,

I revised the split page table lock patch (v1 is [1]), and got some numbers
to confirm the performance improvement.

This patchset simply replaces all of locking/unlocking of mm->page_table_lock
in hugepage context with page->ptl when USE_SPLIT_PTLOCKS is true, which
breaks single mm wide locking into multiple small (pmd/pte sized) address
range locking, so we can clearly expect better performance when many threads
access to virtual memory of a process simultaneously.

Here is the result of my testing [2], where I measured the time (in seconds)
taken to execute a specific workload in various conditions. So the smaller
number means the better performance. The workload is like this:
  1) allocate N hugepages/thps and touch them once,
  2) create T threads with pthread_create(), and
  3) each thread accesses to the whole pages sequentially 10 times.

           |             hugetlb             |               thp               |
 N      T  |   v3.11-rc3    |    patched     |   v3.11-rc3    |    patched     |
100    100 |  0.13 (+-0.04) |  0.07 (+-0.01) |  0.10 (+-0.01) |  0.08 (+-0.03) |
100   3000 | 11.67 (+-0.47) |  6.54 (+-0.38) | 11.21 (+-0.28) |  6.44 (+-0.26) |
6000   100 |  2.87 (+-0.07) |  2.79 (+-0.06) |  3.21 (+-0.06) |  3.10 (+-0.06) |
6000  3000 | 18.76 (+-0.50) | 13.68 (+-0.35) | 19.44 (+-0.78) | 14.03 (+-0.43) |

  * Numbers are the averages (and stddev) of 20 testing respectively.

This result shows that for both of hugetlb/thp patched kernel provides better
results, so patches works fine. The performance gain is larger for larger T.
Interestingly, in more detailed analysis the improvement mostly comes from 2).
I got a little improvement for 3), but no visible improvement for 1).

[1] http://thread.gmane.org/gmane.linux.kernel.mm/100856/focus=100858
[2] https://github.com/Naoya-Horiguchi/test_split_page_table_lock_for_hugepage

Naoya Horiguchi (2):
      hugetlbfs: support split page table lock
      thp: support split page table lock

 arch/powerpc/mm/hugetlbpage.c |   6 +-
 arch/powerpc/mm/pgtable_64.c  |   8 +-
 arch/s390/mm/pgtable.c        |   4 +-
 arch/sparc/mm/tlb.c           |   4 +-
 arch/tile/mm/hugetlbpage.c    |   6 +-
 fs/proc/task_mmu.c            |  17 +++--
 include/linux/huge_mm.h       |  11 +--
 include/linux/hugetlb.h       |  20 +++++
 include/linux/mm.h            |   3 +
 mm/huge_memory.c              | 170 +++++++++++++++++++++++++-----------------
 mm/hugetlb.c                  |  92 ++++++++++++++---------
 mm/memcontrol.c               |  14 ++--
 mm/memory.c                   |  15 ++--
 mm/mempolicy.c                |   5 +-
 mm/migrate.c                  |  12 +--
 mm/mprotect.c                 |   5 +-
 mm/pgtable-generic.c          |  10 +--
 mm/rmap.c                     |  13 ++--
 18 files changed, 251 insertions(+), 164 deletions(-)

Thanks,
Naoya Horiguchi

^ permalink raw reply	[flat|nested] 42+ messages in thread
* [PATCH 0/2 v3] split page table lock for hugepage
@ 2013-09-05 21:27 Naoya Horiguchi
  2013-09-05 21:27   ` Naoya Horiguchi
  0 siblings, 1 reply; 42+ messages in thread
From: Naoya Horiguchi @ 2013-09-05 21:27 UTC (permalink / raw)
  To: linux-mm
  Cc: Andrew Morton, Mel Gorman, Andi Kleen, Michal Hocko,
	KOSAKI Motohiro, Rik van Riel, Andrea Arcangeli, kirill.shutemov,
	Aneesh Kumar K.V, Alex Thorlton, linux-kernel

I revised the split ptl patchset with small fixes.
See also the previous post [1] for the motivation and the numbers.

Any comments and reviews are welcomed.

[1] http://thread.gmane.org/gmane.linux.kernel.mm/106292/

Thanks,
Naoya Horiguchi
---
Summary:

Naoya Horiguchi (2):
      hugetlbfs: support split page table lock
      thp: support split page table lock

 arch/powerpc/mm/pgtable_64.c |   8 +-
 arch/s390/mm/pgtable.c       |   4 +-
 arch/sparc/mm/tlb.c          |   4 +-
 fs/proc/task_mmu.c           |  17 +++--
 include/linux/huge_mm.h      |  11 +--
 include/linux/hugetlb.h      |  20 +++++
 include/linux/mm.h           |   3 +
 include/linux/mm_types.h     |   2 +
 mm/huge_memory.c             | 171 ++++++++++++++++++++++++++-----------------
 mm/hugetlb.c                 |  92 ++++++++++++++---------
 mm/memcontrol.c              |  14 ++--
 mm/memory.c                  |  15 ++--
 mm/mempolicy.c               |   5 +-
 mm/migrate.c                 |  12 +--
 mm/mprotect.c                |   5 +-
 mm/pgtable-generic.c         |  10 +--
 mm/rmap.c                    |  13 ++--
 17 files changed, 246 insertions(+), 160 deletions(-)

^ permalink raw reply	[flat|nested] 42+ messages in thread

end of thread, other threads:[~2013-09-09 16:26 UTC | newest]

Thread overview: 42+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2013-05-28 19:52 [PATCH 0/2] hugetlbfs: support split page table lock Naoya Horiguchi
2013-05-28 19:52 ` Naoya Horiguchi
2013-05-28 19:52 ` [PATCH 1/2] " Naoya Horiguchi
2013-05-28 19:52   ` Naoya Horiguchi
2013-05-29  1:09   ` Wanpeng Li
2013-05-29  1:09   ` Wanpeng Li
2013-06-03 13:19   ` Michal Hocko
2013-06-03 13:19     ` Michal Hocko
2013-06-03 14:34     ` Naoya Horiguchi
2013-06-03 14:34       ` Naoya Horiguchi
2013-06-03 15:42       ` Michal Hocko
2013-06-03 15:42         ` Michal Hocko
2013-05-28 19:52 ` [PATCH v3 2/2] migrate: add migrate_entry_wait_huge() Naoya Horiguchi
2013-05-28 19:52   ` Naoya Horiguchi
2013-05-29  1:11   ` Wanpeng Li
2013-05-29  1:11   ` Wanpeng Li
2013-05-31 19:30   ` Andrew Morton
2013-05-31 19:30     ` Andrew Morton
2013-05-31 19:46     ` Naoya Horiguchi
2013-05-31 19:46       ` Naoya Horiguchi
2013-06-03 13:26   ` Michal Hocko
2013-06-03 13:26     ` Michal Hocko
2013-06-03 14:34     ` Naoya Horiguchi
2013-06-03 14:34       ` Naoya Horiguchi
2013-06-04 16:44     ` [PATCH v4] " Naoya Horiguchi
2013-06-04 16:44       ` Naoya Horiguchi
2013-08-30 17:18 [PATCH 0/2 v2] split page table lock for hugepage Naoya Horiguchi
2013-08-30 17:18 ` [PATCH 1/2] hugetlbfs: support split page table lock Naoya Horiguchi
2013-08-30 17:18   ` Naoya Horiguchi
2013-09-04  7:13   ` Aneesh Kumar K.V
2013-09-04  7:13     ` Aneesh Kumar K.V
2013-09-04 16:32     ` Naoya Horiguchi
2013-09-04 16:32       ` Naoya Horiguchi
2013-09-05  9:18       ` Aneesh Kumar K.V
2013-09-05  9:18         ` Aneesh Kumar K.V
2013-09-05 15:23         ` Naoya Horiguchi
2013-09-05 15:23           ` Naoya Horiguchi
2013-09-05 21:27 [PATCH 0/2 v3] split page table lock for hugepage Naoya Horiguchi
2013-09-05 21:27 ` [PATCH 1/2] hugetlbfs: support split page table lock Naoya Horiguchi
2013-09-05 21:27   ` Naoya Horiguchi
2013-09-08 16:53   ` Aneesh Kumar K.V
2013-09-08 16:53     ` Aneesh Kumar K.V
2013-09-09 16:26     ` Naoya Horiguchi
2013-09-09 16:26       ` Naoya Horiguchi

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.