All of lore.kernel.org
 help / color / mirror / Atom feed
From: Joonsoo Kim <iamjoonsoo.kim@lge.com>
To: Andrew Morton <akpm@linux-foundation.org>
Cc: Rik van Riel <riel@redhat.com>, Mel Gorman <mgorman@suse.de>,
	Michal Hocko <mhocko@suse.cz>,
	"Aneesh Kumar K.V" <aneesh.kumar@linux.vnet.ibm.com>,
	KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>,
	Hugh Dickins <hughd@google.com>,
	Davidlohr Bueso <davidlohr.bueso@hp.com>,
	David Gibson <david@gibson.dropbear.id.au>,
	linux-mm@kvack.org, linux-kernel@vger.kernel.org,
	Joonsoo Kim <js1304@gmail.com>,
	Wanpeng Li <liwanp@linux.vnet.ibm.com>,
	Naoya Horiguchi <n-horiguchi@ah.jp.nec.com>,
	Hillf Danton <dhillf@gmail.com>,
	Joonsoo Kim <iamjoonsoo.kim@lge.com>
Subject: [PATCH v2 00/20] mm, hugetlb: remove a hugetlb_instantiation_mutex
Date: Fri,  9 Aug 2013 18:26:18 +0900	[thread overview]
Message-ID: <1376040398-11212-1-git-send-email-iamjoonsoo.kim@lge.com> (raw)

Without a hugetlb_instantiation_mutex, if parallel fault occur, we can
fail to allocate a hugepage, because many threads dequeue a hugepage
to handle a fault of same address. This makes reserved pool shortage
just for a little while and this cause faulting thread to get a SIGBUS
signal, although there are enough hugepages.

To solve this problem, we already have a nice solution, that is,
a hugetlb_instantiation_mutex. This blocks other threads to dive into
a fault handler. This solve the problem clearly, but it introduce
performance degradation, because it serialize all fault handling.
    
Now, I try to remove a hugetlb_instantiation_mutex to get rid of
performance problem reported by Davidlohr Bueso [1].

This patchset consist of 4 parts roughly.

Part 1. (1-6) Random fix and clean-up. Enhancing error handling.
	
	These can be merged into mainline separately.

Part 2. (7-9) Protect region tracking via it's own spinlock, instead of
	the hugetlb_instantiation_mutex.
	
	Breaking dependency on the hugetlb_instantiation_mutex for
	tracking a region is also needed by other approaches like as
	'table mutexes', so these can be merged into mainline separately.

Part 3. (10-13) Clean-up.
	
	IMO, these make code really simple, so these are worth to go into
	mainline separately, regardless success of my approach.

Part 4. (14-20) Remove a hugetlb_instantiation_mutex.
	
	Almost patches are just for clean-up to error handling path.
	In patch 19, retry approach is implemented that if faulted thread
	failed to allocate a hugepage, it continue to run a fault handler
	until there is no concurrent thread having a hugepage. This causes
	threads who want to get a last hugepage to be serialized, so
	threads don't get a SIGBUS if enough hugepage exist.
	In patch 20, remove a hugetlb_instantiation_mutex.

These patches are based on my previous patchset [2] which is now on mmotm.
In my compile testing, [2] and this patchset can be applied to
v3.11-rc4 cleanly, but, I do running test of this patchset on top of v3.10 :)

With applying these, I passed a libhugetlbfs test suite clearly which
have allocation-instantiation race test cases.

If there is a something I should consider, please let me know!
Thanks.

* Changes in v2
- Re-order patches to clear it's relationship
- sleepable object allocation(kmalloc) without holding a spinlock
	(Pointed by Hillf)
- Remove vma_has_reserves, instead of vma_needs_reservation.
	(Suggest by Aneesh and Naoya)
- Change a way of returning a hugepage back to reserved pool
	(Suggedt by Naoya)


[1] http://lwn.net/Articles/558863/ 
	"[PATCH] mm/hugetlb: per-vma instantiation mutexes"
[2] https://lkml.org/lkml/2013/7/22/96
	"[PATCH v2 00/10] mm, hugetlb: clean-up and possible bug fix"

Joonsoo Kim (20):
  mm, hugetlb: protect reserved pages when soft offlining a hugepage
  mm, hugetlb: change variable name reservations to resv
  mm, hugetlb: fix subpool accounting handling
  mm, hugetlb: remove useless check about mapping type
  mm, hugetlb: grab a page_table_lock after page_cache_release
  mm, hugetlb: return a reserved page to a reserved pool if failed
  mm, hugetlb: unify region structure handling
  mm, hugetlb: region manipulation functions take resv_map rather
    list_head
  mm, hugetlb: protect region tracking via newly introduced resv_map
    lock
  mm, hugetlb: remove resv_map_put()
  mm, hugetlb: make vma_resv_map() works for all mapping type
  mm, hugetlb: remove vma_has_reserves()
  mm, hugetlb: mm, hugetlb: unify chg and avoid_reserve to use_reserve
  mm, hugetlb: call vma_needs_reservation before entering
    alloc_huge_page()
  mm, hugetlb: remove a check for return value of alloc_huge_page()
  mm, hugetlb: move down outside_reserve check
  mm, hugetlb: move up anon_vma_prepare()
  mm, hugetlb: clean-up error handling in hugetlb_cow()
  mm, hugetlb: retry if failed to allocate and there is concurrent user
  mm, hugetlb: remove a hugetlb_instantiation_mutex

 fs/hugetlbfs/inode.c    |   16 +-
 include/linux/hugetlb.h |   11 ++
 mm/hugetlb.c            |  419 +++++++++++++++++++++++++----------------------
 3 files changed, 250 insertions(+), 196 deletions(-)

-- 
1.7.9.5


WARNING: multiple messages have this Message-ID (diff)
From: Joonsoo Kim <iamjoonsoo.kim@lge.com>
To: Andrew Morton <akpm@linux-foundation.org>
Cc: Rik van Riel <riel@redhat.com>, Mel Gorman <mgorman@suse.de>,
	Michal Hocko <mhocko@suse.cz>,
	"Aneesh Kumar K.V" <aneesh.kumar@linux.vnet.ibm.com>,
	KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>,
	Hugh Dickins <hughd@google.com>,
	Davidlohr Bueso <davidlohr.bueso@hp.com>,
	David Gibson <david@gibson.dropbear.id.au>,
	linux-mm@kvack.org, linux-kernel@vger.kernel.org,
	Joonsoo Kim <js1304@gmail.com>,
	Wanpeng Li <liwanp@linux.vnet.ibm.com>,
	Naoya Horiguchi <n-horiguchi@ah.jp.nec.com>,
	Hillf Danton <dhillf@gmail.com>,
	Joonsoo Kim <iamjoonsoo.kim@lge.com>
Subject: [PATCH v2 00/20] mm, hugetlb: remove a hugetlb_instantiation_mutex
Date: Fri,  9 Aug 2013 18:26:18 +0900	[thread overview]
Message-ID: <1376040398-11212-1-git-send-email-iamjoonsoo.kim@lge.com> (raw)

Without a hugetlb_instantiation_mutex, if parallel fault occur, we can
fail to allocate a hugepage, because many threads dequeue a hugepage
to handle a fault of same address. This makes reserved pool shortage
just for a little while and this cause faulting thread to get a SIGBUS
signal, although there are enough hugepages.

To solve this problem, we already have a nice solution, that is,
a hugetlb_instantiation_mutex. This blocks other threads to dive into
a fault handler. This solve the problem clearly, but it introduce
performance degradation, because it serialize all fault handling.
    
Now, I try to remove a hugetlb_instantiation_mutex to get rid of
performance problem reported by Davidlohr Bueso [1].

This patchset consist of 4 parts roughly.

Part 1. (1-6) Random fix and clean-up. Enhancing error handling.
	
	These can be merged into mainline separately.

Part 2. (7-9) Protect region tracking via it's own spinlock, instead of
	the hugetlb_instantiation_mutex.
	
	Breaking dependency on the hugetlb_instantiation_mutex for
	tracking a region is also needed by other approaches like as
	'table mutexes', so these can be merged into mainline separately.

Part 3. (10-13) Clean-up.
	
	IMO, these make code really simple, so these are worth to go into
	mainline separately, regardless success of my approach.

Part 4. (14-20) Remove a hugetlb_instantiation_mutex.
	
	Almost patches are just for clean-up to error handling path.
	In patch 19, retry approach is implemented that if faulted thread
	failed to allocate a hugepage, it continue to run a fault handler
	until there is no concurrent thread having a hugepage. This causes
	threads who want to get a last hugepage to be serialized, so
	threads don't get a SIGBUS if enough hugepage exist.
	In patch 20, remove a hugetlb_instantiation_mutex.

These patches are based on my previous patchset [2] which is now on mmotm.
In my compile testing, [2] and this patchset can be applied to
v3.11-rc4 cleanly, but, I do running test of this patchset on top of v3.10 :)

With applying these, I passed a libhugetlbfs test suite clearly which
have allocation-instantiation race test cases.

If there is a something I should consider, please let me know!
Thanks.

* Changes in v2
- Re-order patches to clear it's relationship
- sleepable object allocation(kmalloc) without holding a spinlock
	(Pointed by Hillf)
- Remove vma_has_reserves, instead of vma_needs_reservation.
	(Suggest by Aneesh and Naoya)
- Change a way of returning a hugepage back to reserved pool
	(Suggedt by Naoya)


[1] http://lwn.net/Articles/558863/ 
	"[PATCH] mm/hugetlb: per-vma instantiation mutexes"
[2] https://lkml.org/lkml/2013/7/22/96
	"[PATCH v2 00/10] mm, hugetlb: clean-up and possible bug fix"

Joonsoo Kim (20):
  mm, hugetlb: protect reserved pages when soft offlining a hugepage
  mm, hugetlb: change variable name reservations to resv
  mm, hugetlb: fix subpool accounting handling
  mm, hugetlb: remove useless check about mapping type
  mm, hugetlb: grab a page_table_lock after page_cache_release
  mm, hugetlb: return a reserved page to a reserved pool if failed
  mm, hugetlb: unify region structure handling
  mm, hugetlb: region manipulation functions take resv_map rather
    list_head
  mm, hugetlb: protect region tracking via newly introduced resv_map
    lock
  mm, hugetlb: remove resv_map_put()
  mm, hugetlb: make vma_resv_map() works for all mapping type
  mm, hugetlb: remove vma_has_reserves()
  mm, hugetlb: mm, hugetlb: unify chg and avoid_reserve to use_reserve
  mm, hugetlb: call vma_needs_reservation before entering
    alloc_huge_page()
  mm, hugetlb: remove a check for return value of alloc_huge_page()
  mm, hugetlb: move down outside_reserve check
  mm, hugetlb: move up anon_vma_prepare()
  mm, hugetlb: clean-up error handling in hugetlb_cow()
  mm, hugetlb: retry if failed to allocate and there is concurrent user
  mm, hugetlb: remove a hugetlb_instantiation_mutex

 fs/hugetlbfs/inode.c    |   16 +-
 include/linux/hugetlb.h |   11 ++
 mm/hugetlb.c            |  419 +++++++++++++++++++++++++----------------------
 3 files changed, 250 insertions(+), 196 deletions(-)

-- 
1.7.9.5

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

             reply	other threads:[~2013-08-09  9:27 UTC|newest]

Thread overview: 139+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2013-08-09  9:26 Joonsoo Kim [this message]
2013-08-09  9:26 ` [PATCH v2 00/20] mm, hugetlb: remove a hugetlb_instantiation_mutex Joonsoo Kim
2013-08-09  9:26 ` [PATCH v2 01/20] mm, hugetlb: protect reserved pages when soft offlining a hugepage Joonsoo Kim
2013-08-09  9:26   ` Joonsoo Kim
2013-08-12 13:20   ` Davidlohr Bueso
2013-08-12 13:20     ` Davidlohr Bueso
2013-08-09  9:26 ` [PATCH v2 02/20] mm, hugetlb: change variable name reservations to resv Joonsoo Kim
2013-08-09  9:26   ` Joonsoo Kim
2013-08-12 13:21   ` Davidlohr Bueso
2013-08-12 13:21     ` Davidlohr Bueso
2013-08-09  9:26 ` [PATCH v2 03/20] mm, hugetlb: fix subpool accounting handling Joonsoo Kim
2013-08-09  9:26   ` Joonsoo Kim
2013-08-21  9:28   ` Aneesh Kumar K.V
2013-08-21  9:28     ` Aneesh Kumar K.V
2013-08-22  6:50     ` Joonsoo Kim
2013-08-22  6:50       ` Joonsoo Kim
2013-08-22  7:08       ` Aneesh Kumar K.V
2013-08-22  7:08         ` Aneesh Kumar K.V
2013-08-22  7:47         ` Joonsoo Kim
2013-08-22  7:47           ` Joonsoo Kim
2013-08-26 13:01           ` Aneesh Kumar K.V
2013-08-26 13:01             ` Aneesh Kumar K.V
2013-08-27  7:40             ` Joonsoo Kim
2013-08-27  7:40               ` Joonsoo Kim
2013-08-09  9:26 ` [PATCH v2 04/20] mm, hugetlb: remove useless check about mapping type Joonsoo Kim
2013-08-09  9:26   ` Joonsoo Kim
2013-08-12 13:31   ` Davidlohr Bueso
2013-08-12 13:31     ` Davidlohr Bueso
2013-08-21  9:30   ` Aneesh Kumar K.V
2013-08-21  9:30     ` Aneesh Kumar K.V
2013-08-09  9:26 ` [PATCH v2 05/20] mm, hugetlb: grab a page_table_lock after page_cache_release Joonsoo Kim
2013-08-09  9:26   ` Joonsoo Kim
2013-08-12 13:35   ` Davidlohr Bueso
2013-08-12 13:35     ` Davidlohr Bueso
2013-08-21  9:31   ` Aneesh Kumar K.V
2013-08-21  9:31     ` Aneesh Kumar K.V
2013-08-09  9:26 ` [PATCH v2 06/20] mm, hugetlb: return a reserved page to a reserved pool if failed Joonsoo Kim
2013-08-09  9:26   ` Joonsoo Kim
2013-08-21  9:54   ` Aneesh Kumar K.V
2013-08-21  9:54     ` Aneesh Kumar K.V
2013-08-22  6:51     ` Joonsoo Kim
2013-08-22  6:51       ` Joonsoo Kim
2013-08-09  9:26 ` [PATCH v2 07/20] mm, hugetlb: unify region structure handling Joonsoo Kim
2013-08-09  9:26   ` Joonsoo Kim
2013-08-21  9:57   ` Aneesh Kumar K.V
2013-08-21  9:57     ` Aneesh Kumar K.V
2013-08-22  6:56     ` Joonsoo Kim
2013-08-22  6:56       ` Joonsoo Kim
2013-08-21 10:22   ` Aneesh Kumar K.V
2013-08-21 10:22     ` Aneesh Kumar K.V
2013-08-22  6:53     ` Joonsoo Kim
2013-08-22  6:53       ` Joonsoo Kim
2013-08-09  9:26 ` [PATCH v2 08/20] mm, hugetlb: region manipulation functions take resv_map rather list_head Joonsoo Kim
2013-08-09  9:26   ` Joonsoo Kim
2013-08-21  9:58   ` Aneesh Kumar K.V
2013-08-21  9:58     ` Aneesh Kumar K.V
2013-08-09  9:26 ` [PATCH v2 09/20] mm, hugetlb: protect region tracking via newly introduced resv_map lock Joonsoo Kim
2013-08-09  9:26   ` Joonsoo Kim
2013-08-12 22:03   ` Davidlohr Bueso
2013-08-12 22:03     ` Davidlohr Bueso
2013-08-13  7:45     ` Joonsoo Kim
2013-08-13  7:45       ` Joonsoo Kim
2013-08-21 10:13   ` Aneesh Kumar K.V
2013-08-21 10:13     ` Aneesh Kumar K.V
2013-08-22  6:59     ` Joonsoo Kim
2013-08-22  6:59       ` Joonsoo Kim
2013-08-09  9:26 ` [PATCH v2 10/20] mm, hugetlb: remove resv_map_put() Joonsoo Kim
2013-08-09  9:26   ` Joonsoo Kim
2013-08-21 10:49   ` Aneesh Kumar K.V
2013-08-21 10:49     ` Aneesh Kumar K.V
2013-08-22  7:24     ` Joonsoo Kim
2013-08-22  7:24       ` Joonsoo Kim
2013-08-09  9:26 ` [PATCH v2 11/20] mm, hugetlb: make vma_resv_map() works for all mapping type Joonsoo Kim
2013-08-09  9:26   ` Joonsoo Kim
2013-08-21 10:37   ` Aneesh Kumar K.V
2013-08-21 10:37     ` Aneesh Kumar K.V
2013-08-22  7:25     ` Joonsoo Kim
2013-08-22  7:25       ` Joonsoo Kim
2013-08-09  9:26 ` [PATCH v2 12/20] mm, hugetlb: remove vma_has_reserves() Joonsoo Kim
2013-08-09  9:26   ` Joonsoo Kim
2013-08-22  8:44   ` Aneesh Kumar K.V
2013-08-22  8:44     ` Aneesh Kumar K.V
2013-08-22  9:17     ` Joonsoo Kim
2013-08-22  9:17       ` Joonsoo Kim
2013-08-22 11:04       ` Aneesh Kumar K.V
2013-08-22 11:04         ` Aneesh Kumar K.V
2013-08-23  6:16         ` Joonsoo Kim
2013-08-23  6:16           ` Joonsoo Kim
2013-08-09  9:26 ` [PATCH v2 13/20] mm, hugetlb: mm, hugetlb: unify chg and avoid_reserve to use_reserve Joonsoo Kim
2013-08-09  9:26   ` Joonsoo Kim
2013-08-26 13:09   ` Aneesh Kumar K.V
2013-08-26 13:09     ` Aneesh Kumar K.V
2013-08-27  7:57     ` Joonsoo Kim
2013-08-27  7:57       ` Joonsoo Kim
2013-08-09  9:26 ` [PATCH v2 14/20] mm, hugetlb: call vma_needs_reservation before entering alloc_huge_page() Joonsoo Kim
2013-08-09  9:26   ` Joonsoo Kim
2013-08-26 13:36   ` Aneesh Kumar K.V
2013-08-26 13:36     ` Aneesh Kumar K.V
2013-08-26 13:46     ` Aneesh Kumar K.V
2013-08-26 13:46       ` Aneesh Kumar K.V
2013-08-27  7:58       ` Joonsoo Kim
2013-08-27  7:58         ` Joonsoo Kim
2013-08-09  9:26 ` [PATCH v2 15/20] mm, hugetlb: remove a check for return value of alloc_huge_page() Joonsoo Kim
2013-08-09  9:26   ` Joonsoo Kim
2013-08-26 13:38   ` Aneesh Kumar K.V
2013-08-26 13:38     ` Aneesh Kumar K.V
2013-08-09  9:26 ` [PATCH v2 16/20] mm, hugetlb: move down outside_reserve check Joonsoo Kim
2013-08-09  9:26   ` Joonsoo Kim
2013-08-26 13:44   ` Aneesh Kumar K.V
2013-08-26 13:44     ` Aneesh Kumar K.V
2013-08-09  9:26 ` [PATCH v2 17/20] mm, hugetlb: move up anon_vma_prepare() Joonsoo Kim
2013-08-09  9:26   ` Joonsoo Kim
2013-08-26 14:09   ` Aneesh Kumar K.V
2013-08-26 14:09     ` Aneesh Kumar K.V
2013-08-09  9:26 ` [PATCH v2 18/20] mm, hugetlb: clean-up error handling in hugetlb_cow() Joonsoo Kim
2013-08-09  9:26   ` Joonsoo Kim
2013-08-26 14:12   ` Aneesh Kumar K.V
2013-08-26 14:12     ` Aneesh Kumar K.V
2013-08-09  9:26 ` [PATCH v2 19/20] mm, hugetlb: retry if failed to allocate and there is concurrent user Joonsoo Kim
2013-08-09  9:26   ` Joonsoo Kim
2013-09-04  8:44   ` Joonsoo Kim
2013-09-04  8:44     ` Joonsoo Kim
2013-09-05  1:16     ` David Gibson
2013-09-05  1:15   ` David Gibson
2013-09-05  5:43     ` Joonsoo Kim
2013-09-05  5:43       ` Joonsoo Kim
2013-09-16 12:09       ` David Gibson
2013-09-30  7:47         ` Joonsoo Kim
2013-09-30  7:47           ` Joonsoo Kim
2013-12-09 16:36           ` Davidlohr Bueso
2013-12-09 16:36             ` Davidlohr Bueso
2013-12-10  8:32             ` Joonsoo Kim
2013-12-10  8:32               ` Joonsoo Kim
2013-08-09  9:26 ` [PATCH v2 20/20] mm, hugetlb: remove a hugetlb_instantiation_mutex Joonsoo Kim
2013-08-09  9:26   ` Joonsoo Kim
2013-08-14 23:22 ` [PATCH v2 00/20] " Andrew Morton
2013-08-14 23:22   ` Andrew Morton
2013-08-16 17:18   ` JoonSoo Kim
2013-08-16 17:18     ` JoonSoo Kim

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=1376040398-11212-1-git-send-email-iamjoonsoo.kim@lge.com \
    --to=iamjoonsoo.kim@lge.com \
    --cc=akpm@linux-foundation.org \
    --cc=aneesh.kumar@linux.vnet.ibm.com \
    --cc=david@gibson.dropbear.id.au \
    --cc=davidlohr.bueso@hp.com \
    --cc=dhillf@gmail.com \
    --cc=hughd@google.com \
    --cc=js1304@gmail.com \
    --cc=kamezawa.hiroyu@jp.fujitsu.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=liwanp@linux.vnet.ibm.com \
    --cc=mgorman@suse.de \
    --cc=mhocko@suse.cz \
    --cc=n-horiguchi@ah.jp.nec.com \
    --cc=riel@redhat.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.