From: Joonsoo Kim <iamjoonsoo.kim@lge.com> To: Andrew Morton <akpm@linux-foundation.org> Cc: Rik van Riel <riel@redhat.com>, Mel Gorman <mgorman@suse.de>, Michal Hocko <mhocko@suse.cz>, "Aneesh Kumar K.V" <aneesh.kumar@linux.vnet.ibm.com>, KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>, Hugh Dickins <hughd@google.com>, Davidlohr Bueso <davidlohr.bueso@hp.com>, David Gibson <david@gibson.dropbear.id.au>, linux-mm@kvack.org, linux-kernel@vger.kernel.org, Joonsoo Kim <js1304@gmail.com>, Wanpeng Li <liwanp@linux.vnet.ibm.com>, Naoya Horiguchi <n-horiguchi@ah.jp.nec.com>, Hillf Danton <dhillf@gmail.com>, Joonsoo Kim <iamjoonsoo.kim@lge.com> Subject: [PATCH v2 00/20] mm, hugetlb: remove a hugetlb_instantiation_mutex Date: Fri, 9 Aug 2013 18:26:18 +0900 [thread overview] Message-ID: <1376040398-11212-1-git-send-email-iamjoonsoo.kim@lge.com> (raw) Without a hugetlb_instantiation_mutex, if parallel fault occur, we can fail to allocate a hugepage, because many threads dequeue a hugepage to handle a fault of same address. This makes reserved pool shortage just for a little while and this cause faulting thread to get a SIGBUS signal, although there are enough hugepages. To solve this problem, we already have a nice solution, that is, a hugetlb_instantiation_mutex. This blocks other threads to dive into a fault handler. This solve the problem clearly, but it introduce performance degradation, because it serialize all fault handling. Now, I try to remove a hugetlb_instantiation_mutex to get rid of performance problem reported by Davidlohr Bueso [1]. This patchset consist of 4 parts roughly. Part 1. (1-6) Random fix and clean-up. Enhancing error handling. These can be merged into mainline separately. Part 2. (7-9) Protect region tracking via it's own spinlock, instead of the hugetlb_instantiation_mutex. Breaking dependency on the hugetlb_instantiation_mutex for tracking a region is also needed by other approaches like as 'table mutexes', so these can be merged into mainline separately. Part 3. (10-13) Clean-up. IMO, these make code really simple, so these are worth to go into mainline separately, regardless success of my approach. Part 4. (14-20) Remove a hugetlb_instantiation_mutex. Almost patches are just for clean-up to error handling path. In patch 19, retry approach is implemented that if faulted thread failed to allocate a hugepage, it continue to run a fault handler until there is no concurrent thread having a hugepage. This causes threads who want to get a last hugepage to be serialized, so threads don't get a SIGBUS if enough hugepage exist. In patch 20, remove a hugetlb_instantiation_mutex. These patches are based on my previous patchset [2] which is now on mmotm. In my compile testing, [2] and this patchset can be applied to v3.11-rc4 cleanly, but, I do running test of this patchset on top of v3.10 :) With applying these, I passed a libhugetlbfs test suite clearly which have allocation-instantiation race test cases. If there is a something I should consider, please let me know! Thanks. * Changes in v2 - Re-order patches to clear it's relationship - sleepable object allocation(kmalloc) without holding a spinlock (Pointed by Hillf) - Remove vma_has_reserves, instead of vma_needs_reservation. (Suggest by Aneesh and Naoya) - Change a way of returning a hugepage back to reserved pool (Suggedt by Naoya) [1] http://lwn.net/Articles/558863/ "[PATCH] mm/hugetlb: per-vma instantiation mutexes" [2] https://lkml.org/lkml/2013/7/22/96 "[PATCH v2 00/10] mm, hugetlb: clean-up and possible bug fix" Joonsoo Kim (20): mm, hugetlb: protect reserved pages when soft offlining a hugepage mm, hugetlb: change variable name reservations to resv mm, hugetlb: fix subpool accounting handling mm, hugetlb: remove useless check about mapping type mm, hugetlb: grab a page_table_lock after page_cache_release mm, hugetlb: return a reserved page to a reserved pool if failed mm, hugetlb: unify region structure handling mm, hugetlb: region manipulation functions take resv_map rather list_head mm, hugetlb: protect region tracking via newly introduced resv_map lock mm, hugetlb: remove resv_map_put() mm, hugetlb: make vma_resv_map() works for all mapping type mm, hugetlb: remove vma_has_reserves() mm, hugetlb: mm, hugetlb: unify chg and avoid_reserve to use_reserve mm, hugetlb: call vma_needs_reservation before entering alloc_huge_page() mm, hugetlb: remove a check for return value of alloc_huge_page() mm, hugetlb: move down outside_reserve check mm, hugetlb: move up anon_vma_prepare() mm, hugetlb: clean-up error handling in hugetlb_cow() mm, hugetlb: retry if failed to allocate and there is concurrent user mm, hugetlb: remove a hugetlb_instantiation_mutex fs/hugetlbfs/inode.c | 16 +- include/linux/hugetlb.h | 11 ++ mm/hugetlb.c | 419 +++++++++++++++++++++++++---------------------- 3 files changed, 250 insertions(+), 196 deletions(-) -- 1.7.9.5
WARNING: multiple messages have this Message-ID (diff)
From: Joonsoo Kim <iamjoonsoo.kim@lge.com> To: Andrew Morton <akpm@linux-foundation.org> Cc: Rik van Riel <riel@redhat.com>, Mel Gorman <mgorman@suse.de>, Michal Hocko <mhocko@suse.cz>, "Aneesh Kumar K.V" <aneesh.kumar@linux.vnet.ibm.com>, KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>, Hugh Dickins <hughd@google.com>, Davidlohr Bueso <davidlohr.bueso@hp.com>, David Gibson <david@gibson.dropbear.id.au>, linux-mm@kvack.org, linux-kernel@vger.kernel.org, Joonsoo Kim <js1304@gmail.com>, Wanpeng Li <liwanp@linux.vnet.ibm.com>, Naoya Horiguchi <n-horiguchi@ah.jp.nec.com>, Hillf Danton <dhillf@gmail.com>, Joonsoo Kim <iamjoonsoo.kim@lge.com> Subject: [PATCH v2 00/20] mm, hugetlb: remove a hugetlb_instantiation_mutex Date: Fri, 9 Aug 2013 18:26:18 +0900 [thread overview] Message-ID: <1376040398-11212-1-git-send-email-iamjoonsoo.kim@lge.com> (raw) Without a hugetlb_instantiation_mutex, if parallel fault occur, we can fail to allocate a hugepage, because many threads dequeue a hugepage to handle a fault of same address. This makes reserved pool shortage just for a little while and this cause faulting thread to get a SIGBUS signal, although there are enough hugepages. To solve this problem, we already have a nice solution, that is, a hugetlb_instantiation_mutex. This blocks other threads to dive into a fault handler. This solve the problem clearly, but it introduce performance degradation, because it serialize all fault handling. Now, I try to remove a hugetlb_instantiation_mutex to get rid of performance problem reported by Davidlohr Bueso [1]. This patchset consist of 4 parts roughly. Part 1. (1-6) Random fix and clean-up. Enhancing error handling. These can be merged into mainline separately. Part 2. (7-9) Protect region tracking via it's own spinlock, instead of the hugetlb_instantiation_mutex. Breaking dependency on the hugetlb_instantiation_mutex for tracking a region is also needed by other approaches like as 'table mutexes', so these can be merged into mainline separately. Part 3. (10-13) Clean-up. IMO, these make code really simple, so these are worth to go into mainline separately, regardless success of my approach. Part 4. (14-20) Remove a hugetlb_instantiation_mutex. Almost patches are just for clean-up to error handling path. In patch 19, retry approach is implemented that if faulted thread failed to allocate a hugepage, it continue to run a fault handler until there is no concurrent thread having a hugepage. This causes threads who want to get a last hugepage to be serialized, so threads don't get a SIGBUS if enough hugepage exist. In patch 20, remove a hugetlb_instantiation_mutex. These patches are based on my previous patchset [2] which is now on mmotm. In my compile testing, [2] and this patchset can be applied to v3.11-rc4 cleanly, but, I do running test of this patchset on top of v3.10 :) With applying these, I passed a libhugetlbfs test suite clearly which have allocation-instantiation race test cases. If there is a something I should consider, please let me know! Thanks. * Changes in v2 - Re-order patches to clear it's relationship - sleepable object allocation(kmalloc) without holding a spinlock (Pointed by Hillf) - Remove vma_has_reserves, instead of vma_needs_reservation. (Suggest by Aneesh and Naoya) - Change a way of returning a hugepage back to reserved pool (Suggedt by Naoya) [1] http://lwn.net/Articles/558863/ "[PATCH] mm/hugetlb: per-vma instantiation mutexes" [2] https://lkml.org/lkml/2013/7/22/96 "[PATCH v2 00/10] mm, hugetlb: clean-up and possible bug fix" Joonsoo Kim (20): mm, hugetlb: protect reserved pages when soft offlining a hugepage mm, hugetlb: change variable name reservations to resv mm, hugetlb: fix subpool accounting handling mm, hugetlb: remove useless check about mapping type mm, hugetlb: grab a page_table_lock after page_cache_release mm, hugetlb: return a reserved page to a reserved pool if failed mm, hugetlb: unify region structure handling mm, hugetlb: region manipulation functions take resv_map rather list_head mm, hugetlb: protect region tracking via newly introduced resv_map lock mm, hugetlb: remove resv_map_put() mm, hugetlb: make vma_resv_map() works for all mapping type mm, hugetlb: remove vma_has_reserves() mm, hugetlb: mm, hugetlb: unify chg and avoid_reserve to use_reserve mm, hugetlb: call vma_needs_reservation before entering alloc_huge_page() mm, hugetlb: remove a check for return value of alloc_huge_page() mm, hugetlb: move down outside_reserve check mm, hugetlb: move up anon_vma_prepare() mm, hugetlb: clean-up error handling in hugetlb_cow() mm, hugetlb: retry if failed to allocate and there is concurrent user mm, hugetlb: remove a hugetlb_instantiation_mutex fs/hugetlbfs/inode.c | 16 +- include/linux/hugetlb.h | 11 ++ mm/hugetlb.c | 419 +++++++++++++++++++++++++---------------------- 3 files changed, 250 insertions(+), 196 deletions(-) -- 1.7.9.5 -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
next reply other threads:[~2013-08-09 9:27 UTC|newest] Thread overview: 139+ messages / expand[flat|nested] mbox.gz Atom feed top 2013-08-09 9:26 Joonsoo Kim [this message] 2013-08-09 9:26 ` [PATCH v2 00/20] mm, hugetlb: remove a hugetlb_instantiation_mutex Joonsoo Kim 2013-08-09 9:26 ` [PATCH v2 01/20] mm, hugetlb: protect reserved pages when soft offlining a hugepage Joonsoo Kim 2013-08-09 9:26 ` Joonsoo Kim 2013-08-12 13:20 ` Davidlohr Bueso 2013-08-12 13:20 ` Davidlohr Bueso 2013-08-09 9:26 ` [PATCH v2 02/20] mm, hugetlb: change variable name reservations to resv Joonsoo Kim 2013-08-09 9:26 ` Joonsoo Kim 2013-08-12 13:21 ` Davidlohr Bueso 2013-08-12 13:21 ` Davidlohr Bueso 2013-08-09 9:26 ` [PATCH v2 03/20] mm, hugetlb: fix subpool accounting handling Joonsoo Kim 2013-08-09 9:26 ` Joonsoo Kim 2013-08-21 9:28 ` Aneesh Kumar K.V 2013-08-21 9:28 ` Aneesh Kumar K.V 2013-08-22 6:50 ` Joonsoo Kim 2013-08-22 6:50 ` Joonsoo Kim 2013-08-22 7:08 ` Aneesh Kumar K.V 2013-08-22 7:08 ` Aneesh Kumar K.V 2013-08-22 7:47 ` Joonsoo Kim 2013-08-22 7:47 ` Joonsoo Kim 2013-08-26 13:01 ` Aneesh Kumar K.V 2013-08-26 13:01 ` Aneesh Kumar K.V 2013-08-27 7:40 ` Joonsoo Kim 2013-08-27 7:40 ` Joonsoo Kim 2013-08-09 9:26 ` [PATCH v2 04/20] mm, hugetlb: remove useless check about mapping type Joonsoo Kim 2013-08-09 9:26 ` Joonsoo Kim 2013-08-12 13:31 ` Davidlohr Bueso 2013-08-12 13:31 ` Davidlohr Bueso 2013-08-21 9:30 ` Aneesh Kumar K.V 2013-08-21 9:30 ` Aneesh Kumar K.V 2013-08-09 9:26 ` [PATCH v2 05/20] mm, hugetlb: grab a page_table_lock after page_cache_release Joonsoo Kim 2013-08-09 9:26 ` Joonsoo Kim 2013-08-12 13:35 ` Davidlohr Bueso 2013-08-12 13:35 ` Davidlohr Bueso 2013-08-21 9:31 ` Aneesh Kumar K.V 2013-08-21 9:31 ` Aneesh Kumar K.V 2013-08-09 9:26 ` [PATCH v2 06/20] mm, hugetlb: return a reserved page to a reserved pool if failed Joonsoo Kim 2013-08-09 9:26 ` Joonsoo Kim 2013-08-21 9:54 ` Aneesh Kumar K.V 2013-08-21 9:54 ` Aneesh Kumar K.V 2013-08-22 6:51 ` Joonsoo Kim 2013-08-22 6:51 ` Joonsoo Kim 2013-08-09 9:26 ` [PATCH v2 07/20] mm, hugetlb: unify region structure handling Joonsoo Kim 2013-08-09 9:26 ` Joonsoo Kim 2013-08-21 9:57 ` Aneesh Kumar K.V 2013-08-21 9:57 ` Aneesh Kumar K.V 2013-08-22 6:56 ` Joonsoo Kim 2013-08-22 6:56 ` Joonsoo Kim 2013-08-21 10:22 ` Aneesh Kumar K.V 2013-08-21 10:22 ` Aneesh Kumar K.V 2013-08-22 6:53 ` Joonsoo Kim 2013-08-22 6:53 ` Joonsoo Kim 2013-08-09 9:26 ` [PATCH v2 08/20] mm, hugetlb: region manipulation functions take resv_map rather list_head Joonsoo Kim 2013-08-09 9:26 ` Joonsoo Kim 2013-08-21 9:58 ` Aneesh Kumar K.V 2013-08-21 9:58 ` Aneesh Kumar K.V 2013-08-09 9:26 ` [PATCH v2 09/20] mm, hugetlb: protect region tracking via newly introduced resv_map lock Joonsoo Kim 2013-08-09 9:26 ` Joonsoo Kim 2013-08-12 22:03 ` Davidlohr Bueso 2013-08-12 22:03 ` Davidlohr Bueso 2013-08-13 7:45 ` Joonsoo Kim 2013-08-13 7:45 ` Joonsoo Kim 2013-08-21 10:13 ` Aneesh Kumar K.V 2013-08-21 10:13 ` Aneesh Kumar K.V 2013-08-22 6:59 ` Joonsoo Kim 2013-08-22 6:59 ` Joonsoo Kim 2013-08-09 9:26 ` [PATCH v2 10/20] mm, hugetlb: remove resv_map_put() Joonsoo Kim 2013-08-09 9:26 ` Joonsoo Kim 2013-08-21 10:49 ` Aneesh Kumar K.V 2013-08-21 10:49 ` Aneesh Kumar K.V 2013-08-22 7:24 ` Joonsoo Kim 2013-08-22 7:24 ` Joonsoo Kim 2013-08-09 9:26 ` [PATCH v2 11/20] mm, hugetlb: make vma_resv_map() works for all mapping type Joonsoo Kim 2013-08-09 9:26 ` Joonsoo Kim 2013-08-21 10:37 ` Aneesh Kumar K.V 2013-08-21 10:37 ` Aneesh Kumar K.V 2013-08-22 7:25 ` Joonsoo Kim 2013-08-22 7:25 ` Joonsoo Kim 2013-08-09 9:26 ` [PATCH v2 12/20] mm, hugetlb: remove vma_has_reserves() Joonsoo Kim 2013-08-09 9:26 ` Joonsoo Kim 2013-08-22 8:44 ` Aneesh Kumar K.V 2013-08-22 8:44 ` Aneesh Kumar K.V 2013-08-22 9:17 ` Joonsoo Kim 2013-08-22 9:17 ` Joonsoo Kim 2013-08-22 11:04 ` Aneesh Kumar K.V 2013-08-22 11:04 ` Aneesh Kumar K.V 2013-08-23 6:16 ` Joonsoo Kim 2013-08-23 6:16 ` Joonsoo Kim 2013-08-09 9:26 ` [PATCH v2 13/20] mm, hugetlb: mm, hugetlb: unify chg and avoid_reserve to use_reserve Joonsoo Kim 2013-08-09 9:26 ` Joonsoo Kim 2013-08-26 13:09 ` Aneesh Kumar K.V 2013-08-26 13:09 ` Aneesh Kumar K.V 2013-08-27 7:57 ` Joonsoo Kim 2013-08-27 7:57 ` Joonsoo Kim 2013-08-09 9:26 ` [PATCH v2 14/20] mm, hugetlb: call vma_needs_reservation before entering alloc_huge_page() Joonsoo Kim 2013-08-09 9:26 ` Joonsoo Kim 2013-08-26 13:36 ` Aneesh Kumar K.V 2013-08-26 13:36 ` Aneesh Kumar K.V 2013-08-26 13:46 ` Aneesh Kumar K.V 2013-08-26 13:46 ` Aneesh Kumar K.V 2013-08-27 7:58 ` Joonsoo Kim 2013-08-27 7:58 ` Joonsoo Kim 2013-08-09 9:26 ` [PATCH v2 15/20] mm, hugetlb: remove a check for return value of alloc_huge_page() Joonsoo Kim 2013-08-09 9:26 ` Joonsoo Kim 2013-08-26 13:38 ` Aneesh Kumar K.V 2013-08-26 13:38 ` Aneesh Kumar K.V 2013-08-09 9:26 ` [PATCH v2 16/20] mm, hugetlb: move down outside_reserve check Joonsoo Kim 2013-08-09 9:26 ` Joonsoo Kim 2013-08-26 13:44 ` Aneesh Kumar K.V 2013-08-26 13:44 ` Aneesh Kumar K.V 2013-08-09 9:26 ` [PATCH v2 17/20] mm, hugetlb: move up anon_vma_prepare() Joonsoo Kim 2013-08-09 9:26 ` Joonsoo Kim 2013-08-26 14:09 ` Aneesh Kumar K.V 2013-08-26 14:09 ` Aneesh Kumar K.V 2013-08-09 9:26 ` [PATCH v2 18/20] mm, hugetlb: clean-up error handling in hugetlb_cow() Joonsoo Kim 2013-08-09 9:26 ` Joonsoo Kim 2013-08-26 14:12 ` Aneesh Kumar K.V 2013-08-26 14:12 ` Aneesh Kumar K.V 2013-08-09 9:26 ` [PATCH v2 19/20] mm, hugetlb: retry if failed to allocate and there is concurrent user Joonsoo Kim 2013-08-09 9:26 ` Joonsoo Kim 2013-09-04 8:44 ` Joonsoo Kim 2013-09-04 8:44 ` Joonsoo Kim 2013-09-05 1:16 ` David Gibson 2013-09-05 1:15 ` David Gibson 2013-09-05 5:43 ` Joonsoo Kim 2013-09-05 5:43 ` Joonsoo Kim 2013-09-16 12:09 ` David Gibson 2013-09-30 7:47 ` Joonsoo Kim 2013-09-30 7:47 ` Joonsoo Kim 2013-12-09 16:36 ` Davidlohr Bueso 2013-12-09 16:36 ` Davidlohr Bueso 2013-12-10 8:32 ` Joonsoo Kim 2013-12-10 8:32 ` Joonsoo Kim 2013-08-09 9:26 ` [PATCH v2 20/20] mm, hugetlb: remove a hugetlb_instantiation_mutex Joonsoo Kim 2013-08-09 9:26 ` Joonsoo Kim 2013-08-14 23:22 ` [PATCH v2 00/20] " Andrew Morton 2013-08-14 23:22 ` Andrew Morton 2013-08-16 17:18 ` JoonSoo Kim 2013-08-16 17:18 ` JoonSoo Kim
Reply instructions: You may reply publicly to this message via plain-text email using any one of the following methods: * Save the following mbox file, import it into your mail client, and reply-to-all from there: mbox Avoid top-posting and favor interleaved quoting: https://en.wikipedia.org/wiki/Posting_style#Interleaved_style * Reply using the --to, --cc, and --in-reply-to switches of git-send-email(1): git send-email \ --in-reply-to=1376040398-11212-1-git-send-email-iamjoonsoo.kim@lge.com \ --to=iamjoonsoo.kim@lge.com \ --cc=akpm@linux-foundation.org \ --cc=aneesh.kumar@linux.vnet.ibm.com \ --cc=david@gibson.dropbear.id.au \ --cc=davidlohr.bueso@hp.com \ --cc=dhillf@gmail.com \ --cc=hughd@google.com \ --cc=js1304@gmail.com \ --cc=kamezawa.hiroyu@jp.fujitsu.com \ --cc=linux-kernel@vger.kernel.org \ --cc=linux-mm@kvack.org \ --cc=liwanp@linux.vnet.ibm.com \ --cc=mgorman@suse.de \ --cc=mhocko@suse.cz \ --cc=n-horiguchi@ah.jp.nec.com \ --cc=riel@redhat.com \ /path/to/YOUR_REPLY https://kernel.org/pub/software/scm/git/docs/git-send-email.html * If your mail client supports setting the In-Reply-To header via mailto: links, try the mailto: linkBe sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes, see mirroring instructions on how to clone and mirror all data and code used by this external index.