From: Luiz Capitulino <lcapitulino@redhat.com> To: linux-mm@kvack.org Cc: linux-kernel@vger.kernel.org, mtosatti@redhat.com, aarcange@redhat.com, mgorman@suse.de, akpm@linux-foundation.org, andi@firstfloor.org, davidlohr@hp.com, rientjes@google.com, isimatu.yasuaki@jp.fujitsu.com, yinghai@kernel.org, riel@redhat.com, n-horiguchi@ah.jp.nec.com, kirill@shutemov.name Subject: [PATCH v3 0/5] hugetlb: add support gigantic page allocation at runtime Date: Thu, 10 Apr 2014 13:58:40 -0400 [thread overview] Message-ID: <1397152725-20990-1-git-send-email-lcapitulino@redhat.com> (raw) [Full introduction right after the changelog] Changelog --------- v3 - Dropped unnecessary WARN_ON() call [Kirill] - Always check if the pfn range lies within a zone [Yasuaki] - Renamed some function arguments for consistency v2 - Rewrote allocation loop to avoid scanning unless PFNs [Yasuaki] - Dropped incomplete multi-arch support [Naoya] - Added patch to drop __init from prep_compound_gigantic_page() - Restricted the feature to x86_64 (more details in patch 5/5) - Added review-bys plus minor changelog changes Introduction ------------ The HugeTLB subsystem uses the buddy allocator to allocate hugepages during runtime. This means that hugepages allocation during runtime is limited to MAX_ORDER order. For archs supporting gigantic pages (that is, page sizes greater than MAX_ORDER), this in turn means that those pages can't be allocated at runtime. HugeTLB supports gigantic page allocation during boottime, via the boot allocator. To this end the kernel provides the command-line options hugepagesz= and hugepages=, which can be used to instruct the kernel to allocate N gigantic pages during boot. For example, x86_64 supports 2M and 1G hugepages, but only 2M hugepages can be allocated and freed at runtime. If one wants to allocate 1G gigantic pages, this has to be done at boot via the hugepagesz= and hugepages= command-line options. Now, gigantic page allocation at boottime has two serious problems: 1. Boottime allocation is not NUMA aware. On a NUMA machine the kernel evenly distributes boottime allocated hugepages among nodes. For example, suppose you have a four-node NUMA machine and want to allocate four 1G gigantic pages at boottime. The kernel will allocate one gigantic page per node. On the other hand, we do have users who want to be able to specify which NUMA node gigantic pages should allocated from. So that they can place virtual machines on a specific NUMA node. 2. Gigantic pages allocated at boottime can't be freed At this point it's important to observe that regular hugepages allocated at runtime don't have those problems. This is so because HugeTLB interface for runtime allocation in sysfs supports NUMA and runtime allocated pages can be freed just fine via the buddy allocator. This series adds support for allocating gigantic pages at runtime. It does so by allocating gigantic pages via CMA instead of the buddy allocator. Releasing gigantic pages is also supported via CMA. As this series builds on top of the existing HugeTLB interface, it makes gigantic page allocation and releasing just like regular sized hugepages. This also means that NUMA support just works. For example, to allocate two 1G gigantic pages on node 1, one can do: # echo 2 > \ /sys/devices/system/node/node1/hugepages/hugepages-1048576kB/nr_hugepages And, to release all gigantic pages on the same node: # echo 0 > \ /sys/devices/system/node/node1/hugepages/hugepages-1048576kB/nr_hugepages Please, refer to patch 5/5 for full technical details. Finally, please note that this series is a follow up for a previous series that tried to extend the command-line options set to be NUMA aware: http://marc.info/?l=linux-mm&m=139593335312191&w=2 During the discussion of that series it was agreed that having runtime allocation support for gigantic pages was a better solution. Luiz Capitulino (5): hugetlb: prep_compound_gigantic_page(): drop __init marker hugetlb: add hstate_is_gigantic() hugetlb: update_and_free_page(): don't clear PG_reserved bit hugetlb: move helpers up in the file hugetlb: add support for gigantic page allocation at runtime include/linux/hugetlb.h | 5 + mm/hugetlb.c | 336 ++++++++++++++++++++++++++++++++++-------------- 2 files changed, 245 insertions(+), 96 deletions(-) -- 1.8.1.4
WARNING: multiple messages have this Message-ID (diff)
From: Luiz Capitulino <lcapitulino@redhat.com> To: linux-mm@kvack.org Cc: linux-kernel@vger.kernel.org, mtosatti@redhat.com, aarcange@redhat.com, mgorman@suse.de, akpm@linux-foundation.org, andi@firstfloor.org, davidlohr@hp.com, rientjes@google.com, isimatu.yasuaki@jp.fujitsu.com, yinghai@kernel.org, riel@redhat.com, n-horiguchi@ah.jp.nec.com, kirill@shutemov.name Subject: [PATCH v3 0/5] hugetlb: add support gigantic page allocation at runtime Date: Thu, 10 Apr 2014 13:58:40 -0400 [thread overview] Message-ID: <1397152725-20990-1-git-send-email-lcapitulino@redhat.com> (raw) [Full introduction right after the changelog] Changelog --------- v3 - Dropped unnecessary WARN_ON() call [Kirill] - Always check if the pfn range lies within a zone [Yasuaki] - Renamed some function arguments for consistency v2 - Rewrote allocation loop to avoid scanning unless PFNs [Yasuaki] - Dropped incomplete multi-arch support [Naoya] - Added patch to drop __init from prep_compound_gigantic_page() - Restricted the feature to x86_64 (more details in patch 5/5) - Added review-bys plus minor changelog changes Introduction ------------ The HugeTLB subsystem uses the buddy allocator to allocate hugepages during runtime. This means that hugepages allocation during runtime is limited to MAX_ORDER order. For archs supporting gigantic pages (that is, page sizes greater than MAX_ORDER), this in turn means that those pages can't be allocated at runtime. HugeTLB supports gigantic page allocation during boottime, via the boot allocator. To this end the kernel provides the command-line options hugepagesz= and hugepages=, which can be used to instruct the kernel to allocate N gigantic pages during boot. For example, x86_64 supports 2M and 1G hugepages, but only 2M hugepages can be allocated and freed at runtime. If one wants to allocate 1G gigantic pages, this has to be done at boot via the hugepagesz= and hugepages= command-line options. Now, gigantic page allocation at boottime has two serious problems: 1. Boottime allocation is not NUMA aware. On a NUMA machine the kernel evenly distributes boottime allocated hugepages among nodes. For example, suppose you have a four-node NUMA machine and want to allocate four 1G gigantic pages at boottime. The kernel will allocate one gigantic page per node. On the other hand, we do have users who want to be able to specify which NUMA node gigantic pages should allocated from. So that they can place virtual machines on a specific NUMA node. 2. Gigantic pages allocated at boottime can't be freed At this point it's important to observe that regular hugepages allocated at runtime don't have those problems. This is so because HugeTLB interface for runtime allocation in sysfs supports NUMA and runtime allocated pages can be freed just fine via the buddy allocator. This series adds support for allocating gigantic pages at runtime. It does so by allocating gigantic pages via CMA instead of the buddy allocator. Releasing gigantic pages is also supported via CMA. As this series builds on top of the existing HugeTLB interface, it makes gigantic page allocation and releasing just like regular sized hugepages. This also means that NUMA support just works. For example, to allocate two 1G gigantic pages on node 1, one can do: # echo 2 > \ /sys/devices/system/node/node1/hugepages/hugepages-1048576kB/nr_hugepages And, to release all gigantic pages on the same node: # echo 0 > \ /sys/devices/system/node/node1/hugepages/hugepages-1048576kB/nr_hugepages Please, refer to patch 5/5 for full technical details. Finally, please note that this series is a follow up for a previous series that tried to extend the command-line options set to be NUMA aware: http://marc.info/?l=linux-mm&m=139593335312191&w=2 During the discussion of that series it was agreed that having runtime allocation support for gigantic pages was a better solution. Luiz Capitulino (5): hugetlb: prep_compound_gigantic_page(): drop __init marker hugetlb: add hstate_is_gigantic() hugetlb: update_and_free_page(): don't clear PG_reserved bit hugetlb: move helpers up in the file hugetlb: add support for gigantic page allocation at runtime include/linux/hugetlb.h | 5 + mm/hugetlb.c | 336 ++++++++++++++++++++++++++++++++++-------------- 2 files changed, 245 insertions(+), 96 deletions(-) -- 1.8.1.4 -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
next reply other threads:[~2014-04-10 17:59 UTC|newest] Thread overview: 37+ messages / expand[flat|nested] mbox.gz Atom feed top 2014-04-10 17:58 Luiz Capitulino [this message] 2014-04-10 17:58 ` [PATCH v3 0/5] hugetlb: add support gigantic page allocation at runtime Luiz Capitulino 2014-04-10 17:58 ` [PATCH 1/5] hugetlb: prep_compound_gigantic_page(): drop __init marker Luiz Capitulino 2014-04-10 17:58 ` Luiz Capitulino 2014-04-10 17:58 ` [PATCH 2/5] hugetlb: add hstate_is_gigantic() Luiz Capitulino 2014-04-10 17:58 ` Luiz Capitulino 2014-04-10 17:58 ` [PATCH 3/5] hugetlb: update_and_free_page(): don't clear PG_reserved bit Luiz Capitulino 2014-04-10 17:58 ` Luiz Capitulino 2014-04-10 17:58 ` [PATCH 4/5] hugetlb: move helpers up in the file Luiz Capitulino 2014-04-10 17:58 ` Luiz Capitulino 2014-04-10 17:58 ` [PATCH 5/5] hugetlb: add support for gigantic page allocation at runtime Luiz Capitulino 2014-04-10 17:58 ` Luiz Capitulino 2014-04-13 23:31 ` Yasuaki Ishimatsu 2014-04-13 23:31 ` Yasuaki Ishimatsu 2014-04-17 23:00 ` Andrew Morton 2014-04-17 23:00 ` Andrew Morton 2014-04-22 21:19 ` Luiz Capitulino 2014-04-22 21:19 ` Luiz Capitulino 2014-04-10 21:44 ` [PATCH v3 0/5] hugetlb: add support " Davidlohr Bueso 2014-04-10 21:44 ` Davidlohr Bueso 2014-04-11 12:08 ` Kirill A. Shutemov 2014-04-14 7:31 ` Zhang Yanfei 2014-04-14 7:31 ` Zhang Yanfei 2014-04-17 15:13 ` Luiz Capitulino 2014-04-17 15:13 ` Luiz Capitulino 2014-04-17 18:52 ` Andrew Morton 2014-04-17 18:52 ` Andrew Morton 2014-04-17 19:09 ` Luiz Capitulino 2014-04-17 19:09 ` Luiz Capitulino 2014-04-17 23:01 ` Andrew Morton 2014-04-17 23:01 ` Andrew Morton 2014-04-22 21:37 ` Luiz Capitulino 2014-04-22 21:37 ` Luiz Capitulino 2014-04-22 21:55 ` Andrew Morton 2014-04-22 21:55 ` Andrew Morton 2014-04-25 20:18 ` Luiz Capitulino 2014-04-25 20:18 ` Luiz Capitulino
Reply instructions: You may reply publicly to this message via plain-text email using any one of the following methods: * Save the following mbox file, import it into your mail client, and reply-to-all from there: mbox Avoid top-posting and favor interleaved quoting: https://en.wikipedia.org/wiki/Posting_style#Interleaved_style * Reply using the --to, --cc, and --in-reply-to switches of git-send-email(1): git send-email \ --in-reply-to=1397152725-20990-1-git-send-email-lcapitulino@redhat.com \ --to=lcapitulino@redhat.com \ --cc=aarcange@redhat.com \ --cc=akpm@linux-foundation.org \ --cc=andi@firstfloor.org \ --cc=davidlohr@hp.com \ --cc=isimatu.yasuaki@jp.fujitsu.com \ --cc=kirill@shutemov.name \ --cc=linux-kernel@vger.kernel.org \ --cc=linux-mm@kvack.org \ --cc=mgorman@suse.de \ --cc=mtosatti@redhat.com \ --cc=n-horiguchi@ah.jp.nec.com \ --cc=riel@redhat.com \ --cc=rientjes@google.com \ --cc=yinghai@kernel.org \ /path/to/YOUR_REPLY https://kernel.org/pub/software/scm/git/docs/git-send-email.html * If your mail client supports setting the In-Reply-To header via mailto: links, try the mailto: linkBe sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes, see mirroring instructions on how to clone and mirror all data and code used by this external index.