* Re: [PATCH 2/2] mm: thp: give transparent hugepage code a separate copy_page
2013-10-28 22:16 ` [PATCH 2/2] mm: thp: give transparent hugepage code a separate copy_page Dave Hansen
@ 2013-10-28 22:11 ` Kirill A. Shutemov
2013-10-30 15:31 ` [PATCH 2/2] mm: thp: give transparent hugepage code a separatecopy_page Naoya Horiguchi
2013-11-05 18:47 ` [PATCH 2/2] mm: thp: give transparent hugepage code a separate copy_page Dave Hansen
2013-11-06 13:46 ` Hillf Danton
1 sibling, 2 replies; 10+ messages in thread
From: Kirill A. Shutemov @ 2013-10-28 22:11 UTC (permalink / raw)
To: Dave Hansen; +Cc: linux-kernel, linux-mm, dave.jiang, Mel Gorman, akpm, dhillf
On Mon, Oct 28, 2013 at 03:16:20PM -0700, Dave Hansen wrote:
>
> From: Dave Hansen <dave.hansen@linux.intel.com>
>
> Right now, the migration code in migrate_page_copy() uses
> copy_huge_page() for hugetlbfs and thp pages:
>
> if (PageHuge(page) || PageTransHuge(page))
> copy_huge_page(newpage, page);
>
> So, yay for code reuse. But:
>
> void copy_huge_page(struct page *dst, struct page *src)
> {
> struct hstate *h = page_hstate(src);
>
> and a non-hugetlbfs page has no page_hstate(). This
> works 99% of the time because page_hstate() determines
> the hstate from the page order alone. Since the page
> order of a THP page matches the default hugetlbfs page
> order, it works.
>
> But, if you change the default huge page size on the
> boot command-line (say default_hugepagesz=1G), then
> we might not even *have* a 2MB hstate so page_hstate()
> returns null and copy_huge_page() oopses pretty fast
> since copy_huge_page() dereferences the hstate:
>
> void copy_huge_page(struct page *dst, struct page *src)
> {
> struct hstate *h = page_hstate(src);
> if (unlikely(pages_per_huge_page(h) > MAX_ORDER_NR_PAGES)) {
> ...
>
> This patch creates a copy_high_order_page() which can
> be used on THP pages.
We already have copy_user_huge_page() and copy_user_gigantic_page() in
generic code (mm/memory.c). I think copy_gigantic_page() and
copy_huge_page() should be moved there too.
BTW, I think pages_per_huge_page in copy_user_huge_page() is redunand:
compound_order(page) should be enough, right?
--
Kirill A. Shutemov
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply [flat|nested] 10+ messages in thread
* [PATCH 1/2] mm: hugetlbfs: Add some VM_BUG_ON()s to catch non-hugetlbfs pages
@ 2013-10-28 22:16 Dave Hansen
2013-10-28 22:16 ` [PATCH 2/2] mm: thp: give transparent hugepage code a separate copy_page Dave Hansen
2013-10-30 15:10 ` [PATCH 1/2] mm: hugetlbfs: Add some VM_BUG_ON()s to catchnon-hugetlbfs pages Naoya Horiguchi
0 siblings, 2 replies; 10+ messages in thread
From: Dave Hansen @ 2013-10-28 22:16 UTC (permalink / raw)
To: linux-kernel; +Cc: linux-mm, dave.jiang, Mel Gorman, akpm, dhillf, Dave Hansen
Dave Jiang reported that he was seeing oopses when running
NUMA systems and default_hugepagesz=1G. I traced the issue down
to migrate_page_copy() trying to use the same code for hugetlb
pages and transparent hugepages. It should not have been trying
to pass thp pages in there.
So, add some VM_BUG_ON()s for the next hapless VM developer that
tries the same thing.
---
linux.git-davehans/include/linux/hugetlb.h | 1 +
linux.git-davehans/mm/hugetlb.c | 1 +
2 files changed, 2 insertions(+)
diff -puN include/linux/hugetlb.h~bug-not-hugetlbfs-in-copy_huge_page include/linux/hugetlb.h
--- linux.git/include/linux/hugetlb.h~bug-not-hugetlbfs-in-copy_huge_page 2013-10-28 15:06:12.888828815 -0700
+++ linux.git-davehans/include/linux/hugetlb.h 2013-10-28 15:06:12.893829038 -0700
@@ -355,6 +355,7 @@ static inline pte_t arch_make_huge_pte(p
static inline struct hstate *page_hstate(struct page *page)
{
+ VM_BUG_ON(!PageHuge(page));
return size_to_hstate(PAGE_SIZE << compound_order(page));
}
diff -puN mm/hugetlb.c~bug-not-hugetlbfs-in-copy_huge_page mm/hugetlb.c
--- linux.git/mm/hugetlb.c~bug-not-hugetlbfs-in-copy_huge_page 2013-10-28 15:06:12.890828904 -0700
+++ linux.git-davehans/mm/hugetlb.c 2013-10-28 15:06:12.894829082 -0700
@@ -498,6 +498,7 @@ void copy_huge_page(struct page *dst, st
int i;
struct hstate *h = page_hstate(src);
+ VM_BUG_ON(!h);
if (unlikely(pages_per_huge_page(h) > MAX_ORDER_NR_PAGES)) {
copy_gigantic_page(dst, src);
return;
_
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply [flat|nested] 10+ messages in thread
* [PATCH 2/2] mm: thp: give transparent hugepage code a separate copy_page
2013-10-28 22:16 [PATCH 1/2] mm: hugetlbfs: Add some VM_BUG_ON()s to catch non-hugetlbfs pages Dave Hansen
@ 2013-10-28 22:16 ` Dave Hansen
2013-10-28 22:11 ` Kirill A. Shutemov
2013-11-06 13:46 ` Hillf Danton
2013-10-30 15:10 ` [PATCH 1/2] mm: hugetlbfs: Add some VM_BUG_ON()s to catchnon-hugetlbfs pages Naoya Horiguchi
1 sibling, 2 replies; 10+ messages in thread
From: Dave Hansen @ 2013-10-28 22:16 UTC (permalink / raw)
To: linux-kernel; +Cc: linux-mm, dave.jiang, Mel Gorman, akpm, dhillf, Dave Hansen
From: Dave Hansen <dave.hansen@linux.intel.com>
Right now, the migration code in migrate_page_copy() uses
copy_huge_page() for hugetlbfs and thp pages:
if (PageHuge(page) || PageTransHuge(page))
copy_huge_page(newpage, page);
So, yay for code reuse. But:
void copy_huge_page(struct page *dst, struct page *src)
{
struct hstate *h = page_hstate(src);
and a non-hugetlbfs page has no page_hstate(). This
works 99% of the time because page_hstate() determines
the hstate from the page order alone. Since the page
order of a THP page matches the default hugetlbfs page
order, it works.
But, if you change the default huge page size on the
boot command-line (say default_hugepagesz=1G), then
we might not even *have* a 2MB hstate so page_hstate()
returns null and copy_huge_page() oopses pretty fast
since copy_huge_page() dereferences the hstate:
void copy_huge_page(struct page *dst, struct page *src)
{
struct hstate *h = page_hstate(src);
if (unlikely(pages_per_huge_page(h) > MAX_ORDER_NR_PAGES)) {
...
This patch creates a copy_high_order_page() which can
be used on THP pages.
Signed-off-by: Dave Hansen <dave.hansen@linux.intel.com>
---
linux.git-davehans/include/linux/huge_mm.h | 16 ++++++++++++++++
linux.git-davehans/mm/huge_memory.c | 13 +++++++++++++
linux.git-davehans/mm/migrate.c | 4 +++-
3 files changed, 32 insertions(+), 1 deletion(-)
diff -puN include/linux/huge_mm.h~copy-huge-separate-from-copy-transhuge include/linux/huge_mm.h
--- linux.git/include/linux/huge_mm.h~copy-huge-separate-from-copy-transhuge 2013-10-28 15:10:28.294220490 -0700
+++ linux.git-davehans/include/linux/huge_mm.h 2013-10-28 15:10:28.301220803 -0700
@@ -177,6 +177,10 @@ static inline struct page *compound_tran
extern int do_huge_pmd_numa_page(struct mm_struct *mm, struct vm_area_struct *vma,
unsigned long addr, pmd_t pmd, pmd_t *pmdp);
+extern void copy_high_order_page(struct page *newpage,
+ struct page *oldpage,
+ int order);
+
#else /* CONFIG_TRANSPARENT_HUGEPAGE */
#define HPAGE_PMD_SHIFT ({ BUILD_BUG(); 0; })
#define HPAGE_PMD_MASK ({ BUILD_BUG(); 0; })
@@ -227,6 +231,18 @@ static inline int do_huge_pmd_numa_page(
return 0;
}
+/*
+ * The non-stub version of this code is probably usable
+ * generically but its only user is thp at the moment,
+ * so enforce that with a BUG()
+ */
+static inline void copy_high_order_page(struct page *newpage,
+ struct page *oldpage,
+ int order)
+{
+ BUG();
+}
+
#endif /* CONFIG_TRANSPARENT_HUGEPAGE */
#endif /* _LINUX_HUGE_MM_H */
diff -puN mm/huge_memory.c~copy-huge-separate-from-copy-transhuge mm/huge_memory.c
--- linux.git/mm/huge_memory.c~copy-huge-separate-from-copy-transhuge 2013-10-28 15:10:28.296220580 -0700
+++ linux.git-davehans/mm/huge_memory.c 2013-10-28 15:10:28.302220848 -0700
@@ -2789,3 +2789,16 @@ void __vma_adjust_trans_huge(struct vm_a
split_huge_page_address(next->vm_mm, nstart);
}
}
+
+void copy_high_order_page(struct page *newpage,
+ struct page *oldpage,
+ int order)
+{
+ int i;
+
+ might_sleep();
+ for (i = 0; i < (1<<order); i++) {
+ cond_resched();
+ copy_highpage(newpage + i, oldpage + i);
+ }
+}
diff -puN mm/migrate.c~copy-huge-separate-from-copy-transhuge mm/migrate.c
--- linux.git/mm/migrate.c~copy-huge-separate-from-copy-transhuge 2013-10-28 15:10:28.298220669 -0700
+++ linux.git-davehans/mm/migrate.c 2013-10-28 15:10:28.303220893 -0700
@@ -443,8 +443,10 @@ int migrate_huge_page_move_mapping(struc
*/
void migrate_page_copy(struct page *newpage, struct page *page)
{
- if (PageHuge(page) || PageTransHuge(page))
+ if (PageHuge(page))
copy_huge_page(newpage, page);
+ else if(PageTransHuge(page))
+ copy_high_order_page(newpage, page, HPAGE_PMD_ORDER);
else
copy_highpage(newpage, page);
_
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [PATCH 1/2] mm: hugetlbfs: Add some VM_BUG_ON()s to catchnon-hugetlbfs pages
2013-10-28 22:16 [PATCH 1/2] mm: hugetlbfs: Add some VM_BUG_ON()s to catch non-hugetlbfs pages Dave Hansen
2013-10-28 22:16 ` [PATCH 2/2] mm: thp: give transparent hugepage code a separate copy_page Dave Hansen
@ 2013-10-30 15:10 ` Naoya Horiguchi
1 sibling, 0 replies; 10+ messages in thread
From: Naoya Horiguchi @ 2013-10-30 15:10 UTC (permalink / raw)
To: Dave Hansen; +Cc: linux-kernel, linux-mm, dave.jiang, Mel Gorman, akpm, dhillf
On Mon, Oct 28, 2013 at 03:16:18PM -0700, Dave Hansen wrote:
>
> Dave Jiang reported that he was seeing oopses when running
> NUMA systems and default_hugepagesz=1G. I traced the issue down
> to migrate_page_copy() trying to use the same code for hugetlb
> pages and transparent hugepages. It should not have been trying
> to pass thp pages in there.
>
> So, add some VM_BUG_ON()s for the next hapless VM developer that
Looks good to me, thanks.
Reviewed-by: Naoya Horiguchi <n-horiguchi@ah.jp.nec.com>
>
> ---
>
> linux.git-davehans/include/linux/hugetlb.h | 1 +
> linux.git-davehans/mm/hugetlb.c | 1 +
> 2 files changed, 2 insertions(+)
>
> diff -puN include/linux/hugetlb.h~bug-not-hugetlbfs-in-copy_huge_page include/linux/hugetlb.h
> --- linux.git/include/linux/hugetlb.h~bug-not-hugetlbfs-in-copy_huge_page 2013-10-28 15:06:12.888828815 -0700
> +++ linux.git-davehans/include/linux/hugetlb.h 2013-10-28 15:06:12.893829038 -0700
> @@ -355,6 +355,7 @@ static inline pte_t arch_make_huge_pte(p
>
> static inline struct hstate *page_hstate(struct page *page)
> {
> + VM_BUG_ON(!PageHuge(page));
> return size_to_hstate(PAGE_SIZE << compound_order(page));
> }
>
> diff -puN mm/hugetlb.c~bug-not-hugetlbfs-in-copy_huge_page mm/hugetlb.c
> --- linux.git/mm/hugetlb.c~bug-not-hugetlbfs-in-copy_huge_page 2013-10-28 15:06:12.890828904 -0700
> +++ linux.git-davehans/mm/hugetlb.c 2013-10-28 15:06:12.894829082 -0700
> @@ -498,6 +498,7 @@ void copy_huge_page(struct page *dst, st
> int i;
> struct hstate *h = page_hstate(src);
>
> + VM_BUG_ON(!h);
> if (unlikely(pages_per_huge_page(h) > MAX_ORDER_NR_PAGES)) {
> copy_gigantic_page(dst, src);
> return;
> _
>
> --
> To unsubscribe, send a message with 'unsubscribe linux-mm' in
> the body to majordomo@kvack.org. For more info on Linux MM,
> see: http://www.linux-mm.org/ .
> Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
>
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [PATCH 2/2] mm: thp: give transparent hugepage code a separatecopy_page
2013-10-28 22:11 ` Kirill A. Shutemov
@ 2013-10-30 15:31 ` Naoya Horiguchi
2013-11-05 18:47 ` [PATCH 2/2] mm: thp: give transparent hugepage code a separate copy_page Dave Hansen
1 sibling, 0 replies; 10+ messages in thread
From: Naoya Horiguchi @ 2013-10-30 15:31 UTC (permalink / raw)
To: Kirill A. Shutemov
Cc: Dave Hansen, linux-kernel, linux-mm, dave.jiang, Mel Gorman,
akpm, dhillf
On Tue, Oct 29, 2013 at 12:11:26AM +0200, Kirill A. Shutemov wrote:
> On Mon, Oct 28, 2013 at 03:16:20PM -0700, Dave Hansen wrote:
> >
> > From: Dave Hansen <dave.hansen@linux.intel.com>
> >
> > Right now, the migration code in migrate_page_copy() uses
> > copy_huge_page() for hugetlbfs and thp pages:
> >
> > if (PageHuge(page) || PageTransHuge(page))
> > copy_huge_page(newpage, page);
> >
> > So, yay for code reuse. But:
> >
> > void copy_huge_page(struct page *dst, struct page *src)
> > {
> > struct hstate *h = page_hstate(src);
> >
> > and a non-hugetlbfs page has no page_hstate(). This
> > works 99% of the time because page_hstate() determines
> > the hstate from the page order alone. Since the page
> > order of a THP page matches the default hugetlbfs page
> > order, it works.
> >
> > But, if you change the default huge page size on the
> > boot command-line (say default_hugepagesz=1G), then
> > we might not even *have* a 2MB hstate so page_hstate()
> > returns null and copy_huge_page() oopses pretty fast
> > since copy_huge_page() dereferences the hstate:
> >
> > void copy_huge_page(struct page *dst, struct page *src)
> > {
> > struct hstate *h = page_hstate(src);
> > if (unlikely(pages_per_huge_page(h) > MAX_ORDER_NR_PAGES)) {
> > ...
> >
> > This patch creates a copy_high_order_page() which can
> > be used on THP pages.
>
> We already have copy_user_huge_page() and copy_user_gigantic_page() in
> generic code (mm/memory.c). I think copy_gigantic_page() and
> copy_huge_page() should be moved there too.
I agree this.
> BTW, I think pages_per_huge_page in copy_user_huge_page() is redunand:
> compound_order(page) should be enough, right?
I guess that thp code is very strict on performance, so developers chose
to pass it as an argument instead of calculating compound_order in each call.
I think the performance gain is small (maybe invisible),
but it's not a bad idea to me.
Thanks,
Naoya
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [PATCH 2/2] mm: thp: give transparent hugepage code a separate copy_page
2013-10-28 22:11 ` Kirill A. Shutemov
2013-10-30 15:31 ` [PATCH 2/2] mm: thp: give transparent hugepage code a separatecopy_page Naoya Horiguchi
@ 2013-11-05 18:47 ` Dave Hansen
1 sibling, 0 replies; 10+ messages in thread
From: Dave Hansen @ 2013-11-05 18:47 UTC (permalink / raw)
To: Kirill A. Shutemov
Cc: linux-kernel, linux-mm, dave.jiang, Mel Gorman, akpm, dhillf
On 10/28/2013 03:11 PM, Kirill A. Shutemov wrote:
> On Mon, Oct 28, 2013 at 03:16:20PM -0700, Dave Hansen wrote:
>> void copy_huge_page(struct page *dst, struct page *src)
>> {
>> struct hstate *h = page_hstate(src);
>> if (unlikely(pages_per_huge_page(h) > MAX_ORDER_NR_PAGES)) {
>> ...
>>
>> This patch creates a copy_high_order_page() which can
>> be used on THP pages.
>
> We already have copy_user_huge_page() and copy_user_gigantic_page() in
> generic code (mm/memory.c). I think copy_gigantic_page() and
> copy_huge_page() should be moved there too.
That would be fine I guesss... in another patch. :)
> BTW, I think pages_per_huge_page in copy_user_huge_page() is redunand:
> compound_order(page) should be enough, right?
The way it is now, the compiler can optimize for the !HUGETLBFS case.
Also, pages_per_huge_page() works for gigantic pages. compound_order()
wouldn't work for those.
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [PATCH 2/2] mm: thp: give transparent hugepage code a separate copy_page
2013-10-28 22:16 ` [PATCH 2/2] mm: thp: give transparent hugepage code a separate copy_page Dave Hansen
2013-10-28 22:11 ` Kirill A. Shutemov
@ 2013-11-06 13:46 ` Hillf Danton
2013-11-06 16:34 ` Dave Hansen
1 sibling, 1 reply; 10+ messages in thread
From: Hillf Danton @ 2013-11-06 13:46 UTC (permalink / raw)
To: Dave Hansen; +Cc: LKML, Linux-MM, dave.jiang, Mel Gorman, Andrew Morton
On Tue, Oct 29, 2013 at 6:16 AM, Dave Hansen <dave@sr71.net> wrote:
> +
> +void copy_high_order_page(struct page *newpage,
> + struct page *oldpage,
> + int order)
> +{
> + int i;
> +
> + might_sleep();
> + for (i = 0; i < (1<<order); i++) {
> + cond_resched();
> + copy_highpage(newpage + i, oldpage + i);
> + }
> +}
Can we make no use of might_sleep here with cond_resched in loop?
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [PATCH 2/2] mm: thp: give transparent hugepage code a separate copy_page
2013-11-06 13:46 ` Hillf Danton
@ 2013-11-06 16:34 ` Dave Hansen
0 siblings, 0 replies; 10+ messages in thread
From: Dave Hansen @ 2013-11-06 16:34 UTC (permalink / raw)
To: Hillf Danton; +Cc: LKML, Linux-MM, dave.jiang, Mel Gorman, Andrew Morton
On 11/06/2013 05:46 AM, Hillf Danton wrote:
> On Tue, Oct 29, 2013 at 6:16 AM, Dave Hansen <dave@sr71.net> wrote:
>> +
>> +void copy_high_order_page(struct page *newpage,
>> + struct page *oldpage,
>> + int order)
>> +{
>> + int i;
>> +
>> + might_sleep();
>> + for (i = 0; i < (1<<order); i++) {
>> + cond_resched();
>> + copy_highpage(newpage + i, oldpage + i);
>> + }
>> +}
>
> Can we make no use of might_sleep here with cond_resched in loop?
I'm not sure what you're saying.
Are you pointing out that cond_resched() actually calls might_sleep() so
the might_sleep() is redundant?
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [PATCH 2/2] mm: thp: give transparent hugepage code a separate copy_page
2013-11-14 23:34 ` [PATCH 2/2] mm: thp: give transparent hugepage code a separate copy_page Dave Hansen
@ 2013-11-15 10:26 ` Mel Gorman
0 siblings, 0 replies; 10+ messages in thread
From: Mel Gorman @ 2013-11-15 10:26 UTC (permalink / raw)
To: Dave Hansen
Cc: linux-kernel, linux-mm, dave.jiang, akpm, dhillf, Naoya Horiguchi
On Thu, Nov 14, 2013 at 03:34:00PM -0800, Dave Hansen wrote:
>
> Changes from v1:
> * removed explicit might_sleep() in favor of the one that we
> get from the cond_resched();
>
> --
>
> From: Dave Hansen <dave.hansen@linux.intel.com>
>
> Right now, the migration code in migrate_page_copy() uses
> copy_huge_page() for hugetlbfs and thp pages:
>
> if (PageHuge(page) || PageTransHuge(page))
> copy_huge_page(newpage, page);
>
> So, yay for code reuse. But:
>
> void copy_huge_page(struct page *dst, struct page *src)
> {
> struct hstate *h = page_hstate(src);
>
> and a non-hugetlbfs page has no page_hstate(). This
> works 99% of the time because page_hstate() determines
> the hstate from the page order alone. Since the page
> order of a THP page matches the default hugetlbfs page
> order, it works.
>
> But, if you change the default huge page size on the
> boot command-line (say default_hugepagesz=1G), then
> we might not even *have* a 2MB hstate so page_hstate()
> returns null and copy_huge_page() oopses pretty fast
> since copy_huge_page() dereferences the hstate:
>
> void copy_huge_page(struct page *dst, struct page *src)
> {
> struct hstate *h = page_hstate(src);
> if (unlikely(pages_per_huge_page(h) > MAX_ORDER_NR_PAGES)) {
> ...
>
> This patch creates a copy_high_order_page() which can
> be used on THP pages.
>
> I believe the bug was introduced in b32967ff101:
> Author: Mel Gorman <mgorman@suse.de>
> Date: Mon Nov 19 12:35:47 2012 +0000
> mm: numa: Add THP migration for the NUMA working set scanning fault case.
>
> Signed-off-by: Dave Hansen <dave.hansen@linux.intel.com>
It is a mild pity that there are variants of this like copy_user_huge_page
for COW. They could be collapsed but the result API would not be pretty.
A rename of copy_huge_page to copy_hugetlbfs_page is justified to avoid
a repeat mistake. Alternatively, there seems to be little reason to add
hugetlbfs and thp specific apis when you could just do something like this
(untested)
diff --git a/mm/hugetlb.c b/mm/hugetlb.c
index 0b7656e..784313a 100644
--- a/mm/hugetlb.c
+++ b/mm/hugetlb.c
@@ -476,40 +476,6 @@ static int vma_has_reserves(struct vm_area_struct *vma, long chg)
return 0;
}
-static void copy_gigantic_page(struct page *dst, struct page *src)
-{
- int i;
- struct hstate *h = page_hstate(src);
- struct page *dst_base = dst;
- struct page *src_base = src;
-
- for (i = 0; i < pages_per_huge_page(h); ) {
- cond_resched();
- copy_highpage(dst, src);
-
- i++;
- dst = mem_map_next(dst, dst_base, i);
- src = mem_map_next(src, src_base, i);
- }
-}
-
-void copy_huge_page(struct page *dst, struct page *src)
-{
- int i;
- struct hstate *h = page_hstate(src);
-
- if (unlikely(pages_per_huge_page(h) > MAX_ORDER_NR_PAGES)) {
- copy_gigantic_page(dst, src);
- return;
- }
-
- might_sleep();
- for (i = 0; i < pages_per_huge_page(h); i++) {
- cond_resched();
- copy_highpage(dst + i, src + i);
- }
-}
-
static void enqueue_huge_page(struct hstate *h, struct page *page)
{
int nid = page_to_nid(page);
diff --git a/mm/migrate.c b/mm/migrate.c
index 9167b22..843b96d 100644
--- a/mm/migrate.c
+++ b/mm/migrate.c
@@ -440,6 +440,49 @@ int migrate_huge_page_move_mapping(struct address_space *mapping,
return MIGRATEPAGE_SUCCESS;
}
+static void copy_gigantic_page(struct page *dst, struct page *src,
+ int nr_pages)
+{
+ int i;
+ struct page *dst_base = dst;
+ struct page *src_base = src;
+
+ for (i = 0; i < nr_pages; ) {
+ cond_resched();
+ copy_highpage(dst, src);
+
+ i++;
+ dst = mem_map_next(dst, dst_base, i);
+ src = mem_map_next(src, src_base, i);
+ }
+}
+
+void copy_huge_page(struct page *dst, struct page *src)
+{
+ int i;
+ int nr_pages;
+
+ if (PageHuge(src)) {
+ /* hugetlbfs page */
+ struct hstate *h = page_hstate(src);
+ nr_pages = pages_per_huge_page(h);
+
+ if (unlikely(nr_pages > MAX_ORDER_NR_PAGES)) {
+ copy_gigantic_page(dst, src, nr_pages);
+ return;
+ }
+ } else {
+ /* thp page */
+ BUG_ON(!PageTransHuge(src));
+ nr_pages = HPAGE_PMD_NR;
+ }
+
+ for (i = 0; i < nr_pages; i++) {
+ cond_resched();
+ copy_highpage(dst + i, src + i);
+ }
+}
+
/*
* Copy the page to its new location
*/
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply related [flat|nested] 10+ messages in thread
* [PATCH 2/2] mm: thp: give transparent hugepage code a separate copy_page
2013-11-14 23:33 [PATCH 0/2] v2: fix hugetlb vs. anon-thp copy page Dave Hansen
@ 2013-11-14 23:34 ` Dave Hansen
2013-11-15 10:26 ` Mel Gorman
0 siblings, 1 reply; 10+ messages in thread
From: Dave Hansen @ 2013-11-14 23:34 UTC (permalink / raw)
To: linux-kernel
Cc: linux-mm, dave.jiang, Mel Gorman, akpm, dhillf, Naoya Horiguchi,
Dave Hansen
Changes from v1:
* removed explicit might_sleep() in favor of the one that we
get from the cond_resched();
--
From: Dave Hansen <dave.hansen@linux.intel.com>
Right now, the migration code in migrate_page_copy() uses
copy_huge_page() for hugetlbfs and thp pages:
if (PageHuge(page) || PageTransHuge(page))
copy_huge_page(newpage, page);
So, yay for code reuse. But:
void copy_huge_page(struct page *dst, struct page *src)
{
struct hstate *h = page_hstate(src);
and a non-hugetlbfs page has no page_hstate(). This
works 99% of the time because page_hstate() determines
the hstate from the page order alone. Since the page
order of a THP page matches the default hugetlbfs page
order, it works.
But, if you change the default huge page size on the
boot command-line (say default_hugepagesz=1G), then
we might not even *have* a 2MB hstate so page_hstate()
returns null and copy_huge_page() oopses pretty fast
since copy_huge_page() dereferences the hstate:
void copy_huge_page(struct page *dst, struct page *src)
{
struct hstate *h = page_hstate(src);
if (unlikely(pages_per_huge_page(h) > MAX_ORDER_NR_PAGES)) {
...
This patch creates a copy_high_order_page() which can
be used on THP pages.
I believe the bug was introduced in b32967ff101:
Author: Mel Gorman <mgorman@suse.de>
Date: Mon Nov 19 12:35:47 2012 +0000
mm: numa: Add THP migration for the NUMA working set scanning fault case.
Signed-off-by: Dave Hansen <dave.hansen@linux.intel.com>
---
linux.git-davehans/include/linux/huge_mm.h | 16 ++++++++++++++++
linux.git-davehans/mm/huge_memory.c | 12 ++++++++++++
linux.git-davehans/mm/migrate.c | 6 ++++--
3 files changed, 32 insertions(+), 2 deletions(-)
diff -puN include/linux/huge_mm.h~copy-huge-separate-from-copy-transhuge include/linux/huge_mm.h
--- linux.git/include/linux/huge_mm.h~copy-huge-separate-from-copy-transhuge 2013-11-14 15:09:38.869188202 -0800
+++ linux.git-davehans/include/linux/huge_mm.h 2013-11-14 15:09:38.873188379 -0800
@@ -177,6 +177,10 @@ static inline struct page *compound_tran
extern int do_huge_pmd_numa_page(struct mm_struct *mm, struct vm_area_struct *vma,
unsigned long addr, pmd_t pmd, pmd_t *pmdp);
+extern void copy_high_order_page(struct page *newpage,
+ struct page *oldpage,
+ int order);
+
#else /* CONFIG_TRANSPARENT_HUGEPAGE */
#define HPAGE_PMD_SHIFT ({ BUILD_BUG(); 0; })
#define HPAGE_PMD_MASK ({ BUILD_BUG(); 0; })
@@ -227,6 +231,18 @@ static inline int do_huge_pmd_numa_page(
return 0;
}
+/*
+ * The non-stub version of this code is probably usable
+ * generically but its only user is thp at the moment,
+ * so enforce that with a BUG()
+ */
+static inline void copy_high_order_page(struct page *newpage,
+ struct page *oldpage,
+ int order)
+{
+ BUG();
+}
+
#endif /* CONFIG_TRANSPARENT_HUGEPAGE */
#endif /* _LINUX_HUGE_MM_H */
diff -puN mm/huge_memory.c~copy-huge-separate-from-copy-transhuge mm/huge_memory.c
--- linux.git/mm/huge_memory.c~copy-huge-separate-from-copy-transhuge 2013-11-14 15:09:38.870188245 -0800
+++ linux.git-davehans/mm/huge_memory.c 2013-11-14 15:09:38.874188424 -0800
@@ -2890,3 +2890,15 @@ void __vma_adjust_trans_huge(struct vm_a
split_huge_page_address(next->vm_mm, nstart);
}
}
+
+void copy_high_order_page(struct page *newpage,
+ struct page *oldpage,
+ int order)
+{
+ int i;
+
+ for (i = 0; i < (1<<order); i++) {
+ cond_resched();
+ copy_highpage(newpage + i, oldpage + i);
+ }
+}
diff -puN mm/migrate.c~copy-huge-separate-from-copy-transhuge mm/migrate.c
--- linux.git/mm/migrate.c~copy-huge-separate-from-copy-transhuge 2013-11-14 15:09:38.871188288 -0800
+++ linux.git-davehans/mm/migrate.c 2013-11-14 15:09:38.874188424 -0800
@@ -447,8 +447,10 @@ void migrate_page_copy(struct page *newp
{
int cpupid;
- if (PageHuge(page) || PageTransHuge(page))
- copy_huge_page(newpage, page);
+ if (PageHuge(page))
+ copy_huge_page(newpage, page);
+ else if(PageTransHuge(page))
+ copy_high_order_page(newpage, page, HPAGE_PMD_ORDER);
else
copy_highpage(newpage, page);
_
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply [flat|nested] 10+ messages in thread
end of thread, other threads:[~2013-11-15 10:26 UTC | newest]
Thread overview: 10+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2013-10-28 22:16 [PATCH 1/2] mm: hugetlbfs: Add some VM_BUG_ON()s to catch non-hugetlbfs pages Dave Hansen
2013-10-28 22:16 ` [PATCH 2/2] mm: thp: give transparent hugepage code a separate copy_page Dave Hansen
2013-10-28 22:11 ` Kirill A. Shutemov
2013-10-30 15:31 ` [PATCH 2/2] mm: thp: give transparent hugepage code a separatecopy_page Naoya Horiguchi
2013-11-05 18:47 ` [PATCH 2/2] mm: thp: give transparent hugepage code a separate copy_page Dave Hansen
2013-11-06 13:46 ` Hillf Danton
2013-11-06 16:34 ` Dave Hansen
2013-10-30 15:10 ` [PATCH 1/2] mm: hugetlbfs: Add some VM_BUG_ON()s to catchnon-hugetlbfs pages Naoya Horiguchi
2013-11-14 23:33 [PATCH 0/2] v2: fix hugetlb vs. anon-thp copy page Dave Hansen
2013-11-14 23:34 ` [PATCH 2/2] mm: thp: give transparent hugepage code a separate copy_page Dave Hansen
2013-11-15 10:26 ` Mel Gorman
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).