linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* vma_needs_copy always true for VM_HUGETLB ?
@ 2022-05-18 22:36 Mike Kravetz
  2022-05-19  1:30 ` Hugh Dickins
  0 siblings, 1 reply; 3+ messages in thread
From: Mike Kravetz @ 2022-05-18 22:36 UTC (permalink / raw)
  To: linux-mm, linux-kernel
  Cc: Michal Hocko, Oscar Salvador, David Hildenbrand, Naoya Horiguchi,
	Hugh Dickins, Peter Xu

For most non-anonymous vmas, we do not copy page tables at fork time, but
rather lazily populate the tables after fork via faults.  The routine
vma_needs_copy() is used to make this decision. For VM_HUGETLB vmas, it always
returns true.

Anyone know/remember why?  The code was added more than 15 years ago and
my search for why hugetlb vmas were excluded came up empty.

I do not see a reason why VM_HUGETLB is in this list.  Initial testing did
not reveal any problems when I removed the VM_HUGETLB check.

FYI - I am looking at the performance of fork and exec (unmap) of processes
with very large hugetlb mappings.  Skipping the copy at fork time would
certainly speed things up.  Of course, there could some users who would
notice if hugetlb page tables are not copied at fork time.  However, this
is the behavior for 'normal' mappings.  I am inclined to make hugetlb be
'more normal'.
-- 
Mike Kravetz

^ permalink raw reply	[flat|nested] 3+ messages in thread

* Re: vma_needs_copy always true for VM_HUGETLB ?
  2022-05-18 22:36 vma_needs_copy always true for VM_HUGETLB ? Mike Kravetz
@ 2022-05-19  1:30 ` Hugh Dickins
  2022-05-19  3:36   ` Mike Kravetz
  0 siblings, 1 reply; 3+ messages in thread
From: Hugh Dickins @ 2022-05-19  1:30 UTC (permalink / raw)
  To: Mike Kravetz
  Cc: linux-mm, linux-kernel, Michal Hocko, Oscar Salvador,
	David Hildenbrand, Naoya Horiguchi, Hugh Dickins, Peter Xu,
	Nick Piggin, Andi Kleen

On Wed, 18 May 2022, Mike Kravetz wrote:

> For most non-anonymous vmas, we do not copy page tables at fork time, but
> rather lazily populate the tables after fork via faults.  The routine
> vma_needs_copy() is used to make this decision. For VM_HUGETLB vmas, it always
> returns true.

"vma_needs_copy()" is *very* recent coinage, not reached Linus yet.

> 
> Anyone know/remember why?  The code was added more than 15 years ago and
> my search for why hugetlb vmas were excluded came up empty.
> 
> I do not see a reason why VM_HUGETLB is in this list.  Initial testing did
> not reveal any problems when I removed the VM_HUGETLB check.
> 
> FYI - I am looking at the performance of fork and exec (unmap) of processes
> with very large hugetlb mappings.  Skipping the copy at fork time would
> certainly speed things up.  Of course, there could some users who would
> notice if hugetlb page tables are not copied at fork time.  However, this
> is the behavior for 'normal' mappings.  I am inclined to make hugetlb be
> 'more normal'.

Good question, not obvious to me either: but I've found the answer.

The commit was of course Nick's d992895ba2b2 ("[PATCH] Lazy page table
copies in fork()") in 2.6.14; but it doesn't explain why VM_HUGETLB is
there in the test, and goes on to be copied.

I haven't re-read through the whole mail thread which led to that
commit, but I think you'll find the crucial observation comes from
Andi in https://lore.kernel.org/lkml/200508251756.07849.ak@suse.de/#t

"Actually I disabled it for hugetlbfs (... !is_huge...vma). The reason 
is that lazy faulting for huge pages is still not in mainline."

and indeed, look at the 2.6.13 or 2.6.14 mm/hugetlb.c and you find
/*
 * We cannot handle pagefaults against hugetlb pages at all.  They cause
 * handle_mm_fault() to try to instantiate regular-sized pages in the
 * hugegpage VMA.  do_page_fault() is supposed to trap this, so BUG is we get
 * this far.
 */
static struct page *hugetlb_nopage(struct vm_area_struct *vma,
				unsigned long address, int *unused)
{
	BUG();
	return NULL;
}

Oh, and that pretty much still exists to this day, to cover that path
to a fault; but 2.6.16 implemented hugetlb_no_page(), which is what
then actually got used to satisfy a hugetlb fault.

So the reason for fork copying VM_HUGETLB appears to have gone away
in 2.6.16.

(I haven't a clue on private hugetlb mappings and reservations and
whether anon_vma means the same on hugetlb, but you know all that.)

Hugh

^ permalink raw reply	[flat|nested] 3+ messages in thread

* Re: vma_needs_copy always true for VM_HUGETLB ?
  2022-05-19  1:30 ` Hugh Dickins
@ 2022-05-19  3:36   ` Mike Kravetz
  0 siblings, 0 replies; 3+ messages in thread
From: Mike Kravetz @ 2022-05-19  3:36 UTC (permalink / raw)
  To: Hugh Dickins
  Cc: linux-mm, linux-kernel, Michal Hocko, Oscar Salvador,
	David Hildenbrand, Naoya Horiguchi, Peter Xu, Nick Piggin,
	Andi Kleen

On 5/18/22 18:30, Hugh Dickins wrote:
> On Wed, 18 May 2022, Mike Kravetz wrote:
> 
>> For most non-anonymous vmas, we do not copy page tables at fork time, but
>> rather lazily populate the tables after fork via faults.  The routine
>> vma_needs_copy() is used to make this decision. For VM_HUGETLB vmas, it always
>> returns true.
> 
> "vma_needs_copy()" is *very* recent coinage, not reached Linus yet.
> 
>>
>> Anyone know/remember why?  The code was added more than 15 years ago and
>> my search for why hugetlb vmas were excluded came up empty.
>>
>> I do not see a reason why VM_HUGETLB is in this list.  Initial testing did
>> not reveal any problems when I removed the VM_HUGETLB check.
>>
>> FYI - I am looking at the performance of fork and exec (unmap) of processes
>> with very large hugetlb mappings.  Skipping the copy at fork time would
>> certainly speed things up.  Of course, there could some users who would
>> notice if hugetlb page tables are not copied at fork time.  However, this
>> is the behavior for 'normal' mappings.  I am inclined to make hugetlb be
>> 'more normal'.
> 
> Good question, not obvious to me either: but I've found the answer.

Thank you Hugh!  You went above and beyond as usual.

> The commit was of course Nick's d992895ba2b2 ("[PATCH] Lazy page table
> copies in fork()") in 2.6.14; but it doesn't explain why VM_HUGETLB is
> there in the test, and goes on to be copied.
> 
> I haven't re-read through the whole mail thread which led to that
> commit, but I think you'll find the crucial observation comes from
> Andi in https://lore.kernel.org/lkml/200508251756.07849.ak@suse.de/#t

Sorry, that I did not find the entire thread.  There were only a couple
pieces on linux-mm and that is the only place I looked.

> 
> "Actually I disabled it for hugetlbfs (... !is_huge...vma). The reason 
> is that lazy faulting for huge pages is still not in mainline."
> 
> and indeed, look at the 2.6.13 or 2.6.14 mm/hugetlb.c and you find
> /*
>  * We cannot handle pagefaults against hugetlb pages at all.  They cause
>  * handle_mm_fault() to try to instantiate regular-sized pages in the
>  * hugegpage VMA.  do_page_fault() is supposed to trap this, so BUG is we get
>  * this far.
>  */
> static struct page *hugetlb_nopage(struct vm_area_struct *vma,
> 				unsigned long address, int *unused)
> {
> 	BUG();
> 	return NULL;
> }
> 
> Oh, and that pretty much still exists to this day, to cover that path
> to a fault; but 2.6.16 implemented hugetlb_no_page(), which is what
> then actually got used to satisfy a hugetlb fault.
> 
> So the reason for fork copying VM_HUGETLB appears to have gone away
> in 2.6.16.

Yes, that is the likely reason.  Functionality was not originally
supported, and when it was added this 'optimization' was not enabled.

> (I haven't a clue on private hugetlb mappings and reservations and
> whether anon_vma means the same on hugetlb, but you know all that.)

Yes, I believe anon_vma means the same on hugetlb for this purpose.
Although, I do need to look closer just to make sure there are not
hidden surprises.

Thanks again,
-- 
Mike Kravetz

^ permalink raw reply	[flat|nested] 3+ messages in thread

end of thread, other threads:[~2022-05-19  3:37 UTC | newest]

Thread overview: 3+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2022-05-18 22:36 vma_needs_copy always true for VM_HUGETLB ? Mike Kravetz
2022-05-19  1:30 ` Hugh Dickins
2022-05-19  3:36   ` Mike Kravetz

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).