linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH] mm/hugetlb: Unmap pages if page fault raced with hole punch
@ 2015-10-29 22:33 Mike Kravetz
  2015-10-30  3:32 ` Hugh Dickins
  0 siblings, 1 reply; 8+ messages in thread
From: Mike Kravetz @ 2015-10-29 22:33 UTC (permalink / raw)
  To: linux-mm, linux-kernel, Andrew Morton
  Cc: Dave Hansen, Naoya Horiguchi, Hugh Dickins, Davidlohr Bueso,
	Mike Kravetz

This patch is a combination of:
[PATCH v2 4/4] mm/hugetlb: Unmap pages to remove if page fault raced
	with hole punch  and,
[PATCH] mm/hugetlb: i_mmap_lock_write before unmapping in
	remove_inode_hugepages
This patch can replace the entire series:
[PATCH v2 0/4] hugetlbfs fallocate hole punch race with page faults
	and
[PATCH] mm/hugetlb: i_mmap_lock_write before unmapping in
	remove_inode_hugepages
It is being provided in an effort to possibly make tree management easier.

Page faults can race with fallocate hole punch.  If a page fault happens
between the unmap and remove operations, the page is not removed and
remains within the hole.  This is not the desired behavior.

If this race is detected and a page is mapped, the remove operation
(remove_inode_hugepages) will unmap the page before removing.  The unmap
within remove_inode_hugepages occurs with the hugetlb_fault_mutex held
so that no other faults can occur until the page is removed.

The (unmodified) routine hugetlb_vmdelete_list was moved ahead of
remove_inode_hugepages to satisfy the new reference.

Signed-off-by: Mike Kravetz <mike.kravetz@oracle.com>
---
 fs/hugetlbfs/inode.c | 125 ++++++++++++++++++++++++++-------------------------
 1 file changed, 65 insertions(+), 60 deletions(-)

diff --git a/fs/hugetlbfs/inode.c b/fs/hugetlbfs/inode.c
index 316adb9..8b8e5e8 100644
--- a/fs/hugetlbfs/inode.c
+++ b/fs/hugetlbfs/inode.c
@@ -324,11 +324,44 @@ static void remove_huge_page(struct page *page)
 	delete_from_page_cache(page);
 }
 
+static inline void
+hugetlb_vmdelete_list(struct rb_root *root, pgoff_t start, pgoff_t end)
+{
+	struct vm_area_struct *vma;
+
+	/*
+	 * end == 0 indicates that the entire range after
+	 * start should be unmapped.
+	 */
+	vma_interval_tree_foreach(vma, root, start, end ? end : ULONG_MAX) {
+		unsigned long v_offset;
+
+		/*
+		 * Can the expression below overflow on 32-bit arches?
+		 * No, because the interval tree returns us only those vmas
+		 * which overlap the truncated area starting at pgoff,
+		 * and no vma on a 32-bit arch can span beyond the 4GB.
+		 */
+		if (vma->vm_pgoff < start)
+			v_offset = (start - vma->vm_pgoff) << PAGE_SHIFT;
+		else
+			v_offset = 0;
+
+		if (end) {
+			end = ((end - start) << PAGE_SHIFT) +
+			       vma->vm_start + v_offset;
+			if (end > vma->vm_end)
+				end = vma->vm_end;
+		} else
+			end = vma->vm_end;
+
+		unmap_hugepage_range(vma, vma->vm_start + v_offset, end, NULL);
+	}
+}
 
 /*
  * remove_inode_hugepages handles two distinct cases: truncation and hole
  * punch.  There are subtle differences in operation for each case.
-
  * truncation is indicated by end of range being LLONG_MAX
  *	In this case, we first scan the range and release found pages.
  *	After releasing pages, hugetlb_unreserve_pages cleans up region/reserv
@@ -381,12 +414,27 @@ static void remove_inode_hugepages(struct inode *inode, loff_t lstart,
 		for (i = 0; i < pagevec_count(&pvec); ++i) {
 			struct page *page = pvec.pages[i];
 			u32 hash;
+			bool rsv_on_error;
 
 			hash = hugetlb_fault_mutex_hash(h, current->mm,
 							&pseudo_vma,
 							mapping, next, 0);
 			mutex_lock(&hugetlb_fault_mutex_table[hash]);
 
+			/*
+			 * If page is mapped, it was faulted in after being
+			 * unmapped in caller.  Unmap (again) now after taking
+			 * the fault mutex.  The mutex will prevent faults
+			 * until we finish removing the page.
+			 */
+			if (page_mapped(page)) {
+				i_mmap_lock_write(mapping);
+				hugetlb_vmdelete_list(&mapping->i_mmap,
+					next * pages_per_huge_page(h),
+					(next + 1) * pages_per_huge_page(h));
+				i_mmap_unlock_write(mapping);
+			}
+
 			lock_page(page);
 			if (page->index >= end) {
 				unlock_page(page);
@@ -396,31 +444,23 @@ static void remove_inode_hugepages(struct inode *inode, loff_t lstart,
 			}
 
 			/*
-			 * If page is mapped, it was faulted in after being
-			 * unmapped.  Do nothing in this race case.  In the
-			 * normal case page is not mapped.
+			 * We must free the huge page and remove from page
+			 * cache (remove_huge_page) BEFORE removing the
+			 * region/reserve map (hugetlb_unreserve_pages).
+			 * In rare out of memory conditions, removal of the
+			 * region/reserve map could fail.  Before free'ing
+			 * the page, note PagePrivate which is used in case
+			 * of error.
 			 */
-			if (!page_mapped(page)) {
-				bool rsv_on_error = !PagePrivate(page);
-				/*
-				 * We must free the huge page and remove
-				 * from page cache (remove_huge_page) BEFORE
-				 * removing the region/reserve map
-				 * (hugetlb_unreserve_pages).  In rare out
-				 * of memory conditions, removal of the
-				 * region/reserve map could fail.  Before
-				 * free'ing the page, note PagePrivate which
-				 * is used in case of error.
-				 */
-				remove_huge_page(page);
-				freed++;
-				if (!truncate_op) {
-					if (unlikely(hugetlb_unreserve_pages(
-							inode, next,
-							next + 1, 1)))
-						hugetlb_fix_reserve_counts(
-							inode, rsv_on_error);
-				}
+			rsv_on_error = !PagePrivate(page);
+			remove_huge_page(page);
+			freed++;
+			if (!truncate_op) {
+				if (unlikely(hugetlb_unreserve_pages(inode,
+								next, next + 1,
+								1)))
+					hugetlb_fix_reserve_counts(inode,
+								rsv_on_error);
 			}
 
 			if (page->index > next)
@@ -450,41 +490,6 @@ static void hugetlbfs_evict_inode(struct inode *inode)
 	clear_inode(inode);
 }
 
-static inline void
-hugetlb_vmdelete_list(struct rb_root *root, pgoff_t start, pgoff_t end)
-{
-	struct vm_area_struct *vma;
-
-	/*
-	 * end == 0 indicates that the entire range after
-	 * start should be unmapped.
-	 */
-	vma_interval_tree_foreach(vma, root, start, end ? end : ULONG_MAX) {
-		unsigned long v_offset;
-
-		/*
-		 * Can the expression below overflow on 32-bit arches?
-		 * No, because the interval tree returns us only those vmas
-		 * which overlap the truncated area starting at pgoff,
-		 * and no vma on a 32-bit arch can span beyond the 4GB.
-		 */
-		if (vma->vm_pgoff < start)
-			v_offset = (start - vma->vm_pgoff) << PAGE_SHIFT;
-		else
-			v_offset = 0;
-
-		if (end) {
-			end = ((end - start) << PAGE_SHIFT) +
-			       vma->vm_start + v_offset;
-			if (end > vma->vm_end)
-				end = vma->vm_end;
-		} else
-			end = vma->vm_end;
-
-		unmap_hugepage_range(vma, vma->vm_start + v_offset, end, NULL);
-	}
-}
-
 static int hugetlb_vmtruncate(struct inode *inode, loff_t offset)
 {
 	pgoff_t pgoff;
-- 
2.4.3


^ permalink raw reply related	[flat|nested] 8+ messages in thread

* Re: [PATCH] mm/hugetlb: Unmap pages if page fault raced with hole punch
  2015-10-29 22:33 [PATCH] mm/hugetlb: Unmap pages if page fault raced with hole punch Mike Kravetz
@ 2015-10-30  3:32 ` Hugh Dickins
  2015-10-30 16:45   ` Mike Kravetz
  0 siblings, 1 reply; 8+ messages in thread
From: Hugh Dickins @ 2015-10-30  3:32 UTC (permalink / raw)
  To: Mike Kravetz
  Cc: linux-mm, linux-kernel, Andrew Morton, Dave Hansen,
	Naoya Horiguchi, Hugh Dickins, Davidlohr Bueso

On Thu, 29 Oct 2015, Mike Kravetz wrote:

> This patch is a combination of:
> [PATCH v2 4/4] mm/hugetlb: Unmap pages to remove if page fault raced
> 	with hole punch  and,
> [PATCH] mm/hugetlb: i_mmap_lock_write before unmapping in
> 	remove_inode_hugepages
> This patch can replace the entire series:
> [PATCH v2 0/4] hugetlbfs fallocate hole punch race with page faults
> 	and
> [PATCH] mm/hugetlb: i_mmap_lock_write before unmapping in
> 	remove_inode_hugepages
> It is being provided in an effort to possibly make tree management easier.
> 
> Page faults can race with fallocate hole punch.  If a page fault happens
> between the unmap and remove operations, the page is not removed and
> remains within the hole.  This is not the desired behavior.
> 
> If this race is detected and a page is mapped, the remove operation
> (remove_inode_hugepages) will unmap the page before removing.  The unmap
> within remove_inode_hugepages occurs with the hugetlb_fault_mutex held
> so that no other faults can occur until the page is removed.
> 
> The (unmodified) routine hugetlb_vmdelete_list was moved ahead of
> remove_inode_hugepages to satisfy the new reference.
> 
> Signed-off-by: Mike Kravetz <mike.kravetz@oracle.com>

Sorry, I came here to give this a quick Ack, but find I cannot:
you're adding to the remove_inode_hugepages() loop (heading towards
4.3 final), but its use of "next" looks wrong to me already.

Doesn't "next" need to be assigned from page->index much earlier?
If there's a hole in the file (which there very well might be, since
you've just implemented holepunch!), doesn't it do the wrong thing?

And the loop itself is a bit weird, though that probably doesn't
matter very much: I said before, seeing the "while (next < end)",
that it's a straightforward scan from start to end, and sometimes
it would work that way; but buried inside is "next = start; continue;"
from a contrasting "pincer" loop (which goes back to squeeze every
page out of the range, lest faults raced truncation or holepunch).
I know the originals in truncate.c or shmem.c are quite tricky,
but this being different again would take time to validate.

No cond_resched() either.

Hugh

> ---
>  fs/hugetlbfs/inode.c | 125 ++++++++++++++++++++++++++-------------------------
>  1 file changed, 65 insertions(+), 60 deletions(-)
> 
> diff --git a/fs/hugetlbfs/inode.c b/fs/hugetlbfs/inode.c
> index 316adb9..8b8e5e8 100644
> --- a/fs/hugetlbfs/inode.c
> +++ b/fs/hugetlbfs/inode.c
> @@ -324,11 +324,44 @@ static void remove_huge_page(struct page *page)
>  	delete_from_page_cache(page);
>  }
>  
> +static inline void
> +hugetlb_vmdelete_list(struct rb_root *root, pgoff_t start, pgoff_t end)
> +{
> +	struct vm_area_struct *vma;
> +
> +	/*
> +	 * end == 0 indicates that the entire range after
> +	 * start should be unmapped.
> +	 */
> +	vma_interval_tree_foreach(vma, root, start, end ? end : ULONG_MAX) {
> +		unsigned long v_offset;
> +
> +		/*
> +		 * Can the expression below overflow on 32-bit arches?
> +		 * No, because the interval tree returns us only those vmas
> +		 * which overlap the truncated area starting at pgoff,
> +		 * and no vma on a 32-bit arch can span beyond the 4GB.
> +		 */
> +		if (vma->vm_pgoff < start)
> +			v_offset = (start - vma->vm_pgoff) << PAGE_SHIFT;
> +		else
> +			v_offset = 0;
> +
> +		if (end) {
> +			end = ((end - start) << PAGE_SHIFT) +
> +			       vma->vm_start + v_offset;
> +			if (end > vma->vm_end)
> +				end = vma->vm_end;
> +		} else
> +			end = vma->vm_end;
> +
> +		unmap_hugepage_range(vma, vma->vm_start + v_offset, end, NULL);
> +	}
> +}
>  
>  /*
>   * remove_inode_hugepages handles two distinct cases: truncation and hole
>   * punch.  There are subtle differences in operation for each case.
> -
>   * truncation is indicated by end of range being LLONG_MAX
>   *	In this case, we first scan the range and release found pages.
>   *	After releasing pages, hugetlb_unreserve_pages cleans up region/reserv
> @@ -381,12 +414,27 @@ static void remove_inode_hugepages(struct inode *inode, loff_t lstart,
>  		for (i = 0; i < pagevec_count(&pvec); ++i) {
>  			struct page *page = pvec.pages[i];
>  			u32 hash;
> +			bool rsv_on_error;
>  
>  			hash = hugetlb_fault_mutex_hash(h, current->mm,
>  							&pseudo_vma,
>  							mapping, next, 0);
>  			mutex_lock(&hugetlb_fault_mutex_table[hash]);
>  
> +			/*
> +			 * If page is mapped, it was faulted in after being
> +			 * unmapped in caller.  Unmap (again) now after taking
> +			 * the fault mutex.  The mutex will prevent faults
> +			 * until we finish removing the page.
> +			 */
> +			if (page_mapped(page)) {
> +				i_mmap_lock_write(mapping);
> +				hugetlb_vmdelete_list(&mapping->i_mmap,
> +					next * pages_per_huge_page(h),
> +					(next + 1) * pages_per_huge_page(h));
> +				i_mmap_unlock_write(mapping);
> +			}
> +
>  			lock_page(page);
>  			if (page->index >= end) {
>  				unlock_page(page);
> @@ -396,31 +444,23 @@ static void remove_inode_hugepages(struct inode *inode, loff_t lstart,
>  			}
>  
>  			/*
> -			 * If page is mapped, it was faulted in after being
> -			 * unmapped.  Do nothing in this race case.  In the
> -			 * normal case page is not mapped.
> +			 * We must free the huge page and remove from page
> +			 * cache (remove_huge_page) BEFORE removing the
> +			 * region/reserve map (hugetlb_unreserve_pages).
> +			 * In rare out of memory conditions, removal of the
> +			 * region/reserve map could fail.  Before free'ing
> +			 * the page, note PagePrivate which is used in case
> +			 * of error.
>  			 */
> -			if (!page_mapped(page)) {
> -				bool rsv_on_error = !PagePrivate(page);
> -				/*
> -				 * We must free the huge page and remove
> -				 * from page cache (remove_huge_page) BEFORE
> -				 * removing the region/reserve map
> -				 * (hugetlb_unreserve_pages).  In rare out
> -				 * of memory conditions, removal of the
> -				 * region/reserve map could fail.  Before
> -				 * free'ing the page, note PagePrivate which
> -				 * is used in case of error.
> -				 */
> -				remove_huge_page(page);
> -				freed++;
> -				if (!truncate_op) {
> -					if (unlikely(hugetlb_unreserve_pages(
> -							inode, next,
> -							next + 1, 1)))
> -						hugetlb_fix_reserve_counts(
> -							inode, rsv_on_error);
> -				}
> +			rsv_on_error = !PagePrivate(page);
> +			remove_huge_page(page);
> +			freed++;
> +			if (!truncate_op) {
> +				if (unlikely(hugetlb_unreserve_pages(inode,
> +								next, next + 1,
> +								1)))
> +					hugetlb_fix_reserve_counts(inode,
> +								rsv_on_error);
>  			}
>  
>  			if (page->index > next)
> @@ -450,41 +490,6 @@ static void hugetlbfs_evict_inode(struct inode *inode)
>  	clear_inode(inode);
>  }
>  
> -static inline void
> -hugetlb_vmdelete_list(struct rb_root *root, pgoff_t start, pgoff_t end)
> -{
> -	struct vm_area_struct *vma;
> -
> -	/*
> -	 * end == 0 indicates that the entire range after
> -	 * start should be unmapped.
> -	 */
> -	vma_interval_tree_foreach(vma, root, start, end ? end : ULONG_MAX) {
> -		unsigned long v_offset;
> -
> -		/*
> -		 * Can the expression below overflow on 32-bit arches?
> -		 * No, because the interval tree returns us only those vmas
> -		 * which overlap the truncated area starting at pgoff,
> -		 * and no vma on a 32-bit arch can span beyond the 4GB.
> -		 */
> -		if (vma->vm_pgoff < start)
> -			v_offset = (start - vma->vm_pgoff) << PAGE_SHIFT;
> -		else
> -			v_offset = 0;
> -
> -		if (end) {
> -			end = ((end - start) << PAGE_SHIFT) +
> -			       vma->vm_start + v_offset;
> -			if (end > vma->vm_end)
> -				end = vma->vm_end;
> -		} else
> -			end = vma->vm_end;
> -
> -		unmap_hugepage_range(vma, vma->vm_start + v_offset, end, NULL);
> -	}
> -}
> -
>  static int hugetlb_vmtruncate(struct inode *inode, loff_t offset)
>  {
>  	pgoff_t pgoff;
> -- 
> 2.4.3
> 
> 

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH] mm/hugetlb: Unmap pages if page fault raced with hole punch
  2015-10-30  3:32 ` Hugh Dickins
@ 2015-10-30 16:45   ` Mike Kravetz
  2015-10-30 20:56     ` Mike Kravetz
  0 siblings, 1 reply; 8+ messages in thread
From: Mike Kravetz @ 2015-10-30 16:45 UTC (permalink / raw)
  To: Hugh Dickins
  Cc: linux-mm, linux-kernel, Andrew Morton, Dave Hansen,
	Naoya Horiguchi, Davidlohr Bueso

On 10/29/2015 08:32 PM, Hugh Dickins wrote:
> On Thu, 29 Oct 2015, Mike Kravetz wrote:
> 
>> This patch is a combination of:
>> [PATCH v2 4/4] mm/hugetlb: Unmap pages to remove if page fault raced
>> 	with hole punch  and,
>> [PATCH] mm/hugetlb: i_mmap_lock_write before unmapping in
>> 	remove_inode_hugepages
>> This patch can replace the entire series:
>> [PATCH v2 0/4] hugetlbfs fallocate hole punch race with page faults
>> 	and
>> [PATCH] mm/hugetlb: i_mmap_lock_write before unmapping in
>> 	remove_inode_hugepages
>> It is being provided in an effort to possibly make tree management easier.
>>
>> Page faults can race with fallocate hole punch.  If a page fault happens
>> between the unmap and remove operations, the page is not removed and
>> remains within the hole.  This is not the desired behavior.
>>
>> If this race is detected and a page is mapped, the remove operation
>> (remove_inode_hugepages) will unmap the page before removing.  The unmap
>> within remove_inode_hugepages occurs with the hugetlb_fault_mutex held
>> so that no other faults can occur until the page is removed.
>>
>> The (unmodified) routine hugetlb_vmdelete_list was moved ahead of
>> remove_inode_hugepages to satisfy the new reference.
>>
>> Signed-off-by: Mike Kravetz <mike.kravetz@oracle.com>
> 
> Sorry, I came here to give this a quick Ack, but find I cannot:
> you're adding to the remove_inode_hugepages() loop (heading towards
> 4.3 final), but its use of "next" looks wrong to me already.

You are correct, the (current) code is wrong.

The hugetlbfs fallocate code started with shmem as an example.  Some
of the complexities of that code are not needed in hugetlbfs.  However,
some remnants were left.

I'll create a patch to fix the existing code, then when that is acceptable
refactor this patch.

> 
> Doesn't "next" need to be assigned from page->index much earlier?
> If there's a hole in the file (which there very well might be, since
> you've just implemented holepunch!), doesn't it do the wrong thing?

Yes, I think it will.

> 
> And the loop itself is a bit weird, though that probably doesn't
> matter very much: I said before, seeing the "while (next < end)",
> that it's a straightforward scan from start to end, and sometimes
> it would work that way; but buried inside is "next = start; continue;"

Correct, that next = start should not be there.

Thanks
-- 
Mike Kravetz

> from a contrasting "pincer" loop (which goes back to squeeze every
> page out of the range, lest faults raced truncation or holepunch).
> I know the originals in truncate.c or shmem.c are quite tricky,
> but this being different again would take time to validate.
> 
> No cond_resched() either.
> 
> Hugh
> 
>> ---
>>  fs/hugetlbfs/inode.c | 125 ++++++++++++++++++++++++++-------------------------
>>  1 file changed, 65 insertions(+), 60 deletions(-)
>>
>> diff --git a/fs/hugetlbfs/inode.c b/fs/hugetlbfs/inode.c
>> index 316adb9..8b8e5e8 100644
>> --- a/fs/hugetlbfs/inode.c
>> +++ b/fs/hugetlbfs/inode.c
>> @@ -324,11 +324,44 @@ static void remove_huge_page(struct page *page)
>>  	delete_from_page_cache(page);
>>  }
>>  
>> +static inline void
>> +hugetlb_vmdelete_list(struct rb_root *root, pgoff_t start, pgoff_t end)
>> +{
>> +	struct vm_area_struct *vma;
>> +
>> +	/*
>> +	 * end == 0 indicates that the entire range after
>> +	 * start should be unmapped.
>> +	 */
>> +	vma_interval_tree_foreach(vma, root, start, end ? end : ULONG_MAX) {
>> +		unsigned long v_offset;
>> +
>> +		/*
>> +		 * Can the expression below overflow on 32-bit arches?
>> +		 * No, because the interval tree returns us only those vmas
>> +		 * which overlap the truncated area starting at pgoff,
>> +		 * and no vma on a 32-bit arch can span beyond the 4GB.
>> +		 */
>> +		if (vma->vm_pgoff < start)
>> +			v_offset = (start - vma->vm_pgoff) << PAGE_SHIFT;
>> +		else
>> +			v_offset = 0;
>> +
>> +		if (end) {
>> +			end = ((end - start) << PAGE_SHIFT) +
>> +			       vma->vm_start + v_offset;
>> +			if (end > vma->vm_end)
>> +				end = vma->vm_end;
>> +		} else
>> +			end = vma->vm_end;
>> +
>> +		unmap_hugepage_range(vma, vma->vm_start + v_offset, end, NULL);
>> +	}
>> +}
>>  
>>  /*
>>   * remove_inode_hugepages handles two distinct cases: truncation and hole
>>   * punch.  There are subtle differences in operation for each case.
>> -
>>   * truncation is indicated by end of range being LLONG_MAX
>>   *	In this case, we first scan the range and release found pages.
>>   *	After releasing pages, hugetlb_unreserve_pages cleans up region/reserv
>> @@ -381,12 +414,27 @@ static void remove_inode_hugepages(struct inode *inode, loff_t lstart,
>>  		for (i = 0; i < pagevec_count(&pvec); ++i) {
>>  			struct page *page = pvec.pages[i];
>>  			u32 hash;
>> +			bool rsv_on_error;
>>  
>>  			hash = hugetlb_fault_mutex_hash(h, current->mm,
>>  							&pseudo_vma,
>>  							mapping, next, 0);
>>  			mutex_lock(&hugetlb_fault_mutex_table[hash]);
>>  
>> +			/*
>> +			 * If page is mapped, it was faulted in after being
>> +			 * unmapped in caller.  Unmap (again) now after taking
>> +			 * the fault mutex.  The mutex will prevent faults
>> +			 * until we finish removing the page.
>> +			 */
>> +			if (page_mapped(page)) {
>> +				i_mmap_lock_write(mapping);
>> +				hugetlb_vmdelete_list(&mapping->i_mmap,
>> +					next * pages_per_huge_page(h),
>> +					(next + 1) * pages_per_huge_page(h));
>> +				i_mmap_unlock_write(mapping);
>> +			}
>> +
>>  			lock_page(page);
>>  			if (page->index >= end) {
>>  				unlock_page(page);
>> @@ -396,31 +444,23 @@ static void remove_inode_hugepages(struct inode *inode, loff_t lstart,
>>  			}
>>  
>>  			/*
>> -			 * If page is mapped, it was faulted in after being
>> -			 * unmapped.  Do nothing in this race case.  In the
>> -			 * normal case page is not mapped.
>> +			 * We must free the huge page and remove from page
>> +			 * cache (remove_huge_page) BEFORE removing the
>> +			 * region/reserve map (hugetlb_unreserve_pages).
>> +			 * In rare out of memory conditions, removal of the
>> +			 * region/reserve map could fail.  Before free'ing
>> +			 * the page, note PagePrivate which is used in case
>> +			 * of error.
>>  			 */
>> -			if (!page_mapped(page)) {
>> -				bool rsv_on_error = !PagePrivate(page);
>> -				/*
>> -				 * We must free the huge page and remove
>> -				 * from page cache (remove_huge_page) BEFORE
>> -				 * removing the region/reserve map
>> -				 * (hugetlb_unreserve_pages).  In rare out
>> -				 * of memory conditions, removal of the
>> -				 * region/reserve map could fail.  Before
>> -				 * free'ing the page, note PagePrivate which
>> -				 * is used in case of error.
>> -				 */
>> -				remove_huge_page(page);
>> -				freed++;
>> -				if (!truncate_op) {
>> -					if (unlikely(hugetlb_unreserve_pages(
>> -							inode, next,
>> -							next + 1, 1)))
>> -						hugetlb_fix_reserve_counts(
>> -							inode, rsv_on_error);
>> -				}
>> +			rsv_on_error = !PagePrivate(page);
>> +			remove_huge_page(page);
>> +			freed++;
>> +			if (!truncate_op) {
>> +				if (unlikely(hugetlb_unreserve_pages(inode,
>> +								next, next + 1,
>> +								1)))
>> +					hugetlb_fix_reserve_counts(inode,
>> +								rsv_on_error);
>>  			}
>>  
>>  			if (page->index > next)
>> @@ -450,41 +490,6 @@ static void hugetlbfs_evict_inode(struct inode *inode)
>>  	clear_inode(inode);
>>  }
>>  
>> -static inline void
>> -hugetlb_vmdelete_list(struct rb_root *root, pgoff_t start, pgoff_t end)
>> -{
>> -	struct vm_area_struct *vma;
>> -
>> -	/*
>> -	 * end == 0 indicates that the entire range after
>> -	 * start should be unmapped.
>> -	 */
>> -	vma_interval_tree_foreach(vma, root, start, end ? end : ULONG_MAX) {
>> -		unsigned long v_offset;
>> -
>> -		/*
>> -		 * Can the expression below overflow on 32-bit arches?
>> -		 * No, because the interval tree returns us only those vmas
>> -		 * which overlap the truncated area starting at pgoff,
>> -		 * and no vma on a 32-bit arch can span beyond the 4GB.
>> -		 */
>> -		if (vma->vm_pgoff < start)
>> -			v_offset = (start - vma->vm_pgoff) << PAGE_SHIFT;
>> -		else
>> -			v_offset = 0;
>> -
>> -		if (end) {
>> -			end = ((end - start) << PAGE_SHIFT) +
>> -			       vma->vm_start + v_offset;
>> -			if (end > vma->vm_end)
>> -				end = vma->vm_end;
>> -		} else
>> -			end = vma->vm_end;
>> -
>> -		unmap_hugepage_range(vma, vma->vm_start + v_offset, end, NULL);
>> -	}
>> -}
>> -
>>  static int hugetlb_vmtruncate(struct inode *inode, loff_t offset)
>>  {
>>  	pgoff_t pgoff;
>> -- 
>> 2.4.3
>>
>>

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH] mm/hugetlb: Unmap pages if page fault raced with hole punch
  2015-10-30 16:45   ` Mike Kravetz
@ 2015-10-30 20:56     ` Mike Kravetz
  2015-11-09  7:42       ` Hugh Dickins
  0 siblings, 1 reply; 8+ messages in thread
From: Mike Kravetz @ 2015-10-30 20:56 UTC (permalink / raw)
  To: Hugh Dickins
  Cc: linux-mm, linux-kernel, Andrew Morton, Dave Hansen,
	Naoya Horiguchi, Davidlohr Bueso

On 10/30/2015 09:45 AM, Mike Kravetz wrote:
> On 10/29/2015 08:32 PM, Hugh Dickins wrote:
>> On Thu, 29 Oct 2015, Mike Kravetz wrote:
>>
>>> This patch is a combination of:
>>> [PATCH v2 4/4] mm/hugetlb: Unmap pages to remove if page fault raced
>>> 	with hole punch  and,
>>> [PATCH] mm/hugetlb: i_mmap_lock_write before unmapping in
>>> 	remove_inode_hugepages
>>> This patch can replace the entire series:
>>> [PATCH v2 0/4] hugetlbfs fallocate hole punch race with page faults
>>> 	and
>>> [PATCH] mm/hugetlb: i_mmap_lock_write before unmapping in
>>> 	remove_inode_hugepages
>>> It is being provided in an effort to possibly make tree management easier.
>>>
>>> Page faults can race with fallocate hole punch.  If a page fault happens
>>> between the unmap and remove operations, the page is not removed and
>>> remains within the hole.  This is not the desired behavior.
>>>
>>> If this race is detected and a page is mapped, the remove operation
>>> (remove_inode_hugepages) will unmap the page before removing.  The unmap
>>> within remove_inode_hugepages occurs with the hugetlb_fault_mutex held
>>> so that no other faults can occur until the page is removed.
>>>
>>> The (unmodified) routine hugetlb_vmdelete_list was moved ahead of
>>> remove_inode_hugepages to satisfy the new reference.
>>>
>>> Signed-off-by: Mike Kravetz <mike.kravetz@oracle.com>
>>
>> Sorry, I came here to give this a quick Ack, but find I cannot:
>> you're adding to the remove_inode_hugepages() loop (heading towards
>> 4.3 final), but its use of "next" looks wrong to me already.
> 
> You are correct, the (current) code is wrong.
> 
> The hugetlbfs fallocate code started with shmem as an example.  Some
> of the complexities of that code are not needed in hugetlbfs.  However,
> some remnants were left.
> 
> I'll create a patch to fix the existing code, then when that is acceptable
> refactor this patch.
> 
>>
>> Doesn't "next" need to be assigned from page->index much earlier?
>> If there's a hole in the file (which there very well might be, since
>> you've just implemented holepunch!), doesn't it do the wrong thing?
> 
> Yes, I think it will.
> 
>>
>> And the loop itself is a bit weird, though that probably doesn't
>> matter very much: I said before, seeing the "while (next < end)",
>> that it's a straightforward scan from start to end, and sometimes
>> it would work that way; but buried inside is "next = start; continue;"
> 
> Correct, that next = start should not be there.

The 'next = start' code is actually from the original truncate_hugepages
routine.  This functionality was combined with that needed for hole punch
to create remove_inode_hugepages().

The following code was in truncate_hugepages:

	next = start;
	while (1) {
		if (!pagevec_lookup(&pvec, mapping, next, PAGEVEC_SIZE)) {
			if (next == start)
				break;
			next = start;
			continue;
		}


So, in the truncate case pages starting at 'start' are deleted until
pagevec_lookup fails.  Then, we call pagevec_lookup() again.  If no
pages are found we are done.  Else, we repeat the whole process.

Does anyone recall the reason for going back and looking for pages at
index'es already deleted?  Git doesn't help as that was part of initial
commit.  My thought is that truncate can race with page faults.  The
truncate code sets inode offset before unmapping and deleting pages.
So, faults after the new offset is set should fail.  But, I suppose a
fault could race with setting offset and deleting of pages.  Does this
sound right?  Or, is there some other reason I am missing?

I would like to continue having remove_inode_hugepages handle both the
truncate and hole punch case.  So, what to make sure the code correctly
handles both cases.

-- 
Mike Kravetz

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH] mm/hugetlb: Unmap pages if page fault raced with hole punch
  2015-10-30 20:56     ` Mike Kravetz
@ 2015-11-09  7:42       ` Hugh Dickins
  2015-11-09 22:55         ` Mike Kravetz
  0 siblings, 1 reply; 8+ messages in thread
From: Hugh Dickins @ 2015-11-09  7:42 UTC (permalink / raw)
  To: Mike Kravetz
  Cc: Hugh Dickins, linux-mm, linux-kernel, Andrew Morton, Dave Hansen,
	Naoya Horiguchi, Davidlohr Bueso

On Fri, 30 Oct 2015, Mike Kravetz wrote:
> 
> The 'next = start' code is actually from the original truncate_hugepages
> routine.  This functionality was combined with that needed for hole punch
> to create remove_inode_hugepages().
> 
> The following code was in truncate_hugepages:
> 
> 	next = start;
> 	while (1) {
> 		if (!pagevec_lookup(&pvec, mapping, next, PAGEVEC_SIZE)) {
> 			if (next == start)
> 				break;
> 			next = start;
> 			continue;
> 		}
> 
> 
> So, in the truncate case pages starting at 'start' are deleted until
> pagevec_lookup fails.  Then, we call pagevec_lookup() again.  If no
> pages are found we are done.  Else, we repeat the whole process.
> 
> Does anyone recall the reason for going back and looking for pages at
> index'es already deleted?  Git doesn't help as that was part of initial
> commit.  My thought is that truncate can race with page faults.  The
> truncate code sets inode offset before unmapping and deleting pages.
> So, faults after the new offset is set should fail.  But, I suppose a
> fault could race with setting offset and deleting of pages.  Does this
> sound right?  Or, is there some other reason I am missing?

I believe your thinking is correct.  But remember that
truncate_inode_pages_range() is shared by almost all filesystems,
and different filesystems have different internal locking conventions,
and different propensities to such a race: it's trying to cover for
all of them.

Typically, writing is well serialized (by i_mutex) against truncation,
but faulting (like reading) sails through without enough of a lock.
We resort to i_size checks to avoid the worst of it, but there's often
a corner or two in which those checks are not quite good enough -
it's easy to check i_size at the beginning, but it needs to be checked
again at the end too, and what's been done undone - can be awkward.

I hope that in the case of hugetlbfs, since you already have the
additional fault_mutex to handle races between faults and punching,
it should be possible to get away without that "pincer" restarting.

Hugh

> 
> I would like to continue having remove_inode_hugepages handle both the
> truncate and hole punch case.  So, what to make sure the code correctly
> handles both cases.
> 
> -- 
> Mike Kravetz

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH] mm/hugetlb: Unmap pages if page fault raced with hole punch
  2015-11-09  7:42       ` Hugh Dickins
@ 2015-11-09 22:55         ` Mike Kravetz
  2015-11-10 22:41           ` Mike Kravetz
  0 siblings, 1 reply; 8+ messages in thread
From: Mike Kravetz @ 2015-11-09 22:55 UTC (permalink / raw)
  To: Hugh Dickins
  Cc: linux-mm, linux-kernel, Andrew Morton, Dave Hansen,
	Naoya Horiguchi, Davidlohr Bueso

On 11/08/2015 11:42 PM, Hugh Dickins wrote:
> On Fri, 30 Oct 2015, Mike Kravetz wrote:
>>
>> The 'next = start' code is actually from the original truncate_hugepages
>> routine.  This functionality was combined with that needed for hole punch
>> to create remove_inode_hugepages().
>>
>> The following code was in truncate_hugepages:
>>
>> 	next = start;
>> 	while (1) {
>> 		if (!pagevec_lookup(&pvec, mapping, next, PAGEVEC_SIZE)) {
>> 			if (next == start)
>> 				break;
>> 			next = start;
>> 			continue;
>> 		}
>>
>>
>> So, in the truncate case pages starting at 'start' are deleted until
>> pagevec_lookup fails.  Then, we call pagevec_lookup() again.  If no
>> pages are found we are done.  Else, we repeat the whole process.
>>
>> Does anyone recall the reason for going back and looking for pages at
>> index'es already deleted?  Git doesn't help as that was part of initial
>> commit.  My thought is that truncate can race with page faults.  The
>> truncate code sets inode offset before unmapping and deleting pages.
>> So, faults after the new offset is set should fail.  But, I suppose a
>> fault could race with setting offset and deleting of pages.  Does this
>> sound right?  Or, is there some other reason I am missing?
> 
> I believe your thinking is correct.  But remember that
> truncate_inode_pages_range() is shared by almost all filesystems,
> and different filesystems have different internal locking conventions,
> and different propensities to such a race: it's trying to cover for
> all of them.
> 
> Typically, writing is well serialized (by i_mutex) against truncation,
> but faulting (like reading) sails through without enough of a lock.
> We resort to i_size checks to avoid the worst of it, but there's often
> a corner or two in which those checks are not quite good enough -
> it's easy to check i_size at the beginning, but it needs to be checked
> again at the end too, and what's been done undone - can be awkward.

Well, it looks like the hugetlb_no_page() routine is checking i_size both
before and after.  It appears to be doing the right thing to handle the
race, but I need to stare at the code some more to make sure.

Because of the way the truncate code went back and did an extra lookup
when done with the range, I assumed it was covering some race.  However,
that may not be the case.

> 
> I hope that in the case of hugetlbfs, since you already have the
> additional fault_mutex to handle races between faults and punching,
> it should be possible to get away without that "pincer" restarting.

Yes, it looks like this may work as a straight loop over the range of
pages.  I just need to study the code some more to make sure I am not
missing something.

-- 
Mike Kravetz

> 
> Hugh
> 
>>
>> I would like to continue having remove_inode_hugepages handle both the
>> truncate and hole punch case.  So, what to make sure the code correctly
>> handles both cases.
>>
>> -- 
>> Mike Kravetz

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH] mm/hugetlb: Unmap pages if page fault raced with hole punch
  2015-11-09 22:55         ` Mike Kravetz
@ 2015-11-10 22:41           ` Mike Kravetz
  2015-11-14  0:36             ` Hugh Dickins
  0 siblings, 1 reply; 8+ messages in thread
From: Mike Kravetz @ 2015-11-10 22:41 UTC (permalink / raw)
  To: Hugh Dickins
  Cc: linux-mm, linux-kernel, Andrew Morton, Dave Hansen,
	Naoya Horiguchi, Davidlohr Bueso

On 11/09/2015 02:55 PM, Mike Kravetz wrote:
> On 11/08/2015 11:42 PM, Hugh Dickins wrote:
>> On Fri, 30 Oct 2015, Mike Kravetz wrote:
>>>
>>> The 'next = start' code is actually from the original truncate_hugepages
>>> routine.  This functionality was combined with that needed for hole punch
>>> to create remove_inode_hugepages().
>>>
>>> The following code was in truncate_hugepages:
>>>
>>> 	next = start;
>>> 	while (1) {
>>> 		if (!pagevec_lookup(&pvec, mapping, next, PAGEVEC_SIZE)) {
>>> 			if (next == start)
>>> 				break;
>>> 			next = start;
>>> 			continue;
>>> 		}
>>>
>>>
>>> So, in the truncate case pages starting at 'start' are deleted until
>>> pagevec_lookup fails.  Then, we call pagevec_lookup() again.  If no
>>> pages are found we are done.  Else, we repeat the whole process.
>>>
>>> Does anyone recall the reason for going back and looking for pages at
>>> index'es already deleted?  Git doesn't help as that was part of initial
>>> commit.  My thought is that truncate can race with page faults.  The
>>> truncate code sets inode offset before unmapping and deleting pages.
>>> So, faults after the new offset is set should fail.  But, I suppose a
>>> fault could race with setting offset and deleting of pages.  Does this
>>> sound right?  Or, is there some other reason I am missing?
>>
>> I believe your thinking is correct.  But remember that
>> truncate_inode_pages_range() is shared by almost all filesystems,
>> and different filesystems have different internal locking conventions,
>> and different propensities to such a race: it's trying to cover for
>> all of them.
>>
>> Typically, writing is well serialized (by i_mutex) against truncation,
>> but faulting (like reading) sails through without enough of a lock.
>> We resort to i_size checks to avoid the worst of it, but there's often
>> a corner or two in which those checks are not quite good enough -
>> it's easy to check i_size at the beginning, but it needs to be checked
>> again at the end too, and what's been done undone - can be awkward.
> 
> Well, it looks like the hugetlb_no_page() routine is checking i_size both
> before and after.  It appears to be doing the right thing to handle the
> race, but I need to stare at the code some more to make sure.
> 
> Because of the way the truncate code went back and did an extra lookup
> when done with the range, I assumed it was covering some race.  However,
> that may not be the case.
> 
>>
>> I hope that in the case of hugetlbfs, since you already have the
>> additional fault_mutex to handle races between faults and punching,
>> it should be possible to get away without that "pincer" restarting.
> 
> Yes, it looks like this may work as a straight loop over the range of
> pages.  I just need to study the code some more to make sure I am not
> missing something.

I have convinced myself that hugetlb_no_page is coded such that page
faults can not race with truncate.  hugetlb_no_page handles the case
where there is no PTE for a faulted in address.  The general flow in
hugetlb_no_page for the no page found case is:
- check index against i_size, end if beyond
- allocate huge page
- take page table lock for huge page
- check index against i_size again,  if beyond free page and return
- add huge page to page table
- unlock page table lock for huge page

The flow for the truncate operation in hugetlb_vmtruncate is:
- set i_size
- take inode/mapping write lock
- hugetlb_vmdelete_list() which removes page table entries.  The page
  table lock will be taken for each huge page in the range
- release inode/mapping write lock
- remove_inode_hugepages() to actually remove pages

The truncate/page fault race we are concerned with is if a page is faulted
in after hugetlb_vmtruncate sets i_size and unmaps the page, but before
actually removing the page.  Obviously, any entry into hugetlb_no_page
after i_size is set will check the value and not allow the fault.  In
addition, if the value of i_size is set before the second check in
hugetlb_no_page, it will do the right thing.  Therefore, the only place to
race is after the second i_size check in hugetlb_no_page.

Note that the second check for i_size is with the page table lock for
the huge page held.  It is not possible for hugetlb_vmtruncate to unmap
the huge page before the page fault completes, as it must acquire the page
table lock.  This is the same as a fault happening before the truncate
operation starts and is handled correctly by hugetlb_vmtruncate.

Another way to look at this is by asking the question, Is it possible to
fault on a page in the truncate range after it is unmapped by
hugetlb_vmtruncate/hugetlb_vmdelete_list?  To unmap a page,
hugetlb_vmtruncate will:
- set i_size
- take page table lock for huge page
- unmap page
- release page table lock for page

In order to fault in the page, it must take the same page table lock and
check i_size.  I do not know of any way for the faulting code to get an
old value for i_size.

Please let me know if my reasoning is incorrect.  I will code up a new
(simpler) version of remove_inode_hugepages with the assumption that
truncate can not race with page faults.

Also, I wrote a fairly simple test to have truncate race with page faults.
It was quite easy to hit the second check in hugetlb_no_page where it
notices index is beyond i_size and backs out of the fault.  Even after
adding delays in strategic locations of the fault and truncate code, I
could not cause a race as observed by remove_inode_hugepages.
-- 
Mike Kravetz

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH] mm/hugetlb: Unmap pages if page fault raced with hole punch
  2015-11-10 22:41           ` Mike Kravetz
@ 2015-11-14  0:36             ` Hugh Dickins
  0 siblings, 0 replies; 8+ messages in thread
From: Hugh Dickins @ 2015-11-14  0:36 UTC (permalink / raw)
  To: Mike Kravetz
  Cc: Hugh Dickins, linux-mm, linux-kernel, Andrew Morton, Dave Hansen,
	Naoya Horiguchi, Davidlohr Bueso

On Tue, 10 Nov 2015, Mike Kravetz wrote:
> On 11/09/2015 02:55 PM, Mike Kravetz wrote:
> > On 11/08/2015 11:42 PM, Hugh Dickins wrote:
> >> On Fri, 30 Oct 2015, Mike Kravetz wrote:
> >>>
> >>> The 'next = start' code is actually from the original truncate_hugepages
> >>> routine.  This functionality was combined with that needed for hole punch
> >>> to create remove_inode_hugepages().
> >>>
> >>> The following code was in truncate_hugepages:
> >>>
> >>> 	next = start;
> >>> 	while (1) {
> >>> 		if (!pagevec_lookup(&pvec, mapping, next, PAGEVEC_SIZE)) {
> >>> 			if (next == start)
> >>> 				break;
> >>> 			next = start;
> >>> 			continue;
> >>> 		}
> >>>
> >>>
> >>> So, in the truncate case pages starting at 'start' are deleted until
> >>> pagevec_lookup fails.  Then, we call pagevec_lookup() again.  If no
> >>> pages are found we are done.  Else, we repeat the whole process.
> >>>
> >>> Does anyone recall the reason for going back and looking for pages at
> >>> index'es already deleted?  Git doesn't help as that was part of initial
> >>> commit.  My thought is that truncate can race with page faults.  The
> >>> truncate code sets inode offset before unmapping and deleting pages.
> >>> So, faults after the new offset is set should fail.  But, I suppose a
> >>> fault could race with setting offset and deleting of pages.  Does this
> >>> sound right?  Or, is there some other reason I am missing?
> >>
> >> I believe your thinking is correct.  But remember that
> >> truncate_inode_pages_range() is shared by almost all filesystems,
> >> and different filesystems have different internal locking conventions,
> >> and different propensities to such a race: it's trying to cover for
> >> all of them.
> >>
> >> Typically, writing is well serialized (by i_mutex) against truncation,
> >> but faulting (like reading) sails through without enough of a lock.
> >> We resort to i_size checks to avoid the worst of it, but there's often
> >> a corner or two in which those checks are not quite good enough -
> >> it's easy to check i_size at the beginning, but it needs to be checked
> >> again at the end too, and what's been done undone - can be awkward.
> > 
> > Well, it looks like the hugetlb_no_page() routine is checking i_size both
> > before and after.  It appears to be doing the right thing to handle the
> > race, but I need to stare at the code some more to make sure.
> > 
> > Because of the way the truncate code went back and did an extra lookup
> > when done with the range, I assumed it was covering some race.  However,
> > that may not be the case.
> > 
> >>
> >> I hope that in the case of hugetlbfs, since you already have the
> >> additional fault_mutex to handle races between faults and punching,
> >> it should be possible to get away without that "pincer" restarting.
> > 
> > Yes, it looks like this may work as a straight loop over the range of
> > pages.  I just need to study the code some more to make sure I am not
> > missing something.
> 
> I have convinced myself that hugetlb_no_page is coded such that page
> faults can not race with truncate.  hugetlb_no_page handles the case
> where there is no PTE for a faulted in address.  The general flow in
> hugetlb_no_page for the no page found case is:
> - check index against i_size, end if beyond
> - allocate huge page
> - take page table lock for huge page
> - check index against i_size again,  if beyond free page and return
> - add huge page to page table
> - unlock page table lock for huge page
> 
> The flow for the truncate operation in hugetlb_vmtruncate is:
> - set i_size
> - take inode/mapping write lock
> - hugetlb_vmdelete_list() which removes page table entries.  The page
>   table lock will be taken for each huge page in the range
> - release inode/mapping write lock
> - remove_inode_hugepages() to actually remove pages
> 
> The truncate/page fault race we are concerned with is if a page is faulted
> in after hugetlb_vmtruncate sets i_size and unmaps the page, but before
> actually removing the page.  Obviously, any entry into hugetlb_no_page
> after i_size is set will check the value and not allow the fault.  In
> addition, if the value of i_size is set before the second check in
> hugetlb_no_page, it will do the right thing.  Therefore, the only place to
> race is after the second i_size check in hugetlb_no_page.
> 
> Note that the second check for i_size is with the page table lock for
> the huge page held.  It is not possible for hugetlb_vmtruncate to unmap
> the huge page before the page fault completes, as it must acquire the page
> table lock.  This is the same as a fault happening before the truncate
> operation starts and is handled correctly by hugetlb_vmtruncate.
> 
> Another way to look at this is by asking the question, Is it possible to
> fault on a page in the truncate range after it is unmapped by
> hugetlb_vmtruncate/hugetlb_vmdelete_list?  To unmap a page,
> hugetlb_vmtruncate will:
> - set i_size
> - take page table lock for huge page
> - unmap page
> - release page table lock for page
> 
> In order to fault in the page, it must take the same page table lock and
> check i_size.  I do not know of any way for the faulting code to get an
> old value for i_size.
> 
> Please let me know if my reasoning is incorrect.  I will code up a new
> (simpler) version of remove_inode_hugepages with the assumption that
> truncate can not race with page faults.
> 
> Also, I wrote a fairly simple test to have truncate race with page faults.
> It was quite easy to hit the second check in hugetlb_no_page where it
> notices index is beyond i_size and backs out of the fault.  Even after
> adding delays in strategic locations of the fault and truncate code, I
> could not cause a race as observed by remove_inode_hugepages.

Thank you for working it out and writing it down, Mike: I agree with you.

Easy for someone like me to come along and "optimize" something
("ooh, looks like no pte there so let's not take the page table lock"),
but in fact (perhaps) break it.  But that's a criticism of me, not the
code: we couldn't write anything if that were an argument against it!

Ack to your v3 patch coming up now.

Hugh

^ permalink raw reply	[flat|nested] 8+ messages in thread

end of thread, other threads:[~2015-11-14  0:36 UTC | newest]

Thread overview: 8+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2015-10-29 22:33 [PATCH] mm/hugetlb: Unmap pages if page fault raced with hole punch Mike Kravetz
2015-10-30  3:32 ` Hugh Dickins
2015-10-30 16:45   ` Mike Kravetz
2015-10-30 20:56     ` Mike Kravetz
2015-11-09  7:42       ` Hugh Dickins
2015-11-09 22:55         ` Mike Kravetz
2015-11-10 22:41           ` Mike Kravetz
2015-11-14  0:36             ` Hugh Dickins

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).