All of lore.kernel.org
 help / color / mirror / Atom feed
From: Mike Kravetz <mike.kravetz@oracle.com>
To: Joao Martins <joao.m.martins@oracle.com>, linux-mm@kvack.org
Cc: linux-kernel@vger.kernel.org, Andrew Morton <akpm@linux-foundation.org>
Subject: Re: [PATCH 2/2] mm/hugetlb: refactor subpage recording
Date: Tue, 26 Jan 2021 10:08:25 -0800	[thread overview]
Message-ID: <3d34111f-8365-ab95-af11-aaf4825204be@oracle.com> (raw)
In-Reply-To: <20210125205744.10203-3-joao.m.martins@oracle.com>

On 1/25/21 12:57 PM, Joao Martins wrote:
> For a given hugepage backing a VA, there's a rather ineficient
> loop which is solely responsible for storing subpages in the passed
> pages/vmas array. For each subpage we check whether it's within
> range or size of @pages and keep incrementing @pfn_offset and a couple
> other variables per subpage iteration.
> 
> Simplify this logic and minimize ops per iteration to just
> store the output page/vma. Instead of incrementing number of @refs
> iteratively, we do it through a precalculation of @refs and having
> only a tight loop for storing pinned subpages/vmas.
> 
> pinning consequently improves considerably, bringing us close to
> {pin,get}_user_pages_fast:
> 
>  - 16G with 1G huge page size
>  gup_test -f /mnt/huge/file -m 16384 -r 10 -L -S -n 512 -w
> 
>  PIN_LONGTERM_BENCHMARK: ~11k us -> ~4400 us
>  PIN_FAST_BENCHMARK: ~3700 us
> 
> Signed-off-by: Joao Martins <joao.m.martins@oracle.com>
> ---
>  mm/hugetlb.c | 49 ++++++++++++++++++++++++++++---------------------
>  1 file changed, 28 insertions(+), 21 deletions(-)
> 
> diff --git a/mm/hugetlb.c b/mm/hugetlb.c
> index 016addc8e413..1f7a95bc7c87 100644
> --- a/mm/hugetlb.c
> +++ b/mm/hugetlb.c
> @@ -4789,6 +4789,20 @@ int hugetlb_mcopy_atomic_pte(struct mm_struct *dst_mm,
>  	goto out;
>  }
>  
> +static void record_subpages_vmas(struct page *page, struct vm_area_struct *vma,
> +				 int refs, struct page **pages,
> +				 struct vm_area_struct **vmas)
> +{
> +	int nr;
> +
> +	for (nr = 0; nr < refs; nr++) {
> +		if (likely(pages))
> +			pages[nr] = page++;
> +		if (vmas)
> +			vmas[nr] = vma;
> +	}
> +}
> +
>  long follow_hugetlb_page(struct mm_struct *mm, struct vm_area_struct *vma,
>  			 struct page **pages, struct vm_area_struct **vmas,
>  			 unsigned long *position, unsigned long *nr_pages,
> @@ -4918,28 +4932,16 @@ long follow_hugetlb_page(struct mm_struct *mm, struct vm_area_struct *vma,
>  			continue;
>  		}
>  
> -		refs = 0;
> +		refs = min3(pages_per_huge_page(h) - pfn_offset,
> +			    (vma->vm_end - vaddr) >> PAGE_SHIFT, remainder);
>  
> -same_page:
> -		if (pages)
> -			pages[i] = mem_map_offset(page, pfn_offset);
> +		if (pages || vmas)
> +			record_subpages_vmas(mem_map_offset(page, pfn_offset),

The assumption made here is that mem_map is contiguous for the range of
pages in the hugetlb page.  I do not believe you can make this assumption
for (gigantic) hugetlb pages which are > MAX_ORDER_NR_PAGES.  For example,

/*
 * Gigantic pages are so large that we do not guarantee that page++ pointer
 * arithmetic will work across the entire page.  We need something more
 * specialized.
 */
static void __copy_gigantic_page(struct page *dst, struct page *src,
                                int nr_pages)


-- 
Mike Kravetz

> +					     vma, refs,
> +					     likely(pages) ? pages + i : NULL,
> +					     vmas ? vmas + i : NULL);
>  
> -		if (vmas)
> -			vmas[i] = vma;
> -
> -		vaddr += PAGE_SIZE;
> -		++pfn_offset;
> -		--remainder;
> -		++i;
> -		refs++;
> -		if (vaddr < vma->vm_end && remainder &&
> -				pfn_offset < pages_per_huge_page(h)) {
> -			/*
> -			 * We use pfn_offset to avoid touching the pageframes
> -			 * of this compound page.
> -			 */
> -			goto same_page;
> -		} else if (pages) {
> +		if (pages) {
>  			/*
>  			 * try_grab_compound_head() should always succeed here,
>  			 * because: a) we hold the ptl lock, and b) we've just
> @@ -4950,7 +4952,7 @@ long follow_hugetlb_page(struct mm_struct *mm, struct vm_area_struct *vma,
>  			 * any way. So this page must be available at this
>  			 * point, unless the page refcount overflowed:
>  			 */
> -			if (WARN_ON_ONCE(!try_grab_compound_head(pages[i-1],
> +			if (WARN_ON_ONCE(!try_grab_compound_head(pages[i],
>  								 refs,
>  								 flags))) {
>  				spin_unlock(ptl);
> @@ -4959,6 +4961,11 @@ long follow_hugetlb_page(struct mm_struct *mm, struct vm_area_struct *vma,
>  				break;
>  			}
>  		}
> +
> +		vaddr += (refs << PAGE_SHIFT);
> +		remainder -= refs;
> +		i += refs;
> +
>  		spin_unlock(ptl);
>  	}
>  	*nr_pages = remainder;
> 

  reply	other threads:[~2021-01-26 18:19 UTC|newest]

Thread overview: 15+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-01-25 20:57 [PATCH 0/2] mm/hugetlb: follow_hugetlb_page() improvements Joao Martins
2021-01-25 20:57 ` [PATCH 1/2] mm/hugetlb: grab head page refcount once per group of subpages Joao Martins
2021-01-26  2:14   ` Mike Kravetz
2021-01-25 20:57 ` [PATCH 2/2] mm/hugetlb: refactor subpage recording Joao Martins
2021-01-26 18:08   ` Mike Kravetz [this message]
2021-01-26 19:21     ` Joao Martins
2021-01-26 19:24       ` Joao Martins
2021-01-26 21:21       ` Mike Kravetz
2021-01-26 23:20         ` Joao Martins
2021-01-27  0:07         ` Jason Gunthorpe
2021-01-27  1:58           ` Mike Kravetz
2021-01-27  2:10             ` Jason Gunthorpe
2021-02-13 21:12               ` Mike Kravetz
2021-01-27  2:24           ` Matthew Wilcox
2021-01-27  2:50             ` Zi Yan

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=3d34111f-8365-ab95-af11-aaf4825204be@oracle.com \
    --to=mike.kravetz@oracle.com \
    --cc=akpm@linux-foundation.org \
    --cc=joao.m.martins@oracle.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.