All of lore.kernel.org
 help / color / mirror / Atom feed
From: Andy Whitcroft <apw@canonical.com>
To: Mel Gorman <mel@csn.ul.ie>
Cc: Linus Torvalds <torvalds@linux-foundation.org>,
	Linux Kernel Mailing List <linux-kernel@vger.kernel.org>,
	KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>,
	Hugh Dickins <hugh@veritas.com>,
	Lee Schermerhorn <Lee.Schermerhorn@hp.com>,
	Greg KH <gregkh@suse.de>,
	Maksim Yevmenkin <maksim.yevmenkin@gmail.com>,
	Nick Piggin <npiggin@suse.de>,
	Andrew Morton <akpm@linux-foundation.org>,
	will@crowder-design.com, Rik van Riel <riel@redhat.com>,
	KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>,
	Mikos Szeredi <miklos@szeredi.hu>,
	wli@movementarian.org
Subject: Re: [PATCH] Do not account for the address space used by hugetlbfs using VM_ACCOUNT V2 (Was Linus 2.6.29-rc4)
Date: Wed, 11 Feb 2009 16:43:59 +0000	[thread overview]
Message-ID: <20090211164359.GI25898@shadowen.org> (raw)
In-Reply-To: <20090211163416.GA2733@csn.ul.ie>

Acked-by: Andy Whitcroft <apw@canonical.com>

> Subject: [PATCH] Do not account for hugetlbfs quota at mmap() time if mapping [SHM|MAP]_NORESERVE
> 
> Commit 5a6fe125950676015f5108fb71b2a67441755003 brought hugetlbfs more in line
> with the core VM by obeying VM_NORESERVE and not reserving hugepages for both
> shared and private mappings when [SHM|MAP]_NORESERVE are specified. However,
> it is still taking filesystem quota unconditionally.
> 
> At fault time, if there are no reserves and attempt is made to allocate
> the page and account for filesystem quota. If either fail, the fault
> fails. The impact is that quota is getting accounted for twice. This patch
> partially reverts 5a6fe125950676015f5108fb71b2a67441755003.  To help
> prevent this mistake happening again, it improves the documentation of
> hugetlb_reserve_pages()
> 
> Reported-by: Andy Whitcroft <apw@canonical.com>
> Signed-off-by: Mel Gorman <mel@csn.ul.ie>
> 
> diff --git a/mm/hugetlb.c b/mm/hugetlb.c
> index 2074642..107da3d 100644
> --- a/mm/hugetlb.c
> +++ b/mm/hugetlb.c
> @@ -2272,10 +2272,18 @@ int hugetlb_reserve_pages(struct inode *inode,
>  					struct vm_area_struct *vma,
>  					int acctflag)
>  {
> -	long ret = 0, chg;
> +	long ret, chg;
>  	struct hstate *h = hstate_inode(inode);
>  
>  	/*
> +	 * Only apply hugepage reservation if asked. At fault time, an
> +	 * attempt will be made for VM_NORESERVE to allocate a page
> +	 * and filesystem quota without using reserves
> +	 */
> +	if (acctflag & VM_NORESERVE)
> +		return 0;
> +
> +	/*
>  	 * Shared mappings base their reservation on the number of pages that
>  	 * are already allocated on behalf of the file. Private mappings need
>  	 * to reserve the full area even if read-only as mprotect() may be
> @@ -2283,42 +2291,47 @@ int hugetlb_reserve_pages(struct inode *inode,
>  	 */
>  	if (!vma || vma->vm_flags & VM_SHARED)
>  		chg = region_chg(&inode->i_mapping->private_list, from, to);
> -	else
> +	else {
> +		struct resv_map *resv_map = resv_map_alloc();
> +		if (!resv_map)
> +			return -ENOMEM;
> +
>  		chg = to - from;
>  
> +		set_vma_resv_map(vma, resv_map);
> +		set_vma_resv_flags(vma, HPAGE_RESV_OWNER);
> +	}
> +
>  	if (chg < 0)
>  		return chg;
>  
> +	/* There must be enough filesystem quota for the mapping */
>  	if (hugetlb_get_quota(inode->i_mapping, chg))
>  		return -ENOSPC;
>  
>  	/*
> -	 * Only apply hugepage reservation if asked. We still have to
> -	 * take the filesystem quota because it is an upper limit
> -	 * defined for the mount and not necessarily memory as a whole
> +	 * Check enough hugepages are available for the reservation.
> +	 * Hand back the quota if there are not
>  	 */
> -	if (acctflag & VM_NORESERVE) {
> -		reset_vma_resv_huge_pages(vma);
> -		return 0;
> -	}
> -
>  	ret = hugetlb_acct_memory(h, chg);
>  	if (ret < 0) {
>  		hugetlb_put_quota(inode->i_mapping, chg);
>  		return ret;
>  	}
> +
> +	/*
> +	 * Account for the reservations made. Shared mappings record regions
> +	 * that have reservations as they are shared by multiple VMAs.
> +	 * When the last VMA disappears, the region map says how much
> +	 * the reservation was and the page cache tells how much of
> +	 * the reservation was consumed. Private mappings are per-VMA and
> +	 * only the consumed reservations are tracked. When the VMA
> +	 * disappears, the original reservation is the VMA size and the
> +	 * consumed reservations are stored in the map. Hence, nothing
> +	 * else has to be done for private mappings here
> +	 */
>  	if (!vma || vma->vm_flags & VM_SHARED)
>  		region_add(&inode->i_mapping->private_list, from, to);
> -	else {
> -		struct resv_map *resv_map = resv_map_alloc();
> -
> -		if (!resv_map)
> -			return -ENOMEM;
> -
> -		set_vma_resv_map(vma, resv_map);
> -		set_vma_resv_flags(vma, HPAGE_RESV_OWNER);
> -	}
> -
>  	return 0;
>  }

-apw

      reply	other threads:[~2009-02-11 16:44 UTC|newest]

Thread overview: 50+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2009-02-08 21:01 Linus 2.6.29-rc4 Linus Torvalds
2009-02-09  1:21 ` Arjan van de Ven
2009-02-09  8:28   ` Ingo Molnar
2009-02-09 12:18     ` Avi Kivity
2009-02-09 13:34 ` Gerd Hoffmann
2009-02-09 15:04   ` Steven Noonan
2009-02-09 18:26 ` Oopses and ACPI problems (Linus 2.6.29-rc4) Darren Salt
2009-02-09 23:49   ` Ingo Molnar
2009-02-09 23:49     ` Ingo Molnar
2009-02-10 14:12     ` Darren Salt
2009-02-10 14:12       ` Darren Salt
2009-02-10 14:54       ` [PATCH 2.6.29-rc4] Restore ACPI reporting via /proc/acpi/events for EeePC & other Asus laptops Darren Salt
2009-02-10 14:54         ` Darren Salt
2009-02-24 11:31         ` Corentin Chary
2009-02-10 15:04       ` Oopses and ACPI problems (Linus 2.6.29-rc4) Matthew Garrett
2009-02-10 15:04         ` Matthew Garrett
2009-02-10 15:15         ` Darren Salt
2009-02-10 15:15           ` Darren Salt
2009-02-10 15:45           ` Matthew Garrett
2009-02-10 15:45             ` Matthew Garrett
2009-02-10 16:03             ` Darren Salt
2009-02-23 16:39               ` Matthew Garrett
2009-02-23 16:39                 ` Matthew Garrett
2009-02-24 15:29                 ` Darren Salt
2009-02-24 16:00                   ` Matthew Garrett
2009-02-24 16:00                     ` Matthew Garrett
2009-02-24 19:45                     ` Darren Salt
2009-02-10 16:06             ` Corentin Chary
2009-02-10 19:16               ` Darren Salt
2009-02-11  2:03                 ` Matthew Garrett
2009-02-11  2:03                   ` Matthew Garrett
2009-02-11  1:23               ` yakui_zhao
2009-04-19  1:56           ` [PATCH] eee-laptop: Register as a pci-hotplug device Matthew Garrett
2009-04-19  7:20             ` Corentin Chary
2009-04-19 15:13               ` Matthew Garrett
2009-04-25 14:12                 ` Corentin Chary
2009-04-26 17:16                   ` Matthew Garrett
2009-04-26 20:51                     ` Corentin Chary
2009-02-10  1:06   ` Oopses and ACPI problems (Linus 2.6.29-rc4) yakui_zhao
2009-02-10  1:06     ` yakui_zhao
2009-02-10 14:02 ` [PATCH] Do not account for the address space used by hugetlbfs using VM_ACCOUNT V2 (Was Linus 2.6.29-rc4) Mel Gorman
2009-02-10 23:45   ` Andrew Morton
2009-02-11 11:15     ` Mel Gorman
2009-02-11  9:43   ` Andy Whitcroft
2009-02-11 10:30     ` Mel Gorman
2009-02-11 12:03       ` Andy Whitcroft
2009-02-11 14:20         ` Mel Gorman
2009-02-11 16:03           ` Andy Whitcroft
2009-02-11 16:34             ` Mel Gorman
2009-02-11 16:43               ` Andy Whitcroft [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20090211164359.GI25898@shadowen.org \
    --to=apw@canonical.com \
    --cc=Lee.Schermerhorn@hp.com \
    --cc=akpm@linux-foundation.org \
    --cc=gregkh@suse.de \
    --cc=hugh@veritas.com \
    --cc=kamezawa.hiroyu@jp.fujitsu.com \
    --cc=kosaki.motohiro@jp.fujitsu.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=maksim.yevmenkin@gmail.com \
    --cc=mel@csn.ul.ie \
    --cc=miklos@szeredi.hu \
    --cc=npiggin@suse.de \
    --cc=riel@redhat.com \
    --cc=torvalds@linux-foundation.org \
    --cc=will@crowder-design.com \
    --cc=wli@movementarian.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.