All of lore.kernel.org
 help / color / mirror / Atom feed
From: Mel Gorman <mel@csn.ul.ie>
To: Lee Schermerhorn <lee.schermerhorn@hp.com>
Cc: linux-mm@kvack.org, akpm@linux-foundation.org,
	Nishanth Aravamudan <nacc@us.ibm.com>,
	Adam Litke <agl@us.ibm.com>, Andy Whitcroft <apw@canonical.com>,
	eric.whitney@hp.com
Subject: Re: [PATCH 1/5] Free huge pages round robin to balance across nodes
Date: Wed, 17 Jun 2009 14:18:33 +0100	[thread overview]
Message-ID: <20090617131833.GG28529@csn.ul.ie> (raw)
In-Reply-To: <20090616135236.25248.93692.sendpatchset@lts-notebook>

On Tue, Jun 16, 2009 at 09:52:36AM -0400, Lee Schermerhorn wrote:
> [PATCH 1/5] Free huge pages round robin to balance across nodes
> 
> Against:  17may09 mmotm
> 
> Currently, altho' increasing nr_hugepages will [attempt to]
> distribute the new huge pages across all nodes in the system,
> reducing nr_hugepages will free or surplus all free pages
> from nodes in node id order.  This patch frees huges pages
> from nodes in round robin fashion in an attempt to keep
> [persistent] hugepage allocates balanced across the nodes.
> 
> New function free_pool_huge_page() is modeled on and
> performs roughly the inverse of alloc_fresh_huge_page().
> Replaces dequeue_huge_page() which now has no callers
> and can be removed.
> 
> Helper function hstate_next_to_free_node() uses new hstate
> member next_to_free_nid to distribute "frees" across all
> nodes with huge pages.
> 
> I placed this patch first in the series because I think it
> [or something similar] should be applied independent of the
> rest of the series.  
> 

Agreed. Reading though, I can't see any problems with the patch and it
does make the freeing symmetric with the allocation.

For clarity though, would it be worth renaming hugetlb_next_nid to
next_to_alloc_nid so that there is a clear relationship in the
round-robin allocation and freeing of pages amoung online nodes?

> Signed-off-by: Lee Schermerhorn <lee.schermerhorn@hp.com>
> 
>  include/linux/hugetlb.h |    1 
>  mm/hugetlb.c            |   68 +++++++++++++++++++++++++++++++++---------------
>  2 files changed, 48 insertions(+), 21 deletions(-)
> 
> Index: linux-2.6.30-rc8-mmotm-090603-1633/include/linux/hugetlb.h
> ===================================================================
> --- linux-2.6.30-rc8-mmotm-090603-1633.orig/include/linux/hugetlb.h	2009-06-04 12:59:29.000000000 -0400
> +++ linux-2.6.30-rc8-mmotm-090603-1633/include/linux/hugetlb.h	2009-06-04 12:59:31.000000000 -0400
> @@ -184,6 +184,7 @@ unsigned long hugetlb_get_unmapped_area(
>  /* Defines one hugetlb page size */
>  struct hstate {
>  	int hugetlb_next_nid;
> +	int next_to_free_nid;
>  	unsigned int order;
>  	unsigned long mask;
>  	unsigned long max_huge_pages;
> Index: linux-2.6.30-rc8-mmotm-090603-1633/mm/hugetlb.c
> ===================================================================
> --- linux-2.6.30-rc8-mmotm-090603-1633.orig/mm/hugetlb.c	2009-06-04 12:59:29.000000000 -0400
> +++ linux-2.6.30-rc8-mmotm-090603-1633/mm/hugetlb.c	2009-06-04 12:59:31.000000000 -0400
> @@ -455,24 +455,6 @@ static void enqueue_huge_page(struct hst
>  	h->free_huge_pages_node[nid]++;
>  }
>  
> -static struct page *dequeue_huge_page(struct hstate *h)
> -{
> -	int nid;
> -	struct page *page = NULL;
> -
> -	for (nid = 0; nid < MAX_NUMNODES; ++nid) {
> -		if (!list_empty(&h->hugepage_freelists[nid])) {
> -			page = list_entry(h->hugepage_freelists[nid].next,
> -					  struct page, lru);
> -			list_del(&page->lru);
> -			h->free_huge_pages--;
> -			h->free_huge_pages_node[nid]--;
> -			break;
> -		}
> -	}
> -	return page;
> -}
> -
>  static struct page *dequeue_huge_page_vma(struct hstate *h,
>  				struct vm_area_struct *vma,
>  				unsigned long address, int avoid_reserve)
> @@ -683,6 +665,52 @@ static int alloc_fresh_huge_page(struct 
>  	return ret;
>  }
>  
> +/*
> + * helper for free_pool_huge_page() - find next node
> + * from which to free a huge page
> + */
> +static int hstate_next_to_free_node(struct hstate *h)
> +{
> +	int next_nid;
> +	next_nid = next_node(h->next_to_free_nid, node_online_map);
> +	if (next_nid == MAX_NUMNODES)
> +		next_nid = first_node(node_online_map);
> +	h->next_to_free_nid = next_nid;
> +	return next_nid;
> +}
> +
> +/*
> + * Free huge page from pool from next node to free.
> + * Attempt to keep persistent huge pages more or less
> + * balanced over allowed nodes.
> + * Called with hugetlb_lock locked.
> + */
> +static int free_pool_huge_page(struct hstate *h)
> +{
> +	int start_nid;
> +	int nid;
> +	int ret = 0;
> +
> +	start_nid = h->next_to_free_nid;
> +	nid = h->next_to_free_nid;
> +
> +	do {
> +		if (!list_empty(&h->hugepage_freelists[nid])) {
> +			struct page *page =
> +				list_entry(h->hugepage_freelists[nid].next,
> +					  struct page, lru);
> +			list_del(&page->lru);
> +			h->free_huge_pages--;
> +			h->free_huge_pages_node[nid]--;
> +			update_and_free_page(h, page);
> +			ret = 1;
> +		}
> +		nid = hstate_next_to_free_node(h);
> +	} while (!ret && nid != start_nid);
> +
> +	return ret;
> +}
> +
>  static struct page *alloc_buddy_huge_page(struct hstate *h,
>  			struct vm_area_struct *vma, unsigned long address)
>  {
> @@ -1226,10 +1254,8 @@ static unsigned long set_max_huge_pages(
>  	min_count = max(count, min_count);
>  	try_to_free_low(h, min_count);
>  	while (min_count < persistent_huge_pages(h)) {
> -		struct page *page = dequeue_huge_page(h);
> -		if (!page)
> +		if (!free_pool_huge_page(h))
>  			break;
> -		update_and_free_page(h, page);
>  	}
>  	while (count < persistent_huge_pages(h)) {
>  		if (!adjust_pool_surplus(h, 1))
> 

-- 
Mel Gorman
Part-time Phd Student                          Linux Technology Center
University of Limerick                         IBM Dublin Software Lab

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

  reply	other threads:[~2009-06-17 13:18 UTC|newest]

Thread overview: 31+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2009-06-16 13:52 [PATCH 0/5] Huge Pages Nodes Allowed Lee Schermerhorn
2009-06-16 13:52 ` [PATCH 1/5] Free huge pages round robin to balance across nodes Lee Schermerhorn
2009-06-17 13:18   ` Mel Gorman [this message]
2009-06-17 17:16     ` Lee Schermerhorn
2009-06-18 19:08       ` David Rientjes
2009-06-16 13:52 ` [PATCH 2/5] Add nodes_allowed members to hugepages hstate struct Lee Schermerhorn
2009-06-17 13:35   ` Mel Gorman
2009-06-17 17:38     ` Lee Schermerhorn
2009-06-18  9:17       ` Mel Gorman
2009-06-16 13:53 ` [PATCH 3/5] Use per hstate nodes_allowed to constrain huge page allocation Lee Schermerhorn
2009-06-17 13:39   ` Mel Gorman
2009-06-17 17:47     ` Lee Schermerhorn
2009-06-18  9:18       ` Mel Gorman
2009-06-16 13:53 ` [PATCH 4/5] Add sysctl for default hstate nodes_allowed Lee Schermerhorn
2009-06-17 13:41   ` Mel Gorman
2009-06-17 17:52     ` Lee Schermerhorn
2009-06-18  9:19       ` Mel Gorman
2009-06-16 13:53 ` [PATCH 5/5] Update huge pages kernel documentation Lee Schermerhorn
2009-06-18 18:49   ` David Rientjes
2009-06-18 19:06     ` Lee Schermerhorn
2009-06-17 13:02 ` [PATCH 0/5] Huge Pages Nodes Allowed Mel Gorman
2009-06-17 17:15   ` Lee Schermerhorn
2009-06-18  9:33     ` Mel Gorman
2009-06-18 14:46       ` Lee Schermerhorn
2009-06-18 15:00         ` Mel Gorman
2009-06-18 19:08     ` David Rientjes
2009-06-24  7:11       ` David Rientjes
2009-06-24 11:25         ` Lee Schermerhorn
2009-06-24 22:26           ` David Rientjes
2009-06-25  2:14             ` Lee Schermerhorn
2009-06-25 19:22               ` David Rientjes

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20090617131833.GG28529@csn.ul.ie \
    --to=mel@csn.ul.ie \
    --cc=agl@us.ibm.com \
    --cc=akpm@linux-foundation.org \
    --cc=apw@canonical.com \
    --cc=eric.whitney@hp.com \
    --cc=lee.schermerhorn@hp.com \
    --cc=linux-mm@kvack.org \
    --cc=nacc@us.ibm.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.