All of lore.kernel.org
 help / color / mirror / Atom feed
From: Gerald Schaefer <gerald.schaefer@de.ibm.com>
To: Dave Hansen <dave.hansen@linux.intel.com>
Cc: Andrew Morton <akpm@linux-foundation.org>,
	Michal Hocko <mhocko@kernel.org>,
	Naoya Horiguchi <n-horiguchi@ah.jp.nec.com>,
	Hillf Danton <hillf.zj@alibaba-inc.com>,
	linux-mm@kvack.org, linux-kernel@vger.kernel.org,
	"Kirill A . Shutemov" <kirill.shutemov@linux.intel.com>,
	Vlastimil Babka <vbabka@suse.cz>,
	Mike Kravetz <mike.kravetz@oracle.com>,
	"Aneesh Kumar K . V" <aneesh.kumar@linux.vnet.ibm.com>,
	Martin Schwidefsky <schwidefsky@de.ibm.com>,
	Heiko Carstens <heiko.carstens@de.ibm.com>,
	Rui Teng <rui.teng@linux.vnet.ibm.com>
Subject: Re: [PATCH v3] mm/hugetlb: fix memory offline with hugepage size > memory block size
Date: Fri, 23 Sep 2016 12:36:22 +0200	[thread overview]
Message-ID: <20160923123622.00289d21@thinkpad> (raw)
In-Reply-To: <57E41EF6.1010903@linux.intel.com>

On Thu, 22 Sep 2016 11:12:06 -0700
Dave Hansen <dave.hansen@linux.intel.com> wrote:

> On 09/22/2016 09:29 AM, Gerald Schaefer wrote:
> >  static void dissolve_free_huge_page(struct page *page)
> >  {
> > +	struct page *head = compound_head(page);
> > +	struct hstate *h = page_hstate(head);
> > +	int nid = page_to_nid(head);
> > +
> >  	spin_lock(&hugetlb_lock);
> > -	if (PageHuge(page) && !page_count(page)) {
> > -		struct hstate *h = page_hstate(page);
> > -		int nid = page_to_nid(page);
> > -		list_del(&page->lru);
> > -		h->free_huge_pages--;
> > -		h->free_huge_pages_node[nid]--;
> > -		h->max_huge_pages--;
> > -		update_and_free_page(h, page);
> > -	}
> > +	list_del(&head->lru);
> > +	h->free_huge_pages--;
> > +	h->free_huge_pages_node[nid]--;
> > +	h->max_huge_pages--;
> > +	update_and_free_page(h, head);
> >  	spin_unlock(&hugetlb_lock);
> >  }
> 
> Do you need to revalidate anything once you acquire the lock?  Can this,
> for instance, race with another thread doing vm.nr_hugepages=0?  Or a
> thread faulting in and allocating the large page that's being dissolved?
> 

Yes, good point. I was relying on the range being isolated, but that only
seems to be checked in dequeue_huge_page_node(), as introduced with the
original commit. So this would only protect against anyone allocating the
hugepage at this point. This is also somehow expected, since we already
are beyond the "point of no return" in offline_pages().

vm.nr_hugepages=0 seems to be an issue though, as set_max_hugepages()
will not care about isolation, and so I guess we could have a race here
and double-free the hugepage. Revalidation of at least PageHuge() after
taking the lock should protect from that, not sure about page_count(),
but I think I'll just check both which will give the same behaviour as
before.

Will send v4, after thinking a bit more on the page reservation point
brought up by Mike.

WARNING: multiple messages have this Message-ID (diff)
From: Gerald Schaefer <gerald.schaefer@de.ibm.com>
To: Dave Hansen <dave.hansen@linux.intel.com>
Cc: Andrew Morton <akpm@linux-foundation.org>,
	Michal Hocko <mhocko@kernel.org>,
	Naoya Horiguchi <n-horiguchi@ah.jp.nec.com>,
	Hillf Danton <hillf.zj@alibaba-inc.com>,
	linux-mm@kvack.org, linux-kernel@vger.kernel.org,
	"Kirill A . Shutemov" <kirill.shutemov@linux.intel.com>,
	Vlastimil Babka <vbabka@suse.cz>,
	Mike Kravetz <mike.kravetz@oracle.com>,
	"Aneesh Kumar K . V" <aneesh.kumar@linux.vnet.ibm.com>,
	Martin Schwidefsky <schwidefsky@de.ibm.com>,
	Heiko Carstens <heiko.carstens@de.ibm.com>,
	Rui Teng <rui.teng@linux.vnet.ibm.com>
Subject: Re: [PATCH v3] mm/hugetlb: fix memory offline with hugepage size > memory block size
Date: Fri, 23 Sep 2016 12:36:22 +0200	[thread overview]
Message-ID: <20160923123622.00289d21@thinkpad> (raw)
In-Reply-To: <57E41EF6.1010903@linux.intel.com>

On Thu, 22 Sep 2016 11:12:06 -0700
Dave Hansen <dave.hansen@linux.intel.com> wrote:

> On 09/22/2016 09:29 AM, Gerald Schaefer wrote:
> >  static void dissolve_free_huge_page(struct page *page)
> >  {
> > +	struct page *head = compound_head(page);
> > +	struct hstate *h = page_hstate(head);
> > +	int nid = page_to_nid(head);
> > +
> >  	spin_lock(&hugetlb_lock);
> > -	if (PageHuge(page) && !page_count(page)) {
> > -		struct hstate *h = page_hstate(page);
> > -		int nid = page_to_nid(page);
> > -		list_del(&page->lru);
> > -		h->free_huge_pages--;
> > -		h->free_huge_pages_node[nid]--;
> > -		h->max_huge_pages--;
> > -		update_and_free_page(h, page);
> > -	}
> > +	list_del(&head->lru);
> > +	h->free_huge_pages--;
> > +	h->free_huge_pages_node[nid]--;
> > +	h->max_huge_pages--;
> > +	update_and_free_page(h, head);
> >  	spin_unlock(&hugetlb_lock);
> >  }
> 
> Do you need to revalidate anything once you acquire the lock?  Can this,
> for instance, race with another thread doing vm.nr_hugepages=0?  Or a
> thread faulting in and allocating the large page that's being dissolved?
> 

Yes, good point. I was relying on the range being isolated, but that only
seems to be checked in dequeue_huge_page_node(), as introduced with the
original commit. So this would only protect against anyone allocating the
hugepage at this point. This is also somehow expected, since we already
are beyond the "point of no return" in offline_pages().

vm.nr_hugepages=0 seems to be an issue though, as set_max_hugepages()
will not care about isolation, and so I guess we could have a race here
and double-free the hugepage. Revalidation of at least PageHuge() after
taking the lock should protect from that, not sure about page_count(),
but I think I'll just check both which will give the same behaviour as
before.

Will send v4, after thinking a bit more on the page reservation point
brought up by Mike.

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

  parent reply	other threads:[~2016-09-23 10:36 UTC|newest]

Thread overview: 48+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2016-09-20 15:53 [PATCH 0/1] memory offline issues with hugepage size > memory block size Gerald Schaefer
2016-09-20 15:53 ` Gerald Schaefer
2016-09-20 15:53 ` [PATCH 1/1] mm/hugetlb: fix memory offline " Gerald Schaefer
2016-09-20 15:53   ` Gerald Schaefer
2016-09-21  6:29   ` Hillf Danton
2016-09-21  6:29     ` Hillf Danton
2016-09-21 12:35     ` [PATCH v2 " Gerald Schaefer
2016-09-21 12:35       ` Gerald Schaefer
2016-09-21 13:17       ` Rui Teng
2016-09-21 13:17         ` Rui Teng
2016-09-21 15:13         ` Gerald Schaefer
2016-09-21 15:13           ` Gerald Schaefer
2016-09-22  7:58       ` Hillf Danton
2016-09-22  7:58         ` Hillf Danton
2016-09-22  9:51       ` Michal Hocko
2016-09-22  9:51         ` Michal Hocko
2016-09-22 13:45         ` Gerald Schaefer
2016-09-22 13:45           ` Gerald Schaefer
2016-09-22 16:29           ` [PATCH v3] " Gerald Schaefer
2016-09-22 16:29             ` Gerald Schaefer
2016-09-22 18:12             ` Dave Hansen
2016-09-22 18:12               ` Dave Hansen
2016-09-22 19:13               ` Mike Kravetz
2016-09-22 19:13                 ` Mike Kravetz
2016-09-23 10:36               ` Gerald Schaefer [this message]
2016-09-23 10:36                 ` Gerald Schaefer
2016-09-23  6:40         ` [PATCH v2 1/1] " Rui Teng
2016-09-23  6:40           ` Rui Teng
2016-09-23 11:03           ` Gerald Schaefer
2016-09-23 11:03             ` Gerald Schaefer
2016-09-26  2:49             ` Rui Teng
2016-09-26  2:49               ` Rui Teng
2016-09-20 17:37 ` [PATCH 0/1] memory offline issues " Mike Kravetz
2016-09-20 17:37   ` Mike Kravetz
2016-09-20 17:45   ` Dave Hansen
2016-09-20 17:45     ` Dave Hansen
2016-09-21  9:49     ` Vlastimil Babka
2016-09-21  9:49       ` Vlastimil Babka
2016-09-21 10:34     ` Gerald Schaefer
2016-09-21 10:34       ` Gerald Schaefer
2016-09-21 10:30   ` Gerald Schaefer
2016-09-21 10:30     ` Gerald Schaefer
2016-09-21 18:20   ` Michal Hocko
2016-09-21 18:20     ` Michal Hocko
2016-09-21 18:27     ` Dave Hansen
2016-09-21 18:27       ` Dave Hansen
2016-09-21 19:22       ` Michal Hocko
2016-09-21 19:22         ` Michal Hocko

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20160923123622.00289d21@thinkpad \
    --to=gerald.schaefer@de.ibm.com \
    --cc=akpm@linux-foundation.org \
    --cc=aneesh.kumar@linux.vnet.ibm.com \
    --cc=dave.hansen@linux.intel.com \
    --cc=heiko.carstens@de.ibm.com \
    --cc=hillf.zj@alibaba-inc.com \
    --cc=kirill.shutemov@linux.intel.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=mhocko@kernel.org \
    --cc=mike.kravetz@oracle.com \
    --cc=n-horiguchi@ah.jp.nec.com \
    --cc=rui.teng@linux.vnet.ibm.com \
    --cc=schwidefsky@de.ibm.com \
    --cc=vbabka@suse.cz \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.