All of lore.kernel.org
 help / color / mirror / Atom feed
From: Gerald Schaefer <gerald.schaefer@de.ibm.com>
To: Rui Teng <rui.teng@linux.vnet.ibm.com>
Cc: Michal Hocko <mhocko@kernel.org>,
	Andrew Morton <akpm@linux-foundation.org>,
	Naoya Horiguchi <n-horiguchi@ah.jp.nec.com>,
	Hillf Danton <hillf.zj@alibaba-inc.com>,
	linux-mm@kvack.org, linux-kernel@vger.kernel.org,
	"Kirill A . Shutemov" <kirill.shutemov@linux.intel.com>,
	Vlastimil Babka <vbabka@suse.cz>,
	Mike Kravetz <mike.kravetz@oracle.com>,
	"Aneesh Kumar K . V" <aneesh.kumar@linux.vnet.ibm.com>,
	Martin Schwidefsky <schwidefsky@de.ibm.com>,
	Heiko Carstens <heiko.carstens@de.ibm.com>,
	Dave Hansen <dave.hansen@linux.intel.com>
Subject: Re: [PATCH v2 1/1] mm/hugetlb: fix memory offline with hugepage size > memory block size
Date: Fri, 23 Sep 2016 13:03:48 +0200	[thread overview]
Message-ID: <20160923130348.14c4b2b5@thinkpad> (raw)
In-Reply-To: <4ef25b67-13bc-57bd-f322-04310e6d6a00@linux.vnet.ibm.com>

On Fri, 23 Sep 2016 14:40:33 +0800
Rui Teng <rui.teng@linux.vnet.ibm.com> wrote:

> On 9/22/16 5:51 PM, Michal Hocko wrote:
> > On Wed 21-09-16 14:35:34, Gerald Schaefer wrote:
> >> dissolve_free_huge_pages() will either run into the VM_BUG_ON() or a
> >> list corruption and addressing exception when trying to set a memory
> >> block offline that is part (but not the first part) of a hugetlb page
> >> with a size > memory block size.
> >>
> >> When no other smaller hugetlb page sizes are present, the VM_BUG_ON()
> >> will trigger directly. In the other case we will run into an addressing
> >> exception later, because dissolve_free_huge_page() will not work on the
> >> head page of the compound hugetlb page which will result in a NULL
> >> hstate from page_hstate().
> >>
> >> To fix this, first remove the VM_BUG_ON() because it is wrong, and then
> >> use the compound head page in dissolve_free_huge_page().
> >
> > OK so dissolve_free_huge_page will work also on tail pages now which
> > makes some sense. I would appreciate also few words why do we want to
> > sacrifice something as precious as gigantic page rather than fail the
> > page block offline. Dave pointed out dim offline usecase for example.
> >
> >> Also change locking in dissolve_free_huge_page(), so that it only takes
> >> the lock when actually removing a hugepage.
> >
> > From a quick look it seems this has been broken since introduced by
> > c8721bbbdd36 ("mm: memory-hotplug: enable memory hotplug to handle
> > hugepage"). Do we want to have this backported to stable? In any way
> > Fixes: SHA1 would be really nice.
> >
> 
> If the huge page hot-plug function was introduced by c8721bbbdd36, and
> it has already indicated that the gigantic page is not supported:
> 
> 	"As for larger hugepages (1GB for x86_64), it's not easy to do
> 	hotremove over them because it's larger than memory block.  So
> 	we now simply leave it to fail as it is."
> 
> Is it possible that the gigantic page hot-plugin has never been
> supported?

Offlining blocks with gigantic pages only fails when they are in-use,
I guess that was meant by the description. Maybe it was also meant to
fail in any case, but that was not was the patch did.

With free gigantic pages, it looks like it only ever worked when
offlining the first block of a gigantic page. And as long as you only
have gigantic pages, the VM_BUG_ON() would actually have triggered on
every block that is not gigantic-page-aligned, even if the block is not
part of any gigantic page at all.

Given the age of the patch it is a little bit surprising that it never
struck anyone, and that we now have found it on two architectures at
once :-)

> 
> I made another patch for this problem, and also tried to apply the
> first version of this patch on my system too. But they only postpone
> the error happened. The HugePages_Free will be changed from 2 to 1, if I 
> offline a huge page. I think it does not have a correct roll back.
> 
> # cat /proc/meminfo | grep -i huge
> AnonHugePages:         0 kB
> HugePages_Total:       2
> HugePages_Free:        1
> HugePages_Rsvd:        0
> HugePages_Surp:        0
> Hugepagesize:   16777216 kB

HugePages_Free is supposed to be reduced when offlining a block, but
then HugePages_Total should also be reduced, so that is strange. On my
system both were reduced. Does this happen with any version of my patch?

What do you mean with postpone the error? Can you reproduce the BUG_ON
or the addressing exception with my patch?

> 
> I will make more test on it, but can any one confirm that this function 
> has been implemented and tested before?

WARNING: multiple messages have this Message-ID (diff)
From: Gerald Schaefer <gerald.schaefer@de.ibm.com>
To: Rui Teng <rui.teng@linux.vnet.ibm.com>
Cc: Michal Hocko <mhocko@kernel.org>,
	Andrew Morton <akpm@linux-foundation.org>,
	Naoya Horiguchi <n-horiguchi@ah.jp.nec.com>,
	Hillf Danton <hillf.zj@alibaba-inc.com>,
	linux-mm@kvack.org, linux-kernel@vger.kernel.org,
	"Kirill A . Shutemov" <kirill.shutemov@linux.intel.com>,
	Vlastimil Babka <vbabka@suse.cz>,
	Mike Kravetz <mike.kravetz@oracle.com>,
	"Aneesh Kumar K . V" <aneesh.kumar@linux.vnet.ibm.com>,
	Martin Schwidefsky <schwidefsky@de.ibm.com>,
	Heiko Carstens <heiko.carstens@de.ibm.com>,
	Dave Hansen <dave.hansen@linux.intel.com>
Subject: Re: [PATCH v2 1/1] mm/hugetlb: fix memory offline with hugepage size > memory block size
Date: Fri, 23 Sep 2016 13:03:48 +0200	[thread overview]
Message-ID: <20160923130348.14c4b2b5@thinkpad> (raw)
In-Reply-To: <4ef25b67-13bc-57bd-f322-04310e6d6a00@linux.vnet.ibm.com>

On Fri, 23 Sep 2016 14:40:33 +0800
Rui Teng <rui.teng@linux.vnet.ibm.com> wrote:

> On 9/22/16 5:51 PM, Michal Hocko wrote:
> > On Wed 21-09-16 14:35:34, Gerald Schaefer wrote:
> >> dissolve_free_huge_pages() will either run into the VM_BUG_ON() or a
> >> list corruption and addressing exception when trying to set a memory
> >> block offline that is part (but not the first part) of a hugetlb page
> >> with a size > memory block size.
> >>
> >> When no other smaller hugetlb page sizes are present, the VM_BUG_ON()
> >> will trigger directly. In the other case we will run into an addressing
> >> exception later, because dissolve_free_huge_page() will not work on the
> >> head page of the compound hugetlb page which will result in a NULL
> >> hstate from page_hstate().
> >>
> >> To fix this, first remove the VM_BUG_ON() because it is wrong, and then
> >> use the compound head page in dissolve_free_huge_page().
> >
> > OK so dissolve_free_huge_page will work also on tail pages now which
> > makes some sense. I would appreciate also few words why do we want to
> > sacrifice something as precious as gigantic page rather than fail the
> > page block offline. Dave pointed out dim offline usecase for example.
> >
> >> Also change locking in dissolve_free_huge_page(), so that it only takes
> >> the lock when actually removing a hugepage.
> >
> > From a quick look it seems this has been broken since introduced by
> > c8721bbbdd36 ("mm: memory-hotplug: enable memory hotplug to handle
> > hugepage"). Do we want to have this backported to stable? In any way
> > Fixes: SHA1 would be really nice.
> >
> 
> If the huge page hot-plug function was introduced by c8721bbbdd36, and
> it has already indicated that the gigantic page is not supported:
> 
> 	"As for larger hugepages (1GB for x86_64), it's not easy to do
> 	hotremove over them because it's larger than memory block.  So
> 	we now simply leave it to fail as it is."
> 
> Is it possible that the gigantic page hot-plugin has never been
> supported?

Offlining blocks with gigantic pages only fails when they are in-use,
I guess that was meant by the description. Maybe it was also meant to
fail in any case, but that was not was the patch did.

With free gigantic pages, it looks like it only ever worked when
offlining the first block of a gigantic page. And as long as you only
have gigantic pages, the VM_BUG_ON() would actually have triggered on
every block that is not gigantic-page-aligned, even if the block is not
part of any gigantic page at all.

Given the age of the patch it is a little bit surprising that it never
struck anyone, and that we now have found it on two architectures at
once :-)

> 
> I made another patch for this problem, and also tried to apply the
> first version of this patch on my system too. But they only postpone
> the error happened. The HugePages_Free will be changed from 2 to 1, if I 
> offline a huge page. I think it does not have a correct roll back.
> 
> # cat /proc/meminfo | grep -i huge
> AnonHugePages:         0 kB
> HugePages_Total:       2
> HugePages_Free:        1
> HugePages_Rsvd:        0
> HugePages_Surp:        0
> Hugepagesize:   16777216 kB

HugePages_Free is supposed to be reduced when offlining a block, but
then HugePages_Total should also be reduced, so that is strange. On my
system both were reduced. Does this happen with any version of my patch?

What do you mean with postpone the error? Can you reproduce the BUG_ON
or the addressing exception with my patch?

> 
> I will make more test on it, but can any one confirm that this function 
> has been implemented and tested before?

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

  reply	other threads:[~2016-09-23 11:04 UTC|newest]

Thread overview: 48+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2016-09-20 15:53 [PATCH 0/1] memory offline issues with hugepage size > memory block size Gerald Schaefer
2016-09-20 15:53 ` Gerald Schaefer
2016-09-20 15:53 ` [PATCH 1/1] mm/hugetlb: fix memory offline " Gerald Schaefer
2016-09-20 15:53   ` Gerald Schaefer
2016-09-21  6:29   ` Hillf Danton
2016-09-21  6:29     ` Hillf Danton
2016-09-21 12:35     ` [PATCH v2 " Gerald Schaefer
2016-09-21 12:35       ` Gerald Schaefer
2016-09-21 13:17       ` Rui Teng
2016-09-21 13:17         ` Rui Teng
2016-09-21 15:13         ` Gerald Schaefer
2016-09-21 15:13           ` Gerald Schaefer
2016-09-22  7:58       ` Hillf Danton
2016-09-22  7:58         ` Hillf Danton
2016-09-22  9:51       ` Michal Hocko
2016-09-22  9:51         ` Michal Hocko
2016-09-22 13:45         ` Gerald Schaefer
2016-09-22 13:45           ` Gerald Schaefer
2016-09-22 16:29           ` [PATCH v3] " Gerald Schaefer
2016-09-22 16:29             ` Gerald Schaefer
2016-09-22 18:12             ` Dave Hansen
2016-09-22 18:12               ` Dave Hansen
2016-09-22 19:13               ` Mike Kravetz
2016-09-22 19:13                 ` Mike Kravetz
2016-09-23 10:36               ` Gerald Schaefer
2016-09-23 10:36                 ` Gerald Schaefer
2016-09-23  6:40         ` [PATCH v2 1/1] " Rui Teng
2016-09-23  6:40           ` Rui Teng
2016-09-23 11:03           ` Gerald Schaefer [this message]
2016-09-23 11:03             ` Gerald Schaefer
2016-09-26  2:49             ` Rui Teng
2016-09-26  2:49               ` Rui Teng
2016-09-20 17:37 ` [PATCH 0/1] memory offline issues " Mike Kravetz
2016-09-20 17:37   ` Mike Kravetz
2016-09-20 17:45   ` Dave Hansen
2016-09-20 17:45     ` Dave Hansen
2016-09-21  9:49     ` Vlastimil Babka
2016-09-21  9:49       ` Vlastimil Babka
2016-09-21 10:34     ` Gerald Schaefer
2016-09-21 10:34       ` Gerald Schaefer
2016-09-21 10:30   ` Gerald Schaefer
2016-09-21 10:30     ` Gerald Schaefer
2016-09-21 18:20   ` Michal Hocko
2016-09-21 18:20     ` Michal Hocko
2016-09-21 18:27     ` Dave Hansen
2016-09-21 18:27       ` Dave Hansen
2016-09-21 19:22       ` Michal Hocko
2016-09-21 19:22         ` Michal Hocko

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20160923130348.14c4b2b5@thinkpad \
    --to=gerald.schaefer@de.ibm.com \
    --cc=akpm@linux-foundation.org \
    --cc=aneesh.kumar@linux.vnet.ibm.com \
    --cc=dave.hansen@linux.intel.com \
    --cc=heiko.carstens@de.ibm.com \
    --cc=hillf.zj@alibaba-inc.com \
    --cc=kirill.shutemov@linux.intel.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=mhocko@kernel.org \
    --cc=mike.kravetz@oracle.com \
    --cc=n-horiguchi@ah.jp.nec.com \
    --cc=rui.teng@linux.vnet.ibm.com \
    --cc=schwidefsky@de.ibm.com \
    --cc=vbabka@suse.cz \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.