All of lore.kernel.org
 help / color / mirror / Atom feed
* [bug/regression] libhugetlbfs testsuite failures and OOMs eventually kill my system
@ 2016-10-13 12:19 ` Jan Stancek
  0 siblings, 0 replies; 26+ messages in thread
From: Jan Stancek @ 2016-10-13 12:19 UTC (permalink / raw)
  To: linux-mm, linux-kernel
  Cc: mike.kravetz, hillf.zj, dave.hansen, kirill.shutemov, mhocko,
	n-horiguchi, aneesh.kumar, iamjoonsoo.kim

Hi,

I'm running into ENOMEM failures with libhugetlbfs testsuite [1] on
a power8 lpar system running 4.8 or latest git [2]. Repeated runs of
this suite trigger multiple OOMs, that eventually kill entire system,
it usually takes 3-5 runs:

 * Total System Memory......:  18024 MB
 * Shared Mem Max Mapping...:    320 MB
 * System Huge Page Size....:     16 MB
 * Available Huge Pages.....:     20
 * Total size of Huge Pages.:    320 MB
 * Remaining System Memory..:  17704 MB
 * Huge Page User Group.....:  hugepages (1001)

I see this only on ppc (BE/LE), x86_64 seems unaffected and successfully
ran the tests for ~12 hours.

Bisect has identified following patch as culprit:
  commit 67961f9db8c477026ea20ce05761bde6f8bf85b0
  Author: Mike Kravetz <mike.kravetz@oracle.com>
  Date:   Wed Jun 8 15:33:42 2016 -0700
    mm/hugetlb: fix huge page reserve accounting for private mappings


Following patch (made with my limited insight) applied to
latest git [2] fixes the problem for me:

diff --git a/mm/hugetlb.c b/mm/hugetlb.c
index ec49d9e..7261583 100644
--- a/mm/hugetlb.c
+++ b/mm/hugetlb.c
@@ -1876,7 +1876,7 @@ static long __vma_reservation_common(struct hstate *h,
                 * return value of this routine is the opposite of the
                 * value returned from reserve map manipulation routines above.
                 */
-               if (ret)
+               if (ret >= 0)
                        return 0;
                else
                        return 1;

Regards,
Jan

[1] https://github.com/libhugetlbfs/libhugetlbfs
[2] v4.8-14230-gb67be92

^ permalink raw reply related	[flat|nested] 26+ messages in thread

end of thread, other threads:[~2016-10-18 11:34 UTC | newest]

Thread overview: 26+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2016-10-13 12:19 [bug/regression] libhugetlbfs testsuite failures and OOMs eventually kill my system Jan Stancek
2016-10-13 12:19 ` Jan Stancek
2016-10-13 15:24 ` Mike Kravetz
2016-10-13 15:24   ` Mike Kravetz
2016-10-13 23:26   ` Mike Kravetz
2016-10-13 23:26     ` Mike Kravetz
2016-10-14  8:48     ` Jan Stancek
2016-10-14  8:48       ` Jan Stancek
2016-10-14 23:57       ` Mike Kravetz
2016-10-14 23:57         ` Mike Kravetz
2016-10-17  5:04         ` Aneesh Kumar K.V
2016-10-17  5:04           ` Aneesh Kumar K.V
2016-10-17 22:53           ` Mike Kravetz
2016-10-17 22:53             ` Mike Kravetz
2016-10-18  1:18             ` Mike Kravetz
2016-10-18  1:18               ` Mike Kravetz
2016-10-17 14:44         ` Jan Stancek
2016-10-17 14:44           ` Jan Stancek
2016-10-17 18:27           ` Aneesh Kumar K.V
2016-10-17 18:27             ` Aneesh Kumar K.V
2016-10-17 23:19             ` Mike Kravetz
2016-10-17 23:19               ` Mike Kravetz
2016-10-18  8:31           ` Aneesh Kumar K.V
2016-10-18  8:31             ` Aneesh Kumar K.V
2016-10-18 11:28             ` Jan Stancek
2016-10-18 11:28               ` Jan Stancek

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.