From: David Hildenbrand <david@redhat.com>
To: Mike Kravetz <mike.kravetz@oracle.com>,
Mina Almasry <almasrymina@google.com>
Cc: "linux-kernel@vger.kernel.org" <linux-kernel@vger.kernel.org>,
"linux-mm@kvack.org" <linux-mm@kvack.org>,
Michal Privoznik <mprivozn@redhat.com>,
"Michael S. Tsirkin" <mst@redhat.com>,
Michal Hocko <mhocko@kernel.org>,
Muchun Song <songmuchun@bytedance.com>,
"Aneesh Kumar K.V" <aneesh.kumar@linux.vnet.ibm.com>,
Tejun Heo <tj@kernel.org>
Subject: Re: cgroup and FALLOC_FL_PUNCH_HOLE: WARNING: CPU: 13 PID: 2438 at mm/page_counter.c:57 page_counter_uncharge+0x4b/0x5
Date: Tue, 20 Oct 2020 15:38:52 +0200 [thread overview]
Message-ID: <b24380ad-b87c-a3a1-d25e-ee30c10ed0d2@redhat.com> (raw)
In-Reply-To: <c78634ee-0d6f-c98c-3c2a-8cb500c0ae47@oracle.com>
On 16.10.20 01:14, Mike Kravetz wrote:
> On 10/14/20 11:31 AM, Mike Kravetz wrote:
>> On 10/14/20 11:18 AM, David Hildenbrand wrote:
>>
>> FWIW - I ran libhugetlbfs tests which do a bunch of hole punching
>> with (and without) hugetlb controller enabled and did not see this issue.
>>
>
> I took a closer look after running just the fallocate_stress test
> in libhugetlbfs. Here are the cgroup counter values:
>
> hugetlb.2MB.failcnt 0
> hugetlb.2MB.limit_in_bytes 9223372036854771712
> hugetlb.2MB.max_usage_in_bytes 209715200
> hugetlb.2MB.rsvd.failcnt 0
> hugetlb.2MB.rsvd.limit_in_bytes 9223372036854771712
> hugetlb.2MB.rsvd.max_usage_in_bytes 601882624
> hugetlb.2MB.rsvd.usage_in_bytes 392167424
> hugetlb.2MB.usage_in_bytes 0
>
> We did not hit the WARN_ON_ONCE(), but the 'rsvd.usage_in_bytes' value
> is not correct in that it should be zero. No huge page reservations
> remain after the test.
>
> HugePages_Total: 1024
> HugePages_Free: 1024
> HugePages_Rsvd: 0
> HugePages_Surp: 0
> Hugepagesize: 2048 kB
> Hugetlb: 2097152 kB
>
> To try and better understand the reservation cgroup controller, I addded
> a few printks to the code. While running fallocate_stress with the
> printks, I can consistently hit the WARN_ON_ONCE() due to the counter
> going negative. Here are the cgroup counter values after such a run:
>
> hugetlb.2MB.failcnt 0
> hugetlb.2MB.limit_in_bytes 9223372036854771712
> hugetlb.2MB.max_usage_in_bytes 209715200
> hugetlb.2MB.rsvd.failcnt 3
> hugetlb.2MB.rsvd.limit_in_bytes 9223372036854771712
> hugetlb.2MB.rsvd.max_usage_in_bytes 251658240
> hugetlb.2MB.rsvd.usage_in_bytes 18446744073487253504
> hugetlb.2MB.usage_in_bytes 0
>
> Again, no reserved pages after the test.
>
> HugePages_Total: 1024
> HugePages_Free: 1024
> HugePages_Rsvd: 0
> HugePages_Surp: 0
> Hugepagesize: 2048 kB
> Hugetlb: 2097152 kB
>
> I have some basic hugetlb hole punch functionality tests. Running
> these on the kernel with added printk's does not cause any issues.
> In order to reproduce, I need to run fallocate_stress test which
> will cause hole punch to race with page fault. Best guess at this
> time is that some of the error/race detection reservation back out
> code is not properly dealing with cgroup accounting.
>
> I'll take a look at this as well.
>
I'm bisecting the warning right now. Looks like it was introduced in v5.7.
--
Thanks,
David / dhildenb
next prev parent reply other threads:[~2020-10-20 13:39 UTC|newest]
Thread overview: 18+ messages / expand[flat|nested] mbox.gz Atom feed top
2020-10-14 15:22 cgroup and FALLOC_FL_PUNCH_HOLE: WARNING: CPU: 13 PID: 2438 at mm/page_counter.c:57 page_counter_uncharge+0x4b/0x5 David Hildenbrand
2020-10-14 16:15 ` David Hildenbrand
2020-10-14 17:56 ` Mina Almasry
2020-10-14 18:18 ` David Hildenbrand
2020-10-14 18:31 ` Mike Kravetz
2020-10-15 7:56 ` David Hildenbrand
2020-10-15 8:57 ` David Hildenbrand
2020-10-15 9:01 ` David Hildenbrand
2020-10-15 23:14 ` Mike Kravetz
2020-10-20 13:38 ` David Hildenbrand [this message]
2020-10-21 3:35 ` Mike Kravetz
2020-10-21 12:42 ` David Hildenbrand
2020-10-21 12:57 ` Michal Privoznik
2020-10-21 13:11 ` David Hildenbrand
2020-10-21 13:34 ` David Hildenbrand
2020-10-21 13:38 ` David Hildenbrand
2020-10-21 16:58 ` Mike Kravetz
2020-10-21 17:30 ` David Hildenbrand
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=b24380ad-b87c-a3a1-d25e-ee30c10ed0d2@redhat.com \
--to=david@redhat.com \
--cc=almasrymina@google.com \
--cc=aneesh.kumar@linux.vnet.ibm.com \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-mm@kvack.org \
--cc=mhocko@kernel.org \
--cc=mike.kravetz@oracle.com \
--cc=mprivozn@redhat.com \
--cc=mst@redhat.com \
--cc=songmuchun@bytedance.com \
--cc=tj@kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).