All of lore.kernel.org
 help / color / mirror / Atom feed
From: Guillaume Morin <guillaume@morinfr.org>
To: almasrymina@google.com, mike.kravetz@oracle.com,
	cgroups@vger.kernel.org, guillaume@morinfr.org,
	linux-mm@kvack.org
Subject: Re: [BUG] potential hugetlb css refcounting issues
Date: Sat, 28 Aug 2021 00:58:43 +0200	[thread overview]
Message-ID: <20210827225841.GA30891@bender.morinfr.org> (raw)
In-Reply-To: <8a4f2fbc-76e8-b67b-f110-30beff2228f5@oracle-com>

Hello Mike,

I really appreciate the quick reply

Mike Kravets wrote: 
> There have been other hugetlb cgroup fixes since 5.10.  I do not believe
> they are related to the underflow issue you have seen.  Just FYI.

Yes, I am aware. Actually I did my best to look at all recent changes
not backported to 5.10 and couldn't find anything related. I tried to
cherry-pick a couple of fixes in case but the bug did not go away.

> However, when a vma is split both resulting vmas would be 'owners' of
> private mapping reserves without incrementing the refcount which would
> lead to the underflow you describe.

Indeed and I do know that programs running on my reproducer machines do
split vmas.

>> 2. After 08cf9faf75580, __free_huge_page() decrements the css
>> refcount for _each_ page unconditionally by calling
>> hugetlb_cgroup_uncharge_page_rsvd().  But a per-page reference count
>> is only taken *per page* outside the reserve case in
>> alloc_huge_page() (i.e hugetlb_cgroup_charge_cgroup_rsvd() is called
>> only if deferred_reserve is true).  In the reserve case, there is
>> only one css reference linked to the resv map (taken in
>> hugetlb_reserve_pages()).  This also leads to an underflow of the
>> counter.  A similar scheme to HPageRestoreReserve can be used to
>> track which pages were allocated in the deferred_reserve case and
>> call hugetlb_cgroup_uncharge_page_rsvd() only for these during
>> freeing.

> I am not sure about the above analysis.  It is true that
> hugetlb_cgroup_uncharge_page_rsvd is called unconditionally in
> free_huge_page.  However, IIUC hugetlb_cgroup_uncharge_page_rsvd will
> only decrement the css refcount if there is a non-NULL hugetlb_cgroup
> pointer in the page.  And, the pointer in the page would only be set
> in the 'deferred_reserve' path of alloc_huge_page.  Unless I am
> missing something, they seem to balance.

Now that you explain, I am pretty sure that you're right and I was
wrong.

I'll confirm that I can't reproduce without my change for 2.

Thank you,

Guillaume.

-- 
Guillaume Morin <guillaume@morinfr.org>


WARNING: multiple messages have this Message-ID (diff)
From: Guillaume Morin <guillaume-/FyPzM6KSZdAfugRpC6u6w@public.gmane.org>
To: almasrymina-hpIqsD4AKlfQT0dZR+AlfA@public.gmane.org,
	mike.kravetz-QHcLZuEGTsvQT0dZR+AlfA@public.gmane.org,
	cgroups-u79uwXL29TY76Z2rM5mHXA@public.gmane.org,
	guillaume-/FyPzM6KSZdAfugRpC6u6w@public.gmane.org,
	linux-mm-Bw31MaZKKs3YtjvyW6yDsg@public.gmane.org
Subject: Re: [BUG] potential hugetlb css refcounting issues
Date: Sat, 28 Aug 2021 00:58:43 +0200	[thread overview]
Message-ID: <20210827225841.GA30891@bender.morinfr.org> (raw)
In-Reply-To: <8a4f2fbc-76e8-b67b-f110-30beff2228f5@oracle-com>

Hello Mike,

I really appreciate the quick reply

Mike Kravets wrote: 
> There have been other hugetlb cgroup fixes since 5.10.  I do not believe
> they are related to the underflow issue you have seen.  Just FYI.

Yes, I am aware. Actually I did my best to look at all recent changes
not backported to 5.10 and couldn't find anything related. I tried to
cherry-pick a couple of fixes in case but the bug did not go away.

> However, when a vma is split both resulting vmas would be 'owners' of
> private mapping reserves without incrementing the refcount which would
> lead to the underflow you describe.

Indeed and I do know that programs running on my reproducer machines do
split vmas.

>> 2. After 08cf9faf75580, __free_huge_page() decrements the css
>> refcount for _each_ page unconditionally by calling
>> hugetlb_cgroup_uncharge_page_rsvd().  But a per-page reference count
>> is only taken *per page* outside the reserve case in
>> alloc_huge_page() (i.e hugetlb_cgroup_charge_cgroup_rsvd() is called
>> only if deferred_reserve is true).  In the reserve case, there is
>> only one css reference linked to the resv map (taken in
>> hugetlb_reserve_pages()).  This also leads to an underflow of the
>> counter.  A similar scheme to HPageRestoreReserve can be used to
>> track which pages were allocated in the deferred_reserve case and
>> call hugetlb_cgroup_uncharge_page_rsvd() only for these during
>> freeing.

> I am not sure about the above analysis.  It is true that
> hugetlb_cgroup_uncharge_page_rsvd is called unconditionally in
> free_huge_page.  However, IIUC hugetlb_cgroup_uncharge_page_rsvd will
> only decrement the css refcount if there is a non-NULL hugetlb_cgroup
> pointer in the page.  And, the pointer in the page would only be set
> in the 'deferred_reserve' path of alloc_huge_page.  Unless I am
> missing something, they seem to balance.

Now that you explain, I am pretty sure that you're right and I was
wrong.

I'll confirm that I can't reproduce without my change for 2.

Thank you,

Guillaume.

-- 
Guillaume Morin <guillaume-/FyPzM6KSZdAfugRpC6u6w@public.gmane.org>

       reply	other threads:[~2021-08-27 22:58 UTC|newest]

Thread overview: 7+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
     [not found] <8a4f2fbc-76e8-b67b-f110-30beff2228f5@oracle-com>
2021-08-27 22:58 ` Guillaume Morin [this message]
2021-08-27 22:58   ` [BUG] potential hugetlb css refcounting issues Guillaume Morin
2021-08-28 19:37   ` Guillaume Morin
2021-08-28 19:37     ` Guillaume Morin
2021-08-27 15:11 Guillaume Morin
2021-08-27 22:22 ` Mike Kravetz
2021-08-27 22:22   ` Mike Kravetz

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20210827225841.GA30891@bender.morinfr.org \
    --to=guillaume@morinfr.org \
    --cc=almasrymina@google.com \
    --cc=cgroups@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=mike.kravetz@oracle.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.