linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Miaohe Lin <linmiaohe@huawei.com>
To: Mike Kravetz <mike.kravetz@oracle.com>, <akpm@linux-foundation.org>
Cc: <almasrymina@google.com>, <rientjes@google.com>,
	<linux-mm@kvack.org>, <linux-kernel@vger.kernel.org>
Subject: Re: [PATCH RFC] hugetlb_cgroup: fix unbalanced css_put for shared mappings
Date: Wed, 10 Feb 2021 10:11:14 +0800	[thread overview]
Message-ID: <ce1f6b65-54e9-11a5-3378-ff2bb96559ec@huawei.com> (raw)
In-Reply-To: <14ed5468-156f-6ac4-ba1e-283a88bab917@oracle.com>

Hi:
On 2021/2/10 2:56, Mike Kravetz wrote:
> On 2/8/21 7:27 PM, Miaohe Lin wrote:
>> On 2021/2/9 3:52, Mike Kravetz wrote:
>>> On 1/23/21 1:31 AM, Miaohe Lin wrote:
>>>> The current implementation of hugetlb_cgroup for shared mappings could have
>>>> different behavior. Consider the following two scenarios:
>>>>
>>>> 1.Assume initial css reference count of hugetlb_cgroup is 1:
>>>>   1.1 Call hugetlb_reserve_pages with from = 1, to = 2. So css reference
>>>> count is 2 associated with 1 file_region.
>>>>   1.2 Call hugetlb_reserve_pages with from = 2, to = 3. So css reference
>>>> count is 3 associated with 2 file_region.
>>>>   1.3 coalesce_file_region will coalesce these two file_regions into one.
>>>> So css reference count is 3 associated with 1 file_region now.
>>>>
>>>> 2.Assume initial css reference count of hugetlb_cgroup is 1 again:
>>>>   2.1 Call hugetlb_reserve_pages with from = 1, to = 3. So css reference
>>>> count is 2 associated with 1 file_region.
>>>>
>>>> Therefore, we might have one file_region while holding one or more css
>>>> reference counts. This inconsistency could lead to unbalanced css_put().
>>>> If we do css_put one by one (i.g. hole punch case), scenario 2 would put
>>>> one more css reference. If we do css_put all together (i.g. truncate case),
>>>> scenario 1 will leak one css reference.
>>>
>>> Sorry for the delay in replying.  This is tricky code and I needed some quiet
>>> time to study it.
>>>
>>
>> That's fine. I was trying to catch more buggy case too.
>>
>>> I agree that the issue described exists.  Can you describe what a user would
>>> see in the above imbalance scenarios?  What happens if we do one too many
>>> css_put calls?  What happens if we leak the reference and do not do the
>>> required number of css_puts?
>>>
>>
>> The imbalanced css_get/css_put would result in a non-zero reference when we try to
>> destroy the hugetlb cgroup. The hugetlb cgroup dir is removed __but__ associated
>> resource is not freed. This might result in OOM or can not create a new hugetlb cgroup
>> in a really busy workload finally.
>>
>>> The code changes look correct.
>>>
>>> I just wish this code was not so complicated.  I think the private mapping
>>> case could be simplified to only take a single css_ref per reserve map.
>>
>> Could you explain this more?
>> It seems one reserve map already takes a single css_ref. And a hugepage outside
>> reservation would take a single css_ref too.
> 
> Let me preface this by saying that my cgroup knowledge is limited.
> For private mappings, all reservations will be associated with the same cgroup.
> This is because, only one process can access the mapping.  Since there is only
> one process, we only need to hold one css reference.  Individual counters can
> be incremented as needed without increasing the css reference count.  We
> take a reference when the reserv map is created and drop the reference when it
> is deleted.
> 

I see. Many thanks for detailed explanation. This could be a to-be-optimized point.

> This does not work for shared mappings as you can have multiple processes in
> multiple cgroups taking reservations on the same file.  This is why you need
> per-reservation reference accounting in this case.

Thanks again. :)

      reply	other threads:[~2021-02-10  2:33 UTC|newest]

Thread overview: 5+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-01-23  9:31 [PATCH RFC] hugetlb_cgroup: fix unbalanced css_put for shared mappings Miaohe Lin
2021-02-08 19:52 ` Mike Kravetz
2021-02-09  3:27   ` Miaohe Lin
2021-02-09 18:56     ` Mike Kravetz
2021-02-10  2:11       ` Miaohe Lin [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=ce1f6b65-54e9-11a5-3378-ff2bb96559ec@huawei.com \
    --to=linmiaohe@huawei.com \
    --cc=akpm@linux-foundation.org \
    --cc=almasrymina@google.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=mike.kravetz@oracle.com \
    --cc=rientjes@google.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).