Re: [PATCH v2] mm: memory: move mem_cgroup_charge() into alloc_anon_folio()

From: Kefeng Wang <wangkefeng.wang@huawei.com>
To: Michal Hocko <mhocko@suse.com>
Cc: Andrew Morton <akpm@linux-foundation.org>, <linux-mm@kvack.org>,
	<linux-kernel@vger.kernel.org>, <ryan.roberts@arm.com>,
	Matthew Wilcox <willy@infradead.org>,
	David Hildenbrand <david@redhat.com>
Subject: Re: [PATCH v2] mm: memory: move mem_cgroup_charge() into alloc_anon_folio()
Date: Fri, 19 Jan 2024 20:59:22 +0800	[thread overview]
Message-ID: <14ae628d-a9ef-42f3-9201-e90c5c88c133@huawei.com> (raw)
In-Reply-To: <ZaosK59cRa27K9zW@tiehlicka>

On 2024/1/19 16:00, Michal Hocko wrote:
> On Fri 19-01-24 10:05:15, Kefeng Wang wrote:
>>
>>
>> On 2024/1/18 23:59, Michal Hocko wrote:
>>> On Wed 17-01-24 18:39:54, Kefeng Wang wrote:
>>>> mem_cgroup_charge() uses the GFP flags in a fairly sophisticated way.
>>>> In addition to checking gfpflags_allow_blocking(), it pays attention
>>>> to __GFP_NORETRY and __GFP_RETRY_MAYFAIL to ensure that processes within
>>>> this memcg do not exceed their quotas. Using the same GFP flags ensures
>>>> that we handle large anonymous folios correctly, including falling back
>>>> to smaller orders when there is plenty of memory available in the system
>>>> but this memcg is close to its limits.
>>>
>>> The changelog is not really clear in the actual problem you are trying
>>> to fix. Is this pure consistency fix or have you actually seen any
>>> misbehavior. From the patch I suspect you are interested in THPs much
>>> more than regular order-0 pages because those are GFP_KERNEL like when
>>> it comes to charging. THPs have a variety of options on how aggressive
>>> the allocation should try. From that perspective NORETRY and
>>> RETRY_MAYFAIL are not all that interesting because costly allocations
>>> (which THPs are) already do imply MAYFAIL and NORETRY.
>>
>> I don't meet actual issue, it founds from code inspection.
>>
>> mTHP is introduced by Ryan（19eaf44954df "mm: thp: support allocation of
>> anonymous multi-size THP")，so we have similar check for mTHP like PMD THP
>> in alloc_anon_folio(), it will try to allocate large order folio below
>> PMD_ORDER, and fallback to order-0 folio if fails, meanwhile,
>> it get GFP flags from vma_thp_gfp_mask() according to user configuration
>> like PMD THP allocation, so
>>
>> 1) the memory charge failure check should be moved into fallback
>> logical, because it will make us to allocated as much as possible large
>> order folio, although the memcg's memory usage is close to its limits.
>>
>> 2) using seem GFP flags for allocate/mem charge, be consistent with PMD
>> THP firstly, in addition, according to GFP flag returned for
>> vma_thp_gfp_mask(), GFP_TRANSHUGE_LIGHT could make us skip direct reclaim,
>> _GFP_NORETRY will make us skip mem_cgroup_oom and won't kill
>> any progress from large order folio charging.
> 
> OK, makes sense. Please turn that into the changelog.

Sure.

> 
>>> GFP_TRANSHUGE_LIGHT is more interesting though because those do not dive
>>> into the direct reclaim at all. With the current code they will reclaim
>>> charges to free up the space for the allocated THP page and that defeats
>>> the light mode. I have a vague recollection of preparing a patch to
>>
>> We are interesting to GFP_TRANSHUGE_LIGHT and _GFP_NORETRY as mentioned
>> above.
> 
> if mTHP can be smaller than COSTLY_ORDER then you are correct and
> NORETRY makes a difference. Please mention that in the changelog as
> well.
> 

For memory cgroup charge, _GFP_NORETRY checked to make us directly skip
mem_cgroup_oom(), it has no concern with folio order or COSTLY_ORDER 
when check _GFP_NORETRY in try_charge_memcg(), so I think NORETRY should
always make difference for all large order folio.