On 1 Mar 2021, at 20:59, Roman Gushchin wrote: > On Wed, Feb 24, 2021 at 05:35:36PM -0500, Zi Yan wrote: >> From: Zi Yan >> >> Hi all, >> >> I have rebased my 1GB PUD THP support patches on v5.11-mmotm-2021-02-18-18-29 >> and the code is available at >> https://github.com/x-y-z/linux-1gb-thp/tree/1gb_thp_v5.11-mmotm-2021-02-18-18-29 >> if you want to give it a try. The actual 49 patches are not sent out with this >> cover letter. :) >> >> Instead of asking for code review, I would like to discuss on the concerns I got >> from previous RFCs. I think there are two major ones: >> >> 1. 1GB page allocation. Current implementation allocates 1GB pages from CMA >> regions that are reserved at boot time like hugetlbfs. The concerns on >> using CMA is that an educated guess is needed to avoid depleting kernel >> memory in case CMA regions are set too large. Recently David Rientjes >> proposes to use process_madvise() for hugepage collapse, which is an >> alternative [1] but might not work for 1GB pages, since there is no way of >> _allocating_ a 1GB page to which collapse pages. I proposed a similar >> approach at LSF/MM 2019, generating physically contiguous memory after pages >> are allocated [2], which is usable for 1GB THPs. This approach does in-place >> huge page promotion thus does not require page allocation. > > Well, I don't think there an alternative to cma as now. When the memory is almost > filled at least once, any subsequent activity leading to substantial slab allocations > (e.g. run git gc) will fragment the memory, so that there are virtually no chances > to find a continuous GB. > > It's possible in theory to reduce the fragmentation on 1GB scale by grouping > non-movable pageblocks, but it seems a separate project. My experiments showed that finding continuous GBs is possible, but I agree that CMA is more reliable and 1GB scale defragmentation should be a separate project. — Best Regards, Yan Zi