On 8 Sep 2020, at 10:27, Matthew Wilcox wrote: > On Tue, Sep 08, 2020 at 10:05:11AM -0400, Zi Yan wrote: >> On 8 Sep 2020, at 7:57, David Hildenbrand wrote: >>> I have concerns if we would silently use 1~GB THPs in most scenarios >>> where be would have used 2~MB THP. I'd appreciate a trigger to >>> explicitly enable that - MADV_HUGEPAGE is not sufficient because some >>> applications relying on that assume that the THP size will be 2~MB >>> (especially, if you want sparse, large VMAs). >> >> This patchset is not intended to silently use 1GB THP in place of 2MB THP. >> First of all, there is a knob /sys/kernel/mm/transparent_hugepage/enable_1GB >> to enable 1GB THP explicitly. Also, 1GB THP is allocated from a reserved CMA >> region (although I had alloc_contig_pages as a fallback, which can be removed >> in next version), so users need to add hugepage_cma=nG kernel parameter to >> enable 1GB THP allocation. If a finer control is necessary, we can add >> a new MADV_HUGEPAGE_1GB for 1GB THP. > > I think we do need that flag. Machines don't run a single workload > (arguably with VMs, we're getting closer to going back to the single > workload per machine, but that's a different matter). So if there's > one app that wants 2MB pages and one that wants 1GB pages, we need to > be able to distinguish them. > > I could also see there being an app which benefits from 1GB for > one mapping and prefers 2GB for a different mapping, so I think the > per-mapping madvise flag is best. > > I'm a little wary of encoding the size of an x86 PUD in the Linux API > though. Probably best to follow the example set in > include/uapi/asm-generic/hugetlb_encode.h, but I don't love it. I > don't have a better suggestion though. Using hugeltb_encode.h makes sense to me. I will add it in the next version. Thanks for the suggestion. — Best Regards, Yan Zi