Shaohua Li writes: > On Fri, Sep 23, 2016 at 10:32:39AM +0800, Huang, Ying wrote: >> Rik van Riel writes: >> >> > On Thu, 2016-09-22 at 15:56 -0700, Shaohua Li wrote: >> >> On Wed, Sep 07, 2016 at 09:45:59AM -0700, Huang, Ying wrote: >> >> >. >> >> > - It will help the memory fragmentation, especially when the THP is >> >> > . heavily used by the applications.. The 2M continuous pages will >> >> > be >> >> > . free up after THP swapping out. >> >> >> >> So this is impossible without THP swapin. While 2M swapout makes a >> >> lot of >> >> sense, I doubt 2M swapin is really useful. What kind of application >> >> is >> >> 'optimized' to do sequential memory access? >> > >> > I suspect a lot of this will depend on the ratio of storage >> > speed to CPU & RAM speed. >> > >> > When swapping to a spinning disk, it makes sense to avoid >> > extra memory use on swapin, and work in 4kB blocks. >> >> For spinning disk, the THP swap optimization will be turned off in >> current implementation. Because huge swap cluster allocation based on >> swap cluster management, which is available only for non-rotating block >> devices (blk_queue_nonrot()). > > For 2m swapin, as long as one byte is changed in the 2m, next time we must do > 2m swapout. There is huge waste of memory and IO bandwidth and increases > unnecessary memory pressure. 2M IO will very easily saturate a very fast SSD > and makes IO the bottleneck. Not sure about NVRAM though. One solution is to make 2M swapin configurable, maybe via a sysfs file in /sys/kernel/mm/transparent_hugepage/, so that we can turn on it only for really fast storage devices, such as NVRAM, etc. Best Regards, Huang, Ying