On 3 Jun 2021, at 10:22, Mel Gorman wrote: > The per-cpu page allocator (PCP) only stores order-0 pages. This means > that all THP and "cheap" high-order allocations including SLUB contends > on the zone->lock. This patch extends the PCP allocator to store THP and > "cheap" high-order pages. Note that struct per_cpu_pages increases in > size to 256 bytes (4 cache lines) on x86-64. > > Note that this is not necessarily a universal performance win because of > how it is implemented. High-order pages can cause pcp->high to be exceeded > prematurely for lower-orders so for example, a large number of THP pages > being freed could release order-0 pages from the PCP lists. Hence, much > depends on the allocation/free pattern as observed by a single CPU to > determine if caching helps or hurts a particular workload. > > That said, basic performance testing passed. The following is a netperf > UDP_STREAM test which hits the relevant patches as some of the network > allocations are high-order. > > netperf-udp > 5.13.0-rc2 5.13.0-rc2 > mm-pcpburst-v3r4 mm-pcphighorder-v1r7 > Hmean send-64 261.46 ( 0.00%) 266.30 * 1.85%* > Hmean send-128 516.35 ( 0.00%) 536.78 * 3.96%* > Hmean send-256 1014.13 ( 0.00%) 1034.63 * 2.02%* > Hmean send-1024 3907.65 ( 0.00%) 4046.11 * 3.54%* > Hmean send-2048 7492.93 ( 0.00%) 7754.85 * 3.50%* > Hmean send-3312 11410.04 ( 0.00%) 11772.32 * 3.18%* > Hmean send-4096 13521.95 ( 0.00%) 13912.34 * 2.89%* > Hmean send-8192 21660.50 ( 0.00%) 22730.72 * 4.94%* > Hmean send-16384 31902.32 ( 0.00%) 32637.50 * 2.30%* > > From a functional point of view, a patch like this is necessary to > make bulk allocation of high-order pages work with similar performance > to order-0 bulk allocations. The bulk allocator is not updated in this > series as it would have to be determined by bulk allocation users how > they want to track the order of pages allocated with the bulk allocator. > > Signed-off-by: Mel Gorman > Acked-by: Vlastimil Babka > --- > include/linux/mmzone.h | 20 +++++- > mm/internal.h | 2 +- > mm/page_alloc.c | 159 +++++++++++++++++++++++++++++------------ > mm/swap.c | 2 +- > 4 files changed, 135 insertions(+), 48 deletions(-) > Hi Mel, I am not able to boot my QEMU VM with v5.13-rc5-mmotm-2021-06-07-18-33. git bisect points to this patch. The VM got stuck at “Booting from ROM…”. My kernel config is attached and my qemu command is: qemu-system-x86_64 -kernel ~/repos/linux-1gb-thp/arch/x86/boot/bzImage \ -drive file=~/qemu-image/vm.qcow2,if=virtio \ -append "nokaslr root=/dev/vda1 rw console=ttyS0 " \ -pidfile vm.pid \ -netdev user,id=mynet0,hostfwd=tcp::11022-:22 \ -device virtio-net-pci,netdev=mynet0 \ -m 16g -smp 6 -cpu host -enable-kvm -nographic \ -machine hmat=on -object memory-backend-ram,size=8g,id=m0 \ -object memory-backend-ram,size=8g,id=m1 \ -numa node,memdev=m0,nodeid=0 -numa node,memdev=m1,nodeid=1 The attached config has THP disabled. The VM cannot boot with THP enabled, either. — Best Regards, Yan, Zi