linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
* Performance regressions in "boot_time" tests in Linux 5.8 Kernel
@ 2020-10-09 13:15 Rahul Gopakumar
  2020-10-10  6:11 ` bhe
  0 siblings, 1 reply; 21+ messages in thread
From: Rahul Gopakumar @ 2020-10-09 13:15 UTC (permalink / raw)
  To: bhe, linux-mm, linux-kernel
  Cc: akpm, natechancellor, ndesaulniers, clang-built-linux, rostedt,
	Rajender M, Yiu Cho Lau, Peter Jonasson, Venkatesh Rajaram

As part of VMware's performance regression testing for Linux Kernel
upstream releases, we identified boot time increase when comparing
Linux 5.8 kernel against Linux 5.7 kernel. Increase in boot time is
noticeable on VM with a **large amount of memory**.
 
In our test cases, it's noticeable with memory 1TB and more, whereas
there was no major difference noticed in testcases with <1TB.
 
On bisecting between 5.7 and 5.8, we found the following commit from 
“Baoquan He” to be the cause of boot time increase in big VM test cases.
 
-------------------------------------
 
commit 73a6e474cb376921a311786652782155eac2fdf0
Author: Baoquan He <bhe@redhat.com>
Date: Wed Jun 3 15:57:55 2020 -0700
 
mm: memmap_init: iterate over memblock regions rather that check each PFN
 
When called during boot the memmap_init_zone() function checks if each PFN
is valid and actually belongs to the node being initialized using
early_pfn_valid() and early_pfn_in_nid().
 
Each such check may cost up to O(log(n)) where n is the number of memory
banks, so for large amount of memory overall time spent in early_pfn*()
becomes substantial.
 
-------------------------------------
 
For boot time test, we used RHEL 8.1 as the guest OS.
VM config is 84 vcpu and 1TB vRAM.
 
Here are the actual performance numbers.
 
5.7 GA - 18.17 secs
Baoquan's commit - 21.6 secs (-16% increase in time)
 
From dmesg logs, we can see significant time delay around memmap.
 
Refer below logs.
 
Good commit
 
[0.033176] Normal zone: 1445888 pages used for memmap
[0.033176] Normal zone: 89391104 pages, LIFO batch:63
[0.035851] ACPI: PM-Timer IO Port: 0x448
 
Problem commit
 
[0.026874] Normal zone: 1445888 pages used for memmap
[0.026875] Normal zone: 89391104 pages, LIFO batch:63
[2.028450] ACPI: PM-Timer IO Port: 0x448
 
We did some analysis, and it looks like with the problem commit it's
not deferring the memory initialization to a later stage and it's 
initializing the huge chunk of memory in serial - during the boot-up
time.  Whereas with the good commit, it was able to defer the
initialization of the memory when it could be done in parallel.


Rahul Gopakumar
Performance Engineering
VMware, Inc.

^ permalink raw reply	[flat|nested] 21+ messages in thread

end of thread, other threads:[~2020-12-13 15:15 UTC | newest]

Thread overview: 21+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-10-09 13:15 Performance regressions in "boot_time" tests in Linux 5.8 Kernel Rahul Gopakumar
2020-10-10  6:11 ` bhe
2020-10-12 17:21   ` Rahul Gopakumar
2020-10-13  5:08     ` bhe
2020-10-13 13:17     ` bhe
2020-10-20 13:45       ` Rahul Gopakumar
2020-10-20 15:18         ` bhe
2020-10-20 15:26           ` Rahul Gopakumar
2020-10-22  4:04             ` bhe
2020-10-22 17:21               ` Rahul Gopakumar
2020-11-02 14:15                 ` Rahul Gopakumar
2020-11-02 14:30                   ` bhe
2020-11-03 12:34                     ` Rahul Gopakumar
2020-11-03 14:03                       ` bhe
2020-11-12 14:51                       ` bhe
2020-11-20  3:11                         ` Rahul Gopakumar
2020-11-22  1:08                           ` bhe
2020-11-24 15:03                             ` Rahul Gopakumar
2020-11-30 16:55                               ` Mike Rapoport
2020-12-11 16:16                               ` Rahul Gopakumar
2020-12-13 15:15                                 ` bhe

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).