All of lore.kernel.org
 help / color / mirror / Atom feed
From: Rahul Gopakumar <gopakumarr@vmware.com>
To: "bhe@redhat.com" <bhe@redhat.com>
Cc: "linux-mm@kvack.org" <linux-mm@kvack.org>,
	"linux-kernel@vger.kernel.org" <linux-kernel@vger.kernel.org>,
	"akpm@linux-foundation.org" <akpm@linux-foundation.org>,
	"natechancellor@gmail.com" <natechancellor@gmail.com>,
	"ndesaulniers@google.com" <ndesaulniers@google.com>,
	"clang-built-linux@googlegroups.com" 
	<clang-built-linux@googlegroups.com>,
	"rostedt@goodmis.org" <rostedt@goodmis.org>,
	Rajender M <manir@vmware.com>, Yiu Cho Lau <lauyiuch@vmware.com>,
	Peter Jonasson <pjonasson@vmware.com>,
	Venkatesh Rajaram <rajaramv@vmware.com>
Subject: Re: Performance regressions in "boot_time" tests in Linux 5.8 Kernel
Date: Mon, 2 Nov 2020 14:15:32 +0000	[thread overview]
Message-ID: <DM6PR05MB5292DF14DF1C82FFE001AC24A4100@DM6PR05MB5292.namprd05.prod.outlook.com> (raw)
In-Reply-To: <DM6PR05MB5292D8B85FA9DDE263F6147AA41D0@DM6PR05MB5292.namprd05.prod.outlook.com>

Hi Baoquan,

There could still be some memory initialization problem with
the draft patch. I see a lot of page corruption errors.

BUG: Bad page state in process swapper  pfn:ab0803c

Here is the call trace

[    0.262826]  dump_stack+0x57/0x6a
[    0.262827]  bad_page.cold.119+0x63/0x93
[    0.262828]  __free_pages_ok+0x31f/0x330
[    0.262829]  memblock_free_all+0x153/0x1bf
[    0.262830]  mem_init+0x23/0x1f2
[    0.262831]  start_kernel+0x299/0x57a
[    0.262832]  secondary_startup_64_no_verify+0xb8/0xbb

I don't see this in dmesg log with vanilla kernel.

It looks like the overhead due to this initialization problem
is around 3 secs.

[    0.262831]  start_kernel+0x299/0x57a
[    0.262832]  secondary_startup_64_no_verify+0xb8/0xbb
[    3.758185] Memory: 3374072K/1073740756K available (12297K kernel code, 5778Krwdata, 4376K rodata, 2352K init, 6480K bss, 16999716K reserved, 0K cma-reserved)

But the draft patch is fixing the initial problem
reported around 2 secs (log snippet below) hence the total
delay of 1 sec.

[    0.024752]   Normal zone: 1445888 pages used for memmap
[    0.024753]   Normal zone: 89391104 pages, LIFO batch:63
[    0.027379] ACPI: PM-Timer IO Port: 0x448


________________________________________
From: Rahul Gopakumar <gopakumarr@vmware.com>
Sent: 22 October 2020 10:51 PM
To: bhe@redhat.com
Cc: linux-mm@kvack.org; linux-kernel@vger.kernel.org; akpm@linux-foundation.org; natechancellor@gmail.com; ndesaulniers@google.com; clang-built-linux@googlegroups.com; rostedt@goodmis.org; Rajender M; Yiu Cho Lau; Peter Jonasson; Venkatesh Rajaram
Subject: Re: Performance regressions in "boot_time" tests in Linux 5.8 Kernel

Hi Baoquan,

>> Can you tell how you measure the boot time?

Our test is actually boothalt, time reported by this test
includes both boot-up and shutdown time.

>> At above, you said "Patch on latest commit - 20.161 secs",
>> could you tell where this 20.161 secs comes from,

So this time is boot-up time + shutdown time.

From the dmesg.log it looks like during the memmap_init
it's taking less time in the patch. Let me take a closer look to
confirm this and also to find where the 1-sec delay in the patch
run is coming from.


From: bhe@redhat.com <bhe@redhat.com>
Sent: 22 October 2020 9:34 AM
To: Rahul Gopakumar <gopakumarr@vmware.com>
Cc: linux-mm@kvack.org <linux-mm@kvack.org>; linux-kernel@vger.kernel.org <linux-kernel@vger.kernel.org>; akpm@linux-foundation.org <akpm@linux-foundation.org>; natechancellor@gmail.com <natechancellor@gmail.com>; ndesaulniers@google.com <ndesaulniers@google.com>; clang-built-linux@googlegroups.com <clang-built-linux@googlegroups.com>; rostedt@goodmis.org <rostedt@goodmis.org>; Rajender M <manir@vmware.com>; Yiu Cho Lau <lauyiuch@vmware.com>; Peter Jonasson <pjonasson@vmware.com>; Venkatesh Rajaram <rajaramv@vmware.com>
Subject: Re: Performance regressions in "boot_time" tests in Linux 5.8 Kernel

Hi Rahul,

On 10/20/20 at 03:26pm, Rahul Gopakumar wrote:
> >> Here, do you mean it even cost more time with the patch applied?
>
> Yes, we ran it multiple times and it looks like there is a
> very minor increase with the patch.
>
......
> On 10/20/20 at 01:45pm, Rahul Gopakumar wrote:
> > Hi Baoquan,
> >
> > We had some trouble applying the patch to problem commit and the latest upstream commit. Steven (CC'ed) helped us by providing the updated draft patch. We applied it on the latest commit (3e4fb4346c781068610d03c12b16c0cfb0fd24a3), and it doesn't look like improving the performance numbers.
>
> Thanks for your feedback. From the code, I am sure what the problem is,
> but I didn't test it on system with huge memory. Forget mentioning my
> draft patch is based on akpm/master branch since it's a mm issue, it
> might be a little different with linus's mainline kernel, sorry for the
> inconvenience.
>
> I will test and debug this on a server with 4T memory in our lab, and
> update if any progress.
>
> >
> > Patch on latest commit - 20.161 secs
> > Vanilla latest commit - 19.50 secs
>

Can you tell how you measure the boot time? I checked the boot logs you
attached, E.g in below two logs, I saw patch_dmesg.log even has less
time during memmap init. Now I have got a machine with 1T memory for
testing, but didn't see obvious time cost increase. At above, you said
"Patch on latest commit - 20.161 secs", could you tell where this 20.161
secs comes from, so that I can investigate and reproduce on my system?

patch_dmesg.log:
[    0.023126] Initmem setup node 1 [mem 0x0000005600000000-0x000000aaffffffff]
[    0.023128] On node 1 totalpages: 89128960
[    0.023129]   Normal zone: 1392640 pages used for memmap
[    0.023130]   Normal zone: 89128960 pages, LIFO batch:63
[    0.023893] Initmem setup node 2 [mem 0x000000ab00000000-0x000001033fffffff]
[    0.023895] On node 2 totalpages: 89391104
[    0.023896]   Normal zone: 1445888 pages used for memmap
[    0.023897]   Normal zone: 89391104 pages, LIFO batch:63
[    0.026744] ACPI: PM-Timer IO Port: 0x448
[    0.026747] ACPI: Local APIC address 0xfee00000

vanilla_dmesg.log:
[    0.024295] Initmem setup node 1 [mem 0x0000005600000000-0x000000aaffffffff]
[    0.024298] On node 1 totalpages: 89128960
[    0.024299]   Normal zone: 1392640 pages used for memmap
[    0.024299]   Normal zone: 89128960 pages, LIFO batch:63
[    0.025289] Initmem setup node 2 [mem 0x000000ab00000000-0x000001033fffffff]
[    0.025291] On node 2 totalpages: 89391104
[    0.025292]   Normal zone: 1445888 pages used for memmap
[    0.025293]   Normal zone: 89391104 pages, LIFO batch:63
[    2.096982] ACPI: PM-Timer IO Port: 0x448
[    2.096987] ACPI: Local APIC address 0xfee00000

  reply	other threads:[~2020-11-02 14:15 UTC|newest]

Thread overview: 42+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-10-09 13:15 Performance regressions in "boot_time" tests in Linux 5.8 Kernel Rahul Gopakumar
2020-10-09 13:15 ` Rahul Gopakumar
2020-10-10  6:11 ` bhe
2020-10-10  6:11   ` bhe
2020-10-12 17:21   ` Rahul Gopakumar
2020-10-12 17:21     ` Rahul Gopakumar
2020-10-13  5:08     ` bhe
2020-10-13  5:08       ` bhe
2020-10-13 13:17     ` bhe
2020-10-13 13:17       ` bhe
2020-10-20 13:45       ` Rahul Gopakumar
2020-10-20 13:45         ` Rahul Gopakumar
2020-10-20 15:18         ` bhe
2020-10-20 15:18           ` bhe
2020-10-20 15:26           ` Rahul Gopakumar
2020-10-20 15:26             ` Rahul Gopakumar
2020-10-22  4:04             ` bhe
2020-10-22  4:04               ` bhe
2020-10-22 17:21               ` Rahul Gopakumar
2020-10-22 17:21                 ` Rahul Gopakumar
2020-11-02 14:15                 ` Rahul Gopakumar [this message]
2020-11-02 14:15                   ` Rahul Gopakumar
2020-11-02 14:30                   ` bhe
2020-11-02 14:30                     ` bhe
2020-11-03 12:34                     ` Rahul Gopakumar
2020-11-03 12:34                       ` Rahul Gopakumar
2020-11-03 14:03                       ` bhe
2020-11-03 14:03                         ` bhe
2020-11-12 14:51                       ` bhe
2020-11-12 14:51                         ` bhe
2020-11-20  3:11                         ` Rahul Gopakumar
2020-11-20  3:11                           ` Rahul Gopakumar
2020-11-22  1:08                           ` bhe
2020-11-22  1:08                             ` bhe
2020-11-24 15:03                             ` Rahul Gopakumar
2020-11-24 15:03                               ` Rahul Gopakumar
2020-11-30 16:55                               ` Mike Rapoport
2020-11-30 16:55                                 ` Mike Rapoport
2020-12-11 16:16                               ` Rahul Gopakumar
2020-12-11 16:16                                 ` Rahul Gopakumar
2020-12-13 15:15                                 ` bhe
2020-12-13 15:15                                   ` bhe

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=DM6PR05MB5292DF14DF1C82FFE001AC24A4100@DM6PR05MB5292.namprd05.prod.outlook.com \
    --to=gopakumarr@vmware.com \
    --cc=akpm@linux-foundation.org \
    --cc=bhe@redhat.com \
    --cc=clang-built-linux@googlegroups.com \
    --cc=lauyiuch@vmware.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=manir@vmware.com \
    --cc=natechancellor@gmail.com \
    --cc=ndesaulniers@google.com \
    --cc=pjonasson@vmware.com \
    --cc=rajaramv@vmware.com \
    --cc=rostedt@goodmis.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.