All of lore.kernel.org
 help / color / mirror / Atom feed
From: David Hildenbrand <david@redhat.com>
To: "Liang, Liang (Leo)" <Liang.Liang@amd.com>
Cc: David Hildenbrand <david@redhat.com>,
	"Deucher, Alexander" <Alexander.Deucher@amd.com>,
	linux-kernel@vger.kernel.org,
	amd-gfx list <amd-gfx@lists.freedesktop.org>,
	Andrew Morton <akpm@linux-foundation.org>,
	"Huang, Ray" <Ray.Huang@amd.com>,
	"Koenig, Christian" <Christian.Koenig@amd.com>,
	Mike Rapoport <rppt@linux.ibm.com>,
	"Rafael J. Wysocki" <rafael@kernel.org>,
	George Kennedy <george.kennedy@oracle.com>
Subject: Re: slow boot with 7fef431be9c9 ("mm/page_alloc: place pages to tail in __free_pages_core()")
Date: Sat, 13 Mar 2021 10:05:23 +0100	[thread overview]
Message-ID: <0AE49D98-171A-42B9-9CFC-9193A9BD3346@redhat.com> (raw)
In-Reply-To: <MW3PR12MB4537B49678884A1EB1F75AB5F36E9@MW3PR12MB4537.namprd12.prod.outlook.com>


> Am 13.03.2021 um 05:04 schrieb Liang, Liang (Leo) <Liang.Liang@amd.com>:
> 
> [AMD Public Use]
> 
> Hi David,
> 
> Which benchmark tool you prefer? Memtest86+ or else?

Hi Leo,

I think you want something that runs under Linux natively.

I‘m planning on coding up a kernel module to walk all 4MB pages in the freelists and perform a stream benchmark individually. Then we might be able to identify the problematic range - if there is a problematic range :) Guess I‘ll have it running by Monday and let you know.

Cheers!

> 
> BRs,
> Leo
> -----Original Message-----
> From: David Hildenbrand <david@redhat.com> 
> Sent: Saturday, March 13, 2021 12:47 AM
> To: Liang, Liang (Leo) <Liang.Liang@amd.com>; Deucher, Alexander <Alexander.Deucher@amd.com>; linux-kernel@vger.kernel.org; amd-gfx list <amd-gfx@lists.freedesktop.org>; Andrew Morton <akpm@linux-foundation.org>
> Cc: Huang, Ray <Ray.Huang@amd.com>; Koenig, Christian <Christian.Koenig@amd.com>; Mike Rapoport <rppt@linux.ibm.com>; Rafael J. Wysocki <rafael@kernel.org>; George Kennedy <george.kennedy@oracle.com>
> Subject: Re: slow boot with 7fef431be9c9 ("mm/page_alloc: place pages to tail in __free_pages_core()")
> 
>> On 12.03.21 17:19, Liang, Liang (Leo) wrote:
>> [AMD Public Use]
>> 
>> Dmesg attached.
>> 
> 
> 
> So, looks like the "real" slowdown starts once the buddy is up and running (no surprise).
> 
> 
> [    0.044035] Memory: 6856724K/7200304K available (14345K kernel code, 9699K rwdata, 5276K rodata, 2628K init, 12104K bss, 343324K reserved, 0K cma-reserved)
> [    0.044045] random: get_random_u64 called from __kmem_cache_create+0x33/0x460 with crng_init=1
> [    0.049025] SLUB: HWalign=64, Order=0-3, MinObjects=0, CPUs=16, Nodes=1
> [    0.050036] ftrace: allocating 47158 entries in 185 pages
> [    0.097487] ftrace: allocated 185 pages with 5 groups
> [    0.109210] rcu: Hierarchical RCU implementation.
> 
> vs.
> 
> [    0.041115] Memory: 6869396K/7200304K available (14345K kernel code, 3433K rwdata, 5284K rodata, 2624K init, 6088K bss, 330652K reserved, 0K cma-reserved)
> [    0.041127] random: get_random_u64 called from __kmem_cache_create+0x31/0x430 with crng_init=1
> [    0.041309] SLUB: HWalign=64, Order=0-3, MinObjects=0, CPUs=16, Nodes=1
> [    0.041335] ftrace: allocating 47184 entries in 185 pages
> [    0.055719] ftrace: allocated 185 pages with 5 groups
> [    0.055863] rcu: Hierarchical RCU implementation.
> 
> 
> And it gets especially bad during ACPI table processing:
> 
> [    4.158303] ACPI: Added _OSI(Module Device)
> [    4.158767] ACPI: Added _OSI(Processor Device)
> [    4.159230] ACPI: Added _OSI(3.0 _SCP Extensions)
> [    4.159705] ACPI: Added _OSI(Processor Aggregator Device)
> [    4.160551] ACPI: Added _OSI(Linux-Dell-Video)
> [    4.161359] ACPI: Added _OSI(Linux-Lenovo-NV-HDMI-Audio)
> [    4.162264] ACPI: Added _OSI(Linux-HPI-Hybrid-Graphics)
> [   17.713421] ACPI: 13 ACPI AML tables successfully acquired and loaded
> [   18.716065] ACPI: [Firmware Bug]: BIOS _OSI(Linux) query ignored
> [   20.743828] ACPI: EC: EC started
> [   20.744155] ACPI: EC: interrupt blocked
> [   20.945956] ACPI: EC: EC_CMD/EC_SC=0x666, EC_DATA=0x662
> [   20.946618] ACPI: \_SB_.PCI0.LPC0.EC0_: Boot DSDT EC used to handle transactions
> [   20.947348] ACPI: Interpreter enabled
> [   20.951278] ACPI: (supports S0 S3 S4 S5)
> [   20.951632] ACPI: Using IOAPIC for interrupt routing
> 
> vs.
> 
> [    0.216039] ACPI: Added _OSI(Module Device)
> [    0.216041] ACPI: Added _OSI(Processor Device)
> [    0.216043] ACPI: Added _OSI(3.0 _SCP Extensions)
> [    0.216044] ACPI: Added _OSI(Processor Aggregator Device)
> [    0.216046] ACPI: Added _OSI(Linux-Dell-Video)
> [    0.216048] ACPI: Added _OSI(Linux-Lenovo-NV-HDMI-Audio)
> [    0.216049] ACPI: Added _OSI(Linux-HPI-Hybrid-Graphics)
> [    0.228259] ACPI: 13 ACPI AML tables successfully acquired and loaded
> [    0.229527] ACPI: [Firmware Bug]: BIOS _OSI(Linux) query ignored
> [    0.231663] ACPI: EC: EC started
> [    0.231666] ACPI: EC: interrupt blocked
> [    0.233664] ACPI: EC: EC_CMD/EC_SC=0x666, EC_DATA=0x662
> [    0.233667] ACPI: \_SB_.PCI0.LPC0.EC0_: Boot DSDT EC used to handle transactions
> [    0.233670] ACPI: Interpreter enabled
> [    0.233685] ACPI: (supports S0 S3 S4 S5)
> [    0.233687] ACPI: Using IOAPIC for interrupt routing
> 
> The jump from 4.1 -> 17.7 is especially bad.
> 
> Which might in fact indicate that this could be related to using some very special slow (ACPI?) memory for ordinary purposes, interfering with actual ACPI users?
> 
> But again, just a wild guess, because the system is extremely slow afterwards, however, we don't have any pauses without any signs of life for that long.
> 
> 
> It would be interesting to run a simple memory bandwidth benchmark on the fast kernel with differing sizes up to running OOM to see if there is really some memory that is just horribly slow once allocated and used.
> 
> --
> Thanks,
> 
> David / dhildenb
> 


WARNING: multiple messages have this Message-ID (diff)
From: David Hildenbrand <david@redhat.com>
To: "Liang, Liang (Leo)" <Liang.Liang@amd.com>
Cc: David Hildenbrand <david@redhat.com>,
	"Rafael J. Wysocki" <rafael@kernel.org>,
	linux-kernel@vger.kernel.org,
	amd-gfx list <amd-gfx@lists.freedesktop.org>,
	Mike Rapoport <rppt@linux.ibm.com>,
	George Kennedy <george.kennedy@oracle.com>,
	"Huang, Ray" <Ray.Huang@amd.com>,
	"Deucher, Alexander" <Alexander.Deucher@amd.com>,
	Andrew Morton <akpm@linux-foundation.org>,
	"Koenig, Christian" <Christian.Koenig@amd.com>
Subject: Re: slow boot with 7fef431be9c9 ("mm/page_alloc: place pages to tail in __free_pages_core()")
Date: Sat, 13 Mar 2021 10:05:23 +0100	[thread overview]
Message-ID: <0AE49D98-171A-42B9-9CFC-9193A9BD3346@redhat.com> (raw)
In-Reply-To: <MW3PR12MB4537B49678884A1EB1F75AB5F36E9@MW3PR12MB4537.namprd12.prod.outlook.com>


> Am 13.03.2021 um 05:04 schrieb Liang, Liang (Leo) <Liang.Liang@amd.com>:
> 
> [AMD Public Use]
> 
> Hi David,
> 
> Which benchmark tool you prefer? Memtest86+ or else?

Hi Leo,

I think you want something that runs under Linux natively.

I‘m planning on coding up a kernel module to walk all 4MB pages in the freelists and perform a stream benchmark individually. Then we might be able to identify the problematic range - if there is a problematic range :) Guess I‘ll have it running by Monday and let you know.

Cheers!

> 
> BRs,
> Leo
> -----Original Message-----
> From: David Hildenbrand <david@redhat.com> 
> Sent: Saturday, March 13, 2021 12:47 AM
> To: Liang, Liang (Leo) <Liang.Liang@amd.com>; Deucher, Alexander <Alexander.Deucher@amd.com>; linux-kernel@vger.kernel.org; amd-gfx list <amd-gfx@lists.freedesktop.org>; Andrew Morton <akpm@linux-foundation.org>
> Cc: Huang, Ray <Ray.Huang@amd.com>; Koenig, Christian <Christian.Koenig@amd.com>; Mike Rapoport <rppt@linux.ibm.com>; Rafael J. Wysocki <rafael@kernel.org>; George Kennedy <george.kennedy@oracle.com>
> Subject: Re: slow boot with 7fef431be9c9 ("mm/page_alloc: place pages to tail in __free_pages_core()")
> 
>> On 12.03.21 17:19, Liang, Liang (Leo) wrote:
>> [AMD Public Use]
>> 
>> Dmesg attached.
>> 
> 
> 
> So, looks like the "real" slowdown starts once the buddy is up and running (no surprise).
> 
> 
> [    0.044035] Memory: 6856724K/7200304K available (14345K kernel code, 9699K rwdata, 5276K rodata, 2628K init, 12104K bss, 343324K reserved, 0K cma-reserved)
> [    0.044045] random: get_random_u64 called from __kmem_cache_create+0x33/0x460 with crng_init=1
> [    0.049025] SLUB: HWalign=64, Order=0-3, MinObjects=0, CPUs=16, Nodes=1
> [    0.050036] ftrace: allocating 47158 entries in 185 pages
> [    0.097487] ftrace: allocated 185 pages with 5 groups
> [    0.109210] rcu: Hierarchical RCU implementation.
> 
> vs.
> 
> [    0.041115] Memory: 6869396K/7200304K available (14345K kernel code, 3433K rwdata, 5284K rodata, 2624K init, 6088K bss, 330652K reserved, 0K cma-reserved)
> [    0.041127] random: get_random_u64 called from __kmem_cache_create+0x31/0x430 with crng_init=1
> [    0.041309] SLUB: HWalign=64, Order=0-3, MinObjects=0, CPUs=16, Nodes=1
> [    0.041335] ftrace: allocating 47184 entries in 185 pages
> [    0.055719] ftrace: allocated 185 pages with 5 groups
> [    0.055863] rcu: Hierarchical RCU implementation.
> 
> 
> And it gets especially bad during ACPI table processing:
> 
> [    4.158303] ACPI: Added _OSI(Module Device)
> [    4.158767] ACPI: Added _OSI(Processor Device)
> [    4.159230] ACPI: Added _OSI(3.0 _SCP Extensions)
> [    4.159705] ACPI: Added _OSI(Processor Aggregator Device)
> [    4.160551] ACPI: Added _OSI(Linux-Dell-Video)
> [    4.161359] ACPI: Added _OSI(Linux-Lenovo-NV-HDMI-Audio)
> [    4.162264] ACPI: Added _OSI(Linux-HPI-Hybrid-Graphics)
> [   17.713421] ACPI: 13 ACPI AML tables successfully acquired and loaded
> [   18.716065] ACPI: [Firmware Bug]: BIOS _OSI(Linux) query ignored
> [   20.743828] ACPI: EC: EC started
> [   20.744155] ACPI: EC: interrupt blocked
> [   20.945956] ACPI: EC: EC_CMD/EC_SC=0x666, EC_DATA=0x662
> [   20.946618] ACPI: \_SB_.PCI0.LPC0.EC0_: Boot DSDT EC used to handle transactions
> [   20.947348] ACPI: Interpreter enabled
> [   20.951278] ACPI: (supports S0 S3 S4 S5)
> [   20.951632] ACPI: Using IOAPIC for interrupt routing
> 
> vs.
> 
> [    0.216039] ACPI: Added _OSI(Module Device)
> [    0.216041] ACPI: Added _OSI(Processor Device)
> [    0.216043] ACPI: Added _OSI(3.0 _SCP Extensions)
> [    0.216044] ACPI: Added _OSI(Processor Aggregator Device)
> [    0.216046] ACPI: Added _OSI(Linux-Dell-Video)
> [    0.216048] ACPI: Added _OSI(Linux-Lenovo-NV-HDMI-Audio)
> [    0.216049] ACPI: Added _OSI(Linux-HPI-Hybrid-Graphics)
> [    0.228259] ACPI: 13 ACPI AML tables successfully acquired and loaded
> [    0.229527] ACPI: [Firmware Bug]: BIOS _OSI(Linux) query ignored
> [    0.231663] ACPI: EC: EC started
> [    0.231666] ACPI: EC: interrupt blocked
> [    0.233664] ACPI: EC: EC_CMD/EC_SC=0x666, EC_DATA=0x662
> [    0.233667] ACPI: \_SB_.PCI0.LPC0.EC0_: Boot DSDT EC used to handle transactions
> [    0.233670] ACPI: Interpreter enabled
> [    0.233685] ACPI: (supports S0 S3 S4 S5)
> [    0.233687] ACPI: Using IOAPIC for interrupt routing
> 
> The jump from 4.1 -> 17.7 is especially bad.
> 
> Which might in fact indicate that this could be related to using some very special slow (ACPI?) memory for ordinary purposes, interfering with actual ACPI users?
> 
> But again, just a wild guess, because the system is extremely slow afterwards, however, we don't have any pauses without any signs of life for that long.
> 
> 
> It would be interesting to run a simple memory bandwidth benchmark on the fast kernel with differing sizes up to running OOM to see if there is really some memory that is just horribly slow once allocated and used.
> 
> --
> Thanks,
> 
> David / dhildenb
> 

_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

  reply	other threads:[~2021-03-13  9:06 UTC|newest]

Thread overview: 48+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-03-11 14:41 slow boot with 7fef431be9c9 ("mm/page_alloc: place pages to tail in __free_pages_core()") Deucher, Alexander
2021-03-11 14:41 ` Deucher, Alexander
2021-03-11 15:02 ` David Hildenbrand
2021-03-11 15:02   ` David Hildenbrand
2021-03-12 14:06   ` Deucher, Alexander
2021-03-12 14:06     ` Deucher, Alexander
2021-03-12 14:12     ` David Hildenbrand
2021-03-12 14:12       ` David Hildenbrand
2021-03-12 15:39       ` Deucher, Alexander
2021-03-12 15:39         ` Deucher, Alexander
2021-03-12 15:48         ` David Hildenbrand
2021-03-12 15:48           ` David Hildenbrand
2021-03-12 16:15           ` Deucher, Alexander
2021-03-12 16:15             ` Deucher, Alexander
2021-03-12 16:19         ` Liang, Liang (Leo)
2021-03-12 16:19           ` Liang, Liang (Leo)
2021-03-12 16:46           ` David Hildenbrand
2021-03-12 16:46             ` David Hildenbrand
2021-03-13  4:04             ` Liang, Liang (Leo)
2021-03-13  4:04               ` Liang, Liang (Leo)
2021-03-13  9:05               ` David Hildenbrand [this message]
2021-03-13  9:05                 ` David Hildenbrand
2021-03-13 13:48                 ` Mike Rapoport
2021-03-13 13:48                   ` Mike Rapoport
2021-03-15 13:04                   ` David Hildenbrand
2021-03-15 13:04                     ` David Hildenbrand
2021-03-16  0:36                     ` Liang, Liang (Leo)
2021-03-16  0:36                       ` Liang, Liang (Leo)
2021-03-16  6:50                       ` Mike Rapoport
2021-03-16  6:50                         ` Mike Rapoport
2021-03-16  8:00                         ` Liang, Liang (Leo)
2021-03-16  8:00                           ` Liang, Liang (Leo)
2021-03-16  8:37                           ` David Hildenbrand
2021-03-16  8:37                             ` David Hildenbrand
2021-03-16  8:43                             ` Liang, Liang (Leo)
2021-03-16  8:43                               ` Liang, Liang (Leo)
2021-03-16  8:54                               ` David Hildenbrand
2021-03-16  8:54                                 ` David Hildenbrand
2021-03-16  8:58                                 ` Liang, Liang (Leo)
2021-03-16  8:58                                   ` Liang, Liang (Leo)
2021-03-16  9:08                                   ` David Hildenbrand
2021-03-16  9:08                                     ` David Hildenbrand
2021-03-16 10:30                                     ` Mike Rapoport
2021-03-16 10:30                                       ` Mike Rapoport
2021-03-16 11:02                                       ` Liang, Liang (Leo)
2021-03-16 11:02                                         ` Liang, Liang (Leo)
2021-03-16 11:10                                         ` David Hildenbrand
2021-03-16 11:10                                           ` David Hildenbrand

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=0AE49D98-171A-42B9-9CFC-9193A9BD3346@redhat.com \
    --to=david@redhat.com \
    --cc=Alexander.Deucher@amd.com \
    --cc=Christian.Koenig@amd.com \
    --cc=Liang.Liang@amd.com \
    --cc=Ray.Huang@amd.com \
    --cc=akpm@linux-foundation.org \
    --cc=amd-gfx@lists.freedesktop.org \
    --cc=george.kennedy@oracle.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=rafael@kernel.org \
    --cc=rppt@linux.ibm.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.