All of lore.kernel.org
 help / color / mirror / Atom feed
From: "Liang, Liang (Leo)" <Liang.Liang@amd.com>
To: Mike Rapoport <rppt@linux.ibm.com>, David Hildenbrand <david@redhat.com>
Cc: "Deucher, Alexander" <Alexander.Deucher@amd.com>,
	"linux-kernel@vger.kernel.org" <linux-kernel@vger.kernel.org>,
	amd-gfx list <amd-gfx@lists.freedesktop.org>,
	Andrew Morton <akpm@linux-foundation.org>,
	"Huang, Ray" <Ray.Huang@amd.com>,
	"Koenig, Christian" <Christian.Koenig@amd.com>,
	"Rafael J. Wysocki" <rafael@kernel.org>,
	George Kennedy <george.kennedy@oracle.com>
Subject: RE: slow boot with 7fef431be9c9 ("mm/page_alloc: place pages to tail in __free_pages_core()")
Date: Tue, 16 Mar 2021 11:02:50 +0000	[thread overview]
Message-ID: <MW3PR12MB4537C3C6EFD9CA3A4B32084DF36B9@MW3PR12MB4537.namprd12.prod.outlook.com> (raw)
In-Reply-To: <YFCIqLmn3u1be1yo@linux.ibm.com>

[AMD Public Use]

Hi David and Mike,

It's BIOS buggy. Now fixed by new BIOS. Thanks you so much! Cheers!

[    0.000034] MTRR variable ranges enabled:
[    0.000035]   0 base 000000000000 mask FFFF80000000 write-back
[    0.000037]   1 base 0000FFE00000 mask FFFFFFE00000 write-protect
[    0.000039]   2 base 0000FFDE0000 mask FFFFFFFE0000 write-protect
[    0.000040]   3 base 0000FF000000 mask FFFFFFF80000 write-protect
[    0.000041]   4 disabled
[    0.000042]   5 disabled
[    0.000043]   6 disabled
[    0.000044]   7 disabled
[    0.000045] TOM2: 0000000280000000 aka 10240M

root@scbu-Chachani:/home/scbu# cat /proc/mtrr
reg00: base=0x000000000 (    0MB), size= 2048MB, count=1: write-back
reg01: base=0x0ffe00000 ( 4094MB), size=    2MB, count=1: write-protect
reg02: base=0x0ffde0000 ( 4093MB), size=  128KB, count=1: write-protect
reg03: base=0x0ff000000 ( 4080MB), size=  512KB, count=1: write-protect

BRs,
Leo
-----Original Message-----
From: Mike Rapoport <rppt@linux.ibm.com> 
Sent: Tuesday, March 16, 2021 6:30 PM
To: David Hildenbrand <david@redhat.com>
Cc: Liang, Liang (Leo) <Liang.Liang@amd.com>; Deucher, Alexander <Alexander.Deucher@amd.com>; linux-kernel@vger.kernel.org; amd-gfx list <amd-gfx@lists.freedesktop.org>; Andrew Morton <akpm@linux-foundation.org>; Huang, Ray <Ray.Huang@amd.com>; Koenig, Christian <Christian.Koenig@amd.com>; Rafael J. Wysocki <rafael@kernel.org>; George Kennedy <george.kennedy@oracle.com>
Subject: Re: slow boot with 7fef431be9c9 ("mm/page_alloc: place pages to tail in __free_pages_core()")

On Tue, Mar 16, 2021 at 10:08:10AM +0100, David Hildenbrand wrote:
> On 16.03.21 09:58, Liang, Liang (Leo) wrote:
> > [AMD Public Use]
> > 
> > Hi David,
> > 
> > root@scbu-Chachani:~# cat /proc/mtrr
> > reg00: base=0x000000000 (    0MB), size= 2048MB, count=1: write-back
> > reg01: base=0x0ffe00000 ( 4094MB), size=    2MB, count=1: write-protect
> > reg02: base=0x100000000 ( 4096MB), size=   16MB, count=1: write-protect
> 
> ^ there it is
> 
> https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Fwiki
> .osdev.org%2FMTRR&amp;data=04%7C01%7CLiang.Liang%40amd.com%7C49c791cc6
> 18745b8c35208d8e86679a1%7C3dd8961fe4884e608e11a82d994e183d%7C0%7C0%7C6
> 37514874126576401%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoi
> V2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000&amp;sdata=667IK3Bnyx5uP3
> rKN8bOjW7A2MBuM1sLCg98m1LCIGo%3D&amp;reserved=0
> 
> "Reads allocate cache lines on a cache miss. All writes update main memory.
> 
> Cache lines are not allocated on a write miss. Write hits invalidate 
> the cache line and update main memory. "
> 
> AFAIU, writes completely bypass caches and store directly to main 
> mamory. If there are cache lines from a previous read, they are 
> invalidated. So I think especially slow will be read(addr), 
> write(addr), read(addr), ... which is what we have in the kstream benchmark.
> 
> 
> The question is:
> 
> who sets this up without owning the memory?
> Is the memory actually special/slow or is that setting wrong?

I really doubt that 16M at 0x100000000 in a system with 8G RAM would
*physically* differ from the neighbouring memory.

> Buggy firmware/BIOS?
> Buggy device driver?

[    0.000027] MTRR default type: uncachable
[    0.000028] MTRR fixed ranges enabled:
[    0.000030]   00000-9FFFF write-back
[    0.000031]   A0000-BFFFF uncachable
[    0.000032]   C0000-FFFFF write-through
[    0.000033] MTRR variable ranges enabled:
[    0.000034]   0 base 000000000000 mask FFFF80000000 write-back
[    0.000036]   1 base 0000FFE00000 mask FFFFFFE00000 write-protect
[    0.000037]   2 base 000100000000 mask FFFFFF000000 write-protect

As we have the range at 0x100000000 write-protected reported that early in boot I'd say it's BIOS.

The question is how to reliably detect that this is a bogus setting...

[    0.000038]   3 base 0000FFDE0000 mask FFFFFFFE0000 write-protect
[    0.000039]   4 base 0000FF000000 mask FFFFFFF80000 write-protect
[    0.000040]   5 disabled
[    0.000041]   6 disabled
[    0.000042]   7 disabled
[    0.000042] TOM2: 0000000280000000 aka 10240M


--
Sincerely yours,
Mike.

WARNING: multiple messages have this Message-ID (diff)
From: "Liang, Liang (Leo)" <Liang.Liang@amd.com>
To: Mike Rapoport <rppt@linux.ibm.com>, David Hildenbrand <david@redhat.com>
Cc: "Rafael J. Wysocki" <rafael@kernel.org>,
	"linux-kernel@vger.kernel.org" <linux-kernel@vger.kernel.org>,
	amd-gfx list <amd-gfx@lists.freedesktop.org>,
	George Kennedy <george.kennedy@oracle.com>,
	"Huang, Ray" <Ray.Huang@amd.com>,
	"Deucher, Alexander" <Alexander.Deucher@amd.com>,
	Andrew Morton <akpm@linux-foundation.org>,
	"Koenig, Christian" <Christian.Koenig@amd.com>
Subject: RE: slow boot with 7fef431be9c9 ("mm/page_alloc: place pages to tail in __free_pages_core()")
Date: Tue, 16 Mar 2021 11:02:50 +0000	[thread overview]
Message-ID: <MW3PR12MB4537C3C6EFD9CA3A4B32084DF36B9@MW3PR12MB4537.namprd12.prod.outlook.com> (raw)
In-Reply-To: <YFCIqLmn3u1be1yo@linux.ibm.com>

[AMD Public Use]

Hi David and Mike,

It's BIOS buggy. Now fixed by new BIOS. Thanks you so much! Cheers!

[    0.000034] MTRR variable ranges enabled:
[    0.000035]   0 base 000000000000 mask FFFF80000000 write-back
[    0.000037]   1 base 0000FFE00000 mask FFFFFFE00000 write-protect
[    0.000039]   2 base 0000FFDE0000 mask FFFFFFFE0000 write-protect
[    0.000040]   3 base 0000FF000000 mask FFFFFFF80000 write-protect
[    0.000041]   4 disabled
[    0.000042]   5 disabled
[    0.000043]   6 disabled
[    0.000044]   7 disabled
[    0.000045] TOM2: 0000000280000000 aka 10240M

root@scbu-Chachani:/home/scbu# cat /proc/mtrr
reg00: base=0x000000000 (    0MB), size= 2048MB, count=1: write-back
reg01: base=0x0ffe00000 ( 4094MB), size=    2MB, count=1: write-protect
reg02: base=0x0ffde0000 ( 4093MB), size=  128KB, count=1: write-protect
reg03: base=0x0ff000000 ( 4080MB), size=  512KB, count=1: write-protect

BRs,
Leo
-----Original Message-----
From: Mike Rapoport <rppt@linux.ibm.com> 
Sent: Tuesday, March 16, 2021 6:30 PM
To: David Hildenbrand <david@redhat.com>
Cc: Liang, Liang (Leo) <Liang.Liang@amd.com>; Deucher, Alexander <Alexander.Deucher@amd.com>; linux-kernel@vger.kernel.org; amd-gfx list <amd-gfx@lists.freedesktop.org>; Andrew Morton <akpm@linux-foundation.org>; Huang, Ray <Ray.Huang@amd.com>; Koenig, Christian <Christian.Koenig@amd.com>; Rafael J. Wysocki <rafael@kernel.org>; George Kennedy <george.kennedy@oracle.com>
Subject: Re: slow boot with 7fef431be9c9 ("mm/page_alloc: place pages to tail in __free_pages_core()")

On Tue, Mar 16, 2021 at 10:08:10AM +0100, David Hildenbrand wrote:
> On 16.03.21 09:58, Liang, Liang (Leo) wrote:
> > [AMD Public Use]
> > 
> > Hi David,
> > 
> > root@scbu-Chachani:~# cat /proc/mtrr
> > reg00: base=0x000000000 (    0MB), size= 2048MB, count=1: write-back
> > reg01: base=0x0ffe00000 ( 4094MB), size=    2MB, count=1: write-protect
> > reg02: base=0x100000000 ( 4096MB), size=   16MB, count=1: write-protect
> 
> ^ there it is
> 
> https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Fwiki
> .osdev.org%2FMTRR&amp;data=04%7C01%7CLiang.Liang%40amd.com%7C49c791cc6
> 18745b8c35208d8e86679a1%7C3dd8961fe4884e608e11a82d994e183d%7C0%7C0%7C6
> 37514874126576401%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoi
> V2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000&amp;sdata=667IK3Bnyx5uP3
> rKN8bOjW7A2MBuM1sLCg98m1LCIGo%3D&amp;reserved=0
> 
> "Reads allocate cache lines on a cache miss. All writes update main memory.
> 
> Cache lines are not allocated on a write miss. Write hits invalidate 
> the cache line and update main memory. "
> 
> AFAIU, writes completely bypass caches and store directly to main 
> mamory. If there are cache lines from a previous read, they are 
> invalidated. So I think especially slow will be read(addr), 
> write(addr), read(addr), ... which is what we have in the kstream benchmark.
> 
> 
> The question is:
> 
> who sets this up without owning the memory?
> Is the memory actually special/slow or is that setting wrong?

I really doubt that 16M at 0x100000000 in a system with 8G RAM would
*physically* differ from the neighbouring memory.

> Buggy firmware/BIOS?
> Buggy device driver?

[    0.000027] MTRR default type: uncachable
[    0.000028] MTRR fixed ranges enabled:
[    0.000030]   00000-9FFFF write-back
[    0.000031]   A0000-BFFFF uncachable
[    0.000032]   C0000-FFFFF write-through
[    0.000033] MTRR variable ranges enabled:
[    0.000034]   0 base 000000000000 mask FFFF80000000 write-back
[    0.000036]   1 base 0000FFE00000 mask FFFFFFE00000 write-protect
[    0.000037]   2 base 000100000000 mask FFFFFF000000 write-protect

As we have the range at 0x100000000 write-protected reported that early in boot I'd say it's BIOS.

The question is how to reliably detect that this is a bogus setting...

[    0.000038]   3 base 0000FFDE0000 mask FFFFFFFE0000 write-protect
[    0.000039]   4 base 0000FF000000 mask FFFFFFF80000 write-protect
[    0.000040]   5 disabled
[    0.000041]   6 disabled
[    0.000042]   7 disabled
[    0.000042] TOM2: 0000000280000000 aka 10240M


--
Sincerely yours,
Mike.
_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

  reply	other threads:[~2021-03-16 11:03 UTC|newest]

Thread overview: 48+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-03-11 14:41 slow boot with 7fef431be9c9 ("mm/page_alloc: place pages to tail in __free_pages_core()") Deucher, Alexander
2021-03-11 14:41 ` Deucher, Alexander
2021-03-11 15:02 ` David Hildenbrand
2021-03-11 15:02   ` David Hildenbrand
2021-03-12 14:06   ` Deucher, Alexander
2021-03-12 14:06     ` Deucher, Alexander
2021-03-12 14:12     ` David Hildenbrand
2021-03-12 14:12       ` David Hildenbrand
2021-03-12 15:39       ` Deucher, Alexander
2021-03-12 15:39         ` Deucher, Alexander
2021-03-12 15:48         ` David Hildenbrand
2021-03-12 15:48           ` David Hildenbrand
2021-03-12 16:15           ` Deucher, Alexander
2021-03-12 16:15             ` Deucher, Alexander
2021-03-12 16:19         ` Liang, Liang (Leo)
2021-03-12 16:19           ` Liang, Liang (Leo)
2021-03-12 16:46           ` David Hildenbrand
2021-03-12 16:46             ` David Hildenbrand
2021-03-13  4:04             ` Liang, Liang (Leo)
2021-03-13  4:04               ` Liang, Liang (Leo)
2021-03-13  9:05               ` David Hildenbrand
2021-03-13  9:05                 ` David Hildenbrand
2021-03-13 13:48                 ` Mike Rapoport
2021-03-13 13:48                   ` Mike Rapoport
2021-03-15 13:04                   ` David Hildenbrand
2021-03-15 13:04                     ` David Hildenbrand
2021-03-16  0:36                     ` Liang, Liang (Leo)
2021-03-16  0:36                       ` Liang, Liang (Leo)
2021-03-16  6:50                       ` Mike Rapoport
2021-03-16  6:50                         ` Mike Rapoport
2021-03-16  8:00                         ` Liang, Liang (Leo)
2021-03-16  8:00                           ` Liang, Liang (Leo)
2021-03-16  8:37                           ` David Hildenbrand
2021-03-16  8:37                             ` David Hildenbrand
2021-03-16  8:43                             ` Liang, Liang (Leo)
2021-03-16  8:43                               ` Liang, Liang (Leo)
2021-03-16  8:54                               ` David Hildenbrand
2021-03-16  8:54                                 ` David Hildenbrand
2021-03-16  8:58                                 ` Liang, Liang (Leo)
2021-03-16  8:58                                   ` Liang, Liang (Leo)
2021-03-16  9:08                                   ` David Hildenbrand
2021-03-16  9:08                                     ` David Hildenbrand
2021-03-16 10:30                                     ` Mike Rapoport
2021-03-16 10:30                                       ` Mike Rapoport
2021-03-16 11:02                                       ` Liang, Liang (Leo) [this message]
2021-03-16 11:02                                         ` Liang, Liang (Leo)
2021-03-16 11:10                                         ` David Hildenbrand
2021-03-16 11:10                                           ` David Hildenbrand

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=MW3PR12MB4537C3C6EFD9CA3A4B32084DF36B9@MW3PR12MB4537.namprd12.prod.outlook.com \
    --to=liang.liang@amd.com \
    --cc=Alexander.Deucher@amd.com \
    --cc=Christian.Koenig@amd.com \
    --cc=Ray.Huang@amd.com \
    --cc=akpm@linux-foundation.org \
    --cc=amd-gfx@lists.freedesktop.org \
    --cc=david@redhat.com \
    --cc=george.kennedy@oracle.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=rafael@kernel.org \
    --cc=rppt@linux.ibm.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.