linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
From: David Hildenbrand <david@redhat.com>
To: Daniel Jordan <daniel.m.jordan@oracle.com>
Cc: linux-mm@kvack.org, linux-kernel@vger.kernel.org,
	Andrew Morton <akpm@linux-foundation.org>,
	Andy Lutomirski <luto@kernel.org>,
	Dave Hansen <dave.hansen@linux.intel.com>,
	Michal Hocko <mhocko@kernel.org>,
	Pavel Tatashin <pasha.tatashin@soleen.com>,
	Peter Zijlstra <peterz@infradead.org>,
	Steven Sistare <steven.sistare@oracle.com>
Subject: Re: [PATCH] x86/mm: use max memory block size with unaligned memory end
Date: Thu, 4 Jun 2020 19:45:40 +0200	[thread overview]
Message-ID: <ebc31650-9e98-f286-6fc2-aafdd3cd9272@redhat.com> (raw)
In-Reply-To: <20200604172213.f5lufktpqvqjkv4u@ca-dmjordan1.us.oracle.com>

On 04.06.20 19:22, Daniel Jordan wrote:
> On Thu, Jun 04, 2020 at 09:22:03AM +0200, David Hildenbrand wrote:
>> On 04.06.20 05:54, Daniel Jordan wrote:
>>> Some of our servers spend 14 out of the 21 seconds of kernel boot
>>> initializing memory block sysfs directories and then creating symlinks
>>> between them and the corresponding nodes.  The slowness happens because
>>> the machines get stuck with the smallest supported memory block size on
>>> x86 (128M), which results in 16,288 directories to cover the 2T of
>>> installed RAM, and each of these paths does a linear search of the
>>> memory blocks for every block id, with atomic ops at each step.
>>
>> With 4fb6eabf1037 ("drivers/base/memory.c: cache memory blocks in xarray
>> to accelerate lookup") merged by Linus' today (strange, I thought this
>> would be long upstream)
> 
> Ah, thanks for pointing this out!  It was only posted to LKML so I missed it.
> 
>> all linear searches should be gone and at least
>> the performance observation in this patch no longer applies.
> 
> The performance numbers as stated, that's certainly true, but this patch on top
> still improves kernel boot by 7%.  It's a savings of half a second -- I'll take
> it.
> 
> IMHO the root cause of this is really the small block size.  Building a cache
> on top to avoid iterating over tons of small blocks seems like papering over
> the problem, especially when one of the two affected paths in boot is a

The memory block size dictates your memory hot(un)plug granularity.
E.g., on powerpc that's 16MB so they have *a lot* of memory blocks.
That's why that's not papering over the problem. Increasing the memory
block size isn't always the answer.

(there are other, still fairly academic approaches to power down memory
banks where you also want small memory blocks instead)

> cautious check that might be ready to be removed by now[0]:

Yeah, we discussed that somewhere already. My change only highlighted
the problem. And now that it's cheap, it can just stay unless there is a
very good reason not to do it.

> 
>     static int init_memory_block(struct memory_block **memory,
>     			     unsigned long block_id, unsigned long state)
>     {
>             ...
>     	mem = find_memory_block_by_id(block_id);
>     	if (mem) {
>     		put_device(&mem->dev);
>     		return -EEXIST;
>     	}
> 
> Anyway, I guess I'll redo the changelog and post again.
> 
>> The memmap init should nowadays consume most time.
> 
> Yeah, but of course it's not as bad as it was now that it's fully parallelized.

Right. I also observed that computing if a zone is contiguous can be
expensive.


-- 
Thanks,

David / dhildenb



  reply	other threads:[~2020-06-04 17:45 UTC|newest]

Thread overview: 10+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-06-04  3:54 [PATCH] x86/mm: use max memory block size with unaligned memory end Daniel Jordan
2020-06-04  7:22 ` David Hildenbrand
2020-06-04 17:22   ` Daniel Jordan
2020-06-04 17:45     ` David Hildenbrand [this message]
2020-06-04 18:12       ` Daniel Jordan
2020-06-04 18:55         ` David Hildenbrand
2020-06-04 22:24           ` Daniel Jordan
2020-06-04 20:00         ` Dave Hansen
2020-06-04 22:27           ` Daniel Jordan
2020-06-05  7:44           ` David Hildenbrand

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=ebc31650-9e98-f286-6fc2-aafdd3cd9272@redhat.com \
    --to=david@redhat.com \
    --cc=akpm@linux-foundation.org \
    --cc=daniel.m.jordan@oracle.com \
    --cc=dave.hansen@linux.intel.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=luto@kernel.org \
    --cc=mhocko@kernel.org \
    --cc=pasha.tatashin@soleen.com \
    --cc=peterz@infradead.org \
    --cc=steven.sistare@oracle.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).