linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Mike Rapoport <rppt@kernel.org>
To: Yajun Deng <yajun.deng@linux.dev>
Cc: Greg KH <gregkh@linuxfoundation.org>,
	rafael@kernel.org, akpm@linux-foundation.org,
	linux-kernel@vger.kernel.org, linux-mm@kvack.org
Subject: Re: [PATCH] mm/mm_init.c: remove spinlock in early_pfn_to_nid()
Date: Thu, 15 Jun 2023 09:20:21 +0300	[thread overview]
Message-ID: <20230615062021.GI52412@kernel.org> (raw)
In-Reply-To: <a96c998f9d73f03c85463d7314f6ea8a@linux.dev>

On Thu, Jun 15, 2023 at 03:02:58AM +0000, Yajun Deng wrote:
> June 14, 2023 7:53 PM, "Mike Rapoport" <rppt@kernel.org> wrote:
> 
> > Hi,
> > 
> > On Wed, Jun 14, 2023 at 11:28:32AM +0000, Yajun Deng wrote:
> > 
> >> June 14, 2023 7:09 PM, "Greg KH" <gregkh@linuxfoundation.org> wrote:
> >> 
> >> On Wed, Jun 14, 2023 at 07:03:24PM +0800, Yajun Deng wrote:
> >> 
> >> When the system boots, only one cpu is enabled before smp_init().
> >> So the spinlock is not needed in most cases, remove it.
> >> 
> >> Add spinlock in get_nid_for_pfn() because it is after smp_init().
> >> 
> >> So this is two different things at once in the same patch?
> >> 
> >> Or are they the same problem and both need to go in to solve it?
> >> 
> >> And if a spinlock is not needed at early boot, is it really causing any
> >> problems?
> >> 
> >> They are the same problem.
> >> I added pr_info in early_pfn_to_nid(), found get_nid_for_pfn() is the only
> >> case need to add spinlock.
> >> This patch tested on my x86 system.
> > 
> > Are you sure it'll work on !x86?
> >
> 
> I'm probably sure of that, although I don't have a !x86 machine.
> 
> early_pfn_to_nid() is called in smp_init() and kasan_init() on 
> different architectures. If it works well on x86, it'll work on
> !x86.

This is often not true. Please verify that other architectures do not call
early_pfn_to_nid() after smp_init(). The explanation why it is safe should
be a part of the changelog.

> >> Signed-off-by: Yajun Deng <yajun.deng@linux.dev>
> >> ---
> >> drivers/base/node.c | 11 +++++++++--
> >> mm/mm_init.c | 18 +++---------------
> >> 2 files changed, 12 insertions(+), 17 deletions(-)
> >> 
> >> diff --git a/drivers/base/node.c b/drivers/base/node.c
> >> index 9de524e56307..844102570ff2 100644
> >> --- a/drivers/base/node.c
> >> +++ b/drivers/base/node.c
> >> @@ -748,8 +748,15 @@ int unregister_cpu_under_node(unsigned int cpu, unsigned int nid)
> >> static int __ref get_nid_for_pfn(unsigned long pfn)
> >> {
> >> #ifdef CONFIG_DEFERRED_STRUCT_PAGE_INIT
> >> - if (system_state < SYSTEM_RUNNING)
> >> - return early_pfn_to_nid(pfn);
> >> + static DEFINE_SPINLOCK(early_pfn_lock);
> >> + int nid;
> >> +
> >> + if (system_state < SYSTEM_RUNNING) {
> >> + spin_lock(&early_pfn_lock);
> >> + nid = early_pfn_to_nid(pfn);
> >> + spin_unlock(&early_pfn_lock);
> >> 
> >> Adding an external lock for when you call a function is VERY dangerous
> >> as you did not document this anywhere, and there's no way to enforce it
> >> properly at all.
> >> 
> >> I should add a comment before early_pfn_to_nid().
> >> 
> >> Does your change actually result in any boot time changes? How was this
> >> tested?
> >> 
> >> Just a bit.
> > 
> > Just a bit tested? Or just a bit of boot time changes?
> > For the latter, do you have numbers?
> > 
> 
> For the latter, the most beneficial function is memmap_init_reserved_pages(),
> the boot time changes depending on whether DEFERRED_STRUCT_PAGE_INIT
> is defined or not. 
> 
> -->memmap_init_reserved_pages()
>    -->for_each_reserved_mem_range()
>       reserve_bootmem_region()
>       -->for()
>          init_reserved_page()
>          --> early_pfn_to_nid()
 
A better solution would be to pass nid to reserve_bootmem_range() and drop
the call to early_pfn_to_nid() in init_reserved_page(). 

Then there won't be lock contention and no need for fragile changes in the
locking.

> If define CONFIG_DEFERRED_STRUCT_PAGE_INIT:
> 
> before:
> memmap_init_reserved_pages()   1.87 seconds
> after:
> memmap_init_reserved_pages()   1.27 seconds
> 
> 32% time reduction.

These measurements should be part of the changelog. 
 
> If not define CONFIG_DEFERRED_STRUCT_PAGE_INIT:
> 
> early_pfn_to_nid() is called by few, 
> boot time didn't change.
> 
> 
> By the way, this machine has 190GB RAM.
> 
> > --
> > Sincerely yours,
> > Mike.

-- 
Sincerely yours,
Mike.

  reply	other threads:[~2023-06-15  6:21 UTC|newest]

Thread overview: 7+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2023-06-14 11:03 [PATCH] mm/mm_init.c: remove spinlock in early_pfn_to_nid() Yajun Deng
2023-06-14 11:09 ` Greg KH
2023-06-14 11:28 ` Yajun Deng
2023-06-14 11:53   ` Mike Rapoport
2023-06-15  3:02   ` Yajun Deng
2023-06-15  6:20     ` Mike Rapoport [this message]
2023-06-15  6:36     ` Yajun Deng

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20230615062021.GI52412@kernel.org \
    --to=rppt@kernel.org \
    --cc=akpm@linux-foundation.org \
    --cc=gregkh@linuxfoundation.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=rafael@kernel.org \
    --cc=yajun.deng@linux.dev \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).