LinuxPPC-Dev Archive on lore.kernel.org
 help / color / Atom feed
From: Leonardo Bras <leobras.c@gmail.com>
To: David Gibson <david@gibson.dropbear.id.au>
Cc: Nathan Lynch <nathanl@linux.ibm.com>,
	David Hildenbrand <david@redhat.com>,
	Scott Cheloha <cheloha@linux.ibm.com>,
	linux-kernel@vger.kernel.org, linuxppc-dev@lists.ozlabs.org,
	Nicholas Piggin <npiggin@gmail.com>,
	Bharata B Rao <bharata@linux.ibm.com>,
	Paul Mackerras <paulus@samba.org>,
	Sandipan Das <sandipan@linux.ibm.com>,
	"Aneesh Kumar K.V" <aneesh.kumar@linux.ibm.com>,
	Andrew Morton <akpm@linux-foundation.org>,
	Laurent Dufour <ldufour@linux.ibm.com>,
	Logan Gunthorpe <logang@deltatee.com>,
	Dan Williams <dan.j.williams@intel.com>,
	Mike Rapoport <rppt@kernel.org>
Subject: Re: [PATCH 2/3] powerpc/mm/hash: Avoid multiple HPT resize-ups on memory hotplug
Date: Thu, 08 Apr 2021 23:51:36 -0300
Message-ID: <e5c924479839815c025de29d650d82419b18c0dc.camel@gmail.com> (raw)
In-Reply-To: <YFhNd42RvobCV8tF@yekko.fritz.box>

Hello David, thanks for the feedback!

On Mon, 2021-03-22 at 18:55 +1100, David Gibson wrote:
> > +void hash_memory_batch_expand_prepare(unsigned long newsize)
> > +{
> > +	/*
> > +	 * Resizing-up HPT should never fail, but there are some cases system starts with higher
> > +	 * SHIFT than required, and we go through the funny case of resizing HPT down while
> > +	 * adding memory
> > +	 */
> > +
> > +	while (resize_hpt_for_hotplug(newsize, false) == -ENOSPC) {
> > +		newsize *= 2;
> > +		pr_warn("Hash collision while resizing HPT\n");
> 
> This unbounded increase in newsize makes me nervous - we should be
> bounded by the current size of the HPT at least.  In practice we
> should be fine, since the resize should always succeed by the time we
> reach our current HPT size, but that's far from obvious from this
> point in the code.

Sure, I will add bounds in v2.

> 
> And... you're doubling newsize which is a value which might not be a
> power of 2.  I'm wondering if there's an edge case where this could
> actually cause us to skip the current size and erroneously resize to
> one bigger than we have currently.

I also though that at the start, but it seems quite reliable.
Before using this value, htab_shift_for_mem_size() will always round it
to next power of 2. 
Ex.
Any value between 0b0101 and 0b1000 will be rounded to 0b1000 for shift
calculation. If we multiply it by 2 (same as << 1), we have that
anything between 0b01010 and 0b10000 will be rounded to 0b10000. 

This works just fine as long as we are multiplying. 
Division may have the behavior you expect, as 0b0101 >> 1 would become
0b010 and skip a shift.
	
> > +void memory_batch_expand_prepare(unsigned long newsize)
> 
> This wrapper doesn't seem useful.

Yeah, it does little, but I can't just jump into hash_* functions
directly from hotplug-memory.c, without even knowing if it's using hash
pagetables. (in case the suggestion would be test for disable_radix
inside hash_memory_batch*)

> 
> > +{
> > +	if (!radix_enabled())
> > +		hash_memory_batch_expand_prepare(newsize);
> > +}
> >  #endif /* CONFIG_MEMORY_HOTPLUG */
> >  
> > 
> > +	memory_batch_expand_prepare(memblock_phys_mem_size() +
> > +				     drmem_info->n_lmbs * drmem_lmb_size());
> 
> This doesn't look right.  memory_add_by_index() is adding a *single*
> LMB, I think using drmem_info->n_lmbs here means you're counting this
> as adding again as much memory as you already have hotplugged.

Yeah, my mistake. This makes sense.
I will change it to something like 
memblock_phys_mem_size() + drmem_lmb_size()

> > 
> > +	memory_batch_expand_prepare(memblock_phys_mem_size() + lmbs_to_add * drmem_lmb_size());
> > +
> >  	for_each_drmem_lmb_in_range(lmb, start_lmb, end_lmb) {
> >  		if (lmb->flags & DRCONF_MEM_ASSIGNED)
> >  			continue;
> 
> I don't see memory_batch_expand_prepare() suppressing any existing HPT
> resizes.  Won't this just resize to the right size for the full add,
> then resize several times again as we perform the add?  Or.. I guess
> that will be suppressed by patch 1/3. 

Correct.

>  That's seems kinda fragile, though.

What do you mean by fragile here?
What would you suggest doing different?

Best regards,
Leonardo Bras


  reply index

Thread overview: 12+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-03-12  7:29 [PATCH 0/3] powerpc/mm/hash: Time improvements for memory hot(un)plug Leonardo Bras
2021-03-12  7:29 ` [PATCH 1/3] powerpc/mm/hash: Avoid resizing-down HPT on first memory hotplug Leonardo Bras
2021-03-22  6:49   ` David Gibson
2021-04-09  2:16     ` Leonardo Bras
2021-03-12  7:29 ` [PATCH 2/3] powerpc/mm/hash: Avoid multiple HPT resize-ups on " Leonardo Bras
2021-03-22  7:55   ` David Gibson
2021-04-09  2:51     ` Leonardo Bras [this message]
2021-04-19  5:34       ` David Gibson
2021-03-12  7:29 ` [PATCH 3/3] powerpc/mm/hash: Avoid multiple HPT resize-downs on memory hotunplug Leonardo Bras
2021-03-22 23:45   ` David Gibson
2021-04-09  3:31     ` Leonardo Bras
2021-04-19  5:37       ` David Gibson

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=e5c924479839815c025de29d650d82419b18c0dc.camel@gmail.com \
    --to=leobras.c@gmail.com \
    --cc=akpm@linux-foundation.org \
    --cc=aneesh.kumar@linux.ibm.com \
    --cc=bharata@linux.ibm.com \
    --cc=cheloha@linux.ibm.com \
    --cc=dan.j.williams@intel.com \
    --cc=david@gibson.dropbear.id.au \
    --cc=david@redhat.com \
    --cc=ldufour@linux.ibm.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linuxppc-dev@lists.ozlabs.org \
    --cc=logang@deltatee.com \
    --cc=nathanl@linux.ibm.com \
    --cc=npiggin@gmail.com \
    --cc=paulus@samba.org \
    --cc=rppt@kernel.org \
    --cc=sandipan@linux.ibm.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

LinuxPPC-Dev Archive on lore.kernel.org

Archives are clonable:
	git clone --mirror https://lore.kernel.org/linuxppc-dev/0 linuxppc-dev/git/0.git

	# If you have public-inbox 1.1+ installed, you may
	# initialize and index your mirror using the following commands:
	public-inbox-init -V2 linuxppc-dev linuxppc-dev/ https://lore.kernel.org/linuxppc-dev \
		linuxppc-dev@lists.ozlabs.org linuxppc-dev@ozlabs.org
	public-inbox-index linuxppc-dev

Example config snippet for mirrors

Newsgroup available over NNTP:
	nntp://nntp.lore.kernel.org/org.ozlabs.lists.linuxppc-dev


AGPL code for this site: git clone https://public-inbox.org/public-inbox.git