On Wed, Jun 09, 2021 at 02:51:49AM -0300, Leonardo Brás wrote: > On Wed, 2021-06-09 at 14:40 +1000, David Gibson wrote: > > On Tue, Jun 08, 2021 at 09:52:10PM -0300, Leonardo Brás wrote: > > > On Mon, 2021-06-07 at 15:02 +1000, David Gibson wrote: > > > > On Fri, Apr 30, 2021 at 11:36:06AM -0300, Leonardo Bras wrote: > > > > > Because hypervisors may need to create HPTs without knowing the > > > > > guest > > > > > page size, the smallest used page-size (4k) may be chosen, > > > > > resulting in > > > > > a HPT that is possibly bigger than needed. > > > > > > > > > > On a guest with bigger page-sizes, the amount of entries for > > > > > HTP > > > > > may be > > > > > too high, causing the guest to ask for a HPT resize-down on the > > > > > first > > > > > hotplug. > > > > > > > > > > This becomes a problem when HPT resize-down fails, and causes > > > > > the > > > > > HPT resize to be performed on every LMB added, until HPT size > > > > > is > > > > > compatible to guest memory size, causing a major slowdown. > > > > > > > > > > So, avoiding HPT resizing-down on hot-add significantly > > > > > improves > > > > > memory > > > > > hotplug times. > > > > > > > > > > As an example, hotplugging 256GB on a 129GB guest took 710s > > > > > without > > > > > this > > > > > patch, and 21s after applied. > > > > > > > > > > Signed-off-by: Leonardo Bras > > > > > > > > Sorry it's taken me so long to look at these > > > > > > > > I don't love the extra statefulness that the 'shrinking' > > > > parameter > > > > adds, but I can't see an elegant way to avoid it, so: > > > > > > > > Reviewed-by: David Gibson > > > > > > np, thanks for reviewing! > > > > Actually... I take that back.  With the subsequent patches my > > discomfort with the complexity of implementing the batching grew. > > > > I think I can see a simpler way - although it wasn't as clear as I > > thought it might be, without some deep history on this feature. > > > > What's going on here is pretty hard to follow, because it starts in > > arch-specific code (arch/powerpc/platforms/pseries/hotplug-memory.c) > > where it processes the add/remove requests, then goes into generic > > code __add_memory() which eventually emerges back in arch specific > > code (hash__create_section_mapping()). > > > > The HPT resizing calls are in the "inner" arch specific section, > > whereas it's only the outer arch section that has the information to > > batch properly.  The mutex and 'shrinking' parameter in Leonardo's > > code are all about conveying information from the outer to inner > > section. > > > > Now, I think the reason I had the resize calls in the inner section > > was to accomodate the notion that a) pHyp might support resizing in > > future, and it could come in through a different path with its drmgr > > thingy and/or b) bare metal hash architectures might want to > > implement > > hash resizing, and this would make at least part of the path common. > > > > Given the decreasing relevance of hash MMUs, I think we can now > > safely > > say neither of these is ever going to happen. > > > > Therefore, we can simplify things by moving the HPT resize calls into > > the pseries LMB code, instead of create/remove_section_mapping.  Then > > to do batching without extra complications we just need this logic > > for > > all resizes (both add and remove): > > > >         let new_hpt_order = expected HPT size for new mem size; > > > >         if (new_hpt_order > current_hpt_order) > >                 resize to new_hpt_order > > > >         add/remove memory > > > >         if (new_hpt_order < current_hpt_order - 1) > >                 resize to new_hpt_order > > > > > > > Ok, that really does seem to simplify a lot the batching. > > Question: > by LMB code, you mean dlpar_memory_{add,remove}_by_* ? > (dealing only with dlpar_{add,remove}_lmb() would not be enough to deal > with batching) I was thinking of a two stage process. First moving the resizes to dlpar_{add,remote}_lmb() (not changing behaviour for the pseries dlpar path), then implementing the batching by moving to the {add,remove}_by functions. But.. > In my 3/3 repĺy I sent you some other examples of functions that > currently end up calling resize_hpt_for_hotplug() without comming from  > hotplug-memory.c. Is that ok that they do not call it anymore? ..as I replied there, I was wrong about it being safe to move the resizes all to the pseries dlpar code. -- David Gibson | I'll have my music baroque, and my code david AT gibson.dropbear.id.au | minimalist, thank you. NOT _the_ _other_ | _way_ _around_! http://www.ozlabs.org/~dgibson