On Tue, 21 Mar 2023, Ilpo Järvinen wrote: > On Mon, 20 Mar 2023, James Morse wrote: > > > When a CPU is taken offline resctrl may need to move the overflow or > > limbo handlers to run on a different CPU. > > > > Once the offline callbacks have been split, cqm_setup_limbo_handler() > > will be called while the CPU that is going offline is still present > > in the cpu_mask. > > > > Pass the CPU to exclude to cqm_setup_limbo_handler() and > > mbm_setup_overflow_handler(). These functions can use a variant of > > cpumask_any_but() when selecting the CPU. -1 is used to indicate no CPUs > > need excluding. > > > > Tested-by: Shaopeng Tan > > Signed-off-by: James Morse > > --- > > Changes since v2: > > * Rephrased a comment to avoid a two letter bad-word. (we) > > * Avoid assigning mbm_work_cpu if the domain is going to be free()d > > * Added cpumask_any_housekeeping_but(), I dislike the name > > --- > > arch/x86/kernel/cpu/resctrl/core.c | 8 +++-- > > arch/x86/kernel/cpu/resctrl/internal.h | 37 ++++++++++++++++++++-- > > arch/x86/kernel/cpu/resctrl/monitor.c | 43 +++++++++++++++++++++----- > > arch/x86/kernel/cpu/resctrl/rdtgroup.c | 6 ++-- > > include/linux/resctrl.h | 3 ++ > > 5 files changed, 83 insertions(+), 14 deletions(-) > > > > diff --git a/arch/x86/kernel/cpu/resctrl/core.c b/arch/x86/kernel/cpu/resctrl/core.c > > index 8e25ea49372e..aafe4b74587c 100644 > > --- a/arch/x86/kernel/cpu/resctrl/core.c > > +++ b/arch/x86/kernel/cpu/resctrl/core.c > > @@ -582,12 +582,16 @@ static void domain_remove_cpu(int cpu, struct rdt_resource *r) > > if (r == &rdt_resources_all[RDT_RESOURCE_L3].r_resctrl) { > > if (is_mbm_enabled() && cpu == d->mbm_work_cpu) { > > cancel_delayed_work(&d->mbm_over); > > - mbm_setup_overflow_handler(d, 0); > > + /* > > + * exclude_cpu=-1 as this CPU has already been removed > > + * by cpumask_clear_cpu()d > > + */ > > + mbm_setup_overflow_handler(d, 0, RESCTRL_PICK_ANY_CPU); > > } > > if (is_llc_occupancy_enabled() && cpu == d->cqm_work_cpu && > > has_busy_rmid(r, d)) { > > cancel_delayed_work(&d->cqm_limbo); > > - cqm_setup_limbo_handler(d, 0); > > + cqm_setup_limbo_handler(d, 0, RESCTRL_PICK_ANY_CPU); > > } > > } > > } > > diff --git a/arch/x86/kernel/cpu/resctrl/internal.h b/arch/x86/kernel/cpu/resctrl/internal.h > > index 3eb5b307b809..47838ba6876e 100644 > > --- a/arch/x86/kernel/cpu/resctrl/internal.h > > +++ b/arch/x86/kernel/cpu/resctrl/internal.h > > @@ -78,6 +78,37 @@ static inline unsigned int cpumask_any_housekeeping(const struct cpumask *mask) > > return cpu; > > } > > > > +/** > > + * cpumask_any_housekeeping_but() - Chose any cpu in @mask, preferring those > > + * that aren't marked nohz_full, excluding > > + * the provided CPU > > + * @mask: The mask to pick a CPU from. > > + * @exclude_cpu:The CPU to avoid picking. > > + * > > + * Returns a CPU from @mask, but not @but. If there are houskeeping CPUs that > > + * don't use nohz_full, these are preferred. > > + * Returns >= nr_cpu_ids if no CPUs are available. > > + */ > > +static inline unsigned int > > +cpumask_any_housekeeping_but(const struct cpumask *mask, int exclude_cpu) > > +{ > > + int cpu, hk_cpu; > > + > > + cpu = cpumask_any_but(mask, exclude_cpu); > > + if (tick_nohz_full_cpu(cpu)) { > > + hk_cpu = cpumask_nth_andnot(0, mask, tick_nohz_full_mask); > > + if (hk_cpu == exclude_cpu) { > > + hk_cpu = cpumask_nth_andnot(1, mask, > > + tick_nohz_full_mask); > > I'm left to wonder if it's okay to alter tick_nohz_full_mask in resctrl > code?? I suppose it should do instead: hk_cpu = cpumask_nth_and(0, mask, tick_nohz_full_mask); if (hk_cpu == exclude_cpu) hk_cpu = cpumask_next_and(hk_cpu, mask, tick_nohz_full_mask); -- i.