All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH v2] x86/resctrl: fix an imbalance in domain_remove_cpu
@ 2019-12-11  3:30 Qian Cai
  2019-12-11 18:02 ` Reinette Chatre
  2019-12-30 18:42 ` [tip: x86/urgent] x86/resctrl: Fix an imbalance in domain_remove_cpu() tip-bot2 for Qian Cai
  0 siblings, 2 replies; 3+ messages in thread
From: Qian Cai @ 2019-12-11  3:30 UTC (permalink / raw)
  To: tglx, mingo, bp
  Cc: reinette.chatre, fenghua.yu, hpa, john.stultz, sboyd, tony.luck,
	tj, x86, linux-kernel, Qian Cai

A system that supports resource monitoring may have multiple resources
while not all of these resources are capable of monitoring. Monitoring
related state is initialized only for resources that are capable of
monitoring and correspondingly this state should subsequently only be
removed from these resources that are capable of monitoring.

domain_add_cpu() calls domain_setup_mon_state() only when r->mon_capable
is true where it will initialize d->mbm_over. However,
domain_remove_cpu() calls cancel_delayed_work(&d->mbm_over) without
checking r->mon_capable resulting in an attempt to cancel d->mbm_over on
all resources, even those that never initialized d->mbm_over because
they are not capable of monitoring. Hence, it triggers a debugobjects
warning when offlining CPUs because those timer debugobjects are never
initialized.

ODEBUG: assert_init not available (active state 0) object type:
timer_list hint: 0x0
WARNING: CPU: 143 PID: 789 at lib/debugobjects.c:484
debug_print_object+0xfe/0x140
Hardware name: HP Synergy 680 Gen9/Synergy 680 Gen9 Compute Module, BIOS
I40 05/23/2018
RIP: 0010:debug_print_object+0xfe/0x140
Call Trace:
debug_object_assert_init+0x1f5/0x240
del_timer+0x6f/0xf0
try_to_grab_pending+0x42/0x3c0
cancel_delayed_work+0x7d/0x150
resctrl_offline_cpu+0x3c0/0x520
cpuhp_invoke_callback+0x197/0x1120
cpuhp_thread_fun+0x252/0x2f0
smpboot_thread_fn+0x255/0x440
kthread+0x1e6/0x210
ret_from_fork+0x3a/0x50

Fixes: e33026831bdb ("x86/intel_rdt/mbm: Handle counter overflow")
Signed-off-by: Qian Cai <cai@lca.pw>
---

v2: update the commit log thanks to Reinette.

 arch/x86/kernel/cpu/resctrl/core.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/arch/x86/kernel/cpu/resctrl/core.c b/arch/x86/kernel/cpu/resctrl/core.c
index 03eb90d00af0..89049b343c7a 100644
--- a/arch/x86/kernel/cpu/resctrl/core.c
+++ b/arch/x86/kernel/cpu/resctrl/core.c
@@ -618,7 +618,7 @@ static void domain_remove_cpu(int cpu, struct rdt_resource *r)
 		if (static_branch_unlikely(&rdt_mon_enable_key))
 			rmdir_mondata_subdir_allrdtgrp(r, d->id);
 		list_del(&d->list);
-		if (is_mbm_enabled())
+		if (r->mon_capable && is_mbm_enabled())
 			cancel_delayed_work(&d->mbm_over);
 		if (is_llc_occupancy_enabled() &&  has_busy_rmid(r, d)) {
 			/*
-- 
2.21.0 (Apple Git-122.2)


^ permalink raw reply related	[flat|nested] 3+ messages in thread

* Re: [PATCH v2] x86/resctrl: fix an imbalance in domain_remove_cpu
  2019-12-11  3:30 [PATCH v2] x86/resctrl: fix an imbalance in domain_remove_cpu Qian Cai
@ 2019-12-11 18:02 ` Reinette Chatre
  2019-12-30 18:42 ` [tip: x86/urgent] x86/resctrl: Fix an imbalance in domain_remove_cpu() tip-bot2 for Qian Cai
  1 sibling, 0 replies; 3+ messages in thread
From: Reinette Chatre @ 2019-12-11 18:02 UTC (permalink / raw)
  To: Qian Cai, tglx, mingo, bp
  Cc: fenghua.yu, hpa, john.stultz, sboyd, tony.luck, tj, x86, linux-kernel

Hi Qian,

On 12/10/2019 7:30 PM, Qian Cai wrote:
> A system that supports resource monitoring may have multiple resources
> while not all of these resources are capable of monitoring. Monitoring
> related state is initialized only for resources that are capable of
> monitoring and correspondingly this state should subsequently only be
> removed from these resources that are capable of monitoring.
> 
> domain_add_cpu() calls domain_setup_mon_state() only when r->mon_capable
> is true where it will initialize d->mbm_over. However,
> domain_remove_cpu() calls cancel_delayed_work(&d->mbm_over) without
> checking r->mon_capable resulting in an attempt to cancel d->mbm_over on
> all resources, even those that never initialized d->mbm_over because
> they are not capable of monitoring. Hence, it triggers a debugobjects
> warning when offlining CPUs because those timer debugobjects are never
> initialized.
> 
> ODEBUG: assert_init not available (active state 0) object type:
> timer_list hint: 0x0
> WARNING: CPU: 143 PID: 789 at lib/debugobjects.c:484
> debug_print_object+0xfe/0x140
> Hardware name: HP Synergy 680 Gen9/Synergy 680 Gen9 Compute Module, BIOS
> I40 05/23/2018
> RIP: 0010:debug_print_object+0xfe/0x140
> Call Trace:
> debug_object_assert_init+0x1f5/0x240
> del_timer+0x6f/0xf0
> try_to_grab_pending+0x42/0x3c0
> cancel_delayed_work+0x7d/0x150
> resctrl_offline_cpu+0x3c0/0x520
> cpuhp_invoke_callback+0x197/0x1120
> cpuhp_thread_fun+0x252/0x2f0
> smpboot_thread_fn+0x255/0x440
> kthread+0x1e6/0x210
> ret_from_fork+0x3a/0x50
> 
> Fixes: e33026831bdb ("x86/intel_rdt/mbm: Handle counter overflow")
> Signed-off-by: Qian Cai <cai@lca.pw>
> ---

Thank you very much.

Acked-by: Reinette Chatre <reinette.chatre@intel.com>

Reinette


^ permalink raw reply	[flat|nested] 3+ messages in thread

* [tip: x86/urgent] x86/resctrl: Fix an imbalance in domain_remove_cpu()
  2019-12-11  3:30 [PATCH v2] x86/resctrl: fix an imbalance in domain_remove_cpu Qian Cai
  2019-12-11 18:02 ` Reinette Chatre
@ 2019-12-30 18:42 ` tip-bot2 for Qian Cai
  1 sibling, 0 replies; 3+ messages in thread
From: tip-bot2 for Qian Cai @ 2019-12-30 18:42 UTC (permalink / raw)
  To: linux-tip-commits
  Cc: Qian Cai, Borislav Petkov, Reinette Chatre, Fenghua Yu,
	H. Peter Anvin, Ingo Molnar, john.stultz, sboyd, stable,
	Thomas Gleixner, tj, Tony Luck, Vikas Shivappa, x86-ml, LKML

The following commit has been merged into the x86/urgent branch of tip:

Commit-ID:     e278af89f1ba0a9ef20947db6afc2c9afa37e85b
Gitweb:        https://git.kernel.org/tip/e278af89f1ba0a9ef20947db6afc2c9afa37e85b
Author:        Qian Cai <cai@lca.pw>
AuthorDate:    Tue, 10 Dec 2019 22:30:42 -05:00
Committer:     Borislav Petkov <bp@suse.de>
CommitterDate: Mon, 30 Dec 2019 19:25:59 +01:00

x86/resctrl: Fix an imbalance in domain_remove_cpu()

A system that supports resource monitoring may have multiple resources
while not all of these resources are capable of monitoring. Monitoring
related state is initialized only for resources that are capable of
monitoring and correspondingly this state should subsequently only be
removed from these resources that are capable of monitoring.

domain_add_cpu() calls domain_setup_mon_state() only when r->mon_capable
is true where it will initialize d->mbm_over. However,
domain_remove_cpu() calls cancel_delayed_work(&d->mbm_over) without
checking r->mon_capable resulting in an attempt to cancel d->mbm_over on
all resources, even those that never initialized d->mbm_over because
they are not capable of monitoring. Hence, it triggers a debugobjects
warning when offlining CPUs because those timer debugobjects are never
initialized:

  ODEBUG: assert_init not available (active state 0) object type:
  timer_list hint: 0x0
  WARNING: CPU: 143 PID: 789 at lib/debugobjects.c:484
  debug_print_object
  Hardware name: HP Synergy 680 Gen9/Synergy 680 Gen9 Compute Module, BIOS I40 05/23/2018
  RIP: 0010:debug_print_object
  Call Trace:
  debug_object_assert_init
  del_timer
  try_to_grab_pending
  cancel_delayed_work
  resctrl_offline_cpu
  cpuhp_invoke_callback
  cpuhp_thread_fun
  smpboot_thread_fn
  kthread
  ret_from_fork

Fixes: e33026831bdb ("x86/intel_rdt/mbm: Handle counter overflow")
Signed-off-by: Qian Cai <cai@lca.pw>
Signed-off-by: Borislav Petkov <bp@suse.de>
Acked-by: Reinette Chatre <reinette.chatre@intel.com>
Cc: Fenghua Yu <fenghua.yu@intel.com>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: john.stultz@linaro.org
Cc: sboyd@kernel.org
Cc: <stable@vger.kernel.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: tj@kernel.org
Cc: Tony Luck <tony.luck@intel.com>
Cc: Vikas Shivappa <vikas.shivappa@linux.intel.com>
Cc: x86-ml <x86@kernel.org>
Link: https://lkml.kernel.org/r/20191211033042.2188-1-cai@lca.pw
---
 arch/x86/kernel/cpu/resctrl/core.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/arch/x86/kernel/cpu/resctrl/core.c b/arch/x86/kernel/cpu/resctrl/core.c
index 03eb90d..89049b3 100644
--- a/arch/x86/kernel/cpu/resctrl/core.c
+++ b/arch/x86/kernel/cpu/resctrl/core.c
@@ -618,7 +618,7 @@ static void domain_remove_cpu(int cpu, struct rdt_resource *r)
 		if (static_branch_unlikely(&rdt_mon_enable_key))
 			rmdir_mondata_subdir_allrdtgrp(r, d->id);
 		list_del(&d->list);
-		if (is_mbm_enabled())
+		if (r->mon_capable && is_mbm_enabled())
 			cancel_delayed_work(&d->mbm_over);
 		if (is_llc_occupancy_enabled() &&  has_busy_rmid(r, d)) {
 			/*

^ permalink raw reply related	[flat|nested] 3+ messages in thread

end of thread, other threads:[~2019-12-30 18:42 UTC | newest]

Thread overview: 3+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2019-12-11  3:30 [PATCH v2] x86/resctrl: fix an imbalance in domain_remove_cpu Qian Cai
2019-12-11 18:02 ` Reinette Chatre
2019-12-30 18:42 ` [tip: x86/urgent] x86/resctrl: Fix an imbalance in domain_remove_cpu() tip-bot2 for Qian Cai

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.