* [PATCH] perf/x86/intel/cqm: Move WARN_ONs from intel_cqm_cpu_prepare to cqm_pick_event_reader
@ 2015-08-11 20:31 Yasuaki Ishimatsu
2015-08-12 11:00 ` Matt Fleming
0 siblings, 1 reply; 3+ messages in thread
From: Yasuaki Ishimatsu @ 2015-08-11 20:31 UTC (permalink / raw)
To: peterz; +Cc: linux-kernel, tglx, vikas.shivappa, kanaka.d.juvva, matt.fleming
When hot adding a CPU and onlining it, the following WARN_ON() messages
are shown:
[ 772.891448] ------------[ cut here ]------------
[ 772.896624] WARNING: CPU: 58 PID: 15169 at arch/x86/kernel/cpu/perf_event_intel_cqm.c:1268 intel_cqm_cpu_prepare+0x88/0x90()
[ 772.909167] Modules linked in:
[ 772.995134] CPU: 58 PID: 15169
[ 773.016633] 0000000000000000 0000000092fb60ed ffff88104febbba8 ffffffff8167b5fa
[ 773.024789] 0000000000000000 0000000000000000 ffff88104febbbe8 ffffffff810819ea
[ 773.033119] ffff88103be60000 ffff8c0fbc7ca020 ffffffff819fadf0 000000000000008f
[ 773.041461] Call Trace:
[ 773.044402] [<ffffffff8167b5fa>] dump_stack+0x45/0x57
[ 773.050160] [<ffffffff810819ea>] warn_slowpath_common+0x8a/0xc0
[ 773.056888] [<ffffffff81081b1a>] warn_slowpath_null+0x1a/0x20
[ 773.063426] [<ffffffff810365f8>] intel_cqm_cpu_prepare+0x88/0x90
[ 773.070253] [<ffffffff81036732>] intel_cqm_cpu_notifier+0x42/0x160
[ 773.077271] [<ffffffff810a0d3d>] notifier_call_chain+0x4d/0x80
[ 773.083901] [<ffffffff810a0e4e>] __raw_notifier_call_chain+0xe/0x10
[ 773.091007] [<ffffffff81081ef8>] _cpu_up+0xe8/0x190
[ 773.096555] [<ffffffff8108201a>] cpu_up+0x7a/0xa0
[ 773.101910] [<ffffffff816701b0>] cpu_subsys_online+0x40/0x90
[ 773.108332] [<ffffffff8143d777>] device_online+0x67/0x90
[ 773.114368] [<ffffffff8143d82a>] online_store+0x8a/0xa0
[ 773.120305] [<ffffffff8143aab8>] dev_attr_store+0x18/0x30
[ 773.126437] [<ffffffff8127224a>] sysfs_kf_write+0x3a/0x50
[ 773.132560] [<ffffffff812718d0>] kernfs_fop_write+0x120/0x170
[ 773.139078] [<ffffffff811f7657>] __vfs_write+0x37/0x100
[ 773.145019] [<ffffffff811fa398>] ? __sb_start_write+0x58/0x110
[ 773.151635] [<ffffffff8129d7ed>] ? security_file_permission+0x3d/0xc0
[ 773.158932] [<ffffffff811f7d59>] vfs_write+0xa9/0x190
[ 773.164674] [<ffffffff810234e6>] ? do_audit_syscall_entry+0x66/0x70
[ 773.171776] [<ffffffff811f8b55>] SyS_write+0x55/0xc0
[ 773.177423] [<ffffffff810672f0>] ? do_page_fault+0x30/0x80
[ 773.183654] [<ffffffff8168232e>] entry_SYSCALL_64_fastpath+0x12/0x71
[ 773.190843] ---[ end trace e6219d24386873bd ]---
[ 773.196573] smpboot: Booting Node 7 Processor 143 APIC 0x1f7
[ 773.221241] microcode: CPU143 sig=0x306f3, pf=0x80, revision=0x9
[ 773.228005] Will online and init hotplugged CPU: 143
Here is the root cause of the issue:
When calling intel_cqm_cpu_prepare() at CPU_UP_PREPARE notification,
the function checks that c86_chache_max_rmid is same as cqm_max_rmid
as follows:
static void intel_cqm_cpu_prepare(unsigned int cpu)
{
...
WARN_ON(c->x86_cache_max_rmid != cqm_max_rmid);
But x86_cache_max_rmid of hot added CPU is not set yet, because it will
set in get_cpu_cap() which is called after CPU_UP_PREPARE notification.
So when onlining a hot added CPU, the WARN_ON() are always shown:
To fix the issue, the patch moves WARN_ON()s from intel_cqm_cpu_prepare() to
cqm_pick_event_reader() which is called at CPU_STARTING notification.
Signed-off-by: Yasuaki Ishimatsu <isimatu.yasuaki@jp.fujitsu.com>
CC: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Vikas Shivappa <vikas.shivappa@intel.com>
Cc: Kanaka Juvva <kanaka.d.juvva@intel.com>
CC: Matt Fleming <matt.fleming@intel.com>
---
arch/x86/kernel/cpu/perf_event_intel_cqm.c | 8 ++++----
1 file changed, 4 insertions(+), 4 deletions(-)
diff --git a/arch/x86/kernel/cpu/perf_event_intel_cqm.c b/arch/x86/kernel/cpu/perf_event_intel_cqm.c
index 63eb68b..6196d3e 100644
--- a/arch/x86/kernel/cpu/perf_event_intel_cqm.c
+++ b/arch/x86/kernel/cpu/perf_event_intel_cqm.c
@@ -1244,9 +1244,13 @@ static struct pmu intel_cqm_pmu = {
static inline void cqm_pick_event_reader(int cpu)
{
+ struct cpuinfo_x86 *c = &cpu_data(cpu);
int phys_id = topology_physical_package_id(cpu);
int i;
+ WARN_ON(c->x86_cache_max_rmid != cqm_max_rmid);
+ WARN_ON(c->x86_cache_occ_scale != cqm_l3_scale);
+
for_each_cpu(i, &cqm_cpumask) {
if (phys_id == topology_physical_package_id(i))
return; /* already got reader for this socket */
@@ -1258,14 +1262,10 @@ static inline void cqm_pick_event_reader(int cpu)
static void intel_cqm_cpu_prepare(unsigned int cpu)
{
struct intel_pqr_state *state = &per_cpu(pqr_state, cpu);
- struct cpuinfo_x86 *c = &cpu_data(cpu);
state->rmid = 0;
state->closid = 0;
state->rmid_usecnt = 0;
-
- WARN_ON(c->x86_cache_max_rmid != cqm_max_rmid);
- WARN_ON(c->x86_cache_occ_scale != cqm_l3_scale);
}
static void intel_cqm_cpu_exit(unsigned int cpu)
--
1.8.3.1
^ permalink raw reply related [flat|nested] 3+ messages in thread
* Re: [PATCH] perf/x86/intel/cqm: Move WARN_ONs from intel_cqm_cpu_prepare to cqm_pick_event_reader
2015-08-11 20:31 [PATCH] perf/x86/intel/cqm: Move WARN_ONs from intel_cqm_cpu_prepare to cqm_pick_event_reader Yasuaki Ishimatsu
@ 2015-08-12 11:00 ` Matt Fleming
2015-08-12 14:40 ` Yasuaki Ishimatsu
0 siblings, 1 reply; 3+ messages in thread
From: Matt Fleming @ 2015-08-12 11:00 UTC (permalink / raw)
To: Yasuaki Ishimatsu
Cc: peterz, linux-kernel, tglx, vikas.shivappa, kanaka.d.juvva, matt.fleming
On Tue, 11 Aug, at 01:31:05PM, Yasuaki Ishimatsu wrote:
> When hot adding a CPU and onlining it, the following WARN_ON() messages
> are shown:
>
> [ 772.891448] ------------[ cut here ]------------
> [ 772.896624] WARNING: CPU: 58 PID: 15169 at arch/x86/kernel/cpu/perf_event_intel_cqm.c:1268 intel_cqm_cpu_prepare+0x88/0x90()
> [ 772.909167] Modules linked in:
> [ 772.995134] CPU: 58 PID: 15169
> [ 773.016633] 0000000000000000 0000000092fb60ed ffff88104febbba8 ffffffff8167b5fa
> [ 773.024789] 0000000000000000 0000000000000000 ffff88104febbbe8 ffffffff810819ea
> [ 773.033119] ffff88103be60000 ffff8c0fbc7ca020 ffffffff819fadf0 000000000000008f
> [ 773.041461] Call Trace:
> [ 773.044402] [<ffffffff8167b5fa>] dump_stack+0x45/0x57
> [ 773.050160] [<ffffffff810819ea>] warn_slowpath_common+0x8a/0xc0
> [ 773.056888] [<ffffffff81081b1a>] warn_slowpath_null+0x1a/0x20
> [ 773.063426] [<ffffffff810365f8>] intel_cqm_cpu_prepare+0x88/0x90
> [ 773.070253] [<ffffffff81036732>] intel_cqm_cpu_notifier+0x42/0x160
> [ 773.077271] [<ffffffff810a0d3d>] notifier_call_chain+0x4d/0x80
> [ 773.083901] [<ffffffff810a0e4e>] __raw_notifier_call_chain+0xe/0x10
> [ 773.091007] [<ffffffff81081ef8>] _cpu_up+0xe8/0x190
> [ 773.096555] [<ffffffff8108201a>] cpu_up+0x7a/0xa0
> [ 773.101910] [<ffffffff816701b0>] cpu_subsys_online+0x40/0x90
> [ 773.108332] [<ffffffff8143d777>] device_online+0x67/0x90
> [ 773.114368] [<ffffffff8143d82a>] online_store+0x8a/0xa0
> [ 773.120305] [<ffffffff8143aab8>] dev_attr_store+0x18/0x30
> [ 773.126437] [<ffffffff8127224a>] sysfs_kf_write+0x3a/0x50
> [ 773.132560] [<ffffffff812718d0>] kernfs_fop_write+0x120/0x170
> [ 773.139078] [<ffffffff811f7657>] __vfs_write+0x37/0x100
> [ 773.145019] [<ffffffff811fa398>] ? __sb_start_write+0x58/0x110
> [ 773.151635] [<ffffffff8129d7ed>] ? security_file_permission+0x3d/0xc0
> [ 773.158932] [<ffffffff811f7d59>] vfs_write+0xa9/0x190
> [ 773.164674] [<ffffffff810234e6>] ? do_audit_syscall_entry+0x66/0x70
> [ 773.171776] [<ffffffff811f8b55>] SyS_write+0x55/0xc0
> [ 773.177423] [<ffffffff810672f0>] ? do_page_fault+0x30/0x80
> [ 773.183654] [<ffffffff8168232e>] entry_SYSCALL_64_fastpath+0x12/0x71
> [ 773.190843] ---[ end trace e6219d24386873bd ]---
> [ 773.196573] smpboot: Booting Node 7 Processor 143 APIC 0x1f7
> [ 773.221241] microcode: CPU143 sig=0x306f3, pf=0x80, revision=0x9
> [ 773.228005] Will online and init hotplugged CPU: 143
>
> Here is the root cause of the issue:
> When calling intel_cqm_cpu_prepare() at CPU_UP_PREPARE notification,
> the function checks that c86_chache_max_rmid is same as cqm_max_rmid
> as follows:
>
> static void intel_cqm_cpu_prepare(unsigned int cpu)
> {
> ...
> WARN_ON(c->x86_cache_max_rmid != cqm_max_rmid);
>
> But x86_cache_max_rmid of hot added CPU is not set yet, because it will
> set in get_cpu_cap() which is called after CPU_UP_PREPARE notification.
>
> So when onlining a hot added CPU, the WARN_ON() are always shown:
Thanks for the report! I submitted the following patch on August 6th to
fix this issue,
https://lkml.kernel.org/r/1438863163-14083-1-git-send-email-matt@codeblueprint.co.uk
> To fix the issue, the patch moves WARN_ON()s from intel_cqm_cpu_prepare() to
> cqm_pick_event_reader() which is called at CPU_STARTING notification.
I took a different approach and moved the notifier from CPU_UP_PREPARE
to CPU_STARTING since there's no reason to call the handler as early as
CPU_UP_PREPARE time at all.
Yes, we could collapse cqm_pick_event_reader() and
intel_cqm_cpu_prepare() together but I have a patch to add a
CPU_DOWN_FAILED call that invokes cqm_pick_event_reader() but not
intel_cqm_cpu_prepare(). I actually need to mail that out.
--
Matt Fleming, Intel Open Source Technology Center
^ permalink raw reply [flat|nested] 3+ messages in thread
* Re: [PATCH] perf/x86/intel/cqm: Move WARN_ONs from intel_cqm_cpu_prepare to cqm_pick_event_reader
2015-08-12 11:00 ` Matt Fleming
@ 2015-08-12 14:40 ` Yasuaki Ishimatsu
0 siblings, 0 replies; 3+ messages in thread
From: Yasuaki Ishimatsu @ 2015-08-12 14:40 UTC (permalink / raw)
To: Matt Fleming
Cc: peterz, linux-kernel, tglx, vikas.shivappa, kanaka.d.juvva, matt.fleming
Hi Matt,
Thank you for the information.
Your patch looks good to me.
Thanks,
Yasuaki Ishimatsu
On Wed, 12 Aug 2015 12:00:25 +0100
Matt Fleming <matt@codeblueprint.co.uk> wrote:
> On Tue, 11 Aug, at 01:31:05PM, Yasuaki Ishimatsu wrote:
> > When hot adding a CPU and onlining it, the following WARN_ON() messages
> > are shown:
> >
> > [ 772.891448] ------------[ cut here ]------------
> > [ 772.896624] WARNING: CPU: 58 PID: 15169 at arch/x86/kernel/cpu/perf_event_intel_cqm.c:1268 intel_cqm_cpu_prepare+0x88/0x90()
> > [ 772.909167] Modules linked in:
> > [ 772.995134] CPU: 58 PID: 15169
> > [ 773.016633] 0000000000000000 0000000092fb60ed ffff88104febbba8 ffffffff8167b5fa
> > [ 773.024789] 0000000000000000 0000000000000000 ffff88104febbbe8 ffffffff810819ea
> > [ 773.033119] ffff88103be60000 ffff8c0fbc7ca020 ffffffff819fadf0 000000000000008f
> > [ 773.041461] Call Trace:
> > [ 773.044402] [<ffffffff8167b5fa>] dump_stack+0x45/0x57
> > [ 773.050160] [<ffffffff810819ea>] warn_slowpath_common+0x8a/0xc0
> > [ 773.056888] [<ffffffff81081b1a>] warn_slowpath_null+0x1a/0x20
> > [ 773.063426] [<ffffffff810365f8>] intel_cqm_cpu_prepare+0x88/0x90
> > [ 773.070253] [<ffffffff81036732>] intel_cqm_cpu_notifier+0x42/0x160
> > [ 773.077271] [<ffffffff810a0d3d>] notifier_call_chain+0x4d/0x80
> > [ 773.083901] [<ffffffff810a0e4e>] __raw_notifier_call_chain+0xe/0x10
> > [ 773.091007] [<ffffffff81081ef8>] _cpu_up+0xe8/0x190
> > [ 773.096555] [<ffffffff8108201a>] cpu_up+0x7a/0xa0
> > [ 773.101910] [<ffffffff816701b0>] cpu_subsys_online+0x40/0x90
> > [ 773.108332] [<ffffffff8143d777>] device_online+0x67/0x90
> > [ 773.114368] [<ffffffff8143d82a>] online_store+0x8a/0xa0
> > [ 773.120305] [<ffffffff8143aab8>] dev_attr_store+0x18/0x30
> > [ 773.126437] [<ffffffff8127224a>] sysfs_kf_write+0x3a/0x50
> > [ 773.132560] [<ffffffff812718d0>] kernfs_fop_write+0x120/0x170
> > [ 773.139078] [<ffffffff811f7657>] __vfs_write+0x37/0x100
> > [ 773.145019] [<ffffffff811fa398>] ? __sb_start_write+0x58/0x110
> > [ 773.151635] [<ffffffff8129d7ed>] ? security_file_permission+0x3d/0xc0
> > [ 773.158932] [<ffffffff811f7d59>] vfs_write+0xa9/0x190
> > [ 773.164674] [<ffffffff810234e6>] ? do_audit_syscall_entry+0x66/0x70
> > [ 773.171776] [<ffffffff811f8b55>] SyS_write+0x55/0xc0
> > [ 773.177423] [<ffffffff810672f0>] ? do_page_fault+0x30/0x80
> > [ 773.183654] [<ffffffff8168232e>] entry_SYSCALL_64_fastpath+0x12/0x71
> > [ 773.190843] ---[ end trace e6219d24386873bd ]---
> > [ 773.196573] smpboot: Booting Node 7 Processor 143 APIC 0x1f7
> > [ 773.221241] microcode: CPU143 sig=0x306f3, pf=0x80, revision=0x9
> > [ 773.228005] Will online and init hotplugged CPU: 143
> >
> > Here is the root cause of the issue:
> > When calling intel_cqm_cpu_prepare() at CPU_UP_PREPARE notification,
> > the function checks that c86_chache_max_rmid is same as cqm_max_rmid
> > as follows:
> >
> > static void intel_cqm_cpu_prepare(unsigned int cpu)
> > {
> > ...
> > WARN_ON(c->x86_cache_max_rmid != cqm_max_rmid);
> >
> > But x86_cache_max_rmid of hot added CPU is not set yet, because it will
> > set in get_cpu_cap() which is called after CPU_UP_PREPARE notification.
> >
> > So when onlining a hot added CPU, the WARN_ON() are always shown:
>
> Thanks for the report! I submitted the following patch on August 6th to
> fix this issue,
>
> https://lkml.kernel.org/r/1438863163-14083-1-git-send-email-matt@codeblueprint.co.uk
>
> > To fix the issue, the patch moves WARN_ON()s from intel_cqm_cpu_prepare() to
> > cqm_pick_event_reader() which is called at CPU_STARTING notification.
>
> I took a different approach and moved the notifier from CPU_UP_PREPARE
> to CPU_STARTING since there's no reason to call the handler as early as
> CPU_UP_PREPARE time at all.
>
> Yes, we could collapse cqm_pick_event_reader() and
> intel_cqm_cpu_prepare() together but I have a patch to add a
> CPU_DOWN_FAILED call that invokes cqm_pick_event_reader() but not
> intel_cqm_cpu_prepare(). I actually need to mail that out.
>
> --
> Matt Fleming, Intel Open Source Technology Center
^ permalink raw reply [flat|nested] 3+ messages in thread
end of thread, other threads:[~2015-08-12 14:40 UTC | newest]
Thread overview: 3+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2015-08-11 20:31 [PATCH] perf/x86/intel/cqm: Move WARN_ONs from intel_cqm_cpu_prepare to cqm_pick_event_reader Yasuaki Ishimatsu
2015-08-12 11:00 ` Matt Fleming
2015-08-12 14:40 ` Yasuaki Ishimatsu
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).