From: Steven Rostedt <rostedt@goodmis.org>
To: Joerg Roedel <jroedel@suse.de>
Cc: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>,
linux-kernel <linux-kernel@vger.kernel.org>,
Ingo Molnar <mingo@kernel.org>,
Thomas Gleixner <tglx@linutronix.de>,
Peter Zijlstra <peterz@infradead.org>,
Borislav Petkov <bp@alien8.de>,
Andrew Morton <akpm@linux-foundation.org>,
Shile Zhang <shile.zhang@linux.alibaba.com>,
Andy Lutomirski <luto@amacapital.net>,
"Rafael J. Wysocki" <rafael.j.wysocki@intel.com>,
Dave Hansen <dave.hansen@linux.intel.com>,
Tzvetomir Stoyanov <tz.stoyanov@gmail.com>
Subject: Re: [PATCH] percpu: Sync vmalloc mappings in pcpu_alloc() and free_percpu()
Date: Mon, 4 May 2020 13:40:42 -0400 [thread overview]
Message-ID: <20200504134042.178409c3@gandalf.local.home> (raw)
In-Reply-To: <20200504151236.GI8135@suse.de>
On Mon, 4 May 2020 17:12:36 +0200
Joerg Roedel <jroedel@suse.de> wrote:
> On Thu, Apr 30, 2020 at 10:39:19PM -0400, Steven Rostedt wrote:
> > What's so damn special about alloc_percpu()? It's definitely not a fast
> > path. And it's not used often.
>
> Okay, I fixed it in the percpu code. It is definitly not a nice
> solution, but having to call vmalloc_sync_mappings/unmappings() is not a
> nice solution at any place in the code. Here is the patch which fixes
> this issue for me. I am also not sure what to put in the Fixes tag, as
> it is related to tracing code accessing per-cpu data from the page-fault
> handler, not sure when this got introduced. Maybe someone else can
> provide a meaningful Fixes- or stable tag.
>
> I also have an idea in mind how to make this all more robust and get rid
> of the vmalloc_sync_mappings/unmappings() interface, will show more when
> I know it works the way I think it does.
>
>
Seems that your patch caused a lockdep splat on my box:
========================================================
WARNING: possible irq lock inversion dependency detected
5.7.0-rc3-test+ #249 Not tainted
--------------------------------------------------------
swapper/4/0 just changed the state of lock:
ffff9a580fdd75a0 (&ndev->lock){++.-}-{2:2}, at: mld_ifc_timer_expire+0x3c/0x350
but this lock took another, SOFTIRQ-unsafe lock in the past:
(pgd_lock){+.+.}-{2:2}
and interrupts could create inverse lock ordering between them.
other info that might help us debug this:
Possible interrupt unsafe locking scenario:
CPU0 CPU1
---- ----
lock(pgd_lock);
local_irq_disable();
lock(&ndev->lock);
lock(pgd_lock);
<Interrupt>
lock(&ndev->lock);
*** DEADLOCK ***
1 lock held by swapper/4/0:
#0: ffff9a581ab05e70 ((&idev->mc_ifc_timer)){+.-.}-{0:0}, at: call_timer_fn+0x5/0x2f0
the shortest dependencies between 2nd lock and 1st lock:
-> (pgd_lock){+.+.}-{2:2} {
HARDIRQ-ON-W at:
lock_acquire+0xda/0x3d0
_raw_spin_lock+0x2f/0x40
sync_global_pgds_l4+0x77/0x180
pcpu_alloc+0x1fd/0x7b0
__kmem_cache_create+0x358/0x540
create_cache+0xe1/0x1f0
kmem_cache_create_usercopy+0x1a5/0x270
kmem_cache_create+0x12/0x20
acpi_os_create_cache+0x18/0x30
acpi_ut_create_caches+0x47/0xab
acpi_ut_init_globals+0xa/0x21a
acpi_initialize_subsystem+0x30/0xa5
acpi_early_init+0x62/0xd6
start_kernel+0x797/0x86a
secondary_startup_64+0xa4/0xb0
SOFTIRQ-ON-W at:
lock_acquire+0xda/0x3d0
_raw_spin_lock+0x2f/0x40
sync_global_pgds_l4+0x77/0x180
pcpu_alloc+0x1fd/0x7b0
__kmem_cache_create+0x358/0x540
create_cache+0xe1/0x1f0
kmem_cache_create_usercopy+0x1a5/0x270
kmem_cache_create+0x12/0x20
acpi_os_create_cache+0x18/0x30
acpi_ut_create_caches+0x47/0xab
acpi_ut_init_globals+0xa/0x21a
acpi_initialize_subsystem+0x30/0xa5
acpi_early_init+0x62/0xd6
start_kernel+0x797/0x86a
secondary_startup_64+0xa4/0xb0
INITIAL USE at:
}
... key at: [<ffffffffb96340b8>] pgd_lock+0x18/0x40
... acquired at:
_raw_spin_lock+0x2f/0x40
sync_global_pgds_l4+0x77/0x180
pcpu_alloc+0x1fd/0x7b0
fib_nh_common_init+0x53/0x110
fib6_nh_init+0x10c/0x700
ip6_route_info_create+0x344/0x440
ip6_route_add+0x18/0x90
addrconf_prefix_route.isra.48+0x17b/0x210
addrconf_notify+0x743/0x8c0
notifier_call_chain+0x47/0x70
__dev_notify_flags+0x9d/0x150
dev_change_flags+0x48/0x60
do_setlink+0x39d/0x1080
rtnl_setlink+0x116/0x190
rtnetlink_rcv_msg+0x188/0x4b0
netlink_rcv_skb+0x75/0x140
netlink_unicast+0x1ae/0x280
netlink_sendmsg+0x253/0x490
sock_sendmsg+0x5b/0x60
__sys_sendto+0x12c/0x190
__x64_sys_sendto+0x24/0x30
do_syscall_64+0x60/0x230
entry_SYSCALL_64_after_hwframe+0x49/0xb3
-> (&ndev->lock){++.-}-{2:2} {
HARDIRQ-ON-W at:
lock_acquire+0xda/0x3d0
_raw_write_lock_bh+0x34/0x40
ipv6_mc_init_dev+0x19/0xc0
ipv6_add_dev+0x2e5/0x490
addrconf_init+0x7f/0x250
inet6_init+0x1c3/0x373
do_one_initcall+0x70/0x340
kernel_init_freeable+0x249/0x2ca
kernel_init+0xa/0x10a
ret_from_fork+0x3a/0x50
HARDIRQ-ON-R at:
lock_acquire+0xda/0x3d0
_raw_read_lock_bh+0x37/0x50
addrconf_dad_work+0xc6/0x560
process_one_work+0x25e/0x5c0
worker_thread+0x30/0x380
kthread+0x139/0x160
ret_from_fork+0x3a/0x50
IN-SOFTIRQ-R at:
lock_acquire+0xda/0x3d0
_raw_read_lock_bh+0x37/0x50
mld_ifc_timer_expire+0x3c/0x350
call_timer_fn+0xa5/0x2f0
run_timer_softirq+0x1dd/0x580
__do_softirq+0xf8/0x4be
irq_exit+0xf1/0x100
smp_apic_timer_interrupt+0xd0/0x2a0
apic_timer_interrupt+0xf/0x20
cpuidle_enter_state+0xcd/0x440
cpuidle_enter+0x29/0x40
do_idle+0x24a/0x290
cpu_startup_entry+0x19/0x20
start_secondary+0x195/0x1e0
secondary_startup_64+0xa4/0xb0
INITIAL USE at:
lock_acquire+0xda/0x3d0
_raw_write_lock_bh+0x34/0x40
ipv6_mc_init_dev+0x19/0xc0
ipv6_add_dev+0x2e5/0x490
addrconf_init+0x7f/0x250
inet6_init+0x1c3/0x373
do_one_initcall+0x70/0x340
kernel_init_freeable+0x249/0x2ca
kernel_init+0xa/0x10a
ret_from_fork+0x3a/0x50
}
... key at: [<ffffffffbaf727f0>] __key.78650+0x0/0x10
... acquired at:
mark_lock+0x22e/0x740
__lock_acquire+0x9e1/0x1c30
lock_acquire+0xda/0x3d0
_raw_read_lock_bh+0x37/0x50
mld_ifc_timer_expire+0x3c/0x350
call_timer_fn+0xa5/0x2f0
run_timer_softirq+0x1dd/0x580
__do_softirq+0xf8/0x4be
irq_exit+0xf1/0x100
smp_apic_timer_interrupt+0xd0/0x2a0
apic_timer_interrupt+0xf/0x20
cpuidle_enter_state+0xcd/0x440
cpuidle_enter+0x29/0x40
do_idle+0x24a/0x290
cpu_startup_entry+0x19/0x20
start_secondary+0x195/0x1e0
secondary_startup_64+0xa4/0xb0
stack backtrace:
CPU: 4 PID: 0 Comm: swapper/4 Not tainted 5.7.0-rc3-test+ #249
Hardware name: Hewlett-Packard HP Compaq Pro 6300 SFF/339A, BIOS K01 v03.03 07/14/2016
Call Trace:
<IRQ>
dump_stack+0x8f/0xd0
check_usage_forwards.cold.61+0x1e/0x27
mark_lock+0x22e/0x740
? check_usage_backwards+0x1e0/0x1e0
__lock_acquire+0x9e1/0x1c30
lock_acquire+0xda/0x3d0
? mld_ifc_timer_expire+0x3c/0x350
? mld_dad_timer_expire+0xb0/0xb0
? mld_dad_timer_expire+0xb0/0xb0
_raw_read_lock_bh+0x37/0x50
? mld_ifc_timer_expire+0x3c/0x350
mld_ifc_timer_expire+0x3c/0x350
? mld_dad_timer_expire+0xb0/0xb0
? mld_dad_timer_expire+0xb0/0xb0
call_timer_fn+0xa5/0x2f0
? mld_dad_timer_expire+0xb0/0xb0
run_timer_softirq+0x1dd/0x580
__do_softirq+0xf8/0x4be
irq_exit+0xf1/0x100
smp_apic_timer_interrupt+0xd0/0x2a0
apic_timer_interrupt+0xf/0x20
</IRQ>
RIP: 0010:cpuidle_enter_state+0xcd/0x440
Code: 80 7c 24 13 00 74 17 9c 58 0f 1f 44 00 00 f6 c4 02 0f 85 0c 03 00 00 31 ff e8 6f 35 8b ff e8 1a 52 92 ff fb 66 0f 1f 44 00 00 <85> ed 0f 88 74 02 00 00 48 63 c5 4c 8b 3c 24 4c 2b 7c 24 08 48 8d
RSP: 0018:ffff9a581981fe70 EFLAGS: 00000246 ORIG_RAX: ffffffffffffff13
RAX: 0000000000e2cf41 RBX: ffff9a581ab37400 RCX: 0000000000000000
RDX: ffff9a581982d100 RSI: 0000000000000006 RDI: ffff9a581982d100
RBP: 0000000000000004 R08: 0000000000000001 R09: 0000000000000000
R10: 0000000000000000 R11: 0000000000000000 R12: ffffffffb96f14c0
R13: ffffffffb96f1678 R14: 0000000000000004 R15: 0000000000000004
cpuidle_enter+0x29/0x40
do_idle+0x24a/0x290
cpu_startup_entry+0x19/0x20
start_secondary+0x195/0x1e0
secondary_startup_64+0xa4/0xb0
-- Steve
next prev parent reply other threads:[~2020-05-04 17:40 UTC|newest]
Thread overview: 46+ messages / expand[flat|nested] mbox.gz Atom feed top
2020-04-29 9:48 [RFC][PATCH] x86/mm: Sync all vmalloc mappings before text_poke() Steven Rostedt
2020-04-29 10:59 ` Joerg Roedel
2020-04-29 12:28 ` Steven Rostedt
2020-04-29 14:07 ` Steven Rostedt
2020-04-29 14:10 ` Joerg Roedel
2020-04-29 14:32 ` Steven Rostedt
2020-04-29 15:44 ` Peter Zijlstra
2020-04-29 16:17 ` Joerg Roedel
2020-04-29 16:20 ` Joerg Roedel
2020-04-29 16:52 ` Steven Rostedt
2020-04-29 17:29 ` Mathieu Desnoyers
2020-04-29 18:51 ` Peter Zijlstra
2020-04-30 14:11 ` Joerg Roedel
2020-04-30 14:50 ` Joerg Roedel
2020-04-30 15:20 ` Mathieu Desnoyers
2020-04-30 16:16 ` Steven Rostedt
2020-04-30 16:18 ` Mathieu Desnoyers
2020-04-30 16:30 ` Steven Rostedt
2020-04-30 16:35 ` Mathieu Desnoyers
2020-04-30 15:23 ` Mathieu Desnoyers
2020-04-30 16:12 ` Steven Rostedt
2020-04-30 16:11 ` Steven Rostedt
2020-04-30 16:16 ` Mathieu Desnoyers
2020-04-30 16:25 ` Steven Rostedt
2020-04-30 19:14 ` Joerg Roedel
2020-05-01 1:13 ` Steven Rostedt
2020-05-01 2:26 ` Mathieu Desnoyers
2020-05-01 2:39 ` Steven Rostedt
2020-05-01 10:16 ` Joerg Roedel
2020-05-01 13:35 ` Mathieu Desnoyers
2020-05-04 15:12 ` [PATCH] percpu: Sync vmalloc mappings in pcpu_alloc() and free_percpu() Joerg Roedel
2020-05-04 15:28 ` Mathieu Desnoyers
2020-05-04 15:31 ` Joerg Roedel
2020-05-04 15:38 ` Mathieu Desnoyers
2020-05-04 15:51 ` Joerg Roedel
2020-05-04 17:04 ` Steven Rostedt
2020-05-04 17:40 ` Steven Rostedt [this message]
2020-05-04 18:38 ` Joerg Roedel
2020-05-04 19:10 ` Steven Rostedt
2020-05-05 12:31 ` [PATCH] tracing: Call vmalloc_sync_mappings() after alloc_percpu() Joerg Roedel
2020-05-06 15:17 ` Steven Rostedt
2020-05-08 14:42 ` Joerg Roedel
2020-05-04 20:25 ` [PATCH] percpu: Sync vmalloc mappings in pcpu_alloc() and free_percpu() Peter Zijlstra
2020-05-04 20:43 ` Steven Rostedt
2020-05-01 4:20 ` [RFC][PATCH] x86/mm: Sync all vmalloc mappings before text_poke() Steven Rostedt
2020-05-01 13:22 ` Mathieu Desnoyers
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20200504134042.178409c3@gandalf.local.home \
--to=rostedt@goodmis.org \
--cc=akpm@linux-foundation.org \
--cc=bp@alien8.de \
--cc=dave.hansen@linux.intel.com \
--cc=jroedel@suse.de \
--cc=linux-kernel@vger.kernel.org \
--cc=luto@amacapital.net \
--cc=mathieu.desnoyers@efficios.com \
--cc=mingo@kernel.org \
--cc=peterz@infradead.org \
--cc=rafael.j.wysocki@intel.com \
--cc=shile.zhang@linux.alibaba.com \
--cc=tglx@linutronix.de \
--cc=tz.stoyanov@gmail.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.