From: Serhei Makarov <smakarov@redhat.com>
To: linux-audit@redhat.com, bpf@vger.kernel.org
Cc: Jerome Marchand <jmarchan@redhat.com>,
Daniel Borkmann <daniel@iogearbox.net>,
ast@kernel.org, Serhei Makarov <smakarov@redhat.com>,
Frank Eigler <fche@redhat.com>, Jiri Olsa <jolsa@redhat.com>,
guro@fb.com
Subject: deadlock bug related to bpf,audit subsystems
Date: Thu, 18 Mar 2021 10:43:00 -0400 [thread overview]
Message-ID: <CANYvDQN7H5tVp47fbYcRasv4XF07eUbsDwT_eDCHXJUj43J7jQ@mail.gmail.com> (raw)
In-Reply-To: <YFM+Ijeu4bN4IzH1@krava>
Moving this discussion to kernel mailing lists.
Problem description:
Upstream kernel 5.11.0-rc7 and later was found to deadlock during a
bpf_probe_read_compat call within a sched_switch tracepoint. The
problem is reproducible with the reg_alloc3 testcase from SystemTap's
BPF backend testsuite on x86_64 as well as the runqlat,runqslower
tools from bcc on ppc64le. Example stack trace from [1]:
[ 730.868702] stack backtrace:
[ 730.869590] CPU: 1 PID: 701 Comm: in:imjournal Not tainted
5.12.0-0.rc2.20210309git144c79ef3353.166.fc35.x86_64 #1
[ 730.871605] Hardware name: QEMU Standard PC (Q35 + ICH9, 2009),
BIOS 1.13.0-2.fc32 04/01/2014
[ 730.873278] Call Trace:
[ 730.873770] dump_stack+0x7f/0xa1
[ 730.874433] check_noncircular+0xdf/0x100
[ 730.875232] __lock_acquire+0x1202/0x1e10
[ 730.876031] ? __lock_acquire+0xfc0/0x1e10
[ 730.876844] lock_acquire+0xc2/0x3a0
[ 730.877551] ? __wake_up_common_lock+0x52/0x90
[ 730.878434] ? lock_acquire+0xc2/0x3a0
[ 730.879186] ? lock_is_held_type+0xa7/0x120
[ 730.880044] ? skb_queue_tail+0x1b/0x50
[ 730.880800] _raw_spin_lock_irqsave+0x4d/0x90
[ 730.881656] ? __wake_up_common_lock+0x52/0x90
[ 730.882532] __wake_up_common_lock+0x52/0x90
[ 730.883375] audit_log_end+0x5b/0x100
[ 730.884104] slow_avc_audit+0x69/0x90
[ 730.884836] avc_has_perm+0x8b/0xb0
[ 730.885532] selinux_lockdown+0xa5/0xd0
[ 730.886297] security_locked_down+0x20/0x40
[ 730.887133] bpf_probe_read_compat+0x66/0xd0
[ 730.887983] bpf_prog_250599c5469ac7b5+0x10f/0x820
[ 730.888917] trace_call_bpf+0xe9/0x240
[ 730.889672] perf_trace_run_bpf_submit+0x4d/0xc0
[ 730.890579] perf_trace_sched_switch+0x142/0x180
[ 730.891485] ? __schedule+0x6d8/0xb20
[ 730.892209] __schedule+0x6d8/0xb20
[ 730.892899] schedule+0x5b/0xc0
[ 730.893522] exit_to_user_mode_prepare+0x11d/0x240
[ 730.894457] syscall_exit_to_user_mode+0x27/0x70
[ 730.895361] entry_SYSCALL_64_after_hwframe+0x44/0xae
Jiri Olsa also reports seeing a similar deadlock at v5.10. I'm in the
middle of double-checking my bisection which ended up at a
seemingly-unrelated commit [2]
[1] https://bugzilla.redhat.com/show_bug.cgi?id=1938312
[2] https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?h=v5.11-rc7&id=2dcb3964544177c51853a210b6ad400de78ef17d
Reasonable amount of context below:
On Thu, Mar 18, 2021 at 7:48 AM Jiri Olsa <jolsa@redhat.com> wrote:
> > In that case the issue is in the selinux / audit department, not on bpf side.
> >
> > To be honest, I'm actually puzzled that from bpf_probe_read_*() we end up sending
> > audit messages, this seems highly questionable given those BPF helpers are used in
> > performance critical code, and they can be called from any contexts. So going and
> > allocating an skb for audit is just completely wrong. It should probably be at min
> > avc_has_perm_noaudit() if anything ...
>
> I just noticed this discussion is not on the list ;-)
> let's bring it there and include some audit folks
Yes, my apologies. This started as a quick note from me to Daniel to
glance at the RHBZ and the cc:s gradually snowballed from there.
- Serhei
> jirka
>
> >
> > > ----
> > > [ 56.866377] =============================
> > > [ 56.866397] [ BUG: Invalid wait context ]
> > > [ 56.866407] 5.11.0 #4 Tainted: G E
> > > [ 56.866420] -----------------------------
> > > [ 56.866438] swapper/69/0 is trying to lock:
> > > [ 56.866458] c000000002120038 (notif_lock){....}-{3:3}, at: avc_compute_av.isra.0+0x14c/0x430
> > > [ 56.866508] other info that might help us debug this:
> > > [ 56.866528] context-{2:2}
> > > [ 56.866545] 3 locks held by swapper/69/0:
> > > [ 56.866566] #0: c000001fff1f7a98 (&rq->lock){-.-.}-{2:2}, at: sched_ttwu_pending+0x5c/0x1e0
> > > [ 56.866613] #1: c00000000208b9d8 (rcu_read_lock){....}-{1:3}, at: bpf_trace_run1+0x8/0x240
> > > [ 56.866659] #2: c00000000208b9d8 (rcu_read_lock){....}-{1:3}, at: avc_compute_av.isra.0+0x7c/0x430
> > > [ 56.866704] stack backtrace:
> > > [ 56.866724] CPU: 69 PID: 0 Comm: swapper/69 Tainted: G E 5.11.0 #4
> > > [ 56.866761] Call Trace:
> > > [ 56.866778] [c0000000109fb310] [c000000000a42784] dump_stack+0xe8/0x144 (unreliable)
> > > [ 56.866817] [c0000000109fb360] [c0000000001f02a0] __lock_acquire+0xaa0/0x2800
> > > [ 56.866857] [c0000000109fb490] [c0000000001f2b40] lock_acquire.part.0+0xc0/0x390
> > > [ 56.866885] [c0000000109fb570] [c00000000118af0c] _raw_spin_lock_irqsave+0x6c/0xc0
> > > [ 56.866923] [c0000000109fb5b0] [c00000000089cc4c] avc_compute_av.isra.0+0x14c/0x430
> > > [ 56.866961] [c0000000109fb670] [c00000000089e0a0] avc_has_perm+0x2c0/0x300
> > > [ 56.866997] [c0000000109fb780] [c0000000008a7d34] selinux_lockdown+0xd4/0x100
> > > [ 56.867034] [c0000000109fb810] [c000000000891140] security_locked_down+0x50/0xb0
> > > [ 56.867086] [c0000000109fb840] [c000000000346b7c] bpf_probe_read_compat+0xbc/0x130
> > > [ 56.867125] [c0000000109fb880] [c00800000e63bd38] bpf_prog_3de2db9929262fab_raw_tracepoint__sched_wakeup+0x5c/0x4324
> > > [ 56.867167] [c0000000109fb8f0] [c000000000349784] bpf_trace_run1+0xe4/0x240
> > > [ 56.867204] [c0000000109fb940] [c00000000018f238] __bpf_trace_sched_wakeup_template+0x18/0x30
> > > [ 56.867243] [c0000000109fb960] [c000000000190834] trace_sched_wakeup+0xe4/0x200
> > > [ 56.867281] [c0000000109fb9a0] [c0000000001983bc] ttwu_do_wakeup+0x4c/0x1f0
> > > [ 56.867317] [c0000000109fba20] [c00000000019c190] sched_ttwu_pending+0x120/0x1e0
> > > [ 56.867355] [c0000000109fbac0] [c00000000026cd6c] flush_smp_call_function_queue+0x1bc/0x3c0
> > > [ 56.867397] [c0000000109fbb50] [c000000000059fd4] smp_ipi_demux_relaxed+0xf4/0x100
> > > [ 56.867436] [c0000000109fbb90] [c0000000000537fc] doorbell_exception+0xbc/0x370
> > > [ 56.867474] [c0000000109fbbd0] [c0000000000168d4] replay_soft_interrupts+0x1f4/0x2d0
> > > [ 56.867512] [c0000000109fbdb0] [c000000000016a20] arch_local_irq_restore+0x70/0xe0
> > > [ 56.867550] [c0000000109fbde0] [c000000000df9d34] cpuidle_enter_state+0x124/0x500
> > > [ 56.867587] [c0000000109fbe40] [c000000000dfa1ac] cpuidle_enter+0x4c/0x70
> > > [ 56.867613] [c0000000109fbe80] [c0000000001a5dc8] do_idle+0x338/0x450
> > > [ 56.867649] [c0000000109fbf10] [c0000000001a62bc] cpu_startup_entry+0x3c/0x40
> > > [ 56.867686] [c0000000109fbf40] [c00000000005ac34] start_secondary+0x2a4/0x2b0
> > > [ 56.867727] [c0000000109fbf90] [c00000000000c054] start_secondary_prolog+0x10/0x14
> > >
> >
>
--
Linux-audit mailing list
Linux-audit@redhat.com
https://listman.redhat.com/mailman/listinfo/linux-audit
next parent reply other threads:[~2021-03-18 15:20 UTC|newest]
Thread overview: 5+ messages / expand[flat|nested] mbox.gz Atom feed top
[not found] <2ed7a55e-7def-7faf-fc47-991b867bff9e@iogearbox.net>
[not found] ` <CANYvDQOfygmqv0V-1PuzXV8ZFzk0uD566oEF3v9uX21G4fSFKg@mail.gmail.com>
[not found] ` <1e410caf-019a-ade7-465d-3d936d2f7dc6@iogearbox.net>
[not found] ` <5845cef9-5aaf-f85e-8280-472f61ddaeed@iogearbox.net>
[not found] ` <CANYvDQNCKmEy9ZzPRvhNYvK0=TKk1pRS=seUuAkby92ic8tVqw@mail.gmail.com>
[not found] ` <f97bd923-bf12-69a0-f0a8-c9a764abbed2@iogearbox.net>
[not found] ` <YFIwzhE00OpU1zro@krava>
[not found] ` <ff0db44e-aa55-da94-785f-ba10792a5ae1@iogearbox.net>
[not found] ` <YFKOeGqUwBPTkPzT@krava>
[not found] ` <61494cfb-1ceb-4886-3023-1ac0b35697d6@iogearbox.net>
[not found] ` <YFM+Ijeu4bN4IzH1@krava>
2021-03-18 14:43 ` Serhei Makarov [this message]
2021-03-18 16:43 ` deadlock bug related to bpf,audit subsystems Serhei Makarov
2021-03-18 17:44 ` Paul Moore
2021-03-18 17:45 ` Paul Moore
2021-03-18 18:19 ` Paul Moore
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=CANYvDQN7H5tVp47fbYcRasv4XF07eUbsDwT_eDCHXJUj43J7jQ@mail.gmail.com \
--to=smakarov@redhat.com \
--cc=ast@kernel.org \
--cc=bpf@vger.kernel.org \
--cc=daniel@iogearbox.net \
--cc=fche@redhat.com \
--cc=guro@fb.com \
--cc=jmarchan@redhat.com \
--cc=jolsa@redhat.com \
--cc=linux-audit@redhat.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).