bpf.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* deadlock bug related to bpf,audit subsystems
       [not found]                   ` <YFM+Ijeu4bN4IzH1@krava>
@ 2021-03-18 14:43                     ` Serhei Makarov
  2021-03-18 16:43                       ` Serhei Makarov
  0 siblings, 1 reply; 5+ messages in thread
From: Serhei Makarov @ 2021-03-18 14:43 UTC (permalink / raw)
  To: linux-audit, bpf
  Cc: Daniel Borkmann, ast, Frank Eigler, guro, Jerome Marchand,
	Jiri Olsa, Serhei Makarov

Moving this discussion to kernel mailing lists.

Problem description:

Upstream kernel 5.11.0-rc7 and later was found to deadlock during a
bpf_probe_read_compat call within a sched_switch tracepoint. The
problem is reproducible with the reg_alloc3 testcase from SystemTap's
BPF backend testsuite on x86_64 as well as the runqlat,runqslower
tools from bcc on ppc64le. Example stack trace from [1]:

[  730.868702] stack backtrace:
[  730.869590] CPU: 1 PID: 701 Comm: in:imjournal Not tainted
5.12.0-0.rc2.20210309git144c79ef3353.166.fc35.x86_64 #1
[  730.871605] Hardware name: QEMU Standard PC (Q35 + ICH9, 2009),
BIOS 1.13.0-2.fc32 04/01/2014
[  730.873278] Call Trace:
[  730.873770]  dump_stack+0x7f/0xa1
[  730.874433]  check_noncircular+0xdf/0x100
[  730.875232]  __lock_acquire+0x1202/0x1e10
[  730.876031]  ? __lock_acquire+0xfc0/0x1e10
[  730.876844]  lock_acquire+0xc2/0x3a0
[  730.877551]  ? __wake_up_common_lock+0x52/0x90
[  730.878434]  ? lock_acquire+0xc2/0x3a0
[  730.879186]  ? lock_is_held_type+0xa7/0x120
[  730.880044]  ? skb_queue_tail+0x1b/0x50
[  730.880800]  _raw_spin_lock_irqsave+0x4d/0x90
[  730.881656]  ? __wake_up_common_lock+0x52/0x90
[  730.882532]  __wake_up_common_lock+0x52/0x90
[  730.883375]  audit_log_end+0x5b/0x100
[  730.884104]  slow_avc_audit+0x69/0x90
[  730.884836]  avc_has_perm+0x8b/0xb0
[  730.885532]  selinux_lockdown+0xa5/0xd0
[  730.886297]  security_locked_down+0x20/0x40
[  730.887133]  bpf_probe_read_compat+0x66/0xd0
[  730.887983]  bpf_prog_250599c5469ac7b5+0x10f/0x820
[  730.888917]  trace_call_bpf+0xe9/0x240
[  730.889672]  perf_trace_run_bpf_submit+0x4d/0xc0
[  730.890579]  perf_trace_sched_switch+0x142/0x180
[  730.891485]  ? __schedule+0x6d8/0xb20
[  730.892209]  __schedule+0x6d8/0xb20
[  730.892899]  schedule+0x5b/0xc0
[  730.893522]  exit_to_user_mode_prepare+0x11d/0x240
[  730.894457]  syscall_exit_to_user_mode+0x27/0x70
[  730.895361]  entry_SYSCALL_64_after_hwframe+0x44/0xae

Jiri Olsa also reports seeing a similar deadlock at v5.10. I'm in the
middle of double-checking my bisection which ended up at a
seemingly-unrelated commit [2]

[1] https://bugzilla.redhat.com/show_bug.cgi?id=1938312
[2] https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?h=v5.11-rc7&id=2dcb3964544177c51853a210b6ad400de78ef17d

Reasonable amount of context below:

On Thu, Mar 18, 2021 at 7:48 AM Jiri Olsa <jolsa@redhat.com> wrote:
> > In that case the issue is in the selinux / audit department, not on bpf side.
> >
> > To be honest, I'm actually puzzled that from bpf_probe_read_*() we end up sending
> > audit messages, this seems highly questionable given those BPF helpers are used in
> > performance critical code, and they can be called from any contexts. So going and
> > allocating an skb for audit is just completely wrong. It should probably be at min
> > avc_has_perm_noaudit() if anything ...
>
> I just noticed this discussion is not on the list ;-)
> let's bring it there and include some audit folks

Yes, my apologies. This started as a quick note from me to Daniel to
glance at the RHBZ and the cc:s gradually snowballed from there.

- Serhei

> jirka
>
> >
> > > ----
> > > [   56.866377] =============================
> > > [   56.866397] [ BUG: Invalid wait context ]
> > > [   56.866407] 5.11.0 #4 Tainted: G            E
> > > [   56.866420] -----------------------------
> > > [   56.866438] swapper/69/0 is trying to lock:
> > > [   56.866458] c000000002120038 (notif_lock){....}-{3:3}, at: avc_compute_av.isra.0+0x14c/0x430
> > > [   56.866508] other info that might help us debug this:
> > > [   56.866528] context-{2:2}
> > > [   56.866545] 3 locks held by swapper/69/0:
> > > [   56.866566]  #0: c000001fff1f7a98 (&rq->lock){-.-.}-{2:2}, at: sched_ttwu_pending+0x5c/0x1e0
> > > [   56.866613]  #1: c00000000208b9d8 (rcu_read_lock){....}-{1:3}, at: bpf_trace_run1+0x8/0x240
> > > [   56.866659]  #2: c00000000208b9d8 (rcu_read_lock){....}-{1:3}, at: avc_compute_av.isra.0+0x7c/0x430
> > > [   56.866704] stack backtrace:
> > > [   56.866724] CPU: 69 PID: 0 Comm: swapper/69 Tainted: G            E     5.11.0 #4
> > > [   56.866761] Call Trace:
> > > [   56.866778] [c0000000109fb310] [c000000000a42784] dump_stack+0xe8/0x144 (unreliable)
> > > [   56.866817] [c0000000109fb360] [c0000000001f02a0] __lock_acquire+0xaa0/0x2800
> > > [   56.866857] [c0000000109fb490] [c0000000001f2b40] lock_acquire.part.0+0xc0/0x390
> > > [   56.866885] [c0000000109fb570] [c00000000118af0c] _raw_spin_lock_irqsave+0x6c/0xc0
> > > [   56.866923] [c0000000109fb5b0] [c00000000089cc4c] avc_compute_av.isra.0+0x14c/0x430
> > > [   56.866961] [c0000000109fb670] [c00000000089e0a0] avc_has_perm+0x2c0/0x300
> > > [   56.866997] [c0000000109fb780] [c0000000008a7d34] selinux_lockdown+0xd4/0x100
> > > [   56.867034] [c0000000109fb810] [c000000000891140] security_locked_down+0x50/0xb0
> > > [   56.867086] [c0000000109fb840] [c000000000346b7c] bpf_probe_read_compat+0xbc/0x130
> > > [   56.867125] [c0000000109fb880] [c00800000e63bd38] bpf_prog_3de2db9929262fab_raw_tracepoint__sched_wakeup+0x5c/0x4324
> > > [   56.867167] [c0000000109fb8f0] [c000000000349784] bpf_trace_run1+0xe4/0x240
> > > [   56.867204] [c0000000109fb940] [c00000000018f238] __bpf_trace_sched_wakeup_template+0x18/0x30
> > > [   56.867243] [c0000000109fb960] [c000000000190834] trace_sched_wakeup+0xe4/0x200
> > > [   56.867281] [c0000000109fb9a0] [c0000000001983bc] ttwu_do_wakeup+0x4c/0x1f0
> > > [   56.867317] [c0000000109fba20] [c00000000019c190] sched_ttwu_pending+0x120/0x1e0
> > > [   56.867355] [c0000000109fbac0] [c00000000026cd6c] flush_smp_call_function_queue+0x1bc/0x3c0
> > > [   56.867397] [c0000000109fbb50] [c000000000059fd4] smp_ipi_demux_relaxed+0xf4/0x100
> > > [   56.867436] [c0000000109fbb90] [c0000000000537fc] doorbell_exception+0xbc/0x370
> > > [   56.867474] [c0000000109fbbd0] [c0000000000168d4] replay_soft_interrupts+0x1f4/0x2d0
> > > [   56.867512] [c0000000109fbdb0] [c000000000016a20] arch_local_irq_restore+0x70/0xe0
> > > [   56.867550] [c0000000109fbde0] [c000000000df9d34] cpuidle_enter_state+0x124/0x500
> > > [   56.867587] [c0000000109fbe40] [c000000000dfa1ac] cpuidle_enter+0x4c/0x70
> > > [   56.867613] [c0000000109fbe80] [c0000000001a5dc8] do_idle+0x338/0x450
> > > [   56.867649] [c0000000109fbf10] [c0000000001a62bc] cpu_startup_entry+0x3c/0x40
> > > [   56.867686] [c0000000109fbf40] [c00000000005ac34] start_secondary+0x2a4/0x2b0
> > > [   56.867727] [c0000000109fbf90] [c00000000000c054] start_secondary_prolog+0x10/0x14
> > >
> >
>


^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: deadlock bug related to bpf,audit subsystems
  2021-03-18 14:43                     ` deadlock bug related to bpf,audit subsystems Serhei Makarov
@ 2021-03-18 16:43                       ` Serhei Makarov
  2021-03-18 17:44                         ` Paul Moore
  0 siblings, 1 reply; 5+ messages in thread
From: Serhei Makarov @ 2021-03-18 16:43 UTC (permalink / raw)
  To: linux-audit, bpf
  Cc: Daniel Borkmann, ast, Frank Eigler, guro, Jerome Marchand, Jiri Olsa

On Thu, Mar 18, 2021 at 10:43 AM Serhei Makarov <smakarov@redhat.com> wrote:
> Jiri Olsa also reports seeing a similar deadlock at v5.10. I'm in the
> middle of double-checking my bisection which ended up at a
> seemingly-unrelated commit [2]
>
> [1] https://bugzilla.redhat.com/show_bug.cgi?id=1938312
> [2] https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?h=v5.11-rc7&id=2dcb3964544177c51853a210b6ad400de78ef17d

I've confirmed that my first bisection was incorrect by testing
@1c2f67308af4 mm: thp: fix MADV_REMOVE deadlock on shmem THP
and reproducing the deadlock. Previously this commit was marked as
good, so it seems a kernel with the bug can sometimes pass the test.

I'll double check rc6 next since I have the kernel handy. If
5.11.0-rc6 can also be made to fail, with Jiri Olsa's report it'd be
necessary to do a wider search.
There may be commits with intent similar to
https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=8d92db5c04d103
which tightened some of the behaviour of kernel reads, but affecting
the audit subsystem?
The actual stack trace that leads to deadlock goes through
security_locked_down() which was present since the original patch
reworking probe_read into separate probe_read_{user,kernel} helpers
https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?h=v5.11-rc7&id=6ae08ae3dea2

-- Serhei


^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: deadlock bug related to bpf,audit subsystems
  2021-03-18 16:43                       ` Serhei Makarov
@ 2021-03-18 17:44                         ` Paul Moore
  2021-03-18 17:45                           ` Paul Moore
  0 siblings, 1 reply; 5+ messages in thread
From: Paul Moore @ 2021-03-18 17:44 UTC (permalink / raw)
  To: Serhei Makarov
  Cc: linux-audit, bpf, Jerome Marchand, Daniel Borkmann, ast,
	Frank Eigler, Jiri Olsa, guro

On Thu, Mar 18, 2021 at 12:57 PM Serhei Makarov <smakarov@redhat.com> wrote:
> On Thu, Mar 18, 2021 at 10:43 AM Serhei Makarov <smakarov@redhat.com> wrote:
> > Jiri Olsa also reports seeing a similar deadlock at v5.10. I'm in the
> > middle of double-checking my bisection which ended up at a
> > seemingly-unrelated commit [2]
> >
> > [1] https://bugzilla.redhat.com/show_bug.cgi?id=1938312
> > [2] https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?h=v5.11-rc7&id=2dcb3964544177c51853a210b6ad400de78ef17d
>
> I've confirmed that my first bisection was incorrect by testing
> @1c2f67308af4 mm: thp: fix MADV_REMOVE deadlock on shmem THP
> and reproducing the deadlock. Previously this commit was marked as
> good, so it seems a kernel with the bug can sometimes pass the test.
>
> I'll double check rc6 next since I have the kernel handy. If
> 5.11.0-rc6 can also be made to fail, with Jiri Olsa's report it'd be
> necessary to do a wider search.
> There may be commits with intent similar to
> https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=8d92db5c04d103
> which tightened some of the behaviour of kernel reads, but affecting
> the audit subsystem?
> The actual stack trace that leads to deadlock goes through
> security_locked_down() which was present since the original patch
> reworking probe_read into separate probe_read_{user,kernel} helpers
> https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?h=v5.11-rc7&id=6ae08ae3dea2

Added thee SELinux list to the To/CC line; they should really be
involved.  I'm also CC'ing the LSM list for good measure as there may
be other people that care about this.

FYI, the first instance of this thread that I saw can be found here
via the linux-audit list:

https://lore.kernel.org/linux-audit/CANYvDQN7H5tVp47fbYcRasv4XF07eUbsDwT_eDCHXJUj43J7jQ@mail.gmail.com/

-- 
paul moore
www.paul-moore.com

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: deadlock bug related to bpf,audit subsystems
  2021-03-18 17:44                         ` Paul Moore
@ 2021-03-18 17:45                           ` Paul Moore
  2021-03-18 18:19                             ` Paul Moore
  0 siblings, 1 reply; 5+ messages in thread
From: Paul Moore @ 2021-03-18 17:45 UTC (permalink / raw)
  To: Serhei Makarov
  Cc: linux-audit, bpf, Jerome Marchand, Daniel Borkmann, ast,
	Frank Eigler, Jiri Olsa, guro, selinux, linux-security-module

On Thu, Mar 18, 2021 at 1:44 PM Paul Moore <paul@paul-moore.com> wrote:
> On Thu, Mar 18, 2021 at 12:57 PM Serhei Makarov <smakarov@redhat.com> wrote:
> > On Thu, Mar 18, 2021 at 10:43 AM Serhei Makarov <smakarov@redhat.com> wrote:
> > > Jiri Olsa also reports seeing a similar deadlock at v5.10. I'm in the
> > > middle of double-checking my bisection which ended up at a
> > > seemingly-unrelated commit [2]
> > >
> > > [1] https://bugzilla.redhat.com/show_bug.cgi?id=1938312
> > > [2] https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?h=v5.11-rc7&id=2dcb3964544177c51853a210b6ad400de78ef17d
> >
> > I've confirmed that my first bisection was incorrect by testing
> > @1c2f67308af4 mm: thp: fix MADV_REMOVE deadlock on shmem THP
> > and reproducing the deadlock. Previously this commit was marked as
> > good, so it seems a kernel with the bug can sometimes pass the test.
> >
> > I'll double check rc6 next since I have the kernel handy. If
> > 5.11.0-rc6 can also be made to fail, with Jiri Olsa's report it'd be
> > necessary to do a wider search.
> > There may be commits with intent similar to
> > https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=8d92db5c04d103
> > which tightened some of the behaviour of kernel reads, but affecting
> > the audit subsystem?
> > The actual stack trace that leads to deadlock goes through
> > security_locked_down() which was present since the original patch
> > reworking probe_read into separate probe_read_{user,kernel} helpers
> > https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?h=v5.11-rc7&id=6ae08ae3dea2
>
> Added thee SELinux list to the To/CC line; they should really be
> involved.  I'm also CC'ing the LSM list for good measure as there may
> be other people that care about this.

Argh, hit send a bit too quickly :/

> FYI, the first instance of this thread that I saw can be found here
> via the linux-audit list:
>
> https://lore.kernel.org/linux-audit/CANYvDQN7H5tVp47fbYcRasv4XF07eUbsDwT_eDCHXJUj43J7jQ@mail.gmail.com/

-- 
paul moore
www.paul-moore.com

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: deadlock bug related to bpf,audit subsystems
  2021-03-18 17:45                           ` Paul Moore
@ 2021-03-18 18:19                             ` Paul Moore
  0 siblings, 0 replies; 5+ messages in thread
From: Paul Moore @ 2021-03-18 18:19 UTC (permalink / raw)
  To: Serhei Makarov
  Cc: linux-audit, bpf, Jerome Marchand, Daniel Borkmann, ast,
	Frank Eigler, Jiri Olsa, guro, selinux, linux-security-module

On Thu, Mar 18, 2021 at 1:45 PM Paul Moore <paul@paul-moore.com> wrote:
> On Thu, Mar 18, 2021 at 1:44 PM Paul Moore <paul@paul-moore.com> wrote:
> > On Thu, Mar 18, 2021 at 12:57 PM Serhei Makarov <smakarov@redhat.com> wrote:
> > > On Thu, Mar 18, 2021 at 10:43 AM Serhei Makarov <smakarov@redhat.com> wrote:
> > > > Jiri Olsa also reports seeing a similar deadlock at v5.10. I'm in the
> > > > middle of double-checking my bisection which ended up at a
> > > > seemingly-unrelated commit [2]
> > > >
> > > > [1] https://bugzilla.redhat.com/show_bug.cgi?id=1938312
> > > > [2] https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?h=v5.11-rc7&id=2dcb3964544177c51853a210b6ad400de78ef17d
> > >
> > > I've confirmed that my first bisection was incorrect by testing
> > > @1c2f67308af4 mm: thp: fix MADV_REMOVE deadlock on shmem THP
> > > and reproducing the deadlock. Previously this commit was marked as
> > > good, so it seems a kernel with the bug can sometimes pass the test.
> > >
> > > I'll double check rc6 next since I have the kernel handy. If
> > > 5.11.0-rc6 can also be made to fail, with Jiri Olsa's report it'd be
> > > necessary to do a wider search.
> > > There may be commits with intent similar to
> > > https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=8d92db5c04d103
> > > which tightened some of the behaviour of kernel reads, but affecting
> > > the audit subsystem?
> > > The actual stack trace that leads to deadlock goes through
> > > security_locked_down() which was present since the original patch
> > > reworking probe_read into separate probe_read_{user,kernel} helpers
> > > https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?h=v5.11-rc7&id=6ae08ae3dea2
> >
> > Added thee SELinux list to the To/CC line; they should really be
> > involved.  I'm also CC'ing the LSM list for good measure as there may
> > be other people that care about this.
>
> Argh, hit send a bit too quickly :/
>
> > FYI, the first instance of this thread that I saw can be found here
> > via the linux-audit list:
> >
> > https://lore.kernel.org/linux-audit/CANYvDQN7H5tVp47fbYcRasv4XF07eUbsDwT_eDCHXJUj43J7jQ@mail.gmail.com/

Previously in the thread there was a question about why audit events
are being generated inside bpf_probe_read_compat(); the answer is
pretty simple, we do an access check in the security_locked_down()
hook, inside the call to bpf_probe_read_kernel_common(), and that can
result in an audit event depending on the LSM and it's policy.
Skipping the audit event in the case of a LSM access denial, e.g. a
SELinux AVC denial, could result in a silent access denial which can
be maddening both to users and admins.

-- 
paul moore
www.paul-moore.com

^ permalink raw reply	[flat|nested] 5+ messages in thread

end of thread, other threads:[~2021-03-18 18:20 UTC | newest]

Thread overview: 5+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
     [not found] <2ed7a55e-7def-7faf-fc47-991b867bff9e@iogearbox.net>
     [not found] ` <CANYvDQOfygmqv0V-1PuzXV8ZFzk0uD566oEF3v9uX21G4fSFKg@mail.gmail.com>
     [not found]   ` <1e410caf-019a-ade7-465d-3d936d2f7dc6@iogearbox.net>
     [not found]     ` <5845cef9-5aaf-f85e-8280-472f61ddaeed@iogearbox.net>
     [not found]       ` <CANYvDQNCKmEy9ZzPRvhNYvK0=TKk1pRS=seUuAkby92ic8tVqw@mail.gmail.com>
     [not found]         ` <f97bd923-bf12-69a0-f0a8-c9a764abbed2@iogearbox.net>
     [not found]           ` <YFIwzhE00OpU1zro@krava>
     [not found]             ` <ff0db44e-aa55-da94-785f-ba10792a5ae1@iogearbox.net>
     [not found]               ` <YFKOeGqUwBPTkPzT@krava>
     [not found]                 ` <61494cfb-1ceb-4886-3023-1ac0b35697d6@iogearbox.net>
     [not found]                   ` <YFM+Ijeu4bN4IzH1@krava>
2021-03-18 14:43                     ` deadlock bug related to bpf,audit subsystems Serhei Makarov
2021-03-18 16:43                       ` Serhei Makarov
2021-03-18 17:44                         ` Paul Moore
2021-03-18 17:45                           ` Paul Moore
2021-03-18 18:19                             ` Paul Moore

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).