linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Thomas Gleixner <tglx@linutronix.de>
To: LKML <linux-kernel@vger.kernel.org>
Cc: David Miller <davem@davemloft.net>,
	bpf@vger.kernel.org, netdev@vger.kernel.org,
	Alexei Starovoitov <ast@kernel.org>,
	Daniel Borkmann <daniel@iogearbox.net>,
	Sebastian Sewior <bigeasy@linutronix.de>,
	Peter Zijlstra <peterz@infradead.org>,
	Clark Williams <williams@redhat.com>,
	Steven Rostedt <rostedt@goodmis.org>,
	Juri Lelli <juri.lelli@redhat.com>,
	Ingo Molnar <mingo@kernel.org>
Subject: [RFC patch 03/19] bpf: Update locking comment in hashtab code
Date: Fri, 14 Feb 2020 14:39:20 +0100	[thread overview]
Message-ID: <20200214161503.178987057@linutronix.de> (raw)
In-Reply-To: 20200214133917.304937432@linutronix.de

The comment where the bucket lock is acquired says:

  /* bpf_map_update_elem() can be called in_irq() */

which is not really helpful and aside of that it does not explain the
subtle details of the hash bucket locks expecially in the context of BPF
and perf, kprobes and tracing.

Add a comment at the top of the file which explains the protection scopes
and the details how potential deadlocks are prevented.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
---
 kernel/bpf/hashtab.c |   24 ++++++++++++++++++++----
 1 file changed, 20 insertions(+), 4 deletions(-)

--- a/kernel/bpf/hashtab.c
+++ b/kernel/bpf/hashtab.c
@@ -27,6 +27,26 @@
 	.map_delete_batch =			\
 	generic_map_delete_batch
 
+/*
+ * The bucket lock has two protection scopes:
+ *
+ * 1) Serializing concurrent operations from BPF programs on differrent
+ *    CPUs
+ *
+ * 2) Serializing concurrent operations from BPF programs and sys_bpf()
+ *
+ * BPF programs can execute in any context including perf, kprobes and
+ * tracing. As there are almost no limits where perf, kprobes and tracing
+ * can be invoked from the lock operations need to be protected against
+ * deadlocks. Deadlocks can be caused by recursion and by an invocation in
+ * the lock held section when functions which acquire this lock are invoked
+ * from sys_bpf(). BPF recursion is prevented by incrementing the per CPU
+ * variable bpf_prog_active, which prevents BPF programs attached to perf
+ * events, kprobes and tracing to be invoked before the prior invocation
+ * from one of these contexts completed. sys_bpf() uses the same mechanism
+ * by pinning the task to the current CPU and incrementing the recursion
+ * protection accross the map operation.
+ */
 struct bucket {
 	struct hlist_nulls_head head;
 	raw_spinlock_t lock;
@@ -872,7 +892,6 @@ static int htab_map_update_elem(struct b
 		 */
 	}
 
-	/* bpf_map_update_elem() can be called in_irq() */
 	raw_spin_lock_irqsave(&b->lock, flags);
 
 	l_old = lookup_elem_raw(head, hash, key, key_size);
@@ -952,7 +971,6 @@ static int htab_lru_map_update_elem(stru
 		return -ENOMEM;
 	memcpy(l_new->key + round_up(map->key_size, 8), value, map->value_size);
 
-	/* bpf_map_update_elem() can be called in_irq() */
 	raw_spin_lock_irqsave(&b->lock, flags);
 
 	l_old = lookup_elem_raw(head, hash, key, key_size);
@@ -1007,7 +1025,6 @@ static int __htab_percpu_map_update_elem
 	b = __select_bucket(htab, hash);
 	head = &b->head;
 
-	/* bpf_map_update_elem() can be called in_irq() */
 	raw_spin_lock_irqsave(&b->lock, flags);
 
 	l_old = lookup_elem_raw(head, hash, key, key_size);
@@ -1071,7 +1088,6 @@ static int __htab_lru_percpu_map_update_
 			return -ENOMEM;
 	}
 
-	/* bpf_map_update_elem() can be called in_irq() */
 	raw_spin_lock_irqsave(&b->lock, flags);
 
 	l_old = lookup_elem_raw(head, hash, key, key_size);


  parent reply	other threads:[~2020-02-14 16:45 UTC|newest]

Thread overview: 43+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-02-14 13:39 [RFC patch 00/19] bpf: Make BPF and PREEMPT_RT co-exist Thomas Gleixner
2020-02-14 13:39 ` [RFC patch 01/19] sched: Provide migrate_disable/enable() inlines Thomas Gleixner
2020-02-14 13:39 ` [RFC patch 02/19] sched: Provide cant_migrate() Thomas Gleixner
2020-02-20 20:20   ` [tip: sched/rt] " tip-bot2 for Thomas Gleixner
2020-02-14 13:39 ` Thomas Gleixner [this message]
2020-02-14 13:39 ` [RFC patch 04/19] bpf/tracing: Remove redundant preempt_disable() in __bpf_trace_run() Thomas Gleixner
2020-02-19 16:54   ` Steven Rostedt
2020-02-19 17:26     ` Thomas Gleixner
2020-02-14 13:39 ` [RFC patch 05/19] perf/bpf: Remove preempt disable around BPF invocation Thomas Gleixner
2020-02-14 13:39 ` [RFC patch 06/19] bpf: Dont iterate over possible CPUs with interrupts disabled Thomas Gleixner
2020-02-14 13:39 ` [RFC patch 07/19] bpf: Provide BPF_PROG_RUN_PIN_ON_CPU() macro Thomas Gleixner
2020-02-14 18:50   ` Mathieu Desnoyers
2020-02-14 19:36     ` Thomas Gleixner
2020-02-14 13:39 ` [RFC patch 08/19] bpf: Replace cant_sleep() with cant_migrate() Thomas Gleixner
2020-02-14 13:39 ` [RFC patch 09/19] bpf: Use BPF_PROG_RUN_PIN_ON_CPU() at simple call sites Thomas Gleixner
2020-02-19  1:39   ` Vinicius Costa Gomes
2020-02-19  9:00     ` Thomas Gleixner
2020-02-19 16:38       ` Alexei Starovoitov
2020-02-21  0:20       ` Kees Cook
2020-02-21 14:00         ` Thomas Gleixner
2020-02-21 14:05           ` Peter Zijlstra
2020-02-21 22:15           ` Kees Cook
2020-02-14 13:39 ` [RFC patch 10/19] trace/bpf: Use migrate disable in trace_call_bpf() Thomas Gleixner
2020-02-14 13:39 ` [RFC patch 11/19] bpf/tests: Use migrate disable instead of preempt disable Thomas Gleixner
2020-02-14 13:39 ` [RFC patch 12/19] bpf: Use migrate_disable/enabe() in trampoline code Thomas Gleixner
2020-02-14 13:39 ` [RFC patch 13/19] bpf: Use migrate_disable/enable in array macros and cgroup/lirc code Thomas Gleixner
2020-02-14 13:39 ` [RFC patch 14/19] bpf: Use migrate_disable() in hashtab code Thomas Gleixner
2020-02-14 19:11   ` Mathieu Desnoyers
2020-02-14 19:56     ` Thomas Gleixner
2020-02-18 23:36       ` Alexei Starovoitov
2020-02-19  0:49         ` Thomas Gleixner
2020-02-19  1:23           ` Alexei Starovoitov
2020-02-19 15:17         ` Mathieu Desnoyers
2020-02-20  4:19           ` Alexei Starovoitov
2020-02-14 13:39 ` [RFC patch 15/19] bpf: Use migrate_disable() in sys_bpf() Thomas Gleixner
2020-02-14 13:39 ` [RFC patch 16/19] bpf: Factor out hashtab bucket lock operations Thomas Gleixner
2020-02-14 13:39 ` [RFC patch 17/19] bpf: Prepare hashtab locking for PREEMPT_RT Thomas Gleixner
2020-02-14 13:39 ` [RFC patch 18/19] bpf, lpm: Make locking RT friendly Thomas Gleixner
2020-02-14 13:39 ` [RFC patch 19/19] bpf/stackmap: Dont trylock mmap_sem with PREEMPT_RT and interrupts disabled Thomas Gleixner
2020-02-14 17:53 ` [RFC patch 00/19] bpf: Make BPF and PREEMPT_RT co-exist David Miller
2020-02-14 18:36   ` Thomas Gleixner
2020-02-17 12:59     ` [PATCH] bpf: Enforce map preallocation for all instrumentation programs Thomas Gleixner
2020-02-15 20:09 ` [RFC patch 00/19] bpf: Make BPF and PREEMPT_RT co-exist Jakub Kicinski

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20200214161503.178987057@linutronix.de \
    --to=tglx@linutronix.de \
    --cc=ast@kernel.org \
    --cc=bigeasy@linutronix.de \
    --cc=bpf@vger.kernel.org \
    --cc=daniel@iogearbox.net \
    --cc=davem@davemloft.net \
    --cc=juri.lelli@redhat.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=mingo@kernel.org \
    --cc=netdev@vger.kernel.org \
    --cc=peterz@infradead.org \
    --cc=rostedt@goodmis.org \
    --cc=williams@redhat.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).