All of lore.kernel.org
 help / color / mirror / Atom feed
* [net-next] bpf: avoid the multi checking
@ 2022-11-21 10:05 xiangxia.m.yue
  2022-11-21 10:05 ` [net-next] bpf: avoid hashtab deadlock with try_lock xiangxia.m.yue
  2022-11-22 22:16 ` [net-next] bpf: avoid the multi checking Daniel Borkmann
  0 siblings, 2 replies; 33+ messages in thread
From: xiangxia.m.yue @ 2022-11-21 10:05 UTC (permalink / raw)
  Cc: netdev, Tonghao Zhang, Alexei Starovoitov, Daniel Borkmann,
	Andrii Nakryiko, Martin KaFai Lau, Song Liu, Yonghong Song,
	John Fastabend, KP Singh, Stanislav Fomichev, Hao Luo, Jiri Olsa

From: Tonghao Zhang <xiangxia.m.yue@gmail.com>

.map_alloc_check checked bpf_attr::max_entries, and if bpf_attr::max_entries
== 0, return -EINVAL. bpf_htab::n_buckets will not be 0, while -E2BIG is not
appropriate.

Cc: Alexei Starovoitov <ast@kernel.org>
Cc: Daniel Borkmann <daniel@iogearbox.net>
Cc: Andrii Nakryiko <andrii@kernel.org>
Cc: Martin KaFai Lau <martin.lau@linux.dev>
Cc: Song Liu <song@kernel.org>
Cc: Yonghong Song <yhs@fb.com>
Cc: John Fastabend <john.fastabend@gmail.com>
Cc: KP Singh <kpsingh@kernel.org>
Cc: Stanislav Fomichev <sdf@google.com>
Cc: Hao Luo <haoluo@google.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Signed-off-by: Tonghao Zhang <xiangxia.m.yue@gmail.com>
---
 kernel/bpf/hashtab.c | 7 ++++---
 1 file changed, 4 insertions(+), 3 deletions(-)

diff --git a/kernel/bpf/hashtab.c b/kernel/bpf/hashtab.c
index 50d254cd0709..22855d6ff6d3 100644
--- a/kernel/bpf/hashtab.c
+++ b/kernel/bpf/hashtab.c
@@ -500,9 +500,10 @@ static struct bpf_map *htab_map_alloc(union bpf_attr *attr)
 		htab->elem_size += round_up(htab->map.value_size, 8);
 
 	err = -E2BIG;
-	/* prevent zero size kmalloc and check for u32 overflow */
-	if (htab->n_buckets == 0 ||
-	    htab->n_buckets > U32_MAX / sizeof(struct bucket))
+	/* avoid zero size and u32 overflow kmalloc.
+	 * bpf_attr::max_entries checked in .map_alloc_check().
+	 */
+	if (htab->n_buckets > U32_MAX / sizeof(struct bucket))
 		goto free_htab;
 
 	err = -ENOMEM;
-- 
2.27.0


^ permalink raw reply related	[flat|nested] 33+ messages in thread

* [net-next] bpf: avoid hashtab deadlock with try_lock
  2022-11-21 10:05 [net-next] bpf: avoid the multi checking xiangxia.m.yue
@ 2022-11-21 10:05 ` xiangxia.m.yue
  2022-11-21 20:19   ` Jakub Kicinski
  2022-11-22  1:15   ` Hou Tao
  2022-11-22 22:16 ` [net-next] bpf: avoid the multi checking Daniel Borkmann
  1 sibling, 2 replies; 33+ messages in thread
From: xiangxia.m.yue @ 2022-11-21 10:05 UTC (permalink / raw)
  Cc: netdev, Tonghao Zhang, Alexei Starovoitov, Daniel Borkmann,
	Andrii Nakryiko, Martin KaFai Lau, Song Liu, Yonghong Song,
	John Fastabend, KP Singh, Stanislav Fomichev, Hao Luo, Jiri Olsa

From: Tonghao Zhang <xiangxia.m.yue@gmail.com>

The commit 20b6cc34ea74 ("bpf: Avoid hashtab deadlock with map_locked"),
try to fix deadlock, but in some case, the deadlock occurs:

* CPUn in task context with K1, and taking lock.
* CPUn interrupted by NMI context, with K2.
* They are using the same bucket, but different map_locked.

	    | Task
	    |
	+---v----+
	|  CPUn  |
	+---^----+
	    |
	    | NMI

Anyway, the lockdep still warn:
[   36.092222] ================================
[   36.092230] WARNING: inconsistent lock state
[   36.092234] 6.1.0-rc5+ #81 Tainted: G            E
[   36.092236] --------------------------------
[   36.092237] inconsistent {INITIAL USE} -> {IN-NMI} usage.
[   36.092238] perf/1515 [HC1[1]:SC0[0]:HE0:SE1] takes:
[   36.092242] ffff888341acd1a0 (&htab->lockdep_key){....}-{2:2}, at: htab_lock_bucket+0x4d/0x58
[   36.092253] {INITIAL USE} state was registered at:
[   36.092255]   mark_usage+0x1d/0x11d
[   36.092262]   __lock_acquire+0x3c9/0x6ed
[   36.092266]   lock_acquire+0x23d/0x29a
[   36.092270]   _raw_spin_lock_irqsave+0x43/0x7f
[   36.092274]   htab_lock_bucket+0x4d/0x58
[   36.092276]   htab_map_delete_elem+0x82/0xfb
[   36.092278]   map_delete_elem+0x156/0x1ac
[   36.092282]   __sys_bpf+0x138/0xb71
[   36.092285]   __do_sys_bpf+0xd/0x15
[   36.092288]   do_syscall_64+0x6d/0x84
[   36.092291]   entry_SYSCALL_64_after_hwframe+0x63/0xcd
[   36.092295] irq event stamp: 120346
[   36.092296] hardirqs last  enabled at (120345): [<ffffffff8180b97f>] _raw_spin_unlock_irq+0x24/0x39
[   36.092299] hardirqs last disabled at (120346): [<ffffffff81169e85>] generic_exec_single+0x40/0xb9
[   36.092303] softirqs last  enabled at (120268): [<ffffffff81c00347>] __do_softirq+0x347/0x387
[   36.092307] softirqs last disabled at (120133): [<ffffffff810ba4f0>] __irq_exit_rcu+0x67/0xc6
[   36.092311]
[   36.092311] other info that might help us debug this:
[   36.092312]  Possible unsafe locking scenario:
[   36.092312]
[   36.092313]        CPU0
[   36.092313]        ----
[   36.092314]   lock(&htab->lockdep_key);
[   36.092315]   <Interrupt>
[   36.092316]     lock(&htab->lockdep_key);
[   36.092318]
[   36.092318]  *** DEADLOCK ***
[   36.092318]
[   36.092318] 3 locks held by perf/1515:
[   36.092320]  #0: ffff8881b9805cc0 (&cpuctx_mutex){+.+.}-{4:4}, at: perf_event_ctx_lock_nested+0x8e/0xba
[   36.092327]  #1: ffff8881075ecc20 (&event->child_mutex){+.+.}-{4:4}, at: perf_event_for_each_child+0x35/0x76
[   36.092332]  #2: ffff8881b9805c20 (&cpuctx_lock){-.-.}-{2:2}, at: perf_ctx_lock+0x12/0x27
[   36.092339]
[   36.092339] stack backtrace:
[   36.092341] CPU: 0 PID: 1515 Comm: perf Tainted: G            E      6.1.0-rc5+ #81
[   36.092344] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.16.0-0-gd239552ce722-prebuilt.qemu.org 04/01/2014
[   36.092349] Call Trace:
[   36.092351]  <NMI>
[   36.092354]  dump_stack_lvl+0x57/0x81
[   36.092359]  lock_acquire+0x1f4/0x29a
[   36.092363]  ? handle_pmi_common+0x13f/0x1f0
[   36.092366]  ? htab_lock_bucket+0x4d/0x58
[   36.092371]  _raw_spin_lock_irqsave+0x43/0x7f
[   36.092374]  ? htab_lock_bucket+0x4d/0x58
[   36.092377]  htab_lock_bucket+0x4d/0x58
[   36.092379]  htab_map_update_elem+0x11e/0x220
[   36.092386]  bpf_prog_f3a535ca81a8128a_bpf_prog2+0x3e/0x42
[   36.092392]  trace_call_bpf+0x177/0x215
[   36.092398]  perf_trace_run_bpf_submit+0x52/0xaa
[   36.092403]  ? x86_pmu_stop+0x97/0x97
[   36.092407]  perf_trace_nmi_handler+0xb7/0xe0
[   36.092415]  nmi_handle+0x116/0x254
[   36.092418]  ? x86_pmu_stop+0x97/0x97
[   36.092423]  default_do_nmi+0x3d/0xf6
[   36.092428]  exc_nmi+0xa1/0x109
[   36.092432]  end_repeat_nmi+0x16/0x67
[   36.092436] RIP: 0010:wrmsrl+0xd/0x1b
[   36.092441] Code: 04 01 00 00 c6 84 07 48 01 00 00 01 5b e9 46 15 80 00 5b c3 cc cc cc cc c3 cc cc cc cc 48 89 f2 89 f9 89 f0 48 c1 ea 20 0f 30 <66> 90 c3 cc cc cc cc 31 d2 e9 2f 04 49 00 0f 1f 44 00 00 40 0f6
[   36.092443] RSP: 0018:ffffc900043dfc48 EFLAGS: 00000002
[   36.092445] RAX: 000000000000000f RBX: ffff8881b96153e0 RCX: 000000000000038f
[   36.092447] RDX: 0000000000000007 RSI: 000000070000000f RDI: 000000000000038f
[   36.092449] RBP: 000000070000000f R08: ffffffffffffffff R09: ffff8881053bdaa8
[   36.092451] R10: ffff8881b9805d40 R11: 0000000000000005 R12: ffff8881b9805c00
[   36.092452] R13: 0000000000000000 R14: 0000000000000000 R15: ffff8881075ec970
[   36.092460]  ? wrmsrl+0xd/0x1b
[   36.092465]  ? wrmsrl+0xd/0x1b
[   36.092469]  </NMI>
[   36.092469]  <TASK>
[   36.092470]  __intel_pmu_enable_all.constprop.0+0x7c/0xaf
[   36.092475]  event_function+0xb6/0xd3
[   36.092478]  ? cpu_to_node+0x1a/0x1a
[   36.092482]  ? cpu_to_node+0x1a/0x1a
[   36.092485]  remote_function+0x1e/0x4c
[   36.092489]  generic_exec_single+0x48/0xb9
[   36.092492]  ? __lock_acquire+0x666/0x6ed
[   36.092497]  smp_call_function_single+0xbf/0x106
[   36.092499]  ? cpu_to_node+0x1a/0x1a
[   36.092504]  ? kvm_sched_clock_read+0x5/0x11
[   36.092508]  ? __perf_event_task_sched_in+0x13d/0x13d
[   36.092513]  cpu_function_call+0x47/0x69
[   36.092516]  ? perf_event_update_time+0x52/0x52
[   36.092519]  event_function_call+0x89/0x117
[   36.092521]  ? __perf_event_task_sched_in+0x13d/0x13d
[   36.092526]  ? _perf_event_disable+0x4a/0x4a
[   36.092528]  perf_event_for_each_child+0x3d/0x76
[   36.092532]  ? _perf_event_disable+0x4a/0x4a
[   36.092533]  _perf_ioctl+0x564/0x590
[   36.092537]  ? __lock_release+0xd5/0x1b0
[   36.092543]  ? perf_event_ctx_lock_nested+0x8e/0xba
[   36.092547]  perf_ioctl+0x42/0x5f
[   36.092551]  vfs_ioctl+0x1e/0x2f
[   36.092554]  __do_sys_ioctl+0x66/0x89
[   36.092559]  do_syscall_64+0x6d/0x84
[   36.092563]  entry_SYSCALL_64_after_hwframe+0x63/0xcd
[   36.092566] RIP: 0033:0x7fe7110f362b
[   36.092569] Code: 0f 1e fa 48 8b 05 5d b8 2c 00 64 c7 00 26 00 00 00 48 c7 c0 ff ff ff ff c3 66 0f 1f 44 00 00 f3 0f 1e fa b8 10 00 00 00 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 8b 0d 2d b8 2c 00 f7 d8 64 89 018
[   36.092570] RSP: 002b:00007ffebb8e4b08 EFLAGS: 00000246 ORIG_RAX: 0000000000000010
[   36.092573] RAX: ffffffffffffffda RBX: 0000000000002400 RCX: 00007fe7110f362b
[   36.092575] RDX: 0000000000000000 RSI: 0000000000002400 RDI: 0000000000000013
[   36.092576] RBP: 00007ffebb8e4b40 R08: 0000000000000001 R09: 000055c1db4a5b40
[   36.092577] R10: 0000000000000000 R11: 0000000000000246 R12: 0000000000000000
[   36.092579] R13: 000055c1db3b2a30 R14: 0000000000000000 R15: 0000000000000000
[   36.092586]  </TASK>

Cc: Alexei Starovoitov <ast@kernel.org>
Cc: Daniel Borkmann <daniel@iogearbox.net>
Cc: Andrii Nakryiko <andrii@kernel.org>
Cc: Martin KaFai Lau <martin.lau@linux.dev>
Cc: Song Liu <song@kernel.org>
Cc: Yonghong Song <yhs@fb.com>
Cc: John Fastabend <john.fastabend@gmail.com>
Cc: KP Singh <kpsingh@kernel.org>
Cc: Stanislav Fomichev <sdf@google.com>
Cc: Hao Luo <haoluo@google.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Signed-off-by: Tonghao Zhang <xiangxia.m.yue@gmail.com>
---
 kernel/bpf/hashtab.c | 96 +++++++++++++++++---------------------------
 1 file changed, 36 insertions(+), 60 deletions(-)

diff --git a/kernel/bpf/hashtab.c b/kernel/bpf/hashtab.c
index 22855d6ff6d3..429acd97c869 100644
--- a/kernel/bpf/hashtab.c
+++ b/kernel/bpf/hashtab.c
@@ -80,9 +80,6 @@ struct bucket {
 	raw_spinlock_t raw_lock;
 };
 
-#define HASHTAB_MAP_LOCK_COUNT 8
-#define HASHTAB_MAP_LOCK_MASK (HASHTAB_MAP_LOCK_COUNT - 1)
-
 struct bpf_htab {
 	struct bpf_map map;
 	struct bpf_mem_alloc ma;
@@ -104,7 +101,6 @@ struct bpf_htab {
 	u32 elem_size;	/* size of each element in bytes */
 	u32 hashrnd;
 	struct lock_class_key lockdep_key;
-	int __percpu *map_locked[HASHTAB_MAP_LOCK_COUNT];
 };
 
 /* each htab element is struct htab_elem + key + value */
@@ -146,35 +142,26 @@ static void htab_init_buckets(struct bpf_htab *htab)
 	}
 }
 
-static inline int htab_lock_bucket(const struct bpf_htab *htab,
-				   struct bucket *b, u32 hash,
+static inline int htab_lock_bucket(struct bucket *b,
 				   unsigned long *pflags)
 {
 	unsigned long flags;
 
-	hash = hash & HASHTAB_MAP_LOCK_MASK;
-
-	preempt_disable();
-	if (unlikely(__this_cpu_inc_return(*(htab->map_locked[hash])) != 1)) {
-		__this_cpu_dec(*(htab->map_locked[hash]));
-		preempt_enable();
-		return -EBUSY;
+	if (in_nmi()) {
+		if (!raw_spin_trylock_irqsave(&b->raw_lock, flags))
+			return -EBUSY;
+	} else {
+		raw_spin_lock_irqsave(&b->raw_lock, flags);
 	}
 
-	raw_spin_lock_irqsave(&b->raw_lock, flags);
 	*pflags = flags;
-
 	return 0;
 }
 
-static inline void htab_unlock_bucket(const struct bpf_htab *htab,
-				      struct bucket *b, u32 hash,
+static inline void htab_unlock_bucket(struct bucket *b,
 				      unsigned long flags)
 {
-	hash = hash & HASHTAB_MAP_LOCK_MASK;
 	raw_spin_unlock_irqrestore(&b->raw_lock, flags);
-	__this_cpu_dec(*(htab->map_locked[hash]));
-	preempt_enable();
 }
 
 static bool htab_lru_map_delete_node(void *arg, struct bpf_lru_node *node);
@@ -467,7 +454,7 @@ static struct bpf_map *htab_map_alloc(union bpf_attr *attr)
 	bool percpu_lru = (attr->map_flags & BPF_F_NO_COMMON_LRU);
 	bool prealloc = !(attr->map_flags & BPF_F_NO_PREALLOC);
 	struct bpf_htab *htab;
-	int err, i;
+	int err;
 
 	htab = bpf_map_area_alloc(sizeof(*htab), NUMA_NO_NODE);
 	if (!htab)
@@ -513,15 +500,6 @@ static struct bpf_map *htab_map_alloc(union bpf_attr *attr)
 	if (!htab->buckets)
 		goto free_htab;
 
-	for (i = 0; i < HASHTAB_MAP_LOCK_COUNT; i++) {
-		htab->map_locked[i] = bpf_map_alloc_percpu(&htab->map,
-							   sizeof(int),
-							   sizeof(int),
-							   GFP_USER);
-		if (!htab->map_locked[i])
-			goto free_map_locked;
-	}
-
 	if (htab->map.map_flags & BPF_F_ZERO_SEED)
 		htab->hashrnd = 0;
 	else
@@ -549,13 +527,13 @@ static struct bpf_map *htab_map_alloc(union bpf_attr *attr)
 	if (htab->use_percpu_counter) {
 		err = percpu_counter_init(&htab->pcount, 0, GFP_KERNEL);
 		if (err)
-			goto free_map_locked;
+			goto free_buckets;
 	}
 
 	if (prealloc) {
 		err = prealloc_init(htab);
 		if (err)
-			goto free_map_locked;
+			goto free_buckets;
 
 		if (!percpu && !lru) {
 			/* lru itself can remove the least used element, so
@@ -568,12 +546,12 @@ static struct bpf_map *htab_map_alloc(union bpf_attr *attr)
 	} else {
 		err = bpf_mem_alloc_init(&htab->ma, htab->elem_size, false);
 		if (err)
-			goto free_map_locked;
+			goto free_buckets;
 		if (percpu) {
 			err = bpf_mem_alloc_init(&htab->pcpu_ma,
 						 round_up(htab->map.value_size, 8), true);
 			if (err)
-				goto free_map_locked;
+				goto free_buckets;
 		}
 	}
 
@@ -581,11 +559,10 @@ static struct bpf_map *htab_map_alloc(union bpf_attr *attr)
 
 free_prealloc:
 	prealloc_destroy(htab);
-free_map_locked:
+free_buckets:
 	if (htab->use_percpu_counter)
 		percpu_counter_destroy(&htab->pcount);
-	for (i = 0; i < HASHTAB_MAP_LOCK_COUNT; i++)
-		free_percpu(htab->map_locked[i]);
+
 	bpf_map_area_free(htab->buckets);
 	bpf_mem_alloc_destroy(&htab->pcpu_ma);
 	bpf_mem_alloc_destroy(&htab->ma);
@@ -782,7 +759,7 @@ static bool htab_lru_map_delete_node(void *arg, struct bpf_lru_node *node)
 	b = __select_bucket(htab, tgt_l->hash);
 	head = &b->head;
 
-	ret = htab_lock_bucket(htab, b, tgt_l->hash, &flags);
+	ret = htab_lock_bucket(b, &flags);
 	if (ret)
 		return false;
 
@@ -793,7 +770,7 @@ static bool htab_lru_map_delete_node(void *arg, struct bpf_lru_node *node)
 			break;
 		}
 
-	htab_unlock_bucket(htab, b, tgt_l->hash, flags);
+	htab_unlock_bucket(b, flags);
 
 	return l == tgt_l;
 }
@@ -1107,7 +1084,7 @@ static int htab_map_update_elem(struct bpf_map *map, void *key, void *value,
 		 */
 	}
 
-	ret = htab_lock_bucket(htab, b, hash, &flags);
+	ret = htab_lock_bucket(b, &flags);
 	if (ret)
 		return ret;
 
@@ -1152,7 +1129,7 @@ static int htab_map_update_elem(struct bpf_map *map, void *key, void *value,
 	}
 	ret = 0;
 err:
-	htab_unlock_bucket(htab, b, hash, flags);
+	htab_unlock_bucket(b, flags);
 	return ret;
 }
 
@@ -1198,7 +1175,7 @@ static int htab_lru_map_update_elem(struct bpf_map *map, void *key, void *value,
 	copy_map_value(&htab->map,
 		       l_new->key + round_up(map->key_size, 8), value);
 
-	ret = htab_lock_bucket(htab, b, hash, &flags);
+	ret = htab_lock_bucket(b, &flags);
 	if (ret)
 		return ret;
 
@@ -1219,7 +1196,7 @@ static int htab_lru_map_update_elem(struct bpf_map *map, void *key, void *value,
 	ret = 0;
 
 err:
-	htab_unlock_bucket(htab, b, hash, flags);
+	htab_unlock_bucket(b, flags);
 
 	if (ret)
 		htab_lru_push_free(htab, l_new);
@@ -1255,7 +1232,7 @@ static int __htab_percpu_map_update_elem(struct bpf_map *map, void *key,
 	b = __select_bucket(htab, hash);
 	head = &b->head;
 
-	ret = htab_lock_bucket(htab, b, hash, &flags);
+	ret = htab_lock_bucket(b, &flags);
 	if (ret)
 		return ret;
 
@@ -1280,7 +1257,7 @@ static int __htab_percpu_map_update_elem(struct bpf_map *map, void *key,
 	}
 	ret = 0;
 err:
-	htab_unlock_bucket(htab, b, hash, flags);
+	htab_unlock_bucket(b, flags);
 	return ret;
 }
 
@@ -1321,7 +1298,7 @@ static int __htab_lru_percpu_map_update_elem(struct bpf_map *map, void *key,
 			return -ENOMEM;
 	}
 
-	ret = htab_lock_bucket(htab, b, hash, &flags);
+	ret = htab_lock_bucket(b, &flags);
 	if (ret)
 		return ret;
 
@@ -1345,7 +1322,7 @@ static int __htab_lru_percpu_map_update_elem(struct bpf_map *map, void *key,
 	}
 	ret = 0;
 err:
-	htab_unlock_bucket(htab, b, hash, flags);
+	htab_unlock_bucket(b, flags);
 	if (l_new)
 		bpf_lru_push_free(&htab->lru, &l_new->lru_node);
 	return ret;
@@ -1384,7 +1361,7 @@ static int htab_map_delete_elem(struct bpf_map *map, void *key)
 	b = __select_bucket(htab, hash);
 	head = &b->head;
 
-	ret = htab_lock_bucket(htab, b, hash, &flags);
+	ret = htab_lock_bucket(b, &flags);
 	if (ret)
 		return ret;
 
@@ -1397,7 +1374,7 @@ static int htab_map_delete_elem(struct bpf_map *map, void *key)
 		ret = -ENOENT;
 	}
 
-	htab_unlock_bucket(htab, b, hash, flags);
+	htab_unlock_bucket(b, flags);
 	return ret;
 }
 
@@ -1420,7 +1397,7 @@ static int htab_lru_map_delete_elem(struct bpf_map *map, void *key)
 	b = __select_bucket(htab, hash);
 	head = &b->head;
 
-	ret = htab_lock_bucket(htab, b, hash, &flags);
+	ret = htab_lock_bucket(b, &flags);
 	if (ret)
 		return ret;
 
@@ -1431,7 +1408,7 @@ static int htab_lru_map_delete_elem(struct bpf_map *map, void *key)
 	else
 		ret = -ENOENT;
 
-	htab_unlock_bucket(htab, b, hash, flags);
+	htab_unlock_bucket(b, flags);
 	if (l)
 		htab_lru_push_free(htab, l);
 	return ret;
@@ -1494,7 +1471,6 @@ static void htab_map_free_timers(struct bpf_map *map)
 static void htab_map_free(struct bpf_map *map)
 {
 	struct bpf_htab *htab = container_of(map, struct bpf_htab, map);
-	int i;
 
 	/* bpf_free_used_maps() or close(map_fd) will trigger this map_free callback.
 	 * bpf_free_used_maps() is called after bpf prog is no longer executing.
@@ -1517,10 +1493,10 @@ static void htab_map_free(struct bpf_map *map)
 	bpf_map_area_free(htab->buckets);
 	bpf_mem_alloc_destroy(&htab->pcpu_ma);
 	bpf_mem_alloc_destroy(&htab->ma);
+
 	if (htab->use_percpu_counter)
 		percpu_counter_destroy(&htab->pcount);
-	for (i = 0; i < HASHTAB_MAP_LOCK_COUNT; i++)
-		free_percpu(htab->map_locked[i]);
+
 	lockdep_unregister_key(&htab->lockdep_key);
 	bpf_map_area_free(htab);
 }
@@ -1564,7 +1540,7 @@ static int __htab_map_lookup_and_delete_elem(struct bpf_map *map, void *key,
 	b = __select_bucket(htab, hash);
 	head = &b->head;
 
-	ret = htab_lock_bucket(htab, b, hash, &bflags);
+	ret = htab_lock_bucket(b, &bflags);
 	if (ret)
 		return ret;
 
@@ -1602,7 +1578,7 @@ static int __htab_map_lookup_and_delete_elem(struct bpf_map *map, void *key,
 			free_htab_elem(htab, l);
 	}
 
-	htab_unlock_bucket(htab, b, hash, bflags);
+	htab_unlock_bucket(b, bflags);
 
 	if (is_lru_map && l)
 		htab_lru_push_free(htab, l);
@@ -1720,7 +1696,7 @@ __htab_map_lookup_and_delete_batch(struct bpf_map *map,
 	head = &b->head;
 	/* do not grab the lock unless need it (bucket_cnt > 0). */
 	if (locked) {
-		ret = htab_lock_bucket(htab, b, batch, &flags);
+		ret = htab_lock_bucket(b, &flags);
 		if (ret) {
 			rcu_read_unlock();
 			bpf_enable_instrumentation();
@@ -1743,7 +1719,7 @@ __htab_map_lookup_and_delete_batch(struct bpf_map *map,
 		/* Note that since bucket_cnt > 0 here, it is implicit
 		 * that the locked was grabbed, so release it.
 		 */
-		htab_unlock_bucket(htab, b, batch, flags);
+		htab_unlock_bucket(b, flags);
 		rcu_read_unlock();
 		bpf_enable_instrumentation();
 		goto after_loop;
@@ -1754,7 +1730,7 @@ __htab_map_lookup_and_delete_batch(struct bpf_map *map,
 		/* Note that since bucket_cnt > 0 here, it is implicit
 		 * that the locked was grabbed, so release it.
 		 */
-		htab_unlock_bucket(htab, b, batch, flags);
+		htab_unlock_bucket(b, flags);
 		rcu_read_unlock();
 		bpf_enable_instrumentation();
 		kvfree(keys);
@@ -1815,7 +1791,7 @@ __htab_map_lookup_and_delete_batch(struct bpf_map *map,
 		dst_val += value_size;
 	}
 
-	htab_unlock_bucket(htab, b, batch, flags);
+	htab_unlock_bucket(b, flags);
 	locked = false;
 
 	while (node_to_free) {
-- 
2.27.0


^ permalink raw reply related	[flat|nested] 33+ messages in thread

* Re: [net-next] bpf: avoid hashtab deadlock with try_lock
  2022-11-21 10:05 ` [net-next] bpf: avoid hashtab deadlock with try_lock xiangxia.m.yue
@ 2022-11-21 20:19   ` Jakub Kicinski
  2022-11-22  1:15   ` Hou Tao
  1 sibling, 0 replies; 33+ messages in thread
From: Jakub Kicinski @ 2022-11-21 20:19 UTC (permalink / raw)
  To: xiangxia.m.yue
  Cc: bpf, netdev, Alexei Starovoitov, Daniel Borkmann,
	Andrii Nakryiko, Martin KaFai Lau, Song Liu, Yonghong Song,
	John Fastabend, KP Singh, Stanislav Fomichev, Hao Luo, Jiri Olsa

On Mon, 21 Nov 2022 18:05:21 +0800 xiangxia.m.yue@gmail.com wrote:
> From: Tonghao Zhang <xiangxia.m.yue@gmail.com>
> 
> The commit 20b6cc34ea74 ("bpf: Avoid hashtab deadlock with map_locked"),
> try to fix deadlock, but in some case, the deadlock occurs:
> 
> * CPUn in task context with K1, and taking lock.
> * CPUn interrupted by NMI context, with K2.
> * They are using the same bucket, but different map_locked.

You should really put bpf@ in the CC line for bpf patches.

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [net-next] bpf: avoid hashtab deadlock with try_lock
  2022-11-21 10:05 ` [net-next] bpf: avoid hashtab deadlock with try_lock xiangxia.m.yue
  2022-11-21 20:19   ` Jakub Kicinski
@ 2022-11-22  1:15   ` Hou Tao
  2022-11-22  3:12     ` Tonghao Zhang
  1 sibling, 1 reply; 33+ messages in thread
From: Hou Tao @ 2022-11-22  1:15 UTC (permalink / raw)
  To: xiangxia.m.yue
  Cc: netdev, Alexei Starovoitov, Daniel Borkmann, Andrii Nakryiko,
	Martin KaFai Lau, Song Liu, Yonghong Song, John Fastabend,
	KP Singh, Stanislav Fomichev, Hao Luo, Jiri Olsa, bpf

Hi,

On 11/21/2022 6:05 PM, xiangxia.m.yue@gmail.com wrote:
> From: Tonghao Zhang <xiangxia.m.yue@gmail.com>
>
> The commit 20b6cc34ea74 ("bpf: Avoid hashtab deadlock with map_locked"),
> try to fix deadlock, but in some case, the deadlock occurs:
>
> * CPUn in task context with K1, and taking lock.
> * CPUn interrupted by NMI context, with K2.
> * They are using the same bucket, but different map_locked.
It is possible when n_buckets is less than HASHTAB_MAP_LOCK_COUNT (e.g.,
n_bucket=4). If using hash & min(HASHTAB_MAP_LOCK_MASK, n_bucket - 1) as the
index of map_locked, I think the deadlock will be gone.
>
> 	    | Task
> 	    |
> 	+---v----+
> 	|  CPUn  |
> 	+---^----+
> 	    |
> 	    | NMI
>
> Anyway, the lockdep still warn:
> [   36.092222] ================================
> [   36.092230] WARNING: inconsistent lock state
> [   36.092234] 6.1.0-rc5+ #81 Tainted: G            E
> [   36.092236] --------------------------------
> [   36.092237] inconsistent {INITIAL USE} -> {IN-NMI} usage.
> [   36.092238] perf/1515 [HC1[1]:SC0[0]:HE0:SE1] takes:
> [   36.092242] ffff888341acd1a0 (&htab->lockdep_key){....}-{2:2}, at: htab_lock_bucket+0x4d/0x58
> [   36.092253] {INITIAL USE} state was registered at:
> [   36.092255]   mark_usage+0x1d/0x11d
> [   36.092262]   __lock_acquire+0x3c9/0x6ed
> [   36.092266]   lock_acquire+0x23d/0x29a
> [   36.092270]   _raw_spin_lock_irqsave+0x43/0x7f
> [   36.092274]   htab_lock_bucket+0x4d/0x58
> [   36.092276]   htab_map_delete_elem+0x82/0xfb
> [   36.092278]   map_delete_elem+0x156/0x1ac
> [   36.092282]   __sys_bpf+0x138/0xb71
> [   36.092285]   __do_sys_bpf+0xd/0x15
> [   36.092288]   do_syscall_64+0x6d/0x84
> [   36.092291]   entry_SYSCALL_64_after_hwframe+0x63/0xcd
> [   36.092295] irq event stamp: 120346
> [   36.092296] hardirqs last  enabled at (120345): [<ffffffff8180b97f>] _raw_spin_unlock_irq+0x24/0x39
> [   36.092299] hardirqs last disabled at (120346): [<ffffffff81169e85>] generic_exec_single+0x40/0xb9
> [   36.092303] softirqs last  enabled at (120268): [<ffffffff81c00347>] __do_softirq+0x347/0x387
> [   36.092307] softirqs last disabled at (120133): [<ffffffff810ba4f0>] __irq_exit_rcu+0x67/0xc6
> [   36.092311]
> [   36.092311] other info that might help us debug this:
> [   36.092312]  Possible unsafe locking scenario:
> [   36.092312]
> [   36.092313]        CPU0
> [   36.092313]        ----
> [   36.092314]   lock(&htab->lockdep_key);
> [   36.092315]   <Interrupt>
> [   36.092316]     lock(&htab->lockdep_key);
> [   36.092318]
> [   36.092318]  *** DEADLOCK ***
> [   36.092318]
> [   36.092318] 3 locks held by perf/1515:
> [   36.092320]  #0: ffff8881b9805cc0 (&cpuctx_mutex){+.+.}-{4:4}, at: perf_event_ctx_lock_nested+0x8e/0xba
> [   36.092327]  #1: ffff8881075ecc20 (&event->child_mutex){+.+.}-{4:4}, at: perf_event_for_each_child+0x35/0x76
> [   36.092332]  #2: ffff8881b9805c20 (&cpuctx_lock){-.-.}-{2:2}, at: perf_ctx_lock+0x12/0x27
> [   36.092339]
> [   36.092339] stack backtrace:
> [   36.092341] CPU: 0 PID: 1515 Comm: perf Tainted: G            E      6.1.0-rc5+ #81
> [   36.092344] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.16.0-0-gd239552ce722-prebuilt.qemu.org 04/01/2014
> [   36.092349] Call Trace:
> [   36.092351]  <NMI>
> [   36.092354]  dump_stack_lvl+0x57/0x81
> [   36.092359]  lock_acquire+0x1f4/0x29a
> [   36.092363]  ? handle_pmi_common+0x13f/0x1f0
> [   36.092366]  ? htab_lock_bucket+0x4d/0x58
> [   36.092371]  _raw_spin_lock_irqsave+0x43/0x7f
> [   36.092374]  ? htab_lock_bucket+0x4d/0x58
> [   36.092377]  htab_lock_bucket+0x4d/0x58
> [   36.092379]  htab_map_update_elem+0x11e/0x220
> [   36.092386]  bpf_prog_f3a535ca81a8128a_bpf_prog2+0x3e/0x42
> [   36.092392]  trace_call_bpf+0x177/0x215
> [   36.092398]  perf_trace_run_bpf_submit+0x52/0xaa
> [   36.092403]  ? x86_pmu_stop+0x97/0x97
> [   36.092407]  perf_trace_nmi_handler+0xb7/0xe0
> [   36.092415]  nmi_handle+0x116/0x254
> [   36.092418]  ? x86_pmu_stop+0x97/0x97
> [   36.092423]  default_do_nmi+0x3d/0xf6
> [   36.092428]  exc_nmi+0xa1/0x109
> [   36.092432]  end_repeat_nmi+0x16/0x67
> [   36.092436] RIP: 0010:wrmsrl+0xd/0x1b
> [   36.092441] Code: 04 01 00 00 c6 84 07 48 01 00 00 01 5b e9 46 15 80 00 5b c3 cc cc cc cc c3 cc cc cc cc 48 89 f2 89 f9 89 f0 48 c1 ea 20 0f 30 <66> 90 c3 cc cc cc cc 31 d2 e9 2f 04 49 00 0f 1f 44 00 00 40 0f6
> [   36.092443] RSP: 0018:ffffc900043dfc48 EFLAGS: 00000002
> [   36.092445] RAX: 000000000000000f RBX: ffff8881b96153e0 RCX: 000000000000038f
> [   36.092447] RDX: 0000000000000007 RSI: 000000070000000f RDI: 000000000000038f
> [   36.092449] RBP: 000000070000000f R08: ffffffffffffffff R09: ffff8881053bdaa8
> [   36.092451] R10: ffff8881b9805d40 R11: 0000000000000005 R12: ffff8881b9805c00
> [   36.092452] R13: 0000000000000000 R14: 0000000000000000 R15: ffff8881075ec970
> [   36.092460]  ? wrmsrl+0xd/0x1b
> [   36.092465]  ? wrmsrl+0xd/0x1b
> [   36.092469]  </NMI>
> [   36.092469]  <TASK>
> [   36.092470]  __intel_pmu_enable_all.constprop.0+0x7c/0xaf
> [   36.092475]  event_function+0xb6/0xd3
> [   36.092478]  ? cpu_to_node+0x1a/0x1a
> [   36.092482]  ? cpu_to_node+0x1a/0x1a
> [   36.092485]  remote_function+0x1e/0x4c
> [   36.092489]  generic_exec_single+0x48/0xb9
> [   36.092492]  ? __lock_acquire+0x666/0x6ed
> [   36.092497]  smp_call_function_single+0xbf/0x106
> [   36.092499]  ? cpu_to_node+0x1a/0x1a
> [   36.092504]  ? kvm_sched_clock_read+0x5/0x11
> [   36.092508]  ? __perf_event_task_sched_in+0x13d/0x13d
> [   36.092513]  cpu_function_call+0x47/0x69
> [   36.092516]  ? perf_event_update_time+0x52/0x52
> [   36.092519]  event_function_call+0x89/0x117
> [   36.092521]  ? __perf_event_task_sched_in+0x13d/0x13d
> [   36.092526]  ? _perf_event_disable+0x4a/0x4a
> [   36.092528]  perf_event_for_each_child+0x3d/0x76
> [   36.092532]  ? _perf_event_disable+0x4a/0x4a
> [   36.092533]  _perf_ioctl+0x564/0x590
> [   36.092537]  ? __lock_release+0xd5/0x1b0
> [   36.092543]  ? perf_event_ctx_lock_nested+0x8e/0xba
> [   36.092547]  perf_ioctl+0x42/0x5f
> [   36.092551]  vfs_ioctl+0x1e/0x2f
> [   36.092554]  __do_sys_ioctl+0x66/0x89
> [   36.092559]  do_syscall_64+0x6d/0x84
> [   36.092563]  entry_SYSCALL_64_after_hwframe+0x63/0xcd
> [   36.092566] RIP: 0033:0x7fe7110f362b
> [   36.092569] Code: 0f 1e fa 48 8b 05 5d b8 2c 00 64 c7 00 26 00 00 00 48 c7 c0 ff ff ff ff c3 66 0f 1f 44 00 00 f3 0f 1e fa b8 10 00 00 00 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 8b 0d 2d b8 2c 00 f7 d8 64 89 018
> [   36.092570] RSP: 002b:00007ffebb8e4b08 EFLAGS: 00000246 ORIG_RAX: 0000000000000010
> [   36.092573] RAX: ffffffffffffffda RBX: 0000000000002400 RCX: 00007fe7110f362b
> [   36.092575] RDX: 0000000000000000 RSI: 0000000000002400 RDI: 0000000000000013
> [   36.092576] RBP: 00007ffebb8e4b40 R08: 0000000000000001 R09: 000055c1db4a5b40
> [   36.092577] R10: 0000000000000000 R11: 0000000000000246 R12: 0000000000000000
> [   36.092579] R13: 000055c1db3b2a30 R14: 0000000000000000 R15: 0000000000000000
> [   36.092586]  </TASK>
>
> Cc: Alexei Starovoitov <ast@kernel.org>
> Cc: Daniel Borkmann <daniel@iogearbox.net>
> Cc: Andrii Nakryiko <andrii@kernel.org>
> Cc: Martin KaFai Lau <martin.lau@linux.dev>
> Cc: Song Liu <song@kernel.org>
> Cc: Yonghong Song <yhs@fb.com>
> Cc: John Fastabend <john.fastabend@gmail.com>
> Cc: KP Singh <kpsingh@kernel.org>
> Cc: Stanislav Fomichev <sdf@google.com>
> Cc: Hao Luo <haoluo@google.com>
> Cc: Jiri Olsa <jolsa@kernel.org>
> Signed-off-by: Tonghao Zhang <xiangxia.m.yue@gmail.com>
> ---
>  kernel/bpf/hashtab.c | 96 +++++++++++++++++---------------------------
>  1 file changed, 36 insertions(+), 60 deletions(-)
>
> diff --git a/kernel/bpf/hashtab.c b/kernel/bpf/hashtab.c
> index 22855d6ff6d3..429acd97c869 100644
> --- a/kernel/bpf/hashtab.c
> +++ b/kernel/bpf/hashtab.c
> @@ -80,9 +80,6 @@ struct bucket {
>  	raw_spinlock_t raw_lock;
>  };
>  
> -#define HASHTAB_MAP_LOCK_COUNT 8
> -#define HASHTAB_MAP_LOCK_MASK (HASHTAB_MAP_LOCK_COUNT - 1)
> -
>  struct bpf_htab {
>  	struct bpf_map map;
>  	struct bpf_mem_alloc ma;
> @@ -104,7 +101,6 @@ struct bpf_htab {
>  	u32 elem_size;	/* size of each element in bytes */
>  	u32 hashrnd;
>  	struct lock_class_key lockdep_key;
> -	int __percpu *map_locked[HASHTAB_MAP_LOCK_COUNT];
>  };
>  
>  /* each htab element is struct htab_elem + key + value */
> @@ -146,35 +142,26 @@ static void htab_init_buckets(struct bpf_htab *htab)
>  	}
>  }
>  
> -static inline int htab_lock_bucket(const struct bpf_htab *htab,
> -				   struct bucket *b, u32 hash,
> +static inline int htab_lock_bucket(struct bucket *b,
>  				   unsigned long *pflags)
>  {
>  	unsigned long flags;
>  
> -	hash = hash & HASHTAB_MAP_LOCK_MASK;
> -
> -	preempt_disable();
> -	if (unlikely(__this_cpu_inc_return(*(htab->map_locked[hash])) != 1)) {
> -		__this_cpu_dec(*(htab->map_locked[hash]));
> -		preempt_enable();
> -		return -EBUSY;
> +	if (in_nmi()) {
> +		if (!raw_spin_trylock_irqsave(&b->raw_lock, flags))
> +			return -EBUSY;
> +	} else {
> +		raw_spin_lock_irqsave(&b->raw_lock, flags);
>  	}
>  
> -	raw_spin_lock_irqsave(&b->raw_lock, flags);
>  	*pflags = flags;
> -
>  	return 0;
>  }
map_locked is also used to prevent the re-entrance of htab_lock_bucket() on the
same CPU, so only check in_nmi() is not enough.
>  
> -static inline void htab_unlock_bucket(const struct bpf_htab *htab,
> -				      struct bucket *b, u32 hash,
> +static inline void htab_unlock_bucket(struct bucket *b,
>  				      unsigned long flags)
>  {
> -	hash = hash & HASHTAB_MAP_LOCK_MASK;
>  	raw_spin_unlock_irqrestore(&b->raw_lock, flags);
> -	__this_cpu_dec(*(htab->map_locked[hash]));
> -	preempt_enable();
>  }
>  
>  static bool htab_lru_map_delete_node(void *arg, struct bpf_lru_node *node);
> @@ -467,7 +454,7 @@ static struct bpf_map *htab_map_alloc(union bpf_attr *attr)
>  	bool percpu_lru = (attr->map_flags & BPF_F_NO_COMMON_LRU);
>  	bool prealloc = !(attr->map_flags & BPF_F_NO_PREALLOC);
>  	struct bpf_htab *htab;
> -	int err, i;
> +	int err;
>  
>  	htab = bpf_map_area_alloc(sizeof(*htab), NUMA_NO_NODE);
>  	if (!htab)
> @@ -513,15 +500,6 @@ static struct bpf_map *htab_map_alloc(union bpf_attr *attr)
>  	if (!htab->buckets)
>  		goto free_htab;
>  
> -	for (i = 0; i < HASHTAB_MAP_LOCK_COUNT; i++) {
> -		htab->map_locked[i] = bpf_map_alloc_percpu(&htab->map,
> -							   sizeof(int),
> -							   sizeof(int),
> -							   GFP_USER);
> -		if (!htab->map_locked[i])
> -			goto free_map_locked;
> -	}
> -
>  	if (htab->map.map_flags & BPF_F_ZERO_SEED)
>  		htab->hashrnd = 0;
>  	else
> @@ -549,13 +527,13 @@ static struct bpf_map *htab_map_alloc(union bpf_attr *attr)
>  	if (htab->use_percpu_counter) {
>  		err = percpu_counter_init(&htab->pcount, 0, GFP_KERNEL);
>  		if (err)
> -			goto free_map_locked;
> +			goto free_buckets;
>  	}
>  
>  	if (prealloc) {
>  		err = prealloc_init(htab);
>  		if (err)
> -			goto free_map_locked;
> +			goto free_buckets;
>  
>  		if (!percpu && !lru) {
>  			/* lru itself can remove the least used element, so
> @@ -568,12 +546,12 @@ static struct bpf_map *htab_map_alloc(union bpf_attr *attr)
>  	} else {
>  		err = bpf_mem_alloc_init(&htab->ma, htab->elem_size, false);
>  		if (err)
> -			goto free_map_locked;
> +			goto free_buckets;
>  		if (percpu) {
>  			err = bpf_mem_alloc_init(&htab->pcpu_ma,
>  						 round_up(htab->map.value_size, 8), true);
>  			if (err)
> -				goto free_map_locked;
> +				goto free_buckets;
>  		}
>  	}
>  
> @@ -581,11 +559,10 @@ static struct bpf_map *htab_map_alloc(union bpf_attr *attr)
>  
>  free_prealloc:
>  	prealloc_destroy(htab);
> -free_map_locked:
> +free_buckets:
>  	if (htab->use_percpu_counter)
>  		percpu_counter_destroy(&htab->pcount);
> -	for (i = 0; i < HASHTAB_MAP_LOCK_COUNT; i++)
> -		free_percpu(htab->map_locked[i]);
> +
>  	bpf_map_area_free(htab->buckets);
>  	bpf_mem_alloc_destroy(&htab->pcpu_ma);
>  	bpf_mem_alloc_destroy(&htab->ma);
> @@ -782,7 +759,7 @@ static bool htab_lru_map_delete_node(void *arg, struct bpf_lru_node *node)
>  	b = __select_bucket(htab, tgt_l->hash);
>  	head = &b->head;
>  
> -	ret = htab_lock_bucket(htab, b, tgt_l->hash, &flags);
> +	ret = htab_lock_bucket(b, &flags);
>  	if (ret)
>  		return false;
>  
> @@ -793,7 +770,7 @@ static bool htab_lru_map_delete_node(void *arg, struct bpf_lru_node *node)
>  			break;
>  		}
>  
> -	htab_unlock_bucket(htab, b, tgt_l->hash, flags);
> +	htab_unlock_bucket(b, flags);
>  
>  	return l == tgt_l;
>  }
> @@ -1107,7 +1084,7 @@ static int htab_map_update_elem(struct bpf_map *map, void *key, void *value,
>  		 */
>  	}
>  
> -	ret = htab_lock_bucket(htab, b, hash, &flags);
> +	ret = htab_lock_bucket(b, &flags);
>  	if (ret)
>  		return ret;
>  
> @@ -1152,7 +1129,7 @@ static int htab_map_update_elem(struct bpf_map *map, void *key, void *value,
>  	}
>  	ret = 0;
>  err:
> -	htab_unlock_bucket(htab, b, hash, flags);
> +	htab_unlock_bucket(b, flags);
>  	return ret;
>  }
>  
> @@ -1198,7 +1175,7 @@ static int htab_lru_map_update_elem(struct bpf_map *map, void *key, void *value,
>  	copy_map_value(&htab->map,
>  		       l_new->key + round_up(map->key_size, 8), value);
>  
> -	ret = htab_lock_bucket(htab, b, hash, &flags);
> +	ret = htab_lock_bucket(b, &flags);
>  	if (ret)
>  		return ret;
>  
> @@ -1219,7 +1196,7 @@ static int htab_lru_map_update_elem(struct bpf_map *map, void *key, void *value,
>  	ret = 0;
>  
>  err:
> -	htab_unlock_bucket(htab, b, hash, flags);
> +	htab_unlock_bucket(b, flags);
>  
>  	if (ret)
>  		htab_lru_push_free(htab, l_new);
> @@ -1255,7 +1232,7 @@ static int __htab_percpu_map_update_elem(struct bpf_map *map, void *key,
>  	b = __select_bucket(htab, hash);
>  	head = &b->head;
>  
> -	ret = htab_lock_bucket(htab, b, hash, &flags);
> +	ret = htab_lock_bucket(b, &flags);
>  	if (ret)
>  		return ret;
>  
> @@ -1280,7 +1257,7 @@ static int __htab_percpu_map_update_elem(struct bpf_map *map, void *key,
>  	}
>  	ret = 0;
>  err:
> -	htab_unlock_bucket(htab, b, hash, flags);
> +	htab_unlock_bucket(b, flags);
>  	return ret;
>  }
>  
> @@ -1321,7 +1298,7 @@ static int __htab_lru_percpu_map_update_elem(struct bpf_map *map, void *key,
>  			return -ENOMEM;
>  	}
>  
> -	ret = htab_lock_bucket(htab, b, hash, &flags);
> +	ret = htab_lock_bucket(b, &flags);
>  	if (ret)
>  		return ret;
>  
> @@ -1345,7 +1322,7 @@ static int __htab_lru_percpu_map_update_elem(struct bpf_map *map, void *key,
>  	}
>  	ret = 0;
>  err:
> -	htab_unlock_bucket(htab, b, hash, flags);
> +	htab_unlock_bucket(b, flags);
>  	if (l_new)
>  		bpf_lru_push_free(&htab->lru, &l_new->lru_node);
>  	return ret;
> @@ -1384,7 +1361,7 @@ static int htab_map_delete_elem(struct bpf_map *map, void *key)
>  	b = __select_bucket(htab, hash);
>  	head = &b->head;
>  
> -	ret = htab_lock_bucket(htab, b, hash, &flags);
> +	ret = htab_lock_bucket(b, &flags);
>  	if (ret)
>  		return ret;
>  
> @@ -1397,7 +1374,7 @@ static int htab_map_delete_elem(struct bpf_map *map, void *key)
>  		ret = -ENOENT;
>  	}
>  
> -	htab_unlock_bucket(htab, b, hash, flags);
> +	htab_unlock_bucket(b, flags);
>  	return ret;
>  }
>  
> @@ -1420,7 +1397,7 @@ static int htab_lru_map_delete_elem(struct bpf_map *map, void *key)
>  	b = __select_bucket(htab, hash);
>  	head = &b->head;
>  
> -	ret = htab_lock_bucket(htab, b, hash, &flags);
> +	ret = htab_lock_bucket(b, &flags);
>  	if (ret)
>  		return ret;
>  
> @@ -1431,7 +1408,7 @@ static int htab_lru_map_delete_elem(struct bpf_map *map, void *key)
>  	else
>  		ret = -ENOENT;
>  
> -	htab_unlock_bucket(htab, b, hash, flags);
> +	htab_unlock_bucket(b, flags);
>  	if (l)
>  		htab_lru_push_free(htab, l);
>  	return ret;
> @@ -1494,7 +1471,6 @@ static void htab_map_free_timers(struct bpf_map *map)
>  static void htab_map_free(struct bpf_map *map)
>  {
>  	struct bpf_htab *htab = container_of(map, struct bpf_htab, map);
> -	int i;
>  
>  	/* bpf_free_used_maps() or close(map_fd) will trigger this map_free callback.
>  	 * bpf_free_used_maps() is called after bpf prog is no longer executing.
> @@ -1517,10 +1493,10 @@ static void htab_map_free(struct bpf_map *map)
>  	bpf_map_area_free(htab->buckets);
>  	bpf_mem_alloc_destroy(&htab->pcpu_ma);
>  	bpf_mem_alloc_destroy(&htab->ma);
> +
>  	if (htab->use_percpu_counter)
>  		percpu_counter_destroy(&htab->pcount);
> -	for (i = 0; i < HASHTAB_MAP_LOCK_COUNT; i++)
> -		free_percpu(htab->map_locked[i]);
> +
>  	lockdep_unregister_key(&htab->lockdep_key);
>  	bpf_map_area_free(htab);
>  }
> @@ -1564,7 +1540,7 @@ static int __htab_map_lookup_and_delete_elem(struct bpf_map *map, void *key,
>  	b = __select_bucket(htab, hash);
>  	head = &b->head;
>  
> -	ret = htab_lock_bucket(htab, b, hash, &bflags);
> +	ret = htab_lock_bucket(b, &bflags);
>  	if (ret)
>  		return ret;
>  
> @@ -1602,7 +1578,7 @@ static int __htab_map_lookup_and_delete_elem(struct bpf_map *map, void *key,
>  			free_htab_elem(htab, l);
>  	}
>  
> -	htab_unlock_bucket(htab, b, hash, bflags);
> +	htab_unlock_bucket(b, bflags);
>  
>  	if (is_lru_map && l)
>  		htab_lru_push_free(htab, l);
> @@ -1720,7 +1696,7 @@ __htab_map_lookup_and_delete_batch(struct bpf_map *map,
>  	head = &b->head;
>  	/* do not grab the lock unless need it (bucket_cnt > 0). */
>  	if (locked) {
> -		ret = htab_lock_bucket(htab, b, batch, &flags);
> +		ret = htab_lock_bucket(b, &flags);
>  		if (ret) {
>  			rcu_read_unlock();
>  			bpf_enable_instrumentation();
> @@ -1743,7 +1719,7 @@ __htab_map_lookup_and_delete_batch(struct bpf_map *map,
>  		/* Note that since bucket_cnt > 0 here, it is implicit
>  		 * that the locked was grabbed, so release it.
>  		 */
> -		htab_unlock_bucket(htab, b, batch, flags);
> +		htab_unlock_bucket(b, flags);
>  		rcu_read_unlock();
>  		bpf_enable_instrumentation();
>  		goto after_loop;
> @@ -1754,7 +1730,7 @@ __htab_map_lookup_and_delete_batch(struct bpf_map *map,
>  		/* Note that since bucket_cnt > 0 here, it is implicit
>  		 * that the locked was grabbed, so release it.
>  		 */
> -		htab_unlock_bucket(htab, b, batch, flags);
> +		htab_unlock_bucket(b, flags);
>  		rcu_read_unlock();
>  		bpf_enable_instrumentation();
>  		kvfree(keys);
> @@ -1815,7 +1791,7 @@ __htab_map_lookup_and_delete_batch(struct bpf_map *map,
>  		dst_val += value_size;
>  	}
>  
> -	htab_unlock_bucket(htab, b, batch, flags);
> +	htab_unlock_bucket(b, flags);
>  	locked = false;
>  
>  	while (node_to_free) {


^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [net-next] bpf: avoid hashtab deadlock with try_lock
  2022-11-22  1:15   ` Hou Tao
@ 2022-11-22  3:12     ` Tonghao Zhang
  2022-11-22  4:01       ` Hou Tao
  0 siblings, 1 reply; 33+ messages in thread
From: Tonghao Zhang @ 2022-11-22  3:12 UTC (permalink / raw)
  To: Hou Tao
  Cc: netdev, Alexei Starovoitov, Daniel Borkmann, Andrii Nakryiko,
	Martin KaFai Lau, Song Liu, Yonghong Song, John Fastabend,
	KP Singh, Stanislav Fomichev, Hao Luo, Jiri Olsa, bpf

.

On Tue, Nov 22, 2022 at 9:16 AM Hou Tao <houtao1@huawei.com> wrote:
>
> Hi,
>
> On 11/21/2022 6:05 PM, xiangxia.m.yue@gmail.com wrote:
> > From: Tonghao Zhang <xiangxia.m.yue@gmail.com>
> >
> > The commit 20b6cc34ea74 ("bpf: Avoid hashtab deadlock with map_locked"),
> > try to fix deadlock, but in some case, the deadlock occurs:
> >
> > * CPUn in task context with K1, and taking lock.
> > * CPUn interrupted by NMI context, with K2.
> > * They are using the same bucket, but different map_locked.
> It is possible when n_buckets is less than HASHTAB_MAP_LOCK_COUNT (e.g.,
> n_bucket=4). If using hash & min(HASHTAB_MAP_LOCK_MASK, n_bucket - 1) as the
> index of map_locked, I think the deadlock will be gone.
Yes, but for saving memory, HASHTAB_MAP_LOCK_MASK should not be too
large(now this value is 8-1).
if user define n_bucket ,e.g 8192, the part of bucket only are
selected via hash & min(HASHTAB_MAP_LOCK_MASK, n_bucket - 1).
> >           | Task
> >           |
> >       +---v----+
> >       |  CPUn  |
> >       +---^----+
> >
> >           | NMI
> >
> > Anyway, the lockdep still warn:
> > [   36.092222] ================================
> > [   36.092230] WARNING: inconsistent lock state
> > [   36.092234] 6.1.0-rc5+ #81 Tainted: G            E
> > [   36.092236] --------------------------------
> > [   36.092237] inconsistent {INITIAL USE} -> {IN-NMI} usage.
> > [   36.092238] perf/1515 [HC1[1]:SC0[0]:HE0:SE1] takes:
> > [   36.092242] ffff888341acd1a0 (&htab->lockdep_key){....}-{2:2}, at: htab_lock_bucket+0x4d/0x58
> > [   36.092253] {INITIAL USE} state was registered at:
> > [   36.092255]   mark_usage+0x1d/0x11d
> > [   36.092262]   __lock_acquire+0x3c9/0x6ed
> > [   36.092266]   lock_acquire+0x23d/0x29a
> > [   36.092270]   _raw_spin_lock_irqsave+0x43/0x7f
> > [   36.092274]   htab_lock_bucket+0x4d/0x58
> > [   36.092276]   htab_map_delete_elem+0x82/0xfb
> > [   36.092278]   map_delete_elem+0x156/0x1ac
> > [   36.092282]   __sys_bpf+0x138/0xb71
> > [   36.092285]   __do_sys_bpf+0xd/0x15
> > [   36.092288]   do_syscall_64+0x6d/0x84
> > [   36.092291]   entry_SYSCALL_64_after_hwframe+0x63/0xcd
> > [   36.092295] irq event stamp: 120346
> > [   36.092296] hardirqs last  enabled at (120345): [<ffffffff8180b97f>] _raw_spin_unlock_irq+0x24/0x39
> > [   36.092299] hardirqs last disabled at (120346): [<ffffffff81169e85>] generic_exec_single+0x40/0xb9
> > [   36.092303] softirqs last  enabled at (120268): [<ffffffff81c00347>] __do_softirq+0x347/0x387
> > [   36.092307] softirqs last disabled at (120133): [<ffffffff810ba4f0>] __irq_exit_rcu+0x67/0xc6
> > [   36.092311]
> > [   36.092311] other info that might help us debug this:
> > [   36.092312]  Possible unsafe locking scenario:
> > [   36.092312]
> > [   36.092313]        CPU0
> > [   36.092313]        ----
> > [   36.092314]   lock(&htab->lockdep_key);
> > [   36.092315]   <Interrupt>
> > [   36.092316]     lock(&htab->lockdep_key);
> > [   36.092318]
> > [   36.092318]  *** DEADLOCK ***
> > [   36.092318]
> > [   36.092318] 3 locks held by perf/1515:
> > [   36.092320]  #0: ffff8881b9805cc0 (&cpuctx_mutex){+.+.}-{4:4}, at: perf_event_ctx_lock_nested+0x8e/0xba
> > [   36.092327]  #1: ffff8881075ecc20 (&event->child_mutex){+.+.}-{4:4}, at: perf_event_for_each_child+0x35/0x76
> > [   36.092332]  #2: ffff8881b9805c20 (&cpuctx_lock){-.-.}-{2:2}, at: perf_ctx_lock+0x12/0x27
> > [   36.092339]
> > [   36.092339] stack backtrace:
> > [   36.092341] CPU: 0 PID: 1515 Comm: perf Tainted: G            E      6.1.0-rc5+ #81
> > [   36.092344] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.16.0-0-gd239552ce722-prebuilt.qemu.org 04/01/2014
> > [   36.092349] Call Trace:
> > [   36.092351]  <NMI>
> > [   36.092354]  dump_stack_lvl+0x57/0x81
> > [   36.092359]  lock_acquire+0x1f4/0x29a
> > [   36.092363]  ? handle_pmi_common+0x13f/0x1f0
> > [   36.092366]  ? htab_lock_bucket+0x4d/0x58
> > [   36.092371]  _raw_spin_lock_irqsave+0x43/0x7f
> > [   36.092374]  ? htab_lock_bucket+0x4d/0x58
> > [   36.092377]  htab_lock_bucket+0x4d/0x58
> > [   36.092379]  htab_map_update_elem+0x11e/0x220
> > [   36.092386]  bpf_prog_f3a535ca81a8128a_bpf_prog2+0x3e/0x42
> > [   36.092392]  trace_call_bpf+0x177/0x215
> > [   36.092398]  perf_trace_run_bpf_submit+0x52/0xaa
> > [   36.092403]  ? x86_pmu_stop+0x97/0x97
> > [   36.092407]  perf_trace_nmi_handler+0xb7/0xe0
> > [   36.092415]  nmi_handle+0x116/0x254
> > [   36.092418]  ? x86_pmu_stop+0x97/0x97
> > [   36.092423]  default_do_nmi+0x3d/0xf6
> > [   36.092428]  exc_nmi+0xa1/0x109
> > [   36.092432]  end_repeat_nmi+0x16/0x67
> > [   36.092436] RIP: 0010:wrmsrl+0xd/0x1b
> > [   36.092441] Code: 04 01 00 00 c6 84 07 48 01 00 00 01 5b e9 46 15 80 00 5b c3 cc cc cc cc c3 cc cc cc cc 48 89 f2 89 f9 89 f0 48 c1 ea 20 0f 30 <66> 90 c3 cc cc cc cc 31 d2 e9 2f 04 49 00 0f 1f 44 00 00 40 0f6
> > [   36.092443] RSP: 0018:ffffc900043dfc48 EFLAGS: 00000002
> > [   36.092445] RAX: 000000000000000f RBX: ffff8881b96153e0 RCX: 000000000000038f
> > [   36.092447] RDX: 0000000000000007 RSI: 000000070000000f RDI: 000000000000038f
> > [   36.092449] RBP: 000000070000000f R08: ffffffffffffffff R09: ffff8881053bdaa8
> > [   36.092451] R10: ffff8881b9805d40 R11: 0000000000000005 R12: ffff8881b9805c00
> > [   36.092452] R13: 0000000000000000 R14: 0000000000000000 R15: ffff8881075ec970
> > [   36.092460]  ? wrmsrl+0xd/0x1b
> > [   36.092465]  ? wrmsrl+0xd/0x1b
> > [   36.092469]  </NMI>
> > [   36.092469]  <TASK>
> > [   36.092470]  __intel_pmu_enable_all.constprop.0+0x7c/0xaf
> > [   36.092475]  event_function+0xb6/0xd3
> > [   36.092478]  ? cpu_to_node+0x1a/0x1a
> > [   36.092482]  ? cpu_to_node+0x1a/0x1a
> > [   36.092485]  remote_function+0x1e/0x4c
> > [   36.092489]  generic_exec_single+0x48/0xb9
> > [   36.092492]  ? __lock_acquire+0x666/0x6ed
> > [   36.092497]  smp_call_function_single+0xbf/0x106
> > [   36.092499]  ? cpu_to_node+0x1a/0x1a
> > [   36.092504]  ? kvm_sched_clock_read+0x5/0x11
> > [   36.092508]  ? __perf_event_task_sched_in+0x13d/0x13d
> > [   36.092513]  cpu_function_call+0x47/0x69
> > [   36.092516]  ? perf_event_update_time+0x52/0x52
> > [   36.092519]  event_function_call+0x89/0x117
> > [   36.092521]  ? __perf_event_task_sched_in+0x13d/0x13d
> > [   36.092526]  ? _perf_event_disable+0x4a/0x4a
> > [   36.092528]  perf_event_for_each_child+0x3d/0x76
> > [   36.092532]  ? _perf_event_disable+0x4a/0x4a
> > [   36.092533]  _perf_ioctl+0x564/0x590
> > [   36.092537]  ? __lock_release+0xd5/0x1b0
> > [   36.092543]  ? perf_event_ctx_lock_nested+0x8e/0xba
> > [   36.092547]  perf_ioctl+0x42/0x5f
> > [   36.092551]  vfs_ioctl+0x1e/0x2f
> > [   36.092554]  __do_sys_ioctl+0x66/0x89
> > [   36.092559]  do_syscall_64+0x6d/0x84
> > [   36.092563]  entry_SYSCALL_64_after_hwframe+0x63/0xcd
> > [   36.092566] RIP: 0033:0x7fe7110f362b
> > [   36.092569] Code: 0f 1e fa 48 8b 05 5d b8 2c 00 64 c7 00 26 00 00 00 48 c7 c0 ff ff ff ff c3 66 0f 1f 44 00 00 f3 0f 1e fa b8 10 00 00 00 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 8b 0d 2d b8 2c 00 f7 d8 64 89 018
> > [   36.092570] RSP: 002b:00007ffebb8e4b08 EFLAGS: 00000246 ORIG_RAX: 0000000000000010
> > [   36.092573] RAX: ffffffffffffffda RBX: 0000000000002400 RCX: 00007fe7110f362b
> > [   36.092575] RDX: 0000000000000000 RSI: 0000000000002400 RDI: 0000000000000013
> > [   36.092576] RBP: 00007ffebb8e4b40 R08: 0000000000000001 R09: 000055c1db4a5b40
> > [   36.092577] R10: 0000000000000000 R11: 0000000000000246 R12: 0000000000000000
> > [   36.092579] R13: 000055c1db3b2a30 R14: 0000000000000000 R15: 0000000000000000
> > [   36.092586]  </TASK>
> >
> > Cc: Alexei Starovoitov <ast@kernel.org>
> > Cc: Daniel Borkmann <daniel@iogearbox.net>
> > Cc: Andrii Nakryiko <andrii@kernel.org>
> > Cc: Martin KaFai Lau <martin.lau@linux.dev>
> > Cc: Song Liu <song@kernel.org>
> > Cc: Yonghong Song <yhs@fb.com>
> > Cc: John Fastabend <john.fastabend@gmail.com>
> > Cc: KP Singh <kpsingh@kernel.org>
> > Cc: Stanislav Fomichev <sdf@google.com>
> > Cc: Hao Luo <haoluo@google.com>
> > Cc: Jiri Olsa <jolsa@kernel.org>
> > Signed-off-by: Tonghao Zhang <xiangxia.m.yue@gmail.com>
> > ---
> >  kernel/bpf/hashtab.c | 96 +++++++++++++++++---------------------------
> >  1 file changed, 36 insertions(+), 60 deletions(-)
> >
> > diff --git a/kernel/bpf/hashtab.c b/kernel/bpf/hashtab.c
> > index 22855d6ff6d3..429acd97c869 100644
> > --- a/kernel/bpf/hashtab.c
> > +++ b/kernel/bpf/hashtab.c
> > @@ -80,9 +80,6 @@ struct bucket {
> >       raw_spinlock_t raw_lock;
> >  };
> >
> > -#define HASHTAB_MAP_LOCK_COUNT 8
> > -#define HASHTAB_MAP_LOCK_MASK (HASHTAB_MAP_LOCK_COUNT - 1)
> > -
> >  struct bpf_htab {
> >       struct bpf_map map;
> >       struct bpf_mem_alloc ma;
> > @@ -104,7 +101,6 @@ struct bpf_htab {
> >       u32 elem_size;  /* size of each element in bytes */
> >       u32 hashrnd;
> >       struct lock_class_key lockdep_key;
> > -     int __percpu *map_locked[HASHTAB_MAP_LOCK_COUNT];
> >  };
> >
> >  /* each htab element is struct htab_elem + key + value */
> > @@ -146,35 +142,26 @@ static void htab_init_buckets(struct bpf_htab *htab)
> >       }
> >  }
> >
> > -static inline int htab_lock_bucket(const struct bpf_htab *htab,
> > -                                struct bucket *b, u32 hash,
> > +static inline int htab_lock_bucket(struct bucket *b,
> >                                  unsigned long *pflags)
> >  {
> >       unsigned long flags;
> >
> > -     hash = hash & HASHTAB_MAP_LOCK_MASK;
> > -
> > -     preempt_disable();
> > -     if (unlikely(__this_cpu_inc_return(*(htab->map_locked[hash])) != 1)) {
> > -             __this_cpu_dec(*(htab->map_locked[hash]));
> > -             preempt_enable();
> > -             return -EBUSY;
> > +     if (in_nmi()) {
> > +             if (!raw_spin_trylock_irqsave(&b->raw_lock, flags))
> > +                     return -EBUSY;
> > +     } else {
> > +             raw_spin_lock_irqsave(&b->raw_lock, flags);
> >       }
> >
> > -     raw_spin_lock_irqsave(&b->raw_lock, flags);
> >       *pflags = flags;
> > -
> >       return 0;
> >  }
> map_locked is also used to prevent the re-entrance of htab_lock_bucket() on the
> same CPU, so only check in_nmi() is not enough.
NMI, IRQ, and preempt may interrupt the task context.
In htab_lock_bucket, raw_spin_lock_irqsave disable the preempt and
irq. so only NMI may interrupt the codes, right ?

> > -static inline void htab_unlock_bucket(const struct bpf_htab *htab,
> > -                                   struct bucket *b, u32 hash,
> > +static inline void htab_unlock_bucket(struct bucket *b,
> >                                     unsigned long flags)
> >  {
> > -     hash = hash & HASHTAB_MAP_LOCK_MASK;
> >       raw_spin_unlock_irqrestore(&b->raw_lock, flags);
> > -     __this_cpu_dec(*(htab->map_locked[hash]));
> > -     preempt_enable();
> >  }
> >
> >  static bool htab_lru_map_delete_node(void *arg, struct bpf_lru_node *node);
> > @@ -467,7 +454,7 @@ static struct bpf_map *htab_map_alloc(union bpf_attr *attr)
> >       bool percpu_lru = (attr->map_flags & BPF_F_NO_COMMON_LRU);
> >       bool prealloc = !(attr->map_flags & BPF_F_NO_PREALLOC);
> >       struct bpf_htab *htab;
> > -     int err, i;
> > +     int err;
> >
> >       htab = bpf_map_area_alloc(sizeof(*htab), NUMA_NO_NODE);
> >       if (!htab)
> > @@ -513,15 +500,6 @@ static struct bpf_map *htab_map_alloc(union bpf_attr *attr)
> >       if (!htab->buckets)
> >               goto free_htab;
> >
> > -     for (i = 0; i < HASHTAB_MAP_LOCK_COUNT; i++) {
> > -             htab->map_locked[i] = bpf_map_alloc_percpu(&htab->map,
> > -                                                        sizeof(int),
> > -                                                        sizeof(int),
> > -                                                        GFP_USER);
> > -             if (!htab->map_locked[i])
> > -                     goto free_map_locked;
> > -     }
> > -
> >       if (htab->map.map_flags & BPF_F_ZERO_SEED)
> >               htab->hashrnd = 0;
> >       else
> > @@ -549,13 +527,13 @@ static struct bpf_map *htab_map_alloc(union bpf_attr *attr)
> >       if (htab->use_percpu_counter) {
> >               err = percpu_counter_init(&htab->pcount, 0, GFP_KERNEL);
> >               if (err)
> > -                     goto free_map_locked;
> > +                     goto free_buckets;
> >       }
> >
> >       if (prealloc) {
> >               err = prealloc_init(htab);
> >               if (err)
> > -                     goto free_map_locked;
> > +                     goto free_buckets;
> >
> >               if (!percpu && !lru) {
> >                       /* lru itself can remove the least used element, so
> > @@ -568,12 +546,12 @@ static struct bpf_map *htab_map_alloc(union bpf_attr *attr)
> >       } else {
> >               err = bpf_mem_alloc_init(&htab->ma, htab->elem_size, false);
> >               if (err)
> > -                     goto free_map_locked;
> > +                     goto free_buckets;
> >               if (percpu) {
> >                       err = bpf_mem_alloc_init(&htab->pcpu_ma,
> >                                                round_up(htab->map.value_size, 8), true);
> >                       if (err)
> > -                             goto free_map_locked;
> > +                             goto free_buckets;
> >               }
> >       }
> >
> > @@ -581,11 +559,10 @@ static struct bpf_map *htab_map_alloc(union bpf_attr *attr)
> >
> >  free_prealloc:
> >       prealloc_destroy(htab);
> > -free_map_locked:
> > +free_buckets:
> >       if (htab->use_percpu_counter)
> >               percpu_counter_destroy(&htab->pcount);
> > -     for (i = 0; i < HASHTAB_MAP_LOCK_COUNT; i++)
> > -             free_percpu(htab->map_locked[i]);
> > +
> >       bpf_map_area_free(htab->buckets);
> >       bpf_mem_alloc_destroy(&htab->pcpu_ma);
> >       bpf_mem_alloc_destroy(&htab->ma);
> > @@ -782,7 +759,7 @@ static bool htab_lru_map_delete_node(void *arg, struct bpf_lru_node *node)
> >       b = __select_bucket(htab, tgt_l->hash);
> >       head = &b->head;
> >
> > -     ret = htab_lock_bucket(htab, b, tgt_l->hash, &flags);
> > +     ret = htab_lock_bucket(b, &flags);
> >       if (ret)
> >               return false;
> >
> > @@ -793,7 +770,7 @@ static bool htab_lru_map_delete_node(void *arg, struct bpf_lru_node *node)
> >                       break;
> >               }
> >
> > -     htab_unlock_bucket(htab, b, tgt_l->hash, flags);
> > +     htab_unlock_bucket(b, flags);
> >
> >       return l == tgt_l;
> >  }
> > @@ -1107,7 +1084,7 @@ static int htab_map_update_elem(struct bpf_map *map, void *key, void *value,
> >                */
> >       }
> >
> > -     ret = htab_lock_bucket(htab, b, hash, &flags);
> > +     ret = htab_lock_bucket(b, &flags);
> >       if (ret)
> >               return ret;
> >
> > @@ -1152,7 +1129,7 @@ static int htab_map_update_elem(struct bpf_map *map, void *key, void *value,
> >       }
> >       ret = 0;
> >  err:
> > -     htab_unlock_bucket(htab, b, hash, flags);
> > +     htab_unlock_bucket(b, flags);
> >       return ret;
> >  }
> >
> > @@ -1198,7 +1175,7 @@ static int htab_lru_map_update_elem(struct bpf_map *map, void *key, void *value,
> >       copy_map_value(&htab->map,
> >                      l_new->key + round_up(map->key_size, 8), value);
> >
> > -     ret = htab_lock_bucket(htab, b, hash, &flags);
> > +     ret = htab_lock_bucket(b, &flags);
> >       if (ret)
> >               return ret;
> >
> > @@ -1219,7 +1196,7 @@ static int htab_lru_map_update_elem(struct bpf_map *map, void *key, void *value,
> >       ret = 0;
> >
> >  err:
> > -     htab_unlock_bucket(htab, b, hash, flags);
> > +     htab_unlock_bucket(b, flags);
> >
> >       if (ret)
> >               htab_lru_push_free(htab, l_new);
> > @@ -1255,7 +1232,7 @@ static int __htab_percpu_map_update_elem(struct bpf_map *map, void *key,
> >       b = __select_bucket(htab, hash);
> >       head = &b->head;
> >
> > -     ret = htab_lock_bucket(htab, b, hash, &flags);
> > +     ret = htab_lock_bucket(b, &flags);
> >       if (ret)
> >               return ret;
> >
> > @@ -1280,7 +1257,7 @@ static int __htab_percpu_map_update_elem(struct bpf_map *map, void *key,
> >       }
> >       ret = 0;
> >  err:
> > -     htab_unlock_bucket(htab, b, hash, flags);
> > +     htab_unlock_bucket(b, flags);
> >       return ret;
> >  }
> >
> > @@ -1321,7 +1298,7 @@ static int __htab_lru_percpu_map_update_elem(struct bpf_map *map, void *key,
> >                       return -ENOMEM;
> >       }
> >
> > -     ret = htab_lock_bucket(htab, b, hash, &flags);
> > +     ret = htab_lock_bucket(b, &flags);
> >       if (ret)
> >               return ret;
> >
> > @@ -1345,7 +1322,7 @@ static int __htab_lru_percpu_map_update_elem(struct bpf_map *map, void *key,
> >       }
> >       ret = 0;
> >  err:
> > -     htab_unlock_bucket(htab, b, hash, flags);
> > +     htab_unlock_bucket(b, flags);
> >       if (l_new)
> >               bpf_lru_push_free(&htab->lru, &l_new->lru_node);
> >       return ret;
> > @@ -1384,7 +1361,7 @@ static int htab_map_delete_elem(struct bpf_map *map, void *key)
> >       b = __select_bucket(htab, hash);
> >       head = &b->head;
> >
> > -     ret = htab_lock_bucket(htab, b, hash, &flags);
> > +     ret = htab_lock_bucket(b, &flags);
> >       if (ret)
> >               return ret;
> >
> > @@ -1397,7 +1374,7 @@ static int htab_map_delete_elem(struct bpf_map *map, void *key)
> >               ret = -ENOENT;
> >       }
> >
> > -     htab_unlock_bucket(htab, b, hash, flags);
> > +     htab_unlock_bucket(b, flags);
> >       return ret;
> >  }
> >
> > @@ -1420,7 +1397,7 @@ static int htab_lru_map_delete_elem(struct bpf_map *map, void *key)
> >       b = __select_bucket(htab, hash);
> >       head = &b->head;
> >
> > -     ret = htab_lock_bucket(htab, b, hash, &flags);
> > +     ret = htab_lock_bucket(b, &flags);
> >       if (ret)
> >               return ret;
> >
> > @@ -1431,7 +1408,7 @@ static int htab_lru_map_delete_elem(struct bpf_map *map, void *key)
> >       else
> >               ret = -ENOENT;
> >
> > -     htab_unlock_bucket(htab, b, hash, flags);
> > +     htab_unlock_bucket(b, flags);
> >       if (l)
> >               htab_lru_push_free(htab, l);
> >       return ret;
> > @@ -1494,7 +1471,6 @@ static void htab_map_free_timers(struct bpf_map *map)
> >  static void htab_map_free(struct bpf_map *map)
> >  {
> >       struct bpf_htab *htab = container_of(map, struct bpf_htab, map);
> > -     int i;
> >
> >       /* bpf_free_used_maps() or close(map_fd) will trigger this map_free callback.
> >        * bpf_free_used_maps() is called after bpf prog is no longer executing.
> > @@ -1517,10 +1493,10 @@ static void htab_map_free(struct bpf_map *map)
> >       bpf_map_area_free(htab->buckets);
> >       bpf_mem_alloc_destroy(&htab->pcpu_ma);
> >       bpf_mem_alloc_destroy(&htab->ma);
> > +
> >       if (htab->use_percpu_counter)
> >               percpu_counter_destroy(&htab->pcount);
> > -     for (i = 0; i < HASHTAB_MAP_LOCK_COUNT; i++)
> > -             free_percpu(htab->map_locked[i]);
> > +
> >       lockdep_unregister_key(&htab->lockdep_key);
> >       bpf_map_area_free(htab);
> >  }
> > @@ -1564,7 +1540,7 @@ static int __htab_map_lookup_and_delete_elem(struct bpf_map *map, void *key,
> >       b = __select_bucket(htab, hash);
> >       head = &b->head;
> >
> > -     ret = htab_lock_bucket(htab, b, hash, &bflags);
> > +     ret = htab_lock_bucket(b, &bflags);
> >       if (ret)
> >               return ret;
> >
> > @@ -1602,7 +1578,7 @@ static int __htab_map_lookup_and_delete_elem(struct bpf_map *map, void *key,
> >                       free_htab_elem(htab, l);
> >       }
> >
> > -     htab_unlock_bucket(htab, b, hash, bflags);
> > +     htab_unlock_bucket(b, bflags);
> >
> >       if (is_lru_map && l)
> >               htab_lru_push_free(htab, l);
> > @@ -1720,7 +1696,7 @@ __htab_map_lookup_and_delete_batch(struct bpf_map *map,
> >       head = &b->head;
> >       /* do not grab the lock unless need it (bucket_cnt > 0). */
> >       if (locked) {
> > -             ret = htab_lock_bucket(htab, b, batch, &flags);
> > +             ret = htab_lock_bucket(b, &flags);
> >               if (ret) {
> >                       rcu_read_unlock();
> >                       bpf_enable_instrumentation();
> > @@ -1743,7 +1719,7 @@ __htab_map_lookup_and_delete_batch(struct bpf_map *map,
> >               /* Note that since bucket_cnt > 0 here, it is implicit
> >                * that the locked was grabbed, so release it.
> >                */
> > -             htab_unlock_bucket(htab, b, batch, flags);
> > +             htab_unlock_bucket(b, flags);
> >               rcu_read_unlock();
> >               bpf_enable_instrumentation();
> >               goto after_loop;
> > @@ -1754,7 +1730,7 @@ __htab_map_lookup_and_delete_batch(struct bpf_map *map,
> >               /* Note that since bucket_cnt > 0 here, it is implicit
> >                * that the locked was grabbed, so release it.
> >                */
> > -             htab_unlock_bucket(htab, b, batch, flags);
> > +             htab_unlock_bucket(b, flags);
> >               rcu_read_unlock();
> >               bpf_enable_instrumentation();
> >               kvfree(keys);
> > @@ -1815,7 +1791,7 @@ __htab_map_lookup_and_delete_batch(struct bpf_map *map,
> >               dst_val += value_size;
> >       }
> >
> > -     htab_unlock_bucket(htab, b, batch, flags);
> > +     htab_unlock_bucket(b, flags);
> >       locked = false;
> >
> >       while (node_to_free) {
>


-- 
Best regards, Tonghao

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [net-next] bpf: avoid hashtab deadlock with try_lock
  2022-11-22  3:12     ` Tonghao Zhang
@ 2022-11-22  4:01       ` Hou Tao
  2022-11-22  4:06         ` Hou Tao
  0 siblings, 1 reply; 33+ messages in thread
From: Hou Tao @ 2022-11-22  4:01 UTC (permalink / raw)
  To: Tonghao Zhang
  Cc: netdev, Alexei Starovoitov, Daniel Borkmann, Andrii Nakryiko,
	Martin KaFai Lau, Song Liu, Yonghong Song, John Fastabend,
	KP Singh, Stanislav Fomichev, Hao Luo, Jiri Olsa, bpf

Hi,

On 11/22/2022 11:12 AM, Tonghao Zhang wrote:
> .
>
> On Tue, Nov 22, 2022 at 9:16 AM Hou Tao <houtao1@huawei.com> wrote:
>> Hi,
>>
>> On 11/21/2022 6:05 PM, xiangxia.m.yue@gmail.com wrote:
>>> From: Tonghao Zhang <xiangxia.m.yue@gmail.com>
>>>
>>> The commit 20b6cc34ea74 ("bpf: Avoid hashtab deadlock with map_locked"),
>>> try to fix deadlock, but in some case, the deadlock occurs:
>>>
>>> * CPUn in task context with K1, and taking lock.
>>> * CPUn interrupted by NMI context, with K2.
>>> * They are using the same bucket, but different map_locked.
>> It is possible when n_buckets is less than HASHTAB_MAP_LOCK_COUNT (e.g.,
>> n_bucket=4). If using hash & min(HASHTAB_MAP_LOCK_MASK, n_bucket - 1) as the
>> index of map_locked, I think the deadlock will be gone.
> Yes, but for saving memory, HASHTAB_MAP_LOCK_MASK should not be too
> large(now this value is 8-1).
> if user define n_bucket ,e.g 8192, the part of bucket only are
> selected via hash & min(HASHTAB_MAP_LOCK_MASK, n_bucket - 1).
SNIP
>
>>> diff --git a/kernel/bpf/hashtab.c b/kernel/bpf/hashtab.c
>>> index 22855d6ff6d3..429acd97c869 100644
>>> --- a/kernel/bpf/hashtab.c
>>> +++ b/kernel/bpf/hashtab.c
>>> @@ -80,9 +80,6 @@ struct bucket {
>>>       raw_spinlock_t raw_lock;
>>>  };
>>>
>>> -#define HASHTAB_MAP_LOCK_COUNT 8
>>> -#define HASHTAB_MAP_LOCK_MASK (HASHTAB_MAP_LOCK_COUNT - 1)
>>> -
>>>  struct bpf_htab {
>>>       struct bpf_map map;
>>>       struct bpf_mem_alloc ma;
>>> @@ -104,7 +101,6 @@ struct bpf_htab {
>>>       u32 elem_size;  /* size of each element in bytes */
>>>       u32 hashrnd;
>>>       struct lock_class_key lockdep_key;
>>> -     int __percpu *map_locked[HASHTAB_MAP_LOCK_COUNT];
>>>  };
>>>
>>>  /* each htab element is struct htab_elem + key + value */
>>> @@ -146,35 +142,26 @@ static void htab_init_buckets(struct bpf_htab *htab)
>>>       }
>>>  }
>>>
>>> -static inline int htab_lock_bucket(const struct bpf_htab *htab,
>>> -                                struct bucket *b, u32 hash,
>>> +static inline int htab_lock_bucket(struct bucket *b,
>>>                                  unsigned long *pflags)
>>>  {
>>>       unsigned long flags;
>>>
>>> -     hash = hash & HASHTAB_MAP_LOCK_MASK;
>>> -
>>> -     preempt_disable();
>>> -     if (unlikely(__this_cpu_inc_return(*(htab->map_locked[hash])) != 1)) {
>>> -             __this_cpu_dec(*(htab->map_locked[hash]));
>>> -             preempt_enable();
>>> -             return -EBUSY;
>>> +     if (in_nmi()) {
>>> +             if (!raw_spin_trylock_irqsave(&b->raw_lock, flags))
>>> +                     return -EBUSY;
>>> +     } else {
>>> +             raw_spin_lock_irqsave(&b->raw_lock, flags);
>>>       }
>>>
>>> -     raw_spin_lock_irqsave(&b->raw_lock, flags);
>>>       *pflags = flags;
>>> -
>>>       return 0;
>>>  }
>> map_locked is also used to prevent the re-entrance of htab_lock_bucket() on the
>> same CPU, so only check in_nmi() is not enough.
> NMI, IRQ, and preempt may interrupt the task context.
> In htab_lock_bucket, raw_spin_lock_irqsave disable the preempt and
> irq. so only NMI may interrupt the codes, right ?
The re-entrance here means the nesting of bpf programs as show below:

bpf_prog A
update map X
    htab_lock_bucket
        raw_spin_lock_irqsave()
    lookup_elem_raw()
        // bpf prog B is attached on lookup_elem_raw()
        bpf prog B
            update map X again and update the element
                htab_lock_bucket()
                    // dead-lock
                    raw_spinlock_irqsave()
.

>>> -static inline void htab_unlock_bucket(const struct bpf_htab *htab,
>>> -                                   struct bucket *b, u32 hash,
>>> +static inline void htab_unlock_bucket(struct bucket *b,
>>>                                     unsigned long flags)
>>>  {
>>> -     hash = hash & HASHTAB_MAP_LOCK_MASK;
>>>       raw_spin_unlock_irqrestore(&b->raw_lock, flags);
>>> -     __this_cpu_dec(*(htab->map_locked[hash]));
>>> -     preempt_enable();
>>>  }
>>>
>>>  static bool htab_lru_map_delete_node(void *arg, struct bpf_lru_node *node);
>>> @@ -467,7 +454,7 @@ static struct bpf_map *htab_map_alloc(union bpf_attr *attr)
>>>       bool percpu_lru = (attr->map_flags & BPF_F_NO_COMMON_LRU);
>>>       bool prealloc = !(attr->map_flags & BPF_F_NO_PREALLOC);
>>>       struct bpf_htab *htab;
>>> -     int err, i;
>>> +     int err;
>>>
>>>       htab = bpf_map_area_alloc(sizeof(*htab), NUMA_NO_NODE);
>>>       if (!htab)
>>> @@ -513,15 +500,6 @@ static struct bpf_map *htab_map_alloc(union bpf_attr *attr)
>>>       if (!htab->buckets)
>>>               goto free_htab;
>>>
>>> -     for (i = 0; i < HASHTAB_MAP_LOCK_COUNT; i++) {
>>> -             htab->map_locked[i] = bpf_map_alloc_percpu(&htab->map,
>>> -                                                        sizeof(int),
>>> -                                                        sizeof(int),
>>> -                                                        GFP_USER);
>>> -             if (!htab->map_locked[i])
>>> -                     goto free_map_locked;
>>> -     }
>>> -
>>>       if (htab->map.map_flags & BPF_F_ZERO_SEED)
>>>               htab->hashrnd = 0;
>>>       else
>>> @@ -549,13 +527,13 @@ static struct bpf_map *htab_map_alloc(union bpf_attr *attr)
>>>       if (htab->use_percpu_counter) {
>>>               err = percpu_counter_init(&htab->pcount, 0, GFP_KERNEL);
>>>               if (err)
>>> -                     goto free_map_locked;
>>> +                     goto free_buckets;
>>>       }
>>>
>>>       if (prealloc) {
>>>               err = prealloc_init(htab);
>>>               if (err)
>>> -                     goto free_map_locked;
>>> +                     goto free_buckets;
>>>
>>>               if (!percpu && !lru) {
>>>                       /* lru itself can remove the least used element, so
>>> @@ -568,12 +546,12 @@ static struct bpf_map *htab_map_alloc(union bpf_attr *attr)
>>>       } else {
>>>               err = bpf_mem_alloc_init(&htab->ma, htab->elem_size, false);
>>>               if (err)
>>> -                     goto free_map_locked;
>>> +                     goto free_buckets;
>>>               if (percpu) {
>>>                       err = bpf_mem_alloc_init(&htab->pcpu_ma,
>>>                                                round_up(htab->map.value_size, 8), true);
>>>                       if (err)
>>> -                             goto free_map_locked;
>>> +                             goto free_buckets;
>>>               }
>>>       }
>>>
>>> @@ -581,11 +559,10 @@ static struct bpf_map *htab_map_alloc(union bpf_attr *attr)
>>>
>>>  free_prealloc:
>>>       prealloc_destroy(htab);
>>> -free_map_locked:
>>> +free_buckets:
>>>       if (htab->use_percpu_counter)
>>>               percpu_counter_destroy(&htab->pcount);
>>> -     for (i = 0; i < HASHTAB_MAP_LOCK_COUNT; i++)
>>> -             free_percpu(htab->map_locked[i]);
>>> +
>>>       bpf_map_area_free(htab->buckets);
>>>       bpf_mem_alloc_destroy(&htab->pcpu_ma);
>>>       bpf_mem_alloc_destroy(&htab->ma);
>>> @@ -782,7 +759,7 @@ static bool htab_lru_map_delete_node(void *arg, struct bpf_lru_node *node)
>>>       b = __select_bucket(htab, tgt_l->hash);
>>>       head = &b->head;
>>>
>>> -     ret = htab_lock_bucket(htab, b, tgt_l->hash, &flags);
>>> +     ret = htab_lock_bucket(b, &flags);
>>>       if (ret)
>>>               return false;
>>>
>>> @@ -793,7 +770,7 @@ static bool htab_lru_map_delete_node(void *arg, struct bpf_lru_node *node)
>>>                       break;
>>>               }
>>>
>>> -     htab_unlock_bucket(htab, b, tgt_l->hash, flags);
>>> +     htab_unlock_bucket(b, flags);
>>>
>>>       return l == tgt_l;
>>>  }
>>> @@ -1107,7 +1084,7 @@ static int htab_map_update_elem(struct bpf_map *map, void *key, void *value,
>>>                */
>>>       }
>>>
>>> -     ret = htab_lock_bucket(htab, b, hash, &flags);
>>> +     ret = htab_lock_bucket(b, &flags);
>>>       if (ret)
>>>               return ret;
>>>
>>> @@ -1152,7 +1129,7 @@ static int htab_map_update_elem(struct bpf_map *map, void *key, void *value,
>>>       }
>>>       ret = 0;
>>>  err:
>>> -     htab_unlock_bucket(htab, b, hash, flags);
>>> +     htab_unlock_bucket(b, flags);
>>>       return ret;
>>>  }
>>>
>>> @@ -1198,7 +1175,7 @@ static int htab_lru_map_update_elem(struct bpf_map *map, void *key, void *value,
>>>       copy_map_value(&htab->map,
>>>                      l_new->key + round_up(map->key_size, 8), value);
>>>
>>> -     ret = htab_lock_bucket(htab, b, hash, &flags);
>>> +     ret = htab_lock_bucket(b, &flags);
>>>       if (ret)
>>>               return ret;
>>>
>>> @@ -1219,7 +1196,7 @@ static int htab_lru_map_update_elem(struct bpf_map *map, void *key, void *value,
>>>       ret = 0;
>>>
>>>  err:
>>> -     htab_unlock_bucket(htab, b, hash, flags);
>>> +     htab_unlock_bucket(b, flags);
>>>
>>>       if (ret)
>>>               htab_lru_push_free(htab, l_new);
>>> @@ -1255,7 +1232,7 @@ static int __htab_percpu_map_update_elem(struct bpf_map *map, void *key,
>>>       b = __select_bucket(htab, hash);
>>>       head = &b->head;
>>>
>>> -     ret = htab_lock_bucket(htab, b, hash, &flags);
>>> +     ret = htab_lock_bucket(b, &flags);
>>>       if (ret)
>>>               return ret;
>>>
>>> @@ -1280,7 +1257,7 @@ static int __htab_percpu_map_update_elem(struct bpf_map *map, void *key,
>>>       }
>>>       ret = 0;
>>>  err:
>>> -     htab_unlock_bucket(htab, b, hash, flags);
>>> +     htab_unlock_bucket(b, flags);
>>>       return ret;
>>>  }
>>>
>>> @@ -1321,7 +1298,7 @@ static int __htab_lru_percpu_map_update_elem(struct bpf_map *map, void *key,
>>>                       return -ENOMEM;
>>>       }
>>>
>>> -     ret = htab_lock_bucket(htab, b, hash, &flags);
>>> +     ret = htab_lock_bucket(b, &flags);
>>>       if (ret)
>>>               return ret;
>>>
>>> @@ -1345,7 +1322,7 @@ static int __htab_lru_percpu_map_update_elem(struct bpf_map *map, void *key,
>>>       }
>>>       ret = 0;
>>>  err:
>>> -     htab_unlock_bucket(htab, b, hash, flags);
>>> +     htab_unlock_bucket(b, flags);
>>>       if (l_new)
>>>               bpf_lru_push_free(&htab->lru, &l_new->lru_node);
>>>       return ret;
>>> @@ -1384,7 +1361,7 @@ static int htab_map_delete_elem(struct bpf_map *map, void *key)
>>>       b = __select_bucket(htab, hash);
>>>       head = &b->head;
>>>
>>> -     ret = htab_lock_bucket(htab, b, hash, &flags);
>>> +     ret = htab_lock_bucket(b, &flags);
>>>       if (ret)
>>>               return ret;
>>>
>>> @@ -1397,7 +1374,7 @@ static int htab_map_delete_elem(struct bpf_map *map, void *key)
>>>               ret = -ENOENT;
>>>       }
>>>
>>> -     htab_unlock_bucket(htab, b, hash, flags);
>>> +     htab_unlock_bucket(b, flags);
>>>       return ret;
>>>  }
>>>
>>> @@ -1420,7 +1397,7 @@ static int htab_lru_map_delete_elem(struct bpf_map *map, void *key)
>>>       b = __select_bucket(htab, hash);
>>>       head = &b->head;
>>>
>>> -     ret = htab_lock_bucket(htab, b, hash, &flags);
>>> +     ret = htab_lock_bucket(b, &flags);
>>>       if (ret)
>>>               return ret;
>>>
>>> @@ -1431,7 +1408,7 @@ static int htab_lru_map_delete_elem(struct bpf_map *map, void *key)
>>>       else
>>>               ret = -ENOENT;
>>>
>>> -     htab_unlock_bucket(htab, b, hash, flags);
>>> +     htab_unlock_bucket(b, flags);
>>>       if (l)
>>>               htab_lru_push_free(htab, l);
>>>       return ret;
>>> @@ -1494,7 +1471,6 @@ static void htab_map_free_timers(struct bpf_map *map)
>>>  static void htab_map_free(struct bpf_map *map)
>>>  {
>>>       struct bpf_htab *htab = container_of(map, struct bpf_htab, map);
>>> -     int i;
>>>
>>>       /* bpf_free_used_maps() or close(map_fd) will trigger this map_free callback.
>>>        * bpf_free_used_maps() is called after bpf prog is no longer executing.
>>> @@ -1517,10 +1493,10 @@ static void htab_map_free(struct bpf_map *map)
>>>       bpf_map_area_free(htab->buckets);
>>>       bpf_mem_alloc_destroy(&htab->pcpu_ma);
>>>       bpf_mem_alloc_destroy(&htab->ma);
>>> +
>>>       if (htab->use_percpu_counter)
>>>               percpu_counter_destroy(&htab->pcount);
>>> -     for (i = 0; i < HASHTAB_MAP_LOCK_COUNT; i++)
>>> -             free_percpu(htab->map_locked[i]);
>>> +
>>>       lockdep_unregister_key(&htab->lockdep_key);
>>>       bpf_map_area_free(htab);
>>>  }
>>> @@ -1564,7 +1540,7 @@ static int __htab_map_lookup_and_delete_elem(struct bpf_map *map, void *key,
>>>       b = __select_bucket(htab, hash);
>>>       head = &b->head;
>>>
>>> -     ret = htab_lock_bucket(htab, b, hash, &bflags);
>>> +     ret = htab_lock_bucket(b, &bflags);
>>>       if (ret)
>>>               return ret;
>>>
>>> @@ -1602,7 +1578,7 @@ static int __htab_map_lookup_and_delete_elem(struct bpf_map *map, void *key,
>>>                       free_htab_elem(htab, l);
>>>       }
>>>
>>> -     htab_unlock_bucket(htab, b, hash, bflags);
>>> +     htab_unlock_bucket(b, bflags);
>>>
>>>       if (is_lru_map && l)
>>>               htab_lru_push_free(htab, l);
>>> @@ -1720,7 +1696,7 @@ __htab_map_lookup_and_delete_batch(struct bpf_map *map,
>>>       head = &b->head;
>>>       /* do not grab the lock unless need it (bucket_cnt > 0). */
>>>       if (locked) {
>>> -             ret = htab_lock_bucket(htab, b, batch, &flags);
>>> +             ret = htab_lock_bucket(b, &flags);
>>>               if (ret) {
>>>                       rcu_read_unlock();
>>>                       bpf_enable_instrumentation();
>>> @@ -1743,7 +1719,7 @@ __htab_map_lookup_and_delete_batch(struct bpf_map *map,
>>>               /* Note that since bucket_cnt > 0 here, it is implicit
>>>                * that the locked was grabbed, so release it.
>>>                */
>>> -             htab_unlock_bucket(htab, b, batch, flags);
>>> +             htab_unlock_bucket(b, flags);
>>>               rcu_read_unlock();
>>>               bpf_enable_instrumentation();
>>>               goto after_loop;
>>> @@ -1754,7 +1730,7 @@ __htab_map_lookup_and_delete_batch(struct bpf_map *map,
>>>               /* Note that since bucket_cnt > 0 here, it is implicit
>>>                * that the locked was grabbed, so release it.
>>>                */
>>> -             htab_unlock_bucket(htab, b, batch, flags);
>>> +             htab_unlock_bucket(b, flags);
>>>               rcu_read_unlock();
>>>               bpf_enable_instrumentation();
>>>               kvfree(keys);
>>> @@ -1815,7 +1791,7 @@ __htab_map_lookup_and_delete_batch(struct bpf_map *map,
>>>               dst_val += value_size;
>>>       }
>>>
>>> -     htab_unlock_bucket(htab, b, batch, flags);
>>> +     htab_unlock_bucket(b, flags);
>>>       locked = false;
>>>
>>>       while (node_to_free) {
>


^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [net-next] bpf: avoid hashtab deadlock with try_lock
  2022-11-22  4:01       ` Hou Tao
@ 2022-11-22  4:06         ` Hou Tao
  2022-11-24 12:57           ` Tonghao Zhang
  0 siblings, 1 reply; 33+ messages in thread
From: Hou Tao @ 2022-11-22  4:06 UTC (permalink / raw)
  To: Tonghao Zhang
  Cc: netdev, Alexei Starovoitov, Daniel Borkmann, Andrii Nakryiko,
	Martin KaFai Lau, Song Liu, Yonghong Song, John Fastabend,
	KP Singh, Stanislav Fomichev, Hao Luo, Jiri Olsa, bpf

Hi,

On 11/22/2022 12:01 PM, Hou Tao wrote:
> Hi,
>
> On 11/22/2022 11:12 AM, Tonghao Zhang wrote:
>> .
>>
>> On Tue, Nov 22, 2022 at 9:16 AM Hou Tao <houtao1@huawei.com> wrote:
>>> Hi,
>>>
>>> On 11/21/2022 6:05 PM, xiangxia.m.yue@gmail.com wrote:
>>>> From: Tonghao Zhang <xiangxia.m.yue@gmail.com>
>>>>
>>>> The commit 20b6cc34ea74 ("bpf: Avoid hashtab deadlock with map_locked"),
>>>> try to fix deadlock, but in some case, the deadlock occurs:
>>>>
>>>> * CPUn in task context with K1, and taking lock.
>>>> * CPUn interrupted by NMI context, with K2.
>>>> * They are using the same bucket, but different map_locked.
>>> It is possible when n_buckets is less than HASHTAB_MAP_LOCK_COUNT (e.g.,
>>> n_bucket=4). If using hash & min(HASHTAB_MAP_LOCK_MASK, n_bucket - 1) as the
>>> index of map_locked, I think the deadlock will be gone.
>> Yes, but for saving memory, HASHTAB_MAP_LOCK_MASK should not be too
>> large(now this value is 8-1).
>> if user define n_bucket ,e.g 8192, the part of bucket only are
>> selected via hash & min(HASHTAB_MAP_LOCK_MASK, n_bucket - 1).
I don't mean to extend map_locked. Using hash & min(HASHTAB_MAP_LOCK_MASK,
n_bucket - 1) as index of map_locked  can guarantee the same map_locked will be
used if different update processes are using the same bucket lock.
> SNIP
>>>> diff --git a/kernel/bpf/hashtab.c b/kernel/bpf/hashtab.c
>>>> index 22855d6ff6d3..429acd97c869 100644
>>>> --- a/kernel/bpf/hashtab.c
>>>> +++ b/kernel/bpf/hashtab.c
>>>> @@ -80,9 +80,6 @@ struct bucket {
>>>>       raw_spinlock_t raw_lock;
>>>>  };
>>>>
>>>> -#define HASHTAB_MAP_LOCK_COUNT 8
>>>> -#define HASHTAB_MAP_LOCK_MASK (HASHTAB_MAP_LOCK_COUNT - 1)
>>>> -
>>>>  struct bpf_htab {
>>>>       struct bpf_map map;
>>>>       struct bpf_mem_alloc ma;
>>>> @@ -104,7 +101,6 @@ struct bpf_htab {
>>>>       u32 elem_size;  /* size of each element in bytes */
>>>>       u32 hashrnd;
>>>>       struct lock_class_key lockdep_key;
>>>> -     int __percpu *map_locked[HASHTAB_MAP_LOCK_COUNT];
>>>>  };
>>>>
>>>>  /* each htab element is struct htab_elem + key + value */
>>>> @@ -146,35 +142,26 @@ static void htab_init_buckets(struct bpf_htab *htab)
>>>>       }
>>>>  }
>>>>
>>>> -static inline int htab_lock_bucket(const struct bpf_htab *htab,
>>>> -                                struct bucket *b, u32 hash,
>>>> +static inline int htab_lock_bucket(struct bucket *b,
>>>>                                  unsigned long *pflags)
>>>>  {
>>>>       unsigned long flags;
>>>>
>>>> -     hash = hash & HASHTAB_MAP_LOCK_MASK;
>>>> -
>>>> -     preempt_disable();
>>>> -     if (unlikely(__this_cpu_inc_return(*(htab->map_locked[hash])) != 1)) {
>>>> -             __this_cpu_dec(*(htab->map_locked[hash]));
>>>> -             preempt_enable();
>>>> -             return -EBUSY;
>>>> +     if (in_nmi()) {
>>>> +             if (!raw_spin_trylock_irqsave(&b->raw_lock, flags))
>>>> +                     return -EBUSY;
>>>> +     } else {
>>>> +             raw_spin_lock_irqsave(&b->raw_lock, flags);
>>>>       }
>>>>
>>>> -     raw_spin_lock_irqsave(&b->raw_lock, flags);
>>>>       *pflags = flags;
>>>> -
>>>>       return 0;
>>>>  }
>>> map_locked is also used to prevent the re-entrance of htab_lock_bucket() on the
>>> same CPU, so only check in_nmi() is not enough.
>> NMI, IRQ, and preempt may interrupt the task context.
>> In htab_lock_bucket, raw_spin_lock_irqsave disable the preempt and
>> irq. so only NMI may interrupt the codes, right ?
> The re-entrance here means the nesting of bpf programs as show below:
>
> bpf_prog A
> update map X
>     htab_lock_bucket
>         raw_spin_lock_irqsave()
>     lookup_elem_raw()
>         // bpf prog B is attached on lookup_elem_raw()
>         bpf prog B
>             update map X again and update the element
>                 htab_lock_bucket()
>                     // dead-lock
>                     raw_spinlock_irqsave()
> .
>
>>>> -static inline void htab_unlock_bucket(const struct bpf_htab *htab,
>>>> -                                   struct bucket *b, u32 hash,
>>>> +static inline void htab_unlock_bucket(struct bucket *b,
>>>>                                     unsigned long flags)
>>>>  {
>>>> -     hash = hash & HASHTAB_MAP_LOCK_MASK;
>>>>       raw_spin_unlock_irqrestore(&b->raw_lock, flags);
>>>> -     __this_cpu_dec(*(htab->map_locked[hash]));
>>>> -     preempt_enable();
>>>>  }
>>>>
>>>>  static bool htab_lru_map_delete_node(void *arg, struct bpf_lru_node *node);
>>>> @@ -467,7 +454,7 @@ static struct bpf_map *htab_map_alloc(union bpf_attr *attr)
>>>>       bool percpu_lru = (attr->map_flags & BPF_F_NO_COMMON_LRU);
>>>>       bool prealloc = !(attr->map_flags & BPF_F_NO_PREALLOC);
>>>>       struct bpf_htab *htab;
>>>> -     int err, i;
>>>> +     int err;
>>>>
>>>>       htab = bpf_map_area_alloc(sizeof(*htab), NUMA_NO_NODE);
>>>>       if (!htab)
>>>> @@ -513,15 +500,6 @@ static struct bpf_map *htab_map_alloc(union bpf_attr *attr)
>>>>       if (!htab->buckets)
>>>>               goto free_htab;
>>>>
>>>> -     for (i = 0; i < HASHTAB_MAP_LOCK_COUNT; i++) {
>>>> -             htab->map_locked[i] = bpf_map_alloc_percpu(&htab->map,
>>>> -                                                        sizeof(int),
>>>> -                                                        sizeof(int),
>>>> -                                                        GFP_USER);
>>>> -             if (!htab->map_locked[i])
>>>> -                     goto free_map_locked;
>>>> -     }
>>>> -
>>>>       if (htab->map.map_flags & BPF_F_ZERO_SEED)
>>>>               htab->hashrnd = 0;
>>>>       else
>>>> @@ -549,13 +527,13 @@ static struct bpf_map *htab_map_alloc(union bpf_attr *attr)
>>>>       if (htab->use_percpu_counter) {
>>>>               err = percpu_counter_init(&htab->pcount, 0, GFP_KERNEL);
>>>>               if (err)
>>>> -                     goto free_map_locked;
>>>> +                     goto free_buckets;
>>>>       }
>>>>
>>>>       if (prealloc) {
>>>>               err = prealloc_init(htab);
>>>>               if (err)
>>>> -                     goto free_map_locked;
>>>> +                     goto free_buckets;
>>>>
>>>>               if (!percpu && !lru) {
>>>>                       /* lru itself can remove the least used element, so
>>>> @@ -568,12 +546,12 @@ static struct bpf_map *htab_map_alloc(union bpf_attr *attr)
>>>>       } else {
>>>>               err = bpf_mem_alloc_init(&htab->ma, htab->elem_size, false);
>>>>               if (err)
>>>> -                     goto free_map_locked;
>>>> +                     goto free_buckets;
>>>>               if (percpu) {
>>>>                       err = bpf_mem_alloc_init(&htab->pcpu_ma,
>>>>                                                round_up(htab->map.value_size, 8), true);
>>>>                       if (err)
>>>> -                             goto free_map_locked;
>>>> +                             goto free_buckets;
>>>>               }
>>>>       }
>>>>
>>>> @@ -581,11 +559,10 @@ static struct bpf_map *htab_map_alloc(union bpf_attr *attr)
>>>>
>>>>  free_prealloc:
>>>>       prealloc_destroy(htab);
>>>> -free_map_locked:
>>>> +free_buckets:
>>>>       if (htab->use_percpu_counter)
>>>>               percpu_counter_destroy(&htab->pcount);
>>>> -     for (i = 0; i < HASHTAB_MAP_LOCK_COUNT; i++)
>>>> -             free_percpu(htab->map_locked[i]);
>>>> +
>>>>       bpf_map_area_free(htab->buckets);
>>>>       bpf_mem_alloc_destroy(&htab->pcpu_ma);
>>>>       bpf_mem_alloc_destroy(&htab->ma);
>>>> @@ -782,7 +759,7 @@ static bool htab_lru_map_delete_node(void *arg, struct bpf_lru_node *node)
>>>>       b = __select_bucket(htab, tgt_l->hash);
>>>>       head = &b->head;
>>>>
>>>> -     ret = htab_lock_bucket(htab, b, tgt_l->hash, &flags);
>>>> +     ret = htab_lock_bucket(b, &flags);
>>>>       if (ret)
>>>>               return false;
>>>>
>>>> @@ -793,7 +770,7 @@ static bool htab_lru_map_delete_node(void *arg, struct bpf_lru_node *node)
>>>>                       break;
>>>>               }
>>>>
>>>> -     htab_unlock_bucket(htab, b, tgt_l->hash, flags);
>>>> +     htab_unlock_bucket(b, flags);
>>>>
>>>>       return l == tgt_l;
>>>>  }
>>>> @@ -1107,7 +1084,7 @@ static int htab_map_update_elem(struct bpf_map *map, void *key, void *value,
>>>>                */
>>>>       }
>>>>
>>>> -     ret = htab_lock_bucket(htab, b, hash, &flags);
>>>> +     ret = htab_lock_bucket(b, &flags);
>>>>       if (ret)
>>>>               return ret;
>>>>
>>>> @@ -1152,7 +1129,7 @@ static int htab_map_update_elem(struct bpf_map *map, void *key, void *value,
>>>>       }
>>>>       ret = 0;
>>>>  err:
>>>> -     htab_unlock_bucket(htab, b, hash, flags);
>>>> +     htab_unlock_bucket(b, flags);
>>>>       return ret;
>>>>  }
>>>>
>>>> @@ -1198,7 +1175,7 @@ static int htab_lru_map_update_elem(struct bpf_map *map, void *key, void *value,
>>>>       copy_map_value(&htab->map,
>>>>                      l_new->key + round_up(map->key_size, 8), value);
>>>>
>>>> -     ret = htab_lock_bucket(htab, b, hash, &flags);
>>>> +     ret = htab_lock_bucket(b, &flags);
>>>>       if (ret)
>>>>               return ret;
>>>>
>>>> @@ -1219,7 +1196,7 @@ static int htab_lru_map_update_elem(struct bpf_map *map, void *key, void *value,
>>>>       ret = 0;
>>>>
>>>>  err:
>>>> -     htab_unlock_bucket(htab, b, hash, flags);
>>>> +     htab_unlock_bucket(b, flags);
>>>>
>>>>       if (ret)
>>>>               htab_lru_push_free(htab, l_new);
>>>> @@ -1255,7 +1232,7 @@ static int __htab_percpu_map_update_elem(struct bpf_map *map, void *key,
>>>>       b = __select_bucket(htab, hash);
>>>>       head = &b->head;
>>>>
>>>> -     ret = htab_lock_bucket(htab, b, hash, &flags);
>>>> +     ret = htab_lock_bucket(b, &flags);
>>>>       if (ret)
>>>>               return ret;
>>>>
>>>> @@ -1280,7 +1257,7 @@ static int __htab_percpu_map_update_elem(struct bpf_map *map, void *key,
>>>>       }
>>>>       ret = 0;
>>>>  err:
>>>> -     htab_unlock_bucket(htab, b, hash, flags);
>>>> +     htab_unlock_bucket(b, flags);
>>>>       return ret;
>>>>  }
>>>>
>>>> @@ -1321,7 +1298,7 @@ static int __htab_lru_percpu_map_update_elem(struct bpf_map *map, void *key,
>>>>                       return -ENOMEM;
>>>>       }
>>>>
>>>> -     ret = htab_lock_bucket(htab, b, hash, &flags);
>>>> +     ret = htab_lock_bucket(b, &flags);
>>>>       if (ret)
>>>>               return ret;
>>>>
>>>> @@ -1345,7 +1322,7 @@ static int __htab_lru_percpu_map_update_elem(struct bpf_map *map, void *key,
>>>>       }
>>>>       ret = 0;
>>>>  err:
>>>> -     htab_unlock_bucket(htab, b, hash, flags);
>>>> +     htab_unlock_bucket(b, flags);
>>>>       if (l_new)
>>>>               bpf_lru_push_free(&htab->lru, &l_new->lru_node);
>>>>       return ret;
>>>> @@ -1384,7 +1361,7 @@ static int htab_map_delete_elem(struct bpf_map *map, void *key)
>>>>       b = __select_bucket(htab, hash);
>>>>       head = &b->head;
>>>>
>>>> -     ret = htab_lock_bucket(htab, b, hash, &flags);
>>>> +     ret = htab_lock_bucket(b, &flags);
>>>>       if (ret)
>>>>               return ret;
>>>>
>>>> @@ -1397,7 +1374,7 @@ static int htab_map_delete_elem(struct bpf_map *map, void *key)
>>>>               ret = -ENOENT;
>>>>       }
>>>>
>>>> -     htab_unlock_bucket(htab, b, hash, flags);
>>>> +     htab_unlock_bucket(b, flags);
>>>>       return ret;
>>>>  }
>>>>
>>>> @@ -1420,7 +1397,7 @@ static int htab_lru_map_delete_elem(struct bpf_map *map, void *key)
>>>>       b = __select_bucket(htab, hash);
>>>>       head = &b->head;
>>>>
>>>> -     ret = htab_lock_bucket(htab, b, hash, &flags);
>>>> +     ret = htab_lock_bucket(b, &flags);
>>>>       if (ret)
>>>>               return ret;
>>>>
>>>> @@ -1431,7 +1408,7 @@ static int htab_lru_map_delete_elem(struct bpf_map *map, void *key)
>>>>       else
>>>>               ret = -ENOENT;
>>>>
>>>> -     htab_unlock_bucket(htab, b, hash, flags);
>>>> +     htab_unlock_bucket(b, flags);
>>>>       if (l)
>>>>               htab_lru_push_free(htab, l);
>>>>       return ret;
>>>> @@ -1494,7 +1471,6 @@ static void htab_map_free_timers(struct bpf_map *map)
>>>>  static void htab_map_free(struct bpf_map *map)
>>>>  {
>>>>       struct bpf_htab *htab = container_of(map, struct bpf_htab, map);
>>>> -     int i;
>>>>
>>>>       /* bpf_free_used_maps() or close(map_fd) will trigger this map_free callback.
>>>>        * bpf_free_used_maps() is called after bpf prog is no longer executing.
>>>> @@ -1517,10 +1493,10 @@ static void htab_map_free(struct bpf_map *map)
>>>>       bpf_map_area_free(htab->buckets);
>>>>       bpf_mem_alloc_destroy(&htab->pcpu_ma);
>>>>       bpf_mem_alloc_destroy(&htab->ma);
>>>> +
>>>>       if (htab->use_percpu_counter)
>>>>               percpu_counter_destroy(&htab->pcount);
>>>> -     for (i = 0; i < HASHTAB_MAP_LOCK_COUNT; i++)
>>>> -             free_percpu(htab->map_locked[i]);
>>>> +
>>>>       lockdep_unregister_key(&htab->lockdep_key);
>>>>       bpf_map_area_free(htab);
>>>>  }
>>>> @@ -1564,7 +1540,7 @@ static int __htab_map_lookup_and_delete_elem(struct bpf_map *map, void *key,
>>>>       b = __select_bucket(htab, hash);
>>>>       head = &b->head;
>>>>
>>>> -     ret = htab_lock_bucket(htab, b, hash, &bflags);
>>>> +     ret = htab_lock_bucket(b, &bflags);
>>>>       if (ret)
>>>>               return ret;
>>>>
>>>> @@ -1602,7 +1578,7 @@ static int __htab_map_lookup_and_delete_elem(struct bpf_map *map, void *key,
>>>>                       free_htab_elem(htab, l);
>>>>       }
>>>>
>>>> -     htab_unlock_bucket(htab, b, hash, bflags);
>>>> +     htab_unlock_bucket(b, bflags);
>>>>
>>>>       if (is_lru_map && l)
>>>>               htab_lru_push_free(htab, l);
>>>> @@ -1720,7 +1696,7 @@ __htab_map_lookup_and_delete_batch(struct bpf_map *map,
>>>>       head = &b->head;
>>>>       /* do not grab the lock unless need it (bucket_cnt > 0). */
>>>>       if (locked) {
>>>> -             ret = htab_lock_bucket(htab, b, batch, &flags);
>>>> +             ret = htab_lock_bucket(b, &flags);
>>>>               if (ret) {
>>>>                       rcu_read_unlock();
>>>>                       bpf_enable_instrumentation();
>>>> @@ -1743,7 +1719,7 @@ __htab_map_lookup_and_delete_batch(struct bpf_map *map,
>>>>               /* Note that since bucket_cnt > 0 here, it is implicit
>>>>                * that the locked was grabbed, so release it.
>>>>                */
>>>> -             htab_unlock_bucket(htab, b, batch, flags);
>>>> +             htab_unlock_bucket(b, flags);
>>>>               rcu_read_unlock();
>>>>               bpf_enable_instrumentation();
>>>>               goto after_loop;
>>>> @@ -1754,7 +1730,7 @@ __htab_map_lookup_and_delete_batch(struct bpf_map *map,
>>>>               /* Note that since bucket_cnt > 0 here, it is implicit
>>>>                * that the locked was grabbed, so release it.
>>>>                */
>>>> -             htab_unlock_bucket(htab, b, batch, flags);
>>>> +             htab_unlock_bucket(b, flags);
>>>>               rcu_read_unlock();
>>>>               bpf_enable_instrumentation();
>>>>               kvfree(keys);
>>>> @@ -1815,7 +1791,7 @@ __htab_map_lookup_and_delete_batch(struct bpf_map *map,
>>>>               dst_val += value_size;
>>>>       }
>>>>
>>>> -     htab_unlock_bucket(htab, b, batch, flags);
>>>> +     htab_unlock_bucket(b, flags);
>>>>       locked = false;
>>>>
>>>>       while (node_to_free) {
> .


^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [net-next] bpf: avoid the multi checking
  2022-11-21 10:05 [net-next] bpf: avoid the multi checking xiangxia.m.yue
  2022-11-21 10:05 ` [net-next] bpf: avoid hashtab deadlock with try_lock xiangxia.m.yue
@ 2022-11-22 22:16 ` Daniel Borkmann
  1 sibling, 0 replies; 33+ messages in thread
From: Daniel Borkmann @ 2022-11-22 22:16 UTC (permalink / raw)
  To: xiangxia.m.yue
  Cc: netdev, Alexei Starovoitov, Andrii Nakryiko, Martin KaFai Lau,
	Song Liu, Yonghong Song, John Fastabend, KP Singh,
	Stanislav Fomichev, Hao Luo, Jiri Olsa

On 11/21/22 11:05 AM, xiangxia.m.yue@gmail.com wrote:
> From: Tonghao Zhang <xiangxia.m.yue@gmail.com>
> 
> .map_alloc_check checked bpf_attr::max_entries, and if bpf_attr::max_entries
> == 0, return -EINVAL. bpf_htab::n_buckets will not be 0, while -E2BIG is not
> appropriate.
> 
> Cc: Alexei Starovoitov <ast@kernel.org>
> Cc: Daniel Borkmann <daniel@iogearbox.net>
> Cc: Andrii Nakryiko <andrii@kernel.org>
> Cc: Martin KaFai Lau <martin.lau@linux.dev>
> Cc: Song Liu <song@kernel.org>
> Cc: Yonghong Song <yhs@fb.com>
> Cc: John Fastabend <john.fastabend@gmail.com>
> Cc: KP Singh <kpsingh@kernel.org>
> Cc: Stanislav Fomichev <sdf@google.com>
> Cc: Hao Luo <haoluo@google.com>
> Cc: Jiri Olsa <jolsa@kernel.org>
> Signed-off-by: Tonghao Zhang <xiangxia.m.yue@gmail.com>

Pls Cc bpf@vger.kernel.org list and $subj line should target bpf-next.

> ---
>   kernel/bpf/hashtab.c | 7 ++++---
>   1 file changed, 4 insertions(+), 3 deletions(-)
> 
> diff --git a/kernel/bpf/hashtab.c b/kernel/bpf/hashtab.c
> index 50d254cd0709..22855d6ff6d3 100644
> --- a/kernel/bpf/hashtab.c
> +++ b/kernel/bpf/hashtab.c
> @@ -500,9 +500,10 @@ static struct bpf_map *htab_map_alloc(union bpf_attr *attr)
>   		htab->elem_size += round_up(htab->map.value_size, 8);
>   
>   	err = -E2BIG;
> -	/* prevent zero size kmalloc and check for u32 overflow */
> -	if (htab->n_buckets == 0 ||
> -	    htab->n_buckets > U32_MAX / sizeof(struct bucket))
> +	/* avoid zero size and u32 overflow kmalloc.
> +	 * bpf_attr::max_entries checked in .map_alloc_check().
> +	 */
> +	if (htab->n_buckets > U32_MAX / sizeof(struct bucket))

This looks buggy to remove it, this is to guard against the previous
roundup_pow_of_two() on htab->n_buckets.

>   		goto free_htab;
>   
>   	err = -ENOMEM;
> 


^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [net-next] bpf: avoid hashtab deadlock with try_lock
  2022-11-22  4:06         ` Hou Tao
@ 2022-11-24 12:57           ` Tonghao Zhang
  2022-11-24 14:13             ` Hou Tao
  0 siblings, 1 reply; 33+ messages in thread
From: Tonghao Zhang @ 2022-11-24 12:57 UTC (permalink / raw)
  To: Hou Tao
  Cc: netdev, Alexei Starovoitov, Daniel Borkmann, Andrii Nakryiko,
	Martin KaFai Lau, Song Liu, Yonghong Song, John Fastabend,
	KP Singh, Stanislav Fomichev, Hao Luo, Jiri Olsa, bpf

On Tue, Nov 22, 2022 at 12:06 PM Hou Tao <houtao1@huawei.com> wrote:
>
> Hi,
>
> On 11/22/2022 12:01 PM, Hou Tao wrote:
> > Hi,
> >
> > On 11/22/2022 11:12 AM, Tonghao Zhang wrote:
> >> .
> >>
> >> On Tue, Nov 22, 2022 at 9:16 AM Hou Tao <houtao1@huawei.com> wrote:
> >>> Hi,
> >>>
> >>> On 11/21/2022 6:05 PM, xiangxia.m.yue@gmail.com wrote:
> >>>> From: Tonghao Zhang <xiangxia.m.yue@gmail.com>
> >>>>
> >>>> The commit 20b6cc34ea74 ("bpf: Avoid hashtab deadlock with map_locked"),
> >>>> try to fix deadlock, but in some case, the deadlock occurs:
> >>>>
> >>>> * CPUn in task context with K1, and taking lock.
> >>>> * CPUn interrupted by NMI context, with K2.
> >>>> * They are using the same bucket, but different map_locked.
> >>> It is possible when n_buckets is less than HASHTAB_MAP_LOCK_COUNT (e.g.,
> >>> n_bucket=4). If using hash & min(HASHTAB_MAP_LOCK_MASK, n_bucket - 1) as the
> >>> index of map_locked, I think the deadlock will be gone.
> >> Yes, but for saving memory, HASHTAB_MAP_LOCK_MASK should not be too
> >> large(now this value is 8-1).
> >> if user define n_bucket ,e.g 8192, the part of bucket only are
> >> selected via hash & min(HASHTAB_MAP_LOCK_MASK, n_bucket - 1).
> I don't mean to extend map_locked. Using hash & min(HASHTAB_MAP_LOCK_MASK,
> n_bucket - 1) as index of map_locked  can guarantee the same map_locked will be
> used if different update processes are using the same bucket lock.
Thanks, I got it. but I tried it using the hash = hash &
min(HASHTAB_MAP_LOCK_MASK, htab->n_buckets -1) in
htab_lock_bucket/htab_unlock_bucket.
But the warning occur again.
  > > SNIP
> >>>> diff --git a/kernel/bpf/hashtab.c b/kernel/bpf/hashtab.c
> >>>> index 22855d6ff6d3..429acd97c869 100644
> >>>> --- a/kernel/bpf/hashtab.c
> >>>> +++ b/kernel/bpf/hashtab.c
> >>>> @@ -80,9 +80,6 @@ struct bucket {
> >>>>       raw_spinlock_t raw_lock;
> >>>>  };
> >>>>
> >>>> -#define HASHTAB_MAP_LOCK_COUNT 8
> >>>> -#define HASHTAB_MAP_LOCK_MASK (HASHTAB_MAP_LOCK_COUNT - 1)
> >>>> -
> >>>>  struct bpf_htab {
> >>>>       struct bpf_map map;
> >>>>       struct bpf_mem_alloc ma;
> >>>> @@ -104,7 +101,6 @@ struct bpf_htab {
> >>>>       u32 elem_size;  /* size of each element in bytes */
> >>>>       u32 hashrnd;
> >>>>       struct lock_class_key lockdep_key;
> >>>> -     int __percpu *map_locked[HASHTAB_MAP_LOCK_COUNT];
> >>>>  };
> >>>>
> >>>>  /* each htab element is struct htab_elem + key + value */
> >>>> @@ -146,35 +142,26 @@ static void htab_init_buckets(struct bpf_htab *htab)
> >>>>       }
> >>>>  }
> >>>>
> >>>> -static inline int htab_lock_bucket(const struct bpf_htab *htab,
> >>>> -                                struct bucket *b, u32 hash,
> >>>> +static inline int htab_lock_bucket(struct bucket *b,
> >>>>                                  unsigned long *pflags)
> >>>>  {
> >>>>       unsigned long flags;
> >>>>
> >>>> -     hash = hash & HASHTAB_MAP_LOCK_MASK;
> >>>> -
> >>>> -     preempt_disable();
> >>>> -     if (unlikely(__this_cpu_inc_return(*(htab->map_locked[hash])) != 1)) {
> >>>> -             __this_cpu_dec(*(htab->map_locked[hash]));
> >>>> -             preempt_enable();
> >>>> -             return -EBUSY;
> >>>> +     if (in_nmi()) {
> >>>> +             if (!raw_spin_trylock_irqsave(&b->raw_lock, flags))
> >>>> +                     return -EBUSY;
> >>>> +     } else {
> >>>> +             raw_spin_lock_irqsave(&b->raw_lock, flags);
> >>>>       }
> >>>>
> >>>> -     raw_spin_lock_irqsave(&b->raw_lock, flags);
> >>>>       *pflags = flags;
> >>>> -
> >>>>       return 0;
> >>>>  }
> >>> map_locked is also used to prevent the re-entrance of htab_lock_bucket() on the
> >>> same CPU, so only check in_nmi() is not enough.
> >> NMI, IRQ, and preempt may interrupt the task context.
> >> In htab_lock_bucket, raw_spin_lock_irqsave disable the preempt and
> >> irq. so only NMI may interrupt the codes, right ?
> > The re-entrance here means the nesting of bpf programs as show below:
> >
> > bpf_prog A
> > update map X
> >     htab_lock_bucket
> >         raw_spin_lock_irqsave()
> >     lookup_elem_raw()
> >         // bpf prog B is attached on lookup_elem_raw()
I am confused, bpf_prog A disables preempt and irq with
raw_spin_lock_irqsave. Why bpf prog B run here?
> >         bpf prog B
> >             update map X again and update the element
> >                 htab_lock_bucket()
> >                     // dead-lock
> >                     raw_spinlock_irqsave()
> > .
> >
> >>>> -static inline void htab_unlock_bucket(const struct bpf_htab *htab,
> >>>> -                                   struct bucket *b, u32 hash,
> >>>> +static inline void htab_unlock_bucket(struct bucket *b,
> >>>>                                     unsigned long flags)
> >>>>  {
> >>>> -     hash = hash & HASHTAB_MAP_LOCK_MASK;
> >>>>       raw_spin_unlock_irqrestore(&b->raw_lock, flags);
> >>>> -     __this_cpu_dec(*(htab->map_locked[hash]));
> >>>> -     preempt_enable();
> >>>>  }
> >>>>
> >>>>  static bool htab_lru_map_delete_node(void *arg, struct bpf_lru_node *node);
> >>>> @@ -467,7 +454,7 @@ static struct bpf_map *htab_map_alloc(union bpf_attr *attr)
> >>>>       bool percpu_lru = (attr->map_flags & BPF_F_NO_COMMON_LRU);
> >>>>       bool prealloc = !(attr->map_flags & BPF_F_NO_PREALLOC);
> >>>>       struct bpf_htab *htab;
> >>>> -     int err, i;
> >>>> +     int err;
> >>>>
> >>>>       htab = bpf_map_area_alloc(sizeof(*htab), NUMA_NO_NODE);
> >>>>       if (!htab)
> >>>> @@ -513,15 +500,6 @@ static struct bpf_map *htab_map_alloc(union bpf_attr *attr)
> >>>>       if (!htab->buckets)
> >>>>               goto free_htab;
> >>>>
> >>>> -     for (i = 0; i < HASHTAB_MAP_LOCK_COUNT; i++) {
> >>>> -             htab->map_locked[i] = bpf_map_alloc_percpu(&htab->map,
> >>>> -                                                        sizeof(int),
> >>>> -                                                        sizeof(int),
> >>>> -                                                        GFP_USER);
> >>>> -             if (!htab->map_locked[i])
> >>>> -                     goto free_map_locked;
> >>>> -     }
> >>>> -
> >>>>       if (htab->map.map_flags & BPF_F_ZERO_SEED)
> >>>>               htab->hashrnd = 0;
> >>>>       else
> >>>> @@ -549,13 +527,13 @@ static struct bpf_map *htab_map_alloc(union bpf_attr *attr)
> >>>>       if (htab->use_percpu_counter) {
> >>>>               err = percpu_counter_init(&htab->pcount, 0, GFP_KERNEL);
> >>>>               if (err)
> >>>> -                     goto free_map_locked;
> >>>> +                     goto free_buckets;
> >>>>       }
> >>>>
> >>>>       if (prealloc) {
> >>>>               err = prealloc_init(htab);
> >>>>               if (err)
> >>>> -                     goto free_map_locked;
> >>>> +                     goto free_buckets;
> >>>>
> >>>>               if (!percpu && !lru) {
> >>>>                       /* lru itself can remove the least used element, so
> >>>> @@ -568,12 +546,12 @@ static struct bpf_map *htab_map_alloc(union bpf_attr *attr)
> >>>>       } else {
> >>>>               err = bpf_mem_alloc_init(&htab->ma, htab->elem_size, false);
> >>>>               if (err)
> >>>> -                     goto free_map_locked;
> >>>> +                     goto free_buckets;
> >>>>               if (percpu) {
> >>>>                       err = bpf_mem_alloc_init(&htab->pcpu_ma,
> >>>>                                                round_up(htab->map.value_size, 8), true);
> >>>>                       if (err)
> >>>> -                             goto free_map_locked;
> >>>> +                             goto free_buckets;
> >>>>               }
> >>>>       }
> >>>>
> >>>> @@ -581,11 +559,10 @@ static struct bpf_map *htab_map_alloc(union bpf_attr *attr)
> >>>>
> >>>>  free_prealloc:
> >>>>       prealloc_destroy(htab);
> >>>> -free_map_locked:
> >>>> +free_buckets:
> >>>>       if (htab->use_percpu_counter)
> >>>>               percpu_counter_destroy(&htab->pcount);
> >>>> -     for (i = 0; i < HASHTAB_MAP_LOCK_COUNT; i++)
> >>>> -             free_percpu(htab->map_locked[i]);
> >>>> +
> >>>>       bpf_map_area_free(htab->buckets);
> >>>>       bpf_mem_alloc_destroy(&htab->pcpu_ma);
> >>>>       bpf_mem_alloc_destroy(&htab->ma);
> >>>> @@ -782,7 +759,7 @@ static bool htab_lru_map_delete_node(void *arg, struct bpf_lru_node *node)
> >>>>       b = __select_bucket(htab, tgt_l->hash);
> >>>>       head = &b->head;
> >>>>
> >>>> -     ret = htab_lock_bucket(htab, b, tgt_l->hash, &flags);
> >>>> +     ret = htab_lock_bucket(b, &flags);
> >>>>       if (ret)
> >>>>               return false;
> >>>>
> >>>> @@ -793,7 +770,7 @@ static bool htab_lru_map_delete_node(void *arg, struct bpf_lru_node *node)
> >>>>                       break;
> >>>>               }
> >>>>
> >>>> -     htab_unlock_bucket(htab, b, tgt_l->hash, flags);
> >>>> +     htab_unlock_bucket(b, flags);
> >>>>
> >>>>       return l == tgt_l;
> >>>>  }
> >>>> @@ -1107,7 +1084,7 @@ static int htab_map_update_elem(struct bpf_map *map, void *key, void *value,
> >>>>                */
> >>>>       }
> >>>>
> >>>> -     ret = htab_lock_bucket(htab, b, hash, &flags);
> >>>> +     ret = htab_lock_bucket(b, &flags);
> >>>>       if (ret)
> >>>>               return ret;
> >>>>
> >>>> @@ -1152,7 +1129,7 @@ static int htab_map_update_elem(struct bpf_map *map, void *key, void *value,
> >>>>       }
> >>>>       ret = 0;
> >>>>  err:
> >>>> -     htab_unlock_bucket(htab, b, hash, flags);
> >>>> +     htab_unlock_bucket(b, flags);
> >>>>       return ret;
> >>>>  }
> >>>>
> >>>> @@ -1198,7 +1175,7 @@ static int htab_lru_map_update_elem(struct bpf_map *map, void *key, void *value,
> >>>>       copy_map_value(&htab->map,
> >>>>                      l_new->key + round_up(map->key_size, 8), value);
> >>>>
> >>>> -     ret = htab_lock_bucket(htab, b, hash, &flags);
> >>>> +     ret = htab_lock_bucket(b, &flags);
> >>>>       if (ret)
> >>>>               return ret;
> >>>>
> >>>> @@ -1219,7 +1196,7 @@ static int htab_lru_map_update_elem(struct bpf_map *map, void *key, void *value,
> >>>>       ret = 0;
> >>>>
> >>>>  err:
> >>>> -     htab_unlock_bucket(htab, b, hash, flags);
> >>>> +     htab_unlock_bucket(b, flags);
> >>>>
> >>>>       if (ret)
> >>>>               htab_lru_push_free(htab, l_new);
> >>>> @@ -1255,7 +1232,7 @@ static int __htab_percpu_map_update_elem(struct bpf_map *map, void *key,
> >>>>       b = __select_bucket(htab, hash);
> >>>>       head = &b->head;
> >>>>
> >>>> -     ret = htab_lock_bucket(htab, b, hash, &flags);
> >>>> +     ret = htab_lock_bucket(b, &flags);
> >>>>       if (ret)
> >>>>               return ret;
> >>>>
> >>>> @@ -1280,7 +1257,7 @@ static int __htab_percpu_map_update_elem(struct bpf_map *map, void *key,
> >>>>       }
> >>>>       ret = 0;
> >>>>  err:
> >>>> -     htab_unlock_bucket(htab, b, hash, flags);
> >>>> +     htab_unlock_bucket(b, flags);
> >>>>       return ret;
> >>>>  }
> >>>>
> >>>> @@ -1321,7 +1298,7 @@ static int __htab_lru_percpu_map_update_elem(struct bpf_map *map, void *key,
> >>>>                       return -ENOMEM;
> >>>>       }
> >>>>
> >>>> -     ret = htab_lock_bucket(htab, b, hash, &flags);
> >>>> +     ret = htab_lock_bucket(b, &flags);
> >>>>       if (ret)
> >>>>               return ret;
> >>>>
> >>>> @@ -1345,7 +1322,7 @@ static int __htab_lru_percpu_map_update_elem(struct bpf_map *map, void *key,
> >>>>       }
> >>>>       ret = 0;
> >>>>  err:
> >>>> -     htab_unlock_bucket(htab, b, hash, flags);
> >>>> +     htab_unlock_bucket(b, flags);
> >>>>       if (l_new)
> >>>>               bpf_lru_push_free(&htab->lru, &l_new->lru_node);
> >>>>       return ret;
> >>>> @@ -1384,7 +1361,7 @@ static int htab_map_delete_elem(struct bpf_map *map, void *key)
> >>>>       b = __select_bucket(htab, hash);
> >>>>       head = &b->head;
> >>>>
> >>>> -     ret = htab_lock_bucket(htab, b, hash, &flags);
> >>>> +     ret = htab_lock_bucket(b, &flags);
> >>>>       if (ret)
> >>>>               return ret;
> >>>>
> >>>> @@ -1397,7 +1374,7 @@ static int htab_map_delete_elem(struct bpf_map *map, void *key)
> >>>>               ret = -ENOENT;
> >>>>       }
> >>>>
> >>>> -     htab_unlock_bucket(htab, b, hash, flags);
> >>>> +     htab_unlock_bucket(b, flags);
> >>>>       return ret;
> >>>>  }
> >>>>
> >>>> @@ -1420,7 +1397,7 @@ static int htab_lru_map_delete_elem(struct bpf_map *map, void *key)
> >>>>       b = __select_bucket(htab, hash);
> >>>>       head = &b->head;
> >>>>
> >>>> -     ret = htab_lock_bucket(htab, b, hash, &flags);
> >>>> +     ret = htab_lock_bucket(b, &flags);
> >>>>       if (ret)
> >>>>               return ret;
> >>>>
> >>>> @@ -1431,7 +1408,7 @@ static int htab_lru_map_delete_elem(struct bpf_map *map, void *key)
> >>>>       else
> >>>>               ret = -ENOENT;
> >>>>
> >>>> -     htab_unlock_bucket(htab, b, hash, flags);
> >>>> +     htab_unlock_bucket(b, flags);
> >>>>       if (l)
> >>>>               htab_lru_push_free(htab, l);
> >>>>       return ret;
> >>>> @@ -1494,7 +1471,6 @@ static void htab_map_free_timers(struct bpf_map *map)
> >>>>  static void htab_map_free(struct bpf_map *map)
> >>>>  {
> >>>>       struct bpf_htab *htab = container_of(map, struct bpf_htab, map);
> >>>> -     int i;
> >>>>
> >>>>       /* bpf_free_used_maps() or close(map_fd) will trigger this map_free callback.
> >>>>        * bpf_free_used_maps() is called after bpf prog is no longer executing.
> >>>> @@ -1517,10 +1493,10 @@ static void htab_map_free(struct bpf_map *map)
> >>>>       bpf_map_area_free(htab->buckets);
> >>>>       bpf_mem_alloc_destroy(&htab->pcpu_ma);
> >>>>       bpf_mem_alloc_destroy(&htab->ma);
> >>>> +
> >>>>       if (htab->use_percpu_counter)
> >>>>               percpu_counter_destroy(&htab->pcount);
> >>>> -     for (i = 0; i < HASHTAB_MAP_LOCK_COUNT; i++)
> >>>> -             free_percpu(htab->map_locked[i]);
> >>>> +
> >>>>       lockdep_unregister_key(&htab->lockdep_key);
> >>>>       bpf_map_area_free(htab);
> >>>>  }
> >>>> @@ -1564,7 +1540,7 @@ static int __htab_map_lookup_and_delete_elem(struct bpf_map *map, void *key,
> >>>>       b = __select_bucket(htab, hash);
> >>>>       head = &b->head;
> >>>>
> >>>> -     ret = htab_lock_bucket(htab, b, hash, &bflags);
> >>>> +     ret = htab_lock_bucket(b, &bflags);
> >>>>       if (ret)
> >>>>               return ret;
> >>>>
> >>>> @@ -1602,7 +1578,7 @@ static int __htab_map_lookup_and_delete_elem(struct bpf_map *map, void *key,
> >>>>                       free_htab_elem(htab, l);
> >>>>       }
> >>>>
> >>>> -     htab_unlock_bucket(htab, b, hash, bflags);
> >>>> +     htab_unlock_bucket(b, bflags);
> >>>>
> >>>>       if (is_lru_map && l)
> >>>>               htab_lru_push_free(htab, l);
> >>>> @@ -1720,7 +1696,7 @@ __htab_map_lookup_and_delete_batch(struct bpf_map *map,
> >>>>       head = &b->head;
> >>>>       /* do not grab the lock unless need it (bucket_cnt > 0). */
> >>>>       if (locked) {
> >>>> -             ret = htab_lock_bucket(htab, b, batch, &flags);
> >>>> +             ret = htab_lock_bucket(b, &flags);
> >>>>               if (ret) {
> >>>>                       rcu_read_unlock();
> >>>>                       bpf_enable_instrumentation();
> >>>> @@ -1743,7 +1719,7 @@ __htab_map_lookup_and_delete_batch(struct bpf_map *map,
> >>>>               /* Note that since bucket_cnt > 0 here, it is implicit
> >>>>                * that the locked was grabbed, so release it.
> >>>>                */
> >>>> -             htab_unlock_bucket(htab, b, batch, flags);
> >>>> +             htab_unlock_bucket(b, flags);
> >>>>               rcu_read_unlock();
> >>>>               bpf_enable_instrumentation();
> >>>>               goto after_loop;
> >>>> @@ -1754,7 +1730,7 @@ __htab_map_lookup_and_delete_batch(struct bpf_map *map,
> >>>>               /* Note that since bucket_cnt > 0 here, it is implicit
> >>>>                * that the locked was grabbed, so release it.
> >>>>                */
> >>>> -             htab_unlock_bucket(htab, b, batch, flags);
> >>>> +             htab_unlock_bucket(b, flags);
> >>>>               rcu_read_unlock();
> >>>>               bpf_enable_instrumentation();
> >>>>               kvfree(keys);
> >>>> @@ -1815,7 +1791,7 @@ __htab_map_lookup_and_delete_batch(struct bpf_map *map,
> >>>>               dst_val += value_size;
> >>>>       }
> >>>>
> >>>> -     htab_unlock_bucket(htab, b, batch, flags);
> >>>> +     htab_unlock_bucket(b, flags);
> >>>>       locked = false;
> >>>>
> >>>>       while (node_to_free) {
> > .
>


-- 
Best regards, Tonghao

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [net-next] bpf: avoid hashtab deadlock with try_lock
  2022-11-24 12:57           ` Tonghao Zhang
@ 2022-11-24 14:13             ` Hou Tao
  2022-11-28  3:15               ` Tonghao Zhang
  0 siblings, 1 reply; 33+ messages in thread
From: Hou Tao @ 2022-11-24 14:13 UTC (permalink / raw)
  To: Tonghao Zhang
  Cc: netdev, Alexei Starovoitov, Daniel Borkmann, Andrii Nakryiko,
	Martin KaFai Lau, Song Liu, Yonghong Song, John Fastabend,
	KP Singh, Stanislav Fomichev, Hao Luo, Jiri Olsa, bpf

Hi,

On 11/24/2022 8:57 PM, Tonghao Zhang wrote:
> On Tue, Nov 22, 2022 at 12:06 PM Hou Tao <houtao1@huawei.com> wrote:
>> Hi,
>>
>> On 11/22/2022 12:01 PM, Hou Tao wrote:
>>> Hi,
>>>
>>> On 11/22/2022 11:12 AM, Tonghao Zhang wrote:
>>>> .
>>>>
>>>> On Tue, Nov 22, 2022 at 9:16 AM Hou Tao <houtao1@huawei.com> wrote:
>>>>> Hi,
>>>>>
>>>>> On 11/21/2022 6:05 PM, xiangxia.m.yue@gmail.com wrote:
>>>>>> From: Tonghao Zhang <xiangxia.m.yue@gmail.com>
>>>>>>
>>>>>> The commit 20b6cc34ea74 ("bpf: Avoid hashtab deadlock with map_locked"),
>>>>>> try to fix deadlock, but in some case, the deadlock occurs:
>>>>>>
>>>>>> * CPUn in task context with K1, and taking lock.
>>>>>> * CPUn interrupted by NMI context, with K2.
>>>>>> * They are using the same bucket, but different map_locked.
>>>>> It is possible when n_buckets is less than HASHTAB_MAP_LOCK_COUNT (e.g.,
>>>>> n_bucket=4). If using hash & min(HASHTAB_MAP_LOCK_MASK, n_bucket - 1) as the
>>>>> index of map_locked, I think the deadlock will be gone.
>>>> Yes, but for saving memory, HASHTAB_MAP_LOCK_MASK should not be too
>>>> large(now this value is 8-1).
>>>> if user define n_bucket ,e.g 8192, the part of bucket only are
>>>> selected via hash & min(HASHTAB_MAP_LOCK_MASK, n_bucket - 1).
>> I don't mean to extend map_locked. Using hash & min(HASHTAB_MAP_LOCK_MASK,
>> n_bucket - 1) as index of map_locked  can guarantee the same map_locked will be
>> used if different update processes are using the same bucket lock.
> Thanks, I got it. but I tried it using the hash = hash &
> min(HASHTAB_MAP_LOCK_MASK, htab->n_buckets -1) in
> htab_lock_bucket/htab_unlock_bucket.
> But the warning occur again.
Does the deadlock happen ? Or just get the warning from lockdep. Maybe it is
just a false alarm from lockdep. I will check tomorrow. Could you share the
steps on how to reproduce the problem, specially the size of the hash table ?
>   > > SNIP
>>>>>> diff --git a/kernel/bpf/hashtab.c b/kernel/bpf/hashtab.c
>>>>>> index 22855d6ff6d3..429acd97c869 100644
>>>>>> --- a/kernel/bpf/hashtab.c
>>>>>> +++ b/kernel/bpf/hashtab.c
>>>>>> @@ -80,9 +80,6 @@ struct bucket {
>>>>>>       raw_spinlock_t raw_lock;
>>>>>>  };
>>>>>>
>>>>>> -#define HASHTAB_MAP_LOCK_COUNT 8
>>>>>> -#define HASHTAB_MAP_LOCK_MASK (HASHTAB_MAP_LOCK_COUNT - 1)
>>>>>> -
>>>>>>  struct bpf_htab {
>>>>>>       struct bpf_map map;
>>>>>>       struct bpf_mem_alloc ma;
>>>>>> @@ -104,7 +101,6 @@ struct bpf_htab {
>>>>>>       u32 elem_size;  /* size of each element in bytes */
>>>>>>       u32 hashrnd;
>>>>>>       struct lock_class_key lockdep_key;
>>>>>> -     int __percpu *map_locked[HASHTAB_MAP_LOCK_COUNT];
>>>>>>  };
>>>>>>
>>>>>>  /* each htab element is struct htab_elem + key + value */
>>>>>> @@ -146,35 +142,26 @@ static void htab_init_buckets(struct bpf_htab *htab)
>>>>>>       }
>>>>>>  }
>>>>>>
>>>>>> -static inline int htab_lock_bucket(const struct bpf_htab *htab,
>>>>>> -                                struct bucket *b, u32 hash,
>>>>>> +static inline int htab_lock_bucket(struct bucket *b,
>>>>>>                                  unsigned long *pflags)
>>>>>>  {
>>>>>>       unsigned long flags;
>>>>>>
>>>>>> -     hash = hash & HASHTAB_MAP_LOCK_MASK;
>>>>>> -
>>>>>> -     preempt_disable();
>>>>>> -     if (unlikely(__this_cpu_inc_return(*(htab->map_locked[hash])) != 1)) {
>>>>>> -             __this_cpu_dec(*(htab->map_locked[hash]));
>>>>>> -             preempt_enable();
>>>>>> -             return -EBUSY;
>>>>>> +     if (in_nmi()) {
>>>>>> +             if (!raw_spin_trylock_irqsave(&b->raw_lock, flags))
>>>>>> +                     return -EBUSY;
>>>>>> +     } else {
>>>>>> +             raw_spin_lock_irqsave(&b->raw_lock, flags);
>>>>>>       }
>>>>>>
>>>>>> -     raw_spin_lock_irqsave(&b->raw_lock, flags);
>>>>>>       *pflags = flags;
>>>>>> -
>>>>>>       return 0;
>>>>>>  }
>>>>> map_locked is also used to prevent the re-entrance of htab_lock_bucket() on the
>>>>> same CPU, so only check in_nmi() is not enough.
>>>> NMI, IRQ, and preempt may interrupt the task context.
>>>> In htab_lock_bucket, raw_spin_lock_irqsave disable the preempt and
>>>> irq. so only NMI may interrupt the codes, right ?
>>> The re-entrance here means the nesting of bpf programs as show below:
>>>
>>> bpf_prog A
>>> update map X
>>>     htab_lock_bucket
>>>         raw_spin_lock_irqsave()
>>>     lookup_elem_raw()
>>>         // bpf prog B is attached on lookup_elem_raw()
> I am confused, bpf_prog A disables preempt and irq with
> raw_spin_lock_irqsave. Why bpf prog B run here?
Because program B (e.g., fentry program) has been attached on lookup_elem_raw(),
calling lookup_elem_raw() will call the fentry program first. I had written a
test case for the similar scenario in bpf selftests. The path of the test case
is tools/testing/selftests/bpf/prog_tests/htab_update.c, you can use ./test_prog
-t htab_update/reenter_update to run the test case.
>>>         bpf prog B
>>>             update map X again and update the element
>>>                 htab_lock_bucket()
>>>                     // dead-lock
>>>                     raw_spinlock_irqsave()
>>> .
>>>
>>>>>> -static inline void htab_unlock_bucket(const struct bpf_htab *htab,
>>>>>> -                                   struct bucket *b, u32 hash,
>>>>>> +static inline void htab_unlock_bucket(struct bucket *b,
>>>>>>                                     unsigned long flags)
>>>>>>  {
>>>>>> -     hash = hash & HASHTAB_MAP_LOCK_MASK;
>>>>>>       raw_spin_unlock_irqrestore(&b->raw_lock, flags);
>>>>>> -     __this_cpu_dec(*(htab->map_locked[hash]));
>>>>>> -     preempt_enable();
>>>>>>  }
>>>>>>
>>>>>>  static bool htab_lru_map_delete_node(void *arg, struct bpf_lru_node *node);
>>>>>> @@ -467,7 +454,7 @@ static struct bpf_map *htab_map_alloc(union bpf_attr *attr)
>>>>>>       bool percpu_lru = (attr->map_flags & BPF_F_NO_COMMON_LRU);
>>>>>>       bool prealloc = !(attr->map_flags & BPF_F_NO_PREALLOC);
>>>>>>       struct bpf_htab *htab;
>>>>>> -     int err, i;
>>>>>> +     int err;
>>>>>>
>>>>>>       htab = bpf_map_area_alloc(sizeof(*htab), NUMA_NO_NODE);
>>>>>>       if (!htab)
>>>>>> @@ -513,15 +500,6 @@ static struct bpf_map *htab_map_alloc(union bpf_attr *attr)
>>>>>>       if (!htab->buckets)
>>>>>>               goto free_htab;
>>>>>>
>>>>>> -     for (i = 0; i < HASHTAB_MAP_LOCK_COUNT; i++) {
>>>>>> -             htab->map_locked[i] = bpf_map_alloc_percpu(&htab->map,
>>>>>> -                                                        sizeof(int),
>>>>>> -                                                        sizeof(int),
>>>>>> -                                                        GFP_USER);
>>>>>> -             if (!htab->map_locked[i])
>>>>>> -                     goto free_map_locked;
>>>>>> -     }
>>>>>> -
>>>>>>       if (htab->map.map_flags & BPF_F_ZERO_SEED)
>>>>>>               htab->hashrnd = 0;
>>>>>>       else
>>>>>> @@ -549,13 +527,13 @@ static struct bpf_map *htab_map_alloc(union bpf_attr *attr)
>>>>>>       if (htab->use_percpu_counter) {
>>>>>>               err = percpu_counter_init(&htab->pcount, 0, GFP_KERNEL);
>>>>>>               if (err)
>>>>>> -                     goto free_map_locked;
>>>>>> +                     goto free_buckets;
>>>>>>       }
>>>>>>
>>>>>>       if (prealloc) {
>>>>>>               err = prealloc_init(htab);
>>>>>>               if (err)
>>>>>> -                     goto free_map_locked;
>>>>>> +                     goto free_buckets;
>>>>>>
>>>>>>               if (!percpu && !lru) {
>>>>>>                       /* lru itself can remove the least used element, so
>>>>>> @@ -568,12 +546,12 @@ static struct bpf_map *htab_map_alloc(union bpf_attr *attr)
>>>>>>       } else {
>>>>>>               err = bpf_mem_alloc_init(&htab->ma, htab->elem_size, false);
>>>>>>               if (err)
>>>>>> -                     goto free_map_locked;
>>>>>> +                     goto free_buckets;
>>>>>>               if (percpu) {
>>>>>>                       err = bpf_mem_alloc_init(&htab->pcpu_ma,
>>>>>>                                                round_up(htab->map.value_size, 8), true);
>>>>>>                       if (err)
>>>>>> -                             goto free_map_locked;
>>>>>> +                             goto free_buckets;
>>>>>>               }
>>>>>>       }
>>>>>>
>>>>>> @@ -581,11 +559,10 @@ static struct bpf_map *htab_map_alloc(union bpf_attr *attr)
>>>>>>
>>>>>>  free_prealloc:
>>>>>>       prealloc_destroy(htab);
>>>>>> -free_map_locked:
>>>>>> +free_buckets:
>>>>>>       if (htab->use_percpu_counter)
>>>>>>               percpu_counter_destroy(&htab->pcount);
>>>>>> -     for (i = 0; i < HASHTAB_MAP_LOCK_COUNT; i++)
>>>>>> -             free_percpu(htab->map_locked[i]);
>>>>>> +
>>>>>>       bpf_map_area_free(htab->buckets);
>>>>>>       bpf_mem_alloc_destroy(&htab->pcpu_ma);
>>>>>>       bpf_mem_alloc_destroy(&htab->ma);
>>>>>> @@ -782,7 +759,7 @@ static bool htab_lru_map_delete_node(void *arg, struct bpf_lru_node *node)
>>>>>>       b = __select_bucket(htab, tgt_l->hash);
>>>>>>       head = &b->head;
>>>>>>
>>>>>> -     ret = htab_lock_bucket(htab, b, tgt_l->hash, &flags);
>>>>>> +     ret = htab_lock_bucket(b, &flags);
>>>>>>       if (ret)
>>>>>>               return false;
>>>>>>
>>>>>> @@ -793,7 +770,7 @@ static bool htab_lru_map_delete_node(void *arg, struct bpf_lru_node *node)
>>>>>>                       break;
>>>>>>               }
>>>>>>
>>>>>> -     htab_unlock_bucket(htab, b, tgt_l->hash, flags);
>>>>>> +     htab_unlock_bucket(b, flags);
>>>>>>
>>>>>>       return l == tgt_l;
>>>>>>  }
>>>>>> @@ -1107,7 +1084,7 @@ static int htab_map_update_elem(struct bpf_map *map, void *key, void *value,
>>>>>>                */
>>>>>>       }
>>>>>>
>>>>>> -     ret = htab_lock_bucket(htab, b, hash, &flags);
>>>>>> +     ret = htab_lock_bucket(b, &flags);
>>>>>>       if (ret)
>>>>>>               return ret;
>>>>>>
>>>>>> @@ -1152,7 +1129,7 @@ static int htab_map_update_elem(struct bpf_map *map, void *key, void *value,
>>>>>>       }
>>>>>>       ret = 0;
>>>>>>  err:
>>>>>> -     htab_unlock_bucket(htab, b, hash, flags);
>>>>>> +     htab_unlock_bucket(b, flags);
>>>>>>       return ret;
>>>>>>  }
>>>>>>
>>>>>> @@ -1198,7 +1175,7 @@ static int htab_lru_map_update_elem(struct bpf_map *map, void *key, void *value,
>>>>>>       copy_map_value(&htab->map,
>>>>>>                      l_new->key + round_up(map->key_size, 8), value);
>>>>>>
>>>>>> -     ret = htab_lock_bucket(htab, b, hash, &flags);
>>>>>> +     ret = htab_lock_bucket(b, &flags);
>>>>>>       if (ret)
>>>>>>               return ret;
>>>>>>
>>>>>> @@ -1219,7 +1196,7 @@ static int htab_lru_map_update_elem(struct bpf_map *map, void *key, void *value,
>>>>>>       ret = 0;
>>>>>>
>>>>>>  err:
>>>>>> -     htab_unlock_bucket(htab, b, hash, flags);
>>>>>> +     htab_unlock_bucket(b, flags);
>>>>>>
>>>>>>       if (ret)
>>>>>>               htab_lru_push_free(htab, l_new);
>>>>>> @@ -1255,7 +1232,7 @@ static int __htab_percpu_map_update_elem(struct bpf_map *map, void *key,
>>>>>>       b = __select_bucket(htab, hash);
>>>>>>       head = &b->head;
>>>>>>
>>>>>> -     ret = htab_lock_bucket(htab, b, hash, &flags);
>>>>>> +     ret = htab_lock_bucket(b, &flags);
>>>>>>       if (ret)
>>>>>>               return ret;
>>>>>>
>>>>>> @@ -1280,7 +1257,7 @@ static int __htab_percpu_map_update_elem(struct bpf_map *map, void *key,
>>>>>>       }
>>>>>>       ret = 0;
>>>>>>  err:
>>>>>> -     htab_unlock_bucket(htab, b, hash, flags);
>>>>>> +     htab_unlock_bucket(b, flags);
>>>>>>       return ret;
>>>>>>  }
>>>>>>
>>>>>> @@ -1321,7 +1298,7 @@ static int __htab_lru_percpu_map_update_elem(struct bpf_map *map, void *key,
>>>>>>                       return -ENOMEM;
>>>>>>       }
>>>>>>
>>>>>> -     ret = htab_lock_bucket(htab, b, hash, &flags);
>>>>>> +     ret = htab_lock_bucket(b, &flags);
>>>>>>       if (ret)
>>>>>>               return ret;
>>>>>>
>>>>>> @@ -1345,7 +1322,7 @@ static int __htab_lru_percpu_map_update_elem(struct bpf_map *map, void *key,
>>>>>>       }
>>>>>>       ret = 0;
>>>>>>  err:
>>>>>> -     htab_unlock_bucket(htab, b, hash, flags);
>>>>>> +     htab_unlock_bucket(b, flags);
>>>>>>       if (l_new)
>>>>>>               bpf_lru_push_free(&htab->lru, &l_new->lru_node);
>>>>>>       return ret;
>>>>>> @@ -1384,7 +1361,7 @@ static int htab_map_delete_elem(struct bpf_map *map, void *key)
>>>>>>       b = __select_bucket(htab, hash);
>>>>>>       head = &b->head;
>>>>>>
>>>>>> -     ret = htab_lock_bucket(htab, b, hash, &flags);
>>>>>> +     ret = htab_lock_bucket(b, &flags);
>>>>>>       if (ret)
>>>>>>               return ret;
>>>>>>
>>>>>> @@ -1397,7 +1374,7 @@ static int htab_map_delete_elem(struct bpf_map *map, void *key)
>>>>>>               ret = -ENOENT;
>>>>>>       }
>>>>>>
>>>>>> -     htab_unlock_bucket(htab, b, hash, flags);
>>>>>> +     htab_unlock_bucket(b, flags);
>>>>>>       return ret;
>>>>>>  }
>>>>>>
>>>>>> @@ -1420,7 +1397,7 @@ static int htab_lru_map_delete_elem(struct bpf_map *map, void *key)
>>>>>>       b = __select_bucket(htab, hash);
>>>>>>       head = &b->head;
>>>>>>
>>>>>> -     ret = htab_lock_bucket(htab, b, hash, &flags);
>>>>>> +     ret = htab_lock_bucket(b, &flags);
>>>>>>       if (ret)
>>>>>>               return ret;
>>>>>>
>>>>>> @@ -1431,7 +1408,7 @@ static int htab_lru_map_delete_elem(struct bpf_map *map, void *key)
>>>>>>       else
>>>>>>               ret = -ENOENT;
>>>>>>
>>>>>> -     htab_unlock_bucket(htab, b, hash, flags);
>>>>>> +     htab_unlock_bucket(b, flags);
>>>>>>       if (l)
>>>>>>               htab_lru_push_free(htab, l);
>>>>>>       return ret;
>>>>>> @@ -1494,7 +1471,6 @@ static void htab_map_free_timers(struct bpf_map *map)
>>>>>>  static void htab_map_free(struct bpf_map *map)
>>>>>>  {
>>>>>>       struct bpf_htab *htab = container_of(map, struct bpf_htab, map);
>>>>>> -     int i;
>>>>>>
>>>>>>       /* bpf_free_used_maps() or close(map_fd) will trigger this map_free callback.
>>>>>>        * bpf_free_used_maps() is called after bpf prog is no longer executing.
>>>>>> @@ -1517,10 +1493,10 @@ static void htab_map_free(struct bpf_map *map)
>>>>>>       bpf_map_area_free(htab->buckets);
>>>>>>       bpf_mem_alloc_destroy(&htab->pcpu_ma);
>>>>>>       bpf_mem_alloc_destroy(&htab->ma);
>>>>>> +
>>>>>>       if (htab->use_percpu_counter)
>>>>>>               percpu_counter_destroy(&htab->pcount);
>>>>>> -     for (i = 0; i < HASHTAB_MAP_LOCK_COUNT; i++)
>>>>>> -             free_percpu(htab->map_locked[i]);
>>>>>> +
>>>>>>       lockdep_unregister_key(&htab->lockdep_key);
>>>>>>       bpf_map_area_free(htab);
>>>>>>  }
>>>>>> @@ -1564,7 +1540,7 @@ static int __htab_map_lookup_and_delete_elem(struct bpf_map *map, void *key,
>>>>>>       b = __select_bucket(htab, hash);
>>>>>>       head = &b->head;
>>>>>>
>>>>>> -     ret = htab_lock_bucket(htab, b, hash, &bflags);
>>>>>> +     ret = htab_lock_bucket(b, &bflags);
>>>>>>       if (ret)
>>>>>>               return ret;
>>>>>>
>>>>>> @@ -1602,7 +1578,7 @@ static int __htab_map_lookup_and_delete_elem(struct bpf_map *map, void *key,
>>>>>>                       free_htab_elem(htab, l);
>>>>>>       }
>>>>>>
>>>>>> -     htab_unlock_bucket(htab, b, hash, bflags);
>>>>>> +     htab_unlock_bucket(b, bflags);
>>>>>>
>>>>>>       if (is_lru_map && l)
>>>>>>               htab_lru_push_free(htab, l);
>>>>>> @@ -1720,7 +1696,7 @@ __htab_map_lookup_and_delete_batch(struct bpf_map *map,
>>>>>>       head = &b->head;
>>>>>>       /* do not grab the lock unless need it (bucket_cnt > 0). */
>>>>>>       if (locked) {
>>>>>> -             ret = htab_lock_bucket(htab, b, batch, &flags);
>>>>>> +             ret = htab_lock_bucket(b, &flags);
>>>>>>               if (ret) {
>>>>>>                       rcu_read_unlock();
>>>>>>                       bpf_enable_instrumentation();
>>>>>> @@ -1743,7 +1719,7 @@ __htab_map_lookup_and_delete_batch(struct bpf_map *map,
>>>>>>               /* Note that since bucket_cnt > 0 here, it is implicit
>>>>>>                * that the locked was grabbed, so release it.
>>>>>>                */
>>>>>> -             htab_unlock_bucket(htab, b, batch, flags);
>>>>>> +             htab_unlock_bucket(b, flags);
>>>>>>               rcu_read_unlock();
>>>>>>               bpf_enable_instrumentation();
>>>>>>               goto after_loop;
>>>>>> @@ -1754,7 +1730,7 @@ __htab_map_lookup_and_delete_batch(struct bpf_map *map,
>>>>>>               /* Note that since bucket_cnt > 0 here, it is implicit
>>>>>>                * that the locked was grabbed, so release it.
>>>>>>                */
>>>>>> -             htab_unlock_bucket(htab, b, batch, flags);
>>>>>> +             htab_unlock_bucket(b, flags);
>>>>>>               rcu_read_unlock();
>>>>>>               bpf_enable_instrumentation();
>>>>>>               kvfree(keys);
>>>>>> @@ -1815,7 +1791,7 @@ __htab_map_lookup_and_delete_batch(struct bpf_map *map,
>>>>>>               dst_val += value_size;
>>>>>>       }
>>>>>>
>>>>>> -     htab_unlock_bucket(htab, b, batch, flags);
>>>>>> +     htab_unlock_bucket(b, flags);
>>>>>>       locked = false;
>>>>>>
>>>>>>       while (node_to_free) {
>>> .
>


^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [net-next] bpf: avoid hashtab deadlock with try_lock
  2022-11-24 14:13             ` Hou Tao
@ 2022-11-28  3:15               ` Tonghao Zhang
  2022-11-28 21:55                 ` Hao Luo
  0 siblings, 1 reply; 33+ messages in thread
From: Tonghao Zhang @ 2022-11-28  3:15 UTC (permalink / raw)
  To: Hou Tao
  Cc: netdev, Alexei Starovoitov, Daniel Borkmann, Andrii Nakryiko,
	Martin KaFai Lau, Song Liu, Yonghong Song, John Fastabend,
	KP Singh, Stanislav Fomichev, Hao Luo, Jiri Olsa, bpf

On Thu, Nov 24, 2022 at 10:13 PM Hou Tao <houtao1@huawei.com> wrote:
>
> Hi,
>
> On 11/24/2022 8:57 PM, Tonghao Zhang wrote:
> > On Tue, Nov 22, 2022 at 12:06 PM Hou Tao <houtao1@huawei.com> wrote:
> >> Hi,
> >>
> >> On 11/22/2022 12:01 PM, Hou Tao wrote:
> >>> Hi,
> >>>
> >>> On 11/22/2022 11:12 AM, Tonghao Zhang wrote:
> >>>> .
> >>>>
> >>>> On Tue, Nov 22, 2022 at 9:16 AM Hou Tao <houtao1@huawei.com> wrote:
> >>>>> Hi,
> >>>>>
> >>>>> On 11/21/2022 6:05 PM, xiangxia.m.yue@gmail.com wrote:
> >>>>>> From: Tonghao Zhang <xiangxia.m.yue@gmail.com>
> >>>>>>
> >>>>>> The commit 20b6cc34ea74 ("bpf: Avoid hashtab deadlock with map_locked"),
> >>>>>> try to fix deadlock, but in some case, the deadlock occurs:
> >>>>>>
> >>>>>> * CPUn in task context with K1, and taking lock.
> >>>>>> * CPUn interrupted by NMI context, with K2.
> >>>>>> * They are using the same bucket, but different map_locked.
> >>>>> It is possible when n_buckets is less than HASHTAB_MAP_LOCK_COUNT (e.g.,
> >>>>> n_bucket=4). If using hash & min(HASHTAB_MAP_LOCK_MASK, n_bucket - 1) as the
> >>>>> index of map_locked, I think the deadlock will be gone.
> >>>> Yes, but for saving memory, HASHTAB_MAP_LOCK_MASK should not be too
> >>>> large(now this value is 8-1).
> >>>> if user define n_bucket ,e.g 8192, the part of bucket only are
> >>>> selected via hash & min(HASHTAB_MAP_LOCK_MASK, n_bucket - 1).
> >> I don't mean to extend map_locked. Using hash & min(HASHTAB_MAP_LOCK_MASK,
> >> n_bucket - 1) as index of map_locked  can guarantee the same map_locked will be
> >> used if different update processes are using the same bucket lock.
> > Thanks, I got it. but I tried it using the hash = hash &
> > min(HASHTAB_MAP_LOCK_MASK, htab->n_buckets -1) in
> > htab_lock_bucket/htab_unlock_bucket.
> > But the warning occur again.
> Does the deadlock happen ? Or just get the warning from lockdep. Maybe it is
> just a false alarm from lockdep. I will check tomorrow. Could you share the
> steps on how to reproduce the problem, specially the size of the hash table ?
Hi
only a warning from lockdep.
1. the kernel .config
#
# Debug Oops, Lockups and Hangs
#
CONFIG_PANIC_ON_OOPS=y
CONFIG_PANIC_ON_OOPS_VALUE=1
CONFIG_PANIC_TIMEOUT=0
CONFIG_LOCKUP_DETECTOR=y
CONFIG_SOFTLOCKUP_DETECTOR=y
# CONFIG_BOOTPARAM_SOFTLOCKUP_PANIC is not set
CONFIG_HARDLOCKUP_DETECTOR_PERF=y
CONFIG_HARDLOCKUP_CHECK_TIMESTAMP=y
CONFIG_HARDLOCKUP_DETECTOR=y
CONFIG_BOOTPARAM_HARDLOCKUP_PANIC=y
CONFIG_DETECT_HUNG_TASK=y
CONFIG_DEFAULT_HUNG_TASK_TIMEOUT=120
# CONFIG_BOOTPARAM_HUNG_TASK_PANIC is not set
# CONFIG_WQ_WATCHDOG is not set
# CONFIG_TEST_LOCKUP is not set
# end of Debug Oops, Lockups and Hangs

2. bpf.c, the map size is 2.
struct {
__uint(type, BPF_MAP_TYPE_HASH);
__uint(max_entries, 2);
__uint(key_size, sizeof(unsigned int));
__uint(value_size, sizeof(unsigned int));
} map1 SEC(".maps");

static int bpf_update_data()
{
unsigned int val = 1, key = 0;

return bpf_map_update_elem(&map1, &key, &val, BPF_ANY);
}

SEC("kprobe/ip_rcv")
int bpf_prog1(struct pt_regs *regs)
{
bpf_update_data();
return 0;
}

SEC("tracepoint/nmi/nmi_handler")
int bpf_prog2(struct pt_regs *regs)
{
bpf_update_data();
return 0;
}

char _license[] SEC("license") = "GPL";
unsigned int _version SEC("version") = LINUX_VERSION_CODE;

3. bpf loader.
#include "kprobe-example.skel.h"

#include <unistd.h>
#include <errno.h>

#include <bpf/bpf.h>

int main()
{
struct kprobe_example *skel;
int map_fd, prog_fd;
int i;
int err = 0;

skel = kprobe_example__open_and_load();
if (!skel)
return -1;

err = kprobe_example__attach(skel);
if (err)
goto cleanup;

/* all libbpf APIs are usable */
prog_fd = bpf_program__fd(skel->progs.bpf_prog1);
map_fd = bpf_map__fd(skel->maps.map1);

printf("map_fd: %d\n", map_fd);

unsigned int val = 0, key = 0;

while (1) {
bpf_map_delete_elem(map_fd, &key);
bpf_map_update_elem(map_fd, &key, &val, BPF_ANY);
}

cleanup:
kprobe_example__destroy(skel);
return err;
}

4. run the bpf loader and perf record for nmi interrupts.  the warming occurs

> >   > > SNIP
> >>>>>> diff --git a/kernel/bpf/hashtab.c b/kernel/bpf/hashtab.c
> >>>>>> index 22855d6ff6d3..429acd97c869 100644
> >>>>>> --- a/kernel/bpf/hashtab.c
> >>>>>> +++ b/kernel/bpf/hashtab.c
> >>>>>> @@ -80,9 +80,6 @@ struct bucket {
> >>>>>>       raw_spinlock_t raw_lock;
> >>>>>>  };
> >>>>>>
> >>>>>> -#define HASHTAB_MAP_LOCK_COUNT 8
> >>>>>> -#define HASHTAB_MAP_LOCK_MASK (HASHTAB_MAP_LOCK_COUNT - 1)
> >>>>>> -
> >>>>>>  struct bpf_htab {
> >>>>>>       struct bpf_map map;
> >>>>>>       struct bpf_mem_alloc ma;
> >>>>>> @@ -104,7 +101,6 @@ struct bpf_htab {
> >>>>>>       u32 elem_size;  /* size of each element in bytes */
> >>>>>>       u32 hashrnd;
> >>>>>>       struct lock_class_key lockdep_key;
> >>>>>> -     int __percpu *map_locked[HASHTAB_MAP_LOCK_COUNT];
> >>>>>>  };
> >>>>>>
> >>>>>>  /* each htab element is struct htab_elem + key + value */
> >>>>>> @@ -146,35 +142,26 @@ static void htab_init_buckets(struct bpf_htab *htab)
> >>>>>>       }
> >>>>>>  }
> >>>>>>
> >>>>>> -static inline int htab_lock_bucket(const struct bpf_htab *htab,
> >>>>>> -                                struct bucket *b, u32 hash,
> >>>>>> +static inline int htab_lock_bucket(struct bucket *b,
> >>>>>>                                  unsigned long *pflags)
> >>>>>>  {
> >>>>>>       unsigned long flags;
> >>>>>>
> >>>>>> -     hash = hash & HASHTAB_MAP_LOCK_MASK;
> >>>>>> -
> >>>>>> -     preempt_disable();
> >>>>>> -     if (unlikely(__this_cpu_inc_return(*(htab->map_locked[hash])) != 1)) {
> >>>>>> -             __this_cpu_dec(*(htab->map_locked[hash]));
> >>>>>> -             preempt_enable();
> >>>>>> -             return -EBUSY;
> >>>>>> +     if (in_nmi()) {
> >>>>>> +             if (!raw_spin_trylock_irqsave(&b->raw_lock, flags))
> >>>>>> +                     return -EBUSY;
> >>>>>> +     } else {
> >>>>>> +             raw_spin_lock_irqsave(&b->raw_lock, flags);
> >>>>>>       }
> >>>>>>
> >>>>>> -     raw_spin_lock_irqsave(&b->raw_lock, flags);
> >>>>>>       *pflags = flags;
> >>>>>> -
> >>>>>>       return 0;
> >>>>>>  }
> >>>>> map_locked is also used to prevent the re-entrance of htab_lock_bucket() on the
> >>>>> same CPU, so only check in_nmi() is not enough.
> >>>> NMI, IRQ, and preempt may interrupt the task context.
> >>>> In htab_lock_bucket, raw_spin_lock_irqsave disable the preempt and
> >>>> irq. so only NMI may interrupt the codes, right ?
> >>> The re-entrance here means the nesting of bpf programs as show below:
> >>>
> >>> bpf_prog A
> >>> update map X
> >>>     htab_lock_bucket
> >>>         raw_spin_lock_irqsave()
> >>>     lookup_elem_raw()
> >>>         // bpf prog B is attached on lookup_elem_raw()
> > I am confused, bpf_prog A disables preempt and irq with
> > raw_spin_lock_irqsave. Why bpf prog B run here?
> Because program B (e.g., fentry program) has been attached on lookup_elem_raw(),
> calling lookup_elem_raw() will call the fentry program first. I had written a
> test case for the similar scenario in bpf selftests. The path of the test case
> is tools/testing/selftests/bpf/prog_tests/htab_update.c, you can use ./test_prog
> -t htab_update/reenter_update to run the test case.
> >>>         bpf prog B
> >>>             update map X again and update the element
> >>>                 htab_lock_bucket()
> >>>                     // dead-lock
> >>>                     raw_spinlock_irqsave()
> >>> .
> >>>
> >>>>>> -static inline void htab_unlock_bucket(const struct bpf_htab *htab,
> >>>>>> -                                   struct bucket *b, u32 hash,
> >>>>>> +static inline void htab_unlock_bucket(struct bucket *b,
> >>>>>>                                     unsigned long flags)
> >>>>>>  {
> >>>>>> -     hash = hash & HASHTAB_MAP_LOCK_MASK;
> >>>>>>       raw_spin_unlock_irqrestore(&b->raw_lock, flags);
> >>>>>> -     __this_cpu_dec(*(htab->map_locked[hash]));
> >>>>>> -     preempt_enable();
> >>>>>>  }
> >>>>>>
> >>>>>>  static bool htab_lru_map_delete_node(void *arg, struct bpf_lru_node *node);
> >>>>>> @@ -467,7 +454,7 @@ static struct bpf_map *htab_map_alloc(union bpf_attr *attr)
> >>>>>>       bool percpu_lru = (attr->map_flags & BPF_F_NO_COMMON_LRU);
> >>>>>>       bool prealloc = !(attr->map_flags & BPF_F_NO_PREALLOC);
> >>>>>>       struct bpf_htab *htab;
> >>>>>> -     int err, i;
> >>>>>> +     int err;
> >>>>>>
> >>>>>>       htab = bpf_map_area_alloc(sizeof(*htab), NUMA_NO_NODE);
> >>>>>>       if (!htab)
> >>>>>> @@ -513,15 +500,6 @@ static struct bpf_map *htab_map_alloc(union bpf_attr *attr)
> >>>>>>       if (!htab->buckets)
> >>>>>>               goto free_htab;
> >>>>>>
> >>>>>> -     for (i = 0; i < HASHTAB_MAP_LOCK_COUNT; i++) {
> >>>>>> -             htab->map_locked[i] = bpf_map_alloc_percpu(&htab->map,
> >>>>>> -                                                        sizeof(int),
> >>>>>> -                                                        sizeof(int),
> >>>>>> -                                                        GFP_USER);
> >>>>>> -             if (!htab->map_locked[i])
> >>>>>> -                     goto free_map_locked;
> >>>>>> -     }
> >>>>>> -
> >>>>>>       if (htab->map.map_flags & BPF_F_ZERO_SEED)
> >>>>>>               htab->hashrnd = 0;
> >>>>>>       else
> >>>>>> @@ -549,13 +527,13 @@ static struct bpf_map *htab_map_alloc(union bpf_attr *attr)
> >>>>>>       if (htab->use_percpu_counter) {
> >>>>>>               err = percpu_counter_init(&htab->pcount, 0, GFP_KERNEL);
> >>>>>>               if (err)
> >>>>>> -                     goto free_map_locked;
> >>>>>> +                     goto free_buckets;
> >>>>>>       }
> >>>>>>
> >>>>>>       if (prealloc) {
> >>>>>>               err = prealloc_init(htab);
> >>>>>>               if (err)
> >>>>>> -                     goto free_map_locked;
> >>>>>> +                     goto free_buckets;
> >>>>>>
> >>>>>>               if (!percpu && !lru) {
> >>>>>>                       /* lru itself can remove the least used element, so
> >>>>>> @@ -568,12 +546,12 @@ static struct bpf_map *htab_map_alloc(union bpf_attr *attr)
> >>>>>>       } else {
> >>>>>>               err = bpf_mem_alloc_init(&htab->ma, htab->elem_size, false);
> >>>>>>               if (err)
> >>>>>> -                     goto free_map_locked;
> >>>>>> +                     goto free_buckets;
> >>>>>>               if (percpu) {
> >>>>>>                       err = bpf_mem_alloc_init(&htab->pcpu_ma,
> >>>>>>                                                round_up(htab->map.value_size, 8), true);
> >>>>>>                       if (err)
> >>>>>> -                             goto free_map_locked;
> >>>>>> +                             goto free_buckets;
> >>>>>>               }
> >>>>>>       }
> >>>>>>
> >>>>>> @@ -581,11 +559,10 @@ static struct bpf_map *htab_map_alloc(union bpf_attr *attr)
> >>>>>>
> >>>>>>  free_prealloc:
> >>>>>>       prealloc_destroy(htab);
> >>>>>> -free_map_locked:
> >>>>>> +free_buckets:
> >>>>>>       if (htab->use_percpu_counter)
> >>>>>>               percpu_counter_destroy(&htab->pcount);
> >>>>>> -     for (i = 0; i < HASHTAB_MAP_LOCK_COUNT; i++)
> >>>>>> -             free_percpu(htab->map_locked[i]);
> >>>>>> +
> >>>>>>       bpf_map_area_free(htab->buckets);
> >>>>>>       bpf_mem_alloc_destroy(&htab->pcpu_ma);
> >>>>>>       bpf_mem_alloc_destroy(&htab->ma);
> >>>>>> @@ -782,7 +759,7 @@ static bool htab_lru_map_delete_node(void *arg, struct bpf_lru_node *node)
> >>>>>>       b = __select_bucket(htab, tgt_l->hash);
> >>>>>>       head = &b->head;
> >>>>>>
> >>>>>> -     ret = htab_lock_bucket(htab, b, tgt_l->hash, &flags);
> >>>>>> +     ret = htab_lock_bucket(b, &flags);
> >>>>>>       if (ret)
> >>>>>>               return false;
> >>>>>>
> >>>>>> @@ -793,7 +770,7 @@ static bool htab_lru_map_delete_node(void *arg, struct bpf_lru_node *node)
> >>>>>>                       break;
> >>>>>>               }
> >>>>>>
> >>>>>> -     htab_unlock_bucket(htab, b, tgt_l->hash, flags);
> >>>>>> +     htab_unlock_bucket(b, flags);
> >>>>>>
> >>>>>>       return l == tgt_l;
> >>>>>>  }
> >>>>>> @@ -1107,7 +1084,7 @@ static int htab_map_update_elem(struct bpf_map *map, void *key, void *value,
> >>>>>>                */
> >>>>>>       }
> >>>>>>
> >>>>>> -     ret = htab_lock_bucket(htab, b, hash, &flags);
> >>>>>> +     ret = htab_lock_bucket(b, &flags);
> >>>>>>       if (ret)
> >>>>>>               return ret;
> >>>>>>
> >>>>>> @@ -1152,7 +1129,7 @@ static int htab_map_update_elem(struct bpf_map *map, void *key, void *value,
> >>>>>>       }
> >>>>>>       ret = 0;
> >>>>>>  err:
> >>>>>> -     htab_unlock_bucket(htab, b, hash, flags);
> >>>>>> +     htab_unlock_bucket(b, flags);
> >>>>>>       return ret;
> >>>>>>  }
> >>>>>>
> >>>>>> @@ -1198,7 +1175,7 @@ static int htab_lru_map_update_elem(struct bpf_map *map, void *key, void *value,
> >>>>>>       copy_map_value(&htab->map,
> >>>>>>                      l_new->key + round_up(map->key_size, 8), value);
> >>>>>>
> >>>>>> -     ret = htab_lock_bucket(htab, b, hash, &flags);
> >>>>>> +     ret = htab_lock_bucket(b, &flags);
> >>>>>>       if (ret)
> >>>>>>               return ret;
> >>>>>>
> >>>>>> @@ -1219,7 +1196,7 @@ static int htab_lru_map_update_elem(struct bpf_map *map, void *key, void *value,
> >>>>>>       ret = 0;
> >>>>>>
> >>>>>>  err:
> >>>>>> -     htab_unlock_bucket(htab, b, hash, flags);
> >>>>>> +     htab_unlock_bucket(b, flags);
> >>>>>>
> >>>>>>       if (ret)
> >>>>>>               htab_lru_push_free(htab, l_new);
> >>>>>> @@ -1255,7 +1232,7 @@ static int __htab_percpu_map_update_elem(struct bpf_map *map, void *key,
> >>>>>>       b = __select_bucket(htab, hash);
> >>>>>>       head = &b->head;
> >>>>>>
> >>>>>> -     ret = htab_lock_bucket(htab, b, hash, &flags);
> >>>>>> +     ret = htab_lock_bucket(b, &flags);
> >>>>>>       if (ret)
> >>>>>>               return ret;
> >>>>>>
> >>>>>> @@ -1280,7 +1257,7 @@ static int __htab_percpu_map_update_elem(struct bpf_map *map, void *key,
> >>>>>>       }
> >>>>>>       ret = 0;
> >>>>>>  err:
> >>>>>> -     htab_unlock_bucket(htab, b, hash, flags);
> >>>>>> +     htab_unlock_bucket(b, flags);
> >>>>>>       return ret;
> >>>>>>  }
> >>>>>>
> >>>>>> @@ -1321,7 +1298,7 @@ static int __htab_lru_percpu_map_update_elem(struct bpf_map *map, void *key,
> >>>>>>                       return -ENOMEM;
> >>>>>>       }
> >>>>>>
> >>>>>> -     ret = htab_lock_bucket(htab, b, hash, &flags);
> >>>>>> +     ret = htab_lock_bucket(b, &flags);
> >>>>>>       if (ret)
> >>>>>>               return ret;
> >>>>>>
> >>>>>> @@ -1345,7 +1322,7 @@ static int __htab_lru_percpu_map_update_elem(struct bpf_map *map, void *key,
> >>>>>>       }
> >>>>>>       ret = 0;
> >>>>>>  err:
> >>>>>> -     htab_unlock_bucket(htab, b, hash, flags);
> >>>>>> +     htab_unlock_bucket(b, flags);
> >>>>>>       if (l_new)
> >>>>>>               bpf_lru_push_free(&htab->lru, &l_new->lru_node);
> >>>>>>       return ret;
> >>>>>> @@ -1384,7 +1361,7 @@ static int htab_map_delete_elem(struct bpf_map *map, void *key)
> >>>>>>       b = __select_bucket(htab, hash);
> >>>>>>       head = &b->head;
> >>>>>>
> >>>>>> -     ret = htab_lock_bucket(htab, b, hash, &flags);
> >>>>>> +     ret = htab_lock_bucket(b, &flags);
> >>>>>>       if (ret)
> >>>>>>               return ret;
> >>>>>>
> >>>>>> @@ -1397,7 +1374,7 @@ static int htab_map_delete_elem(struct bpf_map *map, void *key)
> >>>>>>               ret = -ENOENT;
> >>>>>>       }
> >>>>>>
> >>>>>> -     htab_unlock_bucket(htab, b, hash, flags);
> >>>>>> +     htab_unlock_bucket(b, flags);
> >>>>>>       return ret;
> >>>>>>  }
> >>>>>>
> >>>>>> @@ -1420,7 +1397,7 @@ static int htab_lru_map_delete_elem(struct bpf_map *map, void *key)
> >>>>>>       b = __select_bucket(htab, hash);
> >>>>>>       head = &b->head;
> >>>>>>
> >>>>>> -     ret = htab_lock_bucket(htab, b, hash, &flags);
> >>>>>> +     ret = htab_lock_bucket(b, &flags);
> >>>>>>       if (ret)
> >>>>>>               return ret;
> >>>>>>
> >>>>>> @@ -1431,7 +1408,7 @@ static int htab_lru_map_delete_elem(struct bpf_map *map, void *key)
> >>>>>>       else
> >>>>>>               ret = -ENOENT;
> >>>>>>
> >>>>>> -     htab_unlock_bucket(htab, b, hash, flags);
> >>>>>> +     htab_unlock_bucket(b, flags);
> >>>>>>       if (l)
> >>>>>>               htab_lru_push_free(htab, l);
> >>>>>>       return ret;
> >>>>>> @@ -1494,7 +1471,6 @@ static void htab_map_free_timers(struct bpf_map *map)
> >>>>>>  static void htab_map_free(struct bpf_map *map)
> >>>>>>  {
> >>>>>>       struct bpf_htab *htab = container_of(map, struct bpf_htab, map);
> >>>>>> -     int i;
> >>>>>>
> >>>>>>       /* bpf_free_used_maps() or close(map_fd) will trigger this map_free callback.
> >>>>>>        * bpf_free_used_maps() is called after bpf prog is no longer executing.
> >>>>>> @@ -1517,10 +1493,10 @@ static void htab_map_free(struct bpf_map *map)
> >>>>>>       bpf_map_area_free(htab->buckets);
> >>>>>>       bpf_mem_alloc_destroy(&htab->pcpu_ma);
> >>>>>>       bpf_mem_alloc_destroy(&htab->ma);
> >>>>>> +
> >>>>>>       if (htab->use_percpu_counter)
> >>>>>>               percpu_counter_destroy(&htab->pcount);
> >>>>>> -     for (i = 0; i < HASHTAB_MAP_LOCK_COUNT; i++)
> >>>>>> -             free_percpu(htab->map_locked[i]);
> >>>>>> +
> >>>>>>       lockdep_unregister_key(&htab->lockdep_key);
> >>>>>>       bpf_map_area_free(htab);
> >>>>>>  }
> >>>>>> @@ -1564,7 +1540,7 @@ static int __htab_map_lookup_and_delete_elem(struct bpf_map *map, void *key,
> >>>>>>       b = __select_bucket(htab, hash);
> >>>>>>       head = &b->head;
> >>>>>>
> >>>>>> -     ret = htab_lock_bucket(htab, b, hash, &bflags);
> >>>>>> +     ret = htab_lock_bucket(b, &bflags);
> >>>>>>       if (ret)
> >>>>>>               return ret;
> >>>>>>
> >>>>>> @@ -1602,7 +1578,7 @@ static int __htab_map_lookup_and_delete_elem(struct bpf_map *map, void *key,
> >>>>>>                       free_htab_elem(htab, l);
> >>>>>>       }
> >>>>>>
> >>>>>> -     htab_unlock_bucket(htab, b, hash, bflags);
> >>>>>> +     htab_unlock_bucket(b, bflags);
> >>>>>>
> >>>>>>       if (is_lru_map && l)
> >>>>>>               htab_lru_push_free(htab, l);
> >>>>>> @@ -1720,7 +1696,7 @@ __htab_map_lookup_and_delete_batch(struct bpf_map *map,
> >>>>>>       head = &b->head;
> >>>>>>       /* do not grab the lock unless need it (bucket_cnt > 0). */
> >>>>>>       if (locked) {
> >>>>>> -             ret = htab_lock_bucket(htab, b, batch, &flags);
> >>>>>> +             ret = htab_lock_bucket(b, &flags);
> >>>>>>               if (ret) {
> >>>>>>                       rcu_read_unlock();
> >>>>>>                       bpf_enable_instrumentation();
> >>>>>> @@ -1743,7 +1719,7 @@ __htab_map_lookup_and_delete_batch(struct bpf_map *map,
> >>>>>>               /* Note that since bucket_cnt > 0 here, it is implicit
> >>>>>>                * that the locked was grabbed, so release it.
> >>>>>>                */
> >>>>>> -             htab_unlock_bucket(htab, b, batch, flags);
> >>>>>> +             htab_unlock_bucket(b, flags);
> >>>>>>               rcu_read_unlock();
> >>>>>>               bpf_enable_instrumentation();
> >>>>>>               goto after_loop;
> >>>>>> @@ -1754,7 +1730,7 @@ __htab_map_lookup_and_delete_batch(struct bpf_map *map,
> >>>>>>               /* Note that since bucket_cnt > 0 here, it is implicit
> >>>>>>                * that the locked was grabbed, so release it.
> >>>>>>                */
> >>>>>> -             htab_unlock_bucket(htab, b, batch, flags);
> >>>>>> +             htab_unlock_bucket(b, flags);
> >>>>>>               rcu_read_unlock();
> >>>>>>               bpf_enable_instrumentation();
> >>>>>>               kvfree(keys);
> >>>>>> @@ -1815,7 +1791,7 @@ __htab_map_lookup_and_delete_batch(struct bpf_map *map,
> >>>>>>               dst_val += value_size;
> >>>>>>       }
> >>>>>>
> >>>>>> -     htab_unlock_bucket(htab, b, batch, flags);
> >>>>>> +     htab_unlock_bucket(b, flags);
> >>>>>>       locked = false;
> >>>>>>
> >>>>>>       while (node_to_free) {
> >>> .
> >
>


-- 
Best regards, Tonghao

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [net-next] bpf: avoid hashtab deadlock with try_lock
  2022-11-28  3:15               ` Tonghao Zhang
@ 2022-11-28 21:55                 ` Hao Luo
  2022-11-29  4:32                   ` Hou Tao
  0 siblings, 1 reply; 33+ messages in thread
From: Hao Luo @ 2022-11-28 21:55 UTC (permalink / raw)
  To: Tonghao Zhang
  Cc: Hou Tao, netdev, Alexei Starovoitov, Daniel Borkmann,
	Andrii Nakryiko, Martin KaFai Lau, Song Liu, Yonghong Song,
	John Fastabend, KP Singh, Stanislav Fomichev, Jiri Olsa, bpf

On Sun, Nov 27, 2022 at 7:15 PM Tonghao Zhang <xiangxia.m.yue@gmail.com> wrote:
>

Hi Tonghao,

With a quick look at the htab_lock_bucket() and your problem
statement, I agree with Hou Tao that using hash &
min(HASHTAB_MAP_LOCK_MASK, n_bucket - 1) to index in map_locked seems
to fix the potential deadlock. Can you actually send your changes as
v2 so we can take a look and better help you? Also, can you explain
your solution in your commit message? Right now, your commit message
has only a problem statement and is not very clear. Please include
more details on what you do to fix the issue.

Hao

> Hi
> only a warning from lockdep.
> 1. the kernel .config
> #
> # Debug Oops, Lockups and Hangs
> #
> CONFIG_PANIC_ON_OOPS=y
> CONFIG_PANIC_ON_OOPS_VALUE=1
> CONFIG_PANIC_TIMEOUT=0
> CONFIG_LOCKUP_DETECTOR=y
> CONFIG_SOFTLOCKUP_DETECTOR=y
> # CONFIG_BOOTPARAM_SOFTLOCKUP_PANIC is not set
> CONFIG_HARDLOCKUP_DETECTOR_PERF=y
> CONFIG_HARDLOCKUP_CHECK_TIMESTAMP=y
> CONFIG_HARDLOCKUP_DETECTOR=y
> CONFIG_BOOTPARAM_HARDLOCKUP_PANIC=y
> CONFIG_DETECT_HUNG_TASK=y
> CONFIG_DEFAULT_HUNG_TASK_TIMEOUT=120
> # CONFIG_BOOTPARAM_HUNG_TASK_PANIC is not set
> # CONFIG_WQ_WATCHDOG is not set
> # CONFIG_TEST_LOCKUP is not set
> # end of Debug Oops, Lockups and Hangs
>
> 2. bpf.c, the map size is 2.
> struct {
> __uint(type, BPF_MAP_TYPE_HASH);
> __uint(max_entries, 2);
> __uint(key_size, sizeof(unsigned int));
> __uint(value_size, sizeof(unsigned int));
> } map1 SEC(".maps");
>
> static int bpf_update_data()
> {
> unsigned int val = 1, key = 0;
>
> return bpf_map_update_elem(&map1, &key, &val, BPF_ANY);
> }
>
> SEC("kprobe/ip_rcv")
> int bpf_prog1(struct pt_regs *regs)
> {
> bpf_update_data();
> return 0;
> }
>
> SEC("tracepoint/nmi/nmi_handler")
> int bpf_prog2(struct pt_regs *regs)
> {
> bpf_update_data();
> return 0;
> }
>
> char _license[] SEC("license") = "GPL";
> unsigned int _version SEC("version") = LINUX_VERSION_CODE;
>
> 3. bpf loader.
> #include "kprobe-example.skel.h"
>
> #include <unistd.h>
> #include <errno.h>
>
> #include <bpf/bpf.h>
>
> int main()
> {
> struct kprobe_example *skel;
> int map_fd, prog_fd;
> int i;
> int err = 0;
>
> skel = kprobe_example__open_and_load();
> if (!skel)
> return -1;
>
> err = kprobe_example__attach(skel);
> if (err)
> goto cleanup;
>
> /* all libbpf APIs are usable */
> prog_fd = bpf_program__fd(skel->progs.bpf_prog1);
> map_fd = bpf_map__fd(skel->maps.map1);
>
> printf("map_fd: %d\n", map_fd);
>
> unsigned int val = 0, key = 0;
>
> while (1) {
> bpf_map_delete_elem(map_fd, &key);
> bpf_map_update_elem(map_fd, &key, &val, BPF_ANY);
> }
>
> cleanup:
> kprobe_example__destroy(skel);
> return err;
> }
>
> 4. run the bpf loader and perf record for nmi interrupts.  the warming occurs
>
> --
> Best regards, Tonghao

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [net-next] bpf: avoid hashtab deadlock with try_lock
  2022-11-28 21:55                 ` Hao Luo
@ 2022-11-29  4:32                   ` Hou Tao
  2022-11-29  6:06                     ` Tonghao Zhang
  0 siblings, 1 reply; 33+ messages in thread
From: Hou Tao @ 2022-11-29  4:32 UTC (permalink / raw)
  To: Tonghao Zhang
  Cc: netdev, Alexei Starovoitov, Daniel Borkmann, Andrii Nakryiko,
	Martin KaFai Lau, Song Liu, Yonghong Song, John Fastabend,
	KP Singh, Stanislav Fomichev, Jiri Olsa, bpf, Hao Luo

Hi,

On 11/29/2022 5:55 AM, Hao Luo wrote:
> On Sun, Nov 27, 2022 at 7:15 PM Tonghao Zhang <xiangxia.m.yue@gmail.com> wrote:
> Hi Tonghao,
>
> With a quick look at the htab_lock_bucket() and your problem
> statement, I agree with Hou Tao that using hash &
> min(HASHTAB_MAP_LOCK_MASK, n_bucket - 1) to index in map_locked seems
> to fix the potential deadlock. Can you actually send your changes as
> v2 so we can take a look and better help you? Also, can you explain
> your solution in your commit message? Right now, your commit message
> has only a problem statement and is not very clear. Please include
> more details on what you do to fix the issue.
>
> Hao
It would be better if the test case below can be rewritten as a bpf selftests.
Please see comments below on how to improve it and reproduce the deadlock.
>
>> Hi
>> only a warning from lockdep.
Thanks for your details instruction.  I can reproduce the warning by using your
setup. I am not a lockdep expert, it seems that fixing such warning needs to set
different lockdep class to the different bucket. Because we use map_locked to
protect the acquisition of bucket lock, so I think we can define  lock_class_key
array in bpf_htab (e.g., lockdep_key[HASHTAB_MAP_LOCK_COUNT]) and initialize the
bucket lock accordingly.

>> 1. the kernel .config
>> #
>> # Debug Oops, Lockups and Hangs
>> #
>> CONFIG_PANIC_ON_OOPS=y
>> CONFIG_PANIC_ON_OOPS_VALUE=1
>> CONFIG_PANIC_TIMEOUT=0
>> CONFIG_LOCKUP_DETECTOR=y
>> CONFIG_SOFTLOCKUP_DETECTOR=y
>> # CONFIG_BOOTPARAM_SOFTLOCKUP_PANIC is not set
>> CONFIG_HARDLOCKUP_DETECTOR_PERF=y
>> CONFIG_HARDLOCKUP_CHECK_TIMESTAMP=y
>> CONFIG_HARDLOCKUP_DETECTOR=y
>> CONFIG_BOOTPARAM_HARDLOCKUP_PANIC=y
>> CONFIG_DETECT_HUNG_TASK=y
>> CONFIG_DEFAULT_HUNG_TASK_TIMEOUT=120
>> # CONFIG_BOOTPARAM_HUNG_TASK_PANIC is not set
>> # CONFIG_WQ_WATCHDOG is not set
>> # CONFIG_TEST_LOCKUP is not set
>> # end of Debug Oops, Lockups and Hangs
>>
>> 2. bpf.c, the map size is 2.
>> struct {
>> __uint(type, BPF_MAP_TYPE_HASH);
Adding __uint(map_flags, BPF_F_ZERO_SEED); to ensure there will be no seed for
hash calculation, so we can use key=4 and key=20 to construct the case that
these two keys have the same bucket index but have different map_locked index.
>> __uint(max_entries, 2);
>> __uint(key_size, sizeof(unsigned int));
>> __uint(value_size, sizeof(unsigned int));
>> } map1 SEC(".maps");
>>
>> static int bpf_update_data()
>> {
>> unsigned int val = 1, key = 0;
key = 20
>>
>> return bpf_map_update_elem(&map1, &key, &val, BPF_ANY);
>> }
>>
>> SEC("kprobe/ip_rcv")
>> int bpf_prog1(struct pt_regs *regs)
>> {
>> bpf_update_data();
>> return 0;
>> }
kprobe on ip_rcv is unnecessary, you can just remove it.
>>
>> SEC("tracepoint/nmi/nmi_handler")
>> int bpf_prog2(struct pt_regs *regs)
>> {
>> bpf_update_data();
>> return 0;
>> }
Please use SEC("fentry/nmi_handle") instead of SEC("tracepoint") and unfold
bpf_update_data(), because the running of bpf program on tracepoint will be
blocked by bpf_prog_active which will be increased bpf_map_update_elem through
bpf_disable_instrumentation().
>>
>> char _license[] SEC("license") = "GPL";
>> unsigned int _version SEC("version") = LINUX_VERSION_CODE;
>>
>> 3. bpf loader.
>> #include "kprobe-example.skel.h"
>>
>> #include <unistd.h>
>> #include <errno.h>
>>
>> #include <bpf/bpf.h>
>>
>> int main()
>> {
>> struct kprobe_example *skel;
>> int map_fd, prog_fd;
>> int i;
>> int err = 0;
>>
>> skel = kprobe_example__open_and_load();
>> if (!skel)
>> return -1;
>>
>> err = kprobe_example__attach(skel);
>> if (err)
>> goto cleanup;
>>
>> /* all libbpf APIs are usable */
>> prog_fd = bpf_program__fd(skel->progs.bpf_prog1);
>> map_fd = bpf_map__fd(skel->maps.map1);
>>
>> printf("map_fd: %d\n", map_fd);
>>
>> unsigned int val = 0, key = 0;
>>
>> while (1) {
>> bpf_map_delete_elem(map_fd, &key);
No needed neither. Only do bpf_map_update_elem() is OK. Also change key=0 from
key=4, so it will have the same bucket index as key=20 but have different
map_locked index.
>> bpf_map_update_elem(map_fd, &key, &val, BPF_ANY);
>> }
Also need to pin the process on a specific CPU (e.g., CPU 0)
>>
>> cleanup:
>> kprobe_example__destroy(skel);
>> return err;
>> }
>>
>> 4. run the bpf loader and perf record for nmi interrupts.  the warming occurs
For perf event, you can reference prog_tests/find_vma.c on how to using
perf_event_open to trigger a perf nmi interrupt. The perf event also needs to
pin on a specific CPU as the caller of bpf_map_update_elem() does.

>>
>> --
>> Best regards, Tonghao
> .


^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [net-next] bpf: avoid hashtab deadlock with try_lock
  2022-11-29  4:32                   ` Hou Tao
@ 2022-11-29  6:06                     ` Tonghao Zhang
  2022-11-29  7:56                       ` Hou Tao
  2022-11-29 12:45                       ` Hou Tao
  0 siblings, 2 replies; 33+ messages in thread
From: Tonghao Zhang @ 2022-11-29  6:06 UTC (permalink / raw)
  To: Hou Tao
  Cc: netdev, Alexei Starovoitov, Daniel Borkmann, Andrii Nakryiko,
	Martin KaFai Lau, Song Liu, Yonghong Song, John Fastabend,
	KP Singh, Stanislav Fomichev, Jiri Olsa, bpf, Hao Luo

On Tue, Nov 29, 2022 at 12:32 PM Hou Tao <houtao1@huawei.com> wrote:
>
> Hi,
>
> On 11/29/2022 5:55 AM, Hao Luo wrote:
> > On Sun, Nov 27, 2022 at 7:15 PM Tonghao Zhang <xiangxia.m.yue@gmail.com> wrote:
> > Hi Tonghao,
> >
> > With a quick look at the htab_lock_bucket() and your problem
> > statement, I agree with Hou Tao that using hash &
> > min(HASHTAB_MAP_LOCK_MASK, n_bucket - 1) to index in map_locked seems
> > to fix the potential deadlock. Can you actually send your changes as
> > v2 so we can take a look and better help you? Also, can you explain
> > your solution in your commit message? Right now, your commit message
> > has only a problem statement and is not very clear. Please include
> > more details on what you do to fix the issue.
> >
> > Hao
> It would be better if the test case below can be rewritten as a bpf selftests.
> Please see comments below on how to improve it and reproduce the deadlock.
> >
> >> Hi
> >> only a warning from lockdep.
> Thanks for your details instruction.  I can reproduce the warning by using your
> setup. I am not a lockdep expert, it seems that fixing such warning needs to set
> different lockdep class to the different bucket. Because we use map_locked to
> protect the acquisition of bucket lock, so I think we can define  lock_class_key
> array in bpf_htab (e.g., lockdep_key[HASHTAB_MAP_LOCK_COUNT]) and initialize the
> bucket lock accordingly.
Hi
Thanks for your reply. define the lock_class_key array looks good.
Last question: how about using  raw_spin_trylock_irqsave, if the
bucket is locked on the same or other cpu.
raw_spin_trylock_irqsave will return the false, we should return the
-EBUSY in htab_lock_bucket.

static inline int htab_lock_bucket(struct bucket *b,
                                   unsigned long *pflags)
{
        unsigned long flags;

        if (!raw_spin_trylock_irqsave(&b->raw_lock, flags))
                return -EBUSY;

        *pflags = flags;
        return 0;
}

> >> 1. the kernel .config
> >> #
> >> # Debug Oops, Lockups and Hangs
> >> #
> >> CONFIG_PANIC_ON_OOPS=y
> >> CONFIG_PANIC_ON_OOPS_VALUE=1
> >> CONFIG_PANIC_TIMEOUT=0
> >> CONFIG_LOCKUP_DETECTOR=y
> >> CONFIG_SOFTLOCKUP_DETECTOR=y
> >> # CONFIG_BOOTPARAM_SOFTLOCKUP_PANIC is not set
> >> CONFIG_HARDLOCKUP_DETECTOR_PERF=y
> >> CONFIG_HARDLOCKUP_CHECK_TIMESTAMP=y
> >> CONFIG_HARDLOCKUP_DETECTOR=y
> >> CONFIG_BOOTPARAM_HARDLOCKUP_PANIC=y
> >> CONFIG_DETECT_HUNG_TASK=y
> >> CONFIG_DEFAULT_HUNG_TASK_TIMEOUT=120
> >> # CONFIG_BOOTPARAM_HUNG_TASK_PANIC is not set
> >> # CONFIG_WQ_WATCHDOG is not set
> >> # CONFIG_TEST_LOCKUP is not set
> >> # end of Debug Oops, Lockups and Hangs
> >>
> >> 2. bpf.c, the map size is 2.
> >> struct {
> >> __uint(type, BPF_MAP_TYPE_HASH);
> Adding __uint(map_flags, BPF_F_ZERO_SEED); to ensure there will be no seed for
> hash calculation, so we can use key=4 and key=20 to construct the case that
> these two keys have the same bucket index but have different map_locked index.
> >> __uint(max_entries, 2);
> >> __uint(key_size, sizeof(unsigned int));
> >> __uint(value_size, sizeof(unsigned int));
> >> } map1 SEC(".maps");
> >>
> >> static int bpf_update_data()
> >> {
> >> unsigned int val = 1, key = 0;
> key = 20
> >>
> >> return bpf_map_update_elem(&map1, &key, &val, BPF_ANY);
> >> }
> >>
> >> SEC("kprobe/ip_rcv")
> >> int bpf_prog1(struct pt_regs *regs)
> >> {
> >> bpf_update_data();
> >> return 0;
> >> }
> kprobe on ip_rcv is unnecessary, you can just remove it.
> >>
> >> SEC("tracepoint/nmi/nmi_handler")
> >> int bpf_prog2(struct pt_regs *regs)
> >> {
> >> bpf_update_data();
> >> return 0;
> >> }
> Please use SEC("fentry/nmi_handle") instead of SEC("tracepoint") and unfold
> bpf_update_data(), because the running of bpf program on tracepoint will be
> blocked by bpf_prog_active which will be increased bpf_map_update_elem through
> bpf_disable_instrumentation().
> >>
> >> char _license[] SEC("license") = "GPL";
> >> unsigned int _version SEC("version") = LINUX_VERSION_CODE;
> >>
> >> 3. bpf loader.
> >> #include "kprobe-example.skel.h"
> >>
> >> #include <unistd.h>
> >> #include <errno.h>
> >>
> >> #include <bpf/bpf.h>
> >>
> >> int main()
> >> {
> >> struct kprobe_example *skel;
> >> int map_fd, prog_fd;
> >> int i;
> >> int err = 0;
> >>
> >> skel = kprobe_example__open_and_load();
> >> if (!skel)
> >> return -1;
> >>
> >> err = kprobe_example__attach(skel);
> >> if (err)
> >> goto cleanup;
> >>
> >> /* all libbpf APIs are usable */
> >> prog_fd = bpf_program__fd(skel->progs.bpf_prog1);
> >> map_fd = bpf_map__fd(skel->maps.map1);
> >>
> >> printf("map_fd: %d\n", map_fd);
> >>
> >> unsigned int val = 0, key = 0;
> >>
> >> while (1) {
> >> bpf_map_delete_elem(map_fd, &key);
> No needed neither. Only do bpf_map_update_elem() is OK. Also change key=0 from
> key=4, so it will have the same bucket index as key=20 but have different
> map_locked index.
> >> bpf_map_update_elem(map_fd, &key, &val, BPF_ANY);
> >> }
> Also need to pin the process on a specific CPU (e.g., CPU 0)
> >>
> >> cleanup:
> >> kprobe_example__destroy(skel);
> >> return err;
> >> }
> >>
> >> 4. run the bpf loader and perf record for nmi interrupts.  the warming occurs
> For perf event, you can reference prog_tests/find_vma.c on how to using
> perf_event_open to trigger a perf nmi interrupt. The perf event also needs to
> pin on a specific CPU as the caller of bpf_map_update_elem() does.
>
> >>
> >> --
> >> Best regards, Tonghao
> > .
>


-- 
Best regards, Tonghao

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [net-next] bpf: avoid hashtab deadlock with try_lock
  2022-11-29  6:06                     ` Tonghao Zhang
@ 2022-11-29  7:56                       ` Hou Tao
  2022-11-29 12:45                       ` Hou Tao
  1 sibling, 0 replies; 33+ messages in thread
From: Hou Tao @ 2022-11-29  7:56 UTC (permalink / raw)
  To: Tonghao Zhang
  Cc: netdev, Alexei Starovoitov, Daniel Borkmann, Andrii Nakryiko,
	Martin KaFai Lau, Song Liu, Yonghong Song, John Fastabend,
	KP Singh, Stanislav Fomichev, Jiri Olsa, bpf, Hao Luo

Hi,

On 11/29/2022 2:06 PM, Tonghao Zhang wrote:
> On Tue, Nov 29, 2022 at 12:32 PM Hou Tao <houtao1@huawei.com> wrote:
>> Hi,
>>
>> On 11/29/2022 5:55 AM, Hao Luo wrote:
>>> On Sun, Nov 27, 2022 at 7:15 PM Tonghao Zhang <xiangxia.m.yue@gmail.com> wrote:
>>> Hi Tonghao,
>>>
>>> With a quick look at the htab_lock_bucket() and your problem
>>> statement, I agree with Hou Tao that using hash &
>>> min(HASHTAB_MAP_LOCK_MASK, n_bucket - 1) to index in map_locked seems
>>> to fix the potential deadlock. Can you actually send your changes as
>>> v2 so we can take a look and better help you? Also, can you explain
>>> your solution in your commit message? Right now, your commit message
>>> has only a problem statement and is not very clear. Please include
>>> more details on what you do to fix the issue.
>>>
>>> Hao
>> It would be better if the test case below can be rewritten as a bpf selftests.
>> Please see comments below on how to improve it and reproduce the deadlock.
>>>> Hi
>>>> only a warning from lockdep.
>> Thanks for your details instruction.  I can reproduce the warning by using your
>> setup. I am not a lockdep expert, it seems that fixing such warning needs to set
>> different lockdep class to the different bucket. Because we use map_locked to
>> protect the acquisition of bucket lock, so I think we can define  lock_class_key
>> array in bpf_htab (e.g., lockdep_key[HASHTAB_MAP_LOCK_COUNT]) and initialize the
>> bucket lock accordingly.
> Hi
> Thanks for your reply. define the lock_class_key array looks good.
> Last question: how about using  raw_spin_trylock_irqsave, if the
> bucket is locked on the same or other cpu.
> raw_spin_trylock_irqsave will return the false, we should return the
> -EBUSY in htab_lock_bucket.
>
> static inline int htab_lock_bucket(struct bucket *b,
>                                    unsigned long *pflags)
> {
>         unsigned long flags;
>
>         if (!raw_spin_trylock_irqsave(&b->raw_lock, flags))
>                 return -EBUSY;
>
>         *pflags = flags;
>         return 0;
> }
The flaw of trylock solution is that it can not distinguish between dead-lock
and lock with high contention. So I don't think it is a good idea to do that.
>
>>>> 1. the kernel .config
>>>> #
>>>> # Debug Oops, Lockups and Hangs
>>>> #
>>>> CONFIG_PANIC_ON_OOPS=y
>>>> CONFIG_PANIC_ON_OOPS_VALUE=1
>>>> CONFIG_PANIC_TIMEOUT=0
>>>> CONFIG_LOCKUP_DETECTOR=y
>>>> CONFIG_SOFTLOCKUP_DETECTOR=y
>>>> # CONFIG_BOOTPARAM_SOFTLOCKUP_PANIC is not set
>>>> CONFIG_HARDLOCKUP_DETECTOR_PERF=y
>>>> CONFIG_HARDLOCKUP_CHECK_TIMESTAMP=y
>>>> CONFIG_HARDLOCKUP_DETECTOR=y
>>>> CONFIG_BOOTPARAM_HARDLOCKUP_PANIC=y
>>>> CONFIG_DETECT_HUNG_TASK=y
>>>> CONFIG_DEFAULT_HUNG_TASK_TIMEOUT=120
>>>> # CONFIG_BOOTPARAM_HUNG_TASK_PANIC is not set
>>>> # CONFIG_WQ_WATCHDOG is not set
>>>> # CONFIG_TEST_LOCKUP is not set
>>>> # end of Debug Oops, Lockups and Hangs
>>>>
>>>> 2. bpf.c, the map size is 2.
>>>> struct {
>>>> __uint(type, BPF_MAP_TYPE_HASH);
>> Adding __uint(map_flags, BPF_F_ZERO_SEED); to ensure there will be no seed for
>> hash calculation, so we can use key=4 and key=20 to construct the case that
>> these two keys have the same bucket index but have different map_locked index.
>>>> __uint(max_entries, 2);
>>>> __uint(key_size, sizeof(unsigned int));
>>>> __uint(value_size, sizeof(unsigned int));
>>>> } map1 SEC(".maps");
>>>>
>>>> static int bpf_update_data()
>>>> {
>>>> unsigned int val = 1, key = 0;
>> key = 20
>>>> return bpf_map_update_elem(&map1, &key, &val, BPF_ANY);
>>>> }
>>>>
>>>> SEC("kprobe/ip_rcv")
>>>> int bpf_prog1(struct pt_regs *regs)
>>>> {
>>>> bpf_update_data();
>>>> return 0;
>>>> }
>> kprobe on ip_rcv is unnecessary, you can just remove it.
>>>> SEC("tracepoint/nmi/nmi_handler")
>>>> int bpf_prog2(struct pt_regs *regs)
>>>> {
>>>> bpf_update_data();
>>>> return 0;
>>>> }
>> Please use SEC("fentry/nmi_handle") instead of SEC("tracepoint") and unfold
>> bpf_update_data(), because the running of bpf program on tracepoint will be
>> blocked by bpf_prog_active which will be increased bpf_map_update_elem through
>> bpf_disable_instrumentation().
>>>> char _license[] SEC("license") = "GPL";
>>>> unsigned int _version SEC("version") = LINUX_VERSION_CODE;
>>>>
>>>> 3. bpf loader.
>>>> #include "kprobe-example.skel.h"
>>>>
>>>> #include <unistd.h>
>>>> #include <errno.h>
>>>>
>>>> #include <bpf/bpf.h>
>>>>
>>>> int main()
>>>> {
>>>> struct kprobe_example *skel;
>>>> int map_fd, prog_fd;
>>>> int i;
>>>> int err = 0;
>>>>
>>>> skel = kprobe_example__open_and_load();
>>>> if (!skel)
>>>> return -1;
>>>>
>>>> err = kprobe_example__attach(skel);
>>>> if (err)
>>>> goto cleanup;
>>>>
>>>> /* all libbpf APIs are usable */
>>>> prog_fd = bpf_program__fd(skel->progs.bpf_prog1);
>>>> map_fd = bpf_map__fd(skel->maps.map1);
>>>>
>>>> printf("map_fd: %d\n", map_fd);
>>>>
>>>> unsigned int val = 0, key = 0;
>>>>
>>>> while (1) {
>>>> bpf_map_delete_elem(map_fd, &key);
>> No needed neither. Only do bpf_map_update_elem() is OK. Also change key=0 from
>> key=4, so it will have the same bucket index as key=20 but have different
>> map_locked index.
>>>> bpf_map_update_elem(map_fd, &key, &val, BPF_ANY);
>>>> }
>> Also need to pin the process on a specific CPU (e.g., CPU 0)
>>>> cleanup:
>>>> kprobe_example__destroy(skel);
>>>> return err;
>>>> }
>>>>
>>>> 4. run the bpf loader and perf record for nmi interrupts.  the warming occurs
>> For perf event, you can reference prog_tests/find_vma.c on how to using
>> perf_event_open to trigger a perf nmi interrupt. The perf event also needs to
>> pin on a specific CPU as the caller of bpf_map_update_elem() does.
>>
>>>> --
>>>> Best regards, Tonghao
>>> .
>


^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [net-next] bpf: avoid hashtab deadlock with try_lock
  2022-11-29  6:06                     ` Tonghao Zhang
  2022-11-29  7:56                       ` Hou Tao
@ 2022-11-29 12:45                       ` Hou Tao
  2022-11-29 16:06                         ` Waiman Long
  1 sibling, 1 reply; 33+ messages in thread
From: Hou Tao @ 2022-11-29 12:45 UTC (permalink / raw)
  To: Boqun Feng, Tonghao Zhang, Peter Zijlstra, Ingo Molnar,
	Will Deacon, Waiman Long
  Cc: netdev, Alexei Starovoitov, Daniel Borkmann, Andrii Nakryiko,
	Martin KaFai Lau, Song Liu, Yonghong Song, John Fastabend,
	KP Singh, Stanislav Fomichev, Jiri Olsa, bpf, Hao Luo, houtao1,
	LKML

Hi,

On 11/29/2022 2:06 PM, Tonghao Zhang wrote:
> On Tue, Nov 29, 2022 at 12:32 PM Hou Tao <houtao1@huawei.com> wrote:
>> Hi,
>>
>> On 11/29/2022 5:55 AM, Hao Luo wrote:
>>> On Sun, Nov 27, 2022 at 7:15 PM Tonghao Zhang <xiangxia.m.yue@gmail.com> wrote:
>>> Hi Tonghao,
>>>
>>> With a quick look at the htab_lock_bucket() and your problem
>>> statement, I agree with Hou Tao that using hash &
>>> min(HASHTAB_MAP_LOCK_MASK, n_bucket - 1) to index in map_locked seems
>>> to fix the potential deadlock. Can you actually send your changes as
>>> v2 so we can take a look and better help you? Also, can you explain
>>> your solution in your commit message? Right now, your commit message
>>> has only a problem statement and is not very clear. Please include
>>> more details on what you do to fix the issue.
>>>
>>> Hao
>> It would be better if the test case below can be rewritten as a bpf selftests.
>> Please see comments below on how to improve it and reproduce the deadlock.
>>>> Hi
>>>> only a warning from lockdep.
>> Thanks for your details instruction.  I can reproduce the warning by using your
>> setup. I am not a lockdep expert, it seems that fixing such warning needs to set
>> different lockdep class to the different bucket. Because we use map_locked to
>> protect the acquisition of bucket lock, so I think we can define  lock_class_key
>> array in bpf_htab (e.g., lockdep_key[HASHTAB_MAP_LOCK_COUNT]) and initialize the
>> bucket lock accordingly.
The proposed lockdep solution doesn't work. Still got lockdep warning after
that, so cc +locking expert +lkml.org for lockdep help.

Hi lockdep experts,

We are trying to fix the following lockdep warning from bpf subsystem:

[   36.092222] ================================
[   36.092230] WARNING: inconsistent lock state
[   36.092234] 6.1.0-rc5+ #81 Tainted: G            E
[   36.092236] --------------------------------
[   36.092237] inconsistent {INITIAL USE} -> {IN-NMI} usage.
[   36.092238] perf/1515 [HC1[1]:SC0[0]:HE0:SE1] takes:
[   36.092242] ffff888341acd1a0 (&htab->lockdep_key){....}-{2:2}, at:
htab_lock_bucket+0x4d/0x58
[   36.092253] {INITIAL USE} state was registered at:
[   36.092255]   mark_usage+0x1d/0x11d
[   36.092262]   __lock_acquire+0x3c9/0x6ed
[   36.092266]   lock_acquire+0x23d/0x29a
[   36.092270]   _raw_spin_lock_irqsave+0x43/0x7f
[   36.092274]   htab_lock_bucket+0x4d/0x58
[   36.092276]   htab_map_delete_elem+0x82/0xfb
[   36.092278]   map_delete_elem+0x156/0x1ac
[   36.092282]   __sys_bpf+0x138/0xb71
[   36.092285]   __do_sys_bpf+0xd/0x15
[   36.092288]   do_syscall_64+0x6d/0x84
[   36.092291]   entry_SYSCALL_64_after_hwframe+0x63/0xcd
[   36.092295] irq event stamp: 120346
[   36.092296] hardirqs last  enabled at (120345): [<ffffffff8180b97f>]
_raw_spin_unlock_irq+0x24/0x39
[   36.092299] hardirqs last disabled at (120346): [<ffffffff81169e85>]
generic_exec_single+0x40/0xb9
[   36.092303] softirqs last  enabled at (120268): [<ffffffff81c00347>]
__do_softirq+0x347/0x387
[   36.092307] softirqs last disabled at (120133): [<ffffffff810ba4f0>]
__irq_exit_rcu+0x67/0xc6
[   36.092311]
[   36.092311] other info that might help us debug this:
[   36.092312]  Possible unsafe locking scenario:
[   36.092312]
[   36.092313]        CPU0
[   36.092313]        ----
[   36.092314]   lock(&htab->lockdep_key);
[   36.092315]   <Interrupt>
[   36.092316]     lock(&htab->lockdep_key);
[   36.092318]
[   36.092318]  *** DEADLOCK ***
[   36.092318]
[   36.092318] 3 locks held by perf/1515:
[   36.092320]  #0: ffff8881b9805cc0 (&cpuctx_mutex){+.+.}-{4:4}, at:
perf_event_ctx_lock_nested+0x8e/0xba
[   36.092327]  #1: ffff8881075ecc20 (&event->child_mutex){+.+.}-{4:4}, at:
perf_event_for_each_child+0x35/0x76
[   36.092332]  #2: ffff8881b9805c20 (&cpuctx_lock){-.-.}-{2:2}, at:
perf_ctx_lock+0x12/0x27
[   36.092339]
[   36.092339] stack backtrace:
[   36.092341] CPU: 0 PID: 1515 Comm: perf Tainted: G            E     
6.1.0-rc5+ #81
[   36.092344] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS
rel-1.16.0-0-gd239552ce722-prebuilt.qemu.org 04/01/2014
[   36.092349] Call Trace:
[   36.092351]  <NMI>
[   36.092354]  dump_stack_lvl+0x57/0x81
[   36.092359]  lock_acquire+0x1f4/0x29a
[   36.092363]  ? handle_pmi_common+0x13f/0x1f0
[   36.092366]  ? htab_lock_bucket+0x4d/0x58
[   36.092371]  _raw_spin_lock_irqsave+0x43/0x7f
[   36.092374]  ? htab_lock_bucket+0x4d/0x58
[   36.092377]  htab_lock_bucket+0x4d/0x58
[   36.092379]  htab_map_update_elem+0x11e/0x220
[   36.092386]  bpf_prog_f3a535ca81a8128a_bpf_prog2+0x3e/0x42
[   36.092392]  trace_call_bpf+0x177/0x215
[   36.092398]  perf_trace_run_bpf_submit+0x52/0xaa
[   36.092403]  ? x86_pmu_stop+0x97/0x97
[   36.092407]  perf_trace_nmi_handler+0xb7/0xe0
[   36.092415]  nmi_handle+0x116/0x254
[   36.092418]  ? x86_pmu_stop+0x97/0x97
[   36.092423]  default_do_nmi+0x3d/0xf6
[   36.092428]  exc_nmi+0xa1/0x109
[   36.092432]  end_repeat_nmi+0x16/0x67
[   36.092436] RIP: 0010:wrmsrl+0xd/0x1b
[   36.092441] Code: 04 01 00 00 c6 84 07 48 01 00 00 01 5b e9 46 15 80 00 5b c3
cc cc cc cc c3 cc cc cc cc 48 89 f2 89 f9 89 f0 48 c1 ea 20 0f 30 <66> 90 c3 cc
cc cc cc 31 d2 e9 2f 04 49 00 0f 1f 44 00 00 40 0f6
[   36.092443] RSP: 0018:ffffc900043dfc48 EFLAGS: 00000002
[   36.092445] RAX: 000000000000000f RBX: ffff8881b96153e0 RCX: 000000000000038f
[   36.092447] RDX: 0000000000000007 RSI: 000000070000000f RDI: 000000000000038f
[   36.092449] RBP: 000000070000000f R08: ffffffffffffffff R09: ffff8881053bdaa8
[   36.092451] R10: ffff8881b9805d40 R11: 0000000000000005 R12: ffff8881b9805c00
[   36.092452] R13: 0000000000000000 R14: 0000000000000000 R15: ffff8881075ec970
[   36.092460]  ? wrmsrl+0xd/0x1b
[   36.092465]  ? wrmsrl+0xd/0x1b
[   36.092469]  </NMI>
[   36.092469]  <TASK>
[   36.092470]  __intel_pmu_enable_all.constprop.0+0x7c/0xaf
[   36.092475]  event_function+0xb6/0xd3
[   36.092478]  ? cpu_to_node+0x1a/0x1a
[   36.092482]  ? cpu_to_node+0x1a/0x1a
[   36.092485]  remote_function+0x1e/0x4c
[   36.092489]  generic_exec_single+0x48/0xb9
[   36.092492]  ? __lock_acquire+0x666/0x6ed
[   36.092497]  smp_call_function_single+0xbf/0x106
[   36.092499]  ? cpu_to_node+0x1a/0x1a
[   36.092504]  ? kvm_sched_clock_read+0x5/0x11
[   36.092508]  ? __perf_event_task_sched_in+0x13d/0x13d
[   36.092513]  cpu_function_call+0x47/0x69
[   36.092516]  ? perf_event_update_time+0x52/0x52
[   36.092519]  event_function_call+0x89/0x117
[   36.092521]  ? __perf_event_task_sched_in+0x13d/0x13d
[   36.092526]  ? _perf_event_disable+0x4a/0x4a
[   36.092528]  perf_event_for_each_child+0x3d/0x76
[   36.092532]  ? _perf_event_disable+0x4a/0x4a
[   36.092533]  _perf_ioctl+0x564/0x590
[   36.092537]  ? __lock_release+0xd5/0x1b0
[   36.092543]  ? perf_event_ctx_lock_nested+0x8e/0xba
[   36.092547]  perf_ioctl+0x42/0x5f
[   36.092551]  vfs_ioctl+0x1e/0x2f
[   36.092554]  __do_sys_ioctl+0x66/0x89
[   36.092559]  do_syscall_64+0x6d/0x84
[   36.092563]  entry_SYSCALL_64_after_hwframe+0x63/0xcd
[   36.092566] RIP: 0033:0x7fe7110f362b
[   36.092569] Code: 0f 1e fa 48 8b 05 5d b8 2c 00 64 c7 00 26 00 00 00 48 c7 c0
ff ff ff ff c3 66 0f 1f 44 00 00 f3 0f 1e fa b8 10 00 00 00 0f 05 <48> 3d 01 f0
ff ff 73 01 c3 48 8b 0d 2d b8 2c 00 f7 d8 64 89 018
[   36.092570] RSP: 002b:00007ffebb8e4b08 EFLAGS: 00000246 ORIG_RAX:
0000000000000010
[   36.092573] RAX: ffffffffffffffda RBX: 0000000000002400 RCX: 00007fe7110f362b
[   36.092575] RDX: 0000000000000000 RSI: 0000000000002400 RDI: 0000000000000013
[   36.092576] RBP: 00007ffebb8e4b40 R08: 0000000000000001 R09: 000055c1db4a5b40
[   36.092577] R10: 0000000000000000 R11: 0000000000000246 R12: 0000000000000000
[   36.092579] R13: 000055c1db3b2a30 R14: 0000000000000000 R15: 0000000000000000
[   36.092586]  </TASK>

The lockdep warning is a false alarm, because per-cpu map_locked must be zero
before acquire b->raw_lock. If b->raw_lock has already been acquired by a normal
process through htab_map_update_elem(), then a NMI interrupts the process and
tries to acquire the same b->raw_lock, the acquisition will fail because per-cpu
map_locked has already been increased by the process.

So beside using lockdep_off() and lockdep_on() to disable/enable lockdep
temporarily in htab_lock_bucket() and htab_unlock_bucket(), are there other ways
to fix the lockdep warning ?

Thanks,
Tao




> Hi
> Thanks for your reply. define the lock_class_key array looks good.
> Last question: how about using  raw_spin_trylock_irqsave, if the
> bucket is locked on the same or other cpu.
> raw_spin_trylock_irqsave will return the false, we should return the
> -EBUSY in htab_lock_bucket.
>
> static inline int htab_lock_bucket(struct bucket *b,
>                                    unsigned long *pflags)
> {
>         unsigned long flags;
>
>         if (!raw_spin_trylock_irqsave(&b->raw_lock, flags))
>                 return -EBUSY;
>
>         *pflags = flags;
>         return 0;
> }
>
>>>> 1. the kernel .config
>>>> #
>>>> # Debug Oops, Lockups and Hangs
>>>> #
>>>> CONFIG_PANIC_ON_OOPS=y
>>>> CONFIG_PANIC_ON_OOPS_VALUE=1
>>>> CONFIG_PANIC_TIMEOUT=0
>>>> CONFIG_LOCKUP_DETECTOR=y
>>>> CONFIG_SOFTLOCKUP_DETECTOR=y
>>>> # CONFIG_BOOTPARAM_SOFTLOCKUP_PANIC is not set
>>>> CONFIG_HARDLOCKUP_DETECTOR_PERF=y
>>>> CONFIG_HARDLOCKUP_CHECK_TIMESTAMP=y
>>>> CONFIG_HARDLOCKUP_DETECTOR=y
>>>> CONFIG_BOOTPARAM_HARDLOCKUP_PANIC=y
>>>> CONFIG_DETECT_HUNG_TASK=y
>>>> CONFIG_DEFAULT_HUNG_TASK_TIMEOUT=120
>>>> # CONFIG_BOOTPARAM_HUNG_TASK_PANIC is not set
>>>> # CONFIG_WQ_WATCHDOG is not set
>>>> # CONFIG_TEST_LOCKUP is not set
>>>> # end of Debug Oops, Lockups and Hangs
>>>>
>>>> 2. bpf.c, the map size is 2.
>>>> struct {
>>>> __uint(type, BPF_MAP_TYPE_HASH);
>> Adding __uint(map_flags, BPF_F_ZERO_SEED); to ensure there will be no seed for
>> hash calculation, so we can use key=4 and key=20 to construct the case that
>> these two keys have the same bucket index but have different map_locked index.
>>>> __uint(max_entries, 2);
>>>> __uint(key_size, sizeof(unsigned int));
>>>> __uint(value_size, sizeof(unsigned int));
>>>> } map1 SEC(".maps");
>>>>
>>>> static int bpf_update_data()
>>>> {
>>>> unsigned int val = 1, key = 0;
>> key = 20
>>>> return bpf_map_update_elem(&map1, &key, &val, BPF_ANY);
>>>> }
>>>>
>>>> SEC("kprobe/ip_rcv")
>>>> int bpf_prog1(struct pt_regs *regs)
>>>> {
>>>> bpf_update_data();
>>>> return 0;
>>>> }
>> kprobe on ip_rcv is unnecessary, you can just remove it.
>>>> SEC("tracepoint/nmi/nmi_handler")
>>>> int bpf_prog2(struct pt_regs *regs)
>>>> {
>>>> bpf_update_data();
>>>> return 0;
>>>> }
>> Please use SEC("fentry/nmi_handle") instead of SEC("tracepoint") and unfold
>> bpf_update_data(), because the running of bpf program on tracepoint will be
>> blocked by bpf_prog_active which will be increased bpf_map_update_elem through
>> bpf_disable_instrumentation().
>>>> char _license[] SEC("license") = "GPL";
>>>> unsigned int _version SEC("version") = LINUX_VERSION_CODE;
>>>>
>>>> 3. bpf loader.
>>>> #include "kprobe-example.skel.h"
>>>>
>>>> #include <unistd.h>
>>>> #include <errno.h>
>>>>
>>>> #include <bpf/bpf.h>
>>>>
>>>> int main()
>>>> {
>>>> struct kprobe_example *skel;
>>>> int map_fd, prog_fd;
>>>> int i;
>>>> int err = 0;
>>>>
>>>> skel = kprobe_example__open_and_load();
>>>> if (!skel)
>>>> return -1;
>>>>
>>>> err = kprobe_example__attach(skel);
>>>> if (err)
>>>> goto cleanup;
>>>>
>>>> /* all libbpf APIs are usable */
>>>> prog_fd = bpf_program__fd(skel->progs.bpf_prog1);
>>>> map_fd = bpf_map__fd(skel->maps.map1);
>>>>
>>>> printf("map_fd: %d\n", map_fd);
>>>>
>>>> unsigned int val = 0, key = 0;
>>>>
>>>> while (1) {
>>>> bpf_map_delete_elem(map_fd, &key);
>> No needed neither. Only do bpf_map_update_elem() is OK. Also change key=0 from
>> key=4, so it will have the same bucket index as key=20 but have different
>> map_locked index.
>>>> bpf_map_update_elem(map_fd, &key, &val, BPF_ANY);
>>>> }
>> Also need to pin the process on a specific CPU (e.g., CPU 0)
>>>> cleanup:
>>>> kprobe_example__destroy(skel);
>>>> return err;
>>>> }
>>>>
>>>> 4. run the bpf loader and perf record for nmi interrupts.  the warming occurs
>> For perf event, you can reference prog_tests/find_vma.c on how to using
>> perf_event_open to trigger a perf nmi interrupt. The perf event also needs to
>> pin on a specific CPU as the caller of bpf_map_update_elem() does.
>>
>>>> --
>>>> Best regards, Tonghao
>>> .
>


^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [net-next] bpf: avoid hashtab deadlock with try_lock
  2022-11-29 12:45                       ` Hou Tao
@ 2022-11-29 16:06                         ` Waiman Long
  2022-11-29 17:23                           ` Boqun Feng
  0 siblings, 1 reply; 33+ messages in thread
From: Waiman Long @ 2022-11-29 16:06 UTC (permalink / raw)
  To: Hou Tao, Boqun Feng, Tonghao Zhang, Peter Zijlstra, Ingo Molnar,
	Will Deacon
  Cc: netdev, Alexei Starovoitov, Daniel Borkmann, Andrii Nakryiko,
	Martin KaFai Lau, Song Liu, Yonghong Song, John Fastabend,
	KP Singh, Stanislav Fomichev, Jiri Olsa, bpf, Hao Luo, houtao1,
	LKML

On 11/29/22 07:45, Hou Tao wrote:
> Hi,
>
> On 11/29/2022 2:06 PM, Tonghao Zhang wrote:
>> On Tue, Nov 29, 2022 at 12:32 PM Hou Tao <houtao1@huawei.com> wrote:
>>> Hi,
>>>
>>> On 11/29/2022 5:55 AM, Hao Luo wrote:
>>>> On Sun, Nov 27, 2022 at 7:15 PM Tonghao Zhang <xiangxia.m.yue@gmail.com> wrote:
>>>> Hi Tonghao,
>>>>
>>>> With a quick look at the htab_lock_bucket() and your problem
>>>> statement, I agree with Hou Tao that using hash &
>>>> min(HASHTAB_MAP_LOCK_MASK, n_bucket - 1) to index in map_locked seems
>>>> to fix the potential deadlock. Can you actually send your changes as
>>>> v2 so we can take a look and better help you? Also, can you explain
>>>> your solution in your commit message? Right now, your commit message
>>>> has only a problem statement and is not very clear. Please include
>>>> more details on what you do to fix the issue.
>>>>
>>>> Hao
>>> It would be better if the test case below can be rewritten as a bpf selftests.
>>> Please see comments below on how to improve it and reproduce the deadlock.
>>>>> Hi
>>>>> only a warning from lockdep.
>>> Thanks for your details instruction.  I can reproduce the warning by using your
>>> setup. I am not a lockdep expert, it seems that fixing such warning needs to set
>>> different lockdep class to the different bucket. Because we use map_locked to
>>> protect the acquisition of bucket lock, so I think we can define  lock_class_key
>>> array in bpf_htab (e.g., lockdep_key[HASHTAB_MAP_LOCK_COUNT]) and initialize the
>>> bucket lock accordingly.
> The proposed lockdep solution doesn't work. Still got lockdep warning after
> that, so cc +locking expert +lkml.org for lockdep help.
>
> Hi lockdep experts,
>
> We are trying to fix the following lockdep warning from bpf subsystem:
>
> [   36.092222] ================================
> [   36.092230] WARNING: inconsistent lock state
> [   36.092234] 6.1.0-rc5+ #81 Tainted: G            E
> [   36.092236] --------------------------------
> [   36.092237] inconsistent {INITIAL USE} -> {IN-NMI} usage.
> [   36.092238] perf/1515 [HC1[1]:SC0[0]:HE0:SE1] takes:
> [   36.092242] ffff888341acd1a0 (&htab->lockdep_key){....}-{2:2}, at:
> htab_lock_bucket+0x4d/0x58
> [   36.092253] {INITIAL USE} state was registered at:
> [   36.092255]   mark_usage+0x1d/0x11d
> [   36.092262]   __lock_acquire+0x3c9/0x6ed
> [   36.092266]   lock_acquire+0x23d/0x29a
> [   36.092270]   _raw_spin_lock_irqsave+0x43/0x7f
> [   36.092274]   htab_lock_bucket+0x4d/0x58
> [   36.092276]   htab_map_delete_elem+0x82/0xfb
> [   36.092278]   map_delete_elem+0x156/0x1ac
> [   36.092282]   __sys_bpf+0x138/0xb71
> [   36.092285]   __do_sys_bpf+0xd/0x15
> [   36.092288]   do_syscall_64+0x6d/0x84
> [   36.092291]   entry_SYSCALL_64_after_hwframe+0x63/0xcd
> [   36.092295] irq event stamp: 120346
> [   36.092296] hardirqs last  enabled at (120345): [<ffffffff8180b97f>]
> _raw_spin_unlock_irq+0x24/0x39
> [   36.092299] hardirqs last disabled at (120346): [<ffffffff81169e85>]
> generic_exec_single+0x40/0xb9
> [   36.092303] softirqs last  enabled at (120268): [<ffffffff81c00347>]
> __do_softirq+0x347/0x387
> [   36.092307] softirqs last disabled at (120133): [<ffffffff810ba4f0>]
> __irq_exit_rcu+0x67/0xc6
> [   36.092311]
> [   36.092311] other info that might help us debug this:
> [   36.092312]  Possible unsafe locking scenario:
> [   36.092312]
> [   36.092313]        CPU0
> [   36.092313]        ----
> [   36.092314]   lock(&htab->lockdep_key);
> [   36.092315]   <Interrupt>
> [   36.092316]     lock(&htab->lockdep_key);
> [   36.092318]
> [   36.092318]  *** DEADLOCK ***
> [   36.092318]
> [   36.092318] 3 locks held by perf/1515:
> [   36.092320]  #0: ffff8881b9805cc0 (&cpuctx_mutex){+.+.}-{4:4}, at:
> perf_event_ctx_lock_nested+0x8e/0xba
> [   36.092327]  #1: ffff8881075ecc20 (&event->child_mutex){+.+.}-{4:4}, at:
> perf_event_for_each_child+0x35/0x76
> [   36.092332]  #2: ffff8881b9805c20 (&cpuctx_lock){-.-.}-{2:2}, at:
> perf_ctx_lock+0x12/0x27
> [   36.092339]
> [   36.092339] stack backtrace:
> [   36.092341] CPU: 0 PID: 1515 Comm: perf Tainted: G            E
> 6.1.0-rc5+ #81
> [   36.092344] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS
> rel-1.16.0-0-gd239552ce722-prebuilt.qemu.org 04/01/2014
> [   36.092349] Call Trace:
> [   36.092351]  <NMI>
> [   36.092354]  dump_stack_lvl+0x57/0x81
> [   36.092359]  lock_acquire+0x1f4/0x29a
> [   36.092363]  ? handle_pmi_common+0x13f/0x1f0
> [   36.092366]  ? htab_lock_bucket+0x4d/0x58
> [   36.092371]  _raw_spin_lock_irqsave+0x43/0x7f
> [   36.092374]  ? htab_lock_bucket+0x4d/0x58
> [   36.092377]  htab_lock_bucket+0x4d/0x58
> [   36.092379]  htab_map_update_elem+0x11e/0x220
> [   36.092386]  bpf_prog_f3a535ca81a8128a_bpf_prog2+0x3e/0x42
> [   36.092392]  trace_call_bpf+0x177/0x215
> [   36.092398]  perf_trace_run_bpf_submit+0x52/0xaa
> [   36.092403]  ? x86_pmu_stop+0x97/0x97
> [   36.092407]  perf_trace_nmi_handler+0xb7/0xe0
> [   36.092415]  nmi_handle+0x116/0x254
> [   36.092418]  ? x86_pmu_stop+0x97/0x97
> [   36.092423]  default_do_nmi+0x3d/0xf6
> [   36.092428]  exc_nmi+0xa1/0x109
> [   36.092432]  end_repeat_nmi+0x16/0x67
> [   36.092436] RIP: 0010:wrmsrl+0xd/0x1b

So the lock is really taken in a NMI context. In general, we advise 
again using lock in a NMI context unless it is a lock that is used only 
in that context. Otherwise, deadlock is certainly a possibility as there 
is no way to mask off again NMI.

Cheers,
Longman


^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [net-next] bpf: avoid hashtab deadlock with try_lock
  2022-11-29 16:06                         ` Waiman Long
@ 2022-11-29 17:23                           ` Boqun Feng
  2022-11-29 17:32                             ` Boqun Feng
  2022-11-30  1:37                             ` Hou Tao
  0 siblings, 2 replies; 33+ messages in thread
From: Boqun Feng @ 2022-11-29 17:23 UTC (permalink / raw)
  To: Waiman Long
  Cc: Hou Tao, Tonghao Zhang, Peter Zijlstra, Ingo Molnar, Will Deacon,
	netdev, Alexei Starovoitov, Daniel Borkmann, Andrii Nakryiko,
	Martin KaFai Lau, Song Liu, Yonghong Song, John Fastabend,
	KP Singh, Stanislav Fomichev, Jiri Olsa, bpf, Hao Luo, houtao1,
	LKML

On Tue, Nov 29, 2022 at 11:06:51AM -0500, Waiman Long wrote:
> On 11/29/22 07:45, Hou Tao wrote:
> > Hi,
> > 
> > On 11/29/2022 2:06 PM, Tonghao Zhang wrote:
> > > On Tue, Nov 29, 2022 at 12:32 PM Hou Tao <houtao1@huawei.com> wrote:
> > > > Hi,
> > > > 
> > > > On 11/29/2022 5:55 AM, Hao Luo wrote:
> > > > > On Sun, Nov 27, 2022 at 7:15 PM Tonghao Zhang <xiangxia.m.yue@gmail.com> wrote:
> > > > > Hi Tonghao,
> > > > > 
> > > > > With a quick look at the htab_lock_bucket() and your problem
> > > > > statement, I agree with Hou Tao that using hash &
> > > > > min(HASHTAB_MAP_LOCK_MASK, n_bucket - 1) to index in map_locked seems
> > > > > to fix the potential deadlock. Can you actually send your changes as
> > > > > v2 so we can take a look and better help you? Also, can you explain
> > > > > your solution in your commit message? Right now, your commit message
> > > > > has only a problem statement and is not very clear. Please include
> > > > > more details on what you do to fix the issue.
> > > > > 
> > > > > Hao
> > > > It would be better if the test case below can be rewritten as a bpf selftests.
> > > > Please see comments below on how to improve it and reproduce the deadlock.
> > > > > > Hi
> > > > > > only a warning from lockdep.
> > > > Thanks for your details instruction.  I can reproduce the warning by using your
> > > > setup. I am not a lockdep expert, it seems that fixing such warning needs to set
> > > > different lockdep class to the different bucket. Because we use map_locked to
> > > > protect the acquisition of bucket lock, so I think we can define  lock_class_key
> > > > array in bpf_htab (e.g., lockdep_key[HASHTAB_MAP_LOCK_COUNT]) and initialize the
> > > > bucket lock accordingly.
> > The proposed lockdep solution doesn't work. Still got lockdep warning after
> > that, so cc +locking expert +lkml.org for lockdep help.
> > 
> > Hi lockdep experts,
> > 
> > We are trying to fix the following lockdep warning from bpf subsystem:
> > 
> > [   36.092222] ================================
> > [   36.092230] WARNING: inconsistent lock state
> > [   36.092234] 6.1.0-rc5+ #81 Tainted: G            E
> > [   36.092236] --------------------------------
> > [   36.092237] inconsistent {INITIAL USE} -> {IN-NMI} usage.
> > [   36.092238] perf/1515 [HC1[1]:SC0[0]:HE0:SE1] takes:
> > [   36.092242] ffff888341acd1a0 (&htab->lockdep_key){....}-{2:2}, at:
> > htab_lock_bucket+0x4d/0x58
> > [   36.092253] {INITIAL USE} state was registered at:
> > [   36.092255]   mark_usage+0x1d/0x11d
> > [   36.092262]   __lock_acquire+0x3c9/0x6ed
> > [   36.092266]   lock_acquire+0x23d/0x29a
> > [   36.092270]   _raw_spin_lock_irqsave+0x43/0x7f
> > [   36.092274]   htab_lock_bucket+0x4d/0x58
> > [   36.092276]   htab_map_delete_elem+0x82/0xfb
> > [   36.092278]   map_delete_elem+0x156/0x1ac
> > [   36.092282]   __sys_bpf+0x138/0xb71
> > [   36.092285]   __do_sys_bpf+0xd/0x15
> > [   36.092288]   do_syscall_64+0x6d/0x84
> > [   36.092291]   entry_SYSCALL_64_after_hwframe+0x63/0xcd
> > [   36.092295] irq event stamp: 120346
> > [   36.092296] hardirqs last  enabled at (120345): [<ffffffff8180b97f>]
> > _raw_spin_unlock_irq+0x24/0x39
> > [   36.092299] hardirqs last disabled at (120346): [<ffffffff81169e85>]
> > generic_exec_single+0x40/0xb9
> > [   36.092303] softirqs last  enabled at (120268): [<ffffffff81c00347>]
> > __do_softirq+0x347/0x387
> > [   36.092307] softirqs last disabled at (120133): [<ffffffff810ba4f0>]
> > __irq_exit_rcu+0x67/0xc6
> > [   36.092311]
> > [   36.092311] other info that might help us debug this:
> > [   36.092312]  Possible unsafe locking scenario:
> > [   36.092312]
> > [   36.092313]        CPU0
> > [   36.092313]        ----
> > [   36.092314]   lock(&htab->lockdep_key);
> > [   36.092315]   <Interrupt>
> > [   36.092316]     lock(&htab->lockdep_key);
> > [   36.092318]
> > [   36.092318]  *** DEADLOCK ***
> > [   36.092318]
> > [   36.092318] 3 locks held by perf/1515:
> > [   36.092320]  #0: ffff8881b9805cc0 (&cpuctx_mutex){+.+.}-{4:4}, at:
> > perf_event_ctx_lock_nested+0x8e/0xba
> > [   36.092327]  #1: ffff8881075ecc20 (&event->child_mutex){+.+.}-{4:4}, at:
> > perf_event_for_each_child+0x35/0x76
> > [   36.092332]  #2: ffff8881b9805c20 (&cpuctx_lock){-.-.}-{2:2}, at:
> > perf_ctx_lock+0x12/0x27
> > [   36.092339]
> > [   36.092339] stack backtrace:
> > [   36.092341] CPU: 0 PID: 1515 Comm: perf Tainted: G            E
> > 6.1.0-rc5+ #81
> > [   36.092344] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS
> > rel-1.16.0-0-gd239552ce722-prebuilt.qemu.org 04/01/2014
> > [   36.092349] Call Trace:
> > [   36.092351]  <NMI>
> > [   36.092354]  dump_stack_lvl+0x57/0x81
> > [   36.092359]  lock_acquire+0x1f4/0x29a
> > [   36.092363]  ? handle_pmi_common+0x13f/0x1f0
> > [   36.092366]  ? htab_lock_bucket+0x4d/0x58
> > [   36.092371]  _raw_spin_lock_irqsave+0x43/0x7f
> > [   36.092374]  ? htab_lock_bucket+0x4d/0x58
> > [   36.092377]  htab_lock_bucket+0x4d/0x58
> > [   36.092379]  htab_map_update_elem+0x11e/0x220
> > [   36.092386]  bpf_prog_f3a535ca81a8128a_bpf_prog2+0x3e/0x42
> > [   36.092392]  trace_call_bpf+0x177/0x215
> > [   36.092398]  perf_trace_run_bpf_submit+0x52/0xaa
> > [   36.092403]  ? x86_pmu_stop+0x97/0x97
> > [   36.092407]  perf_trace_nmi_handler+0xb7/0xe0
> > [   36.092415]  nmi_handle+0x116/0x254
> > [   36.092418]  ? x86_pmu_stop+0x97/0x97
> > [   36.092423]  default_do_nmi+0x3d/0xf6
> > [   36.092428]  exc_nmi+0xa1/0x109
> > [   36.092432]  end_repeat_nmi+0x16/0x67
> > [   36.092436] RIP: 0010:wrmsrl+0xd/0x1b
> 
> So the lock is really taken in a NMI context. In general, we advise again
> using lock in a NMI context unless it is a lock that is used only in that
> context. Otherwise, deadlock is certainly a possibility as there is no way
> to mask off again NMI.
> 

I think here they use a percpu counter as an "outer lock" to make the
accesses to the real lock exclusive:

	preempt_disable();
	a = __this_cpu_inc(->map_locked);
	if (a != 1) {
		__this_cpu_dec(->map_locked);
		preempt_enable();
		return -EBUSY;
	}
	preempt_enable();
		return -EBUSY;
	
	raw_spin_lock_irqsave(->raw_lock);

and lockdep is not aware that ->map_locked acts as a lock.

However, I feel this may be just a reinvented try_lock pattern, Hou Tao,
could you see if this can be refactored with a try_lock? Otherwise, you
may need to introduce a virtual lockclass for ->map_locked.

Regards,
Boqun

> Cheers,
> Longman
> 

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [net-next] bpf: avoid hashtab deadlock with try_lock
  2022-11-29 17:23                           ` Boqun Feng
@ 2022-11-29 17:32                             ` Boqun Feng
  2022-11-29 19:36                               ` Hao Luo
  2022-11-30  1:37                             ` Hou Tao
  1 sibling, 1 reply; 33+ messages in thread
From: Boqun Feng @ 2022-11-29 17:32 UTC (permalink / raw)
  To: Waiman Long
  Cc: Hou Tao, Tonghao Zhang, Peter Zijlstra, Ingo Molnar, Will Deacon,
	netdev, Alexei Starovoitov, Daniel Borkmann, Andrii Nakryiko,
	Martin KaFai Lau, Song Liu, Yonghong Song, John Fastabend,
	KP Singh, Stanislav Fomichev, Jiri Olsa, bpf, Hao Luo, houtao1,
	LKML

On Tue, Nov 29, 2022 at 09:23:18AM -0800, Boqun Feng wrote:
> On Tue, Nov 29, 2022 at 11:06:51AM -0500, Waiman Long wrote:
> > On 11/29/22 07:45, Hou Tao wrote:
> > > Hi,
> > > 
> > > On 11/29/2022 2:06 PM, Tonghao Zhang wrote:
> > > > On Tue, Nov 29, 2022 at 12:32 PM Hou Tao <houtao1@huawei.com> wrote:
> > > > > Hi,
> > > > > 
> > > > > On 11/29/2022 5:55 AM, Hao Luo wrote:
> > > > > > On Sun, Nov 27, 2022 at 7:15 PM Tonghao Zhang <xiangxia.m.yue@gmail.com> wrote:
> > > > > > Hi Tonghao,
> > > > > > 
> > > > > > With a quick look at the htab_lock_bucket() and your problem
> > > > > > statement, I agree with Hou Tao that using hash &
> > > > > > min(HASHTAB_MAP_LOCK_MASK, n_bucket - 1) to index in map_locked seems
> > > > > > to fix the potential deadlock. Can you actually send your changes as
> > > > > > v2 so we can take a look and better help you? Also, can you explain
> > > > > > your solution in your commit message? Right now, your commit message
> > > > > > has only a problem statement and is not very clear. Please include
> > > > > > more details on what you do to fix the issue.
> > > > > > 
> > > > > > Hao
> > > > > It would be better if the test case below can be rewritten as a bpf selftests.
> > > > > Please see comments below on how to improve it and reproduce the deadlock.
> > > > > > > Hi
> > > > > > > only a warning from lockdep.
> > > > > Thanks for your details instruction.  I can reproduce the warning by using your
> > > > > setup. I am not a lockdep expert, it seems that fixing such warning needs to set
> > > > > different lockdep class to the different bucket. Because we use map_locked to
> > > > > protect the acquisition of bucket lock, so I think we can define  lock_class_key
> > > > > array in bpf_htab (e.g., lockdep_key[HASHTAB_MAP_LOCK_COUNT]) and initialize the
> > > > > bucket lock accordingly.
> > > The proposed lockdep solution doesn't work. Still got lockdep warning after
> > > that, so cc +locking expert +lkml.org for lockdep help.
> > > 
> > > Hi lockdep experts,
> > > 
> > > We are trying to fix the following lockdep warning from bpf subsystem:
> > > 
> > > [   36.092222] ================================
> > > [   36.092230] WARNING: inconsistent lock state
> > > [   36.092234] 6.1.0-rc5+ #81 Tainted: G            E
> > > [   36.092236] --------------------------------
> > > [   36.092237] inconsistent {INITIAL USE} -> {IN-NMI} usage.
> > > [   36.092238] perf/1515 [HC1[1]:SC0[0]:HE0:SE1] takes:
> > > [   36.092242] ffff888341acd1a0 (&htab->lockdep_key){....}-{2:2}, at:
> > > htab_lock_bucket+0x4d/0x58
> > > [   36.092253] {INITIAL USE} state was registered at:
> > > [   36.092255]   mark_usage+0x1d/0x11d
> > > [   36.092262]   __lock_acquire+0x3c9/0x6ed
> > > [   36.092266]   lock_acquire+0x23d/0x29a
> > > [   36.092270]   _raw_spin_lock_irqsave+0x43/0x7f
> > > [   36.092274]   htab_lock_bucket+0x4d/0x58
> > > [   36.092276]   htab_map_delete_elem+0x82/0xfb
> > > [   36.092278]   map_delete_elem+0x156/0x1ac
> > > [   36.092282]   __sys_bpf+0x138/0xb71
> > > [   36.092285]   __do_sys_bpf+0xd/0x15
> > > [   36.092288]   do_syscall_64+0x6d/0x84
> > > [   36.092291]   entry_SYSCALL_64_after_hwframe+0x63/0xcd
> > > [   36.092295] irq event stamp: 120346
> > > [   36.092296] hardirqs last  enabled at (120345): [<ffffffff8180b97f>]
> > > _raw_spin_unlock_irq+0x24/0x39
> > > [   36.092299] hardirqs last disabled at (120346): [<ffffffff81169e85>]
> > > generic_exec_single+0x40/0xb9
> > > [   36.092303] softirqs last  enabled at (120268): [<ffffffff81c00347>]
> > > __do_softirq+0x347/0x387
> > > [   36.092307] softirqs last disabled at (120133): [<ffffffff810ba4f0>]
> > > __irq_exit_rcu+0x67/0xc6
> > > [   36.092311]
> > > [   36.092311] other info that might help us debug this:
> > > [   36.092312]  Possible unsafe locking scenario:
> > > [   36.092312]
> > > [   36.092313]        CPU0
> > > [   36.092313]        ----
> > > [   36.092314]   lock(&htab->lockdep_key);
> > > [   36.092315]   <Interrupt>
> > > [   36.092316]     lock(&htab->lockdep_key);
> > > [   36.092318]
> > > [   36.092318]  *** DEADLOCK ***
> > > [   36.092318]
> > > [   36.092318] 3 locks held by perf/1515:
> > > [   36.092320]  #0: ffff8881b9805cc0 (&cpuctx_mutex){+.+.}-{4:4}, at:
> > > perf_event_ctx_lock_nested+0x8e/0xba
> > > [   36.092327]  #1: ffff8881075ecc20 (&event->child_mutex){+.+.}-{4:4}, at:
> > > perf_event_for_each_child+0x35/0x76
> > > [   36.092332]  #2: ffff8881b9805c20 (&cpuctx_lock){-.-.}-{2:2}, at:
> > > perf_ctx_lock+0x12/0x27
> > > [   36.092339]
> > > [   36.092339] stack backtrace:
> > > [   36.092341] CPU: 0 PID: 1515 Comm: perf Tainted: G            E
> > > 6.1.0-rc5+ #81
> > > [   36.092344] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS
> > > rel-1.16.0-0-gd239552ce722-prebuilt.qemu.org 04/01/2014
> > > [   36.092349] Call Trace:
> > > [   36.092351]  <NMI>
> > > [   36.092354]  dump_stack_lvl+0x57/0x81
> > > [   36.092359]  lock_acquire+0x1f4/0x29a
> > > [   36.092363]  ? handle_pmi_common+0x13f/0x1f0
> > > [   36.092366]  ? htab_lock_bucket+0x4d/0x58
> > > [   36.092371]  _raw_spin_lock_irqsave+0x43/0x7f
> > > [   36.092374]  ? htab_lock_bucket+0x4d/0x58
> > > [   36.092377]  htab_lock_bucket+0x4d/0x58
> > > [   36.092379]  htab_map_update_elem+0x11e/0x220
> > > [   36.092386]  bpf_prog_f3a535ca81a8128a_bpf_prog2+0x3e/0x42
> > > [   36.092392]  trace_call_bpf+0x177/0x215
> > > [   36.092398]  perf_trace_run_bpf_submit+0x52/0xaa
> > > [   36.092403]  ? x86_pmu_stop+0x97/0x97
> > > [   36.092407]  perf_trace_nmi_handler+0xb7/0xe0
> > > [   36.092415]  nmi_handle+0x116/0x254
> > > [   36.092418]  ? x86_pmu_stop+0x97/0x97
> > > [   36.092423]  default_do_nmi+0x3d/0xf6
> > > [   36.092428]  exc_nmi+0xa1/0x109
> > > [   36.092432]  end_repeat_nmi+0x16/0x67
> > > [   36.092436] RIP: 0010:wrmsrl+0xd/0x1b
> > 
> > So the lock is really taken in a NMI context. In general, we advise again
> > using lock in a NMI context unless it is a lock that is used only in that
> > context. Otherwise, deadlock is certainly a possibility as there is no way
> > to mask off again NMI.
> > 
> 
> I think here they use a percpu counter as an "outer lock" to make the
> accesses to the real lock exclusive:
> 
> 	preempt_disable();
> 	a = __this_cpu_inc(->map_locked);
> 	if (a != 1) {
> 		__this_cpu_dec(->map_locked);
> 		preempt_enable();
> 		return -EBUSY;
> 	}
> 	preempt_enable();
> 		return -EBUSY;
> 	
> 	raw_spin_lock_irqsave(->raw_lock);
> 
> and lockdep is not aware that ->map_locked acts as a lock.
> 
> However, I feel this may be just a reinvented try_lock pattern, Hou Tao,
> could you see if this can be refactored with a try_lock? Otherwise, you

Just to be clear, I meant to refactor htab_lock_bucket() into a try
lock pattern. Also after a second thought, the below suggestion doesn't
work. I think the proper way is to make htab_lock_bucket() as a
raw_spin_trylock_irqsave().

Regards,
Boqun

> may need to introduce a virtual lockclass for ->map_locked.
> 
> Regards,
> Boqun
> 
> > Cheers,
> > Longman
> > 

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [net-next] bpf: avoid hashtab deadlock with try_lock
  2022-11-29 17:32                             ` Boqun Feng
@ 2022-11-29 19:36                               ` Hao Luo
  2022-11-29 21:13                                 ` Waiman Long
  2022-11-30  1:50                                 ` Hou Tao
  0 siblings, 2 replies; 33+ messages in thread
From: Hao Luo @ 2022-11-29 19:36 UTC (permalink / raw)
  To: Boqun Feng
  Cc: Waiman Long, Hou Tao, Tonghao Zhang, Peter Zijlstra, Ingo Molnar,
	Will Deacon, netdev, Alexei Starovoitov, Daniel Borkmann,
	Andrii Nakryiko, Martin KaFai Lau, Song Liu, Yonghong Song,
	John Fastabend, KP Singh, Stanislav Fomichev, Jiri Olsa, bpf,
	houtao1, LKML

On Tue, Nov 29, 2022 at 9:32 AM Boqun Feng <boqun.feng@gmail.com> wrote:
>
> Just to be clear, I meant to refactor htab_lock_bucket() into a try
> lock pattern. Also after a second thought, the below suggestion doesn't
> work. I think the proper way is to make htab_lock_bucket() as a
> raw_spin_trylock_irqsave().
>
> Regards,
> Boqun
>

The potential deadlock happens when the lock is contended from the
same cpu. When the lock is contended from a remote cpu, we would like
the remote cpu to spin and wait, instead of giving up immediately. As
this gives better throughput. So replacing the current
raw_spin_lock_irqsave() with trylock sacrifices this performance gain.

I suspect the source of the problem is the 'hash' that we used in
htab_lock_bucket(). The 'hash' is derived from the 'key', I wonder
whether we should use a hash derived from 'bucket' rather than from
'key'. For example, from the memory address of the 'bucket'. Because,
different keys may fall into the same bucket, but yield different
hashes. If the same bucket can never have two different 'hashes' here,
the map_locked check should behave as intended. Also because
->map_locked is per-cpu, execution flows from two different cpus can
both pass.

Hao

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [net-next] bpf: avoid hashtab deadlock with try_lock
  2022-11-29 19:36                               ` Hao Luo
@ 2022-11-29 21:13                                 ` Waiman Long
  2022-11-30  1:50                                 ` Hou Tao
  1 sibling, 0 replies; 33+ messages in thread
From: Waiman Long @ 2022-11-29 21:13 UTC (permalink / raw)
  To: Hao Luo, Boqun Feng
  Cc: Hou Tao, Tonghao Zhang, Peter Zijlstra, Ingo Molnar, Will Deacon,
	netdev, Alexei Starovoitov, Daniel Borkmann, Andrii Nakryiko,
	Martin KaFai Lau, Song Liu, Yonghong Song, John Fastabend,
	KP Singh, Stanislav Fomichev, Jiri Olsa, bpf, houtao1, LKML


On 11/29/22 14:36, Hao Luo wrote:
> On Tue, Nov 29, 2022 at 9:32 AM Boqun Feng <boqun.feng@gmail.com> wrote:
>> Just to be clear, I meant to refactor htab_lock_bucket() into a try
>> lock pattern. Also after a second thought, the below suggestion doesn't
>> work. I think the proper way is to make htab_lock_bucket() as a
>> raw_spin_trylock_irqsave().
>>
>> Regards,
>> Boqun
>>
> The potential deadlock happens when the lock is contended from the
> same cpu. When the lock is contended from a remote cpu, we would like
> the remote cpu to spin and wait, instead of giving up immediately. As
> this gives better throughput. So replacing the current
> raw_spin_lock_irqsave() with trylock sacrifices this performance gain.
>
> I suspect the source of the problem is the 'hash' that we used in
> htab_lock_bucket(). The 'hash' is derived from the 'key', I wonder
> whether we should use a hash derived from 'bucket' rather than from
> 'key'. For example, from the memory address of the 'bucket'. Because,
> different keys may fall into the same bucket, but yield different
> hashes. If the same bucket can never have two different 'hashes' here,
> the map_locked check should behave as intended. Also because
> ->map_locked is per-cpu, execution flows from two different cpus can
> both pass.

I would suggest that you add a in_nmi() check and if true use trylock to 
get the lock. You can continue to use raw_spin_lock_irqsave() in all 
other cases.

Cheers,
Longman


^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [net-next] bpf: avoid hashtab deadlock with try_lock
  2022-11-29 17:23                           ` Boqun Feng
  2022-11-29 17:32                             ` Boqun Feng
@ 2022-11-30  1:37                             ` Hou Tao
  1 sibling, 0 replies; 33+ messages in thread
From: Hou Tao @ 2022-11-30  1:37 UTC (permalink / raw)
  To: Boqun Feng, Waiman Long
  Cc: Tonghao Zhang, Peter Zijlstra, Ingo Molnar, Will Deacon, netdev,
	Alexei Starovoitov, Daniel Borkmann, Andrii Nakryiko,
	Martin KaFai Lau, Song Liu, Yonghong Song, John Fastabend,
	KP Singh, Stanislav Fomichev, Jiri Olsa, bpf, Hao Luo, houtao1,
	LKML

Hi,

On 11/30/2022 1:23 AM, Boqun Feng wrote:
> On Tue, Nov 29, 2022 at 11:06:51AM -0500, Waiman Long wrote:
>> On 11/29/22 07:45, Hou Tao wrote:
>>> Hi,
>>>
>>> On 11/29/2022 2:06 PM, Tonghao Zhang wrote:
>>>> On Tue, Nov 29, 2022 at 12:32 PM Hou Tao <houtao1@huawei.com> wrote:
>>>>> Hi,
>>>>>
>>>>> On 11/29/2022 5:55 AM, Hao Luo wrote:
>>>>>> On Sun, Nov 27, 2022 at 7:15 PM Tonghao Zhang <xiangxia.m.yue@gmail.com> wrote:
>>>>>> Hi Tonghao,
>>>>>>
>>>>>> With a quick look at the htab_lock_bucket() and your problem
>>>>>> statement, I agree with Hou Tao that using hash &
>>>>>> min(HASHTAB_MAP_LOCK_MASK, n_bucket - 1) to index in map_locked seems
>>>>>> to fix the potential deadlock. Can you actually send your changes as
>>>>>> v2 so we can take a look and better help you? Also, can you explain
>>>>>> your solution in your commit message? Right now, your commit message
>>>>>> has only a problem statement and is not very clear. Please include
>>>>>> more details on what you do to fix the issue.
>>>>>>
>>>>>> Hao
>>>>> It would be better if the test case below can be rewritten as a bpf selftests.
>>>>> Please see comments below on how to improve it and reproduce the deadlock.
>>>>>>> Hi
>>>>>>> only a warning from lockdep.
>>>>> Thanks for your details instruction.  I can reproduce the warning by using your
>>>>> setup. I am not a lockdep expert, it seems that fixing such warning needs to set
>>>>> different lockdep class to the different bucket. Because we use map_locked to
>>>>> protect the acquisition of bucket lock, so I think we can define  lock_class_key
>>>>> array in bpf_htab (e.g., lockdep_key[HASHTAB_MAP_LOCK_COUNT]) and initialize the
>>>>> bucket lock accordingly.
>>> The proposed lockdep solution doesn't work. Still got lockdep warning after
>>> that, so cc +locking expert +lkml.org for lockdep help.
>>>
>>> Hi lockdep experts,
>>>
>>> We are trying to fix the following lockdep warning from bpf subsystem:
>>>
>>> [   36.092222] ================================
>>> [   36.092230] WARNING: inconsistent lock state
>>> [   36.092234] 6.1.0-rc5+ #81 Tainted: G            E
>>> [   36.092236] --------------------------------
>>> [   36.092237] inconsistent {INITIAL USE} -> {IN-NMI} usage.
>>> [   36.092238] perf/1515 [HC1[1]:SC0[0]:HE0:SE1] takes:
>>> [   36.092242] ffff888341acd1a0 (&htab->lockdep_key){....}-{2:2}, at:
>>> htab_lock_bucket+0x4d/0x58
>>> [   36.092253] {INITIAL USE} state was registered at:
>>> [   36.092255]   mark_usage+0x1d/0x11d
>>> [   36.092262]   __lock_acquire+0x3c9/0x6ed
>>> [   36.092266]   lock_acquire+0x23d/0x29a
>>> [   36.092270]   _raw_spin_lock_irqsave+0x43/0x7f
>>> [   36.092274]   htab_lock_bucket+0x4d/0x58
>>> [   36.092276]   htab_map_delete_elem+0x82/0xfb
>>> [   36.092278]   map_delete_elem+0x156/0x1ac
>>> [   36.092282]   __sys_bpf+0x138/0xb71
>>> [   36.092285]   __do_sys_bpf+0xd/0x15
>>> [   36.092288]   do_syscall_64+0x6d/0x84
>>> [   36.092291]   entry_SYSCALL_64_after_hwframe+0x63/0xcd
>>> [   36.092295] irq event stamp: 120346
>>> [   36.092296] hardirqs last  enabled at (120345): [<ffffffff8180b97f>]
>>> _raw_spin_unlock_irq+0x24/0x39
>>> [   36.092299] hardirqs last disabled at (120346): [<ffffffff81169e85>]
>>> generic_exec_single+0x40/0xb9
>>> [   36.092303] softirqs last  enabled at (120268): [<ffffffff81c00347>]
>>> __do_softirq+0x347/0x387
>>> [   36.092307] softirqs last disabled at (120133): [<ffffffff810ba4f0>]
>>> __irq_exit_rcu+0x67/0xc6
>>> [   36.092311]
>>> [   36.092311] other info that might help us debug this:
>>> [   36.092312]  Possible unsafe locking scenario:
>>> [   36.092312]
>>> [   36.092313]        CPU0
>>> [   36.092313]        ----
>>> [   36.092314]   lock(&htab->lockdep_key);
>>> [   36.092315]   <Interrupt>
>>> [   36.092316]     lock(&htab->lockdep_key);
>>> [   36.092318]
>>> [   36.092318]  *** DEADLOCK ***
>>> [   36.092318]
>>> [   36.092318] 3 locks held by perf/1515:
>>> [   36.092320]  #0: ffff8881b9805cc0 (&cpuctx_mutex){+.+.}-{4:4}, at:
>>> perf_event_ctx_lock_nested+0x8e/0xba
>>> [   36.092327]  #1: ffff8881075ecc20 (&event->child_mutex){+.+.}-{4:4}, at:
>>> perf_event_for_each_child+0x35/0x76
>>> [   36.092332]  #2: ffff8881b9805c20 (&cpuctx_lock){-.-.}-{2:2}, at:
>>> perf_ctx_lock+0x12/0x27
>>> [   36.092339]
>>> [   36.092339] stack backtrace:
>>> [   36.092341] CPU: 0 PID: 1515 Comm: perf Tainted: G            E
>>> 6.1.0-rc5+ #81
>>> [   36.092344] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS
>>> rel-1.16.0-0-gd239552ce722-prebuilt.qemu.org 04/01/2014
>>> [   36.092349] Call Trace:
>>> [   36.092351]  <NMI>
>>> [   36.092354]  dump_stack_lvl+0x57/0x81
>>> [   36.092359]  lock_acquire+0x1f4/0x29a
>>> [   36.092363]  ? handle_pmi_common+0x13f/0x1f0
>>> [   36.092366]  ? htab_lock_bucket+0x4d/0x58
>>> [   36.092371]  _raw_spin_lock_irqsave+0x43/0x7f
>>> [   36.092374]  ? htab_lock_bucket+0x4d/0x58
>>> [   36.092377]  htab_lock_bucket+0x4d/0x58
>>> [   36.092379]  htab_map_update_elem+0x11e/0x220
>>> [   36.092386]  bpf_prog_f3a535ca81a8128a_bpf_prog2+0x3e/0x42
>>> [   36.092392]  trace_call_bpf+0x177/0x215
>>> [   36.092398]  perf_trace_run_bpf_submit+0x52/0xaa
>>> [   36.092403]  ? x86_pmu_stop+0x97/0x97
>>> [   36.092407]  perf_trace_nmi_handler+0xb7/0xe0
>>> [   36.092415]  nmi_handle+0x116/0x254
>>> [   36.092418]  ? x86_pmu_stop+0x97/0x97
>>> [   36.092423]  default_do_nmi+0x3d/0xf6
>>> [   36.092428]  exc_nmi+0xa1/0x109
>>> [   36.092432]  end_repeat_nmi+0x16/0x67
>>> [   36.092436] RIP: 0010:wrmsrl+0xd/0x1b
>> So the lock is really taken in a NMI context. In general, we advise again
>> using lock in a NMI context unless it is a lock that is used only in that
>> context. Otherwise, deadlock is certainly a possibility as there is no way
>> to mask off again NMI.
>>
> I think here they use a percpu counter as an "outer lock" to make the
> accesses to the real lock exclusive:
>
> 	preempt_disable();
> 	a = __this_cpu_inc(->map_locked);
> 	if (a != 1) {
> 		__this_cpu_dec(->map_locked);
> 		preempt_enable();
> 		return -EBUSY;
> 	}
> 	preempt_enable();
> 		return -EBUSY;
> 	
> 	raw_spin_lock_irqsave(->raw_lock);
>
> and lockdep is not aware that ->map_locked acts as a lock.
>
> However, I feel this may be just a reinvented try_lock pattern, Hou Tao,
> could you see if this can be refactored with a try_lock? Otherwise, you
> may need to introduce a virtual lockclass for ->map_locked.
As said by Hao Luo, the problem of using trylock in nmi context is that it can
not distinguish between dead-lock and lock with high-contention. And map_locked
is still needed even trylock is used in NMI because htab_map_update_elem() may
be reentered in a normal context through attaching a bpf program to one function
called after taken the lock. So introducing a virtual lockclass for ->map_locked
is a better idea.

Thanks,
Tao
> Regards,
> Boqun
>
>> Cheers,
>> Longman
>>
> .


^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [net-next] bpf: avoid hashtab deadlock with try_lock
  2022-11-29 19:36                               ` Hao Luo
  2022-11-29 21:13                                 ` Waiman Long
@ 2022-11-30  1:50                                 ` Hou Tao
  2022-11-30  2:47                                   ` Tonghao Zhang
  1 sibling, 1 reply; 33+ messages in thread
From: Hou Tao @ 2022-11-30  1:50 UTC (permalink / raw)
  To: Hao Luo
  Cc: Waiman Long, Tonghao Zhang, Peter Zijlstra, Ingo Molnar,
	Will Deacon, netdev, Alexei Starovoitov, Daniel Borkmann,
	Andrii Nakryiko, Martin KaFai Lau, Song Liu, Yonghong Song,
	John Fastabend, KP Singh, Stanislav Fomichev, Jiri Olsa, bpf,
	houtao1, LKML, Boqun Feng

Hi Hao,

On 11/30/2022 3:36 AM, Hao Luo wrote:
> On Tue, Nov 29, 2022 at 9:32 AM Boqun Feng <boqun.feng@gmail.com> wrote:
>> Just to be clear, I meant to refactor htab_lock_bucket() into a try
>> lock pattern. Also after a second thought, the below suggestion doesn't
>> work. I think the proper way is to make htab_lock_bucket() as a
>> raw_spin_trylock_irqsave().
>>
>> Regards,
>> Boqun
>>
> The potential deadlock happens when the lock is contended from the
> same cpu. When the lock is contended from a remote cpu, we would like
> the remote cpu to spin and wait, instead of giving up immediately. As
> this gives better throughput. So replacing the current
> raw_spin_lock_irqsave() with trylock sacrifices this performance gain.
>
> I suspect the source of the problem is the 'hash' that we used in
> htab_lock_bucket(). The 'hash' is derived from the 'key', I wonder
> whether we should use a hash derived from 'bucket' rather than from
> 'key'. For example, from the memory address of the 'bucket'. Because,
> different keys may fall into the same bucket, but yield different
> hashes. If the same bucket can never have two different 'hashes' here,
> the map_locked check should behave as intended. Also because
> ->map_locked is per-cpu, execution flows from two different cpus can
> both pass.
The warning from lockdep is due to the reason the bucket lock A is used in a
no-NMI context firstly, then the same bucke lock is used a NMI context, so
lockdep deduces that may be a dead-lock. I have already tried to use the same
map_locked for keys with the same bucket, the dead-lock is gone, but still got
lockdep warning.
>
> Hao
> .


^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [net-next] bpf: avoid hashtab deadlock with try_lock
  2022-11-30  1:50                                 ` Hou Tao
@ 2022-11-30  2:47                                   ` Tonghao Zhang
  2022-11-30  3:06                                     ` Waiman Long
  2022-11-30  4:13                                     ` Hou Tao
  0 siblings, 2 replies; 33+ messages in thread
From: Tonghao Zhang @ 2022-11-30  2:47 UTC (permalink / raw)
  To: Hou Tao
  Cc: Hao Luo, Waiman Long, Peter Zijlstra, Ingo Molnar, Will Deacon,
	netdev, Alexei Starovoitov, Daniel Borkmann, Andrii Nakryiko,
	Martin KaFai Lau, Song Liu, Yonghong Song, John Fastabend,
	KP Singh, Stanislav Fomichev, Jiri Olsa, bpf, houtao1, LKML,
	Boqun Feng

On Wed, Nov 30, 2022 at 9:50 AM Hou Tao <houtao@huaweicloud.com> wrote:
>
> Hi Hao,
>
> On 11/30/2022 3:36 AM, Hao Luo wrote:
> > On Tue, Nov 29, 2022 at 9:32 AM Boqun Feng <boqun.feng@gmail.com> wrote:
> >> Just to be clear, I meant to refactor htab_lock_bucket() into a try
> >> lock pattern. Also after a second thought, the below suggestion doesn't
> >> work. I think the proper way is to make htab_lock_bucket() as a
> >> raw_spin_trylock_irqsave().
> >>
> >> Regards,
> >> Boqun
> >>
> > The potential deadlock happens when the lock is contended from the
> > same cpu. When the lock is contended from a remote cpu, we would like
> > the remote cpu to spin and wait, instead of giving up immediately. As
> > this gives better throughput. So replacing the current
> > raw_spin_lock_irqsave() with trylock sacrifices this performance gain.
> >
> > I suspect the source of the problem is the 'hash' that we used in
> > htab_lock_bucket(). The 'hash' is derived from the 'key', I wonder
> > whether we should use a hash derived from 'bucket' rather than from
> > 'key'. For example, from the memory address of the 'bucket'. Because,
> > different keys may fall into the same bucket, but yield different
> > hashes. If the same bucket can never have two different 'hashes' here,
> > the map_locked check should behave as intended. Also because
> > ->map_locked is per-cpu, execution flows from two different cpus can
> > both pass.
> The warning from lockdep is due to the reason the bucket lock A is used in a
> no-NMI context firstly, then the same bucke lock is used a NMI context, so
Yes, I tested lockdep too, we can't use the lock in NMI(but only
try_lock work fine) context if we use them no-NMI context. otherwise
the lockdep prints the warning.
* for the dead-lock case: we can use the
1. hash & min(HASHTAB_MAP_LOCK_MASK, htab->n_buckets -1)
2. or hash bucket address.

* for lockdep warning, we should use in_nmi check with map_locked.

BTW, the patch doesn't work, so we can remove the lock_key
https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=c50eb518e262fa06bd334e6eec172eaf5d7a5bd9

static inline int htab_lock_bucket(const struct bpf_htab *htab,
                                   struct bucket *b, u32 hash,
                                   unsigned long *pflags)
{
        unsigned long flags;

        hash = hash & min(HASHTAB_MAP_LOCK_MASK, htab->n_buckets -1);

        preempt_disable();
        if (unlikely(__this_cpu_inc_return(*(htab->map_locked[hash])) != 1)) {
                __this_cpu_dec(*(htab->map_locked[hash]));
                preempt_enable();
                return -EBUSY;
        }

        if (in_nmi()) {
                if (!raw_spin_trylock_irqsave(&b->raw_lock, flags))
                        return -EBUSY;
        } else {
                raw_spin_lock_irqsave(&b->raw_lock, flags);
        }

        *pflags = flags;
        return 0;
}


> lockdep deduces that may be a dead-lock. I have already tried to use the same
> map_locked for keys with the same bucket, the dead-lock is gone, but still got
> lockdep warning.
> >
> > Hao
> > .
>


-- 
Best regards, Tonghao

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [net-next] bpf: avoid hashtab deadlock with try_lock
  2022-11-30  2:47                                   ` Tonghao Zhang
@ 2022-11-30  3:06                                     ` Waiman Long
  2022-11-30  3:32                                       ` Tonghao Zhang
  2022-11-30  4:13                                     ` Hou Tao
  1 sibling, 1 reply; 33+ messages in thread
From: Waiman Long @ 2022-11-30  3:06 UTC (permalink / raw)
  To: Tonghao Zhang, Hou Tao
  Cc: Hao Luo, Peter Zijlstra, Ingo Molnar, Will Deacon, netdev,
	Alexei Starovoitov, Daniel Borkmann, Andrii Nakryiko,
	Martin KaFai Lau, Song Liu, Yonghong Song, John Fastabend,
	KP Singh, Stanislav Fomichev, Jiri Olsa, bpf, houtao1, LKML,
	Boqun Feng

On 11/29/22 21:47, Tonghao Zhang wrote:
> On Wed, Nov 30, 2022 at 9:50 AM Hou Tao <houtao@huaweicloud.com> wrote:
>> Hi Hao,
>>
>> On 11/30/2022 3:36 AM, Hao Luo wrote:
>>> On Tue, Nov 29, 2022 at 9:32 AM Boqun Feng <boqun.feng@gmail.com> wrote:
>>>> Just to be clear, I meant to refactor htab_lock_bucket() into a try
>>>> lock pattern. Also after a second thought, the below suggestion doesn't
>>>> work. I think the proper way is to make htab_lock_bucket() as a
>>>> raw_spin_trylock_irqsave().
>>>>
>>>> Regards,
>>>> Boqun
>>>>
>>> The potential deadlock happens when the lock is contended from the
>>> same cpu. When the lock is contended from a remote cpu, we would like
>>> the remote cpu to spin and wait, instead of giving up immediately. As
>>> this gives better throughput. So replacing the current
>>> raw_spin_lock_irqsave() with trylock sacrifices this performance gain.
>>>
>>> I suspect the source of the problem is the 'hash' that we used in
>>> htab_lock_bucket(). The 'hash' is derived from the 'key', I wonder
>>> whether we should use a hash derived from 'bucket' rather than from
>>> 'key'. For example, from the memory address of the 'bucket'. Because,
>>> different keys may fall into the same bucket, but yield different
>>> hashes. If the same bucket can never have two different 'hashes' here,
>>> the map_locked check should behave as intended. Also because
>>> ->map_locked is per-cpu, execution flows from two different cpus can
>>> both pass.
>> The warning from lockdep is due to the reason the bucket lock A is used in a
>> no-NMI context firstly, then the same bucke lock is used a NMI context, so
> Yes, I tested lockdep too, we can't use the lock in NMI(but only
> try_lock work fine) context if we use them no-NMI context. otherwise
> the lockdep prints the warning.
> * for the dead-lock case: we can use the
> 1. hash & min(HASHTAB_MAP_LOCK_MASK, htab->n_buckets -1)
> 2. or hash bucket address.
>
> * for lockdep warning, we should use in_nmi check with map_locked.
>
> BTW, the patch doesn't work, so we can remove the lock_key
> https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=c50eb518e262fa06bd334e6eec172eaf5d7a5bd9
>
> static inline int htab_lock_bucket(const struct bpf_htab *htab,
>                                     struct bucket *b, u32 hash,
>                                     unsigned long *pflags)
> {
>          unsigned long flags;
>
>          hash = hash & min(HASHTAB_MAP_LOCK_MASK, htab->n_buckets -1);
>
>          preempt_disable();
>          if (unlikely(__this_cpu_inc_return(*(htab->map_locked[hash])) != 1)) {
>                  __this_cpu_dec(*(htab->map_locked[hash]));
>                  preempt_enable();
>                  return -EBUSY;
>          }
>
>          if (in_nmi()) {
>                  if (!raw_spin_trylock_irqsave(&b->raw_lock, flags))
>                          return -EBUSY;
That is not right. You have to do the same step as above by decrementing 
the percpu count and enable preemption. So you may want to put all these 
busy_out steps after the return 0 and use "goto busy_out;" to jump there.
>          } else {
>                  raw_spin_lock_irqsave(&b->raw_lock, flags);
>          }
>
>          *pflags = flags;
>          return 0;
> }

BTW, with that change, I believe you can actually remove all the percpu 
map_locked count code.

Cheers,
Longman


^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [net-next] bpf: avoid hashtab deadlock with try_lock
  2022-11-30  3:06                                     ` Waiman Long
@ 2022-11-30  3:32                                       ` Tonghao Zhang
  2022-11-30  4:07                                         ` Waiman Long
  0 siblings, 1 reply; 33+ messages in thread
From: Tonghao Zhang @ 2022-11-30  3:32 UTC (permalink / raw)
  To: Waiman Long, Hou Tao
  Cc: Hou Tao, Hao Luo, Peter Zijlstra, Ingo Molnar, Will Deacon,
	netdev, Alexei Starovoitov, Daniel Borkmann, Andrii Nakryiko,
	Martin KaFai Lau, Song Liu, Yonghong Song, John Fastabend,
	KP Singh, Stanislav Fomichev, Jiri Olsa, bpf, LKML, Boqun Feng

On Wed, Nov 30, 2022 at 11:07 AM Waiman Long <longman@redhat.com> wrote:
>
> On 11/29/22 21:47, Tonghao Zhang wrote:
> > On Wed, Nov 30, 2022 at 9:50 AM Hou Tao <houtao@huaweicloud.com> wrote:
> >> Hi Hao,
> >>
> >> On 11/30/2022 3:36 AM, Hao Luo wrote:
> >>> On Tue, Nov 29, 2022 at 9:32 AM Boqun Feng <boqun.feng@gmail.com> wrote:
> >>>> Just to be clear, I meant to refactor htab_lock_bucket() into a try
> >>>> lock pattern. Also after a second thought, the below suggestion doesn't
> >>>> work. I think the proper way is to make htab_lock_bucket() as a
> >>>> raw_spin_trylock_irqsave().
> >>>>
> >>>> Regards,
> >>>> Boqun
> >>>>
> >>> The potential deadlock happens when the lock is contended from the
> >>> same cpu. When the lock is contended from a remote cpu, we would like
> >>> the remote cpu to spin and wait, instead of giving up immediately. As
> >>> this gives better throughput. So replacing the current
> >>> raw_spin_lock_irqsave() with trylock sacrifices this performance gain.
> >>>
> >>> I suspect the source of the problem is the 'hash' that we used in
> >>> htab_lock_bucket(). The 'hash' is derived from the 'key', I wonder
> >>> whether we should use a hash derived from 'bucket' rather than from
> >>> 'key'. For example, from the memory address of the 'bucket'. Because,
> >>> different keys may fall into the same bucket, but yield different
> >>> hashes. If the same bucket can never have two different 'hashes' here,
> >>> the map_locked check should behave as intended. Also because
> >>> ->map_locked is per-cpu, execution flows from two different cpus can
> >>> both pass.
> >> The warning from lockdep is due to the reason the bucket lock A is used in a
> >> no-NMI context firstly, then the same bucke lock is used a NMI context, so
> > Yes, I tested lockdep too, we can't use the lock in NMI(but only
> > try_lock work fine) context if we use them no-NMI context. otherwise
> > the lockdep prints the warning.
> > * for the dead-lock case: we can use the
> > 1. hash & min(HASHTAB_MAP_LOCK_MASK, htab->n_buckets -1)
> > 2. or hash bucket address.
> >
> > * for lockdep warning, we should use in_nmi check with map_locked.
> >
> > BTW, the patch doesn't work, so we can remove the lock_key
> > https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=c50eb518e262fa06bd334e6eec172eaf5d7a5bd9
> >
> > static inline int htab_lock_bucket(const struct bpf_htab *htab,
> >                                     struct bucket *b, u32 hash,
> >                                     unsigned long *pflags)
> > {
> >          unsigned long flags;
> >
> >          hash = hash & min(HASHTAB_MAP_LOCK_MASK, htab->n_buckets -1);
> >
> >          preempt_disable();
> >          if (unlikely(__this_cpu_inc_return(*(htab->map_locked[hash])) != 1)) {
> >                  __this_cpu_dec(*(htab->map_locked[hash]));
> >                  preempt_enable();
> >                  return -EBUSY;
> >          }
> >
> >          if (in_nmi()) {
> >                  if (!raw_spin_trylock_irqsave(&b->raw_lock, flags))
> >                          return -EBUSY;
> That is not right. You have to do the same step as above by decrementing
> the percpu count and enable preemption. So you may want to put all these
> busy_out steps after the return 0 and use "goto busy_out;" to jump there.
Yes, thanks Waiman, I should add the busy_out label.
> >          } else {
> >                  raw_spin_lock_irqsave(&b->raw_lock, flags);
> >          }
> >
> >          *pflags = flags;
> >          return 0;
> > }
>
> BTW, with that change, I believe you can actually remove all the percpu
> map_locked count code.
there are some case, for example, we run the bpf_prog A B in task
context on the same cpu.
bpf_prog A
update map X
    htab_lock_bucket
        raw_spin_lock_irqsave()
    lookup_elem_raw()
        // bpf prog B is attached on lookup_elem_raw()
        bpf prog B
            update map X again and update the element
                htab_lock_bucket()
                    // dead-lock
                    raw_spinlock_irqsave()
> Cheers,
> Longman
>


-- 
Best regards, Tonghao

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [net-next] bpf: avoid hashtab deadlock with try_lock
  2022-11-30  3:32                                       ` Tonghao Zhang
@ 2022-11-30  4:07                                         ` Waiman Long
  0 siblings, 0 replies; 33+ messages in thread
From: Waiman Long @ 2022-11-30  4:07 UTC (permalink / raw)
  To: Tonghao Zhang, Hou Tao
  Cc: Hou Tao, Hao Luo, Peter Zijlstra, Ingo Molnar, Will Deacon,
	netdev, Alexei Starovoitov, Daniel Borkmann, Andrii Nakryiko,
	Martin KaFai Lau, Song Liu, Yonghong Song, John Fastabend,
	KP Singh, Stanislav Fomichev, Jiri Olsa, bpf, LKML, Boqun Feng

On 11/29/22 22:32, Tonghao Zhang wrote:
> On Wed, Nov 30, 2022 at 11:07 AM Waiman Long <longman@redhat.com> wrote:
>> On 11/29/22 21:47, Tonghao Zhang wrote:
>>> On Wed, Nov 30, 2022 at 9:50 AM Hou Tao <houtao@huaweicloud.com> wrote:
>>>> Hi Hao,
>>>>
>>>> On 11/30/2022 3:36 AM, Hao Luo wrote:
>>>>> On Tue, Nov 29, 2022 at 9:32 AM Boqun Feng <boqun.feng@gmail.com> wrote:
>>>>>> Just to be clear, I meant to refactor htab_lock_bucket() into a try
>>>>>> lock pattern. Also after a second thought, the below suggestion doesn't
>>>>>> work. I think the proper way is to make htab_lock_bucket() as a
>>>>>> raw_spin_trylock_irqsave().
>>>>>>
>>>>>> Regards,
>>>>>> Boqun
>>>>>>
>>>>> The potential deadlock happens when the lock is contended from the
>>>>> same cpu. When the lock is contended from a remote cpu, we would like
>>>>> the remote cpu to spin and wait, instead of giving up immediately. As
>>>>> this gives better throughput. So replacing the current
>>>>> raw_spin_lock_irqsave() with trylock sacrifices this performance gain.
>>>>>
>>>>> I suspect the source of the problem is the 'hash' that we used in
>>>>> htab_lock_bucket(). The 'hash' is derived from the 'key', I wonder
>>>>> whether we should use a hash derived from 'bucket' rather than from
>>>>> 'key'. For example, from the memory address of the 'bucket'. Because,
>>>>> different keys may fall into the same bucket, but yield different
>>>>> hashes. If the same bucket can never have two different 'hashes' here,
>>>>> the map_locked check should behave as intended. Also because
>>>>> ->map_locked is per-cpu, execution flows from two different cpus can
>>>>> both pass.
>>>> The warning from lockdep is due to the reason the bucket lock A is used in a
>>>> no-NMI context firstly, then the same bucke lock is used a NMI context, so
>>> Yes, I tested lockdep too, we can't use the lock in NMI(but only
>>> try_lock work fine) context if we use them no-NMI context. otherwise
>>> the lockdep prints the warning.
>>> * for the dead-lock case: we can use the
>>> 1. hash & min(HASHTAB_MAP_LOCK_MASK, htab->n_buckets -1)
>>> 2. or hash bucket address.
>>>
>>> * for lockdep warning, we should use in_nmi check with map_locked.
>>>
>>> BTW, the patch doesn't work, so we can remove the lock_key
>>> https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=c50eb518e262fa06bd334e6eec172eaf5d7a5bd9
>>>
>>> static inline int htab_lock_bucket(const struct bpf_htab *htab,
>>>                                      struct bucket *b, u32 hash,
>>>                                      unsigned long *pflags)
>>> {
>>>           unsigned long flags;
>>>
>>>           hash = hash & min(HASHTAB_MAP_LOCK_MASK, htab->n_buckets -1);
>>>
>>>           preempt_disable();
>>>           if (unlikely(__this_cpu_inc_return(*(htab->map_locked[hash])) != 1)) {
>>>                   __this_cpu_dec(*(htab->map_locked[hash]));
>>>                   preempt_enable();
>>>                   return -EBUSY;
>>>           }
>>>
>>>           if (in_nmi()) {
>>>                   if (!raw_spin_trylock_irqsave(&b->raw_lock, flags))
>>>                           return -EBUSY;
>> That is not right. You have to do the same step as above by decrementing
>> the percpu count and enable preemption. So you may want to put all these
>> busy_out steps after the return 0 and use "goto busy_out;" to jump there.
> Yes, thanks Waiman, I should add the busy_out label.
>>>           } else {
>>>                   raw_spin_lock_irqsave(&b->raw_lock, flags);
>>>           }
>>>
>>>           *pflags = flags;
>>>           return 0;
>>> }
>> BTW, with that change, I believe you can actually remove all the percpu
>> map_locked count code.
> there are some case, for example, we run the bpf_prog A B in task
> context on the same cpu.
> bpf_prog A
> update map X
>      htab_lock_bucket
>          raw_spin_lock_irqsave()
>      lookup_elem_raw()
>          // bpf prog B is attached on lookup_elem_raw()
>          bpf prog B
>              update map X again and update the element
>                  htab_lock_bucket()
>                      // dead-lock
>                      raw_spinlock_irqsave()

I see, so nested locking is possible in this case. Beside using the 
percpu map_lock, another way is to have cpumask associated with each 
bucket lock and use each bit in the cpumask for to control access using 
test_and_set_bit() for each cpu. That will allow more concurrency and 
you can actually find out how contended is the lock. Anyway, it is just 
a thought.

Cheers,
Longman



^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [net-next] bpf: avoid hashtab deadlock with try_lock
  2022-11-30  2:47                                   ` Tonghao Zhang
  2022-11-30  3:06                                     ` Waiman Long
@ 2022-11-30  4:13                                     ` Hou Tao
  2022-11-30  5:02                                       ` Hao Luo
  2022-11-30  5:55                                       ` Tonghao Zhang
  1 sibling, 2 replies; 33+ messages in thread
From: Hou Tao @ 2022-11-30  4:13 UTC (permalink / raw)
  To: Tonghao Zhang
  Cc: Hao Luo, Waiman Long, Peter Zijlstra, Ingo Molnar, Will Deacon,
	netdev, Alexei Starovoitov, Daniel Borkmann, Andrii Nakryiko,
	Martin KaFai Lau, Song Liu, Yonghong Song, John Fastabend,
	KP Singh, Stanislav Fomichev, Jiri Olsa, bpf, houtao1, LKML,
	Boqun Feng

Hi,

On 11/30/2022 10:47 AM, Tonghao Zhang wrote:
> On Wed, Nov 30, 2022 at 9:50 AM Hou Tao <houtao@huaweicloud.com> wrote:
>> Hi Hao,
>>
>> On 11/30/2022 3:36 AM, Hao Luo wrote:
>>> On Tue, Nov 29, 2022 at 9:32 AM Boqun Feng <boqun.feng@gmail.com> wrote:
>>>> Just to be clear, I meant to refactor htab_lock_bucket() into a try
>>>> lock pattern. Also after a second thought, the below suggestion doesn't
>>>> work. I think the proper way is to make htab_lock_bucket() as a
>>>> raw_spin_trylock_irqsave().
>>>>
>>>> Regards,
>>>> Boqun
>>>>
>>> The potential deadlock happens when the lock is contended from the
>>> same cpu. When the lock is contended from a remote cpu, we would like
>>> the remote cpu to spin and wait, instead of giving up immediately. As
>>> this gives better throughput. So replacing the current
>>> raw_spin_lock_irqsave() with trylock sacrifices this performance gain.
>>>
>>> I suspect the source of the problem is the 'hash' that we used in
>>> htab_lock_bucket(). The 'hash' is derived from the 'key', I wonder
>>> whether we should use a hash derived from 'bucket' rather than from
>>> 'key'. For example, from the memory address of the 'bucket'. Because,
>>> different keys may fall into the same bucket, but yield different
>>> hashes. If the same bucket can never have two different 'hashes' here,
>>> the map_locked check should behave as intended. Also because
>>> ->map_locked is per-cpu, execution flows from two different cpus can
>>> both pass.
>> The warning from lockdep is due to the reason the bucket lock A is used in a
>> no-NMI context firstly, then the same bucke lock is used a NMI context, so
> Yes, I tested lockdep too, we can't use the lock in NMI(but only
> try_lock work fine) context if we use them no-NMI context. otherwise
> the lockdep prints the warning.
> * for the dead-lock case: we can use the
> 1. hash & min(HASHTAB_MAP_LOCK_MASK, htab->n_buckets -1)
> 2. or hash bucket address.
Use the computed hash will be better than hash bucket address, because the hash
buckets are allocated sequentially.
>
> * for lockdep warning, we should use in_nmi check with map_locked.
>
> BTW, the patch doesn't work, so we can remove the lock_key
> https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=c50eb518e262fa06bd334e6eec172eaf5d7a5bd9
>
> static inline int htab_lock_bucket(const struct bpf_htab *htab,
>                                    struct bucket *b, u32 hash,
>                                    unsigned long *pflags)
> {
>         unsigned long flags;
>
>         hash = hash & min(HASHTAB_MAP_LOCK_MASK, htab->n_buckets -1);
>
>         preempt_disable();
>         if (unlikely(__this_cpu_inc_return(*(htab->map_locked[hash])) != 1)) {
>                 __this_cpu_dec(*(htab->map_locked[hash]));
>                 preempt_enable();
>                 return -EBUSY;
>         }
>
>         if (in_nmi()) {
>                 if (!raw_spin_trylock_irqsave(&b->raw_lock, flags))
>                         return -EBUSY;
The only purpose of trylock here is to make lockdep happy and it may lead to
unnecessary -EBUSY error for htab operations in NMI context. I still prefer add
a virtual lock-class for map_locked to fix the lockdep warning. So could you use
separated patches to fix the potential dead-lock and the lockdep warning ? It
will be better you can also add a bpf selftests for deadlock problem as said before.

Thanks,
Tao
>         } else {
>                 raw_spin_lock_irqsave(&b->raw_lock, flags);
>         }
>
>         *pflags = flags;
>         return 0;
> }
>
>
>> lockdep deduces that may be a dead-lock. I have already tried to use the same
>> map_locked for keys with the same bucket, the dead-lock is gone, but still got
>> lockdep warning.
>>> Hao
>>> .
>


^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [net-next] bpf: avoid hashtab deadlock with try_lock
  2022-11-30  4:13                                     ` Hou Tao
@ 2022-11-30  5:02                                       ` Hao Luo
  2022-11-30  5:56                                         ` Tonghao Zhang
  2022-11-30  5:55                                       ` Tonghao Zhang
  1 sibling, 1 reply; 33+ messages in thread
From: Hao Luo @ 2022-11-30  5:02 UTC (permalink / raw)
  To: Hou Tao
  Cc: Tonghao Zhang, Waiman Long, Peter Zijlstra, Ingo Molnar,
	Will Deacon, netdev, Alexei Starovoitov, Daniel Borkmann,
	Andrii Nakryiko, Martin KaFai Lau, Song Liu, Yonghong Song,
	John Fastabend, KP Singh, Stanislav Fomichev, Jiri Olsa, bpf,
	houtao1, LKML, Boqun Feng

On Tue, Nov 29, 2022 at 8:13 PM Hou Tao <houtao@huaweicloud.com> wrote:
>
> On 11/30/2022 10:47 AM, Tonghao Zhang wrote:
<...>
> >         if (in_nmi()) {
> >                 if (!raw_spin_trylock_irqsave(&b->raw_lock, flags))
> >                         return -EBUSY;
>
> The only purpose of trylock here is to make lockdep happy and it may lead to
> unnecessary -EBUSY error for htab operations in NMI context. I still prefer add
> a virtual lock-class for map_locked to fix the lockdep warning. So could you use
> separated patches to fix the potential dead-lock and the lockdep warning ? It
> will be better you can also add a bpf selftests for deadlock problem as said before.
>

Agree with Tao here. Tonghao, could you send another version which:

- separates the fix to deadlock and the fix to lockdep warning
- includes a bpf selftest to verify the fix to deadlock
- with bpf-specific tag: [PATCH bpf-next]

There are multiple ideas on the fly in this thread, it's easy to lose
track of what has been proposed and what change you intend to make.

Thanks,
Hao

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [net-next] bpf: avoid hashtab deadlock with try_lock
  2022-11-30  4:13                                     ` Hou Tao
  2022-11-30  5:02                                       ` Hao Luo
@ 2022-11-30  5:55                                       ` Tonghao Zhang
  2022-12-01  2:53                                         ` Hou Tao
  1 sibling, 1 reply; 33+ messages in thread
From: Tonghao Zhang @ 2022-11-30  5:55 UTC (permalink / raw)
  To: Hou Tao
  Cc: Hao Luo, Waiman Long, Peter Zijlstra, Ingo Molnar, Will Deacon,
	netdev, Alexei Starovoitov, Daniel Borkmann, Andrii Nakryiko,
	Martin KaFai Lau, Song Liu, Yonghong Song, John Fastabend,
	KP Singh, Stanislav Fomichev, Jiri Olsa, bpf, houtao1, LKML,
	Boqun Feng

On Wed, Nov 30, 2022 at 12:13 PM Hou Tao <houtao@huaweicloud.com> wrote:
>
> Hi,
>
> On 11/30/2022 10:47 AM, Tonghao Zhang wrote:
> > On Wed, Nov 30, 2022 at 9:50 AM Hou Tao <houtao@huaweicloud.com> wrote:
> >> Hi Hao,
> >>
> >> On 11/30/2022 3:36 AM, Hao Luo wrote:
> >>> On Tue, Nov 29, 2022 at 9:32 AM Boqun Feng <boqun.feng@gmail.com> wrote:
> >>>> Just to be clear, I meant to refactor htab_lock_bucket() into a try
> >>>> lock pattern. Also after a second thought, the below suggestion doesn't
> >>>> work. I think the proper way is to make htab_lock_bucket() as a
> >>>> raw_spin_trylock_irqsave().
> >>>>
> >>>> Regards,
> >>>> Boqun
> >>>>
> >>> The potential deadlock happens when the lock is contended from the
> >>> same cpu. When the lock is contended from a remote cpu, we would like
> >>> the remote cpu to spin and wait, instead of giving up immediately. As
> >>> this gives better throughput. So replacing the current
> >>> raw_spin_lock_irqsave() with trylock sacrifices this performance gain.
> >>>
> >>> I suspect the source of the problem is the 'hash' that we used in
> >>> htab_lock_bucket(). The 'hash' is derived from the 'key', I wonder
> >>> whether we should use a hash derived from 'bucket' rather than from
> >>> 'key'. For example, from the memory address of the 'bucket'. Because,
> >>> different keys may fall into the same bucket, but yield different
> >>> hashes. If the same bucket can never have two different 'hashes' here,
> >>> the map_locked check should behave as intended. Also because
> >>> ->map_locked is per-cpu, execution flows from two different cpus can
> >>> both pass.
> >> The warning from lockdep is due to the reason the bucket lock A is used in a
> >> no-NMI context firstly, then the same bucke lock is used a NMI context, so
> > Yes, I tested lockdep too, we can't use the lock in NMI(but only
> > try_lock work fine) context if we use them no-NMI context. otherwise
> > the lockdep prints the warning.
> > * for the dead-lock case: we can use the
> > 1. hash & min(HASHTAB_MAP_LOCK_MASK, htab->n_buckets -1)
> > 2. or hash bucket address.
> Use the computed hash will be better than hash bucket address, because the hash
> buckets are allocated sequentially.
> >
> > * for lockdep warning, we should use in_nmi check with map_locked.
> >
> > BTW, the patch doesn't work, so we can remove the lock_key
> > https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=c50eb518e262fa06bd334e6eec172eaf5d7a5bd9
> >
> > static inline int htab_lock_bucket(const struct bpf_htab *htab,
> >                                    struct bucket *b, u32 hash,
> >                                    unsigned long *pflags)
> > {
> >         unsigned long flags;
> >
> >         hash = hash & min(HASHTAB_MAP_LOCK_MASK, htab->n_buckets -1);
> >
> >         preempt_disable();
> >         if (unlikely(__this_cpu_inc_return(*(htab->map_locked[hash])) != 1)) {
> >                 __this_cpu_dec(*(htab->map_locked[hash]));
> >                 preempt_enable();
> >                 return -EBUSY;
> >         }
> >
> >         if (in_nmi()) {
> >                 if (!raw_spin_trylock_irqsave(&b->raw_lock, flags))
> >                         return -EBUSY;
> The only purpose of trylock here is to make lockdep happy and it may lead to
> unnecessary -EBUSY error for htab operations in NMI context. I still prefer add
> a virtual lock-class for map_locked to fix the lockdep warning. So could you use
Hi, what is virtual lock-class ? Can you give me an example of what you mean?
> separated patches to fix the potential dead-lock and the lockdep warning ? It
> will be better you can also add a bpf selftests for deadlock problem as said before.
>
> Thanks,
> Tao
> >         } else {
> >                 raw_spin_lock_irqsave(&b->raw_lock, flags);
> >         }
> >
> >         *pflags = flags;
> >         return 0;
> > }
> >
> >
> >> lockdep deduces that may be a dead-lock. I have already tried to use the same
> >> map_locked for keys with the same bucket, the dead-lock is gone, but still got
> >> lockdep warning.
> >>> Hao
> >>> .
> >
>


-- 
Best regards, Tonghao

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [net-next] bpf: avoid hashtab deadlock with try_lock
  2022-11-30  5:02                                       ` Hao Luo
@ 2022-11-30  5:56                                         ` Tonghao Zhang
  0 siblings, 0 replies; 33+ messages in thread
From: Tonghao Zhang @ 2022-11-30  5:56 UTC (permalink / raw)
  To: Hao Luo
  Cc: Hou Tao, Waiman Long, Peter Zijlstra, Ingo Molnar, Will Deacon,
	netdev, Alexei Starovoitov, Daniel Borkmann, Andrii Nakryiko,
	Martin KaFai Lau, Song Liu, Yonghong Song, John Fastabend,
	KP Singh, Stanislav Fomichev, Jiri Olsa, bpf, houtao1, LKML,
	Boqun Feng

On Wed, Nov 30, 2022 at 1:02 PM Hao Luo <haoluo@google.com> wrote:
>
> On Tue, Nov 29, 2022 at 8:13 PM Hou Tao <houtao@huaweicloud.com> wrote:
> >
> > On 11/30/2022 10:47 AM, Tonghao Zhang wrote:
> <...>
> > >         if (in_nmi()) {
> > >                 if (!raw_spin_trylock_irqsave(&b->raw_lock, flags))
> > >                         return -EBUSY;
> >
> > The only purpose of trylock here is to make lockdep happy and it may lead to
> > unnecessary -EBUSY error for htab operations in NMI context. I still prefer add
> > a virtual lock-class for map_locked to fix the lockdep warning. So could you use
> > separated patches to fix the potential dead-lock and the lockdep warning ? It
> > will be better you can also add a bpf selftests for deadlock problem as said before.
> >
>
> Agree with Tao here. Tonghao, could you send another version which:
>
> - separates the fix to deadlock and the fix to lockdep warning
> - includes a bpf selftest to verify the fix to deadlock
> - with bpf-specific tag: [PATCH bpf-next]
>
> There are multiple ideas on the fly in this thread, it's easy to lose
> track of what has been proposed and what change you intend to make.
Hi, I will send v2 soon. Thanks.
> Thanks,
> Hao



-- 
Best regards, Tonghao

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [net-next] bpf: avoid hashtab deadlock with try_lock
  2022-11-30  5:55                                       ` Tonghao Zhang
@ 2022-12-01  2:53                                         ` Hou Tao
  0 siblings, 0 replies; 33+ messages in thread
From: Hou Tao @ 2022-12-01  2:53 UTC (permalink / raw)
  To: Tonghao Zhang
  Cc: Hao Luo, Waiman Long, Peter Zijlstra, Ingo Molnar, Will Deacon,
	netdev, Alexei Starovoitov, Daniel Borkmann, Andrii Nakryiko,
	Martin KaFai Lau, Song Liu, Yonghong Song, John Fastabend,
	KP Singh, Stanislav Fomichev, Jiri Olsa, bpf, houtao1, LKML,
	Boqun Feng

Hi,

On 11/30/2022 1:55 PM, Tonghao Zhang wrote:
> On Wed, Nov 30, 2022 at 12:13 PM Hou Tao <houtao@huaweicloud.com> wrote:
>> Hi,
>>
>> On 11/30/2022 10:47 AM, Tonghao Zhang wrote:
>>> On Wed, Nov 30, 2022 at 9:50 AM Hou Tao <houtao@huaweicloud.com> wrote:
>>>> Hi Hao,
>>>>
>>>> On 11/30/2022 3:36 AM, Hao Luo wrote:
>>>>> On Tue, Nov 29, 2022 at 9:32 AM Boqun Feng <boqun.feng@gmail.com> wrote:
>>>>>> Just to be clear, I meant to refactor htab_lock_bucket() into a try
>>>>>> lock pattern. Also after a second thought, the below suggestion doesn't
>>>>>> work. I think the proper way is to make htab_lock_bucket() as a
>>>>>> raw_spin_trylock_irqsave().
>>>>>>
>>>>>> Regards,
>>>>>> Boqun
>>>>>>
>>>>> The potential deadlock happens when the lock is contended from the
>>>>> same cpu. When the lock is contended from a remote cpu, we would like
>>>>> the remote cpu to spin and wait, instead of giving up immediately. As
>>>>> this gives better throughput. So replacing the current
>>>>> raw_spin_lock_irqsave() with trylock sacrifices this performance gain.
>>>>>
>>>>> I suspect the source of the problem is the 'hash' that we used in
>>>>> htab_lock_bucket(). The 'hash' is derived from the 'key', I wonder
>>>>> whether we should use a hash derived from 'bucket' rather than from
>>>>> 'key'. For example, from the memory address of the 'bucket'. Because,
>>>>> different keys may fall into the same bucket, but yield different
>>>>> hashes. If the same bucket can never have two different 'hashes' here,
>>>>> the map_locked check should behave as intended. Also because
>>>>> ->map_locked is per-cpu, execution flows from two different cpus can
>>>>> both pass.
>>>> The warning from lockdep is due to the reason the bucket lock A is used in a
>>>> no-NMI context firstly, then the same bucke lock is used a NMI context, so
>>> Yes, I tested lockdep too, we can't use the lock in NMI(but only
>>> try_lock work fine) context if we use them no-NMI context. otherwise
>>> the lockdep prints the warning.
>>> * for the dead-lock case: we can use the
>>> 1. hash & min(HASHTAB_MAP_LOCK_MASK, htab->n_buckets -1)
>>> 2. or hash bucket address.
>> Use the computed hash will be better than hash bucket address, because the hash
>> buckets are allocated sequentially.
>>> * for lockdep warning, we should use in_nmi check with map_locked.
>>>
>>> BTW, the patch doesn't work, so we can remove the lock_key
>>> https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=c50eb518e262fa06bd334e6eec172eaf5d7a5bd9
>>>
>>> static inline int htab_lock_bucket(const struct bpf_htab *htab,
>>>                                    struct bucket *b, u32 hash,
>>>                                    unsigned long *pflags)
>>> {
>>>         unsigned long flags;
>>>
>>>         hash = hash & min(HASHTAB_MAP_LOCK_MASK, htab->n_buckets -1);
>>>
>>>         preempt_disable();
>>>         if (unlikely(__this_cpu_inc_return(*(htab->map_locked[hash])) != 1)) {
>>>                 __this_cpu_dec(*(htab->map_locked[hash]));
>>>                 preempt_enable();
>>>                 return -EBUSY;
>>>         }
>>>
>>>         if (in_nmi()) {
>>>                 if (!raw_spin_trylock_irqsave(&b->raw_lock, flags))
>>>                         return -EBUSY;
>> The only purpose of trylock here is to make lockdep happy and it may lead to
>> unnecessary -EBUSY error for htab operations in NMI context. I still prefer add
>> a virtual lock-class for map_locked to fix the lockdep warning. So could you use
> Hi, what is virtual lock-class ? Can you give me an example of what you mean?
If LOCKDEP is enabled, raw_spinlock will add dep_map in the definition and it
also calls lock_acquire() and lock_release() to assist the deadlock check. Now
map_locked is not a lock but it acts like a raw_spin_trylock, so we need to add
dep_map to it manually, and then also call lock_acquire(trylock=1) and
lock_release() before increasing and decreasing map_locked. You can reference
the implementation of raw_spin_trylock and raw_spin_unlock for more details.
>> separated patches to fix the potential dead-lock and the lockdep warning ? It
>> will be better you can also add a bpf selftests for deadlock problem as said before.
>>
>> Thanks,
>> Tao
>>>         } else {
>>>                 raw_spin_lock_irqsave(&b->raw_lock, flags);
>>>         }
>>>
>>>         *pflags = flags;
>>>         return 0;
>>> }
>>>
>>>
>>>> lockdep deduces that may be a dead-lock. I have already tried to use the same
>>>> map_locked for keys with the same bucket, the dead-lock is gone, but still got
>>>> lockdep warning.
>>>>> Hao
>>>>> .
>


^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [net-next] bpf: avoid hashtab deadlock with try_lock
@ 2022-11-23  0:06 kernel test robot
  0 siblings, 0 replies; 33+ messages in thread
From: kernel test robot @ 2022-11-23  0:06 UTC (permalink / raw)
  To: oe-kbuild; +Cc: lkp, Dan Carpenter

[-- Attachment #1: Type: text/plain, Size: 2713 bytes --]

BCC: lkp@intel.com
CC: oe-kbuild-all@lists.linux.dev
In-Reply-To: <20221121100521.56601-2-xiangxia.m.yue@gmail.com>
References: <20221121100521.56601-2-xiangxia.m.yue@gmail.com>
TO: xiangxia.m.yue@gmail.com
CC: netdev@vger.kernel.org
CC: Tonghao Zhang <xiangxia.m.yue@gmail.com>
CC: Alexei Starovoitov <ast@kernel.org>
CC: Daniel Borkmann <daniel@iogearbox.net>
CC: Andrii Nakryiko <andrii@kernel.org>
CC: Martin KaFai Lau <martin.lau@linux.dev>
CC: Song Liu <song@kernel.org>
CC: Yonghong Song <yhs@fb.com>
CC: John Fastabend <john.fastabend@gmail.com>
CC: KP Singh <kpsingh@kernel.org>
CC: Stanislav Fomichev <sdf@google.com>
CC: Hao Luo <haoluo@google.com>
CC: Jiri Olsa <jolsa@kernel.org>

Hi,

Thank you for the patch! Perhaps something to improve:

[auto build test WARNING on net-next/master]

url:    https://github.com/intel-lab-lkp/linux/commits/xiangxia-m-yue-gmail-com/bpf-avoid-hashtab-deadlock-with-try_lock/20221121-180611
patch link:    https://lore.kernel.org/r/20221121100521.56601-2-xiangxia.m.yue%40gmail.com
patch subject: [net-next] bpf: avoid hashtab deadlock with try_lock
:::::: branch date: 2 days ago
:::::: commit date: 2 days ago
config: csky-randconfig-m031-20221121
compiler: csky-linux-gcc (GCC) 12.1.0

If you fix the issue, kindly add following tag where applicable
| Reported-by: kernel test robot <lkp@intel.com>
| Reported-by: Dan Carpenter <error27@gmail.com>

smatch warnings:
kernel/bpf/hashtab.c:157 htab_lock_bucket() error: uninitialized symbol 'flags'.

vim +/flags +157 kernel/bpf/hashtab.c

d01f9b198ca985b Thomas Gleixner 2020-02-24  144  
13daa24092fd546 Tonghao Zhang   2022-11-21  145  static inline int htab_lock_bucket(struct bucket *b,
20b6cc34ea74b6a Song Liu        2020-10-29  146  				   unsigned long *pflags)
d01f9b198ca985b Thomas Gleixner 2020-02-24  147  {
d01f9b198ca985b Thomas Gleixner 2020-02-24  148  	unsigned long flags;
d01f9b198ca985b Thomas Gleixner 2020-02-24  149  
13daa24092fd546 Tonghao Zhang   2022-11-21  150  	if (in_nmi()) {
13daa24092fd546 Tonghao Zhang   2022-11-21  151  		if (!raw_spin_trylock_irqsave(&b->raw_lock, flags))
20b6cc34ea74b6a Song Liu        2020-10-29  152  			return -EBUSY;
13daa24092fd546 Tonghao Zhang   2022-11-21  153  	} else {
13daa24092fd546 Tonghao Zhang   2022-11-21  154  		raw_spin_lock_irqsave(&b->raw_lock, flags);
20b6cc34ea74b6a Song Liu        2020-10-29  155  	}
20b6cc34ea74b6a Song Liu        2020-10-29  156  
20b6cc34ea74b6a Song Liu        2020-10-29 @157  	*pflags = flags;
20b6cc34ea74b6a Song Liu        2020-10-29  158  	return 0;
d01f9b198ca985b Thomas Gleixner 2020-02-24  159  }
d01f9b198ca985b Thomas Gleixner 2020-02-24  160  

-- 
0-DAY CI Kernel Test Service
https://01.org/lkp

[-- Attachment #2: config --]
[-- Type: text/plain, Size: 100827 bytes --]

#
# Automatically generated file; DO NOT EDIT.
# Linux/csky 6.1.0-rc5 Kernel Configuration
#
CONFIG_CC_VERSION_TEXT="csky-linux-gcc (GCC) 12.1.0"
CONFIG_CC_IS_GCC=y
CONFIG_GCC_VERSION=120100
CONFIG_CLANG_VERSION=0
CONFIG_AS_IS_GNU=y
CONFIG_AS_VERSION=23800
CONFIG_LD_IS_BFD=y
CONFIG_LD_VERSION=23800
CONFIG_LLD_VERSION=0
CONFIG_CC_HAS_ASM_GOTO_OUTPUT=y
CONFIG_CC_HAS_ASM_INLINE=y
CONFIG_CC_HAS_NO_PROFILE_FN_ATTR=y
CONFIG_PAHOLE_VERSION=123
CONFIG_CONSTRUCTORS=y
CONFIG_IRQ_WORK=y

#
# General setup
#
CONFIG_BROKEN_ON_SMP=y
CONFIG_INIT_ENV_ARG_LIMIT=32
CONFIG_COMPILE_TEST=y
# CONFIG_WERROR is not set
CONFIG_LOCALVERSION=""
CONFIG_BUILD_SALT=""
CONFIG_HAVE_KERNEL_GZIP=y
CONFIG_HAVE_KERNEL_LZMA=y
CONFIG_HAVE_KERNEL_LZO=y
CONFIG_KERNEL_GZIP=y
# CONFIG_KERNEL_LZMA is not set
# CONFIG_KERNEL_LZO is not set
CONFIG_DEFAULT_INIT=""
CONFIG_DEFAULT_HOSTNAME="(none)"
CONFIG_SYSVIPC=y
CONFIG_SYSVIPC_SYSCTL=y
CONFIG_WATCH_QUEUE=y
# CONFIG_CROSS_MEMORY_ATTACH is not set
CONFIG_USELIB=y
CONFIG_HAVE_ARCH_AUDITSYSCALL=y

#
# IRQ subsystem
#
CONFIG_MAY_HAVE_SPARSE_IRQ=y
CONFIG_GENERIC_IRQ_PROBE=y
CONFIG_GENERIC_IRQ_SHOW=y
CONFIG_GENERIC_IRQ_INJECTION=y
CONFIG_GENERIC_IRQ_CHIP=y
CONFIG_IRQ_DOMAIN=y
CONFIG_IRQ_SIM=y
CONFIG_IRQ_DOMAIN_HIERARCHY=y
# CONFIG_SPARSE_IRQ is not set
CONFIG_GENERIC_IRQ_DEBUGFS=y
# end of IRQ subsystem

CONFIG_GENERIC_IRQ_MULTI_HANDLER=y
CONFIG_GENERIC_TIME_VSYSCALL=y
CONFIG_GENERIC_CLOCKEVENTS=y
CONFIG_CONTEXT_TRACKING=y
CONFIG_CONTEXT_TRACKING_IDLE=y

#
# Timers subsystem
#
CONFIG_TICK_ONESHOT=y
CONFIG_NO_HZ_COMMON=y
# CONFIG_HZ_PERIODIC is not set
CONFIG_NO_HZ_IDLE=y
# CONFIG_NO_HZ is not set
# CONFIG_HIGH_RES_TIMERS is not set
# end of Timers subsystem

CONFIG_BPF=y

#
# BPF subsystem
#
CONFIG_BPF_SYSCALL=y
CONFIG_BPF_UNPRIV_DEFAULT_OFF=y
# end of BPF subsystem

CONFIG_PREEMPT_BUILD=y
# CONFIG_PREEMPT_NONE is not set
# CONFIG_PREEMPT_VOLUNTARY is not set
CONFIG_PREEMPT=y
CONFIG_PREEMPT_COUNT=y
CONFIG_PREEMPTION=y

#
# CPU/Task time and stats accounting
#
CONFIG_TICK_CPU_ACCOUNTING=y
# CONFIG_VIRT_CPU_ACCOUNTING_GEN is not set
CONFIG_PSI=y
# CONFIG_PSI_DEFAULT_DISABLED is not set
# end of CPU/Task time and stats accounting

CONFIG_CPU_ISOLATION=y

#
# RCU Subsystem
#
CONFIG_TREE_RCU=y
CONFIG_PREEMPT_RCU=y
# CONFIG_RCU_EXPERT is not set
CONFIG_SRCU=y
CONFIG_TREE_SRCU=y
CONFIG_TASKS_RCU_GENERIC=y
CONFIG_TASKS_RCU=y
CONFIG_TASKS_TRACE_RCU=y
CONFIG_RCU_STALL_COMMON=y
CONFIG_RCU_NEED_SEGCBLIST=y
# end of RCU Subsystem

# CONFIG_IKCONFIG is not set
CONFIG_IKHEADERS=y
CONFIG_GENERIC_SCHED_CLOCK=y

#
# Scheduler features
#
# end of Scheduler features

CONFIG_CC_IMPLICIT_FALLTHROUGH="-Wimplicit-fallthrough=5"
CONFIG_GCC12_NO_ARRAY_BOUNDS=y
CONFIG_CC_NO_ARRAY_BOUNDS=y
CONFIG_CGROUPS=y
CONFIG_PAGE_COUNTER=y
# CONFIG_CGROUP_FAVOR_DYNMODS is not set
CONFIG_MEMCG=y
CONFIG_CGROUP_SCHED=y
CONFIG_FAIR_GROUP_SCHED=y
CONFIG_CFS_BANDWIDTH=y
# CONFIG_RT_GROUP_SCHED is not set
# CONFIG_CGROUP_PIDS is not set
# CONFIG_CGROUP_RDMA is not set
# CONFIG_CGROUP_FREEZER is not set
CONFIG_CGROUP_DEVICE=y
CONFIG_CGROUP_CPUACCT=y
CONFIG_CGROUP_BPF=y
# CONFIG_CGROUP_MISC is not set
CONFIG_CGROUP_DEBUG=y
CONFIG_SOCK_CGROUP_DATA=y
# CONFIG_CHECKPOINT_RESTORE is not set
CONFIG_SCHED_AUTOGROUP=y
# CONFIG_SYSFS_DEPRECATED is not set
# CONFIG_RELAY is not set
CONFIG_BLK_DEV_INITRD=y
CONFIG_INITRAMFS_SOURCE=""
CONFIG_RD_GZIP=y
CONFIG_RD_BZIP2=y
# CONFIG_RD_LZMA is not set
# CONFIG_RD_XZ is not set
# CONFIG_RD_LZO is not set
CONFIG_RD_LZ4=y
CONFIG_RD_ZSTD=y
# CONFIG_BOOT_CONFIG is not set
CONFIG_INITRAMFS_PRESERVE_MTIME=y
# CONFIG_CC_OPTIMIZE_FOR_PERFORMANCE is not set
CONFIG_CC_OPTIMIZE_FOR_SIZE=y
CONFIG_SYSCTL=y
CONFIG_EXPERT=y
# CONFIG_MULTIUSER is not set
CONFIG_SGETMASK_SYSCALL=y
# CONFIG_SYSFS_SYSCALL is not set
# CONFIG_FHANDLE is not set
CONFIG_POSIX_TIMERS=y
# CONFIG_PRINTK is not set
CONFIG_BUG=y
CONFIG_BASE_FULL=y
CONFIG_FUTEX=y
CONFIG_FUTEX_PI=y
CONFIG_EPOLL=y
# CONFIG_SIGNALFD is not set
CONFIG_TIMERFD=y
CONFIG_EVENTFD=y
# CONFIG_SHMEM is not set
# CONFIG_AIO is not set
# CONFIG_IO_URING is not set
# CONFIG_ADVISE_SYSCALLS is not set
CONFIG_MEMBARRIER=y
# CONFIG_KALLSYMS is not set
CONFIG_KCMP=y
CONFIG_RSEQ=y
# CONFIG_DEBUG_RSEQ is not set
CONFIG_EMBEDDED=y
CONFIG_HAVE_PERF_EVENTS=y
CONFIG_PC104=y

#
# Kernel Performance Events And Counters
#
# CONFIG_PERF_EVENTS is not set
# end of Kernel Performance Events And Counters

CONFIG_SYSTEM_DATA_VERIFICATION=y
CONFIG_PROFILING=y
# end of General setup

CONFIG_CSKY=y
CONFIG_LOCKDEP_SUPPORT=y
CONFIG_ARCH_SUPPORTS_UPROBES=y
CONFIG_CPU_HAS_CACHEV2=y
CONFIG_CPU_HAS_FPUV2=y
CONFIG_CPU_HAS_TLBI=y
CONFIG_CPU_HAS_LDSTEX=y
CONFIG_GENERIC_CALIBRATE_DELAY=y
CONFIG_GENERIC_CSUM=y
CONFIG_GENERIC_HWEIGHT=y
CONFIG_MMU=y
CONFIG_STACKTRACE_SUPPORT=y
CONFIG_TIME_LOW_RES=y
CONFIG_CPU_TLB_SIZE=1024
CONFIG_CPU_ASID_BITS=12
CONFIG_L1_CACHE_SHIFT=6
CONFIG_ARCH_MMAP_RND_BITS_MIN=8
CONFIG_ARCH_MMAP_RND_BITS_MAX=17

#
# Processor type and features
#
# CONFIG_CPU_CK610 is not set
# CONFIG_CPU_CK810 is not set
# CONFIG_CPU_CK807 is not set
CONFIG_CPU_CK860=y
CONFIG_PAGE_OFFSET_80000000=y
# CONFIG_PAGE_OFFSET_A0000000 is not set
CONFIG_PAGE_OFFSET=0x80000000
CONFIG_CPU_PM_NONE=y
# CONFIG_CPU_PM_WAIT is not set
# CONFIG_CPU_PM_DOZE is not set
# CONFIG_CPU_PM_STOP is not set
# CONFIG_CPU_HAS_FPU is not set
CONFIG_CPU_HAS_ICACHE_INS=y
# CONFIG_SMP is not set
CONFIG_HIGHMEM=y
CONFIG_ARCH_FORCE_MAX_ORDER=11
CONFIG_DRAM_BASE=0x0
# CONFIG_HAVE_EFFICIENT_UNALIGNED_STRING_OPS is not set
# end of Processor type and features

#
# Platform drivers selection
#
CONFIG_ARCH_CSKY_DW_APB_ICTL=y
# end of Platform drivers selection

CONFIG_HZ_100=y
# CONFIG_HZ_250 is not set
# CONFIG_HZ_300 is not set
# CONFIG_HZ_1000 is not set
CONFIG_HZ=100

#
# General architecture-dependent options
#
# CONFIG_KPROBES is not set
CONFIG_JUMP_LABEL=y
CONFIG_STATIC_KEYS_SELFTEST=y
CONFIG_ARCH_USE_BUILTIN_BSWAP=y
CONFIG_HAVE_KPROBES=y
CONFIG_HAVE_KRETPROBES=y
CONFIG_HAVE_KPROBES_ON_FTRACE=y
CONFIG_HAVE_FUNCTION_ERROR_INJECTION=y
CONFIG_TRACE_IRQFLAGS_SUPPORT=y
CONFIG_HAVE_ARCH_TRACEHOOK=y
CONFIG_HAVE_DMA_CONTIGUOUS=y
CONFIG_GENERIC_SMP_IDLE_THREAD=y
CONFIG_ARCH_32BIT_OFF_T=y
CONFIG_HAVE_REGS_AND_STACK_ACCESS_API=y
CONFIG_HAVE_RSEQ=y
CONFIG_HAVE_PERF_REGS=y
CONFIG_HAVE_PERF_USER_STACK_DUMP=y
CONFIG_HAVE_ARCH_JUMP_LABEL=y
CONFIG_HAVE_ARCH_JUMP_LABEL_RELATIVE=y
CONFIG_HAVE_ARCH_SECCOMP=y
CONFIG_HAVE_ARCH_SECCOMP_FILTER=y
# CONFIG_SECCOMP is not set
CONFIG_HAVE_STACKPROTECTOR=y
CONFIG_STACKPROTECTOR=y
CONFIG_STACKPROTECTOR_STRONG=y
CONFIG_LTO_NONE=y
CONFIG_HAVE_CONTEXT_TRACKING_USER=y
CONFIG_HAVE_VIRT_CPU_ACCOUNTING_GEN=y
CONFIG_MODULES_USE_ELF_RELA=y
CONFIG_PGTABLE_LEVELS=2
CONFIG_ARCH_HAS_ELF_RANDOMIZE=y
CONFIG_HAVE_ARCH_MMAP_RND_BITS=y
CONFIG_ARCH_MMAP_RND_BITS=8
CONFIG_PAGE_SIZE_LESS_THAN_64KB=y
CONFIG_PAGE_SIZE_LESS_THAN_256KB=y
CONFIG_ARCH_WANT_DEFAULT_TOPDOWN_MMAP_LAYOUT=y
CONFIG_COMPAT_32BIT_TIME=y
# CONFIG_LOCK_EVENT_COUNTS is not set

#
# GCOV-based kernel profiling
#
CONFIG_GCOV_KERNEL=y
CONFIG_ARCH_HAS_GCOV_PROFILE_ALL=y
# end of GCOV-based kernel profiling
# end of General architecture-dependent options

CONFIG_RT_MUTEXES=y
CONFIG_BASE_SMALL=0
CONFIG_MODULE_SIG_FORMAT=y
CONFIG_MODULES=y
# CONFIG_MODULE_FORCE_LOAD is not set
# CONFIG_MODULE_UNLOAD is not set
# CONFIG_MODVERSIONS is not set
CONFIG_MODULE_SRCVERSION_ALL=y
CONFIG_MODULE_SIG=y
CONFIG_MODULE_SIG_FORCE=y
CONFIG_MODULE_SIG_ALL=y
# CONFIG_MODULE_SIG_SHA1 is not set
# CONFIG_MODULE_SIG_SHA224 is not set
# CONFIG_MODULE_SIG_SHA256 is not set
# CONFIG_MODULE_SIG_SHA384 is not set
CONFIG_MODULE_SIG_SHA512=y
CONFIG_MODULE_SIG_HASH="sha512"
# CONFIG_MODULE_COMPRESS_NONE is not set
# CONFIG_MODULE_COMPRESS_GZIP is not set
CONFIG_MODULE_COMPRESS_XZ=y
# CONFIG_MODULE_COMPRESS_ZSTD is not set
# CONFIG_MODULE_DECOMPRESS is not set
# CONFIG_MODULE_ALLOW_MISSING_NAMESPACE_IMPORTS is not set
CONFIG_MODPROBE_PATH="/sbin/modprobe"
# CONFIG_BLOCK is not set
CONFIG_ASN1=y
CONFIG_UNINLINE_SPIN_UNLOCK=y
CONFIG_ARCH_USE_QUEUED_SPINLOCKS=y
CONFIG_ARCH_USE_QUEUED_RWLOCKS=y

#
# Executable file formats
#
CONFIG_BINFMT_ELF=y
CONFIG_ELFCORE=y
CONFIG_BINFMT_SCRIPT=m
CONFIG_BINFMT_MISC=m
# CONFIG_COREDUMP is not set
# end of Executable file formats

#
# Memory Management options
#

#
# SLAB allocator options
#
# CONFIG_SLAB is not set
# CONFIG_SLUB is not set
CONFIG_SLOB=y
# end of SLAB allocator options

CONFIG_SHUFFLE_PAGE_ALLOCATOR=y
CONFIG_COMPAT_BRK=y
CONFIG_FLATMEM=y
CONFIG_EXCLUSIVE_SYSTEM_RAM=y
CONFIG_SPLIT_PTLOCK_CPUS=4
CONFIG_MEMORY_BALLOON=y
# CONFIG_BALLOON_COMPACTION is not set
CONFIG_COMPACTION=y
CONFIG_COMPACT_UNEVICTABLE_DEFAULT=1
CONFIG_PAGE_REPORTING=y
CONFIG_MIGRATION=y
CONFIG_KSM=y
CONFIG_DEFAULT_MMAP_MIN_ADDR=4096
CONFIG_NEED_PER_CPU_KM=y
# CONFIG_CMA is not set
CONFIG_PAGE_IDLE_FLAG=y
CONFIG_IDLE_PAGE_TRACKING=y
CONFIG_VM_EVENT_COUNTERS=y
CONFIG_PERCPU_STATS=y
# CONFIG_GUP_TEST is not set
CONFIG_KMAP_LOCAL=y
# CONFIG_USERFAULTFD is not set
CONFIG_LRU_GEN=y
# CONFIG_LRU_GEN_ENABLED is not set
CONFIG_LRU_GEN_STATS=y

#
# Data Access Monitoring
#
# CONFIG_DAMON is not set
# end of Data Access Monitoring
# end of Memory Management options

# CONFIG_NET is not set

#
# Device Drivers
#
CONFIG_HAVE_PCI=y
# CONFIG_PCI is not set
# CONFIG_PCCARD is not set

#
# Generic Driver Options
#
CONFIG_AUXILIARY_BUS=y
# CONFIG_UEVENT_HELPER is not set
CONFIG_DEVTMPFS=y
CONFIG_DEVTMPFS_MOUNT=y
CONFIG_DEVTMPFS_SAFE=y
CONFIG_STANDALONE=y
# CONFIG_PREVENT_FIRMWARE_BUILD is not set

#
# Firmware loader
#
CONFIG_FW_LOADER=y
CONFIG_FW_LOADER_PAGED_BUF=y
CONFIG_FW_LOADER_SYSFS=y
CONFIG_EXTRA_FIRMWARE=""
CONFIG_FW_LOADER_USER_HELPER=y
# CONFIG_FW_LOADER_USER_HELPER_FALLBACK is not set
# CONFIG_FW_LOADER_COMPRESS is not set
CONFIG_FW_UPLOAD=y
# end of Firmware loader

CONFIG_WANT_DEV_COREDUMP=y
# CONFIG_ALLOW_DEV_COREDUMP is not set
# CONFIG_DEBUG_DRIVER is not set
# CONFIG_DEBUG_DEVRES is not set
# CONFIG_DEBUG_TEST_DRIVER_REMOVE is not set
# CONFIG_TEST_ASYNC_DRIVER_PROBE is not set
CONFIG_GENERIC_CPU_DEVICES=y
CONFIG_REGMAP=y
CONFIG_REGMAP_I2C=m
CONFIG_REGMAP_SLIMBUS=m
CONFIG_REGMAP_SPMI=m
CONFIG_REGMAP_MMIO=y
CONFIG_REGMAP_IRQ=y
CONFIG_REGMAP_SCCB=m
CONFIG_DMA_SHARED_BUFFER=y
CONFIG_DMA_FENCE_TRACE=y
# end of Generic Driver Options

#
# Bus devices
#
# CONFIG_ARM_INTEGRATOR_LM is not set
# CONFIG_BT1_APB is not set
# CONFIG_BT1_AXI is not set
# CONFIG_HISILICON_LPC is not set
# CONFIG_INTEL_IXP4XX_EB is not set
# CONFIG_QCOM_EBI2 is not set
# CONFIG_MHI_BUS is not set
CONFIG_MHI_BUS_EP=m
# end of Bus devices

#
# Firmware Drivers
#

#
# ARM System Control and Management Interface Protocol
#
# CONFIG_ARM_SCMI_PROTOCOL is not set
CONFIG_ARM_SCMI_POWER_DOMAIN=y
# CONFIG_ARM_SCMI_POWER_CONTROL is not set
# end of ARM System Control and Management Interface Protocol

CONFIG_ARM_SCPI_POWER_DOMAIN=y
# CONFIG_FIRMWARE_MEMMAP is not set
# CONFIG_BCM47XX_NVRAM is not set
CONFIG_GOOGLE_FIRMWARE=y
# CONFIG_GOOGLE_COREBOOT_TABLE is not set

#
# Tegra firmware driver
#
# end of Tegra firmware driver
# end of Firmware Drivers

CONFIG_GNSS=m
CONFIG_GNSS_SERIAL=m
CONFIG_GNSS_MTK_SERIAL=m
CONFIG_GNSS_SIRF_SERIAL=m
# CONFIG_GNSS_UBX_SERIAL is not set
CONFIG_MTD=y
CONFIG_MTD_TESTS=m

#
# Partition parsers
#
# CONFIG_MTD_AR7_PARTS is not set
# CONFIG_MTD_BCM63XX_PARTS is not set
# CONFIG_MTD_BRCM_U_BOOT is not set
CONFIG_MTD_CMDLINE_PARTS=m
CONFIG_MTD_OF_PARTS=m
# CONFIG_MTD_OF_PARTS_BCM4908 is not set
# CONFIG_MTD_OF_PARTS_LINKSYS_NS is not set
# CONFIG_MTD_PARSER_IMAGETAG is not set
# CONFIG_MTD_PARSER_TRX is not set
# CONFIG_MTD_SHARPSL_PARTS is not set
CONFIG_MTD_REDBOOT_PARTS=m
CONFIG_MTD_REDBOOT_DIRECTORY_BLOCK=-1
CONFIG_MTD_REDBOOT_PARTS_UNALLOCATED=y
# CONFIG_MTD_REDBOOT_PARTS_READONLY is not set
# end of Partition parsers

#
# User Modules And Translation Layers
#
CONFIG_MTD_OOPS=m
# CONFIG_MTD_PARTITIONED_MASTER is not set

#
# RAM/ROM/Flash chip drivers
#
CONFIG_MTD_CFI=m
# CONFIG_MTD_JEDECPROBE is not set
CONFIG_MTD_GEN_PROBE=m
CONFIG_MTD_CFI_ADV_OPTIONS=y
CONFIG_MTD_CFI_NOSWAP=y
# CONFIG_MTD_CFI_BE_BYTE_SWAP is not set
# CONFIG_MTD_CFI_LE_BYTE_SWAP is not set
# CONFIG_MTD_CFI_GEOMETRY is not set
CONFIG_MTD_MAP_BANK_WIDTH_1=y
CONFIG_MTD_MAP_BANK_WIDTH_2=y
CONFIG_MTD_MAP_BANK_WIDTH_4=y
CONFIG_MTD_CFI_I1=y
CONFIG_MTD_CFI_I2=y
CONFIG_MTD_OTP=y
# CONFIG_MTD_CFI_INTELEXT is not set
CONFIG_MTD_CFI_AMDSTD=m
CONFIG_MTD_CFI_STAA=m
CONFIG_MTD_CFI_UTIL=m
CONFIG_MTD_RAM=m
# CONFIG_MTD_ROM is not set
CONFIG_MTD_ABSENT=y
# end of RAM/ROM/Flash chip drivers

#
# Mapping drivers for chip access
#
CONFIG_MTD_COMPLEX_MAPPINGS=y
# CONFIG_MTD_PHYSMAP is not set
# CONFIG_MTD_SC520CDP is not set
# CONFIG_MTD_NETSC520 is not set
# CONFIG_MTD_TS5500 is not set
CONFIG_MTD_PLATRAM=m
# end of Mapping drivers for chip access

#
# Self-contained MTD device drivers
#
CONFIG_MTD_SPEAR_SMI=y
CONFIG_MTD_SLRAM=m
# CONFIG_MTD_PHRAM is not set
CONFIG_MTD_MTDRAM=y
CONFIG_MTDRAM_TOTAL_SIZE=4096
CONFIG_MTDRAM_ERASE_SIZE=128

#
# Disk-On-Chip Device Drivers
#
CONFIG_MTD_DOCG3=m
CONFIG_BCH_CONST_M=14
CONFIG_BCH_CONST_T=4
# end of Self-contained MTD device drivers

#
# NAND
#
CONFIG_MTD_NAND_CORE=y
CONFIG_MTD_ONENAND=y
# CONFIG_MTD_ONENAND_VERIFY_WRITE is not set
CONFIG_MTD_ONENAND_GENERIC=y
# CONFIG_MTD_ONENAND_SAMSUNG is not set
CONFIG_MTD_ONENAND_OTP=y
CONFIG_MTD_ONENAND_2X_PROGRAM=y
CONFIG_MTD_RAW_NAND=m

#
# Raw/parallel NAND flash controllers
#
CONFIG_MTD_NAND_DENALI=m
CONFIG_MTD_NAND_DENALI_DT=m
CONFIG_MTD_NAND_AMS_DELTA=m
# CONFIG_MTD_NAND_SHARPSL is not set
# CONFIG_MTD_NAND_ATMEL is not set
# CONFIG_MTD_NAND_MARVELL is not set
# CONFIG_MTD_NAND_SLC_LPC32XX is not set
# CONFIG_MTD_NAND_MLC_LPC32XX is not set
# CONFIG_MTD_NAND_BRCMNAND is not set
# CONFIG_MTD_NAND_OXNAS is not set
# CONFIG_MTD_NAND_FSL_IFC is not set
# CONFIG_MTD_NAND_VF610_NFC is not set
# CONFIG_MTD_NAND_MXC is not set
# CONFIG_MTD_NAND_SH_FLCTL is not set
# CONFIG_MTD_NAND_DAVINCI is not set
# CONFIG_MTD_NAND_TXX9NDFMC is not set
# CONFIG_MTD_NAND_FSMC is not set
# CONFIG_MTD_NAND_SUNXI is not set
# CONFIG_MTD_NAND_HISI504 is not set
# CONFIG_MTD_NAND_QCOM is not set
# CONFIG_MTD_NAND_MXIC is not set
# CONFIG_MTD_NAND_TEGRA is not set
# CONFIG_MTD_NAND_STM32_FMC2 is not set
# CONFIG_MTD_NAND_MESON is not set
CONFIG_MTD_NAND_GPIO=m
# CONFIG_MTD_NAND_PLATFORM is not set
# CONFIG_MTD_NAND_CADENCE is not set
# CONFIG_MTD_NAND_ARASAN is not set
CONFIG_MTD_NAND_INTEL_LGM=m
# CONFIG_MTD_NAND_RENESAS is not set

#
# Misc
#
CONFIG_MTD_NAND_NANDSIM=m
# CONFIG_MTD_NAND_DISKONCHIP is not set

#
# ECC engine support
#
CONFIG_MTD_NAND_ECC=y
# CONFIG_MTD_NAND_ECC_SW_HAMMING is not set
# CONFIG_MTD_NAND_ECC_SW_BCH is not set
# CONFIG_MTD_NAND_ECC_MXIC is not set
# CONFIG_MTD_NAND_ECC_MEDIATEK is not set
# end of ECC engine support
# end of NAND

#
# LPDDR & LPDDR2 PCM memory drivers
#
CONFIG_MTD_LPDDR=y
CONFIG_MTD_QINFO_PROBE=y
# end of LPDDR & LPDDR2 PCM memory drivers

# CONFIG_MTD_UBI is not set
CONFIG_MTD_HYPERBUS=m
# CONFIG_HBMC_AM654 is not set
CONFIG_DTC=y
CONFIG_OF=y
# CONFIG_OF_UNITTEST is not set
# CONFIG_OF_ALL_DTBS is not set
CONFIG_OF_FLATTREE=y
CONFIG_OF_EARLY_FLATTREE=y
CONFIG_OF_KOBJ=y
CONFIG_OF_DYNAMIC=y
CONFIG_OF_ADDRESS=y
CONFIG_OF_IRQ=y
CONFIG_OF_RESERVED_MEM=y
CONFIG_OF_RESOLVE=y
CONFIG_OF_OVERLAY=y
CONFIG_PARPORT=y
CONFIG_PARPORT_AX88796=y
CONFIG_PARPORT_1284=y
CONFIG_PARPORT_NOT_PC=y

#
# NVME Support
#
# end of NVME Support

#
# Misc devices
#
CONFIG_AD525X_DPOT=m
CONFIG_AD525X_DPOT_I2C=m
# CONFIG_DUMMY_IRQ is not set
CONFIG_ICS932S401=m
# CONFIG_ATMEL_SSC is not set
# CONFIG_ENCLOSURE_SERVICES is not set
CONFIG_HI6421V600_IRQ=m
# CONFIG_QCOM_COINCELL is not set
CONFIG_APDS9802ALS=m
CONFIG_ISL29003=m
CONFIG_ISL29020=m
CONFIG_SENSORS_TSL2550=m
CONFIG_SENSORS_BH1770=m
CONFIG_SENSORS_APDS990X=m
CONFIG_HMC6352=m
CONFIG_DS1682=m
# CONFIG_SRAM is not set
CONFIG_XILINX_SDFEC=m
CONFIG_OPEN_DICE=y
CONFIG_VCPU_STALL_DETECTOR=y
CONFIG_C2PORT=m

#
# EEPROM support
#
CONFIG_EEPROM_AT24=m
# CONFIG_EEPROM_LEGACY is not set
CONFIG_EEPROM_MAX6875=m
# CONFIG_EEPROM_93CX6 is not set
# CONFIG_EEPROM_IDT_89HPESX is not set
# CONFIG_EEPROM_EE1004 is not set
# end of EEPROM support

#
# Texas Instruments shared transport line discipline
#
# end of Texas Instruments shared transport line discipline

#
# Altera FPGA firmware download module (requires I2C)
#
CONFIG_ALTERA_STAPL=m
# CONFIG_ECHO is not set
CONFIG_UACCE=m
CONFIG_PVPANIC=y
CONFIG_PVPANIC_MMIO=m
# end of Misc devices

#
# SCSI device support
#
# end of SCSI device support

#
# IEEE 1394 (FireWire) support
#
# CONFIG_FIREWIRE is not set
# end of IEEE 1394 (FireWire) support

#
# Input device support
#
# CONFIG_INPUT is not set

#
# Hardware I/O ports
#
CONFIG_SERIO=m
CONFIG_SERIO_SERPORT=m
# CONFIG_SERIO_PARKBD is not set
CONFIG_SERIO_LIBPS2=m
CONFIG_SERIO_RAW=m
CONFIG_SERIO_ALTERA_PS2=m
CONFIG_SERIO_PS2MULT=m
CONFIG_SERIO_ARC_PS2=m
CONFIG_SERIO_APBPS2=m
# CONFIG_SERIO_OLPC_APSP is not set
# CONFIG_SERIO_SUN4I_PS2 is not set
CONFIG_SERIO_GPIO_PS2=m
CONFIG_USERIO=m
CONFIG_GAMEPORT=y
# CONFIG_GAMEPORT_NS558 is not set
# CONFIG_GAMEPORT_L4 is not set
# end of Hardware I/O ports
# end of Input device support

#
# Character devices
#
CONFIG_TTY=y
# CONFIG_VT is not set
# CONFIG_UNIX98_PTYS is not set
CONFIG_LEGACY_PTYS=y
CONFIG_LEGACY_PTY_COUNT=256
# CONFIG_LDISC_AUTOLOAD is not set

#
# Serial drivers
#
CONFIG_SERIAL_EARLYCON=y
# CONFIG_SERIAL_8250 is not set

#
# Non-8250 serial port support
#
# CONFIG_SERIAL_AMBA_PL010 is not set
# CONFIG_SERIAL_ATMEL is not set
# CONFIG_SERIAL_MESON is not set
# CONFIG_SERIAL_CLPS711X is not set
# CONFIG_SERIAL_SAMSUNG is not set
# CONFIG_SERIAL_TEGRA is not set
# CONFIG_SERIAL_IMX is not set
# CONFIG_SERIAL_IMX_EARLYCON is not set
# CONFIG_SERIAL_UARTLITE is not set
# CONFIG_SERIAL_SH_SCI is not set
# CONFIG_SERIAL_HS_LPC32XX is not set
CONFIG_SERIAL_CORE=y
CONFIG_SERIAL_CORE_CONSOLE=y
# CONFIG_SERIAL_MSM is not set
# CONFIG_SERIAL_VT8500 is not set
# CONFIG_SERIAL_OMAP is not set
# CONFIG_SERIAL_SIFIVE is not set
# CONFIG_SERIAL_LANTIQ is not set
CONFIG_SERIAL_SCCNXP=y
CONFIG_SERIAL_SCCNXP_CONSOLE=y
# CONFIG_SERIAL_SC16IS7XX is not set
# CONFIG_SERIAL_TIMBERDALE is not set
# CONFIG_SERIAL_BCM63XX is not set
CONFIG_SERIAL_ALTERA_JTAGUART=m
CONFIG_SERIAL_ALTERA_UART=m
CONFIG_SERIAL_ALTERA_UART_MAXPORTS=4
CONFIG_SERIAL_ALTERA_UART_BAUDRATE=115200
# CONFIG_SERIAL_MXS_AUART is not set
CONFIG_SERIAL_XILINX_PS_UART=m
# CONFIG_SERIAL_MPS2_UART is not set
CONFIG_SERIAL_ARC=m
CONFIG_SERIAL_ARC_NR_PORTS=1
CONFIG_SERIAL_FSL_LPUART=m
# CONFIG_SERIAL_FSL_LPUART_CONSOLE is not set
CONFIG_SERIAL_CONEXANT_DIGICOLOR=y
CONFIG_SERIAL_CONEXANT_DIGICOLOR_CONSOLE=y
# CONFIG_SERIAL_ST_ASC is not set
CONFIG_SERIAL_SPRD=y
CONFIG_SERIAL_SPRD_CONSOLE=y
# CONFIG_SERIAL_STM32 is not set
# CONFIG_SERIAL_MVEBU_UART is not set
# CONFIG_SERIAL_OWL is not set
# CONFIG_SERIAL_RDA is not set
# CONFIG_SERIAL_MILBEAUT_USIO is not set
# CONFIG_SERIAL_LITEUART is not set
# CONFIG_SERIAL_SUNPLUS is not set
# end of Serial drivers

# CONFIG_SERIAL_NONSTANDARD is not set
CONFIG_GOLDFISH_TTY=m
# CONFIG_NULL_TTY is not set
CONFIG_SERIAL_DEV_BUS=y
CONFIG_SERIAL_DEV_CTRL_TTYPORT=y
CONFIG_TTY_PRINTK=y
CONFIG_TTY_PRINTK_LEVEL=6
# CONFIG_PRINTER is not set
CONFIG_PPDEV=m
# CONFIG_VIRTIO_CONSOLE is not set
CONFIG_IPMI_HANDLER=m
# CONFIG_IPMI_PANIC_EVENT is not set
# CONFIG_IPMI_DEVICE_INTERFACE is not set
# CONFIG_IPMI_SI is not set
CONFIG_IPMI_SSIF=m
CONFIG_IPMI_IPMB=m
CONFIG_IPMI_WATCHDOG=m
CONFIG_IPMI_POWEROFF=m
# CONFIG_ASPEED_KCS_IPMI_BMC is not set
# CONFIG_NPCM7XX_KCS_IPMI_BMC is not set
# CONFIG_ASPEED_BT_IPMI_BMC is not set
CONFIG_IPMB_DEVICE_INTERFACE=m
CONFIG_HW_RANDOM=y
CONFIG_HW_RANDOM_TIMERIOMEM=y
CONFIG_HW_RANDOM_ATMEL=y
CONFIG_HW_RANDOM_BA431=y
CONFIG_HW_RANDOM_BCM2835=y
CONFIG_HW_RANDOM_IPROC_RNG200=y
CONFIG_HW_RANDOM_IXP4XX=y
CONFIG_HW_RANDOM_OMAP=y
CONFIG_HW_RANDOM_OMAP3_ROM=y
CONFIG_HW_RANDOM_VIRTIO=y
CONFIG_HW_RANDOM_IMX_RNGC=y
CONFIG_HW_RANDOM_NOMADIK=y
CONFIG_HW_RANDOM_STM32=y
CONFIG_HW_RANDOM_MESON=y
CONFIG_HW_RANDOM_MTK=y
CONFIG_HW_RANDOM_EXYNOS=y
CONFIG_HW_RANDOM_NPCM=y
CONFIG_HW_RANDOM_KEYSTONE=y
# CONFIG_HW_RANDOM_CCTRNG is not set
CONFIG_HW_RANDOM_XIPHERA=y
# CONFIG_DEVMEM is not set
CONFIG_TCG_TPM=y
CONFIG_HW_RANDOM_TPM=y
# CONFIG_TCG_TIS is not set
# CONFIG_TCG_TIS_I2C is not set
# CONFIG_TCG_TIS_SYNQUACER is not set
CONFIG_TCG_TIS_I2C_CR50=m
CONFIG_TCG_TIS_I2C_ATMEL=m
CONFIG_TCG_TIS_I2C_INFINEON=m
# CONFIG_TCG_TIS_I2C_NUVOTON is not set
CONFIG_TCG_ATMEL=y
# CONFIG_TCG_VTPM_PROXY is not set
CONFIG_TCG_TIS_ST33ZP24=m
CONFIG_TCG_TIS_ST33ZP24_I2C=m
# CONFIG_XILLYBUS is not set
# CONFIG_RANDOM_TRUST_CPU is not set
CONFIG_RANDOM_TRUST_BOOTLOADER=y
# end of Character devices

#
# I2C support
#
CONFIG_I2C=m
CONFIG_I2C_BOARDINFO=y
# CONFIG_I2C_COMPAT is not set
CONFIG_I2C_CHARDEV=m
CONFIG_I2C_MUX=m

#
# Multiplexer I2C Chip support
#
# CONFIG_I2C_ARB_GPIO_CHALLENGE is not set
# CONFIG_I2C_MUX_GPIO is not set
CONFIG_I2C_MUX_GPMUX=m
CONFIG_I2C_MUX_LTC4306=m
CONFIG_I2C_MUX_PCA9541=m
# CONFIG_I2C_MUX_PCA954x is not set
CONFIG_I2C_MUX_PINCTRL=m
# CONFIG_I2C_MUX_REG is not set
# CONFIG_I2C_DEMUX_PINCTRL is not set
# CONFIG_I2C_MUX_MLXCPLD is not set
# end of Multiplexer I2C Chip support

# CONFIG_I2C_HELPER_AUTO is not set
CONFIG_I2C_SMBUS=m

#
# I2C Algorithms
#
CONFIG_I2C_ALGOBIT=m
# CONFIG_I2C_ALGOPCF is not set
CONFIG_I2C_ALGOPCA=m
# end of I2C Algorithms

#
# I2C Hardware Bus support
#
# CONFIG_I2C_HIX5HD2 is not set

#
# I2C system bus drivers (mostly embedded / system-on-chip)
#
# CONFIG_I2C_ALTERA is not set
# CONFIG_I2C_ASPEED is not set
# CONFIG_I2C_AT91 is not set
# CONFIG_I2C_AXXIA is not set
# CONFIG_I2C_BCM2835 is not set
# CONFIG_I2C_BCM_IPROC is not set
# CONFIG_I2C_BCM_KONA is not set
CONFIG_I2C_BRCMSTB=m
# CONFIG_I2C_CADENCE is not set
CONFIG_I2C_CBUS_GPIO=m
# CONFIG_I2C_DAVINCI is not set
# CONFIG_I2C_DESIGNWARE_PLATFORM is not set
# CONFIG_I2C_DIGICOLOR is not set
CONFIG_I2C_EMEV2=m
# CONFIG_I2C_EXYNOS5 is not set
CONFIG_I2C_GPIO=m
CONFIG_I2C_GPIO_FAULT_INJECTOR=y
# CONFIG_I2C_HIGHLANDER is not set
# CONFIG_I2C_HISI is not set
# CONFIG_I2C_IMG is not set
# CONFIG_I2C_IMX is not set
# CONFIG_I2C_IMX_LPI2C is not set
# CONFIG_I2C_IOP3XX is not set
# CONFIG_I2C_JZ4780 is not set
CONFIG_I2C_KEMPLD=m
# CONFIG_I2C_LPC2K is not set
# CONFIG_I2C_MESON is not set
# CONFIG_I2C_MICROCHIP_CORE is not set
# CONFIG_I2C_MT65XX is not set
# CONFIG_I2C_MT7621 is not set
# CONFIG_I2C_MV64XXX is not set
# CONFIG_I2C_MXS is not set
# CONFIG_I2C_NPCM is not set
CONFIG_I2C_OCORES=m
# CONFIG_I2C_OMAP is not set
# CONFIG_I2C_OWL is not set
# CONFIG_I2C_APPLE is not set
# CONFIG_I2C_PCA_PLATFORM is not set
# CONFIG_I2C_PNX is not set
# CONFIG_I2C_PXA is not set
# CONFIG_I2C_QCOM_CCI is not set
# CONFIG_I2C_QUP is not set
# CONFIG_I2C_RIIC is not set
# CONFIG_I2C_RK3X is not set
# CONFIG_I2C_RZV2M is not set
# CONFIG_I2C_S3C2410 is not set
# CONFIG_I2C_SH_MOBILE is not set
CONFIG_I2C_SIMTEC=m
# CONFIG_I2C_ST is not set
# CONFIG_I2C_STM32F4 is not set
# CONFIG_I2C_STM32F7 is not set
# CONFIG_I2C_SYNQUACER is not set
# CONFIG_I2C_TEGRA_BPMP is not set
# CONFIG_I2C_UNIPHIER is not set
# CONFIG_I2C_UNIPHIER_F is not set
# CONFIG_I2C_VERSATILE is not set
# CONFIG_I2C_WMT is not set
CONFIG_I2C_XILINX=m
# CONFIG_I2C_XLP9XX is not set
# CONFIG_I2C_RCAR is not set

#
# External I2C/SMBus adapter drivers
#
CONFIG_I2C_PARPORT=m
CONFIG_I2C_TAOS_EVM=m

#
# Other I2C/SMBus bus drivers
#
# CONFIG_I2C_MLXCPLD is not set
# CONFIG_I2C_VIRTIO is not set
# end of I2C Hardware Bus support

# CONFIG_I2C_STUB is not set
CONFIG_I2C_SLAVE=y
CONFIG_I2C_SLAVE_EEPROM=m
CONFIG_I2C_SLAVE_TESTUNIT=m
# CONFIG_I2C_DEBUG_CORE is not set
# CONFIG_I2C_DEBUG_ALGO is not set
# CONFIG_I2C_DEBUG_BUS is not set
# end of I2C support

# CONFIG_I3C is not set
# CONFIG_SPI is not set
CONFIG_SPMI=m
# CONFIG_SPMI_HISI3670 is not set
# CONFIG_SPMI_MSM_PMIC_ARB is not set
# CONFIG_SPMI_MTK_PMIF is not set
# CONFIG_HSI is not set
CONFIG_PPS=m
# CONFIG_PPS_DEBUG is not set

#
# PPS clients support
#
CONFIG_PPS_CLIENT_KTIMER=m
# CONFIG_PPS_CLIENT_LDISC is not set
CONFIG_PPS_CLIENT_PARPORT=m
CONFIG_PPS_CLIENT_GPIO=m

#
# PPS generators support
#

#
# PTP clock support
#
CONFIG_PTP_1588_CLOCK_OPTIONAL=y

#
# Enable PHYLIB and NETWORK_PHY_TIMESTAMPING to see the additional clocks.
#
# end of PTP clock support

CONFIG_PINCTRL=y
CONFIG_GENERIC_PINCTRL_GROUPS=y
CONFIG_PINMUX=y
CONFIG_GENERIC_PINMUX_FUNCTIONS=y
CONFIG_PINCONF=y
CONFIG_GENERIC_PINCONF=y
CONFIG_DEBUG_PINCTRL=y
# CONFIG_PINCTRL_AMD is not set
# CONFIG_PINCTRL_AT91PIO4 is not set
# CONFIG_PINCTRL_BM1880 is not set
CONFIG_PINCTRL_CY8C95X0=m
# CONFIG_PINCTRL_DA850_PUPD is not set
# CONFIG_PINCTRL_EQUILIBRIUM is not set
# CONFIG_PINCTRL_INGENIC is not set
# CONFIG_PINCTRL_LPC18XX is not set
CONFIG_PINCTRL_MCP23S08_I2C=m
CONFIG_PINCTRL_MCP23S08=m
# CONFIG_PINCTRL_MICROCHIP_SGPIO is not set
CONFIG_PINCTRL_OCELOT=y
# CONFIG_PINCTRL_PISTACHIO is not set
# CONFIG_PINCTRL_RK805 is not set
# CONFIG_PINCTRL_ROCKCHIP is not set
CONFIG_PINCTRL_SINGLE=m
# CONFIG_PINCTRL_STMFX is not set
# CONFIG_PINCTRL_OWL is not set
# CONFIG_PINCTRL_ASPEED_G4 is not set
# CONFIG_PINCTRL_ASPEED_G5 is not set
# CONFIG_PINCTRL_ASPEED_G6 is not set
# CONFIG_PINCTRL_BCM281XX is not set
# CONFIG_PINCTRL_BCM2835 is not set
# CONFIG_PINCTRL_BCM4908 is not set
# CONFIG_PINCTRL_BCM6318 is not set
# CONFIG_PINCTRL_BCM6328 is not set
# CONFIG_PINCTRL_BCM6358 is not set
# CONFIG_PINCTRL_BCM6362 is not set
# CONFIG_PINCTRL_BCM6368 is not set
# CONFIG_PINCTRL_BCM63268 is not set
# CONFIG_PINCTRL_IPROC_GPIO is not set
# CONFIG_PINCTRL_CYGNUS_MUX is not set
# CONFIG_PINCTRL_NS is not set
# CONFIG_PINCTRL_NSP_GPIO is not set
# CONFIG_PINCTRL_NS2_MUX is not set
# CONFIG_PINCTRL_NSP_MUX is not set
# CONFIG_PINCTRL_AS370 is not set
# CONFIG_PINCTRL_BERLIN_BG4CT is not set
CONFIG_PINCTRL_MADERA=m
CONFIG_PINCTRL_CS47L35=y
CONFIG_PINCTRL_CS47L92=y

#
# Intel pinctrl drivers
#
# end of Intel pinctrl drivers

#
# MediaTek pinctrl drivers
#
# CONFIG_EINT_MTK is not set
# CONFIG_PINCTRL_MT2701 is not set
# CONFIG_PINCTRL_MT7623 is not set
# CONFIG_PINCTRL_MT7629 is not set
# CONFIG_PINCTRL_MT8135 is not set
# CONFIG_PINCTRL_MT8127 is not set
# CONFIG_PINCTRL_MT2712 is not set
# CONFIG_PINCTRL_MT6765 is not set
# CONFIG_PINCTRL_MT6779 is not set
# CONFIG_PINCTRL_MT6795 is not set
# CONFIG_PINCTRL_MT6797 is not set
# CONFIG_PINCTRL_MT7622 is not set
# CONFIG_PINCTRL_MT7986 is not set
# CONFIG_PINCTRL_MT8167 is not set
# CONFIG_PINCTRL_MT8173 is not set
# CONFIG_PINCTRL_MT8183 is not set
# CONFIG_PINCTRL_MT8186 is not set
# CONFIG_PINCTRL_MT8188 is not set
# CONFIG_PINCTRL_MT8192 is not set
# CONFIG_PINCTRL_MT8195 is not set
# CONFIG_PINCTRL_MT8365 is not set
# CONFIG_PINCTRL_MT8516 is not set
# CONFIG_PINCTRL_MT6397 is not set
# end of MediaTek pinctrl drivers

CONFIG_PINCTRL_MESON=y
# CONFIG_PINCTRL_WPCM450 is not set
# CONFIG_PINCTRL_NPCM7XX is not set
# CONFIG_PINCTRL_PXA25X is not set
# CONFIG_PINCTRL_PXA27X is not set
# CONFIG_PINCTRL_MSM is not set
# CONFIG_PINCTRL_QCOM_SPMI_PMIC is not set
# CONFIG_PINCTRL_QCOM_SSBI_PMIC is not set
# CONFIG_PINCTRL_LPASS_LPI is not set

#
# Renesas pinctrl drivers
#
# CONFIG_PINCTRL_RENESAS is not set
# CONFIG_PINCTRL_PFC_EMEV2 is not set
# CONFIG_PINCTRL_PFC_R8A77995 is not set
# CONFIG_PINCTRL_PFC_R8A7794 is not set
# CONFIG_PINCTRL_PFC_R8A77990 is not set
# CONFIG_PINCTRL_PFC_R8A7779 is not set
# CONFIG_PINCTRL_PFC_R8A7790 is not set
# CONFIG_PINCTRL_PFC_R8A77950 is not set
# CONFIG_PINCTRL_PFC_R8A77951 is not set
# CONFIG_PINCTRL_PFC_R8A7778 is not set
# CONFIG_PINCTRL_PFC_R8A7793 is not set
# CONFIG_PINCTRL_PFC_R8A7791 is not set
# CONFIG_PINCTRL_PFC_R8A77965 is not set
# CONFIG_PINCTRL_PFC_R8A77960 is not set
# CONFIG_PINCTRL_PFC_R8A77961 is not set
# CONFIG_PINCTRL_PFC_R8A779F0 is not set
# CONFIG_PINCTRL_PFC_R8A7792 is not set
# CONFIG_PINCTRL_PFC_R8A77980 is not set
# CONFIG_PINCTRL_PFC_R8A77970 is not set
# CONFIG_PINCTRL_PFC_R8A779A0 is not set
# CONFIG_PINCTRL_PFC_R8A779G0 is not set
# CONFIG_PINCTRL_PFC_R8A7740 is not set
# CONFIG_PINCTRL_PFC_R8A73A4 is not set
# CONFIG_PINCTRL_RZA1 is not set
# CONFIG_PINCTRL_RZA2 is not set
# CONFIG_PINCTRL_RZG2L is not set
# CONFIG_PINCTRL_PFC_R8A77470 is not set
# CONFIG_PINCTRL_PFC_R8A7745 is not set
# CONFIG_PINCTRL_PFC_R8A7742 is not set
# CONFIG_PINCTRL_PFC_R8A7743 is not set
# CONFIG_PINCTRL_PFC_R8A7744 is not set
# CONFIG_PINCTRL_PFC_R8A774C0 is not set
# CONFIG_PINCTRL_PFC_R8A774E1 is not set
# CONFIG_PINCTRL_PFC_R8A774A1 is not set
# CONFIG_PINCTRL_PFC_R8A774B1 is not set
# CONFIG_PINCTRL_RZN1 is not set
# CONFIG_PINCTRL_RZV2M is not set
# CONFIG_PINCTRL_PFC_SH7203 is not set
# CONFIG_PINCTRL_PFC_SH7264 is not set
# CONFIG_PINCTRL_PFC_SH7269 is not set
# CONFIG_PINCTRL_PFC_SH7720 is not set
# CONFIG_PINCTRL_PFC_SH7722 is not set
# CONFIG_PINCTRL_PFC_SH7734 is not set
# CONFIG_PINCTRL_PFC_SH7757 is not set
# CONFIG_PINCTRL_PFC_SH7785 is not set
# CONFIG_PINCTRL_PFC_SH7786 is not set
# CONFIG_PINCTRL_PFC_SH73A0 is not set
# CONFIG_PINCTRL_PFC_SH7723 is not set
# CONFIG_PINCTRL_PFC_SH7724 is not set
# CONFIG_PINCTRL_PFC_SHX3 is not set
# end of Renesas pinctrl drivers

# CONFIG_PINCTRL_EXYNOS is not set
# CONFIG_PINCTRL_S3C24XX is not set
# CONFIG_PINCTRL_S3C64XX is not set
# CONFIG_PINCTRL_SPRD_SC9860 is not set
# CONFIG_PINCTRL_STARFIVE_JH7100 is not set
# CONFIG_PINCTRL_STM32F429 is not set
# CONFIG_PINCTRL_STM32F469 is not set
# CONFIG_PINCTRL_STM32F746 is not set
# CONFIG_PINCTRL_STM32F769 is not set
# CONFIG_PINCTRL_STM32H743 is not set
# CONFIG_PINCTRL_STM32MP135 is not set
# CONFIG_PINCTRL_STM32MP157 is not set
# CONFIG_PINCTRL_TI_IODELAY is not set
CONFIG_PINCTRL_UNIPHIER=y
# CONFIG_PINCTRL_UNIPHIER_LD4 is not set
# CONFIG_PINCTRL_UNIPHIER_PRO4 is not set
# CONFIG_PINCTRL_UNIPHIER_SLD8 is not set
# CONFIG_PINCTRL_UNIPHIER_PRO5 is not set
# CONFIG_PINCTRL_UNIPHIER_PXS2 is not set
# CONFIG_PINCTRL_UNIPHIER_LD6B is not set
# CONFIG_PINCTRL_UNIPHIER_LD11 is not set
# CONFIG_PINCTRL_UNIPHIER_LD20 is not set
# CONFIG_PINCTRL_UNIPHIER_PXS3 is not set
# CONFIG_PINCTRL_UNIPHIER_NX1 is not set
# CONFIG_PINCTRL_TMPV7700 is not set
CONFIG_GPIOLIB=y
CONFIG_GPIOLIB_FASTPATH_LIMIT=512
CONFIG_OF_GPIO=y
CONFIG_GPIOLIB_IRQCHIP=y
# CONFIG_DEBUG_GPIO is not set
# CONFIG_GPIO_SYSFS is not set
CONFIG_GPIO_CDEV=y
CONFIG_GPIO_CDEV_V1=y
CONFIG_GPIO_GENERIC=y

#
# Memory mapped GPIO drivers
#
CONFIG_GPIO_74XX_MMIO=y
# CONFIG_GPIO_ALTERA is not set
# CONFIG_GPIO_ASPEED is not set
# CONFIG_GPIO_ASPEED_SGPIO is not set
# CONFIG_GPIO_ATH79 is not set
# CONFIG_GPIO_RASPBERRYPI_EXP is not set
# CONFIG_GPIO_BCM_KONA is not set
# CONFIG_GPIO_BCM_XGS_IPROC is not set
# CONFIG_GPIO_BRCMSTB is not set
CONFIG_GPIO_CADENCE=y
# CONFIG_GPIO_CLPS711X is not set
# CONFIG_GPIO_DWAPB is not set
# CONFIG_GPIO_EIC_SPRD is not set
# CONFIG_GPIO_EM is not set
# CONFIG_GPIO_FTGPIO010 is not set
CONFIG_GPIO_GENERIC_PLATFORM=m
CONFIG_GPIO_GRGPIO=y
# CONFIG_GPIO_HISI is not set
# CONFIG_GPIO_HLWD is not set
# CONFIG_GPIO_IOP is not set
CONFIG_GPIO_LOGICVC=m
# CONFIG_GPIO_LPC18XX is not set
# CONFIG_GPIO_LPC32XX is not set
CONFIG_GPIO_MB86S7X=m
# CONFIG_GPIO_MPC8XXX is not set
# CONFIG_GPIO_MT7621 is not set
# CONFIG_GPIO_MXC is not set
# CONFIG_GPIO_MXS is not set
# CONFIG_GPIO_PMIC_EIC_SPRD is not set
# CONFIG_GPIO_PXA is not set
# CONFIG_GPIO_RCAR is not set
# CONFIG_GPIO_RDA is not set
# CONFIG_GPIO_ROCKCHIP is not set
# CONFIG_GPIO_SAMA5D2_PIOBU is not set
# CONFIG_GPIO_SIFIVE is not set
# CONFIG_GPIO_SNPS_CREG is not set
# CONFIG_GPIO_SPRD is not set
# CONFIG_GPIO_STP_XWAY is not set
CONFIG_GPIO_SYSCON=y
# CONFIG_GPIO_TEGRA is not set
# CONFIG_GPIO_TEGRA186 is not set
# CONFIG_GPIO_TS4800 is not set
# CONFIG_GPIO_UNIPHIER is not set
# CONFIG_GPIO_VISCONTI is not set
CONFIG_GPIO_WCD934X=m
# CONFIG_GPIO_XGENE_SB is not set
CONFIG_GPIO_XILINX=y
# CONFIG_GPIO_XLP is not set
CONFIG_GPIO_AMD_FCH=m
# CONFIG_GPIO_IDT3243X is not set
# end of Memory mapped GPIO drivers

#
# I2C GPIO expanders
#
CONFIG_GPIO_ADNP=m
# CONFIG_GPIO_GW_PLD is not set
# CONFIG_GPIO_MAX7300 is not set
CONFIG_GPIO_MAX732X=m
CONFIG_GPIO_PCA953X=m
CONFIG_GPIO_PCA953X_IRQ=y
CONFIG_GPIO_PCA9570=m
CONFIG_GPIO_PCF857X=m
CONFIG_GPIO_TPIC2810=m
# CONFIG_GPIO_TS4900 is not set
# end of I2C GPIO expanders

#
# MFD GPIO expanders
#
CONFIG_GPIO_ARIZONA=m
CONFIG_GPIO_BD9571MWV=m
CONFIG_GPIO_KEMPLD=m
CONFIG_GPIO_LP87565=m
CONFIG_GPIO_MADERA=m
CONFIG_GPIO_MAX77650=m
# CONFIG_GPIO_SL28CPLD is not set
CONFIG_GPIO_TPS65086=m
CONFIG_GPIO_TPS65218=m
CONFIG_GPIO_TQMX86=m
CONFIG_GPIO_WM8994=m
# end of MFD GPIO expanders

#
# Virtual GPIO drivers
#
CONFIG_GPIO_AGGREGATOR=y
CONFIG_GPIO_MOCKUP=y
CONFIG_GPIO_VIRTIO=m
# CONFIG_GPIO_SIM is not set
# end of Virtual GPIO drivers

CONFIG_W1=y

#
# 1-wire Bus Masters
#
CONFIG_W1_MASTER_DS2482=m
# CONFIG_W1_MASTER_MXC is not set
CONFIG_W1_MASTER_DS1WM=m
CONFIG_W1_MASTER_GPIO=m
CONFIG_W1_MASTER_SGI=m
# end of 1-wire Bus Masters

#
# 1-wire Slaves
#
CONFIG_W1_SLAVE_THERM=y
# CONFIG_W1_SLAVE_SMEM is not set
CONFIG_W1_SLAVE_DS2405=m
CONFIG_W1_SLAVE_DS2408=y
CONFIG_W1_SLAVE_DS2408_READBACK=y
CONFIG_W1_SLAVE_DS2413=m
CONFIG_W1_SLAVE_DS2406=y
CONFIG_W1_SLAVE_DS2423=m
CONFIG_W1_SLAVE_DS2805=y
# CONFIG_W1_SLAVE_DS2430 is not set
# CONFIG_W1_SLAVE_DS2431 is not set
# CONFIG_W1_SLAVE_DS2433 is not set
# CONFIG_W1_SLAVE_DS2438 is not set
# CONFIG_W1_SLAVE_DS250X is not set
CONFIG_W1_SLAVE_DS2780=y
# CONFIG_W1_SLAVE_DS2781 is not set
CONFIG_W1_SLAVE_DS28E04=m
CONFIG_W1_SLAVE_DS28E17=m
# end of 1-wire Slaves

CONFIG_POWER_RESET=y
CONFIG_POWER_RESET_ATC260X=m
# CONFIG_POWER_RESET_BRCMKONA is not set
# CONFIG_POWER_RESET_BRCMSTB is not set
# CONFIG_POWER_RESET_GEMINI_POWEROFF is not set
# CONFIG_POWER_RESET_GPIO is not set
# CONFIG_POWER_RESET_GPIO_RESTART is not set
# CONFIG_POWER_RESET_OCELOT_RESET is not set
CONFIG_POWER_RESET_LTC2952=y
CONFIG_POWER_RESET_REGULATOR=y
CONFIG_POWER_RESET_RESTART=y
CONFIG_POWER_RESET_TPS65086=y
# CONFIG_POWER_RESET_KEYSTONE is not set
# CONFIG_POWER_RESET_SYSCON is not set
CONFIG_POWER_RESET_SYSCON_POWEROFF=y
# CONFIG_POWER_RESET_RMOBILE is not set
CONFIG_REBOOT_MODE=y
# CONFIG_SYSCON_REBOOT_MODE is not set
# CONFIG_POWER_RESET_SC27XX is not set
CONFIG_NVMEM_REBOOT_MODE=y
# CONFIG_POWER_SUPPLY is not set
# CONFIG_HWMON is not set
CONFIG_THERMAL=y
CONFIG_THERMAL_STATISTICS=y
CONFIG_THERMAL_EMERGENCY_POWEROFF_DELAY_MS=0
CONFIG_THERMAL_OF=y
CONFIG_THERMAL_WRITABLE_TRIPS=y
CONFIG_THERMAL_DEFAULT_GOV_STEP_WISE=y
# CONFIG_THERMAL_DEFAULT_GOV_FAIR_SHARE is not set
# CONFIG_THERMAL_DEFAULT_GOV_USER_SPACE is not set
CONFIG_THERMAL_GOV_FAIR_SHARE=y
CONFIG_THERMAL_GOV_STEP_WISE=y
# CONFIG_THERMAL_GOV_BANG_BANG is not set
CONFIG_THERMAL_GOV_USER_SPACE=y
# CONFIG_CPU_THERMAL is not set
CONFIG_DEVFREQ_THERMAL=y
CONFIG_THERMAL_EMULATION=y
# CONFIG_THERMAL_MMIO is not set
CONFIG_HISI_THERMAL=y
# CONFIG_IMX_THERMAL is not set
# CONFIG_IMX8MM_THERMAL is not set
# CONFIG_K3_THERMAL is not set
# CONFIG_QORIQ_THERMAL is not set
# CONFIG_SPEAR_THERMAL is not set
# CONFIG_RCAR_THERMAL is not set
# CONFIG_RCAR_GEN3_THERMAL is not set
# CONFIG_RZG2L_THERMAL is not set
# CONFIG_KIRKWOOD_THERMAL is not set
# CONFIG_DOVE_THERMAL is not set
# CONFIG_ARMADA_THERMAL is not set
# CONFIG_DA9062_THERMAL is not set

#
# Intel thermal drivers
#

#
# ACPI INT340X thermal drivers
#
# end of ACPI INT340X thermal drivers
# end of Intel thermal drivers

#
# Broadcom thermal drivers
#
# CONFIG_BCM2711_THERMAL is not set
# CONFIG_BCM2835_THERMAL is not set
# CONFIG_BRCMSTB_THERMAL is not set
# CONFIG_BCM_NS_THERMAL is not set
# CONFIG_BCM_SR_THERMAL is not set
# end of Broadcom thermal drivers

#
# Texas Instruments thermal drivers
#
# CONFIG_TI_SOC_THERMAL is not set
# end of Texas Instruments thermal drivers

#
# Samsung thermal drivers
#
# CONFIG_EXYNOS_THERMAL is not set
# end of Samsung thermal drivers

#
# NVIDIA Tegra thermal drivers
#
# CONFIG_TEGRA_SOCTHERM is not set
# CONFIG_TEGRA_BPMP_THERMAL is not set
# CONFIG_TEGRA30_TSENSOR is not set
# end of NVIDIA Tegra thermal drivers

#
# Qualcomm thermal drivers
#
# end of Qualcomm thermal drivers

# CONFIG_UNIPHIER_THERMAL is not set
# CONFIG_SPRD_THERMAL is not set
# CONFIG_WATCHDOG is not set
CONFIG_SSB_POSSIBLE=y
# CONFIG_SSB is not set
CONFIG_BCMA_POSSIBLE=y
CONFIG_BCMA=m
CONFIG_BCMA_HOST_SOC=y
# CONFIG_BCMA_DRIVER_MIPS is not set
# CONFIG_BCMA_SFLASH is not set
# CONFIG_BCMA_DRIVER_GMAC_CMN is not set
CONFIG_BCMA_DRIVER_GPIO=y
# CONFIG_BCMA_DEBUG is not set

#
# Multifunction device drivers
#
CONFIG_MFD_CORE=m
CONFIG_MFD_ACT8945A=m
# CONFIG_MFD_SUN4I_GPADC is not set
# CONFIG_MFD_AT91_USART is not set
CONFIG_MFD_ATMEL_FLEXCOM=m
CONFIG_MFD_ATMEL_HLCDC=m
CONFIG_MFD_BCM590XX=m
CONFIG_MFD_BD9571MWV=m
# CONFIG_MFD_AXP20X_I2C is not set
CONFIG_MFD_MADERA=m
CONFIG_MFD_MADERA_I2C=m
# CONFIG_MFD_CS47L15 is not set
CONFIG_MFD_CS47L35=y
# CONFIG_MFD_CS47L85 is not set
# CONFIG_MFD_CS47L90 is not set
CONFIG_MFD_CS47L92=y
# CONFIG_MFD_ASIC3 is not set
# CONFIG_MFD_DA9062 is not set
# CONFIG_MFD_DA9063 is not set
# CONFIG_MFD_DA9150 is not set
# CONFIG_MFD_ENE_KB3930 is not set
# CONFIG_MFD_EXYNOS_LPASS is not set
CONFIG_MFD_GATEWORKS_GSC=m
CONFIG_MFD_MC13XXX=m
CONFIG_MFD_MC13XXX_I2C=m
CONFIG_MFD_MP2629=m
# CONFIG_MFD_MXS_LRADC is not set
# CONFIG_MFD_MX25_TSADC is not set
CONFIG_MFD_HI6421_PMIC=m
# CONFIG_MFD_HI6421_SPMI is not set
# CONFIG_MFD_HI655X_PMIC is not set
# CONFIG_HTC_PASIC3 is not set
CONFIG_MFD_IQS62X=m
CONFIG_MFD_KEMPLD=m
CONFIG_MFD_88PM800=m
CONFIG_MFD_88PM805=m
CONFIG_MFD_MAX14577=m
CONFIG_MFD_MAX77650=m
CONFIG_MFD_MAX77686=m
CONFIG_MFD_MAX77693=m
CONFIG_MFD_MAX77714=m
CONFIG_MFD_MAX8907=m
# CONFIG_MFD_MT6360 is not set
# CONFIG_MFD_MT6370 is not set
# CONFIG_MFD_MT6397 is not set
CONFIG_MFD_MENF21BMC=m
# CONFIG_MFD_NTXEC is not set
# CONFIG_MFD_RETU is not set
CONFIG_MFD_PCF50633=m
CONFIG_PCF50633_ADC=m
CONFIG_PCF50633_GPIO=m
# CONFIG_MFD_PM8XXX is not set
# CONFIG_MFD_SPMI_PMIC is not set
CONFIG_MFD_SY7636A=m
CONFIG_MFD_RT4831=m
CONFIG_MFD_RT5033=m
# CONFIG_MFD_RT5120 is not set
CONFIG_MFD_RK808=m
CONFIG_MFD_RN5T618=m
CONFIG_MFD_SI476X_CORE=m
CONFIG_MFD_SIMPLE_MFD_I2C=m
# CONFIG_MFD_SL28CPLD is not set
# CONFIG_MFD_SM501 is not set
CONFIG_MFD_SKY81452=m
# CONFIG_ABX500_CORE is not set
# CONFIG_MFD_SUN6I_PRCM is not set
CONFIG_MFD_SYSCON=y
CONFIG_MFD_TI_AM335X_TSCADC=m
# CONFIG_MFD_LP3943 is not set
CONFIG_MFD_TI_LMU=m
# CONFIG_TPS6105X is not set
# CONFIG_TPS65010 is not set
# CONFIG_TPS6507X is not set
CONFIG_MFD_TPS65086=m
# CONFIG_MFD_TPS65217 is not set
# CONFIG_MFD_TI_LP873X is not set
CONFIG_MFD_TI_LP87565=m
CONFIG_MFD_TPS65218=m
# CONFIG_MFD_TPS65912_I2C is not set
CONFIG_MFD_WL1273_CORE=m
# CONFIG_MFD_LM3533 is not set
CONFIG_MFD_TQMX86=m
CONFIG_MFD_ARIZONA=m
CONFIG_MFD_ARIZONA_I2C=m
CONFIG_MFD_CS47L24=y
CONFIG_MFD_WM5102=y
CONFIG_MFD_WM5110=y
# CONFIG_MFD_WM8997 is not set
CONFIG_MFD_WM8998=y
CONFIG_MFD_WM8994=m
# CONFIG_MFD_STW481X is not set
# CONFIG_MFD_STM32_LPTIMER is not set
# CONFIG_MFD_STM32_TIMERS is not set
CONFIG_MFD_STMFX=m
CONFIG_MFD_WCD934X=m
CONFIG_MFD_ATC260X=m
CONFIG_MFD_ATC260X_I2C=m
# CONFIG_MFD_KHADAS_MCU is not set
# CONFIG_MFD_ACER_A500_EC is not set
# CONFIG_MFD_QCOM_PM8008 is not set
# CONFIG_RAVE_SP_CORE is not set
# CONFIG_MFD_RSMU_I2C is not set
# end of Multifunction device drivers

CONFIG_REGULATOR=y
CONFIG_REGULATOR_DEBUG=y
CONFIG_REGULATOR_FIXED_VOLTAGE=m
# CONFIG_REGULATOR_VIRTUAL_CONSUMER is not set
# CONFIG_REGULATOR_USERSPACE_CONSUMER is not set
# CONFIG_REGULATOR_88PG86X is not set
CONFIG_REGULATOR_88PM800=m
# CONFIG_REGULATOR_ACT8945A is not set
CONFIG_REGULATOR_AD5398=m
# CONFIG_REGULATOR_ANATOP is not set
# CONFIG_REGULATOR_ATC260X is not set
CONFIG_REGULATOR_BCM590XX=m
CONFIG_REGULATOR_BD9571MWV=m
CONFIG_REGULATOR_DA9121=m
CONFIG_REGULATOR_DA9210=m
CONFIG_REGULATOR_DA9211=m
# CONFIG_REGULATOR_FAN53555 is not set
CONFIG_REGULATOR_FAN53880=m
# CONFIG_REGULATOR_GPIO is not set
CONFIG_REGULATOR_HI6421=m
CONFIG_REGULATOR_HI6421V530=m
CONFIG_REGULATOR_ISL9305=m
# CONFIG_REGULATOR_ISL6271A is not set
CONFIG_REGULATOR_LM363X=m
# CONFIG_REGULATOR_LP3971 is not set
CONFIG_REGULATOR_LP3972=m
CONFIG_REGULATOR_LP872X=m
CONFIG_REGULATOR_LP8755=m
# CONFIG_REGULATOR_LP87565 is not set
CONFIG_REGULATOR_LTC3589=m
CONFIG_REGULATOR_LTC3676=m
CONFIG_REGULATOR_MAX14577=m
CONFIG_REGULATOR_MAX1586=m
# CONFIG_REGULATOR_MAX77620 is not set
CONFIG_REGULATOR_MAX77650=m
CONFIG_REGULATOR_MAX8649=m
CONFIG_REGULATOR_MAX8660=m
# CONFIG_REGULATOR_MAX8893 is not set
CONFIG_REGULATOR_MAX8907=m
# CONFIG_REGULATOR_MAX8952 is not set
# CONFIG_REGULATOR_MAX8973 is not set
CONFIG_REGULATOR_MAX20086=m
CONFIG_REGULATOR_MAX77686=m
# CONFIG_REGULATOR_MAX77693 is not set
CONFIG_REGULATOR_MAX77802=m
# CONFIG_REGULATOR_MAX77826 is not set
CONFIG_REGULATOR_MC13XXX_CORE=m
CONFIG_REGULATOR_MC13783=m
# CONFIG_REGULATOR_MC13892 is not set
# CONFIG_REGULATOR_MCP16502 is not set
CONFIG_REGULATOR_MP5416=m
CONFIG_REGULATOR_MP8859=m
# CONFIG_REGULATOR_MP886X is not set
CONFIG_REGULATOR_MPQ7920=m
CONFIG_REGULATOR_MT6311=m
CONFIG_REGULATOR_MT6315=m
# CONFIG_REGULATOR_PBIAS is not set
CONFIG_REGULATOR_PCA9450=m
# CONFIG_REGULATOR_PCF50633 is not set
CONFIG_REGULATOR_PF8X00=m
CONFIG_REGULATOR_PFUZE100=m
CONFIG_REGULATOR_PV88060=m
# CONFIG_REGULATOR_PV88080 is not set
CONFIG_REGULATOR_PV88090=m
# CONFIG_REGULATOR_QCOM_RPMH is not set
# CONFIG_REGULATOR_QCOM_SPMI is not set
# CONFIG_REGULATOR_QCOM_USB_VBUS is not set
CONFIG_REGULATOR_RASPBERRYPI_TOUCHSCREEN_ATTINY=m
CONFIG_REGULATOR_RK808=m
CONFIG_REGULATOR_RN5T618=m
CONFIG_REGULATOR_RT4801=m
CONFIG_REGULATOR_RT4831=m
# CONFIG_REGULATOR_RT5033 is not set
CONFIG_REGULATOR_RT5190A=m
CONFIG_REGULATOR_RT5759=m
# CONFIG_REGULATOR_RT6160 is not set
# CONFIG_REGULATOR_RT6245 is not set
# CONFIG_REGULATOR_RTQ2134 is not set
CONFIG_REGULATOR_RTMV20=m
CONFIG_REGULATOR_RTQ6752=m
# CONFIG_REGULATOR_S2MPA01 is not set
# CONFIG_REGULATOR_S2MPS11 is not set
# CONFIG_REGULATOR_S5M8767 is not set
# CONFIG_REGULATOR_SC2731 is not set
# CONFIG_REGULATOR_SKY81452 is not set
CONFIG_REGULATOR_SLG51000=m
# CONFIG_REGULATOR_STM32_BOOSTER is not set
# CONFIG_REGULATOR_STM32_VREFBUF is not set
# CONFIG_REGULATOR_STM32_PWR is not set
# CONFIG_REGULATOR_TI_ABB is not set
# CONFIG_REGULATOR_STW481X_VMMC is not set
# CONFIG_REGULATOR_SY7636A is not set
CONFIG_REGULATOR_SY8106A=m
CONFIG_REGULATOR_SY8824X=m
CONFIG_REGULATOR_SY8827N=m
CONFIG_REGULATOR_TPS51632=m
CONFIG_REGULATOR_TPS62360=m
CONFIG_REGULATOR_TPS6286X=m
CONFIG_REGULATOR_TPS65023=m
# CONFIG_REGULATOR_TPS6507X is not set
CONFIG_REGULATOR_TPS65086=m
CONFIG_REGULATOR_TPS65132=m
# CONFIG_REGULATOR_TPS65218 is not set
# CONFIG_REGULATOR_TPS68470 is not set
# CONFIG_REGULATOR_UNIPHIER is not set
CONFIG_REGULATOR_VCTRL=m
CONFIG_REGULATOR_WM8994=m
CONFIG_REGULATOR_QCOM_LABIBB=m
CONFIG_CEC_CORE=m

#
# CEC support
#
CONFIG_MEDIA_CEC_SUPPORT=y
CONFIG_CEC_CH7322=m
# CONFIG_CEC_MESON_AO is not set
# CONFIG_CEC_MESON_G12A_AO is not set
# CONFIG_CEC_GPIO is not set
# CONFIG_CEC_SAMSUNG_S5P is not set
# CONFIG_CEC_STI is not set
# CONFIG_CEC_STM32 is not set
# CONFIG_CEC_TEGRA is not set
# end of CEC support

CONFIG_MEDIA_SUPPORT=y
CONFIG_MEDIA_SUPPORT_FILTER=y
# CONFIG_MEDIA_SUBDRV_AUTOSELECT is not set

#
# Media device types
#
CONFIG_MEDIA_CAMERA_SUPPORT=y
# CONFIG_MEDIA_ANALOG_TV_SUPPORT is not set
CONFIG_MEDIA_DIGITAL_TV_SUPPORT=y
CONFIG_MEDIA_RADIO_SUPPORT=y
CONFIG_MEDIA_SDR_SUPPORT=y
# CONFIG_MEDIA_PLATFORM_SUPPORT is not set
CONFIG_MEDIA_TEST_SUPPORT=y
# end of Media device types

CONFIG_VIDEO_DEV=m
CONFIG_MEDIA_CONTROLLER=y
CONFIG_DVB_CORE=m

#
# Video4Linux options
#
CONFIG_VIDEO_V4L2_I2C=y
CONFIG_VIDEO_V4L2_SUBDEV_API=y
CONFIG_VIDEO_ADV_DEBUG=y
# CONFIG_VIDEO_FIXED_MINOR_RANGES is not set
CONFIG_V4L2_FLASH_LED_CLASS=m
CONFIG_V4L2_FWNODE=m
CONFIG_V4L2_ASYNC=m
# end of Video4Linux options

#
# Media controller options
#
# CONFIG_MEDIA_CONTROLLER_DVB is not set
CONFIG_MEDIA_CONTROLLER_REQUEST_API=y
# end of Media controller options

#
# Digital TV options
#
# CONFIG_DVB_MMAP is not set
CONFIG_DVB_MAX_ADAPTERS=16
# CONFIG_DVB_DYNAMIC_MINORS is not set
# CONFIG_DVB_DEMUX_SECTION_LOSS_LOG is not set
CONFIG_DVB_ULE_DEBUG=y
# end of Digital TV options

#
# Media drivers
#

#
# Drivers filtered as selected at 'Filter media drivers'
#

#
# Media drivers
#
CONFIG_RADIO_ADAPTERS=m
# CONFIG_RADIO_SAA7706H is not set
CONFIG_RADIO_SI4713=m
# CONFIG_RADIO_TEA5764 is not set
# CONFIG_RADIO_TEF6862 is not set
# CONFIG_RADIO_WL1273 is not set
CONFIG_RADIO_SI470X=m
CONFIG_I2C_SI470X=m
CONFIG_PLATFORM_SI4713=m
CONFIG_I2C_SI4713=m
# CONFIG_V4L_RADIO_ISA_DRIVERS is not set
CONFIG_V4L_TEST_DRIVERS=y
# CONFIG_VIDEO_VIM2M is not set
# CONFIG_VIDEO_VICODEC is not set
CONFIG_VIDEO_VIMC=m
CONFIG_VIDEO_VIVID=m
CONFIG_VIDEO_VIVID_CEC=y
CONFIG_VIDEO_VIVID_MAX_DEVS=64
# CONFIG_DVB_TEST_DRIVERS is not set
CONFIG_VIDEO_V4L2_TPG=m
CONFIG_VIDEOBUF2_CORE=m
CONFIG_VIDEOBUF2_V4L2=m
CONFIG_VIDEOBUF2_MEMOPS=m
CONFIG_VIDEOBUF2_DMA_CONTIG=m
CONFIG_VIDEOBUF2_VMALLOC=m
# end of Media drivers

#
# Media ancillary drivers
#
CONFIG_MEDIA_ATTACH=y

#
# Camera sensor devices
#
CONFIG_VIDEO_APTINA_PLL=m
CONFIG_VIDEO_CCS_PLL=m
# CONFIG_VIDEO_AR0521 is not set
CONFIG_VIDEO_HI556=m
# CONFIG_VIDEO_HI846 is not set
CONFIG_VIDEO_HI847=m
CONFIG_VIDEO_IMX208=m
CONFIG_VIDEO_IMX214=m
# CONFIG_VIDEO_IMX219 is not set
CONFIG_VIDEO_IMX258=m
# CONFIG_VIDEO_IMX274 is not set
CONFIG_VIDEO_IMX290=m
CONFIG_VIDEO_IMX319=m
# CONFIG_VIDEO_IMX334 is not set
# CONFIG_VIDEO_IMX335 is not set
CONFIG_VIDEO_IMX355=m
CONFIG_VIDEO_IMX412=m
CONFIG_VIDEO_MAX9271_LIB=m
CONFIG_VIDEO_MT9M001=m
CONFIG_VIDEO_MT9M032=m
CONFIG_VIDEO_MT9M111=m
CONFIG_VIDEO_MT9P031=m
# CONFIG_VIDEO_MT9T001 is not set
CONFIG_VIDEO_MT9T112=m
CONFIG_VIDEO_MT9V011=m
# CONFIG_VIDEO_MT9V032 is not set
# CONFIG_VIDEO_MT9V111 is not set
CONFIG_VIDEO_NOON010PC30=m
CONFIG_VIDEO_OG01A1B=m
CONFIG_VIDEO_OV02A10=m
CONFIG_VIDEO_OV08D10=m
# CONFIG_VIDEO_OV13858 is not set
# CONFIG_VIDEO_OV13B10 is not set
CONFIG_VIDEO_OV2640=m
CONFIG_VIDEO_OV2659=m
# CONFIG_VIDEO_OV2680 is not set
CONFIG_VIDEO_OV2685=m
# CONFIG_VIDEO_OV2740 is not set
# CONFIG_VIDEO_OV5640 is not set
CONFIG_VIDEO_OV5645=m
# CONFIG_VIDEO_OV5647 is not set
# CONFIG_VIDEO_OV5670 is not set
CONFIG_VIDEO_OV5675=m
# CONFIG_VIDEO_OV5693 is not set
# CONFIG_VIDEO_OV5695 is not set
# CONFIG_VIDEO_OV6650 is not set
CONFIG_VIDEO_OV7251=m
# CONFIG_VIDEO_OV7640 is not set
CONFIG_VIDEO_OV7670=m
CONFIG_VIDEO_OV772X=m
# CONFIG_VIDEO_OV7740 is not set
CONFIG_VIDEO_OV8856=m
CONFIG_VIDEO_OV9282=m
# CONFIG_VIDEO_OV9640 is not set
CONFIG_VIDEO_OV9650=m
# CONFIG_VIDEO_OV9734 is not set
# CONFIG_VIDEO_RDACM20 is not set
CONFIG_VIDEO_RDACM21=m
CONFIG_VIDEO_RJ54N1=m
# CONFIG_VIDEO_S5K4ECGX is not set
CONFIG_VIDEO_S5K5BAF=m
CONFIG_VIDEO_S5K6A3=m
# CONFIG_VIDEO_S5K6AA is not set
CONFIG_VIDEO_SR030PC30=m
# CONFIG_VIDEO_VS6624 is not set
CONFIG_VIDEO_CCS=m
CONFIG_VIDEO_ET8EK8=m
CONFIG_VIDEO_M5MOLS=m
# end of Camera sensor devices

#
# Lens drivers
#
CONFIG_VIDEO_AD5820=m
# CONFIG_VIDEO_AK7375 is not set
CONFIG_VIDEO_DW9714=m
CONFIG_VIDEO_DW9768=m
CONFIG_VIDEO_DW9807_VCM=m
# end of Lens drivers

#
# Flash devices
#
CONFIG_VIDEO_ADP1653=m
CONFIG_VIDEO_LM3560=m
CONFIG_VIDEO_LM3646=m
# end of Flash devices

#
# Audio decoders, processors and mixers
#
CONFIG_VIDEO_CS3308=m
CONFIG_VIDEO_CS5345=m
# CONFIG_VIDEO_CS53L32A is not set
CONFIG_VIDEO_MSP3400=m
CONFIG_VIDEO_SONY_BTF_MPX=m
# CONFIG_VIDEO_TDA7432 is not set
CONFIG_VIDEO_TDA9840=m
CONFIG_VIDEO_TEA6415C=m
# CONFIG_VIDEO_TEA6420 is not set
# CONFIG_VIDEO_TLV320AIC23B is not set
CONFIG_VIDEO_TVAUDIO=m
CONFIG_VIDEO_UDA1342=m
# CONFIG_VIDEO_VP27SMPX is not set
# CONFIG_VIDEO_WM8739 is not set
CONFIG_VIDEO_WM8775=m
# end of Audio decoders, processors and mixers

#
# RDS decoders
#
# CONFIG_VIDEO_SAA6588 is not set
# end of RDS decoders

#
# Video decoders
#
CONFIG_VIDEO_ADV7180=m
CONFIG_VIDEO_ADV7183=m
# CONFIG_VIDEO_ADV748X is not set
CONFIG_VIDEO_ADV7604=m
# CONFIG_VIDEO_ADV7604_CEC is not set
CONFIG_VIDEO_ADV7842=m
# CONFIG_VIDEO_ADV7842_CEC is not set
# CONFIG_VIDEO_BT819 is not set
# CONFIG_VIDEO_BT856 is not set
CONFIG_VIDEO_BT866=m
CONFIG_VIDEO_ISL7998X=m
# CONFIG_VIDEO_KS0127 is not set
# CONFIG_VIDEO_MAX9286 is not set
CONFIG_VIDEO_ML86V7667=m
CONFIG_VIDEO_SAA7110=m
CONFIG_VIDEO_SAA711X=m
# CONFIG_VIDEO_TC358743 is not set
# CONFIG_VIDEO_TVP514X is not set
CONFIG_VIDEO_TVP5150=m
CONFIG_VIDEO_TVP7002=m
CONFIG_VIDEO_TW2804=m
CONFIG_VIDEO_TW9903=m
CONFIG_VIDEO_TW9906=m
CONFIG_VIDEO_TW9910=m
CONFIG_VIDEO_VPX3220=m

#
# Video and audio decoders
#
CONFIG_VIDEO_SAA717X=m
CONFIG_VIDEO_CX25840=m
# end of Video decoders

#
# Video encoders
#
# CONFIG_VIDEO_AD9389B is not set
CONFIG_VIDEO_ADV7170=m
# CONFIG_VIDEO_ADV7175 is not set
CONFIG_VIDEO_ADV7343=m
# CONFIG_VIDEO_ADV7393 is not set
# CONFIG_VIDEO_ADV7511 is not set
CONFIG_VIDEO_AK881X=m
# CONFIG_VIDEO_SAA7127 is not set
CONFIG_VIDEO_SAA7185=m
# CONFIG_VIDEO_THS8200 is not set
# end of Video encoders

#
# Video improvement chips
#
# CONFIG_VIDEO_UPD64031A is not set
CONFIG_VIDEO_UPD64083=m
# end of Video improvement chips

#
# Audio/Video compression chips
#
CONFIG_VIDEO_SAA6752HS=m
# end of Audio/Video compression chips

#
# SDR tuner chips
#
CONFIG_SDR_MAX2175=m
# end of SDR tuner chips

#
# Miscellaneous helper chips
#
# CONFIG_VIDEO_I2C is not set
# CONFIG_VIDEO_M52790 is not set
# CONFIG_VIDEO_ST_MIPID02 is not set
CONFIG_VIDEO_THS7303=m
# end of Miscellaneous helper chips

CONFIG_MEDIA_TUNER=m

#
# Customize TV tuners
#
CONFIG_MEDIA_TUNER_E4000=m
# CONFIG_MEDIA_TUNER_FC0011 is not set
CONFIG_MEDIA_TUNER_FC0012=m
CONFIG_MEDIA_TUNER_FC0013=m
CONFIG_MEDIA_TUNER_FC2580=m
CONFIG_MEDIA_TUNER_IT913X=m
# CONFIG_MEDIA_TUNER_M88RS6000T is not set
CONFIG_MEDIA_TUNER_MAX2165=m
CONFIG_MEDIA_TUNER_MC44S803=m
CONFIG_MEDIA_TUNER_MT2060=m
# CONFIG_MEDIA_TUNER_MT2063 is not set
CONFIG_MEDIA_TUNER_MT20XX=m
CONFIG_MEDIA_TUNER_MT2131=m
CONFIG_MEDIA_TUNER_MT2266=m
CONFIG_MEDIA_TUNER_MXL301RF=m
CONFIG_MEDIA_TUNER_MXL5005S=m
# CONFIG_MEDIA_TUNER_MXL5007T is not set
# CONFIG_MEDIA_TUNER_QM1D1B0004 is not set
CONFIG_MEDIA_TUNER_QM1D1C0042=m
CONFIG_MEDIA_TUNER_QT1010=m
# CONFIG_MEDIA_TUNER_R820T is not set
CONFIG_MEDIA_TUNER_SI2157=m
CONFIG_MEDIA_TUNER_SIMPLE=m
CONFIG_MEDIA_TUNER_TDA18212=m
CONFIG_MEDIA_TUNER_TDA18218=m
CONFIG_MEDIA_TUNER_TDA18250=m
CONFIG_MEDIA_TUNER_TDA18271=m
CONFIG_MEDIA_TUNER_TDA827X=m
# CONFIG_MEDIA_TUNER_TDA8290 is not set
CONFIG_MEDIA_TUNER_TDA9887=m
# CONFIG_MEDIA_TUNER_TEA5761 is not set
CONFIG_MEDIA_TUNER_TEA5767=m
CONFIG_MEDIA_TUNER_TUA9001=m
CONFIG_MEDIA_TUNER_XC2028=m
CONFIG_MEDIA_TUNER_XC4000=m
# CONFIG_MEDIA_TUNER_XC5000 is not set
# end of Customize TV tuners

#
# Customise DVB Frontends
#

#
# Multistandard (satellite) frontends
#
CONFIG_DVB_M88DS3103=m
# CONFIG_DVB_MXL5XX is not set
CONFIG_DVB_STB0899=m
# CONFIG_DVB_STB6100 is not set
# CONFIG_DVB_STV090x is not set
CONFIG_DVB_STV0910=m
CONFIG_DVB_STV6110x=m
CONFIG_DVB_STV6111=m

#
# Multistandard (cable + terrestrial) frontends
#
# CONFIG_DVB_DRXK is not set
# CONFIG_DVB_MN88472 is not set
CONFIG_DVB_MN88473=m
# CONFIG_DVB_SI2165 is not set
# CONFIG_DVB_TDA18271C2DD is not set

#
# DVB-S (satellite) frontends
#
CONFIG_DVB_CX24110=m
CONFIG_DVB_CX24116=m
CONFIG_DVB_CX24117=m
# CONFIG_DVB_CX24120 is not set
# CONFIG_DVB_CX24123 is not set
# CONFIG_DVB_DS3000 is not set
CONFIG_DVB_MB86A16=m
# CONFIG_DVB_MT312 is not set
# CONFIG_DVB_S5H1420 is not set
CONFIG_DVB_SI21XX=m
CONFIG_DVB_STB6000=m
# CONFIG_DVB_STV0288 is not set
CONFIG_DVB_STV0299=m
# CONFIG_DVB_STV0900 is not set
CONFIG_DVB_STV6110=m
# CONFIG_DVB_TDA10071 is not set
CONFIG_DVB_TDA10086=m
# CONFIG_DVB_TDA8083 is not set
CONFIG_DVB_TDA8261=m
# CONFIG_DVB_TDA826X is not set
CONFIG_DVB_TS2020=m
CONFIG_DVB_TUA6100=m
# CONFIG_DVB_TUNER_CX24113 is not set
# CONFIG_DVB_TUNER_ITD1000 is not set
CONFIG_DVB_VES1X93=m
CONFIG_DVB_ZL10036=m
# CONFIG_DVB_ZL10039 is not set

#
# DVB-T (terrestrial) frontends
#
# CONFIG_DVB_AF9013 is not set
CONFIG_DVB_CX22700=m
CONFIG_DVB_CX22702=m
# CONFIG_DVB_CXD2820R is not set
# CONFIG_DVB_CXD2841ER is not set
CONFIG_DVB_DIB3000MB=m
CONFIG_DVB_DIB3000MC=m
CONFIG_DVB_DIB7000M=m
CONFIG_DVB_DIB7000P=m
CONFIG_DVB_DIB9000=m
CONFIG_DVB_DRXD=m
CONFIG_DVB_EC100=m
# CONFIG_DVB_L64781 is not set
CONFIG_DVB_MT352=m
# CONFIG_DVB_NXT6000 is not set
CONFIG_DVB_RTL2830=m
# CONFIG_DVB_RTL2832 is not set
# CONFIG_DVB_S5H1432 is not set
CONFIG_DVB_SI2168=m
CONFIG_DVB_SP887X=m
CONFIG_DVB_STV0367=m
CONFIG_DVB_TDA10048=m
CONFIG_DVB_TDA1004X=m
CONFIG_DVB_ZD1301_DEMOD=m
CONFIG_DVB_ZL10353=m

#
# DVB-C (cable) frontends
#
# CONFIG_DVB_STV0297 is not set
CONFIG_DVB_TDA10021=m
# CONFIG_DVB_TDA10023 is not set
CONFIG_DVB_VES1820=m

#
# ATSC (North American/Korean Terrestrial/Cable DTV) frontends
#
CONFIG_DVB_AU8522=m
CONFIG_DVB_AU8522_DTV=m
# CONFIG_DVB_AU8522_V4L is not set
CONFIG_DVB_BCM3510=m
CONFIG_DVB_LG2160=m
CONFIG_DVB_LGDT3305=m
CONFIG_DVB_LGDT3306A=m
# CONFIG_DVB_LGDT330X is not set
CONFIG_DVB_MXL692=m
CONFIG_DVB_NXT200X=m
# CONFIG_DVB_OR51132 is not set
CONFIG_DVB_OR51211=m
# CONFIG_DVB_S5H1409 is not set
CONFIG_DVB_S5H1411=m

#
# ISDB-T (terrestrial) frontends
#
CONFIG_DVB_DIB8000=m
# CONFIG_DVB_MB86A20S is not set
CONFIG_DVB_S921=m

#
# ISDB-S (satellite) & ISDB-T (terrestrial) frontends
#
CONFIG_DVB_MN88443X=m
CONFIG_DVB_TC90522=m

#
# Digital terrestrial only tuners/PLL
#
CONFIG_DVB_PLL=m
CONFIG_DVB_TUNER_DIB0070=m
CONFIG_DVB_TUNER_DIB0090=m

#
# SEC control devices for DVB-S
#
CONFIG_DVB_A8293=m
CONFIG_DVB_AF9033=m
CONFIG_DVB_ASCOT2E=m
# CONFIG_DVB_ATBM8830 is not set
CONFIG_DVB_HELENE=m
CONFIG_DVB_HORUS3A=m
CONFIG_DVB_ISL6405=m
# CONFIG_DVB_ISL6421 is not set
CONFIG_DVB_ISL6423=m
CONFIG_DVB_IX2505V=m
CONFIG_DVB_LGS8GL5=m
# CONFIG_DVB_LGS8GXX is not set
CONFIG_DVB_LNBH25=m
CONFIG_DVB_LNBH29=m
CONFIG_DVB_LNBP21=m
CONFIG_DVB_LNBP22=m
CONFIG_DVB_M88RS2000=m
CONFIG_DVB_TDA665x=m
CONFIG_DVB_DRX39XYJ=m

#
# Common Interface (EN50221) controller drivers
#
CONFIG_DVB_CXD2099=m
CONFIG_DVB_SP2=m
# end of Customise DVB Frontends

#
# Tools to develop new frontends
#
CONFIG_DVB_DUMMY_FE=m
# end of Media ancillary drivers

#
# Graphics support
#
CONFIG_APERTURE_HELPERS=y
# CONFIG_IMX_IPUV3_CORE is not set
# CONFIG_DRM is not set
# CONFIG_DRM_DEBUG_MODESET_LOCK is not set

#
# ARM devices
#
# end of ARM devices

#
# Frame buffer Devices
#
CONFIG_FB_CMDLINE=y
CONFIG_FB_NOTIFY=y
CONFIG_FB=m
# CONFIG_FIRMWARE_EDID is not set
CONFIG_FB_CFB_FILLRECT=m
CONFIG_FB_CFB_COPYAREA=m
CONFIG_FB_CFB_IMAGEBLIT=m
CONFIG_FB_SYS_FILLRECT=m
CONFIG_FB_SYS_COPYAREA=m
CONFIG_FB_SYS_IMAGEBLIT=m
CONFIG_FB_FOREIGN_ENDIAN=y
# CONFIG_FB_BOTH_ENDIAN is not set
# CONFIG_FB_BIG_ENDIAN is not set
CONFIG_FB_LITTLE_ENDIAN=y
CONFIG_FB_SYS_FOPS=m
CONFIG_FB_DEFERRED_IO=y
CONFIG_FB_MODE_HELPERS=y
CONFIG_FB_TILEBLITTING=y

#
# Frame buffer hardware drivers
#
# CONFIG_FB_CLPS711X is not set
# CONFIG_FB_IMX is not set
# CONFIG_FB_ARC is not set
# CONFIG_FB_PVR2 is not set
CONFIG_FB_OPENCORES=m
CONFIG_FB_S1D13XXX=m
# CONFIG_FB_ATMEL is not set
# CONFIG_FB_PXA168 is not set
# CONFIG_FB_W100 is not set
# CONFIG_FB_SH_MOBILE_LCDC is not set
# CONFIG_FB_TMIO is not set
# CONFIG_FB_S3C is not set
CONFIG_FB_IBM_GXT4500=m
CONFIG_FB_GOLDFISH=m
# CONFIG_FB_DA8XX is not set
CONFIG_FB_VIRTUAL=m
CONFIG_FB_METRONOME=m
# CONFIG_FB_BROADSHEET is not set
CONFIG_FB_SIMPLE=m
# CONFIG_FB_SSD1307 is not set
# CONFIG_FB_OMAP2 is not set
# CONFIG_MMP_DISP is not set
# end of Frame buffer Devices

#
# Backlight & LCD device support
#
CONFIG_LCD_CLASS_DEVICE=m
CONFIG_LCD_PLATFORM=m
CONFIG_BACKLIGHT_CLASS_DEVICE=m
# CONFIG_BACKLIGHT_KTD253 is not set
# CONFIG_BACKLIGHT_OMAP1 is not set
CONFIG_BACKLIGHT_QCOM_WLED=m
CONFIG_BACKLIGHT_RT4831=m
CONFIG_BACKLIGHT_ADP8860=m
# CONFIG_BACKLIGHT_ADP8870 is not set
# CONFIG_BACKLIGHT_PCF50633 is not set
CONFIG_BACKLIGHT_LM3639=m
# CONFIG_BACKLIGHT_SKY81452 is not set
# CONFIG_BACKLIGHT_GPIO is not set
CONFIG_BACKLIGHT_LV5207LP=m
CONFIG_BACKLIGHT_BD6107=m
CONFIG_BACKLIGHT_ARCXCNN=m
# CONFIG_BACKLIGHT_LED is not set
# end of Backlight & LCD device support

CONFIG_HDMI=y
CONFIG_LOGO=y
CONFIG_LOGO_LINUX_MONO=y
# CONFIG_LOGO_LINUX_VGA16 is not set
CONFIG_LOGO_LINUX_CLUT224=y
# end of Graphics support

CONFIG_SOUND=y
CONFIG_SND=m
CONFIG_SND_TIMER=m
CONFIG_SND_PCM=m
CONFIG_SND_RAWMIDI=m
CONFIG_SND_JACK=y
# CONFIG_SND_OSSEMUL is not set
CONFIG_SND_PCM_TIMER=y
CONFIG_SND_DYNAMIC_MINORS=y
CONFIG_SND_MAX_CARDS=32
CONFIG_SND_SUPPORT_OLD_API=y
CONFIG_SND_PROC_FS=y
CONFIG_SND_VERBOSE_PROCFS=y
CONFIG_SND_VERBOSE_PRINTK=y
# CONFIG_SND_CTL_FAST_LOOKUP is not set
# CONFIG_SND_DEBUG is not set
# CONFIG_SND_CTL_INPUT_VALIDATION is not set
# CONFIG_SND_SEQUENCER is not set
CONFIG_SND_DRIVERS=y
# CONFIG_SND_DUMMY is not set
CONFIG_SND_ALOOP=m
# CONFIG_SND_MTPAV is not set
CONFIG_SND_MTS64=m
# CONFIG_SND_SERIAL_U16550 is not set
CONFIG_SND_SERIAL_GENERIC=m
# CONFIG_SND_MPU401 is not set
# CONFIG_SND_PORTMAN2X4 is not set

#
# HD-Audio
#
# end of HD-Audio

CONFIG_SND_HDA_PREALLOC_SIZE=64
# CONFIG_SND_SOC is not set
CONFIG_SND_VIRTIO=m
CONFIG_USB_OHCI_LITTLE_ENDIAN=y
# CONFIG_USB_SUPPORT is not set
CONFIG_MMC=m
CONFIG_PWRSEQ_EMMC=m
CONFIG_PWRSEQ_SIMPLE=m
CONFIG_SDIO_UART=m
# CONFIG_MMC_TEST is not set

#
# MMC/SD/SDIO Host Controller Drivers
#
CONFIG_MMC_DEBUG=y
CONFIG_MMC_SDHCI=m
CONFIG_MMC_SDHCI_PLTFM=m
# CONFIG_MMC_SDHCI_OF_ARASAN is not set
# CONFIG_MMC_SDHCI_OF_ASPEED is not set
# CONFIG_MMC_SDHCI_OF_AT91 is not set
# CONFIG_MMC_SDHCI_OF_ESDHC is not set
# CONFIG_MMC_SDHCI_OF_DWCMSHC is not set
# CONFIG_MMC_SDHCI_OF_SPARX5 is not set
CONFIG_MMC_SDHCI_CADENCE=m
# CONFIG_MMC_SDHCI_CNS3XXX is not set
# CONFIG_MMC_SDHCI_ESDHC_IMX is not set
# CONFIG_MMC_SDHCI_DOVE is not set
# CONFIG_MMC_SDHCI_TEGRA is not set
# CONFIG_MMC_SDHCI_S3C is not set
# CONFIG_MMC_SDHCI_PXAV3 is not set
# CONFIG_MMC_SDHCI_PXAV2 is not set
# CONFIG_MMC_SDHCI_SPEAR is not set
# CONFIG_MMC_SDHCI_BCM_KONA is not set
# CONFIG_MMC_SDHCI_F_SDH30 is not set
CONFIG_MMC_SDHCI_MILBEAUT=m
# CONFIG_MMC_SDHCI_IPROC is not set
# CONFIG_MMC_MESON_GX is not set
# CONFIG_MMC_MESON_MX_SDHC is not set
# CONFIG_MMC_MESON_MX_SDIO is not set
# CONFIG_MMC_MOXART is not set
# CONFIG_MMC_SDHCI_ST is not set
# CONFIG_MMC_OMAP_HS is not set
# CONFIG_MMC_SDHCI_MSM is not set
# CONFIG_MMC_DAVINCI is not set
# CONFIG_MMC_S3C is not set
# CONFIG_MMC_SDHCI_SPRD is not set
# CONFIG_MMC_TMIO is not set
# CONFIG_MMC_SDHI is not set
# CONFIG_MMC_UNIPHIER is not set
CONFIG_MMC_DW=m
CONFIG_MMC_DW_PLTFM=m
# CONFIG_MMC_DW_BLUEFIELD is not set
# CONFIG_MMC_DW_EXYNOS is not set
# CONFIG_MMC_DW_HI3798CV200 is not set
CONFIG_MMC_DW_K3=m
# CONFIG_MMC_SH_MMCIF is not set
CONFIG_MMC_USDHI6ROL0=m
CONFIG_MMC_CQHCI=m
# CONFIG_MMC_HSQ is not set
# CONFIG_MMC_BCM2835 is not set
CONFIG_MMC_MTK=m
# CONFIG_MMC_SDHCI_XENON is not set
# CONFIG_MMC_SDHCI_OMAP is not set
# CONFIG_MMC_SDHCI_AM654 is not set
# CONFIG_MMC_OWL is not set
CONFIG_MMC_LITEX=m
# CONFIG_MEMSTICK is not set
CONFIG_NEW_LEDS=y
CONFIG_LEDS_CLASS=m
CONFIG_LEDS_CLASS_FLASH=m
CONFIG_LEDS_CLASS_MULTICOLOR=m
CONFIG_LEDS_BRIGHTNESS_HW_CHANGED=y

#
# LED drivers
#
CONFIG_LEDS_AN30259A=m
# CONFIG_LEDS_ARIEL is not set
CONFIG_LEDS_AW2013=m
# CONFIG_LEDS_BCM6328 is not set
# CONFIG_LEDS_BCM6358 is not set
# CONFIG_LEDS_TURRIS_OMNIA is not set
CONFIG_LEDS_LM3530=m
CONFIG_LEDS_LM3532=m
CONFIG_LEDS_LM3642=m
CONFIG_LEDS_LM3692X=m
# CONFIG_LEDS_S3C24XX is not set
# CONFIG_LEDS_COBALT_QUBE is not set
CONFIG_LEDS_GPIO=m
# CONFIG_LEDS_LP3944 is not set
CONFIG_LEDS_LP3952=m
CONFIG_LEDS_LP50XX=m
CONFIG_LEDS_LP55XX_COMMON=m
CONFIG_LEDS_LP5521=m
CONFIG_LEDS_LP5523=m
# CONFIG_LEDS_LP5562 is not set
# CONFIG_LEDS_LP8501 is not set
# CONFIG_LEDS_LP8860 is not set
# CONFIG_LEDS_PCA955X is not set
CONFIG_LEDS_PCA963X=m
# CONFIG_LEDS_REGULATOR is not set
CONFIG_LEDS_BD2802=m
# CONFIG_LEDS_LT3593 is not set
# CONFIG_LEDS_MC13783 is not set
CONFIG_LEDS_NS2=m
CONFIG_LEDS_NETXBIG=m
CONFIG_LEDS_TCA6507=m
# CONFIG_LEDS_TLC591XX is not set
CONFIG_LEDS_MAX77650=m
CONFIG_LEDS_LM355x=m
# CONFIG_LEDS_OT200 is not set
CONFIG_LEDS_MENF21BMC=m
CONFIG_LEDS_IS31FL319X=m
CONFIG_LEDS_IS31FL32XX=m

#
# LED driver for blink(1) USB RGB LED is under Special HID drivers (HID_THINGM)
#
# CONFIG_LEDS_BLINKM is not set
CONFIG_LEDS_MLXREG=m
CONFIG_LEDS_USER=m
# CONFIG_LEDS_TI_LMU_COMMON is not set
# CONFIG_LEDS_IP30 is not set
# CONFIG_LEDS_BCM63138 is not set
# CONFIG_LEDS_LGM is not set

#
# Flash and Torch LED drivers
#
CONFIG_LEDS_AAT1290=m
CONFIG_LEDS_AS3645A=m
CONFIG_LEDS_KTD2692=m
# CONFIG_LEDS_LM3601X is not set
CONFIG_LEDS_MAX77693=m
CONFIG_LEDS_RT4505=m
# CONFIG_LEDS_RT8515 is not set
CONFIG_LEDS_SGM3140=m

#
# RGB LED drivers
#

#
# LED Triggers
#
CONFIG_LEDS_TRIGGERS=y
# CONFIG_LEDS_TRIGGER_TIMER is not set
CONFIG_LEDS_TRIGGER_ONESHOT=m
# CONFIG_LEDS_TRIGGER_MTD is not set
CONFIG_LEDS_TRIGGER_HEARTBEAT=m
CONFIG_LEDS_TRIGGER_BACKLIGHT=y
# CONFIG_LEDS_TRIGGER_CPU is not set
# CONFIG_LEDS_TRIGGER_ACTIVITY is not set
CONFIG_LEDS_TRIGGER_GPIO=m
CONFIG_LEDS_TRIGGER_DEFAULT_ON=m

#
# iptables trigger is under Netfilter config (LED target)
#
# CONFIG_LEDS_TRIGGER_TRANSIENT is not set
# CONFIG_LEDS_TRIGGER_CAMERA is not set
# CONFIG_LEDS_TRIGGER_PANIC is not set
CONFIG_LEDS_TRIGGER_PATTERN=m
CONFIG_LEDS_TRIGGER_AUDIO=y
CONFIG_LEDS_TRIGGER_TTY=m

#
# Simple LED drivers
#
CONFIG_ACCESSIBILITY=y

#
# Speakup console speech
#
# end of Speakup console speech

CONFIG_RTC_LIB=y
# CONFIG_RTC_CLASS is not set
CONFIG_DMADEVICES=y
# CONFIG_DMADEVICES_DEBUG is not set

#
# DMA Devices
#
CONFIG_DMA_ENGINE=y
CONFIG_DMA_VIRTUAL_CHANNELS=y
CONFIG_DMA_OF=y
# CONFIG_ALTERA_MSGDMA is not set
# CONFIG_APPLE_ADMAC is not set
# CONFIG_AXI_DMAC is not set
# CONFIG_DMA_JZ4780 is not set
# CONFIG_DMA_SA11X0 is not set
# CONFIG_DW_AXI_DMAC is not set
# CONFIG_EP93XX_DMA is not set
# CONFIG_FSL_EDMA is not set
# CONFIG_IMG_MDC_DMA is not set
CONFIG_INTEL_IDMA64=m
# CONFIG_INTEL_IOP_ADMA is not set
# CONFIG_K3_DMA is not set
# CONFIG_MCF_EDMA is not set
# CONFIG_MILBEAUT_HDMAC is not set
# CONFIG_MILBEAUT_XDMAC is not set
# CONFIG_MMP_PDMA is not set
# CONFIG_MMP_TDMA is not set
# CONFIG_MV_XOR is not set
# CONFIG_MXS_DMA is not set
# CONFIG_NBPFAXI_DMA is not set
# CONFIG_STM32_DMA is not set
# CONFIG_STM32_DMAMUX is not set
# CONFIG_STM32_MDMA is not set
# CONFIG_SPRD_DMA is not set
# CONFIG_S3C24XX_DMAC is not set
# CONFIG_TEGRA20_APB_DMA is not set
# CONFIG_TEGRA210_ADMA is not set
# CONFIG_TIMB_DMA is not set
# CONFIG_UNIPHIER_MDMAC is not set
# CONFIG_UNIPHIER_XDMAC is not set
# CONFIG_XGENE_DMA is not set
# CONFIG_XILINX_ZYNQMP_DMA is not set
CONFIG_XILINX_ZYNQMP_DPDMA=m
# CONFIG_MTK_HSDMA is not set
# CONFIG_MTK_CQDMA is not set
# CONFIG_QCOM_ADM is not set
CONFIG_QCOM_HIDMA_MGMT=y
CONFIG_QCOM_HIDMA=y
CONFIG_DW_DMAC_CORE=y
CONFIG_DW_DMAC=y
# CONFIG_RZN1_DMAMUX is not set
CONFIG_SF_PDMA=y
CONFIG_RENESAS_DMA=y
CONFIG_SH_DMAE_BASE=y
# CONFIG_SH_DMAE is not set
# CONFIG_RCAR_DMAC is not set
# CONFIG_RENESAS_USB_DMAC is not set
# CONFIG_RZ_DMAC is not set
CONFIG_TI_EDMA=y
CONFIG_DMA_OMAP=y
CONFIG_TI_DMA_CROSSBAR=y
# CONFIG_INTEL_LDMA is not set

#
# DMA Clients
#
CONFIG_ASYNC_TX_DMA=y
CONFIG_DMATEST=y
CONFIG_DMA_ENGINE_RAID=y

#
# DMABUF options
#
CONFIG_SYNC_FILE=y
CONFIG_SW_SYNC=y
# CONFIG_UDMABUF is not set
CONFIG_DMABUF_MOVE_NOTIFY=y
CONFIG_DMABUF_DEBUG=y
# CONFIG_DMABUF_SELFTESTS is not set
# CONFIG_DMABUF_HEAPS is not set
# CONFIG_DMABUF_SYSFS_STATS is not set
# end of DMABUF options

CONFIG_AUXDISPLAY=y
CONFIG_CHARLCD=y
# CONFIG_LINEDISP is not set
CONFIG_HD44780_COMMON=y
# CONFIG_HD44780 is not set
# CONFIG_IMG_ASCII_LCD is not set
CONFIG_LCD2S=m
CONFIG_PARPORT_PANEL=y
CONFIG_PANEL_PARPORT=0
CONFIG_PANEL_PROFILE=5
CONFIG_PANEL_CHANGE_MESSAGE=y
CONFIG_PANEL_BOOT_MESSAGE=""
# CONFIG_CHARLCD_BL_OFF is not set
CONFIG_CHARLCD_BL_ON=y
# CONFIG_CHARLCD_BL_FLASH is not set
CONFIG_PANEL=m
CONFIG_UIO=y
# CONFIG_UIO_PDRV_GENIRQ is not set
CONFIG_UIO_DMEM_GENIRQ=m
CONFIG_UIO_PRUSS=y
CONFIG_UIO_DFL=m
CONFIG_VFIO=m
CONFIG_VFIO_NOIOMMU=y
# CONFIG_VFIO_PLATFORM is not set
# CONFIG_VFIO_MDEV is not set
# CONFIG_VIRT_DRIVERS is not set
CONFIG_VIRTIO_ANCHOR=y
CONFIG_VIRTIO=y
CONFIG_VIRTIO_MENU=y
CONFIG_VIRTIO_BALLOON=y
CONFIG_VIRTIO_MMIO=m
CONFIG_VIRTIO_MMIO_CMDLINE_DEVICES=y
# CONFIG_VHOST_MENU is not set

#
# Microsoft Hyper-V guest support
#
# end of Microsoft Hyper-V guest support

# CONFIG_GREYBUS is not set
CONFIG_COMEDI=y
# CONFIG_COMEDI_DEBUG is not set
CONFIG_COMEDI_DEFAULT_BUF_SIZE_KB=2048
CONFIG_COMEDI_DEFAULT_BUF_MAXSIZE_KB=20480
CONFIG_COMEDI_MISC_DRIVERS=y
CONFIG_COMEDI_BOND=m
CONFIG_COMEDI_TEST=y
# CONFIG_COMEDI_PARPORT is not set
# CONFIG_COMEDI_SSV_DNP is not set
CONFIG_COMEDI_ISA_DRIVERS=y
CONFIG_COMEDI_PCL711=m
CONFIG_COMEDI_PCL724=y
CONFIG_COMEDI_PCL726=y
# CONFIG_COMEDI_PCL730 is not set
# CONFIG_COMEDI_PCL812 is not set
CONFIG_COMEDI_PCL816=m
CONFIG_COMEDI_PCL818=m
CONFIG_COMEDI_PCM3724=y
# CONFIG_COMEDI_AMPLC_DIO200_ISA is not set
# CONFIG_COMEDI_AMPLC_PC236_ISA is not set
CONFIG_COMEDI_AMPLC_PC263_ISA=y
CONFIG_COMEDI_RTI800=m
CONFIG_COMEDI_RTI802=m
CONFIG_COMEDI_DAC02=m
CONFIG_COMEDI_DAS16M1=y
CONFIG_COMEDI_DAS08_ISA=m
CONFIG_COMEDI_DAS16=y
CONFIG_COMEDI_DAS800=y
CONFIG_COMEDI_DAS1800=m
CONFIG_COMEDI_DAS6402=y
CONFIG_COMEDI_DT2801=m
# CONFIG_COMEDI_DT2811 is not set
# CONFIG_COMEDI_DT2814 is not set
CONFIG_COMEDI_DT2815=m
CONFIG_COMEDI_DT2817=m
# CONFIG_COMEDI_DT282X is not set
CONFIG_COMEDI_DMM32AT=y
# CONFIG_COMEDI_FL512 is not set
# CONFIG_COMEDI_AIO_AIO12_8 is not set
CONFIG_COMEDI_AIO_IIRO_16=m
CONFIG_COMEDI_II_PCI20KC=m
# CONFIG_COMEDI_C6XDIGIO is not set
CONFIG_COMEDI_MPC624=m
CONFIG_COMEDI_ADQ12B=y
# CONFIG_COMEDI_NI_AT_A2150 is not set
CONFIG_COMEDI_NI_AT_AO=m
CONFIG_COMEDI_NI_ATMIO=y
CONFIG_COMEDI_NI_ATMIO16D=m
CONFIG_COMEDI_NI_LABPC_ISA=m
CONFIG_COMEDI_PCMAD=y
# CONFIG_COMEDI_PCMDA12 is not set
CONFIG_COMEDI_PCMMIO=y
CONFIG_COMEDI_PCMUIO=y
# CONFIG_COMEDI_MULTIQ3 is not set
CONFIG_COMEDI_S526=m
CONFIG_COMEDI_8254=y
CONFIG_COMEDI_8255=y
CONFIG_COMEDI_8255_SA=y
CONFIG_COMEDI_KCOMEDILIB=y
CONFIG_COMEDI_DAS08=m
CONFIG_COMEDI_NI_LABPC=m
CONFIG_COMEDI_NI_TIO=y
CONFIG_COMEDI_NI_ROUTING=y
CONFIG_COMEDI_TESTS=m
CONFIG_COMEDI_TESTS_EXAMPLE=m
# CONFIG_COMEDI_TESTS_NI_ROUTES is not set
# CONFIG_STAGING is not set
CONFIG_GOLDFISH=y
# CONFIG_GOLDFISH_PIPE is not set
# CONFIG_CHROME_PLATFORMS is not set
# CONFIG_MELLANOX_PLATFORM is not set
# CONFIG_OLPC_XO175 is not set
CONFIG_SURFACE_PLATFORMS=y
CONFIG_HAVE_CLK=y
CONFIG_HAVE_CLK_PREPARE=y
CONFIG_COMMON_CLK=y

#
# Clock driver for ARM Reference designs
#
# CONFIG_CLK_ICST is not set
# CONFIG_CLK_SP810 is not set
# end of Clock driver for ARM Reference designs

# CONFIG_CLK_HSDK is not set
# CONFIG_COMMON_CLK_APPLE_NCO is not set
CONFIG_COMMON_CLK_MAX77686=m
CONFIG_COMMON_CLK_MAX9485=m
CONFIG_COMMON_CLK_RK808=m
# CONFIG_COMMON_CLK_HI655X is not set
# CONFIG_COMMON_CLK_SCMI is not set
# CONFIG_COMMON_CLK_SCPI is not set
CONFIG_COMMON_CLK_SI5341=m
CONFIG_COMMON_CLK_SI5351=m
CONFIG_COMMON_CLK_SI514=m
CONFIG_COMMON_CLK_SI544=m
CONFIG_COMMON_CLK_SI570=m
# CONFIG_COMMON_CLK_BM1880 is not set
CONFIG_COMMON_CLK_CDCE706=m
# CONFIG_COMMON_CLK_TPS68470 is not set
CONFIG_COMMON_CLK_CDCE925=m
CONFIG_COMMON_CLK_CS2000_CP=m
# CONFIG_COMMON_CLK_EN7523 is not set
# CONFIG_COMMON_CLK_FSL_FLEXSPI is not set
# CONFIG_COMMON_CLK_FSL_SAI is not set
# CONFIG_COMMON_CLK_GEMINI is not set
# CONFIG_COMMON_CLK_LAN966X is not set
# CONFIG_COMMON_CLK_ASPEED is not set
# CONFIG_COMMON_CLK_S2MPS11 is not set
CONFIG_COMMON_CLK_AXI_CLKGEN=m
# CONFIG_CLK_QORIQ is not set
# CONFIG_CLK_LS1028A_PLLDIG is not set
# CONFIG_COMMON_CLK_XGENE is not set
# CONFIG_COMMON_CLK_OXNAS is not set
CONFIG_COMMON_CLK_RS9_PCIE=m
# CONFIG_COMMON_CLK_VC5 is not set
CONFIG_COMMON_CLK_VC7=m
# CONFIG_COMMON_CLK_MMP2_AUDIO is not set
# CONFIG_COMMON_CLK_FIXED_MMIO is not set
# CONFIG_CLK_ACTIONS is not set
# CONFIG_CLK_BAIKAL_T1 is not set
# CONFIG_CLK_BCM2711_DVP is not set
# CONFIG_CLK_BCM2835 is not set
# CONFIG_CLK_BCM_63XX is not set
# CONFIG_CLK_BCM_63XX_GATE is not set
# CONFIG_CLK_BCM_KONA is not set
# CONFIG_CLK_BCM_CYGNUS is not set
# CONFIG_CLK_BCM_HR2 is not set
# CONFIG_CLK_BCM_NSP is not set
# CONFIG_CLK_BCM_NS2 is not set
# CONFIG_CLK_BCM_SR is not set
# CONFIG_CLK_RASPBERRYPI is not set
# CONFIG_COMMON_CLK_HI3516CV300 is not set
# CONFIG_COMMON_CLK_HI3519 is not set
# CONFIG_COMMON_CLK_HI3559A is not set
# CONFIG_COMMON_CLK_HI3660 is not set
# CONFIG_COMMON_CLK_HI3670 is not set
# CONFIG_COMMON_CLK_HI3798CV200 is not set
# CONFIG_COMMON_CLK_HI6220 is not set
# CONFIG_RESET_HISI is not set
# CONFIG_COMMON_CLK_BOSTON is not set
# CONFIG_MXC_CLK is not set
# CONFIG_CLK_IMX8MM is not set
# CONFIG_CLK_IMX8MN is not set
# CONFIG_CLK_IMX8MP is not set
# CONFIG_CLK_IMX8MQ is not set
# CONFIG_CLK_IMX8ULP is not set
# CONFIG_CLK_IMX93 is not set

#
# Ingenic SoCs drivers
#
# CONFIG_INGENIC_CGU_JZ4740 is not set
# CONFIG_INGENIC_CGU_JZ4725B is not set
# CONFIG_INGENIC_CGU_JZ4760 is not set
# CONFIG_INGENIC_CGU_JZ4770 is not set
# CONFIG_INGENIC_CGU_JZ4780 is not set
# CONFIG_INGENIC_CGU_X1000 is not set
# CONFIG_INGENIC_CGU_X1830 is not set
# CONFIG_INGENIC_TCU_CLK is not set
# end of Ingenic SoCs drivers

# CONFIG_COMMON_CLK_KEYSTONE is not set
# CONFIG_TI_SYSCON_CLK is not set

#
# Clock driver for MediaTek SoC
#
# CONFIG_COMMON_CLK_MT2701 is not set
# CONFIG_COMMON_CLK_MT2712 is not set
# CONFIG_COMMON_CLK_MT6765 is not set
# CONFIG_COMMON_CLK_MT6779 is not set
# CONFIG_COMMON_CLK_MT6795 is not set
# CONFIG_COMMON_CLK_MT6797 is not set
# CONFIG_COMMON_CLK_MT7622 is not set
# CONFIG_COMMON_CLK_MT7629 is not set
# CONFIG_COMMON_CLK_MT7986 is not set
# CONFIG_COMMON_CLK_MT8135 is not set
# CONFIG_COMMON_CLK_MT8167 is not set
# CONFIG_COMMON_CLK_MT8173 is not set
# CONFIG_COMMON_CLK_MT8183 is not set
# CONFIG_COMMON_CLK_MT8186 is not set
# CONFIG_COMMON_CLK_MT8192 is not set
# CONFIG_COMMON_CLK_MT8195 is not set
# CONFIG_COMMON_CLK_MT8365 is not set
# CONFIG_COMMON_CLK_MT8516 is not set
# end of Clock driver for MediaTek SoC

#
# Clock support for Amlogic platforms
#
# end of Clock support for Amlogic platforms

# CONFIG_MSTAR_MSC313_MPLL is not set
# CONFIG_MCHP_CLK_MPFS is not set
# CONFIG_COMMON_CLK_PISTACHIO is not set
# CONFIG_COMMON_CLK_QCOM is not set
# CONFIG_CLK_MT7621 is not set
# CONFIG_CLK_RENESAS is not set
# CONFIG_COMMON_CLK_SAMSUNG is not set
# CONFIG_S3C2410_COMMON_CLK is not set
# CONFIG_S3C2412_COMMON_CLK is not set
# CONFIG_S3C2443_COMMON_CLK is not set
# CONFIG_CLK_SIFIVE is not set
# CONFIG_CLK_INTEL_SOCFPGA is not set
# CONFIG_SPRD_COMMON_CLK is not set
# CONFIG_CLK_STARFIVE_JH7100 is not set
CONFIG_CLK_SUNXI=y
CONFIG_CLK_SUNXI_CLOCKS=y
CONFIG_CLK_SUNXI_PRCM_SUN6I=y
CONFIG_CLK_SUNXI_PRCM_SUN8I=y
CONFIG_CLK_SUNXI_PRCM_SUN9I=y
# CONFIG_SUNXI_CCU is not set
# CONFIG_COMMON_CLK_TI_ADPLL is not set
# CONFIG_CLK_UNIPHIER is not set
# CONFIG_COMMON_CLK_VISCONTI is not set
# CONFIG_CLK_LGM_CGU is not set
# CONFIG_XILINX_VCU is not set
# CONFIG_COMMON_CLK_XLNX_CLKWZRD is not set
# CONFIG_COMMON_CLK_ZYNQMP is not set
CONFIG_HWSPINLOCK=y
# CONFIG_HWSPINLOCK_OMAP is not set
# CONFIG_HWSPINLOCK_QCOM is not set
# CONFIG_HWSPINLOCK_SPRD is not set
# CONFIG_HWSPINLOCK_STM32 is not set
# CONFIG_HWSPINLOCK_SUN6I is not set
# CONFIG_HSEM_U8500 is not set

#
# Clock Source drivers
#
CONFIG_TIMER_OF=y
CONFIG_TIMER_PROBE=y
CONFIG_CLKSRC_MMIO=y
# CONFIG_BCM2835_TIMER is not set
# CONFIG_BCM_KONA_TIMER is not set
# CONFIG_DAVINCI_TIMER is not set
# CONFIG_DIGICOLOR_TIMER is not set
# CONFIG_OMAP_DM_TIMER is not set
CONFIG_DW_APB_TIMER=y
CONFIG_DW_APB_TIMER_OF=y
# CONFIG_FTTMR010_TIMER is not set
# CONFIG_IXP4XX_TIMER is not set
# CONFIG_MESON6_TIMER is not set
# CONFIG_OWL_TIMER is not set
# CONFIG_RDA_TIMER is not set
# CONFIG_SUN4I_TIMER is not set
# CONFIG_SUN5I_HSTIMER is not set
# CONFIG_TEGRA_TIMER is not set
# CONFIG_VT8500_TIMER is not set
# CONFIG_NPCM7XX_TIMER is not set
# CONFIG_CADENCE_TTC_TIMER is not set
# CONFIG_ASM9260_TIMER is not set
# CONFIG_CLKSRC_DBX500_PRCMU is not set
# CONFIG_CLPS711X_TIMER is not set
# CONFIG_MXS_TIMER is not set
# CONFIG_NSPIRE_TIMER is not set
# CONFIG_INTEGRATOR_AP_TIMER is not set
# CONFIG_CLKSRC_PISTACHIO is not set
# CONFIG_CLKSRC_TI_32K is not set
# CONFIG_CLKSRC_STM32_LP is not set
# CONFIG_CLKSRC_MPS2 is not set
# CONFIG_ARC_TIMERS is not set
# CONFIG_ARM_TIMER_SP804 is not set
# CONFIG_ARMV7M_SYSTICK is not set
# CONFIG_ATMEL_PIT is not set
# CONFIG_ATMEL_ST is not set
# CONFIG_CLKSRC_SAMSUNG_PWM is not set
# CONFIG_FSL_FTM_TIMER is not set
# CONFIG_OXNAS_RPS_TIMER is not set
# CONFIG_MTK_TIMER is not set
# CONFIG_SPRD_TIMER is not set
# CONFIG_CLKSRC_JCORE_PIT is not set
# CONFIG_SH_TIMER_CMT is not set
# CONFIG_SH_TIMER_MTU2 is not set
# CONFIG_RENESAS_OSTM is not set
# CONFIG_SH_TIMER_TMU is not set
# CONFIG_EM_TIMER_STI is not set
# CONFIG_CLKSRC_VERSATILE is not set
# CONFIG_CLKSRC_PXA is not set
# CONFIG_TIMER_IMX_SYS_CTR is not set
# CONFIG_CLKSRC_ST_LPC is not set
# CONFIG_GXP_TIMER is not set
CONFIG_CSKY_MP_TIMER=y
# CONFIG_GX6605S_TIMER is not set
# CONFIG_MSC313E_TIMER is not set
# CONFIG_INGENIC_TIMER is not set
# CONFIG_INGENIC_SYSOST is not set
# CONFIG_INGENIC_OST is not set
CONFIG_MICROCHIP_PIT64B=y
# end of Clock Source drivers

# CONFIG_MAILBOX is not set
CONFIG_IOMMU_API=y
CONFIG_IOMMU_SUPPORT=y

#
# Generic IOMMU Pagetable Support
#
# CONFIG_IOMMU_IO_PGTABLE_ARMV7S is not set
# end of Generic IOMMU Pagetable Support

# CONFIG_IOMMU_DEBUGFS is not set
# CONFIG_IOMMU_DEFAULT_DMA_STRICT is not set
# CONFIG_IOMMU_DEFAULT_DMA_LAZY is not set
CONFIG_IOMMU_DEFAULT_PASSTHROUGH=y
CONFIG_OF_IOMMU=y
# CONFIG_OMAP_IOMMU is not set
# CONFIG_ROCKCHIP_IOMMU is not set
# CONFIG_SUN50I_IOMMU is not set
# CONFIG_EXYNOS_IOMMU is not set
# CONFIG_S390_CCW_IOMMU is not set
# CONFIG_S390_AP_IOMMU is not set
# CONFIG_MTK_IOMMU is not set
# CONFIG_SPRD_IOMMU is not set

#
# Remoteproc drivers
#
CONFIG_REMOTEPROC=y
# CONFIG_REMOTEPROC_CDEV is not set
# CONFIG_INGENIC_VPU_RPROC is not set
# CONFIG_MTK_SCP is not set
# CONFIG_MESON_MX_AO_ARC_REMOTEPROC is not set
# CONFIG_RCAR_REMOTEPROC is not set
# end of Remoteproc drivers

#
# Rpmsg drivers
#
# CONFIG_RPMSG_VIRTIO is not set
# end of Rpmsg drivers

# CONFIG_SOUNDWIRE is not set

#
# SOC (System On Chip) specific Drivers
#

#
# Amlogic SoC drivers
#
# CONFIG_MESON_CANVAS is not set
# CONFIG_MESON_CLK_MEASURE is not set
# CONFIG_MESON_GX_SOCINFO is not set
# CONFIG_MESON_MX_SOCINFO is not set
# end of Amlogic SoC drivers

#
# Apple SoC drivers
#
# CONFIG_APPLE_SART is not set
# end of Apple SoC drivers

#
# ASPEED SoC drivers
#
# CONFIG_ASPEED_LPC_CTRL is not set
# CONFIG_ASPEED_LPC_SNOOP is not set
# CONFIG_ASPEED_UART_ROUTING is not set
# CONFIG_ASPEED_P2A_CTRL is not set
# CONFIG_ASPEED_SOCINFO is not set
# end of ASPEED SoC drivers

# CONFIG_AT91_SOC_ID is not set
# CONFIG_AT91_SOC_SFR is not set

#
# Broadcom SoC drivers
#
# CONFIG_BCM2835_POWER is not set
# CONFIG_SOC_BCM63XX is not set
# CONFIG_SOC_BRCMSTB is not set
# CONFIG_BCM_PMB is not set
# end of Broadcom SoC drivers

#
# NXP/Freescale QorIQ SoC drivers
#
# CONFIG_QUICC_ENGINE is not set
CONFIG_DPAA2_CONSOLE=y
# end of NXP/Freescale QorIQ SoC drivers

#
# fujitsu SoC drivers
#
# end of fujitsu SoC drivers

#
# i.MX SoC drivers
#
# CONFIG_SOC_IMX8M is not set
# CONFIG_SOC_IMX9 is not set
# end of i.MX SoC drivers

#
# IXP4xx SoC drivers
#
# CONFIG_IXP4XX_QMGR is not set
# CONFIG_IXP4XX_NPE is not set
# end of IXP4xx SoC drivers

#
# Enable LiteX SoC Builder specific drivers
#
CONFIG_LITEX=y
CONFIG_LITEX_SOC_CONTROLLER=y
# end of Enable LiteX SoC Builder specific drivers

#
# MediaTek SoC drivers
#
# CONFIG_MTK_CMDQ is not set
# CONFIG_MTK_DEVAPC is not set
# CONFIG_MTK_INFRACFG is not set
# CONFIG_MTK_SCPSYS is not set
# CONFIG_MTK_MMSYS is not set
# end of MediaTek SoC drivers

#
# Qualcomm SoC drivers
#
# CONFIG_QCOM_COMMAND_DB is not set
# CONFIG_QCOM_GENI_SE is not set
# CONFIG_QCOM_GSBI is not set
# CONFIG_QCOM_LLCC is not set
# CONFIG_QCOM_RPMH is not set
# CONFIG_QCOM_SMEM is not set
# CONFIG_QCOM_SPM is not set
# CONFIG_QCOM_ICC_BWMON is not set
# end of Qualcomm SoC drivers

# CONFIG_SOC_RENESAS is not set
# CONFIG_ROCKCHIP_GRF is not set
# CONFIG_ROCKCHIP_IODOMAIN is not set
# CONFIG_SOC_SAMSUNG is not set
# CONFIG_SOC_TEGRA20_VOLTAGE_COUPLER is not set
# CONFIG_SOC_TEGRA30_VOLTAGE_COUPLER is not set
# CONFIG_SOC_TI is not set
# CONFIG_UX500_SOC_ID is not set

#
# Xilinx SoC drivers
#
# end of Xilinx SoC drivers
# end of SOC (System On Chip) specific Drivers

CONFIG_PM_DEVFREQ=y

#
# DEVFREQ Governors
#
CONFIG_DEVFREQ_GOV_SIMPLE_ONDEMAND=y
CONFIG_DEVFREQ_GOV_PERFORMANCE=m
CONFIG_DEVFREQ_GOV_POWERSAVE=m
# CONFIG_DEVFREQ_GOV_USERSPACE is not set
CONFIG_DEVFREQ_GOV_PASSIVE=y

#
# DEVFREQ Drivers
#
# CONFIG_ARM_EXYNOS_BUS_DEVFREQ is not set
# CONFIG_ARM_IMX_BUS_DEVFREQ is not set
# CONFIG_ARM_TEGRA_DEVFREQ is not set
# CONFIG_ARM_MEDIATEK_CCI_DEVFREQ is not set
# CONFIG_ARM_SUN8I_A33_MBUS_DEVFREQ is not set
# CONFIG_PM_DEVFREQ_EVENT is not set
CONFIG_EXTCON=y

#
# Extcon Device Drivers
#
CONFIG_EXTCON_GPIO=m
# CONFIG_EXTCON_MAX14577 is not set
CONFIG_EXTCON_MAX3355=m
CONFIG_EXTCON_PTN5150=m
# CONFIG_EXTCON_QCOM_SPMI_MISC is not set
# CONFIG_EXTCON_RT8973A is not set
# CONFIG_EXTCON_SM5502 is not set
# CONFIG_EXTCON_USB_GPIO is not set
CONFIG_MEMORY=y
CONFIG_DDR=y
# CONFIG_ATMEL_SDRAMC is not set
# CONFIG_ATMEL_EBI is not set
# CONFIG_BRCMSTB_DPFE is not set
# CONFIG_BRCMSTB_MEMC is not set
# CONFIG_BT1_L2_CTL is not set
# CONFIG_TI_AEMIF is not set
# CONFIG_TI_EMIF is not set
# CONFIG_OMAP_GPMC is not set
# CONFIG_FPGA_DFL_EMIF is not set
# CONFIG_MVEBU_DEVBUS is not set
# CONFIG_FSL_CORENET_CF is not set
# CONFIG_FSL_IFC is not set
# CONFIG_JZ4780_NEMC is not set
# CONFIG_MTK_SMI is not set
# CONFIG_DA8XX_DDRCTL is not set
# CONFIG_RENESAS_RPCIF is not set
# CONFIG_STM32_FMC2_EBI is not set
# CONFIG_SAMSUNG_MC is not set
CONFIG_TEGRA_MC=y
CONFIG_TEGRA20_EMC=y
CONFIG_TEGRA30_EMC=y
CONFIG_TEGRA124_EMC=y
# CONFIG_TEGRA210_EMC is not set
# CONFIG_IIO is not set
# CONFIG_PWM is not set

#
# IRQ chip support
#
CONFIG_IRQCHIP=y
# CONFIG_AL_FIC is not set
CONFIG_DW_APB_ICTL=y
CONFIG_MADERA_IRQ=m
# CONFIG_JCORE_AIC is not set
# CONFIG_RENESAS_INTC_IRQPIN is not set
# CONFIG_RENESAS_IRQC is not set
# CONFIG_RENESAS_RZA1_IRQC is not set
# CONFIG_RENESAS_RZG2L_IRQC is not set
# CONFIG_SL28CPLD_INTC is not set
# CONFIG_TS4800_IRQ is not set
# CONFIG_XILINX_INTC is not set
# CONFIG_INGENIC_TCU_IRQ is not set
# CONFIG_IRQ_UNIPHIER_AIDET is not set
# CONFIG_MESON_IRQ_GPIO is not set
CONFIG_CSKY_MPINTC=y
CONFIG_CSKY_APB_INTC=y
# CONFIG_IMX_IRQSTEER is not set
# CONFIG_IMX_INTMUX is not set
# CONFIG_IMX_MU_MSI is not set
# CONFIG_EXYNOS_IRQ_COMBINER is not set
# CONFIG_MST_IRQ is not set
# CONFIG_MCHP_EIC is not set
# CONFIG_SUNPLUS_SP7021_INTC is not set
# end of IRQ chip support

CONFIG_IPACK_BUS=m
CONFIG_SERIAL_IPOCTAL=m
# CONFIG_RESET_CONTROLLER is not set

#
# PHY Subsystem
#
CONFIG_GENERIC_PHY=y
CONFIG_GENERIC_PHY_MIPI_DPHY=y
# CONFIG_PHY_LPC18XX_USB_OTG is not set
# CONFIG_PHY_PISTACHIO_USB is not set
# CONFIG_PHY_XGENE is not set
# CONFIG_PHY_CAN_TRANSCEIVER is not set
# CONFIG_PHY_MESON8_HDMI_TX is not set
# CONFIG_PHY_MESON_G12A_MIPI_DPHY_ANALOG is not set
# CONFIG_PHY_MESON_G12A_USB2 is not set
# CONFIG_PHY_MESON_G12A_USB3_PCIE is not set
# CONFIG_PHY_MESON_AXG_PCIE is not set
# CONFIG_PHY_MESON_AXG_MIPI_PCIE_ANALOG is not set
# CONFIG_PHY_MESON_AXG_MIPI_DPHY is not set

#
# PHY drivers for Broadcom platforms
#
# CONFIG_PHY_BCM63XX_USBH is not set
# CONFIG_PHY_CYGNUS_PCIE is not set
# CONFIG_PHY_BCM_SR_USB is not set
# CONFIG_BCM_KONA_USB2_PHY is not set
# CONFIG_PHY_BCM_NS_USB2 is not set
# CONFIG_PHY_NS2_USB_DRD is not set
# CONFIG_PHY_BRCM_SATA is not set
# CONFIG_PHY_BRCM_USB is not set
# CONFIG_PHY_BCM_SR_PCIE is not set
# end of PHY drivers for Broadcom platforms

# CONFIG_PHY_CADENCE_TORRENT is not set
CONFIG_PHY_CADENCE_DPHY=y
# CONFIG_PHY_CADENCE_DPHY_RX is not set
CONFIG_PHY_CADENCE_SALVO=m
# CONFIG_PHY_FSL_IMX8MQ_USB is not set
# CONFIG_PHY_MIXEL_LVDS_PHY is not set
# CONFIG_PHY_MIXEL_MIPI_DPHY is not set
# CONFIG_PHY_FSL_IMX8M_PCIE is not set
# CONFIG_PHY_FSL_LYNX_28G is not set
# CONFIG_PHY_HI6220_USB is not set
# CONFIG_PHY_HI3660_USB is not set
# CONFIG_PHY_HI3670_USB is not set
# CONFIG_PHY_HI3670_PCIE is not set
# CONFIG_PHY_HISTB_COMBPHY is not set
# CONFIG_PHY_HISI_INNO_USB2 is not set
# CONFIG_PHY_LANTIQ_VRX200_PCIE is not set
# CONFIG_PHY_LANTIQ_RCU_USB2 is not set
# CONFIG_ARMADA375_USBCLUSTER_PHY is not set
# CONFIG_PHY_BERLIN_SATA is not set
CONFIG_PHY_MVEBU_A3700_UTMI=y
# CONFIG_PHY_MVEBU_A38X_COMPHY is not set
CONFIG_PHY_PXA_28NM_HSIC=y
CONFIG_PHY_PXA_28NM_USB2=m
# CONFIG_PHY_PXA_USB is not set
# CONFIG_PHY_MMP3_USB is not set
# CONFIG_PHY_MMP3_HSIC is not set
# CONFIG_PHY_MTK_PCIE is not set
# CONFIG_PHY_MTK_TPHY is not set
# CONFIG_PHY_MTK_UFS is not set
# CONFIG_PHY_MTK_XSPHY is not set
# CONFIG_PHY_MTK_HDMI is not set
# CONFIG_PHY_MTK_MIPI_DSI is not set
# CONFIG_PHY_MTK_DP is not set
# CONFIG_PHY_SPARX5_SERDES is not set
CONFIG_PHY_LAN966X_SERDES=y
CONFIG_PHY_OCELOT_SERDES=y
# CONFIG_PHY_ATH79_USB is not set
# CONFIG_PHY_QCOM_EDP is not set
# CONFIG_PHY_QCOM_IPQ4019_USB is not set
# CONFIG_PHY_QCOM_PCIE2 is not set
# CONFIG_PHY_QCOM_QMP is not set
# CONFIG_PHY_QCOM_QUSB2 is not set
# CONFIG_PHY_QCOM_USB_SNPS_FEMTO_V2 is not set
# CONFIG_PHY_QCOM_USB_HS_28NM is not set
# CONFIG_PHY_QCOM_USB_SS is not set
# CONFIG_PHY_QCOM_IPQ806X_USB is not set
# CONFIG_PHY_MT7621_PCI is not set
# CONFIG_PHY_RALINK_USB is not set
# CONFIG_PHY_RCAR_GEN3_USB3 is not set
# CONFIG_PHY_ROCKCHIP_DPHY_RX0 is not set
# CONFIG_PHY_ROCKCHIP_INNO_HDMI is not set
# CONFIG_PHY_ROCKCHIP_INNO_CSIDPHY is not set
# CONFIG_PHY_ROCKCHIP_INNO_DSIDPHY is not set
# CONFIG_PHY_ROCKCHIP_PCIE is not set
# CONFIG_PHY_ROCKCHIP_SNPS_PCIE3 is not set
# CONFIG_PHY_ROCKCHIP_TYPEC is not set
# CONFIG_PHY_EXYNOS_DP_VIDEO is not set
# CONFIG_PHY_EXYNOS_MIPI_VIDEO is not set
# CONFIG_PHY_EXYNOS_PCIE is not set
# CONFIG_PHY_SAMSUNG_UFS is not set
# CONFIG_PHY_SAMSUNG_USB2 is not set
# CONFIG_PHY_UNIPHIER_USB2 is not set
# CONFIG_PHY_UNIPHIER_USB3 is not set
# CONFIG_PHY_UNIPHIER_PCIE is not set
# CONFIG_PHY_UNIPHIER_AHCI is not set
# CONFIG_PHY_ST_SPEAR1310_MIPHY is not set
# CONFIG_PHY_ST_SPEAR1340_MIPHY is not set
# CONFIG_PHY_STM32_USBPHYC is not set
# CONFIG_PHY_SUNPLUS_USB is not set
# CONFIG_PHY_TEGRA194_P2U is not set
# CONFIG_PHY_DA8XX_USB is not set
# CONFIG_PHY_AM654_SERDES is not set
# CONFIG_PHY_J721E_WIZ is not set
# CONFIG_OMAP_CONTROL_PHY is not set
# CONFIG_TI_PIPE3 is not set
# CONFIG_PHY_INTEL_KEEMBAY_EMMC is not set
# CONFIG_PHY_INTEL_KEEMBAY_USB is not set
# CONFIG_PHY_INTEL_LGM_COMBO is not set
# CONFIG_PHY_INTEL_LGM_EMMC is not set
# CONFIG_PHY_INTEL_THUNDERBAY_EMMC is not set
# CONFIG_PHY_XILINX_ZYNQMP is not set
# end of PHY Subsystem

CONFIG_POWERCAP=y
# CONFIG_DTPM is not set
# CONFIG_MCB is not set
# CONFIG_RAS is not set

#
# Android
#
# CONFIG_ANDROID_BINDER_IPC is not set
# end of Android

# CONFIG_DAX is not set
CONFIG_NVMEM=y
CONFIG_NVMEM_SYSFS=y
# CONFIG_NVMEM_APPLE_EFUSES is not set
# CONFIG_NVMEM_BCM_OCOTP is not set
# CONFIG_NVMEM_BRCM_NVRAM is not set
# CONFIG_NVMEM_IMX_IIM is not set
# CONFIG_NVMEM_IMX_OCOTP is not set
# CONFIG_NVMEM_JZ4780_EFUSE is not set
# CONFIG_NVMEM_LAN9662_OTPC is not set
# CONFIG_NVMEM_LAYERSCAPE_SFP is not set
# CONFIG_NVMEM_LPC18XX_EEPROM is not set
# CONFIG_NVMEM_LPC18XX_OTP is not set
# CONFIG_NVMEM_MESON_MX_EFUSE is not set
# CONFIG_NVMEM_MICROCHIP_OTPC is not set
# CONFIG_NVMEM_MTK_EFUSE is not set
# CONFIG_NVMEM_MXS_OCOTP is not set
# CONFIG_NVMEM_NINTENDO_OTP is not set
# CONFIG_NVMEM_QCOM_QFPROM is not set
CONFIG_NVMEM_RMEM=m
# CONFIG_NVMEM_ROCKCHIP_EFUSE is not set
# CONFIG_NVMEM_ROCKCHIP_OTP is not set
# CONFIG_NVMEM_SC27XX_EFUSE is not set
# CONFIG_NVMEM_SNVS_LPGPR is not set
# CONFIG_NVMEM_SPMI_SDAM is not set
# CONFIG_NVMEM_SPRD_EFUSE is not set
# CONFIG_NVMEM_STM32_ROMEM is not set
# CONFIG_NVMEM_SUNPLUS_OCOTP is not set
# CONFIG_NVMEM_U_BOOT_ENV is not set
# CONFIG_NVMEM_UNIPHIER_EFUSE is not set
# CONFIG_NVMEM_VF610_OCOTP is not set

#
# HW tracing support
#
# CONFIG_STM is not set
CONFIG_INTEL_TH=m
CONFIG_INTEL_TH_GTH=m
CONFIG_INTEL_TH_MSU=m
CONFIG_INTEL_TH_PTI=m
CONFIG_INTEL_TH_DEBUG=y
# end of HW tracing support

CONFIG_FPGA=y
# CONFIG_FPGA_MGR_SOCFPGA is not set
# CONFIG_FPGA_MGR_SOCFPGA_A10 is not set
CONFIG_ALTERA_PR_IP_CORE=y
CONFIG_ALTERA_PR_IP_CORE_PLAT=y
# CONFIG_FPGA_MGR_ZYNQ_FPGA is not set
CONFIG_FPGA_BRIDGE=m
CONFIG_ALTERA_FREEZE_BRIDGE=m
CONFIG_XILINX_PR_DECOUPLER=m
CONFIG_FPGA_REGION=m
# CONFIG_OF_FPGA_REGION is not set
CONFIG_FPGA_DFL=m
CONFIG_FPGA_DFL_AFU=m
CONFIG_FPGA_DFL_NIOS_INTEL_PAC_N3000=m
# CONFIG_FPGA_MGR_ZYNQMP_FPGA is not set
# CONFIG_FPGA_MGR_VERSAL_FPGA is not set
# CONFIG_FSI is not set
# CONFIG_TEE is not set
CONFIG_MULTIPLEXER=m

#
# Multiplexer drivers
#
# CONFIG_MUX_ADG792A is not set
# CONFIG_MUX_GPIO is not set
CONFIG_MUX_MMIO=m
# end of Multiplexer drivers

CONFIG_PM_OPP=y
# CONFIG_SIOX is not set
CONFIG_SLIMBUS=m
CONFIG_SLIM_QCOM_CTRL=m
CONFIG_INTERCONNECT=y
# CONFIG_INTERCONNECT_IMX is not set
# CONFIG_INTERCONNECT_QCOM_OSM_L3 is not set
# CONFIG_INTERCONNECT_SAMSUNG is not set
CONFIG_COUNTER=m
# CONFIG_104_QUAD_8 is not set
# CONFIG_INTERRUPT_CNT is not set
# CONFIG_STM32_TIMER_CNT is not set
# CONFIG_STM32_LPTIMER_CNT is not set
# CONFIG_TI_EQEP is not set
CONFIG_FTM_QUADDEC=m
CONFIG_MICROCHIP_TCB_CAPTURE=m
# CONFIG_TI_ECAP_CAPTURE is not set
CONFIG_MOST=y
CONFIG_MOST_CDEV=y
# CONFIG_MOST_SND is not set
CONFIG_PECI=y
CONFIG_PECI_CPU=y
# CONFIG_PECI_ASPEED is not set
# CONFIG_HTE is not set
# end of Device Drivers

#
# File systems
#
# CONFIG_VALIDATE_FS_PARSER is not set
CONFIG_FS_POSIX_ACL=y
CONFIG_EXPORTFS=y
# CONFIG_EXPORTFS_BLOCK_OPS is not set
# CONFIG_FILE_LOCKING is not set
CONFIG_FS_ENCRYPTION=y
# CONFIG_FS_VERITY is not set
CONFIG_FSNOTIFY=y
# CONFIG_DNOTIFY is not set
# CONFIG_INOTIFY_USER is not set
CONFIG_FANOTIFY=y
CONFIG_QUOTA=y
CONFIG_PRINT_QUOTA_WARNING=y
CONFIG_QUOTA_DEBUG=y
# CONFIG_QFMT_V1 is not set
# CONFIG_QFMT_V2 is not set
CONFIG_QUOTACTL=y
# CONFIG_AUTOFS4_FS is not set
CONFIG_AUTOFS_FS=y
CONFIG_FUSE_FS=m
# CONFIG_CUSE is not set
CONFIG_VIRTIO_FS=m
CONFIG_OVERLAY_FS=y
CONFIG_OVERLAY_FS_REDIRECT_DIR=y
# CONFIG_OVERLAY_FS_REDIRECT_ALWAYS_FOLLOW is not set
# CONFIG_OVERLAY_FS_INDEX is not set
CONFIG_OVERLAY_FS_METACOPY=y

#
# Caches
#
CONFIG_NETFS_SUPPORT=y
CONFIG_NETFS_STATS=y
CONFIG_FSCACHE=y
# CONFIG_FSCACHE_STATS is not set
# CONFIG_FSCACHE_DEBUG is not set
# end of Caches

#
# Pseudo filesystems
#
CONFIG_PROC_FS=y
# CONFIG_PROC_KCORE is not set
CONFIG_PROC_SYSCTL=y
# CONFIG_PROC_PAGE_MONITOR is not set
CONFIG_PROC_CHILDREN=y
CONFIG_KERNFS=y
CONFIG_SYSFS=y
CONFIG_CONFIGFS_FS=y
# end of Pseudo filesystems

CONFIG_MISC_FILESYSTEMS=y
CONFIG_ORANGEFS_FS=m
CONFIG_ECRYPT_FS=m
# CONFIG_ECRYPT_FS_MESSAGING is not set
# CONFIG_JFFS2_FS is not set
# CONFIG_CRAMFS is not set
CONFIG_ROMFS_FS=m
CONFIG_ROMFS_BACKED_BY_MTD=y
CONFIG_ROMFS_ON_MTD=y
# CONFIG_PSTORE is not set
# CONFIG_NLS is not set
CONFIG_UNICODE=y
# CONFIG_UNICODE_NORMALIZATION_SELFTEST is not set
# end of File systems

#
# Security options
#
CONFIG_KEYS=y
# CONFIG_KEYS_REQUEST_CACHE is not set
CONFIG_PERSISTENT_KEYRINGS=y
CONFIG_TRUSTED_KEYS=y
CONFIG_TRUSTED_KEYS_TPM=y
CONFIG_ENCRYPTED_KEYS=m
CONFIG_USER_DECRYPTED_DATA=y
CONFIG_KEY_DH_OPERATIONS=y
CONFIG_KEY_NOTIFICATIONS=y
# CONFIG_SECURITY_DMESG_RESTRICT is not set
# CONFIG_SECURITYFS is not set
# CONFIG_STATIC_USERMODEHELPER is not set
CONFIG_DEFAULT_SECURITY_DAC=y
CONFIG_LSM="landlock,lockdown,yama,loadpin,safesetid,integrity,bpf"

#
# Kernel hardening options
#

#
# Memory initialization
#
CONFIG_CC_HAS_AUTO_VAR_INIT_PATTERN=y
CONFIG_CC_HAS_AUTO_VAR_INIT_ZERO_BARE=y
CONFIG_CC_HAS_AUTO_VAR_INIT_ZERO=y
CONFIG_INIT_STACK_NONE=y
# CONFIG_INIT_STACK_ALL_PATTERN is not set
# CONFIG_INIT_STACK_ALL_ZERO is not set
CONFIG_INIT_ON_ALLOC_DEFAULT_ON=y
CONFIG_INIT_ON_FREE_DEFAULT_ON=y
CONFIG_CC_HAS_ZERO_CALL_USED_REGS=y
CONFIG_ZERO_CALL_USED_REGS=y
# end of Memory initialization

CONFIG_RANDSTRUCT_NONE=y
# end of Kernel hardening options
# end of Security options

CONFIG_CRYPTO=y

#
# Crypto core or helper
#
CONFIG_CRYPTO_FIPS=y
CONFIG_CRYPTO_FIPS_NAME="Linux Kernel Cryptographic API"
CONFIG_CRYPTO_FIPS_CUSTOM_VERSION=y
CONFIG_CRYPTO_FIPS_VERSION="(none)"
CONFIG_CRYPTO_ALGAPI=y
CONFIG_CRYPTO_ALGAPI2=y
CONFIG_CRYPTO_AEAD=y
CONFIG_CRYPTO_AEAD2=y
CONFIG_CRYPTO_SKCIPHER=y
CONFIG_CRYPTO_SKCIPHER2=y
CONFIG_CRYPTO_HASH=y
CONFIG_CRYPTO_HASH2=y
CONFIG_CRYPTO_RNG=y
CONFIG_CRYPTO_RNG2=y
CONFIG_CRYPTO_RNG_DEFAULT=y
CONFIG_CRYPTO_AKCIPHER2=y
CONFIG_CRYPTO_AKCIPHER=y
CONFIG_CRYPTO_KPP2=y
CONFIG_CRYPTO_KPP=y
CONFIG_CRYPTO_ACOMP2=y
CONFIG_CRYPTO_MANAGER=y
CONFIG_CRYPTO_MANAGER2=y
# CONFIG_CRYPTO_MANAGER_DISABLE_TESTS is not set
CONFIG_CRYPTO_MANAGER_EXTRA_TESTS=y
CONFIG_CRYPTO_GF128MUL=y
CONFIG_CRYPTO_NULL=y
CONFIG_CRYPTO_NULL2=y
# CONFIG_CRYPTO_CRYPTD is not set
CONFIG_CRYPTO_AUTHENC=m
# CONFIG_CRYPTO_TEST is not set
# end of Crypto core or helper

#
# Public-key cryptography
#
CONFIG_CRYPTO_RSA=y
CONFIG_CRYPTO_DH=y
CONFIG_CRYPTO_DH_RFC7919_GROUPS=y
CONFIG_CRYPTO_ECC=y
CONFIG_CRYPTO_ECDH=m
CONFIG_CRYPTO_ECDSA=y
CONFIG_CRYPTO_ECRDSA=y
CONFIG_CRYPTO_SM2=m
# CONFIG_CRYPTO_CURVE25519 is not set
# end of Public-key cryptography

#
# Block ciphers
#
CONFIG_CRYPTO_AES=y
CONFIG_CRYPTO_AES_TI=m
# CONFIG_CRYPTO_ARIA is not set
CONFIG_CRYPTO_BLOWFISH=m
CONFIG_CRYPTO_BLOWFISH_COMMON=m
CONFIG_CRYPTO_CAMELLIA=m
CONFIG_CRYPTO_CAST_COMMON=y
CONFIG_CRYPTO_CAST5=m
CONFIG_CRYPTO_CAST6=y
CONFIG_CRYPTO_DES=m
CONFIG_CRYPTO_FCRYPT=y
CONFIG_CRYPTO_SERPENT=y
CONFIG_CRYPTO_SM4=m
CONFIG_CRYPTO_SM4_GENERIC=m
CONFIG_CRYPTO_TWOFISH=y
CONFIG_CRYPTO_TWOFISH_COMMON=y
# end of Block ciphers

#
# Length-preserving ciphers and modes
#
# CONFIG_CRYPTO_ADIANTUM is not set
CONFIG_CRYPTO_CHACHA20=m
CONFIG_CRYPTO_CBC=m
CONFIG_CRYPTO_CFB=y
CONFIG_CRYPTO_CTR=y
# CONFIG_CRYPTO_CTS is not set
CONFIG_CRYPTO_ECB=y
# CONFIG_CRYPTO_HCTR2 is not set
CONFIG_CRYPTO_KEYWRAP=y
CONFIG_CRYPTO_LRW=y
# CONFIG_CRYPTO_OFB is not set
CONFIG_CRYPTO_PCBC=y
CONFIG_CRYPTO_XTS=y
# end of Length-preserving ciphers and modes

#
# AEAD (authenticated encryption with associated data) ciphers
#
CONFIG_CRYPTO_AEGIS128=m
CONFIG_CRYPTO_CHACHA20POLY1305=m
CONFIG_CRYPTO_CCM=y
CONFIG_CRYPTO_GCM=m
CONFIG_CRYPTO_SEQIV=y
# CONFIG_CRYPTO_ECHAINIV is not set
CONFIG_CRYPTO_ESSIV=m
# end of AEAD (authenticated encryption with associated data) ciphers

#
# Hashes, digests, and MACs
#
CONFIG_CRYPTO_BLAKE2B=y
# CONFIG_CRYPTO_CMAC is not set
CONFIG_CRYPTO_GHASH=y
CONFIG_CRYPTO_HMAC=y
# CONFIG_CRYPTO_MD4 is not set
CONFIG_CRYPTO_MD5=m
# CONFIG_CRYPTO_MICHAEL_MIC is not set
CONFIG_CRYPTO_POLY1305=m
CONFIG_CRYPTO_RMD160=y
CONFIG_CRYPTO_SHA1=y
CONFIG_CRYPTO_SHA256=y
CONFIG_CRYPTO_SHA512=y
# CONFIG_CRYPTO_SHA3 is not set
CONFIG_CRYPTO_SM3=m
CONFIG_CRYPTO_SM3_GENERIC=m
CONFIG_CRYPTO_STREEBOG=y
CONFIG_CRYPTO_VMAC=m
CONFIG_CRYPTO_WP512=y
CONFIG_CRYPTO_XCBC=y
CONFIG_CRYPTO_XXHASH=y
# end of Hashes, digests, and MACs

#
# CRCs (cyclic redundancy checks)
#
CONFIG_CRYPTO_CRC32C=m
CONFIG_CRYPTO_CRC32=m
CONFIG_CRYPTO_CRCT10DIF=m
CONFIG_CRYPTO_CRC64_ROCKSOFT=m
# end of CRCs (cyclic redundancy checks)

#
# Compression
#
CONFIG_CRYPTO_DEFLATE=y
CONFIG_CRYPTO_LZO=m
CONFIG_CRYPTO_842=m
CONFIG_CRYPTO_LZ4=y
CONFIG_CRYPTO_LZ4HC=m
CONFIG_CRYPTO_ZSTD=y
# end of Compression

#
# Random number generation
#
# CONFIG_CRYPTO_ANSI_CPRNG is not set
CONFIG_CRYPTO_DRBG_MENU=y
CONFIG_CRYPTO_DRBG_HMAC=y
CONFIG_CRYPTO_DRBG_HASH=y
# CONFIG_CRYPTO_DRBG_CTR is not set
CONFIG_CRYPTO_DRBG=y
CONFIG_CRYPTO_JITTERENTROPY=y
CONFIG_CRYPTO_KDF800108_CTR=y
# end of Random number generation

#
# Userspace interface
#
# end of Userspace interface

CONFIG_CRYPTO_HASH_INFO=y
# CONFIG_CRYPTO_HW is not set
CONFIG_ASYMMETRIC_KEY_TYPE=y
CONFIG_ASYMMETRIC_PUBLIC_KEY_SUBTYPE=y
CONFIG_X509_CERTIFICATE_PARSER=y
CONFIG_PKCS8_PRIVATE_KEY_PARSER=y
CONFIG_PKCS7_MESSAGE_PARSER=y
# CONFIG_PKCS7_TEST_KEY is not set
# CONFIG_SIGNED_PE_FILE_VERIFICATION is not set
# CONFIG_FIPS_SIGNATURE_SELFTEST is not set

#
# Certificates for signature checking
#
CONFIG_MODULE_SIG_KEY="certs/signing_key.pem"
# CONFIG_MODULE_SIG_KEY_TYPE_RSA is not set
CONFIG_MODULE_SIG_KEY_TYPE_ECDSA=y
CONFIG_SYSTEM_TRUSTED_KEYRING=y
CONFIG_SYSTEM_TRUSTED_KEYS=""
# CONFIG_SYSTEM_EXTRA_CERTIFICATE is not set
CONFIG_SECONDARY_TRUSTED_KEYRING=y
CONFIG_SYSTEM_BLACKLIST_KEYRING=y
CONFIG_SYSTEM_BLACKLIST_HASH_LIST=""
# CONFIG_SYSTEM_REVOCATION_LIST is not set
# CONFIG_SYSTEM_BLACKLIST_AUTH_UPDATE is not set
# end of Certificates for signature checking

CONFIG_BINARY_PRINTF=y

#
# Library routines
#
CONFIG_LINEAR_RANGES=y
CONFIG_PACKING=y
CONFIG_BITREVERSE=y
CONFIG_GENERIC_STRNCPY_FROM_USER=y
CONFIG_GENERIC_STRNLEN_USER=y
CONFIG_CORDIC=m
# CONFIG_PRIME_NUMBERS is not set
CONFIG_RATIONAL=y
CONFIG_GENERIC_PCI_IOMAP=y

#
# Crypto library routines
#
CONFIG_CRYPTO_LIB_UTILS=y
CONFIG_CRYPTO_LIB_AES=y
CONFIG_CRYPTO_LIB_BLAKE2S_GENERIC=y
CONFIG_CRYPTO_LIB_CHACHA_GENERIC=m
CONFIG_CRYPTO_LIB_CHACHA=m
# CONFIG_CRYPTO_LIB_CURVE25519 is not set
CONFIG_CRYPTO_LIB_DES=m
CONFIG_CRYPTO_LIB_POLY1305_RSIZE=1
CONFIG_CRYPTO_LIB_POLY1305_GENERIC=m
# CONFIG_CRYPTO_LIB_POLY1305 is not set
# CONFIG_CRYPTO_LIB_CHACHA20POLY1305 is not set
CONFIG_CRYPTO_LIB_SHA1=y
CONFIG_CRYPTO_LIB_SHA256=y
# end of Crypto library routines

# CONFIG_CRC_CCITT is not set
CONFIG_CRC16=y
# CONFIG_CRC_T10DIF is not set
CONFIG_CRC64_ROCKSOFT=m
CONFIG_CRC_ITU_T=y
CONFIG_CRC32=y
CONFIG_CRC32_SELFTEST=m
CONFIG_CRC32_SLICEBY8=y
# CONFIG_CRC32_SLICEBY4 is not set
# CONFIG_CRC32_SARWATE is not set
# CONFIG_CRC32_BIT is not set
CONFIG_CRC64=m
CONFIG_CRC4=m
CONFIG_CRC7=y
# CONFIG_LIBCRC32C is not set
# CONFIG_CRC8 is not set
CONFIG_XXHASH=y
CONFIG_RANDOM32_SELFTEST=y
CONFIG_842_COMPRESS=m
CONFIG_842_DECOMPRESS=m
CONFIG_ZLIB_INFLATE=y
CONFIG_ZLIB_DEFLATE=y
CONFIG_LZO_COMPRESS=m
CONFIG_LZO_DECOMPRESS=m
CONFIG_LZ4_COMPRESS=y
CONFIG_LZ4HC_COMPRESS=m
CONFIG_LZ4_DECOMPRESS=y
CONFIG_ZSTD_COMMON=y
CONFIG_ZSTD_COMPRESS=y
CONFIG_ZSTD_DECOMPRESS=y
CONFIG_XZ_DEC=m
CONFIG_XZ_DEC_X86=y
# CONFIG_XZ_DEC_POWERPC is not set
# CONFIG_XZ_DEC_IA64 is not set
# CONFIG_XZ_DEC_ARM is not set
CONFIG_XZ_DEC_ARMTHUMB=y
# CONFIG_XZ_DEC_SPARC is not set
CONFIG_XZ_DEC_MICROLZMA=y
CONFIG_XZ_DEC_BCJ=y
# CONFIG_XZ_DEC_TEST is not set
CONFIG_DECOMPRESS_GZIP=y
CONFIG_DECOMPRESS_BZIP2=y
CONFIG_DECOMPRESS_LZ4=y
CONFIG_DECOMPRESS_ZSTD=y
CONFIG_GENERIC_ALLOCATOR=y
CONFIG_BCH=m
CONFIG_BCH_CONST_PARAMS=y
CONFIG_INTERVAL_TREE=y
CONFIG_ASSOCIATIVE_ARRAY=y
CONFIG_HAS_IOMEM=y
CONFIG_HAS_IOPORT_MAP=y
CONFIG_HAS_DMA=y
CONFIG_NEED_DMA_MAP_STATE=y
CONFIG_DMA_DECLARE_COHERENT=y
CONFIG_ARCH_HAS_SYNC_DMA_FOR_DEVICE=y
CONFIG_ARCH_HAS_SYNC_DMA_FOR_CPU=y
CONFIG_ARCH_HAS_DMA_PREP_COHERENT=y
CONFIG_DMA_NONCOHERENT_MMAP=y
CONFIG_DMA_COHERENT_POOL=y
CONFIG_DMA_DIRECT_REMAP=y
# CONFIG_DMA_API_DEBUG is not set
CONFIG_DMA_MAP_BENCHMARK=y
CONFIG_SGL_ALLOC=y
CONFIG_GENERIC_ATOMIC64=y
CONFIG_CLZ_TAB=y
# CONFIG_IRQ_POLL is not set
CONFIG_MPILIB=y
CONFIG_LIBFDT=y
CONFIG_OID_REGISTRY=y
CONFIG_HAVE_GENERIC_VDSO=y
CONFIG_GENERIC_GETTIMEOFDAY=y
CONFIG_GENERIC_VDSO_32=y
CONFIG_FONT_SUPPORT=m
CONFIG_FONT_8x16=y
CONFIG_FONT_AUTOSELECT=y
CONFIG_STACKDEPOT=y
# CONFIG_PARMAN is not set
# CONFIG_OBJAGG is not set
# end of Library routines

CONFIG_GENERIC_IOREMAP=y
CONFIG_GENERIC_LIB_ASHLDI3=y
CONFIG_GENERIC_LIB_ASHRDI3=y
CONFIG_GENERIC_LIB_LSHRDI3=y
CONFIG_GENERIC_LIB_MULDI3=y
CONFIG_GENERIC_LIB_CMPDI2=y
CONFIG_GENERIC_LIB_UCMPDI2=y
CONFIG_ASN1_ENCODER=y

#
# Kernel hacking
#

#
# printk and dmesg options
#
CONFIG_CONSOLE_LOGLEVEL_DEFAULT=7
CONFIG_CONSOLE_LOGLEVEL_QUIET=4
CONFIG_MESSAGE_LOGLEVEL_DEFAULT=4
CONFIG_SYMBOLIC_ERRNAME=y
CONFIG_DEBUG_BUGVERBOSE=y
# end of printk and dmesg options

CONFIG_DEBUG_KERNEL=y
# CONFIG_DEBUG_MISC is not set

#
# Compile-time checks and compiler options
#
CONFIG_DEBUG_INFO=y
CONFIG_AS_HAS_NON_CONST_LEB128=y
# CONFIG_DEBUG_INFO_NONE is not set
# CONFIG_DEBUG_INFO_DWARF_TOOLCHAIN_DEFAULT is not set
CONFIG_DEBUG_INFO_DWARF4=y
# CONFIG_DEBUG_INFO_DWARF5 is not set
CONFIG_DEBUG_INFO_REDUCED=y
# CONFIG_DEBUG_INFO_COMPRESSED is not set
# CONFIG_DEBUG_INFO_SPLIT is not set
CONFIG_PAHOLE_HAS_SPLIT_BTF=y
CONFIG_GDB_SCRIPTS=y
CONFIG_FRAME_WARN=1024
# CONFIG_STRIP_ASM_SYMS is not set
CONFIG_READABLE_ASM=y
# CONFIG_HEADERS_INSTALL is not set
CONFIG_DEBUG_SECTION_MISMATCH=y
CONFIG_SECTION_MISMATCH_WARN_ONLY=y
CONFIG_VMLINUX_MAP=y
# CONFIG_DEBUG_FORCE_WEAK_PER_CPU is not set
# end of Compile-time checks and compiler options

#
# Generic Kernel Debugging Instruments
#
CONFIG_MAGIC_SYSRQ=y
CONFIG_MAGIC_SYSRQ_DEFAULT_ENABLE=0x1
# CONFIG_MAGIC_SYSRQ_SERIAL is not set
CONFIG_DEBUG_FS=y
# CONFIG_DEBUG_FS_ALLOW_ALL is not set
CONFIG_DEBUG_FS_DISALLOW_MOUNT=y
# CONFIG_DEBUG_FS_ALLOW_NONE is not set
CONFIG_UBSAN=y
CONFIG_CC_HAS_UBSAN_BOUNDS=y
# CONFIG_UBSAN_BOUNDS is not set
CONFIG_UBSAN_SHIFT=y
# CONFIG_UBSAN_DIV_ZERO is not set
CONFIG_UBSAN_UNREACHABLE=y
CONFIG_UBSAN_BOOL=y
CONFIG_UBSAN_ENUM=y
CONFIG_TEST_UBSAN=m
CONFIG_HAVE_KCSAN_COMPILER=y
# end of Generic Kernel Debugging Instruments

#
# Networking Debugging
#
# end of Networking Debugging

#
# Memory Debugging
#
CONFIG_PAGE_EXTENSION=y
CONFIG_DEBUG_PAGEALLOC=y
# CONFIG_DEBUG_PAGEALLOC_ENABLE_DEFAULT is not set
CONFIG_PAGE_OWNER=y
CONFIG_PAGE_POISONING=y
CONFIG_DEBUG_OBJECTS=y
CONFIG_DEBUG_OBJECTS_SELFTEST=y
# CONFIG_DEBUG_OBJECTS_FREE is not set
CONFIG_DEBUG_OBJECTS_TIMERS=y
# CONFIG_DEBUG_OBJECTS_WORK is not set
CONFIG_DEBUG_OBJECTS_RCU_HEAD=y
# CONFIG_DEBUG_OBJECTS_PERCPU_COUNTER is not set
CONFIG_DEBUG_OBJECTS_ENABLE_DEFAULT=1
CONFIG_SHRINKER_DEBUG=y
CONFIG_HAVE_DEBUG_KMEMLEAK=y
# CONFIG_DEBUG_KMEMLEAK is not set
# CONFIG_DEBUG_STACK_USAGE is not set
# CONFIG_SCHED_STACK_END_CHECK is not set
# CONFIG_DEBUG_VM is not set
# CONFIG_DEBUG_MEMORY_INIT is not set
CONFIG_DEBUG_KMAP_LOCAL=y
CONFIG_DEBUG_HIGHMEM=y
CONFIG_CC_HAS_WORKING_NOSANITIZE_ADDRESS=y
# end of Memory Debugging

# CONFIG_DEBUG_SHIRQ is not set

#
# Debug Oops, Lockups and Hangs
#
CONFIG_PANIC_ON_OOPS=y
CONFIG_PANIC_ON_OOPS_VALUE=1
CONFIG_PANIC_TIMEOUT=0
CONFIG_LOCKUP_DETECTOR=y
CONFIG_SOFTLOCKUP_DETECTOR=y
# CONFIG_BOOTPARAM_SOFTLOCKUP_PANIC is not set
# CONFIG_DETECT_HUNG_TASK is not set
# CONFIG_WQ_WATCHDOG is not set
CONFIG_TEST_LOCKUP=m
# end of Debug Oops, Lockups and Hangs

#
# Scheduler Debugging
#
# CONFIG_SCHED_DEBUG is not set
# CONFIG_SCHEDSTATS is not set
# end of Scheduler Debugging

CONFIG_DEBUG_TIMEKEEPING=y
# CONFIG_DEBUG_PREEMPT is not set

#
# Lock Debugging (spinlocks, mutexes, etc...)
#
CONFIG_LOCK_DEBUGGING_SUPPORT=y
# CONFIG_PROVE_LOCKING is not set
# CONFIG_LOCK_STAT is not set
# CONFIG_DEBUG_RT_MUTEXES is not set
# CONFIG_DEBUG_SPINLOCK is not set
# CONFIG_DEBUG_MUTEXES is not set
# CONFIG_DEBUG_WW_MUTEX_SLOWPATH is not set
CONFIG_DEBUG_RWSEMS=y
# CONFIG_DEBUG_LOCK_ALLOC is not set
# CONFIG_DEBUG_ATOMIC_SLEEP is not set
CONFIG_DEBUG_LOCKING_API_SELFTESTS=y
CONFIG_LOCK_TORTURE_TEST=y
CONFIG_WW_MUTEX_SELFTEST=m
# CONFIG_SCF_TORTURE_TEST is not set
# end of Lock Debugging (spinlocks, mutexes, etc...)

# CONFIG_DEBUG_IRQFLAGS is not set
CONFIG_STACKTRACE=y
# CONFIG_WARN_ALL_UNSEEDED_RANDOM is not set
# CONFIG_DEBUG_KOBJECT is not set
# CONFIG_DEBUG_KOBJECT_RELEASE is not set
CONFIG_HAVE_DEBUG_BUGVERBOSE=y

#
# Debug kernel data structures
#
# CONFIG_DEBUG_LIST is not set
# CONFIG_DEBUG_PLIST is not set
CONFIG_DEBUG_SG=y
# CONFIG_DEBUG_NOTIFIERS is not set
# CONFIG_BUG_ON_DATA_CORRUPTION is not set
CONFIG_DEBUG_MAPLE_TREE=y
# end of Debug kernel data structures

CONFIG_DEBUG_CREDENTIALS=y

#
# RCU Debugging
#
CONFIG_TORTURE_TEST=y
CONFIG_RCU_SCALE_TEST=y
CONFIG_RCU_TORTURE_TEST=m
CONFIG_RCU_REF_SCALE_TEST=m
CONFIG_RCU_CPU_STALL_TIMEOUT=21
CONFIG_RCU_EXP_CPU_STALL_TIMEOUT=0
# CONFIG_RCU_TRACE is not set
# CONFIG_RCU_EQS_DEBUG is not set
# end of RCU Debugging

CONFIG_DEBUG_WQ_FORCE_RR_CPU=y
CONFIG_HAVE_FUNCTION_TRACER=y
CONFIG_HAVE_FUNCTION_GRAPH_TRACER=y
CONFIG_HAVE_DYNAMIC_FTRACE=y
CONFIG_HAVE_DYNAMIC_FTRACE_WITH_REGS=y
CONFIG_HAVE_FTRACE_MCOUNT_RECORD=y
CONFIG_HAVE_SYSCALL_TRACEPOINTS=y
CONFIG_TRACING_SUPPORT=y
# CONFIG_FTRACE is not set
# CONFIG_SAMPLES is not set

#
# csky Debugging
#
# end of csky Debugging

#
# Kernel Testing and Coverage
#
# CONFIG_KUNIT is not set
# CONFIG_NOTIFIER_ERROR_INJECTION is not set
CONFIG_FAULT_INJECTION=y
CONFIG_FAIL_PAGE_ALLOC=y
# CONFIG_FAULT_INJECTION_USERCOPY is not set
CONFIG_FAIL_FUTEX=y
# CONFIG_FAULT_INJECTION_DEBUG_FS is not set
CONFIG_CC_HAS_SANCOV_TRACE_PC=y
# CONFIG_RUNTIME_TESTING_MENU is not set
# end of Kernel Testing and Coverage

#
# Rust hacking
#
# end of Rust hacking

# CONFIG_WARN_MISSING_DOCUMENTS is not set
# CONFIG_WARN_ABI_ERRORS is not set
# end of Kernel hacking

^ permalink raw reply	[flat|nested] 33+ messages in thread

end of thread, other threads:[~2022-12-01  2:53 UTC | newest]

Thread overview: 33+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2022-11-21 10:05 [net-next] bpf: avoid the multi checking xiangxia.m.yue
2022-11-21 10:05 ` [net-next] bpf: avoid hashtab deadlock with try_lock xiangxia.m.yue
2022-11-21 20:19   ` Jakub Kicinski
2022-11-22  1:15   ` Hou Tao
2022-11-22  3:12     ` Tonghao Zhang
2022-11-22  4:01       ` Hou Tao
2022-11-22  4:06         ` Hou Tao
2022-11-24 12:57           ` Tonghao Zhang
2022-11-24 14:13             ` Hou Tao
2022-11-28  3:15               ` Tonghao Zhang
2022-11-28 21:55                 ` Hao Luo
2022-11-29  4:32                   ` Hou Tao
2022-11-29  6:06                     ` Tonghao Zhang
2022-11-29  7:56                       ` Hou Tao
2022-11-29 12:45                       ` Hou Tao
2022-11-29 16:06                         ` Waiman Long
2022-11-29 17:23                           ` Boqun Feng
2022-11-29 17:32                             ` Boqun Feng
2022-11-29 19:36                               ` Hao Luo
2022-11-29 21:13                                 ` Waiman Long
2022-11-30  1:50                                 ` Hou Tao
2022-11-30  2:47                                   ` Tonghao Zhang
2022-11-30  3:06                                     ` Waiman Long
2022-11-30  3:32                                       ` Tonghao Zhang
2022-11-30  4:07                                         ` Waiman Long
2022-11-30  4:13                                     ` Hou Tao
2022-11-30  5:02                                       ` Hao Luo
2022-11-30  5:56                                         ` Tonghao Zhang
2022-11-30  5:55                                       ` Tonghao Zhang
2022-12-01  2:53                                         ` Hou Tao
2022-11-30  1:37                             ` Hou Tao
2022-11-22 22:16 ` [net-next] bpf: avoid the multi checking Daniel Borkmann
2022-11-23  0:06 [net-next] bpf: avoid hashtab deadlock with try_lock kernel test robot

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.