All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH v3] kprobes,lib: kretprobe scalability improvement
@ 2022-11-01  1:43 wuqiang
  2022-11-01 12:27 ` kernel test robot
  2022-11-02  2:30 ` [PATCH v4] " wuqiang
  0 siblings, 2 replies; 23+ messages in thread
From: wuqiang @ 2022-11-01  1:43 UTC (permalink / raw)
  To: mhiramat, davem, anil.s.keshavamurthy, naveen.n.rao, rostedt,
	peterz, ast, akpm
  Cc: linux-kernel, mattwu, wuqiang

kretprobe is using freelist to manage return-instances, but freelist,
as LIFO queue based on singly linked list, scales badly and reduces
the overall throughput of kretprobed routines, especially for high
contention scenarios.

Here's a typical throughput test of sys_flock (counts in 10 seconds,
measured with perf stat -a -I 10000 -e syscalls:sys_enter_flock):

OS: Debian 10 X86_64, Linux 6.1rc2
HW: XEON 8336C x 2, 64 cores/128 threads, DDR4 3200MT/s

      1X       2X       4X       6X       8X      12X      16X
34762430 36546920 17949900 13101899 12569595 12646601 14729195
     24X      32X      48X      64X      72X      96X     128X
19263546 10102064  8985418 11936495 11493980  7127789  9330985

This patch implements a scalable, lock-less and numa-aware object pool,
which brings near-linear scalability to kretprobed routines. Tests of
kretprobe throughput show the biggest ratio as 333.9x of the original
freelist. Here's the comparison:

                  1X         2X         4X         8X        16X
freelist:   34762430   36546920   17949900   12569595   14729195
objpool:    35627544   72182095  144068494  287564688  576903916
                 32X        48X        64X        96X       128X
freelist:   10102064    8985418   11936495    7127789    9330985
objpool:  1158876372 1737828164 2324371724 2380310472 2463182819

Tests on 96-core ARM64 system output similar results, with biggest
ratio up to 642.2x:

OS: Debian 10 AARCH64, Linux 6.1rc2
HW: Kunpeng-920 96 cores/2 sockets/4 NUMA nodes, DDR4 2933 MT/s

                  1X         2X         4X         8X        16X
freelist:   17498299   10887037   10224710    8499132    6421751
objpool:    18715726   35549845   71615884  144258971  283707220
                 24X        32X        48X        64X        96X
freelist:    5339868    4819116    3593919    3121575    2687167
objpool:   419830913  571609748  877456139 1143316315 1725668029

The object pool is a scalable implementation of high performance queue
for objects allocation and reclamation, such as kretprobe instances.

With leveraging percpu ring-array to mitigate the hot spots of memory
contention, it could deliver near-linear scalability for high parallel
scenarios. The ring-array is compactly managed in a single cacheline
for most cases or continuous cachelines if more than 4 instances are
pre-allocated for every core.

Changes since V2:
1) the percpu-extended version of the freelist replaced by new percpu-
   ring-array. freelist has data-contention in freelist_node (refs and
   next) even after node is removed from freelist and the node could
   be polluted easily (with freelist_node defined as union)
2) routines split to objpool.h and objpool.c, the latter moved to lib
3) test module (test_objpool.ko) added to lib for functional testings

Changes since V1:
1) reformat to a single patch as Masami Hiramatsu suggested
2) use __vmalloc_node to replace vmalloc_node for vmalloc
3) a few minor fixes: typo and coding-style issues

Signed-off-by: wuqiang <wuqiang.matt@bytedance.com>
---
 include/linux/freelist.h |  129 -----
 include/linux/kprobes.h  |    9 +-
 include/linux/objpool.h  |  151 ++++++
 include/linux/rethook.h  |   15 +-
 kernel/kprobes.c         |   95 ++--
 kernel/trace/fprobe.c    |   17 +-
 kernel/trace/rethook.c   |   80 +--
 lib/Kconfig.debug        |   11 +
 lib/Makefile             |    4 +-
 lib/objpool.c            |  480 ++++++++++++++++++
 lib/test_objpool.c       | 1031 ++++++++++++++++++++++++++++++++++++++
 11 files changed, 1772 insertions(+), 250 deletions(-)
 delete mode 100644 include/linux/freelist.h
 create mode 100644 include/linux/objpool.h
 create mode 100644 lib/objpool.c
 create mode 100644 lib/test_objpool.c

diff --git a/include/linux/freelist.h b/include/linux/freelist.h
deleted file mode 100644
index fc1842b96469..000000000000
--- a/include/linux/freelist.h
+++ /dev/null
@@ -1,129 +0,0 @@
-/* SPDX-License-Identifier: GPL-2.0-only OR BSD-2-Clause */
-#ifndef FREELIST_H
-#define FREELIST_H
-
-#include <linux/atomic.h>
-
-/*
- * Copyright: cameron@moodycamel.com
- *
- * A simple CAS-based lock-free free list. Not the fastest thing in the world
- * under heavy contention, but simple and correct (assuming nodes are never
- * freed until after the free list is destroyed), and fairly speedy under low
- * contention.
- *
- * Adapted from: https://moodycamel.com/blog/2014/solving-the-aba-problem-for-lock-free-free-lists
- */
-
-struct freelist_node {
-	atomic_t		refs;
-	struct freelist_node	*next;
-};
-
-struct freelist_head {
-	struct freelist_node	*head;
-};
-
-#define REFS_ON_FREELIST 0x80000000
-#define REFS_MASK	 0x7FFFFFFF
-
-static inline void __freelist_add(struct freelist_node *node, struct freelist_head *list)
-{
-	/*
-	 * Since the refcount is zero, and nobody can increase it once it's
-	 * zero (except us, and we run only one copy of this method per node at
-	 * a time, i.e. the single thread case), then we know we can safely
-	 * change the next pointer of the node; however, once the refcount is
-	 * back above zero, then other threads could increase it (happens under
-	 * heavy contention, when the refcount goes to zero in between a load
-	 * and a refcount increment of a node in try_get, then back up to
-	 * something non-zero, then the refcount increment is done by the other
-	 * thread) -- so if the CAS to add the node to the actual list fails,
-	 * decrese the refcount and leave the add operation to the next thread
-	 * who puts the refcount back to zero (which could be us, hence the
-	 * loop).
-	 */
-	struct freelist_node *head = READ_ONCE(list->head);
-
-	for (;;) {
-		WRITE_ONCE(node->next, head);
-		atomic_set_release(&node->refs, 1);
-
-		if (!try_cmpxchg_release(&list->head, &head, node)) {
-			/*
-			 * Hmm, the add failed, but we can only try again when
-			 * the refcount goes back to zero.
-			 */
-			if (atomic_fetch_add_release(REFS_ON_FREELIST - 1, &node->refs) == 1)
-				continue;
-		}
-		return;
-	}
-}
-
-static inline void freelist_add(struct freelist_node *node, struct freelist_head *list)
-{
-	/*
-	 * We know that the should-be-on-freelist bit is 0 at this point, so
-	 * it's safe to set it using a fetch_add.
-	 */
-	if (!atomic_fetch_add_release(REFS_ON_FREELIST, &node->refs)) {
-		/*
-		 * Oh look! We were the last ones referencing this node, and we
-		 * know we want to add it to the free list, so let's do it!
-		 */
-		__freelist_add(node, list);
-	}
-}
-
-static inline struct freelist_node *freelist_try_get(struct freelist_head *list)
-{
-	struct freelist_node *prev, *next, *head = smp_load_acquire(&list->head);
-	unsigned int refs;
-
-	while (head) {
-		prev = head;
-		refs = atomic_read(&head->refs);
-		if ((refs & REFS_MASK) == 0 ||
-		    !atomic_try_cmpxchg_acquire(&head->refs, &refs, refs+1)) {
-			head = smp_load_acquire(&list->head);
-			continue;
-		}
-
-		/*
-		 * Good, reference count has been incremented (it wasn't at
-		 * zero), which means we can read the next and not worry about
-		 * it changing between now and the time we do the CAS.
-		 */
-		next = READ_ONCE(head->next);
-		if (try_cmpxchg_acquire(&list->head, &head, next)) {
-			/*
-			 * Yay, got the node. This means it was on the list,
-			 * which means should-be-on-freelist must be false no
-			 * matter the refcount (because nobody else knows it's
-			 * been taken off yet, it can't have been put back on).
-			 */
-			WARN_ON_ONCE(atomic_read(&head->refs) & REFS_ON_FREELIST);
-
-			/*
-			 * Decrease refcount twice, once for our ref, and once
-			 * for the list's ref.
-			 */
-			atomic_fetch_add(-2, &head->refs);
-
-			return head;
-		}
-
-		/*
-		 * OK, the head must have changed on us, but we still need to decrement
-		 * the refcount we increased.
-		 */
-		refs = atomic_fetch_add(-1, &prev->refs);
-		if (refs == REFS_ON_FREELIST + 1)
-			__freelist_add(prev, list);
-	}
-
-	return NULL;
-}
-
-#endif /* FREELIST_H */
diff --git a/include/linux/kprobes.h b/include/linux/kprobes.h
index a0b92be98984..f13f01e600c2 100644
--- a/include/linux/kprobes.h
+++ b/include/linux/kprobes.h
@@ -27,7 +27,7 @@
 #include <linux/mutex.h>
 #include <linux/ftrace.h>
 #include <linux/refcount.h>
-#include <linux/freelist.h>
+#include <linux/objpool.h>
 #include <linux/rethook.h>
 #include <asm/kprobes.h>
 
@@ -141,6 +141,7 @@ static inline bool kprobe_ftrace(struct kprobe *p)
  */
 struct kretprobe_holder {
 	struct kretprobe	*rp;
+	struct objpool_head	oh;
 	refcount_t		ref;
 };
 
@@ -154,7 +155,6 @@ struct kretprobe {
 #ifdef CONFIG_KRETPROBE_ON_RETHOOK
 	struct rethook *rh;
 #else
-	struct freelist_head freelist;
 	struct kretprobe_holder *rph;
 #endif
 };
@@ -165,10 +165,7 @@ struct kretprobe_instance {
 #ifdef CONFIG_KRETPROBE_ON_RETHOOK
 	struct rethook_node node;
 #else
-	union {
-		struct freelist_node freelist;
-		struct rcu_head rcu;
-	};
+	struct rcu_head rcu;
 	struct llist_node llist;
 	struct kretprobe_holder *rph;
 	kprobe_opcode_t *ret_addr;
diff --git a/include/linux/objpool.h b/include/linux/objpool.h
new file mode 100644
index 000000000000..0b746187482a
--- /dev/null
+++ b/include/linux/objpool.h
@@ -0,0 +1,151 @@
+/* SPDX-License-Identifier: GPL-2.0 */
+
+#ifndef _LINUX_OBJPOOL_H
+#define _LINUX_OBJPOOL_H
+
+#include <linux/slab.h>
+#include <linux/vmalloc.h>
+#include <linux/atomic.h>
+
+/*
+ * objpool: ring-array based lockless MPMC/FIFO queues
+ *
+ * Copyright: wuqiang.matt@bytedance.com
+ *
+ * The object pool is a scalable implementaion of high performance queue
+ * for objects allocation and reclamation, such as kretprobe instances.
+ *
+ * With leveraging per-cpu ring-array to mitigate the hot spots of memory
+ * contention, it could deliver near-linear scalability for high parallel
+ * cases. Meanwhile, it also achieves high throughput with benifiting from
+ * warmed cache on each core.
+ *
+ * The object pool are best suited for the following cases:
+ * 1) memory allocation or reclamation is prohibited or too expensive
+ * 2) the objects are allocated/used/reclaimed very frequently
+ *
+ * Before using, you must be aware of it's limitations:
+ * 1) Maximum number of objects is determined during pool initializing
+ * 2) The memory of objects won't be freed until the poll is de-allocated
+ * 3) Both allocation and reclamation could be nested
+ */
+
+/*
+ * objpool_slot: per-cpu ring array
+ *
+ * Represents a cpu-local array-based ring buffer, its size is specialized
+ * during initialization of object pool.
+ *
+ * The objpool_slot is allocated from local memory for NUMA system, and to
+ * be kept compact in a single cacheline. ages[] is stored just after the
+ * body of objpool_slot, and ents[] is after ages[]. ages[] describes the
+ * revision of epoch of the item, solely used to avoid ABA. ents[] contains
+ * the object pointers.
+ *
+ * The default size of objpool_slot is a single cacheline, aka. 64 bytes.
+ *
+ * 64bit:
+ *        4      8      12     16        32                 64
+ * | head | tail | size | mask | ages[4] | ents[4]: (8 * 4) |
+ *
+ * 32bit:
+ *        4      8      12     16        32        48       64
+ * | head | tail | size | mask | ages[4] | ents[4] | unused |
+ *
+ */
+
+struct objpool_slot {
+	uint32_t                os_head;	/* head of ring array */
+	uint32_t                os_tail;	/* tail of ring array */
+	uint32_t                os_size;	/* max item slots, pow of 2 */
+	uint32_t                os_mask;	/* os_size - 1 */
+/*
+ *	uint32_t                os_ages[];	// ring epoch id
+ *	void                   *os_ents[];	// objects array
+ */
+};
+
+/* caller-specified object initial callback to setup each object, only called once */
+typedef int (*objpool_init_node_cb)(void *context, void *obj);
+
+/* caller-specified cleanup callback for private objects/pool/context */
+typedef int (*objpool_release_cb)(void *context, void *ptr, uint32_t flags);
+
+/* called for object releasing: ptr points to an object */
+#define OBJPOOL_FLAG_NODE        (0x00000001)
+/* for user pool and context releasing, ptr could be NULL */
+#define OBJPOOL_FLAG_POOL        (0x00001000)
+/* the object or pool to be released is user-managed */
+#define OBJPOOL_FLAG_USER        (0x00008000)
+
+/*
+ * objpool_head: object pooling metadata
+ */
+
+struct objpool_head {
+	uint32_t                oh_objsz;	/* object & element size */
+	uint32_t                oh_nobjs;	/* total objs (pre-allocated) */
+	uint32_t                oh_nents;	/* max objects per cpuslot */
+	uint32_t                oh_ncpus;	/* num of possible cpus */
+	uint32_t                oh_in_user:1;	/* user-specified buffer */
+	uint32_t                oh_in_slot:1;	/* objs alloced with slots */
+	uint32_t                oh_vmalloc:1;	/* alloc from vmalloc zone */
+	gfp_t                   oh_gfp;		/* k/vmalloc gfp flags */
+	uint32_t                oh_sz_pool;	/* user pool size in byes */
+	void                   *oh_pool;	/* user managed memory pool */
+	struct objpool_slot   **oh_slots;	/* array of percpu slots */
+	uint32_t               *oh_sz_slots;	/* size in bytes of slots */
+	objpool_release_cb      oh_release;	/* resource cleanup callback */
+	void                   *oh_context;	/* caller-provided context */
+};
+
+/* initialize object pool and pre-allocate objects */
+int objpool_init(struct objpool_head *oh,
+		int nobjs, int max, int objsz,
+		gfp_t gfp, void *context,
+		objpool_init_node_cb objinit,
+		objpool_release_cb release);
+
+/* add objects in batch from user provided pool */
+int objpool_populate(struct objpool_head *oh, void *buf,
+			int size, int objsz, void *context,
+			objpool_init_node_cb objinit);
+
+/* add pre-allocated object (managed by user) to objpool */
+int objpool_add(void *obj, struct objpool_head *oh);
+
+/* allocate an object from objects pool */
+void *objpool_pop(struct objpool_head *oh);
+
+/* reclaim an object and return it back to objects pool */
+int objpool_push(void *node, struct objpool_head *oh);
+
+/* cleanup the whole object pool (including all chained objects) */
+void objpool_fini(struct objpool_head *oh);
+
+/* whether the object is pre-allocated with percpu slots */
+static inline int objpool_is_inslot(void *obj, struct objpool_head *oh)
+{
+	void *slot;
+	int i;
+
+	if (!obj)
+		return 0;
+
+	for (i = 0; i < oh->oh_ncpus; i++) {
+		slot = oh->oh_slots[i];
+		if (obj >= slot && obj < slot + oh->oh_sz_slots[i])
+			return 1;
+	}
+
+	return 0;
+}
+
+/* whether the object is from user pool (batched adding) */
+static inline int objpool_is_inpool(void *obj, struct objpool_head *oh)
+{
+	return (obj && oh->oh_pool && obj >= oh->oh_pool &&
+		obj < oh->oh_pool + oh->oh_sz_pool);
+}
+
+#endif /* _LINUX_OBJPOOL_H */
diff --git a/include/linux/rethook.h b/include/linux/rethook.h
index c8ac1e5afcd1..278ec65e71fe 100644
--- a/include/linux/rethook.h
+++ b/include/linux/rethook.h
@@ -6,7 +6,7 @@
 #define _LINUX_RETHOOK_H
 
 #include <linux/compiler.h>
-#include <linux/freelist.h>
+#include <linux/objpool.h>
 #include <linux/kallsyms.h>
 #include <linux/llist.h>
 #include <linux/rcupdate.h>
@@ -30,14 +30,14 @@ typedef void (*rethook_handler_t) (struct rethook_node *, void *, struct pt_regs
 struct rethook {
 	void			*data;
 	rethook_handler_t	handler;
-	struct freelist_head	pool;
+	struct objpool_head	pool;
 	refcount_t		ref;
 	struct rcu_head		rcu;
 };
 
 /**
  * struct rethook_node - The rethook shadow-stack entry node.
- * @freelist: The freelist, linked to struct rethook::pool.
+ * @nod: The objpool node, linked to struct rethook::pool.
  * @rcu: The rcu_head for deferred freeing.
  * @llist: The llist, linked to a struct task_struct::rethooks.
  * @rethook: The pointer to the struct rethook.
@@ -48,19 +48,15 @@ struct rethook {
  * on each entry of the shadow stack.
  */
 struct rethook_node {
-	union {
-		struct freelist_node freelist;
-		struct rcu_head      rcu;
-	};
+	struct rcu_head		rcu;
 	struct llist_node	llist;
 	struct rethook		*rethook;
 	unsigned long		ret_addr;
 	unsigned long		frame;
 };
 
-struct rethook *rethook_alloc(void *data, rethook_handler_t handler);
+struct rethook *rethook_alloc(void *data, rethook_handler_t handler, gfp_t gfp, int size, int max);
 void rethook_free(struct rethook *rh);
-void rethook_add_node(struct rethook *rh, struct rethook_node *node);
 struct rethook_node *rethook_try_get(struct rethook *rh);
 void rethook_recycle(struct rethook_node *node);
 void rethook_hook(struct rethook_node *node, struct pt_regs *regs, bool mcount);
@@ -97,4 +93,3 @@ void rethook_flush_task(struct task_struct *tk);
 #endif
 
 #endif
-
diff --git a/kernel/kprobes.c b/kernel/kprobes.c
index b781dee3f552..42cb708c3248 100644
--- a/kernel/kprobes.c
+++ b/kernel/kprobes.c
@@ -1865,10 +1865,12 @@ static struct notifier_block kprobe_exceptions_nb = {
 static void free_rp_inst_rcu(struct rcu_head *head)
 {
 	struct kretprobe_instance *ri = container_of(head, struct kretprobe_instance, rcu);
+	struct kretprobe_holder *rph = ri->rph;
 
-	if (refcount_dec_and_test(&ri->rph->ref))
-		kfree(ri->rph);
-	kfree(ri);
+	if (refcount_dec_and_test(&rph->ref)) {
+		objpool_fini(&rph->oh);
+		kfree(rph);
+	}
 }
 NOKPROBE_SYMBOL(free_rp_inst_rcu);
 
@@ -1877,7 +1879,7 @@ static void recycle_rp_inst(struct kretprobe_instance *ri)
 	struct kretprobe *rp = get_kretprobe(ri);
 
 	if (likely(rp))
-		freelist_add(&ri->freelist, &rp->freelist);
+		objpool_push(ri, &rp->rph->oh);
 	else
 		call_rcu(&ri->rcu, free_rp_inst_rcu);
 }
@@ -1914,23 +1916,19 @@ NOKPROBE_SYMBOL(kprobe_flush_task);
 
 static inline void free_rp_inst(struct kretprobe *rp)
 {
-	struct kretprobe_instance *ri;
-	struct freelist_node *node;
-	int count = 0;
-
-	node = rp->freelist.head;
-	while (node) {
-		ri = container_of(node, struct kretprobe_instance, freelist);
-		node = node->next;
-
-		kfree(ri);
-		count++;
-	}
+	struct kretprobe_holder *rph = rp->rph;
+	void *nod;
 
-	if (refcount_sub_and_test(count, &rp->rph->ref)) {
-		kfree(rp->rph);
-		rp->rph = NULL;
-	}
+	rp->rph = NULL;
+	do {
+		nod = objpool_pop(&rph->oh);
+		/* deref anyway since we've one extra ref grabbed */
+		if (refcount_dec_and_test(&rph->ref)) {
+			objpool_fini(&rph->oh);
+			kfree(rph);
+			break;
+		}
+	} while (nod);
 }
 
 /* This assumes the 'tsk' is the current task or the is not running. */
@@ -2072,19 +2070,17 @@ NOKPROBE_SYMBOL(__kretprobe_trampoline_handler)
 static int pre_handler_kretprobe(struct kprobe *p, struct pt_regs *regs)
 {
 	struct kretprobe *rp = container_of(p, struct kretprobe, kp);
+	struct kretprobe_holder *rph = rp->rph;
 	struct kretprobe_instance *ri;
-	struct freelist_node *fn;
 
-	fn = freelist_try_get(&rp->freelist);
-	if (!fn) {
+	ri = objpool_pop(&rph->oh);
+	if (!ri) {
 		rp->nmissed++;
 		return 0;
 	}
 
-	ri = container_of(fn, struct kretprobe_instance, freelist);
-
 	if (rp->entry_handler && rp->entry_handler(ri, regs)) {
-		freelist_add(&ri->freelist, &rp->freelist);
+		objpool_push(ri, &rph->oh);
 		return 0;
 	}
 
@@ -2174,10 +2170,19 @@ int kprobe_on_func_entry(kprobe_opcode_t *addr, const char *sym, unsigned long o
 	return 0;
 }
 
+#ifndef CONFIG_KRETPROBE_ON_RETHOOK
+static int kretprobe_init_inst(void *context, void *nod)
+{
+	struct kretprobe_instance *ri = nod;
+
+	ri->rph = context;
+	return 0;
+}
+#endif
+
 int register_kretprobe(struct kretprobe *rp)
 {
 	int ret;
-	struct kretprobe_instance *inst;
 	int i;
 	void *addr;
 
@@ -2215,20 +2220,12 @@ int register_kretprobe(struct kretprobe *rp)
 #endif
 	}
 #ifdef CONFIG_KRETPROBE_ON_RETHOOK
-	rp->rh = rethook_alloc((void *)rp, kretprobe_rethook_handler);
+	rp->rh = rethook_alloc((void *)rp, kretprobe_rethook_handler, GFP_KERNEL,
+				sizeof(struct kretprobe_instance) + rp->data_size,
+				rp->maxactive);
 	if (!rp->rh)
 		return -ENOMEM;
 
-	for (i = 0; i < rp->maxactive; i++) {
-		inst = kzalloc(sizeof(struct kretprobe_instance) +
-			       rp->data_size, GFP_KERNEL);
-		if (inst == NULL) {
-			rethook_free(rp->rh);
-			rp->rh = NULL;
-			return -ENOMEM;
-		}
-		rethook_add_node(rp->rh, &inst->node);
-	}
 	rp->nmissed = 0;
 	/* Establish function entry probe point */
 	ret = register_kprobe(&rp->kp);
@@ -2237,25 +2234,19 @@ int register_kretprobe(struct kretprobe *rp)
 		rp->rh = NULL;
 	}
 #else	/* !CONFIG_KRETPROBE_ON_RETHOOK */
-	rp->freelist.head = NULL;
 	rp->rph = kzalloc(sizeof(struct kretprobe_holder), GFP_KERNEL);
 	if (!rp->rph)
 		return -ENOMEM;
 
-	rp->rph->rp = rp;
-	for (i = 0; i < rp->maxactive; i++) {
-		inst = kzalloc(sizeof(struct kretprobe_instance) +
-			       rp->data_size, GFP_KERNEL);
-		if (inst == NULL) {
-			refcount_set(&rp->rph->ref, i);
-			free_rp_inst(rp);
-			return -ENOMEM;
-		}
-		inst->rph = rp->rph;
-		freelist_add(&inst->freelist, &rp->freelist);
+	if (objpool_init(&rp->rph->oh, rp->maxactive, rp->maxactive,
+			  rp->data_size + sizeof(struct kretprobe_instance),
+			  GFP_KERNEL, rp->rph, kretprobe_init_inst, NULL)) {
+		kfree(rp->rph);
+		rp->rph = NULL;
+		return -ENOMEM;
 	}
-	refcount_set(&rp->rph->ref, i);
-
+	refcount_set(&rp->rph->ref, rp->maxactive + 1);
+	rp->rph->rp = rp;
 	rp->nmissed = 0;
 	/* Establish function entry probe point */
 	ret = register_kprobe(&rp->kp);
diff --git a/kernel/trace/fprobe.c b/kernel/trace/fprobe.c
index aac63ca9c3d1..d2521a0ab2ae 100644
--- a/kernel/trace/fprobe.c
+++ b/kernel/trace/fprobe.c
@@ -140,18 +140,11 @@ static int fprobe_init_rethook(struct fprobe *fp, int num)
 	if (size < 0)
 		return -E2BIG;
 
-	fp->rethook = rethook_alloc((void *)fp, fprobe_exit_handler);
-	for (i = 0; i < size; i++) {
-		struct fprobe_rethook_node *node;
-
-		node = kzalloc(sizeof(*node), GFP_KERNEL);
-		if (!node) {
-			rethook_free(fp->rethook);
-			fp->rethook = NULL;
-			return -ENOMEM;
-		}
-		rethook_add_node(fp->rethook, &node->node);
-	}
+	fp->rethook = rethook_alloc((void *)fp, fprobe_exit_handler, GFP_KERNEL,
+				sizeof(struct fprobe_rethook_node), size);
+	if (!fp->rethook)
+		return -ENOMEM;
+
 	return 0;
 }
 
diff --git a/kernel/trace/rethook.c b/kernel/trace/rethook.c
index c69d82273ce7..01df98db2fbe 100644
--- a/kernel/trace/rethook.c
+++ b/kernel/trace/rethook.c
@@ -36,21 +36,17 @@ void rethook_flush_task(struct task_struct *tk)
 static void rethook_free_rcu(struct rcu_head *head)
 {
 	struct rethook *rh = container_of(head, struct rethook, rcu);
-	struct rethook_node *rhn;
-	struct freelist_node *node;
-	int count = 1;
+	struct rethook_node *nod;
 
-	node = rh->pool.head;
-	while (node) {
-		rhn = container_of(node, struct rethook_node, freelist);
-		node = node->next;
-		kfree(rhn);
-		count++;
-	}
-
-	/* The rh->ref is the number of pooled node + 1 */
-	if (refcount_sub_and_test(count, &rh->ref))
-		kfree(rh);
+	do {
+		nod = objpool_pop(&rh->pool);
+		/* deref anyway since we've one extra ref grabbed */
+		if (refcount_dec_and_test(&rh->ref)) {
+			objpool_fini(&rh->pool);
+			kfree(rh);
+			break;
+		}
+	} while (nod);
 }
 
 /**
@@ -70,16 +66,28 @@ void rethook_free(struct rethook *rh)
 	call_rcu(&rh->rcu, rethook_free_rcu);
 }
 
+static int rethook_init_node(void *context, void *nod)
+{
+	struct rethook_node *node = nod;
+
+	node->rethook = context;
+	return 0;
+}
+
 /**
  * rethook_alloc() - Allocate struct rethook.
  * @data: a data to pass the @handler when hooking the return.
  * @handler: the return hook callback function.
+ * @gfp: default gfp for objpool allocation
+ * @size: rethook node size
+ * @max: number of rethook nodes to be preallocated
  *
  * Allocate and initialize a new rethook with @data and @handler.
  * Return NULL if memory allocation fails or @handler is NULL.
  * Note that @handler == NULL means this rethook is going to be freed.
  */
-struct rethook *rethook_alloc(void *data, rethook_handler_t handler)
+struct rethook *rethook_alloc(void *data, rethook_handler_t handler, gfp_t gfp,
+			      int size, int max)
 {
 	struct rethook *rh = kzalloc(sizeof(struct rethook), GFP_KERNEL);
 
@@ -88,34 +96,26 @@ struct rethook *rethook_alloc(void *data, rethook_handler_t handler)
 
 	rh->data = data;
 	rh->handler = handler;
-	rh->pool.head = NULL;
-	refcount_set(&rh->ref, 1);
 
+	/* initialize the objpool for rethook nodes */
+	if (objpool_init(&rh->pool, max, max, size, gfp, rh, rethook_init_node,
+			NULL)) {
+		kfree(rh);
+		return NULL;
+	}
+	refcount_set(&rh->ref, max + 1);
 	return rh;
 }
 
-/**
- * rethook_add_node() - Add a new node to the rethook.
- * @rh: the struct rethook.
- * @node: the struct rethook_node to be added.
- *
- * Add @node to @rh. User must allocate @node (as a part of user's
- * data structure.) The @node fields are initialized in this function.
- */
-void rethook_add_node(struct rethook *rh, struct rethook_node *node)
-{
-	node->rethook = rh;
-	freelist_add(&node->freelist, &rh->pool);
-	refcount_inc(&rh->ref);
-}
-
 static void free_rethook_node_rcu(struct rcu_head *head)
 {
 	struct rethook_node *node = container_of(head, struct rethook_node, rcu);
+	struct rethook *rh = node->rethook;
 
-	if (refcount_dec_and_test(&node->rethook->ref))
-		kfree(node->rethook);
-	kfree(node);
+	if (refcount_dec_and_test(&rh->ref)) {
+		objpool_fini(&rh->pool);
+		kfree(rh);
+	}
 }
 
 /**
@@ -130,7 +130,7 @@ void rethook_recycle(struct rethook_node *node)
 	lockdep_assert_preemption_disabled();
 
 	if (likely(READ_ONCE(node->rethook->handler)))
-		freelist_add(&node->freelist, &node->rethook->pool);
+		objpool_push(node, &node->rethook->pool);
 	else
 		call_rcu(&node->rcu, free_rethook_node_rcu);
 }
@@ -146,7 +146,7 @@ NOKPROBE_SYMBOL(rethook_recycle);
 struct rethook_node *rethook_try_get(struct rethook *rh)
 {
 	rethook_handler_t handler = READ_ONCE(rh->handler);
-	struct freelist_node *fn;
+	struct rethook_node *nod;
 
 	lockdep_assert_preemption_disabled();
 
@@ -163,11 +163,11 @@ struct rethook_node *rethook_try_get(struct rethook *rh)
 	if (unlikely(!rcu_is_watching()))
 		return NULL;
 
-	fn = freelist_try_get(&rh->pool);
-	if (!fn)
+	nod = (struct rethook_node *)objpool_pop(&rh->pool);
+	if (!nod)
 		return NULL;
 
-	return container_of(fn, struct rethook_node, freelist);
+	return nod;
 }
 NOKPROBE_SYMBOL(rethook_try_get);
 
diff --git a/lib/Kconfig.debug b/lib/Kconfig.debug
index 3fc7abffc7aa..b12cc71754cf 100644
--- a/lib/Kconfig.debug
+++ b/lib/Kconfig.debug
@@ -2737,6 +2737,17 @@ config TEST_CLOCKSOURCE_WATCHDOG
 
 	  If unsure, say N.
 
+config TEST_OBJPOOL
+	tristate "Test module for correctness and stress of objpool"
+	default n
+       depends on m
+	help
+	  This builds the "test_objpool" module that should be used for
+	  correctness verification and concurrent testings of objects
+	  allocation and reclamation.
+
+	  If unsure, say N.
+
 endif # RUNTIME_TESTING_MENU
 
 config ARCH_USE_MEMTEST
diff --git a/lib/Makefile b/lib/Makefile
index 161d6a724ff7..4aa282fa0cfc 100644
--- a/lib/Makefile
+++ b/lib/Makefile
@@ -34,7 +34,7 @@ lib-y := ctype.o string.o vsprintf.o cmdline.o \
 	 is_single_threaded.o plist.o decompress.o kobject_uevent.o \
 	 earlycpio.o seq_buf.o siphash.o dec_and_lock.o \
 	 nmi_backtrace.o win_minmax.o memcat_p.o \
-	 buildid.o
+	 buildid.o objpool.o
 
 lib-$(CONFIG_PRINTK) += dump_stack.o
 lib-$(CONFIG_SMP) += cpumask.o
@@ -99,6 +99,8 @@ obj-$(CONFIG_KPROBES_SANITY_TEST) += test_kprobes.o
 obj-$(CONFIG_TEST_REF_TRACKER) += test_ref_tracker.o
 CFLAGS_test_fprobe.o += $(CC_FLAGS_FTRACE)
 obj-$(CONFIG_FPROBE_SANITY_TEST) += test_fprobe.o
+obj-$(CONFIG_TEST_OBJPOOL) += test_objpool.o
+
 #
 # CFLAGS for compiling floating point code inside the kernel. x86/Makefile turns
 # off the generation of FPU/SSE* instructions for kernel proper but FPU_FLAGS
diff --git a/lib/objpool.c b/lib/objpool.c
new file mode 100644
index 000000000000..51b3499ff9da
--- /dev/null
+++ b/lib/objpool.c
@@ -0,0 +1,480 @@
+// SPDX-License-Identifier: GPL-2.0
+
+#include <linux/objpool.h>
+
+/*
+ * objpool: ring-array based lockless MPMC/FIFO queues
+ *
+ * Copyright: wuqiang.matt@bytedance.com
+ */
+
+/* compute the suitable num of objects to be managed by slot */
+static inline uint32_t __objpool_num_of_objs(uint32_t size)
+{
+	return rounddown_pow_of_two((size - sizeof(struct objpool_slot)) /
+			(sizeof(uint32_t) + sizeof(void *)));
+}
+
+#define SLOT_AGES(s) ((uint32_t *)((char *)(s) + sizeof(struct objpool_slot)))
+#define SLOT_ENTS(s) ((void **)((char *)(s) + sizeof(struct objpool_slot) + \
+			sizeof(uint32_t) * (s)->os_size))
+#define SLOT_OBJS(s) ((void *)((char *)(s) + sizeof(struct objpool_slot) + \
+			(sizeof(uint32_t) + sizeof(void *)) * (s)->os_size))
+
+/* allocate and initialize percpu slots */
+static inline int
+__objpool_init_percpu_slots(struct objpool_head *oh, uint32_t nobjs,
+			void *context, objpool_init_node_cb objinit)
+{
+	uint32_t i, j, size, objsz, nents = oh->oh_nents;
+
+	/* aligned object size by sizeof(void *) */
+	objsz = ALIGN(oh->oh_objsz, sizeof(void *));
+	/* shall we allocate objects along with objpool_slot */
+	if (objsz)
+		oh->oh_in_slot = 1;
+
+	for (i = 0; i < oh->oh_ncpus; i++) {
+		struct objpool_slot *os;
+		uint32_t n;
+
+		/* compute how many objects to be managed by this slot */
+		n = nobjs / oh->oh_ncpus;
+		if (i < (nobjs % oh->oh_ncpus))
+			n++;
+		size = sizeof(struct objpool_slot) + sizeof(void *) * nents +
+		       sizeof(uint32_t) * nents + objsz * n;
+
+		/* decide which pool shall the slot be allocated from */
+		if (0 == i) {
+			if ((oh->oh_gfp & GFP_ATOMIC) || size < PAGE_SIZE / 2)
+				oh->oh_vmalloc = 0;
+			else
+				oh->oh_vmalloc = 1;
+		}
+
+		/* allocate percpu slot & objects from local memory */
+		if (oh->oh_vmalloc)
+			os = vmalloc_node(size, cpu_to_node(i));
+		else
+			os = kmalloc_node(size, oh->oh_gfp, cpu_to_node(i));
+		if (!os)
+			return -ENOMEM;
+
+		/* initialize percpu slot for the i-th cpu */
+		memset(os, 0, size);
+		os->os_size = oh->oh_nents;
+		os->os_mask = os->os_size - 1;
+		oh->oh_slots[i] = os;
+		oh->oh_sz_slots[i] = size;
+
+		/*
+		 * start from 2nd round to avoid conflict of 1st item.
+		 * we assume that the head item is ready for retrieval
+		 * iff head is equal to ages[head & mask]. but ages is
+		 * initialized as 0, so in view of the caller of pop(),
+		 * the 1st item (0th) is always ready, but fact could
+		 * be: push() is stalled before the final update, thus
+		 * the item being inserted will be lost forever.
+		 */
+		os->os_head = os->os_tail = oh->oh_nents;
+
+		for (j = 0; oh->oh_in_slot && j < n; j++) {
+			uint32_t *ages = SLOT_AGES(os);
+			void **ents = SLOT_ENTS(os);
+			void *obj = SLOT_OBJS(os) + j * objsz;
+			uint32_t ie = os->os_tail & os->os_mask;
+
+			/* perform object initialization */
+			if (objinit) {
+				int rc = objinit(context, obj);
+				if (rc)
+					return rc;
+			}
+
+			/* add obj into the ring array */
+			ents[ie] = obj;
+			ages[ie] = os->os_tail;
+			os->os_tail++;
+			oh->oh_nobjs++;
+		}
+	}
+
+	return 0;
+}
+
+/* cleanup all percpu slots of the object pool */
+static inline void __objpool_fini_percpu_slots(struct objpool_head *oh)
+{
+	uint32_t i;
+
+	if (!oh->oh_slots)
+		return;
+
+	for (i = 0; i < oh->oh_ncpus; i++) {
+		if (!oh->oh_slots[i])
+			continue;
+		if (oh->oh_vmalloc)
+			vfree(oh->oh_slots[i]);
+		else
+			kfree(oh->oh_slots[i]);
+	}
+	kfree(oh->oh_slots);
+	oh->oh_slots = NULL;
+	oh->oh_sz_slots = NULL;
+}
+
+/**
+ * objpool_init: initialize object pool and pre-allocate objects
+ *
+ * args:
+ * @oh:    the object pool to be initialized, declared by the caller
+ * @nojbs: total objects to be allocated by this object pool
+ * @max:   max objs this objpool could manage, use nobjs if 0
+ * @ojbsz: size of an object, to be pre-allocated if objsz is not 0
+ * @gfp:   gfp flags of caller's context for memory allocation
+ * @context: user context for object initialization callback
+ * @objinit: object initialization callback for extra setting-up
+ * @release: cleanup callback for private objects/pool/context
+ *
+ * return:
+ *         0 for success, otherwise error code
+ *
+ * All pre-allocated objects are to be zeroed. Caller could do extra
+ * initialization in objinit callback. The objinit callback will be
+ * called once and only once after the slot allocation
+ */
+int objpool_init(struct objpool_head *oh,
+		int nobjs, int max, int objsz,
+		gfp_t gfp, void *context,
+		objpool_init_node_cb objinit,
+		objpool_release_cb release)
+{
+	uint32_t nents, cpus = num_possible_cpus();
+	int rc;
+
+	/* calculate percpu slot size (rounded to pow of 2) */
+	if (max < nobjs)
+		max = nobjs;
+	nents = max / cpus;
+	if (nents < __objpool_num_of_objs(L1_CACHE_BYTES))
+		nents = __objpool_num_of_objs(L1_CACHE_BYTES);
+	nents = roundup_pow_of_two(nents);
+	while (nents * cpus < nobjs)
+		nents = nents << 1;
+
+	memset(oh, 0, sizeof(struct objpool_head));
+	oh->oh_ncpus = cpus;
+	oh->oh_objsz = objsz;
+	oh->oh_nents = nents;
+	oh->oh_gfp = gfp & ~__GFP_ZERO;
+	oh->oh_context = context;
+	oh->oh_release = release;
+
+	/* allocate array for percpu slots */
+	oh->oh_slots = kzalloc(oh->oh_ncpus * sizeof(void *) +
+			       oh->oh_ncpus * sizeof(uint32_t), oh->oh_gfp);
+	if (!oh->oh_slots)
+		return -ENOMEM;
+	oh->oh_sz_slots = (uint32_t *)&oh->oh_slots[oh->oh_ncpus];
+
+	/* initialize per-cpu slots */
+	rc = __objpool_init_percpu_slots(oh, nobjs, context, objinit);
+	if (rc)
+		__objpool_fini_percpu_slots(oh);
+
+	return rc;
+}
+EXPORT_SYMBOL_GPL(objpool_init);
+
+/* adding object to slot tail, the given slot mustn't be full */
+static inline int __objpool_add_slot(void *obj, struct objpool_slot *os)
+{
+	uint32_t *ages = SLOT_AGES(os);
+	void **ents = SLOT_ENTS(os);
+	uint32_t tail = atomic_inc_return((atomic_t *)&os->os_tail) - 1;
+
+	WRITE_ONCE(ents[tail & os->os_mask], obj);
+
+	/* order matters: obj must be updated before tail updating */
+	smp_store_release(&ages[tail & os->os_mask], tail);
+	return 0;
+}
+
+/* adding object to slot, abort if the slot was already full */
+static inline int __objpool_try_add_slot(void *obj, struct objpool_slot *os)
+{
+	uint32_t *ages = SLOT_AGES(os);
+	void **ents = SLOT_ENTS(os);
+	uint32_t head, tail;
+
+	do {
+		/* perform memory loading for both head and tail */
+		head = READ_ONCE(os->os_head);
+		tail = READ_ONCE(os->os_tail);
+		/* just abort if slot is full */
+		if (tail >= head + os->os_size)
+			return -ENOENT;
+		/* try to extend tail by 1 using CAS to avoid races */
+		if (try_cmpxchg_acquire(&os->os_tail, &tail, tail + 1))
+			break;
+	} while (1);
+
+	/* the tail-th of slot is reserved for the given obj */
+	WRITE_ONCE(ents[tail & os->os_mask], obj);
+	/* update epoch id to make this object available for pop() */
+	smp_store_release(&ages[tail & os->os_mask], tail);
+	return 0;
+}
+
+/**
+ * objpool_populate: add objects from user provided pool in batch
+ *
+ * args:
+ * @oh:  object pool
+ * @buf: user buffer for pre-allocated objects
+ * @size: size of user buffer
+ * @objsz: size of object & element
+ * @context: user context for objinit callback
+ * @objinit: object initialization callback
+ *
+ * return: 0 or error code
+ */
+int objpool_populate(struct objpool_head *oh, void *buf, int size, int objsz,
+		    void *context, objpool_init_node_cb objinit)
+{
+	int n = oh->oh_nobjs, used = 0, i;
+
+	if (oh->oh_pool || !buf || size < objsz)
+		return -EINVAL;
+	if (oh->oh_objsz && oh->oh_objsz != objsz)
+		return -EINVAL;
+	if (oh->oh_context && context && oh->oh_context != context)
+		return -EINVAL;
+	if (oh->oh_nobjs >= oh->oh_ncpus * oh->oh_nents)
+		return -ENOENT;
+
+	WARN_ON_ONCE(((unsigned long)buf) & (sizeof(void *) - 1));
+	WARN_ON_ONCE(((uint32_t)objsz) & (sizeof(void *) - 1));
+
+	/* align object size by sizeof(void *) */
+	oh->oh_objsz = objsz;
+	objsz = ALIGN(objsz, sizeof(void *));
+	if (objsz <= 0)
+		return -EINVAL;
+
+	while (used + objsz <= size) {
+		void *obj = buf + used;
+
+		/* perform object initialization */
+		if (objinit) {
+			int rc = objinit(context, obj);
+			if (rc)
+				return rc;
+		}
+
+		/* insert obj to its corresponding objpool slot */
+		i = (n + used * oh->oh_ncpus/size) % oh->oh_ncpus;
+		if (!__objpool_try_add_slot(obj, oh->oh_slots[i]))
+			oh->oh_nobjs++;
+
+		used += objsz;
+	}
+
+	if (!used)
+		return -ENOENT;
+
+	oh->oh_context = context;
+	oh->oh_pool = buf;
+	oh->oh_sz_pool = size;
+
+	return 0;
+}
+EXPORT_SYMBOL_GPL(objpool_populate);
+
+/**
+ * objpool_add: add pre-allocated object to objpool during pool
+ * initialization
+ *
+ * args:
+ * @obj: object pointer to be added to objpool
+ * @oh:  object pool to be inserted into
+ *
+ * return:
+ *     0 or error code
+ *
+ * objpool_add_node doesn't handle race conditions, can only be
+ * called during objpool initialization
+ */
+int objpool_add(void *obj, struct objpool_head *oh)
+{
+	uint32_t i, cpu;
+
+	if (!obj)
+		return -EINVAL;
+	if (oh->oh_nobjs >= oh->oh_ncpus * oh->oh_nents)
+		return -ENOENT;
+
+	cpu = oh->oh_nobjs % oh->oh_ncpus;
+	for (i = 0; i < oh->oh_ncpus; i++) {
+		if (!__objpool_try_add_slot(obj, oh->oh_slots[cpu])) {
+			oh->oh_nobjs++;
+			return 0;
+		}
+
+		if (++cpu >= oh->oh_ncpus)
+			cpu = 0;
+	}
+
+	return -ENOENT;
+}
+EXPORT_SYMBOL_GPL(objpool_add);
+
+/**
+ * objpool_push: reclaim the object and return back to objects pool
+ *
+ * args:
+ * @obj: object pointer to be pushed to object pool
+ * @oh:  object pool
+ *
+ * return:
+ *     0 or error code: it fails only when objects pool are full
+ *
+ * objpool_push is non-blockable, and can be nested
+ */
+int objpool_push(void *obj, struct objpool_head *oh)
+{
+	uint32_t cpu = raw_smp_processor_id();
+
+	do {
+		if (oh->oh_nobjs > oh->oh_nents) {
+			if (!__objpool_try_add_slot(obj, oh->oh_slots[cpu]))
+				return 0;
+		} else {
+			if (!__objpool_add_slot(obj, oh->oh_slots[cpu]))
+				return 0;
+		}
+		if (++cpu >= oh->oh_ncpus)
+			cpu = 0;
+	} while (1);
+
+	return -ENOENT;
+}
+EXPORT_SYMBOL_GPL(objpool_push);
+
+/* try to retrieve object from slot */
+static inline void *__objpool_try_get_slot(struct objpool_slot *os)
+{
+	uint32_t *ages = SLOT_AGES(os);
+	void **ents = SLOT_ENTS(os);
+	/* do memory load of os_head to local head */
+	uint32_t head = smp_load_acquire(&os->os_head);
+
+	/* loop if slot isn't empty */
+	while (head != READ_ONCE(os->os_tail)) {
+		uint32_t id = head & os->os_mask, prev = head;
+
+		/* do prefetching of object ents */
+		prefetch(&ents[id]);
+
+		/*
+		 * check whether this item was ready for retrieval ? There's
+		 * possibility * in theory * we might retrieve wrong object,
+		 * in case ages[id] overflows when current task is sleeping,
+		 * but it will take very very long to overflow an uint32_t
+		 */
+		if (smp_load_acquire(&ages[id]) == head) {
+			/* node must have been udpated by push() */
+			void *node = READ_ONCE(ents[id]);
+			/* commit and move forward head of the slot */
+			if (try_cmpxchg_release(&os->os_head, &head, head + 1))
+				return node;
+		}
+
+		/* re-load head from memory continue trying */
+		head = READ_ONCE(os->os_head);
+		/*
+		 * head stays unchanged, so it's very likely current pop()
+		 * just preempted/interrupted an ongoing push() operation
+		 */
+		if (head == prev)
+			break;
+	}
+
+	return NULL;
+}
+
+/**
+ * objpool_pop: allocate an object from objects pool
+ *
+ * args:
+ * @oh:  object pool
+ *
+ * return:
+ *   object: NULL if failed (object pool is empty)
+ *
+ * objpool_pop can be nested, so can be used in any context.
+ */
+void *objpool_pop(struct objpool_head *oh)
+{
+	uint32_t i, cpu = raw_smp_processor_id();
+	void *obj = NULL;
+
+	for (i = 0; i < oh->oh_ncpus; i++) {
+		struct objpool_slot *slot = oh->oh_slots[cpu];
+		obj = __objpool_try_get_slot(slot);
+		if (obj)
+			break;
+		if (++cpu >= oh->oh_ncpus)
+			cpu = 0;
+	}
+
+	return obj;
+}
+EXPORT_SYMBOL_GPL(objpool_pop);
+
+/**
+ * objpool_fini: cleanup the whole object pool (releasing all objects)
+ *
+ * args:
+ * @head: object pool to be released
+ *
+ */
+void objpool_fini(struct objpool_head *oh)
+{
+	uint32_t i, flags;
+
+	if (!oh->oh_slots)
+		return;
+
+	if (!oh->oh_release) {
+		__objpool_fini_percpu_slots(oh);
+		return;
+	}
+
+	/* cleanup all objects remained in objpool */
+	for (i = 0; i < oh->oh_ncpus; i++) {
+		void *obj;
+		do {
+			flags = OBJPOOL_FLAG_NODE;
+			obj = __objpool_try_get_slot(oh->oh_slots[i]);
+			if (!obj)
+				break;
+			if (!objpool_is_inpool(obj, oh) &&
+			    !objpool_is_inslot(obj, oh)) {
+				flags |= OBJPOOL_FLAG_USER;
+			}
+			oh->oh_release(oh->oh_context, obj, flags);
+		} while (obj);
+	}
+
+	/* release percpu slots */
+	__objpool_fini_percpu_slots(oh);
+
+	/* cleanup user private pool and related context */
+	flags = OBJPOOL_FLAG_POOL;
+	if (oh->oh_pool)
+		flags |= OBJPOOL_FLAG_USER;
+	oh->oh_release(oh->oh_context, oh->oh_pool, flags);
+}
+EXPORT_SYMBOL_GPL(objpool_fini);
diff --git a/lib/test_objpool.c b/lib/test_objpool.c
new file mode 100644
index 000000000000..c1341ddf77b5
--- /dev/null
+++ b/lib/test_objpool.c
@@ -0,0 +1,1031 @@
+// SPDX-License-Identifier: GPL-2.0
+
+/*
+ * Test module for lockless object pool
+ * (C) 2022 Matt Wu <wuqiang.matt@bytedance.com>
+ */
+
+#include <linux/version.h>
+#include <linux/errno.h>
+#include <linux/module.h>
+#include <linux/moduleparam.h>
+#include <linux/sched.h>
+#include <linux/cpumask.h>
+#include <linux/completion.h>
+#include <linux/kthread.h>
+#include <linux/cpu.h>
+#include <linux/cpuset.h>
+#include <linux/slab.h>
+#include <linux/delay.h>
+#include <linux/hrtimer.h>
+#include <linux/interrupt.h>
+#include <linux/objpool.h>
+
+#define OT_NR_MAX_BULK (16)
+
+struct ot_ctrl {
+	unsigned int mode;
+	unsigned int duration; /* ms */
+	unsigned int delay; /* ms */
+	unsigned int bulk_normal;
+	unsigned int bulk_irq;
+	unsigned long hrtimer; /* ms */
+	const char *name;
+};
+
+struct ot_stat {
+	unsigned long nhits;
+	unsigned long nmiss;
+};
+
+struct ot_item {
+	struct objpool_head *pool; /* pool head */
+	struct ot_ctrl *ctrl; /* ctrl parameters */
+
+	void (*worker)(struct ot_item *item, int irq);
+
+	/* hrtimer control */
+	ktime_t hrtcycle;
+	struct hrtimer hrtimer;
+
+	int bulk[2]; /* for thread and irq */
+	int delay;
+	u32 niters;
+
+	/* results summary */
+	struct ot_stat stat[2]; /* thread and irq */
+
+	u64 duration;
+};
+
+struct ot_mem_stat {
+	atomic_long_t alloc;
+	atomic_long_t free;
+};
+
+struct ot_data {
+	struct rw_semaphore start;
+	struct completion wait;
+	struct completion rcu;
+	atomic_t nthreads ____cacheline_aligned_in_smp;
+	atomic_t stop ____cacheline_aligned_in_smp;
+	struct ot_mem_stat kmalloc;
+	struct ot_mem_stat vmalloc;
+} g_ot_data;
+
+/*
+ * memory leakage checking
+ */
+
+void *ot_kzalloc(long size)
+{
+	void *ptr = kzalloc(size, GFP_KERNEL);
+
+	if (ptr)
+		atomic_long_add(size, &g_ot_data.kmalloc.alloc);
+	return ptr;
+}
+
+void ot_kfree(void *ptr, long size)
+{
+	if (!ptr)
+		return;
+	atomic_long_add(size, &g_ot_data.kmalloc.free);
+	kfree(ptr);
+}
+
+void *ot_vmalloc(long size)
+{
+	void *ptr = vmalloc(size);
+
+	if (ptr)
+		atomic_long_add(size, &g_ot_data.vmalloc.alloc);
+	return ptr;
+}
+
+void ot_vfree(void *ptr, long size)
+{
+	if (!ptr)
+		return;
+	atomic_long_add(size, &g_ot_data.vmalloc.free);
+	vfree(ptr);
+}
+
+static void ot_mem_report(struct ot_ctrl *ctrl)
+{
+	long alloc, free;
+
+	pr_info("memory allocation summary for %s\n", ctrl->name);
+
+	alloc = atomic_long_read(&g_ot_data.kmalloc.alloc);
+	free = atomic_long_read(&g_ot_data.kmalloc.free);
+	pr_info("  kmalloc: %lu - %lu = %lu\n", alloc, free, alloc - free);
+
+	alloc = atomic_long_read(&g_ot_data.vmalloc.alloc);
+	free = atomic_long_read(&g_ot_data.vmalloc.free);
+	pr_info("  vmalloc: %lu - %lu = %lu\n", alloc, free, alloc - free);
+}
+
+/*
+ * general structs & routines
+ */
+
+struct ot_node {
+	void *owner;
+	unsigned long data;
+	unsigned long refs;
+};
+
+struct ot_context {
+	struct objpool_head pool;
+	void *ptr;
+	unsigned long size;
+	refcount_t refs;
+	struct rcu_head rcu;
+};
+
+static DEFINE_PER_CPU(struct ot_item, ot_pcup_items);
+
+static int ot_init_data(struct ot_data *data)
+{
+	memset(data, 0, sizeof(*data));
+	init_rwsem(&data->start);
+	init_completion(&data->wait);
+	init_completion(&data->rcu);
+	atomic_set(&data->nthreads, 1);
+
+	return 0;
+}
+
+static void ot_reset_data(struct ot_data *data)
+{
+	reinit_completion(&data->wait);
+	reinit_completion(&data->rcu);
+	atomic_set(&data->nthreads, 1);
+	atomic_set(&data->stop, 0);
+	memset(&data->kmalloc, 0, sizeof(data->kmalloc));
+	memset(&data->vmalloc, 0, sizeof(data->vmalloc));
+}
+
+static int ot_init_node(void *context, void *nod)
+{
+	struct ot_context *sop = context;
+	struct ot_node *on = nod;
+
+	on->owner = &sop->pool;
+	return 0;
+}
+
+static enum hrtimer_restart ot_hrtimer_handler(struct hrtimer *hrt)
+{
+	struct ot_item *item = container_of(hrt, struct ot_item, hrtimer);
+
+	if (atomic_read_acquire(&g_ot_data.stop))
+		return HRTIMER_NORESTART;
+
+	/* do bulk-testings for objects pop/push */
+	item->worker(item, 1);
+
+	hrtimer_forward(hrt, hrt->base->get_time(), item->hrtcycle);
+	return HRTIMER_RESTART;
+}
+
+static void ot_start_hrtimer(struct ot_item *item)
+{
+	if (!item->ctrl->hrtimer)
+		return;
+	hrtimer_start(&item->hrtimer, item->hrtcycle, HRTIMER_MODE_REL);
+}
+
+static void ot_stop_hrtimer(struct ot_item *item)
+{
+	if (!item->ctrl->hrtimer)
+		return;
+	hrtimer_cancel(&item->hrtimer);
+}
+
+static int ot_init_hrtimer(struct ot_item *item, unsigned long hrtimer)
+{
+	struct hrtimer *hrt = &item->hrtimer;
+
+	if (!hrtimer)
+		return -ENOENT;
+
+	item->hrtcycle = ktime_set(0, hrtimer * 1000000UL);
+	hrtimer_init(hrt, CLOCK_MONOTONIC, HRTIMER_MODE_REL);
+	hrt->function = ot_hrtimer_handler;
+	return 0;
+}
+
+static int ot_init_cpu_item(struct ot_item *item,
+			struct ot_ctrl *ctrl,
+			struct objpool_head *pool,
+			void (*worker)(struct ot_item *, int))
+{
+	memset(item, 0, sizeof(*item));
+	item->pool = pool;
+	item->ctrl = ctrl;
+	item->worker = worker;
+
+	item->bulk[0] = ctrl->bulk_normal;
+	item->bulk[1] = ctrl->bulk_irq;
+	item->delay = ctrl->delay;
+
+	/* initialize hrtimer */
+	ot_init_hrtimer(item, item->ctrl->hrtimer);
+	return 0;
+}
+
+static int ot_thread_worker(void *arg)
+{
+	struct ot_item *item = arg;
+	ktime_t start;
+
+	sched_set_normal(current, 50);
+
+	atomic_inc(&g_ot_data.nthreads);
+	down_read(&g_ot_data.start);
+	up_read(&g_ot_data.start);
+	start = ktime_get();
+	ot_start_hrtimer(item);
+	do {
+		if (atomic_read_acquire(&g_ot_data.stop))
+			break;
+		/* do bulk-testings for objects pop/push */
+		item->worker(item, 0);
+	} while (!kthread_should_stop());
+	ot_stop_hrtimer(item);
+	item->duration = (u64) ktime_us_delta(ktime_get(), start);
+	if (atomic_dec_and_test(&g_ot_data.nthreads))
+		complete(&g_ot_data.wait);
+
+	return 0;
+}
+
+static void ot_perf_report(struct ot_ctrl *ctrl, u64 duration)
+{
+	struct ot_stat total, normal = {0}, irq = {0};
+	int cpu, nthreads = 0;
+
+	pr_info("\n");
+	pr_info("Testing summary for %s\n", ctrl->name);
+
+	for_each_possible_cpu(cpu) {
+		struct ot_item *item = per_cpu_ptr(&ot_pcup_items, cpu);
+		if (!item->duration)
+			continue;
+		normal.nhits += item->stat[0].nhits;
+		normal.nmiss += item->stat[0].nmiss;
+		irq.nhits += item->stat[1].nhits;
+		irq.nmiss += item->stat[1].nmiss;
+		pr_info("CPU: %d  duration: %lluus\n", cpu, item->duration);
+		pr_info("\tthread:\t%16lu hits \t%16lu miss\n",
+			item->stat[0].nhits, item->stat[0].nmiss);
+		pr_info("\tirq:   \t%16lu hits \t%16lu miss\n",
+			item->stat[1].nhits, item->stat[1].nmiss);
+		pr_info("\ttotal: \t%16lu hits \t%16lu miss\n",
+			item->stat[0].nhits + item->stat[1].nhits,
+			item->stat[0].nmiss + item->stat[1].nmiss);
+		nthreads++;
+	}
+
+	total.nhits = normal.nhits + irq.nhits;
+	total.nmiss = normal.nmiss + irq.nmiss;
+
+	pr_info("ALL: \tnthreads: %d  duration: %lluus\n", nthreads, duration);
+	pr_info("SUM: \t%16lu hits \t%16lu miss\n",
+		total.nhits, total.nmiss);
+}
+
+/*
+ * synchronous test cases for objpool manipulation
+ */
+
+/* objpool manipulation for synchronous mode 0 (percpu objpool) */
+static struct ot_context *ot_init_sync_m0(void)
+{
+	struct ot_context *sop = NULL;
+	int max = num_possible_cpus() << 3;
+
+	sop = (struct ot_context *)ot_kzalloc(sizeof(*sop));
+	if (!sop)
+		return NULL;
+
+	if (objpool_init(&sop->pool, max, max, sizeof(struct ot_node),
+			GFP_KERNEL, sop, ot_init_node, NULL)) {
+		ot_kfree(sop, sizeof(*sop));
+		return NULL;
+	}
+	WARN_ON(max != sop->pool.oh_nobjs);
+
+	return sop;
+}
+
+static void ot_fini_sync_m0(struct ot_context *sop)
+{
+	objpool_fini(&sop->pool);
+	ot_kfree(sop, sizeof(*sop));
+}
+
+/* objpool manipulation for synchronous mode 1 (private pool) */
+static struct ot_context *ot_init_sync_m1(void)
+{
+	struct ot_context *sop = NULL;
+	unsigned long size;
+	int rc, szobj, max = num_possible_cpus() << 3;
+
+	sop = (struct ot_context *)ot_kzalloc(sizeof(*sop));
+	if (!sop)
+		return NULL;
+
+	size = sizeof(struct ot_node) * max;
+	sop->ptr = ot_vmalloc(size);
+	sop->size = size;
+	if (!sop->ptr) {
+		ot_kfree(sop, sizeof(*sop));
+		return NULL;
+	}
+	memset(sop->ptr, 0, size);
+
+	/* create and initialize objpool as empty (no objects) */
+	rc = objpool_init(&sop->pool, 0, max, 0, GFP_KERNEL, sop, NULL, NULL);
+	if (rc) {
+		ot_kfree(sop, sizeof(*sop));
+		return NULL;
+	}
+
+	/* populate given buffer to objpool */
+	rc = objpool_populate(&sop->pool, sop->ptr, size,
+		sizeof(struct ot_node), sop, ot_init_node);
+	if (rc) {
+		objpool_fini(&sop->pool);
+		ot_vfree(sop->ptr, size);
+		ot_kfree(sop, sizeof(*sop));
+		return NULL;
+	}
+	szobj = ALIGN(sizeof(struct ot_node), sizeof(void *));
+	WARN_ON((size / szobj) != sop->pool.oh_nobjs);
+
+	return sop;
+}
+
+static void ot_fini_sync_m1(struct ot_context *sop)
+{
+	objpool_fini(&sop->pool);
+
+	ot_vfree(sop->ptr, sop->size);
+	ot_kfree(sop, sizeof(*sop));
+}
+
+/* objpool manipulation for synchronous mode 2 (private objects) */
+static int ot_objpool_release(void *context, void *ptr, uint32_t flags)
+{
+	struct ot_context *sop = context;
+
+	/* here we need release all user-allocated objects */
+	if ((flags & OBJPOOL_FLAG_NODE) && (flags & OBJPOOL_FLAG_USER)) {
+		struct ot_node *on = ptr;
+		WARN_ON(on->data != 0xDEADBEEF);
+		ot_kfree(on, sizeof(struct ot_node));
+	} else if (flags & OBJPOOL_FLAG_POOL) {
+		/* release user preallocated pool */
+		if (sop->ptr) {
+			WARN_ON(sop->ptr != ptr);
+			WARN_ON(!(flags & OBJPOOL_FLAG_USER));
+			ot_vfree(sop->ptr, sop->size);
+		}
+		/* do context cleaning if needed */
+		ot_kfree(sop, sizeof(*sop));
+	}
+
+	return 0;
+}
+
+static struct ot_context *ot_init_sync_m2(void)
+{
+	struct ot_context *sop = NULL;
+	struct ot_node *on;
+	int rc, i, max = num_possible_cpus() << 3;
+
+	sop = (struct ot_context *)ot_kzalloc(sizeof(*sop));
+	if (!sop)
+		return NULL;
+
+	/* create and initialize objpool as empty (no objects) */
+	rc = objpool_init(&sop->pool, 0, max, 0, GFP_KERNEL, sop, NULL,
+			ot_objpool_release);
+	if (rc) {
+		ot_kfree(sop, sizeof(*sop));
+		return NULL;
+	}
+
+	/* allocate private objects and insert to objpool */
+	for (i = 0; i < max; i++) {
+		on = ot_kzalloc(sizeof(struct ot_node));
+		if (on) {
+			ot_init_node(sop, on);
+			on->data = 0xDEADBEEF;
+			objpool_add(on, &sop->pool);
+		}
+	}
+	WARN_ON(max != sop->pool.oh_nobjs);
+
+	return sop;
+}
+
+static void ot_fini_sync_m2(struct ot_context *sop)
+{
+	objpool_fini(&sop->pool);
+}
+
+/* objpool manipulation for synchronous mode 3 (mixed mode) */
+static struct ot_context *ot_init_sync_m3(void)
+{
+	struct ot_context *sop = NULL;
+	struct ot_node *on;
+	unsigned long size;
+	int rc, i, szobj, nobjs;
+	int max = num_possible_cpus() << 4;
+
+	sop = (struct ot_context *)ot_kzalloc(sizeof(*sop));
+	if (!sop)
+		return NULL;
+
+	/* create and initialize objpool as empty (no objects) */
+	nobjs = num_possible_cpus() * 2;
+	rc = objpool_init(&sop->pool, nobjs, max, sizeof(struct ot_node),
+			GFP_KERNEL, sop, ot_init_node, ot_objpool_release);
+	if (rc) {
+		ot_kfree(sop, sizeof(*sop));
+		return NULL;
+	}
+
+	size = sizeof(struct ot_node) * num_possible_cpus() * 4;
+	sop->ptr = ot_vmalloc(size);
+	if (!sop->ptr) {
+		objpool_fini(&sop->pool);
+		return NULL;
+	}
+	sop->size = size;
+	memset(sop->ptr, 0, size);
+
+	/* populate given buffer to objpool */
+	rc = objpool_populate(&sop->pool, sop->ptr, size,
+		sizeof(struct ot_node), sop, ot_init_node);
+	if (rc) {
+		objpool_fini(&sop->pool);
+		ot_vfree(sop->ptr, size);
+		return NULL;
+	}
+	szobj = ALIGN(sizeof(struct ot_node), sizeof(void *));
+	nobjs += size / szobj;
+
+	/* allocate private objects and insert to objpool */
+	for (i = 0; i < num_possible_cpus() * 2; i++) {
+		on = ot_kzalloc(sizeof(struct ot_node));
+		if (on) {
+			ot_init_node(sop, on);
+			on->data = 0xDEADBEEF;
+			if (!objpool_add(on, &sop->pool))
+				nobjs++;
+			else
+				ot_kfree(on, sizeof(struct ot_node));
+		}
+	}
+	WARN_ON(nobjs != sop->pool.oh_nobjs);
+
+	return sop;
+}
+
+static void ot_fini_sync_m3(struct ot_context *sop)
+{
+	objpool_fini(&sop->pool);
+}
+
+struct {
+	struct ot_context * (*init)(void);
+	void (*fini)(struct ot_context *sop);
+} g_ot_sync_ops[4] = {
+	{ot_init_sync_m0, ot_fini_sync_m0},
+	{ot_init_sync_m1, ot_fini_sync_m1},
+	{ot_init_sync_m2, ot_fini_sync_m2},
+	{ot_init_sync_m3, ot_fini_sync_m3},
+};
+
+/*
+ * synchronous test cases: performance mode
+ */
+
+static void ot_bulk_sync(struct ot_item *item, int irq)
+{
+	struct ot_node *nods[OT_NR_MAX_BULK];
+	int i;
+
+	for (i = 0; i < item->bulk[irq]; i++)
+		nods[i] = objpool_pop(item->pool);
+
+	if (!irq && (item->delay || !(++(item->niters) & 0x7FFF)))
+		msleep(item->delay);
+
+	while (i-- > 0) {
+		struct ot_node *on = nods[i];
+		if (on) {
+			on->refs++;
+			objpool_push(on, item->pool);
+			item->stat[irq].nhits++;
+		} else {
+			item->stat[irq].nmiss++;
+		}
+	}
+}
+
+static int ot_start_sync(struct ot_ctrl *ctrl)
+{
+	struct ot_context *sop;
+	ktime_t start;
+	u64 duration;
+	unsigned long timeout;
+	int cpu, rc;
+
+	/* initialize objpool for syncrhonous testcase */
+	sop = g_ot_sync_ops[ctrl->mode].init();
+	if (!sop)
+		return -ENOMEM;
+
+	/* grab rwsem to block testing threads */
+	down_write(&g_ot_data.start);
+
+	for_each_possible_cpu(cpu) {
+		struct ot_item *item = per_cpu_ptr(&ot_pcup_items, cpu);
+		struct task_struct *work;
+
+		ot_init_cpu_item(item, ctrl, &sop->pool, ot_bulk_sync);
+
+		/* skip offline cpus */
+		if (!cpu_online(cpu))
+			continue;
+
+		work = kthread_create_on_node(ot_thread_worker, item,
+				cpu_to_node(cpu), "ot_worker_%d", cpu);
+		if (IS_ERR(work)) {
+			pr_err("failed to create thread for cpu %d\n", cpu);
+		} else {
+			kthread_bind(work, cpu);
+			wake_up_process(work);
+		}
+	}
+
+	/* wait a while to make sure all threads waiting at start line */
+	msleep(20);
+
+	/* in case no threads were created: memory insufficient ? */
+	if (atomic_dec_and_test(&g_ot_data.nthreads))
+		complete(&g_ot_data.wait);
+
+	// sched_set_fifo_low(current);
+
+	/* start objpool testing threads */
+	start = ktime_get();
+	up_write(&g_ot_data.start);
+
+	/* yeild cpu to worker threads for duration ms */
+	timeout = msecs_to_jiffies(ctrl->duration);
+	rc = schedule_timeout_interruptible(timeout);
+
+	/* tell workers threads to quit */
+	atomic_set_release(&g_ot_data.stop, 1);
+
+	/* wait all workers threads finish and quit */
+	wait_for_completion(&g_ot_data.wait);
+	duration = (u64) ktime_us_delta(ktime_get(), start);
+
+	/* cleanup objpool */
+	g_ot_sync_ops[ctrl->mode].fini(sop);
+
+	/* report testing summary and performance results */
+	ot_perf_report(ctrl, duration);
+
+	/* report memory allocation summary */
+	ot_mem_report(ctrl);
+
+	return rc;
+}
+
+/*
+ * asynchronous test cases: pool lifecycle controlled by refcount
+ */
+
+static void ot_fini_async_rcu(struct rcu_head *rcu)
+{
+	struct ot_context *sop = container_of(rcu, struct ot_context, rcu);
+	struct ot_node *on;
+
+	/* here all cpus are aware of the stop event: g_ot_data.stop = 1 */
+	WARN_ON(!atomic_read_acquire(&g_ot_data.stop));
+
+	do {
+		/* release all objects remained in objpool */
+		on = objpool_pop(&sop->pool);
+		if (on && !objpool_is_inslot(on, &sop->pool) &&
+			!objpool_is_inpool(on, &sop->pool)) {
+			/* private object managed by user */
+			WARN_ON(on->data != 0xDEADBEEF);
+			ot_kfree(on, sizeof(struct ot_node));
+		}
+
+		/* deref anyway since we've one extra ref grabbed */
+		if (refcount_dec_and_test(&sop->refs)) {
+			objpool_fini(&sop->pool);
+			break;
+		}
+	} while (on);
+
+	complete(&g_ot_data.rcu);
+}
+
+static void ot_fini_async(struct ot_context *sop)
+{
+	/* make sure the stop event is acknowledged by all cores */
+	call_rcu(&sop->rcu, ot_fini_async_rcu);
+}
+
+static struct ot_context *ot_init_async_m0(void)
+{
+	struct ot_context *sop = NULL;
+	int max = num_possible_cpus() << 3;
+
+	sop = (struct ot_context *)ot_kzalloc(sizeof(*sop));
+	if (!sop)
+		return NULL;
+
+	if (objpool_init(&sop->pool, max, max, sizeof(struct ot_node),
+			GFP_KERNEL, sop, ot_init_node, ot_objpool_release)) {
+		ot_kfree(sop, sizeof(*sop));
+		return NULL;
+	}
+	WARN_ON(max != sop->pool.oh_nobjs);
+	refcount_set(&sop->refs, max + 1);
+
+	return sop;
+}
+
+static struct ot_context *ot_init_async_m1(void)
+{
+	struct ot_context *sop = NULL;
+	unsigned long size;
+	int szobj, rc, max = num_possible_cpus() << 3;
+
+	sop = (struct ot_context *)ot_kzalloc(sizeof(*sop));
+	if (!sop)
+		return NULL;
+
+	size = sizeof(struct ot_node) * max;
+	sop->ptr = ot_vmalloc(size);
+	sop->size = size;
+	if (!sop->ptr) {
+		ot_kfree(sop, sizeof(*sop));
+		return NULL;
+	}
+	memset(sop->ptr, 0, size);
+
+	/* create and initialize objpool as empty (no objects) */
+	rc = objpool_init(&sop->pool, 0, max, 0, GFP_KERNEL, sop, NULL,
+			ot_objpool_release);
+	if (rc) {
+		ot_kfree(sop, sizeof(*sop));
+		return NULL;
+	}
+
+	/* populate given buffer to objpool */
+	rc = objpool_populate(&sop->pool, sop->ptr, size,
+		sizeof(struct ot_node), sop, ot_init_node);
+	if (rc) {
+		objpool_fini(&sop->pool);
+		ot_vfree(sop->ptr, size);
+		return NULL;
+	}
+
+	/* calculate total number of objects stored in ptr */
+	szobj = ALIGN(sizeof(struct ot_node), sizeof(void *));
+	WARN_ON(size / szobj != sop->pool.oh_nobjs);
+	refcount_set(&sop->refs, size / szobj + 1);
+
+	return sop;
+}
+
+static struct ot_context *ot_init_async_m2(void)
+{
+	struct ot_context *sop = NULL;
+	struct ot_node *on;
+	int rc, i, nobjs = 0, max = num_possible_cpus() << 3;
+
+	sop = (struct ot_context *)ot_kzalloc(sizeof(*sop));
+	if (!sop)
+		return NULL;
+
+	/* create and initialize objpool as empty (no objects) */
+	rc = objpool_init(&sop->pool, 0, max, 0, GFP_KERNEL, sop, NULL,
+			ot_objpool_release);
+	if (rc) {
+		ot_kfree(sop, sizeof(*sop));
+		return NULL;
+	}
+
+	/* allocate private objects and insert to objpool */
+	for (i = 0; i < max; i++) {
+		on = ot_kzalloc(sizeof(struct ot_node));
+		if (on) {
+			ot_init_node(sop, on);
+			on->data = 0xDEADBEEF;
+			objpool_add(on, &sop->pool);
+			nobjs++;
+		}
+	}
+	WARN_ON(nobjs != sop->pool.oh_nobjs);
+	refcount_set(&sop->refs, nobjs + 1);
+
+	return sop;
+}
+
+/* objpool manipulation for synchronous mode 3 (mixed mode) */
+static struct ot_context *ot_init_async_m3(void)
+{
+	struct ot_context *sop = NULL;
+	struct ot_node *on;
+	unsigned long size;
+	int szobj, nobjs, rc, i, max = num_possible_cpus() << 4;
+
+	sop = (struct ot_context *)ot_kzalloc(sizeof(*sop));
+	if (!sop)
+		return NULL;
+
+	/* create and initialize objpool as empty (no objects) */
+	nobjs = num_possible_cpus() * 2;
+	rc = objpool_init(&sop->pool, nobjs, max, sizeof(struct ot_node),
+			GFP_KERNEL, sop, ot_init_node, ot_objpool_release);
+	if (rc) {
+		ot_kfree(sop, sizeof(*sop));
+		return NULL;
+	}
+
+	size = sizeof(struct ot_node) * num_possible_cpus() * 4;
+	sop->ptr = ot_vmalloc(size);
+	if (!sop->ptr) {
+		ot_kfree(sop, sizeof(*sop));
+		return NULL;
+	}
+	sop->size = size;
+	memset(sop->ptr, 0, size);
+
+	/* populate given buffer to objpool */
+	rc = objpool_populate(&sop->pool, sop->ptr, size,
+		sizeof(struct ot_node), sop, ot_init_node);
+	if (rc) {
+		objpool_fini(&sop->pool);
+		ot_vfree(sop->ptr, size);
+		return NULL;
+	}
+
+	/* calculate total number of objects stored in ptr */
+	szobj = ALIGN(sizeof(struct ot_node), sizeof(void *));
+	nobjs += size / szobj;
+
+	/* allocate private objects and insert to objpool */
+	for (i = 0; i < num_possible_cpus() * 2; i++) {
+		on = ot_kzalloc(sizeof(struct ot_node));
+		if (on) {
+			ot_init_node(sop, on);
+			on->data = 0xDEADBEEF;
+			objpool_add(on, &sop->pool);
+			nobjs++;
+		}
+	}
+	WARN_ON(nobjs != sop->pool.oh_nobjs);
+	refcount_set(&sop->refs, nobjs + 1);
+
+	return sop;
+}
+
+struct {
+	struct ot_context * (*init)(void);
+	void (*fini)(struct ot_context *sop);
+} g_ot_async_ops[4] = {
+	{ot_init_async_m0, ot_fini_async},
+	{ot_init_async_m1, ot_fini_async},
+	{ot_init_async_m2, ot_fini_async},
+	{ot_init_async_m3, ot_fini_async},
+};
+
+static void ot_nod_recycle(struct ot_node *on, struct objpool_head *pool,
+			int release)
+{
+	struct ot_context *sop;
+
+	on->refs++;
+
+	if (!release) {
+		/* push object back to opjpool for reuse */
+		objpool_push(on, pool);
+		return;
+	}
+
+	sop = container_of(pool, struct ot_context, pool);
+	WARN_ON(sop != pool->oh_context);
+
+	if (objpool_is_inslot(on, pool)) {
+		/* object is alloced from percpu slots */
+	} else if (objpool_is_inpool(on, pool)) {
+		/* object is alloced from user-manged pool */
+	} else {
+		/* private object managed by user */
+		WARN_ON(on->data != 0xDEADBEEF);
+		ot_kfree(on, sizeof(struct ot_node));
+	}
+
+	/* unref objpool with nod removed forever */
+	if (refcount_dec_and_test(&sop->refs))
+		objpool_fini(pool);
+}
+
+static void ot_bulk_async(struct ot_item *item, int irq)
+{
+	struct ot_node *nods[OT_NR_MAX_BULK];
+	int i, stop;
+
+	for (i = 0; i < item->bulk[irq]; i++)
+		nods[i] = objpool_pop(item->pool);
+
+	if (!irq) {
+		if (item->delay || !(++(item->niters) & 0x7FFF))
+			msleep(item->delay);
+		get_cpu();
+	}
+
+	stop = atomic_read_acquire(&g_ot_data.stop);
+
+	/* drop all objects and deref objpool */
+	while (i-- > 0) {
+		struct ot_node *on = nods[i];
+
+		if (on) {
+			on->refs++;
+			ot_nod_recycle(on, item->pool, stop);
+			item->stat[irq].nhits++;
+		} else {
+			item->stat[irq].nmiss++;
+		}
+	}
+
+	if (!irq)
+		put_cpu();
+}
+
+static int ot_start_async(struct ot_ctrl *ctrl)
+{
+	struct ot_context *sop;
+	ktime_t start;
+	u64 duration;
+	unsigned long timeout;
+	int cpu, rc;
+
+	/* initialize objpool for syncrhonous testcase */
+	sop = g_ot_async_ops[ctrl->mode].init();
+	if (!sop)
+		return -ENOMEM;
+
+	/* grab rwsem to block testing threads */
+	down_write(&g_ot_data.start);
+
+	for_each_possible_cpu(cpu) {
+		struct ot_item *item = per_cpu_ptr(&ot_pcup_items, cpu);
+		struct task_struct *work;
+
+		ot_init_cpu_item(item, ctrl, &sop->pool, ot_bulk_async);
+
+		/* skip offline cpus */
+		if (!cpu_online(cpu))
+			continue;
+
+		work = kthread_create_on_node(ot_thread_worker, item,
+				cpu_to_node(cpu), "ot_worker_%d", cpu);
+		if (IS_ERR(work)) {
+			pr_err("failed to create thread for cpu %d\n", cpu);
+		} else {
+			kthread_bind(work, cpu);
+			wake_up_process(work);
+		}
+	}
+
+	/* wait a while to make sure all threads waiting at start line */
+	msleep(20);
+
+	/* in case no threads were created: memory insufficient ? */
+	if (atomic_dec_and_test(&g_ot_data.nthreads))
+		complete(&g_ot_data.wait);
+
+	/* start objpool testing threads */
+	start = ktime_get();
+	up_write(&g_ot_data.start);
+
+	/* yeild cpu to worker threads for duration ms */
+	timeout = msecs_to_jiffies(ctrl->duration);
+	rc = schedule_timeout_interruptible(timeout);
+
+	/* tell workers threads to quit */
+	atomic_set_release(&g_ot_data.stop, 1);
+
+	/* do async-finalization */
+	g_ot_async_ops[ctrl->mode].fini(sop);
+
+	/* wait all workers threads finish and quit */
+	wait_for_completion(&g_ot_data.wait);
+	duration = (u64) ktime_us_delta(ktime_get(), start);
+
+	/* assure rcu callback is triggered */
+	wait_for_completion(&g_ot_data.rcu);
+
+	/*
+	 * now we are sure that objpool is finalized either
+	 * by rcu callback or by worker threads
+	 */
+
+	/* report testing summary and performance results */
+	ot_perf_report(ctrl, duration);
+
+	/* report memory allocation summary */
+	ot_mem_report(ctrl);
+
+	return rc;
+}
+
+/*
+ * predefined testing cases:
+ *   4 synchronous cases / 4 overrun cases / 2 async cases
+ *
+ * mode: unsigned int, could be 0/1/2/3, see name
+ * duration: unsigned int, total test time in ms
+ * delay: unsigned int, delay (in ms) between each iteration
+ * bulk_normal: unsigned int, repeat times for thread worker
+ * bulk_irq: unsigned int, repeat times for irq consumer
+ * hrtimer: unsigned long, hrtimer intervnal in ms
+ * name: char *, tag for current test ot_item
+ */
+
+struct ot_ctrl g_ot_sync[] = {
+	{0, 1000, 0,  1,  0,  0, "sync: percpu objpool"},
+	{1, 1000, 0,  1,  0,  0, "sync: user objpool"},
+	{2, 1000, 0,  1,  0,  0, "sync: user objects"},
+	{3, 1000, 0,  1,  0,  0, "sync: mixed pools & objs"},
+};
+
+struct ot_ctrl g_ot_miss[] = {
+	{0, 1000, 0, 16,  0,  0, "sync overrun: percpu objpool"},
+	{1, 1000, 0, 16,  0,  0, "sync overrun: user objpool"},
+	{2, 1000, 0, 16,  0,  0, "sync overrun: user objects"},
+	{3, 1000, 0, 16,  0,  0, "sync overrun: mixed pools & objs"},
+};
+
+struct ot_ctrl g_ot_async[] = {
+	{0, 1000, 4,  8,  8,  6, "async: percpu objpool"},
+	{1, 1000, 4,  8,  8,  6, "async: user objpool"},
+	{2, 1000, 4,  8,  8,  6, "async: user objects"},
+	{3, 1000, 4,  8,  8,  6, "async: mixed pools & objs"},
+};
+
+static int __init ot_mod_init(void)
+{
+	int i;
+
+	ot_init_data(&g_ot_data);
+
+	for (i = 0; i < ARRAY_SIZE(g_ot_sync); i++) {
+		if (ot_start_sync(&g_ot_sync[i]))
+			goto out;
+		ot_reset_data(&g_ot_data);
+	}
+
+	for (i = 0; i < ARRAY_SIZE(g_ot_miss); i++) {
+		if (ot_start_sync(&g_ot_miss[i]))
+			goto out;
+		ot_reset_data(&g_ot_data);
+	}
+
+	for (i = 0; i < ARRAY_SIZE(g_ot_async); i++) {
+		if (ot_start_async(&g_ot_async[i]))
+			goto out;
+		ot_reset_data(&g_ot_data);
+	}
+
+out:
+	return -EAGAIN;
+}
+
+static void __exit ot_mod_exit(void)
+{
+}
+
+module_init(ot_mod_init);
+module_exit(ot_mod_exit);
+
+MODULE_LICENSE("GPL");
+MODULE_AUTHOR("Matt Wu");
-- 
2.34.1


^ permalink raw reply related	[flat|nested] 23+ messages in thread

* Re: [PATCH v3] kprobes,lib: kretprobe scalability improvement
  2022-11-01  1:43 [PATCH v3] kprobes,lib: kretprobe scalability improvement wuqiang
@ 2022-11-01 12:27 ` kernel test robot
  2022-11-02  2:30 ` [PATCH v4] " wuqiang
  1 sibling, 0 replies; 23+ messages in thread
From: kernel test robot @ 2022-11-01 12:27 UTC (permalink / raw)
  To: wuqiang, mhiramat, davem, anil.s.keshavamurthy, naveen.n.rao,
	rostedt, ast, akpm
  Cc: oe-kbuild-all, linux-kernel, mattwu, wuqiang

[-- Attachment #1: Type: text/plain, Size: 3557 bytes --]

Hi wuqiang,

Thank you for the patch! Perhaps something to improve:

[auto build test WARNING on akpm-mm/mm-everything]
[also build test WARNING on linus/master v6.1-rc3 next-20221101]
[If your patch is applied to the wrong git tree, kindly drop us a note.
And when submitting patch, we suggest to use '--base' as documented in
https://git-scm.com/docs/git-format-patch#_base_tree_information]

url:    https://github.com/intel-lab-lkp/linux/commits/wuqiang/kprobes-lib-kretprobe-scalability-improvement/20221101-110242
base:   https://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm.git mm-everything
patch link:    https://lore.kernel.org/r/20221101014346.150812-1-wuqiang.matt%40bytedance.com
patch subject: [PATCH v3] kprobes,lib: kretprobe scalability improvement
config: x86_64-rhel-8.3-kselftests
compiler: gcc-11 (Debian 11.3.0-8) 11.3.0
reproduce (this is a W=1 build):
        # https://github.com/intel-lab-lkp/linux/commit/a0deeba1c316e59b94856c8eda40f6680fd511f8
        git remote add linux-review https://github.com/intel-lab-lkp/linux
        git fetch --no-tags linux-review wuqiang/kprobes-lib-kretprobe-scalability-improvement/20221101-110242
        git checkout a0deeba1c316e59b94856c8eda40f6680fd511f8
        # save the config file
        mkdir build_dir && cp config build_dir/.config
        make W=1 O=build_dir ARCH=x86_64 SHELL=/bin/bash kernel/trace/

If you fix the issue, kindly add following tag where applicable
| Reported-by: kernel test robot <lkp@intel.com>

All warnings (new ones prefixed by >>):

   kernel/trace/fprobe.c: In function 'fprobe_init_rethook':
>> kernel/trace/fprobe.c:128:13: warning: unused variable 'i' [-Wunused-variable]
     128 |         int i, size;
         |             ^


vim +/i +128 kernel/trace/fprobe.c

cad9931f64dc7f Masami Hiramatsu 2022-03-15  125  
5b0ab78998e325 Masami Hiramatsu 2022-03-15  126  static int fprobe_init_rethook(struct fprobe *fp, int num)
5b0ab78998e325 Masami Hiramatsu 2022-03-15  127  {
5b0ab78998e325 Masami Hiramatsu 2022-03-15 @128  	int i, size;
5b0ab78998e325 Masami Hiramatsu 2022-03-15  129  
5b0ab78998e325 Masami Hiramatsu 2022-03-15  130  	if (num < 0)
5b0ab78998e325 Masami Hiramatsu 2022-03-15  131  		return -EINVAL;
5b0ab78998e325 Masami Hiramatsu 2022-03-15  132  
5b0ab78998e325 Masami Hiramatsu 2022-03-15  133  	if (!fp->exit_handler) {
5b0ab78998e325 Masami Hiramatsu 2022-03-15  134  		fp->rethook = NULL;
5b0ab78998e325 Masami Hiramatsu 2022-03-15  135  		return 0;
5b0ab78998e325 Masami Hiramatsu 2022-03-15  136  	}
5b0ab78998e325 Masami Hiramatsu 2022-03-15  137  
5b0ab78998e325 Masami Hiramatsu 2022-03-15  138  	/* Initialize rethook if needed */
5b0ab78998e325 Masami Hiramatsu 2022-03-15  139  	size = num * num_possible_cpus() * 2;
5b0ab78998e325 Masami Hiramatsu 2022-03-15  140  	if (size < 0)
5b0ab78998e325 Masami Hiramatsu 2022-03-15  141  		return -E2BIG;
5b0ab78998e325 Masami Hiramatsu 2022-03-15  142  
a0deeba1c316e5 wuqiang          2022-11-01  143  	fp->rethook = rethook_alloc((void *)fp, fprobe_exit_handler, GFP_KERNEL,
a0deeba1c316e5 wuqiang          2022-11-01  144  				sizeof(struct fprobe_rethook_node), size);
a0deeba1c316e5 wuqiang          2022-11-01  145  	if (!fp->rethook)
5b0ab78998e325 Masami Hiramatsu 2022-03-15  146  		return -ENOMEM;
a0deeba1c316e5 wuqiang          2022-11-01  147  
5b0ab78998e325 Masami Hiramatsu 2022-03-15  148  	return 0;
5b0ab78998e325 Masami Hiramatsu 2022-03-15  149  }
5b0ab78998e325 Masami Hiramatsu 2022-03-15  150  

-- 
0-DAY CI Kernel Test Service
https://01.org/lkp

[-- Attachment #2: config --]
[-- Type: text/plain, Size: 172575 bytes --]

#
# Automatically generated file; DO NOT EDIT.
# Linux/x86_64 6.1.0-rc3 Kernel Configuration
#
CONFIG_CC_VERSION_TEXT="gcc-11 (Debian 11.3.0-8) 11.3.0"
CONFIG_CC_IS_GCC=y
CONFIG_GCC_VERSION=110300
CONFIG_CLANG_VERSION=0
CONFIG_AS_IS_GNU=y
CONFIG_AS_VERSION=23900
CONFIG_LD_IS_BFD=y
CONFIG_LD_VERSION=23900
CONFIG_LLD_VERSION=0
CONFIG_CC_CAN_LINK=y
CONFIG_CC_CAN_LINK_STATIC=y
CONFIG_CC_HAS_ASM_GOTO_OUTPUT=y
CONFIG_CC_HAS_ASM_INLINE=y
CONFIG_CC_HAS_NO_PROFILE_FN_ATTR=y
CONFIG_PAHOLE_VERSION=123
CONFIG_CONSTRUCTORS=y
CONFIG_IRQ_WORK=y
CONFIG_BUILDTIME_TABLE_SORT=y
CONFIG_THREAD_INFO_IN_TASK=y

#
# General setup
#
CONFIG_INIT_ENV_ARG_LIMIT=32
# CONFIG_COMPILE_TEST is not set
# CONFIG_WERROR is not set
CONFIG_LOCALVERSION=""
CONFIG_LOCALVERSION_AUTO=y
CONFIG_BUILD_SALT=""
CONFIG_HAVE_KERNEL_GZIP=y
CONFIG_HAVE_KERNEL_BZIP2=y
CONFIG_HAVE_KERNEL_LZMA=y
CONFIG_HAVE_KERNEL_XZ=y
CONFIG_HAVE_KERNEL_LZO=y
CONFIG_HAVE_KERNEL_LZ4=y
CONFIG_HAVE_KERNEL_ZSTD=y
CONFIG_KERNEL_GZIP=y
# CONFIG_KERNEL_BZIP2 is not set
# CONFIG_KERNEL_LZMA is not set
# CONFIG_KERNEL_XZ is not set
# CONFIG_KERNEL_LZO is not set
# CONFIG_KERNEL_LZ4 is not set
# CONFIG_KERNEL_ZSTD is not set
CONFIG_DEFAULT_INIT=""
CONFIG_DEFAULT_HOSTNAME="(none)"
CONFIG_SYSVIPC=y
CONFIG_SYSVIPC_SYSCTL=y
CONFIG_SYSVIPC_COMPAT=y
CONFIG_POSIX_MQUEUE=y
CONFIG_POSIX_MQUEUE_SYSCTL=y
# CONFIG_WATCH_QUEUE is not set
CONFIG_CROSS_MEMORY_ATTACH=y
# CONFIG_USELIB is not set
CONFIG_AUDIT=y
CONFIG_HAVE_ARCH_AUDITSYSCALL=y
CONFIG_AUDITSYSCALL=y

#
# IRQ subsystem
#
CONFIG_GENERIC_IRQ_PROBE=y
CONFIG_GENERIC_IRQ_SHOW=y
CONFIG_GENERIC_IRQ_EFFECTIVE_AFF_MASK=y
CONFIG_GENERIC_PENDING_IRQ=y
CONFIG_GENERIC_IRQ_MIGRATION=y
CONFIG_GENERIC_IRQ_INJECTION=y
CONFIG_HARDIRQS_SW_RESEND=y
CONFIG_IRQ_DOMAIN=y
CONFIG_IRQ_SIM=y
CONFIG_IRQ_DOMAIN_HIERARCHY=y
CONFIG_GENERIC_MSI_IRQ=y
CONFIG_GENERIC_MSI_IRQ_DOMAIN=y
CONFIG_IRQ_MSI_IOMMU=y
CONFIG_GENERIC_IRQ_MATRIX_ALLOCATOR=y
CONFIG_GENERIC_IRQ_RESERVATION_MODE=y
CONFIG_IRQ_FORCED_THREADING=y
CONFIG_SPARSE_IRQ=y
# CONFIG_GENERIC_IRQ_DEBUGFS is not set
# end of IRQ subsystem

CONFIG_CLOCKSOURCE_WATCHDOG=y
CONFIG_ARCH_CLOCKSOURCE_INIT=y
CONFIG_CLOCKSOURCE_VALIDATE_LAST_CYCLE=y
CONFIG_GENERIC_TIME_VSYSCALL=y
CONFIG_GENERIC_CLOCKEVENTS=y
CONFIG_GENERIC_CLOCKEVENTS_BROADCAST=y
CONFIG_GENERIC_CLOCKEVENTS_MIN_ADJUST=y
CONFIG_GENERIC_CMOS_UPDATE=y
CONFIG_HAVE_POSIX_CPU_TIMERS_TASK_WORK=y
CONFIG_POSIX_CPU_TIMERS_TASK_WORK=y
CONFIG_CONTEXT_TRACKING=y
CONFIG_CONTEXT_TRACKING_IDLE=y

#
# Timers subsystem
#
CONFIG_TICK_ONESHOT=y
CONFIG_NO_HZ_COMMON=y
# CONFIG_HZ_PERIODIC is not set
# CONFIG_NO_HZ_IDLE is not set
CONFIG_NO_HZ_FULL=y
CONFIG_CONTEXT_TRACKING_USER=y
# CONFIG_CONTEXT_TRACKING_USER_FORCE is not set
CONFIG_NO_HZ=y
CONFIG_HIGH_RES_TIMERS=y
CONFIG_CLOCKSOURCE_WATCHDOG_MAX_SKEW_US=100
# end of Timers subsystem

CONFIG_BPF=y
CONFIG_HAVE_EBPF_JIT=y
CONFIG_ARCH_WANT_DEFAULT_BPF_JIT=y

#
# BPF subsystem
#
CONFIG_BPF_SYSCALL=y
CONFIG_BPF_JIT=y
CONFIG_BPF_JIT_ALWAYS_ON=y
CONFIG_BPF_JIT_DEFAULT_ON=y
CONFIG_BPF_UNPRIV_DEFAULT_OFF=y
CONFIG_USERMODE_DRIVER=y
CONFIG_BPF_PRELOAD=y
CONFIG_BPF_PRELOAD_UMD=y
# CONFIG_BPF_LSM is not set
# end of BPF subsystem

CONFIG_PREEMPT_BUILD=y
# CONFIG_PREEMPT_NONE is not set
# CONFIG_PREEMPT_VOLUNTARY is not set
CONFIG_PREEMPT=y
CONFIG_PREEMPT_COUNT=y
CONFIG_PREEMPTION=y
# CONFIG_PREEMPT_DYNAMIC is not set
# CONFIG_SCHED_CORE is not set

#
# CPU/Task time and stats accounting
#
CONFIG_VIRT_CPU_ACCOUNTING=y
CONFIG_VIRT_CPU_ACCOUNTING_GEN=y
CONFIG_IRQ_TIME_ACCOUNTING=y
CONFIG_HAVE_SCHED_AVG_IRQ=y
CONFIG_BSD_PROCESS_ACCT=y
CONFIG_BSD_PROCESS_ACCT_V3=y
CONFIG_TASKSTATS=y
CONFIG_TASK_DELAY_ACCT=y
CONFIG_TASK_XACCT=y
CONFIG_TASK_IO_ACCOUNTING=y
# CONFIG_PSI is not set
# end of CPU/Task time and stats accounting

CONFIG_CPU_ISOLATION=y

#
# RCU Subsystem
#
CONFIG_TREE_RCU=y
CONFIG_PREEMPT_RCU=y
# CONFIG_RCU_EXPERT is not set
CONFIG_SRCU=y
CONFIG_TREE_SRCU=y
CONFIG_TASKS_RCU_GENERIC=y
CONFIG_TASKS_RCU=y
CONFIG_TASKS_RUDE_RCU=y
CONFIG_TASKS_TRACE_RCU=y
CONFIG_RCU_STALL_COMMON=y
CONFIG_RCU_NEED_SEGCBLIST=y
CONFIG_RCU_NOCB_CPU=y
# CONFIG_RCU_NOCB_CPU_DEFAULT_ALL is not set
# end of RCU Subsystem

CONFIG_IKCONFIG=y
CONFIG_IKCONFIG_PROC=y
# CONFIG_IKHEADERS is not set
CONFIG_LOG_BUF_SHIFT=21
CONFIG_LOG_CPU_MAX_BUF_SHIFT=0
CONFIG_PRINTK_SAFE_LOG_BUF_SHIFT=13
# CONFIG_PRINTK_INDEX is not set
CONFIG_HAVE_UNSTABLE_SCHED_CLOCK=y

#
# Scheduler features
#
# CONFIG_UCLAMP_TASK is not set
# end of Scheduler features

CONFIG_ARCH_SUPPORTS_NUMA_BALANCING=y
CONFIG_ARCH_WANT_BATCHED_UNMAP_TLB_FLUSH=y
CONFIG_CC_HAS_INT128=y
CONFIG_CC_IMPLICIT_FALLTHROUGH="-Wimplicit-fallthrough=5"
CONFIG_GCC12_NO_ARRAY_BOUNDS=y
CONFIG_ARCH_SUPPORTS_INT128=y
CONFIG_NUMA_BALANCING=y
CONFIG_NUMA_BALANCING_DEFAULT_ENABLED=y
CONFIG_CGROUPS=y
CONFIG_PAGE_COUNTER=y
# CONFIG_CGROUP_FAVOR_DYNMODS is not set
CONFIG_MEMCG=y
CONFIG_MEMCG_KMEM=y
CONFIG_BLK_CGROUP=y
CONFIG_CGROUP_WRITEBACK=y
CONFIG_CGROUP_SCHED=y
CONFIG_FAIR_GROUP_SCHED=y
CONFIG_CFS_BANDWIDTH=y
CONFIG_RT_GROUP_SCHED=y
CONFIG_CGROUP_PIDS=y
CONFIG_CGROUP_RDMA=y
CONFIG_CGROUP_FREEZER=y
CONFIG_CGROUP_HUGETLB=y
CONFIG_CPUSETS=y
CONFIG_PROC_PID_CPUSET=y
CONFIG_CGROUP_DEVICE=y
CONFIG_CGROUP_CPUACCT=y
CONFIG_CGROUP_PERF=y
CONFIG_CGROUP_BPF=y
# CONFIG_CGROUP_MISC is not set
# CONFIG_CGROUP_DEBUG is not set
CONFIG_SOCK_CGROUP_DATA=y
CONFIG_NAMESPACES=y
CONFIG_UTS_NS=y
CONFIG_TIME_NS=y
CONFIG_IPC_NS=y
CONFIG_USER_NS=y
CONFIG_PID_NS=y
CONFIG_NET_NS=y
CONFIG_CHECKPOINT_RESTORE=y
CONFIG_SCHED_AUTOGROUP=y
# CONFIG_SYSFS_DEPRECATED is not set
CONFIG_RELAY=y
CONFIG_BLK_DEV_INITRD=y
CONFIG_INITRAMFS_SOURCE=""
CONFIG_RD_GZIP=y
CONFIG_RD_BZIP2=y
CONFIG_RD_LZMA=y
CONFIG_RD_XZ=y
CONFIG_RD_LZO=y
CONFIG_RD_LZ4=y
CONFIG_RD_ZSTD=y
CONFIG_BOOT_CONFIG=y
# CONFIG_BOOT_CONFIG_EMBED is not set
CONFIG_INITRAMFS_PRESERVE_MTIME=y
CONFIG_CC_OPTIMIZE_FOR_PERFORMANCE=y
# CONFIG_CC_OPTIMIZE_FOR_SIZE is not set
CONFIG_LD_ORPHAN_WARN=y
CONFIG_SYSCTL=y
CONFIG_HAVE_UID16=y
CONFIG_SYSCTL_EXCEPTION_TRACE=y
CONFIG_HAVE_PCSPKR_PLATFORM=y
CONFIG_EXPERT=y
CONFIG_UID16=y
CONFIG_MULTIUSER=y
CONFIG_SGETMASK_SYSCALL=y
CONFIG_SYSFS_SYSCALL=y
CONFIG_FHANDLE=y
CONFIG_POSIX_TIMERS=y
CONFIG_PRINTK=y
CONFIG_BUG=y
CONFIG_ELF_CORE=y
CONFIG_PCSPKR_PLATFORM=y
CONFIG_BASE_FULL=y
CONFIG_FUTEX=y
CONFIG_FUTEX_PI=y
CONFIG_EPOLL=y
CONFIG_SIGNALFD=y
CONFIG_TIMERFD=y
CONFIG_EVENTFD=y
CONFIG_SHMEM=y
CONFIG_AIO=y
CONFIG_IO_URING=y
CONFIG_ADVISE_SYSCALLS=y
CONFIG_MEMBARRIER=y
CONFIG_KALLSYMS=y
CONFIG_KALLSYMS_ALL=y
CONFIG_KALLSYMS_ABSOLUTE_PERCPU=y
CONFIG_KALLSYMS_BASE_RELATIVE=y
CONFIG_ARCH_HAS_MEMBARRIER_SYNC_CORE=y
CONFIG_KCMP=y
CONFIG_RSEQ=y
# CONFIG_DEBUG_RSEQ is not set
CONFIG_EMBEDDED=y
CONFIG_HAVE_PERF_EVENTS=y
CONFIG_GUEST_PERF_EVENTS=y
# CONFIG_PC104 is not set

#
# Kernel Performance Events And Counters
#
CONFIG_PERF_EVENTS=y
# CONFIG_DEBUG_PERF_USE_VMALLOC is not set
# end of Kernel Performance Events And Counters

CONFIG_SYSTEM_DATA_VERIFICATION=y
CONFIG_PROFILING=y
CONFIG_TRACEPOINTS=y
# end of General setup

CONFIG_64BIT=y
CONFIG_X86_64=y
CONFIG_X86=y
CONFIG_INSTRUCTION_DECODER=y
CONFIG_OUTPUT_FORMAT="elf64-x86-64"
CONFIG_LOCKDEP_SUPPORT=y
CONFIG_STACKTRACE_SUPPORT=y
CONFIG_MMU=y
CONFIG_ARCH_MMAP_RND_BITS_MIN=28
CONFIG_ARCH_MMAP_RND_BITS_MAX=32
CONFIG_ARCH_MMAP_RND_COMPAT_BITS_MIN=8
CONFIG_ARCH_MMAP_RND_COMPAT_BITS_MAX=16
CONFIG_GENERIC_ISA_DMA=y
CONFIG_GENERIC_CSUM=y
CONFIG_GENERIC_BUG=y
CONFIG_GENERIC_BUG_RELATIVE_POINTERS=y
CONFIG_ARCH_MAY_HAVE_PC_FDC=y
CONFIG_GENERIC_CALIBRATE_DELAY=y
CONFIG_ARCH_HAS_CPU_RELAX=y
CONFIG_ARCH_HIBERNATION_POSSIBLE=y
CONFIG_ARCH_NR_GPIO=1024
CONFIG_ARCH_SUSPEND_POSSIBLE=y
CONFIG_AUDIT_ARCH=y
CONFIG_KASAN_SHADOW_OFFSET=0xdffffc0000000000
CONFIG_HAVE_INTEL_TXT=y
CONFIG_X86_64_SMP=y
CONFIG_ARCH_SUPPORTS_UPROBES=y
CONFIG_FIX_EARLYCON_MEM=y
CONFIG_DYNAMIC_PHYSICAL_MASK=y
CONFIG_PGTABLE_LEVELS=5
CONFIG_CC_HAS_SANE_STACKPROTECTOR=y

#
# Processor type and features
#
CONFIG_SMP=y
CONFIG_X86_FEATURE_NAMES=y
CONFIG_X86_X2APIC=y
CONFIG_X86_MPPARSE=y
# CONFIG_GOLDFISH is not set
CONFIG_X86_CPU_RESCTRL=y
CONFIG_X86_EXTENDED_PLATFORM=y
# CONFIG_X86_NUMACHIP is not set
# CONFIG_X86_VSMP is not set
CONFIG_X86_UV=y
# CONFIG_X86_GOLDFISH is not set
# CONFIG_X86_INTEL_MID is not set
CONFIG_X86_INTEL_LPSS=y
# CONFIG_X86_AMD_PLATFORM_DEVICE is not set
CONFIG_IOSF_MBI=y
# CONFIG_IOSF_MBI_DEBUG is not set
CONFIG_X86_SUPPORTS_MEMORY_FAILURE=y
# CONFIG_SCHED_OMIT_FRAME_POINTER is not set
CONFIG_HYPERVISOR_GUEST=y
CONFIG_PARAVIRT=y
# CONFIG_PARAVIRT_DEBUG is not set
CONFIG_PARAVIRT_SPINLOCKS=y
CONFIG_X86_HV_CALLBACK_VECTOR=y
# CONFIG_XEN is not set
CONFIG_KVM_GUEST=y
CONFIG_ARCH_CPUIDLE_HALTPOLL=y
# CONFIG_PVH is not set
CONFIG_PARAVIRT_TIME_ACCOUNTING=y
CONFIG_PARAVIRT_CLOCK=y
# CONFIG_JAILHOUSE_GUEST is not set
# CONFIG_ACRN_GUEST is not set
CONFIG_INTEL_TDX_GUEST=y
# CONFIG_MK8 is not set
# CONFIG_MPSC is not set
# CONFIG_MCORE2 is not set
# CONFIG_MATOM is not set
CONFIG_GENERIC_CPU=y
CONFIG_X86_INTERNODE_CACHE_SHIFT=6
CONFIG_X86_L1_CACHE_SHIFT=6
CONFIG_X86_TSC=y
CONFIG_X86_CMPXCHG64=y
CONFIG_X86_CMOV=y
CONFIG_X86_MINIMUM_CPU_FAMILY=64
CONFIG_X86_DEBUGCTLMSR=y
CONFIG_IA32_FEAT_CTL=y
CONFIG_X86_VMX_FEATURE_NAMES=y
CONFIG_PROCESSOR_SELECT=y
CONFIG_CPU_SUP_INTEL=y
CONFIG_CPU_SUP_AMD=y
# CONFIG_CPU_SUP_HYGON is not set
# CONFIG_CPU_SUP_CENTAUR is not set
# CONFIG_CPU_SUP_ZHAOXIN is not set
CONFIG_HPET_TIMER=y
CONFIG_HPET_EMULATE_RTC=y
CONFIG_DMI=y
CONFIG_GART_IOMMU=y
CONFIG_BOOT_VESA_SUPPORT=y
CONFIG_MAXSMP=y
CONFIG_NR_CPUS_RANGE_BEGIN=8192
CONFIG_NR_CPUS_RANGE_END=8192
CONFIG_NR_CPUS_DEFAULT=8192
CONFIG_NR_CPUS=8192
CONFIG_SCHED_CLUSTER=y
CONFIG_SCHED_SMT=y
CONFIG_SCHED_MC=y
CONFIG_SCHED_MC_PRIO=y
CONFIG_X86_LOCAL_APIC=y
CONFIG_X86_IO_APIC=y
CONFIG_X86_REROUTE_FOR_BROKEN_BOOT_IRQS=y
CONFIG_X86_MCE=y
CONFIG_X86_MCELOG_LEGACY=y
CONFIG_X86_MCE_INTEL=y
# CONFIG_X86_MCE_AMD is not set
CONFIG_X86_MCE_THRESHOLD=y
CONFIG_X86_MCE_INJECT=m

#
# Performance monitoring
#
CONFIG_PERF_EVENTS_INTEL_UNCORE=m
CONFIG_PERF_EVENTS_INTEL_RAPL=m
CONFIG_PERF_EVENTS_INTEL_CSTATE=m
# CONFIG_PERF_EVENTS_AMD_POWER is not set
# CONFIG_PERF_EVENTS_AMD_UNCORE is not set
# CONFIG_PERF_EVENTS_AMD_BRS is not set
# end of Performance monitoring

CONFIG_X86_16BIT=y
CONFIG_X86_ESPFIX64=y
CONFIG_X86_VSYSCALL_EMULATION=y
CONFIG_X86_IOPL_IOPERM=y
CONFIG_MICROCODE=y
CONFIG_MICROCODE_INTEL=y
# CONFIG_MICROCODE_AMD is not set
CONFIG_MICROCODE_LATE_LOADING=y
CONFIG_X86_MSR=y
CONFIG_X86_CPUID=y
CONFIG_X86_5LEVEL=y
CONFIG_X86_DIRECT_GBPAGES=y
# CONFIG_X86_CPA_STATISTICS is not set
CONFIG_X86_MEM_ENCRYPT=y
# CONFIG_AMD_MEM_ENCRYPT is not set
CONFIG_NUMA=y
# CONFIG_AMD_NUMA is not set
CONFIG_X86_64_ACPI_NUMA=y
CONFIG_NUMA_EMU=y
CONFIG_NODES_SHIFT=10
CONFIG_ARCH_SPARSEMEM_ENABLE=y
CONFIG_ARCH_SPARSEMEM_DEFAULT=y
# CONFIG_ARCH_MEMORY_PROBE is not set
CONFIG_ARCH_PROC_KCORE_TEXT=y
CONFIG_ILLEGAL_POINTER_VALUE=0xdead000000000000
CONFIG_X86_PMEM_LEGACY_DEVICE=y
CONFIG_X86_PMEM_LEGACY=m
CONFIG_X86_CHECK_BIOS_CORRUPTION=y
# CONFIG_X86_BOOTPARAM_MEMORY_CORRUPTION_CHECK is not set
CONFIG_MTRR=y
CONFIG_MTRR_SANITIZER=y
CONFIG_MTRR_SANITIZER_ENABLE_DEFAULT=1
CONFIG_MTRR_SANITIZER_SPARE_REG_NR_DEFAULT=1
CONFIG_X86_PAT=y
CONFIG_ARCH_USES_PG_UNCACHED=y
CONFIG_X86_UMIP=y
CONFIG_CC_HAS_IBT=y
# CONFIG_X86_KERNEL_IBT is not set
CONFIG_X86_INTEL_MEMORY_PROTECTION_KEYS=y
CONFIG_X86_INTEL_TSX_MODE_OFF=y
# CONFIG_X86_INTEL_TSX_MODE_ON is not set
# CONFIG_X86_INTEL_TSX_MODE_AUTO is not set
CONFIG_X86_SGX=y
CONFIG_EFI=y
CONFIG_EFI_STUB=y
CONFIG_EFI_MIXED=y
# CONFIG_HZ_100 is not set
# CONFIG_HZ_250 is not set
# CONFIG_HZ_300 is not set
CONFIG_HZ_1000=y
CONFIG_HZ=1000
CONFIG_SCHED_HRTICK=y
CONFIG_KEXEC=y
CONFIG_KEXEC_FILE=y
CONFIG_ARCH_HAS_KEXEC_PURGATORY=y
# CONFIG_KEXEC_SIG is not set
CONFIG_CRASH_DUMP=y
CONFIG_KEXEC_JUMP=y
CONFIG_PHYSICAL_START=0x1000000
CONFIG_RELOCATABLE=y
# CONFIG_RANDOMIZE_BASE is not set
CONFIG_PHYSICAL_ALIGN=0x200000
CONFIG_DYNAMIC_MEMORY_LAYOUT=y
CONFIG_HOTPLUG_CPU=y
CONFIG_BOOTPARAM_HOTPLUG_CPU0=y
# CONFIG_DEBUG_HOTPLUG_CPU0 is not set
# CONFIG_COMPAT_VDSO is not set
# CONFIG_LEGACY_VSYSCALL_XONLY is not set
CONFIG_LEGACY_VSYSCALL_NONE=y
# CONFIG_CMDLINE_BOOL is not set
CONFIG_MODIFY_LDT_SYSCALL=y
# CONFIG_STRICT_SIGALTSTACK_SIZE is not set
CONFIG_HAVE_LIVEPATCH=y
CONFIG_LIVEPATCH=y
# end of Processor type and features

CONFIG_CC_HAS_SLS=y
CONFIG_CC_HAS_RETURN_THUNK=y
CONFIG_SPECULATION_MITIGATIONS=y
CONFIG_PAGE_TABLE_ISOLATION=y
# CONFIG_RETPOLINE is not set
CONFIG_CPU_IBPB_ENTRY=y
CONFIG_CPU_IBRS_ENTRY=y
# CONFIG_SLS is not set
CONFIG_ARCH_HAS_ADD_PAGES=y
CONFIG_ARCH_MHP_MEMMAP_ON_MEMORY_ENABLE=y

#
# Power management and ACPI options
#
CONFIG_ARCH_HIBERNATION_HEADER=y
CONFIG_SUSPEND=y
CONFIG_SUSPEND_FREEZER=y
# CONFIG_SUSPEND_SKIP_SYNC is not set
CONFIG_HIBERNATE_CALLBACKS=y
CONFIG_HIBERNATION=y
CONFIG_HIBERNATION_SNAPSHOT_DEV=y
CONFIG_PM_STD_PARTITION=""
CONFIG_PM_SLEEP=y
CONFIG_PM_SLEEP_SMP=y
# CONFIG_PM_AUTOSLEEP is not set
# CONFIG_PM_USERSPACE_AUTOSLEEP is not set
# CONFIG_PM_WAKELOCKS is not set
CONFIG_PM=y
CONFIG_PM_DEBUG=y
# CONFIG_PM_ADVANCED_DEBUG is not set
# CONFIG_PM_TEST_SUSPEND is not set
CONFIG_PM_SLEEP_DEBUG=y
# CONFIG_DPM_WATCHDOG is not set
# CONFIG_PM_TRACE_RTC is not set
CONFIG_PM_CLK=y
# CONFIG_WQ_POWER_EFFICIENT_DEFAULT is not set
# CONFIG_ENERGY_MODEL is not set
CONFIG_ARCH_SUPPORTS_ACPI=y
CONFIG_ACPI=y
CONFIG_ACPI_LEGACY_TABLES_LOOKUP=y
CONFIG_ARCH_MIGHT_HAVE_ACPI_PDC=y
CONFIG_ACPI_SYSTEM_POWER_STATES_SUPPORT=y
# CONFIG_ACPI_DEBUGGER is not set
CONFIG_ACPI_SPCR_TABLE=y
# CONFIG_ACPI_FPDT is not set
CONFIG_ACPI_LPIT=y
CONFIG_ACPI_SLEEP=y
CONFIG_ACPI_REV_OVERRIDE_POSSIBLE=y
CONFIG_ACPI_EC_DEBUGFS=m
CONFIG_ACPI_AC=y
CONFIG_ACPI_BATTERY=y
CONFIG_ACPI_BUTTON=y
CONFIG_ACPI_VIDEO=m
CONFIG_ACPI_FAN=y
CONFIG_ACPI_TAD=m
CONFIG_ACPI_DOCK=y
CONFIG_ACPI_CPU_FREQ_PSS=y
CONFIG_ACPI_PROCESSOR_CSTATE=y
CONFIG_ACPI_PROCESSOR_IDLE=y
CONFIG_ACPI_CPPC_LIB=y
CONFIG_ACPI_PROCESSOR=y
CONFIG_ACPI_IPMI=m
CONFIG_ACPI_HOTPLUG_CPU=y
CONFIG_ACPI_PROCESSOR_AGGREGATOR=m
CONFIG_ACPI_THERMAL=y
CONFIG_ACPI_PLATFORM_PROFILE=m
CONFIG_ARCH_HAS_ACPI_TABLE_UPGRADE=y
CONFIG_ACPI_TABLE_UPGRADE=y
# CONFIG_ACPI_DEBUG is not set
CONFIG_ACPI_PCI_SLOT=y
CONFIG_ACPI_CONTAINER=y
CONFIG_ACPI_HOTPLUG_MEMORY=y
CONFIG_ACPI_HOTPLUG_IOAPIC=y
CONFIG_ACPI_SBS=m
CONFIG_ACPI_HED=y
# CONFIG_ACPI_CUSTOM_METHOD is not set
CONFIG_ACPI_BGRT=y
# CONFIG_ACPI_REDUCED_HARDWARE_ONLY is not set
CONFIG_ACPI_NFIT=m
# CONFIG_NFIT_SECURITY_DEBUG is not set
CONFIG_ACPI_NUMA=y
# CONFIG_ACPI_HMAT is not set
CONFIG_HAVE_ACPI_APEI=y
CONFIG_HAVE_ACPI_APEI_NMI=y
CONFIG_ACPI_APEI=y
CONFIG_ACPI_APEI_GHES=y
CONFIG_ACPI_APEI_PCIEAER=y
CONFIG_ACPI_APEI_MEMORY_FAILURE=y
CONFIG_ACPI_APEI_EINJ=m
# CONFIG_ACPI_APEI_ERST_DEBUG is not set
# CONFIG_ACPI_DPTF is not set
CONFIG_ACPI_WATCHDOG=y
CONFIG_ACPI_EXTLOG=m
CONFIG_ACPI_ADXL=y
# CONFIG_ACPI_CONFIGFS is not set
# CONFIG_ACPI_PFRUT is not set
CONFIG_ACPI_PCC=y
CONFIG_PMIC_OPREGION=y
CONFIG_ACPI_PRMT=y
CONFIG_X86_PM_TIMER=y

#
# CPU Frequency scaling
#
CONFIG_CPU_FREQ=y
CONFIG_CPU_FREQ_GOV_ATTR_SET=y
CONFIG_CPU_FREQ_GOV_COMMON=y
CONFIG_CPU_FREQ_STAT=y
CONFIG_CPU_FREQ_DEFAULT_GOV_PERFORMANCE=y
# CONFIG_CPU_FREQ_DEFAULT_GOV_POWERSAVE is not set
# CONFIG_CPU_FREQ_DEFAULT_GOV_USERSPACE is not set
# CONFIG_CPU_FREQ_DEFAULT_GOV_SCHEDUTIL is not set
CONFIG_CPU_FREQ_GOV_PERFORMANCE=y
CONFIG_CPU_FREQ_GOV_POWERSAVE=y
CONFIG_CPU_FREQ_GOV_USERSPACE=y
CONFIG_CPU_FREQ_GOV_ONDEMAND=y
CONFIG_CPU_FREQ_GOV_CONSERVATIVE=y
CONFIG_CPU_FREQ_GOV_SCHEDUTIL=y

#
# CPU frequency scaling drivers
#
CONFIG_X86_INTEL_PSTATE=y
# CONFIG_X86_PCC_CPUFREQ is not set
# CONFIG_X86_AMD_PSTATE is not set
# CONFIG_X86_AMD_PSTATE_UT is not set
CONFIG_X86_ACPI_CPUFREQ=y
CONFIG_X86_ACPI_CPUFREQ_CPB=y
CONFIG_X86_POWERNOW_K8=y
# CONFIG_X86_AMD_FREQ_SENSITIVITY is not set
# CONFIG_X86_SPEEDSTEP_CENTRINO is not set
CONFIG_X86_P4_CLOCKMOD=m

#
# shared options
#
CONFIG_X86_SPEEDSTEP_LIB=m
# end of CPU Frequency scaling

#
# CPU Idle
#
CONFIG_CPU_IDLE=y
CONFIG_CPU_IDLE_GOV_LADDER=y
CONFIG_CPU_IDLE_GOV_MENU=y
# CONFIG_CPU_IDLE_GOV_TEO is not set
# CONFIG_CPU_IDLE_GOV_HALTPOLL is not set
CONFIG_HALTPOLL_CPUIDLE=y
# end of CPU Idle

CONFIG_INTEL_IDLE=y
# end of Power management and ACPI options

#
# Bus options (PCI etc.)
#
CONFIG_PCI_DIRECT=y
CONFIG_PCI_MMCONFIG=y
CONFIG_MMCONF_FAM10H=y
# CONFIG_PCI_CNB20LE_QUIRK is not set
# CONFIG_ISA_BUS is not set
CONFIG_ISA_DMA_API=y
CONFIG_AMD_NB=y
# end of Bus options (PCI etc.)

#
# Binary Emulations
#
CONFIG_IA32_EMULATION=y
# CONFIG_X86_X32_ABI is not set
CONFIG_COMPAT_32=y
CONFIG_COMPAT=y
CONFIG_COMPAT_FOR_U64_ALIGNMENT=y
# end of Binary Emulations

CONFIG_HAVE_KVM=y
CONFIG_HAVE_KVM_PFNCACHE=y
CONFIG_HAVE_KVM_IRQCHIP=y
CONFIG_HAVE_KVM_IRQFD=y
CONFIG_HAVE_KVM_IRQ_ROUTING=y
CONFIG_HAVE_KVM_DIRTY_RING=y
CONFIG_HAVE_KVM_DIRTY_RING_TSO=y
CONFIG_HAVE_KVM_DIRTY_RING_ACQ_REL=y
CONFIG_HAVE_KVM_EVENTFD=y
CONFIG_KVM_MMIO=y
CONFIG_KVM_ASYNC_PF=y
CONFIG_HAVE_KVM_MSI=y
CONFIG_HAVE_KVM_CPU_RELAX_INTERCEPT=y
CONFIG_KVM_VFIO=y
CONFIG_KVM_GENERIC_DIRTYLOG_READ_PROTECT=y
CONFIG_KVM_COMPAT=y
CONFIG_HAVE_KVM_IRQ_BYPASS=y
CONFIG_HAVE_KVM_NO_POLL=y
CONFIG_KVM_XFER_TO_GUEST_WORK=y
CONFIG_HAVE_KVM_PM_NOTIFIER=y
CONFIG_VIRTUALIZATION=y
CONFIG_KVM=m
# CONFIG_KVM_WERROR is not set
CONFIG_KVM_INTEL=m
# CONFIG_X86_SGX_KVM is not set
# CONFIG_KVM_AMD is not set
# CONFIG_KVM_XEN is not set
CONFIG_AS_AVX512=y
CONFIG_AS_SHA1_NI=y
CONFIG_AS_SHA256_NI=y
CONFIG_AS_TPAUSE=y

#
# General architecture-dependent options
#
CONFIG_CRASH_CORE=y
CONFIG_KEXEC_CORE=y
CONFIG_HAVE_IMA_KEXEC=y
CONFIG_HOTPLUG_SMT=y
CONFIG_GENERIC_ENTRY=y
CONFIG_KPROBES=y
CONFIG_JUMP_LABEL=y
# CONFIG_STATIC_KEYS_SELFTEST is not set
# CONFIG_STATIC_CALL_SELFTEST is not set
CONFIG_OPTPROBES=y
CONFIG_KPROBES_ON_FTRACE=y
CONFIG_UPROBES=y
CONFIG_HAVE_EFFICIENT_UNALIGNED_ACCESS=y
CONFIG_ARCH_USE_BUILTIN_BSWAP=y
CONFIG_KRETPROBES=y
CONFIG_KRETPROBE_ON_RETHOOK=y
CONFIG_USER_RETURN_NOTIFIER=y
CONFIG_HAVE_IOREMAP_PROT=y
CONFIG_HAVE_KPROBES=y
CONFIG_HAVE_KRETPROBES=y
CONFIG_HAVE_OPTPROBES=y
CONFIG_HAVE_KPROBES_ON_FTRACE=y
CONFIG_ARCH_CORRECT_STACKTRACE_ON_KRETPROBE=y
CONFIG_HAVE_FUNCTION_ERROR_INJECTION=y
CONFIG_HAVE_NMI=y
CONFIG_TRACE_IRQFLAGS_SUPPORT=y
CONFIG_TRACE_IRQFLAGS_NMI_SUPPORT=y
CONFIG_HAVE_ARCH_TRACEHOOK=y
CONFIG_HAVE_DMA_CONTIGUOUS=y
CONFIG_GENERIC_SMP_IDLE_THREAD=y
CONFIG_ARCH_HAS_FORTIFY_SOURCE=y
CONFIG_ARCH_HAS_SET_MEMORY=y
CONFIG_ARCH_HAS_SET_DIRECT_MAP=y
CONFIG_HAVE_ARCH_THREAD_STRUCT_WHITELIST=y
CONFIG_ARCH_WANTS_DYNAMIC_TASK_STRUCT=y
CONFIG_ARCH_WANTS_NO_INSTR=y
CONFIG_HAVE_ASM_MODVERSIONS=y
CONFIG_HAVE_REGS_AND_STACK_ACCESS_API=y
CONFIG_HAVE_RSEQ=y
CONFIG_HAVE_RUST=y
CONFIG_HAVE_FUNCTION_ARG_ACCESS_API=y
CONFIG_HAVE_HW_BREAKPOINT=y
CONFIG_HAVE_MIXED_BREAKPOINTS_REGS=y
CONFIG_HAVE_USER_RETURN_NOTIFIER=y
CONFIG_HAVE_PERF_EVENTS_NMI=y
CONFIG_HAVE_HARDLOCKUP_DETECTOR_PERF=y
CONFIG_HAVE_PERF_REGS=y
CONFIG_HAVE_PERF_USER_STACK_DUMP=y
CONFIG_HAVE_ARCH_JUMP_LABEL=y
CONFIG_HAVE_ARCH_JUMP_LABEL_RELATIVE=y
CONFIG_MMU_GATHER_TABLE_FREE=y
CONFIG_MMU_GATHER_RCU_TABLE_FREE=y
CONFIG_MMU_GATHER_MERGE_VMAS=y
CONFIG_ARCH_HAVE_NMI_SAFE_CMPXCHG=y
CONFIG_HAVE_ALIGNED_STRUCT_PAGE=y
CONFIG_HAVE_CMPXCHG_LOCAL=y
CONFIG_HAVE_CMPXCHG_DOUBLE=y
CONFIG_ARCH_WANT_COMPAT_IPC_PARSE_VERSION=y
CONFIG_ARCH_WANT_OLD_COMPAT_IPC=y
CONFIG_HAVE_ARCH_SECCOMP=y
CONFIG_HAVE_ARCH_SECCOMP_FILTER=y
CONFIG_SECCOMP=y
CONFIG_SECCOMP_FILTER=y
# CONFIG_SECCOMP_CACHE_DEBUG is not set
CONFIG_HAVE_ARCH_STACKLEAK=y
CONFIG_HAVE_STACKPROTECTOR=y
CONFIG_STACKPROTECTOR=y
CONFIG_STACKPROTECTOR_STRONG=y
CONFIG_ARCH_SUPPORTS_LTO_CLANG=y
CONFIG_ARCH_SUPPORTS_LTO_CLANG_THIN=y
CONFIG_LTO_NONE=y
CONFIG_ARCH_SUPPORTS_CFI_CLANG=y
CONFIG_HAVE_ARCH_WITHIN_STACK_FRAMES=y
CONFIG_HAVE_CONTEXT_TRACKING_USER=y
CONFIG_HAVE_CONTEXT_TRACKING_USER_OFFSTACK=y
CONFIG_HAVE_VIRT_CPU_ACCOUNTING_GEN=y
CONFIG_HAVE_IRQ_TIME_ACCOUNTING=y
CONFIG_HAVE_MOVE_PUD=y
CONFIG_HAVE_MOVE_PMD=y
CONFIG_HAVE_ARCH_TRANSPARENT_HUGEPAGE=y
CONFIG_HAVE_ARCH_TRANSPARENT_HUGEPAGE_PUD=y
CONFIG_HAVE_ARCH_HUGE_VMAP=y
CONFIG_HAVE_ARCH_HUGE_VMALLOC=y
CONFIG_ARCH_WANT_HUGE_PMD_SHARE=y
CONFIG_HAVE_ARCH_SOFT_DIRTY=y
CONFIG_HAVE_MOD_ARCH_SPECIFIC=y
CONFIG_MODULES_USE_ELF_RELA=y
CONFIG_HAVE_IRQ_EXIT_ON_IRQ_STACK=y
CONFIG_HAVE_SOFTIRQ_ON_OWN_STACK=y
CONFIG_SOFTIRQ_ON_OWN_STACK=y
CONFIG_ARCH_HAS_ELF_RANDOMIZE=y
CONFIG_HAVE_ARCH_MMAP_RND_BITS=y
CONFIG_HAVE_EXIT_THREAD=y
CONFIG_ARCH_MMAP_RND_BITS=28
CONFIG_HAVE_ARCH_MMAP_RND_COMPAT_BITS=y
CONFIG_ARCH_MMAP_RND_COMPAT_BITS=8
CONFIG_HAVE_ARCH_COMPAT_MMAP_BASES=y
CONFIG_PAGE_SIZE_LESS_THAN_64KB=y
CONFIG_PAGE_SIZE_LESS_THAN_256KB=y
CONFIG_HAVE_OBJTOOL=y
CONFIG_HAVE_JUMP_LABEL_HACK=y
CONFIG_HAVE_NOINSTR_HACK=y
CONFIG_HAVE_NOINSTR_VALIDATION=y
CONFIG_HAVE_UACCESS_VALIDATION=y
CONFIG_HAVE_STACK_VALIDATION=y
CONFIG_HAVE_RELIABLE_STACKTRACE=y
CONFIG_OLD_SIGSUSPEND3=y
CONFIG_COMPAT_OLD_SIGACTION=y
CONFIG_COMPAT_32BIT_TIME=y
CONFIG_HAVE_ARCH_VMAP_STACK=y
CONFIG_VMAP_STACK=y
CONFIG_HAVE_ARCH_RANDOMIZE_KSTACK_OFFSET=y
CONFIG_RANDOMIZE_KSTACK_OFFSET=y
CONFIG_RANDOMIZE_KSTACK_OFFSET_DEFAULT=y
CONFIG_ARCH_HAS_STRICT_KERNEL_RWX=y
CONFIG_STRICT_KERNEL_RWX=y
CONFIG_ARCH_HAS_STRICT_MODULE_RWX=y
CONFIG_STRICT_MODULE_RWX=y
CONFIG_HAVE_ARCH_PREL32_RELOCATIONS=y
CONFIG_ARCH_USE_MEMREMAP_PROT=y
# CONFIG_LOCK_EVENT_COUNTS is not set
CONFIG_ARCH_HAS_MEM_ENCRYPT=y
CONFIG_ARCH_HAS_CC_PLATFORM=y
CONFIG_HAVE_STATIC_CALL=y
CONFIG_HAVE_STATIC_CALL_INLINE=y
CONFIG_HAVE_PREEMPT_DYNAMIC=y
CONFIG_HAVE_PREEMPT_DYNAMIC_CALL=y
CONFIG_ARCH_WANT_LD_ORPHAN_WARN=y
CONFIG_ARCH_SUPPORTS_DEBUG_PAGEALLOC=y
CONFIG_ARCH_SUPPORTS_PAGE_TABLE_CHECK=y
CONFIG_ARCH_HAS_ELFCORE_COMPAT=y
CONFIG_ARCH_HAS_PARANOID_L1D_FLUSH=y
CONFIG_DYNAMIC_SIGFRAME=y
CONFIG_HAVE_ARCH_NODE_DEV_GROUP=y
CONFIG_ARCH_HAS_NONLEAF_PMD_YOUNG=y

#
# GCOV-based kernel profiling
#
# CONFIG_GCOV_KERNEL is not set
CONFIG_ARCH_HAS_GCOV_PROFILE_ALL=y
# end of GCOV-based kernel profiling

CONFIG_HAVE_GCC_PLUGINS=y
CONFIG_GCC_PLUGINS=y
# CONFIG_GCC_PLUGIN_LATENT_ENTROPY is not set
# end of General architecture-dependent options

CONFIG_RT_MUTEXES=y
CONFIG_BASE_SMALL=0
CONFIG_MODULE_SIG_FORMAT=y
CONFIG_MODULES=y
CONFIG_MODULE_FORCE_LOAD=y
CONFIG_MODULE_UNLOAD=y
# CONFIG_MODULE_FORCE_UNLOAD is not set
# CONFIG_MODULE_UNLOAD_TAINT_TRACKING is not set
CONFIG_MODVERSIONS=y
CONFIG_ASM_MODVERSIONS=y
CONFIG_MODULE_SRCVERSION_ALL=y
CONFIG_MODULE_SIG=y
# CONFIG_MODULE_SIG_FORCE is not set
CONFIG_MODULE_SIG_ALL=y
# CONFIG_MODULE_SIG_SHA1 is not set
# CONFIG_MODULE_SIG_SHA224 is not set
CONFIG_MODULE_SIG_SHA256=y
# CONFIG_MODULE_SIG_SHA384 is not set
# CONFIG_MODULE_SIG_SHA512 is not set
CONFIG_MODULE_SIG_HASH="sha256"
CONFIG_MODULE_COMPRESS_NONE=y
# CONFIG_MODULE_COMPRESS_GZIP is not set
# CONFIG_MODULE_COMPRESS_XZ is not set
# CONFIG_MODULE_COMPRESS_ZSTD is not set
# CONFIG_MODULE_ALLOW_MISSING_NAMESPACE_IMPORTS is not set
CONFIG_MODPROBE_PATH="/sbin/modprobe"
# CONFIG_TRIM_UNUSED_KSYMS is not set
CONFIG_MODULES_TREE_LOOKUP=y
CONFIG_BLOCK=y
CONFIG_BLOCK_LEGACY_AUTOLOAD=y
CONFIG_BLK_CGROUP_RWSTAT=y
CONFIG_BLK_DEV_BSG_COMMON=y
CONFIG_BLK_ICQ=y
CONFIG_BLK_DEV_BSGLIB=y
CONFIG_BLK_DEV_INTEGRITY=y
CONFIG_BLK_DEV_INTEGRITY_T10=m
# CONFIG_BLK_DEV_ZONED is not set
CONFIG_BLK_DEV_THROTTLING=y
# CONFIG_BLK_DEV_THROTTLING_LOW is not set
CONFIG_BLK_WBT=y
CONFIG_BLK_WBT_MQ=y
CONFIG_BLK_CGROUP_IOLATENCY=y
# CONFIG_BLK_CGROUP_IOCOST is not set
# CONFIG_BLK_CGROUP_IOPRIO is not set
CONFIG_BLK_DEBUG_FS=y
# CONFIG_BLK_SED_OPAL is not set
# CONFIG_BLK_INLINE_ENCRYPTION is not set

#
# Partition Types
#
CONFIG_PARTITION_ADVANCED=y
# CONFIG_ACORN_PARTITION is not set
# CONFIG_AIX_PARTITION is not set
CONFIG_OSF_PARTITION=y
CONFIG_AMIGA_PARTITION=y
# CONFIG_ATARI_PARTITION is not set
CONFIG_MAC_PARTITION=y
CONFIG_MSDOS_PARTITION=y
CONFIG_BSD_DISKLABEL=y
CONFIG_MINIX_SUBPARTITION=y
CONFIG_SOLARIS_X86_PARTITION=y
CONFIG_UNIXWARE_DISKLABEL=y
# CONFIG_LDM_PARTITION is not set
CONFIG_SGI_PARTITION=y
# CONFIG_ULTRIX_PARTITION is not set
CONFIG_SUN_PARTITION=y
CONFIG_KARMA_PARTITION=y
CONFIG_EFI_PARTITION=y
# CONFIG_SYSV68_PARTITION is not set
# CONFIG_CMDLINE_PARTITION is not set
# end of Partition Types

CONFIG_BLOCK_COMPAT=y
CONFIG_BLK_MQ_PCI=y
CONFIG_BLK_MQ_VIRTIO=y
CONFIG_BLK_PM=y
CONFIG_BLOCK_HOLDER_DEPRECATED=y
CONFIG_BLK_MQ_STACKING=y

#
# IO Schedulers
#
CONFIG_MQ_IOSCHED_DEADLINE=y
CONFIG_MQ_IOSCHED_KYBER=y
CONFIG_IOSCHED_BFQ=y
CONFIG_BFQ_GROUP_IOSCHED=y
# CONFIG_BFQ_CGROUP_DEBUG is not set
# end of IO Schedulers

CONFIG_PREEMPT_NOTIFIERS=y
CONFIG_PADATA=y
CONFIG_ASN1=y
CONFIG_UNINLINE_SPIN_UNLOCK=y
CONFIG_ARCH_SUPPORTS_ATOMIC_RMW=y
CONFIG_MUTEX_SPIN_ON_OWNER=y
CONFIG_RWSEM_SPIN_ON_OWNER=y
CONFIG_LOCK_SPIN_ON_OWNER=y
CONFIG_ARCH_USE_QUEUED_SPINLOCKS=y
CONFIG_QUEUED_SPINLOCKS=y
CONFIG_ARCH_USE_QUEUED_RWLOCKS=y
CONFIG_QUEUED_RWLOCKS=y
CONFIG_ARCH_HAS_NON_OVERLAPPING_ADDRESS_SPACE=y
CONFIG_ARCH_HAS_SYNC_CORE_BEFORE_USERMODE=y
CONFIG_ARCH_HAS_SYSCALL_WRAPPER=y
CONFIG_FREEZER=y

#
# Executable file formats
#
CONFIG_BINFMT_ELF=y
CONFIG_COMPAT_BINFMT_ELF=y
CONFIG_ELFCORE=y
CONFIG_CORE_DUMP_DEFAULT_ELF_HEADERS=y
CONFIG_BINFMT_SCRIPT=y
CONFIG_BINFMT_MISC=y
CONFIG_COREDUMP=y
# end of Executable file formats

#
# Memory Management options
#
CONFIG_ZPOOL=y
CONFIG_SWAP=y
CONFIG_ZSWAP=y
# CONFIG_ZSWAP_DEFAULT_ON is not set
# CONFIG_ZSWAP_COMPRESSOR_DEFAULT_DEFLATE is not set
CONFIG_ZSWAP_COMPRESSOR_DEFAULT_LZO=y
# CONFIG_ZSWAP_COMPRESSOR_DEFAULT_842 is not set
# CONFIG_ZSWAP_COMPRESSOR_DEFAULT_LZ4 is not set
# CONFIG_ZSWAP_COMPRESSOR_DEFAULT_LZ4HC is not set
# CONFIG_ZSWAP_COMPRESSOR_DEFAULT_ZSTD is not set
CONFIG_ZSWAP_COMPRESSOR_DEFAULT="lzo"
CONFIG_ZSWAP_ZPOOL_DEFAULT_ZBUD=y
# CONFIG_ZSWAP_ZPOOL_DEFAULT_Z3FOLD is not set
# CONFIG_ZSWAP_ZPOOL_DEFAULT_ZSMALLOC is not set
CONFIG_ZSWAP_ZPOOL_DEFAULT="zbud"
CONFIG_ZBUD=y
# CONFIG_Z3FOLD is not set
CONFIG_ZSMALLOC=y
CONFIG_ZSMALLOC_STAT=y

#
# SLAB allocator options
#
# CONFIG_SLAB is not set
CONFIG_SLUB=y
# CONFIG_SLOB is not set
CONFIG_SLAB_MERGE_DEFAULT=y
CONFIG_SLAB_FREELIST_RANDOM=y
CONFIG_SLAB_FREELIST_HARDENED=y
# CONFIG_SLUB_STATS is not set
CONFIG_SLUB_CPU_PARTIAL=y
# end of SLAB allocator options

CONFIG_SHUFFLE_PAGE_ALLOCATOR=y
# CONFIG_COMPAT_BRK is not set
CONFIG_SPARSEMEM=y
CONFIG_SPARSEMEM_EXTREME=y
CONFIG_SPARSEMEM_VMEMMAP_ENABLE=y
CONFIG_SPARSEMEM_VMEMMAP=y
CONFIG_HAVE_FAST_GUP=y
CONFIG_NUMA_KEEP_MEMINFO=y
CONFIG_MEMORY_ISOLATION=y
CONFIG_EXCLUSIVE_SYSTEM_RAM=y
CONFIG_HAVE_BOOTMEM_INFO_NODE=y
CONFIG_ARCH_ENABLE_MEMORY_HOTPLUG=y
CONFIG_ARCH_ENABLE_MEMORY_HOTREMOVE=y
CONFIG_MEMORY_HOTPLUG=y
# CONFIG_MEMORY_HOTPLUG_DEFAULT_ONLINE is not set
CONFIG_MEMORY_HOTREMOVE=y
CONFIG_MHP_MEMMAP_ON_MEMORY=y
CONFIG_SPLIT_PTLOCK_CPUS=4
CONFIG_ARCH_ENABLE_SPLIT_PMD_PTLOCK=y
CONFIG_MEMORY_BALLOON=y
CONFIG_BALLOON_COMPACTION=y
CONFIG_COMPACTION=y
CONFIG_COMPACT_UNEVICTABLE_DEFAULT=1
CONFIG_PAGE_REPORTING=y
CONFIG_MIGRATION=y
CONFIG_DEVICE_MIGRATION=y
CONFIG_ARCH_ENABLE_HUGEPAGE_MIGRATION=y
CONFIG_ARCH_ENABLE_THP_MIGRATION=y
CONFIG_CONTIG_ALLOC=y
CONFIG_PHYS_ADDR_T_64BIT=y
CONFIG_MMU_NOTIFIER=y
CONFIG_KSM=y
CONFIG_DEFAULT_MMAP_MIN_ADDR=4096
CONFIG_ARCH_SUPPORTS_MEMORY_FAILURE=y
CONFIG_MEMORY_FAILURE=y
CONFIG_HWPOISON_INJECT=y
CONFIG_ARCH_WANT_GENERAL_HUGETLB=y
CONFIG_ARCH_WANTS_THP_SWAP=y
CONFIG_TRANSPARENT_HUGEPAGE=y
# CONFIG_TRANSPARENT_HUGEPAGE_ALWAYS is not set
CONFIG_TRANSPARENT_HUGEPAGE_MADVISE=y
CONFIG_THP_SWAP=y
# CONFIG_READ_ONLY_THP_FOR_FS is not set
CONFIG_NEED_PER_CPU_EMBED_FIRST_CHUNK=y
CONFIG_NEED_PER_CPU_PAGE_FIRST_CHUNK=y
CONFIG_USE_PERCPU_NUMA_NODE_ID=y
CONFIG_HAVE_SETUP_PER_CPU_AREA=y
CONFIG_FRONTSWAP=y
CONFIG_CMA=y
# CONFIG_CMA_DEBUG is not set
# CONFIG_CMA_DEBUGFS is not set
# CONFIG_CMA_SYSFS is not set
CONFIG_CMA_AREAS=7
CONFIG_MEM_SOFT_DIRTY=y
CONFIG_GENERIC_EARLY_IOREMAP=y
CONFIG_DEFERRED_STRUCT_PAGE_INIT=y
CONFIG_PAGE_IDLE_FLAG=y
CONFIG_IDLE_PAGE_TRACKING=y
CONFIG_ARCH_HAS_CACHE_LINE_SIZE=y
CONFIG_ARCH_HAS_CURRENT_STACK_POINTER=y
CONFIG_ARCH_HAS_PTE_DEVMAP=y
CONFIG_ARCH_HAS_ZONE_DMA_SET=y
CONFIG_ZONE_DMA=y
CONFIG_ZONE_DMA32=y
CONFIG_ZONE_DEVICE=y
CONFIG_HMM_MIRROR=y
CONFIG_GET_FREE_REGION=y
CONFIG_DEVICE_PRIVATE=y
CONFIG_VMAP_PFN=y
CONFIG_ARCH_USES_HIGH_VMA_FLAGS=y
CONFIG_ARCH_HAS_PKEYS=y
CONFIG_VM_EVENT_COUNTERS=y
# CONFIG_PERCPU_STATS is not set
CONFIG_GUP_TEST=y
CONFIG_ARCH_HAS_PTE_SPECIAL=y
# CONFIG_ANON_VMA_NAME is not set
CONFIG_USERFAULTFD=y
CONFIG_HAVE_ARCH_USERFAULTFD_WP=y
CONFIG_HAVE_ARCH_USERFAULTFD_MINOR=y
CONFIG_PTE_MARKER=y
CONFIG_PTE_MARKER_UFFD_WP=y
# CONFIG_LRU_GEN is not set

#
# Data Access Monitoring
#
CONFIG_DAMON=y
CONFIG_DAMON_VADDR=y
CONFIG_DAMON_PADDR=y
# CONFIG_DAMON_SYSFS is not set
CONFIG_DAMON_DBGFS=y
# CONFIG_DAMON_RECLAIM is not set
# CONFIG_DAMON_LRU_SORT is not set
# end of Data Access Monitoring
# end of Memory Management options

CONFIG_NET=y
CONFIG_NET_INGRESS=y
CONFIG_NET_EGRESS=y
CONFIG_NET_REDIRECT=y
CONFIG_SKB_EXTENSIONS=y

#
# Networking options
#
CONFIG_PACKET=y
CONFIG_PACKET_DIAG=m
CONFIG_UNIX=y
CONFIG_UNIX_SCM=y
CONFIG_AF_UNIX_OOB=y
CONFIG_UNIX_DIAG=m
CONFIG_TLS=m
CONFIG_TLS_DEVICE=y
# CONFIG_TLS_TOE is not set
CONFIG_XFRM=y
CONFIG_XFRM_OFFLOAD=y
CONFIG_XFRM_ALGO=y
CONFIG_XFRM_USER=y
# CONFIG_XFRM_USER_COMPAT is not set
# CONFIG_XFRM_INTERFACE is not set
CONFIG_XFRM_SUB_POLICY=y
CONFIG_XFRM_MIGRATE=y
CONFIG_XFRM_STATISTICS=y
CONFIG_XFRM_AH=m
CONFIG_XFRM_ESP=m
CONFIG_XFRM_IPCOMP=m
CONFIG_NET_KEY=m
CONFIG_NET_KEY_MIGRATE=y
CONFIG_XDP_SOCKETS=y
CONFIG_XDP_SOCKETS_DIAG=y
CONFIG_INET=y
CONFIG_IP_MULTICAST=y
CONFIG_IP_ADVANCED_ROUTER=y
CONFIG_IP_FIB_TRIE_STATS=y
CONFIG_IP_MULTIPLE_TABLES=y
CONFIG_IP_ROUTE_MULTIPATH=y
CONFIG_IP_ROUTE_VERBOSE=y
CONFIG_IP_ROUTE_CLASSID=y
CONFIG_IP_PNP=y
CONFIG_IP_PNP_DHCP=y
# CONFIG_IP_PNP_BOOTP is not set
# CONFIG_IP_PNP_RARP is not set
CONFIG_NET_IPIP=y
CONFIG_NET_IPGRE_DEMUX=y
CONFIG_NET_IP_TUNNEL=y
CONFIG_NET_IPGRE=y
CONFIG_NET_IPGRE_BROADCAST=y
CONFIG_IP_MROUTE_COMMON=y
CONFIG_IP_MROUTE=y
CONFIG_IP_MROUTE_MULTIPLE_TABLES=y
CONFIG_IP_PIMSM_V1=y
CONFIG_IP_PIMSM_V2=y
CONFIG_SYN_COOKIES=y
CONFIG_NET_IPVTI=m
CONFIG_NET_UDP_TUNNEL=y
CONFIG_NET_FOU=y
CONFIG_NET_FOU_IP_TUNNELS=y
CONFIG_INET_AH=m
CONFIG_INET_ESP=m
CONFIG_INET_ESP_OFFLOAD=m
# CONFIG_INET_ESPINTCP is not set
CONFIG_INET_IPCOMP=m
CONFIG_INET_XFRM_TUNNEL=m
CONFIG_INET_TUNNEL=y
CONFIG_INET_DIAG=m
CONFIG_INET_TCP_DIAG=m
CONFIG_INET_UDP_DIAG=m
CONFIG_INET_RAW_DIAG=m
# CONFIG_INET_DIAG_DESTROY is not set
CONFIG_TCP_CONG_ADVANCED=y
CONFIG_TCP_CONG_BIC=m
CONFIG_TCP_CONG_CUBIC=y
CONFIG_TCP_CONG_WESTWOOD=m
CONFIG_TCP_CONG_HTCP=m
CONFIG_TCP_CONG_HSTCP=m
CONFIG_TCP_CONG_HYBLA=m
CONFIG_TCP_CONG_VEGAS=m
CONFIG_TCP_CONG_NV=m
CONFIG_TCP_CONG_SCALABLE=m
CONFIG_TCP_CONG_LP=m
CONFIG_TCP_CONG_VENO=m
CONFIG_TCP_CONG_YEAH=m
CONFIG_TCP_CONG_ILLINOIS=m
CONFIG_TCP_CONG_DCTCP=m
# CONFIG_TCP_CONG_CDG is not set
CONFIG_TCP_CONG_BBR=m
# CONFIG_DEFAULT_CUBIC is not set
CONFIG_DEFAULT_RENO=y
CONFIG_DEFAULT_TCP_CONG="reno"
CONFIG_TCP_MD5SIG=y
CONFIG_IPV6=y
CONFIG_IPV6_ROUTER_PREF=y
CONFIG_IPV6_ROUTE_INFO=y
CONFIG_IPV6_OPTIMISTIC_DAD=y
CONFIG_INET6_AH=m
CONFIG_INET6_ESP=m
CONFIG_INET6_ESP_OFFLOAD=m
# CONFIG_INET6_ESPINTCP is not set
CONFIG_INET6_IPCOMP=m
CONFIG_IPV6_MIP6=y
# CONFIG_IPV6_ILA is not set
CONFIG_INET6_XFRM_TUNNEL=m
CONFIG_INET6_TUNNEL=y
CONFIG_IPV6_VTI=m
CONFIG_IPV6_SIT=m
CONFIG_IPV6_SIT_6RD=y
CONFIG_IPV6_NDISC_NODETYPE=y
CONFIG_IPV6_TUNNEL=y
CONFIG_IPV6_GRE=y
CONFIG_IPV6_FOU=y
CONFIG_IPV6_FOU_TUNNEL=y
CONFIG_IPV6_MULTIPLE_TABLES=y
CONFIG_IPV6_SUBTREES=y
CONFIG_IPV6_MROUTE=y
CONFIG_IPV6_MROUTE_MULTIPLE_TABLES=y
CONFIG_IPV6_PIMSM_V2=y
CONFIG_IPV6_SEG6_LWTUNNEL=y
# CONFIG_IPV6_SEG6_HMAC is not set
CONFIG_IPV6_SEG6_BPF=y
# CONFIG_IPV6_RPL_LWTUNNEL is not set
CONFIG_IPV6_IOAM6_LWTUNNEL=y
CONFIG_NETLABEL=y
CONFIG_MPTCP=y
CONFIG_INET_MPTCP_DIAG=m
CONFIG_MPTCP_IPV6=y
CONFIG_NETWORK_SECMARK=y
CONFIG_NET_PTP_CLASSIFY=y
CONFIG_NETWORK_PHY_TIMESTAMPING=y
CONFIG_NETFILTER=y
CONFIG_NETFILTER_ADVANCED=y
CONFIG_BRIDGE_NETFILTER=m

#
# Core Netfilter Configuration
#
CONFIG_NETFILTER_INGRESS=y
CONFIG_NETFILTER_EGRESS=y
CONFIG_NETFILTER_SKIP_EGRESS=y
CONFIG_NETFILTER_NETLINK=y
CONFIG_NETFILTER_FAMILY_BRIDGE=y
CONFIG_NETFILTER_FAMILY_ARP=y
# CONFIG_NETFILTER_NETLINK_HOOK is not set
# CONFIG_NETFILTER_NETLINK_ACCT is not set
CONFIG_NETFILTER_NETLINK_QUEUE=y
CONFIG_NETFILTER_NETLINK_LOG=y
CONFIG_NETFILTER_NETLINK_OSF=m
CONFIG_NF_CONNTRACK=m
CONFIG_NF_LOG_SYSLOG=m
CONFIG_NETFILTER_CONNCOUNT=m
CONFIG_NF_CONNTRACK_MARK=y
CONFIG_NF_CONNTRACK_SECMARK=y
CONFIG_NF_CONNTRACK_ZONES=y
CONFIG_NF_CONNTRACK_PROCFS=y
CONFIG_NF_CONNTRACK_EVENTS=y
CONFIG_NF_CONNTRACK_TIMEOUT=y
CONFIG_NF_CONNTRACK_TIMESTAMP=y
CONFIG_NF_CONNTRACK_LABELS=y
CONFIG_NF_CT_PROTO_DCCP=y
CONFIG_NF_CT_PROTO_GRE=y
CONFIG_NF_CT_PROTO_SCTP=y
CONFIG_NF_CT_PROTO_UDPLITE=y
CONFIG_NF_CONNTRACK_AMANDA=m
CONFIG_NF_CONNTRACK_FTP=m
CONFIG_NF_CONNTRACK_H323=m
CONFIG_NF_CONNTRACK_IRC=m
CONFIG_NF_CONNTRACK_BROADCAST=m
CONFIG_NF_CONNTRACK_NETBIOS_NS=m
CONFIG_NF_CONNTRACK_SNMP=m
CONFIG_NF_CONNTRACK_PPTP=m
CONFIG_NF_CONNTRACK_SANE=m
CONFIG_NF_CONNTRACK_SIP=m
CONFIG_NF_CONNTRACK_TFTP=m
CONFIG_NF_CT_NETLINK=m
CONFIG_NF_CT_NETLINK_TIMEOUT=m
CONFIG_NF_CT_NETLINK_HELPER=m
CONFIG_NETFILTER_NETLINK_GLUE_CT=y
CONFIG_NF_NAT=m
CONFIG_NF_NAT_AMANDA=m
CONFIG_NF_NAT_FTP=m
CONFIG_NF_NAT_IRC=m
CONFIG_NF_NAT_SIP=m
CONFIG_NF_NAT_TFTP=m
CONFIG_NF_NAT_REDIRECT=y
CONFIG_NF_NAT_MASQUERADE=y
CONFIG_NETFILTER_SYNPROXY=m
CONFIG_NF_TABLES=m
CONFIG_NF_TABLES_INET=y
CONFIG_NF_TABLES_NETDEV=y
CONFIG_NFT_NUMGEN=m
CONFIG_NFT_CT=m
CONFIG_NFT_FLOW_OFFLOAD=m
CONFIG_NFT_CONNLIMIT=m
CONFIG_NFT_LOG=m
CONFIG_NFT_LIMIT=m
CONFIG_NFT_MASQ=m
CONFIG_NFT_REDIR=m
CONFIG_NFT_NAT=m
# CONFIG_NFT_TUNNEL is not set
CONFIG_NFT_OBJREF=m
CONFIG_NFT_QUEUE=m
CONFIG_NFT_QUOTA=m
CONFIG_NFT_REJECT=m
CONFIG_NFT_REJECT_INET=m
CONFIG_NFT_COMPAT=m
CONFIG_NFT_HASH=m
CONFIG_NFT_FIB=m
CONFIG_NFT_FIB_INET=m
# CONFIG_NFT_XFRM is not set
CONFIG_NFT_SOCKET=m
# CONFIG_NFT_OSF is not set
CONFIG_NFT_TPROXY=m
CONFIG_NFT_SYNPROXY=m
CONFIG_NF_DUP_NETDEV=m
CONFIG_NFT_DUP_NETDEV=m
CONFIG_NFT_FWD_NETDEV=m
CONFIG_NFT_FIB_NETDEV=m
# CONFIG_NFT_REJECT_NETDEV is not set
CONFIG_NF_FLOW_TABLE_INET=m
CONFIG_NF_FLOW_TABLE=m
# CONFIG_NF_FLOW_TABLE_PROCFS is not set
CONFIG_NETFILTER_XTABLES=y
CONFIG_NETFILTER_XTABLES_COMPAT=y

#
# Xtables combined modules
#
CONFIG_NETFILTER_XT_MARK=m
CONFIG_NETFILTER_XT_CONNMARK=m
CONFIG_NETFILTER_XT_SET=m

#
# Xtables targets
#
CONFIG_NETFILTER_XT_TARGET_AUDIT=m
CONFIG_NETFILTER_XT_TARGET_CHECKSUM=m
CONFIG_NETFILTER_XT_TARGET_CLASSIFY=m
CONFIG_NETFILTER_XT_TARGET_CONNMARK=m
CONFIG_NETFILTER_XT_TARGET_CONNSECMARK=m
CONFIG_NETFILTER_XT_TARGET_CT=m
CONFIG_NETFILTER_XT_TARGET_DSCP=m
CONFIG_NETFILTER_XT_TARGET_HL=m
CONFIG_NETFILTER_XT_TARGET_HMARK=m
CONFIG_NETFILTER_XT_TARGET_IDLETIMER=m
# CONFIG_NETFILTER_XT_TARGET_LED is not set
CONFIG_NETFILTER_XT_TARGET_LOG=m
CONFIG_NETFILTER_XT_TARGET_MARK=m
CONFIG_NETFILTER_XT_NAT=m
CONFIG_NETFILTER_XT_TARGET_NETMAP=m
CONFIG_NETFILTER_XT_TARGET_NFLOG=m
CONFIG_NETFILTER_XT_TARGET_NFQUEUE=m
CONFIG_NETFILTER_XT_TARGET_NOTRACK=m
CONFIG_NETFILTER_XT_TARGET_RATEEST=m
CONFIG_NETFILTER_XT_TARGET_REDIRECT=m
CONFIG_NETFILTER_XT_TARGET_MASQUERADE=m
CONFIG_NETFILTER_XT_TARGET_TEE=m
CONFIG_NETFILTER_XT_TARGET_TPROXY=m
CONFIG_NETFILTER_XT_TARGET_TRACE=m
CONFIG_NETFILTER_XT_TARGET_SECMARK=m
CONFIG_NETFILTER_XT_TARGET_TCPMSS=m
CONFIG_NETFILTER_XT_TARGET_TCPOPTSTRIP=m

#
# Xtables matches
#
CONFIG_NETFILTER_XT_MATCH_ADDRTYPE=m
CONFIG_NETFILTER_XT_MATCH_BPF=y
CONFIG_NETFILTER_XT_MATCH_CGROUP=m
CONFIG_NETFILTER_XT_MATCH_CLUSTER=m
CONFIG_NETFILTER_XT_MATCH_COMMENT=m
CONFIG_NETFILTER_XT_MATCH_CONNBYTES=m
CONFIG_NETFILTER_XT_MATCH_CONNLABEL=m
CONFIG_NETFILTER_XT_MATCH_CONNLIMIT=m
CONFIG_NETFILTER_XT_MATCH_CONNMARK=m
CONFIG_NETFILTER_XT_MATCH_CONNTRACK=m
CONFIG_NETFILTER_XT_MATCH_CPU=m
CONFIG_NETFILTER_XT_MATCH_DCCP=m
CONFIG_NETFILTER_XT_MATCH_DEVGROUP=m
CONFIG_NETFILTER_XT_MATCH_DSCP=m
CONFIG_NETFILTER_XT_MATCH_ECN=m
CONFIG_NETFILTER_XT_MATCH_ESP=m
CONFIG_NETFILTER_XT_MATCH_HASHLIMIT=m
CONFIG_NETFILTER_XT_MATCH_HELPER=m
CONFIG_NETFILTER_XT_MATCH_HL=m
# CONFIG_NETFILTER_XT_MATCH_IPCOMP is not set
CONFIG_NETFILTER_XT_MATCH_IPRANGE=m
CONFIG_NETFILTER_XT_MATCH_IPVS=m
# CONFIG_NETFILTER_XT_MATCH_L2TP is not set
CONFIG_NETFILTER_XT_MATCH_LENGTH=m
CONFIG_NETFILTER_XT_MATCH_LIMIT=m
CONFIG_NETFILTER_XT_MATCH_MAC=m
CONFIG_NETFILTER_XT_MATCH_MARK=m
CONFIG_NETFILTER_XT_MATCH_MULTIPORT=m
# CONFIG_NETFILTER_XT_MATCH_NFACCT is not set
CONFIG_NETFILTER_XT_MATCH_OSF=m
CONFIG_NETFILTER_XT_MATCH_OWNER=m
CONFIG_NETFILTER_XT_MATCH_POLICY=m
CONFIG_NETFILTER_XT_MATCH_PHYSDEV=m
CONFIG_NETFILTER_XT_MATCH_PKTTYPE=m
CONFIG_NETFILTER_XT_MATCH_QUOTA=m
CONFIG_NETFILTER_XT_MATCH_RATEEST=m
CONFIG_NETFILTER_XT_MATCH_REALM=m
CONFIG_NETFILTER_XT_MATCH_RECENT=m
CONFIG_NETFILTER_XT_MATCH_SCTP=m
CONFIG_NETFILTER_XT_MATCH_SOCKET=m
CONFIG_NETFILTER_XT_MATCH_STATE=m
CONFIG_NETFILTER_XT_MATCH_STATISTIC=y
CONFIG_NETFILTER_XT_MATCH_STRING=m
CONFIG_NETFILTER_XT_MATCH_TCPMSS=m
# CONFIG_NETFILTER_XT_MATCH_TIME is not set
# CONFIG_NETFILTER_XT_MATCH_U32 is not set
# end of Core Netfilter Configuration

CONFIG_IP_SET=m
CONFIG_IP_SET_MAX=256
CONFIG_IP_SET_BITMAP_IP=m
CONFIG_IP_SET_BITMAP_IPMAC=m
CONFIG_IP_SET_BITMAP_PORT=m
CONFIG_IP_SET_HASH_IP=m
CONFIG_IP_SET_HASH_IPMARK=m
CONFIG_IP_SET_HASH_IPPORT=m
CONFIG_IP_SET_HASH_IPPORTIP=m
CONFIG_IP_SET_HASH_IPPORTNET=m
CONFIG_IP_SET_HASH_IPMAC=m
CONFIG_IP_SET_HASH_MAC=m
CONFIG_IP_SET_HASH_NETPORTNET=m
CONFIG_IP_SET_HASH_NET=m
CONFIG_IP_SET_HASH_NETNET=m
CONFIG_IP_SET_HASH_NETPORT=m
CONFIG_IP_SET_HASH_NETIFACE=m
CONFIG_IP_SET_LIST_SET=m
CONFIG_IP_VS=m
CONFIG_IP_VS_IPV6=y
# CONFIG_IP_VS_DEBUG is not set
CONFIG_IP_VS_TAB_BITS=12

#
# IPVS transport protocol load balancing support
#
CONFIG_IP_VS_PROTO_TCP=y
CONFIG_IP_VS_PROTO_UDP=y
CONFIG_IP_VS_PROTO_AH_ESP=y
CONFIG_IP_VS_PROTO_ESP=y
CONFIG_IP_VS_PROTO_AH=y
CONFIG_IP_VS_PROTO_SCTP=y

#
# IPVS scheduler
#
CONFIG_IP_VS_RR=m
CONFIG_IP_VS_WRR=m
CONFIG_IP_VS_LC=m
CONFIG_IP_VS_WLC=m
CONFIG_IP_VS_FO=m
CONFIG_IP_VS_OVF=m
CONFIG_IP_VS_LBLC=m
CONFIG_IP_VS_LBLCR=m
CONFIG_IP_VS_DH=m
CONFIG_IP_VS_SH=m
# CONFIG_IP_VS_MH is not set
CONFIG_IP_VS_SED=m
CONFIG_IP_VS_NQ=m
# CONFIG_IP_VS_TWOS is not set

#
# IPVS SH scheduler
#
CONFIG_IP_VS_SH_TAB_BITS=8

#
# IPVS MH scheduler
#
CONFIG_IP_VS_MH_TAB_INDEX=12

#
# IPVS application helper
#
CONFIG_IP_VS_FTP=m
CONFIG_IP_VS_NFCT=y
CONFIG_IP_VS_PE_SIP=m

#
# IP: Netfilter Configuration
#
CONFIG_NF_DEFRAG_IPV4=m
CONFIG_NF_SOCKET_IPV4=m
CONFIG_NF_TPROXY_IPV4=m
CONFIG_NF_TABLES_IPV4=y
CONFIG_NFT_REJECT_IPV4=m
CONFIG_NFT_DUP_IPV4=m
CONFIG_NFT_FIB_IPV4=m
CONFIG_NF_TABLES_ARP=y
CONFIG_NF_DUP_IPV4=m
CONFIG_NF_LOG_ARP=m
CONFIG_NF_LOG_IPV4=m
CONFIG_NF_REJECT_IPV4=m
CONFIG_NF_NAT_SNMP_BASIC=m
CONFIG_NF_NAT_PPTP=m
CONFIG_NF_NAT_H323=m
CONFIG_IP_NF_IPTABLES=m
CONFIG_IP_NF_MATCH_AH=m
CONFIG_IP_NF_MATCH_ECN=m
CONFIG_IP_NF_MATCH_RPFILTER=m
CONFIG_IP_NF_MATCH_TTL=m
CONFIG_IP_NF_FILTER=m
CONFIG_IP_NF_TARGET_REJECT=m
CONFIG_IP_NF_TARGET_SYNPROXY=m
CONFIG_IP_NF_NAT=m
CONFIG_IP_NF_TARGET_MASQUERADE=m
CONFIG_IP_NF_TARGET_NETMAP=m
CONFIG_IP_NF_TARGET_REDIRECT=m
CONFIG_IP_NF_MANGLE=m
# CONFIG_IP_NF_TARGET_CLUSTERIP is not set
CONFIG_IP_NF_TARGET_ECN=m
CONFIG_IP_NF_TARGET_TTL=m
CONFIG_IP_NF_RAW=m
CONFIG_IP_NF_SECURITY=m
CONFIG_IP_NF_ARPTABLES=m
CONFIG_IP_NF_ARPFILTER=m
CONFIG_IP_NF_ARP_MANGLE=m
# end of IP: Netfilter Configuration

#
# IPv6: Netfilter Configuration
#
CONFIG_NF_SOCKET_IPV6=m
CONFIG_NF_TPROXY_IPV6=m
CONFIG_NF_TABLES_IPV6=y
CONFIG_NFT_REJECT_IPV6=m
CONFIG_NFT_DUP_IPV6=m
CONFIG_NFT_FIB_IPV6=m
CONFIG_NF_DUP_IPV6=m
CONFIG_NF_REJECT_IPV6=m
CONFIG_NF_LOG_IPV6=m
CONFIG_IP6_NF_IPTABLES=m
CONFIG_IP6_NF_MATCH_AH=m
CONFIG_IP6_NF_MATCH_EUI64=m
CONFIG_IP6_NF_MATCH_FRAG=m
CONFIG_IP6_NF_MATCH_OPTS=m
CONFIG_IP6_NF_MATCH_HL=m
CONFIG_IP6_NF_MATCH_IPV6HEADER=m
CONFIG_IP6_NF_MATCH_MH=m
CONFIG_IP6_NF_MATCH_RPFILTER=m
CONFIG_IP6_NF_MATCH_RT=m
# CONFIG_IP6_NF_MATCH_SRH is not set
# CONFIG_IP6_NF_TARGET_HL is not set
CONFIG_IP6_NF_FILTER=m
CONFIG_IP6_NF_TARGET_REJECT=m
CONFIG_IP6_NF_TARGET_SYNPROXY=m
CONFIG_IP6_NF_MANGLE=m
CONFIG_IP6_NF_RAW=m
CONFIG_IP6_NF_SECURITY=m
CONFIG_IP6_NF_NAT=m
CONFIG_IP6_NF_TARGET_MASQUERADE=m
CONFIG_IP6_NF_TARGET_NPT=m
# end of IPv6: Netfilter Configuration

CONFIG_NF_DEFRAG_IPV6=m
CONFIG_NF_TABLES_BRIDGE=m
# CONFIG_NFT_BRIDGE_META is not set
CONFIG_NFT_BRIDGE_REJECT=m
# CONFIG_NF_CONNTRACK_BRIDGE is not set
CONFIG_BRIDGE_NF_EBTABLES=m
CONFIG_BRIDGE_EBT_BROUTE=m
CONFIG_BRIDGE_EBT_T_FILTER=m
CONFIG_BRIDGE_EBT_T_NAT=m
CONFIG_BRIDGE_EBT_802_3=m
CONFIG_BRIDGE_EBT_AMONG=m
CONFIG_BRIDGE_EBT_ARP=m
CONFIG_BRIDGE_EBT_IP=m
CONFIG_BRIDGE_EBT_IP6=m
CONFIG_BRIDGE_EBT_LIMIT=m
CONFIG_BRIDGE_EBT_MARK=m
CONFIG_BRIDGE_EBT_PKTTYPE=m
CONFIG_BRIDGE_EBT_STP=m
CONFIG_BRIDGE_EBT_VLAN=m
CONFIG_BRIDGE_EBT_ARPREPLY=m
CONFIG_BRIDGE_EBT_DNAT=m
CONFIG_BRIDGE_EBT_MARK_T=m
CONFIG_BRIDGE_EBT_REDIRECT=m
CONFIG_BRIDGE_EBT_SNAT=m
CONFIG_BRIDGE_EBT_LOG=m
CONFIG_BRIDGE_EBT_NFLOG=m
CONFIG_BPFILTER=y
CONFIG_BPFILTER_UMH=m
# CONFIG_IP_DCCP is not set
CONFIG_IP_SCTP=m
# CONFIG_SCTP_DBG_OBJCNT is not set
# CONFIG_SCTP_DEFAULT_COOKIE_HMAC_MD5 is not set
CONFIG_SCTP_DEFAULT_COOKIE_HMAC_SHA1=y
# CONFIG_SCTP_DEFAULT_COOKIE_HMAC_NONE is not set
CONFIG_SCTP_COOKIE_HMAC_MD5=y
CONFIG_SCTP_COOKIE_HMAC_SHA1=y
CONFIG_INET_SCTP_DIAG=m
# CONFIG_RDS is not set
CONFIG_TIPC=m
CONFIG_TIPC_MEDIA_UDP=y
CONFIG_TIPC_CRYPTO=y
CONFIG_TIPC_DIAG=m
CONFIG_ATM=m
CONFIG_ATM_CLIP=m
# CONFIG_ATM_CLIP_NO_ICMP is not set
CONFIG_ATM_LANE=m
# CONFIG_ATM_MPOA is not set
CONFIG_ATM_BR2684=m
# CONFIG_ATM_BR2684_IPFILTER is not set
CONFIG_L2TP=m
CONFIG_L2TP_DEBUGFS=m
CONFIG_L2TP_V3=y
CONFIG_L2TP_IP=m
CONFIG_L2TP_ETH=m
CONFIG_STP=y
CONFIG_GARP=y
CONFIG_MRP=y
CONFIG_BRIDGE=y
CONFIG_BRIDGE_IGMP_SNOOPING=y
CONFIG_BRIDGE_VLAN_FILTERING=y
# CONFIG_BRIDGE_MRP is not set
# CONFIG_BRIDGE_CFM is not set
# CONFIG_NET_DSA is not set
CONFIG_VLAN_8021Q=y
CONFIG_VLAN_8021Q_GVRP=y
CONFIG_VLAN_8021Q_MVRP=y
CONFIG_LLC=y
# CONFIG_LLC2 is not set
# CONFIG_ATALK is not set
# CONFIG_X25 is not set
# CONFIG_LAPB is not set
# CONFIG_PHONET is not set
CONFIG_6LOWPAN=m
# CONFIG_6LOWPAN_DEBUGFS is not set
# CONFIG_6LOWPAN_NHC is not set
# CONFIG_IEEE802154 is not set
CONFIG_NET_SCHED=y

#
# Queueing/Scheduling
#
CONFIG_NET_SCH_CBQ=m
CONFIG_NET_SCH_HTB=m
CONFIG_NET_SCH_HFSC=m
CONFIG_NET_SCH_ATM=m
CONFIG_NET_SCH_PRIO=m
CONFIG_NET_SCH_MULTIQ=m
CONFIG_NET_SCH_RED=m
CONFIG_NET_SCH_SFB=m
CONFIG_NET_SCH_SFQ=m
CONFIG_NET_SCH_TEQL=m
CONFIG_NET_SCH_TBF=m
CONFIG_NET_SCH_CBS=m
CONFIG_NET_SCH_ETF=m
CONFIG_NET_SCH_TAPRIO=m
CONFIG_NET_SCH_GRED=m
CONFIG_NET_SCH_DSMARK=m
CONFIG_NET_SCH_NETEM=y
CONFIG_NET_SCH_DRR=m
CONFIG_NET_SCH_MQPRIO=m
CONFIG_NET_SCH_SKBPRIO=m
CONFIG_NET_SCH_CHOKE=m
CONFIG_NET_SCH_QFQ=m
CONFIG_NET_SCH_CODEL=m
CONFIG_NET_SCH_FQ_CODEL=y
CONFIG_NET_SCH_CAKE=m
CONFIG_NET_SCH_FQ=m
CONFIG_NET_SCH_HHF=m
CONFIG_NET_SCH_PIE=m
CONFIG_NET_SCH_FQ_PIE=m
CONFIG_NET_SCH_INGRESS=y
CONFIG_NET_SCH_PLUG=m
CONFIG_NET_SCH_ETS=m
CONFIG_NET_SCH_DEFAULT=y
# CONFIG_DEFAULT_FQ is not set
# CONFIG_DEFAULT_CODEL is not set
CONFIG_DEFAULT_FQ_CODEL=y
# CONFIG_DEFAULT_FQ_PIE is not set
# CONFIG_DEFAULT_SFQ is not set
# CONFIG_DEFAULT_PFIFO_FAST is not set
CONFIG_DEFAULT_NET_SCH="fq_codel"

#
# Classification
#
CONFIG_NET_CLS=y
CONFIG_NET_CLS_BASIC=m
CONFIG_NET_CLS_TCINDEX=m
CONFIG_NET_CLS_ROUTE4=m
CONFIG_NET_CLS_FW=m
CONFIG_NET_CLS_U32=m
CONFIG_CLS_U32_PERF=y
CONFIG_CLS_U32_MARK=y
CONFIG_NET_CLS_RSVP=m
CONFIG_NET_CLS_RSVP6=m
CONFIG_NET_CLS_FLOW=m
CONFIG_NET_CLS_CGROUP=y
CONFIG_NET_CLS_BPF=m
CONFIG_NET_CLS_FLOWER=m
CONFIG_NET_CLS_MATCHALL=m
CONFIG_NET_EMATCH=y
CONFIG_NET_EMATCH_STACK=32
CONFIG_NET_EMATCH_CMP=m
CONFIG_NET_EMATCH_NBYTE=m
CONFIG_NET_EMATCH_U32=m
CONFIG_NET_EMATCH_META=m
CONFIG_NET_EMATCH_TEXT=m
CONFIG_NET_EMATCH_CANID=m
CONFIG_NET_EMATCH_IPSET=m
CONFIG_NET_EMATCH_IPT=m
CONFIG_NET_CLS_ACT=y
CONFIG_NET_ACT_POLICE=m
CONFIG_NET_ACT_GACT=m
CONFIG_GACT_PROB=y
CONFIG_NET_ACT_MIRRED=m
CONFIG_NET_ACT_SAMPLE=m
CONFIG_NET_ACT_IPT=m
CONFIG_NET_ACT_NAT=m
CONFIG_NET_ACT_PEDIT=m
CONFIG_NET_ACT_SIMP=m
CONFIG_NET_ACT_SKBEDIT=m
CONFIG_NET_ACT_CSUM=m
CONFIG_NET_ACT_MPLS=m
CONFIG_NET_ACT_VLAN=m
CONFIG_NET_ACT_BPF=m
CONFIG_NET_ACT_CONNMARK=m
CONFIG_NET_ACT_CTINFO=m
CONFIG_NET_ACT_SKBMOD=m
CONFIG_NET_ACT_IFE=m
CONFIG_NET_ACT_TUNNEL_KEY=m
CONFIG_NET_ACT_CT=m
CONFIG_NET_ACT_GATE=m
CONFIG_NET_IFE_SKBMARK=m
CONFIG_NET_IFE_SKBPRIO=m
CONFIG_NET_IFE_SKBTCINDEX=m
CONFIG_NET_TC_SKB_EXT=y
CONFIG_NET_SCH_FIFO=y
CONFIG_DCB=y
CONFIG_DNS_RESOLVER=y
# CONFIG_BATMAN_ADV is not set
CONFIG_OPENVSWITCH=m
CONFIG_OPENVSWITCH_GRE=m
CONFIG_OPENVSWITCH_VXLAN=m
CONFIG_OPENVSWITCH_GENEVE=m
CONFIG_VSOCKETS=m
CONFIG_VSOCKETS_DIAG=m
CONFIG_VSOCKETS_LOOPBACK=m
CONFIG_VMWARE_VMCI_VSOCKETS=m
CONFIG_VIRTIO_VSOCKETS=m
CONFIG_VIRTIO_VSOCKETS_COMMON=m
CONFIG_NETLINK_DIAG=m
CONFIG_MPLS=y
CONFIG_NET_MPLS_GSO=m
CONFIG_MPLS_ROUTING=m
CONFIG_MPLS_IPTUNNEL=m
CONFIG_NET_NSH=y
# CONFIG_HSR is not set
CONFIG_NET_SWITCHDEV=y
CONFIG_NET_L3_MASTER_DEV=y
# CONFIG_QRTR is not set
# CONFIG_NET_NCSI is not set
CONFIG_PCPU_DEV_REFCNT=y
CONFIG_RPS=y
CONFIG_RFS_ACCEL=y
CONFIG_SOCK_RX_QUEUE_MAPPING=y
CONFIG_XPS=y
CONFIG_CGROUP_NET_PRIO=y
CONFIG_CGROUP_NET_CLASSID=y
CONFIG_NET_RX_BUSY_POLL=y
CONFIG_BQL=y
CONFIG_BPF_STREAM_PARSER=y
CONFIG_NET_FLOW_LIMIT=y

#
# Network testing
#
CONFIG_NET_PKTGEN=m
CONFIG_NET_DROP_MONITOR=y
# end of Network testing
# end of Networking options

# CONFIG_HAMRADIO is not set
CONFIG_CAN=m
CONFIG_CAN_RAW=m
CONFIG_CAN_BCM=m
CONFIG_CAN_GW=m
# CONFIG_CAN_J1939 is not set
# CONFIG_CAN_ISOTP is not set
# CONFIG_BT is not set
# CONFIG_AF_RXRPC is not set
# CONFIG_AF_KCM is not set
CONFIG_STREAM_PARSER=y
# CONFIG_MCTP is not set
CONFIG_FIB_RULES=y
CONFIG_WIRELESS=y
CONFIG_CFG80211=m
# CONFIG_NL80211_TESTMODE is not set
# CONFIG_CFG80211_DEVELOPER_WARNINGS is not set
# CONFIG_CFG80211_CERTIFICATION_ONUS is not set
CONFIG_CFG80211_REQUIRE_SIGNED_REGDB=y
CONFIG_CFG80211_USE_KERNEL_REGDB_KEYS=y
CONFIG_CFG80211_DEFAULT_PS=y
# CONFIG_CFG80211_DEBUGFS is not set
CONFIG_CFG80211_CRDA_SUPPORT=y
# CONFIG_CFG80211_WEXT is not set
CONFIG_MAC80211=m
CONFIG_MAC80211_HAS_RC=y
CONFIG_MAC80211_RC_MINSTREL=y
CONFIG_MAC80211_RC_DEFAULT_MINSTREL=y
CONFIG_MAC80211_RC_DEFAULT="minstrel_ht"
# CONFIG_MAC80211_MESH is not set
CONFIG_MAC80211_LEDS=y
CONFIG_MAC80211_DEBUGFS=y
# CONFIG_MAC80211_MESSAGE_TRACING is not set
# CONFIG_MAC80211_DEBUG_MENU is not set
CONFIG_MAC80211_STA_HASH_MAX_SIZE=0
CONFIG_RFKILL=m
CONFIG_RFKILL_LEDS=y
CONFIG_RFKILL_INPUT=y
# CONFIG_RFKILL_GPIO is not set
CONFIG_NET_9P=y
CONFIG_NET_9P_FD=y
CONFIG_NET_9P_VIRTIO=y
# CONFIG_NET_9P_DEBUG is not set
# CONFIG_CAIF is not set
CONFIG_CEPH_LIB=m
# CONFIG_CEPH_LIB_PRETTYDEBUG is not set
CONFIG_CEPH_LIB_USE_DNS_RESOLVER=y
CONFIG_NFC=m
# CONFIG_NFC_DIGITAL is not set
CONFIG_NFC_NCI=m
# CONFIG_NFC_NCI_SPI is not set
# CONFIG_NFC_NCI_UART is not set
# CONFIG_NFC_HCI is not set

#
# Near Field Communication (NFC) devices
#
CONFIG_NFC_VIRTUAL_NCI=m
# CONFIG_NFC_FDP is not set
# CONFIG_NFC_PN533_USB is not set
# CONFIG_NFC_PN533_I2C is not set
# CONFIG_NFC_MRVL_USB is not set
# CONFIG_NFC_ST_NCI_I2C is not set
# CONFIG_NFC_ST_NCI_SPI is not set
# CONFIG_NFC_NXP_NCI is not set
# CONFIG_NFC_S3FWRN5_I2C is not set
# end of Near Field Communication (NFC) devices

CONFIG_PSAMPLE=m
CONFIG_NET_IFE=m
CONFIG_LWTUNNEL=y
CONFIG_LWTUNNEL_BPF=y
CONFIG_DST_CACHE=y
CONFIG_GRO_CELLS=y
CONFIG_SOCK_VALIDATE_XMIT=y
CONFIG_NET_SELFTESTS=y
CONFIG_NET_SOCK_MSG=y
CONFIG_NET_DEVLINK=y
CONFIG_PAGE_POOL=y
# CONFIG_PAGE_POOL_STATS is not set
CONFIG_FAILOVER=y
CONFIG_ETHTOOL_NETLINK=y

#
# Device Drivers
#
CONFIG_HAVE_EISA=y
# CONFIG_EISA is not set
CONFIG_HAVE_PCI=y
CONFIG_PCI=y
CONFIG_PCI_DOMAINS=y
CONFIG_PCIEPORTBUS=y
CONFIG_HOTPLUG_PCI_PCIE=y
CONFIG_PCIEAER=y
CONFIG_PCIEAER_INJECT=m
CONFIG_PCIE_ECRC=y
CONFIG_PCIEASPM=y
CONFIG_PCIEASPM_DEFAULT=y
# CONFIG_PCIEASPM_POWERSAVE is not set
# CONFIG_PCIEASPM_POWER_SUPERSAVE is not set
# CONFIG_PCIEASPM_PERFORMANCE is not set
CONFIG_PCIE_PME=y
CONFIG_PCIE_DPC=y
# CONFIG_PCIE_PTM is not set
# CONFIG_PCIE_EDR is not set
CONFIG_PCI_MSI=y
CONFIG_PCI_MSI_IRQ_DOMAIN=y
CONFIG_PCI_QUIRKS=y
# CONFIG_PCI_DEBUG is not set
# CONFIG_PCI_REALLOC_ENABLE_AUTO is not set
CONFIG_PCI_STUB=y
CONFIG_PCI_PF_STUB=m
CONFIG_PCI_ATS=y
CONFIG_PCI_LOCKLESS_CONFIG=y
CONFIG_PCI_IOV=y
CONFIG_PCI_PRI=y
CONFIG_PCI_PASID=y
# CONFIG_PCI_P2PDMA is not set
CONFIG_PCI_LABEL=y
# CONFIG_PCIE_BUS_TUNE_OFF is not set
CONFIG_PCIE_BUS_DEFAULT=y
# CONFIG_PCIE_BUS_SAFE is not set
# CONFIG_PCIE_BUS_PERFORMANCE is not set
# CONFIG_PCIE_BUS_PEER2PEER is not set
CONFIG_VGA_ARB=y
CONFIG_VGA_ARB_MAX_GPUS=64
CONFIG_HOTPLUG_PCI=y
CONFIG_HOTPLUG_PCI_ACPI=y
CONFIG_HOTPLUG_PCI_ACPI_IBM=m
# CONFIG_HOTPLUG_PCI_CPCI is not set
CONFIG_HOTPLUG_PCI_SHPC=y

#
# PCI controller drivers
#
CONFIG_VMD=y

#
# DesignWare PCI Core Support
#
# CONFIG_PCIE_DW_PLAT_HOST is not set
# CONFIG_PCI_MESON is not set
# end of DesignWare PCI Core Support

#
# Mobiveil PCIe Core Support
#
# end of Mobiveil PCIe Core Support

#
# Cadence PCIe controllers support
#
# end of Cadence PCIe controllers support
# end of PCI controller drivers

#
# PCI Endpoint
#
# CONFIG_PCI_ENDPOINT is not set
# end of PCI Endpoint

#
# PCI switch controller drivers
#
# CONFIG_PCI_SW_SWITCHTEC is not set
# end of PCI switch controller drivers

# CONFIG_CXL_BUS is not set
# CONFIG_PCCARD is not set
# CONFIG_RAPIDIO is not set

#
# Generic Driver Options
#
CONFIG_AUXILIARY_BUS=y
# CONFIG_UEVENT_HELPER is not set
CONFIG_DEVTMPFS=y
CONFIG_DEVTMPFS_MOUNT=y
# CONFIG_DEVTMPFS_SAFE is not set
CONFIG_STANDALONE=y
CONFIG_PREVENT_FIRMWARE_BUILD=y

#
# Firmware loader
#
CONFIG_FW_LOADER=y
CONFIG_FW_LOADER_PAGED_BUF=y
CONFIG_FW_LOADER_SYSFS=y
CONFIG_EXTRA_FIRMWARE=""
CONFIG_FW_LOADER_USER_HELPER=y
# CONFIG_FW_LOADER_USER_HELPER_FALLBACK is not set
# CONFIG_FW_LOADER_COMPRESS is not set
CONFIG_FW_CACHE=y
CONFIG_FW_UPLOAD=y
# end of Firmware loader

CONFIG_ALLOW_DEV_COREDUMP=y
# CONFIG_DEBUG_DRIVER is not set
# CONFIG_DEBUG_DEVRES is not set
# CONFIG_DEBUG_TEST_DRIVER_REMOVE is not set
# CONFIG_TEST_ASYNC_DRIVER_PROBE is not set
CONFIG_GENERIC_CPU_AUTOPROBE=y
CONFIG_GENERIC_CPU_VULNERABILITIES=y
CONFIG_REGMAP=y
CONFIG_REGMAP_I2C=m
CONFIG_REGMAP_SPI=m
CONFIG_DMA_SHARED_BUFFER=y
# CONFIG_DMA_FENCE_TRACE is not set
# end of Generic Driver Options

#
# Bus devices
#
# CONFIG_MHI_BUS is not set
# CONFIG_MHI_BUS_EP is not set
# end of Bus devices

CONFIG_CONNECTOR=y
CONFIG_PROC_EVENTS=y

#
# Firmware Drivers
#

#
# ARM System Control and Management Interface Protocol
#
# end of ARM System Control and Management Interface Protocol

CONFIG_EDD=m
# CONFIG_EDD_OFF is not set
CONFIG_FIRMWARE_MEMMAP=y
CONFIG_DMIID=y
CONFIG_DMI_SYSFS=y
CONFIG_DMI_SCAN_MACHINE_NON_EFI_FALLBACK=y
# CONFIG_ISCSI_IBFT is not set
CONFIG_FW_CFG_SYSFS=y
# CONFIG_FW_CFG_SYSFS_CMDLINE is not set
CONFIG_SYSFB=y
# CONFIG_SYSFB_SIMPLEFB is not set
# CONFIG_GOOGLE_FIRMWARE is not set

#
# EFI (Extensible Firmware Interface) Support
#
CONFIG_EFI_ESRT=y
CONFIG_EFI_VARS_PSTORE=y
CONFIG_EFI_VARS_PSTORE_DEFAULT_DISABLE=y
CONFIG_EFI_RUNTIME_MAP=y
# CONFIG_EFI_FAKE_MEMMAP is not set
CONFIG_EFI_DXE_MEM_ATTRIBUTES=y
CONFIG_EFI_RUNTIME_WRAPPERS=y
CONFIG_EFI_GENERIC_STUB_INITRD_CMDLINE_LOADER=y
# CONFIG_EFI_BOOTLOADER_CONTROL is not set
# CONFIG_EFI_CAPSULE_LOADER is not set
# CONFIG_EFI_TEST is not set
# CONFIG_APPLE_PROPERTIES is not set
# CONFIG_RESET_ATTACK_MITIGATION is not set
# CONFIG_EFI_RCI2_TABLE is not set
# CONFIG_EFI_DISABLE_PCI_DMA is not set
CONFIG_EFI_EARLYCON=y
CONFIG_EFI_CUSTOM_SSDT_OVERLAYS=y
# CONFIG_EFI_DISABLE_RUNTIME is not set
# CONFIG_EFI_COCO_SECRET is not set
# end of EFI (Extensible Firmware Interface) Support

CONFIG_UEFI_CPER=y
CONFIG_UEFI_CPER_X86=y

#
# Tegra firmware driver
#
# end of Tegra firmware driver
# end of Firmware Drivers

# CONFIG_GNSS is not set
# CONFIG_MTD is not set
# CONFIG_OF is not set
CONFIG_ARCH_MIGHT_HAVE_PC_PARPORT=y
CONFIG_PARPORT=m
CONFIG_PARPORT_PC=m
CONFIG_PARPORT_SERIAL=m
# CONFIG_PARPORT_PC_FIFO is not set
# CONFIG_PARPORT_PC_SUPERIO is not set
# CONFIG_PARPORT_AX88796 is not set
CONFIG_PARPORT_1284=y
CONFIG_PNP=y
# CONFIG_PNP_DEBUG_MESSAGES is not set

#
# Protocols
#
CONFIG_PNPACPI=y
CONFIG_BLK_DEV=y
CONFIG_BLK_DEV_NULL_BLK=m
# CONFIG_BLK_DEV_NULL_BLK_FAULT_INJECTION is not set
# CONFIG_BLK_DEV_FD is not set
CONFIG_CDROM=m
# CONFIG_PARIDE is not set
# CONFIG_BLK_DEV_PCIESSD_MTIP32XX is not set
CONFIG_ZRAM=m
CONFIG_ZRAM_DEF_COMP_LZORLE=y
# CONFIG_ZRAM_DEF_COMP_LZO is not set
CONFIG_ZRAM_DEF_COMP="lzo-rle"
CONFIG_ZRAM_WRITEBACK=y
# CONFIG_ZRAM_MEMORY_TRACKING is not set
# CONFIG_ZRAM_MULTI_COMP is not set
CONFIG_BLK_DEV_LOOP=m
CONFIG_BLK_DEV_LOOP_MIN_COUNT=0
# CONFIG_BLK_DEV_DRBD is not set
CONFIG_BLK_DEV_NBD=m
CONFIG_BLK_DEV_RAM=y
CONFIG_BLK_DEV_RAM_COUNT=16
CONFIG_BLK_DEV_RAM_SIZE=16384
CONFIG_CDROM_PKTCDVD=m
CONFIG_CDROM_PKTCDVD_BUFFERS=8
# CONFIG_CDROM_PKTCDVD_WCACHE is not set
# CONFIG_ATA_OVER_ETH is not set
CONFIG_VIRTIO_BLK=y
CONFIG_BLK_DEV_RBD=m
# CONFIG_BLK_DEV_UBLK is not set

#
# NVME Support
#
CONFIG_NVME_CORE=m
CONFIG_BLK_DEV_NVME=m
CONFIG_NVME_MULTIPATH=y
# CONFIG_NVME_VERBOSE_ERRORS is not set
# CONFIG_NVME_HWMON is not set
CONFIG_NVME_FABRICS=m
# CONFIG_NVME_FC is not set
# CONFIG_NVME_TCP is not set
# CONFIG_NVME_AUTH is not set
CONFIG_NVME_TARGET=m
# CONFIG_NVME_TARGET_PASSTHRU is not set
CONFIG_NVME_TARGET_LOOP=m
CONFIG_NVME_TARGET_FC=m
# CONFIG_NVME_TARGET_TCP is not set
# CONFIG_NVME_TARGET_AUTH is not set
# end of NVME Support

#
# Misc devices
#
CONFIG_SENSORS_LIS3LV02D=m
# CONFIG_AD525X_DPOT is not set
# CONFIG_DUMMY_IRQ is not set
# CONFIG_IBM_ASM is not set
# CONFIG_PHANTOM is not set
CONFIG_TIFM_CORE=m
CONFIG_TIFM_7XX1=m
# CONFIG_ICS932S401 is not set
CONFIG_ENCLOSURE_SERVICES=m
CONFIG_SGI_XP=m
CONFIG_HP_ILO=m
CONFIG_SGI_GRU=m
# CONFIG_SGI_GRU_DEBUG is not set
CONFIG_APDS9802ALS=m
CONFIG_ISL29003=m
CONFIG_ISL29020=m
CONFIG_SENSORS_TSL2550=m
CONFIG_SENSORS_BH1770=m
CONFIG_SENSORS_APDS990X=m
# CONFIG_HMC6352 is not set
# CONFIG_DS1682 is not set
CONFIG_VMWARE_BALLOON=m
# CONFIG_LATTICE_ECP3_CONFIG is not set
# CONFIG_SRAM is not set
# CONFIG_DW_XDATA_PCIE is not set
# CONFIG_PCI_ENDPOINT_TEST is not set
# CONFIG_XILINX_SDFEC is not set
CONFIG_MISC_RTSX=m
# CONFIG_C2PORT is not set

#
# EEPROM support
#
# CONFIG_EEPROM_AT24 is not set
# CONFIG_EEPROM_AT25 is not set
CONFIG_EEPROM_LEGACY=m
CONFIG_EEPROM_MAX6875=m
CONFIG_EEPROM_93CX6=m
# CONFIG_EEPROM_93XX46 is not set
# CONFIG_EEPROM_IDT_89HPESX is not set
# CONFIG_EEPROM_EE1004 is not set
# end of EEPROM support

CONFIG_CB710_CORE=m
# CONFIG_CB710_DEBUG is not set
CONFIG_CB710_DEBUG_ASSUMPTIONS=y

#
# Texas Instruments shared transport line discipline
#
# CONFIG_TI_ST is not set
# end of Texas Instruments shared transport line discipline

CONFIG_SENSORS_LIS3_I2C=m
CONFIG_ALTERA_STAPL=m
CONFIG_INTEL_MEI=m
CONFIG_INTEL_MEI_ME=m
# CONFIG_INTEL_MEI_TXE is not set
# CONFIG_INTEL_MEI_GSC is not set
# CONFIG_INTEL_MEI_HDCP is not set
# CONFIG_INTEL_MEI_PXP is not set
CONFIG_VMWARE_VMCI=m
# CONFIG_GENWQE is not set
# CONFIG_ECHO is not set
# CONFIG_BCM_VK is not set
# CONFIG_MISC_ALCOR_PCI is not set
CONFIG_MISC_RTSX_PCI=m
# CONFIG_MISC_RTSX_USB is not set
# CONFIG_HABANA_AI is not set
# CONFIG_UACCE is not set
CONFIG_PVPANIC=y
# CONFIG_PVPANIC_MMIO is not set
# CONFIG_PVPANIC_PCI is not set
# CONFIG_GP_PCI1XXXX is not set
# end of Misc devices

#
# SCSI device support
#
CONFIG_SCSI_MOD=y
CONFIG_RAID_ATTRS=m
CONFIG_SCSI_COMMON=y
CONFIG_SCSI=y
CONFIG_SCSI_DMA=y
CONFIG_SCSI_NETLINK=y
CONFIG_SCSI_PROC_FS=y

#
# SCSI support type (disk, tape, CD-ROM)
#
CONFIG_BLK_DEV_SD=m
CONFIG_CHR_DEV_ST=m
CONFIG_BLK_DEV_SR=m
CONFIG_CHR_DEV_SG=m
CONFIG_BLK_DEV_BSG=y
CONFIG_CHR_DEV_SCH=m
CONFIG_SCSI_ENCLOSURE=m
CONFIG_SCSI_CONSTANTS=y
CONFIG_SCSI_LOGGING=y
CONFIG_SCSI_SCAN_ASYNC=y

#
# SCSI Transports
#
CONFIG_SCSI_SPI_ATTRS=m
CONFIG_SCSI_FC_ATTRS=m
CONFIG_SCSI_ISCSI_ATTRS=m
CONFIG_SCSI_SAS_ATTRS=m
CONFIG_SCSI_SAS_LIBSAS=m
CONFIG_SCSI_SAS_ATA=y
CONFIG_SCSI_SAS_HOST_SMP=y
CONFIG_SCSI_SRP_ATTRS=m
# end of SCSI Transports

CONFIG_SCSI_LOWLEVEL=y
# CONFIG_ISCSI_TCP is not set
# CONFIG_ISCSI_BOOT_SYSFS is not set
# CONFIG_SCSI_CXGB3_ISCSI is not set
# CONFIG_SCSI_CXGB4_ISCSI is not set
# CONFIG_SCSI_BNX2_ISCSI is not set
# CONFIG_BE2ISCSI is not set
# CONFIG_BLK_DEV_3W_XXXX_RAID is not set
# CONFIG_SCSI_HPSA is not set
# CONFIG_SCSI_3W_9XXX is not set
# CONFIG_SCSI_3W_SAS is not set
# CONFIG_SCSI_ACARD is not set
# CONFIG_SCSI_AACRAID is not set
# CONFIG_SCSI_AIC7XXX is not set
# CONFIG_SCSI_AIC79XX is not set
# CONFIG_SCSI_AIC94XX is not set
# CONFIG_SCSI_MVSAS is not set
# CONFIG_SCSI_MVUMI is not set
# CONFIG_SCSI_ADVANSYS is not set
# CONFIG_SCSI_ARCMSR is not set
# CONFIG_SCSI_ESAS2R is not set
# CONFIG_MEGARAID_NEWGEN is not set
# CONFIG_MEGARAID_LEGACY is not set
# CONFIG_MEGARAID_SAS is not set
CONFIG_SCSI_MPT3SAS=m
CONFIG_SCSI_MPT2SAS_MAX_SGE=128
CONFIG_SCSI_MPT3SAS_MAX_SGE=128
# CONFIG_SCSI_MPT2SAS is not set
# CONFIG_SCSI_MPI3MR is not set
# CONFIG_SCSI_SMARTPQI is not set
# CONFIG_SCSI_HPTIOP is not set
# CONFIG_SCSI_BUSLOGIC is not set
# CONFIG_SCSI_MYRB is not set
# CONFIG_SCSI_MYRS is not set
# CONFIG_VMWARE_PVSCSI is not set
# CONFIG_LIBFC is not set
# CONFIG_SCSI_SNIC is not set
# CONFIG_SCSI_DMX3191D is not set
# CONFIG_SCSI_FDOMAIN_PCI is not set
CONFIG_SCSI_ISCI=m
# CONFIG_SCSI_IPS is not set
# CONFIG_SCSI_INITIO is not set
# CONFIG_SCSI_INIA100 is not set
# CONFIG_SCSI_PPA is not set
# CONFIG_SCSI_IMM is not set
# CONFIG_SCSI_STEX is not set
# CONFIG_SCSI_SYM53C8XX_2 is not set
# CONFIG_SCSI_IPR is not set
# CONFIG_SCSI_QLOGIC_1280 is not set
# CONFIG_SCSI_QLA_FC is not set
# CONFIG_SCSI_QLA_ISCSI is not set
# CONFIG_SCSI_LPFC is not set
# CONFIG_SCSI_EFCT is not set
# CONFIG_SCSI_DC395x is not set
# CONFIG_SCSI_AM53C974 is not set
# CONFIG_SCSI_WD719X is not set
# CONFIG_SCSI_DEBUG is not set
# CONFIG_SCSI_PMCRAID is not set
# CONFIG_SCSI_PM8001 is not set
# CONFIG_SCSI_BFA_FC is not set
# CONFIG_SCSI_VIRTIO is not set
# CONFIG_SCSI_CHELSIO_FCOE is not set
CONFIG_SCSI_DH=y
CONFIG_SCSI_DH_RDAC=y
CONFIG_SCSI_DH_HP_SW=y
CONFIG_SCSI_DH_EMC=y
CONFIG_SCSI_DH_ALUA=y
# end of SCSI device support

CONFIG_ATA=m
CONFIG_SATA_HOST=y
CONFIG_PATA_TIMINGS=y
CONFIG_ATA_VERBOSE_ERROR=y
CONFIG_ATA_FORCE=y
CONFIG_ATA_ACPI=y
# CONFIG_SATA_ZPODD is not set
CONFIG_SATA_PMP=y

#
# Controllers with non-SFF native interface
#
CONFIG_SATA_AHCI=m
CONFIG_SATA_MOBILE_LPM_POLICY=0
CONFIG_SATA_AHCI_PLATFORM=m
# CONFIG_AHCI_DWC is not set
# CONFIG_SATA_INIC162X is not set
# CONFIG_SATA_ACARD_AHCI is not set
# CONFIG_SATA_SIL24 is not set
CONFIG_ATA_SFF=y

#
# SFF controllers with custom DMA interface
#
# CONFIG_PDC_ADMA is not set
# CONFIG_SATA_QSTOR is not set
# CONFIG_SATA_SX4 is not set
CONFIG_ATA_BMDMA=y

#
# SATA SFF controllers with BMDMA
#
CONFIG_ATA_PIIX=m
# CONFIG_SATA_DWC is not set
# CONFIG_SATA_MV is not set
# CONFIG_SATA_NV is not set
# CONFIG_SATA_PROMISE is not set
# CONFIG_SATA_SIL is not set
# CONFIG_SATA_SIS is not set
# CONFIG_SATA_SVW is not set
# CONFIG_SATA_ULI is not set
# CONFIG_SATA_VIA is not set
# CONFIG_SATA_VITESSE is not set

#
# PATA SFF controllers with BMDMA
#
# CONFIG_PATA_ALI is not set
# CONFIG_PATA_AMD is not set
# CONFIG_PATA_ARTOP is not set
# CONFIG_PATA_ATIIXP is not set
# CONFIG_PATA_ATP867X is not set
# CONFIG_PATA_CMD64X is not set
# CONFIG_PATA_CYPRESS is not set
# CONFIG_PATA_EFAR is not set
# CONFIG_PATA_HPT366 is not set
# CONFIG_PATA_HPT37X is not set
# CONFIG_PATA_HPT3X2N is not set
# CONFIG_PATA_HPT3X3 is not set
# CONFIG_PATA_IT8213 is not set
# CONFIG_PATA_IT821X is not set
# CONFIG_PATA_JMICRON is not set
# CONFIG_PATA_MARVELL is not set
# CONFIG_PATA_NETCELL is not set
# CONFIG_PATA_NINJA32 is not set
# CONFIG_PATA_NS87415 is not set
# CONFIG_PATA_OLDPIIX is not set
# CONFIG_PATA_OPTIDMA is not set
# CONFIG_PATA_PDC2027X is not set
# CONFIG_PATA_PDC_OLD is not set
# CONFIG_PATA_RADISYS is not set
# CONFIG_PATA_RDC is not set
# CONFIG_PATA_SCH is not set
# CONFIG_PATA_SERVERWORKS is not set
# CONFIG_PATA_SIL680 is not set
# CONFIG_PATA_SIS is not set
# CONFIG_PATA_TOSHIBA is not set
# CONFIG_PATA_TRIFLEX is not set
# CONFIG_PATA_VIA is not set
# CONFIG_PATA_WINBOND is not set

#
# PIO-only SFF controllers
#
# CONFIG_PATA_CMD640_PCI is not set
# CONFIG_PATA_MPIIX is not set
# CONFIG_PATA_NS87410 is not set
# CONFIG_PATA_OPTI is not set
# CONFIG_PATA_RZ1000 is not set

#
# Generic fallback / legacy drivers
#
# CONFIG_PATA_ACPI is not set
CONFIG_ATA_GENERIC=m
# CONFIG_PATA_LEGACY is not set
CONFIG_MD=y
CONFIG_BLK_DEV_MD=y
CONFIG_MD_AUTODETECT=y
CONFIG_MD_LINEAR=m
CONFIG_MD_RAID0=m
CONFIG_MD_RAID1=m
CONFIG_MD_RAID10=m
CONFIG_MD_RAID456=m
# CONFIG_MD_MULTIPATH is not set
CONFIG_MD_FAULTY=m
CONFIG_MD_CLUSTER=m
# CONFIG_BCACHE is not set
CONFIG_BLK_DEV_DM_BUILTIN=y
CONFIG_BLK_DEV_DM=m
CONFIG_DM_DEBUG=y
CONFIG_DM_BUFIO=m
# CONFIG_DM_DEBUG_BLOCK_MANAGER_LOCKING is not set
CONFIG_DM_BIO_PRISON=m
CONFIG_DM_PERSISTENT_DATA=m
# CONFIG_DM_UNSTRIPED is not set
CONFIG_DM_CRYPT=m
CONFIG_DM_SNAPSHOT=m
CONFIG_DM_THIN_PROVISIONING=m
CONFIG_DM_CACHE=m
CONFIG_DM_CACHE_SMQ=m
CONFIG_DM_WRITECACHE=m
# CONFIG_DM_EBS is not set
CONFIG_DM_ERA=m
# CONFIG_DM_CLONE is not set
CONFIG_DM_MIRROR=m
CONFIG_DM_LOG_USERSPACE=m
CONFIG_DM_RAID=m
CONFIG_DM_ZERO=m
CONFIG_DM_MULTIPATH=m
CONFIG_DM_MULTIPATH_QL=m
CONFIG_DM_MULTIPATH_ST=m
# CONFIG_DM_MULTIPATH_HST is not set
# CONFIG_DM_MULTIPATH_IOA is not set
CONFIG_DM_DELAY=m
# CONFIG_DM_DUST is not set
CONFIG_DM_UEVENT=y
CONFIG_DM_FLAKEY=m
CONFIG_DM_VERITY=m
# CONFIG_DM_VERITY_VERIFY_ROOTHASH_SIG is not set
# CONFIG_DM_VERITY_FEC is not set
CONFIG_DM_SWITCH=m
CONFIG_DM_LOG_WRITES=m
CONFIG_DM_INTEGRITY=m
CONFIG_DM_AUDIT=y
CONFIG_TARGET_CORE=m
CONFIG_TCM_IBLOCK=m
CONFIG_TCM_FILEIO=m
CONFIG_TCM_PSCSI=m
CONFIG_TCM_USER2=m
CONFIG_LOOPBACK_TARGET=m
CONFIG_ISCSI_TARGET=m
# CONFIG_SBP_TARGET is not set
# CONFIG_FUSION is not set

#
# IEEE 1394 (FireWire) support
#
CONFIG_FIREWIRE=m
CONFIG_FIREWIRE_OHCI=m
CONFIG_FIREWIRE_SBP2=m
CONFIG_FIREWIRE_NET=m
# CONFIG_FIREWIRE_NOSY is not set
# end of IEEE 1394 (FireWire) support

CONFIG_MACINTOSH_DRIVERS=y
CONFIG_MAC_EMUMOUSEBTN=y
CONFIG_NETDEVICES=y
CONFIG_MII=y
CONFIG_NET_CORE=y
CONFIG_BONDING=m
CONFIG_DUMMY=y
# CONFIG_WIREGUARD is not set
# CONFIG_EQUALIZER is not set
# CONFIG_NET_FC is not set
CONFIG_IFB=m
# CONFIG_NET_TEAM is not set
# CONFIG_MACVLAN is not set
# CONFIG_IPVLAN is not set
CONFIG_VXLAN=y
CONFIG_GENEVE=y
CONFIG_BAREUDP=m
# CONFIG_GTP is not set
CONFIG_AMT=m
CONFIG_MACSEC=y
CONFIG_NETCONSOLE=m
CONFIG_NETCONSOLE_DYNAMIC=y
CONFIG_NETPOLL=y
CONFIG_NET_POLL_CONTROLLER=y
CONFIG_TUN=m
# CONFIG_TUN_VNET_CROSS_LE is not set
CONFIG_VETH=y
CONFIG_VIRTIO_NET=y
# CONFIG_NLMON is not set
CONFIG_NET_VRF=y
# CONFIG_VSOCKMON is not set
# CONFIG_ARCNET is not set
CONFIG_ATM_DRIVERS=y
# CONFIG_ATM_DUMMY is not set
# CONFIG_ATM_TCP is not set
# CONFIG_ATM_LANAI is not set
# CONFIG_ATM_ENI is not set
# CONFIG_ATM_NICSTAR is not set
# CONFIG_ATM_IDT77252 is not set
# CONFIG_ATM_IA is not set
# CONFIG_ATM_FORE200E is not set
# CONFIG_ATM_HE is not set
# CONFIG_ATM_SOLOS is not set
CONFIG_ETHERNET=y
CONFIG_MDIO=y
# CONFIG_NET_VENDOR_3COM is not set
CONFIG_NET_VENDOR_ADAPTEC=y
# CONFIG_ADAPTEC_STARFIRE is not set
CONFIG_NET_VENDOR_AGERE=y
# CONFIG_ET131X is not set
CONFIG_NET_VENDOR_ALACRITECH=y
# CONFIG_SLICOSS is not set
CONFIG_NET_VENDOR_ALTEON=y
# CONFIG_ACENIC is not set
# CONFIG_ALTERA_TSE is not set
CONFIG_NET_VENDOR_AMAZON=y
# CONFIG_ENA_ETHERNET is not set
# CONFIG_NET_VENDOR_AMD is not set
CONFIG_NET_VENDOR_AQUANTIA=y
# CONFIG_AQTION is not set
CONFIG_NET_VENDOR_ARC=y
CONFIG_NET_VENDOR_ASIX=y
# CONFIG_SPI_AX88796C is not set
CONFIG_NET_VENDOR_ATHEROS=y
# CONFIG_ATL2 is not set
# CONFIG_ATL1 is not set
# CONFIG_ATL1E is not set
# CONFIG_ATL1C is not set
# CONFIG_ALX is not set
# CONFIG_CX_ECAT is not set
CONFIG_NET_VENDOR_BROADCOM=y
# CONFIG_B44 is not set
# CONFIG_BCMGENET is not set
# CONFIG_BNX2 is not set
# CONFIG_CNIC is not set
# CONFIG_TIGON3 is not set
# CONFIG_BNX2X is not set
# CONFIG_SYSTEMPORT is not set
# CONFIG_BNXT is not set
CONFIG_NET_VENDOR_CADENCE=y
# CONFIG_MACB is not set
CONFIG_NET_VENDOR_CAVIUM=y
# CONFIG_THUNDER_NIC_PF is not set
# CONFIG_THUNDER_NIC_VF is not set
# CONFIG_THUNDER_NIC_BGX is not set
# CONFIG_THUNDER_NIC_RGX is not set
CONFIG_CAVIUM_PTP=y
# CONFIG_LIQUIDIO is not set
# CONFIG_LIQUIDIO_VF is not set
CONFIG_NET_VENDOR_CHELSIO=y
# CONFIG_CHELSIO_T1 is not set
# CONFIG_CHELSIO_T3 is not set
# CONFIG_CHELSIO_T4 is not set
# CONFIG_CHELSIO_T4VF is not set
CONFIG_NET_VENDOR_CISCO=y
# CONFIG_ENIC is not set
CONFIG_NET_VENDOR_CORTINA=y
CONFIG_NET_VENDOR_DAVICOM=y
# CONFIG_DM9051 is not set
# CONFIG_DNET is not set
CONFIG_NET_VENDOR_DEC=y
# CONFIG_NET_TULIP is not set
CONFIG_NET_VENDOR_DLINK=y
# CONFIG_DL2K is not set
# CONFIG_SUNDANCE is not set
CONFIG_NET_VENDOR_EMULEX=y
# CONFIG_BE2NET is not set
CONFIG_NET_VENDOR_ENGLEDER=y
# CONFIG_TSNEP is not set
CONFIG_NET_VENDOR_EZCHIP=y
CONFIG_NET_VENDOR_FUNGIBLE=y
# CONFIG_FUN_ETH is not set
CONFIG_NET_VENDOR_GOOGLE=y
# CONFIG_GVE is not set
CONFIG_NET_VENDOR_HUAWEI=y
# CONFIG_HINIC is not set
CONFIG_NET_VENDOR_I825XX=y
CONFIG_NET_VENDOR_INTEL=y
# CONFIG_E100 is not set
CONFIG_E1000=y
CONFIG_E1000E=y
CONFIG_E1000E_HWTS=y
CONFIG_IGB=y
CONFIG_IGB_HWMON=y
# CONFIG_IGBVF is not set
# CONFIG_IXGB is not set
CONFIG_IXGBE=y
CONFIG_IXGBE_HWMON=y
# CONFIG_IXGBE_DCB is not set
# CONFIG_IXGBE_IPSEC is not set
# CONFIG_IXGBEVF is not set
CONFIG_I40E=y
# CONFIG_I40E_DCB is not set
# CONFIG_I40EVF is not set
# CONFIG_ICE is not set
# CONFIG_FM10K is not set
CONFIG_IGC=y
CONFIG_NET_VENDOR_WANGXUN=y
# CONFIG_NGBE is not set
# CONFIG_TXGBE is not set
# CONFIG_JME is not set
CONFIG_NET_VENDOR_ADI=y
# CONFIG_ADIN1110 is not set
CONFIG_NET_VENDOR_LITEX=y
CONFIG_NET_VENDOR_MARVELL=y
# CONFIG_MVMDIO is not set
# CONFIG_SKGE is not set
# CONFIG_SKY2 is not set
# CONFIG_OCTEON_EP is not set
# CONFIG_PRESTERA is not set
CONFIG_NET_VENDOR_MELLANOX=y
# CONFIG_MLX4_EN is not set
# CONFIG_MLX5_CORE is not set
# CONFIG_MLXSW_CORE is not set
# CONFIG_MLXFW is not set
CONFIG_NET_VENDOR_MICREL=y
# CONFIG_KS8842 is not set
# CONFIG_KS8851 is not set
# CONFIG_KS8851_MLL is not set
# CONFIG_KSZ884X_PCI is not set
CONFIG_NET_VENDOR_MICROCHIP=y
# CONFIG_ENC28J60 is not set
# CONFIG_ENCX24J600 is not set
# CONFIG_LAN743X is not set
CONFIG_NET_VENDOR_MICROSEMI=y
CONFIG_NET_VENDOR_MICROSOFT=y
CONFIG_NET_VENDOR_MYRI=y
# CONFIG_MYRI10GE is not set
# CONFIG_FEALNX is not set
CONFIG_NET_VENDOR_NI=y
# CONFIG_NI_XGE_MANAGEMENT_ENET is not set
CONFIG_NET_VENDOR_NATSEMI=y
# CONFIG_NATSEMI is not set
# CONFIG_NS83820 is not set
CONFIG_NET_VENDOR_NETERION=y
# CONFIG_S2IO is not set
CONFIG_NET_VENDOR_NETRONOME=y
# CONFIG_NFP is not set
CONFIG_NET_VENDOR_8390=y
# CONFIG_NE2K_PCI is not set
CONFIG_NET_VENDOR_NVIDIA=y
# CONFIG_FORCEDETH is not set
CONFIG_NET_VENDOR_OKI=y
# CONFIG_ETHOC is not set
CONFIG_NET_VENDOR_PACKET_ENGINES=y
# CONFIG_HAMACHI is not set
# CONFIG_YELLOWFIN is not set
CONFIG_NET_VENDOR_PENSANDO=y
# CONFIG_IONIC is not set
CONFIG_NET_VENDOR_QLOGIC=y
# CONFIG_QLA3XXX is not set
# CONFIG_QLCNIC is not set
# CONFIG_NETXEN_NIC is not set
# CONFIG_QED is not set
CONFIG_NET_VENDOR_BROCADE=y
# CONFIG_BNA is not set
CONFIG_NET_VENDOR_QUALCOMM=y
# CONFIG_QCOM_EMAC is not set
# CONFIG_RMNET is not set
CONFIG_NET_VENDOR_RDC=y
# CONFIG_R6040 is not set
CONFIG_NET_VENDOR_REALTEK=y
# CONFIG_ATP is not set
# CONFIG_8139CP is not set
# CONFIG_8139TOO is not set
CONFIG_R8169=y
CONFIG_NET_VENDOR_RENESAS=y
CONFIG_NET_VENDOR_ROCKER=y
# CONFIG_ROCKER is not set
CONFIG_NET_VENDOR_SAMSUNG=y
# CONFIG_SXGBE_ETH is not set
CONFIG_NET_VENDOR_SEEQ=y
CONFIG_NET_VENDOR_SILAN=y
# CONFIG_SC92031 is not set
CONFIG_NET_VENDOR_SIS=y
# CONFIG_SIS900 is not set
# CONFIG_SIS190 is not set
CONFIG_NET_VENDOR_SOLARFLARE=y
# CONFIG_SFC is not set
# CONFIG_SFC_FALCON is not set
# CONFIG_SFC_SIENA is not set
CONFIG_NET_VENDOR_SMSC=y
# CONFIG_EPIC100 is not set
# CONFIG_SMSC911X is not set
# CONFIG_SMSC9420 is not set
CONFIG_NET_VENDOR_SOCIONEXT=y
CONFIG_NET_VENDOR_STMICRO=y
# CONFIG_STMMAC_ETH is not set
CONFIG_NET_VENDOR_SUN=y
# CONFIG_HAPPYMEAL is not set
# CONFIG_SUNGEM is not set
# CONFIG_CASSINI is not set
# CONFIG_NIU is not set
CONFIG_NET_VENDOR_SYNOPSYS=y
# CONFIG_DWC_XLGMAC is not set
CONFIG_NET_VENDOR_TEHUTI=y
# CONFIG_TEHUTI is not set
CONFIG_NET_VENDOR_TI=y
# CONFIG_TI_CPSW_PHY_SEL is not set
# CONFIG_TLAN is not set
CONFIG_NET_VENDOR_VERTEXCOM=y
# CONFIG_MSE102X is not set
CONFIG_NET_VENDOR_VIA=y
# CONFIG_VIA_RHINE is not set
# CONFIG_VIA_VELOCITY is not set
CONFIG_NET_VENDOR_WIZNET=y
# CONFIG_WIZNET_W5100 is not set
# CONFIG_WIZNET_W5300 is not set
CONFIG_NET_VENDOR_XILINX=y
# CONFIG_XILINX_EMACLITE is not set
# CONFIG_XILINX_AXI_EMAC is not set
# CONFIG_XILINX_LL_TEMAC is not set
# CONFIG_FDDI is not set
# CONFIG_HIPPI is not set
# CONFIG_NET_SB1000 is not set
CONFIG_PHYLINK=y
CONFIG_PHYLIB=y
CONFIG_SWPHY=y
# CONFIG_LED_TRIGGER_PHY is not set
CONFIG_FIXED_PHY=y
# CONFIG_SFP is not set

#
# MII PHY device drivers
#
# CONFIG_AMD_PHY is not set
# CONFIG_ADIN_PHY is not set
# CONFIG_ADIN1100_PHY is not set
# CONFIG_AQUANTIA_PHY is not set
CONFIG_AX88796B_PHY=y
# CONFIG_BROADCOM_PHY is not set
# CONFIG_BCM54140_PHY is not set
# CONFIG_BCM7XXX_PHY is not set
# CONFIG_BCM84881_PHY is not set
# CONFIG_BCM87XX_PHY is not set
# CONFIG_CICADA_PHY is not set
# CONFIG_CORTINA_PHY is not set
# CONFIG_DAVICOM_PHY is not set
# CONFIG_ICPLUS_PHY is not set
# CONFIG_LXT_PHY is not set
# CONFIG_INTEL_XWAY_PHY is not set
# CONFIG_LSI_ET1011C_PHY is not set
# CONFIG_MARVELL_PHY is not set
# CONFIG_MARVELL_10G_PHY is not set
# CONFIG_MARVELL_88X2222_PHY is not set
# CONFIG_MAXLINEAR_GPHY is not set
# CONFIG_MEDIATEK_GE_PHY is not set
# CONFIG_MICREL_PHY is not set
# CONFIG_MICROCHIP_PHY is not set
# CONFIG_MICROCHIP_T1_PHY is not set
# CONFIG_MICROSEMI_PHY is not set
# CONFIG_MOTORCOMM_PHY is not set
# CONFIG_NATIONAL_PHY is not set
# CONFIG_NXP_C45_TJA11XX_PHY is not set
# CONFIG_NXP_TJA11XX_PHY is not set
# CONFIG_QSEMI_PHY is not set
CONFIG_REALTEK_PHY=y
# CONFIG_RENESAS_PHY is not set
# CONFIG_ROCKCHIP_PHY is not set
# CONFIG_SMSC_PHY is not set
# CONFIG_STE10XP is not set
# CONFIG_TERANETICS_PHY is not set
# CONFIG_DP83822_PHY is not set
# CONFIG_DP83TC811_PHY is not set
# CONFIG_DP83848_PHY is not set
# CONFIG_DP83867_PHY is not set
# CONFIG_DP83869_PHY is not set
# CONFIG_DP83TD510_PHY is not set
# CONFIG_VITESSE_PHY is not set
# CONFIG_XILINX_GMII2RGMII is not set
# CONFIG_MICREL_KS8995MA is not set
# CONFIG_PSE_CONTROLLER is not set
CONFIG_CAN_DEV=m
CONFIG_CAN_VCAN=m
# CONFIG_CAN_VXCAN is not set
CONFIG_CAN_NETLINK=y
CONFIG_CAN_CALC_BITTIMING=y
# CONFIG_CAN_CAN327 is not set
# CONFIG_CAN_KVASER_PCIEFD is not set
CONFIG_CAN_SLCAN=m
CONFIG_CAN_C_CAN=m
CONFIG_CAN_C_CAN_PLATFORM=m
CONFIG_CAN_C_CAN_PCI=m
CONFIG_CAN_CC770=m
# CONFIG_CAN_CC770_ISA is not set
CONFIG_CAN_CC770_PLATFORM=m
# CONFIG_CAN_CTUCANFD_PCI is not set
# CONFIG_CAN_IFI_CANFD is not set
# CONFIG_CAN_M_CAN is not set
# CONFIG_CAN_PEAK_PCIEFD is not set
CONFIG_CAN_SJA1000=m
CONFIG_CAN_EMS_PCI=m
# CONFIG_CAN_F81601 is not set
CONFIG_CAN_KVASER_PCI=m
CONFIG_CAN_PEAK_PCI=m
CONFIG_CAN_PEAK_PCIEC=y
CONFIG_CAN_PLX_PCI=m
# CONFIG_CAN_SJA1000_ISA is not set
# CONFIG_CAN_SJA1000_PLATFORM is not set
CONFIG_CAN_SOFTING=m

#
# CAN SPI interfaces
#
# CONFIG_CAN_HI311X is not set
# CONFIG_CAN_MCP251X is not set
# CONFIG_CAN_MCP251XFD is not set
# end of CAN SPI interfaces

#
# CAN USB interfaces
#
# CONFIG_CAN_8DEV_USB is not set
# CONFIG_CAN_EMS_USB is not set
# CONFIG_CAN_ESD_USB is not set
# CONFIG_CAN_ETAS_ES58X is not set
# CONFIG_CAN_GS_USB is not set
# CONFIG_CAN_KVASER_USB is not set
# CONFIG_CAN_MCBA_USB is not set
# CONFIG_CAN_PEAK_USB is not set
# CONFIG_CAN_UCAN is not set
# end of CAN USB interfaces

# CONFIG_CAN_DEBUG_DEVICES is not set
CONFIG_MDIO_DEVICE=y
CONFIG_MDIO_BUS=y
CONFIG_FWNODE_MDIO=y
CONFIG_ACPI_MDIO=y
CONFIG_MDIO_DEVRES=y
# CONFIG_MDIO_BITBANG is not set
# CONFIG_MDIO_BCM_UNIMAC is not set
# CONFIG_MDIO_MVUSB is not set
# CONFIG_MDIO_THUNDER is not set

#
# MDIO Multiplexers
#

#
# PCS device drivers
#
# end of PCS device drivers

# CONFIG_PLIP is not set
# CONFIG_PPP is not set
# CONFIG_SLIP is not set
CONFIG_USB_NET_DRIVERS=y
# CONFIG_USB_CATC is not set
# CONFIG_USB_KAWETH is not set
# CONFIG_USB_PEGASUS is not set
# CONFIG_USB_RTL8150 is not set
CONFIG_USB_RTL8152=y
# CONFIG_USB_LAN78XX is not set
CONFIG_USB_USBNET=y
CONFIG_USB_NET_AX8817X=y
CONFIG_USB_NET_AX88179_178A=y
# CONFIG_USB_NET_CDCETHER is not set
# CONFIG_USB_NET_CDC_EEM is not set
# CONFIG_USB_NET_CDC_NCM is not set
# CONFIG_USB_NET_HUAWEI_CDC_NCM is not set
# CONFIG_USB_NET_CDC_MBIM is not set
# CONFIG_USB_NET_DM9601 is not set
# CONFIG_USB_NET_SR9700 is not set
# CONFIG_USB_NET_SR9800 is not set
# CONFIG_USB_NET_SMSC75XX is not set
# CONFIG_USB_NET_SMSC95XX is not set
# CONFIG_USB_NET_GL620A is not set
# CONFIG_USB_NET_NET1080 is not set
# CONFIG_USB_NET_PLUSB is not set
# CONFIG_USB_NET_MCS7830 is not set
# CONFIG_USB_NET_RNDIS_HOST is not set
# CONFIG_USB_NET_CDC_SUBSET is not set
# CONFIG_USB_NET_ZAURUS is not set
# CONFIG_USB_NET_CX82310_ETH is not set
# CONFIG_USB_NET_KALMIA is not set
# CONFIG_USB_NET_QMI_WWAN is not set
# CONFIG_USB_HSO is not set
# CONFIG_USB_NET_INT51X1 is not set
# CONFIG_USB_IPHETH is not set
# CONFIG_USB_SIERRA_NET is not set
# CONFIG_USB_NET_CH9200 is not set
# CONFIG_USB_NET_AQC111 is not set
CONFIG_WLAN=y
CONFIG_WLAN_VENDOR_ADMTEK=y
# CONFIG_ADM8211 is not set
CONFIG_WLAN_VENDOR_ATH=y
# CONFIG_ATH_DEBUG is not set
# CONFIG_ATH5K is not set
# CONFIG_ATH5K_PCI is not set
# CONFIG_ATH9K is not set
# CONFIG_ATH9K_HTC is not set
# CONFIG_CARL9170 is not set
# CONFIG_ATH6KL is not set
# CONFIG_AR5523 is not set
# CONFIG_WIL6210 is not set
# CONFIG_ATH10K is not set
# CONFIG_WCN36XX is not set
# CONFIG_ATH11K is not set
CONFIG_WLAN_VENDOR_ATMEL=y
# CONFIG_ATMEL is not set
# CONFIG_AT76C50X_USB is not set
CONFIG_WLAN_VENDOR_BROADCOM=y
# CONFIG_B43 is not set
# CONFIG_B43LEGACY is not set
# CONFIG_BRCMSMAC is not set
# CONFIG_BRCMFMAC is not set
CONFIG_WLAN_VENDOR_CISCO=y
# CONFIG_AIRO is not set
CONFIG_WLAN_VENDOR_INTEL=y
# CONFIG_IPW2100 is not set
# CONFIG_IPW2200 is not set
# CONFIG_IWL4965 is not set
# CONFIG_IWL3945 is not set
# CONFIG_IWLWIFI is not set
CONFIG_WLAN_VENDOR_INTERSIL=y
# CONFIG_HOSTAP is not set
# CONFIG_HERMES is not set
# CONFIG_P54_COMMON is not set
CONFIG_WLAN_VENDOR_MARVELL=y
# CONFIG_LIBERTAS is not set
# CONFIG_LIBERTAS_THINFIRM is not set
# CONFIG_MWIFIEX is not set
# CONFIG_MWL8K is not set
# CONFIG_WLAN_VENDOR_MEDIATEK is not set
CONFIG_WLAN_VENDOR_MICROCHIP=y
# CONFIG_WILC1000_SDIO is not set
# CONFIG_WILC1000_SPI is not set
CONFIG_WLAN_VENDOR_PURELIFI=y
# CONFIG_PLFXLC is not set
CONFIG_WLAN_VENDOR_RALINK=y
# CONFIG_RT2X00 is not set
CONFIG_WLAN_VENDOR_REALTEK=y
# CONFIG_RTL8180 is not set
# CONFIG_RTL8187 is not set
CONFIG_RTL_CARDS=m
# CONFIG_RTL8192CE is not set
# CONFIG_RTL8192SE is not set
# CONFIG_RTL8192DE is not set
# CONFIG_RTL8723AE is not set
# CONFIG_RTL8723BE is not set
# CONFIG_RTL8188EE is not set
# CONFIG_RTL8192EE is not set
# CONFIG_RTL8821AE is not set
# CONFIG_RTL8192CU is not set
# CONFIG_RTL8XXXU is not set
# CONFIG_RTW88 is not set
# CONFIG_RTW89 is not set
CONFIG_WLAN_VENDOR_RSI=y
# CONFIG_RSI_91X is not set
CONFIG_WLAN_VENDOR_SILABS=y
# CONFIG_WFX is not set
CONFIG_WLAN_VENDOR_ST=y
# CONFIG_CW1200 is not set
CONFIG_WLAN_VENDOR_TI=y
# CONFIG_WL1251 is not set
# CONFIG_WL12XX is not set
# CONFIG_WL18XX is not set
# CONFIG_WLCORE is not set
CONFIG_WLAN_VENDOR_ZYDAS=y
# CONFIG_USB_ZD1201 is not set
# CONFIG_ZD1211RW is not set
CONFIG_WLAN_VENDOR_QUANTENNA=y
# CONFIG_QTNFMAC_PCIE is not set
# CONFIG_MAC80211_HWSIM is not set
# CONFIG_USB_NET_RNDIS_WLAN is not set
# CONFIG_VIRT_WIFI is not set
# CONFIG_WAN is not set

#
# Wireless WAN
#
# CONFIG_WWAN is not set
# end of Wireless WAN

# CONFIG_VMXNET3 is not set
# CONFIG_FUJITSU_ES is not set
CONFIG_NETDEVSIM=m
CONFIG_NET_FAILOVER=y
# CONFIG_ISDN is not set

#
# Input device support
#
CONFIG_INPUT=y
CONFIG_INPUT_LEDS=y
CONFIG_INPUT_FF_MEMLESS=y
CONFIG_INPUT_SPARSEKMAP=m
# CONFIG_INPUT_MATRIXKMAP is not set
CONFIG_INPUT_VIVALDIFMAP=y

#
# Userland interfaces
#
CONFIG_INPUT_MOUSEDEV=y
# CONFIG_INPUT_MOUSEDEV_PSAUX is not set
CONFIG_INPUT_MOUSEDEV_SCREEN_X=1024
CONFIG_INPUT_MOUSEDEV_SCREEN_Y=768
CONFIG_INPUT_JOYDEV=m
CONFIG_INPUT_EVDEV=y
# CONFIG_INPUT_EVBUG is not set

#
# Input Device Drivers
#
CONFIG_INPUT_KEYBOARD=y
# CONFIG_KEYBOARD_ADP5588 is not set
# CONFIG_KEYBOARD_ADP5589 is not set
# CONFIG_KEYBOARD_APPLESPI is not set
CONFIG_KEYBOARD_ATKBD=y
# CONFIG_KEYBOARD_QT1050 is not set
# CONFIG_KEYBOARD_QT1070 is not set
# CONFIG_KEYBOARD_QT2160 is not set
# CONFIG_KEYBOARD_DLINK_DIR685 is not set
# CONFIG_KEYBOARD_LKKBD is not set
# CONFIG_KEYBOARD_GPIO is not set
# CONFIG_KEYBOARD_GPIO_POLLED is not set
# CONFIG_KEYBOARD_TCA6416 is not set
# CONFIG_KEYBOARD_TCA8418 is not set
# CONFIG_KEYBOARD_MATRIX is not set
# CONFIG_KEYBOARD_LM8323 is not set
# CONFIG_KEYBOARD_LM8333 is not set
# CONFIG_KEYBOARD_MAX7359 is not set
# CONFIG_KEYBOARD_MCS is not set
# CONFIG_KEYBOARD_MPR121 is not set
# CONFIG_KEYBOARD_NEWTON is not set
# CONFIG_KEYBOARD_OPENCORES is not set
# CONFIG_KEYBOARD_SAMSUNG is not set
# CONFIG_KEYBOARD_STOWAWAY is not set
# CONFIG_KEYBOARD_SUNKBD is not set
# CONFIG_KEYBOARD_TM2_TOUCHKEY is not set
# CONFIG_KEYBOARD_XTKBD is not set
# CONFIG_KEYBOARD_CYPRESS_SF is not set
CONFIG_INPUT_MOUSE=y
CONFIG_MOUSE_PS2=y
CONFIG_MOUSE_PS2_ALPS=y
CONFIG_MOUSE_PS2_BYD=y
CONFIG_MOUSE_PS2_LOGIPS2PP=y
CONFIG_MOUSE_PS2_SYNAPTICS=y
CONFIG_MOUSE_PS2_SYNAPTICS_SMBUS=y
CONFIG_MOUSE_PS2_CYPRESS=y
CONFIG_MOUSE_PS2_LIFEBOOK=y
CONFIG_MOUSE_PS2_TRACKPOINT=y
CONFIG_MOUSE_PS2_ELANTECH=y
CONFIG_MOUSE_PS2_ELANTECH_SMBUS=y
CONFIG_MOUSE_PS2_SENTELIC=y
# CONFIG_MOUSE_PS2_TOUCHKIT is not set
CONFIG_MOUSE_PS2_FOCALTECH=y
CONFIG_MOUSE_PS2_VMMOUSE=y
CONFIG_MOUSE_PS2_SMBUS=y
CONFIG_MOUSE_SERIAL=m
# CONFIG_MOUSE_APPLETOUCH is not set
# CONFIG_MOUSE_BCM5974 is not set
CONFIG_MOUSE_CYAPA=m
CONFIG_MOUSE_ELAN_I2C=m
CONFIG_MOUSE_ELAN_I2C_I2C=y
CONFIG_MOUSE_ELAN_I2C_SMBUS=y
CONFIG_MOUSE_VSXXXAA=m
# CONFIG_MOUSE_GPIO is not set
CONFIG_MOUSE_SYNAPTICS_I2C=m
# CONFIG_MOUSE_SYNAPTICS_USB is not set
# CONFIG_INPUT_JOYSTICK is not set
# CONFIG_INPUT_TABLET is not set
# CONFIG_INPUT_TOUCHSCREEN is not set
# CONFIG_INPUT_MISC is not set
CONFIG_RMI4_CORE=m
CONFIG_RMI4_I2C=m
CONFIG_RMI4_SPI=m
CONFIG_RMI4_SMB=m
CONFIG_RMI4_F03=y
CONFIG_RMI4_F03_SERIO=m
CONFIG_RMI4_2D_SENSOR=y
CONFIG_RMI4_F11=y
CONFIG_RMI4_F12=y
CONFIG_RMI4_F30=y
CONFIG_RMI4_F34=y
# CONFIG_RMI4_F3A is not set
CONFIG_RMI4_F55=y

#
# Hardware I/O ports
#
CONFIG_SERIO=y
CONFIG_ARCH_MIGHT_HAVE_PC_SERIO=y
CONFIG_SERIO_I8042=y
CONFIG_SERIO_SERPORT=y
# CONFIG_SERIO_CT82C710 is not set
# CONFIG_SERIO_PARKBD is not set
# CONFIG_SERIO_PCIPS2 is not set
CONFIG_SERIO_LIBPS2=y
CONFIG_SERIO_RAW=m
CONFIG_SERIO_ALTERA_PS2=m
# CONFIG_SERIO_PS2MULT is not set
CONFIG_SERIO_ARC_PS2=m
# CONFIG_SERIO_GPIO_PS2 is not set
# CONFIG_USERIO is not set
# CONFIG_GAMEPORT is not set
# end of Hardware I/O ports
# end of Input device support

#
# Character devices
#
CONFIG_TTY=y
CONFIG_VT=y
CONFIG_CONSOLE_TRANSLATIONS=y
CONFIG_VT_CONSOLE=y
CONFIG_VT_CONSOLE_SLEEP=y
CONFIG_HW_CONSOLE=y
CONFIG_VT_HW_CONSOLE_BINDING=y
CONFIG_UNIX98_PTYS=y
# CONFIG_LEGACY_PTYS is not set
CONFIG_LDISC_AUTOLOAD=y

#
# Serial drivers
#
CONFIG_SERIAL_EARLYCON=y
CONFIG_SERIAL_8250=y
# CONFIG_SERIAL_8250_DEPRECATED_OPTIONS is not set
CONFIG_SERIAL_8250_PNP=y
# CONFIG_SERIAL_8250_16550A_VARIANTS is not set
# CONFIG_SERIAL_8250_FINTEK is not set
CONFIG_SERIAL_8250_CONSOLE=y
CONFIG_SERIAL_8250_DMA=y
CONFIG_SERIAL_8250_PCI=y
CONFIG_SERIAL_8250_EXAR=y
CONFIG_SERIAL_8250_NR_UARTS=32
CONFIG_SERIAL_8250_RUNTIME_UARTS=4
CONFIG_SERIAL_8250_EXTENDED=y
CONFIG_SERIAL_8250_MANY_PORTS=y
CONFIG_SERIAL_8250_SHARE_IRQ=y
CONFIG_SERIAL_8250_DETECT_IRQ=y
CONFIG_SERIAL_8250_RSA=y
CONFIG_SERIAL_8250_DWLIB=y
CONFIG_SERIAL_8250_DW=y
# CONFIG_SERIAL_8250_RT288X is not set
CONFIG_SERIAL_8250_LPSS=y
CONFIG_SERIAL_8250_MID=y
CONFIG_SERIAL_8250_PERICOM=y

#
# Non-8250 serial port support
#
# CONFIG_SERIAL_MAX3100 is not set
# CONFIG_SERIAL_MAX310X is not set
# CONFIG_SERIAL_UARTLITE is not set
CONFIG_SERIAL_CORE=y
CONFIG_SERIAL_CORE_CONSOLE=y
CONFIG_SERIAL_JSM=m
# CONFIG_SERIAL_LANTIQ is not set
# CONFIG_SERIAL_SCCNXP is not set
# CONFIG_SERIAL_SC16IS7XX is not set
# CONFIG_SERIAL_ALTERA_JTAGUART is not set
# CONFIG_SERIAL_ALTERA_UART is not set
CONFIG_SERIAL_ARC=m
CONFIG_SERIAL_ARC_NR_PORTS=1
# CONFIG_SERIAL_RP2 is not set
# CONFIG_SERIAL_FSL_LPUART is not set
# CONFIG_SERIAL_FSL_LINFLEXUART is not set
# CONFIG_SERIAL_SPRD is not set
# end of Serial drivers

CONFIG_SERIAL_MCTRL_GPIO=y
CONFIG_SERIAL_NONSTANDARD=y
# CONFIG_MOXA_INTELLIO is not set
# CONFIG_MOXA_SMARTIO is not set
CONFIG_SYNCLINK_GT=m
CONFIG_N_HDLC=m
CONFIG_N_GSM=m
CONFIG_NOZOMI=m
# CONFIG_NULL_TTY is not set
CONFIG_HVC_DRIVER=y
# CONFIG_SERIAL_DEV_BUS is not set
# CONFIG_TTY_PRINTK is not set
CONFIG_PRINTER=m
# CONFIG_LP_CONSOLE is not set
CONFIG_PPDEV=m
CONFIG_VIRTIO_CONSOLE=y
CONFIG_IPMI_HANDLER=m
CONFIG_IPMI_DMI_DECODE=y
CONFIG_IPMI_PLAT_DATA=y
CONFIG_IPMI_PANIC_EVENT=y
CONFIG_IPMI_PANIC_STRING=y
CONFIG_IPMI_DEVICE_INTERFACE=m
CONFIG_IPMI_SI=m
CONFIG_IPMI_SSIF=m
CONFIG_IPMI_WATCHDOG=m
CONFIG_IPMI_POWEROFF=m
CONFIG_HW_RANDOM=y
CONFIG_HW_RANDOM_TIMERIOMEM=m
CONFIG_HW_RANDOM_INTEL=m
# CONFIG_HW_RANDOM_AMD is not set
# CONFIG_HW_RANDOM_BA431 is not set
CONFIG_HW_RANDOM_VIA=m
CONFIG_HW_RANDOM_VIRTIO=y
# CONFIG_HW_RANDOM_XIPHERA is not set
# CONFIG_APPLICOM is not set
# CONFIG_MWAVE is not set
CONFIG_DEVMEM=y
CONFIG_NVRAM=y
CONFIG_DEVPORT=y
CONFIG_HPET=y
CONFIG_HPET_MMAP=y
# CONFIG_HPET_MMAP_DEFAULT is not set
CONFIG_HANGCHECK_TIMER=m
CONFIG_UV_MMTIMER=m
CONFIG_TCG_TPM=y
CONFIG_HW_RANDOM_TPM=y
CONFIG_TCG_TIS_CORE=y
CONFIG_TCG_TIS=y
# CONFIG_TCG_TIS_SPI is not set
# CONFIG_TCG_TIS_I2C is not set
# CONFIG_TCG_TIS_I2C_CR50 is not set
CONFIG_TCG_TIS_I2C_ATMEL=m
CONFIG_TCG_TIS_I2C_INFINEON=m
CONFIG_TCG_TIS_I2C_NUVOTON=m
CONFIG_TCG_NSC=m
CONFIG_TCG_ATMEL=m
CONFIG_TCG_INFINEON=m
CONFIG_TCG_CRB=y
# CONFIG_TCG_VTPM_PROXY is not set
CONFIG_TCG_TIS_ST33ZP24=m
CONFIG_TCG_TIS_ST33ZP24_I2C=m
# CONFIG_TCG_TIS_ST33ZP24_SPI is not set
CONFIG_TELCLOCK=m
# CONFIG_XILLYBUS is not set
# CONFIG_XILLYUSB is not set
CONFIG_RANDOM_TRUST_CPU=y
CONFIG_RANDOM_TRUST_BOOTLOADER=y
# end of Character devices

#
# I2C support
#
CONFIG_I2C=y
CONFIG_ACPI_I2C_OPREGION=y
CONFIG_I2C_BOARDINFO=y
CONFIG_I2C_COMPAT=y
CONFIG_I2C_CHARDEV=m
CONFIG_I2C_MUX=m

#
# Multiplexer I2C Chip support
#
# CONFIG_I2C_MUX_GPIO is not set
# CONFIG_I2C_MUX_LTC4306 is not set
# CONFIG_I2C_MUX_PCA9541 is not set
# CONFIG_I2C_MUX_PCA954x is not set
# CONFIG_I2C_MUX_REG is not set
CONFIG_I2C_MUX_MLXCPLD=m
# end of Multiplexer I2C Chip support

CONFIG_I2C_HELPER_AUTO=y
CONFIG_I2C_SMBUS=m
CONFIG_I2C_ALGOBIT=y
CONFIG_I2C_ALGOPCA=m

#
# I2C Hardware Bus support
#

#
# PC SMBus host controller drivers
#
# CONFIG_I2C_ALI1535 is not set
# CONFIG_I2C_ALI1563 is not set
# CONFIG_I2C_ALI15X3 is not set
# CONFIG_I2C_AMD756 is not set
# CONFIG_I2C_AMD8111 is not set
# CONFIG_I2C_AMD_MP2 is not set
CONFIG_I2C_I801=m
CONFIG_I2C_ISCH=m
CONFIG_I2C_ISMT=m
CONFIG_I2C_PIIX4=m
CONFIG_I2C_NFORCE2=m
CONFIG_I2C_NFORCE2_S4985=m
# CONFIG_I2C_NVIDIA_GPU is not set
# CONFIG_I2C_SIS5595 is not set
# CONFIG_I2C_SIS630 is not set
CONFIG_I2C_SIS96X=m
CONFIG_I2C_VIA=m
CONFIG_I2C_VIAPRO=m

#
# ACPI drivers
#
CONFIG_I2C_SCMI=m

#
# I2C system bus drivers (mostly embedded / system-on-chip)
#
# CONFIG_I2C_CBUS_GPIO is not set
CONFIG_I2C_DESIGNWARE_CORE=m
# CONFIG_I2C_DESIGNWARE_SLAVE is not set
CONFIG_I2C_DESIGNWARE_PLATFORM=m
# CONFIG_I2C_DESIGNWARE_AMDPSP is not set
CONFIG_I2C_DESIGNWARE_BAYTRAIL=y
# CONFIG_I2C_DESIGNWARE_PCI is not set
# CONFIG_I2C_EMEV2 is not set
# CONFIG_I2C_GPIO is not set
# CONFIG_I2C_OCORES is not set
CONFIG_I2C_PCA_PLATFORM=m
CONFIG_I2C_SIMTEC=m
# CONFIG_I2C_XILINX is not set

#
# External I2C/SMBus adapter drivers
#
# CONFIG_I2C_DIOLAN_U2C is not set
# CONFIG_I2C_CP2615 is not set
CONFIG_I2C_PARPORT=m
# CONFIG_I2C_PCI1XXXX is not set
# CONFIG_I2C_ROBOTFUZZ_OSIF is not set
# CONFIG_I2C_TAOS_EVM is not set
# CONFIG_I2C_TINY_USB is not set

#
# Other I2C/SMBus bus drivers
#
CONFIG_I2C_MLXCPLD=m
# CONFIG_I2C_VIRTIO is not set
# end of I2C Hardware Bus support

CONFIG_I2C_STUB=m
# CONFIG_I2C_SLAVE is not set
# CONFIG_I2C_DEBUG_CORE is not set
# CONFIG_I2C_DEBUG_ALGO is not set
# CONFIG_I2C_DEBUG_BUS is not set
# end of I2C support

# CONFIG_I3C is not set
CONFIG_SPI=y
# CONFIG_SPI_DEBUG is not set
CONFIG_SPI_MASTER=y
# CONFIG_SPI_MEM is not set

#
# SPI Master Controller Drivers
#
# CONFIG_SPI_ALTERA is not set
# CONFIG_SPI_AXI_SPI_ENGINE is not set
# CONFIG_SPI_BITBANG is not set
# CONFIG_SPI_BUTTERFLY is not set
# CONFIG_SPI_CADENCE is not set
# CONFIG_SPI_DESIGNWARE is not set
# CONFIG_SPI_NXP_FLEXSPI is not set
# CONFIG_SPI_GPIO is not set
# CONFIG_SPI_LM70_LLP is not set
# CONFIG_SPI_MICROCHIP_CORE is not set
# CONFIG_SPI_MICROCHIP_CORE_QSPI is not set
# CONFIG_SPI_LANTIQ_SSC is not set
# CONFIG_SPI_OC_TINY is not set
# CONFIG_SPI_PXA2XX is not set
# CONFIG_SPI_ROCKCHIP is not set
# CONFIG_SPI_SC18IS602 is not set
# CONFIG_SPI_SIFIVE is not set
# CONFIG_SPI_MXIC is not set
# CONFIG_SPI_XCOMM is not set
# CONFIG_SPI_XILINX is not set
# CONFIG_SPI_ZYNQMP_GQSPI is not set
# CONFIG_SPI_AMD is not set

#
# SPI Multiplexer support
#
# CONFIG_SPI_MUX is not set

#
# SPI Protocol Masters
#
# CONFIG_SPI_SPIDEV is not set
# CONFIG_SPI_LOOPBACK_TEST is not set
# CONFIG_SPI_TLE62X0 is not set
# CONFIG_SPI_SLAVE is not set
CONFIG_SPI_DYNAMIC=y
# CONFIG_SPMI is not set
# CONFIG_HSI is not set
CONFIG_PPS=y
# CONFIG_PPS_DEBUG is not set

#
# PPS clients support
#
# CONFIG_PPS_CLIENT_KTIMER is not set
CONFIG_PPS_CLIENT_LDISC=m
CONFIG_PPS_CLIENT_PARPORT=m
CONFIG_PPS_CLIENT_GPIO=m

#
# PPS generators support
#

#
# PTP clock support
#
CONFIG_PTP_1588_CLOCK=y
CONFIG_PTP_1588_CLOCK_OPTIONAL=y
# CONFIG_DP83640_PHY is not set
# CONFIG_PTP_1588_CLOCK_INES is not set
CONFIG_PTP_1588_CLOCK_KVM=m
# CONFIG_PTP_1588_CLOCK_IDT82P33 is not set
# CONFIG_PTP_1588_CLOCK_IDTCM is not set
# CONFIG_PTP_1588_CLOCK_VMW is not set
# end of PTP clock support

CONFIG_PINCTRL=y
# CONFIG_DEBUG_PINCTRL is not set
# CONFIG_PINCTRL_AMD is not set
# CONFIG_PINCTRL_CY8C95X0 is not set
# CONFIG_PINCTRL_MCP23S08 is not set
# CONFIG_PINCTRL_SX150X is not set

#
# Intel pinctrl drivers
#
# CONFIG_PINCTRL_BAYTRAIL is not set
# CONFIG_PINCTRL_CHERRYVIEW is not set
# CONFIG_PINCTRL_LYNXPOINT is not set
# CONFIG_PINCTRL_ALDERLAKE is not set
# CONFIG_PINCTRL_BROXTON is not set
# CONFIG_PINCTRL_CANNONLAKE is not set
# CONFIG_PINCTRL_CEDARFORK is not set
# CONFIG_PINCTRL_DENVERTON is not set
# CONFIG_PINCTRL_ELKHARTLAKE is not set
# CONFIG_PINCTRL_EMMITSBURG is not set
# CONFIG_PINCTRL_GEMINILAKE is not set
# CONFIG_PINCTRL_ICELAKE is not set
# CONFIG_PINCTRL_JASPERLAKE is not set
# CONFIG_PINCTRL_LAKEFIELD is not set
# CONFIG_PINCTRL_LEWISBURG is not set
# CONFIG_PINCTRL_METEORLAKE is not set
# CONFIG_PINCTRL_SUNRISEPOINT is not set
# CONFIG_PINCTRL_TIGERLAKE is not set
# end of Intel pinctrl drivers

#
# Renesas pinctrl drivers
#
# end of Renesas pinctrl drivers

CONFIG_GPIOLIB=y
CONFIG_GPIOLIB_FASTPATH_LIMIT=512
CONFIG_GPIO_ACPI=y
# CONFIG_DEBUG_GPIO is not set
CONFIG_GPIO_SYSFS=y
CONFIG_GPIO_CDEV=y
CONFIG_GPIO_CDEV_V1=y

#
# Memory mapped GPIO drivers
#
# CONFIG_GPIO_AMDPT is not set
# CONFIG_GPIO_DWAPB is not set
# CONFIG_GPIO_EXAR is not set
# CONFIG_GPIO_GENERIC_PLATFORM is not set
CONFIG_GPIO_ICH=m
# CONFIG_GPIO_MB86S7X is not set
# CONFIG_GPIO_VX855 is not set
# CONFIG_GPIO_AMD_FCH is not set
# end of Memory mapped GPIO drivers

#
# Port-mapped I/O GPIO drivers
#
# CONFIG_GPIO_F7188X is not set
# CONFIG_GPIO_IT87 is not set
# CONFIG_GPIO_SCH is not set
# CONFIG_GPIO_SCH311X is not set
# CONFIG_GPIO_WINBOND is not set
# CONFIG_GPIO_WS16C48 is not set
# end of Port-mapped I/O GPIO drivers

#
# I2C GPIO expanders
#
# CONFIG_GPIO_MAX7300 is not set
# CONFIG_GPIO_MAX732X is not set
# CONFIG_GPIO_PCA953X is not set
# CONFIG_GPIO_PCA9570 is not set
# CONFIG_GPIO_PCF857X is not set
# CONFIG_GPIO_TPIC2810 is not set
# end of I2C GPIO expanders

#
# MFD GPIO expanders
#
# end of MFD GPIO expanders

#
# PCI GPIO expanders
#
# CONFIG_GPIO_AMD8111 is not set
# CONFIG_GPIO_BT8XX is not set
# CONFIG_GPIO_ML_IOH is not set
# CONFIG_GPIO_PCI_IDIO_16 is not set
# CONFIG_GPIO_PCIE_IDIO_24 is not set
# CONFIG_GPIO_RDC321X is not set
# end of PCI GPIO expanders

#
# SPI GPIO expanders
#
# CONFIG_GPIO_MAX3191X is not set
# CONFIG_GPIO_MAX7301 is not set
# CONFIG_GPIO_MC33880 is not set
# CONFIG_GPIO_PISOSR is not set
# CONFIG_GPIO_XRA1403 is not set
# end of SPI GPIO expanders

#
# USB GPIO expanders
#
# end of USB GPIO expanders

#
# Virtual GPIO drivers
#
# CONFIG_GPIO_AGGREGATOR is not set
CONFIG_GPIO_MOCKUP=m
# CONFIG_GPIO_VIRTIO is not set
CONFIG_GPIO_SIM=m
# end of Virtual GPIO drivers

# CONFIG_W1 is not set
CONFIG_POWER_RESET=y
# CONFIG_POWER_RESET_RESTART is not set
CONFIG_POWER_SUPPLY=y
# CONFIG_POWER_SUPPLY_DEBUG is not set
CONFIG_POWER_SUPPLY_HWMON=y
# CONFIG_PDA_POWER is not set
# CONFIG_IP5XXX_POWER is not set
# CONFIG_TEST_POWER is not set
# CONFIG_CHARGER_ADP5061 is not set
# CONFIG_BATTERY_CW2015 is not set
# CONFIG_BATTERY_DS2780 is not set
# CONFIG_BATTERY_DS2781 is not set
# CONFIG_BATTERY_DS2782 is not set
# CONFIG_BATTERY_SAMSUNG_SDI is not set
# CONFIG_BATTERY_SBS is not set
# CONFIG_CHARGER_SBS is not set
# CONFIG_MANAGER_SBS is not set
# CONFIG_BATTERY_BQ27XXX is not set
# CONFIG_BATTERY_MAX17040 is not set
# CONFIG_BATTERY_MAX17042 is not set
# CONFIG_CHARGER_MAX8903 is not set
# CONFIG_CHARGER_LP8727 is not set
# CONFIG_CHARGER_GPIO is not set
# CONFIG_CHARGER_LT3651 is not set
# CONFIG_CHARGER_LTC4162L is not set
# CONFIG_CHARGER_MAX77976 is not set
# CONFIG_CHARGER_BQ2415X is not set
# CONFIG_CHARGER_BQ24257 is not set
# CONFIG_CHARGER_BQ24735 is not set
# CONFIG_CHARGER_BQ2515X is not set
# CONFIG_CHARGER_BQ25890 is not set
# CONFIG_CHARGER_BQ25980 is not set
# CONFIG_CHARGER_BQ256XX is not set
# CONFIG_BATTERY_GAUGE_LTC2941 is not set
# CONFIG_BATTERY_GOLDFISH is not set
# CONFIG_BATTERY_RT5033 is not set
# CONFIG_CHARGER_RT9455 is not set
# CONFIG_CHARGER_BD99954 is not set
# CONFIG_BATTERY_UG3105 is not set
CONFIG_HWMON=y
CONFIG_HWMON_VID=m
# CONFIG_HWMON_DEBUG_CHIP is not set

#
# Native drivers
#
CONFIG_SENSORS_ABITUGURU=m
CONFIG_SENSORS_ABITUGURU3=m
# CONFIG_SENSORS_AD7314 is not set
CONFIG_SENSORS_AD7414=m
CONFIG_SENSORS_AD7418=m
CONFIG_SENSORS_ADM1025=m
CONFIG_SENSORS_ADM1026=m
CONFIG_SENSORS_ADM1029=m
CONFIG_SENSORS_ADM1031=m
# CONFIG_SENSORS_ADM1177 is not set
CONFIG_SENSORS_ADM9240=m
CONFIG_SENSORS_ADT7X10=m
# CONFIG_SENSORS_ADT7310 is not set
CONFIG_SENSORS_ADT7410=m
CONFIG_SENSORS_ADT7411=m
CONFIG_SENSORS_ADT7462=m
CONFIG_SENSORS_ADT7470=m
CONFIG_SENSORS_ADT7475=m
# CONFIG_SENSORS_AHT10 is not set
# CONFIG_SENSORS_AQUACOMPUTER_D5NEXT is not set
# CONFIG_SENSORS_AS370 is not set
CONFIG_SENSORS_ASC7621=m
# CONFIG_SENSORS_AXI_FAN_CONTROL is not set
CONFIG_SENSORS_K8TEMP=m
CONFIG_SENSORS_K10TEMP=m
CONFIG_SENSORS_FAM15H_POWER=m
CONFIG_SENSORS_APPLESMC=m
CONFIG_SENSORS_ASB100=m
CONFIG_SENSORS_ATXP1=m
# CONFIG_SENSORS_CORSAIR_CPRO is not set
# CONFIG_SENSORS_CORSAIR_PSU is not set
# CONFIG_SENSORS_DRIVETEMP is not set
CONFIG_SENSORS_DS620=m
CONFIG_SENSORS_DS1621=m
# CONFIG_SENSORS_DELL_SMM is not set
CONFIG_SENSORS_I5K_AMB=m
CONFIG_SENSORS_F71805F=m
CONFIG_SENSORS_F71882FG=m
CONFIG_SENSORS_F75375S=m
CONFIG_SENSORS_FSCHMD=m
# CONFIG_SENSORS_FTSTEUTATES is not set
CONFIG_SENSORS_GL518SM=m
CONFIG_SENSORS_GL520SM=m
CONFIG_SENSORS_G760A=m
# CONFIG_SENSORS_G762 is not set
# CONFIG_SENSORS_HIH6130 is not set
CONFIG_SENSORS_IBMAEM=m
CONFIG_SENSORS_IBMPEX=m
CONFIG_SENSORS_I5500=m
CONFIG_SENSORS_CORETEMP=m
CONFIG_SENSORS_IT87=m
CONFIG_SENSORS_JC42=m
# CONFIG_SENSORS_POWR1220 is not set
CONFIG_SENSORS_LINEAGE=m
# CONFIG_SENSORS_LTC2945 is not set
# CONFIG_SENSORS_LTC2947_I2C is not set
# CONFIG_SENSORS_LTC2947_SPI is not set
# CONFIG_SENSORS_LTC2990 is not set
# CONFIG_SENSORS_LTC2992 is not set
CONFIG_SENSORS_LTC4151=m
CONFIG_SENSORS_LTC4215=m
# CONFIG_SENSORS_LTC4222 is not set
CONFIG_SENSORS_LTC4245=m
# CONFIG_SENSORS_LTC4260 is not set
CONFIG_SENSORS_LTC4261=m
# CONFIG_SENSORS_MAX1111 is not set
# CONFIG_SENSORS_MAX127 is not set
CONFIG_SENSORS_MAX16065=m
CONFIG_SENSORS_MAX1619=m
CONFIG_SENSORS_MAX1668=m
CONFIG_SENSORS_MAX197=m
# CONFIG_SENSORS_MAX31722 is not set
# CONFIG_SENSORS_MAX31730 is not set
# CONFIG_SENSORS_MAX31760 is not set
# CONFIG_SENSORS_MAX6620 is not set
# CONFIG_SENSORS_MAX6621 is not set
CONFIG_SENSORS_MAX6639=m
CONFIG_SENSORS_MAX6650=m
CONFIG_SENSORS_MAX6697=m
# CONFIG_SENSORS_MAX31790 is not set
CONFIG_SENSORS_MCP3021=m
# CONFIG_SENSORS_MLXREG_FAN is not set
# CONFIG_SENSORS_TC654 is not set
# CONFIG_SENSORS_TPS23861 is not set
# CONFIG_SENSORS_MR75203 is not set
# CONFIG_SENSORS_ADCXX is not set
CONFIG_SENSORS_LM63=m
# CONFIG_SENSORS_LM70 is not set
CONFIG_SENSORS_LM73=m
CONFIG_SENSORS_LM75=m
CONFIG_SENSORS_LM77=m
CONFIG_SENSORS_LM78=m
CONFIG_SENSORS_LM80=m
CONFIG_SENSORS_LM83=m
CONFIG_SENSORS_LM85=m
CONFIG_SENSORS_LM87=m
CONFIG_SENSORS_LM90=m
CONFIG_SENSORS_LM92=m
CONFIG_SENSORS_LM93=m
CONFIG_SENSORS_LM95234=m
CONFIG_SENSORS_LM95241=m
CONFIG_SENSORS_LM95245=m
CONFIG_SENSORS_PC87360=m
CONFIG_SENSORS_PC87427=m
# CONFIG_SENSORS_NCT6683 is not set
CONFIG_SENSORS_NCT6775_CORE=m
CONFIG_SENSORS_NCT6775=m
# CONFIG_SENSORS_NCT6775_I2C is not set
# CONFIG_SENSORS_NCT7802 is not set
# CONFIG_SENSORS_NCT7904 is not set
# CONFIG_SENSORS_NPCM7XX is not set
# CONFIG_SENSORS_NZXT_KRAKEN2 is not set
# CONFIG_SENSORS_NZXT_SMART2 is not set
CONFIG_SENSORS_PCF8591=m
CONFIG_PMBUS=m
CONFIG_SENSORS_PMBUS=m
# CONFIG_SENSORS_ADM1266 is not set
CONFIG_SENSORS_ADM1275=m
# CONFIG_SENSORS_BEL_PFE is not set
# CONFIG_SENSORS_BPA_RS600 is not set
# CONFIG_SENSORS_DELTA_AHE50DC_FAN is not set
# CONFIG_SENSORS_FSP_3Y is not set
# CONFIG_SENSORS_IBM_CFFPS is not set
# CONFIG_SENSORS_DPS920AB is not set
# CONFIG_SENSORS_INSPUR_IPSPS is not set
# CONFIG_SENSORS_IR35221 is not set
# CONFIG_SENSORS_IR36021 is not set
# CONFIG_SENSORS_IR38064 is not set
# CONFIG_SENSORS_IRPS5401 is not set
# CONFIG_SENSORS_ISL68137 is not set
CONFIG_SENSORS_LM25066=m
# CONFIG_SENSORS_LT7182S is not set
CONFIG_SENSORS_LTC2978=m
# CONFIG_SENSORS_LTC3815 is not set
# CONFIG_SENSORS_MAX15301 is not set
CONFIG_SENSORS_MAX16064=m
# CONFIG_SENSORS_MAX16601 is not set
# CONFIG_SENSORS_MAX20730 is not set
# CONFIG_SENSORS_MAX20751 is not set
# CONFIG_SENSORS_MAX31785 is not set
CONFIG_SENSORS_MAX34440=m
CONFIG_SENSORS_MAX8688=m
# CONFIG_SENSORS_MP2888 is not set
# CONFIG_SENSORS_MP2975 is not set
# CONFIG_SENSORS_MP5023 is not set
# CONFIG_SENSORS_PIM4328 is not set
# CONFIG_SENSORS_PLI1209BC is not set
# CONFIG_SENSORS_PM6764TR is not set
# CONFIG_SENSORS_PXE1610 is not set
# CONFIG_SENSORS_Q54SJ108A2 is not set
# CONFIG_SENSORS_STPDDC60 is not set
# CONFIG_SENSORS_TPS40422 is not set
# CONFIG_SENSORS_TPS53679 is not set
# CONFIG_SENSORS_TPS546D24 is not set
CONFIG_SENSORS_UCD9000=m
CONFIG_SENSORS_UCD9200=m
# CONFIG_SENSORS_XDPE152 is not set
# CONFIG_SENSORS_XDPE122 is not set
CONFIG_SENSORS_ZL6100=m
# CONFIG_SENSORS_SBTSI is not set
# CONFIG_SENSORS_SBRMI is not set
CONFIG_SENSORS_SHT15=m
CONFIG_SENSORS_SHT21=m
# CONFIG_SENSORS_SHT3x is not set
# CONFIG_SENSORS_SHT4x is not set
# CONFIG_SENSORS_SHTC1 is not set
CONFIG_SENSORS_SIS5595=m
CONFIG_SENSORS_DME1737=m
CONFIG_SENSORS_EMC1403=m
# CONFIG_SENSORS_EMC2103 is not set
# CONFIG_SENSORS_EMC2305 is not set
CONFIG_SENSORS_EMC6W201=m
CONFIG_SENSORS_SMSC47M1=m
CONFIG_SENSORS_SMSC47M192=m
CONFIG_SENSORS_SMSC47B397=m
CONFIG_SENSORS_SCH56XX_COMMON=m
CONFIG_SENSORS_SCH5627=m
CONFIG_SENSORS_SCH5636=m
# CONFIG_SENSORS_STTS751 is not set
# CONFIG_SENSORS_SMM665 is not set
# CONFIG_SENSORS_ADC128D818 is not set
CONFIG_SENSORS_ADS7828=m
# CONFIG_SENSORS_ADS7871 is not set
CONFIG_SENSORS_AMC6821=m
CONFIG_SENSORS_INA209=m
CONFIG_SENSORS_INA2XX=m
# CONFIG_SENSORS_INA238 is not set
# CONFIG_SENSORS_INA3221 is not set
# CONFIG_SENSORS_TC74 is not set
CONFIG_SENSORS_THMC50=m
CONFIG_SENSORS_TMP102=m
# CONFIG_SENSORS_TMP103 is not set
# CONFIG_SENSORS_TMP108 is not set
CONFIG_SENSORS_TMP401=m
CONFIG_SENSORS_TMP421=m
# CONFIG_SENSORS_TMP464 is not set
# CONFIG_SENSORS_TMP513 is not set
CONFIG_SENSORS_VIA_CPUTEMP=m
CONFIG_SENSORS_VIA686A=m
CONFIG_SENSORS_VT1211=m
CONFIG_SENSORS_VT8231=m
# CONFIG_SENSORS_W83773G is not set
CONFIG_SENSORS_W83781D=m
CONFIG_SENSORS_W83791D=m
CONFIG_SENSORS_W83792D=m
CONFIG_SENSORS_W83793=m
CONFIG_SENSORS_W83795=m
# CONFIG_SENSORS_W83795_FANCTRL is not set
CONFIG_SENSORS_W83L785TS=m
CONFIG_SENSORS_W83L786NG=m
CONFIG_SENSORS_W83627HF=m
CONFIG_SENSORS_W83627EHF=m
# CONFIG_SENSORS_XGENE is not set

#
# ACPI drivers
#
CONFIG_SENSORS_ACPI_POWER=m
CONFIG_SENSORS_ATK0110=m
# CONFIG_SENSORS_ASUS_WMI is not set
# CONFIG_SENSORS_ASUS_EC is not set
CONFIG_THERMAL=y
# CONFIG_THERMAL_NETLINK is not set
# CONFIG_THERMAL_STATISTICS is not set
CONFIG_THERMAL_EMERGENCY_POWEROFF_DELAY_MS=0
CONFIG_THERMAL_HWMON=y
CONFIG_THERMAL_WRITABLE_TRIPS=y
CONFIG_THERMAL_DEFAULT_GOV_STEP_WISE=y
# CONFIG_THERMAL_DEFAULT_GOV_FAIR_SHARE is not set
# CONFIG_THERMAL_DEFAULT_GOV_USER_SPACE is not set
CONFIG_THERMAL_GOV_FAIR_SHARE=y
CONFIG_THERMAL_GOV_STEP_WISE=y
CONFIG_THERMAL_GOV_BANG_BANG=y
CONFIG_THERMAL_GOV_USER_SPACE=y
# CONFIG_THERMAL_EMULATION is not set

#
# Intel thermal drivers
#
CONFIG_INTEL_POWERCLAMP=y
CONFIG_X86_THERMAL_VECTOR=y
CONFIG_X86_PKG_TEMP_THERMAL=m
# CONFIG_INTEL_SOC_DTS_THERMAL is not set

#
# ACPI INT340X thermal drivers
#
# CONFIG_INT340X_THERMAL is not set
# end of ACPI INT340X thermal drivers

CONFIG_INTEL_PCH_THERMAL=m
# CONFIG_INTEL_TCC_COOLING is not set
# CONFIG_INTEL_MENLOW is not set
# CONFIG_INTEL_HFI_THERMAL is not set
# end of Intel thermal drivers

CONFIG_WATCHDOG=y
CONFIG_WATCHDOG_CORE=y
# CONFIG_WATCHDOG_NOWAYOUT is not set
CONFIG_WATCHDOG_HANDLE_BOOT_ENABLED=y
CONFIG_WATCHDOG_OPEN_TIMEOUT=0
CONFIG_WATCHDOG_SYSFS=y
# CONFIG_WATCHDOG_HRTIMER_PRETIMEOUT is not set

#
# Watchdog Pretimeout Governors
#
# CONFIG_WATCHDOG_PRETIMEOUT_GOV is not set

#
# Watchdog Device Drivers
#
CONFIG_SOFT_WATCHDOG=m
CONFIG_WDAT_WDT=m
# CONFIG_XILINX_WATCHDOG is not set
# CONFIG_ZIIRAVE_WATCHDOG is not set
# CONFIG_MLX_WDT is not set
# CONFIG_CADENCE_WATCHDOG is not set
# CONFIG_DW_WATCHDOG is not set
# CONFIG_MAX63XX_WATCHDOG is not set
# CONFIG_ACQUIRE_WDT is not set
# CONFIG_ADVANTECH_WDT is not set
CONFIG_ALIM1535_WDT=m
CONFIG_ALIM7101_WDT=m
# CONFIG_EBC_C384_WDT is not set
# CONFIG_EXAR_WDT is not set
CONFIG_F71808E_WDT=m
# CONFIG_SP5100_TCO is not set
CONFIG_SBC_FITPC2_WATCHDOG=m
# CONFIG_EUROTECH_WDT is not set
CONFIG_IB700_WDT=m
CONFIG_IBMASR=m
# CONFIG_WAFER_WDT is not set
CONFIG_I6300ESB_WDT=y
CONFIG_IE6XX_WDT=m
CONFIG_ITCO_WDT=y
CONFIG_ITCO_VENDOR_SUPPORT=y
CONFIG_IT8712F_WDT=m
CONFIG_IT87_WDT=m
CONFIG_HP_WATCHDOG=m
CONFIG_HPWDT_NMI_DECODING=y
# CONFIG_SC1200_WDT is not set
# CONFIG_PC87413_WDT is not set
CONFIG_NV_TCO=m
# CONFIG_60XX_WDT is not set
# CONFIG_CPU5_WDT is not set
CONFIG_SMSC_SCH311X_WDT=m
# CONFIG_SMSC37B787_WDT is not set
# CONFIG_TQMX86_WDT is not set
CONFIG_VIA_WDT=m
CONFIG_W83627HF_WDT=m
CONFIG_W83877F_WDT=m
CONFIG_W83977F_WDT=m
CONFIG_MACHZ_WDT=m
# CONFIG_SBC_EPX_C3_WATCHDOG is not set
CONFIG_INTEL_MEI_WDT=m
# CONFIG_NI903X_WDT is not set
# CONFIG_NIC7018_WDT is not set
# CONFIG_MEN_A21_WDT is not set

#
# PCI-based Watchdog Cards
#
CONFIG_PCIPCWATCHDOG=m
CONFIG_WDTPCI=m

#
# USB-based Watchdog Cards
#
# CONFIG_USBPCWATCHDOG is not set
CONFIG_SSB_POSSIBLE=y
# CONFIG_SSB is not set
CONFIG_BCMA_POSSIBLE=y
CONFIG_BCMA=m
CONFIG_BCMA_HOST_PCI_POSSIBLE=y
CONFIG_BCMA_HOST_PCI=y
# CONFIG_BCMA_HOST_SOC is not set
CONFIG_BCMA_DRIVER_PCI=y
CONFIG_BCMA_DRIVER_GMAC_CMN=y
CONFIG_BCMA_DRIVER_GPIO=y
# CONFIG_BCMA_DEBUG is not set

#
# Multifunction device drivers
#
CONFIG_MFD_CORE=y
# CONFIG_MFD_AS3711 is not set
# CONFIG_PMIC_ADP5520 is not set
# CONFIG_MFD_AAT2870_CORE is not set
# CONFIG_MFD_BCM590XX is not set
# CONFIG_MFD_BD9571MWV is not set
# CONFIG_MFD_AXP20X_I2C is not set
# CONFIG_MFD_MADERA is not set
# CONFIG_PMIC_DA903X is not set
# CONFIG_MFD_DA9052_SPI is not set
# CONFIG_MFD_DA9052_I2C is not set
# CONFIG_MFD_DA9055 is not set
# CONFIG_MFD_DA9062 is not set
# CONFIG_MFD_DA9063 is not set
# CONFIG_MFD_DA9150 is not set
# CONFIG_MFD_DLN2 is not set
# CONFIG_MFD_MC13XXX_SPI is not set
# CONFIG_MFD_MC13XXX_I2C is not set
# CONFIG_MFD_MP2629 is not set
# CONFIG_HTC_PASIC3 is not set
# CONFIG_HTC_I2CPLD is not set
# CONFIG_MFD_INTEL_QUARK_I2C_GPIO is not set
CONFIG_LPC_ICH=m
CONFIG_LPC_SCH=m
CONFIG_MFD_INTEL_LPSS=y
CONFIG_MFD_INTEL_LPSS_ACPI=y
CONFIG_MFD_INTEL_LPSS_PCI=y
# CONFIG_MFD_INTEL_PMC_BXT is not set
# CONFIG_MFD_IQS62X is not set
# CONFIG_MFD_JANZ_CMODIO is not set
# CONFIG_MFD_KEMPLD is not set
# CONFIG_MFD_88PM800 is not set
# CONFIG_MFD_88PM805 is not set
# CONFIG_MFD_88PM860X is not set
# CONFIG_MFD_MAX14577 is not set
# CONFIG_MFD_MAX77693 is not set
# CONFIG_MFD_MAX77843 is not set
# CONFIG_MFD_MAX8907 is not set
# CONFIG_MFD_MAX8925 is not set
# CONFIG_MFD_MAX8997 is not set
# CONFIG_MFD_MAX8998 is not set
# CONFIG_MFD_MT6360 is not set
# CONFIG_MFD_MT6370 is not set
# CONFIG_MFD_MT6397 is not set
# CONFIG_MFD_MENF21BMC is not set
# CONFIG_MFD_OCELOT is not set
# CONFIG_EZX_PCAP is not set
# CONFIG_MFD_VIPERBOARD is not set
# CONFIG_MFD_RETU is not set
# CONFIG_MFD_PCF50633 is not set
# CONFIG_MFD_SY7636A is not set
# CONFIG_MFD_RDC321X is not set
# CONFIG_MFD_RT4831 is not set
# CONFIG_MFD_RT5033 is not set
# CONFIG_MFD_RT5120 is not set
# CONFIG_MFD_RC5T583 is not set
# CONFIG_MFD_SI476X_CORE is not set
CONFIG_MFD_SM501=m
CONFIG_MFD_SM501_GPIO=y
# CONFIG_MFD_SKY81452 is not set
# CONFIG_MFD_SYSCON is not set
# CONFIG_MFD_TI_AM335X_TSCADC is not set
# CONFIG_MFD_LP3943 is not set
# CONFIG_MFD_LP8788 is not set
# CONFIG_MFD_TI_LMU is not set
# CONFIG_MFD_PALMAS is not set
# CONFIG_TPS6105X is not set
# CONFIG_TPS65010 is not set
# CONFIG_TPS6507X is not set
# CONFIG_MFD_TPS65086 is not set
# CONFIG_MFD_TPS65090 is not set
# CONFIG_MFD_TI_LP873X is not set
# CONFIG_MFD_TPS6586X is not set
# CONFIG_MFD_TPS65910 is not set
# CONFIG_MFD_TPS65912_I2C is not set
# CONFIG_MFD_TPS65912_SPI is not set
# CONFIG_TWL4030_CORE is not set
# CONFIG_TWL6040_CORE is not set
# CONFIG_MFD_WL1273_CORE is not set
# CONFIG_MFD_LM3533 is not set
# CONFIG_MFD_TQMX86 is not set
CONFIG_MFD_VX855=m
# CONFIG_MFD_ARIZONA_I2C is not set
# CONFIG_MFD_ARIZONA_SPI is not set
# CONFIG_MFD_WM8400 is not set
# CONFIG_MFD_WM831X_I2C is not set
# CONFIG_MFD_WM831X_SPI is not set
# CONFIG_MFD_WM8350_I2C is not set
# CONFIG_MFD_WM8994 is not set
# CONFIG_MFD_ATC260X_I2C is not set
# CONFIG_MFD_INTEL_M10_BMC is not set
# end of Multifunction device drivers

# CONFIG_REGULATOR is not set
CONFIG_RC_CORE=y
CONFIG_BPF_LIRC_MODE2=y
CONFIG_LIRC=y
CONFIG_RC_MAP=m
CONFIG_RC_DECODERS=y
CONFIG_IR_IMON_DECODER=m
CONFIG_IR_JVC_DECODER=m
CONFIG_IR_MCE_KBD_DECODER=m
CONFIG_IR_NEC_DECODER=m
CONFIG_IR_RC5_DECODER=m
CONFIG_IR_RC6_DECODER=m
# CONFIG_IR_RCMM_DECODER is not set
CONFIG_IR_SANYO_DECODER=m
CONFIG_IR_SHARP_DECODER=m
CONFIG_IR_SONY_DECODER=m
# CONFIG_IR_XMP_DECODER is not set
CONFIG_RC_DEVICES=y
CONFIG_IR_ENE=m
CONFIG_IR_FINTEK=m
# CONFIG_IR_IGORPLUGUSB is not set
# CONFIG_IR_IGUANA is not set
# CONFIG_IR_IMON is not set
# CONFIG_IR_IMON_RAW is not set
CONFIG_IR_ITE_CIR=m
# CONFIG_IR_MCEUSB is not set
CONFIG_IR_NUVOTON=m
# CONFIG_IR_REDRAT3 is not set
CONFIG_IR_SERIAL=m
CONFIG_IR_SERIAL_TRANSMITTER=y
# CONFIG_IR_STREAMZAP is not set
# CONFIG_IR_TOY is not set
# CONFIG_IR_TTUSBIR is not set
CONFIG_IR_WINBOND_CIR=m
# CONFIG_RC_ATI_REMOTE is not set
CONFIG_RC_LOOPBACK=m
# CONFIG_RC_XBOX_DVD is not set

#
# CEC support
#
# CONFIG_MEDIA_CEC_SUPPORT is not set
# end of CEC support

CONFIG_MEDIA_SUPPORT=m
CONFIG_MEDIA_SUPPORT_FILTER=y
CONFIG_MEDIA_SUBDRV_AUTOSELECT=y

#
# Media device types
#
# CONFIG_MEDIA_CAMERA_SUPPORT is not set
# CONFIG_MEDIA_ANALOG_TV_SUPPORT is not set
# CONFIG_MEDIA_DIGITAL_TV_SUPPORT is not set
# CONFIG_MEDIA_RADIO_SUPPORT is not set
# CONFIG_MEDIA_SDR_SUPPORT is not set
# CONFIG_MEDIA_PLATFORM_SUPPORT is not set
# CONFIG_MEDIA_TEST_SUPPORT is not set
# end of Media device types

#
# Media drivers
#

#
# Drivers filtered as selected at 'Filter media drivers'
#

#
# Media drivers
#
# CONFIG_MEDIA_USB_SUPPORT is not set
# CONFIG_MEDIA_PCI_SUPPORT is not set
# end of Media drivers

#
# Media ancillary drivers
#
# end of Media ancillary drivers

#
# Graphics support
#
CONFIG_APERTURE_HELPERS=y
CONFIG_AGP=y
CONFIG_AGP_AMD64=y
CONFIG_AGP_INTEL=y
CONFIG_AGP_SIS=y
CONFIG_AGP_VIA=y
CONFIG_INTEL_GTT=y
CONFIG_VGA_SWITCHEROO=y
CONFIG_DRM=y
CONFIG_DRM_MIPI_DSI=y
# CONFIG_DRM_DEBUG_MM is not set
CONFIG_DRM_USE_DYNAMIC_DEBUG=y
CONFIG_DRM_KMS_HELPER=m
# CONFIG_DRM_DEBUG_DP_MST_TOPOLOGY_REFS is not set
CONFIG_DRM_DEBUG_MODESET_LOCK=y
CONFIG_DRM_FBDEV_EMULATION=y
CONFIG_DRM_FBDEV_OVERALLOC=100
# CONFIG_DRM_FBDEV_LEAK_PHYS_SMEM is not set
CONFIG_DRM_LOAD_EDID_FIRMWARE=y
CONFIG_DRM_DISPLAY_HELPER=m
CONFIG_DRM_DISPLAY_DP_HELPER=y
CONFIG_DRM_DISPLAY_HDCP_HELPER=y
CONFIG_DRM_DISPLAY_HDMI_HELPER=y
CONFIG_DRM_DP_AUX_CHARDEV=y
# CONFIG_DRM_DP_CEC is not set
CONFIG_DRM_TTM=m
CONFIG_DRM_BUDDY=m
CONFIG_DRM_VRAM_HELPER=m
CONFIG_DRM_TTM_HELPER=m
CONFIG_DRM_GEM_SHMEM_HELPER=y

#
# I2C encoder or helper chips
#
CONFIG_DRM_I2C_CH7006=m
CONFIG_DRM_I2C_SIL164=m
# CONFIG_DRM_I2C_NXP_TDA998X is not set
# CONFIG_DRM_I2C_NXP_TDA9950 is not set
# end of I2C encoder or helper chips

#
# ARM devices
#
# end of ARM devices

# CONFIG_DRM_RADEON is not set
# CONFIG_DRM_AMDGPU is not set
# CONFIG_DRM_NOUVEAU is not set
CONFIG_DRM_I915=m
CONFIG_DRM_I915_FORCE_PROBE=""
CONFIG_DRM_I915_CAPTURE_ERROR=y
CONFIG_DRM_I915_COMPRESS_ERROR=y
CONFIG_DRM_I915_USERPTR=y
# CONFIG_DRM_I915_GVT_KVMGT is not set

#
# drm/i915 Debugging
#
# CONFIG_DRM_I915_WERROR is not set
# CONFIG_DRM_I915_DEBUG is not set
# CONFIG_DRM_I915_DEBUG_MMIO is not set
# CONFIG_DRM_I915_SW_FENCE_DEBUG_OBJECTS is not set
# CONFIG_DRM_I915_SW_FENCE_CHECK_DAG is not set
# CONFIG_DRM_I915_DEBUG_GUC is not set
# CONFIG_DRM_I915_SELFTEST is not set
# CONFIG_DRM_I915_LOW_LEVEL_TRACEPOINTS is not set
# CONFIG_DRM_I915_DEBUG_VBLANK_EVADE is not set
# CONFIG_DRM_I915_DEBUG_RUNTIME_PM is not set
# end of drm/i915 Debugging

#
# drm/i915 Profile Guided Optimisation
#
CONFIG_DRM_I915_REQUEST_TIMEOUT=20000
CONFIG_DRM_I915_FENCE_TIMEOUT=10000
CONFIG_DRM_I915_USERFAULT_AUTOSUSPEND=250
CONFIG_DRM_I915_HEARTBEAT_INTERVAL=2500
CONFIG_DRM_I915_PREEMPT_TIMEOUT=640
CONFIG_DRM_I915_MAX_REQUEST_BUSYWAIT=8000
CONFIG_DRM_I915_STOP_TIMEOUT=100
CONFIG_DRM_I915_TIMESLICE_DURATION=1
# end of drm/i915 Profile Guided Optimisation

CONFIG_DRM_VGEM=y
# CONFIG_DRM_VKMS is not set
# CONFIG_DRM_VMWGFX is not set
CONFIG_DRM_GMA500=m
# CONFIG_DRM_UDL is not set
CONFIG_DRM_AST=m
# CONFIG_DRM_MGAG200 is not set
CONFIG_DRM_QXL=m
CONFIG_DRM_VIRTIO_GPU=m
CONFIG_DRM_PANEL=y

#
# Display Panels
#
# CONFIG_DRM_PANEL_RASPBERRYPI_TOUCHSCREEN is not set
# CONFIG_DRM_PANEL_WIDECHIPS_WS2401 is not set
# end of Display Panels

CONFIG_DRM_BRIDGE=y
CONFIG_DRM_PANEL_BRIDGE=y

#
# Display Interface Bridges
#
# CONFIG_DRM_ANALOGIX_ANX78XX is not set
# end of Display Interface Bridges

# CONFIG_DRM_ETNAVIV is not set
CONFIG_DRM_BOCHS=m
CONFIG_DRM_CIRRUS_QEMU=m
# CONFIG_DRM_GM12U320 is not set
# CONFIG_DRM_PANEL_MIPI_DBI is not set
# CONFIG_DRM_SIMPLEDRM is not set
# CONFIG_TINYDRM_HX8357D is not set
# CONFIG_TINYDRM_ILI9163 is not set
# CONFIG_TINYDRM_ILI9225 is not set
# CONFIG_TINYDRM_ILI9341 is not set
# CONFIG_TINYDRM_ILI9486 is not set
# CONFIG_TINYDRM_MI0283QT is not set
# CONFIG_TINYDRM_REPAPER is not set
# CONFIG_TINYDRM_ST7586 is not set
# CONFIG_TINYDRM_ST7735R is not set
# CONFIG_DRM_VBOXVIDEO is not set
# CONFIG_DRM_GUD is not set
# CONFIG_DRM_SSD130X is not set
# CONFIG_DRM_LEGACY is not set
CONFIG_DRM_PANEL_ORIENTATION_QUIRKS=y
CONFIG_DRM_NOMODESET=y
CONFIG_DRM_PRIVACY_SCREEN=y

#
# Frame buffer Devices
#
CONFIG_FB_CMDLINE=y
CONFIG_FB_NOTIFY=y
CONFIG_FB=y
# CONFIG_FIRMWARE_EDID is not set
CONFIG_FB_CFB_FILLRECT=y
CONFIG_FB_CFB_COPYAREA=y
CONFIG_FB_CFB_IMAGEBLIT=y
CONFIG_FB_SYS_FILLRECT=m
CONFIG_FB_SYS_COPYAREA=m
CONFIG_FB_SYS_IMAGEBLIT=m
# CONFIG_FB_FOREIGN_ENDIAN is not set
CONFIG_FB_SYS_FOPS=m
CONFIG_FB_DEFERRED_IO=y
CONFIG_FB_MODE_HELPERS=y
CONFIG_FB_TILEBLITTING=y

#
# Frame buffer hardware drivers
#
# CONFIG_FB_CIRRUS is not set
# CONFIG_FB_PM2 is not set
# CONFIG_FB_CYBER2000 is not set
# CONFIG_FB_ARC is not set
# CONFIG_FB_ASILIANT is not set
# CONFIG_FB_IMSTT is not set
# CONFIG_FB_VGA16 is not set
# CONFIG_FB_UVESA is not set
CONFIG_FB_VESA=y
CONFIG_FB_EFI=y
# CONFIG_FB_N411 is not set
# CONFIG_FB_HGA is not set
# CONFIG_FB_OPENCORES is not set
# CONFIG_FB_S1D13XXX is not set
# CONFIG_FB_NVIDIA is not set
# CONFIG_FB_RIVA is not set
# CONFIG_FB_I740 is not set
# CONFIG_FB_LE80578 is not set
# CONFIG_FB_INTEL is not set
# CONFIG_FB_MATROX is not set
# CONFIG_FB_RADEON is not set
# CONFIG_FB_ATY128 is not set
# CONFIG_FB_ATY is not set
# CONFIG_FB_S3 is not set
# CONFIG_FB_SAVAGE is not set
# CONFIG_FB_SIS is not set
# CONFIG_FB_VIA is not set
# CONFIG_FB_NEOMAGIC is not set
# CONFIG_FB_KYRO is not set
# CONFIG_FB_3DFX is not set
# CONFIG_FB_VOODOO1 is not set
# CONFIG_FB_VT8623 is not set
# CONFIG_FB_TRIDENT is not set
# CONFIG_FB_ARK is not set
# CONFIG_FB_PM3 is not set
# CONFIG_FB_CARMINE is not set
# CONFIG_FB_SM501 is not set
# CONFIG_FB_SMSCUFX is not set
# CONFIG_FB_UDL is not set
# CONFIG_FB_IBM_GXT4500 is not set
# CONFIG_FB_VIRTUAL is not set
# CONFIG_FB_METRONOME is not set
# CONFIG_FB_MB862XX is not set
# CONFIG_FB_SIMPLE is not set
# CONFIG_FB_SSD1307 is not set
# CONFIG_FB_SM712 is not set
# end of Frame buffer Devices

#
# Backlight & LCD device support
#
CONFIG_LCD_CLASS_DEVICE=m
# CONFIG_LCD_L4F00242T03 is not set
# CONFIG_LCD_LMS283GF05 is not set
# CONFIG_LCD_LTV350QV is not set
# CONFIG_LCD_ILI922X is not set
# CONFIG_LCD_ILI9320 is not set
# CONFIG_LCD_TDO24M is not set
# CONFIG_LCD_VGG2432A4 is not set
CONFIG_LCD_PLATFORM=m
# CONFIG_LCD_AMS369FG06 is not set
# CONFIG_LCD_LMS501KF03 is not set
# CONFIG_LCD_HX8357 is not set
# CONFIG_LCD_OTM3225A is not set
CONFIG_BACKLIGHT_CLASS_DEVICE=y
# CONFIG_BACKLIGHT_KTD253 is not set
# CONFIG_BACKLIGHT_PWM is not set
CONFIG_BACKLIGHT_APPLE=m
# CONFIG_BACKLIGHT_QCOM_WLED is not set
# CONFIG_BACKLIGHT_SAHARA is not set
# CONFIG_BACKLIGHT_ADP8860 is not set
# CONFIG_BACKLIGHT_ADP8870 is not set
# CONFIG_BACKLIGHT_LM3630A is not set
# CONFIG_BACKLIGHT_LM3639 is not set
CONFIG_BACKLIGHT_LP855X=m
# CONFIG_BACKLIGHT_GPIO is not set
# CONFIG_BACKLIGHT_LV5207LP is not set
# CONFIG_BACKLIGHT_BD6107 is not set
# CONFIG_BACKLIGHT_ARCXCNN is not set
# end of Backlight & LCD device support

CONFIG_HDMI=y

#
# Console display driver support
#
CONFIG_VGA_CONSOLE=y
CONFIG_DUMMY_CONSOLE=y
CONFIG_DUMMY_CONSOLE_COLUMNS=80
CONFIG_DUMMY_CONSOLE_ROWS=25
CONFIG_FRAMEBUFFER_CONSOLE=y
# CONFIG_FRAMEBUFFER_CONSOLE_LEGACY_ACCELERATION is not set
CONFIG_FRAMEBUFFER_CONSOLE_DETECT_PRIMARY=y
CONFIG_FRAMEBUFFER_CONSOLE_ROTATION=y
# CONFIG_FRAMEBUFFER_CONSOLE_DEFERRED_TAKEOVER is not set
# end of Console display driver support

CONFIG_LOGO=y
# CONFIG_LOGO_LINUX_MONO is not set
# CONFIG_LOGO_LINUX_VGA16 is not set
CONFIG_LOGO_LINUX_CLUT224=y
# end of Graphics support

# CONFIG_SOUND is not set

#
# HID support
#
CONFIG_HID=y
CONFIG_HID_BATTERY_STRENGTH=y
CONFIG_HIDRAW=y
CONFIG_UHID=m
CONFIG_HID_GENERIC=y

#
# Special HID drivers
#
CONFIG_HID_A4TECH=y
# CONFIG_HID_ACCUTOUCH is not set
CONFIG_HID_ACRUX=m
# CONFIG_HID_ACRUX_FF is not set
CONFIG_HID_APPLE=m
# CONFIG_HID_APPLEIR is not set
CONFIG_HID_ASUS=m
CONFIG_HID_AUREAL=m
CONFIG_HID_BELKIN=y
# CONFIG_HID_BETOP_FF is not set
# CONFIG_HID_BIGBEN_FF is not set
CONFIG_HID_CHERRY=y
# CONFIG_HID_CHICONY is not set
# CONFIG_HID_CORSAIR is not set
# CONFIG_HID_COUGAR is not set
# CONFIG_HID_MACALLY is not set
CONFIG_HID_CMEDIA=m
# CONFIG_HID_CP2112 is not set
# CONFIG_HID_CREATIVE_SB0540 is not set
CONFIG_HID_CYPRESS=y
CONFIG_HID_DRAGONRISE=y
# CONFIG_DRAGONRISE_FF is not set
# CONFIG_HID_EMS_FF is not set
# CONFIG_HID_ELAN is not set
CONFIG_HID_ELECOM=m
# CONFIG_HID_ELO is not set
CONFIG_HID_EZKEY=y
# CONFIG_HID_FT260 is not set
CONFIG_HID_GEMBIRD=m
CONFIG_HID_GFRM=m
# CONFIG_HID_GLORIOUS is not set
# CONFIG_HID_HOLTEK is not set
# CONFIG_HID_VIVALDI is not set
# CONFIG_HID_GT683R is not set
CONFIG_HID_KEYTOUCH=m
CONFIG_HID_KYE=y
# CONFIG_HID_UCLOGIC is not set
CONFIG_HID_WALTOP=m
# CONFIG_HID_VIEWSONIC is not set
# CONFIG_HID_VRC2 is not set
# CONFIG_HID_XIAOMI is not set
CONFIG_HID_GYRATION=y
CONFIG_HID_ICADE=m
CONFIG_HID_ITE=m
CONFIG_HID_JABRA=m
CONFIG_HID_TWINHAN=y
CONFIG_HID_KENSINGTON=y
CONFIG_HID_LCPOWER=m
CONFIG_HID_LED=m
CONFIG_HID_LENOVO=m
# CONFIG_HID_LETSKETCH is not set
CONFIG_HID_LOGITECH=m
CONFIG_HID_LOGITECH_DJ=m
CONFIG_HID_LOGITECH_HIDPP=m
# CONFIG_LOGITECH_FF is not set
# CONFIG_LOGIRUMBLEPAD2_FF is not set
# CONFIG_LOGIG940_FF is not set
# CONFIG_LOGIWHEELS_FF is not set
CONFIG_HID_MAGICMOUSE=y
# CONFIG_HID_MALTRON is not set
# CONFIG_HID_MAYFLASH is not set
# CONFIG_HID_MEGAWORLD_FF is not set
# CONFIG_HID_REDRAGON is not set
CONFIG_HID_MICROSOFT=y
CONFIG_HID_MONTEREY=y
CONFIG_HID_MULTITOUCH=m
# CONFIG_HID_NINTENDO is not set
CONFIG_HID_NTI=m
# CONFIG_HID_NTRIG is not set
CONFIG_HID_ORTEK=m
CONFIG_HID_PANTHERLORD=y
# CONFIG_PANTHERLORD_FF is not set
# CONFIG_HID_PENMOUNT is not set
CONFIG_HID_PETALYNX=y
CONFIG_HID_PICOLCD=m
CONFIG_HID_PICOLCD_FB=y
CONFIG_HID_PICOLCD_BACKLIGHT=y
CONFIG_HID_PICOLCD_LCD=y
CONFIG_HID_PICOLCD_LEDS=y
CONFIG_HID_PICOLCD_CIR=y
CONFIG_HID_PLANTRONICS=m
# CONFIG_HID_PXRC is not set
# CONFIG_HID_RAZER is not set
CONFIG_HID_PRIMAX=m
# CONFIG_HID_RETRODE is not set
# CONFIG_HID_ROCCAT is not set
CONFIG_HID_SAITEK=m
CONFIG_HID_SAMSUNG=m
# CONFIG_HID_SEMITEK is not set
# CONFIG_HID_SIGMAMICRO is not set
# CONFIG_HID_SONY is not set
CONFIG_HID_SPEEDLINK=m
# CONFIG_HID_STEAM is not set
CONFIG_HID_STEELSERIES=m
CONFIG_HID_SUNPLUS=y
CONFIG_HID_RMI=m
CONFIG_HID_GREENASIA=y
# CONFIG_GREENASIA_FF is not set
CONFIG_HID_SMARTJOYPLUS=y
# CONFIG_SMARTJOYPLUS_FF is not set
CONFIG_HID_TIVO=m
CONFIG_HID_TOPSEED=y
# CONFIG_HID_TOPRE is not set
CONFIG_HID_THINGM=m
CONFIG_HID_THRUSTMASTER=m
# CONFIG_THRUSTMASTER_FF is not set
# CONFIG_HID_UDRAW_PS3 is not set
# CONFIG_HID_U2FZERO is not set
# CONFIG_HID_WACOM is not set
CONFIG_HID_WIIMOTE=m
CONFIG_HID_XINMO=m
CONFIG_HID_ZEROPLUS=y
CONFIG_ZEROPLUS_FF=y
CONFIG_HID_ZYDACRON=m
CONFIG_HID_SENSOR_HUB=y
CONFIG_HID_SENSOR_CUSTOM_SENSOR=m
CONFIG_HID_ALPS=m
# CONFIG_HID_MCP2221 is not set
# end of Special HID drivers

#
# USB HID support
#
CONFIG_USB_HID=y
# CONFIG_HID_PID is not set
# CONFIG_USB_HIDDEV is not set
# end of USB HID support

#
# I2C HID support
#
# CONFIG_I2C_HID_ACPI is not set
# end of I2C HID support

#
# Intel ISH HID support
#
CONFIG_INTEL_ISH_HID=m
# CONFIG_INTEL_ISH_FIRMWARE_DOWNLOADER is not set
# end of Intel ISH HID support

#
# AMD SFH HID Support
#
# CONFIG_AMD_SFH_HID is not set
# end of AMD SFH HID Support
# end of HID support

CONFIG_USB_OHCI_LITTLE_ENDIAN=y
CONFIG_USB_SUPPORT=y
CONFIG_USB_COMMON=y
# CONFIG_USB_LED_TRIG is not set
# CONFIG_USB_ULPI_BUS is not set
# CONFIG_USB_CONN_GPIO is not set
CONFIG_USB_ARCH_HAS_HCD=y
CONFIG_USB=y
CONFIG_USB_PCI=y
CONFIG_USB_ANNOUNCE_NEW_DEVICES=y

#
# Miscellaneous USB options
#
CONFIG_USB_DEFAULT_PERSIST=y
# CONFIG_USB_FEW_INIT_RETRIES is not set
# CONFIG_USB_DYNAMIC_MINORS is not set
# CONFIG_USB_OTG is not set
# CONFIG_USB_OTG_PRODUCTLIST is not set
# CONFIG_USB_OTG_DISABLE_EXTERNAL_HUB is not set
CONFIG_USB_LEDS_TRIGGER_USBPORT=y
CONFIG_USB_AUTOSUSPEND_DELAY=2
CONFIG_USB_MON=y

#
# USB Host Controller Drivers
#
# CONFIG_USB_C67X00_HCD is not set
CONFIG_USB_XHCI_HCD=y
# CONFIG_USB_XHCI_DBGCAP is not set
CONFIG_USB_XHCI_PCI=y
# CONFIG_USB_XHCI_PCI_RENESAS is not set
# CONFIG_USB_XHCI_PLATFORM is not set
CONFIG_USB_EHCI_HCD=y
CONFIG_USB_EHCI_ROOT_HUB_TT=y
CONFIG_USB_EHCI_TT_NEWSCHED=y
CONFIG_USB_EHCI_PCI=y
# CONFIG_USB_EHCI_FSL is not set
# CONFIG_USB_EHCI_HCD_PLATFORM is not set
# CONFIG_USB_OXU210HP_HCD is not set
# CONFIG_USB_ISP116X_HCD is not set
# CONFIG_USB_FOTG210_HCD is not set
# CONFIG_USB_MAX3421_HCD is not set
CONFIG_USB_OHCI_HCD=y
CONFIG_USB_OHCI_HCD_PCI=y
# CONFIG_USB_OHCI_HCD_PLATFORM is not set
CONFIG_USB_UHCI_HCD=y
# CONFIG_USB_SL811_HCD is not set
# CONFIG_USB_R8A66597_HCD is not set
# CONFIG_USB_HCD_BCMA is not set
# CONFIG_USB_HCD_TEST_MODE is not set

#
# USB Device Class drivers
#
# CONFIG_USB_ACM is not set
# CONFIG_USB_PRINTER is not set
# CONFIG_USB_WDM is not set
# CONFIG_USB_TMC is not set

#
# NOTE: USB_STORAGE depends on SCSI but BLK_DEV_SD may
#

#
# also be needed; see USB_STORAGE Help for more info
#
CONFIG_USB_STORAGE=m
# CONFIG_USB_STORAGE_DEBUG is not set
# CONFIG_USB_STORAGE_REALTEK is not set
# CONFIG_USB_STORAGE_DATAFAB is not set
# CONFIG_USB_STORAGE_FREECOM is not set
# CONFIG_USB_STORAGE_ISD200 is not set
# CONFIG_USB_STORAGE_USBAT is not set
# CONFIG_USB_STORAGE_SDDR09 is not set
# CONFIG_USB_STORAGE_SDDR55 is not set
# CONFIG_USB_STORAGE_JUMPSHOT is not set
# CONFIG_USB_STORAGE_ALAUDA is not set
# CONFIG_USB_STORAGE_ONETOUCH is not set
# CONFIG_USB_STORAGE_KARMA is not set
# CONFIG_USB_STORAGE_CYPRESS_ATACB is not set
# CONFIG_USB_STORAGE_ENE_UB6250 is not set
# CONFIG_USB_UAS is not set

#
# USB Imaging devices
#
# CONFIG_USB_MDC800 is not set
# CONFIG_USB_MICROTEK is not set
# CONFIG_USBIP_CORE is not set
# CONFIG_USB_CDNS_SUPPORT is not set
# CONFIG_USB_MUSB_HDRC is not set
# CONFIG_USB_DWC3 is not set
# CONFIG_USB_DWC2 is not set
# CONFIG_USB_CHIPIDEA is not set
# CONFIG_USB_ISP1760 is not set

#
# USB port drivers
#
# CONFIG_USB_USS720 is not set
CONFIG_USB_SERIAL=m
CONFIG_USB_SERIAL_GENERIC=y
# CONFIG_USB_SERIAL_SIMPLE is not set
# CONFIG_USB_SERIAL_AIRCABLE is not set
# CONFIG_USB_SERIAL_ARK3116 is not set
# CONFIG_USB_SERIAL_BELKIN is not set
# CONFIG_USB_SERIAL_CH341 is not set
# CONFIG_USB_SERIAL_WHITEHEAT is not set
# CONFIG_USB_SERIAL_DIGI_ACCELEPORT is not set
# CONFIG_USB_SERIAL_CP210X is not set
# CONFIG_USB_SERIAL_CYPRESS_M8 is not set
# CONFIG_USB_SERIAL_EMPEG is not set
# CONFIG_USB_SERIAL_FTDI_SIO is not set
# CONFIG_USB_SERIAL_VISOR is not set
# CONFIG_USB_SERIAL_IPAQ is not set
# CONFIG_USB_SERIAL_IR is not set
# CONFIG_USB_SERIAL_EDGEPORT is not set
# CONFIG_USB_SERIAL_EDGEPORT_TI is not set
# CONFIG_USB_SERIAL_F81232 is not set
# CONFIG_USB_SERIAL_F8153X is not set
# CONFIG_USB_SERIAL_GARMIN is not set
# CONFIG_USB_SERIAL_IPW is not set
# CONFIG_USB_SERIAL_IUU is not set
# CONFIG_USB_SERIAL_KEYSPAN_PDA is not set
# CONFIG_USB_SERIAL_KEYSPAN is not set
# CONFIG_USB_SERIAL_KLSI is not set
# CONFIG_USB_SERIAL_KOBIL_SCT is not set
# CONFIG_USB_SERIAL_MCT_U232 is not set
# CONFIG_USB_SERIAL_METRO is not set
# CONFIG_USB_SERIAL_MOS7720 is not set
# CONFIG_USB_SERIAL_MOS7840 is not set
# CONFIG_USB_SERIAL_MXUPORT is not set
# CONFIG_USB_SERIAL_NAVMAN is not set
# CONFIG_USB_SERIAL_PL2303 is not set
# CONFIG_USB_SERIAL_OTI6858 is not set
# CONFIG_USB_SERIAL_QCAUX is not set
# CONFIG_USB_SERIAL_QUALCOMM is not set
# CONFIG_USB_SERIAL_SPCP8X5 is not set
# CONFIG_USB_SERIAL_SAFE is not set
# CONFIG_USB_SERIAL_SIERRAWIRELESS is not set
# CONFIG_USB_SERIAL_SYMBOL is not set
# CONFIG_USB_SERIAL_TI is not set
# CONFIG_USB_SERIAL_CYBERJACK is not set
# CONFIG_USB_SERIAL_OPTION is not set
# CONFIG_USB_SERIAL_OMNINET is not set
# CONFIG_USB_SERIAL_OPTICON is not set
# CONFIG_USB_SERIAL_XSENS_MT is not set
# CONFIG_USB_SERIAL_WISHBONE is not set
# CONFIG_USB_SERIAL_SSU100 is not set
# CONFIG_USB_SERIAL_QT2 is not set
# CONFIG_USB_SERIAL_UPD78F0730 is not set
# CONFIG_USB_SERIAL_XR is not set
CONFIG_USB_SERIAL_DEBUG=m

#
# USB Miscellaneous drivers
#
# CONFIG_USB_EMI62 is not set
# CONFIG_USB_EMI26 is not set
# CONFIG_USB_ADUTUX is not set
# CONFIG_USB_SEVSEG is not set
# CONFIG_USB_LEGOTOWER is not set
# CONFIG_USB_LCD is not set
# CONFIG_USB_CYPRESS_CY7C63 is not set
# CONFIG_USB_CYTHERM is not set
# CONFIG_USB_IDMOUSE is not set
# CONFIG_USB_FTDI_ELAN is not set
# CONFIG_USB_APPLEDISPLAY is not set
# CONFIG_APPLE_MFI_FASTCHARGE is not set
# CONFIG_USB_SISUSBVGA is not set
# CONFIG_USB_LD is not set
# CONFIG_USB_TRANCEVIBRATOR is not set
# CONFIG_USB_IOWARRIOR is not set
# CONFIG_USB_TEST is not set
# CONFIG_USB_EHSET_TEST_FIXTURE is not set
# CONFIG_USB_ISIGHTFW is not set
# CONFIG_USB_YUREX is not set
# CONFIG_USB_EZUSB_FX2 is not set
# CONFIG_USB_HUB_USB251XB is not set
# CONFIG_USB_HSIC_USB3503 is not set
# CONFIG_USB_HSIC_USB4604 is not set
# CONFIG_USB_LINK_LAYER_TEST is not set
# CONFIG_USB_CHAOSKEY is not set
# CONFIG_USB_ATM is not set

#
# USB Physical Layer drivers
#
# CONFIG_NOP_USB_XCEIV is not set
# CONFIG_USB_GPIO_VBUS is not set
# CONFIG_USB_ISP1301 is not set
# end of USB Physical Layer drivers

# CONFIG_USB_GADGET is not set
CONFIG_TYPEC=y
# CONFIG_TYPEC_TCPM is not set
CONFIG_TYPEC_UCSI=y
# CONFIG_UCSI_CCG is not set
CONFIG_UCSI_ACPI=y
# CONFIG_UCSI_STM32G0 is not set
# CONFIG_TYPEC_TPS6598X is not set
# CONFIG_TYPEC_RT1719 is not set
# CONFIG_TYPEC_STUSB160X is not set
# CONFIG_TYPEC_WUSB3801 is not set

#
# USB Type-C Multiplexer/DeMultiplexer Switch support
#
# CONFIG_TYPEC_MUX_FSA4480 is not set
# CONFIG_TYPEC_MUX_PI3USB30532 is not set
# end of USB Type-C Multiplexer/DeMultiplexer Switch support

#
# USB Type-C Alternate Mode drivers
#
# CONFIG_TYPEC_DP_ALTMODE is not set
# end of USB Type-C Alternate Mode drivers

# CONFIG_USB_ROLE_SWITCH is not set
CONFIG_MMC=m
CONFIG_MMC_BLOCK=m
CONFIG_MMC_BLOCK_MINORS=8
CONFIG_SDIO_UART=m
# CONFIG_MMC_TEST is not set

#
# MMC/SD/SDIO Host Controller Drivers
#
# CONFIG_MMC_DEBUG is not set
CONFIG_MMC_SDHCI=m
CONFIG_MMC_SDHCI_IO_ACCESSORS=y
CONFIG_MMC_SDHCI_PCI=m
CONFIG_MMC_RICOH_MMC=y
CONFIG_MMC_SDHCI_ACPI=m
CONFIG_MMC_SDHCI_PLTFM=m
# CONFIG_MMC_SDHCI_F_SDH30 is not set
# CONFIG_MMC_WBSD is not set
# CONFIG_MMC_TIFM_SD is not set
# CONFIG_MMC_SPI is not set
# CONFIG_MMC_CB710 is not set
# CONFIG_MMC_VIA_SDMMC is not set
# CONFIG_MMC_VUB300 is not set
# CONFIG_MMC_USHC is not set
# CONFIG_MMC_USDHI6ROL0 is not set
# CONFIG_MMC_REALTEK_PCI is not set
CONFIG_MMC_CQHCI=m
# CONFIG_MMC_HSQ is not set
# CONFIG_MMC_TOSHIBA_PCI is not set
# CONFIG_MMC_MTK is not set
# CONFIG_MMC_SDHCI_XENON is not set
# CONFIG_SCSI_UFSHCD is not set
# CONFIG_MEMSTICK is not set
CONFIG_NEW_LEDS=y
CONFIG_LEDS_CLASS=y
# CONFIG_LEDS_CLASS_FLASH is not set
# CONFIG_LEDS_CLASS_MULTICOLOR is not set
# CONFIG_LEDS_BRIGHTNESS_HW_CHANGED is not set

#
# LED drivers
#
# CONFIG_LEDS_APU is not set
CONFIG_LEDS_LM3530=m
# CONFIG_LEDS_LM3532 is not set
# CONFIG_LEDS_LM3642 is not set
# CONFIG_LEDS_PCA9532 is not set
# CONFIG_LEDS_GPIO is not set
CONFIG_LEDS_LP3944=m
# CONFIG_LEDS_LP3952 is not set
# CONFIG_LEDS_LP50XX is not set
# CONFIG_LEDS_PCA955X is not set
# CONFIG_LEDS_PCA963X is not set
# CONFIG_LEDS_DAC124S085 is not set
# CONFIG_LEDS_PWM is not set
# CONFIG_LEDS_BD2802 is not set
CONFIG_LEDS_INTEL_SS4200=m
CONFIG_LEDS_LT3593=m
# CONFIG_LEDS_TCA6507 is not set
# CONFIG_LEDS_TLC591XX is not set
# CONFIG_LEDS_LM355x is not set
# CONFIG_LEDS_IS31FL319X is not set

#
# LED driver for blink(1) USB RGB LED is under Special HID drivers (HID_THINGM)
#
CONFIG_LEDS_BLINKM=m
CONFIG_LEDS_MLXCPLD=m
# CONFIG_LEDS_MLXREG is not set
# CONFIG_LEDS_USER is not set
# CONFIG_LEDS_NIC78BX is not set
# CONFIG_LEDS_TI_LMU_COMMON is not set

#
# Flash and Torch LED drivers
#

#
# RGB LED drivers
#

#
# LED Triggers
#
CONFIG_LEDS_TRIGGERS=y
CONFIG_LEDS_TRIGGER_TIMER=m
CONFIG_LEDS_TRIGGER_ONESHOT=m
# CONFIG_LEDS_TRIGGER_DISK is not set
CONFIG_LEDS_TRIGGER_HEARTBEAT=m
CONFIG_LEDS_TRIGGER_BACKLIGHT=m
# CONFIG_LEDS_TRIGGER_CPU is not set
# CONFIG_LEDS_TRIGGER_ACTIVITY is not set
CONFIG_LEDS_TRIGGER_GPIO=m
CONFIG_LEDS_TRIGGER_DEFAULT_ON=m

#
# iptables trigger is under Netfilter config (LED target)
#
CONFIG_LEDS_TRIGGER_TRANSIENT=m
CONFIG_LEDS_TRIGGER_CAMERA=m
# CONFIG_LEDS_TRIGGER_PANIC is not set
# CONFIG_LEDS_TRIGGER_NETDEV is not set
# CONFIG_LEDS_TRIGGER_PATTERN is not set
CONFIG_LEDS_TRIGGER_AUDIO=m
# CONFIG_LEDS_TRIGGER_TTY is not set

#
# Simple LED drivers
#
# CONFIG_ACCESSIBILITY is not set
# CONFIG_INFINIBAND is not set
CONFIG_EDAC_ATOMIC_SCRUB=y
CONFIG_EDAC_SUPPORT=y
CONFIG_EDAC=y
CONFIG_EDAC_LEGACY_SYSFS=y
# CONFIG_EDAC_DEBUG is not set
CONFIG_EDAC_GHES=y
CONFIG_EDAC_E752X=m
CONFIG_EDAC_I82975X=m
CONFIG_EDAC_I3000=m
CONFIG_EDAC_I3200=m
CONFIG_EDAC_IE31200=m
CONFIG_EDAC_X38=m
CONFIG_EDAC_I5400=m
CONFIG_EDAC_I7CORE=m
CONFIG_EDAC_I5000=m
CONFIG_EDAC_I5100=m
CONFIG_EDAC_I7300=m
CONFIG_EDAC_SBRIDGE=m
CONFIG_EDAC_SKX=m
# CONFIG_EDAC_I10NM is not set
CONFIG_EDAC_PND2=m
# CONFIG_EDAC_IGEN6 is not set
CONFIG_RTC_LIB=y
CONFIG_RTC_MC146818_LIB=y
CONFIG_RTC_CLASS=y
CONFIG_RTC_HCTOSYS=y
CONFIG_RTC_HCTOSYS_DEVICE="rtc0"
# CONFIG_RTC_SYSTOHC is not set
# CONFIG_RTC_DEBUG is not set
CONFIG_RTC_NVMEM=y

#
# RTC interfaces
#
CONFIG_RTC_INTF_SYSFS=y
CONFIG_RTC_INTF_PROC=y
CONFIG_RTC_INTF_DEV=y
# CONFIG_RTC_INTF_DEV_UIE_EMUL is not set
# CONFIG_RTC_DRV_TEST is not set

#
# I2C RTC drivers
#
# CONFIG_RTC_DRV_ABB5ZES3 is not set
# CONFIG_RTC_DRV_ABEOZ9 is not set
# CONFIG_RTC_DRV_ABX80X is not set
CONFIG_RTC_DRV_DS1307=m
# CONFIG_RTC_DRV_DS1307_CENTURY is not set
CONFIG_RTC_DRV_DS1374=m
# CONFIG_RTC_DRV_DS1374_WDT is not set
CONFIG_RTC_DRV_DS1672=m
CONFIG_RTC_DRV_MAX6900=m
CONFIG_RTC_DRV_RS5C372=m
CONFIG_RTC_DRV_ISL1208=m
CONFIG_RTC_DRV_ISL12022=m
CONFIG_RTC_DRV_X1205=m
CONFIG_RTC_DRV_PCF8523=m
# CONFIG_RTC_DRV_PCF85063 is not set
# CONFIG_RTC_DRV_PCF85363 is not set
CONFIG_RTC_DRV_PCF8563=m
CONFIG_RTC_DRV_PCF8583=m
CONFIG_RTC_DRV_M41T80=m
CONFIG_RTC_DRV_M41T80_WDT=y
CONFIG_RTC_DRV_BQ32K=m
# CONFIG_RTC_DRV_S35390A is not set
CONFIG_RTC_DRV_FM3130=m
# CONFIG_RTC_DRV_RX8010 is not set
CONFIG_RTC_DRV_RX8581=m
CONFIG_RTC_DRV_RX8025=m
CONFIG_RTC_DRV_EM3027=m
# CONFIG_RTC_DRV_RV3028 is not set
# CONFIG_RTC_DRV_RV3032 is not set
# CONFIG_RTC_DRV_RV8803 is not set
# CONFIG_RTC_DRV_SD3078 is not set

#
# SPI RTC drivers
#
# CONFIG_RTC_DRV_M41T93 is not set
# CONFIG_RTC_DRV_M41T94 is not set
# CONFIG_RTC_DRV_DS1302 is not set
# CONFIG_RTC_DRV_DS1305 is not set
# CONFIG_RTC_DRV_DS1343 is not set
# CONFIG_RTC_DRV_DS1347 is not set
# CONFIG_RTC_DRV_DS1390 is not set
# CONFIG_RTC_DRV_MAX6916 is not set
# CONFIG_RTC_DRV_R9701 is not set
CONFIG_RTC_DRV_RX4581=m
# CONFIG_RTC_DRV_RS5C348 is not set
# CONFIG_RTC_DRV_MAX6902 is not set
# CONFIG_RTC_DRV_PCF2123 is not set
# CONFIG_RTC_DRV_MCP795 is not set
CONFIG_RTC_I2C_AND_SPI=y

#
# SPI and I2C RTC drivers
#
CONFIG_RTC_DRV_DS3232=m
CONFIG_RTC_DRV_DS3232_HWMON=y
# CONFIG_RTC_DRV_PCF2127 is not set
CONFIG_RTC_DRV_RV3029C2=m
# CONFIG_RTC_DRV_RV3029_HWMON is not set
# CONFIG_RTC_DRV_RX6110 is not set

#
# Platform RTC drivers
#
CONFIG_RTC_DRV_CMOS=y
CONFIG_RTC_DRV_DS1286=m
CONFIG_RTC_DRV_DS1511=m
CONFIG_RTC_DRV_DS1553=m
# CONFIG_RTC_DRV_DS1685_FAMILY is not set
CONFIG_RTC_DRV_DS1742=m
CONFIG_RTC_DRV_DS2404=m
CONFIG_RTC_DRV_STK17TA8=m
# CONFIG_RTC_DRV_M48T86 is not set
CONFIG_RTC_DRV_M48T35=m
CONFIG_RTC_DRV_M48T59=m
CONFIG_RTC_DRV_MSM6242=m
CONFIG_RTC_DRV_BQ4802=m
CONFIG_RTC_DRV_RP5C01=m
CONFIG_RTC_DRV_V3020=m

#
# on-CPU RTC drivers
#
# CONFIG_RTC_DRV_FTRTC010 is not set

#
# HID Sensor RTC drivers
#
# CONFIG_RTC_DRV_GOLDFISH is not set
CONFIG_DMADEVICES=y
# CONFIG_DMADEVICES_DEBUG is not set

#
# DMA Devices
#
CONFIG_DMA_ENGINE=y
CONFIG_DMA_VIRTUAL_CHANNELS=y
CONFIG_DMA_ACPI=y
# CONFIG_ALTERA_MSGDMA is not set
CONFIG_INTEL_IDMA64=m
# CONFIG_INTEL_IDXD is not set
# CONFIG_INTEL_IDXD_COMPAT is not set
CONFIG_INTEL_IOATDMA=m
# CONFIG_PLX_DMA is not set
# CONFIG_AMD_PTDMA is not set
# CONFIG_QCOM_HIDMA_MGMT is not set
# CONFIG_QCOM_HIDMA is not set
CONFIG_DW_DMAC_CORE=y
CONFIG_DW_DMAC=m
CONFIG_DW_DMAC_PCI=y
# CONFIG_DW_EDMA is not set
# CONFIG_DW_EDMA_PCIE is not set
CONFIG_HSU_DMA=y
# CONFIG_SF_PDMA is not set
# CONFIG_INTEL_LDMA is not set

#
# DMA Clients
#
CONFIG_ASYNC_TX_DMA=y
CONFIG_DMATEST=m
CONFIG_DMA_ENGINE_RAID=y

#
# DMABUF options
#
CONFIG_SYNC_FILE=y
CONFIG_SW_SYNC=y
CONFIG_UDMABUF=y
# CONFIG_DMABUF_MOVE_NOTIFY is not set
# CONFIG_DMABUF_DEBUG is not set
# CONFIG_DMABUF_SELFTESTS is not set
CONFIG_DMABUF_HEAPS=y
# CONFIG_DMABUF_SYSFS_STATS is not set
CONFIG_DMABUF_HEAPS_SYSTEM=y
# CONFIG_DMABUF_HEAPS_CMA is not set
# end of DMABUF options

CONFIG_DCA=m
# CONFIG_AUXDISPLAY is not set
# CONFIG_PANEL is not set
CONFIG_UIO=m
CONFIG_UIO_CIF=m
CONFIG_UIO_PDRV_GENIRQ=m
# CONFIG_UIO_DMEM_GENIRQ is not set
CONFIG_UIO_AEC=m
CONFIG_UIO_SERCOS3=m
CONFIG_UIO_PCI_GENERIC=m
# CONFIG_UIO_NETX is not set
# CONFIG_UIO_PRUSS is not set
# CONFIG_UIO_MF624 is not set
CONFIG_VFIO=m
CONFIG_VFIO_IOMMU_TYPE1=m
CONFIG_VFIO_VIRQFD=m
CONFIG_VFIO_NOIOMMU=y
CONFIG_VFIO_PCI_CORE=m
CONFIG_VFIO_PCI_MMAP=y
CONFIG_VFIO_PCI_INTX=y
CONFIG_VFIO_PCI=m
# CONFIG_VFIO_PCI_VGA is not set
# CONFIG_VFIO_PCI_IGD is not set
CONFIG_VFIO_MDEV=m
CONFIG_IRQ_BYPASS_MANAGER=m
CONFIG_VIRT_DRIVERS=y
CONFIG_VMGENID=y
# CONFIG_VBOXGUEST is not set
# CONFIG_NITRO_ENCLAVES is not set
# CONFIG_EFI_SECRET is not set
CONFIG_VIRTIO_ANCHOR=y
CONFIG_VIRTIO=y
CONFIG_VIRTIO_PCI_LIB=y
CONFIG_VIRTIO_PCI_LIB_LEGACY=y
CONFIG_VIRTIO_MENU=y
CONFIG_VIRTIO_PCI=y
CONFIG_VIRTIO_PCI_LEGACY=y
# CONFIG_VIRTIO_PMEM is not set
CONFIG_VIRTIO_BALLOON=y
# CONFIG_VIRTIO_MEM is not set
CONFIG_VIRTIO_INPUT=m
# CONFIG_VIRTIO_MMIO is not set
CONFIG_VIRTIO_DMA_SHARED_BUFFER=m
# CONFIG_VDPA is not set
CONFIG_VHOST_IOTLB=m
CONFIG_VHOST=m
CONFIG_VHOST_MENU=y
CONFIG_VHOST_NET=m
# CONFIG_VHOST_SCSI is not set
CONFIG_VHOST_VSOCK=m
# CONFIG_VHOST_CROSS_ENDIAN_LEGACY is not set

#
# Microsoft Hyper-V guest support
#
# CONFIG_HYPERV is not set
# end of Microsoft Hyper-V guest support

# CONFIG_GREYBUS is not set
# CONFIG_COMEDI is not set
CONFIG_STAGING=y
# CONFIG_PRISM2_USB is not set
# CONFIG_RTL8192U is not set
# CONFIG_RTLLIB is not set
# CONFIG_RTL8723BS is not set
# CONFIG_R8712U is not set
# CONFIG_R8188EU is not set
# CONFIG_RTS5208 is not set
# CONFIG_VT6655 is not set
# CONFIG_VT6656 is not set
# CONFIG_FB_SM750 is not set
# CONFIG_STAGING_MEDIA is not set
# CONFIG_LTE_GDM724X is not set
# CONFIG_FB_TFT is not set
# CONFIG_KS7010 is not set
# CONFIG_PI433 is not set
# CONFIG_FIELDBUS_DEV is not set
# CONFIG_QLGE is not set
# CONFIG_VME_BUS is not set
# CONFIG_CHROME_PLATFORMS is not set
CONFIG_MELLANOX_PLATFORM=y
CONFIG_MLXREG_HOTPLUG=m
# CONFIG_MLXREG_IO is not set
# CONFIG_MLXREG_LC is not set
# CONFIG_NVSW_SN2201 is not set
CONFIG_SURFACE_PLATFORMS=y
# CONFIG_SURFACE3_WMI is not set
# CONFIG_SURFACE_3_POWER_OPREGION is not set
# CONFIG_SURFACE_GPE is not set
# CONFIG_SURFACE_HOTPLUG is not set
# CONFIG_SURFACE_PRO3_BUTTON is not set
CONFIG_X86_PLATFORM_DEVICES=y
CONFIG_ACPI_WMI=m
CONFIG_WMI_BMOF=m
# CONFIG_HUAWEI_WMI is not set
# CONFIG_UV_SYSFS is not set
CONFIG_MXM_WMI=m
# CONFIG_PEAQ_WMI is not set
# CONFIG_NVIDIA_WMI_EC_BACKLIGHT is not set
# CONFIG_XIAOMI_WMI is not set
# CONFIG_GIGABYTE_WMI is not set
# CONFIG_YOGABOOK_WMI is not set
CONFIG_ACERHDF=m
# CONFIG_ACER_WIRELESS is not set
CONFIG_ACER_WMI=m
# CONFIG_AMD_PMF is not set
# CONFIG_AMD_PMC is not set
# CONFIG_AMD_HSMP is not set
# CONFIG_ADV_SWBUTTON is not set
CONFIG_APPLE_GMUX=m
CONFIG_ASUS_LAPTOP=m
# CONFIG_ASUS_WIRELESS is not set
CONFIG_ASUS_WMI=m
CONFIG_ASUS_NB_WMI=m
# CONFIG_ASUS_TF103C_DOCK is not set
# CONFIG_MERAKI_MX100 is not set
CONFIG_EEEPC_LAPTOP=m
CONFIG_EEEPC_WMI=m
# CONFIG_X86_PLATFORM_DRIVERS_DELL is not set
CONFIG_AMILO_RFKILL=m
CONFIG_FUJITSU_LAPTOP=m
CONFIG_FUJITSU_TABLET=m
# CONFIG_GPD_POCKET_FAN is not set
CONFIG_HP_ACCEL=m
# CONFIG_WIRELESS_HOTKEY is not set
CONFIG_HP_WMI=m
# CONFIG_IBM_RTL is not set
CONFIG_IDEAPAD_LAPTOP=m
CONFIG_SENSORS_HDAPS=m
CONFIG_THINKPAD_ACPI=m
# CONFIG_THINKPAD_ACPI_DEBUGFACILITIES is not set
# CONFIG_THINKPAD_ACPI_DEBUG is not set
# CONFIG_THINKPAD_ACPI_UNSAFE_LEDS is not set
CONFIG_THINKPAD_ACPI_VIDEO=y
CONFIG_THINKPAD_ACPI_HOTKEY_POLL=y
# CONFIG_THINKPAD_LMI is not set
# CONFIG_INTEL_ATOMISP2_PM is not set
# CONFIG_INTEL_SAR_INT1092 is not set
CONFIG_INTEL_PMC_CORE=m

#
# Intel Speed Select Technology interface support
#
# CONFIG_INTEL_SPEED_SELECT_INTERFACE is not set
# end of Intel Speed Select Technology interface support

CONFIG_INTEL_WMI=y
# CONFIG_INTEL_WMI_SBL_FW_UPDATE is not set
CONFIG_INTEL_WMI_THUNDERBOLT=m

#
# Intel Uncore Frequency Control
#
# CONFIG_INTEL_UNCORE_FREQ_CONTROL is not set
# end of Intel Uncore Frequency Control

CONFIG_INTEL_HID_EVENT=m
CONFIG_INTEL_VBTN=m
# CONFIG_INTEL_INT0002_VGPIO is not set
CONFIG_INTEL_OAKTRAIL=m
# CONFIG_INTEL_ISHTP_ECLITE is not set
# CONFIG_INTEL_PUNIT_IPC is not set
CONFIG_INTEL_RST=m
# CONFIG_INTEL_SMARTCONNECT is not set
CONFIG_INTEL_TURBO_MAX_3=y
# CONFIG_INTEL_VSEC is not set
CONFIG_MSI_LAPTOP=m
CONFIG_MSI_WMI=m
# CONFIG_PCENGINES_APU2 is not set
# CONFIG_BARCO_P50_GPIO is not set
CONFIG_SAMSUNG_LAPTOP=m
CONFIG_SAMSUNG_Q10=m
CONFIG_TOSHIBA_BT_RFKILL=m
# CONFIG_TOSHIBA_HAPS is not set
# CONFIG_TOSHIBA_WMI is not set
CONFIG_ACPI_CMPC=m
CONFIG_COMPAL_LAPTOP=m
# CONFIG_LG_LAPTOP is not set
CONFIG_PANASONIC_LAPTOP=m
CONFIG_SONY_LAPTOP=m
CONFIG_SONYPI_COMPAT=y
# CONFIG_SYSTEM76_ACPI is not set
CONFIG_TOPSTAR_LAPTOP=m
# CONFIG_SERIAL_MULTI_INSTANTIATE is not set
CONFIG_MLX_PLATFORM=m
CONFIG_INTEL_IPS=m
# CONFIG_INTEL_SCU_PCI is not set
# CONFIG_INTEL_SCU_PLATFORM is not set
# CONFIG_SIEMENS_SIMATIC_IPC is not set
# CONFIG_WINMATE_FM07_KEYS is not set
CONFIG_P2SB=y
CONFIG_HAVE_CLK=y
CONFIG_HAVE_CLK_PREPARE=y
CONFIG_COMMON_CLK=y
# CONFIG_LMK04832 is not set
# CONFIG_COMMON_CLK_MAX9485 is not set
# CONFIG_COMMON_CLK_SI5341 is not set
# CONFIG_COMMON_CLK_SI5351 is not set
# CONFIG_COMMON_CLK_SI544 is not set
# CONFIG_COMMON_CLK_CDCE706 is not set
# CONFIG_COMMON_CLK_CS2000_CP is not set
# CONFIG_COMMON_CLK_PWM is not set
# CONFIG_XILINX_VCU is not set
CONFIG_HWSPINLOCK=y

#
# Clock Source drivers
#
CONFIG_CLKEVT_I8253=y
CONFIG_I8253_LOCK=y
CONFIG_CLKBLD_I8253=y
# end of Clock Source drivers

CONFIG_MAILBOX=y
CONFIG_PCC=y
# CONFIG_ALTERA_MBOX is not set
CONFIG_IOMMU_IOVA=y
CONFIG_IOASID=y
CONFIG_IOMMU_API=y
CONFIG_IOMMU_SUPPORT=y

#
# Generic IOMMU Pagetable Support
#
# end of Generic IOMMU Pagetable Support

# CONFIG_IOMMU_DEBUGFS is not set
# CONFIG_IOMMU_DEFAULT_DMA_STRICT is not set
CONFIG_IOMMU_DEFAULT_DMA_LAZY=y
# CONFIG_IOMMU_DEFAULT_PASSTHROUGH is not set
CONFIG_IOMMU_DMA=y
CONFIG_IOMMU_SVA=y
# CONFIG_AMD_IOMMU is not set
CONFIG_DMAR_TABLE=y
CONFIG_INTEL_IOMMU=y
CONFIG_INTEL_IOMMU_SVM=y
# CONFIG_INTEL_IOMMU_DEFAULT_ON is not set
CONFIG_INTEL_IOMMU_FLOPPY_WA=y
CONFIG_INTEL_IOMMU_SCALABLE_MODE_DEFAULT_ON=y
CONFIG_IRQ_REMAP=y
# CONFIG_VIRTIO_IOMMU is not set

#
# Remoteproc drivers
#
# CONFIG_REMOTEPROC is not set
# end of Remoteproc drivers

#
# Rpmsg drivers
#
# CONFIG_RPMSG_QCOM_GLINK_RPM is not set
# CONFIG_RPMSG_VIRTIO is not set
# end of Rpmsg drivers

# CONFIG_SOUNDWIRE is not set

#
# SOC (System On Chip) specific Drivers
#

#
# Amlogic SoC drivers
#
# end of Amlogic SoC drivers

#
# Broadcom SoC drivers
#
# end of Broadcom SoC drivers

#
# NXP/Freescale QorIQ SoC drivers
#
# end of NXP/Freescale QorIQ SoC drivers

#
# fujitsu SoC drivers
#
# end of fujitsu SoC drivers

#
# i.MX SoC drivers
#
# end of i.MX SoC drivers

#
# Enable LiteX SoC Builder specific drivers
#
# end of Enable LiteX SoC Builder specific drivers

#
# Qualcomm SoC drivers
#
# end of Qualcomm SoC drivers

# CONFIG_SOC_TI is not set

#
# Xilinx SoC drivers
#
# end of Xilinx SoC drivers
# end of SOC (System On Chip) specific Drivers

# CONFIG_PM_DEVFREQ is not set
# CONFIG_EXTCON is not set
# CONFIG_MEMORY is not set
# CONFIG_IIO is not set
CONFIG_NTB=m
# CONFIG_NTB_MSI is not set
# CONFIG_NTB_AMD is not set
# CONFIG_NTB_IDT is not set
# CONFIG_NTB_INTEL is not set
# CONFIG_NTB_EPF is not set
# CONFIG_NTB_SWITCHTEC is not set
# CONFIG_NTB_PINGPONG is not set
# CONFIG_NTB_TOOL is not set
# CONFIG_NTB_PERF is not set
# CONFIG_NTB_TRANSPORT is not set
CONFIG_PWM=y
CONFIG_PWM_SYSFS=y
# CONFIG_PWM_DEBUG is not set
# CONFIG_PWM_CLK is not set
# CONFIG_PWM_DWC is not set
CONFIG_PWM_LPSS=m
CONFIG_PWM_LPSS_PCI=m
CONFIG_PWM_LPSS_PLATFORM=m
# CONFIG_PWM_PCA9685 is not set

#
# IRQ chip support
#
# end of IRQ chip support

# CONFIG_IPACK_BUS is not set
# CONFIG_RESET_CONTROLLER is not set

#
# PHY Subsystem
#
CONFIG_GENERIC_PHY=y
# CONFIG_USB_LGM_PHY is not set
# CONFIG_PHY_CAN_TRANSCEIVER is not set

#
# PHY drivers for Broadcom platforms
#
# CONFIG_BCM_KONA_USB2_PHY is not set
# end of PHY drivers for Broadcom platforms

# CONFIG_PHY_PXA_28NM_HSIC is not set
# CONFIG_PHY_PXA_28NM_USB2 is not set
# CONFIG_PHY_INTEL_LGM_EMMC is not set
# end of PHY Subsystem

CONFIG_POWERCAP=y
CONFIG_INTEL_RAPL_CORE=m
CONFIG_INTEL_RAPL=m
# CONFIG_IDLE_INJECT is not set
# CONFIG_MCB is not set

#
# Performance monitor support
#
# end of Performance monitor support

CONFIG_RAS=y
# CONFIG_RAS_CEC is not set
# CONFIG_USB4 is not set

#
# Android
#
# CONFIG_ANDROID_BINDER_IPC is not set
# end of Android

CONFIG_LIBNVDIMM=m
CONFIG_BLK_DEV_PMEM=m
CONFIG_ND_CLAIM=y
CONFIG_ND_BTT=m
CONFIG_BTT=y
CONFIG_ND_PFN=m
CONFIG_NVDIMM_PFN=y
CONFIG_NVDIMM_DAX=y
CONFIG_NVDIMM_KEYS=y
CONFIG_DAX=y
CONFIG_DEV_DAX=m
CONFIG_DEV_DAX_PMEM=m
CONFIG_DEV_DAX_KMEM=m
CONFIG_NVMEM=y
CONFIG_NVMEM_SYSFS=y
# CONFIG_NVMEM_RMEM is not set

#
# HW tracing support
#
CONFIG_STM=m
# CONFIG_STM_PROTO_BASIC is not set
# CONFIG_STM_PROTO_SYS_T is not set
CONFIG_STM_DUMMY=m
CONFIG_STM_SOURCE_CONSOLE=m
CONFIG_STM_SOURCE_HEARTBEAT=m
CONFIG_STM_SOURCE_FTRACE=m
CONFIG_INTEL_TH=m
CONFIG_INTEL_TH_PCI=m
CONFIG_INTEL_TH_ACPI=m
CONFIG_INTEL_TH_GTH=m
CONFIG_INTEL_TH_STH=m
CONFIG_INTEL_TH_MSU=m
CONFIG_INTEL_TH_PTI=m
# CONFIG_INTEL_TH_DEBUG is not set
# end of HW tracing support

# CONFIG_FPGA is not set
# CONFIG_TEE is not set
# CONFIG_SIOX is not set
# CONFIG_SLIMBUS is not set
# CONFIG_INTERCONNECT is not set
# CONFIG_COUNTER is not set
# CONFIG_MOST is not set
# CONFIG_PECI is not set
# CONFIG_HTE is not set
# end of Device Drivers

#
# File systems
#
CONFIG_DCACHE_WORD_ACCESS=y
CONFIG_VALIDATE_FS_PARSER=y
CONFIG_FS_IOMAP=y
# CONFIG_EXT2_FS is not set
# CONFIG_EXT3_FS is not set
CONFIG_EXT4_FS=y
CONFIG_EXT4_USE_FOR_EXT2=y
CONFIG_EXT4_FS_POSIX_ACL=y
CONFIG_EXT4_FS_SECURITY=y
# CONFIG_EXT4_DEBUG is not set
CONFIG_JBD2=y
# CONFIG_JBD2_DEBUG is not set
CONFIG_FS_MBCACHE=y
# CONFIG_REISERFS_FS is not set
# CONFIG_JFS_FS is not set
CONFIG_XFS_FS=m
CONFIG_XFS_SUPPORT_V4=y
CONFIG_XFS_QUOTA=y
CONFIG_XFS_POSIX_ACL=y
CONFIG_XFS_RT=y
CONFIG_XFS_ONLINE_SCRUB=y
# CONFIG_XFS_ONLINE_REPAIR is not set
CONFIG_XFS_DEBUG=y
CONFIG_XFS_ASSERT_FATAL=y
CONFIG_GFS2_FS=m
CONFIG_GFS2_FS_LOCKING_DLM=y
CONFIG_OCFS2_FS=m
CONFIG_OCFS2_FS_O2CB=m
CONFIG_OCFS2_FS_USERSPACE_CLUSTER=m
CONFIG_OCFS2_FS_STATS=y
CONFIG_OCFS2_DEBUG_MASKLOG=y
# CONFIG_OCFS2_DEBUG_FS is not set
CONFIG_BTRFS_FS=m
CONFIG_BTRFS_FS_POSIX_ACL=y
# CONFIG_BTRFS_FS_CHECK_INTEGRITY is not set
# CONFIG_BTRFS_FS_RUN_SANITY_TESTS is not set
# CONFIG_BTRFS_DEBUG is not set
# CONFIG_BTRFS_ASSERT is not set
# CONFIG_BTRFS_FS_REF_VERIFY is not set
# CONFIG_NILFS2_FS is not set
CONFIG_F2FS_FS=m
CONFIG_F2FS_STAT_FS=y
CONFIG_F2FS_FS_XATTR=y
CONFIG_F2FS_FS_POSIX_ACL=y
# CONFIG_F2FS_FS_SECURITY is not set
# CONFIG_F2FS_CHECK_FS is not set
# CONFIG_F2FS_FAULT_INJECTION is not set
# CONFIG_F2FS_FS_COMPRESSION is not set
CONFIG_F2FS_IOSTAT=y
# CONFIG_F2FS_UNFAIR_RWSEM is not set
CONFIG_FS_DAX=y
CONFIG_FS_DAX_PMD=y
CONFIG_FS_POSIX_ACL=y
CONFIG_EXPORTFS=y
CONFIG_EXPORTFS_BLOCK_OPS=y
CONFIG_FILE_LOCKING=y
CONFIG_FS_ENCRYPTION=y
CONFIG_FS_ENCRYPTION_ALGS=y
# CONFIG_FS_VERITY is not set
CONFIG_FSNOTIFY=y
CONFIG_DNOTIFY=y
CONFIG_INOTIFY_USER=y
CONFIG_FANOTIFY=y
CONFIG_FANOTIFY_ACCESS_PERMISSIONS=y
CONFIG_QUOTA=y
CONFIG_QUOTA_NETLINK_INTERFACE=y
CONFIG_PRINT_QUOTA_WARNING=y
# CONFIG_QUOTA_DEBUG is not set
CONFIG_QUOTA_TREE=y
# CONFIG_QFMT_V1 is not set
CONFIG_QFMT_V2=y
CONFIG_QUOTACTL=y
CONFIG_AUTOFS4_FS=y
CONFIG_AUTOFS_FS=y
CONFIG_FUSE_FS=m
CONFIG_CUSE=m
# CONFIG_VIRTIO_FS is not set
CONFIG_OVERLAY_FS=m
# CONFIG_OVERLAY_FS_REDIRECT_DIR is not set
# CONFIG_OVERLAY_FS_REDIRECT_ALWAYS_FOLLOW is not set
# CONFIG_OVERLAY_FS_INDEX is not set
# CONFIG_OVERLAY_FS_XINO_AUTO is not set
# CONFIG_OVERLAY_FS_METACOPY is not set

#
# Caches
#
CONFIG_NETFS_SUPPORT=y
CONFIG_NETFS_STATS=y
CONFIG_FSCACHE=m
CONFIG_FSCACHE_STATS=y
# CONFIG_FSCACHE_DEBUG is not set
CONFIG_CACHEFILES=m
# CONFIG_CACHEFILES_DEBUG is not set
# CONFIG_CACHEFILES_ERROR_INJECTION is not set
# CONFIG_CACHEFILES_ONDEMAND is not set
# end of Caches

#
# CD-ROM/DVD Filesystems
#
CONFIG_ISO9660_FS=m
CONFIG_JOLIET=y
CONFIG_ZISOFS=y
CONFIG_UDF_FS=m
# end of CD-ROM/DVD Filesystems

#
# DOS/FAT/EXFAT/NT Filesystems
#
CONFIG_FAT_FS=m
CONFIG_MSDOS_FS=m
CONFIG_VFAT_FS=m
CONFIG_FAT_DEFAULT_CODEPAGE=437
CONFIG_FAT_DEFAULT_IOCHARSET="ascii"
# CONFIG_FAT_DEFAULT_UTF8 is not set
# CONFIG_EXFAT_FS is not set
# CONFIG_NTFS_FS is not set
# CONFIG_NTFS3_FS is not set
# end of DOS/FAT/EXFAT/NT Filesystems

#
# Pseudo filesystems
#
CONFIG_PROC_FS=y
CONFIG_PROC_KCORE=y
CONFIG_PROC_VMCORE=y
CONFIG_PROC_VMCORE_DEVICE_DUMP=y
CONFIG_PROC_SYSCTL=y
CONFIG_PROC_PAGE_MONITOR=y
CONFIG_PROC_CHILDREN=y
CONFIG_PROC_PID_ARCH_STATUS=y
CONFIG_PROC_CPU_RESCTRL=y
CONFIG_KERNFS=y
CONFIG_SYSFS=y
CONFIG_TMPFS=y
CONFIG_TMPFS_POSIX_ACL=y
CONFIG_TMPFS_XATTR=y
# CONFIG_TMPFS_INODE64 is not set
CONFIG_HUGETLBFS=y
CONFIG_HUGETLB_PAGE=y
CONFIG_ARCH_WANT_HUGETLB_PAGE_OPTIMIZE_VMEMMAP=y
CONFIG_HUGETLB_PAGE_OPTIMIZE_VMEMMAP=y
# CONFIG_HUGETLB_PAGE_OPTIMIZE_VMEMMAP_DEFAULT_ON is not set
CONFIG_MEMFD_CREATE=y
CONFIG_ARCH_HAS_GIGANTIC_PAGE=y
CONFIG_CONFIGFS_FS=y
CONFIG_EFIVAR_FS=y
# end of Pseudo filesystems

CONFIG_MISC_FILESYSTEMS=y
# CONFIG_ORANGEFS_FS is not set
# CONFIG_ADFS_FS is not set
# CONFIG_AFFS_FS is not set
# CONFIG_ECRYPT_FS is not set
# CONFIG_HFS_FS is not set
# CONFIG_HFSPLUS_FS is not set
# CONFIG_BEFS_FS is not set
# CONFIG_BFS_FS is not set
# CONFIG_EFS_FS is not set
CONFIG_CRAMFS=m
CONFIG_CRAMFS_BLOCKDEV=y
CONFIG_SQUASHFS=m
# CONFIG_SQUASHFS_FILE_CACHE is not set
CONFIG_SQUASHFS_FILE_DIRECT=y
CONFIG_SQUASHFS_DECOMP_SINGLE=y
# CONFIG_SQUASHFS_CHOICE_DECOMP_BY_MOUNT is not set
CONFIG_SQUASHFS_COMPILE_DECOMP_SINGLE=y
# CONFIG_SQUASHFS_COMPILE_DECOMP_MULTI is not set
# CONFIG_SQUASHFS_COMPILE_DECOMP_MULTI_PERCPU is not set
CONFIG_SQUASHFS_XATTR=y
CONFIG_SQUASHFS_ZLIB=y
# CONFIG_SQUASHFS_LZ4 is not set
CONFIG_SQUASHFS_LZO=y
CONFIG_SQUASHFS_XZ=y
# CONFIG_SQUASHFS_ZSTD is not set
# CONFIG_SQUASHFS_4K_DEVBLK_SIZE is not set
# CONFIG_SQUASHFS_EMBEDDED is not set
CONFIG_SQUASHFS_FRAGMENT_CACHE_SIZE=3
# CONFIG_VXFS_FS is not set
# CONFIG_MINIX_FS is not set
# CONFIG_OMFS_FS is not set
# CONFIG_HPFS_FS is not set
# CONFIG_QNX4FS_FS is not set
# CONFIG_QNX6FS_FS is not set
# CONFIG_ROMFS_FS is not set
CONFIG_PSTORE=y
CONFIG_PSTORE_DEFAULT_KMSG_BYTES=10240
CONFIG_PSTORE_DEFLATE_COMPRESS=y
# CONFIG_PSTORE_LZO_COMPRESS is not set
# CONFIG_PSTORE_LZ4_COMPRESS is not set
# CONFIG_PSTORE_LZ4HC_COMPRESS is not set
# CONFIG_PSTORE_842_COMPRESS is not set
# CONFIG_PSTORE_ZSTD_COMPRESS is not set
CONFIG_PSTORE_COMPRESS=y
CONFIG_PSTORE_DEFLATE_COMPRESS_DEFAULT=y
CONFIG_PSTORE_COMPRESS_DEFAULT="deflate"
CONFIG_PSTORE_CONSOLE=y
CONFIG_PSTORE_PMSG=y
# CONFIG_PSTORE_FTRACE is not set
CONFIG_PSTORE_RAM=m
# CONFIG_PSTORE_BLK is not set
# CONFIG_SYSV_FS is not set
# CONFIG_UFS_FS is not set
# CONFIG_EROFS_FS is not set
CONFIG_NETWORK_FILESYSTEMS=y
CONFIG_NFS_FS=y
# CONFIG_NFS_V2 is not set
CONFIG_NFS_V3=y
CONFIG_NFS_V3_ACL=y
CONFIG_NFS_V4=m
# CONFIG_NFS_SWAP is not set
CONFIG_NFS_V4_1=y
CONFIG_NFS_V4_2=y
CONFIG_PNFS_FILE_LAYOUT=m
CONFIG_PNFS_BLOCK=m
CONFIG_PNFS_FLEXFILE_LAYOUT=m
CONFIG_NFS_V4_1_IMPLEMENTATION_ID_DOMAIN="kernel.org"
# CONFIG_NFS_V4_1_MIGRATION is not set
CONFIG_NFS_V4_SECURITY_LABEL=y
CONFIG_ROOT_NFS=y
# CONFIG_NFS_USE_LEGACY_DNS is not set
CONFIG_NFS_USE_KERNEL_DNS=y
CONFIG_NFS_DEBUG=y
CONFIG_NFS_DISABLE_UDP_SUPPORT=y
# CONFIG_NFS_V4_2_READ_PLUS is not set
CONFIG_NFSD=m
CONFIG_NFSD_V2_ACL=y
CONFIG_NFSD_V3_ACL=y
CONFIG_NFSD_V4=y
CONFIG_NFSD_PNFS=y
# CONFIG_NFSD_BLOCKLAYOUT is not set
CONFIG_NFSD_SCSILAYOUT=y
# CONFIG_NFSD_FLEXFILELAYOUT is not set
# CONFIG_NFSD_V4_2_INTER_SSC is not set
CONFIG_NFSD_V4_SECURITY_LABEL=y
CONFIG_GRACE_PERIOD=y
CONFIG_LOCKD=y
CONFIG_LOCKD_V4=y
CONFIG_NFS_ACL_SUPPORT=y
CONFIG_NFS_COMMON=y
CONFIG_NFS_V4_2_SSC_HELPER=y
CONFIG_SUNRPC=y
CONFIG_SUNRPC_GSS=m
CONFIG_SUNRPC_BACKCHANNEL=y
CONFIG_RPCSEC_GSS_KRB5=m
# CONFIG_SUNRPC_DISABLE_INSECURE_ENCTYPES is not set
CONFIG_SUNRPC_DEBUG=y
CONFIG_CEPH_FS=m
# CONFIG_CEPH_FSCACHE is not set
CONFIG_CEPH_FS_POSIX_ACL=y
# CONFIG_CEPH_FS_SECURITY_LABEL is not set
CONFIG_CIFS=m
CONFIG_CIFS_STATS2=y
CONFIG_CIFS_ALLOW_INSECURE_LEGACY=y
CONFIG_CIFS_UPCALL=y
CONFIG_CIFS_XATTR=y
CONFIG_CIFS_POSIX=y
CONFIG_CIFS_DEBUG=y
# CONFIG_CIFS_DEBUG2 is not set
# CONFIG_CIFS_DEBUG_DUMP_KEYS is not set
CONFIG_CIFS_DFS_UPCALL=y
# CONFIG_CIFS_SWN_UPCALL is not set
# CONFIG_CIFS_FSCACHE is not set
# CONFIG_SMB_SERVER is not set
CONFIG_SMBFS_COMMON=m
# CONFIG_CODA_FS is not set
# CONFIG_AFS_FS is not set
CONFIG_9P_FS=y
CONFIG_9P_FS_POSIX_ACL=y
CONFIG_9P_FS_SECURITY=y
CONFIG_NLS=y
CONFIG_NLS_DEFAULT="utf8"
CONFIG_NLS_CODEPAGE_437=y
CONFIG_NLS_CODEPAGE_737=m
CONFIG_NLS_CODEPAGE_775=m
CONFIG_NLS_CODEPAGE_850=m
CONFIG_NLS_CODEPAGE_852=m
CONFIG_NLS_CODEPAGE_855=m
CONFIG_NLS_CODEPAGE_857=m
CONFIG_NLS_CODEPAGE_860=m
CONFIG_NLS_CODEPAGE_861=m
CONFIG_NLS_CODEPAGE_862=m
CONFIG_NLS_CODEPAGE_863=m
CONFIG_NLS_CODEPAGE_864=m
CONFIG_NLS_CODEPAGE_865=m
CONFIG_NLS_CODEPAGE_866=m
CONFIG_NLS_CODEPAGE_869=m
CONFIG_NLS_CODEPAGE_936=m
CONFIG_NLS_CODEPAGE_950=m
CONFIG_NLS_CODEPAGE_932=m
CONFIG_NLS_CODEPAGE_949=m
CONFIG_NLS_CODEPAGE_874=m
CONFIG_NLS_ISO8859_8=m
CONFIG_NLS_CODEPAGE_1250=m
CONFIG_NLS_CODEPAGE_1251=m
CONFIG_NLS_ASCII=y
CONFIG_NLS_ISO8859_1=m
CONFIG_NLS_ISO8859_2=m
CONFIG_NLS_ISO8859_3=m
CONFIG_NLS_ISO8859_4=m
CONFIG_NLS_ISO8859_5=m
CONFIG_NLS_ISO8859_6=m
CONFIG_NLS_ISO8859_7=m
CONFIG_NLS_ISO8859_9=m
CONFIG_NLS_ISO8859_13=m
CONFIG_NLS_ISO8859_14=m
CONFIG_NLS_ISO8859_15=m
CONFIG_NLS_KOI8_R=m
CONFIG_NLS_KOI8_U=m
CONFIG_NLS_MAC_ROMAN=m
CONFIG_NLS_MAC_CELTIC=m
CONFIG_NLS_MAC_CENTEURO=m
CONFIG_NLS_MAC_CROATIAN=m
CONFIG_NLS_MAC_CYRILLIC=m
CONFIG_NLS_MAC_GAELIC=m
CONFIG_NLS_MAC_GREEK=m
CONFIG_NLS_MAC_ICELAND=m
CONFIG_NLS_MAC_INUIT=m
CONFIG_NLS_MAC_ROMANIAN=m
CONFIG_NLS_MAC_TURKISH=m
CONFIG_NLS_UTF8=m
CONFIG_DLM=m
# CONFIG_DLM_DEPRECATED_API is not set
CONFIG_DLM_DEBUG=y
# CONFIG_UNICODE is not set
CONFIG_IO_WQ=y
# end of File systems

#
# Security options
#
CONFIG_KEYS=y
# CONFIG_KEYS_REQUEST_CACHE is not set
CONFIG_PERSISTENT_KEYRINGS=y
CONFIG_TRUSTED_KEYS=y
CONFIG_TRUSTED_KEYS_TPM=y
CONFIG_ENCRYPTED_KEYS=y
# CONFIG_USER_DECRYPTED_DATA is not set
# CONFIG_KEY_DH_OPERATIONS is not set
# CONFIG_SECURITY_DMESG_RESTRICT is not set
CONFIG_SECURITY=y
CONFIG_SECURITYFS=y
CONFIG_SECURITY_NETWORK=y
CONFIG_SECURITY_NETWORK_XFRM=y
CONFIG_SECURITY_PATH=y
CONFIG_INTEL_TXT=y
CONFIG_HAVE_HARDENED_USERCOPY_ALLOCATOR=y
CONFIG_HARDENED_USERCOPY=y
CONFIG_FORTIFY_SOURCE=y
# CONFIG_STATIC_USERMODEHELPER is not set
# CONFIG_SECURITY_SELINUX is not set
# CONFIG_SECURITY_SMACK is not set
# CONFIG_SECURITY_TOMOYO is not set
# CONFIG_SECURITY_APPARMOR is not set
# CONFIG_SECURITY_LOADPIN is not set
CONFIG_SECURITY_YAMA=y
# CONFIG_SECURITY_SAFESETID is not set
# CONFIG_SECURITY_LOCKDOWN_LSM is not set
CONFIG_SECURITY_LANDLOCK=y
CONFIG_INTEGRITY=y
CONFIG_INTEGRITY_SIGNATURE=y
CONFIG_INTEGRITY_ASYMMETRIC_KEYS=y
CONFIG_INTEGRITY_TRUSTED_KEYRING=y
# CONFIG_INTEGRITY_PLATFORM_KEYRING is not set
CONFIG_INTEGRITY_AUDIT=y
CONFIG_IMA=y
# CONFIG_IMA_KEXEC is not set
CONFIG_IMA_MEASURE_PCR_IDX=10
CONFIG_IMA_NG_TEMPLATE=y
# CONFIG_IMA_SIG_TEMPLATE is not set
CONFIG_IMA_DEFAULT_TEMPLATE="ima-ng"
CONFIG_IMA_DEFAULT_HASH_SHA1=y
# CONFIG_IMA_DEFAULT_HASH_SHA256 is not set
# CONFIG_IMA_DEFAULT_HASH_SHA512 is not set
CONFIG_IMA_DEFAULT_HASH="sha1"
CONFIG_IMA_WRITE_POLICY=y
CONFIG_IMA_READ_POLICY=y
CONFIG_IMA_APPRAISE=y
CONFIG_IMA_ARCH_POLICY=y
# CONFIG_IMA_APPRAISE_BUILD_POLICY is not set
CONFIG_IMA_APPRAISE_BOOTPARAM=y
# CONFIG_IMA_APPRAISE_MODSIG is not set
CONFIG_IMA_TRUSTED_KEYRING=y
# CONFIG_IMA_BLACKLIST_KEYRING is not set
# CONFIG_IMA_LOAD_X509 is not set
CONFIG_IMA_MEASURE_ASYMMETRIC_KEYS=y
CONFIG_IMA_QUEUE_EARLY_BOOT_KEYS=y
CONFIG_IMA_SECURE_AND_OR_TRUSTED_BOOT=y
# CONFIG_IMA_DISABLE_HTABLE is not set
# CONFIG_EVM is not set
CONFIG_DEFAULT_SECURITY_DAC=y
CONFIG_LSM="landlock,lockdown,yama,loadpin,safesetid,integrity,bpf"

#
# Kernel hardening options
#

#
# Memory initialization
#
CONFIG_INIT_STACK_NONE=y
# CONFIG_GCC_PLUGIN_STRUCTLEAK_USER is not set
# CONFIG_GCC_PLUGIN_STACKLEAK is not set
CONFIG_INIT_ON_ALLOC_DEFAULT_ON=y
# CONFIG_INIT_ON_FREE_DEFAULT_ON is not set
CONFIG_CC_HAS_ZERO_CALL_USED_REGS=y
# CONFIG_ZERO_CALL_USED_REGS is not set
# end of Memory initialization

CONFIG_RANDSTRUCT_NONE=y
# CONFIG_RANDSTRUCT_FULL is not set
# CONFIG_RANDSTRUCT_PERFORMANCE is not set
# end of Kernel hardening options
# end of Security options

CONFIG_XOR_BLOCKS=m
CONFIG_ASYNC_CORE=m
CONFIG_ASYNC_MEMCPY=m
CONFIG_ASYNC_XOR=m
CONFIG_ASYNC_PQ=m
CONFIG_ASYNC_RAID6_RECOV=m
CONFIG_CRYPTO=y

#
# Crypto core or helper
#
CONFIG_CRYPTO_ALGAPI=y
CONFIG_CRYPTO_ALGAPI2=y
CONFIG_CRYPTO_AEAD=y
CONFIG_CRYPTO_AEAD2=y
CONFIG_CRYPTO_SKCIPHER=y
CONFIG_CRYPTO_SKCIPHER2=y
CONFIG_CRYPTO_HASH=y
CONFIG_CRYPTO_HASH2=y
CONFIG_CRYPTO_RNG=y
CONFIG_CRYPTO_RNG2=y
CONFIG_CRYPTO_RNG_DEFAULT=y
CONFIG_CRYPTO_AKCIPHER2=y
CONFIG_CRYPTO_AKCIPHER=y
CONFIG_CRYPTO_KPP2=y
CONFIG_CRYPTO_KPP=m
CONFIG_CRYPTO_ACOMP2=y
CONFIG_CRYPTO_MANAGER=y
CONFIG_CRYPTO_MANAGER2=y
CONFIG_CRYPTO_USER=m
CONFIG_CRYPTO_MANAGER_DISABLE_TESTS=y
CONFIG_CRYPTO_GF128MUL=y
CONFIG_CRYPTO_NULL=y
CONFIG_CRYPTO_NULL2=y
CONFIG_CRYPTO_PCRYPT=m
CONFIG_CRYPTO_CRYPTD=y
CONFIG_CRYPTO_AUTHENC=m
# CONFIG_CRYPTO_TEST is not set
CONFIG_CRYPTO_SIMD=y
CONFIG_CRYPTO_ENGINE=y
# end of Crypto core or helper

#
# Public-key cryptography
#
CONFIG_CRYPTO_RSA=y
CONFIG_CRYPTO_DH=m
# CONFIG_CRYPTO_DH_RFC7919_GROUPS is not set
CONFIG_CRYPTO_ECC=m
CONFIG_CRYPTO_ECDH=m
# CONFIG_CRYPTO_ECDSA is not set
# CONFIG_CRYPTO_ECRDSA is not set
# CONFIG_CRYPTO_SM2 is not set
# CONFIG_CRYPTO_CURVE25519 is not set
# end of Public-key cryptography

#
# Block ciphers
#
CONFIG_CRYPTO_AES=y
# CONFIG_CRYPTO_AES_TI is not set
CONFIG_CRYPTO_ANUBIS=m
# CONFIG_CRYPTO_ARIA is not set
CONFIG_CRYPTO_BLOWFISH=m
CONFIG_CRYPTO_BLOWFISH_COMMON=m
CONFIG_CRYPTO_CAMELLIA=m
CONFIG_CRYPTO_CAST_COMMON=m
CONFIG_CRYPTO_CAST5=m
CONFIG_CRYPTO_CAST6=m
CONFIG_CRYPTO_DES=m
CONFIG_CRYPTO_FCRYPT=m
CONFIG_CRYPTO_KHAZAD=m
CONFIG_CRYPTO_SEED=m
CONFIG_CRYPTO_SERPENT=m
CONFIG_CRYPTO_SM4=y
CONFIG_CRYPTO_SM4_GENERIC=y
CONFIG_CRYPTO_TEA=m
CONFIG_CRYPTO_TWOFISH=m
CONFIG_CRYPTO_TWOFISH_COMMON=m
# end of Block ciphers

#
# Length-preserving ciphers and modes
#
# CONFIG_CRYPTO_ADIANTUM is not set
CONFIG_CRYPTO_ARC4=m
CONFIG_CRYPTO_CHACHA20=m
CONFIG_CRYPTO_CBC=y
CONFIG_CRYPTO_CFB=y
CONFIG_CRYPTO_CTR=y
CONFIG_CRYPTO_CTS=m
CONFIG_CRYPTO_ECB=y
# CONFIG_CRYPTO_HCTR2 is not set
# CONFIG_CRYPTO_KEYWRAP is not set
CONFIG_CRYPTO_LRW=m
# CONFIG_CRYPTO_OFB is not set
CONFIG_CRYPTO_PCBC=m
CONFIG_CRYPTO_XTS=m
# end of Length-preserving ciphers and modes

#
# AEAD (authenticated encryption with associated data) ciphers
#
# CONFIG_CRYPTO_AEGIS128 is not set
# CONFIG_CRYPTO_CHACHA20POLY1305 is not set
CONFIG_CRYPTO_CCM=m
CONFIG_CRYPTO_GCM=y
CONFIG_CRYPTO_SEQIV=y
CONFIG_CRYPTO_ECHAINIV=m
CONFIG_CRYPTO_ESSIV=m
# end of AEAD (authenticated encryption with associated data) ciphers

#
# Hashes, digests, and MACs
#
CONFIG_CRYPTO_BLAKE2B=y
CONFIG_CRYPTO_CMAC=m
CONFIG_CRYPTO_GHASH=y
CONFIG_CRYPTO_HMAC=y
CONFIG_CRYPTO_MD4=m
CONFIG_CRYPTO_MD5=y
CONFIG_CRYPTO_MICHAEL_MIC=m
# CONFIG_CRYPTO_POLY1305 is not set
CONFIG_CRYPTO_RMD160=m
CONFIG_CRYPTO_SHA1=y
CONFIG_CRYPTO_SHA256=y
CONFIG_CRYPTO_SHA512=y
CONFIG_CRYPTO_SHA3=m
# CONFIG_CRYPTO_SM3_GENERIC is not set
# CONFIG_CRYPTO_STREEBOG is not set
CONFIG_CRYPTO_VMAC=m
CONFIG_CRYPTO_WP512=m
CONFIG_CRYPTO_XCBC=m
CONFIG_CRYPTO_XXHASH=y
# end of Hashes, digests, and MACs

#
# CRCs (cyclic redundancy checks)
#
CONFIG_CRYPTO_CRC32C=y
CONFIG_CRYPTO_CRC32=m
CONFIG_CRYPTO_CRCT10DIF=y
CONFIG_CRYPTO_CRC64_ROCKSOFT=m
# end of CRCs (cyclic redundancy checks)

#
# Compression
#
CONFIG_CRYPTO_DEFLATE=y
CONFIG_CRYPTO_LZO=y
# CONFIG_CRYPTO_842 is not set
# CONFIG_CRYPTO_LZ4 is not set
# CONFIG_CRYPTO_LZ4HC is not set
# CONFIG_CRYPTO_ZSTD is not set
# end of Compression

#
# Random number generation
#
CONFIG_CRYPTO_ANSI_CPRNG=m
CONFIG_CRYPTO_DRBG_MENU=y
CONFIG_CRYPTO_DRBG_HMAC=y
CONFIG_CRYPTO_DRBG_HASH=y
CONFIG_CRYPTO_DRBG_CTR=y
CONFIG_CRYPTO_DRBG=y
CONFIG_CRYPTO_JITTERENTROPY=y
# end of Random number generation

#
# Userspace interface
#
CONFIG_CRYPTO_USER_API=y
CONFIG_CRYPTO_USER_API_HASH=y
CONFIG_CRYPTO_USER_API_SKCIPHER=y
CONFIG_CRYPTO_USER_API_RNG=y
# CONFIG_CRYPTO_USER_API_RNG_CAVP is not set
CONFIG_CRYPTO_USER_API_AEAD=y
CONFIG_CRYPTO_USER_API_ENABLE_OBSOLETE=y
# CONFIG_CRYPTO_STATS is not set
# end of Userspace interface

CONFIG_CRYPTO_HASH_INFO=y

#
# Accelerated Cryptographic Algorithms for CPU (x86)
#
# CONFIG_CRYPTO_CURVE25519_X86 is not set
CONFIG_CRYPTO_AES_NI_INTEL=y
CONFIG_CRYPTO_BLOWFISH_X86_64=m
CONFIG_CRYPTO_CAMELLIA_X86_64=m
CONFIG_CRYPTO_CAMELLIA_AESNI_AVX_X86_64=m
CONFIG_CRYPTO_CAMELLIA_AESNI_AVX2_X86_64=m
CONFIG_CRYPTO_CAST5_AVX_X86_64=m
CONFIG_CRYPTO_CAST6_AVX_X86_64=m
# CONFIG_CRYPTO_DES3_EDE_X86_64 is not set
CONFIG_CRYPTO_SERPENT_SSE2_X86_64=m
CONFIG_CRYPTO_SERPENT_AVX_X86_64=m
CONFIG_CRYPTO_SERPENT_AVX2_X86_64=m
# CONFIG_CRYPTO_SM4_AESNI_AVX_X86_64 is not set
# CONFIG_CRYPTO_SM4_AESNI_AVX2_X86_64 is not set
CONFIG_CRYPTO_TWOFISH_X86_64=m
CONFIG_CRYPTO_TWOFISH_X86_64_3WAY=m
CONFIG_CRYPTO_TWOFISH_AVX_X86_64=m
# CONFIG_CRYPTO_ARIA_AESNI_AVX_X86_64 is not set
CONFIG_CRYPTO_CHACHA20_X86_64=m
# CONFIG_CRYPTO_AEGIS128_AESNI_SSE2 is not set
# CONFIG_CRYPTO_NHPOLY1305_SSE2 is not set
# CONFIG_CRYPTO_NHPOLY1305_AVX2 is not set
# CONFIG_CRYPTO_BLAKE2S_X86 is not set
# CONFIG_CRYPTO_POLYVAL_CLMUL_NI is not set
# CONFIG_CRYPTO_POLY1305_X86_64 is not set
CONFIG_CRYPTO_SHA1_SSSE3=y
CONFIG_CRYPTO_SHA256_SSSE3=y
CONFIG_CRYPTO_SHA512_SSSE3=m
# CONFIG_CRYPTO_SM3_AVX_X86_64 is not set
CONFIG_CRYPTO_GHASH_CLMUL_NI_INTEL=m
CONFIG_CRYPTO_CRC32C_INTEL=m
CONFIG_CRYPTO_CRC32_PCLMUL=m
CONFIG_CRYPTO_CRCT10DIF_PCLMUL=m
# end of Accelerated Cryptographic Algorithms for CPU (x86)

CONFIG_CRYPTO_HW=y
CONFIG_CRYPTO_DEV_PADLOCK=m
CONFIG_CRYPTO_DEV_PADLOCK_AES=m
CONFIG_CRYPTO_DEV_PADLOCK_SHA=m
# CONFIG_CRYPTO_DEV_ATMEL_ECC is not set
# CONFIG_CRYPTO_DEV_ATMEL_SHA204A is not set
CONFIG_CRYPTO_DEV_CCP=y
CONFIG_CRYPTO_DEV_CCP_DD=m
CONFIG_CRYPTO_DEV_SP_CCP=y
CONFIG_CRYPTO_DEV_CCP_CRYPTO=m
CONFIG_CRYPTO_DEV_SP_PSP=y
# CONFIG_CRYPTO_DEV_CCP_DEBUGFS is not set
CONFIG_CRYPTO_DEV_QAT=m
CONFIG_CRYPTO_DEV_QAT_DH895xCC=m
CONFIG_CRYPTO_DEV_QAT_C3XXX=m
CONFIG_CRYPTO_DEV_QAT_C62X=m
# CONFIG_CRYPTO_DEV_QAT_4XXX is not set
CONFIG_CRYPTO_DEV_QAT_DH895xCCVF=m
CONFIG_CRYPTO_DEV_QAT_C3XXXVF=m
CONFIG_CRYPTO_DEV_QAT_C62XVF=m
CONFIG_CRYPTO_DEV_NITROX=m
CONFIG_CRYPTO_DEV_NITROX_CNN55XX=m
CONFIG_CRYPTO_DEV_VIRTIO=y
# CONFIG_CRYPTO_DEV_SAFEXCEL is not set
# CONFIG_CRYPTO_DEV_AMLOGIC_GXL is not set
CONFIG_ASYMMETRIC_KEY_TYPE=y
CONFIG_ASYMMETRIC_PUBLIC_KEY_SUBTYPE=y
CONFIG_X509_CERTIFICATE_PARSER=y
# CONFIG_PKCS8_PRIVATE_KEY_PARSER is not set
CONFIG_PKCS7_MESSAGE_PARSER=y
# CONFIG_PKCS7_TEST_KEY is not set
CONFIG_SIGNED_PE_FILE_VERIFICATION=y
# CONFIG_FIPS_SIGNATURE_SELFTEST is not set

#
# Certificates for signature checking
#
CONFIG_MODULE_SIG_KEY="certs/signing_key.pem"
CONFIG_MODULE_SIG_KEY_TYPE_RSA=y
# CONFIG_MODULE_SIG_KEY_TYPE_ECDSA is not set
CONFIG_SYSTEM_TRUSTED_KEYRING=y
CONFIG_SYSTEM_TRUSTED_KEYS=""
# CONFIG_SYSTEM_EXTRA_CERTIFICATE is not set
# CONFIG_SECONDARY_TRUSTED_KEYRING is not set
CONFIG_SYSTEM_BLACKLIST_KEYRING=y
CONFIG_SYSTEM_BLACKLIST_HASH_LIST=""
# CONFIG_SYSTEM_REVOCATION_LIST is not set
# CONFIG_SYSTEM_BLACKLIST_AUTH_UPDATE is not set
# end of Certificates for signature checking

CONFIG_BINARY_PRINTF=y

#
# Library routines
#
CONFIG_RAID6_PQ=m
CONFIG_RAID6_PQ_BENCHMARK=y
# CONFIG_PACKING is not set
CONFIG_BITREVERSE=y
CONFIG_GENERIC_STRNCPY_FROM_USER=y
CONFIG_GENERIC_STRNLEN_USER=y
CONFIG_GENERIC_NET_UTILS=y
CONFIG_CORDIC=m
CONFIG_PRIME_NUMBERS=m
CONFIG_RATIONAL=y
CONFIG_GENERIC_PCI_IOMAP=y
CONFIG_GENERIC_IOMAP=y
CONFIG_ARCH_USE_CMPXCHG_LOCKREF=y
CONFIG_ARCH_HAS_FAST_MULTIPLIER=y
CONFIG_ARCH_USE_SYM_ANNOTATIONS=y

#
# Crypto library routines
#
CONFIG_CRYPTO_LIB_UTILS=y
CONFIG_CRYPTO_LIB_AES=y
CONFIG_CRYPTO_LIB_ARC4=m
CONFIG_CRYPTO_LIB_BLAKE2S_GENERIC=y
CONFIG_CRYPTO_ARCH_HAVE_LIB_CHACHA=m
CONFIG_CRYPTO_LIB_CHACHA_GENERIC=m
# CONFIG_CRYPTO_LIB_CHACHA is not set
# CONFIG_CRYPTO_LIB_CURVE25519 is not set
CONFIG_CRYPTO_LIB_DES=m
CONFIG_CRYPTO_LIB_POLY1305_RSIZE=11
# CONFIG_CRYPTO_LIB_POLY1305 is not set
# CONFIG_CRYPTO_LIB_CHACHA20POLY1305 is not set
CONFIG_CRYPTO_LIB_SHA1=y
CONFIG_CRYPTO_LIB_SHA256=y
# end of Crypto library routines

CONFIG_CRC_CCITT=y
CONFIG_CRC16=y
CONFIG_CRC_T10DIF=y
CONFIG_CRC64_ROCKSOFT=m
CONFIG_CRC_ITU_T=m
CONFIG_CRC32=y
# CONFIG_CRC32_SELFTEST is not set
CONFIG_CRC32_SLICEBY8=y
# CONFIG_CRC32_SLICEBY4 is not set
# CONFIG_CRC32_SARWATE is not set
# CONFIG_CRC32_BIT is not set
CONFIG_CRC64=m
# CONFIG_CRC4 is not set
CONFIG_CRC7=m
CONFIG_LIBCRC32C=m
CONFIG_CRC8=m
CONFIG_XXHASH=y
# CONFIG_RANDOM32_SELFTEST is not set
CONFIG_ZLIB_INFLATE=y
CONFIG_ZLIB_DEFLATE=y
CONFIG_LZO_COMPRESS=y
CONFIG_LZO_DECOMPRESS=y
CONFIG_LZ4_DECOMPRESS=y
CONFIG_ZSTD_COMMON=y
CONFIG_ZSTD_COMPRESS=m
CONFIG_ZSTD_DECOMPRESS=y
CONFIG_XZ_DEC=y
CONFIG_XZ_DEC_X86=y
CONFIG_XZ_DEC_POWERPC=y
CONFIG_XZ_DEC_IA64=y
CONFIG_XZ_DEC_ARM=y
CONFIG_XZ_DEC_ARMTHUMB=y
CONFIG_XZ_DEC_SPARC=y
# CONFIG_XZ_DEC_MICROLZMA is not set
CONFIG_XZ_DEC_BCJ=y
# CONFIG_XZ_DEC_TEST is not set
CONFIG_DECOMPRESS_GZIP=y
CONFIG_DECOMPRESS_BZIP2=y
CONFIG_DECOMPRESS_LZMA=y
CONFIG_DECOMPRESS_XZ=y
CONFIG_DECOMPRESS_LZO=y
CONFIG_DECOMPRESS_LZ4=y
CONFIG_DECOMPRESS_ZSTD=y
CONFIG_GENERIC_ALLOCATOR=y
CONFIG_REED_SOLOMON=m
CONFIG_REED_SOLOMON_ENC8=y
CONFIG_REED_SOLOMON_DEC8=y
CONFIG_TEXTSEARCH=y
CONFIG_TEXTSEARCH_KMP=m
CONFIG_TEXTSEARCH_BM=m
CONFIG_TEXTSEARCH_FSM=m
CONFIG_INTERVAL_TREE=y
CONFIG_XARRAY_MULTI=y
CONFIG_ASSOCIATIVE_ARRAY=y
CONFIG_HAS_IOMEM=y
CONFIG_HAS_IOPORT_MAP=y
CONFIG_HAS_DMA=y
CONFIG_DMA_OPS=y
CONFIG_NEED_SG_DMA_LENGTH=y
CONFIG_NEED_DMA_MAP_STATE=y
CONFIG_ARCH_DMA_ADDR_T_64BIT=y
CONFIG_ARCH_HAS_FORCE_DMA_UNENCRYPTED=y
CONFIG_SWIOTLB=y
CONFIG_DMA_CMA=y
# CONFIG_DMA_PERNUMA_CMA is not set

#
# Default contiguous memory area size:
#
CONFIG_CMA_SIZE_MBYTES=0
CONFIG_CMA_SIZE_SEL_MBYTES=y
# CONFIG_CMA_SIZE_SEL_PERCENTAGE is not set
# CONFIG_CMA_SIZE_SEL_MIN is not set
# CONFIG_CMA_SIZE_SEL_MAX is not set
CONFIG_CMA_ALIGNMENT=8
# CONFIG_DMA_API_DEBUG is not set
CONFIG_DMA_MAP_BENCHMARK=y
CONFIG_SGL_ALLOC=y
CONFIG_IOMMU_HELPER=y
CONFIG_CHECK_SIGNATURE=y
CONFIG_CPUMASK_OFFSTACK=y
CONFIG_UNFORCE_NR_CPUS=y
# CONFIG_FORCE_NR_CPUS is not set
CONFIG_CPU_RMAP=y
CONFIG_DQL=y
CONFIG_GLOB=y
# CONFIG_GLOB_SELFTEST is not set
CONFIG_NLATTR=y
CONFIG_CLZ_TAB=y
CONFIG_IRQ_POLL=y
CONFIG_MPILIB=y
CONFIG_SIGNATURE=y
CONFIG_OID_REGISTRY=y
CONFIG_UCS2_STRING=y
CONFIG_HAVE_GENERIC_VDSO=y
CONFIG_GENERIC_GETTIMEOFDAY=y
CONFIG_GENERIC_VDSO_TIME_NS=y
CONFIG_FONT_SUPPORT=y
CONFIG_FONTS=y
CONFIG_FONT_8x8=y
CONFIG_FONT_8x16=y
# CONFIG_FONT_6x11 is not set
# CONFIG_FONT_7x14 is not set
# CONFIG_FONT_PEARL_8x8 is not set
# CONFIG_FONT_ACORN_8x8 is not set
# CONFIG_FONT_MINI_4x6 is not set
# CONFIG_FONT_6x10 is not set
# CONFIG_FONT_10x18 is not set
# CONFIG_FONT_SUN8x16 is not set
# CONFIG_FONT_SUN12x22 is not set
# CONFIG_FONT_TER16x32 is not set
# CONFIG_FONT_6x8 is not set
CONFIG_SG_POOL=y
CONFIG_ARCH_HAS_PMEM_API=y
CONFIG_MEMREGION=y
CONFIG_ARCH_HAS_UACCESS_FLUSHCACHE=y
CONFIG_ARCH_HAS_COPY_MC=y
CONFIG_ARCH_STACKWALK=y
CONFIG_STACKDEPOT=y
CONFIG_STACKDEPOT_ALWAYS_INIT=y
CONFIG_SBITMAP=y
# end of Library routines

CONFIG_ASN1_ENCODER=y

#
# Kernel hacking
#

#
# printk and dmesg options
#
CONFIG_PRINTK_TIME=y
CONFIG_PRINTK_CALLER=y
# CONFIG_STACKTRACE_BUILD_ID is not set
CONFIG_CONSOLE_LOGLEVEL_DEFAULT=7
CONFIG_CONSOLE_LOGLEVEL_QUIET=4
CONFIG_MESSAGE_LOGLEVEL_DEFAULT=4
CONFIG_BOOT_PRINTK_DELAY=y
CONFIG_DYNAMIC_DEBUG=y
CONFIG_DYNAMIC_DEBUG_CORE=y
CONFIG_SYMBOLIC_ERRNAME=y
CONFIG_DEBUG_BUGVERBOSE=y
# end of printk and dmesg options

CONFIG_DEBUG_KERNEL=y
CONFIG_DEBUG_MISC=y

#
# Compile-time checks and compiler options
#
CONFIG_DEBUG_INFO=y
CONFIG_AS_HAS_NON_CONST_LEB128=y
# CONFIG_DEBUG_INFO_NONE is not set
CONFIG_DEBUG_INFO_DWARF_TOOLCHAIN_DEFAULT=y
# CONFIG_DEBUG_INFO_DWARF4 is not set
# CONFIG_DEBUG_INFO_DWARF5 is not set
# CONFIG_DEBUG_INFO_REDUCED is not set
# CONFIG_DEBUG_INFO_COMPRESSED is not set
# CONFIG_DEBUG_INFO_SPLIT is not set
CONFIG_DEBUG_INFO_BTF=y
CONFIG_PAHOLE_HAS_SPLIT_BTF=y
CONFIG_DEBUG_INFO_BTF_MODULES=y
# CONFIG_MODULE_ALLOW_BTF_MISMATCH is not set
# CONFIG_GDB_SCRIPTS is not set
CONFIG_FRAME_WARN=8192
CONFIG_STRIP_ASM_SYMS=y
# CONFIG_READABLE_ASM is not set
# CONFIG_HEADERS_INSTALL is not set
CONFIG_DEBUG_SECTION_MISMATCH=y
CONFIG_SECTION_MISMATCH_WARN_ONLY=y
# CONFIG_DEBUG_FORCE_FUNCTION_ALIGN_64B is not set
CONFIG_OBJTOOL=y
# CONFIG_VMLINUX_MAP is not set
# CONFIG_DEBUG_FORCE_WEAK_PER_CPU is not set
# end of Compile-time checks and compiler options

#
# Generic Kernel Debugging Instruments
#
CONFIG_MAGIC_SYSRQ=y
CONFIG_MAGIC_SYSRQ_DEFAULT_ENABLE=0x1
CONFIG_MAGIC_SYSRQ_SERIAL=y
CONFIG_MAGIC_SYSRQ_SERIAL_SEQUENCE=""
CONFIG_DEBUG_FS=y
CONFIG_DEBUG_FS_ALLOW_ALL=y
# CONFIG_DEBUG_FS_DISALLOW_MOUNT is not set
# CONFIG_DEBUG_FS_ALLOW_NONE is not set
CONFIG_HAVE_ARCH_KGDB=y
# CONFIG_KGDB is not set
CONFIG_ARCH_HAS_UBSAN_SANITIZE_ALL=y
CONFIG_UBSAN=y
# CONFIG_UBSAN_TRAP is not set
CONFIG_CC_HAS_UBSAN_BOUNDS=y
CONFIG_UBSAN_BOUNDS=y
CONFIG_UBSAN_ONLY_BOUNDS=y
CONFIG_UBSAN_SHIFT=y
# CONFIG_UBSAN_DIV_ZERO is not set
# CONFIG_UBSAN_BOOL is not set
# CONFIG_UBSAN_ENUM is not set
# CONFIG_UBSAN_ALIGNMENT is not set
CONFIG_UBSAN_SANITIZE_ALL=y
# CONFIG_TEST_UBSAN is not set
CONFIG_HAVE_ARCH_KCSAN=y
CONFIG_HAVE_KCSAN_COMPILER=y
# end of Generic Kernel Debugging Instruments

#
# Networking Debugging
#
# CONFIG_NET_DEV_REFCNT_TRACKER is not set
# CONFIG_NET_NS_REFCNT_TRACKER is not set
# CONFIG_DEBUG_NET is not set
# end of Networking Debugging

#
# Memory Debugging
#
CONFIG_PAGE_EXTENSION=y
# CONFIG_DEBUG_PAGEALLOC is not set
CONFIG_SLUB_DEBUG=y
# CONFIG_SLUB_DEBUG_ON is not set
CONFIG_PAGE_OWNER=y
# CONFIG_PAGE_TABLE_CHECK is not set
# CONFIG_PAGE_POISONING is not set
# CONFIG_DEBUG_PAGE_REF is not set
# CONFIG_DEBUG_RODATA_TEST is not set
CONFIG_ARCH_HAS_DEBUG_WX=y
# CONFIG_DEBUG_WX is not set
CONFIG_GENERIC_PTDUMP=y
# CONFIG_PTDUMP_DEBUGFS is not set
# CONFIG_DEBUG_OBJECTS is not set
# CONFIG_SHRINKER_DEBUG is not set
CONFIG_HAVE_DEBUG_KMEMLEAK=y
# CONFIG_DEBUG_KMEMLEAK is not set
# CONFIG_DEBUG_STACK_USAGE is not set
CONFIG_SCHED_STACK_END_CHECK=y
CONFIG_ARCH_HAS_DEBUG_VM_PGTABLE=y
# CONFIG_DEBUG_VM is not set
# CONFIG_DEBUG_VM_PGTABLE is not set
CONFIG_ARCH_HAS_DEBUG_VIRTUAL=y
# CONFIG_DEBUG_VIRTUAL is not set
CONFIG_DEBUG_MEMORY_INIT=y
CONFIG_MEMORY_NOTIFIER_ERROR_INJECT=m
# CONFIG_DEBUG_PER_CPU_MAPS is not set
CONFIG_HAVE_ARCH_KASAN=y
CONFIG_HAVE_ARCH_KASAN_VMALLOC=y
CONFIG_CC_HAS_KASAN_GENERIC=y
CONFIG_CC_HAS_WORKING_NOSANITIZE_ADDRESS=y
CONFIG_KASAN=y
CONFIG_KASAN_GENERIC=y
# CONFIG_KASAN_OUTLINE is not set
CONFIG_KASAN_INLINE=y
CONFIG_KASAN_STACK=y
CONFIG_KASAN_VMALLOC=y
# CONFIG_KASAN_MODULE_TEST is not set
CONFIG_HAVE_ARCH_KFENCE=y
# CONFIG_KFENCE is not set
CONFIG_HAVE_ARCH_KMSAN=y
# end of Memory Debugging

CONFIG_DEBUG_SHIRQ=y

#
# Debug Oops, Lockups and Hangs
#
CONFIG_PANIC_ON_OOPS=y
CONFIG_PANIC_ON_OOPS_VALUE=1
CONFIG_PANIC_TIMEOUT=0
CONFIG_LOCKUP_DETECTOR=y
CONFIG_SOFTLOCKUP_DETECTOR=y
# CONFIG_BOOTPARAM_SOFTLOCKUP_PANIC is not set
CONFIG_HARDLOCKUP_DETECTOR_PERF=y
CONFIG_HARDLOCKUP_CHECK_TIMESTAMP=y
CONFIG_HARDLOCKUP_DETECTOR=y
CONFIG_BOOTPARAM_HARDLOCKUP_PANIC=y
CONFIG_DETECT_HUNG_TASK=y
CONFIG_DEFAULT_HUNG_TASK_TIMEOUT=480
# CONFIG_BOOTPARAM_HUNG_TASK_PANIC is not set
CONFIG_WQ_WATCHDOG=y
# CONFIG_TEST_LOCKUP is not set
# end of Debug Oops, Lockups and Hangs

#
# Scheduler Debugging
#
CONFIG_SCHED_DEBUG=y
CONFIG_SCHED_INFO=y
CONFIG_SCHEDSTATS=y
# end of Scheduler Debugging

# CONFIG_DEBUG_TIMEKEEPING is not set
CONFIG_DEBUG_PREEMPT=y

#
# Lock Debugging (spinlocks, mutexes, etc...)
#
CONFIG_LOCK_DEBUGGING_SUPPORT=y
CONFIG_PROVE_LOCKING=y
# CONFIG_PROVE_RAW_LOCK_NESTING is not set
# CONFIG_LOCK_STAT is not set
CONFIG_DEBUG_RT_MUTEXES=y
CONFIG_DEBUG_SPINLOCK=y
CONFIG_DEBUG_MUTEXES=y
CONFIG_DEBUG_WW_MUTEX_SLOWPATH=y
CONFIG_DEBUG_RWSEMS=y
CONFIG_DEBUG_LOCK_ALLOC=y
CONFIG_LOCKDEP=y
CONFIG_LOCKDEP_BITS=15
CONFIG_LOCKDEP_CHAINS_BITS=16
CONFIG_LOCKDEP_STACK_TRACE_BITS=19
CONFIG_LOCKDEP_STACK_TRACE_HASH_BITS=14
CONFIG_LOCKDEP_CIRCULAR_QUEUE_BITS=12
# CONFIG_DEBUG_LOCKDEP is not set
CONFIG_DEBUG_ATOMIC_SLEEP=y
# CONFIG_DEBUG_LOCKING_API_SELFTESTS is not set
# CONFIG_LOCK_TORTURE_TEST is not set
CONFIG_WW_MUTEX_SELFTEST=m
# CONFIG_SCF_TORTURE_TEST is not set
# CONFIG_CSD_LOCK_WAIT_DEBUG is not set
# end of Lock Debugging (spinlocks, mutexes, etc...)

CONFIG_TRACE_IRQFLAGS=y
CONFIG_TRACE_IRQFLAGS_NMI=y
# CONFIG_DEBUG_IRQFLAGS is not set
CONFIG_STACKTRACE=y
# CONFIG_WARN_ALL_UNSEEDED_RANDOM is not set
# CONFIG_DEBUG_KOBJECT is not set

#
# Debug kernel data structures
#
CONFIG_DEBUG_LIST=y
CONFIG_DEBUG_PLIST=y
# CONFIG_DEBUG_SG is not set
# CONFIG_DEBUG_NOTIFIERS is not set
CONFIG_BUG_ON_DATA_CORRUPTION=y
# CONFIG_DEBUG_MAPLE_TREE is not set
# end of Debug kernel data structures

CONFIG_DEBUG_CREDENTIALS=y

#
# RCU Debugging
#
CONFIG_PROVE_RCU=y
# CONFIG_RCU_SCALE_TEST is not set
# CONFIG_RCU_TORTURE_TEST is not set
# CONFIG_RCU_REF_SCALE_TEST is not set
CONFIG_RCU_CPU_STALL_TIMEOUT=60
CONFIG_RCU_EXP_CPU_STALL_TIMEOUT=0
# CONFIG_RCU_TRACE is not set
# CONFIG_RCU_EQS_DEBUG is not set
# end of RCU Debugging

# CONFIG_DEBUG_WQ_FORCE_RR_CPU is not set
# CONFIG_CPU_HOTPLUG_STATE_CONTROL is not set
CONFIG_LATENCYTOP=y
CONFIG_USER_STACKTRACE_SUPPORT=y
CONFIG_NOP_TRACER=y
CONFIG_HAVE_RETHOOK=y
CONFIG_RETHOOK=y
CONFIG_HAVE_FUNCTION_TRACER=y
CONFIG_HAVE_FUNCTION_GRAPH_TRACER=y
CONFIG_HAVE_DYNAMIC_FTRACE=y
CONFIG_HAVE_DYNAMIC_FTRACE_WITH_REGS=y
CONFIG_HAVE_DYNAMIC_FTRACE_WITH_DIRECT_CALLS=y
CONFIG_HAVE_DYNAMIC_FTRACE_WITH_ARGS=y
CONFIG_HAVE_DYNAMIC_FTRACE_NO_PATCHABLE=y
CONFIG_HAVE_FTRACE_MCOUNT_RECORD=y
CONFIG_HAVE_SYSCALL_TRACEPOINTS=y
CONFIG_HAVE_FENTRY=y
CONFIG_HAVE_OBJTOOL_MCOUNT=y
CONFIG_HAVE_C_RECORDMCOUNT=y
CONFIG_HAVE_BUILDTIME_MCOUNT_SORT=y
CONFIG_BUILDTIME_MCOUNT_SORT=y
CONFIG_TRACER_MAX_TRACE=y
CONFIG_TRACE_CLOCK=y
CONFIG_RING_BUFFER=y
CONFIG_EVENT_TRACING=y
CONFIG_CONTEXT_SWITCH_TRACER=y
CONFIG_RING_BUFFER_ALLOW_SWAP=y
CONFIG_PREEMPTIRQ_TRACEPOINTS=y
CONFIG_TRACING=y
CONFIG_GENERIC_TRACER=y
CONFIG_TRACING_SUPPORT=y
CONFIG_FTRACE=y
CONFIG_BOOTTIME_TRACING=y
CONFIG_FUNCTION_TRACER=y
CONFIG_FUNCTION_GRAPH_TRACER=y
CONFIG_DYNAMIC_FTRACE=y
CONFIG_DYNAMIC_FTRACE_WITH_REGS=y
CONFIG_DYNAMIC_FTRACE_WITH_DIRECT_CALLS=y
CONFIG_DYNAMIC_FTRACE_WITH_ARGS=y
CONFIG_FPROBE=y
CONFIG_FUNCTION_PROFILER=y
CONFIG_STACK_TRACER=y
CONFIG_IRQSOFF_TRACER=y
# CONFIG_PREEMPT_TRACER is not set
CONFIG_SCHED_TRACER=y
CONFIG_HWLAT_TRACER=y
# CONFIG_OSNOISE_TRACER is not set
# CONFIG_TIMERLAT_TRACER is not set
# CONFIG_MMIOTRACE is not set
CONFIG_FTRACE_SYSCALLS=y
CONFIG_TRACER_SNAPSHOT=y
CONFIG_TRACER_SNAPSHOT_PER_CPU_SWAP=y
CONFIG_BRANCH_PROFILE_NONE=y
# CONFIG_PROFILE_ANNOTATED_BRANCHES is not set
CONFIG_BLK_DEV_IO_TRACE=y
CONFIG_KPROBE_EVENTS=y
# CONFIG_KPROBE_EVENTS_ON_NOTRACE is not set
CONFIG_UPROBE_EVENTS=y
CONFIG_BPF_EVENTS=y
CONFIG_DYNAMIC_EVENTS=y
CONFIG_PROBE_EVENTS=y
CONFIG_BPF_KPROBE_OVERRIDE=y
CONFIG_FTRACE_MCOUNT_RECORD=y
CONFIG_FTRACE_MCOUNT_USE_CC=y
CONFIG_TRACING_MAP=y
CONFIG_SYNTH_EVENTS=y
CONFIG_HIST_TRIGGERS=y
# CONFIG_TRACE_EVENT_INJECT is not set
# CONFIG_TRACEPOINT_BENCHMARK is not set
CONFIG_RING_BUFFER_BENCHMARK=m
# CONFIG_TRACE_EVAL_MAP_FILE is not set
# CONFIG_FTRACE_RECORD_RECURSION is not set
# CONFIG_FTRACE_STARTUP_TEST is not set
# CONFIG_FTRACE_SORT_STARTUP_TEST is not set
# CONFIG_RING_BUFFER_STARTUP_TEST is not set
# CONFIG_RING_BUFFER_VALIDATE_TIME_DELTAS is not set
CONFIG_PREEMPTIRQ_DELAY_TEST=m
# CONFIG_SYNTH_EVENT_GEN_TEST is not set
# CONFIG_KPROBE_EVENT_GEN_TEST is not set
# CONFIG_HIST_TRIGGERS_DEBUG is not set
# CONFIG_RV is not set
CONFIG_PROVIDE_OHCI1394_DMA_INIT=y
CONFIG_SAMPLES=y
# CONFIG_SAMPLE_AUXDISPLAY is not set
# CONFIG_SAMPLE_TRACE_EVENTS is not set
# CONFIG_SAMPLE_TRACE_CUSTOM_EVENTS is not set
CONFIG_SAMPLE_TRACE_PRINTK=m
CONFIG_SAMPLE_FTRACE_DIRECT=m
# CONFIG_SAMPLE_FTRACE_DIRECT_MULTI is not set
# CONFIG_SAMPLE_TRACE_ARRAY is not set
# CONFIG_SAMPLE_KOBJECT is not set
# CONFIG_SAMPLE_KPROBES is not set
# CONFIG_SAMPLE_HW_BREAKPOINT is not set
# CONFIG_SAMPLE_FPROBE is not set
# CONFIG_SAMPLE_KFIFO is not set
# CONFIG_SAMPLE_LIVEPATCH is not set
# CONFIG_SAMPLE_CONFIGFS is not set
# CONFIG_SAMPLE_VFIO_MDEV_MTTY is not set
# CONFIG_SAMPLE_VFIO_MDEV_MDPY is not set
# CONFIG_SAMPLE_VFIO_MDEV_MDPY_FB is not set
# CONFIG_SAMPLE_VFIO_MDEV_MBOCHS is not set
# CONFIG_SAMPLE_WATCHDOG is not set
CONFIG_HAVE_SAMPLE_FTRACE_DIRECT=y
CONFIG_HAVE_SAMPLE_FTRACE_DIRECT_MULTI=y
CONFIG_ARCH_HAS_DEVMEM_IS_ALLOWED=y
CONFIG_STRICT_DEVMEM=y
# CONFIG_IO_STRICT_DEVMEM is not set

#
# x86 Debugging
#
CONFIG_EARLY_PRINTK_USB=y
CONFIG_X86_VERBOSE_BOOTUP=y
CONFIG_EARLY_PRINTK=y
CONFIG_EARLY_PRINTK_DBGP=y
CONFIG_EARLY_PRINTK_USB_XDBC=y
# CONFIG_EFI_PGT_DUMP is not set
# CONFIG_DEBUG_TLBFLUSH is not set
# CONFIG_IOMMU_DEBUG is not set
CONFIG_HAVE_MMIOTRACE_SUPPORT=y
# CONFIG_X86_DECODER_SELFTEST is not set
CONFIG_IO_DELAY_0X80=y
# CONFIG_IO_DELAY_0XED is not set
# CONFIG_IO_DELAY_UDELAY is not set
# CONFIG_IO_DELAY_NONE is not set
CONFIG_DEBUG_BOOT_PARAMS=y
# CONFIG_CPA_DEBUG is not set
# CONFIG_DEBUG_ENTRY is not set
# CONFIG_DEBUG_NMI_SELFTEST is not set
# CONFIG_X86_DEBUG_FPU is not set
# CONFIG_PUNIT_ATOM_DEBUG is not set
CONFIG_UNWINDER_ORC=y
# CONFIG_UNWINDER_FRAME_POINTER is not set
# end of x86 Debugging

#
# Kernel Testing and Coverage
#
# CONFIG_KUNIT is not set
CONFIG_NOTIFIER_ERROR_INJECTION=y
CONFIG_PM_NOTIFIER_ERROR_INJECT=m
# CONFIG_NETDEV_NOTIFIER_ERROR_INJECT is not set
CONFIG_FUNCTION_ERROR_INJECTION=y
CONFIG_FAULT_INJECTION=y
# CONFIG_FAILSLAB is not set
# CONFIG_FAIL_PAGE_ALLOC is not set
# CONFIG_FAULT_INJECTION_USERCOPY is not set
# CONFIG_FAIL_MAKE_REQUEST is not set
# CONFIG_FAIL_IO_TIMEOUT is not set
# CONFIG_FAIL_FUTEX is not set
CONFIG_FAULT_INJECTION_DEBUG_FS=y
CONFIG_FAIL_FUNCTION=y
# CONFIG_FAIL_MMC_REQUEST is not set
# CONFIG_FAIL_SUNRPC is not set
# CONFIG_FAULT_INJECTION_STACKTRACE_FILTER is not set
CONFIG_ARCH_HAS_KCOV=y
CONFIG_CC_HAS_SANCOV_TRACE_PC=y
# CONFIG_KCOV is not set
CONFIG_RUNTIME_TESTING_MENU=y
CONFIG_LKDTM=y
# CONFIG_TEST_MIN_HEAP is not set
# CONFIG_TEST_DIV64 is not set
# CONFIG_BACKTRACE_SELF_TEST is not set
# CONFIG_TEST_REF_TRACKER is not set
# CONFIG_RBTREE_TEST is not set
# CONFIG_REED_SOLOMON_TEST is not set
# CONFIG_INTERVAL_TREE_TEST is not set
# CONFIG_PERCPU_TEST is not set
# CONFIG_ATOMIC64_SELFTEST is not set
# CONFIG_ASYNC_RAID6_TEST is not set
# CONFIG_TEST_HEXDUMP is not set
# CONFIG_STRING_SELFTEST is not set
# CONFIG_TEST_STRING_HELPERS is not set
CONFIG_TEST_STRSCPY=m
# CONFIG_TEST_KSTRTOX is not set
CONFIG_TEST_PRINTF=m
CONFIG_TEST_SCANF=m
CONFIG_TEST_BITMAP=m
# CONFIG_TEST_UUID is not set
# CONFIG_TEST_XARRAY is not set
# CONFIG_TEST_MAPLE_TREE is not set
# CONFIG_TEST_RHASHTABLE is not set
# CONFIG_TEST_SIPHASH is not set
# CONFIG_TEST_IDA is not set
CONFIG_TEST_LKM=m
CONFIG_TEST_BITOPS=m
CONFIG_TEST_VMALLOC=m
CONFIG_TEST_USER_COPY=m
CONFIG_TEST_BPF=m
CONFIG_TEST_BLACKHOLE_DEV=m
# CONFIG_FIND_BIT_BENCHMARK is not set
CONFIG_TEST_FIRMWARE=y
CONFIG_TEST_SYSCTL=y
# CONFIG_TEST_UDELAY is not set
CONFIG_TEST_STATIC_KEYS=m
# CONFIG_TEST_DYNAMIC_DEBUG is not set
CONFIG_TEST_KMOD=m
# CONFIG_TEST_MEMCAT_P is not set
CONFIG_TEST_LIVEPATCH=m
# CONFIG_TEST_MEMINIT is not set
CONFIG_TEST_HMM=m
# CONFIG_TEST_FREE_PAGES is not set
CONFIG_TEST_FPU=m
# CONFIG_TEST_CLOCKSOURCE_WATCHDOG is not set
# CONFIG_TEST_OBJPOOL is not set
CONFIG_ARCH_USE_MEMTEST=y
# CONFIG_MEMTEST is not set
# end of Kernel Testing and Coverage

#
# Rust hacking
#
# end of Rust hacking
# end of Kernel hacking

^ permalink raw reply	[flat|nested] 23+ messages in thread

* [PATCH v4] kprobes,lib: kretprobe scalability improvement
  2022-11-01  1:43 [PATCH v3] kprobes,lib: kretprobe scalability improvement wuqiang
  2022-11-01 12:27 ` kernel test robot
@ 2022-11-02  2:30 ` wuqiang
  2022-11-02 21:33   ` Andrew Morton
                     ` (3 more replies)
  1 sibling, 4 replies; 23+ messages in thread
From: wuqiang @ 2022-11-02  2:30 UTC (permalink / raw)
  To: mhiramat, davem, anil.s.keshavamurthy, naveen.n.rao, rostedt,
	peterz, akpm, sander, ebiggers, dan.j.williams, jpoimboe
  Cc: linux-kernel, mattwu, wuqiang, kernel test robot

kretprobe is using freelist to manage return-instances, but freelist,
as LIFO queue based on singly linked list, scales badly and reduces
the overall throughput of kretprobed routines, especially for high
contention scenarios.

Here's a typical throughput test of sys_flock (counts in 10 seconds,
measured with perf stat -a -I 10000 -e syscalls:sys_enter_flock):

OS: Debian 10 X86_64, Linux 6.1rc2
HW: XEON 8336C x 2, 64 cores/128 threads, DDR4 3200MT/s

      1X       2X       4X       6X       8X      12X      16X
34762430 36546920 17949900 13101899 12569595 12646601 14729195
     24X      32X      48X      64X      72X      96X     128X
19263546 10102064  8985418 11936495 11493980  7127789  9330985

This patch implements a scalable, lock-less and numa-aware object pool,
which brings near-linear scalability to kretprobed routines. Tests of
kretprobe throughput show the biggest ratio as 333.9x of the original
freelist. Here's the comparison:

                  1X         2X         4X         8X        16X
freelist:   34762430   36546920   17949900   12569595   14729195
objpool:    35627544   72182095  144068494  287564688  576903916
                 32X        48X        64X        96X       128X
freelist:   10102064    8985418   11936495    7127789    9330985
objpool:  1158876372 1737828164 2324371724 2380310472 2463182819

Tests on 96-core ARM64 system output similarly, but with the biggest
ratio up to 642.2x:

OS: Debian 10 AARCH64, Linux 6.1rc2
HW: Kunpeng-920 96 cores/2 sockets/4 NUMA nodes, DDR4 2933 MT/s

                  1X         2X         4X         8X        16X
freelist:   17498299   10887037   10224710    8499132    6421751
objpool:    18715726   35549845   71615884  144258971  283707220
                 24X        32X        48X        64X        96X
freelist:    5339868    4819116    3593919    3121575    2687167
objpool:   419830913  571609748  877456139 1143316315 1725668029

The object pool, leveraging percpu ring-array to mitigate hot spots
of memory contention, could deliver near-linear scalability for high
parallel scenarios. The ring-array is compactly managed in a single
cacheline (64 bytes) to benefit from warmed L1 cache for most cases
(<= 4 instances per core) and objects are managed in the continuous
cachelines just after ring-array.

Changes since V3:
1) build warning: unused variable in fprobe_init_rethook
Reported-by: kernel test robot <lkp@intel.com>

Changes since V2:
1) the percpu-extended version of the freelist replaced by new percpu-
   ring-array. freelist has data-contention in freelist_node (refs and
   next) even after node is removed from freelist and the node could
   be polluted easily (with freelist_node defined in union)
2) routines split to objpool.h and objpool.c according to cold & hot
   pathes, and the latter moved to lib, as suggested by Masami
3) test module (test_objpool.ko) added to lib for functional testings

Changes since V1:
1) reformat to a single patch as Masami Hiramatsu suggested
2) use __vmalloc_node to replace vmalloc_node for vmalloc
3) a few minor fixes: typo and coding-style issues

Signed-off-by: wuqiang <wuqiang.matt@bytedance.com>
---
 include/linux/freelist.h |  129 -----
 include/linux/kprobes.h  |    9 +-
 include/linux/objpool.h  |  151 ++++++
 include/linux/rethook.h  |   15 +-
 kernel/kprobes.c         |   95 ++--
 kernel/trace/fprobe.c    |   17 +-
 kernel/trace/rethook.c   |   80 +--
 lib/Kconfig.debug        |   11 +
 lib/Makefile             |    4 +-
 lib/objpool.c            |  480 ++++++++++++++++++
 lib/test_objpool.c       | 1031 ++++++++++++++++++++++++++++++++++++++
 11 files changed, 1772 insertions(+), 250 deletions(-)
 delete mode 100644 include/linux/freelist.h
 create mode 100644 include/linux/objpool.h
 create mode 100644 lib/objpool.c
 create mode 100644 lib/test_objpool.c

diff --git a/include/linux/freelist.h b/include/linux/freelist.h
deleted file mode 100644
index fc1842b96469..000000000000
--- a/include/linux/freelist.h
+++ /dev/null
@@ -1,129 +0,0 @@
-/* SPDX-License-Identifier: GPL-2.0-only OR BSD-2-Clause */
-#ifndef FREELIST_H
-#define FREELIST_H
-
-#include <linux/atomic.h>
-
-/*
- * Copyright: cameron@moodycamel.com
- *
- * A simple CAS-based lock-free free list. Not the fastest thing in the world
- * under heavy contention, but simple and correct (assuming nodes are never
- * freed until after the free list is destroyed), and fairly speedy under low
- * contention.
- *
- * Adapted from: https://moodycamel.com/blog/2014/solving-the-aba-problem-for-lock-free-free-lists
- */
-
-struct freelist_node {
-	atomic_t		refs;
-	struct freelist_node	*next;
-};
-
-struct freelist_head {
-	struct freelist_node	*head;
-};
-
-#define REFS_ON_FREELIST 0x80000000
-#define REFS_MASK	 0x7FFFFFFF
-
-static inline void __freelist_add(struct freelist_node *node, struct freelist_head *list)
-{
-	/*
-	 * Since the refcount is zero, and nobody can increase it once it's
-	 * zero (except us, and we run only one copy of this method per node at
-	 * a time, i.e. the single thread case), then we know we can safely
-	 * change the next pointer of the node; however, once the refcount is
-	 * back above zero, then other threads could increase it (happens under
-	 * heavy contention, when the refcount goes to zero in between a load
-	 * and a refcount increment of a node in try_get, then back up to
-	 * something non-zero, then the refcount increment is done by the other
-	 * thread) -- so if the CAS to add the node to the actual list fails,
-	 * decrese the refcount and leave the add operation to the next thread
-	 * who puts the refcount back to zero (which could be us, hence the
-	 * loop).
-	 */
-	struct freelist_node *head = READ_ONCE(list->head);
-
-	for (;;) {
-		WRITE_ONCE(node->next, head);
-		atomic_set_release(&node->refs, 1);
-
-		if (!try_cmpxchg_release(&list->head, &head, node)) {
-			/*
-			 * Hmm, the add failed, but we can only try again when
-			 * the refcount goes back to zero.
-			 */
-			if (atomic_fetch_add_release(REFS_ON_FREELIST - 1, &node->refs) == 1)
-				continue;
-		}
-		return;
-	}
-}
-
-static inline void freelist_add(struct freelist_node *node, struct freelist_head *list)
-{
-	/*
-	 * We know that the should-be-on-freelist bit is 0 at this point, so
-	 * it's safe to set it using a fetch_add.
-	 */
-	if (!atomic_fetch_add_release(REFS_ON_FREELIST, &node->refs)) {
-		/*
-		 * Oh look! We were the last ones referencing this node, and we
-		 * know we want to add it to the free list, so let's do it!
-		 */
-		__freelist_add(node, list);
-	}
-}
-
-static inline struct freelist_node *freelist_try_get(struct freelist_head *list)
-{
-	struct freelist_node *prev, *next, *head = smp_load_acquire(&list->head);
-	unsigned int refs;
-
-	while (head) {
-		prev = head;
-		refs = atomic_read(&head->refs);
-		if ((refs & REFS_MASK) == 0 ||
-		    !atomic_try_cmpxchg_acquire(&head->refs, &refs, refs+1)) {
-			head = smp_load_acquire(&list->head);
-			continue;
-		}
-
-		/*
-		 * Good, reference count has been incremented (it wasn't at
-		 * zero), which means we can read the next and not worry about
-		 * it changing between now and the time we do the CAS.
-		 */
-		next = READ_ONCE(head->next);
-		if (try_cmpxchg_acquire(&list->head, &head, next)) {
-			/*
-			 * Yay, got the node. This means it was on the list,
-			 * which means should-be-on-freelist must be false no
-			 * matter the refcount (because nobody else knows it's
-			 * been taken off yet, it can't have been put back on).
-			 */
-			WARN_ON_ONCE(atomic_read(&head->refs) & REFS_ON_FREELIST);
-
-			/*
-			 * Decrease refcount twice, once for our ref, and once
-			 * for the list's ref.
-			 */
-			atomic_fetch_add(-2, &head->refs);
-
-			return head;
-		}
-
-		/*
-		 * OK, the head must have changed on us, but we still need to decrement
-		 * the refcount we increased.
-		 */
-		refs = atomic_fetch_add(-1, &prev->refs);
-		if (refs == REFS_ON_FREELIST + 1)
-			__freelist_add(prev, list);
-	}
-
-	return NULL;
-}
-
-#endif /* FREELIST_H */
diff --git a/include/linux/kprobes.h b/include/linux/kprobes.h
index a0b92be98984..f13f01e600c2 100644
--- a/include/linux/kprobes.h
+++ b/include/linux/kprobes.h
@@ -27,7 +27,7 @@
 #include <linux/mutex.h>
 #include <linux/ftrace.h>
 #include <linux/refcount.h>
-#include <linux/freelist.h>
+#include <linux/objpool.h>
 #include <linux/rethook.h>
 #include <asm/kprobes.h>
 
@@ -141,6 +141,7 @@ static inline bool kprobe_ftrace(struct kprobe *p)
  */
 struct kretprobe_holder {
 	struct kretprobe	*rp;
+	struct objpool_head	oh;
 	refcount_t		ref;
 };
 
@@ -154,7 +155,6 @@ struct kretprobe {
 #ifdef CONFIG_KRETPROBE_ON_RETHOOK
 	struct rethook *rh;
 #else
-	struct freelist_head freelist;
 	struct kretprobe_holder *rph;
 #endif
 };
@@ -165,10 +165,7 @@ struct kretprobe_instance {
 #ifdef CONFIG_KRETPROBE_ON_RETHOOK
 	struct rethook_node node;
 #else
-	union {
-		struct freelist_node freelist;
-		struct rcu_head rcu;
-	};
+	struct rcu_head rcu;
 	struct llist_node llist;
 	struct kretprobe_holder *rph;
 	kprobe_opcode_t *ret_addr;
diff --git a/include/linux/objpool.h b/include/linux/objpool.h
new file mode 100644
index 000000000000..0b746187482a
--- /dev/null
+++ b/include/linux/objpool.h
@@ -0,0 +1,151 @@
+/* SPDX-License-Identifier: GPL-2.0 */
+
+#ifndef _LINUX_OBJPOOL_H
+#define _LINUX_OBJPOOL_H
+
+#include <linux/slab.h>
+#include <linux/vmalloc.h>
+#include <linux/atomic.h>
+
+/*
+ * objpool: ring-array based lockless MPMC/FIFO queues
+ *
+ * Copyright: wuqiang.matt@bytedance.com
+ *
+ * The object pool is a scalable implementaion of high performance queue
+ * for objects allocation and reclamation, such as kretprobe instances.
+ *
+ * With leveraging per-cpu ring-array to mitigate the hot spots of memory
+ * contention, it could deliver near-linear scalability for high parallel
+ * cases. Meanwhile, it also achieves high throughput with benifiting from
+ * warmed cache on each core.
+ *
+ * The object pool are best suited for the following cases:
+ * 1) memory allocation or reclamation is prohibited or too expensive
+ * 2) the objects are allocated/used/reclaimed very frequently
+ *
+ * Before using, you must be aware of it's limitations:
+ * 1) Maximum number of objects is determined during pool initializing
+ * 2) The memory of objects won't be freed until the poll is de-allocated
+ * 3) Both allocation and reclamation could be nested
+ */
+
+/*
+ * objpool_slot: per-cpu ring array
+ *
+ * Represents a cpu-local array-based ring buffer, its size is specialized
+ * during initialization of object pool.
+ *
+ * The objpool_slot is allocated from local memory for NUMA system, and to
+ * be kept compact in a single cacheline. ages[] is stored just after the
+ * body of objpool_slot, and ents[] is after ages[]. ages[] describes the
+ * revision of epoch of the item, solely used to avoid ABA. ents[] contains
+ * the object pointers.
+ *
+ * The default size of objpool_slot is a single cacheline, aka. 64 bytes.
+ *
+ * 64bit:
+ *        4      8      12     16        32                 64
+ * | head | tail | size | mask | ages[4] | ents[4]: (8 * 4) |
+ *
+ * 32bit:
+ *        4      8      12     16        32        48       64
+ * | head | tail | size | mask | ages[4] | ents[4] | unused |
+ *
+ */
+
+struct objpool_slot {
+	uint32_t                os_head;	/* head of ring array */
+	uint32_t                os_tail;	/* tail of ring array */
+	uint32_t                os_size;	/* max item slots, pow of 2 */
+	uint32_t                os_mask;	/* os_size - 1 */
+/*
+ *	uint32_t                os_ages[];	// ring epoch id
+ *	void                   *os_ents[];	// objects array
+ */
+};
+
+/* caller-specified object initial callback to setup each object, only called once */
+typedef int (*objpool_init_node_cb)(void *context, void *obj);
+
+/* caller-specified cleanup callback for private objects/pool/context */
+typedef int (*objpool_release_cb)(void *context, void *ptr, uint32_t flags);
+
+/* called for object releasing: ptr points to an object */
+#define OBJPOOL_FLAG_NODE        (0x00000001)
+/* for user pool and context releasing, ptr could be NULL */
+#define OBJPOOL_FLAG_POOL        (0x00001000)
+/* the object or pool to be released is user-managed */
+#define OBJPOOL_FLAG_USER        (0x00008000)
+
+/*
+ * objpool_head: object pooling metadata
+ */
+
+struct objpool_head {
+	uint32_t                oh_objsz;	/* object & element size */
+	uint32_t                oh_nobjs;	/* total objs (pre-allocated) */
+	uint32_t                oh_nents;	/* max objects per cpuslot */
+	uint32_t                oh_ncpus;	/* num of possible cpus */
+	uint32_t                oh_in_user:1;	/* user-specified buffer */
+	uint32_t                oh_in_slot:1;	/* objs alloced with slots */
+	uint32_t                oh_vmalloc:1;	/* alloc from vmalloc zone */
+	gfp_t                   oh_gfp;		/* k/vmalloc gfp flags */
+	uint32_t                oh_sz_pool;	/* user pool size in byes */
+	void                   *oh_pool;	/* user managed memory pool */
+	struct objpool_slot   **oh_slots;	/* array of percpu slots */
+	uint32_t               *oh_sz_slots;	/* size in bytes of slots */
+	objpool_release_cb      oh_release;	/* resource cleanup callback */
+	void                   *oh_context;	/* caller-provided context */
+};
+
+/* initialize object pool and pre-allocate objects */
+int objpool_init(struct objpool_head *oh,
+		int nobjs, int max, int objsz,
+		gfp_t gfp, void *context,
+		objpool_init_node_cb objinit,
+		objpool_release_cb release);
+
+/* add objects in batch from user provided pool */
+int objpool_populate(struct objpool_head *oh, void *buf,
+			int size, int objsz, void *context,
+			objpool_init_node_cb objinit);
+
+/* add pre-allocated object (managed by user) to objpool */
+int objpool_add(void *obj, struct objpool_head *oh);
+
+/* allocate an object from objects pool */
+void *objpool_pop(struct objpool_head *oh);
+
+/* reclaim an object and return it back to objects pool */
+int objpool_push(void *node, struct objpool_head *oh);
+
+/* cleanup the whole object pool (including all chained objects) */
+void objpool_fini(struct objpool_head *oh);
+
+/* whether the object is pre-allocated with percpu slots */
+static inline int objpool_is_inslot(void *obj, struct objpool_head *oh)
+{
+	void *slot;
+	int i;
+
+	if (!obj)
+		return 0;
+
+	for (i = 0; i < oh->oh_ncpus; i++) {
+		slot = oh->oh_slots[i];
+		if (obj >= slot && obj < slot + oh->oh_sz_slots[i])
+			return 1;
+	}
+
+	return 0;
+}
+
+/* whether the object is from user pool (batched adding) */
+static inline int objpool_is_inpool(void *obj, struct objpool_head *oh)
+{
+	return (obj && oh->oh_pool && obj >= oh->oh_pool &&
+		obj < oh->oh_pool + oh->oh_sz_pool);
+}
+
+#endif /* _LINUX_OBJPOOL_H */
diff --git a/include/linux/rethook.h b/include/linux/rethook.h
index c8ac1e5afcd1..278ec65e71fe 100644
--- a/include/linux/rethook.h
+++ b/include/linux/rethook.h
@@ -6,7 +6,7 @@
 #define _LINUX_RETHOOK_H
 
 #include <linux/compiler.h>
-#include <linux/freelist.h>
+#include <linux/objpool.h>
 #include <linux/kallsyms.h>
 #include <linux/llist.h>
 #include <linux/rcupdate.h>
@@ -30,14 +30,14 @@ typedef void (*rethook_handler_t) (struct rethook_node *, void *, struct pt_regs
 struct rethook {
 	void			*data;
 	rethook_handler_t	handler;
-	struct freelist_head	pool;
+	struct objpool_head	pool;
 	refcount_t		ref;
 	struct rcu_head		rcu;
 };
 
 /**
  * struct rethook_node - The rethook shadow-stack entry node.
- * @freelist: The freelist, linked to struct rethook::pool.
+ * @nod: The objpool node, linked to struct rethook::pool.
  * @rcu: The rcu_head for deferred freeing.
  * @llist: The llist, linked to a struct task_struct::rethooks.
  * @rethook: The pointer to the struct rethook.
@@ -48,19 +48,15 @@ struct rethook {
  * on each entry of the shadow stack.
  */
 struct rethook_node {
-	union {
-		struct freelist_node freelist;
-		struct rcu_head      rcu;
-	};
+	struct rcu_head		rcu;
 	struct llist_node	llist;
 	struct rethook		*rethook;
 	unsigned long		ret_addr;
 	unsigned long		frame;
 };
 
-struct rethook *rethook_alloc(void *data, rethook_handler_t handler);
+struct rethook *rethook_alloc(void *data, rethook_handler_t handler, gfp_t gfp, int size, int max);
 void rethook_free(struct rethook *rh);
-void rethook_add_node(struct rethook *rh, struct rethook_node *node);
 struct rethook_node *rethook_try_get(struct rethook *rh);
 void rethook_recycle(struct rethook_node *node);
 void rethook_hook(struct rethook_node *node, struct pt_regs *regs, bool mcount);
@@ -97,4 +93,3 @@ void rethook_flush_task(struct task_struct *tk);
 #endif
 
 #endif
-
diff --git a/kernel/kprobes.c b/kernel/kprobes.c
index b781dee3f552..42cb708c3248 100644
--- a/kernel/kprobes.c
+++ b/kernel/kprobes.c
@@ -1865,10 +1865,12 @@ static struct notifier_block kprobe_exceptions_nb = {
 static void free_rp_inst_rcu(struct rcu_head *head)
 {
 	struct kretprobe_instance *ri = container_of(head, struct kretprobe_instance, rcu);
+	struct kretprobe_holder *rph = ri->rph;
 
-	if (refcount_dec_and_test(&ri->rph->ref))
-		kfree(ri->rph);
-	kfree(ri);
+	if (refcount_dec_and_test(&rph->ref)) {
+		objpool_fini(&rph->oh);
+		kfree(rph);
+	}
 }
 NOKPROBE_SYMBOL(free_rp_inst_rcu);
 
@@ -1877,7 +1879,7 @@ static void recycle_rp_inst(struct kretprobe_instance *ri)
 	struct kretprobe *rp = get_kretprobe(ri);
 
 	if (likely(rp))
-		freelist_add(&ri->freelist, &rp->freelist);
+		objpool_push(ri, &rp->rph->oh);
 	else
 		call_rcu(&ri->rcu, free_rp_inst_rcu);
 }
@@ -1914,23 +1916,19 @@ NOKPROBE_SYMBOL(kprobe_flush_task);
 
 static inline void free_rp_inst(struct kretprobe *rp)
 {
-	struct kretprobe_instance *ri;
-	struct freelist_node *node;
-	int count = 0;
-
-	node = rp->freelist.head;
-	while (node) {
-		ri = container_of(node, struct kretprobe_instance, freelist);
-		node = node->next;
-
-		kfree(ri);
-		count++;
-	}
+	struct kretprobe_holder *rph = rp->rph;
+	void *nod;
 
-	if (refcount_sub_and_test(count, &rp->rph->ref)) {
-		kfree(rp->rph);
-		rp->rph = NULL;
-	}
+	rp->rph = NULL;
+	do {
+		nod = objpool_pop(&rph->oh);
+		/* deref anyway since we've one extra ref grabbed */
+		if (refcount_dec_and_test(&rph->ref)) {
+			objpool_fini(&rph->oh);
+			kfree(rph);
+			break;
+		}
+	} while (nod);
 }
 
 /* This assumes the 'tsk' is the current task or the is not running. */
@@ -2072,19 +2070,17 @@ NOKPROBE_SYMBOL(__kretprobe_trampoline_handler)
 static int pre_handler_kretprobe(struct kprobe *p, struct pt_regs *regs)
 {
 	struct kretprobe *rp = container_of(p, struct kretprobe, kp);
+	struct kretprobe_holder *rph = rp->rph;
 	struct kretprobe_instance *ri;
-	struct freelist_node *fn;
 
-	fn = freelist_try_get(&rp->freelist);
-	if (!fn) {
+	ri = objpool_pop(&rph->oh);
+	if (!ri) {
 		rp->nmissed++;
 		return 0;
 	}
 
-	ri = container_of(fn, struct kretprobe_instance, freelist);
-
 	if (rp->entry_handler && rp->entry_handler(ri, regs)) {
-		freelist_add(&ri->freelist, &rp->freelist);
+		objpool_push(ri, &rph->oh);
 		return 0;
 	}
 
@@ -2174,10 +2170,19 @@ int kprobe_on_func_entry(kprobe_opcode_t *addr, const char *sym, unsigned long o
 	return 0;
 }
 
+#ifndef CONFIG_KRETPROBE_ON_RETHOOK
+static int kretprobe_init_inst(void *context, void *nod)
+{
+	struct kretprobe_instance *ri = nod;
+
+	ri->rph = context;
+	return 0;
+}
+#endif
+
 int register_kretprobe(struct kretprobe *rp)
 {
 	int ret;
-	struct kretprobe_instance *inst;
 	int i;
 	void *addr;
 
@@ -2215,20 +2220,12 @@ int register_kretprobe(struct kretprobe *rp)
 #endif
 	}
 #ifdef CONFIG_KRETPROBE_ON_RETHOOK
-	rp->rh = rethook_alloc((void *)rp, kretprobe_rethook_handler);
+	rp->rh = rethook_alloc((void *)rp, kretprobe_rethook_handler, GFP_KERNEL,
+				sizeof(struct kretprobe_instance) + rp->data_size,
+				rp->maxactive);
 	if (!rp->rh)
 		return -ENOMEM;
 
-	for (i = 0; i < rp->maxactive; i++) {
-		inst = kzalloc(sizeof(struct kretprobe_instance) +
-			       rp->data_size, GFP_KERNEL);
-		if (inst == NULL) {
-			rethook_free(rp->rh);
-			rp->rh = NULL;
-			return -ENOMEM;
-		}
-		rethook_add_node(rp->rh, &inst->node);
-	}
 	rp->nmissed = 0;
 	/* Establish function entry probe point */
 	ret = register_kprobe(&rp->kp);
@@ -2237,25 +2234,19 @@ int register_kretprobe(struct kretprobe *rp)
 		rp->rh = NULL;
 	}
 #else	/* !CONFIG_KRETPROBE_ON_RETHOOK */
-	rp->freelist.head = NULL;
 	rp->rph = kzalloc(sizeof(struct kretprobe_holder), GFP_KERNEL);
 	if (!rp->rph)
 		return -ENOMEM;
 
-	rp->rph->rp = rp;
-	for (i = 0; i < rp->maxactive; i++) {
-		inst = kzalloc(sizeof(struct kretprobe_instance) +
-			       rp->data_size, GFP_KERNEL);
-		if (inst == NULL) {
-			refcount_set(&rp->rph->ref, i);
-			free_rp_inst(rp);
-			return -ENOMEM;
-		}
-		inst->rph = rp->rph;
-		freelist_add(&inst->freelist, &rp->freelist);
+	if (objpool_init(&rp->rph->oh, rp->maxactive, rp->maxactive,
+			  rp->data_size + sizeof(struct kretprobe_instance),
+			  GFP_KERNEL, rp->rph, kretprobe_init_inst, NULL)) {
+		kfree(rp->rph);
+		rp->rph = NULL;
+		return -ENOMEM;
 	}
-	refcount_set(&rp->rph->ref, i);
-
+	refcount_set(&rp->rph->ref, rp->maxactive + 1);
+	rp->rph->rp = rp;
 	rp->nmissed = 0;
 	/* Establish function entry probe point */
 	ret = register_kprobe(&rp->kp);
diff --git a/kernel/trace/fprobe.c b/kernel/trace/fprobe.c
index aac63ca9c3d1..99b4ab0f6468 100644
--- a/kernel/trace/fprobe.c
+++ b/kernel/trace/fprobe.c
@@ -125,7 +125,7 @@ static void fprobe_init(struct fprobe *fp)
 
 static int fprobe_init_rethook(struct fprobe *fp, int num)
 {
-	int i, size;
+	int size;
 
 	if (num < 0)
 		return -EINVAL;
@@ -140,18 +140,11 @@ static int fprobe_init_rethook(struct fprobe *fp, int num)
 	if (size < 0)
 		return -E2BIG;
 
-	fp->rethook = rethook_alloc((void *)fp, fprobe_exit_handler);
-	for (i = 0; i < size; i++) {
-		struct fprobe_rethook_node *node;
+	fp->rethook = rethook_alloc((void *)fp, fprobe_exit_handler, GFP_KERNEL,
+				sizeof(struct fprobe_rethook_node), size);
+	if (!fp->rethook)
+		return -ENOMEM;
 
-		node = kzalloc(sizeof(*node), GFP_KERNEL);
-		if (!node) {
-			rethook_free(fp->rethook);
-			fp->rethook = NULL;
-			return -ENOMEM;
-		}
-		rethook_add_node(fp->rethook, &node->node);
-	}
 	return 0;
 }
 
diff --git a/kernel/trace/rethook.c b/kernel/trace/rethook.c
index c69d82273ce7..01df98db2fbe 100644
--- a/kernel/trace/rethook.c
+++ b/kernel/trace/rethook.c
@@ -36,21 +36,17 @@ void rethook_flush_task(struct task_struct *tk)
 static void rethook_free_rcu(struct rcu_head *head)
 {
 	struct rethook *rh = container_of(head, struct rethook, rcu);
-	struct rethook_node *rhn;
-	struct freelist_node *node;
-	int count = 1;
+	struct rethook_node *nod;
 
-	node = rh->pool.head;
-	while (node) {
-		rhn = container_of(node, struct rethook_node, freelist);
-		node = node->next;
-		kfree(rhn);
-		count++;
-	}
-
-	/* The rh->ref is the number of pooled node + 1 */
-	if (refcount_sub_and_test(count, &rh->ref))
-		kfree(rh);
+	do {
+		nod = objpool_pop(&rh->pool);
+		/* deref anyway since we've one extra ref grabbed */
+		if (refcount_dec_and_test(&rh->ref)) {
+			objpool_fini(&rh->pool);
+			kfree(rh);
+			break;
+		}
+	} while (nod);
 }
 
 /**
@@ -70,16 +66,28 @@ void rethook_free(struct rethook *rh)
 	call_rcu(&rh->rcu, rethook_free_rcu);
 }
 
+static int rethook_init_node(void *context, void *nod)
+{
+	struct rethook_node *node = nod;
+
+	node->rethook = context;
+	return 0;
+}
+
 /**
  * rethook_alloc() - Allocate struct rethook.
  * @data: a data to pass the @handler when hooking the return.
  * @handler: the return hook callback function.
+ * @gfp: default gfp for objpool allocation
+ * @size: rethook node size
+ * @max: number of rethook nodes to be preallocated
  *
  * Allocate and initialize a new rethook with @data and @handler.
  * Return NULL if memory allocation fails or @handler is NULL.
  * Note that @handler == NULL means this rethook is going to be freed.
  */
-struct rethook *rethook_alloc(void *data, rethook_handler_t handler)
+struct rethook *rethook_alloc(void *data, rethook_handler_t handler, gfp_t gfp,
+			      int size, int max)
 {
 	struct rethook *rh = kzalloc(sizeof(struct rethook), GFP_KERNEL);
 
@@ -88,34 +96,26 @@ struct rethook *rethook_alloc(void *data, rethook_handler_t handler)
 
 	rh->data = data;
 	rh->handler = handler;
-	rh->pool.head = NULL;
-	refcount_set(&rh->ref, 1);
 
+	/* initialize the objpool for rethook nodes */
+	if (objpool_init(&rh->pool, max, max, size, gfp, rh, rethook_init_node,
+			NULL)) {
+		kfree(rh);
+		return NULL;
+	}
+	refcount_set(&rh->ref, max + 1);
 	return rh;
 }
 
-/**
- * rethook_add_node() - Add a new node to the rethook.
- * @rh: the struct rethook.
- * @node: the struct rethook_node to be added.
- *
- * Add @node to @rh. User must allocate @node (as a part of user's
- * data structure.) The @node fields are initialized in this function.
- */
-void rethook_add_node(struct rethook *rh, struct rethook_node *node)
-{
-	node->rethook = rh;
-	freelist_add(&node->freelist, &rh->pool);
-	refcount_inc(&rh->ref);
-}
-
 static void free_rethook_node_rcu(struct rcu_head *head)
 {
 	struct rethook_node *node = container_of(head, struct rethook_node, rcu);
+	struct rethook *rh = node->rethook;
 
-	if (refcount_dec_and_test(&node->rethook->ref))
-		kfree(node->rethook);
-	kfree(node);
+	if (refcount_dec_and_test(&rh->ref)) {
+		objpool_fini(&rh->pool);
+		kfree(rh);
+	}
 }
 
 /**
@@ -130,7 +130,7 @@ void rethook_recycle(struct rethook_node *node)
 	lockdep_assert_preemption_disabled();
 
 	if (likely(READ_ONCE(node->rethook->handler)))
-		freelist_add(&node->freelist, &node->rethook->pool);
+		objpool_push(node, &node->rethook->pool);
 	else
 		call_rcu(&node->rcu, free_rethook_node_rcu);
 }
@@ -146,7 +146,7 @@ NOKPROBE_SYMBOL(rethook_recycle);
 struct rethook_node *rethook_try_get(struct rethook *rh)
 {
 	rethook_handler_t handler = READ_ONCE(rh->handler);
-	struct freelist_node *fn;
+	struct rethook_node *nod;
 
 	lockdep_assert_preemption_disabled();
 
@@ -163,11 +163,11 @@ struct rethook_node *rethook_try_get(struct rethook *rh)
 	if (unlikely(!rcu_is_watching()))
 		return NULL;
 
-	fn = freelist_try_get(&rh->pool);
-	if (!fn)
+	nod = (struct rethook_node *)objpool_pop(&rh->pool);
+	if (!nod)
 		return NULL;
 
-	return container_of(fn, struct rethook_node, freelist);
+	return nod;
 }
 NOKPROBE_SYMBOL(rethook_try_get);
 
diff --git a/lib/Kconfig.debug b/lib/Kconfig.debug
index 3fc7abffc7aa..b12cc71754cf 100644
--- a/lib/Kconfig.debug
+++ b/lib/Kconfig.debug
@@ -2737,6 +2737,17 @@ config TEST_CLOCKSOURCE_WATCHDOG
 
 	  If unsure, say N.
 
+config TEST_OBJPOOL
+	tristate "Test module for correctness and stress of objpool"
+	default n
+       depends on m
+	help
+	  This builds the "test_objpool" module that should be used for
+	  correctness verification and concurrent testings of objects
+	  allocation and reclamation.
+
+	  If unsure, say N.
+
 endif # RUNTIME_TESTING_MENU
 
 config ARCH_USE_MEMTEST
diff --git a/lib/Makefile b/lib/Makefile
index 161d6a724ff7..4aa282fa0cfc 100644
--- a/lib/Makefile
+++ b/lib/Makefile
@@ -34,7 +34,7 @@ lib-y := ctype.o string.o vsprintf.o cmdline.o \
 	 is_single_threaded.o plist.o decompress.o kobject_uevent.o \
 	 earlycpio.o seq_buf.o siphash.o dec_and_lock.o \
 	 nmi_backtrace.o win_minmax.o memcat_p.o \
-	 buildid.o
+	 buildid.o objpool.o
 
 lib-$(CONFIG_PRINTK) += dump_stack.o
 lib-$(CONFIG_SMP) += cpumask.o
@@ -99,6 +99,8 @@ obj-$(CONFIG_KPROBES_SANITY_TEST) += test_kprobes.o
 obj-$(CONFIG_TEST_REF_TRACKER) += test_ref_tracker.o
 CFLAGS_test_fprobe.o += $(CC_FLAGS_FTRACE)
 obj-$(CONFIG_FPROBE_SANITY_TEST) += test_fprobe.o
+obj-$(CONFIG_TEST_OBJPOOL) += test_objpool.o
+
 #
 # CFLAGS for compiling floating point code inside the kernel. x86/Makefile turns
 # off the generation of FPU/SSE* instructions for kernel proper but FPU_FLAGS
diff --git a/lib/objpool.c b/lib/objpool.c
new file mode 100644
index 000000000000..51b3499ff9da
--- /dev/null
+++ b/lib/objpool.c
@@ -0,0 +1,480 @@
+// SPDX-License-Identifier: GPL-2.0
+
+#include <linux/objpool.h>
+
+/*
+ * objpool: ring-array based lockless MPMC/FIFO queues
+ *
+ * Copyright: wuqiang.matt@bytedance.com
+ */
+
+/* compute the suitable num of objects to be managed by slot */
+static inline uint32_t __objpool_num_of_objs(uint32_t size)
+{
+	return rounddown_pow_of_two((size - sizeof(struct objpool_slot)) /
+			(sizeof(uint32_t) + sizeof(void *)));
+}
+
+#define SLOT_AGES(s) ((uint32_t *)((char *)(s) + sizeof(struct objpool_slot)))
+#define SLOT_ENTS(s) ((void **)((char *)(s) + sizeof(struct objpool_slot) + \
+			sizeof(uint32_t) * (s)->os_size))
+#define SLOT_OBJS(s) ((void *)((char *)(s) + sizeof(struct objpool_slot) + \
+			(sizeof(uint32_t) + sizeof(void *)) * (s)->os_size))
+
+/* allocate and initialize percpu slots */
+static inline int
+__objpool_init_percpu_slots(struct objpool_head *oh, uint32_t nobjs,
+			void *context, objpool_init_node_cb objinit)
+{
+	uint32_t i, j, size, objsz, nents = oh->oh_nents;
+
+	/* aligned object size by sizeof(void *) */
+	objsz = ALIGN(oh->oh_objsz, sizeof(void *));
+	/* shall we allocate objects along with objpool_slot */
+	if (objsz)
+		oh->oh_in_slot = 1;
+
+	for (i = 0; i < oh->oh_ncpus; i++) {
+		struct objpool_slot *os;
+		uint32_t n;
+
+		/* compute how many objects to be managed by this slot */
+		n = nobjs / oh->oh_ncpus;
+		if (i < (nobjs % oh->oh_ncpus))
+			n++;
+		size = sizeof(struct objpool_slot) + sizeof(void *) * nents +
+		       sizeof(uint32_t) * nents + objsz * n;
+
+		/* decide which pool shall the slot be allocated from */
+		if (0 == i) {
+			if ((oh->oh_gfp & GFP_ATOMIC) || size < PAGE_SIZE / 2)
+				oh->oh_vmalloc = 0;
+			else
+				oh->oh_vmalloc = 1;
+		}
+
+		/* allocate percpu slot & objects from local memory */
+		if (oh->oh_vmalloc)
+			os = vmalloc_node(size, cpu_to_node(i));
+		else
+			os = kmalloc_node(size, oh->oh_gfp, cpu_to_node(i));
+		if (!os)
+			return -ENOMEM;
+
+		/* initialize percpu slot for the i-th cpu */
+		memset(os, 0, size);
+		os->os_size = oh->oh_nents;
+		os->os_mask = os->os_size - 1;
+		oh->oh_slots[i] = os;
+		oh->oh_sz_slots[i] = size;
+
+		/*
+		 * start from 2nd round to avoid conflict of 1st item.
+		 * we assume that the head item is ready for retrieval
+		 * iff head is equal to ages[head & mask]. but ages is
+		 * initialized as 0, so in view of the caller of pop(),
+		 * the 1st item (0th) is always ready, but fact could
+		 * be: push() is stalled before the final update, thus
+		 * the item being inserted will be lost forever.
+		 */
+		os->os_head = os->os_tail = oh->oh_nents;
+
+		for (j = 0; oh->oh_in_slot && j < n; j++) {
+			uint32_t *ages = SLOT_AGES(os);
+			void **ents = SLOT_ENTS(os);
+			void *obj = SLOT_OBJS(os) + j * objsz;
+			uint32_t ie = os->os_tail & os->os_mask;
+
+			/* perform object initialization */
+			if (objinit) {
+				int rc = objinit(context, obj);
+				if (rc)
+					return rc;
+			}
+
+			/* add obj into the ring array */
+			ents[ie] = obj;
+			ages[ie] = os->os_tail;
+			os->os_tail++;
+			oh->oh_nobjs++;
+		}
+	}
+
+	return 0;
+}
+
+/* cleanup all percpu slots of the object pool */
+static inline void __objpool_fini_percpu_slots(struct objpool_head *oh)
+{
+	uint32_t i;
+
+	if (!oh->oh_slots)
+		return;
+
+	for (i = 0; i < oh->oh_ncpus; i++) {
+		if (!oh->oh_slots[i])
+			continue;
+		if (oh->oh_vmalloc)
+			vfree(oh->oh_slots[i]);
+		else
+			kfree(oh->oh_slots[i]);
+	}
+	kfree(oh->oh_slots);
+	oh->oh_slots = NULL;
+	oh->oh_sz_slots = NULL;
+}
+
+/**
+ * objpool_init: initialize object pool and pre-allocate objects
+ *
+ * args:
+ * @oh:    the object pool to be initialized, declared by the caller
+ * @nojbs: total objects to be allocated by this object pool
+ * @max:   max objs this objpool could manage, use nobjs if 0
+ * @ojbsz: size of an object, to be pre-allocated if objsz is not 0
+ * @gfp:   gfp flags of caller's context for memory allocation
+ * @context: user context for object initialization callback
+ * @objinit: object initialization callback for extra setting-up
+ * @release: cleanup callback for private objects/pool/context
+ *
+ * return:
+ *         0 for success, otherwise error code
+ *
+ * All pre-allocated objects are to be zeroed. Caller could do extra
+ * initialization in objinit callback. The objinit callback will be
+ * called once and only once after the slot allocation
+ */
+int objpool_init(struct objpool_head *oh,
+		int nobjs, int max, int objsz,
+		gfp_t gfp, void *context,
+		objpool_init_node_cb objinit,
+		objpool_release_cb release)
+{
+	uint32_t nents, cpus = num_possible_cpus();
+	int rc;
+
+	/* calculate percpu slot size (rounded to pow of 2) */
+	if (max < nobjs)
+		max = nobjs;
+	nents = max / cpus;
+	if (nents < __objpool_num_of_objs(L1_CACHE_BYTES))
+		nents = __objpool_num_of_objs(L1_CACHE_BYTES);
+	nents = roundup_pow_of_two(nents);
+	while (nents * cpus < nobjs)
+		nents = nents << 1;
+
+	memset(oh, 0, sizeof(struct objpool_head));
+	oh->oh_ncpus = cpus;
+	oh->oh_objsz = objsz;
+	oh->oh_nents = nents;
+	oh->oh_gfp = gfp & ~__GFP_ZERO;
+	oh->oh_context = context;
+	oh->oh_release = release;
+
+	/* allocate array for percpu slots */
+	oh->oh_slots = kzalloc(oh->oh_ncpus * sizeof(void *) +
+			       oh->oh_ncpus * sizeof(uint32_t), oh->oh_gfp);
+	if (!oh->oh_slots)
+		return -ENOMEM;
+	oh->oh_sz_slots = (uint32_t *)&oh->oh_slots[oh->oh_ncpus];
+
+	/* initialize per-cpu slots */
+	rc = __objpool_init_percpu_slots(oh, nobjs, context, objinit);
+	if (rc)
+		__objpool_fini_percpu_slots(oh);
+
+	return rc;
+}
+EXPORT_SYMBOL_GPL(objpool_init);
+
+/* adding object to slot tail, the given slot mustn't be full */
+static inline int __objpool_add_slot(void *obj, struct objpool_slot *os)
+{
+	uint32_t *ages = SLOT_AGES(os);
+	void **ents = SLOT_ENTS(os);
+	uint32_t tail = atomic_inc_return((atomic_t *)&os->os_tail) - 1;
+
+	WRITE_ONCE(ents[tail & os->os_mask], obj);
+
+	/* order matters: obj must be updated before tail updating */
+	smp_store_release(&ages[tail & os->os_mask], tail);
+	return 0;
+}
+
+/* adding object to slot, abort if the slot was already full */
+static inline int __objpool_try_add_slot(void *obj, struct objpool_slot *os)
+{
+	uint32_t *ages = SLOT_AGES(os);
+	void **ents = SLOT_ENTS(os);
+	uint32_t head, tail;
+
+	do {
+		/* perform memory loading for both head and tail */
+		head = READ_ONCE(os->os_head);
+		tail = READ_ONCE(os->os_tail);
+		/* just abort if slot is full */
+		if (tail >= head + os->os_size)
+			return -ENOENT;
+		/* try to extend tail by 1 using CAS to avoid races */
+		if (try_cmpxchg_acquire(&os->os_tail, &tail, tail + 1))
+			break;
+	} while (1);
+
+	/* the tail-th of slot is reserved for the given obj */
+	WRITE_ONCE(ents[tail & os->os_mask], obj);
+	/* update epoch id to make this object available for pop() */
+	smp_store_release(&ages[tail & os->os_mask], tail);
+	return 0;
+}
+
+/**
+ * objpool_populate: add objects from user provided pool in batch
+ *
+ * args:
+ * @oh:  object pool
+ * @buf: user buffer for pre-allocated objects
+ * @size: size of user buffer
+ * @objsz: size of object & element
+ * @context: user context for objinit callback
+ * @objinit: object initialization callback
+ *
+ * return: 0 or error code
+ */
+int objpool_populate(struct objpool_head *oh, void *buf, int size, int objsz,
+		    void *context, objpool_init_node_cb objinit)
+{
+	int n = oh->oh_nobjs, used = 0, i;
+
+	if (oh->oh_pool || !buf || size < objsz)
+		return -EINVAL;
+	if (oh->oh_objsz && oh->oh_objsz != objsz)
+		return -EINVAL;
+	if (oh->oh_context && context && oh->oh_context != context)
+		return -EINVAL;
+	if (oh->oh_nobjs >= oh->oh_ncpus * oh->oh_nents)
+		return -ENOENT;
+
+	WARN_ON_ONCE(((unsigned long)buf) & (sizeof(void *) - 1));
+	WARN_ON_ONCE(((uint32_t)objsz) & (sizeof(void *) - 1));
+
+	/* align object size by sizeof(void *) */
+	oh->oh_objsz = objsz;
+	objsz = ALIGN(objsz, sizeof(void *));
+	if (objsz <= 0)
+		return -EINVAL;
+
+	while (used + objsz <= size) {
+		void *obj = buf + used;
+
+		/* perform object initialization */
+		if (objinit) {
+			int rc = objinit(context, obj);
+			if (rc)
+				return rc;
+		}
+
+		/* insert obj to its corresponding objpool slot */
+		i = (n + used * oh->oh_ncpus/size) % oh->oh_ncpus;
+		if (!__objpool_try_add_slot(obj, oh->oh_slots[i]))
+			oh->oh_nobjs++;
+
+		used += objsz;
+	}
+
+	if (!used)
+		return -ENOENT;
+
+	oh->oh_context = context;
+	oh->oh_pool = buf;
+	oh->oh_sz_pool = size;
+
+	return 0;
+}
+EXPORT_SYMBOL_GPL(objpool_populate);
+
+/**
+ * objpool_add: add pre-allocated object to objpool during pool
+ * initialization
+ *
+ * args:
+ * @obj: object pointer to be added to objpool
+ * @oh:  object pool to be inserted into
+ *
+ * return:
+ *     0 or error code
+ *
+ * objpool_add_node doesn't handle race conditions, can only be
+ * called during objpool initialization
+ */
+int objpool_add(void *obj, struct objpool_head *oh)
+{
+	uint32_t i, cpu;
+
+	if (!obj)
+		return -EINVAL;
+	if (oh->oh_nobjs >= oh->oh_ncpus * oh->oh_nents)
+		return -ENOENT;
+
+	cpu = oh->oh_nobjs % oh->oh_ncpus;
+	for (i = 0; i < oh->oh_ncpus; i++) {
+		if (!__objpool_try_add_slot(obj, oh->oh_slots[cpu])) {
+			oh->oh_nobjs++;
+			return 0;
+		}
+
+		if (++cpu >= oh->oh_ncpus)
+			cpu = 0;
+	}
+
+	return -ENOENT;
+}
+EXPORT_SYMBOL_GPL(objpool_add);
+
+/**
+ * objpool_push: reclaim the object and return back to objects pool
+ *
+ * args:
+ * @obj: object pointer to be pushed to object pool
+ * @oh:  object pool
+ *
+ * return:
+ *     0 or error code: it fails only when objects pool are full
+ *
+ * objpool_push is non-blockable, and can be nested
+ */
+int objpool_push(void *obj, struct objpool_head *oh)
+{
+	uint32_t cpu = raw_smp_processor_id();
+
+	do {
+		if (oh->oh_nobjs > oh->oh_nents) {
+			if (!__objpool_try_add_slot(obj, oh->oh_slots[cpu]))
+				return 0;
+		} else {
+			if (!__objpool_add_slot(obj, oh->oh_slots[cpu]))
+				return 0;
+		}
+		if (++cpu >= oh->oh_ncpus)
+			cpu = 0;
+	} while (1);
+
+	return -ENOENT;
+}
+EXPORT_SYMBOL_GPL(objpool_push);
+
+/* try to retrieve object from slot */
+static inline void *__objpool_try_get_slot(struct objpool_slot *os)
+{
+	uint32_t *ages = SLOT_AGES(os);
+	void **ents = SLOT_ENTS(os);
+	/* do memory load of os_head to local head */
+	uint32_t head = smp_load_acquire(&os->os_head);
+
+	/* loop if slot isn't empty */
+	while (head != READ_ONCE(os->os_tail)) {
+		uint32_t id = head & os->os_mask, prev = head;
+
+		/* do prefetching of object ents */
+		prefetch(&ents[id]);
+
+		/*
+		 * check whether this item was ready for retrieval ? There's
+		 * possibility * in theory * we might retrieve wrong object,
+		 * in case ages[id] overflows when current task is sleeping,
+		 * but it will take very very long to overflow an uint32_t
+		 */
+		if (smp_load_acquire(&ages[id]) == head) {
+			/* node must have been udpated by push() */
+			void *node = READ_ONCE(ents[id]);
+			/* commit and move forward head of the slot */
+			if (try_cmpxchg_release(&os->os_head, &head, head + 1))
+				return node;
+		}
+
+		/* re-load head from memory continue trying */
+		head = READ_ONCE(os->os_head);
+		/*
+		 * head stays unchanged, so it's very likely current pop()
+		 * just preempted/interrupted an ongoing push() operation
+		 */
+		if (head == prev)
+			break;
+	}
+
+	return NULL;
+}
+
+/**
+ * objpool_pop: allocate an object from objects pool
+ *
+ * args:
+ * @oh:  object pool
+ *
+ * return:
+ *   object: NULL if failed (object pool is empty)
+ *
+ * objpool_pop can be nested, so can be used in any context.
+ */
+void *objpool_pop(struct objpool_head *oh)
+{
+	uint32_t i, cpu = raw_smp_processor_id();
+	void *obj = NULL;
+
+	for (i = 0; i < oh->oh_ncpus; i++) {
+		struct objpool_slot *slot = oh->oh_slots[cpu];
+		obj = __objpool_try_get_slot(slot);
+		if (obj)
+			break;
+		if (++cpu >= oh->oh_ncpus)
+			cpu = 0;
+	}
+
+	return obj;
+}
+EXPORT_SYMBOL_GPL(objpool_pop);
+
+/**
+ * objpool_fini: cleanup the whole object pool (releasing all objects)
+ *
+ * args:
+ * @head: object pool to be released
+ *
+ */
+void objpool_fini(struct objpool_head *oh)
+{
+	uint32_t i, flags;
+
+	if (!oh->oh_slots)
+		return;
+
+	if (!oh->oh_release) {
+		__objpool_fini_percpu_slots(oh);
+		return;
+	}
+
+	/* cleanup all objects remained in objpool */
+	for (i = 0; i < oh->oh_ncpus; i++) {
+		void *obj;
+		do {
+			flags = OBJPOOL_FLAG_NODE;
+			obj = __objpool_try_get_slot(oh->oh_slots[i]);
+			if (!obj)
+				break;
+			if (!objpool_is_inpool(obj, oh) &&
+			    !objpool_is_inslot(obj, oh)) {
+				flags |= OBJPOOL_FLAG_USER;
+			}
+			oh->oh_release(oh->oh_context, obj, flags);
+		} while (obj);
+	}
+
+	/* release percpu slots */
+	__objpool_fini_percpu_slots(oh);
+
+	/* cleanup user private pool and related context */
+	flags = OBJPOOL_FLAG_POOL;
+	if (oh->oh_pool)
+		flags |= OBJPOOL_FLAG_USER;
+	oh->oh_release(oh->oh_context, oh->oh_pool, flags);
+}
+EXPORT_SYMBOL_GPL(objpool_fini);
diff --git a/lib/test_objpool.c b/lib/test_objpool.c
new file mode 100644
index 000000000000..c1341ddf77b5
--- /dev/null
+++ b/lib/test_objpool.c
@@ -0,0 +1,1031 @@
+// SPDX-License-Identifier: GPL-2.0
+
+/*
+ * Test module for lockless object pool
+ * (C) 2022 Matt Wu <wuqiang.matt@bytedance.com>
+ */
+
+#include <linux/version.h>
+#include <linux/errno.h>
+#include <linux/module.h>
+#include <linux/moduleparam.h>
+#include <linux/sched.h>
+#include <linux/cpumask.h>
+#include <linux/completion.h>
+#include <linux/kthread.h>
+#include <linux/cpu.h>
+#include <linux/cpuset.h>
+#include <linux/slab.h>
+#include <linux/delay.h>
+#include <linux/hrtimer.h>
+#include <linux/interrupt.h>
+#include <linux/objpool.h>
+
+#define OT_NR_MAX_BULK (16)
+
+struct ot_ctrl {
+	unsigned int mode;
+	unsigned int duration; /* ms */
+	unsigned int delay; /* ms */
+	unsigned int bulk_normal;
+	unsigned int bulk_irq;
+	unsigned long hrtimer; /* ms */
+	const char *name;
+};
+
+struct ot_stat {
+	unsigned long nhits;
+	unsigned long nmiss;
+};
+
+struct ot_item {
+	struct objpool_head *pool; /* pool head */
+	struct ot_ctrl *ctrl; /* ctrl parameters */
+
+	void (*worker)(struct ot_item *item, int irq);
+
+	/* hrtimer control */
+	ktime_t hrtcycle;
+	struct hrtimer hrtimer;
+
+	int bulk[2]; /* for thread and irq */
+	int delay;
+	u32 niters;
+
+	/* results summary */
+	struct ot_stat stat[2]; /* thread and irq */
+
+	u64 duration;
+};
+
+struct ot_mem_stat {
+	atomic_long_t alloc;
+	atomic_long_t free;
+};
+
+struct ot_data {
+	struct rw_semaphore start;
+	struct completion wait;
+	struct completion rcu;
+	atomic_t nthreads ____cacheline_aligned_in_smp;
+	atomic_t stop ____cacheline_aligned_in_smp;
+	struct ot_mem_stat kmalloc;
+	struct ot_mem_stat vmalloc;
+} g_ot_data;
+
+/*
+ * memory leakage checking
+ */
+
+void *ot_kzalloc(long size)
+{
+	void *ptr = kzalloc(size, GFP_KERNEL);
+
+	if (ptr)
+		atomic_long_add(size, &g_ot_data.kmalloc.alloc);
+	return ptr;
+}
+
+void ot_kfree(void *ptr, long size)
+{
+	if (!ptr)
+		return;
+	atomic_long_add(size, &g_ot_data.kmalloc.free);
+	kfree(ptr);
+}
+
+void *ot_vmalloc(long size)
+{
+	void *ptr = vmalloc(size);
+
+	if (ptr)
+		atomic_long_add(size, &g_ot_data.vmalloc.alloc);
+	return ptr;
+}
+
+void ot_vfree(void *ptr, long size)
+{
+	if (!ptr)
+		return;
+	atomic_long_add(size, &g_ot_data.vmalloc.free);
+	vfree(ptr);
+}
+
+static void ot_mem_report(struct ot_ctrl *ctrl)
+{
+	long alloc, free;
+
+	pr_info("memory allocation summary for %s\n", ctrl->name);
+
+	alloc = atomic_long_read(&g_ot_data.kmalloc.alloc);
+	free = atomic_long_read(&g_ot_data.kmalloc.free);
+	pr_info("  kmalloc: %lu - %lu = %lu\n", alloc, free, alloc - free);
+
+	alloc = atomic_long_read(&g_ot_data.vmalloc.alloc);
+	free = atomic_long_read(&g_ot_data.vmalloc.free);
+	pr_info("  vmalloc: %lu - %lu = %lu\n", alloc, free, alloc - free);
+}
+
+/*
+ * general structs & routines
+ */
+
+struct ot_node {
+	void *owner;
+	unsigned long data;
+	unsigned long refs;
+};
+
+struct ot_context {
+	struct objpool_head pool;
+	void *ptr;
+	unsigned long size;
+	refcount_t refs;
+	struct rcu_head rcu;
+};
+
+static DEFINE_PER_CPU(struct ot_item, ot_pcup_items);
+
+static int ot_init_data(struct ot_data *data)
+{
+	memset(data, 0, sizeof(*data));
+	init_rwsem(&data->start);
+	init_completion(&data->wait);
+	init_completion(&data->rcu);
+	atomic_set(&data->nthreads, 1);
+
+	return 0;
+}
+
+static void ot_reset_data(struct ot_data *data)
+{
+	reinit_completion(&data->wait);
+	reinit_completion(&data->rcu);
+	atomic_set(&data->nthreads, 1);
+	atomic_set(&data->stop, 0);
+	memset(&data->kmalloc, 0, sizeof(data->kmalloc));
+	memset(&data->vmalloc, 0, sizeof(data->vmalloc));
+}
+
+static int ot_init_node(void *context, void *nod)
+{
+	struct ot_context *sop = context;
+	struct ot_node *on = nod;
+
+	on->owner = &sop->pool;
+	return 0;
+}
+
+static enum hrtimer_restart ot_hrtimer_handler(struct hrtimer *hrt)
+{
+	struct ot_item *item = container_of(hrt, struct ot_item, hrtimer);
+
+	if (atomic_read_acquire(&g_ot_data.stop))
+		return HRTIMER_NORESTART;
+
+	/* do bulk-testings for objects pop/push */
+	item->worker(item, 1);
+
+	hrtimer_forward(hrt, hrt->base->get_time(), item->hrtcycle);
+	return HRTIMER_RESTART;
+}
+
+static void ot_start_hrtimer(struct ot_item *item)
+{
+	if (!item->ctrl->hrtimer)
+		return;
+	hrtimer_start(&item->hrtimer, item->hrtcycle, HRTIMER_MODE_REL);
+}
+
+static void ot_stop_hrtimer(struct ot_item *item)
+{
+	if (!item->ctrl->hrtimer)
+		return;
+	hrtimer_cancel(&item->hrtimer);
+}
+
+static int ot_init_hrtimer(struct ot_item *item, unsigned long hrtimer)
+{
+	struct hrtimer *hrt = &item->hrtimer;
+
+	if (!hrtimer)
+		return -ENOENT;
+
+	item->hrtcycle = ktime_set(0, hrtimer * 1000000UL);
+	hrtimer_init(hrt, CLOCK_MONOTONIC, HRTIMER_MODE_REL);
+	hrt->function = ot_hrtimer_handler;
+	return 0;
+}
+
+static int ot_init_cpu_item(struct ot_item *item,
+			struct ot_ctrl *ctrl,
+			struct objpool_head *pool,
+			void (*worker)(struct ot_item *, int))
+{
+	memset(item, 0, sizeof(*item));
+	item->pool = pool;
+	item->ctrl = ctrl;
+	item->worker = worker;
+
+	item->bulk[0] = ctrl->bulk_normal;
+	item->bulk[1] = ctrl->bulk_irq;
+	item->delay = ctrl->delay;
+
+	/* initialize hrtimer */
+	ot_init_hrtimer(item, item->ctrl->hrtimer);
+	return 0;
+}
+
+static int ot_thread_worker(void *arg)
+{
+	struct ot_item *item = arg;
+	ktime_t start;
+
+	sched_set_normal(current, 50);
+
+	atomic_inc(&g_ot_data.nthreads);
+	down_read(&g_ot_data.start);
+	up_read(&g_ot_data.start);
+	start = ktime_get();
+	ot_start_hrtimer(item);
+	do {
+		if (atomic_read_acquire(&g_ot_data.stop))
+			break;
+		/* do bulk-testings for objects pop/push */
+		item->worker(item, 0);
+	} while (!kthread_should_stop());
+	ot_stop_hrtimer(item);
+	item->duration = (u64) ktime_us_delta(ktime_get(), start);
+	if (atomic_dec_and_test(&g_ot_data.nthreads))
+		complete(&g_ot_data.wait);
+
+	return 0;
+}
+
+static void ot_perf_report(struct ot_ctrl *ctrl, u64 duration)
+{
+	struct ot_stat total, normal = {0}, irq = {0};
+	int cpu, nthreads = 0;
+
+	pr_info("\n");
+	pr_info("Testing summary for %s\n", ctrl->name);
+
+	for_each_possible_cpu(cpu) {
+		struct ot_item *item = per_cpu_ptr(&ot_pcup_items, cpu);
+		if (!item->duration)
+			continue;
+		normal.nhits += item->stat[0].nhits;
+		normal.nmiss += item->stat[0].nmiss;
+		irq.nhits += item->stat[1].nhits;
+		irq.nmiss += item->stat[1].nmiss;
+		pr_info("CPU: %d  duration: %lluus\n", cpu, item->duration);
+		pr_info("\tthread:\t%16lu hits \t%16lu miss\n",
+			item->stat[0].nhits, item->stat[0].nmiss);
+		pr_info("\tirq:   \t%16lu hits \t%16lu miss\n",
+			item->stat[1].nhits, item->stat[1].nmiss);
+		pr_info("\ttotal: \t%16lu hits \t%16lu miss\n",
+			item->stat[0].nhits + item->stat[1].nhits,
+			item->stat[0].nmiss + item->stat[1].nmiss);
+		nthreads++;
+	}
+
+	total.nhits = normal.nhits + irq.nhits;
+	total.nmiss = normal.nmiss + irq.nmiss;
+
+	pr_info("ALL: \tnthreads: %d  duration: %lluus\n", nthreads, duration);
+	pr_info("SUM: \t%16lu hits \t%16lu miss\n",
+		total.nhits, total.nmiss);
+}
+
+/*
+ * synchronous test cases for objpool manipulation
+ */
+
+/* objpool manipulation for synchronous mode 0 (percpu objpool) */
+static struct ot_context *ot_init_sync_m0(void)
+{
+	struct ot_context *sop = NULL;
+	int max = num_possible_cpus() << 3;
+
+	sop = (struct ot_context *)ot_kzalloc(sizeof(*sop));
+	if (!sop)
+		return NULL;
+
+	if (objpool_init(&sop->pool, max, max, sizeof(struct ot_node),
+			GFP_KERNEL, sop, ot_init_node, NULL)) {
+		ot_kfree(sop, sizeof(*sop));
+		return NULL;
+	}
+	WARN_ON(max != sop->pool.oh_nobjs);
+
+	return sop;
+}
+
+static void ot_fini_sync_m0(struct ot_context *sop)
+{
+	objpool_fini(&sop->pool);
+	ot_kfree(sop, sizeof(*sop));
+}
+
+/* objpool manipulation for synchronous mode 1 (private pool) */
+static struct ot_context *ot_init_sync_m1(void)
+{
+	struct ot_context *sop = NULL;
+	unsigned long size;
+	int rc, szobj, max = num_possible_cpus() << 3;
+
+	sop = (struct ot_context *)ot_kzalloc(sizeof(*sop));
+	if (!sop)
+		return NULL;
+
+	size = sizeof(struct ot_node) * max;
+	sop->ptr = ot_vmalloc(size);
+	sop->size = size;
+	if (!sop->ptr) {
+		ot_kfree(sop, sizeof(*sop));
+		return NULL;
+	}
+	memset(sop->ptr, 0, size);
+
+	/* create and initialize objpool as empty (no objects) */
+	rc = objpool_init(&sop->pool, 0, max, 0, GFP_KERNEL, sop, NULL, NULL);
+	if (rc) {
+		ot_kfree(sop, sizeof(*sop));
+		return NULL;
+	}
+
+	/* populate given buffer to objpool */
+	rc = objpool_populate(&sop->pool, sop->ptr, size,
+		sizeof(struct ot_node), sop, ot_init_node);
+	if (rc) {
+		objpool_fini(&sop->pool);
+		ot_vfree(sop->ptr, size);
+		ot_kfree(sop, sizeof(*sop));
+		return NULL;
+	}
+	szobj = ALIGN(sizeof(struct ot_node), sizeof(void *));
+	WARN_ON((size / szobj) != sop->pool.oh_nobjs);
+
+	return sop;
+}
+
+static void ot_fini_sync_m1(struct ot_context *sop)
+{
+	objpool_fini(&sop->pool);
+
+	ot_vfree(sop->ptr, sop->size);
+	ot_kfree(sop, sizeof(*sop));
+}
+
+/* objpool manipulation for synchronous mode 2 (private objects) */
+static int ot_objpool_release(void *context, void *ptr, uint32_t flags)
+{
+	struct ot_context *sop = context;
+
+	/* here we need release all user-allocated objects */
+	if ((flags & OBJPOOL_FLAG_NODE) && (flags & OBJPOOL_FLAG_USER)) {
+		struct ot_node *on = ptr;
+		WARN_ON(on->data != 0xDEADBEEF);
+		ot_kfree(on, sizeof(struct ot_node));
+	} else if (flags & OBJPOOL_FLAG_POOL) {
+		/* release user preallocated pool */
+		if (sop->ptr) {
+			WARN_ON(sop->ptr != ptr);
+			WARN_ON(!(flags & OBJPOOL_FLAG_USER));
+			ot_vfree(sop->ptr, sop->size);
+		}
+		/* do context cleaning if needed */
+		ot_kfree(sop, sizeof(*sop));
+	}
+
+	return 0;
+}
+
+static struct ot_context *ot_init_sync_m2(void)
+{
+	struct ot_context *sop = NULL;
+	struct ot_node *on;
+	int rc, i, max = num_possible_cpus() << 3;
+
+	sop = (struct ot_context *)ot_kzalloc(sizeof(*sop));
+	if (!sop)
+		return NULL;
+
+	/* create and initialize objpool as empty (no objects) */
+	rc = objpool_init(&sop->pool, 0, max, 0, GFP_KERNEL, sop, NULL,
+			ot_objpool_release);
+	if (rc) {
+		ot_kfree(sop, sizeof(*sop));
+		return NULL;
+	}
+
+	/* allocate private objects and insert to objpool */
+	for (i = 0; i < max; i++) {
+		on = ot_kzalloc(sizeof(struct ot_node));
+		if (on) {
+			ot_init_node(sop, on);
+			on->data = 0xDEADBEEF;
+			objpool_add(on, &sop->pool);
+		}
+	}
+	WARN_ON(max != sop->pool.oh_nobjs);
+
+	return sop;
+}
+
+static void ot_fini_sync_m2(struct ot_context *sop)
+{
+	objpool_fini(&sop->pool);
+}
+
+/* objpool manipulation for synchronous mode 3 (mixed mode) */
+static struct ot_context *ot_init_sync_m3(void)
+{
+	struct ot_context *sop = NULL;
+	struct ot_node *on;
+	unsigned long size;
+	int rc, i, szobj, nobjs;
+	int max = num_possible_cpus() << 4;
+
+	sop = (struct ot_context *)ot_kzalloc(sizeof(*sop));
+	if (!sop)
+		return NULL;
+
+	/* create and initialize objpool as empty (no objects) */
+	nobjs = num_possible_cpus() * 2;
+	rc = objpool_init(&sop->pool, nobjs, max, sizeof(struct ot_node),
+			GFP_KERNEL, sop, ot_init_node, ot_objpool_release);
+	if (rc) {
+		ot_kfree(sop, sizeof(*sop));
+		return NULL;
+	}
+
+	size = sizeof(struct ot_node) * num_possible_cpus() * 4;
+	sop->ptr = ot_vmalloc(size);
+	if (!sop->ptr) {
+		objpool_fini(&sop->pool);
+		return NULL;
+	}
+	sop->size = size;
+	memset(sop->ptr, 0, size);
+
+	/* populate given buffer to objpool */
+	rc = objpool_populate(&sop->pool, sop->ptr, size,
+		sizeof(struct ot_node), sop, ot_init_node);
+	if (rc) {
+		objpool_fini(&sop->pool);
+		ot_vfree(sop->ptr, size);
+		return NULL;
+	}
+	szobj = ALIGN(sizeof(struct ot_node), sizeof(void *));
+	nobjs += size / szobj;
+
+	/* allocate private objects and insert to objpool */
+	for (i = 0; i < num_possible_cpus() * 2; i++) {
+		on = ot_kzalloc(sizeof(struct ot_node));
+		if (on) {
+			ot_init_node(sop, on);
+			on->data = 0xDEADBEEF;
+			if (!objpool_add(on, &sop->pool))
+				nobjs++;
+			else
+				ot_kfree(on, sizeof(struct ot_node));
+		}
+	}
+	WARN_ON(nobjs != sop->pool.oh_nobjs);
+
+	return sop;
+}
+
+static void ot_fini_sync_m3(struct ot_context *sop)
+{
+	objpool_fini(&sop->pool);
+}
+
+struct {
+	struct ot_context * (*init)(void);
+	void (*fini)(struct ot_context *sop);
+} g_ot_sync_ops[4] = {
+	{ot_init_sync_m0, ot_fini_sync_m0},
+	{ot_init_sync_m1, ot_fini_sync_m1},
+	{ot_init_sync_m2, ot_fini_sync_m2},
+	{ot_init_sync_m3, ot_fini_sync_m3},
+};
+
+/*
+ * synchronous test cases: performance mode
+ */
+
+static void ot_bulk_sync(struct ot_item *item, int irq)
+{
+	struct ot_node *nods[OT_NR_MAX_BULK];
+	int i;
+
+	for (i = 0; i < item->bulk[irq]; i++)
+		nods[i] = objpool_pop(item->pool);
+
+	if (!irq && (item->delay || !(++(item->niters) & 0x7FFF)))
+		msleep(item->delay);
+
+	while (i-- > 0) {
+		struct ot_node *on = nods[i];
+		if (on) {
+			on->refs++;
+			objpool_push(on, item->pool);
+			item->stat[irq].nhits++;
+		} else {
+			item->stat[irq].nmiss++;
+		}
+	}
+}
+
+static int ot_start_sync(struct ot_ctrl *ctrl)
+{
+	struct ot_context *sop;
+	ktime_t start;
+	u64 duration;
+	unsigned long timeout;
+	int cpu, rc;
+
+	/* initialize objpool for syncrhonous testcase */
+	sop = g_ot_sync_ops[ctrl->mode].init();
+	if (!sop)
+		return -ENOMEM;
+
+	/* grab rwsem to block testing threads */
+	down_write(&g_ot_data.start);
+
+	for_each_possible_cpu(cpu) {
+		struct ot_item *item = per_cpu_ptr(&ot_pcup_items, cpu);
+		struct task_struct *work;
+
+		ot_init_cpu_item(item, ctrl, &sop->pool, ot_bulk_sync);
+
+		/* skip offline cpus */
+		if (!cpu_online(cpu))
+			continue;
+
+		work = kthread_create_on_node(ot_thread_worker, item,
+				cpu_to_node(cpu), "ot_worker_%d", cpu);
+		if (IS_ERR(work)) {
+			pr_err("failed to create thread for cpu %d\n", cpu);
+		} else {
+			kthread_bind(work, cpu);
+			wake_up_process(work);
+		}
+	}
+
+	/* wait a while to make sure all threads waiting at start line */
+	msleep(20);
+
+	/* in case no threads were created: memory insufficient ? */
+	if (atomic_dec_and_test(&g_ot_data.nthreads))
+		complete(&g_ot_data.wait);
+
+	// sched_set_fifo_low(current);
+
+	/* start objpool testing threads */
+	start = ktime_get();
+	up_write(&g_ot_data.start);
+
+	/* yeild cpu to worker threads for duration ms */
+	timeout = msecs_to_jiffies(ctrl->duration);
+	rc = schedule_timeout_interruptible(timeout);
+
+	/* tell workers threads to quit */
+	atomic_set_release(&g_ot_data.stop, 1);
+
+	/* wait all workers threads finish and quit */
+	wait_for_completion(&g_ot_data.wait);
+	duration = (u64) ktime_us_delta(ktime_get(), start);
+
+	/* cleanup objpool */
+	g_ot_sync_ops[ctrl->mode].fini(sop);
+
+	/* report testing summary and performance results */
+	ot_perf_report(ctrl, duration);
+
+	/* report memory allocation summary */
+	ot_mem_report(ctrl);
+
+	return rc;
+}
+
+/*
+ * asynchronous test cases: pool lifecycle controlled by refcount
+ */
+
+static void ot_fini_async_rcu(struct rcu_head *rcu)
+{
+	struct ot_context *sop = container_of(rcu, struct ot_context, rcu);
+	struct ot_node *on;
+
+	/* here all cpus are aware of the stop event: g_ot_data.stop = 1 */
+	WARN_ON(!atomic_read_acquire(&g_ot_data.stop));
+
+	do {
+		/* release all objects remained in objpool */
+		on = objpool_pop(&sop->pool);
+		if (on && !objpool_is_inslot(on, &sop->pool) &&
+			!objpool_is_inpool(on, &sop->pool)) {
+			/* private object managed by user */
+			WARN_ON(on->data != 0xDEADBEEF);
+			ot_kfree(on, sizeof(struct ot_node));
+		}
+
+		/* deref anyway since we've one extra ref grabbed */
+		if (refcount_dec_and_test(&sop->refs)) {
+			objpool_fini(&sop->pool);
+			break;
+		}
+	} while (on);
+
+	complete(&g_ot_data.rcu);
+}
+
+static void ot_fini_async(struct ot_context *sop)
+{
+	/* make sure the stop event is acknowledged by all cores */
+	call_rcu(&sop->rcu, ot_fini_async_rcu);
+}
+
+static struct ot_context *ot_init_async_m0(void)
+{
+	struct ot_context *sop = NULL;
+	int max = num_possible_cpus() << 3;
+
+	sop = (struct ot_context *)ot_kzalloc(sizeof(*sop));
+	if (!sop)
+		return NULL;
+
+	if (objpool_init(&sop->pool, max, max, sizeof(struct ot_node),
+			GFP_KERNEL, sop, ot_init_node, ot_objpool_release)) {
+		ot_kfree(sop, sizeof(*sop));
+		return NULL;
+	}
+	WARN_ON(max != sop->pool.oh_nobjs);
+	refcount_set(&sop->refs, max + 1);
+
+	return sop;
+}
+
+static struct ot_context *ot_init_async_m1(void)
+{
+	struct ot_context *sop = NULL;
+	unsigned long size;
+	int szobj, rc, max = num_possible_cpus() << 3;
+
+	sop = (struct ot_context *)ot_kzalloc(sizeof(*sop));
+	if (!sop)
+		return NULL;
+
+	size = sizeof(struct ot_node) * max;
+	sop->ptr = ot_vmalloc(size);
+	sop->size = size;
+	if (!sop->ptr) {
+		ot_kfree(sop, sizeof(*sop));
+		return NULL;
+	}
+	memset(sop->ptr, 0, size);
+
+	/* create and initialize objpool as empty (no objects) */
+	rc = objpool_init(&sop->pool, 0, max, 0, GFP_KERNEL, sop, NULL,
+			ot_objpool_release);
+	if (rc) {
+		ot_kfree(sop, sizeof(*sop));
+		return NULL;
+	}
+
+	/* populate given buffer to objpool */
+	rc = objpool_populate(&sop->pool, sop->ptr, size,
+		sizeof(struct ot_node), sop, ot_init_node);
+	if (rc) {
+		objpool_fini(&sop->pool);
+		ot_vfree(sop->ptr, size);
+		return NULL;
+	}
+
+	/* calculate total number of objects stored in ptr */
+	szobj = ALIGN(sizeof(struct ot_node), sizeof(void *));
+	WARN_ON(size / szobj != sop->pool.oh_nobjs);
+	refcount_set(&sop->refs, size / szobj + 1);
+
+	return sop;
+}
+
+static struct ot_context *ot_init_async_m2(void)
+{
+	struct ot_context *sop = NULL;
+	struct ot_node *on;
+	int rc, i, nobjs = 0, max = num_possible_cpus() << 3;
+
+	sop = (struct ot_context *)ot_kzalloc(sizeof(*sop));
+	if (!sop)
+		return NULL;
+
+	/* create and initialize objpool as empty (no objects) */
+	rc = objpool_init(&sop->pool, 0, max, 0, GFP_KERNEL, sop, NULL,
+			ot_objpool_release);
+	if (rc) {
+		ot_kfree(sop, sizeof(*sop));
+		return NULL;
+	}
+
+	/* allocate private objects and insert to objpool */
+	for (i = 0; i < max; i++) {
+		on = ot_kzalloc(sizeof(struct ot_node));
+		if (on) {
+			ot_init_node(sop, on);
+			on->data = 0xDEADBEEF;
+			objpool_add(on, &sop->pool);
+			nobjs++;
+		}
+	}
+	WARN_ON(nobjs != sop->pool.oh_nobjs);
+	refcount_set(&sop->refs, nobjs + 1);
+
+	return sop;
+}
+
+/* objpool manipulation for synchronous mode 3 (mixed mode) */
+static struct ot_context *ot_init_async_m3(void)
+{
+	struct ot_context *sop = NULL;
+	struct ot_node *on;
+	unsigned long size;
+	int szobj, nobjs, rc, i, max = num_possible_cpus() << 4;
+
+	sop = (struct ot_context *)ot_kzalloc(sizeof(*sop));
+	if (!sop)
+		return NULL;
+
+	/* create and initialize objpool as empty (no objects) */
+	nobjs = num_possible_cpus() * 2;
+	rc = objpool_init(&sop->pool, nobjs, max, sizeof(struct ot_node),
+			GFP_KERNEL, sop, ot_init_node, ot_objpool_release);
+	if (rc) {
+		ot_kfree(sop, sizeof(*sop));
+		return NULL;
+	}
+
+	size = sizeof(struct ot_node) * num_possible_cpus() * 4;
+	sop->ptr = ot_vmalloc(size);
+	if (!sop->ptr) {
+		ot_kfree(sop, sizeof(*sop));
+		return NULL;
+	}
+	sop->size = size;
+	memset(sop->ptr, 0, size);
+
+	/* populate given buffer to objpool */
+	rc = objpool_populate(&sop->pool, sop->ptr, size,
+		sizeof(struct ot_node), sop, ot_init_node);
+	if (rc) {
+		objpool_fini(&sop->pool);
+		ot_vfree(sop->ptr, size);
+		return NULL;
+	}
+
+	/* calculate total number of objects stored in ptr */
+	szobj = ALIGN(sizeof(struct ot_node), sizeof(void *));
+	nobjs += size / szobj;
+
+	/* allocate private objects and insert to objpool */
+	for (i = 0; i < num_possible_cpus() * 2; i++) {
+		on = ot_kzalloc(sizeof(struct ot_node));
+		if (on) {
+			ot_init_node(sop, on);
+			on->data = 0xDEADBEEF;
+			objpool_add(on, &sop->pool);
+			nobjs++;
+		}
+	}
+	WARN_ON(nobjs != sop->pool.oh_nobjs);
+	refcount_set(&sop->refs, nobjs + 1);
+
+	return sop;
+}
+
+struct {
+	struct ot_context * (*init)(void);
+	void (*fini)(struct ot_context *sop);
+} g_ot_async_ops[4] = {
+	{ot_init_async_m0, ot_fini_async},
+	{ot_init_async_m1, ot_fini_async},
+	{ot_init_async_m2, ot_fini_async},
+	{ot_init_async_m3, ot_fini_async},
+};
+
+static void ot_nod_recycle(struct ot_node *on, struct objpool_head *pool,
+			int release)
+{
+	struct ot_context *sop;
+
+	on->refs++;
+
+	if (!release) {
+		/* push object back to opjpool for reuse */
+		objpool_push(on, pool);
+		return;
+	}
+
+	sop = container_of(pool, struct ot_context, pool);
+	WARN_ON(sop != pool->oh_context);
+
+	if (objpool_is_inslot(on, pool)) {
+		/* object is alloced from percpu slots */
+	} else if (objpool_is_inpool(on, pool)) {
+		/* object is alloced from user-manged pool */
+	} else {
+		/* private object managed by user */
+		WARN_ON(on->data != 0xDEADBEEF);
+		ot_kfree(on, sizeof(struct ot_node));
+	}
+
+	/* unref objpool with nod removed forever */
+	if (refcount_dec_and_test(&sop->refs))
+		objpool_fini(pool);
+}
+
+static void ot_bulk_async(struct ot_item *item, int irq)
+{
+	struct ot_node *nods[OT_NR_MAX_BULK];
+	int i, stop;
+
+	for (i = 0; i < item->bulk[irq]; i++)
+		nods[i] = objpool_pop(item->pool);
+
+	if (!irq) {
+		if (item->delay || !(++(item->niters) & 0x7FFF))
+			msleep(item->delay);
+		get_cpu();
+	}
+
+	stop = atomic_read_acquire(&g_ot_data.stop);
+
+	/* drop all objects and deref objpool */
+	while (i-- > 0) {
+		struct ot_node *on = nods[i];
+
+		if (on) {
+			on->refs++;
+			ot_nod_recycle(on, item->pool, stop);
+			item->stat[irq].nhits++;
+		} else {
+			item->stat[irq].nmiss++;
+		}
+	}
+
+	if (!irq)
+		put_cpu();
+}
+
+static int ot_start_async(struct ot_ctrl *ctrl)
+{
+	struct ot_context *sop;
+	ktime_t start;
+	u64 duration;
+	unsigned long timeout;
+	int cpu, rc;
+
+	/* initialize objpool for syncrhonous testcase */
+	sop = g_ot_async_ops[ctrl->mode].init();
+	if (!sop)
+		return -ENOMEM;
+
+	/* grab rwsem to block testing threads */
+	down_write(&g_ot_data.start);
+
+	for_each_possible_cpu(cpu) {
+		struct ot_item *item = per_cpu_ptr(&ot_pcup_items, cpu);
+		struct task_struct *work;
+
+		ot_init_cpu_item(item, ctrl, &sop->pool, ot_bulk_async);
+
+		/* skip offline cpus */
+		if (!cpu_online(cpu))
+			continue;
+
+		work = kthread_create_on_node(ot_thread_worker, item,
+				cpu_to_node(cpu), "ot_worker_%d", cpu);
+		if (IS_ERR(work)) {
+			pr_err("failed to create thread for cpu %d\n", cpu);
+		} else {
+			kthread_bind(work, cpu);
+			wake_up_process(work);
+		}
+	}
+
+	/* wait a while to make sure all threads waiting at start line */
+	msleep(20);
+
+	/* in case no threads were created: memory insufficient ? */
+	if (atomic_dec_and_test(&g_ot_data.nthreads))
+		complete(&g_ot_data.wait);
+
+	/* start objpool testing threads */
+	start = ktime_get();
+	up_write(&g_ot_data.start);
+
+	/* yeild cpu to worker threads for duration ms */
+	timeout = msecs_to_jiffies(ctrl->duration);
+	rc = schedule_timeout_interruptible(timeout);
+
+	/* tell workers threads to quit */
+	atomic_set_release(&g_ot_data.stop, 1);
+
+	/* do async-finalization */
+	g_ot_async_ops[ctrl->mode].fini(sop);
+
+	/* wait all workers threads finish and quit */
+	wait_for_completion(&g_ot_data.wait);
+	duration = (u64) ktime_us_delta(ktime_get(), start);
+
+	/* assure rcu callback is triggered */
+	wait_for_completion(&g_ot_data.rcu);
+
+	/*
+	 * now we are sure that objpool is finalized either
+	 * by rcu callback or by worker threads
+	 */
+
+	/* report testing summary and performance results */
+	ot_perf_report(ctrl, duration);
+
+	/* report memory allocation summary */
+	ot_mem_report(ctrl);
+
+	return rc;
+}
+
+/*
+ * predefined testing cases:
+ *   4 synchronous cases / 4 overrun cases / 2 async cases
+ *
+ * mode: unsigned int, could be 0/1/2/3, see name
+ * duration: unsigned int, total test time in ms
+ * delay: unsigned int, delay (in ms) between each iteration
+ * bulk_normal: unsigned int, repeat times for thread worker
+ * bulk_irq: unsigned int, repeat times for irq consumer
+ * hrtimer: unsigned long, hrtimer intervnal in ms
+ * name: char *, tag for current test ot_item
+ */
+
+struct ot_ctrl g_ot_sync[] = {
+	{0, 1000, 0,  1,  0,  0, "sync: percpu objpool"},
+	{1, 1000, 0,  1,  0,  0, "sync: user objpool"},
+	{2, 1000, 0,  1,  0,  0, "sync: user objects"},
+	{3, 1000, 0,  1,  0,  0, "sync: mixed pools & objs"},
+};
+
+struct ot_ctrl g_ot_miss[] = {
+	{0, 1000, 0, 16,  0,  0, "sync overrun: percpu objpool"},
+	{1, 1000, 0, 16,  0,  0, "sync overrun: user objpool"},
+	{2, 1000, 0, 16,  0,  0, "sync overrun: user objects"},
+	{3, 1000, 0, 16,  0,  0, "sync overrun: mixed pools & objs"},
+};
+
+struct ot_ctrl g_ot_async[] = {
+	{0, 1000, 4,  8,  8,  6, "async: percpu objpool"},
+	{1, 1000, 4,  8,  8,  6, "async: user objpool"},
+	{2, 1000, 4,  8,  8,  6, "async: user objects"},
+	{3, 1000, 4,  8,  8,  6, "async: mixed pools & objs"},
+};
+
+static int __init ot_mod_init(void)
+{
+	int i;
+
+	ot_init_data(&g_ot_data);
+
+	for (i = 0; i < ARRAY_SIZE(g_ot_sync); i++) {
+		if (ot_start_sync(&g_ot_sync[i]))
+			goto out;
+		ot_reset_data(&g_ot_data);
+	}
+
+	for (i = 0; i < ARRAY_SIZE(g_ot_miss); i++) {
+		if (ot_start_sync(&g_ot_miss[i]))
+			goto out;
+		ot_reset_data(&g_ot_data);
+	}
+
+	for (i = 0; i < ARRAY_SIZE(g_ot_async); i++) {
+		if (ot_start_async(&g_ot_async[i]))
+			goto out;
+		ot_reset_data(&g_ot_data);
+	}
+
+out:
+	return -EAGAIN;
+}
+
+static void __exit ot_mod_exit(void)
+{
+}
+
+module_init(ot_mod_init);
+module_exit(ot_mod_exit);
+
+MODULE_LICENSE("GPL");
+MODULE_AUTHOR("Matt Wu");
-- 
2.34.1


^ permalink raw reply related	[flat|nested] 23+ messages in thread

* Re: [PATCH v4] kprobes,lib: kretprobe scalability improvement
  2022-11-02  2:30 ` [PATCH v4] " wuqiang
@ 2022-11-02 21:33   ` Andrew Morton
  2022-11-03 16:46     ` wuqiang
  2022-11-03  2:51   ` Masami Hiramatsu
                     ` (2 subsequent siblings)
  3 siblings, 1 reply; 23+ messages in thread
From: Andrew Morton @ 2022-11-02 21:33 UTC (permalink / raw)
  To: wuqiang
  Cc: mhiramat, davem, anil.s.keshavamurthy, naveen.n.rao, rostedt,
	peterz, sander, ebiggers, dan.j.williams, jpoimboe, linux-kernel,
	mattwu, kernel test robot

On Wed,  2 Nov 2022 10:30:12 +0800 wuqiang <wuqiang.matt@bytedance.com> wrote:

> Tests of
> kretprobe throughput show the biggest ratio as 333.9x of the original
> freelist.

Seriously.

I'll add this for some runtime testing.

Are you able to identify other parts of the kernel which could use
(and benefit from) the new objpool?

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH v4] kprobes,lib: kretprobe scalability improvement
  2022-11-02  2:30 ` [PATCH v4] " wuqiang
  2022-11-02 21:33   ` Andrew Morton
@ 2022-11-03  2:51   ` Masami Hiramatsu
  2022-11-03 16:45     ` wuqiang
  2022-11-06  5:34   ` [PATCH v5 0/4] lib,kprobes: " wuqiang
  2022-11-08  7:14   ` [PATCH v6 0/4] lib,kprobes: kretprobe scalability improvement wuqiang
  3 siblings, 1 reply; 23+ messages in thread
From: Masami Hiramatsu @ 2022-11-03  2:51 UTC (permalink / raw)
  To: wuqiang
  Cc: davem, anil.s.keshavamurthy, naveen.n.rao, rostedt, peterz, akpm,
	sander, ebiggers, dan.j.williams, jpoimboe, linux-kernel, mattwu,
	kernel test robot

Hi wuqiang (or Matt?),

Thanks for updating the patch! I have some comments below,

On Wed,  2 Nov 2022 10:30:12 +0800
wuqiang <wuqiang.matt@bytedance.com> wrote:

> kretprobe is using freelist to manage return-instances, but freelist,
> as LIFO queue based on singly linked list, scales badly and reduces
> the overall throughput of kretprobed routines, especially for high
> contention scenarios.
> 
> Here's a typical throughput test of sys_flock (counts in 10 seconds,
> measured with perf stat -a -I 10000 -e syscalls:sys_enter_flock):
> 
> OS: Debian 10 X86_64, Linux 6.1rc2
> HW: XEON 8336C x 2, 64 cores/128 threads, DDR4 3200MT/s
> 
>       1X       2X       4X       6X       8X      12X      16X
> 34762430 36546920 17949900 13101899 12569595 12646601 14729195
>      24X      32X      48X      64X      72X      96X     128X
> 19263546 10102064  8985418 11936495 11493980  7127789  9330985
> 
> This patch implements a scalable, lock-less and numa-aware object pool,
> which brings near-linear scalability to kretprobed routines. Tests of
> kretprobe throughput show the biggest ratio as 333.9x of the original
> freelist. Here's the comparison:
> 
>                   1X         2X         4X         8X        16X
> freelist:   34762430   36546920   17949900   12569595   14729195
> objpool:    35627544   72182095  144068494  287564688  576903916
>                  32X        48X        64X        96X       128X
> freelist:   10102064    8985418   11936495    7127789    9330985
> objpool:  1158876372 1737828164 2324371724 2380310472 2463182819
> 
> Tests on 96-core ARM64 system output similarly, but with the biggest
> ratio up to 642.2x:
> 
> OS: Debian 10 AARCH64, Linux 6.1rc2
> HW: Kunpeng-920 96 cores/2 sockets/4 NUMA nodes, DDR4 2933 MT/s
> 
>                   1X         2X         4X         8X        16X
> freelist:   17498299   10887037   10224710    8499132    6421751
> objpool:    18715726   35549845   71615884  144258971  283707220
>                  24X        32X        48X        64X        96X
> freelist:    5339868    4819116    3593919    3121575    2687167
> objpool:   419830913  571609748  877456139 1143316315 1725668029
> 
> The object pool, leveraging percpu ring-array to mitigate hot spots
> of memory contention, could deliver near-linear scalability for high
> parallel scenarios. The ring-array is compactly managed in a single
> cacheline (64 bytes) to benefit from warmed L1 cache for most cases
> (<= 4 instances per core) and objects are managed in the continuous
> cachelines just after ring-array.
> 
> Changes since V3:
> 1) build warning: unused variable in fprobe_init_rethook
> Reported-by: kernel test robot <lkp@intel.com>
> 
> Changes since V2:
> 1) the percpu-extended version of the freelist replaced by new percpu-
>    ring-array. freelist has data-contention in freelist_node (refs and
>    next) even after node is removed from freelist and the node could
>    be polluted easily (with freelist_node defined in union)
> 2) routines split to objpool.h and objpool.c according to cold & hot
>    pathes, and the latter moved to lib, as suggested by Masami
> 3) test module (test_objpool.ko) added to lib for functional testings
> 
> Changes since V1:
> 1) reformat to a single patch as Masami Hiramatsu suggested
> 2) use __vmalloc_node to replace vmalloc_node for vmalloc
> 3) a few minor fixes: typo and coding-style issues

Recording change log is very good. But if it becomes too long,
you can put a URL of the previous series on LKML instead of
appending the change logs.
You can get the URL (permlink) by "lkml.kernel.org/r/<your-message-id>"

> 
> Signed-off-by: wuqiang <wuqiang.matt@bytedance.com>
> ---
>  include/linux/freelist.h |  129 -----
>  include/linux/kprobes.h  |    9 +-
>  include/linux/objpool.h  |  151 ++++++
>  include/linux/rethook.h  |   15 +-
>  kernel/kprobes.c         |   95 ++--
>  kernel/trace/fprobe.c    |   17 +-
>  kernel/trace/rethook.c   |   80 +--
>  lib/Kconfig.debug        |   11 +
>  lib/Makefile             |    4 +-
>  lib/objpool.c            |  480 ++++++++++++++++++
>  lib/test_objpool.c       | 1031 ++++++++++++++++++++++++++++++++++++++
>  11 files changed, 1772 insertions(+), 250 deletions(-)

Hmm, this does too much things in 1 patch.
Can you split this in below 5 patches? This also makes clear that who
needs to review which part.

- Add generic scalable objpool
- Add objpool test
- Use objpool in kretprobe
- Use objpool in fprobe and rethook
- Remove unused freelist

>  delete mode 100644 include/linux/freelist.h
>  create mode 100644 include/linux/objpool.h
>  create mode 100644 lib/objpool.c
>  create mode 100644 lib/test_objpool.c
> 
[...]
> +
> +struct objpool_slot {
> +	uint32_t                os_head;	/* head of ring array */

If all fields have "os_" prefix, it is meaningless.

> +	uint32_t                os_tail;	/* tail of ring array */
> +	uint32_t                os_size;	/* max item slots, pow of 2 */
> +	uint32_t                os_mask;	/* os_size - 1 */
> +/*
> + *	uint32_t                os_ages[];	// ring epoch id
> + *	void                   *os_ents[];	// objects array

"entries[]"

> + */
> +};
> +
> +/* caller-specified object initial callback to setup each object, only called once */
> +typedef int (*objpool_init_node_cb)(void *context, void *obj);
> +
> +/* caller-specified cleanup callback for private objects/pool/context */
> +typedef int (*objpool_release_cb)(void *context, void *ptr, uint32_t flags);
> +
> +/* called for object releasing: ptr points to an object */
> +#define OBJPOOL_FLAG_NODE        (0x00000001)
> +/* for user pool and context releasing, ptr could be NULL */
> +#define OBJPOOL_FLAG_POOL        (0x00001000)
> +/* the object or pool to be released is user-managed */
> +#define OBJPOOL_FLAG_USER        (0x00008000)
> +
> +/*
> + * objpool_head: object pooling metadata
> + */
> +
> +struct objpool_head {
> +	uint32_t                oh_objsz;	/* object & element size */
> +	uint32_t                oh_nobjs;	/* total objs (pre-allocated) */
> +	uint32_t                oh_nents;	/* max objects per cpuslot */
> +	uint32_t                oh_ncpus;	/* num of possible cpus */

If all fields have "oh_" prefix, it is meaningless.
Also, if there is no reason to be 32bit (like align the structure size
for cache, or pack the structure for streaming etc.) use appropriate types.

And please do not align the length of field name unnaturally. e.g.

size_t obj_size; /* size_t or unsigned int, I don't care. */
int nr_objs; /* I think just 'int' is enough because the value should be
               checked and limited under INT_MAX */
int max_entries;
unsigned int nr_cpus;

(BTW why we need to limit the nr_cpus here? we have num_possible_cpus())

> +	uint32_t                oh_in_user:1;	/* user-specified buffer */
> +	uint32_t                oh_in_slot:1;	/* objs alloced with slots */
> +	uint32_t                oh_vmalloc:1;	/* alloc from vmalloc zone */

Please use "bool" or "unsigned long flags" with flag bits.

> +	gfp_t                   oh_gfp;		/* k/vmalloc gfp flags */
> +	uint32_t                oh_sz_pool;	/* user pool size in byes */

size_t pool_size

> +	void                   *oh_pool;	/* user managed memory pool */
> +	struct objpool_slot   **oh_slots;	/* array of percpu slots */
> +	uint32_t               *oh_sz_slots;	/* size in bytes of slots */

size_t slot_size

> +	objpool_release_cb      oh_release;	/* resource cleanup callback */
> +	void                   *oh_context;	/* caller-provided context */
> +};

Thank you,

-- 
Masami Hiramatsu (Google) <mhiramat@kernel.org>

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH v4] kprobes,lib: kretprobe scalability improvement
  2022-11-03  2:51   ` Masami Hiramatsu
@ 2022-11-03 16:45     ` wuqiang
  2022-11-04  1:28       ` Masami Hiramatsu
  0 siblings, 1 reply; 23+ messages in thread
From: wuqiang @ 2022-11-03 16:45 UTC (permalink / raw)
  To: Masami Hiramatsu (Google)
  Cc: davem, anil.s.keshavamurthy, naveen.n.rao, rostedt, peterz, akpm,
	sander, ebiggers, dan.j.williams, jpoimboe, linux-kernel, mattwu,
	kernel test robot

On 2022/11/3 10:51, Masami Hiramatsu (Google) wrote:
> Hi wuqiang (or Matt?),
> 
> Thanks for updating the patch! I have some comments below,

Thanks for your time :)

> On Wed,  2 Nov 2022 10:30:12 +0800
> wuqiang <wuqiang.matt@bytedance.com> wrote:
> 
...
>> Changes since V3:
>> 1) build warning: unused variable in fprobe_init_rethook
>> Reported-by: kernel test robot <lkp@intel.com>
>>
>> Changes since V2:
>> 1) the percpu-extended version of the freelist replaced by new percpu-
>>     ring-array. freelist has data-contention in freelist_node (refs and
>>     next) even after node is removed from freelist and the node could
>>     be polluted easily (with freelist_node defined in union)
>> 2) routines split to objpool.h and objpool.c according to cold & hot
>>     pathes, and the latter moved to lib, as suggested by Masami
>> 3) test module (test_objpool.ko) added to lib for functional testings
>>
>> Changes since V1:
>> 1) reformat to a single patch as Masami Hiramatsu suggested
>> 2) use __vmalloc_node to replace vmalloc_node for vmalloc
>> 3) a few minor fixes: typo and coding-style issues
> 
> Recording change log is very good. But if it becomes too long,
> you can put a URL of the previous series on LKML instead of
> appending the change logs.
> You can get the URL (permlink) by "lkml.kernel.org/r/<your-message-id>"

Got it.

>>
>> Signed-off-by: wuqiang <wuqiang.matt@bytedance.com>
>> ---
>>   include/linux/freelist.h |  129 -----
>>   include/linux/kprobes.h  |    9 +-
>>   include/linux/objpool.h  |  151 ++++++
>>   include/linux/rethook.h  |   15 +-
>>   kernel/kprobes.c         |   95 ++--
>>   kernel/trace/fprobe.c    |   17 +-
>>   kernel/trace/rethook.c   |   80 +--
>>   lib/Kconfig.debug        |   11 +
>>   lib/Makefile             |    4 +-
>>   lib/objpool.c            |  480 ++++++++++++++++++
>>   lib/test_objpool.c       | 1031 ++++++++++++++++++++++++++++++++++++++
>>   11 files changed, 1772 insertions(+), 250 deletions(-)
> 
> Hmm, this does too much things in 1 patch.
> Can you split this in below 5 patches? This also makes clear that who
> needs to review which part.

I was ever considering of splitting, but finally decided to mix them in a big 
one mostly because it's only for kretprobe improvement.

Next version I'll split to smaller patches. Thank you for the advice.

> 
> - Add generic scalable objpool
> - Add objpool test
> - Use objpool in kretprobe
> - Use objpool in fprobe and rethook
> - Remove unused freelist
> 
>>   delete mode 100644 include/linux/freelist.h
>>   create mode 100644 include/linux/objpool.h
>>   create mode 100644 lib/objpool.c
>>   create mode 100644 lib/test_objpool.c
>>
> [...]
>> +
>> +struct objpool_slot {
>> +	uint32_t                os_head;	/* head of ring array */
> 
> If all fields have "os_" prefix, it is meaningless.
> 
>> +	uint32_t                os_tail;	/* tail of ring array */
>> +	uint32_t                os_size;	/* max item slots, pow of 2 */
>> +	uint32_t                os_mask;	/* os_size - 1 */
>> +/*
>> + *	uint32_t                os_ages[];	// ring epoch id
>> + *	void                   *os_ents[];	// objects array
> 
> "entries[]"

I'll refine the comments here to better explain the memory layout.

> 
>> + */
>> +};
>> +
>> +/* caller-specified object initial callback to setup each object, only called once */
>> +typedef int (*objpool_init_node_cb)(void *context, void *obj);
>> +
>> +/* caller-specified cleanup callback for private objects/pool/context */
>> +typedef int (*objpool_release_cb)(void *context, void *ptr, uint32_t flags);
>> +
>> +/* called for object releasing: ptr points to an object */
>> +#define OBJPOOL_FLAG_NODE        (0x00000001)
>> +/* for user pool and context releasing, ptr could be NULL */
>> +#define OBJPOOL_FLAG_POOL        (0x00001000)
>> +/* the object or pool to be released is user-managed */
>> +#define OBJPOOL_FLAG_USER        (0x00008000)
>> +
>> +/*
>> + * objpool_head: object pooling metadata
>> + */
>> +
>> +struct objpool_head {
>> +	uint32_t                oh_objsz;	/* object & element size */
>> +	uint32_t                oh_nobjs;	/* total objs (pre-allocated) */
>> +	uint32_t                oh_nents;	/* max objects per cpuslot */
>> +	uint32_t                oh_ncpus;	/* num of possible cpus */
> 
> If all fields have "oh_" prefix, it is meaningless.
> Also, if there is no reason to be 32bit (like align the structure size
> for cache, or pack the structure for streaming etc.) use appropriate types.
> 
> And please do not align the length of field name unnaturally. e.g.

Kind of obsessive-compulsive symptom, haha :) The struct size of objpool_head 
doesn't matter. The size of objpool_slot does matter, as managed in a single 
cache-line.

> 
> size_t obj_size; /* size_t or unsigned int, I don't care. */
> int nr_objs; /* I think just 'int' is enough because the value should be
>                 checked and limited under INT_MAX */
> int max_entries;
> unsigned int nr_cpus;
> 
> (BTW why we need to limit the nr_cpus here? we have num_possible_cpus())

You are right that nr_cpus is unnecessary. num_possible_cpus() just keeps the 
same even with new hot-plugged cpus.

> 
>> +	uint32_t                oh_in_user:1;	/* user-specified buffer */
>> +	uint32_t                oh_in_slot:1;	/* objs alloced with slots */
>> +	uint32_t                oh_vmalloc:1;	/* alloc from vmalloc zone */
> 
> Please use "bool" or "unsigned long flags" with flag bits.
> 
>> +	gfp_t                   oh_gfp;		/* k/vmalloc gfp flags */
>> +	uint32_t                oh_sz_pool;	/* user pool size in byes */
> 
> size_t pool_size
> 
>> +	void                   *oh_pool;	/* user managed memory pool */
>> +	struct objpool_slot   **oh_slots;	/* array of percpu slots */
>> +	uint32_t               *oh_sz_slots;	/* size in bytes of slots */
> 
> size_t slot_size
> 

Will apply these changes in next version.

>> +	objpool_release_cb      oh_release;	/* resource cleanup callback */
>> +	void                   *oh_context;	/* caller-provided context */
>> +};
> 
> Thank you,
> 

Best regards,


^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH v4] kprobes,lib: kretprobe scalability improvement
  2022-11-02 21:33   ` Andrew Morton
@ 2022-11-03 16:46     ` wuqiang
  0 siblings, 0 replies; 23+ messages in thread
From: wuqiang @ 2022-11-03 16:46 UTC (permalink / raw)
  To: Andrew Morton
  Cc: mhiramat, davem, anil.s.keshavamurthy, naveen.n.rao, rostedt,
	peterz, sander, ebiggers, dan.j.williams, jpoimboe, linux-kernel,
	mattwu, kernel test robot

On 2022/11/3 05:33, Andrew Morton wrote:
> On Wed,  2 Nov 2022 10:30:12 +0800 wuqiang <wuqiang.matt@bytedance.com> wrote:
> 
>> Tests of
>> kretprobe throughput show the biggest ratio as 333.9x of the original
>> freelist.
> 
> Seriously.
> 
> I'll add this for some runtime testing.

Thanks.

> 
> Are you able to identify other parts of the kernel which could use
> (and benefit from) the new objpool?

The scalability issue is caused by freelist. Currently kretprobe and rethook 
are the only use cases.

I'm working on the evaluation of bpf percpu-freelist, which scales well but 
uses raw_spinlock and needs local irq disabled.

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH v4] kprobes,lib: kretprobe scalability improvement
  2022-11-03 16:45     ` wuqiang
@ 2022-11-04  1:28       ` Masami Hiramatsu
  0 siblings, 0 replies; 23+ messages in thread
From: Masami Hiramatsu @ 2022-11-04  1:28 UTC (permalink / raw)
  To: wuqiang
  Cc: davem, anil.s.keshavamurthy, naveen.n.rao, rostedt, peterz, akpm,
	sander, ebiggers, dan.j.williams, jpoimboe, linux-kernel, mattwu,
	kernel test robot

On Fri, 4 Nov 2022 00:45:19 +0800
wuqiang <wuqiang.matt@bytedance.com> wrote:

> On 2022/11/3 10:51, Masami Hiramatsu (Google) wrote:
> > Hi wuqiang (or Matt?),
> > 
> > Thanks for updating the patch! I have some comments below,
> 
> Thanks for your time :)
> 
> > On Wed,  2 Nov 2022 10:30:12 +0800
> > wuqiang <wuqiang.matt@bytedance.com> wrote:
> > 
> ...
> >> Changes since V3:
> >> 1) build warning: unused variable in fprobe_init_rethook
> >> Reported-by: kernel test robot <lkp@intel.com>
> >>
> >> Changes since V2:
> >> 1) the percpu-extended version of the freelist replaced by new percpu-
> >>     ring-array. freelist has data-contention in freelist_node (refs and
> >>     next) even after node is removed from freelist and the node could
> >>     be polluted easily (with freelist_node defined in union)
> >> 2) routines split to objpool.h and objpool.c according to cold & hot
> >>     pathes, and the latter moved to lib, as suggested by Masami
> >> 3) test module (test_objpool.ko) added to lib for functional testings
> >>
> >> Changes since V1:
> >> 1) reformat to a single patch as Masami Hiramatsu suggested
> >> 2) use __vmalloc_node to replace vmalloc_node for vmalloc
> >> 3) a few minor fixes: typo and coding-style issues
> > 
> > Recording change log is very good. But if it becomes too long,
> > you can put a URL of the previous series on LKML instead of
> > appending the change logs.
> > You can get the URL (permlink) by "lkml.kernel.org/r/<your-message-id>"
> 
> Got it.
> 
> >>
> >> Signed-off-by: wuqiang <wuqiang.matt@bytedance.com>
> >> ---
> >>   include/linux/freelist.h |  129 -----
> >>   include/linux/kprobes.h  |    9 +-
> >>   include/linux/objpool.h  |  151 ++++++
> >>   include/linux/rethook.h  |   15 +-
> >>   kernel/kprobes.c         |   95 ++--
> >>   kernel/trace/fprobe.c    |   17 +-
> >>   kernel/trace/rethook.c   |   80 +--
> >>   lib/Kconfig.debug        |   11 +
> >>   lib/Makefile             |    4 +-
> >>   lib/objpool.c            |  480 ++++++++++++++++++
> >>   lib/test_objpool.c       | 1031 ++++++++++++++++++++++++++++++++++++++
> >>   11 files changed, 1772 insertions(+), 250 deletions(-)
> > 
> > Hmm, this does too much things in 1 patch.
> > Can you split this in below 5 patches? This also makes clear that who
> > needs to review which part.
> 
> I was ever considering of splitting, but finally decided to mix them in a big 
> one mostly because it's only for kretprobe improvement.
> 
> Next version I'll split to smaller patches. Thank you for the advice.
> 
> > 
> > - Add generic scalable objpool
> > - Add objpool test
> > - Use objpool in kretprobe
> > - Use objpool in fprobe and rethook
> > - Remove unused freelist
> > 
> >>   delete mode 100644 include/linux/freelist.h
> >>   create mode 100644 include/linux/objpool.h
> >>   create mode 100644 lib/objpool.c
> >>   create mode 100644 lib/test_objpool.c
> >>
> > [...]
> >> +
> >> +struct objpool_slot {
> >> +	uint32_t                os_head;	/* head of ring array */
> > 
> > If all fields have "os_" prefix, it is meaningless.
> > 
> >> +	uint32_t                os_tail;	/* tail of ring array */
> >> +	uint32_t                os_size;	/* max item slots, pow of 2 */
> >> +	uint32_t                os_mask;	/* os_size - 1 */
> >> +/*
> >> + *	uint32_t                os_ages[];	// ring epoch id
> >> + *	void                   *os_ents[];	// objects array
> > 
> > "entries[]"
> 
> I'll refine the comments here to better explain the memory layout.
> 
> > 
> >> + */
> >> +};
> >> +
> >> +/* caller-specified object initial callback to setup each object, only called once */
> >> +typedef int (*objpool_init_node_cb)(void *context, void *obj);
> >> +
> >> +/* caller-specified cleanup callback for private objects/pool/context */
> >> +typedef int (*objpool_release_cb)(void *context, void *ptr, uint32_t flags);
> >> +
> >> +/* called for object releasing: ptr points to an object */
> >> +#define OBJPOOL_FLAG_NODE        (0x00000001)
> >> +/* for user pool and context releasing, ptr could be NULL */
> >> +#define OBJPOOL_FLAG_POOL        (0x00001000)
> >> +/* the object or pool to be released is user-managed */
> >> +#define OBJPOOL_FLAG_USER        (0x00008000)
> >> +
> >> +/*
> >> + * objpool_head: object pooling metadata
> >> + */
> >> +
> >> +struct objpool_head {
> >> +	uint32_t                oh_objsz;	/* object & element size */
> >> +	uint32_t                oh_nobjs;	/* total objs (pre-allocated) */
> >> +	uint32_t                oh_nents;	/* max objects per cpuslot */
> >> +	uint32_t                oh_ncpus;	/* num of possible cpus */
> > 
> > If all fields have "oh_" prefix, it is meaningless.
> > Also, if there is no reason to be 32bit (like align the structure size
> > for cache, or pack the structure for streaming etc.) use appropriate types.
> > 
> > And please do not align the length of field name unnaturally. e.g.
> 
> Kind of obsessive-compulsive symptom, haha :) The struct size of objpool_head 
> doesn't matter. The size of objpool_slot does matter, as managed in a single 
> cache-line.

Yeah, so objpool_slot is good to use uint32_t. You may also need __packed and
__cacheline_aligned for objpool_slot ;)

Thank you!

> 
> > 
> > size_t obj_size; /* size_t or unsigned int, I don't care. */
> > int nr_objs; /* I think just 'int' is enough because the value should be
> >                 checked and limited under INT_MAX */
> > int max_entries;
> > unsigned int nr_cpus;
> > 
> > (BTW why we need to limit the nr_cpus here? we have num_possible_cpus())
> 
> You are right that nr_cpus is unnecessary. num_possible_cpus() just keeps the 
> same even with new hot-plugged cpus.
> 
> > 
> >> +	uint32_t                oh_in_user:1;	/* user-specified buffer */
> >> +	uint32_t                oh_in_slot:1;	/* objs alloced with slots */
> >> +	uint32_t                oh_vmalloc:1;	/* alloc from vmalloc zone */
> > 
> > Please use "bool" or "unsigned long flags" with flag bits.
> > 
> >> +	gfp_t                   oh_gfp;		/* k/vmalloc gfp flags */
> >> +	uint32_t                oh_sz_pool;	/* user pool size in byes */
> > 
> > size_t pool_size
> > 
> >> +	void                   *oh_pool;	/* user managed memory pool */
> >> +	struct objpool_slot   **oh_slots;	/* array of percpu slots */
> >> +	uint32_t               *oh_sz_slots;	/* size in bytes of slots */
> > 
> > size_t slot_size
> > 
> 
> Will apply these changes in next version.
> 
> >> +	objpool_release_cb      oh_release;	/* resource cleanup callback */
> >> +	void                   *oh_context;	/* caller-provided context */
> >> +};
> > 
> > Thank you,
> > 
> 
> Best regards,
> 


-- 
Masami Hiramatsu (Google) <mhiramat@kernel.org>

^ permalink raw reply	[flat|nested] 23+ messages in thread

* [PATCH v5 0/4] lib,kprobes: kretprobe scalability improvement
  2022-11-02  2:30 ` [PATCH v4] " wuqiang
  2022-11-02 21:33   ` Andrew Morton
  2022-11-03  2:51   ` Masami Hiramatsu
@ 2022-11-06  5:34   ` wuqiang
  2022-11-06  5:34     ` [PATCH v5 1/4] lib: objpool added: ring-array based lockless MPMC queue wuqiang
                       ` (3 more replies)
  2022-11-08  7:14   ` [PATCH v6 0/4] lib,kprobes: kretprobe scalability improvement wuqiang
  3 siblings, 4 replies; 23+ messages in thread
From: wuqiang @ 2022-11-06  5:34 UTC (permalink / raw)
  To: mhiramat, davem, anil.s.keshavamurthy, naveen.n.rao, rostedt,
	peterz, akpm, sander, ebiggers, dan.j.williams, jpoimboe
  Cc: linux-kernel, lkp, mattwu

This patch series introduces a scalable and lockless ring-array based
object pool and replaces the original freelist (a LIFO queue based on
singly linked list) to improve the scalability of kretprobed routines.

Changes from v4:
  1) compiling failure with [-Werror=designated-init]
  2) compiling failure for sparc: prefetch() not defined
  3) comments & codes of objpool routines refined

v4 and more:
  https://lore.kernel.org/all/20221102023012.6362-1-wuqiang.matt@bytedance.com

---
 include/linux/freelist.h |  129 ------------
 include/linux/kprobes.h  |    9 +-
 include/linux/objpool.h  |  153 ++++++++++++++
 include/linux/rethook.h  |   15 +-
 kernel/kprobes.c         |   95 ++++-----
 kernel/trace/fprobe.c    |   17 +-
 kernel/trace/rethook.c   |   80 +++----
 lib/Kconfig.debug        |   11 +
 lib/Makefile             |    4 +-
 lib/objpool.c            |  487 +++++++++++++++++++++++++++++++++++++++++++
 lib/test_objpool.c       | 1052 +++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
 11 files changed, 1802 insertions(+), 250 deletions(-)
 create mode 100644 include/linux/objpool.h
 create mode 100644 lib/objpool.c
 create mode 100644 lib/test_objpool.c
 delete mode 100644 include/linux/freelist.h

^ permalink raw reply	[flat|nested] 23+ messages in thread

* [PATCH v5 1/4] lib: objpool added: ring-array based lockless MPMC queue
  2022-11-06  5:34   ` [PATCH v5 0/4] lib,kprobes: " wuqiang
@ 2022-11-06  5:34     ` wuqiang
  2022-11-06  5:34     ` [PATCH v5 2/4] lib: objpool test module added wuqiang
                       ` (2 subsequent siblings)
  3 siblings, 0 replies; 23+ messages in thread
From: wuqiang @ 2022-11-06  5:34 UTC (permalink / raw)
  To: mhiramat, davem, anil.s.keshavamurthy, naveen.n.rao, rostedt,
	peterz, akpm, sander, ebiggers, dan.j.williams, jpoimboe
  Cc: linux-kernel, lkp, mattwu, wuqiang

The object pool is a scalable implementaion of high performance queue
for objects allocation and reclamation, such as kretprobe instances.

With leveraging per-cpu ring-array to mitigate the hot spots of memory
contention, it could deliver near-linear scalability for high parallel
scenarios. The ring-array is compactly managed in a single cache-line
to benefit from warmed L1 cache for most cases (<= 4 objects per-core).
The body of pre-allocated objects is stored in continuous cache-lines
just after the ring-array.

The object pool is interrupt safe. Both allocation and reclamation
(object pop and push operations) can be preemptible or interruptable.

It's best suited for following cases:
1) Memory allocation or reclamation are prohibited or too expensive
2) Consumers are of different priorities, such as irqs and threads

Limitations:
1) Maximum objects (capacity) is determined during pool initializing
2) The memory of objects won't be freed until the poll is finalized
3) Object allocation (pop) may fail after trying all cpu slots
4) Object reclamation (push) won't fail but may take long time to
   finish for imbalanced scenarios. You can try larger max_entries
   to mitigate, or ( >= CPUS * nr_objs) to avoid

Signed-off-by: wuqiang <wuqiang.matt@bytedance.com>
---
 include/linux/objpool.h | 153 +++++++++++++
 lib/Makefile            |   2 +-
 lib/objpool.c           | 487 ++++++++++++++++++++++++++++++++++++++++
 3 files changed, 641 insertions(+), 1 deletion(-)
 create mode 100644 include/linux/objpool.h
 create mode 100644 lib/objpool.c

diff --git a/include/linux/objpool.h b/include/linux/objpool.h
new file mode 100644
index 000000000000..7899b054b50c
--- /dev/null
+++ b/include/linux/objpool.h
@@ -0,0 +1,153 @@
+/* SPDX-License-Identifier: GPL-2.0 */
+
+#ifndef _LINUX_OBJPOOL_H
+#define _LINUX_OBJPOOL_H
+
+#include <linux/types.h>
+
+/*
+ * objpool: ring-array based lockless MPMC queue
+ *
+ * Copyright: wuqiang.matt@bytedance.com
+ *
+ * The object pool is a scalable implementaion of high performance queue
+ * for objects allocation and reclamation, such as kretprobe instances.
+ *
+ * With leveraging per-cpu ring-array to mitigate the hot spots of memory
+ * contention, it could deliver near-linear scalability for high parallel
+ * scenarios. The ring-array is compactly managed in a single cache-line
+ * to benefit from warmed L1 cache for most cases (<= 4 objects per-core).
+ * The body of pre-allocated objects is stored in continuous cache-lines
+ * just after the ring-array.
+ *
+ * The object pool is interrupt safe. Both allocation and reclamation
+ * (object pop and push operations) can be preemptible or interruptable.
+ *
+ * It's best suited for following cases:
+ * 1) Memory allocation or reclamation are prohibited or too expensive
+ * 2) Consumers are of different priorities, such as irqs and threads
+ *
+ * Limitations:
+ * 1) Maximum objects (capacity) is determined during pool initializing
+ * 2) The memory of objects won't be freed until the poll is finalized
+ * 3) Object allocation (pop) may fail after trying all cpu slots
+ * 4) Object reclamation (push) won't fail but may take long time to
+ *    finish for imbalanced scenarios. You can try larger max_entries
+ *    to mitigate, or ( >= CPUS * nr_objs) to avoid
+ */
+
+/*
+ * objpool_slot: per-cpu ring array
+ *
+ * Represents a cpu-local array-based ring buffer, its size is specialized
+ * during initialization of object pool.
+ *
+ * The objpool_slot is allocated from local memory for NUMA system, and to
+ * be kept compact in a single cacheline. ages[] is stored just after the
+ * body of objpool_slot, and then entries[]. The Array of ages[] describes
+ * revision of each item, solely used to avoid ABA. And array of entries[]
+ * contains the pointers of objects.
+ *
+ * The default size of objpool_slot is a single cache-line, aka. 64 bytes.
+ *
+ * 64bit:
+ *        4      8      12     16        32                 64
+ * | head | tail | size | mask | ages[4] | ents[4]: (8 * 4) | objects
+ *
+ * 32bit:
+ *        4      8      12     16        32        48       64
+ * | head | tail | size | mask | ages[4] | ents[4] | unused | objects
+ *
+ */
+
+struct objpool_slot {
+	uint32_t                head;	/* head of ring array */
+	uint32_t                tail;	/* tail of ring array */
+	uint32_t                size;	/* array size, pow of 2 */
+	uint32_t                mask;	/* size - 1 */
+} __attribute__((packed));
+
+/* caller-specified object initial callback to setup each object, only called once */
+typedef int (*objpool_init_obj_cb)(void *context, void *obj);
+
+/* caller-specified cleanup callback for private objects/pool/context */
+typedef int (*objpool_release_cb)(void *context, void *ptr, uint32_t flags);
+
+/* called for object releasing: ptr points to an object */
+#define OBJPOOL_FLAG_NODE        (0x00000001)
+/* for user pool and context releasing, ptr could be NULL */
+#define OBJPOOL_FLAG_POOL        (0x00001000)
+/* the object or pool to be released is user-managed */
+#define OBJPOOL_FLAG_USER        (0x00008000)
+
+/*
+ * objpool_head: object pooling metadata
+ */
+
+struct objpool_head {
+	unsigned int            obj_size;	/* object & element size */
+	unsigned int            nr_objs;	/* total objs (to be pre-allocated) */
+	unsigned int            nr_cpus;	/* num of possible cpus */
+	unsigned int            capacity;	/* max objects per cpuslot */
+	unsigned long           flags;		/* flags for objpool management */
+	gfp_t                   gfp;		/* gfp flags for kmalloc & vmalloc */
+	unsigned int            pool_size;	/* user pool size in byes */
+	void                   *pool;		/* user managed memory pool */
+	struct objpool_slot   **cpu_slots;	/* array of percpu slots */
+	unsigned int           *slot_sizes;	/* size in bytes of slots */
+	objpool_release_cb      release;	/* resource cleanup callback */
+	void                   *context;	/* caller-provided context */
+};
+
+#define OBJPOOL_FROM_VMALLOC	(0x800000000)	/* objpool allocated from vmalloc area */
+#define OBJPOOL_HAVE_OBJECTS	(0x400000000)	/* objects allocated along with objpool */
+
+/* initialize object pool and pre-allocate objects */
+int objpool_init(struct objpool_head *head, unsigned int nr_objs,
+		 unsigned int max_objs, unsigned int object_size,
+		 gfp_t gfp, void *context, objpool_init_obj_cb objinit,
+		 objpool_release_cb release);
+
+/* add objects in batch from user provided pool */
+int objpool_populate(struct objpool_head *head, void *pool,
+		     unsigned int size, unsigned int object_size,
+		     void *context, objpool_init_obj_cb objinit);
+
+/* add pre-allocated object (managed by user) to objpool */
+int objpool_add(void *obj, struct objpool_head *head);
+
+/* allocate an object from objects pool */
+void *objpool_pop(struct objpool_head *head);
+
+/* reclaim an object to objects pool */
+int objpool_push(void *node, struct objpool_head *head);
+
+/* cleanup the whole object pool (objects including) */
+void objpool_fini(struct objpool_head *head);
+
+/* whether the object is pre-allocated with percpu slots */
+static inline int objpool_is_inslot(void *obj, struct objpool_head *head)
+{
+	void *slot;
+	int i;
+
+	if (!obj || !(head->flags & OBJPOOL_HAVE_OBJECTS))
+		return 0;
+
+	for (i = 0; i < head->nr_cpus; i++) {
+		slot = head->cpu_slots[i];
+		if (obj >= slot && obj < slot + head->slot_sizes[i])
+			return 1;
+	}
+
+	return 0;
+}
+
+/* whether the object is from user pool (batched adding) */
+static inline int objpool_is_inpool(void *obj, struct objpool_head *head)
+{
+	return (obj && head->pool && obj >= head->pool &&
+		obj < head->pool + head->pool_size);
+}
+
+#endif /* _LINUX_OBJPOOL_H */
diff --git a/lib/Makefile b/lib/Makefile
index 161d6a724ff7..e938703a321f 100644
--- a/lib/Makefile
+++ b/lib/Makefile
@@ -34,7 +34,7 @@ lib-y := ctype.o string.o vsprintf.o cmdline.o \
 	 is_single_threaded.o plist.o decompress.o kobject_uevent.o \
 	 earlycpio.o seq_buf.o siphash.o dec_and_lock.o \
 	 nmi_backtrace.o win_minmax.o memcat_p.o \
-	 buildid.o
+	 buildid.o objpool.o
 
 lib-$(CONFIG_PRINTK) += dump_stack.o
 lib-$(CONFIG_SMP) += cpumask.o
diff --git a/lib/objpool.c b/lib/objpool.c
new file mode 100644
index 000000000000..ecffa0795f3d
--- /dev/null
+++ b/lib/objpool.c
@@ -0,0 +1,487 @@
+// SPDX-License-Identifier: GPL-2.0
+
+#include <linux/objpool.h>
+#include <linux/slab.h>
+#include <linux/vmalloc.h>
+#include <linux/atomic.h>
+#include <linux/prefetch.h>
+
+/*
+ * objpool: ring-array based lockless MPMC/FIFO queues
+ *
+ * Copyright: wuqiang.matt@bytedance.com
+ */
+
+/* compute the suitable num of objects to be managed by slot */
+static inline unsigned int __objpool_num_of_objs(unsigned int size)
+{
+	return rounddown_pow_of_two((size - sizeof(struct objpool_slot)) /
+			(sizeof(uint32_t) + sizeof(void *)));
+}
+
+#define SLOT_AGES(s) ((uint32_t *)((char *)(s) + sizeof(struct objpool_slot)))
+#define SLOT_ENTS(s) ((void **)((char *)(s) + sizeof(struct objpool_slot) + \
+			sizeof(uint32_t) * (s)->size))
+#define SLOT_OBJS(s) ((void *)((char *)(s) + sizeof(struct objpool_slot) + \
+			(sizeof(uint32_t) + sizeof(void *)) * (s)->size))
+
+/* allocate and initialize percpu slots */
+static inline int
+__objpool_init_percpu_slots(struct objpool_head *head, unsigned int nobjs,
+			void *context, objpool_init_obj_cb objinit)
+{
+	unsigned int i, j, n, size, objsz, nents = head->capacity;
+
+	/* aligned object size by sizeof(void *) */
+	objsz = ALIGN(head->obj_size, sizeof(void *));
+	/* shall we allocate objects along with objpool_slot */
+	if (objsz)
+		head->flags |= OBJPOOL_HAVE_OBJECTS;
+
+	for (i = 0; i < head->nr_cpus; i++) {
+		struct objpool_slot *os;
+
+		/* compute how many objects to be managed by this slot */
+		n = nobjs / head->nr_cpus;
+		if (i < (nobjs % head->nr_cpus))
+			n++;
+		size = sizeof(struct objpool_slot) + sizeof(void *) * nents +
+		       sizeof(uint32_t) * nents + objsz * n;
+
+		/* decide memory area for cpu-slot allocation */
+		if (!i && !(head->gfp & GFP_ATOMIC) && size > PAGE_SIZE / 2)
+			head->flags |= OBJPOOL_FROM_VMALLOC;
+
+		/* allocate percpu slot & objects from local memory */
+		if (head->flags & OBJPOOL_FROM_VMALLOC)
+			os = __vmalloc_node(size, sizeof(void *), head->gfp,
+				cpu_to_node(i), __builtin_return_address(0));
+		else
+			os = kmalloc_node(size, head->gfp, cpu_to_node(i));
+		if (!os)
+			return -ENOMEM;
+
+		/* initialize percpu slot for the i-th cpu */
+		memset(os, 0, size);
+		os->size = head->capacity;
+		os->mask = os->size - 1;
+		head->cpu_slots[i] = os;
+		head->slot_sizes[i] = size;
+
+		/*
+		 * start from 2nd round to avoid conflict of 1st item.
+		 * we assume that the head item is ready for retrieval
+		 * iff head is equal to ages[head & mask]. but ages is
+		 * initialized as 0, so in view of the caller of pop(),
+		 * the 1st item (0th) is always ready, but fact could
+		 * be: push() is stalled before the final update, thus
+		 * the item being inserted will be lost forever.
+		 */
+		os->head = os->tail = head->capacity;
+
+		if (!objsz)
+			continue;
+
+		for (j = 0; j < n; j++) {
+			uint32_t *ages = SLOT_AGES(os);
+			void **ents = SLOT_ENTS(os);
+			void *obj = SLOT_OBJS(os) + j * objsz;
+			uint32_t ie = os->tail & os->mask;
+
+			/* perform object initialization */
+			if (objinit) {
+				int rc = objinit(context, obj);
+				if (rc)
+					return rc;
+			}
+
+			/* add obj into the ring array */
+			ents[ie] = obj;
+			ages[ie] = os->tail;
+			os->tail++;
+			head->nr_objs++;
+		}
+	}
+
+	return 0;
+}
+
+/* cleanup all percpu slots of the object pool */
+static inline void __objpool_fini_percpu_slots(struct objpool_head *head)
+{
+	unsigned int i;
+
+	if (!head->cpu_slots)
+		return;
+
+	for (i = 0; i < head->nr_cpus; i++) {
+		if (!head->cpu_slots[i])
+			continue;
+		if (head->flags & OBJPOOL_FROM_VMALLOC)
+			vfree(head->cpu_slots[i]);
+		else
+			kfree(head->cpu_slots[i]);
+	}
+	kfree(head->cpu_slots);
+	head->cpu_slots = NULL;
+	head->slot_sizes = NULL;
+}
+
+/**
+ * objpool_init: initialize object pool and pre-allocate objects
+ *
+ * args:
+ * @head:    the object pool to be initialized, declared by caller
+ * @nr_objs: total objects to be pre-allocated by this object pool
+ * @max_objs: max entries (object pool capacity), use nr_objs if 0
+ * @object_size: size of an object, no objects pre-allocated if 0
+ * @gfp:     flags for memory allocation (via kmalloc or vmalloc)
+ * @context: user context for object initialization callback
+ * @objinit: object initialization callback for extra setting-up
+ * @release: cleanup callback for private objects/pool/context
+ *
+ * return:
+ *         0 for success, otherwise error code
+ *
+ * All pre-allocated objects are to be zeroed. Caller could do extra
+ * initialization in objinit callback. The objinit callback will be
+ * called once and only once after the slot allocation. Then objpool
+ * won't touch any content of the objects since then. It's caller's
+ * duty to perform reinitialization after object allocation (pop) or
+ * clearance before object reclamation (push) if required.
+ */
+int objpool_init(struct objpool_head *head, unsigned int nr_objs,
+		unsigned int max_objs, unsigned int object_size,
+		gfp_t gfp, void *context, objpool_init_obj_cb objinit,
+		objpool_release_cb release)
+{
+	unsigned int nents, ncpus = num_possible_cpus();
+	int rc;
+
+	/* calculate percpu slot size (rounded to pow of 2) */
+	if (max_objs < nr_objs)
+		max_objs = nr_objs;
+	nents = max_objs / ncpus;
+	if (nents < __objpool_num_of_objs(L1_CACHE_BYTES))
+		nents = __objpool_num_of_objs(L1_CACHE_BYTES);
+	nents = roundup_pow_of_two(nents);
+	while (nents * ncpus < nr_objs)
+		nents = nents << 1;
+
+	memset(head, 0, sizeof(struct objpool_head));
+	head->nr_cpus = ncpus;
+	head->obj_size = object_size;
+	head->capacity = nents;
+	head->gfp = gfp & ~__GFP_ZERO;
+	head->context = context;
+	head->release = release;
+
+	/* allocate array for percpu slots */
+	head->cpu_slots = kzalloc(head->nr_cpus * sizeof(void *) +
+			       head->nr_cpus * sizeof(uint32_t), head->gfp);
+	if (!head->cpu_slots)
+		return -ENOMEM;
+	head->slot_sizes = (uint32_t *)&head->cpu_slots[head->nr_cpus];
+
+	/* initialize per-cpu slots */
+	rc = __objpool_init_percpu_slots(head, nr_objs, context, objinit);
+	if (rc)
+		__objpool_fini_percpu_slots(head);
+
+	return rc;
+}
+EXPORT_SYMBOL_GPL(objpool_init);
+
+/* adding object to slot tail, the given slot must NOT be full */
+static inline int __objpool_add_slot(void *obj, struct objpool_slot *os)
+{
+	uint32_t *ages = SLOT_AGES(os);
+	void **ents = SLOT_ENTS(os);
+	uint32_t tail = atomic_inc_return((atomic_t *)&os->tail) - 1;
+
+	WRITE_ONCE(ents[tail & os->mask], obj);
+
+	/* order matters: obj must be updated before tail updating */
+	smp_store_release(&ages[tail & os->mask], tail);
+	return 0;
+}
+
+/* adding object to slot, abort if the slot was already full */
+static inline int __objpool_try_add_slot(void *obj, struct objpool_slot *os)
+{
+	uint32_t *ages = SLOT_AGES(os);
+	void **ents = SLOT_ENTS(os);
+	uint32_t head, tail;
+
+	do {
+		/* perform memory loading for both head and tail */
+		head = READ_ONCE(os->head);
+		tail = READ_ONCE(os->tail);
+		/* just abort if slot is full */
+		if (tail >= head + os->size)
+			return -ENOENT;
+		/* try to extend tail by 1 using CAS to avoid races */
+		if (try_cmpxchg_acquire(&os->tail, &tail, tail + 1))
+			break;
+	} while (1);
+
+	/* the tail-th of slot is reserved for the given obj */
+	WRITE_ONCE(ents[tail & os->mask], obj);
+	/* update epoch id to make this object available for pop() */
+	smp_store_release(&ages[tail & os->mask], tail);
+	return 0;
+}
+
+/**
+ * objpool_populate: add objects from user provided pool in batch
+ *
+ * args:
+ * @head:  object pool
+ * @pool: user buffer for pre-allocated objects
+ * @size: size of user buffer
+ * @object_size: size of object & element
+ * @context: user context for objinit callback
+ * @objinit: object initialization callback
+ *
+ * return: 0 or error code
+ */
+int objpool_populate(struct objpool_head *head, void *pool,
+		unsigned int size, unsigned int object_size,
+		void *context, objpool_init_obj_cb objinit)
+{
+	unsigned int n = head->nr_objs, used = 0, i;
+
+	if (head->pool || !pool || size < object_size)
+		return -EINVAL;
+	if (head->obj_size && head->obj_size != object_size)
+		return -EINVAL;
+	if (head->context && context && head->context != context)
+		return -EINVAL;
+	if (head->nr_objs >= head->nr_cpus * head->capacity)
+		return -ENOENT;
+
+	WARN_ON_ONCE(((unsigned long)pool) & (sizeof(void *) - 1));
+	WARN_ON_ONCE(((uint32_t)object_size) & (sizeof(void *) - 1));
+
+	/* align object size by sizeof(void *) */
+	head->obj_size = object_size;
+	object_size = ALIGN(object_size, sizeof(void *));
+	if (object_size == 0)
+		return -EINVAL;
+
+	while (used + object_size <= size) {
+		void *obj = pool + used;
+
+		/* perform object initialization */
+		if (objinit) {
+			int rc = objinit(context, obj);
+			if (rc)
+				return rc;
+		}
+
+		/* insert obj to its corresponding objpool slot */
+		i = (n + used * head->nr_cpus/size) % head->nr_cpus;
+		if (!__objpool_try_add_slot(obj, head->cpu_slots[i]))
+			head->nr_objs++;
+
+		used += object_size;
+	}
+
+	if (!used)
+		return -ENOENT;
+
+	head->context = context;
+	head->pool = pool;
+	head->pool_size = size;
+
+	return 0;
+}
+EXPORT_SYMBOL_GPL(objpool_populate);
+
+/**
+ * objpool_add: add pre-allocated object to objpool during pool
+ * initialization
+ *
+ * args:
+ * @obj:  object pointer to be added to objpool
+ * @head: object pool to be inserted into
+ *
+ * return:
+ *     0 or error code
+ *
+ * objpool_add_node doesn't handle race conditions, can only be
+ * called during objpool initialization
+ */
+int objpool_add(void *obj, struct objpool_head *head)
+{
+	unsigned int i, cpu;
+
+	if (!obj)
+		return -EINVAL;
+	if (head->nr_objs >= head->nr_cpus * head->capacity)
+		return -ENOENT;
+
+	cpu = head->nr_objs % head->nr_cpus;
+	for (i = 0; i < head->nr_cpus; i++) {
+		if (!__objpool_try_add_slot(obj, head->cpu_slots[cpu])) {
+			head->nr_objs++;
+			return 0;
+		}
+
+		if (++cpu >= head->nr_cpus)
+			cpu = 0;
+	}
+
+	return -ENOENT;
+}
+EXPORT_SYMBOL_GPL(objpool_add);
+
+/**
+ * objpool_push: reclaim the object and return back to objects pool
+ *
+ * args:
+ * @obj:  object pointer to be pushed to object pool
+ * @head: object pool
+ *
+ * return:
+ *     0 or error code: it fails only when objects pool are full
+ *
+ * objpool_push is non-blockable, and can be nested
+ */
+int objpool_push(void *obj, struct objpool_head *head)
+{
+	unsigned int cpu = raw_smp_processor_id() % head->nr_cpus;
+
+	do {
+		if (head->nr_objs > head->capacity) {
+			if (!__objpool_try_add_slot(obj, head->cpu_slots[cpu]))
+				return 0;
+		} else {
+			if (!__objpool_add_slot(obj, head->cpu_slots[cpu]))
+				return 0;
+		}
+		if (++cpu >= head->nr_cpus)
+			cpu = 0;
+	} while (1);
+
+	return -ENOENT;
+}
+EXPORT_SYMBOL_GPL(objpool_push);
+
+/* try to retrieve object from slot */
+static inline void *__objpool_try_get_slot(struct objpool_slot *os)
+{
+	uint32_t *ages = SLOT_AGES(os);
+	void **ents = SLOT_ENTS(os);
+	/* do memory load of head to local head */
+	uint32_t head = smp_load_acquire(&os->head);
+
+	/* loop if slot isn't empty */
+	while (head != READ_ONCE(os->tail)) {
+		uint32_t id = head & os->mask, prev = head;
+
+		/* do prefetching of object ents */
+		prefetch(&ents[id]);
+
+		/*
+		 * check whether this item was ready for retrieval ? There's
+		 * possibility * in theory * we might retrieve wrong object,
+		 * in case ages[id] overflows when current task is sleeping,
+		 * but it will take very very long to overflow an uint32_t
+		 */
+		if (smp_load_acquire(&ages[id]) == head) {
+			/* node must have been udpated by push() */
+			void *node = READ_ONCE(ents[id]);
+			/* commit and move forward head of the slot */
+			if (try_cmpxchg_release(&os->head, &head, head + 1))
+				return node;
+		}
+
+		/* re-load head from memory continue trying */
+		head = READ_ONCE(os->head);
+		/*
+		 * head stays unchanged, so it's very likely current pop()
+		 * just preempted/interrupted an ongoing push() operation
+		 */
+		if (head == prev)
+			break;
+	}
+
+	return NULL;
+}
+
+/**
+ * objpool_pop: allocate an object from objects pool
+ *
+ * args:
+ * @oh:  object pool
+ *
+ * return:
+ *   object: NULL if failed (object pool is empty)
+ *
+ * objpool_pop can be nested, so can be used in any context.
+ */
+void *objpool_pop(struct objpool_head *head)
+{
+	unsigned int i, cpu;
+	void *obj = NULL;
+
+	cpu = raw_smp_processor_id() % head->nr_cpus;
+	for (i = 0; i < head->nr_cpus; i++) {
+		struct objpool_slot *slot = head->cpu_slots[cpu];
+		obj = __objpool_try_get_slot(slot);
+		if (obj)
+			break;
+		if (++cpu >= head->nr_cpus)
+			cpu = 0;
+	}
+
+	return obj;
+}
+EXPORT_SYMBOL_GPL(objpool_pop);
+
+/**
+ * objpool_fini: cleanup the whole object pool (releasing all objects)
+ *
+ * args:
+ * @head: object pool to be released
+ *
+ */
+void objpool_fini(struct objpool_head *head)
+{
+	uint32_t i, flags;
+
+	if (!head->cpu_slots)
+		return;
+
+	if (!head->release) {
+		__objpool_fini_percpu_slots(head);
+		return;
+	}
+
+	/* cleanup all objects remained in objpool */
+	for (i = 0; i < head->nr_cpus; i++) {
+		void *obj;
+		do {
+			flags = OBJPOOL_FLAG_NODE;
+			obj = __objpool_try_get_slot(head->cpu_slots[i]);
+			if (!obj)
+				break;
+			if (!objpool_is_inpool(obj, head) &&
+			    !objpool_is_inslot(obj, head)) {
+				flags |= OBJPOOL_FLAG_USER;
+			}
+			head->release(head->context, obj, flags);
+		} while (obj);
+	}
+
+	/* release percpu slots */
+	__objpool_fini_percpu_slots(head);
+
+	/* cleanup user private pool and related context */
+	flags = OBJPOOL_FLAG_POOL;
+	if (head->pool)
+		flags |= OBJPOOL_FLAG_USER;
+	head->release(head->context, head->pool, flags);
+}
+EXPORT_SYMBOL_GPL(objpool_fini);
-- 
2.34.1


^ permalink raw reply related	[flat|nested] 23+ messages in thread

* [PATCH v5 2/4] lib: objpool test module added
  2022-11-06  5:34   ` [PATCH v5 0/4] lib,kprobes: " wuqiang
  2022-11-06  5:34     ` [PATCH v5 1/4] lib: objpool added: ring-array based lockless MPMC queue wuqiang
@ 2022-11-06  5:34     ` wuqiang
  2022-11-06  5:34     ` [PATCH v5 3/4] kprobes: kretprobe scalability improvement with objpool wuqiang
  2022-11-06  5:34     ` [PATCH v5 4/4] kprobes: freelist.h removed wuqiang
  3 siblings, 0 replies; 23+ messages in thread
From: wuqiang @ 2022-11-06  5:34 UTC (permalink / raw)
  To: mhiramat, davem, anil.s.keshavamurthy, naveen.n.rao, rostedt,
	peterz, akpm, sander, ebiggers, dan.j.williams, jpoimboe
  Cc: linux-kernel, lkp, mattwu, wuqiang

The test_objpool module (test_objpool) will run serveral testcases
for objpool stress and performance evaluation. Each testcase will
have all available cpu cores involved to create a situation of high
parallel and high contention.

As of now there are 3 groups and 3 * 6 testcases in total:

1) group 1: synchronous mode
   objpool is managed synchronously, that is, all objects are to be
   reclaimed before objpool finalization and the objpool owner makes
   sure of it. All threads on different cores run in the same pace.
2) group 2: synchronous + miss mode
   This test group is mainly for performance evaluation of missing
   cases when pre-allocated objects are less than the requsted.
3) group 3: asynchronous mode
   This case is just an emulation of kretprobe. The objpool owner
   has no control of the object after it's allocated. hrtimer irq
   is introduced to stress objpool with thread preemption.

Signed-off-by: wuqiang <wuqiang.matt@bytedance.com>
---
 lib/Kconfig.debug  |   11 +
 lib/Makefile       |    2 +
 lib/test_objpool.c | 1052 ++++++++++++++++++++++++++++++++++++++++++++
 3 files changed, 1065 insertions(+)
 create mode 100644 lib/test_objpool.c

diff --git a/lib/Kconfig.debug b/lib/Kconfig.debug
index 3fc7abffc7aa..b12cc71754cf 100644
--- a/lib/Kconfig.debug
+++ b/lib/Kconfig.debug
@@ -2737,6 +2737,17 @@ config TEST_CLOCKSOURCE_WATCHDOG
 
 	  If unsure, say N.
 
+config TEST_OBJPOOL
+	tristate "Test module for correctness and stress of objpool"
+	default n
+       depends on m
+	help
+	  This builds the "test_objpool" module that should be used for
+	  correctness verification and concurrent testings of objects
+	  allocation and reclamation.
+
+	  If unsure, say N.
+
 endif # RUNTIME_TESTING_MENU
 
 config ARCH_USE_MEMTEST
diff --git a/lib/Makefile b/lib/Makefile
index e938703a321f..4aa282fa0cfc 100644
--- a/lib/Makefile
+++ b/lib/Makefile
@@ -99,6 +99,8 @@ obj-$(CONFIG_KPROBES_SANITY_TEST) += test_kprobes.o
 obj-$(CONFIG_TEST_REF_TRACKER) += test_ref_tracker.o
 CFLAGS_test_fprobe.o += $(CC_FLAGS_FTRACE)
 obj-$(CONFIG_FPROBE_SANITY_TEST) += test_fprobe.o
+obj-$(CONFIG_TEST_OBJPOOL) += test_objpool.o
+
 #
 # CFLAGS for compiling floating point code inside the kernel. x86/Makefile turns
 # off the generation of FPU/SSE* instructions for kernel proper but FPU_FLAGS
diff --git a/lib/test_objpool.c b/lib/test_objpool.c
new file mode 100644
index 000000000000..006b79160e1d
--- /dev/null
+++ b/lib/test_objpool.c
@@ -0,0 +1,1052 @@
+// SPDX-License-Identifier: GPL-2.0
+
+/*
+ * Test module for lockless object pool
+ * (C) 2022 Matt Wu <wuqiang.matt@bytedance.com>
+ */
+
+#include <linux/version.h>
+#include <linux/errno.h>
+#include <linux/module.h>
+#include <linux/moduleparam.h>
+#include <linux/sched.h>
+#include <linux/cpumask.h>
+#include <linux/completion.h>
+#include <linux/kthread.h>
+#include <linux/cpu.h>
+#include <linux/cpuset.h>
+#include <linux/slab.h>
+#include <linux/vmalloc.h>
+#include <linux/delay.h>
+#include <linux/hrtimer.h>
+#include <linux/interrupt.h>
+#include <linux/objpool.h>
+
+#define OT_NR_MAX_BULK (16)
+
+struct ot_ctrl {
+	unsigned int mode; /* test no */
+	unsigned int objsz; /* object size */
+	unsigned int duration; /* ms */
+	unsigned int delay; /* ms */
+	unsigned int bulk_normal;
+	unsigned int bulk_irq;
+	unsigned long hrtimer; /* ms */
+	const char *name;
+};
+
+struct ot_stat {
+	unsigned long nhits;
+	unsigned long nmiss;
+};
+
+struct ot_item {
+	struct objpool_head *pool; /* pool head */
+	struct ot_ctrl *ctrl; /* ctrl parameters */
+
+	void (*worker)(struct ot_item *item, int irq);
+
+	/* hrtimer control */
+	ktime_t hrtcycle;
+	struct hrtimer hrtimer;
+
+	int bulk[2]; /* for thread and irq */
+	int delay;
+	u32 niters;
+
+	/* results summary */
+	struct ot_stat stat[2]; /* thread and irq */
+
+	u64 duration;
+};
+
+struct ot_mem_stat {
+	atomic_long_t alloc;
+	atomic_long_t free;
+};
+
+struct ot_data {
+	struct rw_semaphore start;
+	struct completion wait;
+	struct completion rcu;
+	atomic_t nthreads ____cacheline_aligned_in_smp;
+	atomic_t stop ____cacheline_aligned_in_smp;
+	struct ot_mem_stat kmalloc;
+	struct ot_mem_stat vmalloc;
+} g_ot_data;
+
+/*
+ * memory leakage checking
+ */
+
+void *ot_kzalloc(long size)
+{
+	void *ptr = kzalloc(size, GFP_KERNEL);
+
+	if (ptr)
+		atomic_long_add(size, &g_ot_data.kmalloc.alloc);
+	return ptr;
+}
+
+void ot_kfree(void *ptr, long size)
+{
+	if (!ptr)
+		return;
+	atomic_long_add(size, &g_ot_data.kmalloc.free);
+	kfree(ptr);
+}
+
+void *ot_vmalloc(long size)
+{
+	void *ptr = vmalloc(size);
+
+	if (ptr)
+		atomic_long_add(size, &g_ot_data.vmalloc.alloc);
+	return ptr;
+}
+
+void ot_vfree(void *ptr, long size)
+{
+	if (!ptr)
+		return;
+	atomic_long_add(size, &g_ot_data.vmalloc.free);
+	vfree(ptr);
+}
+
+static void ot_mem_report(struct ot_ctrl *ctrl)
+{
+	long alloc, free;
+
+	pr_info("memory allocation summary for %s\n", ctrl->name);
+
+	alloc = atomic_long_read(&g_ot_data.kmalloc.alloc);
+	free = atomic_long_read(&g_ot_data.kmalloc.free);
+	pr_info("  kmalloc: %lu - %lu = %lu\n", alloc, free, alloc - free);
+
+	alloc = atomic_long_read(&g_ot_data.vmalloc.alloc);
+	free = atomic_long_read(&g_ot_data.vmalloc.free);
+	pr_info("  vmalloc: %lu - %lu = %lu\n", alloc, free, alloc - free);
+}
+
+/*
+ * general structs & routines
+ */
+
+struct ot_node {
+	void *owner;
+	unsigned long data;
+	unsigned long refs;
+	unsigned long payload[32];
+};
+
+struct ot_context {
+	struct objpool_head pool; /* objpool head */
+	struct ot_ctrl *ctrl; /* ctrl parameters */
+	void *ptr; /* user pool buffer */
+	unsigned long size; /* buffer size */
+	refcount_t refs;
+	struct rcu_head rcu;
+};
+
+static DEFINE_PER_CPU(struct ot_item, ot_pcup_items);
+
+static int ot_init_data(struct ot_data *data)
+{
+	memset(data, 0, sizeof(*data));
+	init_rwsem(&data->start);
+	init_completion(&data->wait);
+	init_completion(&data->rcu);
+	atomic_set(&data->nthreads, 1);
+
+	return 0;
+}
+
+static void ot_reset_data(struct ot_data *data)
+{
+	reinit_completion(&data->wait);
+	reinit_completion(&data->rcu);
+	atomic_set(&data->nthreads, 1);
+	atomic_set(&data->stop, 0);
+	memset(&data->kmalloc, 0, sizeof(data->kmalloc));
+	memset(&data->vmalloc, 0, sizeof(data->vmalloc));
+}
+
+static int ot_init_node(void *context, void *nod)
+{
+	struct ot_context *sop = context;
+	struct ot_node *on = nod;
+
+	on->owner = &sop->pool;
+	return 0;
+}
+
+static enum hrtimer_restart ot_hrtimer_handler(struct hrtimer *hrt)
+{
+	struct ot_item *item = container_of(hrt, struct ot_item, hrtimer);
+
+	if (atomic_read_acquire(&g_ot_data.stop))
+		return HRTIMER_NORESTART;
+
+	/* do bulk-testings for objects pop/push */
+	item->worker(item, 1);
+
+	hrtimer_forward(hrt, hrt->base->get_time(), item->hrtcycle);
+	return HRTIMER_RESTART;
+}
+
+static void ot_start_hrtimer(struct ot_item *item)
+{
+	if (!item->ctrl->hrtimer)
+		return;
+	hrtimer_start(&item->hrtimer, item->hrtcycle, HRTIMER_MODE_REL);
+}
+
+static void ot_stop_hrtimer(struct ot_item *item)
+{
+	if (!item->ctrl->hrtimer)
+		return;
+	hrtimer_cancel(&item->hrtimer);
+}
+
+static int ot_init_hrtimer(struct ot_item *item, unsigned long hrtimer)
+{
+	struct hrtimer *hrt = &item->hrtimer;
+
+	if (!hrtimer)
+		return -ENOENT;
+
+	item->hrtcycle = ktime_set(0, hrtimer * 1000000UL);
+	hrtimer_init(hrt, CLOCK_MONOTONIC, HRTIMER_MODE_REL);
+	hrt->function = ot_hrtimer_handler;
+	return 0;
+}
+
+static int ot_init_cpu_item(struct ot_item *item,
+			struct ot_ctrl *ctrl,
+			struct objpool_head *pool,
+			void (*worker)(struct ot_item *, int))
+{
+	memset(item, 0, sizeof(*item));
+	item->pool = pool;
+	item->ctrl = ctrl;
+	item->worker = worker;
+
+	item->bulk[0] = ctrl->bulk_normal;
+	item->bulk[1] = ctrl->bulk_irq;
+	item->delay = ctrl->delay;
+
+	/* initialize hrtimer */
+	ot_init_hrtimer(item, item->ctrl->hrtimer);
+	return 0;
+}
+
+static int ot_thread_worker(void *arg)
+{
+	struct ot_item *item = arg;
+	ktime_t start;
+
+	sched_set_normal(current, 50);
+
+	atomic_inc(&g_ot_data.nthreads);
+	down_read(&g_ot_data.start);
+	up_read(&g_ot_data.start);
+	start = ktime_get();
+	ot_start_hrtimer(item);
+	do {
+		if (atomic_read_acquire(&g_ot_data.stop))
+			break;
+		/* do bulk-testings for objects pop/push */
+		item->worker(item, 0);
+	} while (!kthread_should_stop());
+	ot_stop_hrtimer(item);
+	item->duration = (u64) ktime_us_delta(ktime_get(), start);
+	if (atomic_dec_and_test(&g_ot_data.nthreads))
+		complete(&g_ot_data.wait);
+
+	return 0;
+}
+
+static void ot_perf_report(struct ot_ctrl *ctrl, u64 duration)
+{
+	struct ot_stat total, normal = {0}, irq = {0};
+	int cpu, nthreads = 0;
+
+	pr_info("\n");
+	pr_info("Testing summary for %s\n", ctrl->name);
+
+	for_each_possible_cpu(cpu) {
+		struct ot_item *item = per_cpu_ptr(&ot_pcup_items, cpu);
+		if (!item->duration)
+			continue;
+		normal.nhits += item->stat[0].nhits;
+		normal.nmiss += item->stat[0].nmiss;
+		irq.nhits += item->stat[1].nhits;
+		irq.nmiss += item->stat[1].nmiss;
+		pr_info("CPU: %d  duration: %lluus\n", cpu, item->duration);
+		pr_info("\tthread:\t%16lu hits \t%16lu miss\n",
+			item->stat[0].nhits, item->stat[0].nmiss);
+		pr_info("\tirq:   \t%16lu hits \t%16lu miss\n",
+			item->stat[1].nhits, item->stat[1].nmiss);
+		pr_info("\ttotal: \t%16lu hits \t%16lu miss\n",
+			item->stat[0].nhits + item->stat[1].nhits,
+			item->stat[0].nmiss + item->stat[1].nmiss);
+		nthreads++;
+	}
+
+	total.nhits = normal.nhits + irq.nhits;
+	total.nmiss = normal.nmiss + irq.nmiss;
+
+	pr_info("ALL: \tnthreads: %d  duration: %lluus\n", nthreads, duration);
+	pr_info("SUM: \t%16lu hits \t%16lu miss\n",
+		total.nhits, total.nmiss);
+}
+
+/*
+ * synchronous test cases for objpool manipulation
+ */
+
+/* objpool manipulation for synchronous mode 0 (percpu objpool) */
+static struct ot_context *ot_init_sync_m0(struct ot_ctrl *ctrl)
+{
+	struct ot_context *sop = NULL;
+	int max = num_possible_cpus() << 3;
+
+	sop = (struct ot_context *)ot_kzalloc(sizeof(*sop));
+	if (!sop)
+		return NULL;
+	sop->ctrl = ctrl;
+
+	if (objpool_init(&sop->pool, max, max, ctrl->objsz,
+			GFP_KERNEL, sop, ot_init_node, NULL)) {
+		ot_kfree(sop, sizeof(*sop));
+		return NULL;
+	}
+	WARN_ON(max != sop->pool.nr_objs);
+
+	return sop;
+}
+
+static void ot_fini_sync_m0(struct ot_context *sop)
+{
+	objpool_fini(&sop->pool);
+	ot_kfree(sop, sizeof(*sop));
+}
+
+/* objpool manipulation for synchronous mode 1 (private pool) */
+static struct ot_context *ot_init_sync_m1(struct ot_ctrl *ctrl)
+{
+	struct ot_context *sop = NULL;
+	unsigned long size;
+	int rc, szobj, max = num_possible_cpus() << 3;
+
+	sop = (struct ot_context *)ot_kzalloc(sizeof(*sop));
+	if (!sop)
+		return NULL;
+	sop->ctrl = ctrl;
+
+	szobj = ALIGN(ctrl->objsz, sizeof(void *));
+	size = szobj * max;
+	sop->ptr = ot_vmalloc(size);
+	sop->size = size;
+	if (!sop->ptr) {
+		ot_kfree(sop, sizeof(*sop));
+		return NULL;
+	}
+	memset(sop->ptr, 0, size);
+
+	/* create and initialize objpool as empty (no objects) */
+	rc = objpool_init(&sop->pool, 0, max, 0, GFP_KERNEL, sop, NULL, NULL);
+	if (rc) {
+		ot_kfree(sop, sizeof(*sop));
+		return NULL;
+	}
+
+	/* populate given buffer to objpool */
+	rc = objpool_populate(&sop->pool, sop->ptr, size,
+		ctrl->objsz, sop, ot_init_node);
+	if (rc) {
+		objpool_fini(&sop->pool);
+		ot_vfree(sop->ptr, size);
+		ot_kfree(sop, sizeof(*sop));
+		return NULL;
+	}
+	WARN_ON((size / szobj) != sop->pool.nr_objs);
+
+	return sop;
+}
+
+static void ot_fini_sync_m1(struct ot_context *sop)
+{
+	objpool_fini(&sop->pool);
+
+	ot_vfree(sop->ptr, sop->size);
+	ot_kfree(sop, sizeof(*sop));
+}
+
+/* objpool manipulation for synchronous mode 2 (private objects) */
+static int ot_objpool_release(void *context, void *ptr, uint32_t flags)
+{
+	struct ot_context *sop = context;
+
+	/* here we need release all user-allocated objects */
+	if ((flags & OBJPOOL_FLAG_NODE) && (flags & OBJPOOL_FLAG_USER)) {
+		struct ot_node *on = ptr;
+		WARN_ON(on->data != 0xDEADBEEF);
+		ot_kfree(on, sop->ctrl->objsz);
+	} else if (flags & OBJPOOL_FLAG_POOL) {
+		/* release user preallocated pool */
+		if (sop->ptr) {
+			WARN_ON(sop->ptr != ptr);
+			WARN_ON(!(flags & OBJPOOL_FLAG_USER));
+			ot_vfree(sop->ptr, sop->size);
+		}
+		/* do context cleaning if needed */
+		ot_kfree(sop, sizeof(*sop));
+	}
+
+	return 0;
+}
+
+static struct ot_context *ot_init_sync_m2(struct ot_ctrl *ctrl)
+{
+	struct ot_context *sop = NULL;
+	struct ot_node *on;
+	int rc, i, max = num_possible_cpus() << 3;
+
+	sop = (struct ot_context *)ot_kzalloc(sizeof(*sop));
+	if (!sop)
+		return NULL;
+	sop->ctrl = ctrl;
+
+	/* create and initialize objpool as empty (no objects) */
+	rc = objpool_init(&sop->pool, 0, max, 0, GFP_KERNEL, sop, NULL,
+			ot_objpool_release);
+	if (rc) {
+		ot_kfree(sop, sizeof(*sop));
+		return NULL;
+	}
+
+	/* allocate private objects and insert to objpool */
+	for (i = 0; i < max; i++) {
+		on = ot_kzalloc(ctrl->objsz);
+		if (on) {
+			ot_init_node(sop, on);
+			on->data = 0xDEADBEEF;
+			objpool_add(on, &sop->pool);
+		}
+	}
+	WARN_ON(max != sop->pool.nr_objs);
+
+	return sop;
+}
+
+static void ot_fini_sync_m2(struct ot_context *sop)
+{
+	objpool_fini(&sop->pool);
+}
+
+/* objpool manipulation for synchronous mode 3 (mixed mode) */
+static struct ot_context *ot_init_sync_m3(struct ot_ctrl *ctrl)
+{
+	struct ot_context *sop = NULL;
+	struct ot_node *on;
+	unsigned long size;
+	int rc, i, szobj, nobjs;
+	int max = num_possible_cpus() << 4;
+
+	sop = (struct ot_context *)ot_kzalloc(sizeof(*sop));
+	if (!sop)
+		return NULL;
+	sop->ctrl = ctrl;
+
+	/* create and initialize objpool as empty (no objects) */
+	nobjs = num_possible_cpus() * 2;
+	rc = objpool_init(&sop->pool, nobjs, max, ctrl->objsz, GFP_KERNEL,
+			sop, ot_init_node, ot_objpool_release);
+	if (rc) {
+		ot_kfree(sop, sizeof(*sop));
+		return NULL;
+	}
+
+	szobj = ALIGN(ctrl->objsz, sizeof(void *));
+	size = szobj * num_possible_cpus() * 4;
+	sop->ptr = ot_vmalloc(size);
+	if (!sop->ptr) {
+		objpool_fini(&sop->pool);
+		return NULL;
+	}
+	sop->size = size;
+	memset(sop->ptr, 0, size);
+
+	/* populate given buffer to objpool */
+	rc = objpool_populate(&sop->pool, sop->ptr, size,
+		ctrl->objsz, sop, ot_init_node);
+	if (rc) {
+		objpool_fini(&sop->pool);
+		ot_vfree(sop->ptr, size);
+		return NULL;
+	}
+	nobjs += size / szobj;
+
+	/* allocate private objects and insert to objpool */
+	for (i = 0; i < num_possible_cpus() * 2; i++) {
+		on = ot_kzalloc(ctrl->objsz);
+		if (on) {
+			ot_init_node(sop, on);
+			on->data = 0xDEADBEEF;
+			if (!objpool_add(on, &sop->pool))
+				nobjs++;
+			else
+				ot_kfree(on, ctrl->objsz);
+		}
+	}
+	WARN_ON(nobjs != sop->pool.nr_objs);
+
+	return sop;
+}
+
+static void ot_fini_sync_m3(struct ot_context *sop)
+{
+	objpool_fini(&sop->pool);
+}
+
+struct {
+	struct ot_context * (*init)(struct ot_ctrl *);
+	void (*fini)(struct ot_context *sop);
+} g_ot_sync_ops[4] = {
+	{.init = ot_init_sync_m0, .fini = ot_fini_sync_m0},
+	{.init = ot_init_sync_m1, .fini = ot_fini_sync_m1},
+	{.init = ot_init_sync_m2, .fini = ot_fini_sync_m2},
+	{.init = ot_init_sync_m3, .fini = ot_fini_sync_m3},
+};
+
+/*
+ * synchronous test cases: performance mode
+ */
+
+static void ot_bulk_sync(struct ot_item *item, int irq)
+{
+	struct ot_node *nods[OT_NR_MAX_BULK];
+	int i;
+
+	for (i = 0; i < item->bulk[irq]; i++)
+		nods[i] = objpool_pop(item->pool);
+
+	if (!irq && (item->delay || !(++(item->niters) & 0x7FFF)))
+		msleep(item->delay);
+
+	while (i-- > 0) {
+		struct ot_node *on = nods[i];
+		if (on) {
+			on->refs++;
+			objpool_push(on, item->pool);
+			item->stat[irq].nhits++;
+		} else {
+			item->stat[irq].nmiss++;
+		}
+	}
+}
+
+static int ot_start_sync(struct ot_ctrl *ctrl)
+{
+	struct ot_context *sop;
+	ktime_t start;
+	u64 duration;
+	unsigned long timeout;
+	int cpu, rc;
+
+	/* initialize objpool for syncrhonous testcase */
+	sop = g_ot_sync_ops[ctrl->mode].init(ctrl);
+	if (!sop)
+		return -ENOMEM;
+
+	/* grab rwsem to block testing threads */
+	down_write(&g_ot_data.start);
+
+	for_each_possible_cpu(cpu) {
+		struct ot_item *item = per_cpu_ptr(&ot_pcup_items, cpu);
+		struct task_struct *work;
+
+		ot_init_cpu_item(item, ctrl, &sop->pool, ot_bulk_sync);
+
+		/* skip offline cpus */
+		if (!cpu_online(cpu))
+			continue;
+
+		work = kthread_create_on_node(ot_thread_worker, item,
+				cpu_to_node(cpu), "ot_worker_%d", cpu);
+		if (IS_ERR(work)) {
+			pr_err("failed to create thread for cpu %d\n", cpu);
+		} else {
+			kthread_bind(work, cpu);
+			wake_up_process(work);
+		}
+	}
+
+	/* wait a while to make sure all threads waiting at start line */
+	msleep(20);
+
+	/* in case no threads were created: memory insufficient ? */
+	if (atomic_dec_and_test(&g_ot_data.nthreads))
+		complete(&g_ot_data.wait);
+
+	// sched_set_fifo_low(current);
+
+	/* start objpool testing threads */
+	start = ktime_get();
+	up_write(&g_ot_data.start);
+
+	/* yeild cpu to worker threads for duration ms */
+	timeout = msecs_to_jiffies(ctrl->duration);
+	rc = schedule_timeout_interruptible(timeout);
+
+	/* tell workers threads to quit */
+	atomic_set_release(&g_ot_data.stop, 1);
+
+	/* wait all workers threads finish and quit */
+	wait_for_completion(&g_ot_data.wait);
+	duration = (u64) ktime_us_delta(ktime_get(), start);
+
+	/* cleanup objpool */
+	g_ot_sync_ops[ctrl->mode].fini(sop);
+
+	/* report testing summary and performance results */
+	ot_perf_report(ctrl, duration);
+
+	/* report memory allocation summary */
+	ot_mem_report(ctrl);
+
+	return rc;
+}
+
+/*
+ * asynchronous test cases: pool lifecycle controlled by refcount
+ */
+
+static void ot_fini_async_rcu(struct rcu_head *rcu)
+{
+	struct ot_context *sop = container_of(rcu, struct ot_context, rcu);
+	struct ot_node *on;
+
+	/* here all cpus are aware of the stop event: g_ot_data.stop = 1 */
+	WARN_ON(!atomic_read_acquire(&g_ot_data.stop));
+
+	do {
+		/* release all objects remained in objpool */
+		on = objpool_pop(&sop->pool);
+		if (on && !objpool_is_inslot(on, &sop->pool) &&
+			!objpool_is_inpool(on, &sop->pool)) {
+			/* private object managed by user */
+			WARN_ON(on->data != 0xDEADBEEF);
+			ot_kfree(on, sop->ctrl->objsz);
+		}
+
+		/* deref anyway since we've one extra ref grabbed */
+		if (refcount_dec_and_test(&sop->refs)) {
+			objpool_fini(&sop->pool);
+			break;
+		}
+	} while (on);
+
+	complete(&g_ot_data.rcu);
+}
+
+static void ot_fini_async(struct ot_context *sop)
+{
+	/* make sure the stop event is acknowledged by all cores */
+	call_rcu(&sop->rcu, ot_fini_async_rcu);
+}
+
+static struct ot_context *ot_init_async_m0(struct ot_ctrl *ctrl)
+{
+	struct ot_context *sop = NULL;
+	int max = num_possible_cpus() << 3;
+
+	sop = (struct ot_context *)ot_kzalloc(sizeof(*sop));
+	if (!sop)
+		return NULL;
+	sop->ctrl = ctrl;
+
+	if (objpool_init(&sop->pool, max, max, ctrl->objsz, GFP_KERNEL,
+			sop, ot_init_node, ot_objpool_release)) {
+		ot_kfree(sop, sizeof(*sop));
+		return NULL;
+	}
+	WARN_ON(max != sop->pool.nr_objs);
+	refcount_set(&sop->refs, max + 1);
+
+	return sop;
+}
+
+static struct ot_context *ot_init_async_m1(struct ot_ctrl *ctrl)
+{
+	struct ot_context *sop = NULL;
+	unsigned long size;
+	int szobj, rc, max = num_possible_cpus() << 3;
+
+	sop = (struct ot_context *)ot_kzalloc(sizeof(*sop));
+	if (!sop)
+		return NULL;
+	sop->ctrl = ctrl;
+
+	szobj = ALIGN(ctrl->objsz, sizeof(void *));
+	size = szobj * max;
+	sop->ptr = ot_vmalloc(size);
+	sop->size = size;
+	if (!sop->ptr) {
+		ot_kfree(sop, sizeof(*sop));
+		return NULL;
+	}
+	memset(sop->ptr, 0, size);
+
+	/* create and initialize objpool as empty (no objects) */
+	rc = objpool_init(&sop->pool, 0, max, 0, GFP_KERNEL, sop, NULL,
+			ot_objpool_release);
+	if (rc) {
+		ot_kfree(sop, sizeof(*sop));
+		return NULL;
+	}
+
+	/* populate given buffer to objpool */
+	rc = objpool_populate(&sop->pool, sop->ptr, size,
+			ctrl->objsz, sop, ot_init_node);
+	if (rc) {
+		objpool_fini(&sop->pool);
+		ot_vfree(sop->ptr, size);
+		return NULL;
+	}
+
+	/* calculate total number of objects stored in ptr */
+	WARN_ON(size / szobj != sop->pool.nr_objs);
+	refcount_set(&sop->refs, size / szobj + 1);
+
+	return sop;
+}
+
+static struct ot_context *ot_init_async_m2(struct ot_ctrl *ctrl)
+{
+	struct ot_context *sop = NULL;
+	struct ot_node *on;
+	int rc, i, nobjs = 0, max = num_possible_cpus() << 3;
+
+	sop = (struct ot_context *)ot_kzalloc(sizeof(*sop));
+	if (!sop)
+		return NULL;
+	sop->ctrl = ctrl;
+
+	/* create and initialize objpool as empty (no objects) */
+	rc = objpool_init(&sop->pool, 0, max, 0, GFP_KERNEL, sop, NULL,
+			ot_objpool_release);
+	if (rc) {
+		ot_kfree(sop, sizeof(*sop));
+		return NULL;
+	}
+
+	/* allocate private objects and insert to objpool */
+	for (i = 0; i < max; i++) {
+		on = ot_kzalloc(ctrl->objsz);
+		if (on) {
+			ot_init_node(sop, on);
+			on->data = 0xDEADBEEF;
+			objpool_add(on, &sop->pool);
+			nobjs++;
+		}
+	}
+	WARN_ON(nobjs != sop->pool.nr_objs);
+	refcount_set(&sop->refs, nobjs + 1);
+
+	return sop;
+}
+
+/* objpool manipulation for synchronous mode 3 (mixed mode) */
+static struct ot_context *ot_init_async_m3(struct ot_ctrl *ctrl)
+{
+	struct ot_context *sop = NULL;
+	struct ot_node *on;
+	unsigned long size;
+	int szobj, nobjs, rc, i, max = num_possible_cpus() << 4;
+
+	sop = (struct ot_context *)ot_kzalloc(sizeof(*sop));
+	if (!sop)
+		return NULL;
+	sop->ctrl = ctrl;
+
+	/* create and initialize objpool as empty (no objects) */
+	nobjs = num_possible_cpus() * 2;
+	rc = objpool_init(&sop->pool, nobjs, max, ctrl->objsz, GFP_KERNEL,
+			sop, ot_init_node, ot_objpool_release);
+	if (rc) {
+		ot_kfree(sop, sizeof(*sop));
+		return NULL;
+	}
+
+	szobj = ALIGN(ctrl->objsz, sizeof(void *));
+	size = szobj * num_possible_cpus() * 4;
+	sop->ptr = ot_vmalloc(size);
+	if (!sop->ptr) {
+		ot_kfree(sop, sizeof(*sop));
+		return NULL;
+	}
+	sop->size = size;
+	memset(sop->ptr, 0, size);
+
+	/* populate given buffer to objpool */
+	rc = objpool_populate(&sop->pool, sop->ptr, size,
+			ctrl->objsz, sop, ot_init_node);
+	if (rc) {
+		objpool_fini(&sop->pool);
+		ot_vfree(sop->ptr, size);
+		return NULL;
+	}
+
+	/* calculate total number of objects stored in ptr */
+	nobjs += size / szobj;
+
+	/* allocate private objects and insert to objpool */
+	for (i = 0; i < num_possible_cpus() * 2; i++) {
+		on = ot_kzalloc(ctrl->objsz);
+		if (on) {
+			ot_init_node(sop, on);
+			on->data = 0xDEADBEEF;
+			objpool_add(on, &sop->pool);
+			nobjs++;
+		}
+	}
+	WARN_ON(nobjs != sop->pool.nr_objs);
+	refcount_set(&sop->refs, nobjs + 1);
+
+	return sop;
+}
+
+struct {
+	struct ot_context * (*init)(struct ot_ctrl *);
+	void (*fini)(struct ot_context *sop);
+} g_ot_async_ops[4] = {
+	{.init = ot_init_async_m0, .fini = ot_fini_async},
+	{.init = ot_init_async_m1, .fini = ot_fini_async},
+	{.init = ot_init_async_m2, .fini = ot_fini_async},
+	{.init = ot_init_async_m3, .fini = ot_fini_async},
+};
+
+static void ot_nod_recycle(struct ot_node *on, struct objpool_head *pool,
+			int release)
+{
+	struct ot_context *sop;
+
+	on->refs++;
+
+	if (!release) {
+		/* push object back to opjpool for reuse */
+		objpool_push(on, pool);
+		return;
+	}
+
+	sop = container_of(pool, struct ot_context, pool);
+	WARN_ON(sop != pool->context);
+
+	if (objpool_is_inslot(on, pool)) {
+		/* object is alloced from percpu slots */
+	} else if (objpool_is_inpool(on, pool)) {
+		/* object is alloced from user-manged pool */
+	} else {
+		/* private object managed by user */
+		WARN_ON(on->data != 0xDEADBEEF);
+		ot_kfree(on, sop->ctrl->objsz);
+	}
+
+	/* unref objpool with nod removed forever */
+	if (refcount_dec_and_test(&sop->refs))
+		objpool_fini(pool);
+}
+
+static void ot_bulk_async(struct ot_item *item, int irq)
+{
+	struct ot_node *nods[OT_NR_MAX_BULK];
+	int i, stop;
+
+	for (i = 0; i < item->bulk[irq]; i++)
+		nods[i] = objpool_pop(item->pool);
+
+	if (!irq) {
+		if (item->delay || !(++(item->niters) & 0x7FFF))
+			msleep(item->delay);
+		get_cpu();
+	}
+
+	stop = atomic_read_acquire(&g_ot_data.stop);
+
+	/* drop all objects and deref objpool */
+	while (i-- > 0) {
+		struct ot_node *on = nods[i];
+
+		if (on) {
+			on->refs++;
+			ot_nod_recycle(on, item->pool, stop);
+			item->stat[irq].nhits++;
+		} else {
+			item->stat[irq].nmiss++;
+		}
+	}
+
+	if (!irq)
+		put_cpu();
+}
+
+static int ot_start_async(struct ot_ctrl *ctrl)
+{
+	struct ot_context *sop;
+	ktime_t start;
+	u64 duration;
+	unsigned long timeout;
+	int cpu, rc;
+
+	/* initialize objpool for syncrhonous testcase */
+	sop = g_ot_async_ops[ctrl->mode].init(ctrl);
+	if (!sop)
+		return -ENOMEM;
+
+	/* grab rwsem to block testing threads */
+	down_write(&g_ot_data.start);
+
+	for_each_possible_cpu(cpu) {
+		struct ot_item *item = per_cpu_ptr(&ot_pcup_items, cpu);
+		struct task_struct *work;
+
+		ot_init_cpu_item(item, ctrl, &sop->pool, ot_bulk_async);
+
+		/* skip offline cpus */
+		if (!cpu_online(cpu))
+			continue;
+
+		work = kthread_create_on_node(ot_thread_worker, item,
+				cpu_to_node(cpu), "ot_worker_%d", cpu);
+		if (IS_ERR(work)) {
+			pr_err("failed to create thread for cpu %d\n", cpu);
+		} else {
+			kthread_bind(work, cpu);
+			wake_up_process(work);
+		}
+	}
+
+	/* wait a while to make sure all threads waiting at start line */
+	msleep(20);
+
+	/* in case no threads were created: memory insufficient ? */
+	if (atomic_dec_and_test(&g_ot_data.nthreads))
+		complete(&g_ot_data.wait);
+
+	/* start objpool testing threads */
+	start = ktime_get();
+	up_write(&g_ot_data.start);
+
+	/* yeild cpu to worker threads for duration ms */
+	timeout = msecs_to_jiffies(ctrl->duration);
+	rc = schedule_timeout_interruptible(timeout);
+
+	/* tell workers threads to quit */
+	atomic_set_release(&g_ot_data.stop, 1);
+
+	/* do async-finalization */
+	g_ot_async_ops[ctrl->mode].fini(sop);
+
+	/* wait all workers threads finish and quit */
+	wait_for_completion(&g_ot_data.wait);
+	duration = (u64) ktime_us_delta(ktime_get(), start);
+
+	/* assure rcu callback is triggered */
+	wait_for_completion(&g_ot_data.rcu);
+
+	/*
+	 * now we are sure that objpool is finalized either
+	 * by rcu callback or by worker threads
+	 */
+
+	/* report testing summary and performance results */
+	ot_perf_report(ctrl, duration);
+
+	/* report memory allocation summary */
+	ot_mem_report(ctrl);
+
+	return rc;
+}
+
+/*
+ * predefined testing cases:
+ *   4 synchronous cases / 4 overrun cases / 2 async cases
+ *
+ * mode: unsigned int, could be 0/1/2/3, see name
+ * duration: unsigned int, total test time in ms
+ * delay: unsigned int, delay (in ms) between each iteration
+ * bulk_normal: unsigned int, repeat times for thread worker
+ * bulk_irq: unsigned int, repeat times for irq consumer
+ * hrtimer: unsigned long, hrtimer intervnal in ms
+ * name: char *, tag for current test ot_item
+ */
+
+#define NODE_COMPACT sizeof(struct ot_node)
+#define NODE_VMALLOC (512)
+
+struct ot_ctrl g_ot_sync[] = {
+	{0, NODE_COMPACT, 1000, 0,  1,  0,  0, "sync: percpu objpool"},
+	{0, NODE_VMALLOC, 1000, 0,  1,  0,  0, "sync: percpu objpool from vmalloc"},
+	{1, NODE_COMPACT, 1000, 0,  1,  0,  0, "sync: user objpool"},
+	{2, NODE_COMPACT, 1000, 0,  1,  0,  0, "sync: user objects"},
+	{3, NODE_COMPACT, 1000, 0,  1,  0,  0, "sync: mixed pools & objs"},
+	{3, NODE_VMALLOC, 1000, 0,  1,  0,  0, "sync: mixed pools & objs (vmalloc)"},
+};
+
+struct ot_ctrl g_ot_miss[] = {
+	{0, NODE_COMPACT, 1000, 0, 16,  0,  0, "sync overrun: percpu objpool"},
+	{0, NODE_VMALLOC, 1000, 0, 16,  0,  0, "sync overrun: percpu objpool from vmalloc"},
+	{1, NODE_COMPACT, 1000, 0, 16,  0,  0, "sync overrun: user objpool"},
+	{2, NODE_COMPACT, 1000, 0, 16,  0,  0, "sync overrun: user objects"},
+	{3, NODE_COMPACT, 1000, 0, 16,  0,  0, "sync overrun: mixed pools & objs"},
+	{3, NODE_VMALLOC, 1000, 0, 16,  0,  0, "sync overrun: mixed pools & objs (vmalloc)"},
+};
+
+struct ot_ctrl g_ot_async[] = {
+	{0, NODE_COMPACT, 1000, 4,  8,  8,  6, "async: percpu objpool"},
+	{0, NODE_VMALLOC, 1000, 4,  8,  8,  6, "async: percpu objpool from vmalloc"},
+	{1, NODE_COMPACT, 1000, 4,  8,  8,  6, "async: user objpool"},
+	{2, NODE_COMPACT, 1000, 4,  8,  8,  6, "async: user objects"},
+	{3, NODE_COMPACT, 1000, 4,  8,  8,  6, "async: mixed pools & objs"},
+	{3, NODE_VMALLOC, 1000, 4,  8,  8,  6, "async: mixed pools & objs (vmalloc)"},
+};
+
+static int __init ot_mod_init(void)
+{
+	int i;
+
+	ot_init_data(&g_ot_data);
+
+	for (i = 0; i < ARRAY_SIZE(g_ot_sync); i++) {
+		if (ot_start_sync(&g_ot_sync[i]))
+			goto out;
+		ot_reset_data(&g_ot_data);
+	}
+
+	for (i = 0; i < ARRAY_SIZE(g_ot_miss); i++) {
+		if (ot_start_sync(&g_ot_miss[i]))
+			goto out;
+		ot_reset_data(&g_ot_data);
+	}
+
+	for (i = 0; i < ARRAY_SIZE(g_ot_async); i++) {
+		if (ot_start_async(&g_ot_async[i]))
+			goto out;
+		ot_reset_data(&g_ot_data);
+	}
+
+out:
+	return -EAGAIN;
+}
+
+static void __exit ot_mod_exit(void)
+{
+}
+
+module_init(ot_mod_init);
+module_exit(ot_mod_exit);
+
+MODULE_LICENSE("GPL");
+MODULE_AUTHOR("Matt Wu");
-- 
2.34.1


^ permalink raw reply related	[flat|nested] 23+ messages in thread

* [PATCH v5 3/4] kprobes: kretprobe scalability improvement with objpool
  2022-11-06  5:34   ` [PATCH v5 0/4] lib,kprobes: " wuqiang
  2022-11-06  5:34     ` [PATCH v5 1/4] lib: objpool added: ring-array based lockless MPMC queue wuqiang
  2022-11-06  5:34     ` [PATCH v5 2/4] lib: objpool test module added wuqiang
@ 2022-11-06  5:34     ` wuqiang
  2022-11-06  5:34     ` [PATCH v5 4/4] kprobes: freelist.h removed wuqiang
  3 siblings, 0 replies; 23+ messages in thread
From: wuqiang @ 2022-11-06  5:34 UTC (permalink / raw)
  To: mhiramat, davem, anil.s.keshavamurthy, naveen.n.rao, rostedt,
	peterz, akpm, sander, ebiggers, dan.j.williams, jpoimboe
  Cc: linux-kernel, lkp, mattwu, wuqiang

kretprobe is using freelist to manage return-instances, but freelist,
as LIFO queue based on singly linked list, scales badly and reduces
the overall throughput of kretprobed routines, especially for high
contention scenarios.

Here's a typical throughput test of sys_flock (counts in 10 seconds,
measured with perf stat -a -I 10000 -e syscalls:sys_enter_flock):

OS: Debian 10 X86_64, Linux 6.1rc2
HW: XEON 8336C x 2, 64 cores/128 threads, DDR4 3200MT/s

      1X       2X       4X       6X       8X      12X      16X
34762430 36546920 17949900 13101899 12569595 12646601 14729195
     24X      32X      48X      64X      72X      96X     128X
19263546 10102064  8985418 11936495 11493980  7127789  9330985

This patch introduces objpool to kretprobe and rethook, with orginal
freelist replaced and brings near-linear scalability to kretprobed
routines. Tests of kretprobe throughput show the biggest ratio as
333.9x of the original freelist. Here's the comparison:

                  1X         2X         4X         8X        16X
freelist:   34762430   36546920   17949900   12569595   14729195
objpool:    35627544   72182095  144068494  287564688  576903916
                 32X        48X        64X        96X       128X
freelist:   10102064    8985418   11936495    7127789    9330985
objpool:  1158876372 1737828164 2324371724 2380310472 2463182819

Tests on 96-core ARM64 system output similarly, but with the biggest
ratio up to 642.2x:

OS: Debian 10 AARCH64, Linux 6.1rc2
HW: Kunpeng-920 96 cores/2 sockets/4 NUMA nodes, DDR4 2933 MT/s

                  1X         2X         4X         8X        16X
freelist:   17498299   10887037   10224710    8499132    6421751
objpool:    18715726   35549845   71615884  144258971  283707220
                 24X        32X        48X        64X        96X
freelist:    5339868    4819116    3593919    3121575    2687167
objpool:   419830913  571609748  877456139 1143316315 1725668029

Signed-off-by: wuqiang <wuqiang.matt@bytedance.com>
---
 include/linux/kprobes.h |  9 ++--
 include/linux/rethook.h | 15 +++----
 kernel/kprobes.c        | 95 +++++++++++++++++++----------------------
 kernel/trace/fprobe.c   | 17 +++-----
 kernel/trace/rethook.c  | 80 +++++++++++++++++-----------------
 5 files changed, 96 insertions(+), 120 deletions(-)

diff --git a/include/linux/kprobes.h b/include/linux/kprobes.h
index a0b92be98984..f13f01e600c2 100644
--- a/include/linux/kprobes.h
+++ b/include/linux/kprobes.h
@@ -27,7 +27,7 @@
 #include <linux/mutex.h>
 #include <linux/ftrace.h>
 #include <linux/refcount.h>
-#include <linux/freelist.h>
+#include <linux/objpool.h>
 #include <linux/rethook.h>
 #include <asm/kprobes.h>
 
@@ -141,6 +141,7 @@ static inline bool kprobe_ftrace(struct kprobe *p)
  */
 struct kretprobe_holder {
 	struct kretprobe	*rp;
+	struct objpool_head	oh;
 	refcount_t		ref;
 };
 
@@ -154,7 +155,6 @@ struct kretprobe {
 #ifdef CONFIG_KRETPROBE_ON_RETHOOK
 	struct rethook *rh;
 #else
-	struct freelist_head freelist;
 	struct kretprobe_holder *rph;
 #endif
 };
@@ -165,10 +165,7 @@ struct kretprobe_instance {
 #ifdef CONFIG_KRETPROBE_ON_RETHOOK
 	struct rethook_node node;
 #else
-	union {
-		struct freelist_node freelist;
-		struct rcu_head rcu;
-	};
+	struct rcu_head rcu;
 	struct llist_node llist;
 	struct kretprobe_holder *rph;
 	kprobe_opcode_t *ret_addr;
diff --git a/include/linux/rethook.h b/include/linux/rethook.h
index c8ac1e5afcd1..278ec65e71fe 100644
--- a/include/linux/rethook.h
+++ b/include/linux/rethook.h
@@ -6,7 +6,7 @@
 #define _LINUX_RETHOOK_H
 
 #include <linux/compiler.h>
-#include <linux/freelist.h>
+#include <linux/objpool.h>
 #include <linux/kallsyms.h>
 #include <linux/llist.h>
 #include <linux/rcupdate.h>
@@ -30,14 +30,14 @@ typedef void (*rethook_handler_t) (struct rethook_node *, void *, struct pt_regs
 struct rethook {
 	void			*data;
 	rethook_handler_t	handler;
-	struct freelist_head	pool;
+	struct objpool_head	pool;
 	refcount_t		ref;
 	struct rcu_head		rcu;
 };
 
 /**
  * struct rethook_node - The rethook shadow-stack entry node.
- * @freelist: The freelist, linked to struct rethook::pool.
+ * @nod: The objpool node, linked to struct rethook::pool.
  * @rcu: The rcu_head for deferred freeing.
  * @llist: The llist, linked to a struct task_struct::rethooks.
  * @rethook: The pointer to the struct rethook.
@@ -48,19 +48,15 @@ struct rethook {
  * on each entry of the shadow stack.
  */
 struct rethook_node {
-	union {
-		struct freelist_node freelist;
-		struct rcu_head      rcu;
-	};
+	struct rcu_head		rcu;
 	struct llist_node	llist;
 	struct rethook		*rethook;
 	unsigned long		ret_addr;
 	unsigned long		frame;
 };
 
-struct rethook *rethook_alloc(void *data, rethook_handler_t handler);
+struct rethook *rethook_alloc(void *data, rethook_handler_t handler, gfp_t gfp, int size, int max);
 void rethook_free(struct rethook *rh);
-void rethook_add_node(struct rethook *rh, struct rethook_node *node);
 struct rethook_node *rethook_try_get(struct rethook *rh);
 void rethook_recycle(struct rethook_node *node);
 void rethook_hook(struct rethook_node *node, struct pt_regs *regs, bool mcount);
@@ -97,4 +93,3 @@ void rethook_flush_task(struct task_struct *tk);
 #endif
 
 #endif
-
diff --git a/kernel/kprobes.c b/kernel/kprobes.c
index 3220b0a2fb4a..e8526a0d29b6 100644
--- a/kernel/kprobes.c
+++ b/kernel/kprobes.c
@@ -1865,10 +1865,12 @@ static struct notifier_block kprobe_exceptions_nb = {
 static void free_rp_inst_rcu(struct rcu_head *head)
 {
 	struct kretprobe_instance *ri = container_of(head, struct kretprobe_instance, rcu);
+	struct kretprobe_holder *rph = ri->rph;
 
-	if (refcount_dec_and_test(&ri->rph->ref))
-		kfree(ri->rph);
-	kfree(ri);
+	if (refcount_dec_and_test(&rph->ref)) {
+		objpool_fini(&rph->oh);
+		kfree(rph);
+	}
 }
 NOKPROBE_SYMBOL(free_rp_inst_rcu);
 
@@ -1877,7 +1879,7 @@ static void recycle_rp_inst(struct kretprobe_instance *ri)
 	struct kretprobe *rp = get_kretprobe(ri);
 
 	if (likely(rp))
-		freelist_add(&ri->freelist, &rp->freelist);
+		objpool_push(ri, &rp->rph->oh);
 	else
 		call_rcu(&ri->rcu, free_rp_inst_rcu);
 }
@@ -1914,23 +1916,19 @@ NOKPROBE_SYMBOL(kprobe_flush_task);
 
 static inline void free_rp_inst(struct kretprobe *rp)
 {
-	struct kretprobe_instance *ri;
-	struct freelist_node *node;
-	int count = 0;
-
-	node = rp->freelist.head;
-	while (node) {
-		ri = container_of(node, struct kretprobe_instance, freelist);
-		node = node->next;
-
-		kfree(ri);
-		count++;
-	}
+	struct kretprobe_holder *rph = rp->rph;
+	void *nod;
 
-	if (refcount_sub_and_test(count, &rp->rph->ref)) {
-		kfree(rp->rph);
-		rp->rph = NULL;
-	}
+	rp->rph = NULL;
+	do {
+		nod = objpool_pop(&rph->oh);
+		/* deref anyway since we've one extra ref grabbed */
+		if (refcount_dec_and_test(&rph->ref)) {
+			objpool_fini(&rph->oh);
+			kfree(rph);
+			break;
+		}
+	} while (nod);
 }
 
 /* This assumes the 'tsk' is the current task or the is not running. */
@@ -2072,19 +2070,17 @@ NOKPROBE_SYMBOL(__kretprobe_trampoline_handler)
 static int pre_handler_kretprobe(struct kprobe *p, struct pt_regs *regs)
 {
 	struct kretprobe *rp = container_of(p, struct kretprobe, kp);
+	struct kretprobe_holder *rph = rp->rph;
 	struct kretprobe_instance *ri;
-	struct freelist_node *fn;
 
-	fn = freelist_try_get(&rp->freelist);
-	if (!fn) {
+	ri = objpool_pop(&rph->oh);
+	if (!ri) {
 		rp->nmissed++;
 		return 0;
 	}
 
-	ri = container_of(fn, struct kretprobe_instance, freelist);
-
 	if (rp->entry_handler && rp->entry_handler(ri, regs)) {
-		freelist_add(&ri->freelist, &rp->freelist);
+		objpool_push(ri, &rph->oh);
 		return 0;
 	}
 
@@ -2174,10 +2170,19 @@ int kprobe_on_func_entry(kprobe_opcode_t *addr, const char *sym, unsigned long o
 	return 0;
 }
 
+#ifndef CONFIG_KRETPROBE_ON_RETHOOK
+static int kretprobe_init_inst(void *context, void *nod)
+{
+	struct kretprobe_instance *ri = nod;
+
+	ri->rph = context;
+	return 0;
+}
+#endif
+
 int register_kretprobe(struct kretprobe *rp)
 {
 	int ret;
-	struct kretprobe_instance *inst;
 	int i;
 	void *addr;
 
@@ -2215,20 +2220,12 @@ int register_kretprobe(struct kretprobe *rp)
 #endif
 	}
 #ifdef CONFIG_KRETPROBE_ON_RETHOOK
-	rp->rh = rethook_alloc((void *)rp, kretprobe_rethook_handler);
+	rp->rh = rethook_alloc((void *)rp, kretprobe_rethook_handler, GFP_KERNEL,
+				sizeof(struct kretprobe_instance) + rp->data_size,
+				rp->maxactive);
 	if (!rp->rh)
 		return -ENOMEM;
 
-	for (i = 0; i < rp->maxactive; i++) {
-		inst = kzalloc(sizeof(struct kretprobe_instance) +
-			       rp->data_size, GFP_KERNEL);
-		if (inst == NULL) {
-			rethook_free(rp->rh);
-			rp->rh = NULL;
-			return -ENOMEM;
-		}
-		rethook_add_node(rp->rh, &inst->node);
-	}
 	rp->nmissed = 0;
 	/* Establish function entry probe point */
 	ret = register_kprobe(&rp->kp);
@@ -2237,25 +2234,19 @@ int register_kretprobe(struct kretprobe *rp)
 		rp->rh = NULL;
 	}
 #else	/* !CONFIG_KRETPROBE_ON_RETHOOK */
-	rp->freelist.head = NULL;
 	rp->rph = kzalloc(sizeof(struct kretprobe_holder), GFP_KERNEL);
 	if (!rp->rph)
 		return -ENOMEM;
 
-	rp->rph->rp = rp;
-	for (i = 0; i < rp->maxactive; i++) {
-		inst = kzalloc(sizeof(struct kretprobe_instance) +
-			       rp->data_size, GFP_KERNEL);
-		if (inst == NULL) {
-			refcount_set(&rp->rph->ref, i);
-			free_rp_inst(rp);
-			return -ENOMEM;
-		}
-		inst->rph = rp->rph;
-		freelist_add(&inst->freelist, &rp->freelist);
+	if (objpool_init(&rp->rph->oh, rp->maxactive, rp->maxactive,
+			  rp->data_size + sizeof(struct kretprobe_instance),
+			  GFP_KERNEL, rp->rph, kretprobe_init_inst, NULL)) {
+		kfree(rp->rph);
+		rp->rph = NULL;
+		return -ENOMEM;
 	}
-	refcount_set(&rp->rph->ref, i);
-
+	refcount_set(&rp->rph->ref, rp->maxactive + 1);
+	rp->rph->rp = rp;
 	rp->nmissed = 0;
 	/* Establish function entry probe point */
 	ret = register_kprobe(&rp->kp);
diff --git a/kernel/trace/fprobe.c b/kernel/trace/fprobe.c
index aac63ca9c3d1..99b4ab0f6468 100644
--- a/kernel/trace/fprobe.c
+++ b/kernel/trace/fprobe.c
@@ -125,7 +125,7 @@ static void fprobe_init(struct fprobe *fp)
 
 static int fprobe_init_rethook(struct fprobe *fp, int num)
 {
-	int i, size;
+	int size;
 
 	if (num < 0)
 		return -EINVAL;
@@ -140,18 +140,11 @@ static int fprobe_init_rethook(struct fprobe *fp, int num)
 	if (size < 0)
 		return -E2BIG;
 
-	fp->rethook = rethook_alloc((void *)fp, fprobe_exit_handler);
-	for (i = 0; i < size; i++) {
-		struct fprobe_rethook_node *node;
+	fp->rethook = rethook_alloc((void *)fp, fprobe_exit_handler, GFP_KERNEL,
+				sizeof(struct fprobe_rethook_node), size);
+	if (!fp->rethook)
+		return -ENOMEM;
 
-		node = kzalloc(sizeof(*node), GFP_KERNEL);
-		if (!node) {
-			rethook_free(fp->rethook);
-			fp->rethook = NULL;
-			return -ENOMEM;
-		}
-		rethook_add_node(fp->rethook, &node->node);
-	}
 	return 0;
 }
 
diff --git a/kernel/trace/rethook.c b/kernel/trace/rethook.c
index c69d82273ce7..01df98db2fbe 100644
--- a/kernel/trace/rethook.c
+++ b/kernel/trace/rethook.c
@@ -36,21 +36,17 @@ void rethook_flush_task(struct task_struct *tk)
 static void rethook_free_rcu(struct rcu_head *head)
 {
 	struct rethook *rh = container_of(head, struct rethook, rcu);
-	struct rethook_node *rhn;
-	struct freelist_node *node;
-	int count = 1;
+	struct rethook_node *nod;
 
-	node = rh->pool.head;
-	while (node) {
-		rhn = container_of(node, struct rethook_node, freelist);
-		node = node->next;
-		kfree(rhn);
-		count++;
-	}
-
-	/* The rh->ref is the number of pooled node + 1 */
-	if (refcount_sub_and_test(count, &rh->ref))
-		kfree(rh);
+	do {
+		nod = objpool_pop(&rh->pool);
+		/* deref anyway since we've one extra ref grabbed */
+		if (refcount_dec_and_test(&rh->ref)) {
+			objpool_fini(&rh->pool);
+			kfree(rh);
+			break;
+		}
+	} while (nod);
 }
 
 /**
@@ -70,16 +66,28 @@ void rethook_free(struct rethook *rh)
 	call_rcu(&rh->rcu, rethook_free_rcu);
 }
 
+static int rethook_init_node(void *context, void *nod)
+{
+	struct rethook_node *node = nod;
+
+	node->rethook = context;
+	return 0;
+}
+
 /**
  * rethook_alloc() - Allocate struct rethook.
  * @data: a data to pass the @handler when hooking the return.
  * @handler: the return hook callback function.
+ * @gfp: default gfp for objpool allocation
+ * @size: rethook node size
+ * @max: number of rethook nodes to be preallocated
  *
  * Allocate and initialize a new rethook with @data and @handler.
  * Return NULL if memory allocation fails or @handler is NULL.
  * Note that @handler == NULL means this rethook is going to be freed.
  */
-struct rethook *rethook_alloc(void *data, rethook_handler_t handler)
+struct rethook *rethook_alloc(void *data, rethook_handler_t handler, gfp_t gfp,
+			      int size, int max)
 {
 	struct rethook *rh = kzalloc(sizeof(struct rethook), GFP_KERNEL);
 
@@ -88,34 +96,26 @@ struct rethook *rethook_alloc(void *data, rethook_handler_t handler)
 
 	rh->data = data;
 	rh->handler = handler;
-	rh->pool.head = NULL;
-	refcount_set(&rh->ref, 1);
 
+	/* initialize the objpool for rethook nodes */
+	if (objpool_init(&rh->pool, max, max, size, gfp, rh, rethook_init_node,
+			NULL)) {
+		kfree(rh);
+		return NULL;
+	}
+	refcount_set(&rh->ref, max + 1);
 	return rh;
 }
 
-/**
- * rethook_add_node() - Add a new node to the rethook.
- * @rh: the struct rethook.
- * @node: the struct rethook_node to be added.
- *
- * Add @node to @rh. User must allocate @node (as a part of user's
- * data structure.) The @node fields are initialized in this function.
- */
-void rethook_add_node(struct rethook *rh, struct rethook_node *node)
-{
-	node->rethook = rh;
-	freelist_add(&node->freelist, &rh->pool);
-	refcount_inc(&rh->ref);
-}
-
 static void free_rethook_node_rcu(struct rcu_head *head)
 {
 	struct rethook_node *node = container_of(head, struct rethook_node, rcu);
+	struct rethook *rh = node->rethook;
 
-	if (refcount_dec_and_test(&node->rethook->ref))
-		kfree(node->rethook);
-	kfree(node);
+	if (refcount_dec_and_test(&rh->ref)) {
+		objpool_fini(&rh->pool);
+		kfree(rh);
+	}
 }
 
 /**
@@ -130,7 +130,7 @@ void rethook_recycle(struct rethook_node *node)
 	lockdep_assert_preemption_disabled();
 
 	if (likely(READ_ONCE(node->rethook->handler)))
-		freelist_add(&node->freelist, &node->rethook->pool);
+		objpool_push(node, &node->rethook->pool);
 	else
 		call_rcu(&node->rcu, free_rethook_node_rcu);
 }
@@ -146,7 +146,7 @@ NOKPROBE_SYMBOL(rethook_recycle);
 struct rethook_node *rethook_try_get(struct rethook *rh)
 {
 	rethook_handler_t handler = READ_ONCE(rh->handler);
-	struct freelist_node *fn;
+	struct rethook_node *nod;
 
 	lockdep_assert_preemption_disabled();
 
@@ -163,11 +163,11 @@ struct rethook_node *rethook_try_get(struct rethook *rh)
 	if (unlikely(!rcu_is_watching()))
 		return NULL;
 
-	fn = freelist_try_get(&rh->pool);
-	if (!fn)
+	nod = (struct rethook_node *)objpool_pop(&rh->pool);
+	if (!nod)
 		return NULL;
 
-	return container_of(fn, struct rethook_node, freelist);
+	return nod;
 }
 NOKPROBE_SYMBOL(rethook_try_get);
 
-- 
2.34.1


^ permalink raw reply related	[flat|nested] 23+ messages in thread

* [PATCH v5 4/4] kprobes: freelist.h removed
  2022-11-06  5:34   ` [PATCH v5 0/4] lib,kprobes: " wuqiang
                       ` (2 preceding siblings ...)
  2022-11-06  5:34     ` [PATCH v5 3/4] kprobes: kretprobe scalability improvement with objpool wuqiang
@ 2022-11-06  5:34     ` wuqiang
  3 siblings, 0 replies; 23+ messages in thread
From: wuqiang @ 2022-11-06  5:34 UTC (permalink / raw)
  To: mhiramat, davem, anil.s.keshavamurthy, naveen.n.rao, rostedt,
	peterz, akpm, sander, ebiggers, dan.j.williams, jpoimboe
  Cc: linux-kernel, lkp, mattwu, wuqiang

This patch will remove freelist.h from kernel source tree, since the
only use cases (kretprobe and rethook) are converted to objpool.

Signed-off-by: wuqiang <wuqiang.matt@bytedance.com>
---
 include/linux/freelist.h | 129 ---------------------------------------
 1 file changed, 129 deletions(-)
 delete mode 100644 include/linux/freelist.h

diff --git a/include/linux/freelist.h b/include/linux/freelist.h
deleted file mode 100644
index fc1842b96469..000000000000
--- a/include/linux/freelist.h
+++ /dev/null
@@ -1,129 +0,0 @@
-/* SPDX-License-Identifier: GPL-2.0-only OR BSD-2-Clause */
-#ifndef FREELIST_H
-#define FREELIST_H
-
-#include <linux/atomic.h>
-
-/*
- * Copyright: cameron@moodycamel.com
- *
- * A simple CAS-based lock-free free list. Not the fastest thing in the world
- * under heavy contention, but simple and correct (assuming nodes are never
- * freed until after the free list is destroyed), and fairly speedy under low
- * contention.
- *
- * Adapted from: https://moodycamel.com/blog/2014/solving-the-aba-problem-for-lock-free-free-lists
- */
-
-struct freelist_node {
-	atomic_t		refs;
-	struct freelist_node	*next;
-};
-
-struct freelist_head {
-	struct freelist_node	*head;
-};
-
-#define REFS_ON_FREELIST 0x80000000
-#define REFS_MASK	 0x7FFFFFFF
-
-static inline void __freelist_add(struct freelist_node *node, struct freelist_head *list)
-{
-	/*
-	 * Since the refcount is zero, and nobody can increase it once it's
-	 * zero (except us, and we run only one copy of this method per node at
-	 * a time, i.e. the single thread case), then we know we can safely
-	 * change the next pointer of the node; however, once the refcount is
-	 * back above zero, then other threads could increase it (happens under
-	 * heavy contention, when the refcount goes to zero in between a load
-	 * and a refcount increment of a node in try_get, then back up to
-	 * something non-zero, then the refcount increment is done by the other
-	 * thread) -- so if the CAS to add the node to the actual list fails,
-	 * decrese the refcount and leave the add operation to the next thread
-	 * who puts the refcount back to zero (which could be us, hence the
-	 * loop).
-	 */
-	struct freelist_node *head = READ_ONCE(list->head);
-
-	for (;;) {
-		WRITE_ONCE(node->next, head);
-		atomic_set_release(&node->refs, 1);
-
-		if (!try_cmpxchg_release(&list->head, &head, node)) {
-			/*
-			 * Hmm, the add failed, but we can only try again when
-			 * the refcount goes back to zero.
-			 */
-			if (atomic_fetch_add_release(REFS_ON_FREELIST - 1, &node->refs) == 1)
-				continue;
-		}
-		return;
-	}
-}
-
-static inline void freelist_add(struct freelist_node *node, struct freelist_head *list)
-{
-	/*
-	 * We know that the should-be-on-freelist bit is 0 at this point, so
-	 * it's safe to set it using a fetch_add.
-	 */
-	if (!atomic_fetch_add_release(REFS_ON_FREELIST, &node->refs)) {
-		/*
-		 * Oh look! We were the last ones referencing this node, and we
-		 * know we want to add it to the free list, so let's do it!
-		 */
-		__freelist_add(node, list);
-	}
-}
-
-static inline struct freelist_node *freelist_try_get(struct freelist_head *list)
-{
-	struct freelist_node *prev, *next, *head = smp_load_acquire(&list->head);
-	unsigned int refs;
-
-	while (head) {
-		prev = head;
-		refs = atomic_read(&head->refs);
-		if ((refs & REFS_MASK) == 0 ||
-		    !atomic_try_cmpxchg_acquire(&head->refs, &refs, refs+1)) {
-			head = smp_load_acquire(&list->head);
-			continue;
-		}
-
-		/*
-		 * Good, reference count has been incremented (it wasn't at
-		 * zero), which means we can read the next and not worry about
-		 * it changing between now and the time we do the CAS.
-		 */
-		next = READ_ONCE(head->next);
-		if (try_cmpxchg_acquire(&list->head, &head, next)) {
-			/*
-			 * Yay, got the node. This means it was on the list,
-			 * which means should-be-on-freelist must be false no
-			 * matter the refcount (because nobody else knows it's
-			 * been taken off yet, it can't have been put back on).
-			 */
-			WARN_ON_ONCE(atomic_read(&head->refs) & REFS_ON_FREELIST);
-
-			/*
-			 * Decrease refcount twice, once for our ref, and once
-			 * for the list's ref.
-			 */
-			atomic_fetch_add(-2, &head->refs);
-
-			return head;
-		}
-
-		/*
-		 * OK, the head must have changed on us, but we still need to decrement
-		 * the refcount we increased.
-		 */
-		refs = atomic_fetch_add(-1, &prev->refs);
-		if (refs == REFS_ON_FREELIST + 1)
-			__freelist_add(prev, list);
-	}
-
-	return NULL;
-}
-
-#endif /* FREELIST_H */
-- 
2.34.1


^ permalink raw reply related	[flat|nested] 23+ messages in thread

* [PATCH v6 0/4] lib,kprobes: kretprobe scalability improvement
  2022-11-02  2:30 ` [PATCH v4] " wuqiang
                     ` (2 preceding siblings ...)
  2022-11-06  5:34   ` [PATCH v5 0/4] lib,kprobes: " wuqiang
@ 2022-11-08  7:14   ` wuqiang
  2022-11-08  7:14     ` [PATCH v6 1/4] lib: objpool added: ring-array based lockless MPMC queue wuqiang
                       ` (3 more replies)
  3 siblings, 4 replies; 23+ messages in thread
From: wuqiang @ 2022-11-08  7:14 UTC (permalink / raw)
  To: mhiramat, davem, anil.s.keshavamurthy, naveen.n.rao, rostedt,
	peterz, akpm, sander, ebiggers, dan.j.williams, jpoimboe
  Cc: linux-kernel, lkp, mattwu, wuqiang

This patch series introduces a scalable and lockless ring-array based
object pool and replaces the original freelist (a LIFO queue based on
singly linked list) to improve the scalability of kretprobed routines.

Changes from v5 (https://lore.kernel.org/lkml/20221106053441.103199-1-wuqiang.matt@bytedance.com/):
  1) PATCH 2/4: test_objpool: build warnings with [-Wmissing-prototypes]
  2) PATCH 3/4: fprobe.c: conflicts resolved for linux-6.1-rc4 merging

wuqiang (4):
  lib: objpool added: ring-array based lockless MPMC queue
  lib: objpool test module added
  kprobes: kretprobe scalability improvement with objpool
  kprobes: freelist.h removed

 include/linux/freelist.h |  129 -----
 include/linux/kprobes.h  |    9 +-
 include/linux/objpool.h  |  153 ++++++
 include/linux/rethook.h  |   15 +-
 kernel/kprobes.c         |   95 ++--
 kernel/trace/fprobe.c    |   17 +-
 kernel/trace/rethook.c   |   80 +--
 lib/Kconfig.debug        |   11 +
 lib/Makefile             |    4 +-
 lib/objpool.c            |  487 ++++++++++++++++++
 lib/test_objpool.c       | 1052 ++++++++++++++++++++++++++++++++++++++
 11 files changed, 1801 insertions(+), 251 deletions(-)
 delete mode 100644 include/linux/freelist.h
 create mode 100644 include/linux/objpool.h
 create mode 100644 lib/objpool.c
 create mode 100644 lib/test_objpool.c

-- 
2.34.1


^ permalink raw reply	[flat|nested] 23+ messages in thread

* [PATCH v6 1/4] lib: objpool added: ring-array based lockless MPMC queue
  2022-11-08  7:14   ` [PATCH v6 0/4] lib,kprobes: kretprobe scalability improvement wuqiang
@ 2022-11-08  7:14     ` wuqiang
  2022-11-14 15:54       ` Masami Hiramatsu
  2022-11-08  7:14     ` [PATCH v6 2/4] lib: objpool test module added wuqiang
                       ` (2 subsequent siblings)
  3 siblings, 1 reply; 23+ messages in thread
From: wuqiang @ 2022-11-08  7:14 UTC (permalink / raw)
  To: mhiramat, davem, anil.s.keshavamurthy, naveen.n.rao, rostedt,
	peterz, akpm, sander, ebiggers, dan.j.williams, jpoimboe
  Cc: linux-kernel, lkp, mattwu, wuqiang

The object pool is a scalable implementaion of high performance queue
for objects allocation and reclamation, such as kretprobe instances.

With leveraging per-cpu ring-array to mitigate the hot spots of memory
contention, it could deliver near-linear scalability for high parallel
scenarios. The ring-array is compactly managed in a single cache-line
to benefit from warmed L1 cache for most cases (<= 4 objects per-core).
The body of pre-allocated objects is stored in continuous cache-lines
just after the ring-array.

The object pool is interrupt safe. Both allocation and reclamation
(object pop and push operations) can be preemptible or interruptable.

It's best suited for following cases:
1) Memory allocation or reclamation are prohibited or too expensive
2) Consumers are of different priorities, such as irqs and threads

Limitations:
1) Maximum objects (capacity) is determined during pool initializing
2) The memory of objects won't be freed until the poll is finalized
3) Object allocation (pop) may fail after trying all cpu slots
4) Object reclamation (push) won't fail but may take long time to
   finish for imbalanced scenarios. You can try larger max_entries
   to mitigate, or ( >= CPUS * nr_objs) to avoid

Signed-off-by: wuqiang <wuqiang.matt@bytedance.com>
---
 include/linux/objpool.h | 153 +++++++++++++
 lib/Makefile            |   2 +-
 lib/objpool.c           | 487 ++++++++++++++++++++++++++++++++++++++++
 3 files changed, 641 insertions(+), 1 deletion(-)
 create mode 100644 include/linux/objpool.h
 create mode 100644 lib/objpool.c

diff --git a/include/linux/objpool.h b/include/linux/objpool.h
new file mode 100644
index 000000000000..7899b054b50c
--- /dev/null
+++ b/include/linux/objpool.h
@@ -0,0 +1,153 @@
+/* SPDX-License-Identifier: GPL-2.0 */
+
+#ifndef _LINUX_OBJPOOL_H
+#define _LINUX_OBJPOOL_H
+
+#include <linux/types.h>
+
+/*
+ * objpool: ring-array based lockless MPMC queue
+ *
+ * Copyright: wuqiang.matt@bytedance.com
+ *
+ * The object pool is a scalable implementaion of high performance queue
+ * for objects allocation and reclamation, such as kretprobe instances.
+ *
+ * With leveraging per-cpu ring-array to mitigate the hot spots of memory
+ * contention, it could deliver near-linear scalability for high parallel
+ * scenarios. The ring-array is compactly managed in a single cache-line
+ * to benefit from warmed L1 cache for most cases (<= 4 objects per-core).
+ * The body of pre-allocated objects is stored in continuous cache-lines
+ * just after the ring-array.
+ *
+ * The object pool is interrupt safe. Both allocation and reclamation
+ * (object pop and push operations) can be preemptible or interruptable.
+ *
+ * It's best suited for following cases:
+ * 1) Memory allocation or reclamation are prohibited or too expensive
+ * 2) Consumers are of different priorities, such as irqs and threads
+ *
+ * Limitations:
+ * 1) Maximum objects (capacity) is determined during pool initializing
+ * 2) The memory of objects won't be freed until the poll is finalized
+ * 3) Object allocation (pop) may fail after trying all cpu slots
+ * 4) Object reclamation (push) won't fail but may take long time to
+ *    finish for imbalanced scenarios. You can try larger max_entries
+ *    to mitigate, or ( >= CPUS * nr_objs) to avoid
+ */
+
+/*
+ * objpool_slot: per-cpu ring array
+ *
+ * Represents a cpu-local array-based ring buffer, its size is specialized
+ * during initialization of object pool.
+ *
+ * The objpool_slot is allocated from local memory for NUMA system, and to
+ * be kept compact in a single cacheline. ages[] is stored just after the
+ * body of objpool_slot, and then entries[]. The Array of ages[] describes
+ * revision of each item, solely used to avoid ABA. And array of entries[]
+ * contains the pointers of objects.
+ *
+ * The default size of objpool_slot is a single cache-line, aka. 64 bytes.
+ *
+ * 64bit:
+ *        4      8      12     16        32                 64
+ * | head | tail | size | mask | ages[4] | ents[4]: (8 * 4) | objects
+ *
+ * 32bit:
+ *        4      8      12     16        32        48       64
+ * | head | tail | size | mask | ages[4] | ents[4] | unused | objects
+ *
+ */
+
+struct objpool_slot {
+	uint32_t                head;	/* head of ring array */
+	uint32_t                tail;	/* tail of ring array */
+	uint32_t                size;	/* array size, pow of 2 */
+	uint32_t                mask;	/* size - 1 */
+} __attribute__((packed));
+
+/* caller-specified object initial callback to setup each object, only called once */
+typedef int (*objpool_init_obj_cb)(void *context, void *obj);
+
+/* caller-specified cleanup callback for private objects/pool/context */
+typedef int (*objpool_release_cb)(void *context, void *ptr, uint32_t flags);
+
+/* called for object releasing: ptr points to an object */
+#define OBJPOOL_FLAG_NODE        (0x00000001)
+/* for user pool and context releasing, ptr could be NULL */
+#define OBJPOOL_FLAG_POOL        (0x00001000)
+/* the object or pool to be released is user-managed */
+#define OBJPOOL_FLAG_USER        (0x00008000)
+
+/*
+ * objpool_head: object pooling metadata
+ */
+
+struct objpool_head {
+	unsigned int            obj_size;	/* object & element size */
+	unsigned int            nr_objs;	/* total objs (to be pre-allocated) */
+	unsigned int            nr_cpus;	/* num of possible cpus */
+	unsigned int            capacity;	/* max objects per cpuslot */
+	unsigned long           flags;		/* flags for objpool management */
+	gfp_t                   gfp;		/* gfp flags for kmalloc & vmalloc */
+	unsigned int            pool_size;	/* user pool size in byes */
+	void                   *pool;		/* user managed memory pool */
+	struct objpool_slot   **cpu_slots;	/* array of percpu slots */
+	unsigned int           *slot_sizes;	/* size in bytes of slots */
+	objpool_release_cb      release;	/* resource cleanup callback */
+	void                   *context;	/* caller-provided context */
+};
+
+#define OBJPOOL_FROM_VMALLOC	(0x800000000)	/* objpool allocated from vmalloc area */
+#define OBJPOOL_HAVE_OBJECTS	(0x400000000)	/* objects allocated along with objpool */
+
+/* initialize object pool and pre-allocate objects */
+int objpool_init(struct objpool_head *head, unsigned int nr_objs,
+		 unsigned int max_objs, unsigned int object_size,
+		 gfp_t gfp, void *context, objpool_init_obj_cb objinit,
+		 objpool_release_cb release);
+
+/* add objects in batch from user provided pool */
+int objpool_populate(struct objpool_head *head, void *pool,
+		     unsigned int size, unsigned int object_size,
+		     void *context, objpool_init_obj_cb objinit);
+
+/* add pre-allocated object (managed by user) to objpool */
+int objpool_add(void *obj, struct objpool_head *head);
+
+/* allocate an object from objects pool */
+void *objpool_pop(struct objpool_head *head);
+
+/* reclaim an object to objects pool */
+int objpool_push(void *node, struct objpool_head *head);
+
+/* cleanup the whole object pool (objects including) */
+void objpool_fini(struct objpool_head *head);
+
+/* whether the object is pre-allocated with percpu slots */
+static inline int objpool_is_inslot(void *obj, struct objpool_head *head)
+{
+	void *slot;
+	int i;
+
+	if (!obj || !(head->flags & OBJPOOL_HAVE_OBJECTS))
+		return 0;
+
+	for (i = 0; i < head->nr_cpus; i++) {
+		slot = head->cpu_slots[i];
+		if (obj >= slot && obj < slot + head->slot_sizes[i])
+			return 1;
+	}
+
+	return 0;
+}
+
+/* whether the object is from user pool (batched adding) */
+static inline int objpool_is_inpool(void *obj, struct objpool_head *head)
+{
+	return (obj && head->pool && obj >= head->pool &&
+		obj < head->pool + head->pool_size);
+}
+
+#endif /* _LINUX_OBJPOOL_H */
diff --git a/lib/Makefile b/lib/Makefile
index 161d6a724ff7..e938703a321f 100644
--- a/lib/Makefile
+++ b/lib/Makefile
@@ -34,7 +34,7 @@ lib-y := ctype.o string.o vsprintf.o cmdline.o \
 	 is_single_threaded.o plist.o decompress.o kobject_uevent.o \
 	 earlycpio.o seq_buf.o siphash.o dec_and_lock.o \
 	 nmi_backtrace.o win_minmax.o memcat_p.o \
-	 buildid.o
+	 buildid.o objpool.o
 
 lib-$(CONFIG_PRINTK) += dump_stack.o
 lib-$(CONFIG_SMP) += cpumask.o
diff --git a/lib/objpool.c b/lib/objpool.c
new file mode 100644
index 000000000000..ecffa0795f3d
--- /dev/null
+++ b/lib/objpool.c
@@ -0,0 +1,487 @@
+// SPDX-License-Identifier: GPL-2.0
+
+#include <linux/objpool.h>
+#include <linux/slab.h>
+#include <linux/vmalloc.h>
+#include <linux/atomic.h>
+#include <linux/prefetch.h>
+
+/*
+ * objpool: ring-array based lockless MPMC/FIFO queues
+ *
+ * Copyright: wuqiang.matt@bytedance.com
+ */
+
+/* compute the suitable num of objects to be managed by slot */
+static inline unsigned int __objpool_num_of_objs(unsigned int size)
+{
+	return rounddown_pow_of_two((size - sizeof(struct objpool_slot)) /
+			(sizeof(uint32_t) + sizeof(void *)));
+}
+
+#define SLOT_AGES(s) ((uint32_t *)((char *)(s) + sizeof(struct objpool_slot)))
+#define SLOT_ENTS(s) ((void **)((char *)(s) + sizeof(struct objpool_slot) + \
+			sizeof(uint32_t) * (s)->size))
+#define SLOT_OBJS(s) ((void *)((char *)(s) + sizeof(struct objpool_slot) + \
+			(sizeof(uint32_t) + sizeof(void *)) * (s)->size))
+
+/* allocate and initialize percpu slots */
+static inline int
+__objpool_init_percpu_slots(struct objpool_head *head, unsigned int nobjs,
+			void *context, objpool_init_obj_cb objinit)
+{
+	unsigned int i, j, n, size, objsz, nents = head->capacity;
+
+	/* aligned object size by sizeof(void *) */
+	objsz = ALIGN(head->obj_size, sizeof(void *));
+	/* shall we allocate objects along with objpool_slot */
+	if (objsz)
+		head->flags |= OBJPOOL_HAVE_OBJECTS;
+
+	for (i = 0; i < head->nr_cpus; i++) {
+		struct objpool_slot *os;
+
+		/* compute how many objects to be managed by this slot */
+		n = nobjs / head->nr_cpus;
+		if (i < (nobjs % head->nr_cpus))
+			n++;
+		size = sizeof(struct objpool_slot) + sizeof(void *) * nents +
+		       sizeof(uint32_t) * nents + objsz * n;
+
+		/* decide memory area for cpu-slot allocation */
+		if (!i && !(head->gfp & GFP_ATOMIC) && size > PAGE_SIZE / 2)
+			head->flags |= OBJPOOL_FROM_VMALLOC;
+
+		/* allocate percpu slot & objects from local memory */
+		if (head->flags & OBJPOOL_FROM_VMALLOC)
+			os = __vmalloc_node(size, sizeof(void *), head->gfp,
+				cpu_to_node(i), __builtin_return_address(0));
+		else
+			os = kmalloc_node(size, head->gfp, cpu_to_node(i));
+		if (!os)
+			return -ENOMEM;
+
+		/* initialize percpu slot for the i-th cpu */
+		memset(os, 0, size);
+		os->size = head->capacity;
+		os->mask = os->size - 1;
+		head->cpu_slots[i] = os;
+		head->slot_sizes[i] = size;
+
+		/*
+		 * start from 2nd round to avoid conflict of 1st item.
+		 * we assume that the head item is ready for retrieval
+		 * iff head is equal to ages[head & mask]. but ages is
+		 * initialized as 0, so in view of the caller of pop(),
+		 * the 1st item (0th) is always ready, but fact could
+		 * be: push() is stalled before the final update, thus
+		 * the item being inserted will be lost forever.
+		 */
+		os->head = os->tail = head->capacity;
+
+		if (!objsz)
+			continue;
+
+		for (j = 0; j < n; j++) {
+			uint32_t *ages = SLOT_AGES(os);
+			void **ents = SLOT_ENTS(os);
+			void *obj = SLOT_OBJS(os) + j * objsz;
+			uint32_t ie = os->tail & os->mask;
+
+			/* perform object initialization */
+			if (objinit) {
+				int rc = objinit(context, obj);
+				if (rc)
+					return rc;
+			}
+
+			/* add obj into the ring array */
+			ents[ie] = obj;
+			ages[ie] = os->tail;
+			os->tail++;
+			head->nr_objs++;
+		}
+	}
+
+	return 0;
+}
+
+/* cleanup all percpu slots of the object pool */
+static inline void __objpool_fini_percpu_slots(struct objpool_head *head)
+{
+	unsigned int i;
+
+	if (!head->cpu_slots)
+		return;
+
+	for (i = 0; i < head->nr_cpus; i++) {
+		if (!head->cpu_slots[i])
+			continue;
+		if (head->flags & OBJPOOL_FROM_VMALLOC)
+			vfree(head->cpu_slots[i]);
+		else
+			kfree(head->cpu_slots[i]);
+	}
+	kfree(head->cpu_slots);
+	head->cpu_slots = NULL;
+	head->slot_sizes = NULL;
+}
+
+/**
+ * objpool_init: initialize object pool and pre-allocate objects
+ *
+ * args:
+ * @head:    the object pool to be initialized, declared by caller
+ * @nr_objs: total objects to be pre-allocated by this object pool
+ * @max_objs: max entries (object pool capacity), use nr_objs if 0
+ * @object_size: size of an object, no objects pre-allocated if 0
+ * @gfp:     flags for memory allocation (via kmalloc or vmalloc)
+ * @context: user context for object initialization callback
+ * @objinit: object initialization callback for extra setting-up
+ * @release: cleanup callback for private objects/pool/context
+ *
+ * return:
+ *         0 for success, otherwise error code
+ *
+ * All pre-allocated objects are to be zeroed. Caller could do extra
+ * initialization in objinit callback. The objinit callback will be
+ * called once and only once after the slot allocation. Then objpool
+ * won't touch any content of the objects since then. It's caller's
+ * duty to perform reinitialization after object allocation (pop) or
+ * clearance before object reclamation (push) if required.
+ */
+int objpool_init(struct objpool_head *head, unsigned int nr_objs,
+		unsigned int max_objs, unsigned int object_size,
+		gfp_t gfp, void *context, objpool_init_obj_cb objinit,
+		objpool_release_cb release)
+{
+	unsigned int nents, ncpus = num_possible_cpus();
+	int rc;
+
+	/* calculate percpu slot size (rounded to pow of 2) */
+	if (max_objs < nr_objs)
+		max_objs = nr_objs;
+	nents = max_objs / ncpus;
+	if (nents < __objpool_num_of_objs(L1_CACHE_BYTES))
+		nents = __objpool_num_of_objs(L1_CACHE_BYTES);
+	nents = roundup_pow_of_two(nents);
+	while (nents * ncpus < nr_objs)
+		nents = nents << 1;
+
+	memset(head, 0, sizeof(struct objpool_head));
+	head->nr_cpus = ncpus;
+	head->obj_size = object_size;
+	head->capacity = nents;
+	head->gfp = gfp & ~__GFP_ZERO;
+	head->context = context;
+	head->release = release;
+
+	/* allocate array for percpu slots */
+	head->cpu_slots = kzalloc(head->nr_cpus * sizeof(void *) +
+			       head->nr_cpus * sizeof(uint32_t), head->gfp);
+	if (!head->cpu_slots)
+		return -ENOMEM;
+	head->slot_sizes = (uint32_t *)&head->cpu_slots[head->nr_cpus];
+
+	/* initialize per-cpu slots */
+	rc = __objpool_init_percpu_slots(head, nr_objs, context, objinit);
+	if (rc)
+		__objpool_fini_percpu_slots(head);
+
+	return rc;
+}
+EXPORT_SYMBOL_GPL(objpool_init);
+
+/* adding object to slot tail, the given slot must NOT be full */
+static inline int __objpool_add_slot(void *obj, struct objpool_slot *os)
+{
+	uint32_t *ages = SLOT_AGES(os);
+	void **ents = SLOT_ENTS(os);
+	uint32_t tail = atomic_inc_return((atomic_t *)&os->tail) - 1;
+
+	WRITE_ONCE(ents[tail & os->mask], obj);
+
+	/* order matters: obj must be updated before tail updating */
+	smp_store_release(&ages[tail & os->mask], tail);
+	return 0;
+}
+
+/* adding object to slot, abort if the slot was already full */
+static inline int __objpool_try_add_slot(void *obj, struct objpool_slot *os)
+{
+	uint32_t *ages = SLOT_AGES(os);
+	void **ents = SLOT_ENTS(os);
+	uint32_t head, tail;
+
+	do {
+		/* perform memory loading for both head and tail */
+		head = READ_ONCE(os->head);
+		tail = READ_ONCE(os->tail);
+		/* just abort if slot is full */
+		if (tail >= head + os->size)
+			return -ENOENT;
+		/* try to extend tail by 1 using CAS to avoid races */
+		if (try_cmpxchg_acquire(&os->tail, &tail, tail + 1))
+			break;
+	} while (1);
+
+	/* the tail-th of slot is reserved for the given obj */
+	WRITE_ONCE(ents[tail & os->mask], obj);
+	/* update epoch id to make this object available for pop() */
+	smp_store_release(&ages[tail & os->mask], tail);
+	return 0;
+}
+
+/**
+ * objpool_populate: add objects from user provided pool in batch
+ *
+ * args:
+ * @head:  object pool
+ * @pool: user buffer for pre-allocated objects
+ * @size: size of user buffer
+ * @object_size: size of object & element
+ * @context: user context for objinit callback
+ * @objinit: object initialization callback
+ *
+ * return: 0 or error code
+ */
+int objpool_populate(struct objpool_head *head, void *pool,
+		unsigned int size, unsigned int object_size,
+		void *context, objpool_init_obj_cb objinit)
+{
+	unsigned int n = head->nr_objs, used = 0, i;
+
+	if (head->pool || !pool || size < object_size)
+		return -EINVAL;
+	if (head->obj_size && head->obj_size != object_size)
+		return -EINVAL;
+	if (head->context && context && head->context != context)
+		return -EINVAL;
+	if (head->nr_objs >= head->nr_cpus * head->capacity)
+		return -ENOENT;
+
+	WARN_ON_ONCE(((unsigned long)pool) & (sizeof(void *) - 1));
+	WARN_ON_ONCE(((uint32_t)object_size) & (sizeof(void *) - 1));
+
+	/* align object size by sizeof(void *) */
+	head->obj_size = object_size;
+	object_size = ALIGN(object_size, sizeof(void *));
+	if (object_size == 0)
+		return -EINVAL;
+
+	while (used + object_size <= size) {
+		void *obj = pool + used;
+
+		/* perform object initialization */
+		if (objinit) {
+			int rc = objinit(context, obj);
+			if (rc)
+				return rc;
+		}
+
+		/* insert obj to its corresponding objpool slot */
+		i = (n + used * head->nr_cpus/size) % head->nr_cpus;
+		if (!__objpool_try_add_slot(obj, head->cpu_slots[i]))
+			head->nr_objs++;
+
+		used += object_size;
+	}
+
+	if (!used)
+		return -ENOENT;
+
+	head->context = context;
+	head->pool = pool;
+	head->pool_size = size;
+
+	return 0;
+}
+EXPORT_SYMBOL_GPL(objpool_populate);
+
+/**
+ * objpool_add: add pre-allocated object to objpool during pool
+ * initialization
+ *
+ * args:
+ * @obj:  object pointer to be added to objpool
+ * @head: object pool to be inserted into
+ *
+ * return:
+ *     0 or error code
+ *
+ * objpool_add_node doesn't handle race conditions, can only be
+ * called during objpool initialization
+ */
+int objpool_add(void *obj, struct objpool_head *head)
+{
+	unsigned int i, cpu;
+
+	if (!obj)
+		return -EINVAL;
+	if (head->nr_objs >= head->nr_cpus * head->capacity)
+		return -ENOENT;
+
+	cpu = head->nr_objs % head->nr_cpus;
+	for (i = 0; i < head->nr_cpus; i++) {
+		if (!__objpool_try_add_slot(obj, head->cpu_slots[cpu])) {
+			head->nr_objs++;
+			return 0;
+		}
+
+		if (++cpu >= head->nr_cpus)
+			cpu = 0;
+	}
+
+	return -ENOENT;
+}
+EXPORT_SYMBOL_GPL(objpool_add);
+
+/**
+ * objpool_push: reclaim the object and return back to objects pool
+ *
+ * args:
+ * @obj:  object pointer to be pushed to object pool
+ * @head: object pool
+ *
+ * return:
+ *     0 or error code: it fails only when objects pool are full
+ *
+ * objpool_push is non-blockable, and can be nested
+ */
+int objpool_push(void *obj, struct objpool_head *head)
+{
+	unsigned int cpu = raw_smp_processor_id() % head->nr_cpus;
+
+	do {
+		if (head->nr_objs > head->capacity) {
+			if (!__objpool_try_add_slot(obj, head->cpu_slots[cpu]))
+				return 0;
+		} else {
+			if (!__objpool_add_slot(obj, head->cpu_slots[cpu]))
+				return 0;
+		}
+		if (++cpu >= head->nr_cpus)
+			cpu = 0;
+	} while (1);
+
+	return -ENOENT;
+}
+EXPORT_SYMBOL_GPL(objpool_push);
+
+/* try to retrieve object from slot */
+static inline void *__objpool_try_get_slot(struct objpool_slot *os)
+{
+	uint32_t *ages = SLOT_AGES(os);
+	void **ents = SLOT_ENTS(os);
+	/* do memory load of head to local head */
+	uint32_t head = smp_load_acquire(&os->head);
+
+	/* loop if slot isn't empty */
+	while (head != READ_ONCE(os->tail)) {
+		uint32_t id = head & os->mask, prev = head;
+
+		/* do prefetching of object ents */
+		prefetch(&ents[id]);
+
+		/*
+		 * check whether this item was ready for retrieval ? There's
+		 * possibility * in theory * we might retrieve wrong object,
+		 * in case ages[id] overflows when current task is sleeping,
+		 * but it will take very very long to overflow an uint32_t
+		 */
+		if (smp_load_acquire(&ages[id]) == head) {
+			/* node must have been udpated by push() */
+			void *node = READ_ONCE(ents[id]);
+			/* commit and move forward head of the slot */
+			if (try_cmpxchg_release(&os->head, &head, head + 1))
+				return node;
+		}
+
+		/* re-load head from memory continue trying */
+		head = READ_ONCE(os->head);
+		/*
+		 * head stays unchanged, so it's very likely current pop()
+		 * just preempted/interrupted an ongoing push() operation
+		 */
+		if (head == prev)
+			break;
+	}
+
+	return NULL;
+}
+
+/**
+ * objpool_pop: allocate an object from objects pool
+ *
+ * args:
+ * @oh:  object pool
+ *
+ * return:
+ *   object: NULL if failed (object pool is empty)
+ *
+ * objpool_pop can be nested, so can be used in any context.
+ */
+void *objpool_pop(struct objpool_head *head)
+{
+	unsigned int i, cpu;
+	void *obj = NULL;
+
+	cpu = raw_smp_processor_id() % head->nr_cpus;
+	for (i = 0; i < head->nr_cpus; i++) {
+		struct objpool_slot *slot = head->cpu_slots[cpu];
+		obj = __objpool_try_get_slot(slot);
+		if (obj)
+			break;
+		if (++cpu >= head->nr_cpus)
+			cpu = 0;
+	}
+
+	return obj;
+}
+EXPORT_SYMBOL_GPL(objpool_pop);
+
+/**
+ * objpool_fini: cleanup the whole object pool (releasing all objects)
+ *
+ * args:
+ * @head: object pool to be released
+ *
+ */
+void objpool_fini(struct objpool_head *head)
+{
+	uint32_t i, flags;
+
+	if (!head->cpu_slots)
+		return;
+
+	if (!head->release) {
+		__objpool_fini_percpu_slots(head);
+		return;
+	}
+
+	/* cleanup all objects remained in objpool */
+	for (i = 0; i < head->nr_cpus; i++) {
+		void *obj;
+		do {
+			flags = OBJPOOL_FLAG_NODE;
+			obj = __objpool_try_get_slot(head->cpu_slots[i]);
+			if (!obj)
+				break;
+			if (!objpool_is_inpool(obj, head) &&
+			    !objpool_is_inslot(obj, head)) {
+				flags |= OBJPOOL_FLAG_USER;
+			}
+			head->release(head->context, obj, flags);
+		} while (obj);
+	}
+
+	/* release percpu slots */
+	__objpool_fini_percpu_slots(head);
+
+	/* cleanup user private pool and related context */
+	flags = OBJPOOL_FLAG_POOL;
+	if (head->pool)
+		flags |= OBJPOOL_FLAG_USER;
+	head->release(head->context, head->pool, flags);
+}
+EXPORT_SYMBOL_GPL(objpool_fini);
-- 
2.34.1


^ permalink raw reply related	[flat|nested] 23+ messages in thread

* [PATCH v6 2/4] lib: objpool test module added
  2022-11-08  7:14   ` [PATCH v6 0/4] lib,kprobes: kretprobe scalability improvement wuqiang
  2022-11-08  7:14     ` [PATCH v6 1/4] lib: objpool added: ring-array based lockless MPMC queue wuqiang
@ 2022-11-08  7:14     ` wuqiang
  2022-11-08  7:14     ` [PATCH v6 3/4] kprobes: kretprobe scalability improvement with objpool wuqiang
  2022-11-08  7:14     ` [PATCH v6 4/4] kprobes: freelist.h removed wuqiang
  3 siblings, 0 replies; 23+ messages in thread
From: wuqiang @ 2022-11-08  7:14 UTC (permalink / raw)
  To: mhiramat, davem, anil.s.keshavamurthy, naveen.n.rao, rostedt,
	peterz, akpm, sander, ebiggers, dan.j.williams, jpoimboe
  Cc: linux-kernel, lkp, mattwu, wuqiang

The test_objpool module (test_objpool) will run serveral testcases
for objpool stress and performance evaluation. Each testcase will
have all available cpu cores involved to create a situation of high
parallel and high contention.

As of now there are 3 groups and 3 * 6 testcases in total:

1) group 1: synchronous mode
   objpool is managed synchronously, that is, all objects are to be
   reclaimed before objpool finalization and the objpool owner makes
   sure of it. All threads on different cores run in the same pace.
2) group 2: synchronous + miss mode
   This test group is mainly for performance evaluation of missing
   cases when pre-allocated objects are less than the requsted.
3) group 3: asynchronous mode
   This case is just an emulation of kretprobe. The objpool owner
   has no control of the object after it's allocated. hrtimer irq
   is introduced to stress objpool with thread preemption.

Signed-off-by: wuqiang <wuqiang.matt@bytedance.com>
---
 lib/Kconfig.debug  |   11 +
 lib/Makefile       |    2 +
 lib/test_objpool.c | 1052 ++++++++++++++++++++++++++++++++++++++++++++
 3 files changed, 1065 insertions(+)
 create mode 100644 lib/test_objpool.c

diff --git a/lib/Kconfig.debug b/lib/Kconfig.debug
index 29280072dc0e..0749335d79db 100644
--- a/lib/Kconfig.debug
+++ b/lib/Kconfig.debug
@@ -2738,6 +2738,17 @@ config TEST_CLOCKSOURCE_WATCHDOG
 
 	  If unsure, say N.
 
+config TEST_OBJPOOL
+	tristate "Test module for correctness and stress of objpool"
+	default n
+       depends on m
+	help
+	  This builds the "test_objpool" module that should be used for
+	  correctness verification and concurrent testings of objects
+	  allocation and reclamation.
+
+	  If unsure, say N.
+
 endif # RUNTIME_TESTING_MENU
 
 config ARCH_USE_MEMTEST
diff --git a/lib/Makefile b/lib/Makefile
index e938703a321f..4aa282fa0cfc 100644
--- a/lib/Makefile
+++ b/lib/Makefile
@@ -99,6 +99,8 @@ obj-$(CONFIG_KPROBES_SANITY_TEST) += test_kprobes.o
 obj-$(CONFIG_TEST_REF_TRACKER) += test_ref_tracker.o
 CFLAGS_test_fprobe.o += $(CC_FLAGS_FTRACE)
 obj-$(CONFIG_FPROBE_SANITY_TEST) += test_fprobe.o
+obj-$(CONFIG_TEST_OBJPOOL) += test_objpool.o
+
 #
 # CFLAGS for compiling floating point code inside the kernel. x86/Makefile turns
 # off the generation of FPU/SSE* instructions for kernel proper but FPU_FLAGS
diff --git a/lib/test_objpool.c b/lib/test_objpool.c
new file mode 100644
index 000000000000..a4c1814ac3b7
--- /dev/null
+++ b/lib/test_objpool.c
@@ -0,0 +1,1052 @@
+// SPDX-License-Identifier: GPL-2.0
+
+/*
+ * Test module for lockless object pool
+ * (C) 2022 Matt Wu <wuqiang.matt@bytedance.com>
+ */
+
+#include <linux/version.h>
+#include <linux/errno.h>
+#include <linux/module.h>
+#include <linux/moduleparam.h>
+#include <linux/sched.h>
+#include <linux/cpumask.h>
+#include <linux/completion.h>
+#include <linux/kthread.h>
+#include <linux/cpu.h>
+#include <linux/cpuset.h>
+#include <linux/slab.h>
+#include <linux/vmalloc.h>
+#include <linux/delay.h>
+#include <linux/hrtimer.h>
+#include <linux/interrupt.h>
+#include <linux/objpool.h>
+
+#define OT_NR_MAX_BULK (16)
+
+struct ot_ctrl {
+	unsigned int mode; /* test no */
+	unsigned int objsz; /* object size */
+	unsigned int duration; /* ms */
+	unsigned int delay; /* ms */
+	unsigned int bulk_normal;
+	unsigned int bulk_irq;
+	unsigned long hrtimer; /* ms */
+	const char *name;
+};
+
+struct ot_stat {
+	unsigned long nhits;
+	unsigned long nmiss;
+};
+
+struct ot_item {
+	struct objpool_head *pool; /* pool head */
+	struct ot_ctrl *ctrl; /* ctrl parameters */
+
+	void (*worker)(struct ot_item *item, int irq);
+
+	/* hrtimer control */
+	ktime_t hrtcycle;
+	struct hrtimer hrtimer;
+
+	int bulk[2]; /* for thread and irq */
+	int delay;
+	u32 niters;
+
+	/* results summary */
+	struct ot_stat stat[2]; /* thread and irq */
+
+	u64 duration;
+};
+
+struct ot_mem_stat {
+	atomic_long_t alloc;
+	atomic_long_t free;
+};
+
+struct ot_data {
+	struct rw_semaphore start;
+	struct completion wait;
+	struct completion rcu;
+	atomic_t nthreads ____cacheline_aligned_in_smp;
+	atomic_t stop ____cacheline_aligned_in_smp;
+	struct ot_mem_stat kmalloc;
+	struct ot_mem_stat vmalloc;
+} g_ot_data;
+
+/*
+ * memory leakage checking
+ */
+
+static void *ot_kzalloc(long size)
+{
+	void *ptr = kzalloc(size, GFP_KERNEL);
+
+	if (ptr)
+		atomic_long_add(size, &g_ot_data.kmalloc.alloc);
+	return ptr;
+}
+
+static void ot_kfree(void *ptr, long size)
+{
+	if (!ptr)
+		return;
+	atomic_long_add(size, &g_ot_data.kmalloc.free);
+	kfree(ptr);
+}
+
+static void *ot_vmalloc(long size)
+{
+	void *ptr = vmalloc(size);
+
+	if (ptr)
+		atomic_long_add(size, &g_ot_data.vmalloc.alloc);
+	return ptr;
+}
+
+static void ot_vfree(void *ptr, long size)
+{
+	if (!ptr)
+		return;
+	atomic_long_add(size, &g_ot_data.vmalloc.free);
+	vfree(ptr);
+}
+
+static void ot_mem_report(struct ot_ctrl *ctrl)
+{
+	long alloc, free;
+
+	pr_info("memory allocation summary for %s\n", ctrl->name);
+
+	alloc = atomic_long_read(&g_ot_data.kmalloc.alloc);
+	free = atomic_long_read(&g_ot_data.kmalloc.free);
+	pr_info("  kmalloc: %lu - %lu = %lu\n", alloc, free, alloc - free);
+
+	alloc = atomic_long_read(&g_ot_data.vmalloc.alloc);
+	free = atomic_long_read(&g_ot_data.vmalloc.free);
+	pr_info("  vmalloc: %lu - %lu = %lu\n", alloc, free, alloc - free);
+}
+
+/*
+ * general structs & routines
+ */
+
+struct ot_node {
+	void *owner;
+	unsigned long data;
+	unsigned long refs;
+	unsigned long payload[32];
+};
+
+struct ot_context {
+	struct objpool_head pool; /* objpool head */
+	struct ot_ctrl *ctrl; /* ctrl parameters */
+	void *ptr; /* user pool buffer */
+	unsigned long size; /* buffer size */
+	refcount_t refs;
+	struct rcu_head rcu;
+};
+
+static DEFINE_PER_CPU(struct ot_item, ot_pcup_items);
+
+static int ot_init_data(struct ot_data *data)
+{
+	memset(data, 0, sizeof(*data));
+	init_rwsem(&data->start);
+	init_completion(&data->wait);
+	init_completion(&data->rcu);
+	atomic_set(&data->nthreads, 1);
+
+	return 0;
+}
+
+static void ot_reset_data(struct ot_data *data)
+{
+	reinit_completion(&data->wait);
+	reinit_completion(&data->rcu);
+	atomic_set(&data->nthreads, 1);
+	atomic_set(&data->stop, 0);
+	memset(&data->kmalloc, 0, sizeof(data->kmalloc));
+	memset(&data->vmalloc, 0, sizeof(data->vmalloc));
+}
+
+static int ot_init_node(void *context, void *nod)
+{
+	struct ot_context *sop = context;
+	struct ot_node *on = nod;
+
+	on->owner = &sop->pool;
+	return 0;
+}
+
+static enum hrtimer_restart ot_hrtimer_handler(struct hrtimer *hrt)
+{
+	struct ot_item *item = container_of(hrt, struct ot_item, hrtimer);
+
+	if (atomic_read_acquire(&g_ot_data.stop))
+		return HRTIMER_NORESTART;
+
+	/* do bulk-testings for objects pop/push */
+	item->worker(item, 1);
+
+	hrtimer_forward(hrt, hrt->base->get_time(), item->hrtcycle);
+	return HRTIMER_RESTART;
+}
+
+static void ot_start_hrtimer(struct ot_item *item)
+{
+	if (!item->ctrl->hrtimer)
+		return;
+	hrtimer_start(&item->hrtimer, item->hrtcycle, HRTIMER_MODE_REL);
+}
+
+static void ot_stop_hrtimer(struct ot_item *item)
+{
+	if (!item->ctrl->hrtimer)
+		return;
+	hrtimer_cancel(&item->hrtimer);
+}
+
+static int ot_init_hrtimer(struct ot_item *item, unsigned long hrtimer)
+{
+	struct hrtimer *hrt = &item->hrtimer;
+
+	if (!hrtimer)
+		return -ENOENT;
+
+	item->hrtcycle = ktime_set(0, hrtimer * 1000000UL);
+	hrtimer_init(hrt, CLOCK_MONOTONIC, HRTIMER_MODE_REL);
+	hrt->function = ot_hrtimer_handler;
+	return 0;
+}
+
+static int ot_init_cpu_item(struct ot_item *item,
+			struct ot_ctrl *ctrl,
+			struct objpool_head *pool,
+			void (*worker)(struct ot_item *, int))
+{
+	memset(item, 0, sizeof(*item));
+	item->pool = pool;
+	item->ctrl = ctrl;
+	item->worker = worker;
+
+	item->bulk[0] = ctrl->bulk_normal;
+	item->bulk[1] = ctrl->bulk_irq;
+	item->delay = ctrl->delay;
+
+	/* initialize hrtimer */
+	ot_init_hrtimer(item, item->ctrl->hrtimer);
+	return 0;
+}
+
+static int ot_thread_worker(void *arg)
+{
+	struct ot_item *item = arg;
+	ktime_t start;
+
+	sched_set_normal(current, 50);
+
+	atomic_inc(&g_ot_data.nthreads);
+	down_read(&g_ot_data.start);
+	up_read(&g_ot_data.start);
+	start = ktime_get();
+	ot_start_hrtimer(item);
+	do {
+		if (atomic_read_acquire(&g_ot_data.stop))
+			break;
+		/* do bulk-testings for objects pop/push */
+		item->worker(item, 0);
+	} while (!kthread_should_stop());
+	ot_stop_hrtimer(item);
+	item->duration = (u64) ktime_us_delta(ktime_get(), start);
+	if (atomic_dec_and_test(&g_ot_data.nthreads))
+		complete(&g_ot_data.wait);
+
+	return 0;
+}
+
+static void ot_perf_report(struct ot_ctrl *ctrl, u64 duration)
+{
+	struct ot_stat total, normal = {0}, irq = {0};
+	int cpu, nthreads = 0;
+
+	pr_info("\n");
+	pr_info("Testing summary for %s\n", ctrl->name);
+
+	for_each_possible_cpu(cpu) {
+		struct ot_item *item = per_cpu_ptr(&ot_pcup_items, cpu);
+		if (!item->duration)
+			continue;
+		normal.nhits += item->stat[0].nhits;
+		normal.nmiss += item->stat[0].nmiss;
+		irq.nhits += item->stat[1].nhits;
+		irq.nmiss += item->stat[1].nmiss;
+		pr_info("CPU: %d  duration: %lluus\n", cpu, item->duration);
+		pr_info("\tthread:\t%16lu hits \t%16lu miss\n",
+			item->stat[0].nhits, item->stat[0].nmiss);
+		pr_info("\tirq:   \t%16lu hits \t%16lu miss\n",
+			item->stat[1].nhits, item->stat[1].nmiss);
+		pr_info("\ttotal: \t%16lu hits \t%16lu miss\n",
+			item->stat[0].nhits + item->stat[1].nhits,
+			item->stat[0].nmiss + item->stat[1].nmiss);
+		nthreads++;
+	}
+
+	total.nhits = normal.nhits + irq.nhits;
+	total.nmiss = normal.nmiss + irq.nmiss;
+
+	pr_info("ALL: \tnthreads: %d  duration: %lluus\n", nthreads, duration);
+	pr_info("SUM: \t%16lu hits \t%16lu miss\n",
+		total.nhits, total.nmiss);
+}
+
+/*
+ * synchronous test cases for objpool manipulation
+ */
+
+/* objpool manipulation for synchronous mode 0 (percpu objpool) */
+static struct ot_context *ot_init_sync_m0(struct ot_ctrl *ctrl)
+{
+	struct ot_context *sop = NULL;
+	int max = num_possible_cpus() << 3;
+
+	sop = (struct ot_context *)ot_kzalloc(sizeof(*sop));
+	if (!sop)
+		return NULL;
+	sop->ctrl = ctrl;
+
+	if (objpool_init(&sop->pool, max, max, ctrl->objsz,
+			GFP_KERNEL, sop, ot_init_node, NULL)) {
+		ot_kfree(sop, sizeof(*sop));
+		return NULL;
+	}
+	WARN_ON(max != sop->pool.nr_objs);
+
+	return sop;
+}
+
+static void ot_fini_sync_m0(struct ot_context *sop)
+{
+	objpool_fini(&sop->pool);
+	ot_kfree(sop, sizeof(*sop));
+}
+
+/* objpool manipulation for synchronous mode 1 (private pool) */
+static struct ot_context *ot_init_sync_m1(struct ot_ctrl *ctrl)
+{
+	struct ot_context *sop = NULL;
+	unsigned long size;
+	int rc, szobj, max = num_possible_cpus() << 3;
+
+	sop = (struct ot_context *)ot_kzalloc(sizeof(*sop));
+	if (!sop)
+		return NULL;
+	sop->ctrl = ctrl;
+
+	szobj = ALIGN(ctrl->objsz, sizeof(void *));
+	size = szobj * max;
+	sop->ptr = ot_vmalloc(size);
+	sop->size = size;
+	if (!sop->ptr) {
+		ot_kfree(sop, sizeof(*sop));
+		return NULL;
+	}
+	memset(sop->ptr, 0, size);
+
+	/* create and initialize objpool as empty (no objects) */
+	rc = objpool_init(&sop->pool, 0, max, 0, GFP_KERNEL, sop, NULL, NULL);
+	if (rc) {
+		ot_kfree(sop, sizeof(*sop));
+		return NULL;
+	}
+
+	/* populate given buffer to objpool */
+	rc = objpool_populate(&sop->pool, sop->ptr, size,
+		ctrl->objsz, sop, ot_init_node);
+	if (rc) {
+		objpool_fini(&sop->pool);
+		ot_vfree(sop->ptr, size);
+		ot_kfree(sop, sizeof(*sop));
+		return NULL;
+	}
+	WARN_ON((size / szobj) != sop->pool.nr_objs);
+
+	return sop;
+}
+
+static void ot_fini_sync_m1(struct ot_context *sop)
+{
+	objpool_fini(&sop->pool);
+
+	ot_vfree(sop->ptr, sop->size);
+	ot_kfree(sop, sizeof(*sop));
+}
+
+/* objpool manipulation for synchronous mode 2 (private objects) */
+static int ot_objpool_release(void *context, void *ptr, uint32_t flags)
+{
+	struct ot_context *sop = context;
+
+	/* here we need release all user-allocated objects */
+	if ((flags & OBJPOOL_FLAG_NODE) && (flags & OBJPOOL_FLAG_USER)) {
+		struct ot_node *on = ptr;
+		WARN_ON(on->data != 0xDEADBEEF);
+		ot_kfree(on, sop->ctrl->objsz);
+	} else if (flags & OBJPOOL_FLAG_POOL) {
+		/* release user preallocated pool */
+		if (sop->ptr) {
+			WARN_ON(sop->ptr != ptr);
+			WARN_ON(!(flags & OBJPOOL_FLAG_USER));
+			ot_vfree(sop->ptr, sop->size);
+		}
+		/* do context cleaning if needed */
+		ot_kfree(sop, sizeof(*sop));
+	}
+
+	return 0;
+}
+
+static struct ot_context *ot_init_sync_m2(struct ot_ctrl *ctrl)
+{
+	struct ot_context *sop = NULL;
+	struct ot_node *on;
+	int rc, i, max = num_possible_cpus() << 3;
+
+	sop = (struct ot_context *)ot_kzalloc(sizeof(*sop));
+	if (!sop)
+		return NULL;
+	sop->ctrl = ctrl;
+
+	/* create and initialize objpool as empty (no objects) */
+	rc = objpool_init(&sop->pool, 0, max, 0, GFP_KERNEL, sop, NULL,
+			ot_objpool_release);
+	if (rc) {
+		ot_kfree(sop, sizeof(*sop));
+		return NULL;
+	}
+
+	/* allocate private objects and insert to objpool */
+	for (i = 0; i < max; i++) {
+		on = ot_kzalloc(ctrl->objsz);
+		if (on) {
+			ot_init_node(sop, on);
+			on->data = 0xDEADBEEF;
+			objpool_add(on, &sop->pool);
+		}
+	}
+	WARN_ON(max != sop->pool.nr_objs);
+
+	return sop;
+}
+
+static void ot_fini_sync_m2(struct ot_context *sop)
+{
+	objpool_fini(&sop->pool);
+}
+
+/* objpool manipulation for synchronous mode 3 (mixed mode) */
+static struct ot_context *ot_init_sync_m3(struct ot_ctrl *ctrl)
+{
+	struct ot_context *sop = NULL;
+	struct ot_node *on;
+	unsigned long size;
+	int rc, i, szobj, nobjs;
+	int max = num_possible_cpus() << 4;
+
+	sop = (struct ot_context *)ot_kzalloc(sizeof(*sop));
+	if (!sop)
+		return NULL;
+	sop->ctrl = ctrl;
+
+	/* create and initialize objpool as empty (no objects) */
+	nobjs = num_possible_cpus() * 2;
+	rc = objpool_init(&sop->pool, nobjs, max, ctrl->objsz, GFP_KERNEL,
+			sop, ot_init_node, ot_objpool_release);
+	if (rc) {
+		ot_kfree(sop, sizeof(*sop));
+		return NULL;
+	}
+
+	szobj = ALIGN(ctrl->objsz, sizeof(void *));
+	size = szobj * num_possible_cpus() * 4;
+	sop->ptr = ot_vmalloc(size);
+	if (!sop->ptr) {
+		objpool_fini(&sop->pool);
+		return NULL;
+	}
+	sop->size = size;
+	memset(sop->ptr, 0, size);
+
+	/* populate given buffer to objpool */
+	rc = objpool_populate(&sop->pool, sop->ptr, size,
+		ctrl->objsz, sop, ot_init_node);
+	if (rc) {
+		objpool_fini(&sop->pool);
+		ot_vfree(sop->ptr, size);
+		return NULL;
+	}
+	nobjs += size / szobj;
+
+	/* allocate private objects and insert to objpool */
+	for (i = 0; i < num_possible_cpus() * 2; i++) {
+		on = ot_kzalloc(ctrl->objsz);
+		if (on) {
+			ot_init_node(sop, on);
+			on->data = 0xDEADBEEF;
+			if (!objpool_add(on, &sop->pool))
+				nobjs++;
+			else
+				ot_kfree(on, ctrl->objsz);
+		}
+	}
+	WARN_ON(nobjs != sop->pool.nr_objs);
+
+	return sop;
+}
+
+static void ot_fini_sync_m3(struct ot_context *sop)
+{
+	objpool_fini(&sop->pool);
+}
+
+struct {
+	struct ot_context * (*init)(struct ot_ctrl *);
+	void (*fini)(struct ot_context *sop);
+} g_ot_sync_ops[4] = {
+	{.init = ot_init_sync_m0, .fini = ot_fini_sync_m0},
+	{.init = ot_init_sync_m1, .fini = ot_fini_sync_m1},
+	{.init = ot_init_sync_m2, .fini = ot_fini_sync_m2},
+	{.init = ot_init_sync_m3, .fini = ot_fini_sync_m3},
+};
+
+/*
+ * synchronous test cases: performance mode
+ */
+
+static void ot_bulk_sync(struct ot_item *item, int irq)
+{
+	struct ot_node *nods[OT_NR_MAX_BULK];
+	int i;
+
+	for (i = 0; i < item->bulk[irq]; i++)
+		nods[i] = objpool_pop(item->pool);
+
+	if (!irq && (item->delay || !(++(item->niters) & 0x7FFF)))
+		msleep(item->delay);
+
+	while (i-- > 0) {
+		struct ot_node *on = nods[i];
+		if (on) {
+			on->refs++;
+			objpool_push(on, item->pool);
+			item->stat[irq].nhits++;
+		} else {
+			item->stat[irq].nmiss++;
+		}
+	}
+}
+
+static int ot_start_sync(struct ot_ctrl *ctrl)
+{
+	struct ot_context *sop;
+	ktime_t start;
+	u64 duration;
+	unsigned long timeout;
+	int cpu, rc;
+
+	/* initialize objpool for syncrhonous testcase */
+	sop = g_ot_sync_ops[ctrl->mode].init(ctrl);
+	if (!sop)
+		return -ENOMEM;
+
+	/* grab rwsem to block testing threads */
+	down_write(&g_ot_data.start);
+
+	for_each_possible_cpu(cpu) {
+		struct ot_item *item = per_cpu_ptr(&ot_pcup_items, cpu);
+		struct task_struct *work;
+
+		ot_init_cpu_item(item, ctrl, &sop->pool, ot_bulk_sync);
+
+		/* skip offline cpus */
+		if (!cpu_online(cpu))
+			continue;
+
+		work = kthread_create_on_node(ot_thread_worker, item,
+				cpu_to_node(cpu), "ot_worker_%d", cpu);
+		if (IS_ERR(work)) {
+			pr_err("failed to create thread for cpu %d\n", cpu);
+		} else {
+			kthread_bind(work, cpu);
+			wake_up_process(work);
+		}
+	}
+
+	/* wait a while to make sure all threads waiting at start line */
+	msleep(20);
+
+	/* in case no threads were created: memory insufficient ? */
+	if (atomic_dec_and_test(&g_ot_data.nthreads))
+		complete(&g_ot_data.wait);
+
+	// sched_set_fifo_low(current);
+
+	/* start objpool testing threads */
+	start = ktime_get();
+	up_write(&g_ot_data.start);
+
+	/* yeild cpu to worker threads for duration ms */
+	timeout = msecs_to_jiffies(ctrl->duration);
+	rc = schedule_timeout_interruptible(timeout);
+
+	/* tell workers threads to quit */
+	atomic_set_release(&g_ot_data.stop, 1);
+
+	/* wait all workers threads finish and quit */
+	wait_for_completion(&g_ot_data.wait);
+	duration = (u64) ktime_us_delta(ktime_get(), start);
+
+	/* cleanup objpool */
+	g_ot_sync_ops[ctrl->mode].fini(sop);
+
+	/* report testing summary and performance results */
+	ot_perf_report(ctrl, duration);
+
+	/* report memory allocation summary */
+	ot_mem_report(ctrl);
+
+	return rc;
+}
+
+/*
+ * asynchronous test cases: pool lifecycle controlled by refcount
+ */
+
+static void ot_fini_async_rcu(struct rcu_head *rcu)
+{
+	struct ot_context *sop = container_of(rcu, struct ot_context, rcu);
+	struct ot_node *on;
+
+	/* here all cpus are aware of the stop event: g_ot_data.stop = 1 */
+	WARN_ON(!atomic_read_acquire(&g_ot_data.stop));
+
+	do {
+		/* release all objects remained in objpool */
+		on = objpool_pop(&sop->pool);
+		if (on && !objpool_is_inslot(on, &sop->pool) &&
+			!objpool_is_inpool(on, &sop->pool)) {
+			/* private object managed by user */
+			WARN_ON(on->data != 0xDEADBEEF);
+			ot_kfree(on, sop->ctrl->objsz);
+		}
+
+		/* deref anyway since we've one extra ref grabbed */
+		if (refcount_dec_and_test(&sop->refs)) {
+			objpool_fini(&sop->pool);
+			break;
+		}
+	} while (on);
+
+	complete(&g_ot_data.rcu);
+}
+
+static void ot_fini_async(struct ot_context *sop)
+{
+	/* make sure the stop event is acknowledged by all cores */
+	call_rcu(&sop->rcu, ot_fini_async_rcu);
+}
+
+static struct ot_context *ot_init_async_m0(struct ot_ctrl *ctrl)
+{
+	struct ot_context *sop = NULL;
+	int max = num_possible_cpus() << 3;
+
+	sop = (struct ot_context *)ot_kzalloc(sizeof(*sop));
+	if (!sop)
+		return NULL;
+	sop->ctrl = ctrl;
+
+	if (objpool_init(&sop->pool, max, max, ctrl->objsz, GFP_KERNEL,
+			sop, ot_init_node, ot_objpool_release)) {
+		ot_kfree(sop, sizeof(*sop));
+		return NULL;
+	}
+	WARN_ON(max != sop->pool.nr_objs);
+	refcount_set(&sop->refs, max + 1);
+
+	return sop;
+}
+
+static struct ot_context *ot_init_async_m1(struct ot_ctrl *ctrl)
+{
+	struct ot_context *sop = NULL;
+	unsigned long size;
+	int szobj, rc, max = num_possible_cpus() << 3;
+
+	sop = (struct ot_context *)ot_kzalloc(sizeof(*sop));
+	if (!sop)
+		return NULL;
+	sop->ctrl = ctrl;
+
+	szobj = ALIGN(ctrl->objsz, sizeof(void *));
+	size = szobj * max;
+	sop->ptr = ot_vmalloc(size);
+	sop->size = size;
+	if (!sop->ptr) {
+		ot_kfree(sop, sizeof(*sop));
+		return NULL;
+	}
+	memset(sop->ptr, 0, size);
+
+	/* create and initialize objpool as empty (no objects) */
+	rc = objpool_init(&sop->pool, 0, max, 0, GFP_KERNEL, sop, NULL,
+			ot_objpool_release);
+	if (rc) {
+		ot_kfree(sop, sizeof(*sop));
+		return NULL;
+	}
+
+	/* populate given buffer to objpool */
+	rc = objpool_populate(&sop->pool, sop->ptr, size,
+			ctrl->objsz, sop, ot_init_node);
+	if (rc) {
+		objpool_fini(&sop->pool);
+		ot_vfree(sop->ptr, size);
+		return NULL;
+	}
+
+	/* calculate total number of objects stored in ptr */
+	WARN_ON(size / szobj != sop->pool.nr_objs);
+	refcount_set(&sop->refs, size / szobj + 1);
+
+	return sop;
+}
+
+static struct ot_context *ot_init_async_m2(struct ot_ctrl *ctrl)
+{
+	struct ot_context *sop = NULL;
+	struct ot_node *on;
+	int rc, i, nobjs = 0, max = num_possible_cpus() << 3;
+
+	sop = (struct ot_context *)ot_kzalloc(sizeof(*sop));
+	if (!sop)
+		return NULL;
+	sop->ctrl = ctrl;
+
+	/* create and initialize objpool as empty (no objects) */
+	rc = objpool_init(&sop->pool, 0, max, 0, GFP_KERNEL, sop, NULL,
+			ot_objpool_release);
+	if (rc) {
+		ot_kfree(sop, sizeof(*sop));
+		return NULL;
+	}
+
+	/* allocate private objects and insert to objpool */
+	for (i = 0; i < max; i++) {
+		on = ot_kzalloc(ctrl->objsz);
+		if (on) {
+			ot_init_node(sop, on);
+			on->data = 0xDEADBEEF;
+			objpool_add(on, &sop->pool);
+			nobjs++;
+		}
+	}
+	WARN_ON(nobjs != sop->pool.nr_objs);
+	refcount_set(&sop->refs, nobjs + 1);
+
+	return sop;
+}
+
+/* objpool manipulation for synchronous mode 3 (mixed mode) */
+static struct ot_context *ot_init_async_m3(struct ot_ctrl *ctrl)
+{
+	struct ot_context *sop = NULL;
+	struct ot_node *on;
+	unsigned long size;
+	int szobj, nobjs, rc, i, max = num_possible_cpus() << 4;
+
+	sop = (struct ot_context *)ot_kzalloc(sizeof(*sop));
+	if (!sop)
+		return NULL;
+	sop->ctrl = ctrl;
+
+	/* create and initialize objpool as empty (no objects) */
+	nobjs = num_possible_cpus() * 2;
+	rc = objpool_init(&sop->pool, nobjs, max, ctrl->objsz, GFP_KERNEL,
+			sop, ot_init_node, ot_objpool_release);
+	if (rc) {
+		ot_kfree(sop, sizeof(*sop));
+		return NULL;
+	}
+
+	szobj = ALIGN(ctrl->objsz, sizeof(void *));
+	size = szobj * num_possible_cpus() * 4;
+	sop->ptr = ot_vmalloc(size);
+	if (!sop->ptr) {
+		ot_kfree(sop, sizeof(*sop));
+		return NULL;
+	}
+	sop->size = size;
+	memset(sop->ptr, 0, size);
+
+	/* populate given buffer to objpool */
+	rc = objpool_populate(&sop->pool, sop->ptr, size,
+			ctrl->objsz, sop, ot_init_node);
+	if (rc) {
+		objpool_fini(&sop->pool);
+		ot_vfree(sop->ptr, size);
+		return NULL;
+	}
+
+	/* calculate total number of objects stored in ptr */
+	nobjs += size / szobj;
+
+	/* allocate private objects and insert to objpool */
+	for (i = 0; i < num_possible_cpus() * 2; i++) {
+		on = ot_kzalloc(ctrl->objsz);
+		if (on) {
+			ot_init_node(sop, on);
+			on->data = 0xDEADBEEF;
+			objpool_add(on, &sop->pool);
+			nobjs++;
+		}
+	}
+	WARN_ON(nobjs != sop->pool.nr_objs);
+	refcount_set(&sop->refs, nobjs + 1);
+
+	return sop;
+}
+
+struct {
+	struct ot_context * (*init)(struct ot_ctrl *);
+	void (*fini)(struct ot_context *sop);
+} g_ot_async_ops[4] = {
+	{.init = ot_init_async_m0, .fini = ot_fini_async},
+	{.init = ot_init_async_m1, .fini = ot_fini_async},
+	{.init = ot_init_async_m2, .fini = ot_fini_async},
+	{.init = ot_init_async_m3, .fini = ot_fini_async},
+};
+
+static void ot_nod_recycle(struct ot_node *on, struct objpool_head *pool,
+			int release)
+{
+	struct ot_context *sop;
+
+	on->refs++;
+
+	if (!release) {
+		/* push object back to opjpool for reuse */
+		objpool_push(on, pool);
+		return;
+	}
+
+	sop = container_of(pool, struct ot_context, pool);
+	WARN_ON(sop != pool->context);
+
+	if (objpool_is_inslot(on, pool)) {
+		/* object is alloced from percpu slots */
+	} else if (objpool_is_inpool(on, pool)) {
+		/* object is alloced from user-manged pool */
+	} else {
+		/* private object managed by user */
+		WARN_ON(on->data != 0xDEADBEEF);
+		ot_kfree(on, sop->ctrl->objsz);
+	}
+
+	/* unref objpool with nod removed forever */
+	if (refcount_dec_and_test(&sop->refs))
+		objpool_fini(pool);
+}
+
+static void ot_bulk_async(struct ot_item *item, int irq)
+{
+	struct ot_node *nods[OT_NR_MAX_BULK];
+	int i, stop;
+
+	for (i = 0; i < item->bulk[irq]; i++)
+		nods[i] = objpool_pop(item->pool);
+
+	if (!irq) {
+		if (item->delay || !(++(item->niters) & 0x7FFF))
+			msleep(item->delay);
+		get_cpu();
+	}
+
+	stop = atomic_read_acquire(&g_ot_data.stop);
+
+	/* drop all objects and deref objpool */
+	while (i-- > 0) {
+		struct ot_node *on = nods[i];
+
+		if (on) {
+			on->refs++;
+			ot_nod_recycle(on, item->pool, stop);
+			item->stat[irq].nhits++;
+		} else {
+			item->stat[irq].nmiss++;
+		}
+	}
+
+	if (!irq)
+		put_cpu();
+}
+
+static int ot_start_async(struct ot_ctrl *ctrl)
+{
+	struct ot_context *sop;
+	ktime_t start;
+	u64 duration;
+	unsigned long timeout;
+	int cpu, rc;
+
+	/* initialize objpool for syncrhonous testcase */
+	sop = g_ot_async_ops[ctrl->mode].init(ctrl);
+	if (!sop)
+		return -ENOMEM;
+
+	/* grab rwsem to block testing threads */
+	down_write(&g_ot_data.start);
+
+	for_each_possible_cpu(cpu) {
+		struct ot_item *item = per_cpu_ptr(&ot_pcup_items, cpu);
+		struct task_struct *work;
+
+		ot_init_cpu_item(item, ctrl, &sop->pool, ot_bulk_async);
+
+		/* skip offline cpus */
+		if (!cpu_online(cpu))
+			continue;
+
+		work = kthread_create_on_node(ot_thread_worker, item,
+				cpu_to_node(cpu), "ot_worker_%d", cpu);
+		if (IS_ERR(work)) {
+			pr_err("failed to create thread for cpu %d\n", cpu);
+		} else {
+			kthread_bind(work, cpu);
+			wake_up_process(work);
+		}
+	}
+
+	/* wait a while to make sure all threads waiting at start line */
+	msleep(20);
+
+	/* in case no threads were created: memory insufficient ? */
+	if (atomic_dec_and_test(&g_ot_data.nthreads))
+		complete(&g_ot_data.wait);
+
+	/* start objpool testing threads */
+	start = ktime_get();
+	up_write(&g_ot_data.start);
+
+	/* yeild cpu to worker threads for duration ms */
+	timeout = msecs_to_jiffies(ctrl->duration);
+	rc = schedule_timeout_interruptible(timeout);
+
+	/* tell workers threads to quit */
+	atomic_set_release(&g_ot_data.stop, 1);
+
+	/* do async-finalization */
+	g_ot_async_ops[ctrl->mode].fini(sop);
+
+	/* wait all workers threads finish and quit */
+	wait_for_completion(&g_ot_data.wait);
+	duration = (u64) ktime_us_delta(ktime_get(), start);
+
+	/* assure rcu callback is triggered */
+	wait_for_completion(&g_ot_data.rcu);
+
+	/*
+	 * now we are sure that objpool is finalized either
+	 * by rcu callback or by worker threads
+	 */
+
+	/* report testing summary and performance results */
+	ot_perf_report(ctrl, duration);
+
+	/* report memory allocation summary */
+	ot_mem_report(ctrl);
+
+	return rc;
+}
+
+/*
+ * predefined testing cases:
+ *   4 synchronous cases / 4 overrun cases / 2 async cases
+ *
+ * mode: unsigned int, could be 0/1/2/3, see name
+ * duration: unsigned int, total test time in ms
+ * delay: unsigned int, delay (in ms) between each iteration
+ * bulk_normal: unsigned int, repeat times for thread worker
+ * bulk_irq: unsigned int, repeat times for irq consumer
+ * hrtimer: unsigned long, hrtimer intervnal in ms
+ * name: char *, tag for current test ot_item
+ */
+
+#define NODE_COMPACT sizeof(struct ot_node)
+#define NODE_VMALLOC (512)
+
+struct ot_ctrl g_ot_sync[] = {
+	{0, NODE_COMPACT, 1000, 0,  1,  0,  0, "sync: percpu objpool"},
+	{0, NODE_VMALLOC, 1000, 0,  1,  0,  0, "sync: percpu objpool from vmalloc"},
+	{1, NODE_COMPACT, 1000, 0,  1,  0,  0, "sync: user objpool"},
+	{2, NODE_COMPACT, 1000, 0,  1,  0,  0, "sync: user objects"},
+	{3, NODE_COMPACT, 1000, 0,  1,  0,  0, "sync: mixed pools & objs"},
+	{3, NODE_VMALLOC, 1000, 0,  1,  0,  0, "sync: mixed pools & objs (vmalloc)"},
+};
+
+struct ot_ctrl g_ot_miss[] = {
+	{0, NODE_COMPACT, 1000, 0, 16,  0,  0, "sync overrun: percpu objpool"},
+	{0, NODE_VMALLOC, 1000, 0, 16,  0,  0, "sync overrun: percpu objpool from vmalloc"},
+	{1, NODE_COMPACT, 1000, 0, 16,  0,  0, "sync overrun: user objpool"},
+	{2, NODE_COMPACT, 1000, 0, 16,  0,  0, "sync overrun: user objects"},
+	{3, NODE_COMPACT, 1000, 0, 16,  0,  0, "sync overrun: mixed pools & objs"},
+	{3, NODE_VMALLOC, 1000, 0, 16,  0,  0, "sync overrun: mixed pools & objs (vmalloc)"},
+};
+
+struct ot_ctrl g_ot_async[] = {
+	{0, NODE_COMPACT, 1000, 4,  8,  8,  6, "async: percpu objpool"},
+	{0, NODE_VMALLOC, 1000, 4,  8,  8,  6, "async: percpu objpool from vmalloc"},
+	{1, NODE_COMPACT, 1000, 4,  8,  8,  6, "async: user objpool"},
+	{2, NODE_COMPACT, 1000, 4,  8,  8,  6, "async: user objects"},
+	{3, NODE_COMPACT, 1000, 4,  8,  8,  6, "async: mixed pools & objs"},
+	{3, NODE_VMALLOC, 1000, 4,  8,  8,  6, "async: mixed pools & objs (vmalloc)"},
+};
+
+static int __init ot_mod_init(void)
+{
+	int i;
+
+	ot_init_data(&g_ot_data);
+
+	for (i = 0; i < ARRAY_SIZE(g_ot_sync); i++) {
+		if (ot_start_sync(&g_ot_sync[i]))
+			goto out;
+		ot_reset_data(&g_ot_data);
+	}
+
+	for (i = 0; i < ARRAY_SIZE(g_ot_miss); i++) {
+		if (ot_start_sync(&g_ot_miss[i]))
+			goto out;
+		ot_reset_data(&g_ot_data);
+	}
+
+	for (i = 0; i < ARRAY_SIZE(g_ot_async); i++) {
+		if (ot_start_async(&g_ot_async[i]))
+			goto out;
+		ot_reset_data(&g_ot_data);
+	}
+
+out:
+	return -EAGAIN;
+}
+
+static void __exit ot_mod_exit(void)
+{
+}
+
+module_init(ot_mod_init);
+module_exit(ot_mod_exit);
+
+MODULE_LICENSE("GPL");
+MODULE_AUTHOR("Matt Wu");
-- 
2.34.1


^ permalink raw reply related	[flat|nested] 23+ messages in thread

* [PATCH v6 3/4] kprobes: kretprobe scalability improvement with objpool
  2022-11-08  7:14   ` [PATCH v6 0/4] lib,kprobes: kretprobe scalability improvement wuqiang
  2022-11-08  7:14     ` [PATCH v6 1/4] lib: objpool added: ring-array based lockless MPMC queue wuqiang
  2022-11-08  7:14     ` [PATCH v6 2/4] lib: objpool test module added wuqiang
@ 2022-11-08  7:14     ` wuqiang
  2022-11-14 15:56       ` Masami Hiramatsu
  2022-11-14 15:56       ` Masami Hiramatsu
  2022-11-08  7:14     ` [PATCH v6 4/4] kprobes: freelist.h removed wuqiang
  3 siblings, 2 replies; 23+ messages in thread
From: wuqiang @ 2022-11-08  7:14 UTC (permalink / raw)
  To: mhiramat, davem, anil.s.keshavamurthy, naveen.n.rao, rostedt,
	peterz, akpm, sander, ebiggers, dan.j.williams, jpoimboe
  Cc: linux-kernel, lkp, mattwu, wuqiang

kretprobe is using freelist to manage return-instances, but freelist,
as LIFO queue based on singly linked list, scales badly and reduces
the overall throughput of kretprobed routines, especially for high
contention scenarios.

Here's a typical throughput test of sys_flock (counts in 10 seconds,
measured with perf stat -a -I 10000 -e syscalls:sys_enter_flock):

OS: Debian 10 X86_64, Linux 6.1rc2
HW: XEON 8336C x 2, 64 cores/128 threads, DDR4 3200MT/s

      1X       2X       4X       6X       8X      12X      16X
34762430 36546920 17949900 13101899 12569595 12646601 14729195
     24X      32X      48X      64X      72X      96X     128X
19263546 10102064  8985418 11936495 11493980  7127789  9330985

This patch introduces objpool to kretprobe and rethook, with orginal
freelist replaced and brings near-linear scalability to kretprobed
routines. Tests of kretprobe throughput show the biggest ratio as
333.9x of the original freelist. Here's the comparison:

                  1X         2X         4X         8X        16X
freelist:   34762430   36546920   17949900   12569595   14729195
objpool:    35627544   72182095  144068494  287564688  576903916
                 32X        48X        64X        96X       128X
freelist:   10102064    8985418   11936495    7127789    9330985
objpool:  1158876372 1737828164 2324371724 2380310472 2463182819

Tests on 96-core ARM64 system output similarly, but with the biggest
ratio up to 642.2x:

OS: Debian 10 AARCH64, Linux 6.1rc2
HW: Kunpeng-920 96 cores/2 sockets/4 NUMA nodes, DDR4 2933 MT/s

                  1X         2X         4X         8X        16X
freelist:   17498299   10887037   10224710    8499132    6421751
objpool:    18715726   35549845   71615884  144258971  283707220
                 24X        32X        48X        64X        96X
freelist:    5339868    4819116    3593919    3121575    2687167
objpool:   419830913  571609748  877456139 1143316315 1725668029

Signed-off-by: wuqiang <wuqiang.matt@bytedance.com>
---
 include/linux/kprobes.h |  9 ++--
 include/linux/rethook.h | 15 +++----
 kernel/kprobes.c        | 95 +++++++++++++++++++----------------------
 kernel/trace/fprobe.c   | 17 ++------
 kernel/trace/rethook.c  | 80 +++++++++++++++++-----------------
 5 files changed, 95 insertions(+), 121 deletions(-)

diff --git a/include/linux/kprobes.h b/include/linux/kprobes.h
index a0b92be98984..f13f01e600c2 100644
--- a/include/linux/kprobes.h
+++ b/include/linux/kprobes.h
@@ -27,7 +27,7 @@
 #include <linux/mutex.h>
 #include <linux/ftrace.h>
 #include <linux/refcount.h>
-#include <linux/freelist.h>
+#include <linux/objpool.h>
 #include <linux/rethook.h>
 #include <asm/kprobes.h>
 
@@ -141,6 +141,7 @@ static inline bool kprobe_ftrace(struct kprobe *p)
  */
 struct kretprobe_holder {
 	struct kretprobe	*rp;
+	struct objpool_head	oh;
 	refcount_t		ref;
 };
 
@@ -154,7 +155,6 @@ struct kretprobe {
 #ifdef CONFIG_KRETPROBE_ON_RETHOOK
 	struct rethook *rh;
 #else
-	struct freelist_head freelist;
 	struct kretprobe_holder *rph;
 #endif
 };
@@ -165,10 +165,7 @@ struct kretprobe_instance {
 #ifdef CONFIG_KRETPROBE_ON_RETHOOK
 	struct rethook_node node;
 #else
-	union {
-		struct freelist_node freelist;
-		struct rcu_head rcu;
-	};
+	struct rcu_head rcu;
 	struct llist_node llist;
 	struct kretprobe_holder *rph;
 	kprobe_opcode_t *ret_addr;
diff --git a/include/linux/rethook.h b/include/linux/rethook.h
index c8ac1e5afcd1..278ec65e71fe 100644
--- a/include/linux/rethook.h
+++ b/include/linux/rethook.h
@@ -6,7 +6,7 @@
 #define _LINUX_RETHOOK_H
 
 #include <linux/compiler.h>
-#include <linux/freelist.h>
+#include <linux/objpool.h>
 #include <linux/kallsyms.h>
 #include <linux/llist.h>
 #include <linux/rcupdate.h>
@@ -30,14 +30,14 @@ typedef void (*rethook_handler_t) (struct rethook_node *, void *, struct pt_regs
 struct rethook {
 	void			*data;
 	rethook_handler_t	handler;
-	struct freelist_head	pool;
+	struct objpool_head	pool;
 	refcount_t		ref;
 	struct rcu_head		rcu;
 };
 
 /**
  * struct rethook_node - The rethook shadow-stack entry node.
- * @freelist: The freelist, linked to struct rethook::pool.
+ * @nod: The objpool node, linked to struct rethook::pool.
  * @rcu: The rcu_head for deferred freeing.
  * @llist: The llist, linked to a struct task_struct::rethooks.
  * @rethook: The pointer to the struct rethook.
@@ -48,19 +48,15 @@ struct rethook {
  * on each entry of the shadow stack.
  */
 struct rethook_node {
-	union {
-		struct freelist_node freelist;
-		struct rcu_head      rcu;
-	};
+	struct rcu_head		rcu;
 	struct llist_node	llist;
 	struct rethook		*rethook;
 	unsigned long		ret_addr;
 	unsigned long		frame;
 };
 
-struct rethook *rethook_alloc(void *data, rethook_handler_t handler);
+struct rethook *rethook_alloc(void *data, rethook_handler_t handler, gfp_t gfp, int size, int max);
 void rethook_free(struct rethook *rh);
-void rethook_add_node(struct rethook *rh, struct rethook_node *node);
 struct rethook_node *rethook_try_get(struct rethook *rh);
 void rethook_recycle(struct rethook_node *node);
 void rethook_hook(struct rethook_node *node, struct pt_regs *regs, bool mcount);
@@ -97,4 +93,3 @@ void rethook_flush_task(struct task_struct *tk);
 #endif
 
 #endif
-
diff --git a/kernel/kprobes.c b/kernel/kprobes.c
index cd9f5a66a690..a8b202f87e2d 100644
--- a/kernel/kprobes.c
+++ b/kernel/kprobes.c
@@ -1865,10 +1865,12 @@ static struct notifier_block kprobe_exceptions_nb = {
 static void free_rp_inst_rcu(struct rcu_head *head)
 {
 	struct kretprobe_instance *ri = container_of(head, struct kretprobe_instance, rcu);
+	struct kretprobe_holder *rph = ri->rph;
 
-	if (refcount_dec_and_test(&ri->rph->ref))
-		kfree(ri->rph);
-	kfree(ri);
+	if (refcount_dec_and_test(&rph->ref)) {
+		objpool_fini(&rph->oh);
+		kfree(rph);
+	}
 }
 NOKPROBE_SYMBOL(free_rp_inst_rcu);
 
@@ -1877,7 +1879,7 @@ static void recycle_rp_inst(struct kretprobe_instance *ri)
 	struct kretprobe *rp = get_kretprobe(ri);
 
 	if (likely(rp))
-		freelist_add(&ri->freelist, &rp->freelist);
+		objpool_push(ri, &rp->rph->oh);
 	else
 		call_rcu(&ri->rcu, free_rp_inst_rcu);
 }
@@ -1914,23 +1916,19 @@ NOKPROBE_SYMBOL(kprobe_flush_task);
 
 static inline void free_rp_inst(struct kretprobe *rp)
 {
-	struct kretprobe_instance *ri;
-	struct freelist_node *node;
-	int count = 0;
-
-	node = rp->freelist.head;
-	while (node) {
-		ri = container_of(node, struct kretprobe_instance, freelist);
-		node = node->next;
-
-		kfree(ri);
-		count++;
-	}
+	struct kretprobe_holder *rph = rp->rph;
+	void *nod;
 
-	if (refcount_sub_and_test(count, &rp->rph->ref)) {
-		kfree(rp->rph);
-		rp->rph = NULL;
-	}
+	rp->rph = NULL;
+	do {
+		nod = objpool_pop(&rph->oh);
+		/* deref anyway since we've one extra ref grabbed */
+		if (refcount_dec_and_test(&rph->ref)) {
+			objpool_fini(&rph->oh);
+			kfree(rph);
+			break;
+		}
+	} while (nod);
 }
 
 /* This assumes the 'tsk' is the current task or the is not running. */
@@ -2072,19 +2070,17 @@ NOKPROBE_SYMBOL(__kretprobe_trampoline_handler)
 static int pre_handler_kretprobe(struct kprobe *p, struct pt_regs *regs)
 {
 	struct kretprobe *rp = container_of(p, struct kretprobe, kp);
+	struct kretprobe_holder *rph = rp->rph;
 	struct kretprobe_instance *ri;
-	struct freelist_node *fn;
 
-	fn = freelist_try_get(&rp->freelist);
-	if (!fn) {
+	ri = objpool_pop(&rph->oh);
+	if (!ri) {
 		rp->nmissed++;
 		return 0;
 	}
 
-	ri = container_of(fn, struct kretprobe_instance, freelist);
-
 	if (rp->entry_handler && rp->entry_handler(ri, regs)) {
-		freelist_add(&ri->freelist, &rp->freelist);
+		objpool_push(ri, &rph->oh);
 		return 0;
 	}
 
@@ -2174,10 +2170,19 @@ int kprobe_on_func_entry(kprobe_opcode_t *addr, const char *sym, unsigned long o
 	return 0;
 }
 
+#ifndef CONFIG_KRETPROBE_ON_RETHOOK
+static int kretprobe_init_inst(void *context, void *nod)
+{
+	struct kretprobe_instance *ri = nod;
+
+	ri->rph = context;
+	return 0;
+}
+#endif
+
 int register_kretprobe(struct kretprobe *rp)
 {
 	int ret;
-	struct kretprobe_instance *inst;
 	int i;
 	void *addr;
 
@@ -2215,20 +2220,12 @@ int register_kretprobe(struct kretprobe *rp)
 #endif
 	}
 #ifdef CONFIG_KRETPROBE_ON_RETHOOK
-	rp->rh = rethook_alloc((void *)rp, kretprobe_rethook_handler);
+	rp->rh = rethook_alloc((void *)rp, kretprobe_rethook_handler, GFP_KERNEL,
+				sizeof(struct kretprobe_instance) + rp->data_size,
+				rp->maxactive);
 	if (!rp->rh)
 		return -ENOMEM;
 
-	for (i = 0; i < rp->maxactive; i++) {
-		inst = kzalloc(sizeof(struct kretprobe_instance) +
-			       rp->data_size, GFP_KERNEL);
-		if (inst == NULL) {
-			rethook_free(rp->rh);
-			rp->rh = NULL;
-			return -ENOMEM;
-		}
-		rethook_add_node(rp->rh, &inst->node);
-	}
 	rp->nmissed = 0;
 	/* Establish function entry probe point */
 	ret = register_kprobe(&rp->kp);
@@ -2237,25 +2234,19 @@ int register_kretprobe(struct kretprobe *rp)
 		rp->rh = NULL;
 	}
 #else	/* !CONFIG_KRETPROBE_ON_RETHOOK */
-	rp->freelist.head = NULL;
 	rp->rph = kzalloc(sizeof(struct kretprobe_holder), GFP_KERNEL);
 	if (!rp->rph)
 		return -ENOMEM;
 
-	rp->rph->rp = rp;
-	for (i = 0; i < rp->maxactive; i++) {
-		inst = kzalloc(sizeof(struct kretprobe_instance) +
-			       rp->data_size, GFP_KERNEL);
-		if (inst == NULL) {
-			refcount_set(&rp->rph->ref, i);
-			free_rp_inst(rp);
-			return -ENOMEM;
-		}
-		inst->rph = rp->rph;
-		freelist_add(&inst->freelist, &rp->freelist);
+	if (objpool_init(&rp->rph->oh, rp->maxactive, rp->maxactive,
+			  rp->data_size + sizeof(struct kretprobe_instance),
+			  GFP_KERNEL, rp->rph, kretprobe_init_inst, NULL)) {
+		kfree(rp->rph);
+		rp->rph = NULL;
+		return -ENOMEM;
 	}
-	refcount_set(&rp->rph->ref, i);
-
+	refcount_set(&rp->rph->ref, rp->maxactive + 1);
+	rp->rph->rp = rp;
 	rp->nmissed = 0;
 	/* Establish function entry probe point */
 	ret = register_kprobe(&rp->kp);
diff --git a/kernel/trace/fprobe.c b/kernel/trace/fprobe.c
index e8143e368074..9a002bfbd216 100644
--- a/kernel/trace/fprobe.c
+++ b/kernel/trace/fprobe.c
@@ -125,7 +125,7 @@ static void fprobe_init(struct fprobe *fp)
 
 static int fprobe_init_rethook(struct fprobe *fp, int num)
 {
-	int i, size;
+	int size;
 
 	if (num < 0)
 		return -EINVAL;
@@ -140,20 +140,11 @@ static int fprobe_init_rethook(struct fprobe *fp, int num)
 	if (size < 0)
 		return -E2BIG;
 
-	fp->rethook = rethook_alloc((void *)fp, fprobe_exit_handler);
+	fp->rethook = rethook_alloc((void *)fp, fprobe_exit_handler, GFP_KERNEL,
+				sizeof(struct fprobe_rethook_node), size);
 	if (!fp->rethook)
 		return -ENOMEM;
-	for (i = 0; i < size; i++) {
-		struct fprobe_rethook_node *node;
-
-		node = kzalloc(sizeof(*node), GFP_KERNEL);
-		if (!node) {
-			rethook_free(fp->rethook);
-			fp->rethook = NULL;
-			return -ENOMEM;
-		}
-		rethook_add_node(fp->rethook, &node->node);
-	}
+
 	return 0;
 }
 
diff --git a/kernel/trace/rethook.c b/kernel/trace/rethook.c
index c69d82273ce7..01df98db2fbe 100644
--- a/kernel/trace/rethook.c
+++ b/kernel/trace/rethook.c
@@ -36,21 +36,17 @@ void rethook_flush_task(struct task_struct *tk)
 static void rethook_free_rcu(struct rcu_head *head)
 {
 	struct rethook *rh = container_of(head, struct rethook, rcu);
-	struct rethook_node *rhn;
-	struct freelist_node *node;
-	int count = 1;
+	struct rethook_node *nod;
 
-	node = rh->pool.head;
-	while (node) {
-		rhn = container_of(node, struct rethook_node, freelist);
-		node = node->next;
-		kfree(rhn);
-		count++;
-	}
-
-	/* The rh->ref is the number of pooled node + 1 */
-	if (refcount_sub_and_test(count, &rh->ref))
-		kfree(rh);
+	do {
+		nod = objpool_pop(&rh->pool);
+		/* deref anyway since we've one extra ref grabbed */
+		if (refcount_dec_and_test(&rh->ref)) {
+			objpool_fini(&rh->pool);
+			kfree(rh);
+			break;
+		}
+	} while (nod);
 }
 
 /**
@@ -70,16 +66,28 @@ void rethook_free(struct rethook *rh)
 	call_rcu(&rh->rcu, rethook_free_rcu);
 }
 
+static int rethook_init_node(void *context, void *nod)
+{
+	struct rethook_node *node = nod;
+
+	node->rethook = context;
+	return 0;
+}
+
 /**
  * rethook_alloc() - Allocate struct rethook.
  * @data: a data to pass the @handler when hooking the return.
  * @handler: the return hook callback function.
+ * @gfp: default gfp for objpool allocation
+ * @size: rethook node size
+ * @max: number of rethook nodes to be preallocated
  *
  * Allocate and initialize a new rethook with @data and @handler.
  * Return NULL if memory allocation fails or @handler is NULL.
  * Note that @handler == NULL means this rethook is going to be freed.
  */
-struct rethook *rethook_alloc(void *data, rethook_handler_t handler)
+struct rethook *rethook_alloc(void *data, rethook_handler_t handler, gfp_t gfp,
+			      int size, int max)
 {
 	struct rethook *rh = kzalloc(sizeof(struct rethook), GFP_KERNEL);
 
@@ -88,34 +96,26 @@ struct rethook *rethook_alloc(void *data, rethook_handler_t handler)
 
 	rh->data = data;
 	rh->handler = handler;
-	rh->pool.head = NULL;
-	refcount_set(&rh->ref, 1);
 
+	/* initialize the objpool for rethook nodes */
+	if (objpool_init(&rh->pool, max, max, size, gfp, rh, rethook_init_node,
+			NULL)) {
+		kfree(rh);
+		return NULL;
+	}
+	refcount_set(&rh->ref, max + 1);
 	return rh;
 }
 
-/**
- * rethook_add_node() - Add a new node to the rethook.
- * @rh: the struct rethook.
- * @node: the struct rethook_node to be added.
- *
- * Add @node to @rh. User must allocate @node (as a part of user's
- * data structure.) The @node fields are initialized in this function.
- */
-void rethook_add_node(struct rethook *rh, struct rethook_node *node)
-{
-	node->rethook = rh;
-	freelist_add(&node->freelist, &rh->pool);
-	refcount_inc(&rh->ref);
-}
-
 static void free_rethook_node_rcu(struct rcu_head *head)
 {
 	struct rethook_node *node = container_of(head, struct rethook_node, rcu);
+	struct rethook *rh = node->rethook;
 
-	if (refcount_dec_and_test(&node->rethook->ref))
-		kfree(node->rethook);
-	kfree(node);
+	if (refcount_dec_and_test(&rh->ref)) {
+		objpool_fini(&rh->pool);
+		kfree(rh);
+	}
 }
 
 /**
@@ -130,7 +130,7 @@ void rethook_recycle(struct rethook_node *node)
 	lockdep_assert_preemption_disabled();
 
 	if (likely(READ_ONCE(node->rethook->handler)))
-		freelist_add(&node->freelist, &node->rethook->pool);
+		objpool_push(node, &node->rethook->pool);
 	else
 		call_rcu(&node->rcu, free_rethook_node_rcu);
 }
@@ -146,7 +146,7 @@ NOKPROBE_SYMBOL(rethook_recycle);
 struct rethook_node *rethook_try_get(struct rethook *rh)
 {
 	rethook_handler_t handler = READ_ONCE(rh->handler);
-	struct freelist_node *fn;
+	struct rethook_node *nod;
 
 	lockdep_assert_preemption_disabled();
 
@@ -163,11 +163,11 @@ struct rethook_node *rethook_try_get(struct rethook *rh)
 	if (unlikely(!rcu_is_watching()))
 		return NULL;
 
-	fn = freelist_try_get(&rh->pool);
-	if (!fn)
+	nod = (struct rethook_node *)objpool_pop(&rh->pool);
+	if (!nod)
 		return NULL;
 
-	return container_of(fn, struct rethook_node, freelist);
+	return nod;
 }
 NOKPROBE_SYMBOL(rethook_try_get);
 
-- 
2.34.1


^ permalink raw reply related	[flat|nested] 23+ messages in thread

* [PATCH v6 4/4] kprobes: freelist.h removed
  2022-11-08  7:14   ` [PATCH v6 0/4] lib,kprobes: kretprobe scalability improvement wuqiang
                       ` (2 preceding siblings ...)
  2022-11-08  7:14     ` [PATCH v6 3/4] kprobes: kretprobe scalability improvement with objpool wuqiang
@ 2022-11-08  7:14     ` wuqiang
  3 siblings, 0 replies; 23+ messages in thread
From: wuqiang @ 2022-11-08  7:14 UTC (permalink / raw)
  To: mhiramat, davem, anil.s.keshavamurthy, naveen.n.rao, rostedt,
	peterz, akpm, sander, ebiggers, dan.j.williams, jpoimboe
  Cc: linux-kernel, lkp, mattwu, wuqiang

This patch will remove freelist.h from kernel source tree, since the
only use cases (kretprobe and rethook) are converted to objpool.

Signed-off-by: wuqiang <wuqiang.matt@bytedance.com>
---
 include/linux/freelist.h | 129 ---------------------------------------
 1 file changed, 129 deletions(-)
 delete mode 100644 include/linux/freelist.h

diff --git a/include/linux/freelist.h b/include/linux/freelist.h
deleted file mode 100644
index fc1842b96469..000000000000
--- a/include/linux/freelist.h
+++ /dev/null
@@ -1,129 +0,0 @@
-/* SPDX-License-Identifier: GPL-2.0-only OR BSD-2-Clause */
-#ifndef FREELIST_H
-#define FREELIST_H
-
-#include <linux/atomic.h>
-
-/*
- * Copyright: cameron@moodycamel.com
- *
- * A simple CAS-based lock-free free list. Not the fastest thing in the world
- * under heavy contention, but simple and correct (assuming nodes are never
- * freed until after the free list is destroyed), and fairly speedy under low
- * contention.
- *
- * Adapted from: https://moodycamel.com/blog/2014/solving-the-aba-problem-for-lock-free-free-lists
- */
-
-struct freelist_node {
-	atomic_t		refs;
-	struct freelist_node	*next;
-};
-
-struct freelist_head {
-	struct freelist_node	*head;
-};
-
-#define REFS_ON_FREELIST 0x80000000
-#define REFS_MASK	 0x7FFFFFFF
-
-static inline void __freelist_add(struct freelist_node *node, struct freelist_head *list)
-{
-	/*
-	 * Since the refcount is zero, and nobody can increase it once it's
-	 * zero (except us, and we run only one copy of this method per node at
-	 * a time, i.e. the single thread case), then we know we can safely
-	 * change the next pointer of the node; however, once the refcount is
-	 * back above zero, then other threads could increase it (happens under
-	 * heavy contention, when the refcount goes to zero in between a load
-	 * and a refcount increment of a node in try_get, then back up to
-	 * something non-zero, then the refcount increment is done by the other
-	 * thread) -- so if the CAS to add the node to the actual list fails,
-	 * decrese the refcount and leave the add operation to the next thread
-	 * who puts the refcount back to zero (which could be us, hence the
-	 * loop).
-	 */
-	struct freelist_node *head = READ_ONCE(list->head);
-
-	for (;;) {
-		WRITE_ONCE(node->next, head);
-		atomic_set_release(&node->refs, 1);
-
-		if (!try_cmpxchg_release(&list->head, &head, node)) {
-			/*
-			 * Hmm, the add failed, but we can only try again when
-			 * the refcount goes back to zero.
-			 */
-			if (atomic_fetch_add_release(REFS_ON_FREELIST - 1, &node->refs) == 1)
-				continue;
-		}
-		return;
-	}
-}
-
-static inline void freelist_add(struct freelist_node *node, struct freelist_head *list)
-{
-	/*
-	 * We know that the should-be-on-freelist bit is 0 at this point, so
-	 * it's safe to set it using a fetch_add.
-	 */
-	if (!atomic_fetch_add_release(REFS_ON_FREELIST, &node->refs)) {
-		/*
-		 * Oh look! We were the last ones referencing this node, and we
-		 * know we want to add it to the free list, so let's do it!
-		 */
-		__freelist_add(node, list);
-	}
-}
-
-static inline struct freelist_node *freelist_try_get(struct freelist_head *list)
-{
-	struct freelist_node *prev, *next, *head = smp_load_acquire(&list->head);
-	unsigned int refs;
-
-	while (head) {
-		prev = head;
-		refs = atomic_read(&head->refs);
-		if ((refs & REFS_MASK) == 0 ||
-		    !atomic_try_cmpxchg_acquire(&head->refs, &refs, refs+1)) {
-			head = smp_load_acquire(&list->head);
-			continue;
-		}
-
-		/*
-		 * Good, reference count has been incremented (it wasn't at
-		 * zero), which means we can read the next and not worry about
-		 * it changing between now and the time we do the CAS.
-		 */
-		next = READ_ONCE(head->next);
-		if (try_cmpxchg_acquire(&list->head, &head, next)) {
-			/*
-			 * Yay, got the node. This means it was on the list,
-			 * which means should-be-on-freelist must be false no
-			 * matter the refcount (because nobody else knows it's
-			 * been taken off yet, it can't have been put back on).
-			 */
-			WARN_ON_ONCE(atomic_read(&head->refs) & REFS_ON_FREELIST);
-
-			/*
-			 * Decrease refcount twice, once for our ref, and once
-			 * for the list's ref.
-			 */
-			atomic_fetch_add(-2, &head->refs);
-
-			return head;
-		}
-
-		/*
-		 * OK, the head must have changed on us, but we still need to decrement
-		 * the refcount we increased.
-		 */
-		refs = atomic_fetch_add(-1, &prev->refs);
-		if (refs == REFS_ON_FREELIST + 1)
-			__freelist_add(prev, list);
-	}
-
-	return NULL;
-}
-
-#endif /* FREELIST_H */
-- 
2.34.1


^ permalink raw reply related	[flat|nested] 23+ messages in thread

* Re: [PATCH v6 1/4] lib: objpool added: ring-array based lockless MPMC queue
  2022-11-08  7:14     ` [PATCH v6 1/4] lib: objpool added: ring-array based lockless MPMC queue wuqiang
@ 2022-11-14 15:54       ` Masami Hiramatsu
  2022-11-16 10:42         ` wuqiang
  0 siblings, 1 reply; 23+ messages in thread
From: Masami Hiramatsu @ 2022-11-14 15:54 UTC (permalink / raw)
  To: wuqiang
  Cc: davem, anil.s.keshavamurthy, naveen.n.rao, rostedt, peterz, akpm,
	sander, ebiggers, dan.j.williams, jpoimboe, linux-kernel, lkp,
	mattwu

On Tue,  8 Nov 2022 15:14:40 +0800
wuqiang <wuqiang.matt@bytedance.com> wrote:

> The object pool is a scalable implementaion of high performance queue
> for objects allocation and reclamation, such as kretprobe instances.
> 
> With leveraging per-cpu ring-array to mitigate the hot spots of memory
> contention, it could deliver near-linear scalability for high parallel
> scenarios. The ring-array is compactly managed in a single cache-line
> to benefit from warmed L1 cache for most cases (<= 4 objects per-core).
> The body of pre-allocated objects is stored in continuous cache-lines
> just after the ring-array.
> 
> The object pool is interrupt safe. Both allocation and reclamation
> (object pop and push operations) can be preemptible or interruptable.
> 
> It's best suited for following cases:
> 1) Memory allocation or reclamation are prohibited or too expensive
> 2) Consumers are of different priorities, such as irqs and threads
> 
> Limitations:
> 1) Maximum objects (capacity) is determined during pool initializing
> 2) The memory of objects won't be freed until the poll is finalized
> 3) Object allocation (pop) may fail after trying all cpu slots
> 4) Object reclamation (push) won't fail but may take long time to
>    finish for imbalanced scenarios. You can try larger max_entries
>    to mitigate, or ( >= CPUS * nr_objs) to avoid
> 
> Signed-off-by: wuqiang <wuqiang.matt@bytedance.com>
> ---
>  include/linux/objpool.h | 153 +++++++++++++
>  lib/Makefile            |   2 +-
>  lib/objpool.c           | 487 ++++++++++++++++++++++++++++++++++++++++
>  3 files changed, 641 insertions(+), 1 deletion(-)
>  create mode 100644 include/linux/objpool.h
>  create mode 100644 lib/objpool.c
> 
> diff --git a/include/linux/objpool.h b/include/linux/objpool.h
> new file mode 100644
> index 000000000000..7899b054b50c
> --- /dev/null
> +++ b/include/linux/objpool.h
> @@ -0,0 +1,153 @@
> +/* SPDX-License-Identifier: GPL-2.0 */
> +
> +#ifndef _LINUX_OBJPOOL_H
> +#define _LINUX_OBJPOOL_H
> +
> +#include <linux/types.h>
> +
> +/*
> + * objpool: ring-array based lockless MPMC queue
> + *
> + * Copyright: wuqiang.matt@bytedance.com
> + *
> + * The object pool is a scalable implementaion of high performance queue
> + * for objects allocation and reclamation, such as kretprobe instances.
> + *
> + * With leveraging per-cpu ring-array to mitigate the hot spots of memory
> + * contention, it could deliver near-linear scalability for high parallel
> + * scenarios. The ring-array is compactly managed in a single cache-line
> + * to benefit from warmed L1 cache for most cases (<= 4 objects per-core).
> + * The body of pre-allocated objects is stored in continuous cache-lines
> + * just after the ring-array.
> + *
> + * The object pool is interrupt safe. Both allocation and reclamation
> + * (object pop and push operations) can be preemptible or interruptable.
> + *
> + * It's best suited for following cases:
> + * 1) Memory allocation or reclamation are prohibited or too expensive
> + * 2) Consumers are of different priorities, such as irqs and threads
> + *
> + * Limitations:
> + * 1) Maximum objects (capacity) is determined during pool initializing
> + * 2) The memory of objects won't be freed until the poll is finalized
> + * 3) Object allocation (pop) may fail after trying all cpu slots
> + * 4) Object reclamation (push) won't fail but may take long time to
> + *    finish for imbalanced scenarios. You can try larger max_entries
> + *    to mitigate, or ( >= CPUS * nr_objs) to avoid
> + */
> +
> +/*
> + * objpool_slot: per-cpu ring array
> + *
> + * Represents a cpu-local array-based ring buffer, its size is specialized
> + * during initialization of object pool.
> + *
> + * The objpool_slot is allocated from local memory for NUMA system, and to
> + * be kept compact in a single cacheline. ages[] is stored just after the
> + * body of objpool_slot, and then entries[]. The Array of ages[] describes
> + * revision of each item, solely used to avoid ABA. And array of entries[]
> + * contains the pointers of objects.
> + *
> + * The default size of objpool_slot is a single cache-line, aka. 64 bytes.
> + *
> + * 64bit:
> + *        4      8      12     16        32                 64
> + * | head | tail | size | mask | ages[4] | ents[4]: (8 * 4) | objects
> + *
> + * 32bit:
> + *        4      8      12     16        32        48       64
> + * | head | tail | size | mask | ages[4] | ents[4] | unused | objects
> + *
> + */
> +
> +struct objpool_slot {
> +	uint32_t                head;	/* head of ring array */
> +	uint32_t                tail;	/* tail of ring array */
> +	uint32_t                size;	/* array size, pow of 2 */
> +	uint32_t                mask;	/* size - 1 */
> +} __attribute__((packed));
> +
> +/* caller-specified object initial callback to setup each object, only called once */
> +typedef int (*objpool_init_obj_cb)(void *context, void *obj);

It seems a bit confused that this "initialize object" callback
don't have the @obj as the first argument.

> +
> +/* caller-specified cleanup callback for private objects/pool/context */
> +typedef int (*objpool_release_cb)(void *context, void *ptr, uint32_t flags);

Do you have any use-case for this release callback?
If not, until actual use-case comes up, I recommend you to defer
implementing it.

> +
> +/* called for object releasing: ptr points to an object */
> +#define OBJPOOL_FLAG_NODE        (0x00000001)
> +/* for user pool and context releasing, ptr could be NULL */
> +#define OBJPOOL_FLAG_POOL        (0x00001000)
> +/* the object or pool to be released is user-managed */
> +#define OBJPOOL_FLAG_USER        (0x00008000)

Ditto.

> +
> +/*
> + * objpool_head: object pooling metadata
> + */
> +
> +struct objpool_head {
> +	unsigned int            obj_size;	/* object & element size */
> +	unsigned int            nr_objs;	/* total objs (to be pre-allocated) */
> +	unsigned int            nr_cpus;	/* num of possible cpus */
> +	unsigned int            capacity;	/* max objects per cpuslot */
> +	unsigned long           flags;		/* flags for objpool management */
> +	gfp_t                   gfp;		/* gfp flags for kmalloc & vmalloc */
> +	unsigned int            pool_size;	/* user pool size in byes */
> +	void                   *pool;		/* user managed memory pool */
> +	struct objpool_slot   **cpu_slots;	/* array of percpu slots */
> +	unsigned int           *slot_sizes;	/* size in bytes of slots */
> +	objpool_release_cb      release;	/* resource cleanup callback */
> +	void                   *context;	/* caller-provided context */
> +};
> +
> +#define OBJPOOL_FROM_VMALLOC	(0x800000000)	/* objpool allocated from vmalloc area */
> +#define OBJPOOL_HAVE_OBJECTS	(0x400000000)	/* objects allocated along with objpool */

This also doesn't need at this moment. Please start from simple
design for review.

> +
> +/* initialize object pool and pre-allocate objects */
> +int objpool_init(struct objpool_head *head, unsigned int nr_objs,
> +		 unsigned int max_objs, unsigned int object_size,
> +		 gfp_t gfp, void *context, objpool_init_obj_cb objinit,
> +		 objpool_release_cb release);
> +
> +/* add objects in batch from user provided pool */
> +int objpool_populate(struct objpool_head *head, void *pool,
> +		     unsigned int size, unsigned int object_size,
> +		     void *context, objpool_init_obj_cb objinit);
> +
> +/* add pre-allocated object (managed by user) to objpool */
> +int objpool_add(void *obj, struct objpool_head *head);
> +
> +/* allocate an object from objects pool */
> +void *objpool_pop(struct objpool_head *head);
> +
> +/* reclaim an object to objects pool */
> +int objpool_push(void *node, struct objpool_head *head);
> +
> +/* cleanup the whole object pool (objects including) */
> +void objpool_fini(struct objpool_head *head);
> +
> +/* whether the object is pre-allocated with percpu slots */
> +static inline int objpool_is_inslot(void *obj, struct objpool_head *head)
> +{
> +	void *slot;
> +	int i;
> +
> +	if (!obj || !(head->flags & OBJPOOL_HAVE_OBJECTS))
> +		return 0;
> +
> +	for (i = 0; i < head->nr_cpus; i++) {
> +		slot = head->cpu_slots[i];
> +		if (obj >= slot && obj < slot + head->slot_sizes[i])
> +			return 1;
> +	}
> +
> +	return 0;
> +}

Ditto.

It is too complicated to mix the internal allocated objects
and external ones. This will expose the implementation of the
objpool (users must understand they have to free the object
only outside of slot)

You can add it afterwards if it is really needed :)

> +
> +/* whether the object is from user pool (batched adding) */
> +static inline int objpool_is_inpool(void *obj, struct objpool_head *head)
> +{
> +	return (obj && head->pool && obj >= head->pool &&
> +		obj < head->pool + head->pool_size);
> +}
> +
> +#endif /* _LINUX_OBJPOOL_H */
> diff --git a/lib/Makefile b/lib/Makefile
> index 161d6a724ff7..e938703a321f 100644
> --- a/lib/Makefile
> +++ b/lib/Makefile
> @@ -34,7 +34,7 @@ lib-y := ctype.o string.o vsprintf.o cmdline.o \
>  	 is_single_threaded.o plist.o decompress.o kobject_uevent.o \
>  	 earlycpio.o seq_buf.o siphash.o dec_and_lock.o \
>  	 nmi_backtrace.o win_minmax.o memcat_p.o \
> -	 buildid.o
> +	 buildid.o objpool.o
>  
>  lib-$(CONFIG_PRINTK) += dump_stack.o
>  lib-$(CONFIG_SMP) += cpumask.o
> diff --git a/lib/objpool.c b/lib/objpool.c
> new file mode 100644
> index 000000000000..ecffa0795f3d
> --- /dev/null
> +++ b/lib/objpool.c
> @@ -0,0 +1,487 @@
> +// SPDX-License-Identifier: GPL-2.0
> +
> +#include <linux/objpool.h>
> +#include <linux/slab.h>
> +#include <linux/vmalloc.h>
> +#include <linux/atomic.h>
> +#include <linux/prefetch.h>
> +
> +/*
> + * objpool: ring-array based lockless MPMC/FIFO queues
> + *
> + * Copyright: wuqiang.matt@bytedance.com
> + */
> +
> +/* compute the suitable num of objects to be managed by slot */
> +static inline unsigned int __objpool_num_of_objs(unsigned int size)
> +{
> +	return rounddown_pow_of_two((size - sizeof(struct objpool_slot)) /
> +			(sizeof(uint32_t) + sizeof(void *)));
> +}
> +
> +#define SLOT_AGES(s) ((uint32_t *)((char *)(s) + sizeof(struct objpool_slot)))
> +#define SLOT_ENTS(s) ((void **)((char *)(s) + sizeof(struct objpool_slot) + \
> +			sizeof(uint32_t) * (s)->size))
> +#define SLOT_OBJS(s) ((void *)((char *)(s) + sizeof(struct objpool_slot) + \
> +			(sizeof(uint32_t) + sizeof(void *)) * (s)->size))
> +
> +/* allocate and initialize percpu slots */
> +static inline int
> +__objpool_init_percpu_slots(struct objpool_head *head, unsigned int nobjs,
> +			void *context, objpool_init_obj_cb objinit)
> +{
> +	unsigned int i, j, n, size, objsz, nents = head->capacity;
> +
> +	/* aligned object size by sizeof(void *) */
> +	objsz = ALIGN(head->obj_size, sizeof(void *));
> +	/* shall we allocate objects along with objpool_slot */
> +	if (objsz)
> +		head->flags |= OBJPOOL_HAVE_OBJECTS;
> +
> +	for (i = 0; i < head->nr_cpus; i++) {
> +		struct objpool_slot *os;
> +
> +		/* compute how many objects to be managed by this slot */
> +		n = nobjs / head->nr_cpus;
> +		if (i < (nobjs % head->nr_cpus))
> +			n++;
> +		size = sizeof(struct objpool_slot) + sizeof(void *) * nents +
> +		       sizeof(uint32_t) * nents + objsz * n;
> +
> +		/* decide memory area for cpu-slot allocation */
> +		if (!i && !(head->gfp & GFP_ATOMIC) && size > PAGE_SIZE / 2)
> +			head->flags |= OBJPOOL_FROM_VMALLOC;
> +
> +		/* allocate percpu slot & objects from local memory */
> +		if (head->flags & OBJPOOL_FROM_VMALLOC)
> +			os = __vmalloc_node(size, sizeof(void *), head->gfp,
> +				cpu_to_node(i), __builtin_return_address(0));
> +		else
> +			os = kmalloc_node(size, head->gfp, cpu_to_node(i));
> +		if (!os)
> +			return -ENOMEM;
> +
> +		/* initialize percpu slot for the i-th cpu */
> +		memset(os, 0, size);
> +		os->size = head->capacity;
> +		os->mask = os->size - 1;
> +		head->cpu_slots[i] = os;
> +		head->slot_sizes[i] = size;
> +
> +		/*
> +		 * start from 2nd round to avoid conflict of 1st item.
> +		 * we assume that the head item is ready for retrieval
> +		 * iff head is equal to ages[head & mask]. but ages is
> +		 * initialized as 0, so in view of the caller of pop(),
> +		 * the 1st item (0th) is always ready, but fact could
> +		 * be: push() is stalled before the final update, thus
> +		 * the item being inserted will be lost forever.
> +		 */
> +		os->head = os->tail = head->capacity;
> +
> +		if (!objsz)
> +			continue;
> +
> +		for (j = 0; j < n; j++) {
> +			uint32_t *ages = SLOT_AGES(os);
> +			void **ents = SLOT_ENTS(os);
> +			void *obj = SLOT_OBJS(os) + j * objsz;
> +			uint32_t ie = os->tail & os->mask;
> +
> +			/* perform object initialization */
> +			if (objinit) {
> +				int rc = objinit(context, obj);
> +				if (rc)
> +					return rc;
> +			}
> +
> +			/* add obj into the ring array */
> +			ents[ie] = obj;
> +			ages[ie] = os->tail;
> +			os->tail++;
> +			head->nr_objs++;
> +		}
> +	}
> +
> +	return 0;
> +}
> +
> +/* cleanup all percpu slots of the object pool */
> +static inline void __objpool_fini_percpu_slots(struct objpool_head *head)
> +{
> +	unsigned int i;
> +
> +	if (!head->cpu_slots)
> +		return;
> +
> +	for (i = 0; i < head->nr_cpus; i++) {
> +		if (!head->cpu_slots[i])
> +			continue;
> +		if (head->flags & OBJPOOL_FROM_VMALLOC)
> +			vfree(head->cpu_slots[i]);
> +		else
> +			kfree(head->cpu_slots[i]);
> +	}
> +	kfree(head->cpu_slots);
> +	head->cpu_slots = NULL;
> +	head->slot_sizes = NULL;
> +}
> +
> +/**
> + * objpool_init: initialize object pool and pre-allocate objects
> + *
> + * args:
> + * @head:    the object pool to be initialized, declared by caller
> + * @nr_objs: total objects to be pre-allocated by this object pool
> + * @max_objs: max entries (object pool capacity), use nr_objs if 0
> + * @object_size: size of an object, no objects pre-allocated if 0
> + * @gfp:     flags for memory allocation (via kmalloc or vmalloc)
> + * @context: user context for object initialization callback
> + * @objinit: object initialization callback for extra setting-up
> + * @release: cleanup callback for private objects/pool/context
> + *
> + * return:
> + *         0 for success, otherwise error code
> + *
> + * All pre-allocated objects are to be zeroed. Caller could do extra
> + * initialization in objinit callback. The objinit callback will be
> + * called once and only once after the slot allocation. Then objpool
> + * won't touch any content of the objects since then. It's caller's
> + * duty to perform reinitialization after object allocation (pop) or
> + * clearance before object reclamation (push) if required.
> + */
> +int objpool_init(struct objpool_head *head, unsigned int nr_objs,
> +		unsigned int max_objs, unsigned int object_size,
> +		gfp_t gfp, void *context, objpool_init_obj_cb objinit,
> +		objpool_release_cb release)
> +{
> +	unsigned int nents, ncpus = num_possible_cpus();
> +	int rc;
> +
> +	/* calculate percpu slot size (rounded to pow of 2) */
> +	if (max_objs < nr_objs)

This should be an error case.

	if (!max_objs)

> +		max_objs = nr_objs;

	else if (max_objs < nr_objs)
		return -EINVAL;

But to simplify that, I think it should use only nr_objs.
I mean, if we can pass the @objinit, there seems no reason to
have both nr_objs and max_objs.

> +	nents = max_objs / ncpus;
> +	if (nents < __objpool_num_of_objs(L1_CACHE_BYTES))
> +		nents = __objpool_num_of_objs(L1_CACHE_BYTES);
> +	nents = roundup_pow_of_two(nents);
> +	while (nents * ncpus < nr_objs)
> +		nents = nents << 1;
> +
> +	memset(head, 0, sizeof(struct objpool_head));
> +	head->nr_cpus = ncpus;
> +	head->obj_size = object_size;
> +	head->capacity = nents;
> +	head->gfp = gfp & ~__GFP_ZERO;
> +	head->context = context;
> +	head->release = release;
> +
> +	/* allocate array for percpu slots */
> +	head->cpu_slots = kzalloc(head->nr_cpus * sizeof(void *) +
> +			       head->nr_cpus * sizeof(uint32_t), head->gfp);
> +	if (!head->cpu_slots)
> +		return -ENOMEM;
> +	head->slot_sizes = (uint32_t *)&head->cpu_slots[head->nr_cpus];
> +
> +	/* initialize per-cpu slots */
> +	rc = __objpool_init_percpu_slots(head, nr_objs, context, objinit);
> +	if (rc)
> +		__objpool_fini_percpu_slots(head);
> +
> +	return rc;
> +}
> +EXPORT_SYMBOL_GPL(objpool_init);
> +
> +/* adding object to slot tail, the given slot must NOT be full */
> +static inline int __objpool_add_slot(void *obj, struct objpool_slot *os)
> +{
> +	uint32_t *ages = SLOT_AGES(os);
> +	void **ents = SLOT_ENTS(os);
> +	uint32_t tail = atomic_inc_return((atomic_t *)&os->tail) - 1;
> +
> +	WRITE_ONCE(ents[tail & os->mask], obj);
> +
> +	/* order matters: obj must be updated before tail updating */
> +	smp_store_release(&ages[tail & os->mask], tail);
> +	return 0;
> +}
> +
> +/* adding object to slot, abort if the slot was already full */
> +static inline int __objpool_try_add_slot(void *obj, struct objpool_slot *os)
> +{
> +	uint32_t *ages = SLOT_AGES(os);
> +	void **ents = SLOT_ENTS(os);
> +	uint32_t head, tail;
> +
> +	do {
> +		/* perform memory loading for both head and tail */
> +		head = READ_ONCE(os->head);
> +		tail = READ_ONCE(os->tail);
> +		/* just abort if slot is full */
> +		if (tail >= head + os->size)
> +			return -ENOENT;
> +		/* try to extend tail by 1 using CAS to avoid races */
> +		if (try_cmpxchg_acquire(&os->tail, &tail, tail + 1))
> +			break;
> +	} while (1);
> +
> +	/* the tail-th of slot is reserved for the given obj */
> +	WRITE_ONCE(ents[tail & os->mask], obj);
> +	/* update epoch id to make this object available for pop() */
> +	smp_store_release(&ages[tail & os->mask], tail);
> +	return 0;
> +}
> +
> +/**
> + * objpool_populate: add objects from user provided pool in batch
> + *
> + * args:
> + * @head:  object pool
> + * @pool: user buffer for pre-allocated objects
> + * @size: size of user buffer
> + * @object_size: size of object & element
> + * @context: user context for objinit callback
> + * @objinit: object initialization callback
> + *
> + * return: 0 or error code
> + */
> +int objpool_populate(struct objpool_head *head, void *pool,
> +		unsigned int size, unsigned int object_size,
> +		void *context, objpool_init_obj_cb objinit)
> +{
> +	unsigned int n = head->nr_objs, used = 0, i;
> +
> +	if (head->pool || !pool || size < object_size)
> +		return -EINVAL;
> +	if (head->obj_size && head->obj_size != object_size)
> +		return -EINVAL;
> +	if (head->context && context && head->context != context)
> +		return -EINVAL;
> +	if (head->nr_objs >= head->nr_cpus * head->capacity)
> +		return -ENOENT;
> +
> +	WARN_ON_ONCE(((unsigned long)pool) & (sizeof(void *) - 1));
> +	WARN_ON_ONCE(((uint32_t)object_size) & (sizeof(void *) - 1));
> +
> +	/* align object size by sizeof(void *) */
> +	head->obj_size = object_size;
> +	object_size = ALIGN(object_size, sizeof(void *));
> +	if (object_size == 0)
> +		return -EINVAL;
> +
> +	while (used + object_size <= size) {
> +		void *obj = pool + used;
> +
> +		/* perform object initialization */
> +		if (objinit) {
> +			int rc = objinit(context, obj);
> +			if (rc)
> +				return rc;
> +		}
> +
> +		/* insert obj to its corresponding objpool slot */
> +		i = (n + used * head->nr_cpus/size) % head->nr_cpus;
> +		if (!__objpool_try_add_slot(obj, head->cpu_slots[i]))
> +			head->nr_objs++;
> +
> +		used += object_size;
> +	}
> +
> +	if (!used)
> +		return -ENOENT;
> +
> +	head->context = context;
> +	head->pool = pool;
> +	head->pool_size = size;
> +
> +	return 0;
> +}
> +EXPORT_SYMBOL_GPL(objpool_populate);
> +
> +/**
> + * objpool_add: add pre-allocated object to objpool during pool
> + * initialization
> + *
> + * args:
> + * @obj:  object pointer to be added to objpool
> + * @head: object pool to be inserted into
> + *
> + * return:
> + *     0 or error code
> + *
> + * objpool_add_node doesn't handle race conditions, can only be
> + * called during objpool initialization
> + */
> +int objpool_add(void *obj, struct objpool_head *head)
> +{
> +	unsigned int i, cpu;
> +
> +	if (!obj)
> +		return -EINVAL;
> +	if (head->nr_objs >= head->nr_cpus * head->capacity)
> +		return -ENOENT;
> +
> +	cpu = head->nr_objs % head->nr_cpus;
> +	for (i = 0; i < head->nr_cpus; i++) {
> +		if (!__objpool_try_add_slot(obj, head->cpu_slots[cpu])) {
> +			head->nr_objs++;
> +			return 0;
> +		}
> +
> +		if (++cpu >= head->nr_cpus)
> +			cpu = 0;
> +	}
> +
> +	return -ENOENT;
> +}
> +EXPORT_SYMBOL_GPL(objpool_add);
> +
> +/**
> + * objpool_push: reclaim the object and return back to objects pool
> + *
> + * args:
> + * @obj:  object pointer to be pushed to object pool
> + * @head: object pool
> + *
> + * return:
> + *     0 or error code: it fails only when objects pool are full
> + *
> + * objpool_push is non-blockable, and can be nested
> + */
> +int objpool_push(void *obj, struct objpool_head *head)
> +{
> +	unsigned int cpu = raw_smp_processor_id() % head->nr_cpus;
> +
> +	do {
> +		if (head->nr_objs > head->capacity) {
> +			if (!__objpool_try_add_slot(obj, head->cpu_slots[cpu]))
> +				return 0;
> +		} else {
> +			if (!__objpool_add_slot(obj, head->cpu_slots[cpu]))
> +				return 0;
> +		}
> +		if (++cpu >= head->nr_cpus)
> +			cpu = 0;
> +	} while (1);
> +
> +	return -ENOENT;
> +}
> +EXPORT_SYMBOL_GPL(objpool_push);
> +
> +/* try to retrieve object from slot */
> +static inline void *__objpool_try_get_slot(struct objpool_slot *os)
> +{
> +	uint32_t *ages = SLOT_AGES(os);
> +	void **ents = SLOT_ENTS(os);
> +	/* do memory load of head to local head */
> +	uint32_t head = smp_load_acquire(&os->head);
> +
> +	/* loop if slot isn't empty */
> +	while (head != READ_ONCE(os->tail)) {
> +		uint32_t id = head & os->mask, prev = head;
> +
> +		/* do prefetching of object ents */
> +		prefetch(&ents[id]);
> +
> +		/*
> +		 * check whether this item was ready for retrieval ? There's
> +		 * possibility * in theory * we might retrieve wrong object,
> +		 * in case ages[id] overflows when current task is sleeping,
> +		 * but it will take very very long to overflow an uint32_t
> +		 */
> +		if (smp_load_acquire(&ages[id]) == head) {
> +			/* node must have been udpated by push() */
> +			void *node = READ_ONCE(ents[id]);
> +			/* commit and move forward head of the slot */
> +			if (try_cmpxchg_release(&os->head, &head, head + 1))
> +				return node;
> +		}
> +
> +		/* re-load head from memory continue trying */
> +		head = READ_ONCE(os->head);
> +		/*
> +		 * head stays unchanged, so it's very likely current pop()
> +		 * just preempted/interrupted an ongoing push() operation
> +		 */
> +		if (head == prev)
> +			break;
> +	}
> +
> +	return NULL;
> +}
> +
> +/**
> + * objpool_pop: allocate an object from objects pool
> + *
> + * args:
> + * @oh:  object pool
> + *
> + * return:
> + *   object: NULL if failed (object pool is empty)
> + *
> + * objpool_pop can be nested, so can be used in any context.
> + */
> +void *objpool_pop(struct objpool_head *head)
> +{
> +	unsigned int i, cpu;
> +	void *obj = NULL;
> +
> +	cpu = raw_smp_processor_id() % head->nr_cpus;

(Not sure, do we really need this?)

Thank you,

> +	for (i = 0; i < head->nr_cpus; i++) {
> +		struct objpool_slot *slot = head->cpu_slots[cpu];
> +		obj = __objpool_try_get_slot(slot);
> +		if (obj)
> +			break;
> +		if (++cpu >= head->nr_cpus)
> +			cpu = 0;
> +	}
> +
> +	return obj;
> +}
> +EXPORT_SYMBOL_GPL(objpool_pop);
> +
> +/**
> + * objpool_fini: cleanup the whole object pool (releasing all objects)
> + *
> + * args:
> + * @head: object pool to be released
> + *
> + */
> +void objpool_fini(struct objpool_head *head)
> +{
> +	uint32_t i, flags;
> +
> +	if (!head->cpu_slots)
> +		return;
> +
> +	if (!head->release) {
> +		__objpool_fini_percpu_slots(head);
> +		return;
> +	}
> +
> +	/* cleanup all objects remained in objpool */
> +	for (i = 0; i < head->nr_cpus; i++) {
> +		void *obj;
> +		do {
> +			flags = OBJPOOL_FLAG_NODE;
> +			obj = __objpool_try_get_slot(head->cpu_slots[i]);
> +			if (!obj)
> +				break;
> +			if (!objpool_is_inpool(obj, head) &&
> +			    !objpool_is_inslot(obj, head)) {
> +				flags |= OBJPOOL_FLAG_USER;
> +			}
> +			head->release(head->context, obj, flags);
> +		} while (obj);
> +	}
> +
> +	/* release percpu slots */
> +	__objpool_fini_percpu_slots(head);
> +
> +	/* cleanup user private pool and related context */
> +	flags = OBJPOOL_FLAG_POOL;
> +	if (head->pool)
> +		flags |= OBJPOOL_FLAG_USER;
> +	head->release(head->context, head->pool, flags);
> +}
> +EXPORT_SYMBOL_GPL(objpool_fini);
> -- 
> 2.34.1
> 


-- 
Masami Hiramatsu (Google) <mhiramat@kernel.org>

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH v6 3/4] kprobes: kretprobe scalability improvement with objpool
  2022-11-08  7:14     ` [PATCH v6 3/4] kprobes: kretprobe scalability improvement with objpool wuqiang
@ 2022-11-14 15:56       ` Masami Hiramatsu
  2022-11-14 15:56       ` Masami Hiramatsu
  1 sibling, 0 replies; 23+ messages in thread
From: Masami Hiramatsu @ 2022-11-14 15:56 UTC (permalink / raw)
  To: wuqiang
  Cc: davem, anil.s.keshavamurthy, naveen.n.rao, rostedt, peterz, akpm,
	sander, ebiggers, dan.j.williams, jpoimboe, linux-kernel, lkp,
	mattwu

Hi Wuqiang,

On Tue,  8 Nov 2022 15:14:42 +0800
wuqiang <wuqiang.matt@bytedance.com> wrote:

> kretprobe is using freelist to manage return-instances, but freelist,
> as LIFO queue based on singly linked list, scales badly and reduces
> the overall throughput of kretprobed routines, especially for high
> contention scenarios.
> 
> Here's a typical throughput test of sys_flock (counts in 10 seconds,
> measured with perf stat -a -I 10000 -e syscalls:sys_enter_flock):
> 
> OS: Debian 10 X86_64, Linux 6.1rc2
> HW: XEON 8336C x 2, 64 cores/128 threads, DDR4 3200MT/s
> 
>       1X       2X       4X       6X       8X      12X      16X
> 34762430 36546920 17949900 13101899 12569595 12646601 14729195
>      24X      32X      48X      64X      72X      96X     128X
> 19263546 10102064  8985418 11936495 11493980  7127789  9330985
> 
> This patch introduces objpool to kretprobe and rethook, with orginal
> freelist replaced and brings near-linear scalability to kretprobed
> routines. Tests of kretprobe throughput show the biggest ratio as
> 333.9x of the original freelist. Here's the comparison:
> 
>                   1X         2X         4X         8X        16X
> freelist:   34762430   36546920   17949900   12569595   14729195
> objpool:    35627544   72182095  144068494  287564688  576903916
>                  32X        48X        64X        96X       128X
> freelist:   10102064    8985418   11936495    7127789    9330985
> objpool:  1158876372 1737828164 2324371724 2380310472 2463182819
> 
> Tests on 96-core ARM64 system output similarly, but with the biggest
> ratio up to 642.2x:
> 
> OS: Debian 10 AARCH64, Linux 6.1rc2
> HW: Kunpeng-920 96 cores/2 sockets/4 NUMA nodes, DDR4 2933 MT/s
> 
>                   1X         2X         4X         8X        16X
> freelist:   17498299   10887037   10224710    8499132    6421751
> objpool:    18715726   35549845   71615884  144258971  283707220
>                  24X        32X        48X        64X        96X
> freelist:    5339868    4819116    3593919    3121575    2687167
> objpool:   419830913  571609748  877456139 1143316315 1725668029
> 
> Signed-off-by: wuqiang <wuqiang.matt@bytedance.com>
> ---
>  include/linux/kprobes.h |  9 ++--
>  include/linux/rethook.h | 15 +++----
>  kernel/kprobes.c        | 95 +++++++++++++++++++----------------------
>  kernel/trace/fprobe.c   | 17 ++------
>  kernel/trace/rethook.c  | 80 +++++++++++++++++-----------------
>  5 files changed, 95 insertions(+), 121 deletions(-)
> 
> diff --git a/include/linux/kprobes.h b/include/linux/kprobes.h
> index a0b92be98984..f13f01e600c2 100644
> --- a/include/linux/kprobes.h
> +++ b/include/linux/kprobes.h
> @@ -27,7 +27,7 @@
>  #include <linux/mutex.h>
>  #include <linux/ftrace.h>
>  #include <linux/refcount.h>
> -#include <linux/freelist.h>
> +#include <linux/objpool.h>
>  #include <linux/rethook.h>
>  #include <asm/kprobes.h>
>  
> @@ -141,6 +141,7 @@ static inline bool kprobe_ftrace(struct kprobe *p)
>   */
>  struct kretprobe_holder {
>  	struct kretprobe	*rp;
> +	struct objpool_head	oh;

Could you rename it to `pool` as rethook does?

>  	refcount_t		ref;
>  };
>  
> @@ -154,7 +155,6 @@ struct kretprobe {
>  #ifdef CONFIG_KRETPROBE_ON_RETHOOK
>  	struct rethook *rh;
>  #else
> -	struct freelist_head freelist;

OK, this is natural to move the head to kretprobe_holder, because
objpool must call its fini() function when all objects are freed.

>  	struct kretprobe_holder *rph;
>  #endif
>  };
> @@ -165,10 +165,7 @@ struct kretprobe_instance {
>  #ifdef CONFIG_KRETPROBE_ON_RETHOOK
>  	struct rethook_node node;
>  #else
> -	union {
> -		struct freelist_node freelist;
> -		struct rcu_head rcu;
> -	};
> +	struct rcu_head rcu;
>  	struct llist_node llist;
>  	struct kretprobe_holder *rph;
>  	kprobe_opcode_t *ret_addr;
> diff --git a/include/linux/rethook.h b/include/linux/rethook.h
> index c8ac1e5afcd1..278ec65e71fe 100644
> --- a/include/linux/rethook.h
> +++ b/include/linux/rethook.h
> @@ -6,7 +6,7 @@
>  #define _LINUX_RETHOOK_H
>  
>  #include <linux/compiler.h>
> -#include <linux/freelist.h>
> +#include <linux/objpool.h>
>  #include <linux/kallsyms.h>
>  #include <linux/llist.h>
>  #include <linux/rcupdate.h>
> @@ -30,14 +30,14 @@ typedef void (*rethook_handler_t) (struct rethook_node *, void *, struct pt_regs
>  struct rethook {
>  	void			*data;
>  	rethook_handler_t	handler;
> -	struct freelist_head	pool;
> +	struct objpool_head	pool;

On the other hand, the rethook already consolidated that "holder"
feature to itself, so you don't need to move, just replace the
freelist_head with objpool_head. LGTM.

>  	refcount_t		ref;
>  	struct rcu_head		rcu;
>  };
>  
>  /**
>   * struct rethook_node - The rethook shadow-stack entry node.
> - * @freelist: The freelist, linked to struct rethook::pool.
> + * @nod: The objpool node, linked to struct rethook::pool.

I don't see @nod in rethook_node. maybe a typo?

>   * @rcu: The rcu_head for deferred freeing.
>   * @llist: The llist, linked to a struct task_struct::rethooks.
>   * @rethook: The pointer to the struct rethook.
> @@ -48,19 +48,15 @@ struct rethook {
>   * on each entry of the shadow stack.
>   */
>  struct rethook_node {
> -	union {
> -		struct freelist_node freelist;
> -		struct rcu_head      rcu;
> -	};
> +	struct rcu_head		rcu;
>  	struct llist_node	llist;
>  	struct rethook		*rethook;
>  	unsigned long		ret_addr;
>  	unsigned long		frame;
>  };
>  
> -struct rethook *rethook_alloc(void *data, rethook_handler_t handler);
> +struct rethook *rethook_alloc(void *data, rethook_handler_t handler, gfp_t gfp, int size, int max);
>  void rethook_free(struct rethook *rh);
> -void rethook_add_node(struct rethook *rh, struct rethook_node *node);
>  struct rethook_node *rethook_try_get(struct rethook *rh);
>  void rethook_recycle(struct rethook_node *node);
>  void rethook_hook(struct rethook_node *node, struct pt_regs *regs, bool mcount);
> @@ -97,4 +93,3 @@ void rethook_flush_task(struct task_struct *tk);
>  #endif
>  
>  #endif
> -
> diff --git a/kernel/kprobes.c b/kernel/kprobes.c
> index cd9f5a66a690..a8b202f87e2d 100644
> --- a/kernel/kprobes.c
> +++ b/kernel/kprobes.c
> @@ -1865,10 +1865,12 @@ static struct notifier_block kprobe_exceptions_nb = {
>  static void free_rp_inst_rcu(struct rcu_head *head)
>  {
>  	struct kretprobe_instance *ri = container_of(head, struct kretprobe_instance, rcu);
> +	struct kretprobe_holder *rph = ri->rph;
>  
> -	if (refcount_dec_and_test(&ri->rph->ref))
> -		kfree(ri->rph);
> -	kfree(ri);
> +	if (refcount_dec_and_test(&rph->ref)) {
> +		objpool_fini(&rph->oh);
> +		kfree(rph);
> +	}
>  }
>  NOKPROBE_SYMBOL(free_rp_inst_rcu);
>  
> @@ -1877,7 +1879,7 @@ static void recycle_rp_inst(struct kretprobe_instance *ri)
>  	struct kretprobe *rp = get_kretprobe(ri);
>  
>  	if (likely(rp))
> -		freelist_add(&ri->freelist, &rp->freelist);
> +		objpool_push(ri, &rp->rph->oh);
>  	else
>  		call_rcu(&ri->rcu, free_rp_inst_rcu);
>  }
> @@ -1914,23 +1916,19 @@ NOKPROBE_SYMBOL(kprobe_flush_task);
>  
>  static inline void free_rp_inst(struct kretprobe *rp)
>  {
> -	struct kretprobe_instance *ri;
> -	struct freelist_node *node;
> -	int count = 0;
> -
> -	node = rp->freelist.head;
> -	while (node) {
> -		ri = container_of(node, struct kretprobe_instance, freelist);
> -		node = node->next;
> -
> -		kfree(ri);
> -		count++;
> -	}
> +	struct kretprobe_holder *rph = rp->rph;
> +	void *nod;
>  
> -	if (refcount_sub_and_test(count, &rp->rph->ref)) {
> -		kfree(rp->rph);
> -		rp->rph = NULL;
> -	}
> +	rp->rph = NULL;
> +	do {
> +		nod = objpool_pop(&rph->oh);
> +		/* deref anyway since we've one extra ref grabbed */
> +		if (refcount_dec_and_test(&rph->ref)) {
> +			objpool_fini(&rph->oh);
> +			kfree(rph);
> +			break;
> +		}
> +	} while (nod);
>  }
>  
>  /* This assumes the 'tsk' is the current task or the is not running. */
> @@ -2072,19 +2070,17 @@ NOKPROBE_SYMBOL(__kretprobe_trampoline_handler)
>  static int pre_handler_kretprobe(struct kprobe *p, struct pt_regs *regs)
>  {
>  	struct kretprobe *rp = container_of(p, struct kretprobe, kp);
> +	struct kretprobe_holder *rph = rp->rph;
>  	struct kretprobe_instance *ri;
> -	struct freelist_node *fn;
>  
> -	fn = freelist_try_get(&rp->freelist);
> -	if (!fn) {
> +	ri = objpool_pop(&rph->oh);
> +	if (!ri) {
>  		rp->nmissed++;
>  		return 0;
>  	}
>  
> -	ri = container_of(fn, struct kretprobe_instance, freelist);
> -
>  	if (rp->entry_handler && rp->entry_handler(ri, regs)) {
> -		freelist_add(&ri->freelist, &rp->freelist);
> +		objpool_push(ri, &rph->oh);
>  		return 0;
>  	}
>  
> @@ -2174,10 +2170,19 @@ int kprobe_on_func_entry(kprobe_opcode_t *addr, const char *sym, unsigned long o
>  	return 0;
>  }
>  
> +#ifndef CONFIG_KRETPROBE_ON_RETHOOK
> +static int kretprobe_init_inst(void *context, void *nod)
> +{
> +	struct kretprobe_instance *ri = nod;
> +
> +	ri->rph = context;
> +	return 0;
> +}
> +#endif
> +
>  int register_kretprobe(struct kretprobe *rp)
>  {
>  	int ret;
> -	struct kretprobe_instance *inst;
>  	int i;
>  	void *addr;
>  
> @@ -2215,20 +2220,12 @@ int register_kretprobe(struct kretprobe *rp)
>  #endif
>  	}
>  #ifdef CONFIG_KRETPROBE_ON_RETHOOK
> -	rp->rh = rethook_alloc((void *)rp, kretprobe_rethook_handler);
> +	rp->rh = rethook_alloc((void *)rp, kretprobe_rethook_handler, GFP_KERNEL,
> +				sizeof(struct kretprobe_instance) + rp->data_size,
> +				rp->maxactive);
>  	if (!rp->rh)
>  		return -ENOMEM;
>  
> -	for (i = 0; i < rp->maxactive; i++) {
> -		inst = kzalloc(sizeof(struct kretprobe_instance) +
> -			       rp->data_size, GFP_KERNEL);
> -		if (inst == NULL) {
> -			rethook_free(rp->rh);
> -			rp->rh = NULL;
> -			return -ENOMEM;
> -		}
> -		rethook_add_node(rp->rh, &inst->node);
> -	}
>  	rp->nmissed = 0;
>  	/* Establish function entry probe point */
>  	ret = register_kprobe(&rp->kp);
> @@ -2237,25 +2234,19 @@ int register_kretprobe(struct kretprobe *rp)
>  		rp->rh = NULL;
>  	}
>  #else	/* !CONFIG_KRETPROBE_ON_RETHOOK */
> -	rp->freelist.head = NULL;
>  	rp->rph = kzalloc(sizeof(struct kretprobe_holder), GFP_KERNEL);
>  	if (!rp->rph)
>  		return -ENOMEM;
>  
> -	rp->rph->rp = rp;
> -	for (i = 0; i < rp->maxactive; i++) {
> -		inst = kzalloc(sizeof(struct kretprobe_instance) +
> -			       rp->data_size, GFP_KERNEL);
> -		if (inst == NULL) {
> -			refcount_set(&rp->rph->ref, i);
> -			free_rp_inst(rp);
> -			return -ENOMEM;
> -		}
> -		inst->rph = rp->rph;
> -		freelist_add(&inst->freelist, &rp->freelist);
> +	if (objpool_init(&rp->rph->oh, rp->maxactive, rp->maxactive,
> +			  rp->data_size + sizeof(struct kretprobe_instance),
> +			  GFP_KERNEL, rp->rph, kretprobe_init_inst, NULL)) {
> +		kfree(rp->rph);
> +		rp->rph = NULL;
> +		return -ENOMEM;
>  	}
> -	refcount_set(&rp->rph->ref, i);
> -
> +	refcount_set(&rp->rph->ref, rp->maxactive + 1);
> +	rp->rph->rp = rp;
>  	rp->nmissed = 0;
>  	/* Establish function entry probe point */
>  	ret = register_kprobe(&rp->kp);
> diff --git a/kernel/trace/fprobe.c b/kernel/trace/fprobe.c
> index e8143e368074..9a002bfbd216 100644
> --- a/kernel/trace/fprobe.c
> +++ b/kernel/trace/fprobe.c
> @@ -125,7 +125,7 @@ static void fprobe_init(struct fprobe *fp)
>  
>  static int fprobe_init_rethook(struct fprobe *fp, int num)
>  {
> -	int i, size;
> +	int size;
>  
>  	if (num < 0)
>  		return -EINVAL;
> @@ -140,20 +140,11 @@ static int fprobe_init_rethook(struct fprobe *fp, int num)
>  	if (size < 0)
>  		return -E2BIG;
>  
> -	fp->rethook = rethook_alloc((void *)fp, fprobe_exit_handler);
> +	fp->rethook = rethook_alloc((void *)fp, fprobe_exit_handler, GFP_KERNEL,
> +				sizeof(struct fprobe_rethook_node), size);
>  	if (!fp->rethook)
>  		return -ENOMEM;
> -	for (i = 0; i < size; i++) {
> -		struct fprobe_rethook_node *node;
> -
> -		node = kzalloc(sizeof(*node), GFP_KERNEL);
> -		if (!node) {
> -			rethook_free(fp->rethook);
> -			fp->rethook = NULL;
> -			return -ENOMEM;
> -		}
> -		rethook_add_node(fp->rethook, &node->node);
> -	}
> +
>  	return 0;
>  }
>  
> diff --git a/kernel/trace/rethook.c b/kernel/trace/rethook.c
> index c69d82273ce7..01df98db2fbe 100644
> --- a/kernel/trace/rethook.c
> +++ b/kernel/trace/rethook.c
> @@ -36,21 +36,17 @@ void rethook_flush_task(struct task_struct *tk)
>  static void rethook_free_rcu(struct rcu_head *head)
>  {
>  	struct rethook *rh = container_of(head, struct rethook, rcu);
> -	struct rethook_node *rhn;
> -	struct freelist_node *node;
> -	int count = 1;
> +	struct rethook_node *nod;
>  
> -	node = rh->pool.head;
> -	while (node) {
> -		rhn = container_of(node, struct rethook_node, freelist);
> -		node = node->next;
> -		kfree(rhn);
> -		count++;
> -	}
> -
> -	/* The rh->ref is the number of pooled node + 1 */
> -	if (refcount_sub_and_test(count, &rh->ref))
> -		kfree(rh);
> +	do {
> +		nod = objpool_pop(&rh->pool);
> +		/* deref anyway since we've one extra ref grabbed */
> +		if (refcount_dec_and_test(&rh->ref)) {
> +			objpool_fini(&rh->pool);
> +			kfree(rh);
> +			break;
> +		}
> +	} while (nod);
>  }
>  
>  /**
> @@ -70,16 +66,28 @@ void rethook_free(struct rethook *rh)
>  	call_rcu(&rh->rcu, rethook_free_rcu);
>  }
>  
> +static int rethook_init_node(void *context, void *nod)
> +{
> +	struct rethook_node *node = nod;
> +
> +	node->rethook = context;
> +	return 0;
> +}
> +
>  /**
>   * rethook_alloc() - Allocate struct rethook.
>   * @data: a data to pass the @handler when hooking the return.
>   * @handler: the return hook callback function.
> + * @gfp: default gfp for objpool allocation
> + * @size: rethook node size

@size should be @data_size, which is the size of additional data bytes
of rethook_node.

> + * @max: number of rethook nodes to be preallocated
>   *
>   * Allocate and initialize a new rethook with @data and @handler.
>   * Return NULL if memory allocation fails or @handler is NULL.
>   * Note that @handler == NULL means this rethook is going to be freed.
>   */
> -struct rethook *rethook_alloc(void *data, rethook_handler_t handler)
> +struct rethook *rethook_alloc(void *data, rethook_handler_t handler, gfp_t gfp,
> +			      int size, int max)

This doesn't need @gfp. Just use GFP_KERNEL from rethook layer.

>  {
>  	struct rethook *rh = kzalloc(sizeof(struct rethook), GFP_KERNEL);

And here, 

	if (data_size < 0 || max < 0)
		return -EINVAL;

>  
> @@ -88,34 +96,26 @@ struct rethook *rethook_alloc(void *data, rethook_handler_t handler)
>  
>  	rh->data = data;
>  	rh->handler = handler;
> -	rh->pool.head = NULL;
> -	refcount_set(&rh->ref, 1);
>  
> +	/* initialize the objpool for rethook nodes */
> +	if (objpool_init(&rh->pool, max, max, size, gfp, rh, rethook_init_node,
> +			NULL)) {
> +		kfree(rh);
> +		return NULL;
> +	}
> +	refcount_set(&rh->ref, max + 1);
>  	return rh;
>  }
>  
> -/**
> - * rethook_add_node() - Add a new node to the rethook.
> - * @rh: the struct rethook.
> - * @node: the struct rethook_node to be added.
> - *
> - * Add @node to @rh. User must allocate @node (as a part of user's
> - * data structure.) The @node fields are initialized in this function.
> - */
> -void rethook_add_node(struct rethook *rh, struct rethook_node *node)
> -{
> -	node->rethook = rh;
> -	freelist_add(&node->freelist, &rh->pool);
> -	refcount_inc(&rh->ref);
> -}
> -
>  static void free_rethook_node_rcu(struct rcu_head *head)
>  {
>  	struct rethook_node *node = container_of(head, struct rethook_node, rcu);
> +	struct rethook *rh = node->rethook;
>  
> -	if (refcount_dec_and_test(&node->rethook->ref))
> -		kfree(node->rethook);
> -	kfree(node);
> +	if (refcount_dec_and_test(&rh->ref)) {
> +		objpool_fini(&rh->pool);
> +		kfree(rh);
> +	}
>  }
>  
>  /**
> @@ -130,7 +130,7 @@ void rethook_recycle(struct rethook_node *node)
>  	lockdep_assert_preemption_disabled();
>  
>  	if (likely(READ_ONCE(node->rethook->handler)))
> -		freelist_add(&node->freelist, &node->rethook->pool);
> +		objpool_push(node, &node->rethook->pool);
>  	else
>  		call_rcu(&node->rcu, free_rethook_node_rcu);
>  }
> @@ -146,7 +146,7 @@ NOKPROBE_SYMBOL(rethook_recycle);
>  struct rethook_node *rethook_try_get(struct rethook *rh)
>  {
>  	rethook_handler_t handler = READ_ONCE(rh->handler);
> -	struct freelist_node *fn;
> +	struct rethook_node *nod;
>  
>  	lockdep_assert_preemption_disabled();
>  
> @@ -163,11 +163,11 @@ struct rethook_node *rethook_try_get(struct rethook *rh)
>  	if (unlikely(!rcu_is_watching()))
>  		return NULL;
>  
> -	fn = freelist_try_get(&rh->pool);
> -	if (!fn)
> +	nod = (struct rethook_node *)objpool_pop(&rh->pool);
> +	if (!nod)
>  		return NULL;
>  
> -	return container_of(fn, struct rethook_node, freelist);
> +	return nod;

You can just write this part as

	return (struct rethook_node *)objpool_pop(&rh->pool);

Thank you,

>  }
>  NOKPROBE_SYMBOL(rethook_try_get);
>  
> -- 
> 2.34.1
> 


-- 
Masami Hiramatsu (Google) <mhiramat@kernel.org>

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH v6 3/4] kprobes: kretprobe scalability improvement with objpool
  2022-11-08  7:14     ` [PATCH v6 3/4] kprobes: kretprobe scalability improvement with objpool wuqiang
  2022-11-14 15:56       ` Masami Hiramatsu
@ 2022-11-14 15:56       ` Masami Hiramatsu
  2022-11-16 10:49         ` wuqiang
  1 sibling, 1 reply; 23+ messages in thread
From: Masami Hiramatsu @ 2022-11-14 15:56 UTC (permalink / raw)
  To: wuqiang
  Cc: davem, anil.s.keshavamurthy, naveen.n.rao, rostedt, peterz, akpm,
	sander, ebiggers, dan.j.williams, jpoimboe, linux-kernel, lkp,
	mattwu

Hi Wuqiang,

On Tue,  8 Nov 2022 15:14:42 +0800
wuqiang <wuqiang.matt@bytedance.com> wrote:

> kretprobe is using freelist to manage return-instances, but freelist,
> as LIFO queue based on singly linked list, scales badly and reduces
> the overall throughput of kretprobed routines, especially for high
> contention scenarios.
> 
> Here's a typical throughput test of sys_flock (counts in 10 seconds,
> measured with perf stat -a -I 10000 -e syscalls:sys_enter_flock):
> 
> OS: Debian 10 X86_64, Linux 6.1rc2
> HW: XEON 8336C x 2, 64 cores/128 threads, DDR4 3200MT/s
> 
>       1X       2X       4X       6X       8X      12X      16X
> 34762430 36546920 17949900 13101899 12569595 12646601 14729195
>      24X      32X      48X      64X      72X      96X     128X
> 19263546 10102064  8985418 11936495 11493980  7127789  9330985
> 
> This patch introduces objpool to kretprobe and rethook, with orginal
> freelist replaced and brings near-linear scalability to kretprobed
> routines. Tests of kretprobe throughput show the biggest ratio as
> 333.9x of the original freelist. Here's the comparison:
> 
>                   1X         2X         4X         8X        16X
> freelist:   34762430   36546920   17949900   12569595   14729195
> objpool:    35627544   72182095  144068494  287564688  576903916
>                  32X        48X        64X        96X       128X
> freelist:   10102064    8985418   11936495    7127789    9330985
> objpool:  1158876372 1737828164 2324371724 2380310472 2463182819
> 
> Tests on 96-core ARM64 system output similarly, but with the biggest
> ratio up to 642.2x:
> 
> OS: Debian 10 AARCH64, Linux 6.1rc2
> HW: Kunpeng-920 96 cores/2 sockets/4 NUMA nodes, DDR4 2933 MT/s
> 
>                   1X         2X         4X         8X        16X
> freelist:   17498299   10887037   10224710    8499132    6421751
> objpool:    18715726   35549845   71615884  144258971  283707220
>                  24X        32X        48X        64X        96X
> freelist:    5339868    4819116    3593919    3121575    2687167
> objpool:   419830913  571609748  877456139 1143316315 1725668029
> 
> Signed-off-by: wuqiang <wuqiang.matt@bytedance.com>
> ---
>  include/linux/kprobes.h |  9 ++--
>  include/linux/rethook.h | 15 +++----
>  kernel/kprobes.c        | 95 +++++++++++++++++++----------------------
>  kernel/trace/fprobe.c   | 17 ++------
>  kernel/trace/rethook.c  | 80 +++++++++++++++++-----------------
>  5 files changed, 95 insertions(+), 121 deletions(-)
> 
> diff --git a/include/linux/kprobes.h b/include/linux/kprobes.h
> index a0b92be98984..f13f01e600c2 100644
> --- a/include/linux/kprobes.h
> +++ b/include/linux/kprobes.h
> @@ -27,7 +27,7 @@
>  #include <linux/mutex.h>
>  #include <linux/ftrace.h>
>  #include <linux/refcount.h>
> -#include <linux/freelist.h>
> +#include <linux/objpool.h>
>  #include <linux/rethook.h>
>  #include <asm/kprobes.h>
>  
> @@ -141,6 +141,7 @@ static inline bool kprobe_ftrace(struct kprobe *p)
>   */
>  struct kretprobe_holder {
>  	struct kretprobe	*rp;
> +	struct objpool_head	oh;

Could you rename it to `pool` as rethook does?

>  	refcount_t		ref;
>  };
>  
> @@ -154,7 +155,6 @@ struct kretprobe {
>  #ifdef CONFIG_KRETPROBE_ON_RETHOOK
>  	struct rethook *rh;
>  #else
> -	struct freelist_head freelist;

OK, this is natural to move the head to kretprobe_holder, because
objpool must call its fini() function when all objects are freed.

>  	struct kretprobe_holder *rph;
>  #endif
>  };
> @@ -165,10 +165,7 @@ struct kretprobe_instance {
>  #ifdef CONFIG_KRETPROBE_ON_RETHOOK
>  	struct rethook_node node;
>  #else
> -	union {
> -		struct freelist_node freelist;
> -		struct rcu_head rcu;
> -	};
> +	struct rcu_head rcu;
>  	struct llist_node llist;
>  	struct kretprobe_holder *rph;
>  	kprobe_opcode_t *ret_addr;
> diff --git a/include/linux/rethook.h b/include/linux/rethook.h
> index c8ac1e5afcd1..278ec65e71fe 100644
> --- a/include/linux/rethook.h
> +++ b/include/linux/rethook.h
> @@ -6,7 +6,7 @@
>  #define _LINUX_RETHOOK_H
>  
>  #include <linux/compiler.h>
> -#include <linux/freelist.h>
> +#include <linux/objpool.h>
>  #include <linux/kallsyms.h>
>  #include <linux/llist.h>
>  #include <linux/rcupdate.h>
> @@ -30,14 +30,14 @@ typedef void (*rethook_handler_t) (struct rethook_node *, void *, struct pt_regs
>  struct rethook {
>  	void			*data;
>  	rethook_handler_t	handler;
> -	struct freelist_head	pool;
> +	struct objpool_head	pool;

On the other hand, the rethook already consolidated that "holder"
feature to itself, so you don't need to move, just replace the
freelist_head with objpool_head. LGTM.

>  	refcount_t		ref;
>  	struct rcu_head		rcu;
>  };
>  
>  /**
>   * struct rethook_node - The rethook shadow-stack entry node.
> - * @freelist: The freelist, linked to struct rethook::pool.
> + * @nod: The objpool node, linked to struct rethook::pool.

I don't see @nod in rethook_node. maybe a typo?

>   * @rcu: The rcu_head for deferred freeing.
>   * @llist: The llist, linked to a struct task_struct::rethooks.
>   * @rethook: The pointer to the struct rethook.
> @@ -48,19 +48,15 @@ struct rethook {
>   * on each entry of the shadow stack.
>   */
>  struct rethook_node {
> -	union {
> -		struct freelist_node freelist;
> -		struct rcu_head      rcu;
> -	};
> +	struct rcu_head		rcu;
>  	struct llist_node	llist;
>  	struct rethook		*rethook;
>  	unsigned long		ret_addr;
>  	unsigned long		frame;
>  };
>  
> -struct rethook *rethook_alloc(void *data, rethook_handler_t handler);
> +struct rethook *rethook_alloc(void *data, rethook_handler_t handler, gfp_t gfp, int size, int max);
>  void rethook_free(struct rethook *rh);
> -void rethook_add_node(struct rethook *rh, struct rethook_node *node);
>  struct rethook_node *rethook_try_get(struct rethook *rh);
>  void rethook_recycle(struct rethook_node *node);
>  void rethook_hook(struct rethook_node *node, struct pt_regs *regs, bool mcount);
> @@ -97,4 +93,3 @@ void rethook_flush_task(struct task_struct *tk);
>  #endif
>  
>  #endif
> -
> diff --git a/kernel/kprobes.c b/kernel/kprobes.c
> index cd9f5a66a690..a8b202f87e2d 100644
> --- a/kernel/kprobes.c
> +++ b/kernel/kprobes.c
> @@ -1865,10 +1865,12 @@ static struct notifier_block kprobe_exceptions_nb = {
>  static void free_rp_inst_rcu(struct rcu_head *head)
>  {
>  	struct kretprobe_instance *ri = container_of(head, struct kretprobe_instance, rcu);
> +	struct kretprobe_holder *rph = ri->rph;
>  
> -	if (refcount_dec_and_test(&ri->rph->ref))
> -		kfree(ri->rph);
> -	kfree(ri);
> +	if (refcount_dec_and_test(&rph->ref)) {
> +		objpool_fini(&rph->oh);
> +		kfree(rph);
> +	}
>  }
>  NOKPROBE_SYMBOL(free_rp_inst_rcu);
>  
> @@ -1877,7 +1879,7 @@ static void recycle_rp_inst(struct kretprobe_instance *ri)
>  	struct kretprobe *rp = get_kretprobe(ri);
>  
>  	if (likely(rp))
> -		freelist_add(&ri->freelist, &rp->freelist);
> +		objpool_push(ri, &rp->rph->oh);
>  	else
>  		call_rcu(&ri->rcu, free_rp_inst_rcu);
>  }
> @@ -1914,23 +1916,19 @@ NOKPROBE_SYMBOL(kprobe_flush_task);
>  
>  static inline void free_rp_inst(struct kretprobe *rp)
>  {
> -	struct kretprobe_instance *ri;
> -	struct freelist_node *node;
> -	int count = 0;
> -
> -	node = rp->freelist.head;
> -	while (node) {
> -		ri = container_of(node, struct kretprobe_instance, freelist);
> -		node = node->next;
> -
> -		kfree(ri);
> -		count++;
> -	}
> +	struct kretprobe_holder *rph = rp->rph;
> +	void *nod;
>  
> -	if (refcount_sub_and_test(count, &rp->rph->ref)) {
> -		kfree(rp->rph);
> -		rp->rph = NULL;
> -	}
> +	rp->rph = NULL;
> +	do {
> +		nod = objpool_pop(&rph->oh);
> +		/* deref anyway since we've one extra ref grabbed */
> +		if (refcount_dec_and_test(&rph->ref)) {
> +			objpool_fini(&rph->oh);
> +			kfree(rph);
> +			break;
> +		}
> +	} while (nod);
>  }
>  
>  /* This assumes the 'tsk' is the current task or the is not running. */
> @@ -2072,19 +2070,17 @@ NOKPROBE_SYMBOL(__kretprobe_trampoline_handler)
>  static int pre_handler_kretprobe(struct kprobe *p, struct pt_regs *regs)
>  {
>  	struct kretprobe *rp = container_of(p, struct kretprobe, kp);
> +	struct kretprobe_holder *rph = rp->rph;
>  	struct kretprobe_instance *ri;
> -	struct freelist_node *fn;
>  
> -	fn = freelist_try_get(&rp->freelist);
> -	if (!fn) {
> +	ri = objpool_pop(&rph->oh);
> +	if (!ri) {
>  		rp->nmissed++;
>  		return 0;
>  	}
>  
> -	ri = container_of(fn, struct kretprobe_instance, freelist);
> -
>  	if (rp->entry_handler && rp->entry_handler(ri, regs)) {
> -		freelist_add(&ri->freelist, &rp->freelist);
> +		objpool_push(ri, &rph->oh);
>  		return 0;
>  	}
>  
> @@ -2174,10 +2170,19 @@ int kprobe_on_func_entry(kprobe_opcode_t *addr, const char *sym, unsigned long o
>  	return 0;
>  }
>  
> +#ifndef CONFIG_KRETPROBE_ON_RETHOOK
> +static int kretprobe_init_inst(void *context, void *nod)
> +{
> +	struct kretprobe_instance *ri = nod;
> +
> +	ri->rph = context;
> +	return 0;
> +}
> +#endif
> +
>  int register_kretprobe(struct kretprobe *rp)
>  {
>  	int ret;
> -	struct kretprobe_instance *inst;
>  	int i;
>  	void *addr;
>  
> @@ -2215,20 +2220,12 @@ int register_kretprobe(struct kretprobe *rp)
>  #endif
>  	}
>  #ifdef CONFIG_KRETPROBE_ON_RETHOOK
> -	rp->rh = rethook_alloc((void *)rp, kretprobe_rethook_handler);
> +	rp->rh = rethook_alloc((void *)rp, kretprobe_rethook_handler, GFP_KERNEL,
> +				sizeof(struct kretprobe_instance) + rp->data_size,
> +				rp->maxactive);
>  	if (!rp->rh)
>  		return -ENOMEM;
>  
> -	for (i = 0; i < rp->maxactive; i++) {
> -		inst = kzalloc(sizeof(struct kretprobe_instance) +
> -			       rp->data_size, GFP_KERNEL);
> -		if (inst == NULL) {
> -			rethook_free(rp->rh);
> -			rp->rh = NULL;
> -			return -ENOMEM;
> -		}
> -		rethook_add_node(rp->rh, &inst->node);
> -	}
>  	rp->nmissed = 0;
>  	/* Establish function entry probe point */
>  	ret = register_kprobe(&rp->kp);
> @@ -2237,25 +2234,19 @@ int register_kretprobe(struct kretprobe *rp)
>  		rp->rh = NULL;
>  	}
>  #else	/* !CONFIG_KRETPROBE_ON_RETHOOK */
> -	rp->freelist.head = NULL;
>  	rp->rph = kzalloc(sizeof(struct kretprobe_holder), GFP_KERNEL);
>  	if (!rp->rph)
>  		return -ENOMEM;
>  
> -	rp->rph->rp = rp;
> -	for (i = 0; i < rp->maxactive; i++) {
> -		inst = kzalloc(sizeof(struct kretprobe_instance) +
> -			       rp->data_size, GFP_KERNEL);
> -		if (inst == NULL) {
> -			refcount_set(&rp->rph->ref, i);
> -			free_rp_inst(rp);
> -			return -ENOMEM;
> -		}
> -		inst->rph = rp->rph;
> -		freelist_add(&inst->freelist, &rp->freelist);
> +	if (objpool_init(&rp->rph->oh, rp->maxactive, rp->maxactive,
> +			  rp->data_size + sizeof(struct kretprobe_instance),
> +			  GFP_KERNEL, rp->rph, kretprobe_init_inst, NULL)) {
> +		kfree(rp->rph);
> +		rp->rph = NULL;
> +		return -ENOMEM;
>  	}
> -	refcount_set(&rp->rph->ref, i);
> -
> +	refcount_set(&rp->rph->ref, rp->maxactive + 1);
> +	rp->rph->rp = rp;
>  	rp->nmissed = 0;
>  	/* Establish function entry probe point */
>  	ret = register_kprobe(&rp->kp);
> diff --git a/kernel/trace/fprobe.c b/kernel/trace/fprobe.c
> index e8143e368074..9a002bfbd216 100644
> --- a/kernel/trace/fprobe.c
> +++ b/kernel/trace/fprobe.c
> @@ -125,7 +125,7 @@ static void fprobe_init(struct fprobe *fp)
>  
>  static int fprobe_init_rethook(struct fprobe *fp, int num)
>  {
> -	int i, size;
> +	int size;
>  
>  	if (num < 0)
>  		return -EINVAL;
> @@ -140,20 +140,11 @@ static int fprobe_init_rethook(struct fprobe *fp, int num)
>  	if (size < 0)
>  		return -E2BIG;
>  
> -	fp->rethook = rethook_alloc((void *)fp, fprobe_exit_handler);
> +	fp->rethook = rethook_alloc((void *)fp, fprobe_exit_handler, GFP_KERNEL,
> +				sizeof(struct fprobe_rethook_node), size);
>  	if (!fp->rethook)
>  		return -ENOMEM;
> -	for (i = 0; i < size; i++) {
> -		struct fprobe_rethook_node *node;
> -
> -		node = kzalloc(sizeof(*node), GFP_KERNEL);
> -		if (!node) {
> -			rethook_free(fp->rethook);
> -			fp->rethook = NULL;
> -			return -ENOMEM;
> -		}
> -		rethook_add_node(fp->rethook, &node->node);
> -	}
> +
>  	return 0;
>  }
>  
> diff --git a/kernel/trace/rethook.c b/kernel/trace/rethook.c
> index c69d82273ce7..01df98db2fbe 100644
> --- a/kernel/trace/rethook.c
> +++ b/kernel/trace/rethook.c
> @@ -36,21 +36,17 @@ void rethook_flush_task(struct task_struct *tk)
>  static void rethook_free_rcu(struct rcu_head *head)
>  {
>  	struct rethook *rh = container_of(head, struct rethook, rcu);
> -	struct rethook_node *rhn;
> -	struct freelist_node *node;
> -	int count = 1;
> +	struct rethook_node *nod;
>  
> -	node = rh->pool.head;
> -	while (node) {
> -		rhn = container_of(node, struct rethook_node, freelist);
> -		node = node->next;
> -		kfree(rhn);
> -		count++;
> -	}
> -
> -	/* The rh->ref is the number of pooled node + 1 */
> -	if (refcount_sub_and_test(count, &rh->ref))
> -		kfree(rh);
> +	do {
> +		nod = objpool_pop(&rh->pool);
> +		/* deref anyway since we've one extra ref grabbed */
> +		if (refcount_dec_and_test(&rh->ref)) {
> +			objpool_fini(&rh->pool);
> +			kfree(rh);
> +			break;
> +		}
> +	} while (nod);
>  }
>  
>  /**
> @@ -70,16 +66,28 @@ void rethook_free(struct rethook *rh)
>  	call_rcu(&rh->rcu, rethook_free_rcu);
>  }
>  
> +static int rethook_init_node(void *context, void *nod)
> +{
> +	struct rethook_node *node = nod;
> +
> +	node->rethook = context;
> +	return 0;
> +}
> +
>  /**
>   * rethook_alloc() - Allocate struct rethook.
>   * @data: a data to pass the @handler when hooking the return.
>   * @handler: the return hook callback function.
> + * @gfp: default gfp for objpool allocation
> + * @size: rethook node size

@size should be @data_size, which is the size of additional data bytes
of rethook_node.

> + * @max: number of rethook nodes to be preallocated
>   *
>   * Allocate and initialize a new rethook with @data and @handler.
>   * Return NULL if memory allocation fails or @handler is NULL.
>   * Note that @handler == NULL means this rethook is going to be freed.
>   */
> -struct rethook *rethook_alloc(void *data, rethook_handler_t handler)
> +struct rethook *rethook_alloc(void *data, rethook_handler_t handler, gfp_t gfp,
> +			      int size, int max)

This doesn't need @gfp. Just use GFP_KERNEL from rethook layer.

>  {
>  	struct rethook *rh = kzalloc(sizeof(struct rethook), GFP_KERNEL);

And here, 

	if (data_size < 0 || max < 0)
		return -EINVAL;

>  
> @@ -88,34 +96,26 @@ struct rethook *rethook_alloc(void *data, rethook_handler_t handler)
>  
>  	rh->data = data;
>  	rh->handler = handler;
> -	rh->pool.head = NULL;
> -	refcount_set(&rh->ref, 1);
>  
> +	/* initialize the objpool for rethook nodes */
> +	if (objpool_init(&rh->pool, max, max, size, gfp, rh, rethook_init_node,
> +			NULL)) {
> +		kfree(rh);
> +		return NULL;
> +	}
> +	refcount_set(&rh->ref, max + 1);
>  	return rh;
>  }
>  
> -/**
> - * rethook_add_node() - Add a new node to the rethook.
> - * @rh: the struct rethook.
> - * @node: the struct rethook_node to be added.
> - *
> - * Add @node to @rh. User must allocate @node (as a part of user's
> - * data structure.) The @node fields are initialized in this function.
> - */
> -void rethook_add_node(struct rethook *rh, struct rethook_node *node)
> -{
> -	node->rethook = rh;
> -	freelist_add(&node->freelist, &rh->pool);
> -	refcount_inc(&rh->ref);
> -}
> -
>  static void free_rethook_node_rcu(struct rcu_head *head)
>  {
>  	struct rethook_node *node = container_of(head, struct rethook_node, rcu);
> +	struct rethook *rh = node->rethook;
>  
> -	if (refcount_dec_and_test(&node->rethook->ref))
> -		kfree(node->rethook);
> -	kfree(node);
> +	if (refcount_dec_and_test(&rh->ref)) {
> +		objpool_fini(&rh->pool);
> +		kfree(rh);
> +	}
>  }
>  
>  /**
> @@ -130,7 +130,7 @@ void rethook_recycle(struct rethook_node *node)
>  	lockdep_assert_preemption_disabled();
>  
>  	if (likely(READ_ONCE(node->rethook->handler)))
> -		freelist_add(&node->freelist, &node->rethook->pool);
> +		objpool_push(node, &node->rethook->pool);
>  	else
>  		call_rcu(&node->rcu, free_rethook_node_rcu);
>  }
> @@ -146,7 +146,7 @@ NOKPROBE_SYMBOL(rethook_recycle);
>  struct rethook_node *rethook_try_get(struct rethook *rh)
>  {
>  	rethook_handler_t handler = READ_ONCE(rh->handler);
> -	struct freelist_node *fn;
> +	struct rethook_node *nod;
>  
>  	lockdep_assert_preemption_disabled();
>  
> @@ -163,11 +163,11 @@ struct rethook_node *rethook_try_get(struct rethook *rh)
>  	if (unlikely(!rcu_is_watching()))
>  		return NULL;
>  
> -	fn = freelist_try_get(&rh->pool);
> -	if (!fn)
> +	nod = (struct rethook_node *)objpool_pop(&rh->pool);
> +	if (!nod)
>  		return NULL;
>  
> -	return container_of(fn, struct rethook_node, freelist);
> +	return nod;

You can just write this part as

	return (struct rethook_node *)objpool_pop(&rh->pool);

Thank you,

>  }
>  NOKPROBE_SYMBOL(rethook_try_get);
>  
> -- 
> 2.34.1
> 


-- 
Masami Hiramatsu (Google) <mhiramat@kernel.org>

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH v6 1/4] lib: objpool added: ring-array based lockless MPMC queue
  2022-11-14 15:54       ` Masami Hiramatsu
@ 2022-11-16 10:42         ` wuqiang
  0 siblings, 0 replies; 23+ messages in thread
From: wuqiang @ 2022-11-16 10:42 UTC (permalink / raw)
  To: Masami Hiramatsu (Google)
  Cc: davem, anil.s.keshavamurthy, naveen.n.rao, rostedt, peterz, akpm,
	sander, ebiggers, dan.j.williams, jpoimboe, linux-kernel, lkp,
	mattwu

On 2022/11/14 23:54, Masami Hiramatsu (Google) wrote:
> On Tue,  8 Nov 2022 15:14:40 +0800
> wuqiang <wuqiang.matt@bytedance.com> wrote:
> 
>> The object pool is a scalable implementaion of high performance queue
>> for objects allocation and reclamation, such as kretprobe instances.
>>
>> With leveraging per-cpu ring-array to mitigate the hot spots of memory
>> contention, it could deliver near-linear scalability for high parallel
>> scenarios. The ring-array is compactly managed in a single cache-line
>> to benefit from warmed L1 cache for most cases (<= 4 objects per-core).
>> The body of pre-allocated objects is stored in continuous cache-lines
>> just after the ring-array.
>>
>> The object pool is interrupt safe. Both allocation and reclamation
>> (object pop and push operations) can be preemptible or interruptable.
>>
>> It's best suited for following cases:
>> 1) Memory allocation or reclamation are prohibited or too expensive
>> 2) Consumers are of different priorities, such as irqs and threads
>>
>> Limitations:
>> 1) Maximum objects (capacity) is determined during pool initializing
>> 2) The memory of objects won't be freed until the poll is finalized
>> 3) Object allocation (pop) may fail after trying all cpu slots
>> 4) Object reclamation (push) won't fail but may take long time to
>>     finish for imbalanced scenarios. You can try larger max_entries
>>     to mitigate, or ( >= CPUS * nr_objs) to avoid
>>
>> Signed-off-by: wuqiang <wuqiang.matt@bytedance.com>
>> ---
>>   include/linux/objpool.h | 153 +++++++++++++
>>   lib/Makefile            |   2 +-
>>   lib/objpool.c           | 487 ++++++++++++++++++++++++++++++++++++++++
>>   3 files changed, 641 insertions(+), 1 deletion(-)
>>   create mode 100644 include/linux/objpool.h
>>   create mode 100644 lib/objpool.c
>>
>> diff --git a/include/linux/objpool.h b/include/linux/objpool.h
>> new file mode 100644
>> index 000000000000..7899b054b50c
>> --- /dev/null
>> +++ b/include/linux/objpool.h
>> @@ -0,0 +1,153 @@
>> +/* SPDX-License-Identifier: GPL-2.0 */
>> +
>> +#ifndef _LINUX_OBJPOOL_H
>> +#define _LINUX_OBJPOOL_H
>> +
>> +#include <linux/types.h>
>> +
>> +/*
>> + * objpool: ring-array based lockless MPMC queue
>> + *
>> + * Copyright: wuqiang.matt@bytedance.com
>> + *
>> + * The object pool is a scalable implementaion of high performance queue
>> + * for objects allocation and reclamation, such as kretprobe instances.
>> + *
>> + * With leveraging per-cpu ring-array to mitigate the hot spots of memory
>> + * contention, it could deliver near-linear scalability for high parallel
>> + * scenarios. The ring-array is compactly managed in a single cache-line
>> + * to benefit from warmed L1 cache for most cases (<= 4 objects per-core).
>> + * The body of pre-allocated objects is stored in continuous cache-lines
>> + * just after the ring-array.
>> + *
>> + * The object pool is interrupt safe. Both allocation and reclamation
>> + * (object pop and push operations) can be preemptible or interruptable.
>> + *
>> + * It's best suited for following cases:
>> + * 1) Memory allocation or reclamation are prohibited or too expensive
>> + * 2) Consumers are of different priorities, such as irqs and threads
>> + *
>> + * Limitations:
>> + * 1) Maximum objects (capacity) is determined during pool initializing
>> + * 2) The memory of objects won't be freed until the poll is finalized
>> + * 3) Object allocation (pop) may fail after trying all cpu slots
>> + * 4) Object reclamation (push) won't fail but may take long time to
>> + *    finish for imbalanced scenarios. You can try larger max_entries
>> + *    to mitigate, or ( >= CPUS * nr_objs) to avoid
>> + */
>> +
>> +/*
>> + * objpool_slot: per-cpu ring array
>> + *
>> + * Represents a cpu-local array-based ring buffer, its size is specialized
>> + * during initialization of object pool.
>> + *
>> + * The objpool_slot is allocated from local memory for NUMA system, and to
>> + * be kept compact in a single cacheline. ages[] is stored just after the
>> + * body of objpool_slot, and then entries[]. The Array of ages[] describes
>> + * revision of each item, solely used to avoid ABA. And array of entries[]
>> + * contains the pointers of objects.
>> + *
>> + * The default size of objpool_slot is a single cache-line, aka. 64 bytes.
>> + *
>> + * 64bit:
>> + *        4      8      12     16        32                 64
>> + * | head | tail | size | mask | ages[4] | ents[4]: (8 * 4) | objects
>> + *
>> + * 32bit:
>> + *        4      8      12     16        32        48       64
>> + * | head | tail | size | mask | ages[4] | ents[4] | unused | objects
>> + *
>> + */
>> +
>> +struct objpool_slot {
>> +	uint32_t                head;	/* head of ring array */
>> +	uint32_t                tail;	/* tail of ring array */
>> +	uint32_t                size;	/* array size, pow of 2 */
>> +	uint32_t                mask;	/* size - 1 */
>> +} __attribute__((packed));
>> +
>> +/* caller-specified object initial callback to setup each object, only called once */
>> +typedef int (*objpool_init_obj_cb)(void *context, void *obj);
> 
> It seems a bit confused that this "initialize object" callback
> don't have the @obj as the first argument.

Sure, will update in next version.

>> +
>> +/* caller-specified cleanup callback for private objects/pool/context */
>> +typedef int (*objpool_release_cb)(void *context, void *ptr, uint32_t flags);
> 
> Do you have any use-case for this release callback?
> If not, until actual use-case comes up, I recommend you to defer
> implementing it.

No actual use-case for now, since both kretprobe and rethook use internal
objects. It's mainly for user-managed objects and asynchronous finilization.

I'll reconsider your advice. Thanks.

>> +
>> +/* called for object releasing: ptr points to an object */
>> +#define OBJPOOL_FLAG_NODE        (0x00000001)
>> +/* for user pool and context releasing, ptr could be NULL */
>> +#define OBJPOOL_FLAG_POOL        (0x00001000)
>> +/* the object or pool to be released is user-managed */
>> +#define OBJPOOL_FLAG_USER        (0x00008000)
> 
> Ditto.
> 
>> +
>> +/*
>> + * objpool_head: object pooling metadata
>> + */
>> +
>> +struct objpool_head {
>> +	unsigned int            obj_size;	/* object & element size */
>> +	unsigned int            nr_objs;	/* total objs (to be pre-allocated) */
>> +	unsigned int            nr_cpus;	/* num of possible cpus */
>> +	unsigned int            capacity;	/* max objects per cpuslot */
>> +	unsigned long           flags;		/* flags for objpool management */
>> +	gfp_t                   gfp;		/* gfp flags for kmalloc & vmalloc */
>> +	unsigned int            pool_size;	/* user pool size in byes */
>> +	void                   *pool;		/* user managed memory pool */
>> +	struct objpool_slot   **cpu_slots;	/* array of percpu slots */
>> +	unsigned int           *slot_sizes;	/* size in bytes of slots */
>> +	objpool_release_cb      release;	/* resource cleanup callback */
>> +	void                   *context;	/* caller-provided context */
>> +};
>> +
>> +#define OBJPOOL_FROM_VMALLOC	(0x800000000)	/* objpool allocated from vmalloc area */
>> +#define OBJPOOL_HAVE_OBJECTS	(0x400000000)	/* objects allocated along with objpool */
> 
> This also doesn't need at this moment. Please start from simple
> design for review.
> 
>> +
>> +/* initialize object pool and pre-allocate objects */
>> +int objpool_init(struct objpool_head *head, unsigned int nr_objs,
>> +		 unsigned int max_objs, unsigned int object_size,
>> +		 gfp_t gfp, void *context, objpool_init_obj_cb objinit,
>> +		 objpool_release_cb release);
>> +
>> +/* add objects in batch from user provided pool */
>> +int objpool_populate(struct objpool_head *head, void *pool,
>> +		     unsigned int size, unsigned int object_size,
>> +		     void *context, objpool_init_obj_cb objinit);
>> +
>> +/* add pre-allocated object (managed by user) to objpool */
>> +int objpool_add(void *obj, struct objpool_head *head);
>> +
>> +/* allocate an object from objects pool */
>> +void *objpool_pop(struct objpool_head *head);
>> +
>> +/* reclaim an object to objects pool */
>> +int objpool_push(void *node, struct objpool_head *head);
>> +
>> +/* cleanup the whole object pool (objects including) */
>> +void objpool_fini(struct objpool_head *head);
>> +
>> +/* whether the object is pre-allocated with percpu slots */
>> +static inline int objpool_is_inslot(void *obj, struct objpool_head *head)
>> +{
>> +	void *slot;
>> +	int i;
>> +
>> +	if (!obj || !(head->flags & OBJPOOL_HAVE_OBJECTS))
>> +		return 0;
>> +
>> +	for (i = 0; i < head->nr_cpus; i++) {
>> +		slot = head->cpu_slots[i];
>> +		if (obj >= slot && obj < slot + head->slot_sizes[i])
>> +			return 1;
>> +	}
>> +
>> +	return 0;
>> +}
> 
> Ditto.
> 
> It is too complicated to mix the internal allocated objects
> and external ones. This will expose the implementation of the
> objpool (users must understand they have to free the object
> only outside of slot)

Mixing is NOT recommended and normally it should be one of internal / user
oool / external. But as a general lib of objpool, mixing is supported just
because we don't know what use-cases real users would face.

> You can add it afterwards if it is really needed :)

Sure, I'll rethink of it. For kretprobe and rethook, a performance-oriented
version (with internal objects) should be simplely enough and pretty compact.

>> +
>> +/* whether the object is from user pool (batched adding) */
>> +static inline int objpool_is_inpool(void *obj, struct objpool_head *head)
>> +{
>> +	return (obj && head->pool && obj >= head->pool &&
>> +		obj < head->pool + head->pool_size);
>> +}
>> +
>> +#endif /* _LINUX_OBJPOOL_H */
>> diff --git a/lib/Makefile b/lib/Makefile
>> index 161d6a724ff7..e938703a321f 100644
>> --- a/lib/Makefile
>> +++ b/lib/Makefile
>> @@ -34,7 +34,7 @@ lib-y := ctype.o string.o vsprintf.o cmdline.o \
>>   	 is_single_threaded.o plist.o decompress.o kobject_uevent.o \
>>   	 earlycpio.o seq_buf.o siphash.o dec_and_lock.o \
>>   	 nmi_backtrace.o win_minmax.o memcat_p.o \
>> -	 buildid.o
>> +	 buildid.o objpool.o
>>   
>>   lib-$(CONFIG_PRINTK) += dump_stack.o
>>   lib-$(CONFIG_SMP) += cpumask.o
>> diff --git a/lib/objpool.c b/lib/objpool.c
>> new file mode 100644
>> index 000000000000..ecffa0795f3d
>> --- /dev/null
>> +++ b/lib/objpool.c
>> @@ -0,0 +1,487 @@
>> +// SPDX-License-Identifier: GPL-2.0
>> +
>> +#include <linux/objpool.h>
>> +#include <linux/slab.h>
>> +#include <linux/vmalloc.h>
>> +#include <linux/atomic.h>
>> +#include <linux/prefetch.h>
>> +
>> +/*
>> + * objpool: ring-array based lockless MPMC/FIFO queues
>> + *
>> + * Copyright: wuqiang.matt@bytedance.com
>> + */
>> +
>> +/* compute the suitable num of objects to be managed by slot */
>> +static inline unsigned int __objpool_num_of_objs(unsigned int size)
>> +{
>> +	return rounddown_pow_of_two((size - sizeof(struct objpool_slot)) /
>> +			(sizeof(uint32_t) + sizeof(void *)));
>> +}
>> +
>> +#define SLOT_AGES(s) ((uint32_t *)((char *)(s) + sizeof(struct objpool_slot)))
>> +#define SLOT_ENTS(s) ((void **)((char *)(s) + sizeof(struct objpool_slot) + \
>> +			sizeof(uint32_t) * (s)->size))
>> +#define SLOT_OBJS(s) ((void *)((char *)(s) + sizeof(struct objpool_slot) + \
>> +			(sizeof(uint32_t) + sizeof(void *)) * (s)->size))
>> +
>> +/* allocate and initialize percpu slots */
>> +static inline int
>> +__objpool_init_percpu_slots(struct objpool_head *head, unsigned int nobjs,
>> +			void *context, objpool_init_obj_cb objinit)
>> +{
>> +	unsigned int i, j, n, size, objsz, nents = head->capacity;
>> +
>> +	/* aligned object size by sizeof(void *) */
>> +	objsz = ALIGN(head->obj_size, sizeof(void *));
>> +	/* shall we allocate objects along with objpool_slot */
>> +	if (objsz)
>> +		head->flags |= OBJPOOL_HAVE_OBJECTS;
>> +
>> +	for (i = 0; i < head->nr_cpus; i++) {
>> +		struct objpool_slot *os;
>> +
>> +		/* compute how many objects to be managed by this slot */
>> +		n = nobjs / head->nr_cpus;
>> +		if (i < (nobjs % head->nr_cpus))
>> +			n++;
>> +		size = sizeof(struct objpool_slot) + sizeof(void *) * nents +
>> +		       sizeof(uint32_t) * nents + objsz * n;
>> +
>> +		/* decide memory area for cpu-slot allocation */
>> +		if (!i && !(head->gfp & GFP_ATOMIC) && size > PAGE_SIZE / 2)
>> +			head->flags |= OBJPOOL_FROM_VMALLOC;
>> +
>> +		/* allocate percpu slot & objects from local memory */
>> +		if (head->flags & OBJPOOL_FROM_VMALLOC)
>> +			os = __vmalloc_node(size, sizeof(void *), head->gfp,
>> +				cpu_to_node(i), __builtin_return_address(0));
>> +		else
>> +			os = kmalloc_node(size, head->gfp, cpu_to_node(i));
>> +		if (!os)
>> +			return -ENOMEM;
>> +
>> +		/* initialize percpu slot for the i-th cpu */
>> +		memset(os, 0, size);
>> +		os->size = head->capacity;
>> +		os->mask = os->size - 1;
>> +		head->cpu_slots[i] = os;
>> +		head->slot_sizes[i] = size;
>> +
>> +		/*
>> +		 * start from 2nd round to avoid conflict of 1st item.
>> +		 * we assume that the head item is ready for retrieval
>> +		 * iff head is equal to ages[head & mask]. but ages is
>> +		 * initialized as 0, so in view of the caller of pop(),
>> +		 * the 1st item (0th) is always ready, but fact could
>> +		 * be: push() is stalled before the final update, thus
>> +		 * the item being inserted will be lost forever.
>> +		 */
>> +		os->head = os->tail = head->capacity;
>> +
>> +		if (!objsz)
>> +			continue;
>> +
>> +		for (j = 0; j < n; j++) {
>> +			uint32_t *ages = SLOT_AGES(os);
>> +			void **ents = SLOT_ENTS(os);
>> +			void *obj = SLOT_OBJS(os) + j * objsz;
>> +			uint32_t ie = os->tail & os->mask;
>> +
>> +			/* perform object initialization */
>> +			if (objinit) {
>> +				int rc = objinit(context, obj);
>> +				if (rc)
>> +					return rc;
>> +			}
>> +
>> +			/* add obj into the ring array */
>> +			ents[ie] = obj;
>> +			ages[ie] = os->tail;
>> +			os->tail++;
>> +			head->nr_objs++;
>> +		}
>> +	}
>> +
>> +	return 0;
>> +}
>> +
>> +/* cleanup all percpu slots of the object pool */
>> +static inline void __objpool_fini_percpu_slots(struct objpool_head *head)
>> +{
>> +	unsigned int i;
>> +
>> +	if (!head->cpu_slots)
>> +		return;
>> +
>> +	for (i = 0; i < head->nr_cpus; i++) {
>> +		if (!head->cpu_slots[i])
>> +			continue;
>> +		if (head->flags & OBJPOOL_FROM_VMALLOC)
>> +			vfree(head->cpu_slots[i]);
>> +		else
>> +			kfree(head->cpu_slots[i]);
>> +	}
>> +	kfree(head->cpu_slots);
>> +	head->cpu_slots = NULL;
>> +	head->slot_sizes = NULL;
>> +}
>> +
>> +/**
>> + * objpool_init: initialize object pool and pre-allocate objects
>> + *
>> + * args:
>> + * @head:    the object pool to be initialized, declared by caller
>> + * @nr_objs: total objects to be pre-allocated by this object pool
>> + * @max_objs: max entries (object pool capacity), use nr_objs if 0
>> + * @object_size: size of an object, no objects pre-allocated if 0
>> + * @gfp:     flags for memory allocation (via kmalloc or vmalloc)
>> + * @context: user context for object initialization callback
>> + * @objinit: object initialization callback for extra setting-up
>> + * @release: cleanup callback for private objects/pool/context
>> + *
>> + * return:
>> + *         0 for success, otherwise error code
>> + *
>> + * All pre-allocated objects are to be zeroed. Caller could do extra
>> + * initialization in objinit callback. The objinit callback will be
>> + * called once and only once after the slot allocation. Then objpool
>> + * won't touch any content of the objects since then. It's caller's
>> + * duty to perform reinitialization after object allocation (pop) or
>> + * clearance before object reclamation (push) if required.
>> + */
>> +int objpool_init(struct objpool_head *head, unsigned int nr_objs,
>> +		unsigned int max_objs, unsigned int object_size,
>> +		gfp_t gfp, void *context, objpool_init_obj_cb objinit,
>> +		objpool_release_cb release)
>> +{
>> +	unsigned int nents, ncpus = num_possible_cpus();
>> +	int rc;
>> +
>> +	/* calculate percpu slot size (rounded to pow of 2) */
>> +	if (max_objs < nr_objs)
> 
> This should be an error case.
> 
> 	if (!max_objs)
> 
>> +		max_objs = nr_objs;
> 
> 	else if (max_objs < nr_objs)
> 		return -EINVAL;

Got it.

> But to simplify that, I think it should use only nr_objs.
> I mean, if we can pass the @objinit, there seems no reason to
> have both nr_objs and max_objs.

I kept both them just to give user the flexibility to best meet the
requirements. For cases that objects are in imbalanced distribution
among CPUs, max_objs should be bigger enough (nr_cpus * nr_objs) to
deliver good performance, with a cost of a little more memory.

>> +	nents = max_objs / ncpus;
>> +	if (nents < __objpool_num_of_objs(L1_CACHE_BYTES))
>> +		nents = __objpool_num_of_objs(L1_CACHE_BYTES);
>> +	nents = roundup_pow_of_two(nents);
>> +	while (nents * ncpus < nr_objs)
>> +		nents = nents << 1;
>> +
>> +	memset(head, 0, sizeof(struct objpool_head));
>> +	head->nr_cpus = ncpus;
>> +	head->obj_size = object_size;
>> +	head->capacity = nents;
>> +	head->gfp = gfp & ~__GFP_ZERO;
>> +	head->context = context;
>> +	head->release = release;
>> +
>> +	/* allocate array for percpu slots */
>> +	head->cpu_slots = kzalloc(head->nr_cpus * sizeof(void *) +
>> +			       head->nr_cpus * sizeof(uint32_t), head->gfp);
>> +	if (!head->cpu_slots)
>> +		return -ENOMEM;
>> +	head->slot_sizes = (uint32_t *)&head->cpu_slots[head->nr_cpus];
>> +
>> +	/* initialize per-cpu slots */
>> +	rc = __objpool_init_percpu_slots(head, nr_objs, context, objinit);
>> +	if (rc)
>> +		__objpool_fini_percpu_slots(head);
>> +
>> +	return rc;
>> +}
>> +EXPORT_SYMBOL_GPL(objpool_init);
>> +
>> +/* adding object to slot tail, the given slot must NOT be full */
>> +static inline int __objpool_add_slot(void *obj, struct objpool_slot *os)
>> +{
>> +	uint32_t *ages = SLOT_AGES(os);
>> +	void **ents = SLOT_ENTS(os);
>> +	uint32_t tail = atomic_inc_return((atomic_t *)&os->tail) - 1;
>> +
>> +	WRITE_ONCE(ents[tail & os->mask], obj);
>> +
>> +	/* order matters: obj must be updated before tail updating */
>> +	smp_store_release(&ages[tail & os->mask], tail);
>> +	return 0;
>> +}
>> +
>> +/* adding object to slot, abort if the slot was already full */
>> +static inline int __objpool_try_add_slot(void *obj, struct objpool_slot *os)
>> +{
>> +	uint32_t *ages = SLOT_AGES(os);
>> +	void **ents = SLOT_ENTS(os);
>> +	uint32_t head, tail;
>> +
>> +	do {
>> +		/* perform memory loading for both head and tail */
>> +		head = READ_ONCE(os->head);
>> +		tail = READ_ONCE(os->tail);
>> +		/* just abort if slot is full */
>> +		if (tail >= head + os->size)
>> +			return -ENOENT;
>> +		/* try to extend tail by 1 using CAS to avoid races */
>> +		if (try_cmpxchg_acquire(&os->tail, &tail, tail + 1))
>> +			break;
>> +	} while (1);
>> +
>> +	/* the tail-th of slot is reserved for the given obj */
>> +	WRITE_ONCE(ents[tail & os->mask], obj);
>> +	/* update epoch id to make this object available for pop() */
>> +	smp_store_release(&ages[tail & os->mask], tail);
>> +	return 0;
>> +}
>> +
>> +/**
>> + * objpool_populate: add objects from user provided pool in batch
>> + *
>> + * args:
>> + * @head:  object pool
>> + * @pool: user buffer for pre-allocated objects
>> + * @size: size of user buffer
>> + * @object_size: size of object & element
>> + * @context: user context for objinit callback
>> + * @objinit: object initialization callback
>> + *
>> + * return: 0 or error code
>> + */
>> +int objpool_populate(struct objpool_head *head, void *pool,
>> +		unsigned int size, unsigned int object_size,
>> +		void *context, objpool_init_obj_cb objinit)
>> +{
>> +	unsigned int n = head->nr_objs, used = 0, i;
>> +
>> +	if (head->pool || !pool || size < object_size)
>> +		return -EINVAL;
>> +	if (head->obj_size && head->obj_size != object_size)
>> +		return -EINVAL;
>> +	if (head->context && context && head->context != context)
>> +		return -EINVAL;
>> +	if (head->nr_objs >= head->nr_cpus * head->capacity)
>> +		return -ENOENT;
>> +
>> +	WARN_ON_ONCE(((unsigned long)pool) & (sizeof(void *) - 1));
>> +	WARN_ON_ONCE(((uint32_t)object_size) & (sizeof(void *) - 1));
>> +
>> +	/* align object size by sizeof(void *) */
>> +	head->obj_size = object_size;
>> +	object_size = ALIGN(object_size, sizeof(void *));
>> +	if (object_size == 0)
>> +		return -EINVAL;
>> +
>> +	while (used + object_size <= size) {
>> +		void *obj = pool + used;
>> +
>> +		/* perform object initialization */
>> +		if (objinit) {
>> +			int rc = objinit(context, obj);
>> +			if (rc)
>> +				return rc;
>> +		}
>> +
>> +		/* insert obj to its corresponding objpool slot */
>> +		i = (n + used * head->nr_cpus/size) % head->nr_cpus;
>> +		if (!__objpool_try_add_slot(obj, head->cpu_slots[i]))
>> +			head->nr_objs++;
>> +
>> +		used += object_size;
>> +	}
>> +
>> +	if (!used)
>> +		return -ENOENT;
>> +
>> +	head->context = context;
>> +	head->pool = pool;
>> +	head->pool_size = size;
>> +
>> +	return 0;
>> +}
>> +EXPORT_SYMBOL_GPL(objpool_populate);
>> +
>> +/**
>> + * objpool_add: add pre-allocated object to objpool during pool
>> + * initialization
>> + *
>> + * args:
>> + * @obj:  object pointer to be added to objpool
>> + * @head: object pool to be inserted into
>> + *
>> + * return:
>> + *     0 or error code
>> + *
>> + * objpool_add_node doesn't handle race conditions, can only be
>> + * called during objpool initialization
>> + */
>> +int objpool_add(void *obj, struct objpool_head *head)
>> +{
>> +	unsigned int i, cpu;
>> +
>> +	if (!obj)
>> +		return -EINVAL;
>> +	if (head->nr_objs >= head->nr_cpus * head->capacity)
>> +		return -ENOENT;
>> +
>> +	cpu = head->nr_objs % head->nr_cpus;
>> +	for (i = 0; i < head->nr_cpus; i++) {
>> +		if (!__objpool_try_add_slot(obj, head->cpu_slots[cpu])) {
>> +			head->nr_objs++;
>> +			return 0;
>> +		}
>> +
>> +		if (++cpu >= head->nr_cpus)
>> +			cpu = 0;
>> +	}
>> +
>> +	return -ENOENT;
>> +}
>> +EXPORT_SYMBOL_GPL(objpool_add);
>> +
>> +/**
>> + * objpool_push: reclaim the object and return back to objects pool
>> + *
>> + * args:
>> + * @obj:  object pointer to be pushed to object pool
>> + * @head: object pool
>> + *
>> + * return:
>> + *     0 or error code: it fails only when objects pool are full
>> + *
>> + * objpool_push is non-blockable, and can be nested
>> + */
>> +int objpool_push(void *obj, struct objpool_head *head)
>> +{
>> +	unsigned int cpu = raw_smp_processor_id() % head->nr_cpus;
>> +
>> +	do {
>> +		if (head->nr_objs > head->capacity) {
>> +			if (!__objpool_try_add_slot(obj, head->cpu_slots[cpu]))
>> +				return 0;
>> +		} else {
>> +			if (!__objpool_add_slot(obj, head->cpu_slots[cpu]))
>> +				return 0;
>> +		}
>> +		if (++cpu >= head->nr_cpus)
>> +			cpu = 0;
>> +	} while (1);
>> +
>> +	return -ENOENT;
>> +}
>> +EXPORT_SYMBOL_GPL(objpool_push);
>> +
>> +/* try to retrieve object from slot */
>> +static inline void *__objpool_try_get_slot(struct objpool_slot *os)
>> +{
>> +	uint32_t *ages = SLOT_AGES(os);
>> +	void **ents = SLOT_ENTS(os);
>> +	/* do memory load of head to local head */
>> +	uint32_t head = smp_load_acquire(&os->head);
>> +
>> +	/* loop if slot isn't empty */
>> +	while (head != READ_ONCE(os->tail)) {
>> +		uint32_t id = head & os->mask, prev = head;
>> +
>> +		/* do prefetching of object ents */
>> +		prefetch(&ents[id]);
>> +
>> +		/*
>> +		 * check whether this item was ready for retrieval ? There's
>> +		 * possibility * in theory * we might retrieve wrong object,
>> +		 * in case ages[id] overflows when current task is sleeping,
>> +		 * but it will take very very long to overflow an uint32_t
>> +		 */
>> +		if (smp_load_acquire(&ages[id]) == head) {
>> +			/* node must have been udpated by push() */
>> +			void *node = READ_ONCE(ents[id]);
>> +			/* commit and move forward head of the slot */
>> +			if (try_cmpxchg_release(&os->head, &head, head + 1))
>> +				return node;
>> +		}
>> +
>> +		/* re-load head from memory continue trying */
>> +		head = READ_ONCE(os->head);
>> +		/*
>> +		 * head stays unchanged, so it's very likely current pop()
>> +		 * just preempted/interrupted an ongoing push() operation
>> +		 */
>> +		if (head == prev)
>> +			break;
>> +	}
>> +
>> +	return NULL;
>> +}
>> +
>> +/**
>> + * objpool_pop: allocate an object from objects pool
>> + *
>> + * args:
>> + * @oh:  object pool
>> + *
>> + * return:
>> + *   object: NULL if failed (object pool is empty)
>> + *
>> + * objpool_pop can be nested, so can be used in any context.
>> + */
>> +void *objpool_pop(struct objpool_head *head)
>> +{
>> +	unsigned int i, cpu;
>> +	void *obj = NULL;
>> +
>> +	cpu = raw_smp_processor_id() % head->nr_cpus;
> 
> (Not sure, do we really need this?)

V6 (this version) needs it, since it's using num_possible_cpus() to manage
the array of cpu_slots.

Last week I finished an improved version, which takes account of holes in
cpu_possible_mask. The testings are going well so far.

> Thank you,

Your advices would be greatly appreciated. Thank you for your time.

> 
>> +	for (i = 0; i < head->nr_cpus; i++) {
>> +		struct objpool_slot *slot = head->cpu_slots[cpu];
>> +		obj = __objpool_try_get_slot(slot);
>> +		if (obj)
>> +			break;
>> +		if (++cpu >= head->nr_cpus)
>> +			cpu = 0;
>> +	}
>> +
>> +	return obj;
>> +}
>> +EXPORT_SYMBOL_GPL(objpool_pop);
>> +
>> +/**
>> + * objpool_fini: cleanup the whole object pool (releasing all objects)
>> + *
>> + * args:
>> + * @head: object pool to be released
>> + *
>> + */
>> +void objpool_fini(struct objpool_head *head)
>> +{
>> +	uint32_t i, flags;
>> +
>> +	if (!head->cpu_slots)
>> +		return;
>> +
>> +	if (!head->release) {
>> +		__objpool_fini_percpu_slots(head);
>> +		return;
>> +	}
>> +
>> +	/* cleanup all objects remained in objpool */
>> +	for (i = 0; i < head->nr_cpus; i++) {
>> +		void *obj;
>> +		do {
>> +			flags = OBJPOOL_FLAG_NODE;
>> +			obj = __objpool_try_get_slot(head->cpu_slots[i]);
>> +			if (!obj)
>> +				break;
>> +			if (!objpool_is_inpool(obj, head) &&
>> +			    !objpool_is_inslot(obj, head)) {
>> +				flags |= OBJPOOL_FLAG_USER;
>> +			}
>> +			head->release(head->context, obj, flags);
>> +		} while (obj);
>> +	}
>> +
>> +	/* release percpu slots */
>> +	__objpool_fini_percpu_slots(head);
>> +
>> +	/* cleanup user private pool and related context */
>> +	flags = OBJPOOL_FLAG_POOL;
>> +	if (head->pool)
>> +		flags |= OBJPOOL_FLAG_USER;
>> +	head->release(head->context, head->pool, flags);
>> +}
>> +EXPORT_SYMBOL_GPL(objpool_fini);
>> -- 
>> 2.34.1
>>


^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH v6 3/4] kprobes: kretprobe scalability improvement with objpool
  2022-11-14 15:56       ` Masami Hiramatsu
@ 2022-11-16 10:49         ` wuqiang
  0 siblings, 0 replies; 23+ messages in thread
From: wuqiang @ 2022-11-16 10:49 UTC (permalink / raw)
  To: Masami Hiramatsu (Google)
  Cc: davem, anil.s.keshavamurthy, naveen.n.rao, rostedt, peterz, akpm,
	sander, ebiggers, dan.j.williams, jpoimboe, linux-kernel, lkp,
	mattwu

On 2022/11/14 23:56, Masami Hiramatsu (Google) wrote:
> Hi Wuqiang,
> 
> On Tue,  8 Nov 2022 15:14:42 +0800
> wuqiang <wuqiang.matt@bytedance.com> wrote:
> 
>> kretprobe is using freelist to manage return-instances, but freelist,
>> as LIFO queue based on singly linked list, scales badly and reduces
>> the overall throughput of kretprobed routines, especially for high
>> contention scenarios.
>>
>> Here's a typical throughput test of sys_flock (counts in 10 seconds,
>> measured with perf stat -a -I 10000 -e syscalls:sys_enter_flock):
>>
>> OS: Debian 10 X86_64, Linux 6.1rc2
>> HW: XEON 8336C x 2, 64 cores/128 threads, DDR4 3200MT/s
>>
>>        1X       2X       4X       6X       8X      12X      16X
>> 34762430 36546920 17949900 13101899 12569595 12646601 14729195
>>       24X      32X      48X      64X      72X      96X     128X
>> 19263546 10102064  8985418 11936495 11493980  7127789  9330985
>>
>> This patch introduces objpool to kretprobe and rethook, with orginal
>> freelist replaced and brings near-linear scalability to kretprobed
>> routines. Tests of kretprobe throughput show the biggest ratio as
>> 333.9x of the original freelist. Here's the comparison:
>>
>>                    1X         2X         4X         8X        16X
>> freelist:   34762430   36546920   17949900   12569595   14729195
>> objpool:    35627544   72182095  144068494  287564688  576903916
>>                   32X        48X        64X        96X       128X
>> freelist:   10102064    8985418   11936495    7127789    9330985
>> objpool:  1158876372 1737828164 2324371724 2380310472 2463182819
>>
>> Tests on 96-core ARM64 system output similarly, but with the biggest
>> ratio up to 642.2x:
>>
>> OS: Debian 10 AARCH64, Linux 6.1rc2
>> HW: Kunpeng-920 96 cores/2 sockets/4 NUMA nodes, DDR4 2933 MT/s
>>
>>                    1X         2X         4X         8X        16X
>> freelist:   17498299   10887037   10224710    8499132    6421751
>> objpool:    18715726   35549845   71615884  144258971  283707220
>>                   24X        32X        48X        64X        96X
>> freelist:    5339868    4819116    3593919    3121575    2687167
>> objpool:   419830913  571609748  877456139 1143316315 1725668029
>>
>> Signed-off-by: wuqiang <wuqiang.matt@bytedance.com>
>> ---
>>   include/linux/kprobes.h |  9 ++--
>>   include/linux/rethook.h | 15 +++----
>>   kernel/kprobes.c        | 95 +++++++++++++++++++----------------------
>>   kernel/trace/fprobe.c   | 17 ++------
>>   kernel/trace/rethook.c  | 80 +++++++++++++++++-----------------
>>   5 files changed, 95 insertions(+), 121 deletions(-)
>>
>> diff --git a/include/linux/kprobes.h b/include/linux/kprobes.h
>> index a0b92be98984..f13f01e600c2 100644
>> --- a/include/linux/kprobes.h
>> +++ b/include/linux/kprobes.h
>> @@ -27,7 +27,7 @@
>>   #include <linux/mutex.h>
>>   #include <linux/ftrace.h>
>>   #include <linux/refcount.h>
>> -#include <linux/freelist.h>
>> +#include <linux/objpool.h>
>>   #include <linux/rethook.h>
>>   #include <asm/kprobes.h>
>>   
>> @@ -141,6 +141,7 @@ static inline bool kprobe_ftrace(struct kprobe *p)
>>    */
>>   struct kretprobe_holder {
>>   	struct kretprobe	*rp;
>> +	struct objpool_head	oh;
> 
> Could you rename it to `pool` as rethook does?

Sure. Will udpate in next version.

>>   	refcount_t		ref;
>>   };
>>   
>> @@ -154,7 +155,6 @@ struct kretprobe {
>>   #ifdef CONFIG_KRETPROBE_ON_RETHOOK
>>   	struct rethook *rh;
>>   #else
>> -	struct freelist_head freelist;
> 
> OK, this is natural to move the head to kretprobe_holder, because
> objpool must call its fini() function when all objects are freed.
> 
>>   	struct kretprobe_holder *rph;
>>   #endif
>>   };
>> @@ -165,10 +165,7 @@ struct kretprobe_instance {
>>   #ifdef CONFIG_KRETPROBE_ON_RETHOOK
>>   	struct rethook_node node;
>>   #else
>> -	union {
>> -		struct freelist_node freelist;
>> -		struct rcu_head rcu;
>> -	};
>> +	struct rcu_head rcu;
>>   	struct llist_node llist;
>>   	struct kretprobe_holder *rph;
>>   	kprobe_opcode_t *ret_addr;
>> diff --git a/include/linux/rethook.h b/include/linux/rethook.h
>> index c8ac1e5afcd1..278ec65e71fe 100644
>> --- a/include/linux/rethook.h
>> +++ b/include/linux/rethook.h
>> @@ -6,7 +6,7 @@
>>   #define _LINUX_RETHOOK_H
>>   
>>   #include <linux/compiler.h>
>> -#include <linux/freelist.h>
>> +#include <linux/objpool.h>
>>   #include <linux/kallsyms.h>
>>   #include <linux/llist.h>
>>   #include <linux/rcupdate.h>
>> @@ -30,14 +30,14 @@ typedef void (*rethook_handler_t) (struct rethook_node *, void *, struct pt_regs
>>   struct rethook {
>>   	void			*data;
>>   	rethook_handler_t	handler;
>> -	struct freelist_head	pool;
>> +	struct objpool_head	pool;
> 
> On the other hand, the rethook already consolidated that "holder"
> feature to itself, so you don't need to move, just replace the
> freelist_head with objpool_head. LGTM.
> 
>>   	refcount_t		ref;
>>   	struct rcu_head		rcu;
>>   };
>>   
>>   /**
>>    * struct rethook_node - The rethook shadow-stack entry node.
>> - * @freelist: The freelist, linked to struct rethook::pool.
>> + * @nod: The objpool node, linked to struct rethook::pool.
> 
> I don't see @nod in rethook_node. maybe a typo?

My mistake. Forget to remove it when migrating to ring-array based objpool
from the freelist version.

>>    * @rcu: The rcu_head for deferred freeing.
>>    * @llist: The llist, linked to a struct task_struct::rethooks.
>>    * @rethook: The pointer to the struct rethook.
>> @@ -48,19 +48,15 @@ struct rethook {
>>    * on each entry of the shadow stack.
>>    */
>>   struct rethook_node {
>> -	union {
>> -		struct freelist_node freelist;
>> -		struct rcu_head      rcu;
>> -	};
>> +	struct rcu_head		rcu;
>>   	struct llist_node	llist;
>>   	struct rethook		*rethook;
>>   	unsigned long		ret_addr;
>>   	unsigned long		frame;
>>   };
>>   
>> -struct rethook *rethook_alloc(void *data, rethook_handler_t handler);
>> +struct rethook *rethook_alloc(void *data, rethook_handler_t handler, gfp_t gfp, int size, int max);
>>   void rethook_free(struct rethook *rh);
>> -void rethook_add_node(struct rethook *rh, struct rethook_node *node);
>>   struct rethook_node *rethook_try_get(struct rethook *rh);
>>   void rethook_recycle(struct rethook_node *node);
>>   void rethook_hook(struct rethook_node *node, struct pt_regs *regs, bool mcount);
>> @@ -97,4 +93,3 @@ void rethook_flush_task(struct task_struct *tk);
>>   #endif
>>   
>>   #endif
>> -
>> diff --git a/kernel/kprobes.c b/kernel/kprobes.c
>> index cd9f5a66a690..a8b202f87e2d 100644
>> --- a/kernel/kprobes.c
>> +++ b/kernel/kprobes.c
>> @@ -1865,10 +1865,12 @@ static struct notifier_block kprobe_exceptions_nb = {
>>   static void free_rp_inst_rcu(struct rcu_head *head)
>>   {
>>   	struct kretprobe_instance *ri = container_of(head, struct kretprobe_instance, rcu);
>> +	struct kretprobe_holder *rph = ri->rph;
>>   
>> -	if (refcount_dec_and_test(&ri->rph->ref))
>> -		kfree(ri->rph);
>> -	kfree(ri);
>> +	if (refcount_dec_and_test(&rph->ref)) {
>> +		objpool_fini(&rph->oh);
>> +		kfree(rph);
>> +	}
>>   }
>>   NOKPROBE_SYMBOL(free_rp_inst_rcu);
>>   
>> @@ -1877,7 +1879,7 @@ static void recycle_rp_inst(struct kretprobe_instance *ri)
>>   	struct kretprobe *rp = get_kretprobe(ri);
>>   
>>   	if (likely(rp))
>> -		freelist_add(&ri->freelist, &rp->freelist);
>> +		objpool_push(ri, &rp->rph->oh);
>>   	else
>>   		call_rcu(&ri->rcu, free_rp_inst_rcu);
>>   }
>> @@ -1914,23 +1916,19 @@ NOKPROBE_SYMBOL(kprobe_flush_task);
>>   
>>   static inline void free_rp_inst(struct kretprobe *rp)
>>   {
>> -	struct kretprobe_instance *ri;
>> -	struct freelist_node *node;
>> -	int count = 0;
>> -
>> -	node = rp->freelist.head;
>> -	while (node) {
>> -		ri = container_of(node, struct kretprobe_instance, freelist);
>> -		node = node->next;
>> -
>> -		kfree(ri);
>> -		count++;
>> -	}
>> +	struct kretprobe_holder *rph = rp->rph;
>> +	void *nod;
>>   
>> -	if (refcount_sub_and_test(count, &rp->rph->ref)) {
>> -		kfree(rp->rph);
>> -		rp->rph = NULL;
>> -	}
>> +	rp->rph = NULL;
>> +	do {
>> +		nod = objpool_pop(&rph->oh);
>> +		/* deref anyway since we've one extra ref grabbed */
>> +		if (refcount_dec_and_test(&rph->ref)) {
>> +			objpool_fini(&rph->oh);
>> +			kfree(rph);
>> +			break;
>> +		}
>> +	} while (nod);
>>   }
>>   
>>   /* This assumes the 'tsk' is the current task or the is not running. */
>> @@ -2072,19 +2070,17 @@ NOKPROBE_SYMBOL(__kretprobe_trampoline_handler)
>>   static int pre_handler_kretprobe(struct kprobe *p, struct pt_regs *regs)
>>   {
>>   	struct kretprobe *rp = container_of(p, struct kretprobe, kp);
>> +	struct kretprobe_holder *rph = rp->rph;
>>   	struct kretprobe_instance *ri;
>> -	struct freelist_node *fn;
>>   
>> -	fn = freelist_try_get(&rp->freelist);
>> -	if (!fn) {
>> +	ri = objpool_pop(&rph->oh);
>> +	if (!ri) {
>>   		rp->nmissed++;
>>   		return 0;
>>   	}
>>   
>> -	ri = container_of(fn, struct kretprobe_instance, freelist);
>> -
>>   	if (rp->entry_handler && rp->entry_handler(ri, regs)) {
>> -		freelist_add(&ri->freelist, &rp->freelist);
>> +		objpool_push(ri, &rph->oh);
>>   		return 0;
>>   	}
>>   
>> @@ -2174,10 +2170,19 @@ int kprobe_on_func_entry(kprobe_opcode_t *addr, const char *sym, unsigned long o
>>   	return 0;
>>   }
>>   
>> +#ifndef CONFIG_KRETPROBE_ON_RETHOOK
>> +static int kretprobe_init_inst(void *context, void *nod)
>> +{
>> +	struct kretprobe_instance *ri = nod;
>> +
>> +	ri->rph = context;
>> +	return 0;
>> +}
>> +#endif
>> +
>>   int register_kretprobe(struct kretprobe *rp)
>>   {
>>   	int ret;
>> -	struct kretprobe_instance *inst;
>>   	int i;
>>   	void *addr;
>>   
>> @@ -2215,20 +2220,12 @@ int register_kretprobe(struct kretprobe *rp)
>>   #endif
>>   	}
>>   #ifdef CONFIG_KRETPROBE_ON_RETHOOK
>> -	rp->rh = rethook_alloc((void *)rp, kretprobe_rethook_handler);
>> +	rp->rh = rethook_alloc((void *)rp, kretprobe_rethook_handler, GFP_KERNEL,
>> +				sizeof(struct kretprobe_instance) + rp->data_size,
>> +				rp->maxactive);
>>   	if (!rp->rh)
>>   		return -ENOMEM;
>>   
>> -	for (i = 0; i < rp->maxactive; i++) {
>> -		inst = kzalloc(sizeof(struct kretprobe_instance) +
>> -			       rp->data_size, GFP_KERNEL);
>> -		if (inst == NULL) {
>> -			rethook_free(rp->rh);
>> -			rp->rh = NULL;
>> -			return -ENOMEM;
>> -		}
>> -		rethook_add_node(rp->rh, &inst->node);
>> -	}
>>   	rp->nmissed = 0;
>>   	/* Establish function entry probe point */
>>   	ret = register_kprobe(&rp->kp);
>> @@ -2237,25 +2234,19 @@ int register_kretprobe(struct kretprobe *rp)
>>   		rp->rh = NULL;
>>   	}
>>   #else	/* !CONFIG_KRETPROBE_ON_RETHOOK */
>> -	rp->freelist.head = NULL;
>>   	rp->rph = kzalloc(sizeof(struct kretprobe_holder), GFP_KERNEL);
>>   	if (!rp->rph)
>>   		return -ENOMEM;
>>   
>> -	rp->rph->rp = rp;
>> -	for (i = 0; i < rp->maxactive; i++) {
>> -		inst = kzalloc(sizeof(struct kretprobe_instance) +
>> -			       rp->data_size, GFP_KERNEL);
>> -		if (inst == NULL) {
>> -			refcount_set(&rp->rph->ref, i);
>> -			free_rp_inst(rp);
>> -			return -ENOMEM;
>> -		}
>> -		inst->rph = rp->rph;
>> -		freelist_add(&inst->freelist, &rp->freelist);
>> +	if (objpool_init(&rp->rph->oh, rp->maxactive, rp->maxactive,
>> +			  rp->data_size + sizeof(struct kretprobe_instance),
>> +			  GFP_KERNEL, rp->rph, kretprobe_init_inst, NULL)) {
>> +		kfree(rp->rph);
>> +		rp->rph = NULL;
>> +		return -ENOMEM;
>>   	}
>> -	refcount_set(&rp->rph->ref, i);
>> -
>> +	refcount_set(&rp->rph->ref, rp->maxactive + 1);
>> +	rp->rph->rp = rp;
>>   	rp->nmissed = 0;
>>   	/* Establish function entry probe point */
>>   	ret = register_kprobe(&rp->kp);
>> diff --git a/kernel/trace/fprobe.c b/kernel/trace/fprobe.c
>> index e8143e368074..9a002bfbd216 100644
>> --- a/kernel/trace/fprobe.c
>> +++ b/kernel/trace/fprobe.c
>> @@ -125,7 +125,7 @@ static void fprobe_init(struct fprobe *fp)
>>   
>>   static int fprobe_init_rethook(struct fprobe *fp, int num)
>>   {
>> -	int i, size;
>> +	int size;
>>   
>>   	if (num < 0)
>>   		return -EINVAL;
>> @@ -140,20 +140,11 @@ static int fprobe_init_rethook(struct fprobe *fp, int num)
>>   	if (size < 0)
>>   		return -E2BIG;
>>   
>> -	fp->rethook = rethook_alloc((void *)fp, fprobe_exit_handler);
>> +	fp->rethook = rethook_alloc((void *)fp, fprobe_exit_handler, GFP_KERNEL,
>> +				sizeof(struct fprobe_rethook_node), size);
>>   	if (!fp->rethook)
>>   		return -ENOMEM;
>> -	for (i = 0; i < size; i++) {
>> -		struct fprobe_rethook_node *node;
>> -
>> -		node = kzalloc(sizeof(*node), GFP_KERNEL);
>> -		if (!node) {
>> -			rethook_free(fp->rethook);
>> -			fp->rethook = NULL;
>> -			return -ENOMEM;
>> -		}
>> -		rethook_add_node(fp->rethook, &node->node);
>> -	}
>> +
>>   	return 0;
>>   }
>>   
>> diff --git a/kernel/trace/rethook.c b/kernel/trace/rethook.c
>> index c69d82273ce7..01df98db2fbe 100644
>> --- a/kernel/trace/rethook.c
>> +++ b/kernel/trace/rethook.c
>> @@ -36,21 +36,17 @@ void rethook_flush_task(struct task_struct *tk)
>>   static void rethook_free_rcu(struct rcu_head *head)
>>   {
>>   	struct rethook *rh = container_of(head, struct rethook, rcu);
>> -	struct rethook_node *rhn;
>> -	struct freelist_node *node;
>> -	int count = 1;
>> +	struct rethook_node *nod;
>>   
>> -	node = rh->pool.head;
>> -	while (node) {
>> -		rhn = container_of(node, struct rethook_node, freelist);
>> -		node = node->next;
>> -		kfree(rhn);
>> -		count++;
>> -	}
>> -
>> -	/* The rh->ref is the number of pooled node + 1 */
>> -	if (refcount_sub_and_test(count, &rh->ref))
>> -		kfree(rh);
>> +	do {
>> +		nod = objpool_pop(&rh->pool);
>> +		/* deref anyway since we've one extra ref grabbed */
>> +		if (refcount_dec_and_test(&rh->ref)) {
>> +			objpool_fini(&rh->pool);
>> +			kfree(rh);
>> +			break;
>> +		}
>> +	} while (nod);
>>   }
>>   
>>   /**
>> @@ -70,16 +66,28 @@ void rethook_free(struct rethook *rh)
>>   	call_rcu(&rh->rcu, rethook_free_rcu);
>>   }
>>   
>> +static int rethook_init_node(void *context, void *nod)
>> +{
>> +	struct rethook_node *node = nod;
>> +
>> +	node->rethook = context;
>> +	return 0;
>> +}
>> +
>>   /**
>>    * rethook_alloc() - Allocate struct rethook.
>>    * @data: a data to pass the @handler when hooking the return.
>>    * @handler: the return hook callback function.
>> + * @gfp: default gfp for objpool allocation
>> + * @size: rethook node size
> 
> @size should be @data_size, which is the size of additional data bytes
> of rethook_node.

Got it.

>> + * @max: number of rethook nodes to be preallocated
>>    *
>>    * Allocate and initialize a new rethook with @data and @handler.
>>    * Return NULL if memory allocation fails or @handler is NULL.
>>    * Note that @handler == NULL means this rethook is going to be freed.
>>    */
>> -struct rethook *rethook_alloc(void *data, rethook_handler_t handler)
>> +struct rethook *rethook_alloc(void *data, rethook_handler_t handler, gfp_t gfp,
>> +			      int size, int max)
> 
> This doesn't need @gfp. Just use GFP_KERNEL from rethook layer.

Ok.

>>   {
>>   	struct rethook *rh = kzalloc(sizeof(struct rethook), GFP_KERNEL);
> 
> And here,
> 
> 	if (data_size < 0 || max < 0)
> 		return -EINVAL;
> 

I will update it in next version.

>>   
>> @@ -88,34 +96,26 @@ struct rethook *rethook_alloc(void *data, rethook_handler_t handler)
>>   
>>   	rh->data = data;
>>   	rh->handler = handler;
>> -	rh->pool.head = NULL;
>> -	refcount_set(&rh->ref, 1);
>>   
>> +	/* initialize the objpool for rethook nodes */
>> +	if (objpool_init(&rh->pool, max, max, size, gfp, rh, rethook_init_node,
>> +			NULL)) {
>> +		kfree(rh);
>> +		return NULL;
>> +	}
>> +	refcount_set(&rh->ref, max + 1);
>>   	return rh;
>>   }
>>   
>> -/**
>> - * rethook_add_node() - Add a new node to the rethook.
>> - * @rh: the struct rethook.
>> - * @node: the struct rethook_node to be added.
>> - *
>> - * Add @node to @rh. User must allocate @node (as a part of user's
>> - * data structure.) The @node fields are initialized in this function.
>> - */
>> -void rethook_add_node(struct rethook *rh, struct rethook_node *node)
>> -{
>> -	node->rethook = rh;
>> -	freelist_add(&node->freelist, &rh->pool);
>> -	refcount_inc(&rh->ref);
>> -}
>> -
>>   static void free_rethook_node_rcu(struct rcu_head *head)
>>   {
>>   	struct rethook_node *node = container_of(head, struct rethook_node, rcu);
>> +	struct rethook *rh = node->rethook;
>>   
>> -	if (refcount_dec_and_test(&node->rethook->ref))
>> -		kfree(node->rethook);
>> -	kfree(node);
>> +	if (refcount_dec_and_test(&rh->ref)) {
>> +		objpool_fini(&rh->pool);
>> +		kfree(rh);
>> +	}
>>   }
>>   
>>   /**
>> @@ -130,7 +130,7 @@ void rethook_recycle(struct rethook_node *node)
>>   	lockdep_assert_preemption_disabled();
>>   
>>   	if (likely(READ_ONCE(node->rethook->handler)))
>> -		freelist_add(&node->freelist, &node->rethook->pool);
>> +		objpool_push(node, &node->rethook->pool);
>>   	else
>>   		call_rcu(&node->rcu, free_rethook_node_rcu);
>>   }
>> @@ -146,7 +146,7 @@ NOKPROBE_SYMBOL(rethook_recycle);
>>   struct rethook_node *rethook_try_get(struct rethook *rh)
>>   {
>>   	rethook_handler_t handler = READ_ONCE(rh->handler);
>> -	struct freelist_node *fn;
>> +	struct rethook_node *nod;
>>   
>>   	lockdep_assert_preemption_disabled();
>>   
>> @@ -163,11 +163,11 @@ struct rethook_node *rethook_try_get(struct rethook *rh)
>>   	if (unlikely(!rcu_is_watching()))
>>   		return NULL;
>>   
>> -	fn = freelist_try_get(&rh->pool);
>> -	if (!fn)
>> +	nod = (struct rethook_node *)objpool_pop(&rh->pool);
>> +	if (!nod)
>>   		return NULL;
>>   
>> -	return container_of(fn, struct rethook_node, freelist);
>> +	return nod;
> 
> You can just write this part as
> 
> 	return (struct rethook_node *)objpool_pop(&rh->pool);

Yes, sure.

> Thank you,
> 

Best regards,

>>   }
>>   NOKPROBE_SYMBOL(rethook_try_get);
>>   
>> -- 
>> 2.34.1
>>


^ permalink raw reply	[flat|nested] 23+ messages in thread

end of thread, other threads:[~2022-11-16 11:02 UTC | newest]

Thread overview: 23+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2022-11-01  1:43 [PATCH v3] kprobes,lib: kretprobe scalability improvement wuqiang
2022-11-01 12:27 ` kernel test robot
2022-11-02  2:30 ` [PATCH v4] " wuqiang
2022-11-02 21:33   ` Andrew Morton
2022-11-03 16:46     ` wuqiang
2022-11-03  2:51   ` Masami Hiramatsu
2022-11-03 16:45     ` wuqiang
2022-11-04  1:28       ` Masami Hiramatsu
2022-11-06  5:34   ` [PATCH v5 0/4] lib,kprobes: " wuqiang
2022-11-06  5:34     ` [PATCH v5 1/4] lib: objpool added: ring-array based lockless MPMC queue wuqiang
2022-11-06  5:34     ` [PATCH v5 2/4] lib: objpool test module added wuqiang
2022-11-06  5:34     ` [PATCH v5 3/4] kprobes: kretprobe scalability improvement with objpool wuqiang
2022-11-06  5:34     ` [PATCH v5 4/4] kprobes: freelist.h removed wuqiang
2022-11-08  7:14   ` [PATCH v6 0/4] lib,kprobes: kretprobe scalability improvement wuqiang
2022-11-08  7:14     ` [PATCH v6 1/4] lib: objpool added: ring-array based lockless MPMC queue wuqiang
2022-11-14 15:54       ` Masami Hiramatsu
2022-11-16 10:42         ` wuqiang
2022-11-08  7:14     ` [PATCH v6 2/4] lib: objpool test module added wuqiang
2022-11-08  7:14     ` [PATCH v6 3/4] kprobes: kretprobe scalability improvement with objpool wuqiang
2022-11-14 15:56       ` Masami Hiramatsu
2022-11-14 15:56       ` Masami Hiramatsu
2022-11-16 10:49         ` wuqiang
2022-11-08  7:14     ` [PATCH v6 4/4] kprobes: freelist.h removed wuqiang

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.