rcu.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH 0/7] Headless support in the kvfree_rcu()
@ 2020-03-23 11:36 Uladzislau Rezki (Sony)
  2020-03-23 11:36 ` [PATCH 1/7] rcu/tree: simplify KFREE_BULK_MAX_ENTR macro Uladzislau Rezki (Sony)
                   ` (6 more replies)
  0 siblings, 7 replies; 12+ messages in thread
From: Uladzislau Rezki (Sony) @ 2020-03-23 11:36 UTC (permalink / raw)
  To: LKML, Paul E . McKenney, Joel Fernandes
  Cc: RCU, Andrew Morton, Uladzislau Rezki, Steven Rostedt,
	Oleksiy Avramchenko

Motivation.
We had some discussions about having possibility to kvfree() an
object that does not contain any rcu_head inside its allocated
memory. Basically to have a simple interface like:

<snip>
    void *ptr = kvmalloc(some_bytes, GFP_KERNEL);
        if (ptr)
            kvfree_rcu(ptr);
<snip>

For example, please have a look at recent ext4 topic
    https://lkml.org/lkml/2020/2/19/1372

due to lack of the interface that is in question, the
ext4 specific workaround has been introduced to kvfree()
after a grace period:

<snip>
void ext4_kvfree_array_rcu(void *to_free)
{
	struct ext4_rcu_ptr *ptr = kzalloc(sizeof(*ptr), GFP_KERNEL);

	if (ptr) {
		ptr->ptr = to_free;
		call_rcu(&ptr->rcu, ext4_rcu_ptr_callback);
		return;
	}
	synchronize_rcu();
	kvfree(ptr);
}
<snip>

Also, as Joel Fernandes mentioned and proposed we also
can have kfree_rcu() headless variant. Actually with this
series it is capable of doing that, but there is missing
one thing. The kfree_rcu() macro is eligible to be called
with two arguments only. So it should be updated similar
way as it has been done for kvfree_rcu().

In that case we can update many places in the kernel where
people do not embed the rcu_head into their stuctures for
some reason and do like:

<snip>
    synchronize_rcu();
    kfree(p);
<snip>

<snip>
urezki@pc636:~/data/ssd/coding/linux-rcu$ find ./ -name "*.c" | xargs grep -C 1 -rn "synchronize_rcu" | grep kfree
./arch/x86/mm/mmio-mod.c-314-           kfree(found_trace);
./kernel/module.c-3910- kfree(mod->args);
./kernel/trace/ftrace.c-5078-                   kfree(direct);
./kernel/trace/ftrace.c-5155-                   kfree(direct);
./kernel/trace/trace_probe.c-1087-      kfree(link);
./fs/nfs/sysfs.c-113-           kfree(old);
./fs/ext4/super.c-1701- kfree(old_qname);
./net/ipv4/gre.mod.c-36-        { 0xfc3fcca2, "kfree_skb" },
./net/core/sysctl_net_core.c-143-                               kfree(cur);
./drivers/crypto/nx/nx-842-pseries.c-1010-      kfree(old_devdata);
./drivers/misc/vmw_vmci/vmci_context.c-692-             kfree(notifier);
./drivers/misc/vmw_vmci/vmci_event.c-213-       kfree(s);
./drivers/infiniband/core/device.c:2162:                         * synchronize_rcu before the netdev is kfreed, so we
./drivers/infiniband/hw/hfi1/sdma.c-1337-       kfree(dd->per_sdma);
./drivers/net/ethernet/myricom/myri10ge/myri10ge.c-3582-        kfree(mgp->ss);
./drivers/net/ethernet/myricom/myri10ge/myri10ge.mod.c-156-     { 0x37a0cba, "kfree" },
./drivers/net/ethernet/mellanox/mlx5/core/fpga/tls.c:286:       synchronize_rcu(); /* before kfree(flow) */
./drivers/net/ethernet/mellanox/mlxsw/core.c-1504-      kfree(rxl_item);
./drivers/net/ethernet/chelsio/cxgb4/cxgb4_main.c-6648- kfree(adapter->mbox_log);
./drivers/net/ethernet/chelsio/cxgb4/cxgb4_main.c-6650- kfree(adapter);
./drivers/block/drbd/drbd_receiver.c-3804-      kfree(old_net_conf);
./drivers/block/drbd/drbd_receiver.c-4176-                      kfree(old_disk_conf);
./drivers/block/drbd/drbd_state.c-2074-         kfree(old_conf);
./drivers/block/drbd/drbd_nl.c-1689-    kfree(old_disk_conf);
./drivers/block/drbd/drbd_nl.c-2522-    kfree(old_net_conf);
./drivers/block/drbd/drbd_nl.c-2935-            kfree(old_disk_conf);
./drivers/mfd/dln2.c-178-               kfree(i);
./drivers/staging/fwserial/fwserial.c-2122-     kfree(peer);
<snip>

Uladzislau Rezki (Sony) (7):
  rcu/tree: simplify KFREE_BULK_MAX_ENTR macro
  rcu/tree: maintain separate array for vmalloc ptrs
  rcu/tree: introduce expedited_drain flag
  rcu/tree: support reclaim for head-less object
  rcu/tiny: move kvfree_call_rcu() out of header
  rcu/tiny: support reclaim for head-less object
  rcu: support headless variant in the kvfree_rcu()

 include/linux/rcupdate.h |  38 ++++-
 include/linux/rcutiny.h  |   6 +-
 kernel/rcu/tiny.c        | 161 +++++++++++++++++++++
 kernel/rcu/tree.c        | 294 ++++++++++++++++++++++++++++-----------
 4 files changed, 407 insertions(+), 92 deletions(-)

-- 
2.20.1


^ permalink raw reply	[flat|nested] 12+ messages in thread

* [PATCH 1/7] rcu/tree: simplify KFREE_BULK_MAX_ENTR macro
  2020-03-23 11:36 [PATCH 0/7] Headless support in the kvfree_rcu() Uladzislau Rezki (Sony)
@ 2020-03-23 11:36 ` Uladzislau Rezki (Sony)
  2020-03-23 11:36 ` [PATCH 2/7] rcu/tree: maintain separate array for vmalloc ptrs Uladzislau Rezki (Sony)
                   ` (5 subsequent siblings)
  6 siblings, 0 replies; 12+ messages in thread
From: Uladzislau Rezki (Sony) @ 2020-03-23 11:36 UTC (permalink / raw)
  To: LKML, Paul E . McKenney, Joel Fernandes
  Cc: RCU, Andrew Morton, Uladzislau Rezki, Steven Rostedt,
	Oleksiy Avramchenko, Boqun Feng

We can simplify KFREE_BULK_MAX_ENTR macro and get rid of
magic numbers which were used to make the structure to be
exactly one page.

Suggested-by: Boqun Feng <boqun.feng@gmail.com>
Signed-off-by: Uladzislau Rezki (Sony) <urezki@gmail.com>
---
 kernel/rcu/tree.c | 15 ++++++++-------
 1 file changed, 8 insertions(+), 7 deletions(-)

diff --git a/kernel/rcu/tree.c b/kernel/rcu/tree.c
index e7963270e7f6..49ba1ff50af5 100644
--- a/kernel/rcu/tree.c
+++ b/kernel/rcu/tree.c
@@ -2740,13 +2740,6 @@ EXPORT_SYMBOL_GPL(call_rcu);
 #define KFREE_DRAIN_JIFFIES (HZ / 50)
 #define KFREE_N_BATCHES 2
 
-/*
- * This macro defines how many entries the "records" array
- * will contain. It is based on the fact that the size of
- * kfree_rcu_bulk_data structure becomes exactly one page.
- */
-#define KFREE_BULK_MAX_ENTR ((PAGE_SIZE / sizeof(void *)) - 3)
-
 /**
  * struct kfree_rcu_bulk_data - single block to store kfree_rcu() pointers
  * @nr_records: Number of active pointers in the array
@@ -2760,6 +2753,14 @@ struct kfree_rcu_bulk_data {
 	struct kfree_rcu_bulk_data *next;
 };
 
+/*
+ * This macro defines how many entries the "records" array
+ * will contain. It is based on the fact that the size of
+ * kfree_rcu_bulk_data structure becomes exactly one page.
+ */
+#define KFREE_BULK_MAX_ENTR \
+	((PAGE_SIZE - sizeof(struct kfree_rcu_bulk_data)) / sizeof(void *))
+
 /**
  * struct kfree_rcu_cpu_work - single batch of kfree_rcu() requests
  * @rcu_work: Let queue_rcu_work() invoke workqueue handler after grace period
-- 
2.20.1


^ permalink raw reply related	[flat|nested] 12+ messages in thread

* [PATCH 2/7] rcu/tree: maintain separate array for vmalloc ptrs
  2020-03-23 11:36 [PATCH 0/7] Headless support in the kvfree_rcu() Uladzislau Rezki (Sony)
  2020-03-23 11:36 ` [PATCH 1/7] rcu/tree: simplify KFREE_BULK_MAX_ENTR macro Uladzislau Rezki (Sony)
@ 2020-03-23 11:36 ` Uladzislau Rezki (Sony)
  2020-03-23 11:36 ` [PATCH 3/7] rcu/tree: introduce expedited_drain flag Uladzislau Rezki (Sony)
                   ` (4 subsequent siblings)
  6 siblings, 0 replies; 12+ messages in thread
From: Uladzislau Rezki (Sony) @ 2020-03-23 11:36 UTC (permalink / raw)
  To: LKML, Paul E . McKenney, Joel Fernandes
  Cc: RCU, Andrew Morton, Uladzislau Rezki, Steven Rostedt,
	Oleksiy Avramchenko

To do so we use an array of common kvfree_rcu_bulk_data
structure. It consists of two elements, index number 0
corresponds to SLAB ptrs., whereas vmalloc pointers can
be accessed by using index number 1.

The reason of not mixing pointers is to have an easy way
to to distinguish them.

It is also the preparation patch for head-less objects
support. When an object is head-less we can not queue
it into any list, instead a pointer is placed directly
into an array.

Signed-off-by: Uladzislau Rezki (Sony) <urezki@gmail.com>
---
 kernel/rcu/tree.c | 179 ++++++++++++++++++++++++++++------------------
 1 file changed, 108 insertions(+), 71 deletions(-)

diff --git a/kernel/rcu/tree.c b/kernel/rcu/tree.c
index 49ba1ff50af5..20d08eca7006 100644
--- a/kernel/rcu/tree.c
+++ b/kernel/rcu/tree.c
@@ -2741,38 +2741,36 @@ EXPORT_SYMBOL_GPL(call_rcu);
 #define KFREE_N_BATCHES 2
 
 /**
- * struct kfree_rcu_bulk_data - single block to store kfree_rcu() pointers
+ * struct kvfree_rcu_bulk_data - single block to store kvfree() pointers
  * @nr_records: Number of active pointers in the array
- * @records: Array of the kfree_rcu() pointers
  * @next: Next bulk object in the block chain
- * @head_free_debug: For debug, when CONFIG_DEBUG_OBJECTS_RCU_HEAD is set
+ * @records: Array of the SLAB pointers
  */
-struct kfree_rcu_bulk_data {
+struct kvfree_rcu_bulk_data {
 	unsigned long nr_records;
-	void *records[KFREE_BULK_MAX_ENTR];
-	struct kfree_rcu_bulk_data *next;
+	struct kvfree_rcu_bulk_data *next;
+	void *records[];
 };
 
 /*
  * This macro defines how many entries the "records" array
  * will contain. It is based on the fact that the size of
- * kfree_rcu_bulk_data structure becomes exactly one page.
+ * kvfree_rcu_bulk_data become exactly one page.
  */
-#define KFREE_BULK_MAX_ENTR \
-	((PAGE_SIZE - sizeof(struct kfree_rcu_bulk_data)) / sizeof(void *))
+#define KVFREE_BULK_MAX_ENTR \
+	((PAGE_SIZE - sizeof(struct kvfree_rcu_bulk_data)) / sizeof(void *))
 
 /**
  * struct kfree_rcu_cpu_work - single batch of kfree_rcu() requests
  * @rcu_work: Let queue_rcu_work() invoke workqueue handler after grace period
  * @head_free: List of kfree_rcu() objects waiting for a grace period
- * @bhead_free: Bulk-List of kfree_rcu() objects waiting for a grace period
+ * @bkvhead_free: Bulk-List of kfree_rcu() objects waiting for a grace period
  * @krcp: Pointer to @kfree_rcu_cpu structure
  */
-
 struct kfree_rcu_cpu_work {
 	struct rcu_work rcu_work;
 	struct rcu_head *head_free;
-	struct kfree_rcu_bulk_data *bhead_free;
+	struct kvfree_rcu_bulk_data *bkvhead_free[2];
 	struct kfree_rcu_cpu *krcp;
 };
 
@@ -2794,8 +2792,9 @@ struct kfree_rcu_cpu_work {
  */
 struct kfree_rcu_cpu {
 	struct rcu_head *head;
-	struct kfree_rcu_bulk_data *bhead;
-	struct kfree_rcu_bulk_data *bcached;
+	struct kvfree_rcu_bulk_data *bkvhead[2];
+	struct kvfree_rcu_bulk_data *bkvcache[2];
+
 	struct kfree_rcu_cpu_work krw_arr[KFREE_N_BATCHES];
 	spinlock_t lock;
 	struct delayed_work monitor_work;
@@ -2808,7 +2807,7 @@ struct kfree_rcu_cpu {
 static DEFINE_PER_CPU(struct kfree_rcu_cpu, krc);
 
 static __always_inline void
-debug_rcu_bhead_unqueue(struct kfree_rcu_bulk_data *bhead)
+debug_rcu_bhead_unqueue(struct kvfree_rcu_bulk_data *bhead)
 {
 #ifdef CONFIG_DEBUG_OBJECTS_RCU_HEAD
 	for (int i = 0; i < bhead->nr_records; i++)
@@ -2823,45 +2822,77 @@ debug_rcu_bhead_unqueue(struct kfree_rcu_bulk_data *bhead)
 static void kfree_rcu_work(struct work_struct *work)
 {
 	unsigned long flags;
+	struct kvfree_rcu_bulk_data *bkhead, *bknext;
+	struct kvfree_rcu_bulk_data *bvhead, *bvnext;
 	struct rcu_head *head, *next;
-	struct kfree_rcu_bulk_data *bhead, *bnext;
 	struct kfree_rcu_cpu *krcp;
 	struct kfree_rcu_cpu_work *krwp;
+	int i;
 
 	krwp = container_of(to_rcu_work(work),
-			    struct kfree_rcu_cpu_work, rcu_work);
+				struct kfree_rcu_cpu_work, rcu_work);
+
 	krcp = krwp->krcp;
 	spin_lock_irqsave(&krcp->lock, flags);
+	/* Channel 1. */
+	bkhead = krwp->bkvhead_free[0];
+	krwp->bkvhead_free[0] = NULL;
+
+	/* Channel 2. */
+	bvhead = krwp->bkvhead_free[1];
+	krwp->bkvhead_free[1] = NULL;
+
+	/* Channel 3. */
 	head = krwp->head_free;
 	krwp->head_free = NULL;
-	bhead = krwp->bhead_free;
-	krwp->bhead_free = NULL;
 	spin_unlock_irqrestore(&krcp->lock, flags);
 
-	/* "bhead" is now private, so traverse locklessly. */
-	for (; bhead; bhead = bnext) {
-		bnext = bhead->next;
+	/* kmalloc()/kfree() channel. */
+	for (; bkhead; bkhead = bknext) {
+		bknext = bkhead->next;
 
-		debug_rcu_bhead_unqueue(bhead);
+		debug_rcu_bhead_unqueue(bkhead);
 
 		rcu_lock_acquire(&rcu_callback_map);
 		trace_rcu_invoke_kfree_bulk_callback(rcu_state.name,
-			bhead->nr_records, bhead->records);
+			bkhead->nr_records, bkhead->records);
+
+		kfree_bulk(bkhead->nr_records, bkhead->records);
+		rcu_lock_release(&rcu_callback_map);
+
+		if (cmpxchg(&krcp->bkvcache[0], NULL, bkhead))
+			free_page((unsigned long) bkhead);
+
+		cond_resched_tasks_rcu_qs();
+	}
+
+	/* vmalloc()/vfree() channel. */
+	for (; bvhead; bvhead = bvnext) {
+		bvnext = bvhead->next;
+
+		debug_rcu_bhead_unqueue(bvhead);
 
-		kfree_bulk(bhead->nr_records, bhead->records);
+		rcu_lock_acquire(&rcu_callback_map);
+		for (i = 0; i < bvhead->nr_records; i++) {
+			trace_rcu_invoke_kvfree_callback(rcu_state.name,
+				(struct rcu_head *) bvhead->records[i], 0);
+			vfree(bvhead->records[i]);
+		}
 		rcu_lock_release(&rcu_callback_map);
 
-		if (cmpxchg(&krcp->bcached, NULL, bhead))
-			free_page((unsigned long) bhead);
+		if (cmpxchg(&krcp->bkvcache[1], NULL, bvhead))
+			free_page((unsigned long) bvhead);
 
 		cond_resched_tasks_rcu_qs();
 	}
 
 	/*
-	 * We can end up here either with 1) vmalloc() pointers or 2) were low
-	 * on memory and could not allocate a bulk array. It can happen under
-	 * low memory condition when an allocation gets failed, so the "bulk"
-	 * path can not be temporarly used.
+	 * This path covers emergency case only due to high
+	 * memory pressure also means low memory condition,
+	 * when we could not allocate a bulk array.
+	 *
+	 * Under that condition an object is queued to the
+	 * list instead.
 	 */
 	for (; head; head = next) {
 		unsigned long offset = (unsigned long)head->func;
@@ -2898,21 +2929,34 @@ static inline bool queue_kfree_rcu_work(struct kfree_rcu_cpu *krcp)
 		krwp = &(krcp->krw_arr[i]);
 
 		/*
-		 * Try to detach bhead or head and attach it over any
+		 * Try to detach bkvhead or head and attach it over any
 		 * available corresponding free channel. It can be that
 		 * a previous RCU batch is in progress, it means that
 		 * immediately to queue another one is not possible so
 		 * return false to tell caller to retry.
 		 */
-		if ((krcp->bhead && !krwp->bhead_free) ||
+		if ((krcp->bkvhead[0] && !krwp->bkvhead_free[0]) ||
+			(krcp->bkvhead[1] && !krwp->bkvhead_free[1]) ||
 				(krcp->head && !krwp->head_free)) {
-			/* Channel 1. */
-			if (!krwp->bhead_free) {
-				krwp->bhead_free = krcp->bhead;
-				krcp->bhead = NULL;
+			/*
+			 * Channel 1 corresponds to SLAB ptrs.
+			 */
+			if (!krwp->bkvhead_free[0]) {
+				krwp->bkvhead_free[0] = krcp->bkvhead[0];
+				krcp->bkvhead[0] = NULL;
+			}
+
+			/*
+			 * Channel 2 corresponds to vmalloc ptrs.
+			 */
+			if (!krwp->bkvhead_free[1]) {
+				krwp->bkvhead_free[1] = krcp->bkvhead[1];
+				krcp->bkvhead[1] = NULL;
 			}
 
-			/* Channel 2. */
+			/*
+			 * Channel 3 corresponds to emergency path.
+			 */
 			if (!krwp->head_free) {
 				krwp->head_free = krcp->head;
 				krcp->head = NULL;
@@ -2921,10 +2965,11 @@ static inline bool queue_kfree_rcu_work(struct kfree_rcu_cpu *krcp)
 			WRITE_ONCE(krcp->count, 0);
 
 			/*
-			 * One work is per one batch, so there are two "free channels",
-			 * "bhead_free" and "head_free" the batch can handle. It can be
-			 * that the work is in the pending state when two channels have
-			 * been detached following each other, one by one.
+			 * One work is per one batch, so there are three
+			 * "free channels", the batch can handle. It can
+			 * be that the work is in the pending state when
+			 * channels have been detached following by each
+			 * other.
 			 */
 			queue_rcu_work(system_wq, &krwp->rcu_work);
 			queued = true;
@@ -2969,26 +3014,25 @@ static void kfree_rcu_monitor(struct work_struct *work)
 }
 
 static inline bool
-kfree_call_rcu_add_ptr_to_bulk(struct kfree_rcu_cpu *krcp,
-	struct rcu_head *head, rcu_callback_t func)
+kvfree_call_rcu_add_ptr_to_bulk(struct kfree_rcu_cpu *krcp, void *ptr)
 {
-	struct kfree_rcu_bulk_data *bnode;
+	struct kvfree_rcu_bulk_data *bnode;
+	int idx;
 
 	if (unlikely(!krcp->initialized))
 		return false;
 
 	lockdep_assert_held(&krcp->lock);
+	idx = !is_vmalloc_addr(ptr) ? 0:1;
 
 	/* Check if a new block is required. */
-	if (!krcp->bhead ||
-			krcp->bhead->nr_records == KFREE_BULK_MAX_ENTR) {
-		bnode = xchg(&krcp->bcached, NULL);
-		if (!bnode) {
-			WARN_ON_ONCE(sizeof(struct kfree_rcu_bulk_data) > PAGE_SIZE);
-
-			bnode = (struct kfree_rcu_bulk_data *)
+	if (!krcp->bkvhead[idx] ||
+			krcp->bkvhead[idx]->nr_records ==
+				KVFREE_BULK_MAX_ENTR) {
+		bnode = xchg(&krcp->bkvcache[idx], NULL);
+		if (!bnode)
+			bnode = (struct kvfree_rcu_bulk_data *)
 				__get_free_page(GFP_NOWAIT | __GFP_NOWARN);
-		}
 
 		/* Switch to emergency path. */
 		if (unlikely(!bnode))
@@ -2996,30 +3040,30 @@ kfree_call_rcu_add_ptr_to_bulk(struct kfree_rcu_cpu *krcp,
 
 		/* Initialize the new block. */
 		bnode->nr_records = 0;
-		bnode->next = krcp->bhead;
+		bnode->next = krcp->bkvhead[idx];
 
 		/* Attach it to the head. */
-		krcp->bhead = bnode;
+		krcp->bkvhead[idx] = bnode;
 	}
 
 	/* Finally insert. */
-	krcp->bhead->records[krcp->bhead->nr_records++] =
-		(void *) head - (unsigned long) func;
+	krcp->bkvhead[idx]->records
+		[krcp->bkvhead[idx]->nr_records++] = ptr;
 
 	return true;
 }
 
 /*
- * Queue a request for lazy invocation of kfree_bulk()/kvfree() after a grace
- * period. Please note there are two paths are maintained, one is the main one
- * that uses kfree_bulk() interface and second one is emergency one, that is
- * used only when the main path can not be maintained temporary, due to memory
- * pressure.
+ * Queue a request for lazy invocation of appropriate free routine after a
+ * grace period. Please note there are three paths are maintained, two are the
+ * main ones that use array of pointers interface and third one is emergency
+ * one, that is used only when the main path can not be maintained temporary,
+ * due to memory pressure.
  *
  * Each kvfree_call_rcu() request is added to a batch. The batch will be drained
  * every KFREE_DRAIN_JIFFIES number of jiffies. All the objects in the batch will
  * be free'd in workqueue context. This allows us to: batch requests together to
- * reduce the number of grace periods during heavy kfree_rcu() load.
+ * reduce the number of grace periods during heavy kfree_rcu()/kvfree_rcu() load.
  */
 void kvfree_call_rcu(struct rcu_head *head, rcu_callback_t func)
 {
@@ -3043,17 +3087,10 @@ void kvfree_call_rcu(struct rcu_head *head, rcu_callback_t func)
 	}
 
 	/*
-	 * We do not queue vmalloc pointers into array,
-	 * instead they are just queued to the list. We
-	 * do it because of:
-	 *    a) to distinguish kmalloc()/vmalloc() ptrs;
-	 *    b) there is no vmalloc_bulk() interface.
-	 *
 	 * Under high memory pressure GFP_NOWAIT can fail,
 	 * in that case the emergency path is maintained.
 	 */
-	if (is_vmalloc_addr(ptr) ||
-	    !kfree_call_rcu_add_ptr_to_bulk(krcp, head, func)) {
+	if (!kvfree_call_rcu_add_ptr_to_bulk(krcp, ptr)) {
 		head->func = func;
 		head->next = krcp->head;
 		krcp->head = head;
-- 
2.20.1


^ permalink raw reply related	[flat|nested] 12+ messages in thread

* [PATCH 3/7] rcu/tree: introduce expedited_drain flag
  2020-03-23 11:36 [PATCH 0/7] Headless support in the kvfree_rcu() Uladzislau Rezki (Sony)
  2020-03-23 11:36 ` [PATCH 1/7] rcu/tree: simplify KFREE_BULK_MAX_ENTR macro Uladzislau Rezki (Sony)
  2020-03-23 11:36 ` [PATCH 2/7] rcu/tree: maintain separate array for vmalloc ptrs Uladzislau Rezki (Sony)
@ 2020-03-23 11:36 ` Uladzislau Rezki (Sony)
  2020-03-23 11:36 ` [PATCH 4/7] rcu/tree: support reclaim for head-less object Uladzislau Rezki (Sony)
                   ` (3 subsequent siblings)
  6 siblings, 0 replies; 12+ messages in thread
From: Uladzislau Rezki (Sony) @ 2020-03-23 11:36 UTC (permalink / raw)
  To: LKML, Paul E . McKenney, Joel Fernandes
  Cc: RCU, Andrew Morton, Uladzislau Rezki, Steven Rostedt,
	Oleksiy Avramchenko

It is used and set to true when the bulk array can not
be maintained, it happens under low memory condition
and memory pressure.

In that case the drain work is scheduled right away and
not after KFREE_DRAIN_JIFFIES. It tends to speed up the
reclaim path. From the other hand there are no any data
showing the difference yet.

Signed-off-by: Uladzislau Rezki (Sony) <urezki@gmail.com>
---
 kernel/rcu/tree.c | 20 ++++++++++++++++----
 1 file changed, 16 insertions(+), 4 deletions(-)

diff --git a/kernel/rcu/tree.c b/kernel/rcu/tree.c
index 20d08eca7006..869a72e25d38 100644
--- a/kernel/rcu/tree.c
+++ b/kernel/rcu/tree.c
@@ -3061,14 +3061,16 @@ kvfree_call_rcu_add_ptr_to_bulk(struct kfree_rcu_cpu *krcp, void *ptr)
  * due to memory pressure.
  *
  * Each kvfree_call_rcu() request is added to a batch. The batch will be drained
- * every KFREE_DRAIN_JIFFIES number of jiffies. All the objects in the batch will
- * be free'd in workqueue context. This allows us to: batch requests together to
- * reduce the number of grace periods during heavy kfree_rcu()/kvfree_rcu() load.
+ * every KFREE_DRAIN_JIFFIES number of jiffies or can be scheduled right away if
+ * a low memory is detected. All the objects in the batch will be free'd in
+ * workqueue context. This allows us to: batch requests together to reduce the
+ * number of grace periods during heavy kfree_rcu()/kvfree_rcu() load.
  */
 void kvfree_call_rcu(struct rcu_head *head, rcu_callback_t func)
 {
 	unsigned long flags;
 	struct kfree_rcu_cpu *krcp;
+	bool expedited_drain = false;
 	void *ptr;
 
 	local_irq_save(flags);	// For safely calling this_cpu_ptr().
@@ -3094,6 +3096,14 @@ void kvfree_call_rcu(struct rcu_head *head, rcu_callback_t func)
 		head->func = func;
 		head->next = krcp->head;
 		krcp->head = head;
+
+		/*
+		 * There was an issue to place the pointer directly
+		 * into array, due to memory pressure. Initiate an
+		 * expedited drain to accelerate lazy invocation of
+		 * appropriate free calls.
+		 */
+		expedited_drain = true;
 	}
 
 	WRITE_ONCE(krcp->count, krcp->count + 1);
@@ -3102,7 +3112,9 @@ void kvfree_call_rcu(struct rcu_head *head, rcu_callback_t func)
 	if (rcu_scheduler_active == RCU_SCHEDULER_RUNNING &&
 	    !krcp->monitor_todo) {
 		krcp->monitor_todo = true;
-		schedule_delayed_work(&krcp->monitor_work, KFREE_DRAIN_JIFFIES);
+
+		schedule_delayed_work(&krcp->monitor_work,
+			expedited_drain ? 0:KFREE_DRAIN_JIFFIES);
 	}
 
 unlock_return:
-- 
2.20.1


^ permalink raw reply related	[flat|nested] 12+ messages in thread

* [PATCH 4/7] rcu/tree: support reclaim for head-less object
  2020-03-23 11:36 [PATCH 0/7] Headless support in the kvfree_rcu() Uladzislau Rezki (Sony)
                   ` (2 preceding siblings ...)
  2020-03-23 11:36 ` [PATCH 3/7] rcu/tree: introduce expedited_drain flag Uladzislau Rezki (Sony)
@ 2020-03-23 11:36 ` Uladzislau Rezki (Sony)
  2020-03-29 22:56   ` Joel Fernandes
  2020-03-23 11:36 ` [PATCH 5/7] rcu/tiny: move kvfree_call_rcu() out of header Uladzislau Rezki (Sony)
                   ` (2 subsequent siblings)
  6 siblings, 1 reply; 12+ messages in thread
From: Uladzislau Rezki (Sony) @ 2020-03-23 11:36 UTC (permalink / raw)
  To: LKML, Paul E . McKenney, Joel Fernandes
  Cc: RCU, Andrew Morton, Uladzislau Rezki, Steven Rostedt,
	Oleksiy Avramchenko

Update the kvfree_call_rcu() with head-less support, it
means an object without any rcu_head structure can be
reclaimed after GP.

To store pointers there are two chain-arrays maintained
one for SLAB and another one is for vmalloc. Both types
of objects(head-less variant and regular one) are placed
there based on the type.

It can be that maintaining of arrays becomes impossible
due to high memory pressure. For such reason there is an
emergency path. In that case objects with rcu_head inside
are just queued building one way list. Later on that list
is drained.

As for head-less variant. Such objects do not have any
rcu_head helper inside. Thus it is dynamically attached.
As a result an object consists of back-pointer and regular
rcu_head. It implies that emergency path can detect such
object type, therefore they are tagged. So a back-pointer
could be freed as well as dynamically attached wrapper.

Even though such approach requires dynamic memory it needs
only sizeof(unsigned long *) + sizeof(struct rcu_head) bytes,
thus SLAB is used to obtain it. Finally if attaching of the
rcu_head and queuing get failed, the current context has
to follow might_sleep() annotation, thus below steps could
be applied:
   a) wait until a grace period has elapsed;
   b) direct inlining of the kvfree() call.

Signed-off-by: Uladzislau Rezki (Sony) <urezki@gmail.com>
---
 kernel/rcu/tree.c | 94 +++++++++++++++++++++++++++++++++++++++++++----
 1 file changed, 86 insertions(+), 8 deletions(-)

diff --git a/kernel/rcu/tree.c b/kernel/rcu/tree.c
index 869a72e25d38..5a64c92feafc 100644
--- a/kernel/rcu/tree.c
+++ b/kernel/rcu/tree.c
@@ -2892,19 +2892,34 @@ static void kfree_rcu_work(struct work_struct *work)
 	 * when we could not allocate a bulk array.
 	 *
 	 * Under that condition an object is queued to the
-	 * list instead.
+	 * list instead. Please note that head-less objects
+	 * have dynamically attached rcu_head, so they also
+	 * contain a back-pointer that has to be freed.
 	 */
 	for (; head; head = next) {
 		unsigned long offset = (unsigned long)head->func;
-		void *ptr = (void *)head - offset;
+		bool headless;
+		void *ptr;
 
 		next = head->next;
+
+		/* We tag the headless object, if so adjust offset. */
+		headless = (((unsigned long) head - offset) & BIT(0));
+		if (headless)
+			offset -= 1;
+
+		ptr = (void *) head - offset;
 		debug_rcu_head_unqueue((struct rcu_head *)ptr);
+
 		rcu_lock_acquire(&rcu_callback_map);
 		trace_rcu_invoke_kvfree_callback(rcu_state.name, head, offset);
 
-		if (!WARN_ON_ONCE(!__is_kvfree_rcu_offset(offset)))
+		if (!WARN_ON_ONCE(!__is_kvfree_rcu_offset(offset))) {
+			if (headless)
+				kvfree((void *) *((unsigned long *) ptr));
+
 			kvfree(ptr);
+		}
 
 		rcu_lock_release(&rcu_callback_map);
 		cond_resched_tasks_rcu_qs();
@@ -3053,6 +3068,25 @@ kvfree_call_rcu_add_ptr_to_bulk(struct kfree_rcu_cpu *krcp, void *ptr)
 	return true;
 }
 
+static inline struct rcu_head *
+attach_rcu_head_to_object(void *obj)
+{
+	unsigned long *ptr;
+
+	ptr = kmalloc(sizeof(unsigned long *) +
+			sizeof(struct rcu_head), GFP_NOWAIT | __GFP_NOWARN);
+
+	if (!ptr)
+		ptr = kmalloc(sizeof(unsigned long *) +
+				sizeof(struct rcu_head), GFP_ATOMIC | __GFP_NOWARN);
+
+	if (!ptr)
+		return NULL;
+
+	ptr[0] = (unsigned long) obj;
+	return ((struct rcu_head *) ++ptr);
+}
+
 /*
  * Queue a request for lazy invocation of appropriate free routine after a
  * grace period. Please note there are three paths are maintained, two are the
@@ -3071,20 +3105,37 @@ void kvfree_call_rcu(struct rcu_head *head, rcu_callback_t func)
 	unsigned long flags;
 	struct kfree_rcu_cpu *krcp;
 	bool expedited_drain = false;
+	bool success;
 	void *ptr;
 
+	if (head) {
+		ptr = (void *) head - (unsigned long) func;
+	} else {
+		/*
+		 * Please note there is a limitation for the head-less
+		 * variant, that is why there is a clear rule for such
+		 * objects:
+		 *
+		 * use it from might_sleep() context only. For other
+		 * places please embed an rcu_head to your structures.
+		 */
+		might_sleep();
+		ptr = (unsigned long *) func;
+	}
+
 	local_irq_save(flags);	// For safely calling this_cpu_ptr().
 	krcp = this_cpu_ptr(&krc);
 	if (krcp->initialized)
 		spin_lock(&krcp->lock);
 
-	ptr = (void *)head - (unsigned long)func;
-
 	// Queue the object but don't yet schedule the batch.
 	if (debug_rcu_head_queue(ptr)) {
 		// Probable double kfree_rcu(), just leak.
 		WARN_ONCE(1, "%s(): Double-freed call. rcu_head %p\n",
 			  __func__, head);
+
+		/* Mark as success and leave. */
+		success = true;
 		goto unlock_return;
 	}
 
@@ -3092,7 +3143,22 @@ void kvfree_call_rcu(struct rcu_head *head, rcu_callback_t func)
 	 * Under high memory pressure GFP_NOWAIT can fail,
 	 * in that case the emergency path is maintained.
 	 */
-	if (!kvfree_call_rcu_add_ptr_to_bulk(krcp, ptr)) {
+	success = kvfree_call_rcu_add_ptr_to_bulk(krcp, ptr);
+	if (!success) {
+		/* Is headless object? */
+		if (head == NULL) {
+			head = attach_rcu_head_to_object(ptr);
+			if (head == NULL)
+				goto unlock_return;
+
+			/*
+			 * Tag the headless object. Such objects have a back-pointer
+			 * to the original allocated memory, that has to be freed as
+			 * well as dynamically attached wrapper/head.
+			 */
+			func = (rcu_callback_t) (sizeof(unsigned long *) + 1);
+		}
+
 		head->func = func;
 		head->next = krcp->head;
 		krcp->head = head;
@@ -3104,15 +3170,15 @@ void kvfree_call_rcu(struct rcu_head *head, rcu_callback_t func)
 		 * appropriate free calls.
 		 */
 		expedited_drain = true;
+		success = true;
 	}
 
 	WRITE_ONCE(krcp->count, krcp->count + 1);
 
 	// Set timer to drain after KFREE_DRAIN_JIFFIES.
 	if (rcu_scheduler_active == RCU_SCHEDULER_RUNNING &&
-	    !krcp->monitor_todo) {
+			!krcp->monitor_todo) {
 		krcp->monitor_todo = true;
-
 		schedule_delayed_work(&krcp->monitor_work,
 			expedited_drain ? 0:KFREE_DRAIN_JIFFIES);
 	}
@@ -3121,6 +3187,18 @@ void kvfree_call_rcu(struct rcu_head *head, rcu_callback_t func)
 	if (krcp->initialized)
 		spin_unlock(&krcp->lock);
 	local_irq_restore(flags);
+
+	/*
+	 * High memory pressure, so inline kvfree() after
+	 * synchronize_rcu(). We can do it from might_sleep()
+	 * context only, so the current CPU can pass the QS
+	 * state.
+	 */
+	if (!success) {
+		debug_rcu_head_unqueue(ptr);
+		synchronize_rcu();
+		kvfree(ptr);
+	}
 }
 EXPORT_SYMBOL_GPL(kvfree_call_rcu);
 
-- 
2.20.1


^ permalink raw reply related	[flat|nested] 12+ messages in thread

* [PATCH 5/7] rcu/tiny: move kvfree_call_rcu() out of header
  2020-03-23 11:36 [PATCH 0/7] Headless support in the kvfree_rcu() Uladzislau Rezki (Sony)
                   ` (3 preceding siblings ...)
  2020-03-23 11:36 ` [PATCH 4/7] rcu/tree: support reclaim for head-less object Uladzislau Rezki (Sony)
@ 2020-03-23 11:36 ` Uladzislau Rezki (Sony)
  2020-03-23 11:36 ` [PATCH 6/7] rcu/tiny: support reclaim for head-less object Uladzislau Rezki (Sony)
  2020-03-23 11:36 ` [PATCH 7/7] rcu: support headless variant in the kvfree_rcu() Uladzislau Rezki (Sony)
  6 siblings, 0 replies; 12+ messages in thread
From: Uladzislau Rezki (Sony) @ 2020-03-23 11:36 UTC (permalink / raw)
  To: LKML, Paul E . McKenney, Joel Fernandes
  Cc: RCU, Andrew Morton, Uladzislau Rezki, Steven Rostedt,
	Oleksiy Avramchenko

Move inlined kvfree_call_rcu() function out of the
header file. This step is a preparation for head-lees
support.

Signed-off-by: Uladzislau Rezki (Sony) <urezki@gmail.com>
---
 include/linux/rcutiny.h | 6 +-----
 kernel/rcu/tiny.c       | 6 ++++++
 2 files changed, 7 insertions(+), 5 deletions(-)

diff --git a/include/linux/rcutiny.h b/include/linux/rcutiny.h
index 4cae3dd77173..3a879a2e0268 100644
--- a/include/linux/rcutiny.h
+++ b/include/linux/rcutiny.h
@@ -34,11 +34,7 @@ static inline void synchronize_rcu_expedited(void)
 	synchronize_rcu();
 }
 
-static inline void kvfree_call_rcu(struct rcu_head *head, rcu_callback_t func)
-{
-	call_rcu(head, func);
-}
-
+void kvfree_call_rcu(struct rcu_head *head, rcu_callback_t func);
 void rcu_qs(void);
 
 static inline void rcu_softirq_qs(void)
diff --git a/kernel/rcu/tiny.c b/kernel/rcu/tiny.c
index aa897c3f2e92..508c82faa45c 100644
--- a/kernel/rcu/tiny.c
+++ b/kernel/rcu/tiny.c
@@ -177,6 +177,12 @@ void call_rcu(struct rcu_head *head, rcu_callback_t func)
 }
 EXPORT_SYMBOL_GPL(call_rcu);
 
+void kvfree_call_rcu(struct rcu_head *head, rcu_callback_t func)
+{
+	call_rcu(head, func);
+}
+EXPORT_SYMBOL_GPL(kvfree_call_rcu);
+
 void __init rcu_init(void)
 {
 	open_softirq(RCU_SOFTIRQ, rcu_process_callbacks);
-- 
2.20.1


^ permalink raw reply related	[flat|nested] 12+ messages in thread

* [PATCH 6/7] rcu/tiny: support reclaim for head-less object
  2020-03-23 11:36 [PATCH 0/7] Headless support in the kvfree_rcu() Uladzislau Rezki (Sony)
                   ` (4 preceding siblings ...)
  2020-03-23 11:36 ` [PATCH 5/7] rcu/tiny: move kvfree_call_rcu() out of header Uladzislau Rezki (Sony)
@ 2020-03-23 11:36 ` Uladzislau Rezki (Sony)
  2020-03-30  0:56   ` Joel Fernandes
  2020-03-23 11:36 ` [PATCH 7/7] rcu: support headless variant in the kvfree_rcu() Uladzislau Rezki (Sony)
  6 siblings, 1 reply; 12+ messages in thread
From: Uladzislau Rezki (Sony) @ 2020-03-23 11:36 UTC (permalink / raw)
  To: LKML, Paul E . McKenney, Joel Fernandes
  Cc: RCU, Andrew Morton, Uladzislau Rezki, Steven Rostedt,
	Oleksiy Avramchenko

Make a kvfree_call_rcu() function to support head-less
freeing. Same as for tree-RCU, for such purpose we store
pointers in array. SLAB and vmalloc ptrs. are mixed and
coexist together.

Under high memory pressure it can be that maintaining of
arrays becomes impossible. Objects with an rcu_head are
released via call_rcu(). When it comes to the head-less
variant, the kvfree() call is directly inlined, i.e. we
do the same as for tree-RCU:
    a) wait until a grace period has elapsed;
    b) direct inlining of the kvfree() call.

Thus the current context has to follow might_sleep()
annotation. Also please note that for tiny-RCU any
call of synchronize_rcu() is actually a quiescent
state, therefore (a) does nothing.

Signed-off-by: Uladzislau Rezki (Sony) <urezki@gmail.com>
---
 kernel/rcu/tiny.c | 157 +++++++++++++++++++++++++++++++++++++++++++++-
 1 file changed, 156 insertions(+), 1 deletion(-)

diff --git a/kernel/rcu/tiny.c b/kernel/rcu/tiny.c
index 508c82faa45c..b1c31a935db9 100644
--- a/kernel/rcu/tiny.c
+++ b/kernel/rcu/tiny.c
@@ -40,6 +40,29 @@ static struct rcu_ctrlblk rcu_ctrlblk = {
 	.curtail	= &rcu_ctrlblk.rcucblist,
 };
 
+/* Can be common with tree-RCU. */
+#define KVFREE_DRAIN_JIFFIES (HZ / 50)
+
+/* Can be common with tree-RCU. */
+struct kvfree_rcu_bulk_data {
+	unsigned long nr_records;
+	struct kvfree_rcu_bulk_data *next;
+	void *records[];
+};
+
+/* Can be common with tree-RCU. */
+#define KVFREE_BULK_MAX_ENTR \
+	((PAGE_SIZE - sizeof(struct kvfree_rcu_bulk_data)) / sizeof(void *))
+
+static struct kvfree_rcu_bulk_data *kvhead;
+static struct kvfree_rcu_bulk_data *kvhead_free;
+static struct kvfree_rcu_bulk_data *kvcache;
+
+static DEFINE_STATIC_KEY_FALSE(rcu_init_done);
+static struct delayed_work monitor_work;
+static struct rcu_work rcu_work;
+static bool monitor_todo;
+
 void rcu_barrier(void)
 {
 	wait_rcu_gp(call_rcu);
@@ -177,9 +200,137 @@ void call_rcu(struct rcu_head *head, rcu_callback_t func)
 }
 EXPORT_SYMBOL_GPL(call_rcu);
 
+static inline bool
+kvfree_call_rcu_add_ptr_to_bulk(void *ptr)
+{
+	struct kvfree_rcu_bulk_data *bnode;
+
+	if (!kvhead || kvhead->nr_records == KVFREE_BULK_MAX_ENTR) {
+		bnode = xchg(&kvcache, NULL);
+		if (!bnode)
+			bnode = (struct kvfree_rcu_bulk_data *)
+				__get_free_page(GFP_NOWAIT | __GFP_NOWARN);
+
+		if (unlikely(!bnode))
+			return false;
+
+		/* Initialize the new block. */
+		bnode->nr_records = 0;
+		bnode->next = kvhead;
+
+		/* Attach it to the bvhead. */
+		kvhead = bnode;
+	}
+
+	/* Done. */
+	kvhead->records[kvhead->nr_records++] = ptr;
+	return true;
+}
+
+static void
+kvfree_rcu_work(struct work_struct *work)
+{
+	struct kvfree_rcu_bulk_data *kvhead_tofree, *next;
+	unsigned long flags;
+	int i;
+
+	local_irq_save(flags);
+	kvhead_tofree = kvhead_free;
+	kvhead_free = NULL;
+	local_irq_restore(flags);
+
+	/* Reclaim process. */
+	for (; kvhead_tofree; kvhead_tofree = next) {
+		next = kvhead_tofree->next;
+
+		for (i = 0; i < kvhead_tofree->nr_records; i++) {
+			debug_rcu_head_unqueue((struct rcu_head *)
+				kvhead_tofree->records[i]);
+			kvfree(kvhead_tofree->records[i]);
+		}
+
+		if (cmpxchg(&kvcache, NULL, kvhead_tofree))
+			free_page((unsigned long) kvhead_tofree);
+	}
+}
+
+static inline bool
+queue_kvfree_rcu_work(void)
+{
+	/* Check if the free channel is available. */
+	if (kvhead_free)
+		return false;
+
+	kvhead_free = kvhead;
+	kvhead = NULL;
+
+	/*
+	 * Queue the job for memory reclaim after GP.
+	 */
+	queue_rcu_work(system_wq, &rcu_work);
+	return true;
+}
+
+static void kvfree_rcu_monitor(struct work_struct *work)
+{
+	unsigned long flags;
+	bool queued;
+
+	local_irq_save(flags);
+	queued = queue_kvfree_rcu_work();
+	if (queued)
+		/* Success. */
+		monitor_todo = false;
+	local_irq_restore(flags);
+
+	/*
+	 * If previous RCU reclaim process is still in progress,
+	 * schedule the work one more time to try again later.
+	 */
+	if (monitor_todo)
+		schedule_delayed_work(&monitor_work,
+			KVFREE_DRAIN_JIFFIES);
+}
+
 void kvfree_call_rcu(struct rcu_head *head, rcu_callback_t func)
 {
-	call_rcu(head, func);
+	unsigned long flags;
+	bool success;
+	void *ptr;
+
+	if (head) {
+		ptr = (void *) head - (unsigned long) func;
+	} else {
+		might_sleep();
+		ptr = (void *) func;
+	}
+
+	if (debug_rcu_head_queue(ptr)) {
+		/* Probable double free, just leak. */
+		WARN_ONCE(1, "%s(): Double-freed call. rcu_head %p\n",
+			  __func__, head);
+		return;
+	}
+
+	local_irq_save(flags);
+	success = kvfree_call_rcu_add_ptr_to_bulk(ptr);
+	if (static_branch_likely(&rcu_init_done)) {
+		if (success && !monitor_todo) {
+			monitor_todo = true;
+			schedule_delayed_work(&monitor_work,
+				KVFREE_DRAIN_JIFFIES);
+		}
+	}
+	local_irq_restore(flags);
+
+	if (!success) {
+		if (!head) {
+			synchronize_rcu();
+			kvfree(ptr);
+		} else {
+			call_rcu(head, func);
+		}
+	}
 }
 EXPORT_SYMBOL_GPL(kvfree_call_rcu);
 
@@ -188,4 +339,8 @@ void __init rcu_init(void)
 	open_softirq(RCU_SOFTIRQ, rcu_process_callbacks);
 	rcu_early_boot_tests();
 	srcu_init();
+
+	INIT_DELAYED_WORK(&monitor_work, kvfree_rcu_monitor);
+	INIT_RCU_WORK(&rcu_work, kvfree_rcu_work);
+	static_branch_enable(&rcu_init_done);
 }
-- 
2.20.1


^ permalink raw reply related	[flat|nested] 12+ messages in thread

* [PATCH 7/7] rcu: support headless variant in the kvfree_rcu()
  2020-03-23 11:36 [PATCH 0/7] Headless support in the kvfree_rcu() Uladzislau Rezki (Sony)
                   ` (5 preceding siblings ...)
  2020-03-23 11:36 ` [PATCH 6/7] rcu/tiny: support reclaim for head-less object Uladzislau Rezki (Sony)
@ 2020-03-23 11:36 ` Uladzislau Rezki (Sony)
  6 siblings, 0 replies; 12+ messages in thread
From: Uladzislau Rezki (Sony) @ 2020-03-23 11:36 UTC (permalink / raw)
  To: LKML, Paul E . McKenney, Joel Fernandes
  Cc: RCU, Andrew Morton, Uladzislau Rezki, Steven Rostedt,
	Oleksiy Avramchenko

Make it possible to pass one or two arguments to the
kvfree_rcu() macro what corresponds to either headless
case or not, so it becomes a bit versatile.

As a result we obtain two ways of using that macro,
below are two examples:

a) kvfree_rcu(ptr, rhf);
    struct X {
        struct rcu_head rhf;
        unsigned char data[100];
    };

    void *ptr = kvmalloc(sizeof(struct X), GFP_KERNEL);
    if (ptr)
        kvfree_rcu(ptr, rhf);

b) kvfree_rcu(ptr);
    void *ptr = kvmalloc(some_bytes, GFP_KERNEL);
    if (ptr)
        kvfree_rcu(ptr);

Last one, we name it headless variant, only needs one
argument, means it does not require any rcu_head to be
present within the type of ptr.

There is a restriction the (b) context has to fall into
might_sleep() annotation. To check that, please activate
the CONFIG_DEBUG_ATOMIC_SLEEP option in your kernel.

Signed-off-by: Uladzislau Rezki (Sony) <urezki@gmail.com>
---
 include/linux/rcupdate.h | 38 ++++++++++++++++++++++++++++++++++----
 1 file changed, 34 insertions(+), 4 deletions(-)

diff --git a/include/linux/rcupdate.h b/include/linux/rcupdate.h
index efd3dc13c36d..51ebf2461d51 100644
--- a/include/linux/rcupdate.h
+++ b/include/linux/rcupdate.h
@@ -844,12 +844,42 @@ do {									\
 
 /**
  * kvfree_rcu() - kvfree an object after a grace period.
- * @ptr:	pointer to kvfree
- * @rhf:	the name of the struct rcu_head within the type of @ptr.
  *
- * Same as kfree_rcu(), just simple alias.
+ * This macro consists of one or two arguments and it is
+ * based on whether an object is head-less or not. If it
+ * has a head then a semantic stays the same as it used
+ * to be before:
+ *
+ *     kvfree_rcu(ptr, rhf);
+ *
+ * where @ptr is a pointer to kvfree(), @rhf is the name
+ * of the rcu_head structure within the type of @ptr.
+ *
+ * When it comes to head-less variant, only one argument
+ * is passed and that is just a pointer which has to be
+ * freed after a grace period. Therefore the semantic is
+ *
+ *     kvfree_rcu(ptr);
+ *
+ * where @ptr is a pointer to kvfree().
+ *
+ * Please note, head-less way of freeing is permitted to
+ * use from a context that has to follow might_sleep()
+ * annotation. Otherwise, please switch and embed the
+ * rcu_head structure within the type of @ptr.
  */
-#define kvfree_rcu(ptr, rhf) kfree_rcu(ptr, rhf)
+#define kvfree_rcu(...) KVFREE_GET_MACRO(__VA_ARGS__,		\
+	kvfree_rcu_arg_2, kvfree_rcu_arg_1)(__VA_ARGS__)
+
+#define KVFREE_GET_MACRO(_1, _2, NAME, ...) NAME
+#define kvfree_rcu_arg_2(ptr, rhf) kfree_rcu(ptr, rhf)
+#define kvfree_rcu_arg_1(ptr)					\
+do {								\
+	typeof(ptr) ___p = (ptr);				\
+								\
+	if (___p)						\
+		kvfree_call_rcu(NULL, (rcu_callback_t) (___p));	\
+} while (0)
 
 /*
  * Place this after a lock-acquisition primitive to guarantee that
-- 
2.20.1


^ permalink raw reply related	[flat|nested] 12+ messages in thread

* Re: [PATCH 4/7] rcu/tree: support reclaim for head-less object
  2020-03-23 11:36 ` [PATCH 4/7] rcu/tree: support reclaim for head-less object Uladzislau Rezki (Sony)
@ 2020-03-29 22:56   ` Joel Fernandes
  2020-03-30 12:48     ` Uladzislau Rezki
  0 siblings, 1 reply; 12+ messages in thread
From: Joel Fernandes @ 2020-03-29 22:56 UTC (permalink / raw)
  To: Uladzislau Rezki (Sony)
  Cc: LKML, Paul E . McKenney, RCU, Andrew Morton, Steven Rostedt,
	Oleksiy Avramchenko

On Mon, Mar 23, 2020 at 12:36:18PM +0100, Uladzislau Rezki (Sony) wrote:
> Update the kvfree_call_rcu() with head-less support, it
> means an object without any rcu_head structure can be
> reclaimed after GP.
> 
> To store pointers there are two chain-arrays maintained
> one for SLAB and another one is for vmalloc. Both types
> of objects(head-less variant and regular one) are placed
> there based on the type.
> 
> It can be that maintaining of arrays becomes impossible
> due to high memory pressure. For such reason there is an
> emergency path. In that case objects with rcu_head inside
> are just queued building one way list. Later on that list
> is drained.
> 
> As for head-less variant. Such objects do not have any
> rcu_head helper inside. Thus it is dynamically attached.
> As a result an object consists of back-pointer and regular
> rcu_head. It implies that emergency path can detect such
> object type, therefore they are tagged. So a back-pointer
> could be freed as well as dynamically attached wrapper.
> 
> Even though such approach requires dynamic memory it needs
> only sizeof(unsigned long *) + sizeof(struct rcu_head) bytes,
> thus SLAB is used to obtain it. Finally if attaching of the
> rcu_head and queuing get failed, the current context has
> to follow might_sleep() annotation, thus below steps could
> be applied:
>    a) wait until a grace period has elapsed;
>    b) direct inlining of the kvfree() call.

Very nice work, Vlad! And beautifully split! Makes it easy to review! One
comment below but otherwise patches 1-4 look good to me, I will look at
others as well now. I have some patches on top of the series, mostly little
clean ups which I will send together with yours.

> Signed-off-by: Uladzislau Rezki (Sony) <urezki@gmail.com>
> ---
>  kernel/rcu/tree.c | 94 +++++++++++++++++++++++++++++++++++++++++++----
>  1 file changed, 86 insertions(+), 8 deletions(-)
> 
> diff --git a/kernel/rcu/tree.c b/kernel/rcu/tree.c
> index 869a72e25d38..5a64c92feafc 100644
> --- a/kernel/rcu/tree.c
> +++ b/kernel/rcu/tree.c
> @@ -2892,19 +2892,34 @@ static void kfree_rcu_work(struct work_struct *work)
>  	 * when we could not allocate a bulk array.
>  	 *
>  	 * Under that condition an object is queued to the
> -	 * list instead.
> +	 * list instead. Please note that head-less objects
> +	 * have dynamically attached rcu_head, so they also
> +	 * contain a back-pointer that has to be freed.
>  	 */
>  	for (; head; head = next) {
>  		unsigned long offset = (unsigned long)head->func;
> -		void *ptr = (void *)head - offset;
> +		bool headless;
> +		void *ptr;
>  
>  		next = head->next;
> +
> +		/* We tag the headless object, if so adjust offset. */
> +		headless = (((unsigned long) head - offset) & BIT(0));
> +		if (headless)
> +			offset -= 1;

Just to be sure, can vmalloc() ever allocate an object at an odd valued
memory address? I'm not fully sure looking at vmalloc code whether this is
the case.

As per the tagging, allocated objects have to at least at a 2-byte boundary
for the pointer's BIT(0) to be available. If that's not the case, we need to
add a warning to the code at a bare minimum.

Another approach which is better I think is to add the tag to the offset
itself. So if the offset is > LONG_MAX / 2 or something like that, then
assume it is headless, and override offset to sizeof(unsigned long *) in that
case. Then you would arrive at the correct pointer for the wrapper. That
would take care of the case where in the future, either SLAB or vmalloc()
ends up returning pointers that are only byte-aligned.

Thoughts?

thanks,

 - Joel


> +
> +		ptr = (void *) head - offset;
>  		debug_rcu_head_unqueue((struct rcu_head *)ptr);
> +
>  		rcu_lock_acquire(&rcu_callback_map);
>  		trace_rcu_invoke_kvfree_callback(rcu_state.name, head, offset);
>  
> -		if (!WARN_ON_ONCE(!__is_kvfree_rcu_offset(offset)))
> +		if (!WARN_ON_ONCE(!__is_kvfree_rcu_offset(offset))) {
> +			if (headless)
> +				kvfree((void *) *((unsigned long *) ptr));
> +
>  			kvfree(ptr);
> +		}
>  
>  		rcu_lock_release(&rcu_callback_map);
>  		cond_resched_tasks_rcu_qs();
> @@ -3053,6 +3068,25 @@ kvfree_call_rcu_add_ptr_to_bulk(struct kfree_rcu_cpu *krcp, void *ptr)
>  	return true;
>  }
>  
> +static inline struct rcu_head *
> +attach_rcu_head_to_object(void *obj)
> +{
> +	unsigned long *ptr;
> +
> +	ptr = kmalloc(sizeof(unsigned long *) +
> +			sizeof(struct rcu_head), GFP_NOWAIT | __GFP_NOWARN);
> +
> +	if (!ptr)
> +		ptr = kmalloc(sizeof(unsigned long *) +
> +				sizeof(struct rcu_head), GFP_ATOMIC | __GFP_NOWARN);
> +
> +	if (!ptr)
> +		return NULL;
> +
> +	ptr[0] = (unsigned long) obj;
> +	return ((struct rcu_head *) ++ptr);
> +}
> +
>  /*
>   * Queue a request for lazy invocation of appropriate free routine after a
>   * grace period. Please note there are three paths are maintained, two are the
> @@ -3071,20 +3105,37 @@ void kvfree_call_rcu(struct rcu_head *head, rcu_callback_t func)
>  	unsigned long flags;
>  	struct kfree_rcu_cpu *krcp;
>  	bool expedited_drain = false;
> +	bool success;
>  	void *ptr;
>  
> +	if (head) {
> +		ptr = (void *) head - (unsigned long) func;
> +	} else {
> +		/*
> +		 * Please note there is a limitation for the head-less
> +		 * variant, that is why there is a clear rule for such
> +		 * objects:
> +		 *
> +		 * use it from might_sleep() context only. For other
> +		 * places please embed an rcu_head to your structures.
> +		 */
> +		might_sleep();
> +		ptr = (unsigned long *) func;
> +	}
> +
>  	local_irq_save(flags);	// For safely calling this_cpu_ptr().
>  	krcp = this_cpu_ptr(&krc);
>  	if (krcp->initialized)
>  		spin_lock(&krcp->lock);
>  
> -	ptr = (void *)head - (unsigned long)func;
> -
>  	// Queue the object but don't yet schedule the batch.
>  	if (debug_rcu_head_queue(ptr)) {
>  		// Probable double kfree_rcu(), just leak.
>  		WARN_ONCE(1, "%s(): Double-freed call. rcu_head %p\n",
>  			  __func__, head);
> +
> +		/* Mark as success and leave. */
> +		success = true;
>  		goto unlock_return;
>  	}
>  
> @@ -3092,7 +3143,22 @@ void kvfree_call_rcu(struct rcu_head *head, rcu_callback_t func)
>  	 * Under high memory pressure GFP_NOWAIT can fail,
>  	 * in that case the emergency path is maintained.
>  	 */
> -	if (!kvfree_call_rcu_add_ptr_to_bulk(krcp, ptr)) {
> +	success = kvfree_call_rcu_add_ptr_to_bulk(krcp, ptr);
> +	if (!success) {
> +		/* Is headless object? */
> +		if (head == NULL) {
> +			head = attach_rcu_head_to_object(ptr);
> +			if (head == NULL)
> +				goto unlock_return;
> +
> +			/*
> +			 * Tag the headless object. Such objects have a back-pointer
> +			 * to the original allocated memory, that has to be freed as
> +			 * well as dynamically attached wrapper/head.
> +			 */
> +			func = (rcu_callback_t) (sizeof(unsigned long *) + 1);
> +		}
> +
>  		head->func = func;
>  		head->next = krcp->head;
>  		krcp->head = head;
> @@ -3104,15 +3170,15 @@ void kvfree_call_rcu(struct rcu_head *head, rcu_callback_t func)
>  		 * appropriate free calls.
>  		 */
>  		expedited_drain = true;
> +		success = true;
>  	}
>  
>  	WRITE_ONCE(krcp->count, krcp->count + 1);
>  
>  	// Set timer to drain after KFREE_DRAIN_JIFFIES.
>  	if (rcu_scheduler_active == RCU_SCHEDULER_RUNNING &&
> -	    !krcp->monitor_todo) {
> +			!krcp->monitor_todo) {
>  		krcp->monitor_todo = true;
> -
>  		schedule_delayed_work(&krcp->monitor_work,
>  			expedited_drain ? 0:KFREE_DRAIN_JIFFIES);
>  	}
> @@ -3121,6 +3187,18 @@ void kvfree_call_rcu(struct rcu_head *head, rcu_callback_t func)
>  	if (krcp->initialized)
>  		spin_unlock(&krcp->lock);
>  	local_irq_restore(flags);
> +
> +	/*
> +	 * High memory pressure, so inline kvfree() after
> +	 * synchronize_rcu(). We can do it from might_sleep()
> +	 * context only, so the current CPU can pass the QS
> +	 * state.
> +	 */
> +	if (!success) {
> +		debug_rcu_head_unqueue(ptr);
> +		synchronize_rcu();
> +		kvfree(ptr);
> +	}
>  }
>  EXPORT_SYMBOL_GPL(kvfree_call_rcu);
>  
> -- 
> 2.20.1
> 

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH 6/7] rcu/tiny: support reclaim for head-less object
  2020-03-23 11:36 ` [PATCH 6/7] rcu/tiny: support reclaim for head-less object Uladzislau Rezki (Sony)
@ 2020-03-30  0:56   ` Joel Fernandes
  2020-03-30 14:42     ` Uladzislau Rezki
  0 siblings, 1 reply; 12+ messages in thread
From: Joel Fernandes @ 2020-03-30  0:56 UTC (permalink / raw)
  To: Uladzislau Rezki (Sony)
  Cc: LKML, Paul E . McKenney, RCU, Andrew Morton, Steven Rostedt,
	Oleksiy Avramchenko

On Mon, Mar 23, 2020 at 12:36:20PM +0100, Uladzislau Rezki (Sony) wrote:
> Make a kvfree_call_rcu() function to support head-less
> freeing. Same as for tree-RCU, for such purpose we store
> pointers in array. SLAB and vmalloc ptrs. are mixed and
> coexist together.
> 
> Under high memory pressure it can be that maintaining of
> arrays becomes impossible. Objects with an rcu_head are
> released via call_rcu(). When it comes to the head-less
> variant, the kvfree() call is directly inlined, i.e. we
> do the same as for tree-RCU:
>     a) wait until a grace period has elapsed;
>     b) direct inlining of the kvfree() call.
> 
> Thus the current context has to follow might_sleep()
> annotation. Also please note that for tiny-RCU any
> call of synchronize_rcu() is actually a quiescent
> state, therefore (a) does nothing.

Hmm, Ok. So on -tiny, if there's any allocation failure ever, we immediately
revert to call_rcu(). I guess we could also create a regular (non-array)
queue for objects with an rcu_head and queue it on that (since it does not
need allocation) in case of array allocation failure, however that may not be
worth it. So this LGTM. Thanks!

For entire series:
Reviewed-by: Joel Fernandes (Google) <joel@joelfernandes.org>
(I will submit a follow-up to fix the tagging, please let me submit Vlad's
entire series with some patches on top -- I also did a bit of wordsmithing in
the commit messages of this series).

Loved the might_sleep() idea btw, I suppose if atomic context wants to do
kvfree_rcu(), then we could also have kfree_rcu() defer the kvfree_rcu() to
execute from a workqueue. Thoughts? We can then allow poor insomniacs from
calling this API :)

thanks,

 - Joel


> Signed-off-by: Uladzislau Rezki (Sony) <urezki@gmail.com>
> ---
>  kernel/rcu/tiny.c | 157 +++++++++++++++++++++++++++++++++++++++++++++-
>  1 file changed, 156 insertions(+), 1 deletion(-)
> 
> diff --git a/kernel/rcu/tiny.c b/kernel/rcu/tiny.c
> index 508c82faa45c..b1c31a935db9 100644
> --- a/kernel/rcu/tiny.c
> +++ b/kernel/rcu/tiny.c
> @@ -40,6 +40,29 @@ static struct rcu_ctrlblk rcu_ctrlblk = {
>  	.curtail	= &rcu_ctrlblk.rcucblist,
>  };
>  
> +/* Can be common with tree-RCU. */
> +#define KVFREE_DRAIN_JIFFIES (HZ / 50)
> +
> +/* Can be common with tree-RCU. */
> +struct kvfree_rcu_bulk_data {
> +	unsigned long nr_records;
> +	struct kvfree_rcu_bulk_data *next;
> +	void *records[];
> +};
> +
> +/* Can be common with tree-RCU. */
> +#define KVFREE_BULK_MAX_ENTR \
> +	((PAGE_SIZE - sizeof(struct kvfree_rcu_bulk_data)) / sizeof(void *))
> +
> +static struct kvfree_rcu_bulk_data *kvhead;
> +static struct kvfree_rcu_bulk_data *kvhead_free;
> +static struct kvfree_rcu_bulk_data *kvcache;
> +
> +static DEFINE_STATIC_KEY_FALSE(rcu_init_done);
> +static struct delayed_work monitor_work;
> +static struct rcu_work rcu_work;
> +static bool monitor_todo;
> +
>  void rcu_barrier(void)
>  {
>  	wait_rcu_gp(call_rcu);
> @@ -177,9 +200,137 @@ void call_rcu(struct rcu_head *head, rcu_callback_t func)
>  }
>  EXPORT_SYMBOL_GPL(call_rcu);
>  
> +static inline bool
> +kvfree_call_rcu_add_ptr_to_bulk(void *ptr)
> +{
> +	struct kvfree_rcu_bulk_data *bnode;
> +
> +	if (!kvhead || kvhead->nr_records == KVFREE_BULK_MAX_ENTR) {
> +		bnode = xchg(&kvcache, NULL);
> +		if (!bnode)
> +			bnode = (struct kvfree_rcu_bulk_data *)
> +				__get_free_page(GFP_NOWAIT | __GFP_NOWARN);
> +
> +		if (unlikely(!bnode))
> +			return false;
> +
> +		/* Initialize the new block. */
> +		bnode->nr_records = 0;
> +		bnode->next = kvhead;
> +
> +		/* Attach it to the bvhead. */
> +		kvhead = bnode;
> +	}
> +
> +	/* Done. */
> +	kvhead->records[kvhead->nr_records++] = ptr;
> +	return true;
> +}
> +
> +static void
> +kvfree_rcu_work(struct work_struct *work)
> +{
> +	struct kvfree_rcu_bulk_data *kvhead_tofree, *next;
> +	unsigned long flags;
> +	int i;
> +
> +	local_irq_save(flags);
> +	kvhead_tofree = kvhead_free;
> +	kvhead_free = NULL;
> +	local_irq_restore(flags);
> +
> +	/* Reclaim process. */
> +	for (; kvhead_tofree; kvhead_tofree = next) {
> +		next = kvhead_tofree->next;
> +
> +		for (i = 0; i < kvhead_tofree->nr_records; i++) {
> +			debug_rcu_head_unqueue((struct rcu_head *)
> +				kvhead_tofree->records[i]);
> +			kvfree(kvhead_tofree->records[i]);
> +		}
> +
> +		if (cmpxchg(&kvcache, NULL, kvhead_tofree))
> +			free_page((unsigned long) kvhead_tofree);
> +	}
> +}
> +
> +static inline bool
> +queue_kvfree_rcu_work(void)
> +{
> +	/* Check if the free channel is available. */
> +	if (kvhead_free)
> +		return false;
> +
> +	kvhead_free = kvhead;
> +	kvhead = NULL;
> +
> +	/*
> +	 * Queue the job for memory reclaim after GP.
> +	 */
> +	queue_rcu_work(system_wq, &rcu_work);
> +	return true;
> +}
> +
> +static void kvfree_rcu_monitor(struct work_struct *work)
> +{
> +	unsigned long flags;
> +	bool queued;
> +
> +	local_irq_save(flags);
> +	queued = queue_kvfree_rcu_work();
> +	if (queued)
> +		/* Success. */
> +		monitor_todo = false;
> +	local_irq_restore(flags);
> +
> +	/*
> +	 * If previous RCU reclaim process is still in progress,
> +	 * schedule the work one more time to try again later.
> +	 */
> +	if (monitor_todo)
> +		schedule_delayed_work(&monitor_work,
> +			KVFREE_DRAIN_JIFFIES);
> +}
> +
>  void kvfree_call_rcu(struct rcu_head *head, rcu_callback_t func)
>  {
> -	call_rcu(head, func);
> +	unsigned long flags;
> +	bool success;
> +	void *ptr;
> +
> +	if (head) {
> +		ptr = (void *) head - (unsigned long) func;
> +	} else {
> +		might_sleep();
> +		ptr = (void *) func;
> +	}
> +
> +	if (debug_rcu_head_queue(ptr)) {
> +		/* Probable double free, just leak. */
> +		WARN_ONCE(1, "%s(): Double-freed call. rcu_head %p\n",
> +			  __func__, head);
> +		return;
> +	}
> +
> +	local_irq_save(flags);
> +	success = kvfree_call_rcu_add_ptr_to_bulk(ptr);
> +	if (static_branch_likely(&rcu_init_done)) {
> +		if (success && !monitor_todo) {
> +			monitor_todo = true;
> +			schedule_delayed_work(&monitor_work,
> +				KVFREE_DRAIN_JIFFIES);
> +		}
> +	}
> +	local_irq_restore(flags);
> +
> +	if (!success) {
> +		if (!head) {
> +			synchronize_rcu();
> +			kvfree(ptr);
> +		} else {
> +			call_rcu(head, func);
> +		}
> +	}
>  }
>  EXPORT_SYMBOL_GPL(kvfree_call_rcu);
>  
> @@ -188,4 +339,8 @@ void __init rcu_init(void)
>  	open_softirq(RCU_SOFTIRQ, rcu_process_callbacks);
>  	rcu_early_boot_tests();
>  	srcu_init();
> +
> +	INIT_DELAYED_WORK(&monitor_work, kvfree_rcu_monitor);
> +	INIT_RCU_WORK(&rcu_work, kvfree_rcu_work);
> +	static_branch_enable(&rcu_init_done);
>  }
> -- 
> 2.20.1
> 

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH 4/7] rcu/tree: support reclaim for head-less object
  2020-03-29 22:56   ` Joel Fernandes
@ 2020-03-30 12:48     ` Uladzislau Rezki
  0 siblings, 0 replies; 12+ messages in thread
From: Uladzislau Rezki @ 2020-03-30 12:48 UTC (permalink / raw)
  To: Joel Fernandes
  Cc: Uladzislau Rezki (Sony),
	LKML, Paul E . McKenney, RCU, Andrew Morton, Steven Rostedt,
	Oleksiy Avramchenko

> 
> Very nice work, Vlad! And beautifully split! Makes it easy to review! One
> comment below but otherwise patches 1-4 look good to me, I will look at
> others as well now. I have some patches on top of the series, mostly little
> clean ups which I will send together with yours.
> 
Thanks, Joel!


> > Signed-off-by: Uladzislau Rezki (Sony) <urezki@gmail.com>
> > ---
> >  kernel/rcu/tree.c | 94 +++++++++++++++++++++++++++++++++++++++++++----
> >  1 file changed, 86 insertions(+), 8 deletions(-)
> > 
> > diff --git a/kernel/rcu/tree.c b/kernel/rcu/tree.c
> > index 869a72e25d38..5a64c92feafc 100644
> > --- a/kernel/rcu/tree.c
> > +++ b/kernel/rcu/tree.c
> > @@ -2892,19 +2892,34 @@ static void kfree_rcu_work(struct work_struct *work)
> >  	 * when we could not allocate a bulk array.
> >  	 *
> >  	 * Under that condition an object is queued to the
> > -	 * list instead.
> > +	 * list instead. Please note that head-less objects
> > +	 * have dynamically attached rcu_head, so they also
> > +	 * contain a back-pointer that has to be freed.
> >  	 */
> >  	for (; head; head = next) {
> >  		unsigned long offset = (unsigned long)head->func;
> > -		void *ptr = (void *)head - offset;
> > +		bool headless;
> > +		void *ptr;
> >  
> >  		next = head->next;
> > +
> > +		/* We tag the headless object, if so adjust offset. */
> > +		headless = (((unsigned long) head - offset) & BIT(0));
> > +		if (headless)
> > +			offset -= 1;
> 
> Just to be sure, can vmalloc() ever allocate an object at an odd valued
> memory address? I'm not fully sure looking at vmalloc code whether this is
> the case.
>
No. It must be PAGE_ALIGNED(addr). 

> 
> As per the tagging, allocated objects have to at least at a 2-byte boundary
> for the pointer's BIT(0) to be available. If that's not the case, we need to
> add a warning to the code at a bare minimum.
> 
> Another approach which is better I think is to add the tag to the offset
> itself. So if the offset is > LONG_MAX / 2 or something like that, then
> assume it is headless, and override offset to sizeof(unsigned long *) in that
> case. Then you would arrive at the correct pointer for the wrapper. That
> would take care of the case where in the future, either SLAB or vmalloc()
> ends up returning pointers that are only byte-aligned.
> 
> Thoughts?
> 
I saw your https://lkml.org/lkml/2020/3/29/480. I think it is OK,
and we can go your way. There is advantage that it is tolerate to
any alignment :)

Thanks!

--
Vlad Rezki

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH 6/7] rcu/tiny: support reclaim for head-less object
  2020-03-30  0:56   ` Joel Fernandes
@ 2020-03-30 14:42     ` Uladzislau Rezki
  0 siblings, 0 replies; 12+ messages in thread
From: Uladzislau Rezki @ 2020-03-30 14:42 UTC (permalink / raw)
  To: Joel Fernandes
  Cc: Uladzislau Rezki (Sony),
	LKML, Paul E . McKenney, RCU, Andrew Morton, Steven Rostedt,
	Oleksiy Avramchenko

> 
> Hmm, Ok. So on -tiny, if there's any allocation failure ever, we immediately
> revert to call_rcu(). I guess we could also create a regular (non-array)
> queue for objects with an rcu_head and queue it on that (since it does not
> need allocation) in case of array allocation failure, however that may not be
> worth it. So this LGTM. Thanks!
> 
> For entire series:
> Reviewed-by: Joel Fernandes (Google) <joel@joelfernandes.org>
> (I will submit a follow-up to fix the tagging, please let me submit Vlad's
> entire series with some patches on top -- I also did a bit of wordsmithing in
> the commit messages of this series).
>
Thank you, Joel, for your review and help!

> 
> Loved the might_sleep() idea btw, I suppose if atomic context wants to do
> kvfree_rcu(), then we could also have kfree_rcu() defer the kvfree_rcu() to
> execute from a workqueue. Thoughts? We can then allow poor insomniacs from
> calling this API :)
> 
Not sure if i understand you correctly. Could you please <snip> some code
for illustration?

As far as i understand, it should be done then synchronously. We can defer  
and queue some work to do it in worqueue context. But i am not sure how
to proccess next coming request, i.e. busy waiting until we manage to push
a new ptr. to free? But in that case it would not work if there is only
one CPU available.

Thanks!

--
Vlad Rezki

^ permalink raw reply	[flat|nested] 12+ messages in thread

end of thread, other threads:[~2020-03-30 14:42 UTC | newest]

Thread overview: 12+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-03-23 11:36 [PATCH 0/7] Headless support in the kvfree_rcu() Uladzislau Rezki (Sony)
2020-03-23 11:36 ` [PATCH 1/7] rcu/tree: simplify KFREE_BULK_MAX_ENTR macro Uladzislau Rezki (Sony)
2020-03-23 11:36 ` [PATCH 2/7] rcu/tree: maintain separate array for vmalloc ptrs Uladzislau Rezki (Sony)
2020-03-23 11:36 ` [PATCH 3/7] rcu/tree: introduce expedited_drain flag Uladzislau Rezki (Sony)
2020-03-23 11:36 ` [PATCH 4/7] rcu/tree: support reclaim for head-less object Uladzislau Rezki (Sony)
2020-03-29 22:56   ` Joel Fernandes
2020-03-30 12:48     ` Uladzislau Rezki
2020-03-23 11:36 ` [PATCH 5/7] rcu/tiny: move kvfree_call_rcu() out of header Uladzislau Rezki (Sony)
2020-03-23 11:36 ` [PATCH 6/7] rcu/tiny: support reclaim for head-less object Uladzislau Rezki (Sony)
2020-03-30  0:56   ` Joel Fernandes
2020-03-30 14:42     ` Uladzislau Rezki
2020-03-23 11:36 ` [PATCH 7/7] rcu: support headless variant in the kvfree_rcu() Uladzislau Rezki (Sony)

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).