All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH RFC] eventpoll: optimize epoll_ctl by reusing eppoll_entry allocations
@ 2024-03-17 22:20 Ivan Trofimov
  2024-03-18  4:59 ` Eric Biggers
  0 siblings, 1 reply; 3+ messages in thread
From: Ivan Trofimov @ 2024-03-17 22:20 UTC (permalink / raw)
  To: brauner; +Cc: linux-fsdevel

Instead of unconditionally allocating and deallocating pwq objects,
try to reuse them by storing the entry in the eventpoll struct
at deallocation request, and consuming that entry at allocation request.
This way every EPOLL_CTL_ADD operation immediately following an
EPOLL_CTL_DEL operation effectively cancels out its pwq allocation
with the preceding deallocation.

With this patch applied I'm observing ~13% overall speedup when 
benchmarking the following scenario:
1. epoll_ctl(..., EPOLL_CTL_ADD, ...)
2. epoll_ctl(..., EPOLL_CTL_DEL, ...)
which should be a pretty common one for either applications dealing with
a lot of short-lived connections or applications doing a DEL + ADD dance
per level-triggered FD readiness.

This optimization comes with a sizeof(void*) + sizeof(struct eppoll_entry)
per-epoll-instance memory cost, which amounts to 72 bytes for 64-bit.

Signed-off-by: Ivan Trofimov <i.trofimow@yandex.ru>
---
This is my first ever attempt at submiting a patch for the kernel,
so please don't hesitate to point out mistakes I'm doing in the process.

 fs/eventpoll.c | 39 +++++++++++++++++++++++++++++++++++++--
 1 file changed, 37 insertions(+), 2 deletions(-)

diff --git a/fs/eventpoll.c b/fs/eventpoll.c
index 882b89edc..c4094124c 100644
--- a/fs/eventpoll.c
+++ b/fs/eventpoll.c
@@ -219,6 +219,9 @@ struct eventpoll {
 	u64 gen;
 	struct hlist_head refs;
 
+	/* a single-item cache used to reuse eppoll_entry allocations */
+	struct eppoll_entry *pwq_slot;
+
 	/*
 	 * usage count, used together with epitem->dying to
 	 * orchestrate the disposal of this struct
@@ -648,6 +651,36 @@ static void ep_remove_wait_queue(struct eppoll_entry *pwq)
 	rcu_read_unlock();
 }
 
+/*
+ * This functions either consumes the pwq_slot, or allocates a new
+ * eppoll_entry if the slot is empty.
+ * Must be called with "mtx" held.
+ */
+static struct eppoll_entry *ep_alloc_pwq(struct eventpoll *ep)
+{
+	struct eppoll_entry *pwq = ep->pwq_slot;
+
+	if (pwq) {
+		ep->pwq_slot = NULL;
+		return pwq;
+	}
+	return kmem_cache_alloc(pwq_cache, GFP_KERNEL);
+}
+
+/*
+ * This function either fills the pwq_slot with the eppoll_entry,
+ * or deallocates the entry if the slot is already filled.
+ * Must be called with "mtx" held.
+ */
+static void ep_free_pwq(struct eventpoll *ep, struct eppoll_entry *pwq)
+{
+	if (!ep->pwq_slot) {
+		ep->pwq_slot = pwq;
+		return;
+	}
+	kmem_cache_free(pwq_cache, pwq);
+}
+
 /*
  * This function unregisters poll callbacks from the associated file
  * descriptor.  Must be called with "mtx" held.
@@ -660,7 +693,7 @@ static void ep_unregister_pollwait(struct eventpoll *ep, struct epitem *epi)
 	while ((pwq = *p) != NULL) {
 		*p = pwq->next;
 		ep_remove_wait_queue(pwq);
-		kmem_cache_free(pwq_cache, pwq);
+		ep_free_pwq(ep, pwq);
 	}
 }
 
@@ -789,6 +822,8 @@ static void ep_free(struct eventpoll *ep)
 	mutex_destroy(&ep->mtx);
 	free_uid(ep->user);
 	wakeup_source_unregister(ep->ws);
+	if (ep->pwq_slot)
+		kmem_cache_free(pwq_cache, ep->pwq_slot);
 	kfree(ep);
 }
 
@@ -1384,7 +1419,7 @@ static void ep_ptable_queue_proc(struct file *file, wait_queue_head_t *whead,
 	if (unlikely(!epi))	// an earlier allocation has failed
 		return;
 
-	pwq = kmem_cache_alloc(pwq_cache, GFP_KERNEL);
+	pwq = ep_alloc_pwq(epi->ep);
 	if (unlikely(!pwq)) {
 		epq->epi = NULL;
 		return;
-- 
2.34.1


^ permalink raw reply related	[flat|nested] 3+ messages in thread

* Re: [PATCH RFC] eventpoll: optimize epoll_ctl by reusing eppoll_entry allocations
  2024-03-17 22:20 [PATCH RFC] eventpoll: optimize epoll_ctl by reusing eppoll_entry allocations Ivan Trofimov
@ 2024-03-18  4:59 ` Eric Biggers
  2024-03-18  9:57   ` Ivan Trofimov
  0 siblings, 1 reply; 3+ messages in thread
From: Eric Biggers @ 2024-03-18  4:59 UTC (permalink / raw)
  To: Ivan Trofimov; +Cc: brauner, linux-fsdevel

On Mon, Mar 18, 2024 at 01:20:04AM +0300, Ivan Trofimov wrote:
> Instead of unconditionally allocating and deallocating pwq objects,
> try to reuse them by storing the entry in the eventpoll struct
> at deallocation request, and consuming that entry at allocation request.
> This way every EPOLL_CTL_ADD operation immediately following an
> EPOLL_CTL_DEL operation effectively cancels out its pwq allocation
> with the preceding deallocation.
> 
> With this patch applied I'm observing ~13% overall speedup when 
> benchmarking the following scenario:
> 1. epoll_ctl(..., EPOLL_CTL_ADD, ...)
> 2. epoll_ctl(..., EPOLL_CTL_DEL, ...)
> which should be a pretty common one for either applications dealing with
> a lot of short-lived connections or applications doing a DEL + ADD dance
> per level-triggered FD readiness.
> 
> This optimization comes with a sizeof(void*) + sizeof(struct eppoll_entry)
> per-epoll-instance memory cost, which amounts to 72 bytes for 64-bit.
> 
> Signed-off-by: Ivan Trofimov <i.trofimow@yandex.ru>
> ---
> This is my first ever attempt at submiting a patch for the kernel,
> so please don't hesitate to point out mistakes I'm doing in the process.
> 
>  fs/eventpoll.c | 39 +++++++++++++++++++++++++++++++++++++--
>  1 file changed, 37 insertions(+), 2 deletions(-)
> 
> diff --git a/fs/eventpoll.c b/fs/eventpoll.c
> index 882b89edc..c4094124c 100644
> --- a/fs/eventpoll.c
> +++ b/fs/eventpoll.c
> @@ -219,6 +219,9 @@ struct eventpoll {
>  	u64 gen;
>  	struct hlist_head refs;
>  
> +	/* a single-item cache used to reuse eppoll_entry allocations */
> +	struct eppoll_entry *pwq_slot;
> +
>  	/*
>  	 * usage count, used together with epitem->dying to
>  	 * orchestrate the disposal of this struct
> @@ -648,6 +651,36 @@ static void ep_remove_wait_queue(struct eppoll_entry *pwq)
>  	rcu_read_unlock();
>  }
>  
> +/*
> + * This functions either consumes the pwq_slot, or allocates a new

"This function"

> @@ -789,6 +822,8 @@ static void ep_free(struct eventpoll *ep)
>  	mutex_destroy(&ep->mtx);
>  	free_uid(ep->user);
>  	wakeup_source_unregister(ep->ws);
> +	if (ep->pwq_slot)
> +		kmem_cache_free(pwq_cache, ep->pwq_slot);

A NULL check isn't necessary before kmem_cache_free().

> @@ -1384,7 +1419,7 @@ static void ep_ptable_queue_proc(struct file *file, wait_queue_head_t *whead,
>  	if (unlikely(!epi))	// an earlier allocation has failed
>  		return;
>  
> -	pwq = kmem_cache_alloc(pwq_cache, GFP_KERNEL);
> +	pwq = ep_alloc_pwq(epi->ep);

What guarantees that ep->mtx is held here?

- Eric

^ permalink raw reply	[flat|nested] 3+ messages in thread

* Re: [PATCH RFC] eventpoll: optimize epoll_ctl by reusing eppoll_entry allocations
  2024-03-18  4:59 ` Eric Biggers
@ 2024-03-18  9:57   ` Ivan Trofimov
  0 siblings, 0 replies; 3+ messages in thread
From: Ivan Trofimov @ 2024-03-18  9:57 UTC (permalink / raw)
  To: Eric Biggers; +Cc: brauner, linux-fsdevel

 >> +/*
 >> + * This functions either consumes the pwq_slot, or allocates a new
 >
 > "This function"
Thanks, will fix.

 >> +	if (ep->pwq_slot)
 >> +		kmem_cache_free(pwq_cache, ep->pwq_slot);
 >
 > A NULL check isn't necessary before kmem_cache_free().
Thanks, noted, will fix.

 >> @@ -1384,7 +1419,7 @@ static void ep_ptable_queue_proc(struct file 
*file, wait_queue_head_t *whead,
 >>   	if (unlikely(!epi))	// an earlier allocation has failed
 >>   		return;
 >>
 >> -	pwq = kmem_cache_alloc(pwq_cache, GFP_KERNEL);
 >> +	pwq = ep_alloc_pwq(epi->ep);
 >
 > What guarantees that ep->mtx is held here?
The "ep_ptable_queue_proc" is invoked as a poll callback when
"ep_insert" attaches the item added to the poll hooks,
"ep_insert" is called with the "ep->mtx" held.

^ permalink raw reply	[flat|nested] 3+ messages in thread

end of thread, other threads:[~2024-03-18  9:58 UTC | newest]

Thread overview: 3+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2024-03-17 22:20 [PATCH RFC] eventpoll: optimize epoll_ctl by reusing eppoll_entry allocations Ivan Trofimov
2024-03-18  4:59 ` Eric Biggers
2024-03-18  9:57   ` Ivan Trofimov

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.