linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH] epoll: trim epitem by one cache line on x86_64
@ 2013-03-04 11:29 Eric Wong
  2013-03-06 22:37 ` Andrew Morton
  0 siblings, 1 reply; 5+ messages in thread
From: Eric Wong @ 2013-03-04 11:29 UTC (permalink / raw)
  To: Davide Libenzi; +Cc: Al Viro, Andrew Morton, linux-fsdevel, linux-kernel

It is common for epoll users to have thousands of epitems, so saving a
cache line on every allocation leads to large memory savings.

Since epitem allocations are cache-aligned, reducing sizeof(struct
epitem) from 136 bytes to 128 bytes will allow it to squeeze under a
cache line boundary on x86_64.

>From /sys/kernel/slab/eventpoll_epi, I see the following changes on my
x86_64 Core2 Duo (which has 64-byte cache alignment):

	object_size  :  192 => 128
	objs_per_slab:   21 =>  32

I have no access to other 64-bit machines, so I am limiting this to
x86_64-only with EPOLL_PACKED instead of __attribute__((packed))

Signed-off-by: Eric Wong <normalperson@yhbt.net>
Cc: Davide Libenzi <davidel@xmailserver.org>
Cc: Al Viro <viro@ZenIV.linux.org.uk>
Cc: Andrew Morton <akpm@linux-foundation.org>
---
 fs/eventpoll.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/fs/eventpoll.c b/fs/eventpoll.c
index cfc4b16..06f3d0e 100644
--- a/fs/eventpoll.c
+++ b/fs/eventpoll.c
@@ -107,7 +107,7 @@
 struct epoll_filefd {
 	struct file *file;
 	int fd;
-};
+} EPOLL_PACKED;
 
 /*
  * Structure used to track possible nested calls, for too deep recursions
-- 
Eric Wong


^ permalink raw reply related	[flat|nested] 5+ messages in thread

* Re: [PATCH] epoll: trim epitem by one cache line on x86_64
  2013-03-04 11:29 [PATCH] epoll: trim epitem by one cache line on x86_64 Eric Wong
@ 2013-03-06 22:37 ` Andrew Morton
  2013-03-07 10:32   ` Eric Wong
  0 siblings, 1 reply; 5+ messages in thread
From: Andrew Morton @ 2013-03-06 22:37 UTC (permalink / raw)
  To: Eric Wong; +Cc: Davide Libenzi, Al Viro, linux-fsdevel, linux-kernel

On Mon, 4 Mar 2013 11:29:41 +0000 Eric Wong <normalperson@yhbt.net> wrote:

> It is common for epoll users to have thousands of epitems, so saving a
> cache line on every allocation leads to large memory savings.
> 
> Since epitem allocations are cache-aligned, reducing sizeof(struct
> epitem) from 136 bytes to 128 bytes will allow it to squeeze under a
> cache line boundary on x86_64.
> 
> >From /sys/kernel/slab/eventpoll_epi, I see the following changes on my
> x86_64 Core2 Duo (which has 64-byte cache alignment):
> 
> 	object_size  :  192 => 128
> 	objs_per_slab:   21 =>  32
> 
> I have no access to other 64-bit machines, so I am limiting this to
> x86_64-only with EPOLL_PACKED instead of __attribute__((packed))
> 
> ...
>
> --- a/fs/eventpoll.c
> +++ b/fs/eventpoll.c
> @@ -107,7 +107,7 @@
>  struct epoll_filefd {
>  	struct file *file;
>  	int fd;
> -};
> +} EPOLL_PACKED;
>  
>  /*
>   * Structure used to track possible nested calls, for too deep recursions

Yes, I see the same numbers on my gcc, x86_64 allmodconfig.

It's going to be hard to maintain this - someone will change something
sometime and break it.  I suppose we could add a runtime check if we
cared enough.  Adding a big fat comment to struct epitem might help.

I don't see much additional room to be saved.  We could probably remove
epitem.nwait, but that wouldn't actually save anything because nwait
nestles with ffd.fd.

I tested your patch on powerpc and it reduced sizeof(epitem) from 136
to 128 for that arch as well, so I suggest we run with

--- a/fs/eventpoll.c~epoll-trim-epitem-by-one-cache-line-on-x86_64-fix
+++ a/fs/eventpoll.c
@@ -105,7 +105,7 @@
 struct epoll_filefd {
 	struct file *file;
 	int fd;
-} EPOLL_PACKED;
+} __packed;
 
 /*
  * Structure used to track possible nested calls, for too deep recursions
_


^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [PATCH] epoll: trim epitem by one cache line on x86_64
  2013-03-06 22:37 ` Andrew Morton
@ 2013-03-07 10:32   ` Eric Wong
  2013-03-08 23:54     ` Andrew Morton
  0 siblings, 1 reply; 5+ messages in thread
From: Eric Wong @ 2013-03-07 10:32 UTC (permalink / raw)
  To: Andrew Morton; +Cc: Davide Libenzi, Al Viro, linux-fsdevel, linux-kernel

Andrew Morton <akpm@linux-foundation.org> wrote:
> It's going to be hard to maintain this - someone will change something
> sometime and break it.  I suppose we could add a runtime check if we
> cared enough.  Adding a big fat comment to struct epitem might help.

Thanks for looking at this patch.  I'll send a patch with a comment
about keeping epitem size in check.  Also, would adding (with comments):

	BUILD_BUG_ON(sizeof(struct epitem) > 128);

...be too heavy-handed?  I used that in my testing.  I'll check for:
sizeof(void *) <= 8 too; in case 128-bit machines appear...

> I don't see much additional room to be saved.  We could probably remove
> epitem.nwait, but that wouldn't actually save anything because nwait
> nestles with ffd.fd.

If we remove nwait, we can move epoll_event up and have event.events
tucked in there.  I have more and more depending on epoll, so I'll be
around to comment on future epoll changes as they come up.

> I tested your patch on powerpc and it reduced sizeof(epitem) from 136
> to 128 for that arch as well, so I suggest we run with
> 
> --- a/fs/eventpoll.c~epoll-trim-epitem-by-one-cache-line-on-x86_64-fix
> +++ a/fs/eventpoll.c
> @@ -105,7 +105,7 @@
>  struct epoll_filefd {
>  	struct file *file;
>  	int fd;
> -} EPOLL_PACKED;
> +} __packed;

Thanks for testing on ppc.  Looks good to me.  For what it's worth:
Acked-by: Eric Wong <normalperson@yhbt.net>

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [PATCH] epoll: trim epitem by one cache line on x86_64
  2013-03-07 10:32   ` Eric Wong
@ 2013-03-08 23:54     ` Andrew Morton
  2013-03-09  0:41       ` [PATCH] epoll: comment + BUILD_BUG_ON to prevent epitem bloat Eric Wong
  0 siblings, 1 reply; 5+ messages in thread
From: Andrew Morton @ 2013-03-08 23:54 UTC (permalink / raw)
  To: Eric Wong; +Cc: Davide Libenzi, Al Viro, linux-fsdevel, linux-kernel

On Thu, 7 Mar 2013 10:32:40 +0000 Eric Wong <normalperson@yhbt.net> wrote:

> Andrew Morton <akpm@linux-foundation.org> wrote:
> > It's going to be hard to maintain this - someone will change something
> > sometime and break it.  I suppose we could add a runtime check if we
> > cared enough.  Adding a big fat comment to struct epitem might help.
> 
> Thanks for looking at this patch.  I'll send a patch with a comment
> about keeping epitem size in check.  Also, would adding (with comments):
> 
> 	BUILD_BUG_ON(sizeof(struct epitem) > 128);
> 
> ...be too heavy-handed?  I used that in my testing.  I'll check for:
> sizeof(void *) <= 8 too; in case 128-bit machines appear...

I guess such a check might avoid accidents in the future.  If it
becomes a problem, we can always delete it.


^ permalink raw reply	[flat|nested] 5+ messages in thread

* [PATCH] epoll: comment + BUILD_BUG_ON to prevent epitem bloat
  2013-03-08 23:54     ` Andrew Morton
@ 2013-03-09  0:41       ` Eric Wong
  0 siblings, 0 replies; 5+ messages in thread
From: Eric Wong @ 2013-03-09  0:41 UTC (permalink / raw)
  To: Andrew Morton; +Cc: Davide Libenzi, Al Viro, linux-fsdevel, linux-kernel

This will prevent us from accidentally introducing a memory bloat
regression here in the future.

Signed-off-by: Eric Wong <normalperson@yhbt.net>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Davide Libenzi <davidel@xmailserver.org>,
Cc: Al Viro <viro@ZenIV.linux.org.uk>
---
  Andrew Morton <akpm@linux-foundation.org> wrote:
  > On Thu, 7 Mar 2013 10:32:40 +0000 Eric Wong <normalperson@yhbt.net> wrote:
  > 
  > > Andrew Morton <akpm@linux-foundation.org> wrote:
  > > > It's going to be hard to maintain this - someone will change something
  > > > sometime and break it.  I suppose we could add a runtime check if we
  > > > cared enough.  Adding a big fat comment to struct epitem might help.
  > > 
  > > Thanks for looking at this patch.  I'll send a patch with a comment
  > > about keeping epitem size in check.  Also, would adding (with comments):
  > > 
  > > 	BUILD_BUG_ON(sizeof(struct epitem) > 128);
  > > 
  > > ...be too heavy-handed?  I used that in my testing.  I'll check for:
  > > sizeof(void *) <= 8 too; in case 128-bit machines appear...
  > 
  > I guess such a check might avoid accidents in the future.  If it
  > becomes a problem, we can always delete it.

 fs/eventpoll.c | 8 ++++++++
 1 file changed, 8 insertions(+)

diff --git a/fs/eventpoll.c b/fs/eventpoll.c
index 9fec183..55028da 100644
--- a/fs/eventpoll.c
+++ b/fs/eventpoll.c
@@ -128,6 +128,8 @@ struct nested_calls {
 /*
  * Each file descriptor added to the eventpoll interface will
  * have an entry of this type linked to the "rbr" RB tree.
+ * Avoid increasing the size of this struct, there can be many thousands
+ * of these on a server and we do not want this to take another cache line.
  */
 struct epitem {
 	/* RB tree node used to link this structure to the eventpoll RB tree */
@@ -1964,6 +1966,12 @@ static int __init eventpoll_init(void)
 	/* Initialize the structure used to perform file's f_op->poll() calls */
 	ep_nested_calls_init(&poll_readywalk_ncalls);
 
+	/*
+	 * We can have many thousands of epitems, so prevent this from
+	 * using an extra cache line on 64-bit (and smaller) CPUs
+	 */
+	BUILD_BUG_ON(sizeof(void *) <= 8 && sizeof(struct epitem) > 128);
+
 	/* Allocates slab cache used to allocate "struct epitem" items */
 	epi_cache = kmem_cache_create("eventpoll_epi", sizeof(struct epitem),
 			0, SLAB_HWCACHE_ALIGN | SLAB_PANIC, NULL);
-- 
Eric Wong

^ permalink raw reply related	[flat|nested] 5+ messages in thread

end of thread, other threads:[~2013-03-09  0:41 UTC | newest]

Thread overview: 5+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2013-03-04 11:29 [PATCH] epoll: trim epitem by one cache line on x86_64 Eric Wong
2013-03-06 22:37 ` Andrew Morton
2013-03-07 10:32   ` Eric Wong
2013-03-08 23:54     ` Andrew Morton
2013-03-09  0:41       ` [PATCH] epoll: comment + BUILD_BUG_ON to prevent epitem bloat Eric Wong

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).