All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH 0/8] VFS name lookup permission checking cleanup
@ 2009-09-07 21:01 Linus Torvalds
  2009-09-07 21:02 ` [PATCH 1/8] Do not call 'ima_path_check()' for each path component Linus Torvalds
                   ` (2 more replies)
  0 siblings, 3 replies; 24+ messages in thread
From: Linus Torvalds @ 2009-09-07 21:01 UTC (permalink / raw)
  To: Linux Kernel Mailing List, Al Viro, Linux Filesystem Mailing List
  Cc: Eric Paris, Mimi Zohar, James Morris


This is a series of eight trivial patches that I'd like people to take a 
look at, because I am hoping to eventually do multiple path component 
lookups in one go without taking the per-dentry lock or incrementing (and 
then decrementing) the per-dentry atomic count for each component.

The aim would be to try to avoid getting that annoying cacheline ping-pong 
on the common top-level dentries that everybody looks up (ie root and home 
directories, /usr, /usr/bin etc).

Right now I have some simple (but real) loads that show the contention on 
dentry->d_lock to be a roughly 3% performance hit on a single-socket 
nehalem, and I assume it can be much worse on multi-socket machines.

And the thing is, it should be entirely possible to do everything but the 
last component lookup with just a single read_seqbegin()/read_seqretry() 
around the whole lookup. Yes, the last component is special and absolutely 
needs locking and counting - but the last component also doesn't tend to 
be shared, so locking it is fine.

Now, I may never actually get there, but when looking at it, the biggest 
problem is actually not so much the path lookup itself, as the security 
tests that are done for each path component. And it should be noted that 
in order for a lockless seq-lock only lookup make sense, any such 
operations would have to be totally lock-free too. They certainly can't 
take mutexes etc, but right now they do.

Those security tests fall into two categories:

 - actual security layer callouts: ima_path_check().

   This one looks totally pointless. Path component lookup is a horribly 
   timing-critical path, and we will only do a successful lookup on a 
   directory (inode needs to have a ->lookup operation), yet in the middle 
   of that is a call to "ima_path_check()".

   Now, it looks like ima_path_check() is very much designed to only check 
   the _final_ path anyway, and was never meant to be used to check the 
   directories we hit on the way. In fact, the whole function starts with

	if (!ima_initialized || !S_ISREG(inode->i_mode))
		return 0;

   so it's totally pointless to do that thing on a directory where 
   that !S_ISREG() test will trigger.

   So just remove it. IMA should never have put that check in there to 
   begin with, it's just way too performance-sensitive.

 - the real filesystem permission checks. 

   We used to do the common case entirely in the VFS layer, but these days 
   the common case includes POSIX ACL checking, and as a result, the 
   trivial short-circuit code in the VFS layer almost never triggers in
   practice, and we call down to the low-level filesystem for each 
   component. 

   We can't fix that by just removing the call, but what we _can_ do is to 
   at least avoid the silly calling back-and-forth: most filesystems will 
   just call back to the VFS layer to do the "generic_permission()" with 
   their own ACL-checking routines.

   That way we can flatten the call-chain out a bit, and avoid one 
   unnecessary indirect call in that timing-critical region. And 
   eventually, if we make the whole ACL caching thing be something that we 
   do at a VFS layer (Al Viro already worked on _some_ of that), we'll be 
   able to avoid the calls entirely when we can see the cached ACL 
   pointers directly.

So this series of 8 patches do all these preliminary things. As shown by 
the diffstat below, it actually reduces the lines of code (mainly by just 
removing the silly per-filesystem wrappers around "generic_permission()") 
and it also makes it a _lot_ clearer what actually gets called in that 
whole 'exec_permission_lite()' function that we use to check the 
permission of a pathname lookup.

Comments?  Especially from the IMA people (first patch) and from generic 
VFS, security and low-level FS people (the 'Simplify exec_permission_lite' 
series, and then the check_acl + per-filesystem changes).

Al?

I'm looking to merge these shortly after 2.6.31 is released, but comments 
welcome.

			Linus

----
Linus Torvalds (8):
  Do not call 'ima_path_check()' for each path component
  Simplify exec_permission_lite() logic
  Simplify exec_permission_lite() further
  Simplify exec_permission_lite(), part 3
  Make 'check_acl()' a first-class filesystem op
  shmfs: use 'check_acl' instead of 'permission'
  ext[234]: move over to 'check_acl' permission model
  jffs2/jfs/xfs: switch over to 'check_acl' rather than 'permission()'

 fs/ext2/acl.c               |    8 +----
 fs/ext2/acl.h               |    4 +-
 fs/ext2/file.c              |    2 +-
 fs/ext2/namei.c             |    4 +-
 fs/ext3/acl.c               |    8 +----
 fs/ext3/acl.h               |    4 +-
 fs/ext3/file.c              |    2 +-
 fs/ext3/namei.c             |    4 +-
 fs/ext4/acl.c               |    8 +----
 fs/ext4/acl.h               |    4 +-
 fs/ext4/file.c              |    2 +-
 fs/ext4/namei.c             |    4 +-
 fs/jffs2/acl.c              |    7 +---
 fs/jffs2/acl.h              |    4 +-
 fs/jffs2/dir.c              |    2 +-
 fs/jffs2/file.c             |    2 +-
 fs/jffs2/symlink.c          |    2 +-
 fs/jfs/acl.c                |    7 +---
 fs/jfs/file.c               |    2 +-
 fs/jfs/jfs_acl.h            |    2 +-
 fs/jfs/namei.c              |    2 +-
 fs/namei.c                  |   82 +++++++++++++++++++++---------------------
 fs/xfs/linux-2.6/xfs_iops.c |   16 ++------
 include/linux/fs.h          |    1 +
 include/linux/shmem_fs.h    |    2 +-
 mm/shmem.c                  |    6 ++--
 mm/shmem_acl.c              |   11 +-----
 27 files changed, 79 insertions(+), 123 deletions(-)

^ permalink raw reply	[flat|nested] 24+ messages in thread

end of thread, other threads:[~2009-09-09 17:11 UTC | newest]

Thread overview: 24+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2009-09-07 21:01 [PATCH 0/8] VFS name lookup permission checking cleanup Linus Torvalds
2009-09-07 21:02 ` [PATCH 1/8] Do not call 'ima_path_check()' for each path component Linus Torvalds
2009-09-07 21:02   ` [PATCH 2/8] Simplify exec_permission_lite() logic Linus Torvalds
2009-09-07 21:03     ` [PATCH 3/8] Simplify exec_permission_lite() further Linus Torvalds
2009-09-07 21:03       ` [PATCH 4/8] Simplify exec_permission_lite(), part 3 Linus Torvalds
2009-09-07 21:04         ` [PATCH 5/8] Make 'check_acl()' a first-class filesystem op Linus Torvalds
2009-09-07 21:04           ` [PATCH 6/8] shmfs: use 'check_acl' instead of 'permission' Linus Torvalds
2009-09-07 21:05             ` [PATCH 7/8] ext[234]: move over to 'check_acl' permission model Linus Torvalds
2009-09-07 21:05               ` [PATCH 8/8] jffs2/jfs/xfs: switch over to 'check_acl' rather than 'permission()' Linus Torvalds
2009-09-08 18:34                 ` Christoph Hellwig
2009-09-09 17:09                   ` Linus Torvalds
2009-09-07 22:23             ` [PATCH 6/8] shmfs: use 'check_acl' instead of 'permission' James Morris
2009-09-08  0:03               ` James Morris
2009-09-08 18:05             ` Hugh Dickins
2009-09-07 22:20           ` [PATCH 5/8] Make 'check_acl()' a first-class filesystem op James Morris
2009-09-07 22:18         ` [PATCH 4/8] Simplify exec_permission_lite(), part 3 James Morris
2009-09-07 22:15       ` [PATCH 3/8] Simplify exec_permission_lite() further James Morris
2009-09-08 14:40       ` Jamie Lokier
2009-09-08 14:58         ` Matthew Wilcox
2009-09-08 15:02         ` Linus Torvalds
2009-09-07 22:12     ` [PATCH 2/8] Simplify exec_permission_lite() logic James Morris
2009-09-08  1:50   ` [PATCH 1/8] Do not call 'ima_path_check()' for each path component Christoph Hellwig
2009-09-08 14:01 ` [PATCH 0/8] VFS name lookup permission checking cleanup Serge E. Hallyn
2009-09-08 17:52 ` Mimi Zohar

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.