From: Mateusz Guzik <mjguzik@gmail.com>
To: viro@zeniv.linux.org.uk
Cc: serge@hallyn.com, torvalds@linux-foundation.org,
paul@paul-moore.com, linux-fsdevel@vger.kernel.org,
linux-kernel@vger.kernel.org,
linux-security-module@vger.kernel.org,
Mateusz Guzik <mjguzik@gmail.com>
Subject: [PATCH v3 2/2] vfs: avoid duplicating creds in faccessat if possible
Date: Wed, 25 Jan 2023 16:55:57 +0100 [thread overview]
Message-ID: <20230125155557.37816-2-mjguzik@gmail.com> (raw)
In-Reply-To: <20230125155557.37816-1-mjguzik@gmail.com>
access(2) remains commonly used, for example on exec:
access("/etc/ld.so.preload", R_OK)
or when running gcc: strace -c gcc empty.c
% time seconds usecs/call calls errors syscall
------ ----------- ----------- --------- --------- ----------------
0.00 0.000000 0 42 26 access
It falls down to do_faccessat without the AT_EACCESS flag, which in turn
results in allocation of new creds in order to modify fsuid/fsgid and
caps. This is a very expensive process single-threaded and most notably
multi-threaded, with numerous structures getting refed and unrefed on
imminent new cred destruction.
Turns out for typical consumers the resulting creds would be identical
and this can be checked upfront, avoiding the hard work.
An access benchmark plugged into will-it-scale running on Cascade Lake
shows:
test proc before after
access1 1 1310582 2908735 (+121%) # distinct files
access1 24 4716491 63822173 (+1353%) # distinct files
access2 24 2378041 5370335 (+125%) # same file
The above benchmarks are not integrated into will-it-scale, but can be
found in a pull request:
https://github.com/antonblanchard/will-it-scale/pull/36/files
Signed-off-by: Mateusz Guzik <mjguzik@gmail.com>
v3:
- add a comment warning about changing access_override_creds
v2:
- fix current->cred usage warn reported by the kernel test robot
Link: https://lore.kernel.org/all/202301150709.9EC6UKBT-lkp@intel.com/
---
fs/open.c | 38 +++++++++++++++++++++++++++++++++++++-
1 file changed, 37 insertions(+), 1 deletion(-)
diff --git a/fs/open.c b/fs/open.c
index 82c1a28b3308..2afed058250c 100644
--- a/fs/open.c
+++ b/fs/open.c
@@ -367,7 +367,37 @@ COMPAT_SYSCALL_DEFINE6(fallocate, int, fd, int, mode, compat_arg_u64_dual(offset
* access() needs to use the real uid/gid, not the effective uid/gid.
* We do this by temporarily clearing all FS-related capabilities and
* switching the fsuid/fsgid around to the real ones.
+ *
+ * Creating new credentials is expensive, so we try to skip doing it,
+ * which we can if the result would match what we already got.
*/
+static bool access_need_override_creds(int flags)
+{
+ const struct cred *cred;
+
+ if (flags & AT_EACCESS)
+ return false;
+
+ cred = current_cred();
+ if (!uid_eq(cred->fsuid, cred->uid) ||
+ !gid_eq(cred->fsgid, cred->gid))
+ return true;
+
+ if (!issecure(SECURE_NO_SETUID_FIXUP)) {
+ kuid_t root_uid = make_kuid(cred->user_ns, 0);
+ if (!uid_eq(cred->uid, root_uid)) {
+ if (!cap_isclear(cred->cap_effective))
+ return true;
+ } else {
+ if (!cap_isidentical(cred->cap_effective,
+ cred->cap_permitted))
+ return true;
+ }
+ }
+
+ return false;
+}
+
static const struct cred *access_override_creds(void)
{
const struct cred *old_cred;
@@ -377,6 +407,12 @@ static const struct cred *access_override_creds(void)
if (!override_cred)
return NULL;
+ /*
+ * XXX access_need_override_creds performs checks in hopes of skipping
+ * this work. Make sure it stays in sync if making any changes in this
+ * routine.
+ */
+
override_cred->fsuid = override_cred->uid;
override_cred->fsgid = override_cred->gid;
@@ -436,7 +472,7 @@ static long do_faccessat(int dfd, const char __user *filename, int mode, int fla
if (flags & AT_EMPTY_PATH)
lookup_flags |= LOOKUP_EMPTY;
- if (!(flags & AT_EACCESS)) {
+ if (access_need_override_creds(flags)) {
old_cred = access_override_creds();
if (!old_cred)
return -ENOMEM;
--
2.39.0
next prev parent reply other threads:[~2023-01-25 15:56 UTC|newest]
Thread overview: 67+ messages / expand[flat|nested] mbox.gz Atom feed top
2023-01-25 15:55 [PATCH v3 1/2] capability: add cap_isidentical Mateusz Guzik
2023-01-25 15:55 ` Mateusz Guzik [this message]
2023-02-28 0:44 ` [PATCH v3 2/2] vfs: avoid duplicating creds in faccessat if possible Linus Torvalds
2023-03-02 8:30 ` Christian Brauner
2023-03-02 17:51 ` Linus Torvalds
2023-03-02 18:14 ` Mateusz Guzik
2023-03-02 18:18 ` Al Viro
2023-03-02 18:22 ` Mateusz Guzik
2023-03-02 18:43 ` Al Viro
2023-03-02 18:51 ` Mateusz Guzik
2023-03-02 19:02 ` Al Viro
2023-03-02 19:18 ` Al Viro
2023-03-02 19:03 ` Linus Torvalds
2023-03-02 19:10 ` Linus Torvalds
2023-03-02 19:19 ` Al Viro
2023-03-02 19:54 ` Kees Cook
2023-03-02 20:11 ` Al Viro
2023-03-03 15:30 ` Alexander Potapenko
2023-03-03 17:39 ` Mateusz Guzik
2023-03-03 17:54 ` Linus Torvalds
2023-03-03 19:37 ` Mateusz Guzik
2023-03-03 19:38 ` Mateusz Guzik
2023-03-03 20:08 ` Linus Torvalds
2023-03-03 20:39 ` Mateusz Guzik
2023-03-03 20:58 ` Linus Torvalds
2023-03-03 21:09 ` Mateusz Guzik
2023-03-04 19:01 ` Mateusz Guzik
2023-03-04 20:31 ` Mateusz Guzik
2023-03-04 20:48 ` Linus Torvalds
2023-03-05 17:23 ` David Laight
2023-03-04 1:29 ` Linus Torvalds
2023-03-04 3:25 ` Yury Norov
2023-03-04 3:42 ` Linus Torvalds
2023-03-04 5:51 ` Yury Norov
2023-03-04 16:41 ` David Vernet
2023-03-04 19:02 ` Linus Torvalds
2023-03-04 19:19 ` Linus Torvalds
2023-03-04 20:34 ` Linus Torvalds
2023-03-04 20:51 ` Yury Norov
2023-03-04 21:01 ` Linus Torvalds
2023-03-04 21:03 ` Linus Torvalds
2023-03-04 21:10 ` Linus Torvalds
2023-03-04 23:08 ` Linus Torvalds
2023-03-04 23:52 ` Linus Torvalds
2023-03-05 9:26 ` Sedat Dilek
2023-03-05 18:17 ` Linus Torvalds
2023-03-05 18:43 ` Linus Torvalds
2023-03-06 5:43 ` Yury Norov
2023-03-04 20:18 ` Al Viro
2023-03-04 20:42 ` Mateusz Guzik
2023-03-02 19:38 ` Kees Cook
2023-03-02 19:48 ` Eric Biggers
2023-03-02 18:41 ` Al Viro
2023-03-03 14:49 ` Christian Brauner
2023-03-02 18:11 ` Al Viro
2023-03-03 14:27 ` Christian Brauner
2023-02-28 1:14 ` [PATCH v3 1/2] capability: add cap_isidentical Linus Torvalds
2023-02-28 2:46 ` Casey Schaufler
2023-02-28 14:47 ` Mateusz Guzik
2023-02-28 19:39 ` Linus Torvalds
2023-02-28 19:51 ` Linus Torvalds
2023-02-28 20:48 ` Linus Torvalds
2023-02-28 21:21 ` Mateusz Guzik
2023-02-28 21:29 ` Linus Torvalds
2023-03-01 18:13 ` Linus Torvalds
2023-02-28 17:32 ` Serge E. Hallyn
2023-02-28 17:52 ` Casey Schaufler
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20230125155557.37816-2-mjguzik@gmail.com \
--to=mjguzik@gmail.com \
--cc=linux-fsdevel@vger.kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-security-module@vger.kernel.org \
--cc=paul@paul-moore.com \
--cc=serge@hallyn.com \
--cc=torvalds@linux-foundation.org \
--cc=viro@zeniv.linux.org.uk \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.