linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Mateusz Guzik <mjguzik@gmail.com>
To: Paul Moore <paul@paul-moore.com>
Cc: viro@zeniv.linux.org.uk, serge@hallyn.com,
	torvalds@linux-foundation.org, linux-fsdevel@vger.kernel.org,
	linux-kernel@vger.kernel.org,
	linux-security-module@vger.kernel.org
Subject: Re: [PATCH v2 2/2] vfs: avoid duplicating creds in faccessat if possible
Date: Tue, 24 Jan 2023 11:16:29 +0100	[thread overview]
Message-ID: <CAGudoHEWQJKMS=pL9Ate4COshgQaC-fjQ2RN3LiYmdS=0MVruA@mail.gmail.com> (raw)
In-Reply-To: <CAHC9VhTnpWKnKRu3wFTNfub_qdcDePdEXYZWOpvpqL0fcfS_Uw@mail.gmail.com>

On 1/23/23, Paul Moore <paul@paul-moore.com> wrote:
> On Fri, Jan 20, 2023 at 7:50 PM Mateusz Guzik <mjguzik@gmail.com> wrote:
>> On 1/20/23, Paul Moore <paul@paul-moore.com> wrote:
>> > On Mon, Jan 16, 2023 at 4:21 PM Mateusz Guzik <mjguzik@gmail.com>
>> > wrote:
>> >>
>> >> access(2) remains commonly used, for example on exec:
>> >> access("/etc/ld.so.preload", R_OK)
>> >>
>> >> or when running gcc: strace -c gcc empty.c
>> >> % time     seconds  usecs/call     calls    errors syscall
>> >> ------ ----------- ----------- --------- --------- ----------------
>> >>   0.00    0.000000           0        42        26 access
>> >>
>> >> It falls down to do_faccessat without the AT_EACCESS flag, which in
>> >> turn
>> >> results in allocation of new creds in order to modify fsuid/fsgid and
>> >> caps. This is a very expensive process single-threaded and most
>> >> notably
>> >> multi-threaded, with numerous structures getting refed and unrefed on
>> >> imminent new cred destruction.
>> >>
>> >> Turns out for typical consumers the resulting creds would be identical
>> >> and this can be checked upfront, avoiding the hard work.
>> >>
>> >> An access benchmark plugged into will-it-scale running on Cascade Lake
>> >> shows:
>> >> test    proc    before  after
>> >> access1 1       1310582 2908735  (+121%)  # distinct files
>> >> access1 24      4716491 63822173 (+1353%) # distinct files
>> >> access2 24      2378041 5370335  (+125%)  # same file
>> >
>> > Out of curiosity, do you have any measurements of the impact this
>> > patch has on the AT_EACCESS case when the creds do need to be
>> > modified?
>>
>> I could not be arsed to bench that. I'm not saying there is literally 0
>> impact, but it should not be high and the massive win in the case I
>> patched imho justifies it.
>
> That's one way to respond to an honest question asking if you've done
> any tests on the other side of the change.  I agree the impact should
> be less than the advantage you've shown, but sometimes it's nice to
> see these things.
>

So reading this now I do think it was worded in quite a poor manner, so
apologies for that.

Wording aside, I don't know whether this is just a passing remark or
are you genuinely concerned about the other case.

If you are, I noted there is an immediately achievable speed up by
eliminating the get/put ref cycle on creds coming from override_creds +
put_cred to backpedal from it. This should be enough to cover it, but
there are cosmetic problems around it I don't want to flame over.

Say override_creds_noref gets added doing the usual work, except for
get_new_cred.

Then override_creds would be:
        validate_creds(new);
        get_new_cred((struct cred *)new);
        override_creds_noref(new);

But override_creds_noref would retain validate_creds new/old and the
above would repeat it which would preferably be avoided. Not a problem
if it is deemed ok to get_new_cred without validate_creds.

>> These funcs are literally next to each other, I don't think that is easy
>> to miss. I concede a comment in access_override_creds to take a look at
>> access_need_override_creds would not hurt, but I don't know if a resend
>> to add it is justified.
>
> Perhaps it's because I have to deal with a fair amount of code getting
> changed in one place and not another, but I would think that a comment
> would be the least one could do here and would justify a respin.
>

I'm not going to *insist* on not adding that comment.

Would this work for you?

diff --git a/fs/open.c b/fs/open.c
index 3c068a38044c..756177b94b04 100644
--- a/fs/open.c
+++ b/fs/open.c
@@ -407,6 +407,11 @@ static const struct cred *access_override_creds(void)
        if (!override_cred)
                return NULL;

+       /*
+        * XXX access_need_override_creds performs checks in hopes of
+        * skipping this work. Make sure it stays in sync if making any
+        * changes here.
+        */
        override_cred->fsuid = override_cred->uid;
        override_cred->fsgid = override_cred->gid;

if not, can you phrase it however you see fit for me to copy-paste?

> In my opinion a generalized shallow copy approach has more value than
> a one-off solution that has the potential to fall out of sync and
> cause a problem in the future (I recognize that you disagree on the
> likelihood of this happening).
>

To reiterate my stance, the posted patch is trivial to reason about
and it provides a marked improvement for the most commonly seen case.
It comes with some extra branches for the less common case, which I
don't consider to be a big deal.

From the quick toor I took around kernel/cred.c I think the cred code
is messy and it would be best to sort it out before doing anything
fancy. I have no interest in doing the clean up.

The shallow copy idea I outlined above looks very simple, but I can't
help the feeling there are surprises there, so I'm reluctant to roll
with it as is.

More importantly I can't show any workload which runs into the other
case, thus if someone asks me to justify the complexity I will have
nothing, which is mostly why I did not go for it.

That said, if powers to be declare this is the way forward, I can
spend some time getting it done.

> Ultimately it's a call for the VFS folks as they are responsible for
> the access() code.
>

Well let's wait and see. :)

-- 
Mateusz Guzik <mjguzik gmail.com>

  reply	other threads:[~2023-01-24 10:16 UTC|newest]

Thread overview: 16+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2023-01-16 21:21 [PATCH v2 1/2] capability: add cap_isidentical Mateusz Guzik
2023-01-16 21:21 ` [PATCH v2 2/2] vfs: avoid duplicating creds in faccessat if possible Mateusz Guzik
2023-01-20 20:45   ` Paul Moore
2023-01-21  0:50     ` Mateusz Guzik
2023-01-21  1:18       ` Mateusz Guzik
2023-01-23 21:29       ` Paul Moore
2023-01-24 10:16         ` Mateusz Guzik [this message]
2023-01-24 17:00           ` Paul Moore
2023-01-24 17:14             ` Linus Torvalds
2023-01-24 18:42               ` Mateusz Guzik
2023-01-24 20:37                 ` Linus Torvalds
2023-01-24 21:33               ` Paul Moore
2023-01-25 15:00                 ` Mateusz Guzik
2023-01-25 16:17                   ` Mateusz Guzik
2023-01-25 17:11                     ` Paul Moore
2023-01-25 17:07                   ` Paul Moore

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to='CAGudoHEWQJKMS=pL9Ate4COshgQaC-fjQ2RN3LiYmdS=0MVruA@mail.gmail.com' \
    --to=mjguzik@gmail.com \
    --cc=linux-fsdevel@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-security-module@vger.kernel.org \
    --cc=paul@paul-moore.com \
    --cc=serge@hallyn.com \
    --cc=torvalds@linux-foundation.org \
    --cc=viro@zeniv.linux.org.uk \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).