Am 29.01.20 um 11:17 schrieb Pavel Begunkov: > On 29/01/2020 03:54, Jens Axboe wrote: >> On 1/28/20 5:24 PM, Jens Axboe wrote: >>> On 1/28/20 5:21 PM, Pavel Begunkov wrote: >>>> On 29/01/2020 03:20, Jens Axboe wrote: >>>>> On 1/28/20 5:10 PM, Pavel Begunkov wrote: >>>>>>>>> Checked out ("don't use static creds/mm assignments") >>>>>>>>> >>>>>>>>> 1. do we miscount cred refs? We grab one in get_current_cred() for each async >>>>>>>>> request, but if (worker->creds != work->creds) it will never be put. >>>>>>>> >>>>>>>> Yeah I think you're right, that needs a bit of fixing up. >>>>>>> >>>>>> >>>>>> Hmm, it seems it leaks it unconditionally, as it grabs in a ref in >>>>>> override_creds(). >>>>>> >>>>> >>>>> We grab one there, and an extra one. Then we drop one of them inline, >>>>> and the other in __io_req_aux_free(). >>>>> >>>> Yeah, with the last patch it should make it even >>> >>> OK good we agree on that. I should probably pull back that bit to the >>> original patch to avoid having a hole in there... >> >> Done >> > > ("io_uring/io-wq: don't use static creds/mm assignments") and ("io_uring: > support using a registered personality for commands") looks good now. > > Reviewed-by: Pavel Begunkov I'm very happy with the design, thanks! That exactly what I had in mind:-) It would also work with IORING_SETUP_SQPOLL, correct? However I think there're a few things to improve/simplify. > https://git.kernel.dk/cgit/linux-block/commit/?h=for-5.6/io_uring-vfs&id=a26d26412e1e1783473f9dc8f030c3af3d54b1a6 In fs/io_uring.c mmgrab() and get_current_cred() are used together in two places, why is put_cred() called in __io_req_aux_free while mmdrop() is called from io_put_work(). I think both should be called in io_put_work(), that makes the code much easier to understand. My guess is that you choose __io_req_aux_free() for put_cred() because of the following patches, but I'll explain on the other commit why it's not needed. > https://git.kernel.dk/cgit/linux-block/commit/?h=for-5.6/io_uring-vfs&id=d9db233adf034bd7855ba06190525e10a05868be A minor one would be starting with 1 instead of 0 and using idr_alloc_cyclic() in order to avoid immediate reuse of ids. That way we could include the id in the tracing message and 0 would mean the current creds were used. > +static int io_remove_personalities(int id, void *p, void *data) > +{ > + struct io_ring_ctx *ctx = data; > + > + idr_remove(&ctx->personality_idr, id); Here we need something like: put_creds((const struct cred *)p); > + return 0; > +} The io_uring_register() calles would look like this, correct? id = io_uring_register(ring_fd, IORING_REGISTER_PERSONALITY, NULL, 0); io_uring_register(ring_fd, IORING_UNREGISTER_PERSONALITY, NULL, id); > https://git.kernel.dk/cgit/linux-block/commit/?h=for-5.6/io_uring-vfs&id=eec9e69e0ad9ad364e1b6a5dfc52ad576afee235 > + > + if (sqe_flags & IOSQE_PERSONALITY) { > + int id = READ_ONCE(sqe->personality); > + > + req->work.creds = idr_find(&ctx->personality_idr, id); > + if (unlikely(!req->work.creds)) { > + ret = -EINVAL; > + goto err_req; > + } > + get_cred(req->work.creds);> + old_creds = override_creds(req->work.creds); > + } > + Here we could use a helper variable const struct cred *personality_creds; and leave req->work.creds as NULL. It means we can avoid the explicit get_cred() call and can skip the following hunk too: > @@ -3977,7 +3977,8 @@ static int io_req_defer_prep(struct io_kiocb *req, > mmgrab(current->mm); > req->work.mm = current->mm; > } > - req->work.creds = get_current_cred(); > + if (!req->work.creds) > + req->work.creds = get_current_cred(); > > switch (req->opcode) { > case IORING_OP_NOP: The override_creds(personality_creds) has changed current->cred and get_current_cred() will just pick it up as in the default case. This would make the patch much simpler and allows put_cred() to be in io_put_work() instead of __io_req_aux_free() as explained above. metze