From mboxrd@z Thu Jan 1 00:00:00 1970 From: Pavel Begunkov Date: Tue, 22 Sep 2020 06:30:09 +0000 Subject: Re: [PATCH 1/9] kernel: add a PF_FORCE_COMPAT flag Message-Id: MIME-Version: 1.0 Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: 7bit List-Id: References: <563138b5-7073-74bc-f0c5-b2bad6277e87@gmail.com> <486c92d0-0f2e-bd61-1ab8-302524af5e08@gmail.com> In-Reply-To: To: Andy Lutomirski Cc: Arnd Bergmann , Christoph Hellwig , Al Viro , Andrew Morton , Jens Axboe , David Howells , linux-arm-kernel , X86 ML , LKML , "open list:MIPS" , Parisc List , linuxppc-dev , linux-s390 , sparclinux , linux-block , Linux SCSI List , Linux FS Devel , linux-aio , io-uring@vger.kernel.org, linux-arch , Linux-MM , Network Development , keyrings@vger.kernel.org, LSM List On 22/09/2020 03:58, Andy Lutomirski wrote: > On Mon, Sep 21, 2020 at 5:24 PM Pavel Begunkov wrote: >>>>>>> Ah, so reading /dev/input/event* would suffer from the same issue, >>>>>>> and that one would in fact be broken by your patch in the hypothetical >>>>>>> case that someone tried to use io_uring to read /dev/input/event on x32... >>>>>>> >>>>>>> For reference, I checked the socket timestamp handling that has a >>>>>>> number of corner cases with time32/time64 formats in compat mode, >>>>>>> but none of those appear to be affected by the problem. >>>>>>> >>>>>>>> Aside from the potentially nasty use of per-task variables, one thing >>>>>>>> I don't like about PF_FORCE_COMPAT is that it's one-way. If we're >>>>>>>> going to have a generic mechanism for this, shouldn't we allow a full >>>>>>>> override of the syscall arch instead of just allowing forcing compat >>>>>>>> so that a compat syscall can do a non-compat operation? >>>>>>> >>>>>>> The only reason it's needed here is that the caller is in a kernel >>>>>>> thread rather than a system call. Are there any possible scenarios >>>>>>> where one would actually need the opposite? >>>>>>> >>>>>> >>>>>> I can certainly imagine needing to force x32 mode from a kernel thread. >>>>>> >>>>>> As for the other direction: what exactly are the desired bitness/arch semantics of io_uring? Is the operation bitness chosen by the io_uring creation or by the io_uring_enter() bitness? >>>>> >>>>> It's rather the second one. Even though AFAIR it wasn't discussed >>>>> specifically, that how it works now (_partially_). >>>> >>>> Double checked -- I'm wrong, that's the former one. Most of it is based >>>> on a flag that was set an creation. >>>> >>> >>> Could we get away with making io_uring_enter() return -EINVAL (or >>> maybe -ENOTTY?) if you try to do it with bitness that doesn't match >>> the io_uring? And disable SQPOLL in compat mode? >> >> Something like below. If PF_FORCE_COMPAT or any other solution >> doesn't lend by the time, I'll take a look whether other io_uring's >> syscalls need similar checks, etc. >> >> >> diff --git a/fs/io_uring.c b/fs/io_uring.c >> index 0458f02d4ca8..aab20785fa9a 100644 >> --- a/fs/io_uring.c >> +++ b/fs/io_uring.c >> @@ -8671,6 +8671,10 @@ SYSCALL_DEFINE6(io_uring_enter, unsigned int, fd, u32, to_submit, >> if (ctx->flags & IORING_SETUP_R_DISABLED) >> goto out; >> >> + ret = -EINVAl; >> + if (ctx->compat != in_compat_syscall()) >> + goto out; >> + > > This seems entirely reasonable to me. Sharing an io_uring ring > between programs with different ABIs seems a bit nutty. > >> /* >> * For SQ polling, the thread will do all submissions and completions. >> * Just return the requested submit count, and wake the thread if >> @@ -9006,6 +9010,10 @@ static int io_uring_create(unsigned entries, struct io_uring_params *p, >> if (ret) >> goto err; >> >> + ret = -EINVAL; >> + if (ctx->compat) >> + goto err; >> + > > I may be looking at a different kernel than you, but aren't you > preventing creating an io_uring regardless of whether SQPOLL is > requested? I diffed a not-saved file on a sleepy head, thanks for noticing. As you said, there should be an SQPOLL check. ... if (ctx->compat && (p->flags & IORING_SETUP_SQPOLL)) goto err; -- Pavel Begunkov