linux-fsdevel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH v3 RESEND] fcntl: Add 32bit filesystem mode
@ 2020-10-12 22:06 Linus Walleij
  2020-10-13  0:08 ` Eric Blake
  2020-10-13  9:22 ` Dave Martin
  0 siblings, 2 replies; 6+ messages in thread
From: Linus Walleij @ 2020-10-12 22:06 UTC (permalink / raw)
  To: Theodore Ts'o, Andreas Dilger
  Cc: linux-ext4, linux-fsdevel, linux-api, qemu-devel, Linus Walleij,
	Florian Weimer, Peter Maydell, Andy Lutomirski

It was brought to my attention that this bug from 2018 was
still unresolved: 32 bit emulators like QEMU were given
64 bit hashes when running 32 bit emulation on 64 bit systems.

This adds a flag to the fcntl() F_GETFD and F_SETFD operations
to set the underlying filesystem into 32bit mode even if the
file handle was opened using 64bit mode without the compat
syscalls.

Programs that need the 32 bit file system behavior need to
issue a fcntl() system call such as in this example:

  #define FD_32BIT_MODE 2

  int main(int argc, char** argv) {
    DIR* dir;
    int err;
    int fd;

    dir = opendir("/boot");
    fd = dirfd(dir);
    err = fcntl(fd, F_SETFD, FD_32BIT_MODE);
    if (err) {
      printf("fcntl() failed! err=%d\n", err);
      return 1;
    }
    printf("dir=%p\n", dir);
    printf("readdir(dir)=%p\n", readdir(dir));
    printf("errno=%d: %s\n", errno, strerror(errno));
    return 0;
  }

This can be pretty hard to test since C libraries and linux
userspace security extensions aggressively filter the parameters
that are passed down and allowed to commit into actual system
calls.

Cc: Florian Weimer <fw@deneb.enyo.de>
Cc: Peter Maydell <peter.maydell@linaro.org>
Cc: Andy Lutomirski <luto@kernel.org>
Suggested-by: Theodore Ts'o <tytso@mit.edu>
Link: https://bugs.launchpad.net/qemu/+bug/1805913
Link: https://lore.kernel.org/lkml/87bm56vqg4.fsf@mid.deneb.enyo.de/
Bugzilla: https://bugzilla.kernel.org/show_bug.cgi?id=205957
Signed-off-by: Linus Walleij <linus.walleij@linaro.org>
---
ChangeLog v3->v3 RESEND 1:
- Resending during the v5.10 merge window to get attention.
ChangeLog v2->v3:
- Realized that I also have to clear the flag correspondingly
  if someone ask for !FD_32BIT_MODE after setting it the
  first time.
ChangeLog v1->v2:
- Use a new flag FD_32BIT_MODE to F_GETFD and F_SETFD
  instead of a new fcntl operation, there is already a fcntl
  operation to set random flags.
- Sorry for taking forever to respin this patch :(
---
 fs/fcntl.c                       | 7 +++++++
 include/uapi/asm-generic/fcntl.h | 8 ++++++++
 2 files changed, 15 insertions(+)

diff --git a/fs/fcntl.c b/fs/fcntl.c
index 19ac5baad50f..6c32edc4099a 100644
--- a/fs/fcntl.c
+++ b/fs/fcntl.c
@@ -335,10 +335,17 @@ static long do_fcntl(int fd, unsigned int cmd, unsigned long arg,
 		break;
 	case F_GETFD:
 		err = get_close_on_exec(fd) ? FD_CLOEXEC : 0;
+		/* Report 32bit file system mode */
+		if (filp->f_mode & FMODE_32BITHASH)
+			err |= FD_32BIT_MODE;
 		break;
 	case F_SETFD:
 		err = 0;
 		set_close_on_exec(fd, arg & FD_CLOEXEC);
+		if (arg & FD_32BIT_MODE)
+			filp->f_mode |= FMODE_32BITHASH;
+		else
+			filp->f_mode &= ~FMODE_32BITHASH;
 		break;
 	case F_GETFL:
 		err = filp->f_flags;
diff --git a/include/uapi/asm-generic/fcntl.h b/include/uapi/asm-generic/fcntl.h
index 9dc0bf0c5a6e..edd3573cb7ef 100644
--- a/include/uapi/asm-generic/fcntl.h
+++ b/include/uapi/asm-generic/fcntl.h
@@ -160,6 +160,14 @@ struct f_owner_ex {
 
 /* for F_[GET|SET]FL */
 #define FD_CLOEXEC	1	/* actually anything with low bit set goes */
+/*
+ * This instructs the kernel to provide 32bit semantics (such as hashes) from
+ * the file system layer, when running a userland that depend on 32bit
+ * semantics on a kernel that supports 64bit userland, but does not use the
+ * compat ioctl() for e.g. open(), so that the kernel would otherwise assume
+ * that the userland process is capable of dealing with 64bit semantics.
+ */
+#define FD_32BIT_MODE	2
 
 /* for posix fcntl() and lockf() */
 #ifndef F_RDLCK
-- 
2.26.2


^ permalink raw reply related	[flat|nested] 6+ messages in thread

* Re: [PATCH v3 RESEND] fcntl: Add 32bit filesystem mode
  2020-10-12 22:06 [PATCH v3 RESEND] fcntl: Add 32bit filesystem mode Linus Walleij
@ 2020-10-13  0:08 ` Eric Blake
  2020-10-13  9:22 ` Dave Martin
  1 sibling, 0 replies; 6+ messages in thread
From: Eric Blake @ 2020-10-13  0:08 UTC (permalink / raw)
  To: Linus Walleij, Theodore Ts'o, Andreas Dilger
  Cc: Peter Maydell, linux-api, qemu-devel, Florian Weimer,
	Andy Lutomirski, linux-fsdevel, linux-ext4

On 10/12/20 5:06 PM, Linus Walleij wrote:
> It was brought to my attention that this bug from 2018 was
> still unresolved: 32 bit emulators like QEMU were given
> 64 bit hashes when running 32 bit emulation on 64 bit systems.
> 
> This adds a flag to the fcntl() F_GETFD and F_SETFD operations
> to set the underlying filesystem into 32bit mode even if the
> file handle was opened using 64bit mode without the compat
> syscalls.
> 
> Programs that need the 32 bit file system behavior need to
> issue a fcntl() system call such as in this example:
> 
>    #define FD_32BIT_MODE 2
> 
>    int main(int argc, char** argv) {
>      DIR* dir;
>      int err;
>      int fd;
> 
>      dir = opendir("/boot");
>      fd = dirfd(dir);
>      err = fcntl(fd, F_SETFD, FD_32BIT_MODE);

This is a blind set, and wipes out FD_CLOEXEC. Better would be to do a 
proper demonstration of the read-modify-write with F_GETFD that portable 
programs will have to use in practice.

-- 
Eric Blake, Principal Software Engineer
Red Hat, Inc.           +1-919-301-3226
Virtualization:  qemu.org | libvirt.org


^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH v3 RESEND] fcntl: Add 32bit filesystem mode
  2020-10-12 22:06 [PATCH v3 RESEND] fcntl: Add 32bit filesystem mode Linus Walleij
  2020-10-13  0:08 ` Eric Blake
@ 2020-10-13  9:22 ` Dave Martin
  2020-11-17 23:38   ` Linus Walleij
  1 sibling, 1 reply; 6+ messages in thread
From: Dave Martin @ 2020-10-13  9:22 UTC (permalink / raw)
  To: Linus Walleij
  Cc: Theodore Ts'o, Andreas Dilger, linux-ext4, linux-fsdevel,
	linux-api, qemu-devel, Florian Weimer, Peter Maydell,
	Andy Lutomirski

On Tue, Oct 13, 2020 at 12:06:20AM +0200, Linus Walleij wrote:
> It was brought to my attention that this bug from 2018 was
> still unresolved: 32 bit emulators like QEMU were given
> 64 bit hashes when running 32 bit emulation on 64 bit systems.
> 
> This adds a flag to the fcntl() F_GETFD and F_SETFD operations
> to set the underlying filesystem into 32bit mode even if the
> file handle was opened using 64bit mode without the compat
> syscalls.
> 
> Programs that need the 32 bit file system behavior need to
> issue a fcntl() system call such as in this example:
> 
>   #define FD_32BIT_MODE 2
> 
>   int main(int argc, char** argv) {
>     DIR* dir;
>     int err;
>     int fd;
> 
>     dir = opendir("/boot");
>     fd = dirfd(dir);
>     err = fcntl(fd, F_SETFD, FD_32BIT_MODE);
>     if (err) {
>       printf("fcntl() failed! err=%d\n", err);
>       return 1;
>     }
>     printf("dir=%p\n", dir);
>     printf("readdir(dir)=%p\n", readdir(dir));
>     printf("errno=%d: %s\n", errno, strerror(errno));
>     return 0;
>   }
> 
> This can be pretty hard to test since C libraries and linux
> userspace security extensions aggressively filter the parameters
> that are passed down and allowed to commit into actual system
> calls.
> 
> Cc: Florian Weimer <fw@deneb.enyo.de>
> Cc: Peter Maydell <peter.maydell@linaro.org>
> Cc: Andy Lutomirski <luto@kernel.org>
> Suggested-by: Theodore Ts'o <tytso@mit.edu>
> Link: https://bugs.launchpad.net/qemu/+bug/1805913
> Link: https://lore.kernel.org/lkml/87bm56vqg4.fsf@mid.deneb.enyo.de/
> Bugzilla: https://bugzilla.kernel.org/show_bug.cgi?id=205957
> Signed-off-by: Linus Walleij <linus.walleij@linaro.org>
> ---
> ChangeLog v3->v3 RESEND 1:
> - Resending during the v5.10 merge window to get attention.
> ChangeLog v2->v3:
> - Realized that I also have to clear the flag correspondingly
>   if someone ask for !FD_32BIT_MODE after setting it the
>   first time.
> ChangeLog v1->v2:
> - Use a new flag FD_32BIT_MODE to F_GETFD and F_SETFD
>   instead of a new fcntl operation, there is already a fcntl
>   operation to set random flags.
> - Sorry for taking forever to respin this patch :(
> ---
>  fs/fcntl.c                       | 7 +++++++
>  include/uapi/asm-generic/fcntl.h | 8 ++++++++
>  2 files changed, 15 insertions(+)
> 
> diff --git a/fs/fcntl.c b/fs/fcntl.c
> index 19ac5baad50f..6c32edc4099a 100644
> --- a/fs/fcntl.c
> +++ b/fs/fcntl.c
> @@ -335,10 +335,17 @@ static long do_fcntl(int fd, unsigned int cmd, unsigned long arg,
>  		break;
>  	case F_GETFD:
>  		err = get_close_on_exec(fd) ? FD_CLOEXEC : 0;
> +		/* Report 32bit file system mode */
> +		if (filp->f_mode & FMODE_32BITHASH)
> +			err |= FD_32BIT_MODE;
>  		break;
>  	case F_SETFD:
>  		err = 0;
>  		set_close_on_exec(fd, arg & FD_CLOEXEC);
> +		if (arg & FD_32BIT_MODE)
> +			filp->f_mode |= FMODE_32BITHASH;
> +		else
> +			filp->f_mode &= ~FMODE_32BITHASH;

This seems inconsistent?  F_SETFD is for setting flags on a file
descriptor.  Won't setting a flag on filp here instead cause the
behaviour to change for all file descriptors across the system that are
open on this struct file?  Compare set_close_on_exec().

I don't see any discussion on whether this should be an F_SETFL or an
F_SETFD, though I see F_SETFD was Ted's suggestion originally.

[...]

Cheers
---Dave

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH v3 RESEND] fcntl: Add 32bit filesystem mode
  2020-10-13  9:22 ` Dave Martin
@ 2020-11-17 23:38   ` Linus Walleij
  2020-11-18  9:00     ` Arnd Bergmann
  2021-11-15 10:56     ` Peter Maydell
  0 siblings, 2 replies; 6+ messages in thread
From: Linus Walleij @ 2020-11-17 23:38 UTC (permalink / raw)
  To: Dave Martin
  Cc: Theodore Ts'o, Andreas Dilger, Ext4 Developers List,
	linux-fsdevel, Linux API, QEMU Developers, Florian Weimer,
	Peter Maydell, Andy Lutomirski

On Tue, Oct 13, 2020 at 11:22 AM Dave Martin <Dave.Martin@arm.com> wrote:

> >       case F_SETFD:
> >               err = 0;
> >               set_close_on_exec(fd, arg & FD_CLOEXEC);
> > +             if (arg & FD_32BIT_MODE)
> > +                     filp->f_mode |= FMODE_32BITHASH;
> > +             else
> > +                     filp->f_mode &= ~FMODE_32BITHASH;
>
> This seems inconsistent?  F_SETFD is for setting flags on a file
> descriptor.  Won't setting a flag on filp here instead cause the
> behaviour to change for all file descriptors across the system that are
> open on this struct file?  Compare set_close_on_exec().
>
> I don't see any discussion on whether this should be an F_SETFL or an
> F_SETFD, though I see F_SETFD was Ted's suggestion originally.

I cannot honestly say I know the semantic difference.

I would ask the QEMU people how a user program would expect
the flag to behave.

Yours,
Linus Walleij

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH v3 RESEND] fcntl: Add 32bit filesystem mode
  2020-11-17 23:38   ` Linus Walleij
@ 2020-11-18  9:00     ` Arnd Bergmann
  2021-11-15 10:56     ` Peter Maydell
  1 sibling, 0 replies; 6+ messages in thread
From: Arnd Bergmann @ 2020-11-18  9:00 UTC (permalink / raw)
  To: Linus Walleij
  Cc: Dave Martin, Theodore Ts'o, Andreas Dilger,
	Ext4 Developers List, linux-fsdevel, Linux API, QEMU Developers,
	Florian Weimer, Peter Maydell, Andy Lutomirski

On Wed, Nov 18, 2020 at 12:38 AM Linus Walleij <linus.walleij@linaro.org> wrote:
>
> On Tue, Oct 13, 2020 at 11:22 AM Dave Martin <Dave.Martin@arm.com> wrote:
>
> > >       case F_SETFD:
> > >               err = 0;
> > >               set_close_on_exec(fd, arg & FD_CLOEXEC);
> > > +             if (arg & FD_32BIT_MODE)
> > > +                     filp->f_mode |= FMODE_32BITHASH;
> > > +             else
> > > +                     filp->f_mode &= ~FMODE_32BITHASH;
> >
> > This seems inconsistent?  F_SETFD is for setting flags on a file
> > descriptor.  Won't setting a flag on filp here instead cause the
> > behaviour to change for all file descriptors across the system that are
> > open on this struct file?  Compare set_close_on_exec().
> >
> > I don't see any discussion on whether this should be an F_SETFL or an
> > F_SETFD, though I see F_SETFD was Ted's suggestion originally.
>
> I cannot honestly say I know the semantic difference.
>
> I would ask the QEMU people how a user program would expect
> the flag to behave.

I agree it should either use F_SETFD to set a bit in the fdtable structure
like set_close_on_exec() or it should use F_SETFL to set a bit in
filp->f_mode.

It appears the reason FMODE_32BITHASH is part of  filp->f_mode
is that the only user today is nfsd, which does not have a file
descriptor but only has a struct file. Similarly, the only code that
understands the difference (ext4_readdir()) has no reference to
the file descriptor.

If this becomes an O_DIR32BITHASH flag for F_SETFL,
I suppose it should also be supported by openat2().

       Arnd

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH v3 RESEND] fcntl: Add 32bit filesystem mode
  2020-11-17 23:38   ` Linus Walleij
  2020-11-18  9:00     ` Arnd Bergmann
@ 2021-11-15 10:56     ` Peter Maydell
  1 sibling, 0 replies; 6+ messages in thread
From: Peter Maydell @ 2021-11-15 10:56 UTC (permalink / raw)
  To: Linus Walleij
  Cc: Dave Martin, Theodore Ts'o, Andreas Dilger,
	Ext4 Developers List, linux-fsdevel, Linux API, QEMU Developers,
	Florian Weimer, Andy Lutomirski

On Tue, 17 Nov 2020 at 23:38, Linus Walleij <linus.walleij@linaro.org> wrote:
>
> On Tue, Oct 13, 2020 at 11:22 AM Dave Martin <Dave.Martin@arm.com> wrote:
>
> > >       case F_SETFD:
> > >               err = 0;
> > >               set_close_on_exec(fd, arg & FD_CLOEXEC);
> > > +             if (arg & FD_32BIT_MODE)
> > > +                     filp->f_mode |= FMODE_32BITHASH;
> > > +             else
> > > +                     filp->f_mode &= ~FMODE_32BITHASH;
> >
> > This seems inconsistent?  F_SETFD is for setting flags on a file
> > descriptor.  Won't setting a flag on filp here instead cause the
> > behaviour to change for all file descriptors across the system that are
> > open on this struct file?  Compare set_close_on_exec().
> >
> > I don't see any discussion on whether this should be an F_SETFL or an
> > F_SETFD, though I see F_SETFD was Ted's suggestion originally.
>
> I cannot honestly say I know the semantic difference.
>
> I would ask the QEMU people how a user program would expect
> the flag to behave.

Apologies for the very late response -- I hadn't noticed that
this thread had stalled out waiting for an answer to this,
and was only reminded of it recently when another QEMU user
ran into the problem that this kernel patch is trying to resolve.

If I understand the distinction here correctly, I think
QEMU wouldn't care about it in practice. We want the "32 bit readdir
offsets" behaviour on all file descriptors that correspond
to where we're emulating "the guest opened this file descriptor".
We don't want (but probably won't notice if we get) that behaviour
on file descriptors that QEMU has opened for its own purposes.
But we'll never open a file descriptor for the guest and then
dup it into one for QEMU's purposes. (I guess there might be
some weird unlikely-to-happen edge cases where an emulated
guest binary opens an fd for a directory and then passes it
via exec to a host binary: but even there I expect the host
binary wouldn't notice it was getting 32-bit hashes.)

But overall I think that the more natural behaviour would be that
it is per-file-descriptor.

-- PMM

^ permalink raw reply	[flat|nested] 6+ messages in thread

end of thread, other threads:[~2021-11-15 10:57 UTC | newest]

Thread overview: 6+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-10-12 22:06 [PATCH v3 RESEND] fcntl: Add 32bit filesystem mode Linus Walleij
2020-10-13  0:08 ` Eric Blake
2020-10-13  9:22 ` Dave Martin
2020-11-17 23:38   ` Linus Walleij
2020-11-18  9:00     ` Arnd Bergmann
2021-11-15 10:56     ` Peter Maydell

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).