All of lore.kernel.org
 help / color / mirror / Atom feed
From: Peter Maydell <peter.maydell@linaro.org>
To: Florian Weimer <fw@deneb.enyo.de>
Cc: Linus Walleij <linus.walleij@linaro.org>,
	"Theodore Ts'o" <tytso@mit.edu>,
	Andreas Dilger <adilger.kernel@dilger.ca>,
	Ext4 Developers List <linux-ext4@vger.kernel.org>,
	linux-fsdevel <linux-fsdevel@vger.kernel.org>,
	Linux API <linux-api@vger.kernel.org>,
	QEMU Developers <qemu-devel@nongnu.org>,
	Andy Lutomirski <luto@kernel.org>
Subject: Re: [PATCH] fcntl: Add 32bit filesystem mode
Date: Mon, 20 Apr 2020 12:38:44 +0100	[thread overview]
Message-ID: <CAFEAcA-No3Z95+UQJZWTxDesd-z_Y5XnyHs6NMpzDo3RVOHQ4w@mail.gmail.com> (raw)
In-Reply-To: <87v9luwgc6.fsf@mid.deneb.enyo.de>

On Mon, 20 Apr 2020 at 12:23, Florian Weimer <fw@deneb.enyo.de> wrote:
>
> * Peter Maydell:
>
> > We open fd 3 to read '.'; we issue the new fcntl, which
> > succeeds. Then there's some unrelated stuff operating on
> > stdout. Then we do a getdents64(), but the d_off values
> > we get back are still 64 bits. The guest binary doesn't
> > like those, so it fails. My expectation was that we would
> > get back d_off values here that were in the 32 bit range.
>
> What's your file system?
>
> I think not all of them have 32-bit hashes (some of them probably
> can't, particularly in the network-based file system case).

Whoops, good point. I was testing this via lkvm, so it's
actually using a 9p filesystem... I'll see if I can figure
out how to test with an ext3 fs, which I think is the one
we most care about. It would be nice if the flag was
supported by other fses too, of course.

Appended is the QEMU patch I tested with.

thanks
-- PMM

From 73471e01733dd1d998ff3cd41edebb4c78793193 Mon Sep 17 00:00:00 2001
From: Peter Maydell <peter.maydell@linaro.org>
Date: Mon, 20 Apr 2020 11:54:22 +0100
Subject: [RFC] linux-user: Use new F_SET_FILE_32BIT_FS fcntl for 32-bit guests

If the guest is 32 bit then there is a potential problem if the
host gives us back a 64-bit sized value that we can't fit into
the ABI the guest requires. This is a theoretical issue for many
syscalls, but a real issue for directory reads where the host
is using ext3 or ext4. There the 'offset' values retured via
the getdents syscall are hashes, and on a 64-bit system they
will always fill the full 64 bits.

Use the F_SET_FILE_32BIT_FS fcntl to tell the kernel to stick
to 32-bit sized hashes for fds used by the guest.

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
---
RFC patch because it depends on the kernel patch to provide
F_SET_FILE_32BIT_FS, which is still under discussion. All this
patch does is call the fcntl for every fd the guest opens.

 linux-user/syscall.c | 27 +++++++++++++++++++++++++++
 1 file changed, 27 insertions(+)

diff --git a/linux-user/syscall.c b/linux-user/syscall.c
index 674f70e70a5..8966d4881bd 100644
--- a/linux-user/syscall.c
+++ b/linux-user/syscall.c
@@ -884,6 +884,28 @@ static inline int host_to_target_sock_type(int host_type)
     return target_type;
 }

+/*
+ * If the guest is using a 32 bit ABI then we should try to ask the kernel
+ * to provide 32-bit offsets in getdents syscalls, as otherwise some
+ * filesystems will return 64-bit hash values which we can't fit into
+ * the field sizes the guest ABI mandates.
+ */
+#ifndef F_SET_FILE_32BIT_FS
+#define F_SET_FILE_32BIT_FS (1024 + 15)
+#endif
+
+static inline void request_32bit_fs(int fd)
+{
+#if HOST_LONG_BITS > TARGET_ABI_BITS
+    /*
+     * Ignore errors, which are likely due to the host kernel being too
+     * old to support this fcntl. We'll try anyway, which might or might
+     * not work, depending on the guest code and on the host filesystem.
+     */
+    fcntl(fd, F_SET_FILE_32BIT_FS);
+#endif
+}
+
 static abi_ulong target_brk;
 static abi_ulong target_original_brk;
 static abi_ulong brk_page;
@@ -7704,6 +7726,7 @@ static abi_long do_syscall1(void *cpu_env, int
num, abi_long arg1,
                                   target_to_host_bitmask(arg2,
fcntl_flags_tbl),
                                   arg3));
         fd_trans_unregister(ret);
+        request_32bit_fs(ret);
         unlock_user(p, arg1, 0);
         return ret;
 #endif
@@ -7714,6 +7737,7 @@ static abi_long do_syscall1(void *cpu_env, int
num, abi_long arg1,
                                   target_to_host_bitmask(arg3,
fcntl_flags_tbl),
                                   arg4));
         fd_trans_unregister(ret);
+        request_32bit_fs(ret);
         unlock_user(p, arg2, 0);
         return ret;
 #if defined(TARGET_NR_name_to_handle_at) && defined(CONFIG_OPEN_BY_HANDLE)
@@ -7725,6 +7749,7 @@ static abi_long do_syscall1(void *cpu_env, int
num, abi_long arg1,
     case TARGET_NR_open_by_handle_at:
         ret = do_open_by_handle_at(arg1, arg2, arg3);
         fd_trans_unregister(ret);
+        request_32bit_fs(ret);
         return ret;
 #endif
     case TARGET_NR_close:
@@ -7769,6 +7794,7 @@ static abi_long do_syscall1(void *cpu_env, int
num, abi_long arg1,
             return -TARGET_EFAULT;
         ret = get_errno(creat(p, arg2));
         fd_trans_unregister(ret);
+        request_32bit_fs(ret);
         unlock_user(p, arg1, 0);
         return ret;
 #endif
@@ -12393,6 +12419,7 @@ static abi_long do_syscall1(void *cpu_env, int
num, abi_long arg1,
         }
         ret = get_errno(memfd_create(p, arg2));
         fd_trans_unregister(ret);
+        request_32bit_fs(ret);
         unlock_user(p, arg1, 0);
         return ret;
 #endif
-- 
2.20.1

WARNING: multiple messages have this Message-ID (diff)
From: Peter Maydell <peter.maydell@linaro.org>
To: Florian Weimer <fw@deneb.enyo.de>
Cc: Theodore Ts'o <tytso@mit.edu>,
	Linux API <linux-api@vger.kernel.org>,
	Linus Walleij <linus.walleij@linaro.org>,
	QEMU Developers <qemu-devel@nongnu.org>,
	Andreas Dilger <adilger.kernel@dilger.ca>,
	Andy Lutomirski <luto@kernel.org>,
	linux-fsdevel <linux-fsdevel@vger.kernel.org>,
	Ext4 Developers List <linux-ext4@vger.kernel.org>
Subject: Re: [PATCH] fcntl: Add 32bit filesystem mode
Date: Mon, 20 Apr 2020 12:38:44 +0100	[thread overview]
Message-ID: <CAFEAcA-No3Z95+UQJZWTxDesd-z_Y5XnyHs6NMpzDo3RVOHQ4w@mail.gmail.com> (raw)
In-Reply-To: <87v9luwgc6.fsf@mid.deneb.enyo.de>

On Mon, 20 Apr 2020 at 12:23, Florian Weimer <fw@deneb.enyo.de> wrote:
>
> * Peter Maydell:
>
> > We open fd 3 to read '.'; we issue the new fcntl, which
> > succeeds. Then there's some unrelated stuff operating on
> > stdout. Then we do a getdents64(), but the d_off values
> > we get back are still 64 bits. The guest binary doesn't
> > like those, so it fails. My expectation was that we would
> > get back d_off values here that were in the 32 bit range.
>
> What's your file system?
>
> I think not all of them have 32-bit hashes (some of them probably
> can't, particularly in the network-based file system case).

Whoops, good point. I was testing this via lkvm, so it's
actually using a 9p filesystem... I'll see if I can figure
out how to test with an ext3 fs, which I think is the one
we most care about. It would be nice if the flag was
supported by other fses too, of course.

Appended is the QEMU patch I tested with.

thanks
-- PMM

From 73471e01733dd1d998ff3cd41edebb4c78793193 Mon Sep 17 00:00:00 2001
From: Peter Maydell <peter.maydell@linaro.org>
Date: Mon, 20 Apr 2020 11:54:22 +0100
Subject: [RFC] linux-user: Use new F_SET_FILE_32BIT_FS fcntl for 32-bit guests

If the guest is 32 bit then there is a potential problem if the
host gives us back a 64-bit sized value that we can't fit into
the ABI the guest requires. This is a theoretical issue for many
syscalls, but a real issue for directory reads where the host
is using ext3 or ext4. There the 'offset' values retured via
the getdents syscall are hashes, and on a 64-bit system they
will always fill the full 64 bits.

Use the F_SET_FILE_32BIT_FS fcntl to tell the kernel to stick
to 32-bit sized hashes for fds used by the guest.

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
---
RFC patch because it depends on the kernel patch to provide
F_SET_FILE_32BIT_FS, which is still under discussion. All this
patch does is call the fcntl for every fd the guest opens.

 linux-user/syscall.c | 27 +++++++++++++++++++++++++++
 1 file changed, 27 insertions(+)

diff --git a/linux-user/syscall.c b/linux-user/syscall.c
index 674f70e70a5..8966d4881bd 100644
--- a/linux-user/syscall.c
+++ b/linux-user/syscall.c
@@ -884,6 +884,28 @@ static inline int host_to_target_sock_type(int host_type)
     return target_type;
 }

+/*
+ * If the guest is using a 32 bit ABI then we should try to ask the kernel
+ * to provide 32-bit offsets in getdents syscalls, as otherwise some
+ * filesystems will return 64-bit hash values which we can't fit into
+ * the field sizes the guest ABI mandates.
+ */
+#ifndef F_SET_FILE_32BIT_FS
+#define F_SET_FILE_32BIT_FS (1024 + 15)
+#endif
+
+static inline void request_32bit_fs(int fd)
+{
+#if HOST_LONG_BITS > TARGET_ABI_BITS
+    /*
+     * Ignore errors, which are likely due to the host kernel being too
+     * old to support this fcntl. We'll try anyway, which might or might
+     * not work, depending on the guest code and on the host filesystem.
+     */
+    fcntl(fd, F_SET_FILE_32BIT_FS);
+#endif
+}
+
 static abi_ulong target_brk;
 static abi_ulong target_original_brk;
 static abi_ulong brk_page;
@@ -7704,6 +7726,7 @@ static abi_long do_syscall1(void *cpu_env, int
num, abi_long arg1,
                                   target_to_host_bitmask(arg2,
fcntl_flags_tbl),
                                   arg3));
         fd_trans_unregister(ret);
+        request_32bit_fs(ret);
         unlock_user(p, arg1, 0);
         return ret;
 #endif
@@ -7714,6 +7737,7 @@ static abi_long do_syscall1(void *cpu_env, int
num, abi_long arg1,
                                   target_to_host_bitmask(arg3,
fcntl_flags_tbl),
                                   arg4));
         fd_trans_unregister(ret);
+        request_32bit_fs(ret);
         unlock_user(p, arg2, 0);
         return ret;
 #if defined(TARGET_NR_name_to_handle_at) && defined(CONFIG_OPEN_BY_HANDLE)
@@ -7725,6 +7749,7 @@ static abi_long do_syscall1(void *cpu_env, int
num, abi_long arg1,
     case TARGET_NR_open_by_handle_at:
         ret = do_open_by_handle_at(arg1, arg2, arg3);
         fd_trans_unregister(ret);
+        request_32bit_fs(ret);
         return ret;
 #endif
     case TARGET_NR_close:
@@ -7769,6 +7794,7 @@ static abi_long do_syscall1(void *cpu_env, int
num, abi_long arg1,
             return -TARGET_EFAULT;
         ret = get_errno(creat(p, arg2));
         fd_trans_unregister(ret);
+        request_32bit_fs(ret);
         unlock_user(p, arg1, 0);
         return ret;
 #endif
@@ -12393,6 +12419,7 @@ static abi_long do_syscall1(void *cpu_env, int
num, abi_long arg1,
         }
         ret = get_errno(memfd_create(p, arg2));
         fd_trans_unregister(ret);
+        request_32bit_fs(ret);
         unlock_user(p, arg1, 0);
         return ret;
 #endif
-- 
2.20.1


  reply	other threads:[~2020-04-20 11:38 UTC|newest]

Thread overview: 21+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-03-31 13:35 [PATCH] fcntl: Add 32bit filesystem mode Linus Walleij
2020-03-31 13:35 ` Linus Walleij
2020-04-20 11:19 ` Peter Maydell
2020-04-20 11:19   ` Peter Maydell
2020-04-20 11:23   ` Florian Weimer
2020-04-20 11:23     ` Florian Weimer
2020-04-20 11:38     ` Peter Maydell [this message]
2020-04-20 11:38       ` Peter Maydell
2020-04-20 14:16       ` Peter Maydell
2020-04-20 14:16         ` Peter Maydell
2020-04-20 23:51       ` Andreas Dilger
2020-04-20 23:51         ` Andreas Dilger
2020-04-21 13:02         ` Peter Maydell
2020-04-21 13:02           ` Peter Maydell
2020-04-20 15:13 ` Theodore Y. Ts'o
2020-04-20 15:13   ` Theodore Y. Ts'o
2020-04-20 15:23   ` Eric Blake
2020-04-20 15:29     ` Peter Maydell
2020-04-20 15:29       ` Peter Maydell
2020-04-20 17:01       ` Theodore Y. Ts'o
2020-04-20 17:01         ` Theodore Y. Ts'o

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=CAFEAcA-No3Z95+UQJZWTxDesd-z_Y5XnyHs6NMpzDo3RVOHQ4w@mail.gmail.com \
    --to=peter.maydell@linaro.org \
    --cc=adilger.kernel@dilger.ca \
    --cc=fw@deneb.enyo.de \
    --cc=linus.walleij@linaro.org \
    --cc=linux-api@vger.kernel.org \
    --cc=linux-ext4@vger.kernel.org \
    --cc=linux-fsdevel@vger.kernel.org \
    --cc=luto@kernel.org \
    --cc=qemu-devel@nongnu.org \
    --cc=tytso@mit.edu \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.