linux-api.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Peter Maydell <peter.maydell@linaro.org>
To: Florian Weimer <fw@deneb.enyo.de>
Cc: Linus Walleij <linus.walleij@linaro.org>,
	"Theodore Ts'o" <tytso@mit.edu>,
	Andreas Dilger <adilger.kernel@dilger.ca>,
	Ext4 Developers List <linux-ext4@vger.kernel.org>,
	linux-fsdevel <linux-fsdevel@vger.kernel.org>,
	Linux API <linux-api@vger.kernel.org>,
	QEMU Developers <qemu-devel@nongnu.org>,
	Andy Lutomirski <luto@kernel.org>
Subject: Re: [PATCH] fcntl: Add 32bit filesystem mode
Date: Mon, 20 Apr 2020 12:38:44 +0100	[thread overview]
Message-ID: <CAFEAcA-No3Z95+UQJZWTxDesd-z_Y5XnyHs6NMpzDo3RVOHQ4w@mail.gmail.com> (raw)
In-Reply-To: <87v9luwgc6.fsf@mid.deneb.enyo.de>

On Mon, 20 Apr 2020 at 12:23, Florian Weimer <fw@deneb.enyo.de> wrote:
>
> * Peter Maydell:
>
> > We open fd 3 to read '.'; we issue the new fcntl, which
> > succeeds. Then there's some unrelated stuff operating on
> > stdout. Then we do a getdents64(), but the d_off values
> > we get back are still 64 bits. The guest binary doesn't
> > like those, so it fails. My expectation was that we would
> > get back d_off values here that were in the 32 bit range.
>
> What's your file system?
>
> I think not all of them have 32-bit hashes (some of them probably
> can't, particularly in the network-based file system case).

Whoops, good point. I was testing this via lkvm, so it's
actually using a 9p filesystem... I'll see if I can figure
out how to test with an ext3 fs, which I think is the one
we most care about. It would be nice if the flag was
supported by other fses too, of course.

Appended is the QEMU patch I tested with.

thanks
-- PMM

From 73471e01733dd1d998ff3cd41edebb4c78793193 Mon Sep 17 00:00:00 2001
From: Peter Maydell <peter.maydell@linaro.org>
Date: Mon, 20 Apr 2020 11:54:22 +0100
Subject: [RFC] linux-user: Use new F_SET_FILE_32BIT_FS fcntl for 32-bit guests

If the guest is 32 bit then there is a potential problem if the
host gives us back a 64-bit sized value that we can't fit into
the ABI the guest requires. This is a theoretical issue for many
syscalls, but a real issue for directory reads where the host
is using ext3 or ext4. There the 'offset' values retured via
the getdents syscall are hashes, and on a 64-bit system they
will always fill the full 64 bits.

Use the F_SET_FILE_32BIT_FS fcntl to tell the kernel to stick
to 32-bit sized hashes for fds used by the guest.

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
---
RFC patch because it depends on the kernel patch to provide
F_SET_FILE_32BIT_FS, which is still under discussion. All this
patch does is call the fcntl for every fd the guest opens.

 linux-user/syscall.c | 27 +++++++++++++++++++++++++++
 1 file changed, 27 insertions(+)

diff --git a/linux-user/syscall.c b/linux-user/syscall.c
index 674f70e70a5..8966d4881bd 100644
--- a/linux-user/syscall.c
+++ b/linux-user/syscall.c
@@ -884,6 +884,28 @@ static inline int host_to_target_sock_type(int host_type)
     return target_type;
 }

+/*
+ * If the guest is using a 32 bit ABI then we should try to ask the kernel
+ * to provide 32-bit offsets in getdents syscalls, as otherwise some
+ * filesystems will return 64-bit hash values which we can't fit into
+ * the field sizes the guest ABI mandates.
+ */
+#ifndef F_SET_FILE_32BIT_FS
+#define F_SET_FILE_32BIT_FS (1024 + 15)
+#endif
+
+static inline void request_32bit_fs(int fd)
+{
+#if HOST_LONG_BITS > TARGET_ABI_BITS
+    /*
+     * Ignore errors, which are likely due to the host kernel being too
+     * old to support this fcntl. We'll try anyway, which might or might
+     * not work, depending on the guest code and on the host filesystem.
+     */
+    fcntl(fd, F_SET_FILE_32BIT_FS);
+#endif
+}
+
 static abi_ulong target_brk;
 static abi_ulong target_original_brk;
 static abi_ulong brk_page;
@@ -7704,6 +7726,7 @@ static abi_long do_syscall1(void *cpu_env, int
num, abi_long arg1,
                                   target_to_host_bitmask(arg2,
fcntl_flags_tbl),
                                   arg3));
         fd_trans_unregister(ret);
+        request_32bit_fs(ret);
         unlock_user(p, arg1, 0);
         return ret;
 #endif
@@ -7714,6 +7737,7 @@ static abi_long do_syscall1(void *cpu_env, int
num, abi_long arg1,
                                   target_to_host_bitmask(arg3,
fcntl_flags_tbl),
                                   arg4));
         fd_trans_unregister(ret);
+        request_32bit_fs(ret);
         unlock_user(p, arg2, 0);
         return ret;
 #if defined(TARGET_NR_name_to_handle_at) && defined(CONFIG_OPEN_BY_HANDLE)
@@ -7725,6 +7749,7 @@ static abi_long do_syscall1(void *cpu_env, int
num, abi_long arg1,
     case TARGET_NR_open_by_handle_at:
         ret = do_open_by_handle_at(arg1, arg2, arg3);
         fd_trans_unregister(ret);
+        request_32bit_fs(ret);
         return ret;
 #endif
     case TARGET_NR_close:
@@ -7769,6 +7794,7 @@ static abi_long do_syscall1(void *cpu_env, int
num, abi_long arg1,
             return -TARGET_EFAULT;
         ret = get_errno(creat(p, arg2));
         fd_trans_unregister(ret);
+        request_32bit_fs(ret);
         unlock_user(p, arg1, 0);
         return ret;
 #endif
@@ -12393,6 +12419,7 @@ static abi_long do_syscall1(void *cpu_env, int
num, abi_long arg1,
         }
         ret = get_errno(memfd_create(p, arg2));
         fd_trans_unregister(ret);
+        request_32bit_fs(ret);
         unlock_user(p, arg1, 0);
         return ret;
 #endif
-- 
2.20.1

  reply	other threads:[~2020-04-20 11:38 UTC|newest]

Thread overview: 11+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-03-31 13:35 [PATCH] fcntl: Add 32bit filesystem mode Linus Walleij
2020-04-20 11:19 ` Peter Maydell
2020-04-20 11:23   ` Florian Weimer
2020-04-20 11:38     ` Peter Maydell [this message]
2020-04-20 14:16       ` Peter Maydell
2020-04-20 23:51       ` Andreas Dilger
2020-04-21 13:02         ` Peter Maydell
2020-04-20 15:13 ` Theodore Y. Ts'o
2020-04-20 15:23   ` Eric Blake
2020-04-20 15:29     ` Peter Maydell
2020-04-20 17:01       ` Theodore Y. Ts'o

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=CAFEAcA-No3Z95+UQJZWTxDesd-z_Y5XnyHs6NMpzDo3RVOHQ4w@mail.gmail.com \
    --to=peter.maydell@linaro.org \
    --cc=adilger.kernel@dilger.ca \
    --cc=fw@deneb.enyo.de \
    --cc=linus.walleij@linaro.org \
    --cc=linux-api@vger.kernel.org \
    --cc=linux-ext4@vger.kernel.org \
    --cc=linux-fsdevel@vger.kernel.org \
    --cc=luto@kernel.org \
    --cc=qemu-devel@nongnu.org \
    --cc=tytso@mit.edu \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).