All of lore.kernel.org
 help / color / mirror / Atom feed
From: Peter Maydell <peter.maydell@linaro.org>
To: Linus Walleij <linus.walleij@linaro.org>
Cc: "Theodore Ts'o" <tytso@mit.edu>,
	Andreas Dilger <adilger.kernel@dilger.ca>,
	Ext4 Developers List <linux-ext4@vger.kernel.org>,
	linux-fsdevel <linux-fsdevel@vger.kernel.org>,
	Linux API <linux-api@vger.kernel.org>,
	QEMU Developers <qemu-devel@nongnu.org>,
	Florian Weimer <fw@deneb.enyo.de>,
	Andy Lutomirski <luto@kernel.org>
Subject: Re: [PATCH] fcntl: Add 32bit filesystem mode
Date: Mon, 20 Apr 2020 12:19:17 +0100	[thread overview]
Message-ID: <CAFEAcA9Gep1HN+7WJHencp9g2uUBLhagxdgjHf-16AOdP5oOjg@mail.gmail.com> (raw)
In-Reply-To: <20200331133536.3328-1-linus.walleij@linaro.org>

On Tue, 31 Mar 2020 at 14:37, Linus Walleij <linus.walleij@linaro.org> wrote:
>
> It was brought to my attention that this bug from 2018 was
> still unresolved: 32 bit emulators like QEMU were given
> 64 bit hashes when running 32 bit emulation on 64 bit systems.
>
> This adds a fcntl() operation to set the underlying filesystem
> into 32bit mode even if the file hanle was opened using 64bit
> mode without the compat syscalls.
>
> Programs that need the 32 bit file system behavior need to
> issue a fcntl() system call such as in this example:
>
>   #define F_SET_FILE_32BIT_FS (1024 + 15)
>
>   int main(int argc, char** argv) {
>     DIR* dir;
>     int err;
>     int fd;
>
>     dir = opendir("/boot");
>     fd = dirfd(dir);
>     err = fcntl(fd, F_SET_FILE_32BIT_FS);
>     if (err) {
>       printf("fcntl() failed! err=%d\n", err);
>       return 1;
>     }
>     printf("dir=%p\n", dir);
>     printf("readdir(dir)=%p\n", readdir(dir));
>     printf("errno=%d: %s\n", errno, strerror(errno));
>     return 0;
>   }

I gave this a try with a modified QEMU, but it doesn't seem
to fix the problem. Here's the relevant chunk of the strace
output from stracing a QEMU that's running a 32-bit guest
binary that issues a getdents64 and fails (it's the 'readdir-bug'
test case from the launchpad bug):

openat(AT_FDCWD, ".", O_RDONLY|O_NONBLOCK|O_CLOEXEC|O_DIRECTORY) = 3
fcntl(3, 0x40f /* F_??? */, 0x3)        = 0
fstat(3, {st_dev=makedev(0, 16), st_ino=4637237, st_mode=S_IFDIR|0755,
st_nlink=12, st_uid=1000, st_gid=1000, st_blksize=8192, st_blocks=8,
st_size=4096, st_atime=1587380917 /*
2020-04-20T11:08:37.756174607+0000 */, st_atime_nsec=756174607,
st_mtime=1587380910 /* 2020-04-20T11:08:30.356230179+0000 */,
st_mtime_nsec=356230179, st_ctime=1587380910 /*
2020-04-20T11:08:30.356230179+0000 */, st_ctime_nsec=356230179}) = 0
fstat(1, {st_dev=makedev(0, 2), st_ino=9017, st_mode=S_IFCHR|0600,
st_nlink=1, st_uid=0, st_gid=0, st_blksize=4096, st_blocks=0,
st_rdev=makedev(5, 1), st_atime=1587381196 /* 2020-04-20T11:13:16+0000
*/, st_atime_nsec=0, st_mtime=1587381196 /* 2020-04-20T11:13:16+0000
*/, st_mtime_nsec=0, st_ctime=1587381042 /*
2020-04-20T11:10:42.484981152+0000 */, st_ctime_nsec=484981152}) = 0
ioctl(1, TCGETS, {c_iflags=0x2502, c_oflags=0x5, c_cflags=0xcbd,
c_lflags=0x8a3b, c_line=0,
c_cc="\x03\x1c\x7f\x15\x04\x00\x01\x00\x11\x13\x1a\x00\x12\x0f\x17\x16\x00\x00\x00"})
= 0
write(1, "dir=0x76128\n", 12)           = 12
getdents64(3, [{d_ino=1, d_off=273341893525646730, d_reclen=24,
d_type=DT_DIR, d_name=".."}, {d_ino=4637239, d_off=849308795555391993,
d_reclen=24, d_type=DT_DIR, d_name="etc"}, {d_ino=4587984,
d_off=1620709961571101518, d_reclen=24, d_type=DT_LNK, d_name="usr"},
{d_ino=4637238, d_off=2787937917159437645, d_reclen=24, d_type=DT_DIR,
d_name="dev"}, {d_ino=4637244, d_off=3015508490233103491, d_reclen=24,
d_type=DT_DIR, d_name="sys"}, {d_ino=4587608,
d_off=3551089360661460833, d_reclen=24, d_type=DT_LNK, d_name="lib"},
{d_ino=4637246, d_off=3857320197951442970, d_reclen=24, d_type=DT_DIR,
d_name="var"}, {d_ino=4637242, d_off=4103122318823701457, d_reclen=24,
d_type=DT_DIR, d_name="proc"}, {d_ino=4587541,
d_off=4252201186220906002, d_reclen=24, d_type=DT_LNK, d_name="bin"},
{d_ino=4637245, d_off=4386533378951587638, d_reclen=24, d_type=DT_DIR,
d_name="tmp"}, {d_ino=4637241, d_off=4883206313583644962, d_reclen=24,
d_type=DT_DIR, d_name="host"}, {d_ino=4637237,
d_off=4941119754928488586, d_reclen=24, d_type=DT_DIR, d_name="."},
{d_ino=4637243, d_off=5301154723342888169, d_reclen=24, d_type=DT_DIR,
d_name="root"}, {d_ino=4587838, d_off=6989908915879243400,
d_reclen=32, d_type=DT_LNK, d_name="lib64"}, {d_ino=4587679,
d_off=7356513223657690979, d_reclen=32, d_type=DT_REG,
d_name="strace.log"}, {d_ino=4587847, d_off=7810090083157553519,
d_reclen=24, d_type=DT_LNK, d_name="sbin"}, {d_ino=4637240,
d_off=8254997891991845677, d_reclen=24, d_type=DT_DIR, d_name="home"},
{d_ino=4637248, d_off=9223372036854775807, d_reclen=24, d_type=DT_DIR,
d_name="virt"}], 32768) = 448
write(1, "readdir(dir)=(nil)\n", 19)    = 19
write(1, "errno=75: Value too large for de"..., 48) = 48
exit_group(0)                           = ?


We open fd 3 to read '.'; we issue the new fcntl, which
succeeds. Then there's some unrelated stuff operating on
stdout. Then we do a getdents64(), but the d_off values
we get back are still 64 bits. The guest binary doesn't
like those, so it fails. My expectation was that we would
get back d_off values here that were in the 32 bit range.

(To be clear, the guest binary here is doing a getdents64(),
which QEMU translates into a host getdents64().)

thanks
-- PMM

WARNING: multiple messages have this Message-ID (diff)
From: Peter Maydell <peter.maydell@linaro.org>
To: Linus Walleij <linus.walleij@linaro.org>
Cc: Theodore Ts'o <tytso@mit.edu>,
	Linux API <linux-api@vger.kernel.org>,
	QEMU Developers <qemu-devel@nongnu.org>,
	Florian Weimer <fw@deneb.enyo.de>,
	Andreas Dilger <adilger.kernel@dilger.ca>,
	Andy Lutomirski <luto@kernel.org>,
	linux-fsdevel <linux-fsdevel@vger.kernel.org>,
	Ext4 Developers List <linux-ext4@vger.kernel.org>
Subject: Re: [PATCH] fcntl: Add 32bit filesystem mode
Date: Mon, 20 Apr 2020 12:19:17 +0100	[thread overview]
Message-ID: <CAFEAcA9Gep1HN+7WJHencp9g2uUBLhagxdgjHf-16AOdP5oOjg@mail.gmail.com> (raw)
In-Reply-To: <20200331133536.3328-1-linus.walleij@linaro.org>

On Tue, 31 Mar 2020 at 14:37, Linus Walleij <linus.walleij@linaro.org> wrote:
>
> It was brought to my attention that this bug from 2018 was
> still unresolved: 32 bit emulators like QEMU were given
> 64 bit hashes when running 32 bit emulation on 64 bit systems.
>
> This adds a fcntl() operation to set the underlying filesystem
> into 32bit mode even if the file hanle was opened using 64bit
> mode without the compat syscalls.
>
> Programs that need the 32 bit file system behavior need to
> issue a fcntl() system call such as in this example:
>
>   #define F_SET_FILE_32BIT_FS (1024 + 15)
>
>   int main(int argc, char** argv) {
>     DIR* dir;
>     int err;
>     int fd;
>
>     dir = opendir("/boot");
>     fd = dirfd(dir);
>     err = fcntl(fd, F_SET_FILE_32BIT_FS);
>     if (err) {
>       printf("fcntl() failed! err=%d\n", err);
>       return 1;
>     }
>     printf("dir=%p\n", dir);
>     printf("readdir(dir)=%p\n", readdir(dir));
>     printf("errno=%d: %s\n", errno, strerror(errno));
>     return 0;
>   }

I gave this a try with a modified QEMU, but it doesn't seem
to fix the problem. Here's the relevant chunk of the strace
output from stracing a QEMU that's running a 32-bit guest
binary that issues a getdents64 and fails (it's the 'readdir-bug'
test case from the launchpad bug):

openat(AT_FDCWD, ".", O_RDONLY|O_NONBLOCK|O_CLOEXEC|O_DIRECTORY) = 3
fcntl(3, 0x40f /* F_??? */, 0x3)        = 0
fstat(3, {st_dev=makedev(0, 16), st_ino=4637237, st_mode=S_IFDIR|0755,
st_nlink=12, st_uid=1000, st_gid=1000, st_blksize=8192, st_blocks=8,
st_size=4096, st_atime=1587380917 /*
2020-04-20T11:08:37.756174607+0000 */, st_atime_nsec=756174607,
st_mtime=1587380910 /* 2020-04-20T11:08:30.356230179+0000 */,
st_mtime_nsec=356230179, st_ctime=1587380910 /*
2020-04-20T11:08:30.356230179+0000 */, st_ctime_nsec=356230179}) = 0
fstat(1, {st_dev=makedev(0, 2), st_ino=9017, st_mode=S_IFCHR|0600,
st_nlink=1, st_uid=0, st_gid=0, st_blksize=4096, st_blocks=0,
st_rdev=makedev(5, 1), st_atime=1587381196 /* 2020-04-20T11:13:16+0000
*/, st_atime_nsec=0, st_mtime=1587381196 /* 2020-04-20T11:13:16+0000
*/, st_mtime_nsec=0, st_ctime=1587381042 /*
2020-04-20T11:10:42.484981152+0000 */, st_ctime_nsec=484981152}) = 0
ioctl(1, TCGETS, {c_iflags=0x2502, c_oflags=0x5, c_cflags=0xcbd,
c_lflags=0x8a3b, c_line=0,
c_cc="\x03\x1c\x7f\x15\x04\x00\x01\x00\x11\x13\x1a\x00\x12\x0f\x17\x16\x00\x00\x00"})
= 0
write(1, "dir=0x76128\n", 12)           = 12
getdents64(3, [{d_ino=1, d_off=273341893525646730, d_reclen=24,
d_type=DT_DIR, d_name=".."}, {d_ino=4637239, d_off=849308795555391993,
d_reclen=24, d_type=DT_DIR, d_name="etc"}, {d_ino=4587984,
d_off=1620709961571101518, d_reclen=24, d_type=DT_LNK, d_name="usr"},
{d_ino=4637238, d_off=2787937917159437645, d_reclen=24, d_type=DT_DIR,
d_name="dev"}, {d_ino=4637244, d_off=3015508490233103491, d_reclen=24,
d_type=DT_DIR, d_name="sys"}, {d_ino=4587608,
d_off=3551089360661460833, d_reclen=24, d_type=DT_LNK, d_name="lib"},
{d_ino=4637246, d_off=3857320197951442970, d_reclen=24, d_type=DT_DIR,
d_name="var"}, {d_ino=4637242, d_off=4103122318823701457, d_reclen=24,
d_type=DT_DIR, d_name="proc"}, {d_ino=4587541,
d_off=4252201186220906002, d_reclen=24, d_type=DT_LNK, d_name="bin"},
{d_ino=4637245, d_off=4386533378951587638, d_reclen=24, d_type=DT_DIR,
d_name="tmp"}, {d_ino=4637241, d_off=4883206313583644962, d_reclen=24,
d_type=DT_DIR, d_name="host"}, {d_ino=4637237,
d_off=4941119754928488586, d_reclen=24, d_type=DT_DIR, d_name="."},
{d_ino=4637243, d_off=5301154723342888169, d_reclen=24, d_type=DT_DIR,
d_name="root"}, {d_ino=4587838, d_off=6989908915879243400,
d_reclen=32, d_type=DT_LNK, d_name="lib64"}, {d_ino=4587679,
d_off=7356513223657690979, d_reclen=32, d_type=DT_REG,
d_name="strace.log"}, {d_ino=4587847, d_off=7810090083157553519,
d_reclen=24, d_type=DT_LNK, d_name="sbin"}, {d_ino=4637240,
d_off=8254997891991845677, d_reclen=24, d_type=DT_DIR, d_name="home"},
{d_ino=4637248, d_off=9223372036854775807, d_reclen=24, d_type=DT_DIR,
d_name="virt"}], 32768) = 448
write(1, "readdir(dir)=(nil)\n", 19)    = 19
write(1, "errno=75: Value too large for de"..., 48) = 48
exit_group(0)                           = ?


We open fd 3 to read '.'; we issue the new fcntl, which
succeeds. Then there's some unrelated stuff operating on
stdout. Then we do a getdents64(), but the d_off values
we get back are still 64 bits. The guest binary doesn't
like those, so it fails. My expectation was that we would
get back d_off values here that were in the 32 bit range.

(To be clear, the guest binary here is doing a getdents64(),
which QEMU translates into a host getdents64().)

thanks
-- PMM


  reply	other threads:[~2020-04-20 11:19 UTC|newest]

Thread overview: 21+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-03-31 13:35 [PATCH] fcntl: Add 32bit filesystem mode Linus Walleij
2020-03-31 13:35 ` Linus Walleij
2020-04-20 11:19 ` Peter Maydell [this message]
2020-04-20 11:19   ` Peter Maydell
2020-04-20 11:23   ` Florian Weimer
2020-04-20 11:23     ` Florian Weimer
2020-04-20 11:38     ` Peter Maydell
2020-04-20 11:38       ` Peter Maydell
2020-04-20 14:16       ` Peter Maydell
2020-04-20 14:16         ` Peter Maydell
2020-04-20 23:51       ` Andreas Dilger
2020-04-20 23:51         ` Andreas Dilger
2020-04-21 13:02         ` Peter Maydell
2020-04-21 13:02           ` Peter Maydell
2020-04-20 15:13 ` Theodore Y. Ts'o
2020-04-20 15:13   ` Theodore Y. Ts'o
2020-04-20 15:23   ` Eric Blake
2020-04-20 15:29     ` Peter Maydell
2020-04-20 15:29       ` Peter Maydell
2020-04-20 17:01       ` Theodore Y. Ts'o
2020-04-20 17:01         ` Theodore Y. Ts'o

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=CAFEAcA9Gep1HN+7WJHencp9g2uUBLhagxdgjHf-16AOdP5oOjg@mail.gmail.com \
    --to=peter.maydell@linaro.org \
    --cc=adilger.kernel@dilger.ca \
    --cc=fw@deneb.enyo.de \
    --cc=linus.walleij@linaro.org \
    --cc=linux-api@vger.kernel.org \
    --cc=linux-ext4@vger.kernel.org \
    --cc=linux-fsdevel@vger.kernel.org \
    --cc=luto@kernel.org \
    --cc=qemu-devel@nongnu.org \
    --cc=tytso@mit.edu \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.