From: Christian Schoenebeck <qemu_oss@crudebyte.com>
To: qemu-devel@nongnu.org
Cc: Keno Fischer <keno@juliacomputing.com>,
Michael Roitzsch <reactorcontrol@icloud.com>,
Will Cohen <wwcohen@gmail.com>, Greg Kurz <groug@kaod.org>,
qemu-stable@nongnu.org, Akihiko Odaki <akihiko.odaki@gmail.com>
Subject: Re: [PATCH v2 2/5] 9pfs: fix qemu_mknodat(S_IFSOCK) on macOS
Date: Sun, 24 Apr 2022 20:45:21 +0200 [thread overview]
Message-ID: <3849551.ofAv5PygDX@silver> (raw)
In-Reply-To: <eafd4bbf-dbff-323a-179f-8f29905701e1@gmail.com>
On Samstag, 23. April 2022 06:33:50 CEST Akihiko Odaki wrote:
> On 2022/04/22 23:06, Christian Schoenebeck wrote:
> > On Freitag, 22. April 2022 04:43:40 CEST Akihiko Odaki wrote:
> >> On 2022/04/22 0:07, Christian Schoenebeck wrote:
> >>> mknod() on macOS does not support creating sockets, so divert to
> >>> call sequence socket(), bind() and chmod() respectively if S_IFSOCK
> >>> was passed with mode argument.
> >>>
> >>> Link: https://lore.kernel.org/qemu-devel/17933734.zYzKuhC07K@silver/
> >>> Signed-off-by: Christian Schoenebeck <qemu_oss@crudebyte.com>
> >>> Reviewed-by: Will Cohen <wwcohen@gmail.com>
> >>> ---
> >>>
> >>> hw/9pfs/9p-util-darwin.c | 27 ++++++++++++++++++++++++++-
> >>> 1 file changed, 26 insertions(+), 1 deletion(-)
> >>>
> >>> diff --git a/hw/9pfs/9p-util-darwin.c b/hw/9pfs/9p-util-darwin.c
> >>> index e24d09763a..39308f2a45 100644
> >>> --- a/hw/9pfs/9p-util-darwin.c
> >>> +++ b/hw/9pfs/9p-util-darwin.c
> >>> @@ -74,6 +74,27 @@ int fsetxattrat_nofollow(int dirfd, const char
> >>> *filename, const char *name,>
> >>>
> >>> */
> >>>
> >>> #if defined CONFIG_PTHREAD_FCHDIR_NP
> >>>
> >>> +static int create_socket_file_at_cwd(const char *filename, mode_t mode)
> >>> {
> >>> + int fd, err;
> >>> + struct sockaddr_un addr = {
> >>> + .sun_family = AF_UNIX
> >>> + };
> >>> +
> >>> + fd = socket(PF_UNIX, SOCK_DGRAM, 0);
> >>> + if (fd == -1) {
> >>> + return fd;
> >>> + }
> >>> + snprintf(addr.sun_path, sizeof(addr.sun_path), "./%s", filename);
> >>
> >> It would result in an incorrect path if the path does not fit in
> >> addr.sun_path. It should report an explicit error instead.
> >
> > Looking at its header file, 'sun_path' is indeed defined on macOS with an
> > oddly small size of only 104 bytes. So yes, I should explicitly handle
> > that
> > error case.
> >
> > I'll post a v3.
> >
> >>> + err = bind(fd, (struct sockaddr *) &addr, sizeof(addr));
> >>> + if (err == -1) {
> >>> + goto out;
> >>
> >> You may close(fd) as soon as bind() returns (before checking the
> >> returned value) and eliminate goto.
> >
> > Yeah, I thought about that alternative, but found it a bit ugly, and
> > probably also counter-productive in case this function might get extended
> > with more error pathes in future. Not that I would insist on the current
> > solution though.
>
> I'm happy with the explanation. Thanks.
>
> >>> + }
> >>> + err = chmod(addr.sun_path, mode);
> >>
> >> I'm not sure if it is fine to have a time window between bind() and
> >> chmod(). Do you have some rationale?
> >
> > Good question. QEMU's 9p server is multi-threaded; all 9p requests come in
> > serialized and the 9p server controller portion (9p.c) is only running on
> > QEMU main thread, but the actual filesystem driver calls are then
> > dispatched to QEMU worker threads and therefore running concurrently at
> > this point:
> >
> > https://wiki.qemu.org/Documentation/9p#Threads_and_Coroutines
> >
> > Similar situation on Linux 9p client side: it handles access to a mounted
> > 9p filesystem concurrently, requests are then serialized by 9p driver on
> > Linux and sent over wire to 9p server (host).
> >
> > So yes, there might be implications by that short time windows. But could
> > that be exploited on macOS hosts in practice?
> >
> > The socket file would have mode srwxr-xr-x for a short moment.
> >
> > For security_model=mapped* this should not be a problem.
> >
> > For security_model=none|passhrough, in theory, maybe? But how likely is
> > that? If you are using a Linux client for instance, trying to brute-force
> > opening the socket file, the client would send several 9p commands
> > (Twalk, Tgetattr, Topen, probably more). The time window of the two
> > commands above should be much smaller than that and I would expect one of
> > the 9p commands to error out in between.
> >
> > What would be a viable approach to avoid this issue on macOS?
>
> It is unlikely that a naive brute-force approach will succeed to
> exploit. The more concerning scenario is that the attacker uses the
> knowledge of the underlying implementation of macOS to cause resource
> contention to widen the window. Whether an exploitation is viable
> depends on how much time you spend digging XNU.
>
> However, I'm also not sure if it really *has* a race condition. Looking
> at v9fs_co_mknod(), it sequentially calls s->ops->mknod() and
> s->ops->lstat(). It also results in an entity called "path name based
> fid" in the code, which inherently cannot identify a file when it is
> renamed or recreated.
>
> If there is some rationale it is safe, it may also be applied to the
> sequence of bind() and chmod(). Can anyone explain the sequence of
> s->ops->mknod() and s->ops->lstat() or path name based fid in general?
You are talking about 9p server's controller level: I don't see something that
would prevent a concurrent open() during this bind() ... chmod() time window
unfortunately.
Argument 'fidp' passed to function v9fs_co_mknod() reflects the directory in
which the new device file shall be created. So 'fidp' is not the device file
here, nor is 'fidp' modified during this function.
Function v9fs_co_mknod() is entered by 9p server on QEMU main thread. At the
beginning of the function it first acquires a read lock on a (per 9p export)
global coroutine mutex:
v9fs_path_read_lock(s);
and holds this lock until returning from function v9fs_co_mknod(). But that's
just a read lock. Function v9fs_co_open() also just gains a read lock. So they
can happen concurrently.
Then v9fs_co_run_in_worker({...}) is called to dispatch and execute all the
code block (think of it as an Obj-C "block") inside this (macro actually) on a
QEMU worker thread. So an arbitrary background thread would then call the fs
driver functions:
s->ops->mknod()
v9fs_name_to_path()
s->ops->lstat()
and then at the end of the code block the background thread would dispatch
back to QEMU main thread. So when we are reaching:
v9fs_path_unlock(s);
we are already back on QEMU main thread, hence unlocking on main thread now
and finally leaving function v9fs_co_mknod().
The important thing to understand is, while that
v9fs_co_run_in_worker({...})
code block is executed on a QEMU worker thread, the QEMU main thread (9p
server controller portion, i.e. 9p.c) is *not* sleeping, QEMU main thread
rather continues to process other (if any) client requests in the meantime. In
other words v9fs_co_run_in_worker() neither behaves exactly like Apple's GCD
dispatch_async(), nor like dispatch_sync(), as GCD is not coroutine based.
So 9p server might pull a pending 'Topen' client request from the input FIFO
in the meantime and likewise dispatch that to a worker thread, etc. Hence a
concurrent open() might in theory be possible, but I find it quite unlikely to
succeed in practice as the open() call on guest is translated by Linux client
into a bunch of synchronous 9p requests on the path passed with the open()
call on guest, and a round trip for each 9p message is like what, ~0.3ms or
something in this order. That's quite huge compared to the time window I would
expect between bind() ... open().
Does this answer your questions?
> Regards,
> Akihiko Odaki
>
> >>> +out:
> >>> + close(fd);
> >>> + return err;
> >>> +}
> >>> +
> >>>
> >>> int qemu_mknodat(int dirfd, const char *filename, mode_t mode, dev_t
> >>> dev)
> >>> {
> >>>
> >>> int preserved_errno, err;
> >>>
> >>> @@ -93,7 +114,11 @@ int qemu_mknodat(int dirfd, const char *filename,
> >>> mode_t mode, dev_t dev)>
> >>>
> >>> if (pthread_fchdir_np(dirfd) < 0) {
> >>>
> >>> return -1;
> >>>
> >>> }
> >>>
> >>> - err = mknod(filename, mode, dev);
> >>> + if (S_ISSOCK(mode)) {
> >>> + err = create_socket_file_at_cwd(filename, mode);
> >>> + } else {
> >>> + err = mknod(filename, mode, dev);
> >>> + }
> >>>
> >>> preserved_errno = errno;
> >>> /* Stop using the thread-local cwd */
> >>> pthread_fchdir_np(-1);
next prev parent reply other threads:[~2022-04-24 18:47 UTC|newest]
Thread overview: 27+ messages / expand[flat|nested] mbox.gz Atom feed top
2022-04-21 15:08 [PATCH v2 0/5] 9pfs: macOS host fixes Christian Schoenebeck
2022-04-21 15:07 ` [PATCH v2 1/5] 9pfs: fix qemu_mknodat(S_IFREG) on macOS Christian Schoenebeck
2022-04-21 16:32 ` Greg Kurz
2022-04-21 15:07 ` [PATCH v2 2/5] 9pfs: fix qemu_mknodat(S_IFSOCK) " Christian Schoenebeck
2022-04-21 16:36 ` Greg Kurz
2022-04-21 17:29 ` Christian Schoenebeck
2022-04-22 2:43 ` Akihiko Odaki
2022-04-22 14:06 ` Christian Schoenebeck
2022-04-23 4:33 ` Akihiko Odaki
2022-04-24 18:45 ` Christian Schoenebeck [this message]
2022-04-26 3:57 ` Akihiko Odaki
2022-04-26 12:38 ` Greg Kurz
2022-04-27 2:27 ` Akihiko Odaki
2022-04-27 10:18 ` Greg Kurz
2022-04-27 12:32 ` Christian Schoenebeck
2022-04-27 13:31 ` Greg Kurz
2022-04-27 16:18 ` Christian Schoenebeck
2022-04-27 17:12 ` Will Cohen
2022-04-27 18:16 ` Christian Schoenebeck
2022-04-27 17:37 ` Greg Kurz
2022-04-27 18:36 ` Christian Schoenebeck
2022-04-21 15:07 ` [PATCH v2 3/5] 9pfs: fix wrong encoding of rdev field in Rgetattr " Christian Schoenebeck
2022-04-21 16:39 ` Greg Kurz
2022-04-21 15:07 ` [PATCH v2 4/5] 9pfs: fix wrong errno being sent to Linux client on macOS host Christian Schoenebeck
2022-04-21 16:39 ` Greg Kurz
2022-04-21 15:07 ` [PATCH v2 5/5] 9pfs: fix removing non-existent POSIX ACL xattr " Christian Schoenebeck
2022-04-21 16:40 ` Greg Kurz
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=3849551.ofAv5PygDX@silver \
--to=qemu_oss@crudebyte.com \
--cc=akihiko.odaki@gmail.com \
--cc=groug@kaod.org \
--cc=keno@juliacomputing.com \
--cc=qemu-devel@nongnu.org \
--cc=qemu-stable@nongnu.org \
--cc=reactorcontrol@icloud.com \
--cc=wwcohen@gmail.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).