* [PATCH v5 00/16] hw/9pfs: Add 9pfs support for Windows
@ 2023-02-20 10:07 Bin Meng
2023-02-20 10:08 ` [PATCH v5 01/16] hw/9pfs: Add missing definitions " Bin Meng
` (17 more replies)
0 siblings, 18 replies; 32+ messages in thread
From: Bin Meng @ 2023-02-20 10:07 UTC (permalink / raw)
To: Christian Schoenebeck, Greg Kurz, qemu-devel
At present there is no Windows support for 9p file system.
This series adds initial Windows support for 9p file system.
'local' file system backend driver is supported on Windows,
including open, read, write, close, rename, remove, etc.
All security models are supported. The mapped (mapped-xattr)
security model is implemented using NTFS Alternate Data Stream
(ADS) so the 9p export path shall be on an NTFS partition.
'synth' driver is adapted for Windows too so that we can now
run qtests on Windows for 9p related regression testing.
Example command line to test:
"-fsdev local,path=c:\msys64,security_model=mapped,id=p9 -device virtio-9p-pci,fsdev=p9,mount_tag=p9fs"
Changes in v5:
- rework Windows specific xxxdir() APIs implementation
Bin Meng (2):
hw/9pfs: Update helper qemu_stat_rdev()
hw/9pfs: Add a helper qemu_stat_blksize()
Guohuai Shi (14):
hw/9pfs: Add missing definitions for Windows
hw/9pfs: Implement Windows specific utilities functions for 9pfs
hw/9pfs: Replace the direct call to xxxdir() APIs with a wrapper
hw/9pfs: Implement Windows specific xxxdir() APIs
hw/9pfs: Update the local fs driver to support Windows
hw/9pfs: Support getting current directory offset for Windows
hw/9pfs: Disable unsupported flags and features for Windows
hw/9pfs: Update v9fs_set_fd_limit() for Windows
hw/9pfs: Add Linux error number definition
hw/9pfs: Translate Windows errno to Linux value
fsdev: Disable proxy fs driver on Windows
hw/9pfs: Update synth fs driver for Windows
tests/qtest: virtio-9p-test: Adapt the case for win32
meson.build: Turn on virtfs for Windows
meson.build | 10 +-
fsdev/file-op-9p.h | 33 +
hw/9pfs/9p-linux-errno.h | 151 +++
hw/9pfs/9p-local.h | 8 +
hw/9pfs/9p-util.h | 139 ++-
hw/9pfs/9p.h | 43 +
tests/qtest/libqos/virtio-9p-client.h | 7 +
fsdev/qemu-fsdev.c | 2 +
hw/9pfs/9p-local.c | 269 ++++-
hw/9pfs/9p-synth.c | 5 +-
hw/9pfs/9p-util-win32.c | 1452 +++++++++++++++++++++++++
hw/9pfs/9p.c | 90 +-
hw/9pfs/codir.c | 2 +-
fsdev/meson.build | 1 +
hw/9pfs/meson.build | 8 +-
15 files changed, 2155 insertions(+), 65 deletions(-)
create mode 100644 hw/9pfs/9p-linux-errno.h
create mode 100644 hw/9pfs/9p-util-win32.c
--
2.25.1
^ permalink raw reply [flat|nested] 32+ messages in thread
* [PATCH v5 01/16] hw/9pfs: Add missing definitions for Windows
2023-02-20 10:07 [PATCH v5 00/16] hw/9pfs: Add 9pfs support for Windows Bin Meng
@ 2023-02-20 10:08 ` Bin Meng
2023-02-20 10:08 ` [PATCH v5 02/16] hw/9pfs: Implement Windows specific utilities functions for 9pfs Bin Meng
` (16 subsequent siblings)
17 siblings, 0 replies; 32+ messages in thread
From: Bin Meng @ 2023-02-20 10:08 UTC (permalink / raw)
To: Christian Schoenebeck, Greg Kurz, qemu-devel; +Cc: Guohuai Shi
From: Guohuai Shi <guohuai.shi@windriver.com>
Some definitions currently used by the 9pfs codes are only available
on POSIX platforms. Let's add our own ones in preparation to adding
9pfs support for Windows.
Signed-off-by: Guohuai Shi <guohuai.shi@windriver.com>
Signed-off-by: Bin Meng <bin.meng@windriver.com>
---
fsdev/file-op-9p.h | 33 +++++++++++++++++++++++++++++++++
hw/9pfs/9p.h | 43 +++++++++++++++++++++++++++++++++++++++++++
2 files changed, 76 insertions(+)
diff --git a/fsdev/file-op-9p.h b/fsdev/file-op-9p.h
index 4997677460..7d9a736b66 100644
--- a/fsdev/file-op-9p.h
+++ b/fsdev/file-op-9p.h
@@ -27,6 +27,39 @@
# include <sys/mount.h>
#endif
+#ifdef CONFIG_WIN32
+
+/* POSIX structure not defined in Windows */
+
+typedef uint32_t uid_t;
+typedef uint32_t gid_t;
+
+/* from http://man7.org/linux/man-pages/man2/statfs.2.html */
+typedef uint32_t __fsword_t;
+typedef uint32_t fsblkcnt_t;
+typedef uint32_t fsfilcnt_t;
+
+/* from linux/include/uapi/asm-generic/posix_types.h */
+typedef struct {
+ long __val[2];
+} fsid_t;
+
+struct statfs {
+ __fsword_t f_type;
+ __fsword_t f_bsize;
+ fsblkcnt_t f_blocks;
+ fsblkcnt_t f_bfree;
+ fsblkcnt_t f_bavail;
+ fsfilcnt_t f_files;
+ fsfilcnt_t f_ffree;
+ fsid_t f_fsid;
+ __fsword_t f_namelen;
+ __fsword_t f_frsize;
+ __fsword_t f_flags;
+};
+
+#endif /* CONFIG_WIN32 */
+
#define SM_LOCAL_MODE_BITS 0600
#define SM_LOCAL_DIR_MODE_BITS 0700
diff --git a/hw/9pfs/9p.h b/hw/9pfs/9p.h
index 2fce4140d1..ada9f14ebc 100644
--- a/hw/9pfs/9p.h
+++ b/hw/9pfs/9p.h
@@ -3,13 +3,56 @@
#include <dirent.h>
#include <utime.h>
+#ifndef CONFIG_WIN32
#include <sys/resource.h>
+#endif
#include "fsdev/file-op-9p.h"
#include "fsdev/9p-iov-marshal.h"
#include "qemu/thread.h"
#include "qemu/coroutine.h"
#include "qemu/qht.h"
+#ifdef CONFIG_WIN32
+
+/* Windows does not provide such a macro, typically it is 255 */
+#define NAME_MAX 255
+
+/* macros required for build, values do not matter */
+#define AT_SYMLINK_NOFOLLOW 0x100 /* Do not follow symbolic links */
+#define AT_REMOVEDIR 0x200 /* Remove directory instead of file */
+#define O_DIRECTORY 02000000
+
+#define makedev(major, minor) \
+ ((dev_t)((((major) & 0xfff) << 8) | ((minor) & 0xff)))
+#define major(dev) ((unsigned int)(((dev) >> 8) & 0xfff))
+#define minor(dev) ((unsigned int)(((dev) & 0xff)))
+
+/*
+ * Currenlty Windows/MinGW does not provide the following flag macros,
+ * so define them here for 9p codes.
+ *
+ * Once Windows/MinGW provides them, remove the defines to prevent conflicts.
+ */
+
+#ifndef S_IFLNK
+#define S_IFLNK 0xA000
+#define S_ISLNK(mode) ((mode & S_IFMT) == S_IFLNK)
+#endif /* S_IFLNK */
+
+#ifndef S_ISUID
+#define S_ISUID 0x0800
+#endif
+
+#ifndef S_ISGID
+#define S_ISGID 0x0400
+#endif
+
+#ifndef S_ISVTX
+#define S_ISVTX 0x0200
+#endif
+
+#endif /* CONFIG_WIN32 */
+
enum {
P9_TLERROR = 6,
P9_RLERROR,
--
2.25.1
^ permalink raw reply related [flat|nested] 32+ messages in thread
* [PATCH v5 02/16] hw/9pfs: Implement Windows specific utilities functions for 9pfs
2023-02-20 10:07 [PATCH v5 00/16] hw/9pfs: Add 9pfs support for Windows Bin Meng
2023-02-20 10:08 ` [PATCH v5 01/16] hw/9pfs: Add missing definitions " Bin Meng
@ 2023-02-20 10:08 ` Bin Meng
2023-02-20 10:08 ` [PATCH v5 03/16] hw/9pfs: Replace the direct call to xxxdir() APIs with a wrapper Bin Meng
` (15 subsequent siblings)
17 siblings, 0 replies; 32+ messages in thread
From: Bin Meng @ 2023-02-20 10:08 UTC (permalink / raw)
To: Christian Schoenebeck, Greg Kurz, qemu-devel; +Cc: Guohuai Shi
From: Guohuai Shi <guohuai.shi@windriver.com>
Windows POSIX API and MinGW library do not provide the NO_FOLLOW
flag, and do not allow opening a directory by POSIX open(). This
causes all xxx_at() functions cannot work directly. However, we
can provide Windows handle based functions to emulate xxx_at()
functions (e.g.: openat_win32, utimensat_win32, etc.).
NTFS ADS (Alternate Data Streams) is used to emulate 9pfs extended
attributes on Windows. Symbolic link is only supported when security
model is "mapped-xattr" or "mapped-file".
Signed-off-by: Guohuai Shi <guohuai.shi@windriver.com>
Signed-off-by: Bin Meng <bin.meng@windriver.com>
---
hw/9pfs/9p-local.h | 7 +
hw/9pfs/9p-util.h | 32 +-
hw/9pfs/9p-local.c | 4 -
hw/9pfs/9p-util-win32.c | 979 ++++++++++++++++++++++++++++++++++++++++
4 files changed, 1017 insertions(+), 5 deletions(-)
create mode 100644 hw/9pfs/9p-util-win32.c
diff --git a/hw/9pfs/9p-local.h b/hw/9pfs/9p-local.h
index 32c72749d9..77e7f57f89 100644
--- a/hw/9pfs/9p-local.h
+++ b/hw/9pfs/9p-local.h
@@ -13,6 +13,13 @@
#ifndef QEMU_9P_LOCAL_H
#define QEMU_9P_LOCAL_H
+typedef struct {
+ int mountfd;
+#ifdef CONFIG_WIN32
+ char *root_path;
+#endif
+} LocalData;
+
int local_open_nofollow(FsContext *fs_ctx, const char *path, int flags,
mode_t mode);
int local_opendir_nofollow(FsContext *fs_ctx, const char *path);
diff --git a/hw/9pfs/9p-util.h b/hw/9pfs/9p-util.h
index c314cf381d..90420a7578 100644
--- a/hw/9pfs/9p-util.h
+++ b/hw/9pfs/9p-util.h
@@ -88,18 +88,46 @@ static inline int errno_to_dotl(int err) {
return err;
}
-#ifdef CONFIG_DARWIN
+#if defined(CONFIG_DARWIN)
#define qemu_fgetxattr(...) fgetxattr(__VA_ARGS__, 0, 0)
+#elif defined(CONFIG_WIN32)
+#define qemu_fgetxattr fgetxattr_win32
#else
#define qemu_fgetxattr fgetxattr
#endif
+#ifdef CONFIG_WIN32
+#define qemu_openat openat_win32
+#define qemu_fstatat fstatat_win32
+#define qemu_mkdirat mkdirat_win32
+#define qemu_renameat renameat_win32
+#define qemu_utimensat utimensat_win32
+#define qemu_unlinkat unlinkat_win32
+#else
#define qemu_openat openat
#define qemu_fstatat fstatat
#define qemu_mkdirat mkdirat
#define qemu_renameat renameat
#define qemu_utimensat utimensat
#define qemu_unlinkat unlinkat
+#endif
+
+#ifdef CONFIG_WIN32
+char *get_full_path_win32(HANDLE hDir, const char *name);
+ssize_t fgetxattr_win32(int fd, const char *name, void *value, size_t size);
+int openat_win32(int dirfd, const char *pathname, int flags, mode_t mode);
+int fstatat_win32(int dirfd, const char *pathname,
+ struct stat *statbuf, int flags);
+int mkdirat_win32(int dirfd, const char *pathname, mode_t mode);
+int renameat_win32(int olddirfd, const char *oldpath,
+ int newdirfd, const char *newpath);
+int utimensat_win32(int dirfd, const char *pathname,
+ const struct timespec times[2], int flags);
+int unlinkat_win32(int dirfd, const char *pathname, int flags);
+int statfs_win32(const char *root_path, struct statfs *stbuf);
+int openat_dir(int dirfd, const char *name);
+int openat_file(int dirfd, const char *name, int flags, mode_t mode);
+#endif
static inline void close_preserve_errno(int fd)
{
@@ -108,6 +136,7 @@ static inline void close_preserve_errno(int fd)
errno = serrno;
}
+#ifndef CONFIG_WIN32
static inline int openat_dir(int dirfd, const char *name)
{
return qemu_openat(dirfd, name,
@@ -154,6 +183,7 @@ again:
errno = serrno;
return fd;
}
+#endif
ssize_t fgetxattrat_nofollow(int dirfd, const char *path, const char *name,
void *value, size_t size);
diff --git a/hw/9pfs/9p-local.c b/hw/9pfs/9p-local.c
index 9d07620235..b6102c9e5a 100644
--- a/hw/9pfs/9p-local.c
+++ b/hw/9pfs/9p-local.c
@@ -53,10 +53,6 @@
#define BTRFS_SUPER_MAGIC 0x9123683E
#endif
-typedef struct {
- int mountfd;
-} LocalData;
-
int local_open_nofollow(FsContext *fs_ctx, const char *path, int flags,
mode_t mode)
{
diff --git a/hw/9pfs/9p-util-win32.c b/hw/9pfs/9p-util-win32.c
new file mode 100644
index 0000000000..a99d579a06
--- /dev/null
+++ b/hw/9pfs/9p-util-win32.c
@@ -0,0 +1,979 @@
+/*
+ * 9p utilities (Windows Implementation)
+ *
+ * Copyright (c) 2022 Wind River Systems, Inc.
+ *
+ * This work is licensed under the terms of the GNU GPL, version 2 or later.
+ * See the COPYING file in the top-level directory.
+ */
+
+/*
+ * This file contains Windows only functions for 9pfs.
+ *
+ * For 9pfs Windows host, the following features are different from Linux host:
+ *
+ * 1. Windows POSIX API does not provide the NO_FOLLOW flag, that means MinGW
+ * cannot detect if a path is a symbolic link or not. Also Windows do not
+ * provide POSIX compatible readlink(). Supporting symbolic link in 9pfs on
+ * Windows may cause security issues, so symbolic link support is disabled
+ * completely for security model "none" or "passthrough".
+ *
+ * 2. Windows file system does not support extended attributes directly. 9pfs
+ * for Windows uses NTFS ADS (Alternate Data Streams) to emulate extended
+ * attributes.
+ *
+ * 3. statfs() is not available on Windows. qemu_statfs() is used to emulate it.
+ *
+ * 4. On Windows trying to open a directory with the open() API will fail.
+ * This is because Windows does not allow opening directory in normal usage.
+ *
+ * As a result of this, all xxx_at() functions won't work directly on
+ * Windows, e.g.: openat(), unlinkat(), etc.
+ *
+ * As xxx_at() can prevent parent directory to be modified on Linux host,
+ * to support this and prevent security issue, all xxx_at() APIs are replaced
+ * by xxx_at_win32().
+ *
+ * Windows does not support opendir, the directory fd is created by
+ * CreateFile and convert to fd by _open_osfhandle(). Keep the fd open will
+ * lock and protect the directory (can not be modified or replaced)
+ */
+
+#include "qemu/osdep.h"
+#include "qapi/error.h"
+#include "qemu/error-report.h"
+#include "9p.h"
+#include "9p-util.h"
+#include "9p-local.h"
+
+#include <windows.h>
+#include <dirent.h>
+
+#define V9FS_MAGIC 0x53465039 /* string "9PFS" */
+
+/*
+ * win32_error_to_posix - convert Win32 error to POSIX error number
+ *
+ * This function converts Win32 error to POSIX error number.
+ * e.g. ERROR_FILE_NOT_FOUND and ERROR_PATH_NOT_FOUND will be translated to
+ * ENOENT.
+ */
+static int win32_error_to_posix(DWORD win32err)
+{
+ switch (win32err) {
+ case ERROR_FILE_NOT_FOUND: return ENOENT;
+ case ERROR_PATH_NOT_FOUND: return ENOENT;
+ case ERROR_INVALID_DRIVE: return ENODEV;
+ case ERROR_TOO_MANY_OPEN_FILES: return EMFILE;
+ case ERROR_ACCESS_DENIED: return EACCES;
+ case ERROR_INVALID_HANDLE: return EBADF;
+ case ERROR_NOT_ENOUGH_MEMORY: return ENOMEM;
+ case ERROR_FILE_EXISTS: return EEXIST;
+ case ERROR_DISK_FULL: return ENOSPC;
+ }
+ return EIO;
+}
+
+/*
+ * build_ads_name - construct Windows ADS name
+ *
+ * This function constructs Windows NTFS ADS (Alternate Data Streams) name
+ * to <namebuf>.
+ */
+static int build_ads_name(char *namebuf, size_t namebuf_len,
+ const char *filename, const char *ads_name)
+{
+ size_t total_size;
+
+ total_size = strlen(filename) + strlen(ads_name) + 2;
+ if (total_size > namebuf_len) {
+ return -1;
+ }
+
+ /*
+ * NTFS ADS (Alternate Data Streams) name format: filename:ads_name
+ * e.g.: D:\1.txt:my_ads_name
+ */
+
+ strcpy(namebuf, filename);
+ strcat(namebuf, ":");
+ strcat(namebuf, ads_name);
+
+ return 0;
+}
+
+/*
+ * copy_ads_name - copy ADS name from buffer returned by FindNextStreamW()
+ *
+ * This function removes string "$DATA" in ADS name string returned by
+ * FindNextStreamW(), and copies the real ADS name to <namebuf>.
+ */
+static ssize_t copy_ads_name(char *namebuf, size_t namebuf_len,
+ char *full_ads_name)
+{
+ char *p1, *p2;
+
+ /*
+ * NTFS ADS (Alternate Data Streams) name from enumerate data format:
+ * :ads_name:$DATA, e.g.: :my_ads_name:$DATA
+ *
+ * ADS name from FindNextStreamW() always has ":$DATA" string at the end.
+ *
+ * This function copies ADS name to namebuf.
+ */
+
+ p1 = strchr(full_ads_name, ':');
+ if (p1 == NULL) {
+ return -1;
+ }
+
+ p2 = strchr(p1 + 1, ':');
+ if (p2 == NULL) {
+ return -1;
+ }
+
+ /* skip empty ads name */
+ if (p2 - p1 == 1) {
+ return 0;
+ }
+
+ if (p2 - p1 + 1 > namebuf_len) {
+ return -1;
+ }
+
+ memcpy(namebuf, p1 + 1, p2 - p1 - 1);
+ namebuf[p2 - p1 - 1] = '\0';
+
+ return p2 - p1;
+}
+
+/*
+ * get_full_path_win32 - get full file name base on a handle
+ *
+ * This function gets full file name based on a handle specified by <fd> to
+ * a file or directory.
+ *
+ * Caller function needs to free the file name string after use.
+ */
+char *get_full_path_win32(HANDLE hDir, const char *name)
+{
+ g_autofree char *full_file_name = NULL;
+ DWORD total_size;
+ DWORD name_size;
+
+ if (hDir == INVALID_HANDLE_VALUE) {
+ return NULL;
+ }
+
+ full_file_name = g_malloc0(NAME_MAX);
+
+ /* get parent directory full file name */
+ name_size = GetFinalPathNameByHandle(hDir, full_file_name,
+ NAME_MAX - 1, FILE_NAME_NORMALIZED);
+ if (name_size == 0 || name_size > NAME_MAX - 1) {
+ return NULL;
+ }
+
+ /* full path returned is the "\\?\" syntax, remove the lead string */
+ memmove(full_file_name, full_file_name + 4, NAME_MAX - 4);
+
+ if (name != NULL) {
+ total_size = strlen(full_file_name) + strlen(name) + 2;
+
+ if (total_size > NAME_MAX) {
+ return NULL;
+ }
+
+ /* build sub-directory file name */
+ strcat(full_file_name, "\\");
+ strcat(full_file_name, name);
+ }
+
+ return g_steal_pointer(&full_file_name);
+}
+
+/*
+ * fgetxattr_win32 - get extended attribute by fd
+ *
+ * This function gets extened attribute by <fd>. <fd> will be translated to
+ * Windows handle.
+ *
+ * This function emulates extended attribute by NTFS ADS.
+ */
+ssize_t fgetxattr_win32(int fd, const char *name, void *value, size_t size)
+{
+ g_autofree char *full_file_name = NULL;
+ char ads_file_name[NAME_MAX + 1] = {0};
+ DWORD dwBytesRead;
+ HANDLE hStream;
+ HANDLE hFile;
+
+ hFile = (HANDLE)_get_osfhandle(fd);
+
+ full_file_name = get_full_path_win32(hFile, NULL);
+ if (full_file_name == NULL) {
+ errno = EIO;
+ return -1;
+ }
+
+ if (build_ads_name(ads_file_name, NAME_MAX, full_file_name, name) < 0) {
+ errno = EIO;
+ return -1;
+ }
+
+ hStream = CreateFile(ads_file_name, GENERIC_READ, FILE_SHARE_READ, NULL,
+ OPEN_EXISTING, FILE_ATTRIBUTE_NORMAL, NULL);
+ if (hStream == INVALID_HANDLE_VALUE &&
+ GetLastError() == ERROR_FILE_NOT_FOUND) {
+ errno = ENODATA;
+ return -1;
+ }
+
+ if (ReadFile(hStream, value, size, &dwBytesRead, NULL) == FALSE) {
+ errno = EIO;
+ CloseHandle(hStream);
+ return -1;
+ }
+
+ CloseHandle(hStream);
+
+ return dwBytesRead;
+}
+
+/*
+ * openat_win32 - emulate openat()
+ *
+ * This function emulates openat().
+ *
+ * this function needs a handle to get the full file name, it has to
+ * convert fd to handle by get_osfhandle().
+ *
+ * For symbolic access:
+ * 1. Parent directory handle <dirfd> should not be a symbolic link because
+ * it is opened by openat_dir() which can prevent from opening a link to
+ * a dirctory.
+ * 2. Link flag in <mode> is not set because Windows does not have this flag.
+ * Create a new symbolic link will be denied.
+ * 3. This function checks file symbolic link attribute after open.
+ *
+ * So native symbolic link will not be accessed by 9p client.
+ */
+int openat_win32(int dirfd, const char *pathname, int flags, mode_t mode)
+{
+ g_autofree char *full_file_name1 = NULL;
+ g_autofree char *full_file_name2 = NULL;
+ HANDLE hFile;
+ HANDLE hDir = (HANDLE)_get_osfhandle(dirfd);
+ int fd;
+
+ full_file_name1 = get_full_path_win32(hDir, pathname);
+ if (full_file_name1 == NULL) {
+ return -1;
+ }
+
+ fd = open(full_file_name1, flags, mode);
+ if (fd > 0) {
+ DWORD attribute;
+ hFile = (HANDLE)_get_osfhandle(fd);
+
+ full_file_name2 = get_full_path_win32(hFile, NULL);
+ attribute = GetFileAttributes(full_file_name2);
+
+ /* check if it is a symbolic link */
+ if ((attribute == INVALID_FILE_ATTRIBUTES)
+ || (attribute & FILE_ATTRIBUTE_REPARSE_POINT) != 0) {
+ errno = EACCES;
+ close(fd);
+ }
+ }
+
+ return fd;
+}
+
+/*
+ * fstatat_win32 - emulate fstatat()
+ *
+ * This function emulates fstatat().
+ *
+ * Access to a symbolic link will be denied to prevent security issues.
+ */
+int fstatat_win32(int dirfd, const char *pathname,
+ struct stat *statbuf, int flags)
+{
+ g_autofree char *full_file_name = NULL;
+ HANDLE hFile = INVALID_HANDLE_VALUE;
+ HANDLE hDir = (HANDLE)_get_osfhandle(dirfd);
+ BY_HANDLE_FILE_INFORMATION file_info;
+ DWORD attribute;
+ int err = 0;
+ int ret = -1;
+ ino_t st_ino;
+ int is_symlink = 0;
+
+ full_file_name = get_full_path_win32(hDir, pathname);
+ if (full_file_name == NULL) {
+ return ret;
+ }
+
+ /* open file to lock it */
+ hFile = CreateFile(full_file_name, GENERIC_READ,
+ FILE_SHARE_READ | FILE_SHARE_WRITE | FILE_SHARE_DELETE,
+ NULL,
+ OPEN_EXISTING,
+ FILE_FLAG_BACKUP_SEMANTICS
+ | FILE_FLAG_OPEN_REPARSE_POINT,
+ NULL);
+
+ if (hFile == INVALID_HANDLE_VALUE) {
+ err = win32_error_to_posix(GetLastError());
+ goto out;
+ }
+
+ attribute = GetFileAttributes(full_file_name);
+
+ if (attribute == INVALID_FILE_ATTRIBUTES) {
+ err = EACCES;
+ goto out;
+ }
+
+ /* check if it is a symbolic link */
+ if ((attribute & FILE_ATTRIBUTE_REPARSE_POINT) != 0) {
+ is_symlink = 1;
+ }
+
+ ret = stat(full_file_name, statbuf);
+
+ if (GetFileInformationByHandle(hFile, &file_info) == 0) {
+ err = win32_error_to_posix(GetLastError());
+ goto out;
+ }
+
+ /*
+ * Windows (NTFS) file ID is a 64-bit ID:
+ * 16-bit sequence ID + 48 bit segment number
+ *
+ * But currently, ino_t defined in Windows header file is only 16-bit,
+ * and it is not patched by MinGW. So we build a pseudo inode number
+ * by the low 32-bit segment number when ino_t is only 16-bit.
+ */
+ if (sizeof(st_ino) == sizeof(uint64_t)) {
+ st_ino = (ino_t)((uint64_t)file_info.nFileIndexLow
+ | (((uint64_t)file_info.nFileIndexHigh) << 32));
+ } else if (sizeof(st_ino) == sizeof(uint16_t)) {
+ st_ino = (ino_t)(((uint16_t)file_info.nFileIndexLow)
+ ^ ((uint16_t)(file_info.nFileIndexLow >> 16)));
+ } else {
+ st_ino = (ino_t)file_info.nFileIndexLow;
+ }
+
+ statbuf->st_ino = st_ino;
+
+ if (is_symlink == 1) {
+ /* force to set mode to 0, to prevent symlink access */
+ statbuf->st_mode = 0;
+
+ /* hide information */
+ statbuf->st_atime = 0;
+ statbuf->st_mtime = 0;
+ statbuf->st_ctime = 0;
+ statbuf->st_size = 0;
+ }
+
+out:
+ if (hFile != INVALID_HANDLE_VALUE) {
+ CloseHandle(hFile);
+ }
+
+ if (err != 0) {
+ errno = err;
+ }
+ return ret;
+}
+
+/*
+ * mkdirat_win32 - emulate mkdirat()
+ *
+ * This function emulates mkdirat().
+ *
+ * this function needs a handle to get the full file name, it has to
+ * convert fd to handle by get_osfhandle().
+ */
+int mkdirat_win32(int dirfd, const char *pathname, mode_t mode)
+{
+ g_autofree char *full_file_name = NULL;
+ int ret = -1;
+ HANDLE hDir = (HANDLE)_get_osfhandle(dirfd);
+
+ full_file_name = get_full_path_win32(hDir, pathname);
+ if (full_file_name == NULL) {
+ return ret;
+ }
+
+ ret = mkdir(full_file_name);
+
+ return ret;
+}
+
+/*
+ * renameat_win32 - emulate renameat()
+ *
+ * This function emulates renameat().
+ *
+ * this function needs a handle to get the full file name, it has to
+ * convert fd to handle by get_osfhandle().
+ *
+ * Access to a symbolic link will be denied to prevent security issues.
+ */
+int renameat_win32(int olddirfd, const char *oldpath,
+ int newdirfd, const char *newpath)
+{
+ g_autofree char *full_old_name = NULL;
+ g_autofree char *full_new_name = NULL;
+ HANDLE hFile;
+ HANDLE hOldDir = (HANDLE)_get_osfhandle(olddirfd);
+ HANDLE hNewDir = (HANDLE)_get_osfhandle(newdirfd);
+ DWORD attribute;
+ int err = 0;
+ int ret = -1;
+
+ full_old_name = get_full_path_win32(hOldDir, oldpath);
+ full_new_name = get_full_path_win32(hNewDir, newpath);
+ if (full_old_name == NULL || full_new_name == NULL) {
+ return ret;
+ }
+
+ /* open file to lock it */
+ hFile = CreateFile(full_old_name, GENERIC_READ,
+ FILE_SHARE_READ | FILE_SHARE_WRITE | FILE_SHARE_DELETE,
+ NULL,
+ OPEN_EXISTING, FILE_FLAG_BACKUP_SEMANTICS, NULL);
+
+ if (hFile == INVALID_HANDLE_VALUE) {
+ err = win32_error_to_posix(GetLastError());
+ goto out;
+ }
+
+ attribute = GetFileAttributes(full_old_name);
+
+ /* check if it is a symbolic link */
+ if ((attribute == INVALID_FILE_ATTRIBUTES)
+ || (attribute & FILE_ATTRIBUTE_REPARSE_POINT) != 0) {
+ err = EACCES;
+ goto out;
+ }
+
+ CloseHandle(hFile);
+
+ ret = rename(full_old_name, full_new_name);
+out:
+ if (err != 0) {
+ errno = err;
+ }
+ return ret;
+}
+
+/*
+ * utimensat_win32 - emulate utimensat()
+ *
+ * This function emulates utimensat().
+ *
+ * this function needs a handle to get the full file name, it has to
+ * convert fd to handle by get_osfhandle().
+ *
+ * Access to a symbolic link will be denied to prevent security issues.
+ */
+int utimensat_win32(int dirfd, const char *pathname,
+ const struct timespec times[2], int flags)
+{
+ g_autofree char *full_file_name = NULL;
+ HANDLE hFile = INVALID_HANDLE_VALUE;
+ HANDLE hDir = (HANDLE)_get_osfhandle(dirfd);
+ DWORD attribute;
+ struct utimbuf tm;
+ int err = 0;
+ int ret = -1;
+
+ full_file_name = get_full_path_win32(hDir, pathname);
+ if (full_file_name == NULL) {
+ return ret;
+ }
+
+ /* open file to lock it */
+ hFile = CreateFile(full_file_name, GENERIC_READ,
+ FILE_SHARE_READ | FILE_SHARE_WRITE | FILE_SHARE_DELETE,
+ NULL,
+ OPEN_EXISTING,
+ FILE_FLAG_BACKUP_SEMANTICS
+ | FILE_FLAG_OPEN_REPARSE_POINT,
+ NULL);
+
+ if (hFile == INVALID_HANDLE_VALUE) {
+ err = win32_error_to_posix(GetLastError());
+ goto out;
+ }
+
+ attribute = GetFileAttributes(full_file_name);
+
+ /* check if it is a symbolic link */
+ if ((attribute == INVALID_FILE_ATTRIBUTES)
+ || (attribute & FILE_ATTRIBUTE_REPARSE_POINT) != 0) {
+ errno = EACCES;
+ goto out;
+ }
+
+ tm.actime = times[0].tv_sec;
+ tm.modtime = times[1].tv_sec;
+
+ ret = utime(full_file_name, &tm);
+
+out:
+ if (hFile != INVALID_HANDLE_VALUE) {
+ CloseHandle(hFile);
+ }
+
+ if (err != 0) {
+ errno = err;
+ }
+ return ret;
+}
+
+/*
+ * unlinkat_win32 - emulate unlinkat()
+ *
+ * This function emulates unlinkat().
+ *
+ * this function needs a handle to get the full file name, it has to
+ * convert fd to handle by get_osfhandle().
+ *
+ * Access to a symbolic link will be denied to prevent security issues.
+ */
+
+int unlinkat_win32(int dirfd, const char *pathname, int flags)
+{
+ g_autofree char *full_file_name = NULL;
+ HANDLE hFile;
+ HANDLE hDir = (HANDLE)_get_osfhandle(dirfd);
+ DWORD attribute;
+ int err = 0;
+ int ret = -1;
+
+ full_file_name = get_full_path_win32(hDir, pathname);
+ if (full_file_name == NULL) {
+ return ret;
+ }
+
+ /*
+ * open file to prevent other one modify it. FILE_SHARE_DELETE flag
+ * allows remove a file even it is still in opening.
+ */
+ hFile = CreateFile(full_file_name, GENERIC_READ,
+ FILE_SHARE_READ | FILE_SHARE_WRITE | FILE_SHARE_DELETE,
+ NULL,
+ OPEN_EXISTING, FILE_FLAG_BACKUP_SEMANTICS, NULL);
+
+ if (hFile == INVALID_HANDLE_VALUE) {
+ err = win32_error_to_posix(GetLastError());
+ goto out;
+ }
+
+ attribute = GetFileAttributes(full_file_name);
+
+ /* check if it is a symbolic link */
+ if ((attribute == INVALID_FILE_ATTRIBUTES)
+ || (attribute & FILE_ATTRIBUTE_REPARSE_POINT) != 0) {
+ err = EACCES;
+ goto out;
+ }
+
+ if (flags == AT_REMOVEDIR) { /* remove directory */
+ if ((attribute & FILE_ATTRIBUTE_DIRECTORY) == 0) {
+ err = ENOTDIR;
+ goto out;
+ }
+ ret = rmdir(full_file_name);
+ } else { /* remove regular file */
+ if ((attribute & FILE_ATTRIBUTE_DIRECTORY) != 0) {
+ err = EISDIR;
+ goto out;
+ }
+ ret = remove(full_file_name);
+ }
+
+ /* after last handle closed, file will be removed */
+ CloseHandle(hFile);
+
+out:
+ if (err != 0) {
+ errno = err;
+ }
+ return ret;
+}
+
+/*
+ * statfs_win32 - statfs() on Windows
+ *
+ * This function emulates statfs() on Windows host.
+ */
+int statfs_win32(const char *path, struct statfs *stbuf)
+{
+ char RealPath[4] = { 0 };
+ unsigned long SectorsPerCluster;
+ unsigned long BytesPerSector;
+ unsigned long NumberOfFreeClusters;
+ unsigned long TotalNumberOfClusters;
+
+ /* only need first 3 bytes, e.g. "C:\ABC", only need "C:\" */
+ memcpy(RealPath, path, 3);
+
+ if (GetDiskFreeSpace(RealPath, &SectorsPerCluster, &BytesPerSector,
+ &NumberOfFreeClusters, &TotalNumberOfClusters) == 0) {
+ errno = EIO;
+ return -1;
+ }
+
+ stbuf->f_type = V9FS_MAGIC;
+ stbuf->f_bsize =
+ (__fsword_t)SectorsPerCluster * (__fsword_t)BytesPerSector;
+ stbuf->f_blocks = (fsblkcnt_t)TotalNumberOfClusters;
+ stbuf->f_bfree = (fsblkcnt_t)NumberOfFreeClusters;
+ stbuf->f_bavail = (fsblkcnt_t)NumberOfFreeClusters;
+ stbuf->f_files = -1;
+ stbuf->f_ffree = -1;
+ stbuf->f_namelen = NAME_MAX;
+ stbuf->f_frsize = 0;
+ stbuf->f_flags = 0;
+
+ return 0;
+}
+
+/*
+ * openat_dir - emulate openat_dir()
+ *
+ * This function emulates openat_dir().
+ *
+ * Access to a symbolic link will be denied to prevent security issues.
+ */
+int openat_dir(int dirfd, const char *name)
+{
+ g_autofree char *full_file_name = NULL;
+ HANDLE hSubDir;
+ HANDLE hDir = (HANDLE)_get_osfhandle(dirfd);
+ DWORD attribute;
+
+ full_file_name = get_full_path_win32(hDir, name);
+ if (full_file_name == NULL) {
+ return -1;
+ }
+
+ attribute = GetFileAttributes(full_file_name);
+ if (attribute == INVALID_FILE_ATTRIBUTES) {
+ return -1;
+ }
+
+ /* check if it is a directory */
+ if ((attribute & FILE_ATTRIBUTE_DIRECTORY) == 0) {
+ return -1;
+ }
+
+ /* do not allow opening a symbolic link */
+ if ((attribute & FILE_ATTRIBUTE_REPARSE_POINT) != 0) {
+ return -1;
+ }
+
+ /* open it */
+ hSubDir = CreateFile(full_file_name, GENERIC_READ,
+ FILE_SHARE_READ | FILE_SHARE_WRITE | FILE_SHARE_DELETE,
+ NULL,
+ OPEN_EXISTING, FILE_FLAG_BACKUP_SEMANTICS, NULL);
+ return _open_osfhandle((intptr_t)hSubDir, _O_RDONLY);
+}
+
+
+int openat_file(int dirfd, const char *name, int flags, mode_t mode)
+{
+ return openat_win32(dirfd, name, flags | _O_BINARY, mode);
+}
+
+/*
+ * fgetxattrat_nofollow - get extended attribute
+ *
+ * This function gets extended attribute from file <path> in the directory
+ * specified by <dirfd>. The extended atrribute name is specified by <name>
+ * and return value will be put in <value>.
+ *
+ * This function emulates extended attribute by NTFS ADS.
+ */
+ssize_t fgetxattrat_nofollow(int dirfd, const char *path,
+ const char *name, void *value, size_t size)
+{
+ g_autofree char *full_file_name = NULL;
+ char ads_file_name[NAME_MAX + 1] = { 0 };
+ DWORD dwBytesRead;
+ HANDLE hStream;
+ HANDLE hDir = (HANDLE)_get_osfhandle(dirfd);
+
+ full_file_name = get_full_path_win32(hDir, path);
+ if (full_file_name == NULL) {
+ errno = EIO;
+ return -1;
+ }
+
+ if (build_ads_name(ads_file_name, NAME_MAX, full_file_name, name) < 0) {
+ errno = EIO;
+ return -1;
+ }
+
+ hStream = CreateFile(ads_file_name, GENERIC_READ, FILE_SHARE_READ, NULL,
+ OPEN_EXISTING, FILE_ATTRIBUTE_NORMAL, NULL);
+ if (hStream == INVALID_HANDLE_VALUE &&
+ GetLastError() == ERROR_FILE_NOT_FOUND) {
+ errno = ENODATA;
+ return -1;
+ }
+
+ if (ReadFile(hStream, value, size, &dwBytesRead, NULL) == FALSE) {
+ errno = EIO;
+ CloseHandle(hStream);
+ return -1;
+ }
+
+ CloseHandle(hStream);
+
+ return dwBytesRead;
+}
+
+/*
+ * fsetxattrat_nofollow - set extended attribute
+ *
+ * This function sets extended attribute to file <path> in the directory
+ * specified by <dirfd>.
+ *
+ * This function emulates extended attribute by NTFS ADS.
+ */
+
+int fsetxattrat_nofollow(int dirfd, const char *path, const char *name,
+ void *value, size_t size, int flags)
+{
+ g_autofree char *full_file_name = NULL;
+ char ads_file_name[NAME_MAX + 1] = { 0 };
+ DWORD dwBytesWrite;
+ HANDLE hStream;
+ HANDLE hDir = (HANDLE)_get_osfhandle(dirfd);
+
+ full_file_name = get_full_path_win32(hDir, path);
+ if (full_file_name == NULL) {
+ errno = EIO;
+ return -1;
+ }
+
+ if (build_ads_name(ads_file_name, NAME_MAX, full_file_name, name) < 0) {
+ errno = EIO;
+ return -1;
+ }
+
+ hStream = CreateFile(ads_file_name, GENERIC_WRITE, FILE_SHARE_READ, NULL,
+ CREATE_ALWAYS, FILE_ATTRIBUTE_NORMAL, NULL);
+ if (hStream == INVALID_HANDLE_VALUE) {
+ errno = EIO;
+ return -1;
+ }
+
+ if (WriteFile(hStream, value, size, &dwBytesWrite, NULL) == FALSE) {
+ errno = EIO;
+ CloseHandle(hStream);
+ return -1;
+ }
+
+ CloseHandle(hStream);
+
+ return 0;
+}
+
+/*
+ * flistxattrat_nofollow - list extended attribute
+ *
+ * This function gets extended attribute lists from file <filename> in the
+ * directory specified by <dirfd>. Lists returned will be put in <list>.
+ *
+ * This function emulates extended attribute by NTFS ADS.
+ */
+ssize_t flistxattrat_nofollow(int dirfd, const char *filename,
+ char *list, size_t size)
+{
+ g_autofree char *full_file_name = NULL;
+ WCHAR WideCharStr[NAME_MAX + 1] = { 0 };
+ char full_ads_name[NAME_MAX + 1];
+ WIN32_FIND_STREAM_DATA fsd;
+ BOOL bFindNext;
+ char *list_ptr = list;
+ size_t list_left_size = size;
+ HANDLE hFind;
+ HANDLE hDir = (HANDLE)_get_osfhandle(dirfd);
+ int ret;
+
+ full_file_name = get_full_path_win32(hDir, filename);
+ if (full_file_name == NULL) {
+ errno = EIO;
+ return -1;
+ }
+
+ /*
+ * ADS enumerate function only has WCHAR version, so we need to
+ * covert filename to utf-8 string.
+ */
+ ret = MultiByteToWideChar(CP_UTF8, 0, full_file_name,
+ strlen(full_file_name), WideCharStr, NAME_MAX);
+ if (ret == 0) {
+ errno = EIO;
+ return -1;
+ }
+
+ hFind = FindFirstStreamW(WideCharStr, FindStreamInfoStandard, &fsd, 0);
+ if (hFind == INVALID_HANDLE_VALUE) {
+ errno = ENODATA;
+ return -1;
+ }
+
+ do {
+ memset(full_ads_name, 0, sizeof(full_ads_name));
+
+ /*
+ * ADS enumerate function only has WCHAR version, so we need to
+ * covert cStreamName to utf-8 string.
+ */
+ ret = WideCharToMultiByte(CP_UTF8, 0,
+ fsd.cStreamName, wcslen(fsd.cStreamName) + 1,
+ full_ads_name, sizeof(full_ads_name) - 1,
+ NULL, NULL);
+ if (ret == 0) {
+ if (GetLastError() == ERROR_INSUFFICIENT_BUFFER) {
+ errno = ERANGE;
+ }
+ CloseHandle(hFind);
+ return -1;
+ }
+
+ ret = copy_ads_name(list_ptr, list_left_size, full_ads_name);
+ if (ret < 0) {
+ errno = ERANGE;
+ CloseHandle(hFind);
+ return -1;
+ }
+
+ list_ptr = list_ptr + ret;
+ list_left_size = list_left_size - ret;
+
+ bFindNext = FindNextStreamW(hFind, &fsd);
+ } while (bFindNext);
+
+ CloseHandle(hFind);
+
+ return size - list_left_size;
+}
+
+/*
+ * fremovexattrat_nofollow - remove extended attribute
+ *
+ * This function removes an extended attribute from file <filename> in the
+ * directory specified by <dirfd>.
+ *
+ * This function emulates extended attribute by NTFS ADS.
+ */
+ssize_t fremovexattrat_nofollow(int dirfd, const char *filename,
+ const char *name)
+{
+ g_autofree char *full_file_name = NULL;
+ char ads_file_name[NAME_MAX + 1] = { 0 };
+ HANDLE hDir = (HANDLE)_get_osfhandle(dirfd);
+
+ full_file_name = get_full_path_win32(hDir, filename);
+ if (full_file_name == NULL) {
+ errno = EIO;
+ return -1;
+ }
+
+ if (build_ads_name(ads_file_name, NAME_MAX, filename, name) < 0) {
+ errno = EIO;
+ return -1;
+ }
+
+ if (DeleteFile(ads_file_name) != 0) {
+ if (GetLastError() == ERROR_FILE_NOT_FOUND) {
+ errno = ENODATA;
+ return -1;
+ }
+ }
+
+ return 0;
+}
+
+/*
+ * local_opendir_nofollow - open a Windows directory
+ *
+ * This function returns a fd of the directory specified by
+ * <dirpath> based on 9pfs mount point.
+ *
+ * Windows POSIX API does not support opening a directory by open(). Only
+ * handle of directory can be opened by CreateFile().
+ * This function convert handle to fd by _open_osfhandle().
+ *
+ * This function checks the resolved path of <dirpath>. If the resolved
+ * path is not in the scope of root directory (e.g. by symbolic link), then
+ * this function will fail to prevent any security issues.
+ */
+int local_opendir_nofollow(FsContext *fs_ctx, const char *dirpath)
+{
+ g_autofree char *full_file_name = NULL;
+ LocalData *data = fs_ctx->private;
+ HANDLE hDir;
+ int dirfd;
+
+ dirfd = openat_dir(data->mountfd, dirpath);
+ if (dirfd == -1) {
+ return -1;
+ }
+ hDir = (HANDLE)_get_osfhandle(dirfd);
+
+ full_file_name = get_full_path_win32(hDir, NULL);
+ if (full_file_name == NULL) {
+ close(dirfd);
+ return -1;
+ }
+
+ /*
+ * Check if the resolved path is in the root directory scope:
+ * data->root_path and full_file_name are full path with symbolic
+ * link resolved, so fs_ctx->root_path must be in the head of
+ * full_file_name. If not, that means guest OS tries to open a file not
+ * in the scope of mount point. This operation should be denied.
+ */
+ if (memcmp(full_file_name, data->root_path,
+ strlen(data->root_path)) != 0) {
+ close(dirfd);
+ return -1;
+ }
+
+ return dirfd;
+}
+
+/*
+ * qemu_mknodat - mknodat emulate function
+ *
+ * This function emulates mknodat on Windows. It only works when security
+ * model is mapped or mapped-xattr.
+ */
+int qemu_mknodat(int dirfd, const char *filename, mode_t mode, dev_t dev)
+{
+ if (S_ISREG(mode) || !(mode & S_IFMT)) {
+ int fd = openat_file(dirfd, filename, O_CREAT, mode);
+ if (fd == -1) {
+ return -1;
+ }
+ close_preserve_errno(fd);
+ return 0;
+ }
+
+ error_report_once("Unsupported operation for mknodat");
+ errno = ENOTSUP;
+ return -1;
+}
--
2.25.1
^ permalink raw reply related [flat|nested] 32+ messages in thread
* [PATCH v5 03/16] hw/9pfs: Replace the direct call to xxxdir() APIs with a wrapper
2023-02-20 10:07 [PATCH v5 00/16] hw/9pfs: Add 9pfs support for Windows Bin Meng
2023-02-20 10:08 ` [PATCH v5 01/16] hw/9pfs: Add missing definitions " Bin Meng
2023-02-20 10:08 ` [PATCH v5 02/16] hw/9pfs: Implement Windows specific utilities functions for 9pfs Bin Meng
@ 2023-02-20 10:08 ` Bin Meng
2023-03-06 9:31 ` Philippe Mathieu-Daudé
2023-02-20 10:08 ` [PATCH v5 04/16] hw/9pfs: Implement Windows specific xxxdir() APIs Bin Meng
` (14 subsequent siblings)
17 siblings, 1 reply; 32+ messages in thread
From: Bin Meng @ 2023-02-20 10:08 UTC (permalink / raw)
To: Christian Schoenebeck, Greg Kurz, qemu-devel; +Cc: Guohuai Shi
From: Guohuai Shi <guohuai.shi@windriver.com>
xxxdir() APIs are not safe on Windows host. For future extension to
Windows, let's replace the direct call to xxxdir() APIs with a wrapper.
Signed-off-by: Guohuai Shi <guohuai.shi@windriver.com>
Signed-off-by: Bin Meng <bin.meng@windriver.com>
---
hw/9pfs/9p-util.h | 14 ++++++++++++++
hw/9pfs/9p-local.c | 12 ++++++------
2 files changed, 20 insertions(+), 6 deletions(-)
diff --git a/hw/9pfs/9p-util.h b/hw/9pfs/9p-util.h
index 90420a7578..0f159fb4ce 100644
--- a/hw/9pfs/9p-util.h
+++ b/hw/9pfs/9p-util.h
@@ -103,6 +103,13 @@ static inline int errno_to_dotl(int err) {
#define qemu_renameat renameat_win32
#define qemu_utimensat utimensat_win32
#define qemu_unlinkat unlinkat_win32
+
+#define qemu_opendir opendir_win32
+#define qemu_closedir closedir_win32
+#define qemu_readdir readdir_win32
+#define qeme_rewinddir rewinddir_win32
+#define qemu_seekdir seekdir_win32
+#define qemu_telldir telldir_win32
#else
#define qemu_openat openat
#define qemu_fstatat fstatat
@@ -110,6 +117,13 @@ static inline int errno_to_dotl(int err) {
#define qemu_renameat renameat
#define qemu_utimensat utimensat
#define qemu_unlinkat unlinkat
+
+#define qemu_opendir opendir
+#define qemu_closedir closedir
+#define qemu_readdir readdir
+#define qeme_rewinddir rewinddir
+#define qemu_seekdir seekdir
+#define qemu_telldir telldir
#endif
#ifdef CONFIG_WIN32
diff --git a/hw/9pfs/9p-local.c b/hw/9pfs/9p-local.c
index b6102c9e5a..4385f18da2 100644
--- a/hw/9pfs/9p-local.c
+++ b/hw/9pfs/9p-local.c
@@ -495,7 +495,7 @@ static int local_close(FsContext *ctx, V9fsFidOpenState *fs)
static int local_closedir(FsContext *ctx, V9fsFidOpenState *fs)
{
- return closedir(fs->dir.stream);
+ return qemu_closedir(fs->dir.stream);
}
static int local_open(FsContext *ctx, V9fsPath *fs_path,
@@ -533,12 +533,12 @@ static int local_opendir(FsContext *ctx,
static void local_rewinddir(FsContext *ctx, V9fsFidOpenState *fs)
{
- rewinddir(fs->dir.stream);
+ qeme_rewinddir(fs->dir.stream);
}
static off_t local_telldir(FsContext *ctx, V9fsFidOpenState *fs)
{
- return telldir(fs->dir.stream);
+ return qemu_telldir(fs->dir.stream);
}
static bool local_is_mapped_file_metadata(FsContext *fs_ctx, const char *name)
@@ -552,13 +552,13 @@ static struct dirent *local_readdir(FsContext *ctx, V9fsFidOpenState *fs)
struct dirent *entry;
again:
- entry = readdir(fs->dir.stream);
+ entry = qemu_readdir(fs->dir.stream);
if (!entry) {
return NULL;
}
#ifdef CONFIG_DARWIN
int off;
- off = telldir(fs->dir.stream);
+ off = qemu_telldir(fs->dir.stream);
/* If telldir fails, fail the entire readdir call */
if (off < 0) {
return NULL;
@@ -581,7 +581,7 @@ again:
static void local_seekdir(FsContext *ctx, V9fsFidOpenState *fs, off_t off)
{
- seekdir(fs->dir.stream, off);
+ qemu_seekdir(fs->dir.stream, off);
}
static ssize_t local_preadv(FsContext *ctx, V9fsFidOpenState *fs,
--
2.25.1
^ permalink raw reply related [flat|nested] 32+ messages in thread
* [PATCH v5 04/16] hw/9pfs: Implement Windows specific xxxdir() APIs
2023-02-20 10:07 [PATCH v5 00/16] hw/9pfs: Add 9pfs support for Windows Bin Meng
` (2 preceding siblings ...)
2023-02-20 10:08 ` [PATCH v5 03/16] hw/9pfs: Replace the direct call to xxxdir() APIs with a wrapper Bin Meng
@ 2023-02-20 10:08 ` Bin Meng
2023-03-14 16:05 ` Christian Schoenebeck
2023-02-20 10:08 ` [PATCH v5 05/16] hw/9pfs: Update the local fs driver to support Windows Bin Meng
` (13 subsequent siblings)
17 siblings, 1 reply; 32+ messages in thread
From: Bin Meng @ 2023-02-20 10:08 UTC (permalink / raw)
To: Christian Schoenebeck, Greg Kurz, qemu-devel; +Cc: Guohuai Shi
From: Guohuai Shi <guohuai.shi@windriver.com>
This commit implements Windows specific xxxdir() APIs for safety
directory access.
Signed-off-by: Guohuai Shi <guohuai.shi@windriver.com>
Signed-off-by: Bin Meng <bin.meng@windriver.com>
---
hw/9pfs/9p-util.h | 6 +
hw/9pfs/9p-util-win32.c | 443 ++++++++++++++++++++++++++++++++++++++++
2 files changed, 449 insertions(+)
diff --git a/hw/9pfs/9p-util.h b/hw/9pfs/9p-util.h
index 0f159fb4ce..c1c251fbd1 100644
--- a/hw/9pfs/9p-util.h
+++ b/hw/9pfs/9p-util.h
@@ -141,6 +141,12 @@ int unlinkat_win32(int dirfd, const char *pathname, int flags);
int statfs_win32(const char *root_path, struct statfs *stbuf);
int openat_dir(int dirfd, const char *name);
int openat_file(int dirfd, const char *name, int flags, mode_t mode);
+DIR *opendir_win32(const char *full_file_name);
+int closedir_win32(DIR *pDir);
+struct dirent *readdir_win32(DIR *pDir);
+void rewinddir_win32(DIR *pDir);
+void seekdir_win32(DIR *pDir, long pos);
+long telldir_win32(DIR *pDir);
#endif
static inline void close_preserve_errno(int fd)
diff --git a/hw/9pfs/9p-util-win32.c b/hw/9pfs/9p-util-win32.c
index a99d579a06..e9408f3c45 100644
--- a/hw/9pfs/9p-util-win32.c
+++ b/hw/9pfs/9p-util-win32.c
@@ -37,6 +37,16 @@
* Windows does not support opendir, the directory fd is created by
* CreateFile and convert to fd by _open_osfhandle(). Keep the fd open will
* lock and protect the directory (can not be modified or replaced)
+ *
+ * 5. Neither Windows native APIs, nor MinGW provide a POSIX compatible API for
+ * acquiring directory entries in a safe way. Calling those APIs (native
+ * _findfirst() and _findnext() or MinGW's readdir(), seekdir() and
+ * telldir()) directly can lead to an inconsistent state if directory is
+ * modified in between, e.g. the same directory appearing more than once
+ * in output, or directories not appearing at all in output even though they
+ * were neither newly created nor deleted. POSIX does not define what happens
+ * with deleted or newly created directories in between, but it guarantees a
+ * consistent state.
*/
#include "qemu/osdep.h"
@@ -51,6 +61,25 @@
#define V9FS_MAGIC 0x53465039 /* string "9PFS" */
+/*
+ * MinGW and Windows does not provide a safe way to seek directory while other
+ * thread is modifying the same directory.
+ *
+ * This structure is used to store sorted file id and ensure directory seek
+ * consistency.
+ */
+struct dir_win32 {
+ struct dirent dd_dir;
+ uint32_t offset;
+ uint32_t total_entries;
+ HANDLE hDir;
+ uint32_t dir_name_len;
+ uint64_t dot_id;
+ uint64_t dot_dot_id;
+ uint64_t *file_id_list;
+ char dd_name[1];
+};
+
/*
* win32_error_to_posix - convert Win32 error to POSIX error number
*
@@ -977,3 +1006,417 @@ int qemu_mknodat(int dirfd, const char *filename, mode_t mode, dev_t dev)
errno = ENOTSUP;
return -1;
}
+
+static int file_id_compare(const void *id_ptr1, const void *id_ptr2)
+{
+ uint64_t id[2];
+
+ id[0] = *(uint64_t *)id_ptr1;
+ id[1] = *(uint64_t *)id_ptr2;
+
+ if (id[0] > id[1]) {
+ return 1;
+ } else if (id[0] < id[1]) {
+ return -1;
+ } else {
+ return 0;
+ }
+}
+
+static int get_next_entry(struct dir_win32 *stream)
+{
+ HANDLE hDirEntry = INVALID_HANDLE_VALUE;
+ char *entry_name;
+ char *entry_start;
+ FILE_ID_DESCRIPTOR fid;
+ DWORD attribute;
+
+ if (stream->file_id_list[stream->offset] == stream->dot_id) {
+ strcpy(stream->dd_dir.d_name, ".");
+ return 0;
+ }
+
+ if (stream->file_id_list[stream->offset] == stream->dot_dot_id) {
+ strcpy(stream->dd_dir.d_name, "..");
+ return 0;
+ }
+
+ fid.dwSize = sizeof(fid);
+ fid.Type = FileIdType;
+
+ fid.FileId.QuadPart = stream->file_id_list[stream->offset];
+
+ hDirEntry = OpenFileById(stream->hDir, &fid, GENERIC_READ,
+ FILE_SHARE_READ | FILE_SHARE_WRITE
+ | FILE_SHARE_DELETE,
+ NULL,
+ FILE_FLAG_BACKUP_SEMANTICS
+ | FILE_FLAG_OPEN_REPARSE_POINT);
+
+ if (hDirEntry == INVALID_HANDLE_VALUE) {
+ /*
+ * Not open it successfully, it may be deleted.
+ * Try next id.
+ */
+ return -1;
+ }
+
+ entry_name = get_full_path_win32(hDirEntry, NULL);
+
+ CloseHandle(hDirEntry);
+
+ if (entry_name == NULL) {
+ return -1;
+ }
+
+ attribute = GetFileAttributes(entry_name);
+
+ /* symlink is not allowed */
+ if (attribute == INVALID_FILE_ATTRIBUTES
+ || (attribute & FILE_ATTRIBUTE_REPARSE_POINT) != 0) {
+ return -1;
+ }
+
+ if (memcmp(entry_name, stream->dd_name, stream->dir_name_len) != 0) {
+ /*
+ * The full entry file name should be a part of parent directory name,
+ * except dot and dot_dot (is already handled).
+ * If not, this entry should not be returned.
+ */
+ return -1;
+ }
+
+ entry_start = entry_name + stream->dir_name_len;
+
+ /* skip slash */
+ while (*entry_start == '\\') {
+ entry_start++;
+ }
+
+ if (strchr(entry_start, '\\') != NULL) {
+ return -1;
+ }
+
+ if (strlen(entry_start) == 0
+ || strlen(entry_start) + 1 > sizeof(stream->dd_dir.d_name)) {
+ return -1;
+ }
+ strcpy(stream->dd_dir.d_name, entry_start);
+
+ return 0;
+}
+
+/*
+ * opendir_win32 - open a directory
+ *
+ * This function opens a directory and caches all directory entries.
+ */
+DIR *opendir_win32(const char *full_file_name)
+{
+ HANDLE hDir = INVALID_HANDLE_VALUE;
+ HANDLE hDirEntry = INVALID_HANDLE_VALUE;
+ char *full_dir_entry = NULL;
+ DWORD attribute;
+ intptr_t dd_handle = -1;
+ struct _finddata_t dd_data;
+ uint64_t file_id;
+ uint64_t *file_id_list = NULL;
+ BY_HANDLE_FILE_INFORMATION FileInfo;
+ struct dir_win32 *stream = NULL;
+ int err = 0;
+ int find_status;
+ int sort_first_two_entry = 0;
+ uint32_t list_count = 16;
+ uint32_t index = 0;
+
+ /* open directory to prevent it being removed */
+
+ hDir = CreateFile(full_file_name, GENERIC_READ,
+ FILE_SHARE_READ | FILE_SHARE_WRITE | FILE_SHARE_DELETE,
+ NULL,
+ OPEN_EXISTING,
+ FILE_FLAG_BACKUP_SEMANTICS | FILE_FLAG_OPEN_REPARSE_POINT,
+ NULL);
+
+ if (hDir == INVALID_HANDLE_VALUE) {
+ err = win32_error_to_posix(GetLastError());
+ goto out;
+ }
+
+ attribute = GetFileAttributes(full_file_name);
+
+ /* symlink is not allow */
+ if (attribute == INVALID_FILE_ATTRIBUTES
+ || (attribute & FILE_ATTRIBUTE_REPARSE_POINT) != 0) {
+ err = EACCES;
+ goto out;
+ }
+
+ /* check if it is a directory */
+ if ((attribute & FILE_ATTRIBUTE_DIRECTORY) == 0) {
+ err = ENOTDIR;
+ goto out;
+ }
+
+ file_id_list = g_malloc0(sizeof(uint64_t) * list_count);
+
+ /*
+ * findfirst() needs suffix format name like "\dir1\dir2\*",
+ * allocate more buffer to store suffix.
+ */
+ stream = g_malloc0(sizeof(struct dir_win32) + strlen(full_file_name) + 3);
+
+ strcpy(stream->dd_name, full_file_name);
+ strcat(stream->dd_name, "\\*");
+
+ stream->hDir = hDir;
+ stream->dir_name_len = strlen(full_file_name);
+
+ dd_handle = _findfirst(stream->dd_name, &dd_data);
+
+ if (dd_handle == -1) {
+ err = errno;
+ goto out;
+ }
+
+ /* read all entries to link list */
+ do {
+ full_dir_entry = get_full_path_win32(hDir, dd_data.name);
+
+ if (full_dir_entry == NULL) {
+ err = ENOMEM;
+ break;
+ }
+
+ /*
+ * Open every entry and get the file informations.
+ *
+ * Skip symbolic links during reading directory.
+ */
+ hDirEntry = CreateFile(full_dir_entry,
+ GENERIC_READ,
+ FILE_SHARE_READ | FILE_SHARE_WRITE
+ | FILE_SHARE_DELETE,
+ NULL,
+ OPEN_EXISTING,
+ FILE_FLAG_BACKUP_SEMANTICS
+ | FILE_FLAG_OPEN_REPARSE_POINT, NULL);
+
+ if (hDirEntry != INVALID_HANDLE_VALUE) {
+ if (GetFileInformationByHandle(hDirEntry,
+ &FileInfo) == TRUE) {
+ attribute = FileInfo.dwFileAttributes;
+
+ /* only save validate entries */
+ if ((attribute & FILE_ATTRIBUTE_REPARSE_POINT) == 0) {
+ if (index >= list_count) {
+ list_count = list_count + 16;
+ file_id_list = g_realloc(file_id_list,
+ sizeof(uint64_t)
+ * list_count);
+ }
+ file_id = (uint64_t)FileInfo.nFileIndexLow
+ + (((uint64_t)FileInfo.nFileIndexHigh) << 32);
+
+
+ file_id_list[index] = file_id;
+
+ if (strcmp(dd_data.name, ".") == 0) {
+ stream->dot_id = file_id_list[index];
+ if (index != 0) {
+ sort_first_two_entry = 1;
+ }
+ } else if (strcmp(dd_data.name, "..") == 0) {
+ stream->dot_dot_id = file_id_list[index];
+ if (index != 1) {
+ sort_first_two_entry = 1;
+ }
+ }
+ index++;
+ }
+ }
+ CloseHandle(hDirEntry);
+ }
+ g_free(full_dir_entry);
+ find_status = _findnext(dd_handle, &dd_data);
+ } while (find_status == 0);
+
+ if (errno == ENOENT) {
+ /* No more matching files could be found, clean errno */
+ errno = 0;
+ } else {
+ err = errno;
+ goto out;
+ }
+
+ stream->total_entries = index;
+ stream->file_id_list = file_id_list;
+
+ if (sort_first_two_entry == 0) {
+ /*
+ * If the first two entry is "." and "..", then do not sort them.
+ *
+ * If the guest OS always considers first two entries are "." and "..",
+ * sort the two entries may cause confused display in guest OS.
+ */
+ qsort(&file_id_list[2], index - 2, sizeof(file_id), file_id_compare);
+ } else {
+ qsort(&file_id_list[0], index, sizeof(file_id), file_id_compare);
+ }
+
+out:
+ if (err != 0) {
+ errno = err;
+ if (stream != NULL) {
+ if (file_id_list != NULL) {
+ g_free(file_id_list);
+ }
+ CloseHandle(hDir);
+ g_free(stream);
+ stream = NULL;
+ }
+ }
+
+ if (dd_handle != -1) {
+ _findclose(dd_handle);
+ }
+
+ return (DIR *)stream;
+}
+
+/*
+ * closedir_win32 - close a directory
+ *
+ * This function closes directory and free all cached resources.
+ */
+int closedir_win32(DIR *pDir)
+{
+ struct dir_win32 *stream = (struct dir_win32 *)pDir;
+
+ if (stream == NULL) {
+ errno = EBADF;
+ return -1;
+ }
+
+ /* free all resources */
+ CloseHandle(stream->hDir);
+
+ g_free(stream->file_id_list);
+
+ g_free(stream);
+
+ return 0;
+}
+
+/*
+ * readdir_win32 - read a directory
+ *
+ * This function reads a directory entry from cached entry list.
+ */
+struct dirent *readdir_win32(DIR *pDir)
+{
+ struct dir_win32 *stream = (struct dir_win32 *)pDir;
+
+ if (stream == NULL) {
+ errno = EBADF;
+ return NULL;
+ }
+
+retry:
+
+ if (stream->offset >= stream->total_entries) {
+ /* reach to the end, return NULL without set errno */
+ return NULL;
+ }
+
+ if (get_next_entry(stream) != 0) {
+ stream->offset++;
+ goto retry;
+ }
+
+ /* Windows does not provide inode number */
+ stream->dd_dir.d_ino = 0;
+ stream->dd_dir.d_reclen = 0;
+ stream->dd_dir.d_namlen = strlen(stream->dd_dir.d_name);
+
+ stream->offset++;
+
+ return &stream->dd_dir;
+}
+
+/*
+ * rewinddir_win32 - reset directory stream
+ *
+ * This function resets the position of the directory stream to the
+ * beginning of the directory.
+ */
+void rewinddir_win32(DIR *pDir)
+{
+ struct dir_win32 *stream = (struct dir_win32 *)pDir;
+
+ if (stream == NULL) {
+ errno = EBADF;
+ return;
+ }
+
+ stream->offset = 0;
+
+ return;
+}
+
+/*
+ * seekdir_win32 - set the position of the next readdir() call in the directory
+ *
+ * This function sets the position of the next readdir() call in the directory
+ * from which the next readdir() call will start.
+ */
+void seekdir_win32(DIR *pDir, long pos)
+{
+ struct dir_win32 *stream = (struct dir_win32 *)pDir;
+
+ if (stream == NULL) {
+ errno = EBADF;
+ return;
+ }
+
+ if (pos < -1) {
+ errno = EINVAL;
+ return;
+ }
+
+ if (pos == -1 || pos >= (long)stream->total_entries) {
+ /* seek to the end */
+ stream->offset = stream->total_entries;
+ return;
+ }
+
+ if (pos - (long)stream->offset == 0) {
+ /* no need to seek */
+ return;
+ }
+
+ stream->offset = pos;
+
+ return;
+}
+
+/*
+ * telldir_win32 - return current location in directory
+ *
+ * This function returns current location in directory.
+ */
+long telldir_win32(DIR *pDir)
+{
+ struct dir_win32 *stream = (struct dir_win32 *)pDir;
+
+ if (stream == NULL) {
+ errno = EBADF;
+ return -1;
+ }
+
+ if (stream->offset > stream->total_entries) {
+ return -1;
+ }
+
+ return (long)stream->offset;
+}
--
2.25.1
^ permalink raw reply related [flat|nested] 32+ messages in thread
* [PATCH v5 05/16] hw/9pfs: Update the local fs driver to support Windows
2023-02-20 10:07 [PATCH v5 00/16] hw/9pfs: Add 9pfs support for Windows Bin Meng
` (3 preceding siblings ...)
2023-02-20 10:08 ` [PATCH v5 04/16] hw/9pfs: Implement Windows specific xxxdir() APIs Bin Meng
@ 2023-02-20 10:08 ` Bin Meng
2023-02-20 10:08 ` [PATCH v5 06/16] hw/9pfs: Support getting current directory offset for Windows Bin Meng
` (12 subsequent siblings)
17 siblings, 0 replies; 32+ messages in thread
From: Bin Meng @ 2023-02-20 10:08 UTC (permalink / raw)
To: Christian Schoenebeck, Greg Kurz, qemu-devel; +Cc: Guohuai Shi
From: Guohuai Shi <guohuai.shi@windriver.com>
Update the 9p 'local' file system driver to support Windows,
including open, read, write, close, rename, remove, etc.
All security models are supported. The mapped (mapped-xattr)
security model is implemented using NTFS Alternate Data Stream
(ADS) so the 9p export path shall be on an NTFS partition.
Symbolic link and hard link are not supported when security
model is "passthrough" or "none", because Windows NTFS does
not fully support them with POSIX compatibility. Symbolic
link is enabled when security model is "mapped-file" or
"mapped-xattr".
inode remap is always enabled because Windows file system
does not provide a compatible inode number.
mknod() is not supported because Windows does not support it.
chown() and chmod() are not supported when 9pfs is configured
with security mode to 'none' or 'passthrough' because Windows
host does not support such type request.
Signed-off-by: Guohuai Shi <guohuai.shi@windriver.com>
Signed-off-by: Bin Meng <bin.meng@windriver.com>
---
hw/9pfs/9p-local.h | 1 +
hw/9pfs/9p-local.c | 253 +++++++++++++++++++++++++++++++++++++++++++--
2 files changed, 246 insertions(+), 8 deletions(-)
diff --git a/hw/9pfs/9p-local.h b/hw/9pfs/9p-local.h
index 77e7f57f89..5905923881 100644
--- a/hw/9pfs/9p-local.h
+++ b/hw/9pfs/9p-local.h
@@ -17,6 +17,7 @@ typedef struct {
int mountfd;
#ifdef CONFIG_WIN32
char *root_path;
+ DWORD block_size;
#endif
} LocalData;
diff --git a/hw/9pfs/9p-local.c b/hw/9pfs/9p-local.c
index 4385f18da2..d308a88759 100644
--- a/hw/9pfs/9p-local.c
+++ b/hw/9pfs/9p-local.c
@@ -21,11 +21,13 @@
#include "9p-xattr.h"
#include "9p-util.h"
#include "fsdev/qemu-fsdev.h" /* local_ops */
+#ifndef CONFIG_WIN32
#include <arpa/inet.h>
#include <pwd.h>
#include <grp.h>
#include <sys/socket.h>
#include <sys/un.h>
+#endif
#include "qemu/xattr.h"
#include "qapi/error.h"
#include "qemu/cutils.h"
@@ -38,7 +40,9 @@
#include <linux/magic.h>
#endif
#endif
+#ifndef CONFIG_WIN32
#include <sys/ioctl.h>
+#endif
#ifndef XFS_SUPER_MAGIC
#define XFS_SUPER_MAGIC 0x58465342
@@ -90,10 +94,12 @@ int local_open_nofollow(FsContext *fs_ctx, const char *path, int flags,
return fd;
}
+#ifndef CONFIG_WIN32
int local_opendir_nofollow(FsContext *fs_ctx, const char *path)
{
return local_open_nofollow(fs_ctx, path, O_DIRECTORY | O_RDONLY, 0);
}
+#endif
static void renameat_preserve_errno(int odirfd, const char *opath, int ndirfd,
const char *npath)
@@ -236,7 +242,7 @@ static int local_set_mapped_file_attrat(int dirfd, const char *name,
int ret;
char buf[ATTR_MAX];
int uid = -1, gid = -1, mode = -1, rdev = -1;
- int map_dirfd = -1, map_fd;
+ int map_dirfd = -1;
bool is_root = !strcmp(name, ".");
if (is_root) {
@@ -300,10 +306,12 @@ update_map_file:
return -1;
}
- map_fd = fileno(fp);
+#ifndef CONFIG_WIN32
+ int map_fd = fileno(fp);
assert(map_fd != -1);
ret = fchmod(map_fd, 0600);
assert(ret == 0);
+#endif
if (credp->fc_uid != -1) {
uid = credp->fc_uid;
@@ -335,6 +343,7 @@ update_map_file:
return 0;
}
+#ifndef CONFIG_WIN32
static int fchmodat_nofollow(int dirfd, const char *name, mode_t mode)
{
struct stat stbuf;
@@ -396,6 +405,7 @@ static int fchmodat_nofollow(int dirfd, const char *name, mode_t mode)
close_preserve_errno(fd);
return ret;
}
+#endif
static int local_set_xattrat(int dirfd, const char *path, FsCred *credp)
{
@@ -436,6 +446,7 @@ static int local_set_xattrat(int dirfd, const char *path, FsCred *credp)
return 0;
}
+#ifndef CONFIG_WIN32
static int local_set_cred_passthrough(FsContext *fs_ctx, int dirfd,
const char *name, FsCred *credp)
{
@@ -452,6 +463,7 @@ static int local_set_cred_passthrough(FsContext *fs_ctx, int dirfd,
return fchmodat_nofollow(dirfd, name, credp->fc_mode & 07777);
}
+#endif
static ssize_t local_readlink(FsContext *fs_ctx, V9fsPath *fs_path,
char *buf, size_t bufsz)
@@ -470,6 +482,12 @@ static ssize_t local_readlink(FsContext *fs_ctx, V9fsPath *fs_path,
close_preserve_errno(fd);
} else if ((fs_ctx->export_flags & V9FS_SM_PASSTHROUGH) ||
(fs_ctx->export_flags & V9FS_SM_NONE)) {
+#ifdef CONFIG_WIN32
+ errno = ENOTSUP;
+ error_report_once("readlink is not available on Windows host when"
+ "security_model is \"none\" or \"passthrough\"");
+ tsize = -1;
+#else
char *dirpath = g_path_get_dirname(fs_path->data);
char *name = g_path_get_basename(fs_path->data);
int dirfd;
@@ -484,6 +502,7 @@ static ssize_t local_readlink(FsContext *fs_ctx, V9fsPath *fs_path,
out:
g_free(name);
g_free(dirpath);
+#endif
}
return tsize;
}
@@ -522,9 +541,31 @@ static int local_opendir(FsContext *ctx,
return -1;
}
+#ifdef CONFIG_WIN32
+ char *full_file_name;
+
+ HANDLE hDir = (HANDLE)_get_osfhandle(dirfd);
+
+ full_file_name = get_full_path_win32(hDir, NULL);
+
+ close(dirfd);
+
+ if (full_file_name == NULL) {
+ return -1;
+ }
+ stream = qemu_opendir(full_file_name);
+ g_free(full_file_name);
+#else
stream = fdopendir(dirfd);
+#endif
+
if (!stream) {
+#ifndef CONFIG_WIN32
+ /*
+ * dirfd is closed always in above code, so no need to close it here.
+ */
close(dirfd);
+#endif
return -1;
}
fs->dir.stream = stream;
@@ -567,13 +608,17 @@ again:
#endif
if (ctx->export_flags & V9FS_SM_MAPPED) {
+#ifndef CONFIG_WIN32
entry->d_type = DT_UNKNOWN;
+#endif
} else if (ctx->export_flags & V9FS_SM_MAPPED_FILE) {
if (local_is_mapped_file_metadata(ctx, entry->d_name)) {
/* skip the meta data */
goto again;
}
+#ifndef CONFIG_WIN32
entry->d_type = DT_UNKNOWN;
+#endif
}
return entry;
@@ -647,7 +692,14 @@ static int local_chmod(FsContext *fs_ctx, V9fsPath *fs_path, FsCred *credp)
ret = local_set_mapped_file_attrat(dirfd, name, credp);
} else if (fs_ctx->export_flags & V9FS_SM_PASSTHROUGH ||
fs_ctx->export_flags & V9FS_SM_NONE) {
+#ifdef CONFIG_WIN32
+ errno = ENOTSUP;
+ error_report_once("chmod is not available on Windows host when"
+ "security_model is \"none\" or \"passthrough\"");
+ ret = -1;
+#else
ret = fchmodat_nofollow(dirfd, name, credp->fc_mode);
+#endif
}
close_preserve_errno(dirfd);
@@ -691,6 +743,12 @@ static int local_mknod(FsContext *fs_ctx, V9fsPath *dir_path,
}
} else if (fs_ctx->export_flags & V9FS_SM_PASSTHROUGH ||
fs_ctx->export_flags & V9FS_SM_NONE) {
+#ifdef CONFIG_WIN32
+ errno = ENOTSUP;
+ error_report_once("mknod is not available on Windows host when"
+ "security_model is \"none\" or \"passthrough\"");
+ goto out;
+#else
err = qemu_mknodat(dirfd, name, credp->fc_mode, credp->fc_rdev);
if (err == -1) {
goto out;
@@ -699,6 +757,7 @@ static int local_mknod(FsContext *fs_ctx, V9fsPath *dir_path,
if (err == -1) {
goto err_end;
}
+#endif
}
goto out;
@@ -748,10 +807,12 @@ static int local_mkdir(FsContext *fs_ctx, V9fsPath *dir_path,
if (err == -1) {
goto out;
}
+#ifndef CONFIG_WIN32
err = local_set_cred_passthrough(fs_ctx, dirfd, name, credp);
if (err == -1) {
goto err_end;
}
+#endif
}
goto out;
@@ -768,7 +829,12 @@ static int local_fstat(FsContext *fs_ctx, int fid_type,
int err, fd;
if (fid_type == P9_FID_DIR) {
+#ifdef CONFIG_WIN32
+ errno = ENOTSUP;
+ return -1; /* Windows do not allow opening a directory by open() */
+#else
fd = dirfd(fs->dir.stream);
+#endif
} else {
fd = fs->fd;
}
@@ -820,10 +886,10 @@ static int local_open2(FsContext *fs_ctx, V9fsPath *dir_path, const char *name,
return -1;
}
- /*
- * Mark all the open to not follow symlinks
- */
+#ifndef CONFIG_WIN32
+ /* Mark all the open to not follow symlinks */
flags |= O_NOFOLLOW;
+#endif
dirfd = local_opendir_nofollow(fs_ctx, dir_path->data);
if (dirfd == -1) {
@@ -853,10 +919,12 @@ static int local_open2(FsContext *fs_ctx, V9fsPath *dir_path, const char *name,
if (fd == -1) {
goto out;
}
+#ifndef CONFIG_WIN32
err = local_set_cred_passthrough(fs_ctx, dirfd, name, credp);
if (err == -1) {
goto err_end;
}
+#endif
}
err = fd;
fs->fd = fd;
@@ -921,6 +989,21 @@ static int local_symlink(FsContext *fs_ctx, const char *oldpath,
}
} else if (fs_ctx->export_flags & V9FS_SM_PASSTHROUGH ||
fs_ctx->export_flags & V9FS_SM_NONE) {
+#ifdef CONFIG_WIN32
+ /*
+ * Windows symbolic link requires administrator privilage.
+ * And Windows does not provide any interface like readlink().
+ * All symbolic links on Windows are always absolute paths.
+ * It's not 100% compatible with POSIX symbolic link.
+ *
+ * With above reasons, symbolic link with "passthrough" or "none"
+ * mode is disabled on Windows host.
+ */
+ errno = ENOTSUP;
+ error_report_once("symlink is not available on Windows host when"
+ "security_model is \"none\" or \"passthrough\"");
+ goto out;
+#else
err = symlinkat(oldpath, dirfd, name);
if (err) {
goto out;
@@ -938,6 +1021,7 @@ static int local_symlink(FsContext *fs_ctx, const char *oldpath,
err = 0;
}
}
+#endif
}
goto out;
@@ -951,6 +1035,11 @@ out:
static int local_link(FsContext *ctx, V9fsPath *oldpath,
V9fsPath *dirpath, const char *name)
{
+#ifdef CONFIG_WIN32
+ errno = ENOTSUP;
+ error_report_once("link is not available on Windows host");
+ return -1;
+#else
char *odirpath = g_path_get_dirname(oldpath->data);
char *oname = g_path_get_basename(oldpath->data);
int ret = -1;
@@ -1020,6 +1109,7 @@ out:
g_free(oname);
g_free(odirpath);
return ret;
+#endif
}
static int local_truncate(FsContext *ctx, V9fsPath *fs_path, off_t size)
@@ -1050,8 +1140,15 @@ static int local_chown(FsContext *fs_ctx, V9fsPath *fs_path, FsCred *credp)
if ((credp->fc_uid == -1 && credp->fc_gid == -1) ||
(fs_ctx->export_flags & V9FS_SM_PASSTHROUGH) ||
(fs_ctx->export_flags & V9FS_SM_NONE)) {
+#ifdef CONFIG_WIN32
+ errno = ENOTSUP;
+ error_report_once("chown is not available on Windows host when"
+ "security_model is \"none\" or \"passthrough\"");
+ ret = -1;
+#else
ret = fchownat(dirfd, name, credp->fc_uid, credp->fc_gid,
AT_SYMLINK_NOFOLLOW);
+#endif
} else if (fs_ctx->export_flags & V9FS_SM_MAPPED) {
ret = local_set_xattrat(dirfd, name, credp);
} else if (fs_ctx->export_flags & V9FS_SM_MAPPED_FILE) {
@@ -1163,6 +1260,12 @@ out:
static int local_fsync(FsContext *ctx, int fid_type,
V9fsFidOpenState *fs, int datasync)
{
+#ifdef CONFIG_WIN32
+ if (fid_type != P9_FID_DIR) {
+ return _commit(fs->fd);
+ }
+ return 0;
+#else
int fd;
if (fid_type == P9_FID_DIR) {
@@ -1176,11 +1279,14 @@ static int local_fsync(FsContext *ctx, int fid_type,
} else {
return fsync(fd);
}
+#endif
}
static int local_statfs(FsContext *s, V9fsPath *fs_path, struct statfs *stbuf)
{
- int fd, ret;
+ int ret;
+#ifndef CONFIG_WIN32
+ int fd;
fd = local_open_nofollow(s, fs_path->data, O_RDONLY, 0);
if (fd == -1) {
@@ -1188,39 +1294,65 @@ static int local_statfs(FsContext *s, V9fsPath *fs_path, struct statfs *stbuf)
}
ret = fstatfs(fd, stbuf);
close_preserve_errno(fd);
+#else
+ LocalData *data = (LocalData *)s->private;
+
+ ret = statfs_win32(data->root_path, stbuf);
+ if (ret == 0) {
+ /* use context address as fsid */
+ memcpy(&stbuf->f_fsid, s, sizeof(intptr_t));
+ }
+#endif
+
return ret;
}
static ssize_t local_lgetxattr(FsContext *ctx, V9fsPath *fs_path,
const char *name, void *value, size_t size)
{
+#ifdef CONFIG_WIN32
+ return -1;
+#else
char *path = fs_path->data;
return v9fs_get_xattr(ctx, path, name, value, size);
+#endif
}
static ssize_t local_llistxattr(FsContext *ctx, V9fsPath *fs_path,
void *value, size_t size)
{
+#ifdef CONFIG_WIN32
+ return -1;
+#else
char *path = fs_path->data;
return v9fs_list_xattr(ctx, path, value, size);
+#endif
}
static int local_lsetxattr(FsContext *ctx, V9fsPath *fs_path, const char *name,
void *value, size_t size, int flags)
{
+#ifdef CONFIG_WIN32
+ return -1;
+#else
char *path = fs_path->data;
return v9fs_set_xattr(ctx, path, name, value, size, flags);
+#endif
}
static int local_lremovexattr(FsContext *ctx, V9fsPath *fs_path,
const char *name)
{
+#ifdef CONFIG_WIN32
+ return -1;
+#else
char *path = fs_path->data;
return v9fs_remove_xattr(ctx, path, name);
+#endif
}
static int local_name_to_path(FsContext *ctx, V9fsPath *dir_path,
@@ -1383,6 +1515,7 @@ static int local_unlinkat(FsContext *ctx, V9fsPath *dir,
return ret;
}
+#ifndef CONFIG_WIN32
#ifdef FS_IOC_GETVERSION
static int local_ioc_getversion(FsContext *ctx, V9fsPath *path,
mode_t st_mode, uint64_t *st_gen)
@@ -1432,11 +1565,90 @@ static int local_ioc_getversion_init(FsContext *ctx, LocalData *data, Error **er
#endif
return 0;
}
+#endif
-static int local_init(FsContext *ctx, Error **errp)
+#ifdef CONFIG_WIN32
+static int init_win32_root_directory(FsContext *ctx, LocalData *data,
+ Error **errp)
{
- LocalData *data = g_malloc(sizeof(*data));
+ HANDLE hRoot;
+ char *root_path;
+ DWORD SectorsPerCluster;
+ DWORD BytesPerSector;
+ DWORD NumberOfFreeClusters;
+ DWORD TotalNumberOfClusters;
+ char disk_root[4] = { 0 };
+
+ hRoot = CreateFile(ctx->fs_root, GENERIC_READ,
+ FILE_SHARE_READ | FILE_SHARE_WRITE | FILE_SHARE_DELETE,
+ NULL,
+ OPEN_EXISTING, FILE_FLAG_BACKUP_SEMANTICS, NULL);
+ if (hRoot == INVALID_HANDLE_VALUE) {
+ error_setg_errno(errp, EINVAL, "cannot open %s", ctx->fs_root);
+ return -1;
+ }
+
+ if ((ctx->export_flags & V9FS_SM_MAPPED) != 0) {
+ wchar_t fs_name[MAX_PATH + 1] = {0};
+ wchar_t ntfs_name[5] = {'N', 'T', 'F', 'S'};
+
+ /* Get file system type name */
+ if (GetVolumeInformationByHandleW(hRoot, NULL, 0, NULL, NULL, NULL,
+ fs_name, MAX_PATH + 1) == 0) {
+ error_setg_errno(errp, EINVAL,
+ "cannot get file system information");
+ CloseHandle(hRoot);
+ return -1;
+ }
+
+ /*
+ * security_model=mapped(-xattr) requires a fileystem on Windows that
+ * supports Alternate Data Stream (ADS). NTFS is one of them, and is
+ * probably most popular on Windows. It is fair enough to assume
+ * Windows users to use NTFS for the mapped security model.
+ */
+ if (wcscmp(fs_name, ntfs_name) != 0) {
+ CloseHandle(hRoot);
+ error_setg_errno(errp, EINVAL, "require NTFS file system");
+ return -1;
+ }
+ }
+
+ root_path = get_full_path_win32(hRoot, NULL);
+ if (root_path == NULL) {
+ CloseHandle(hRoot);
+ error_setg_errno(errp, EINVAL, "cannot get full root path");
+ return -1;
+ }
+
+ /* copy the first 3 characters for the root directory */
+ memcpy(disk_root, root_path, 3);
+ if (GetDiskFreeSpace(disk_root, &SectorsPerCluster, &BytesPerSector,
+ &NumberOfFreeClusters, &TotalNumberOfClusters) == 0) {
+ CloseHandle(hRoot);
+ error_setg_errno(errp, EINVAL, "cannot get file system block size");
+ return -1;
+ }
+
+ /*
+ * hold the root handle will prevent other one to delete or replace the
+ * root directory during runtime.
+ */
+
+ data->mountfd = _open_osfhandle((intptr_t)hRoot, _O_RDONLY);
+ data->root_path = root_path;
+ data->block_size = SectorsPerCluster * BytesPerSector;
+
+ return 0;
+}
+
+#endif
+
+static int local_init(FsContext *ctx, Error **errp)
+{
+ LocalData *data = g_malloc0(sizeof(*data));
+#ifndef CONFIG_WIN32
data->mountfd = open(ctx->fs_root, O_DIRECTORY | O_RDONLY);
if (data->mountfd == -1) {
error_setg_errno(errp, errno, "failed to open '%s'", ctx->fs_root);
@@ -1447,7 +1659,17 @@ static int local_init(FsContext *ctx, Error **errp)
close(data->mountfd);
goto err;
}
+#else
+ if (init_win32_root_directory(ctx, data, errp) != 0) {
+ goto err;
+ }
+ /*
+ * Always enable inode remap since Windows file system does not
+ * have inode number.
+ */
+ ctx->export_flags |= V9FS_REMAP_INODES;
+#endif
if (ctx->export_flags & V9FS_SM_PASSTHROUGH) {
ctx->xops = passthrough_xattr_ops;
} else if (ctx->export_flags & V9FS_SM_MAPPED) {
@@ -1467,6 +1689,16 @@ static int local_init(FsContext *ctx, Error **errp)
return 0;
err:
+#ifdef CONFIG_WIN32
+ if (data->root_path != NULL) {
+ g_free(data->root_path);
+ }
+#endif
+
+ if (data->mountfd != -1) {
+ close(data->mountfd);
+ }
+
g_free(data);
return -1;
}
@@ -1479,6 +1711,11 @@ static void local_cleanup(FsContext *ctx)
return;
}
+#ifdef CONFIG_WIN32
+ if (data->root_path != NULL) {
+ g_free(data->root_path);
+ }
+#endif
close(data->mountfd);
g_free(data);
}
--
2.25.1
^ permalink raw reply related [flat|nested] 32+ messages in thread
* [PATCH v5 06/16] hw/9pfs: Support getting current directory offset for Windows
2023-02-20 10:07 [PATCH v5 00/16] hw/9pfs: Add 9pfs support for Windows Bin Meng
` (4 preceding siblings ...)
2023-02-20 10:08 ` [PATCH v5 05/16] hw/9pfs: Update the local fs driver to support Windows Bin Meng
@ 2023-02-20 10:08 ` Bin Meng
2023-02-20 10:08 ` [PATCH v5 07/16] hw/9pfs: Update helper qemu_stat_rdev() Bin Meng
` (11 subsequent siblings)
17 siblings, 0 replies; 32+ messages in thread
From: Bin Meng @ 2023-02-20 10:08 UTC (permalink / raw)
To: Christian Schoenebeck, Greg Kurz, qemu-devel; +Cc: Guohuai Shi
From: Guohuai Shi <guohuai.shi@windriver.com>
On Windows 'struct dirent' does not have current directory offset.
Update qemu_dirent_off() to support Windows.
While we are here, add a build time check to error out if a new
host does not implement this helper.
Signed-off-by: Guohuai Shi <guohuai.shi@windriver.com>
Signed-off-by: Bin Meng <bin.meng@windriver.com>
---
hw/9pfs/9p-util.h | 16 +++++++++++++---
hw/9pfs/9p-util-win32.c | 5 +++++
hw/9pfs/9p.c | 4 ++--
hw/9pfs/codir.c | 2 +-
4 files changed, 21 insertions(+), 6 deletions(-)
diff --git a/hw/9pfs/9p-util.h b/hw/9pfs/9p-util.h
index c1c251fbd1..91f70a4c38 100644
--- a/hw/9pfs/9p-util.h
+++ b/hw/9pfs/9p-util.h
@@ -19,6 +19,10 @@
#define O_PATH_9P_UTIL 0
#endif
+/* forward declaration */
+union V9fsFidOpenState;
+struct V9fsState;
+
#if !defined(CONFIG_LINUX)
/*
@@ -147,6 +151,7 @@ struct dirent *readdir_win32(DIR *pDir);
void rewinddir_win32(DIR *pDir);
void seekdir_win32(DIR *pDir, long pos);
long telldir_win32(DIR *pDir);
+off_t qemu_dirent_off_win32(struct V9fsState *s, union V9fsFidOpenState *fs);
#endif
static inline void close_preserve_errno(int fd)
@@ -220,12 +225,17 @@ ssize_t fremovexattrat_nofollow(int dirfd, const char *filename,
* so ensure it is manually injected earlier and call here when
* needed.
*/
-static inline off_t qemu_dirent_off(struct dirent *dent)
+static inline off_t qemu_dirent_off(struct dirent *dent, struct V9fsState *s,
+ union V9fsFidOpenState *fs)
{
-#ifdef CONFIG_DARWIN
+#if defined(CONFIG_DARWIN)
return dent->d_seekoff;
-#else
+#elif defined(CONFIG_LINUX)
return dent->d_off;
+#elif defined(CONFIG_WIN32)
+ return qemu_dirent_off_win32(s, fs);
+#else
+#error Missing qemu_dirent_off() implementation for this host system
#endif
}
diff --git a/hw/9pfs/9p-util-win32.c b/hw/9pfs/9p-util-win32.c
index e9408f3c45..37d98a3e63 100644
--- a/hw/9pfs/9p-util-win32.c
+++ b/hw/9pfs/9p-util-win32.c
@@ -1420,3 +1420,8 @@ long telldir_win32(DIR *pDir)
return (long)stream->offset;
}
+
+off_t qemu_dirent_off_win32(struct V9fsState *s, union V9fsFidOpenState *fs)
+{
+ return s->ops->telldir(&s->ctx, fs);
+}
diff --git a/hw/9pfs/9p.c b/hw/9pfs/9p.c
index 9621ec1341..1b252c6eaf 100644
--- a/hw/9pfs/9p.c
+++ b/hw/9pfs/9p.c
@@ -2334,7 +2334,7 @@ static int coroutine_fn v9fs_do_readdir_with_stat(V9fsPDU *pdu,
count += len;
v9fs_stat_free(&v9stat);
v9fs_path_free(&path);
- saved_dir_pos = qemu_dirent_off(dent);
+ saved_dir_pos = qemu_dirent_off(dent, pdu->s, &fidp->fs);
}
v9fs_readdir_unlock(&fidp->fs.dir);
@@ -2535,7 +2535,7 @@ static int coroutine_fn v9fs_do_readdir(V9fsPDU *pdu, V9fsFidState *fidp,
qid.version = 0;
}
- off = qemu_dirent_off(dent);
+ off = qemu_dirent_off(dent, pdu->s, &fidp->fs);
v9fs_string_init(&name);
v9fs_string_sprintf(&name, "%s", dent->d_name);
diff --git a/hw/9pfs/codir.c b/hw/9pfs/codir.c
index 7ba63be489..6d96e2d72b 100644
--- a/hw/9pfs/codir.c
+++ b/hw/9pfs/codir.c
@@ -167,7 +167,7 @@ static int do_readdir_many(V9fsPDU *pdu, V9fsFidState *fidp,
}
size += len;
- saved_dir_pos = qemu_dirent_off(dent);
+ saved_dir_pos = qemu_dirent_off(dent, s, &fidp->fs);
}
/* restore (last) saved position */
--
2.25.1
^ permalink raw reply related [flat|nested] 32+ messages in thread
* [PATCH v5 07/16] hw/9pfs: Update helper qemu_stat_rdev()
2023-02-20 10:07 [PATCH v5 00/16] hw/9pfs: Add 9pfs support for Windows Bin Meng
` (5 preceding siblings ...)
2023-02-20 10:08 ` [PATCH v5 06/16] hw/9pfs: Support getting current directory offset for Windows Bin Meng
@ 2023-02-20 10:08 ` Bin Meng
2023-02-20 10:08 ` [PATCH v5 08/16] hw/9pfs: Add a helper qemu_stat_blksize() Bin Meng
` (10 subsequent siblings)
17 siblings, 0 replies; 32+ messages in thread
From: Bin Meng @ 2023-02-20 10:08 UTC (permalink / raw)
To: Christian Schoenebeck, Greg Kurz, qemu-devel; +Cc: Guohuai Shi
As Windows host does not have stat->st_rdev field, we use the first
3 characters of the root path to build a device id.
Co-developed-by: Guohuai Shi <guohuai.shi@windriver.com>
Signed-off-by: Bin Meng <bin.meng@windriver.com>
---
hw/9pfs/9p-util.h | 22 +++++++++++++++++++---
hw/9pfs/9p-util-win32.c | 18 ++++++++++++++++++
hw/9pfs/9p.c | 5 +++--
3 files changed, 40 insertions(+), 5 deletions(-)
diff --git a/hw/9pfs/9p-util.h b/hw/9pfs/9p-util.h
index 91f70a4c38..1fb54d0b97 100644
--- a/hw/9pfs/9p-util.h
+++ b/hw/9pfs/9p-util.h
@@ -22,8 +22,9 @@
/* forward declaration */
union V9fsFidOpenState;
struct V9fsState;
+struct FsContext;
-#if !defined(CONFIG_LINUX)
+#ifdef CONFIG_DARWIN
/*
* Generates a Linux device number (a.k.a. dev_t) for given device major
@@ -55,10 +56,12 @@ static inline uint64_t makedev_dotl(uint32_t dev_major, uint32_t dev_minor)
*/
static inline uint64_t host_dev_to_dotl_dev(dev_t dev)
{
-#ifdef CONFIG_LINUX
+#if defined(CONFIG_LINUX) || defined(CONFIG_WIN32)
return dev;
-#else
+#elif defined(CONFIG_DARWIN)
return makedev_dotl(major(dev), minor(dev));
+#else
+#error Missing host_dev_to_dotl_dev() implementation for this host system
#endif
}
@@ -152,6 +155,7 @@ void rewinddir_win32(DIR *pDir);
void seekdir_win32(DIR *pDir, long pos);
long telldir_win32(DIR *pDir);
off_t qemu_dirent_off_win32(struct V9fsState *s, union V9fsFidOpenState *fs);
+uint64_t qemu_stat_rdev_win32(struct FsContext *fs_ctx);
#endif
static inline void close_preserve_errno(int fd)
@@ -269,6 +273,18 @@ static inline struct dirent *qemu_dirent_dup(struct dirent *dent)
return g_memdup(dent, sz);
}
+static inline uint64_t qemu_stat_rdev(const struct stat *stbuf,
+ struct FsContext *fs_ctx)
+{
+#if defined(CONFIG_LINUX) || defined(CONFIG_DARWIN)
+ return stbuf->st_rdev;
+#elif defined(CONFIG_WIN32)
+ return qemu_stat_rdev_win32(fs_ctx);
+#else
+#error Missing qemu_stat_rdev() implementation for this host system
+#endif
+}
+
/*
* As long as mknodat is not available on macOS, this workaround
* using pthread_fchdir_np is needed. qemu_mknodat is defined in
diff --git a/hw/9pfs/9p-util-win32.c b/hw/9pfs/9p-util-win32.c
index 37d98a3e63..61bb572261 100644
--- a/hw/9pfs/9p-util-win32.c
+++ b/hw/9pfs/9p-util-win32.c
@@ -1425,3 +1425,21 @@ off_t qemu_dirent_off_win32(struct V9fsState *s, union V9fsFidOpenState *fs)
{
return s->ops->telldir(&s->ctx, fs);
}
+
+uint64_t qemu_stat_rdev_win32(struct FsContext *fs_ctx)
+{
+ uint64_t rdev = 0;
+ LocalData *data = fs_ctx->private;
+
+ /*
+ * As Windows host does not have stat->st_rdev field, we use the first
+ * 3 characters of the root path to build a device id.
+ *
+ * (Windows root path always starts from a driver letter like "C:\")
+ */
+ if (data) {
+ memcpy(&rdev, data->root_path, 3);
+ }
+
+ return rdev;
+}
diff --git a/hw/9pfs/9p.c b/hw/9pfs/9p.c
index 1b252c6eaf..ead727a12b 100644
--- a/hw/9pfs/9p.c
+++ b/hw/9pfs/9p.c
@@ -1264,7 +1264,8 @@ static int coroutine_fn stat_to_v9stat(V9fsPDU *pdu, V9fsPath *path,
} else if (v9stat->mode & P9_STAT_MODE_DEVICE) {
v9fs_string_sprintf(&v9stat->extension, "%c %u %u",
S_ISCHR(stbuf->st_mode) ? 'c' : 'b',
- major(stbuf->st_rdev), minor(stbuf->st_rdev));
+ major(qemu_stat_rdev(stbuf, &pdu->s->ctx)),
+ minor(qemu_stat_rdev(stbuf, &pdu->s->ctx)));
} else if (S_ISDIR(stbuf->st_mode) || S_ISREG(stbuf->st_mode)) {
v9fs_string_sprintf(&v9stat->extension, "%s %lu",
"HARDLINKCOUNT", (unsigned long)stbuf->st_nlink);
@@ -1344,7 +1345,7 @@ static int stat_to_v9stat_dotl(V9fsPDU *pdu, const struct stat *stbuf,
v9lstat->st_nlink = stbuf->st_nlink;
v9lstat->st_uid = stbuf->st_uid;
v9lstat->st_gid = stbuf->st_gid;
- v9lstat->st_rdev = host_dev_to_dotl_dev(stbuf->st_rdev);
+ v9lstat->st_rdev = host_dev_to_dotl_dev(rdev);
v9lstat->st_size = stbuf->st_size;
v9lstat->st_blksize = stat_to_iounit(pdu, stbuf);
v9lstat->st_blocks = stbuf->st_blocks;
--
2.25.1
^ permalink raw reply related [flat|nested] 32+ messages in thread
* [PATCH v5 08/16] hw/9pfs: Add a helper qemu_stat_blksize()
2023-02-20 10:07 [PATCH v5 00/16] hw/9pfs: Add 9pfs support for Windows Bin Meng
` (6 preceding siblings ...)
2023-02-20 10:08 ` [PATCH v5 07/16] hw/9pfs: Update helper qemu_stat_rdev() Bin Meng
@ 2023-02-20 10:08 ` Bin Meng
2023-02-20 10:08 ` [PATCH v5 09/16] hw/9pfs: Disable unsupported flags and features for Windows Bin Meng
` (9 subsequent siblings)
17 siblings, 0 replies; 32+ messages in thread
From: Bin Meng @ 2023-02-20 10:08 UTC (permalink / raw)
To: Christian Schoenebeck, Greg Kurz, qemu-devel; +Cc: Guohuai Shi
As Windows host does not have stat->st_blksize field, we use the one
we calculated in init_win32_root_directory().
Add a helper qemu_stat_blksize() and use it to avoid direct access to
stat->st_blksize.
Co-developed-by: Guohuai Shi <guohuai.shi@windriver.com>
Signed-off-by: Bin Meng <bin.meng@windriver.com>
---
hw/9pfs/9p-util.h | 13 +++++++++++++
hw/9pfs/9p-util-win32.c | 7 +++++++
hw/9pfs/9p.c | 13 ++++++++++++-
3 files changed, 32 insertions(+), 1 deletion(-)
diff --git a/hw/9pfs/9p-util.h b/hw/9pfs/9p-util.h
index 1fb54d0b97..ea8c116059 100644
--- a/hw/9pfs/9p-util.h
+++ b/hw/9pfs/9p-util.h
@@ -156,6 +156,7 @@ void seekdir_win32(DIR *pDir, long pos);
long telldir_win32(DIR *pDir);
off_t qemu_dirent_off_win32(struct V9fsState *s, union V9fsFidOpenState *fs);
uint64_t qemu_stat_rdev_win32(struct FsContext *fs_ctx);
+uint64_t qemu_stat_blksize_win32(struct FsContext *fs_ctx);
#endif
static inline void close_preserve_errno(int fd)
@@ -285,6 +286,18 @@ static inline uint64_t qemu_stat_rdev(const struct stat *stbuf,
#endif
}
+static inline uint64_t qemu_stat_blksize(const struct stat *stbuf,
+ struct FsContext *fs_ctx)
+{
+#if defined(CONFIG_LINUX) || defined(CONFIG_DARWIN)
+ return stbuf->st_blksize;
+#elif defined(CONFIG_WIN32)
+ return qemu_stat_blksize_win32(fs_ctx);
+#else
+#error Missing qemu_stat_blksize() implementation for this host system
+#endif
+}
+
/*
* As long as mknodat is not available on macOS, this workaround
* using pthread_fchdir_np is needed. qemu_mknodat is defined in
diff --git a/hw/9pfs/9p-util-win32.c b/hw/9pfs/9p-util-win32.c
index 61bb572261..ce7c5f7847 100644
--- a/hw/9pfs/9p-util-win32.c
+++ b/hw/9pfs/9p-util-win32.c
@@ -1443,3 +1443,10 @@ uint64_t qemu_stat_rdev_win32(struct FsContext *fs_ctx)
return rdev;
}
+
+uint64_t qemu_stat_blksize_win32(struct FsContext *fs_ctx)
+{
+ LocalData *data = fs_ctx->private;
+
+ return data ? (uint64_t)data->block_size : 0;
+}
diff --git a/hw/9pfs/9p.c b/hw/9pfs/9p.c
index ead727a12b..8858d7574c 100644
--- a/hw/9pfs/9p.c
+++ b/hw/9pfs/9p.c
@@ -1333,12 +1333,14 @@ static int32_t blksize_to_iounit(const V9fsPDU *pdu, int32_t blksize)
static int32_t stat_to_iounit(const V9fsPDU *pdu, const struct stat *stbuf)
{
- return blksize_to_iounit(pdu, stbuf->st_blksize);
+ return blksize_to_iounit(pdu, qemu_stat_blksize(stbuf, &pdu->s->ctx));
}
static int stat_to_v9stat_dotl(V9fsPDU *pdu, const struct stat *stbuf,
V9fsStatDotl *v9lstat)
{
+ dev_t rdev = qemu_stat_rdev(stbuf, &pdu->s->ctx);
+
memset(v9lstat, 0, sizeof(*v9lstat));
v9lstat->st_mode = stbuf->st_mode;
@@ -1348,7 +1350,16 @@ static int stat_to_v9stat_dotl(V9fsPDU *pdu, const struct stat *stbuf,
v9lstat->st_rdev = host_dev_to_dotl_dev(rdev);
v9lstat->st_size = stbuf->st_size;
v9lstat->st_blksize = stat_to_iounit(pdu, stbuf);
+#if defined(CONFIG_LINUX) || defined(CONFIG_DARWIN)
v9lstat->st_blocks = stbuf->st_blocks;
+#elif defined(CONFIG_WIN32)
+ if (v9lstat->st_blksize == 0) {
+ v9lstat->st_blocks = 0;
+ } else {
+ v9lstat->st_blocks = ROUND_UP(v9lstat->st_size / v9lstat->st_blksize,
+ v9lstat->st_blksize);
+ }
+#endif
v9lstat->st_atime_sec = stbuf->st_atime;
v9lstat->st_mtime_sec = stbuf->st_mtime;
v9lstat->st_ctime_sec = stbuf->st_ctime;
--
2.25.1
^ permalink raw reply related [flat|nested] 32+ messages in thread
* [PATCH v5 09/16] hw/9pfs: Disable unsupported flags and features for Windows
2023-02-20 10:07 [PATCH v5 00/16] hw/9pfs: Add 9pfs support for Windows Bin Meng
` (7 preceding siblings ...)
2023-02-20 10:08 ` [PATCH v5 08/16] hw/9pfs: Add a helper qemu_stat_blksize() Bin Meng
@ 2023-02-20 10:08 ` Bin Meng
2023-02-20 10:08 ` [PATCH v5 10/16] hw/9pfs: Update v9fs_set_fd_limit() " Bin Meng
` (8 subsequent siblings)
17 siblings, 0 replies; 32+ messages in thread
From: Bin Meng @ 2023-02-20 10:08 UTC (permalink / raw)
To: Christian Schoenebeck, Greg Kurz, qemu-devel; +Cc: Guohuai Shi
From: Guohuai Shi <guohuai.shi@windriver.com>
Some flags and features are not supported on Windows, like mknod,
readlink, file mode, etc. Update the codes for Windows.
Signed-off-by: Guohuai Shi <guohuai.shi@windriver.com>
Signed-off-by: Bin Meng <bin.meng@windriver.com>
---
hw/9pfs/9p.c | 45 ++++++++++++++++++++++++++++++++++++++-------
1 file changed, 38 insertions(+), 7 deletions(-)
diff --git a/hw/9pfs/9p.c b/hw/9pfs/9p.c
index 8858d7574c..768f20f2ac 100644
--- a/hw/9pfs/9p.c
+++ b/hw/9pfs/9p.c
@@ -37,6 +37,11 @@
#include "qemu/xxhash.h"
#include <math.h>
+#ifdef CONFIG_WIN32
+#define UTIME_NOW ((1l << 30) - 1l)
+#define UTIME_OMIT ((1l << 30) - 2l)
+#endif
+
int open_fd_hw;
int total_open_fd;
static int open_fd_rc;
@@ -130,13 +135,17 @@ static int dotl_to_open_flags(int flags)
DotlOpenflagMap dotl_oflag_map[] = {
{ P9_DOTL_CREATE, O_CREAT },
{ P9_DOTL_EXCL, O_EXCL },
+#ifndef CONFIG_WIN32
{ P9_DOTL_NOCTTY , O_NOCTTY },
+#endif
{ P9_DOTL_TRUNC, O_TRUNC },
{ P9_DOTL_APPEND, O_APPEND },
+#ifndef CONFIG_WIN32
{ P9_DOTL_NONBLOCK, O_NONBLOCK } ,
{ P9_DOTL_DSYNC, O_DSYNC },
{ P9_DOTL_FASYNC, FASYNC },
-#ifndef CONFIG_DARWIN
+#endif
+#if !defined(CONFIG_DARWIN) && !defined(CONFIG_WIN32)
{ P9_DOTL_NOATIME, O_NOATIME },
/*
* On Darwin, we could map to F_NOCACHE, which is
@@ -149,8 +158,10 @@ static int dotl_to_open_flags(int flags)
#endif
{ P9_DOTL_LARGEFILE, O_LARGEFILE },
{ P9_DOTL_DIRECTORY, O_DIRECTORY },
+#ifndef CONFIG_WIN32
{ P9_DOTL_NOFOLLOW, O_NOFOLLOW },
{ P9_DOTL_SYNC, O_SYNC },
+#endif
};
for (i = 0; i < ARRAY_SIZE(dotl_oflag_map); i++) {
@@ -177,8 +188,11 @@ static int get_dotl_openflags(V9fsState *s, int oflags)
* Filter the client open flags
*/
flags = dotl_to_open_flags(oflags);
- flags &= ~(O_NOCTTY | O_ASYNC | O_CREAT);
-#ifndef CONFIG_DARWIN
+ flags &= ~(O_CREAT);
+#ifndef CONFIG_WIN32
+ flags &= ~(O_NOCTTY | O_ASYNC);
+#endif
+#if !defined(CONFIG_DARWIN) && !defined(CONFIG_WIN32)
/*
* Ignore direct disk access hint until the server supports it.
*/
@@ -1115,12 +1129,14 @@ static mode_t v9mode_to_mode(uint32_t mode, V9fsString *extension)
if (mode & P9_STAT_MODE_SYMLINK) {
ret |= S_IFLNK;
}
+#ifndef CONFIG_WIN32
if (mode & P9_STAT_MODE_SOCKET) {
ret |= S_IFSOCK;
}
if (mode & P9_STAT_MODE_NAMED_PIPE) {
ret |= S_IFIFO;
}
+#endif
if (mode & P9_STAT_MODE_DEVICE) {
if (extension->size && extension->data[0] == 'c') {
ret |= S_IFCHR;
@@ -1201,6 +1217,7 @@ static uint32_t stat_to_v9mode(const struct stat *stbuf)
mode |= P9_STAT_MODE_SYMLINK;
}
+#ifndef CONFIG_WIN32
if (S_ISSOCK(stbuf->st_mode)) {
mode |= P9_STAT_MODE_SOCKET;
}
@@ -1208,6 +1225,7 @@ static uint32_t stat_to_v9mode(const struct stat *stbuf)
if (S_ISFIFO(stbuf->st_mode)) {
mode |= P9_STAT_MODE_NAMED_PIPE;
}
+#endif
if (S_ISBLK(stbuf->st_mode) || S_ISCHR(stbuf->st_mode)) {
mode |= P9_STAT_MODE_DEVICE;
@@ -1367,7 +1385,8 @@ static int stat_to_v9stat_dotl(V9fsPDU *pdu, const struct stat *stbuf,
v9lstat->st_atime_nsec = stbuf->st_atimespec.tv_nsec;
v9lstat->st_mtime_nsec = stbuf->st_mtimespec.tv_nsec;
v9lstat->st_ctime_nsec = stbuf->st_ctimespec.tv_nsec;
-#else
+#endif
+#ifdef CONFIG_LINUX
v9lstat->st_atime_nsec = stbuf->st_atim.tv_nsec;
v9lstat->st_mtime_nsec = stbuf->st_mtim.tv_nsec;
v9lstat->st_ctime_nsec = stbuf->st_ctim.tv_nsec;
@@ -2490,6 +2509,7 @@ static int coroutine_fn v9fs_do_readdir(V9fsPDU *pdu, V9fsFidState *fidp,
struct dirent *dent;
struct stat *st;
struct V9fsDirEnt *entries = NULL;
+ unsigned char d_type = 0;
/*
* inode remapping requires the device id, which in turn might be
@@ -2551,10 +2571,13 @@ static int coroutine_fn v9fs_do_readdir(V9fsPDU *pdu, V9fsFidState *fidp,
v9fs_string_init(&name);
v9fs_string_sprintf(&name, "%s", dent->d_name);
+#ifndef CONFIG_WIN32
+ d_type = dent->d_type;
+#endif
/* 11 = 7 + 4 (7 = start offset, 4 = space for storing count) */
len = pdu_marshal(pdu, 11 + count, "Qqbs",
&qid, off,
- dent->d_type, &name);
+ d_type, &name);
v9fs_string_free(&name);
@@ -2910,8 +2933,12 @@ static void coroutine_fn v9fs_create(void *opaque)
v9fs_path_copy(&fidp->path, &path);
v9fs_path_unlock(s);
} else if (perm & P9_STAT_MODE_SOCKET) {
+#ifndef CONFIG_WIN32
err = v9fs_co_mknod(pdu, fidp, &name, fidp->uid, -1,
0, S_IFSOCK | (perm & 0777), &stbuf);
+#else
+ err = -ENOTSUP;
+#endif
if (err < 0) {
goto out;
}
@@ -3981,7 +4008,7 @@ out_nofid:
#if defined(CONFIG_LINUX)
/* Currently, only Linux has XATTR_SIZE_MAX */
#define P9_XATTR_SIZE_MAX XATTR_SIZE_MAX
-#elif defined(CONFIG_DARWIN)
+#elif defined(CONFIG_DARWIN) || defined(CONFIG_WIN32)
/*
* Darwin doesn't seem to define a maximum xattr size in its user
* space header, so manually configure it across platforms as 64k.
@@ -3998,6 +4025,8 @@ out_nofid:
static void coroutine_fn v9fs_xattrcreate(void *opaque)
{
+ V9fsPDU *pdu = opaque;
+#ifndef CONFIG_WIN32
int flags, rflags = 0;
int32_t fid;
uint64_t size;
@@ -4006,7 +4035,6 @@ static void coroutine_fn v9fs_xattrcreate(void *opaque)
size_t offset = 7;
V9fsFidState *file_fidp;
V9fsFidState *xattr_fidp;
- V9fsPDU *pdu = opaque;
v9fs_string_init(&name);
err = pdu_unmarshal(pdu, offset, "dsqd", &fid, &name, &size, &flags);
@@ -4059,6 +4087,9 @@ out_put_fid:
out_nofid:
pdu_complete(pdu, err);
v9fs_string_free(&name);
+#else
+ pdu_complete(pdu, -1);
+#endif
}
static void coroutine_fn v9fs_readlink(void *opaque)
--
2.25.1
^ permalink raw reply related [flat|nested] 32+ messages in thread
* [PATCH v5 10/16] hw/9pfs: Update v9fs_set_fd_limit() for Windows
2023-02-20 10:07 [PATCH v5 00/16] hw/9pfs: Add 9pfs support for Windows Bin Meng
` (8 preceding siblings ...)
2023-02-20 10:08 ` [PATCH v5 09/16] hw/9pfs: Disable unsupported flags and features for Windows Bin Meng
@ 2023-02-20 10:08 ` Bin Meng
2023-02-20 10:08 ` [PATCH v5 11/16] hw/9pfs: Add Linux error number definition Bin Meng
` (7 subsequent siblings)
17 siblings, 0 replies; 32+ messages in thread
From: Bin Meng @ 2023-02-20 10:08 UTC (permalink / raw)
To: Christian Schoenebeck, Greg Kurz, qemu-devel; +Cc: Guohuai Shi
From: Guohuai Shi <guohuai.shi@windriver.com>
Use _getmaxstdio() to set the fd limit on Windows.
Signed-off-by: Guohuai Shi <guohuai.shi@windriver.com>
Signed-off-by: Bin Meng <bin.meng@windriver.com>
---
hw/9pfs/9p.c | 23 ++++++++++++++++++++---
1 file changed, 20 insertions(+), 3 deletions(-)
diff --git a/hw/9pfs/9p.c b/hw/9pfs/9p.c
index 768f20f2ac..6b2977f637 100644
--- a/hw/9pfs/9p.c
+++ b/hw/9pfs/9p.c
@@ -4394,11 +4394,28 @@ void v9fs_reset(V9fsState *s)
static void __attribute__((__constructor__)) v9fs_set_fd_limit(void)
{
+ int rlim_cur;
+ int ret;
+
+#ifndef CONFIG_WIN32
struct rlimit rlim;
- if (getrlimit(RLIMIT_NOFILE, &rlim) < 0) {
+ ret = getrlimit(RLIMIT_NOFILE, &rlim);
+ rlim_cur = rlim.rlim_cur;
+#else
+ /*
+ * On Windows host, _getmaxstdio() actually returns the number of max
+ * open files at the stdio level. It *may* be smaller than the number
+ * of open files by open() or CreateFile().
+ */
+ ret = _getmaxstdio();
+ rlim_cur = ret;
+#endif
+
+ if (ret < 0) {
error_report("Failed to get the resource limit");
exit(1);
}
- open_fd_hw = rlim.rlim_cur - MIN(400, rlim.rlim_cur / 3);
- open_fd_rc = rlim.rlim_cur / 2;
+
+ open_fd_hw = rlim_cur - MIN(400, rlim_cur / 3);
+ open_fd_rc = rlim_cur / 2;
}
--
2.25.1
^ permalink raw reply related [flat|nested] 32+ messages in thread
* [PATCH v5 11/16] hw/9pfs: Add Linux error number definition
2023-02-20 10:07 [PATCH v5 00/16] hw/9pfs: Add 9pfs support for Windows Bin Meng
` (9 preceding siblings ...)
2023-02-20 10:08 ` [PATCH v5 10/16] hw/9pfs: Update v9fs_set_fd_limit() " Bin Meng
@ 2023-02-20 10:08 ` Bin Meng
2023-02-20 10:08 ` [PATCH v5 12/16] hw/9pfs: Translate Windows errno to Linux value Bin Meng
` (6 subsequent siblings)
17 siblings, 0 replies; 32+ messages in thread
From: Bin Meng @ 2023-02-20 10:08 UTC (permalink / raw)
To: Christian Schoenebeck, Greg Kurz, qemu-devel; +Cc: Guohuai Shi
From: Guohuai Shi <guohuai.shi@windriver.com>
When using 9p2000.L protocol, the errno should use the Linux errno.
Currently magic numbers with comments are used. Replace these with
macros for future expansion.
Signed-off-by: Guohuai Shi <guohuai.shi@windriver.com>
Signed-off-by: Bin Meng <bin.meng@windriver.com>
---
hw/9pfs/9p-linux-errno.h | 151 +++++++++++++++++++++++++++++++++++++++
hw/9pfs/9p-util.h | 24 +++----
2 files changed, 162 insertions(+), 13 deletions(-)
create mode 100644 hw/9pfs/9p-linux-errno.h
diff --git a/hw/9pfs/9p-linux-errno.h b/hw/9pfs/9p-linux-errno.h
new file mode 100644
index 0000000000..56c37fa293
--- /dev/null
+++ b/hw/9pfs/9p-linux-errno.h
@@ -0,0 +1,151 @@
+/*
+ * 9p Linux errno translation definition
+ *
+ * This work is licensed under the terms of the GNU GPL, version 2 or later.
+ * See the COPYING file in the top-level directory.
+ */
+
+#include <errno.h>
+
+#ifndef QEMU_9P_LINUX_ERRNO_H
+#define QEMU_9P_LINUX_ERRNO_H
+
+/*
+ * This file contains the Linux errno definitions to translate errnos set by
+ * the 9P server (running on non-Linux hosts) to a corresponding errno value.
+ *
+ * This list should be periodically reviewed and updated; particularly for
+ * errnos that might be set as a result of a file system operation.
+ */
+
+#define L_EPERM 1 /* Operation not permitted */
+#define L_ENOENT 2 /* No such file or directory */
+#define L_ESRCH 3 /* No such process */
+#define L_EINTR 4 /* Interrupted system call */
+#define L_EIO 5 /* I/O error */
+#define L_ENXIO 6 /* No such device or address */
+#define L_E2BIG 7 /* Argument list too long */
+#define L_ENOEXEC 8 /* Exec format error */
+#define L_EBADF 9 /* Bad file number */
+#define L_ECHILD 10 /* No child processes */
+#define L_EAGAIN 11 /* Try again */
+#define L_ENOMEM 12 /* Out of memory */
+#define L_EACCES 13 /* Permission denied */
+#define L_EFAULT 14 /* Bad address */
+#define L_ENOTBLK 15 /* Block device required */
+#define L_EBUSY 16 /* Device or resource busy */
+#define L_EEXIST 17 /* File exists */
+#define L_EXDEV 18 /* Cross-device link */
+#define L_ENODEV 19 /* No such device */
+#define L_ENOTDIR 20 /* Not a directory */
+#define L_EISDIR 21 /* Is a directory */
+#define L_EINVAL 22 /* Invalid argument */
+#define L_ENFILE 23 /* File table overflow */
+#define L_EMFILE 24 /* Too many open files */
+#define L_ENOTTY 25 /* Not a typewriter */
+#define L_ETXTBSY 26 /* Text file busy */
+#define L_EFBIG 27 /* File too large */
+#define L_ENOSPC 28 /* No space left on device */
+#define L_ESPIPE 29 /* Illegal seek */
+#define L_EROFS 30 /* Read-only file system */
+#define L_EMLINK 31 /* Too many links */
+#define L_EPIPE 32 /* Broken pipe */
+#define L_EDOM 33 /* Math argument out of domain of func */
+#define L_ERANGE 34 /* Math result not representable */
+#define L_EDEADLK 35 /* Resource deadlock would occur */
+#define L_ENAMETOOLONG 36 /* File name too long */
+#define L_ENOLCK 37 /* No record locks available */
+#define L_ENOSYS 38 /* Function not implemented */
+#define L_ENOTEMPTY 39 /* Directory not empty */
+#define L_ELOOP 40 /* Too many symbolic links encountered */
+#define L_ENOMSG 42 /* No message of desired type */
+#define L_EIDRM 43 /* Identifier removed */
+#define L_ECHRNG 44 /* Channel number out of range */
+#define L_EL2NSYNC 45 /* Level 2 not synchronized */
+#define L_EL3HLT 46 /* Level 3 halted */
+#define L_EL3RST 47 /* Level 3 reset */
+#define L_ELNRNG 48 /* Link number out of range */
+#define L_EUNATCH 49 /* Protocol driver not attached */
+#define L_ENOCSI 50 /* No CSI structure available */
+#define L_EL2HLT 51 /* Level 2 halted */
+#define L_EBADE 52 /* Invalid exchange */
+#define L_EBADR 53 /* Invalid request descriptor */
+#define L_EXFULL 54 /* Exchange full */
+#define L_ENOANO 55 /* No anode */
+#define L_EBADRQC 56 /* Invalid request code */
+#define L_EBADSLT 57 /* Invalid slot */
+#define L_EBFONT 58 /* Bad font file format */
+#define L_ENOSTR 59 /* Device not a stream */
+#define L_ENODATA 61 /* No data available */
+#define L_ETIME 62 /* Timer expired */
+#define L_ENOSR 63 /* Out of streams resources */
+#define L_ENONET 64 /* Machine is not on the network */
+#define L_ENOPKG 65 /* Package not installed */
+#define L_EREMOTE 66 /* Object is remote */
+#define L_ENOLINK 67 /* Link has been severed */
+#define L_EADV 68 /* Advertise error */
+#define L_ESRMNT 69 /* Srmount error */
+#define L_ECOMM 70 /* Communication error on send */
+#define L_EPROTO 71 /* Protocol error */
+#define L_EMULTIHOP 72 /* Multihop attempted */
+#define L_EDOTDOT 73 /* RFS specific error */
+#define L_EBADMSG 74 /* Not a data message */
+#define L_EOVERFLOW 75 /* Value too large for defined data type */
+#define L_ENOTUNIQ 76 /* Name not unique on network */
+#define L_EBADFD 77 /* File descriptor in bad state */
+#define L_EREMCHG 78 /* Remote address changed */
+#define L_ELIBACC 79 /* Can not access a needed shared library */
+#define L_ELIBBAD 80 /* Accessing a corrupted shared library */
+#define L_ELIBSCN 81 /* .lib section in a.out corrupted */
+#define L_ELIBMAX 82 /* Attempting to link in too many shared libs */
+#define L_ELIBEXEC 83 /* Cannot exec a shared library directly */
+#define L_EILSEQ 84 /* Illegal byte sequence */
+#define L_ERESTART 85 /* Interrupted system call should be restarted */
+#define L_ESTRPIPE 86 /* Streams pipe error */
+#define L_EUSERS 87 /* Too many users */
+#define L_ENOTSOCK 88 /* Socket operation on non-socket */
+#define L_EDESTADDRREQ 89 /* Destination address required */
+#define L_EMSGSIZE 90 /* Message too long */
+#define L_EPROTOTYPE 91 /* Protocol wrong type for socket */
+#define L_ENOPROTOOPT 92 /* Protocol not available */
+#define L_EPROTONOSUPPORT 93 /* Protocol not supported */
+#define L_ESOCKTNOSUPPORT 94 /* Socket type not supported */
+#define L_EOPNOTSUPP 95 /* Operation not supported on transport endpoint */
+#define L_EPFNOSUPPORT 96 /* Protocol family not supported */
+#define L_EAFNOSUPPORT 97 /* Address family not supported by protocol */
+#define L_EADDRINUSE 98 /* Address already in use */
+#define L_EADDRNOTAVAIL 99 /* Cannot assign requested address */
+#define L_ENETDOWN 100 /* Network is down */
+#define L_ENETUNREACH 101 /* Network is unreachable */
+#define L_ENETRESET 102 /* Network dropped connection because of reset */
+#define L_ECONNABORTED 103 /* Software caused connection abort */
+#define L_ECONNRESET 104 /* Connection reset by peer */
+#define L_ENOBUFS 105 /* No buffer space available */
+#define L_EISCONN 106 /* Transport endpoint is already connected */
+#define L_ENOTCONN 107 /* Transport endpoint is not connected */
+#define L_ESHUTDOWN 108 /* Cannot send after transport endpoint shutdown */
+#define L_ETOOMANYREFS 109 /* Too many references: cannot splice */
+#define L_ETIMEDOUT 110 /* Connection timed out */
+#define L_ECONNREFUSED 111 /* Connection refused */
+#define L_EHOSTDOWN 112 /* Host is down */
+#define L_EHOSTUNREACH 113 /* No route to host */
+#define L_EALREADY 114 /* Operation already in progress */
+#define L_EINPROGRESS 115 /* Operation now in progress */
+#define L_ESTALE 116 /* Stale NFS file handle */
+#define L_EUCLEAN 117 /* Structure needs cleaning */
+#define L_ENOTNAM 118 /* Not a XENIX named type file */
+#define L_ENAVAIL 119 /* No XENIX semaphores available */
+#define L_EISNAM 120 /* Is a named type file */
+#define L_EREMOTEIO 121 /* Remote I/O error */
+#define L_EDQUOT 122 /* Quota exceeded */
+#define L_ENOMEDIUM 123 /* No medium found */
+#define L_EMEDIUMTYPE 124 /* Wrong medium type */
+#define L_ECANCELED 125 /* Operation Canceled */
+#define L_ENOKEY 126 /* Required key not available */
+#define L_EKEYEXPIRED 127 /* Key has expired */
+#define L_EKEYREVOKED 128 /* Key has been revoked */
+#define L_EKEYREJECTED 129 /* Key was rejected by service */
+#define L_EOWNERDEAD 130 /* Owner died */
+#define L_ENOTRECOVERABLE 131 /* State not recoverable */
+
+#endif /* QEMU_9P_LINUX_ERRNO_H */
diff --git a/hw/9pfs/9p-util.h b/hw/9pfs/9p-util.h
index ea8c116059..778352b8ec 100644
--- a/hw/9pfs/9p-util.h
+++ b/hw/9pfs/9p-util.h
@@ -65,8 +65,11 @@ static inline uint64_t host_dev_to_dotl_dev(dev_t dev)
#endif
}
+#include "9p-linux-errno.h"
+
/* Translates errno from host -> Linux if needed */
-static inline int errno_to_dotl(int err) {
+static inline int errno_to_dotl(int err)
+{
#if defined(CONFIG_LINUX)
/* nothing to translate (Linux -> Linux) */
#elif defined(CONFIG_DARWIN)
@@ -76,18 +79,13 @@ static inline int errno_to_dotl(int err) {
* FIXME: Only most important errnos translated here yet, this should be
* extended to as many errnos being translated as possible in future.
*/
- if (err == ENAMETOOLONG) {
- err = 36; /* ==ENAMETOOLONG on Linux */
- } else if (err == ENOTEMPTY) {
- err = 39; /* ==ENOTEMPTY on Linux */
- } else if (err == ELOOP) {
- err = 40; /* ==ELOOP on Linux */
- } else if (err == ENOATTR) {
- err = 61; /* ==ENODATA on Linux */
- } else if (err == ENOTSUP) {
- err = 95; /* ==EOPNOTSUPP on Linux */
- } else if (err == EOPNOTSUPP) {
- err = 95; /* ==EOPNOTSUPP on Linux */
+ switch (err) {
+ case ENAMETOOLONG: return L_ENAMETOOLONG;
+ case ENOTEMPTY: return L_ENOTEMPTY;
+ case ELOOP: return L_ELOOP;
+ case ENOATTR: return L_ENODATA;
+ case ENOTSUP return L_EOPNOTSUPP;
+ case EOPNOTSUPP: return L_EOPNOTSUPP;
}
#else
#error Missing errno translation to Linux for this host system
--
2.25.1
^ permalink raw reply related [flat|nested] 32+ messages in thread
* [PATCH v5 12/16] hw/9pfs: Translate Windows errno to Linux value
2023-02-20 10:07 [PATCH v5 00/16] hw/9pfs: Add 9pfs support for Windows Bin Meng
` (10 preceding siblings ...)
2023-02-20 10:08 ` [PATCH v5 11/16] hw/9pfs: Add Linux error number definition Bin Meng
@ 2023-02-20 10:08 ` Bin Meng
2023-02-20 10:08 ` [PATCH v5 13/16] fsdev: Disable proxy fs driver on Windows Bin Meng
` (5 subsequent siblings)
17 siblings, 0 replies; 32+ messages in thread
From: Bin Meng @ 2023-02-20 10:08 UTC (permalink / raw)
To: Christian Schoenebeck, Greg Kurz, qemu-devel; +Cc: Guohuai Shi
From: Guohuai Shi <guohuai.shi@windriver.com>
Some of Windows error numbers have different value from Linux ones.
For example, ENOTEMPTY is defined to 39 in Linux, but is defined to
41 in Windows. So deleting a directory from a Linux guest on top
of QEMU from a Windows host complains:
# rmdir tmp
rmdir: 'tmp': Unknown error 41
This commit provides error number translation from Windows to Linux.
It can make Linux guest OS happy with the error number when running
on top of QEMU from a Windows host.
Signed-off-by: Guohuai Shi <guohuai.shi@windriver.com>
Signed-off-by: Bin Meng <bin.meng@windriver.com>
---
hw/9pfs/9p-util.h | 12 ++++++++++--
1 file changed, 10 insertions(+), 2 deletions(-)
diff --git a/hw/9pfs/9p-util.h b/hw/9pfs/9p-util.h
index 778352b8ec..824ac81ad3 100644
--- a/hw/9pfs/9p-util.h
+++ b/hw/9pfs/9p-util.h
@@ -72,9 +72,9 @@ static inline int errno_to_dotl(int err)
{
#if defined(CONFIG_LINUX)
/* nothing to translate (Linux -> Linux) */
-#elif defined(CONFIG_DARWIN)
+#elif defined(CONFIG_DARWIN) || defined(CONFIG_WIN32)
/*
- * translation mandatory for macOS hosts
+ * translation mandatory for different hosts
*
* FIXME: Only most important errnos translated here yet, this should be
* extended to as many errnos being translated as possible in future.
@@ -83,9 +83,17 @@ static inline int errno_to_dotl(int err)
case ENAMETOOLONG: return L_ENAMETOOLONG;
case ENOTEMPTY: return L_ENOTEMPTY;
case ELOOP: return L_ELOOP;
+#ifdef CONFIG_DARWIN
case ENOATTR: return L_ENODATA;
case ENOTSUP return L_EOPNOTSUPP;
case EOPNOTSUPP: return L_EOPNOTSUPP;
+#endif
+#ifdef CONFIG_WIN32
+ case EDEADLK: return L_EDEADLK;
+ case ENOLCK: return L_ENOLCK;
+ case ENOSYS: return L_ENOSYS;
+ case EILSEQ: return L_EILSEQ;
+#endif
}
#else
#error Missing errno translation to Linux for this host system
--
2.25.1
^ permalink raw reply related [flat|nested] 32+ messages in thread
* [PATCH v5 13/16] fsdev: Disable proxy fs driver on Windows
2023-02-20 10:07 [PATCH v5 00/16] hw/9pfs: Add 9pfs support for Windows Bin Meng
` (11 preceding siblings ...)
2023-02-20 10:08 ` [PATCH v5 12/16] hw/9pfs: Translate Windows errno to Linux value Bin Meng
@ 2023-02-20 10:08 ` Bin Meng
2023-03-06 9:28 ` Philippe Mathieu-Daudé
2023-02-20 10:08 ` [PATCH v5 14/16] hw/9pfs: Update synth fs driver for Windows Bin Meng
` (4 subsequent siblings)
17 siblings, 1 reply; 32+ messages in thread
From: Bin Meng @ 2023-02-20 10:08 UTC (permalink / raw)
To: Christian Schoenebeck, Greg Kurz, qemu-devel; +Cc: Guohuai Shi
From: Guohuai Shi <guohuai.shi@windriver.com>
We don't plan to support 'proxy' file system driver for 9pfs on
Windows. Disable it for Windows build.
Signed-off-by: Guohuai Shi <guohuai.shi@windriver.com>
Signed-off-by: Bin Meng <bin.meng@windriver.com>
---
fsdev/qemu-fsdev.c | 2 ++
1 file changed, 2 insertions(+)
diff --git a/fsdev/qemu-fsdev.c b/fsdev/qemu-fsdev.c
index 3da64e9f72..58e0710fbb 100644
--- a/fsdev/qemu-fsdev.c
+++ b/fsdev/qemu-fsdev.c
@@ -89,6 +89,7 @@ static FsDriverTable FsDrivers[] = {
NULL
},
},
+#ifndef CONFIG_WIN32
{
.name = "proxy",
.ops = &proxy_ops,
@@ -100,6 +101,7 @@ static FsDriverTable FsDrivers[] = {
NULL
},
},
+#endif
};
static int validate_opt(void *opaque, const char *name, const char *value,
--
2.25.1
^ permalink raw reply related [flat|nested] 32+ messages in thread
* [PATCH v5 14/16] hw/9pfs: Update synth fs driver for Windows
2023-02-20 10:07 [PATCH v5 00/16] hw/9pfs: Add 9pfs support for Windows Bin Meng
` (12 preceding siblings ...)
2023-02-20 10:08 ` [PATCH v5 13/16] fsdev: Disable proxy fs driver on Windows Bin Meng
@ 2023-02-20 10:08 ` Bin Meng
2023-02-20 10:08 ` [PATCH v5 15/16] tests/qtest: virtio-9p-test: Adapt the case for win32 Bin Meng
` (3 subsequent siblings)
17 siblings, 0 replies; 32+ messages in thread
From: Bin Meng @ 2023-02-20 10:08 UTC (permalink / raw)
To: Christian Schoenebeck, Greg Kurz, qemu-devel
Cc: Guohuai Shi, Philippe Mathieu-Daudé
From: Guohuai Shi <guohuai.shi@windriver.com>
Adapt synth fs driver for Windows in preparation to running qtest
9p testing on Windows.
Signed-off-by: Guohuai Shi <guohuai.shi@windriver.com>
Signed-off-by: Bin Meng <bin.meng@windriver.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
---
hw/9pfs/9p-synth.c | 5 ++++-
1 file changed, 4 insertions(+), 1 deletion(-)
diff --git a/hw/9pfs/9p-synth.c b/hw/9pfs/9p-synth.c
index f62c40b639..b1a362a689 100644
--- a/hw/9pfs/9p-synth.c
+++ b/hw/9pfs/9p-synth.c
@@ -146,8 +146,10 @@ static void synth_fill_statbuf(V9fsSynthNode *node, struct stat *stbuf)
stbuf->st_gid = 0;
stbuf->st_rdev = 0;
stbuf->st_size = 0;
+#ifndef CONFIG_WIN32
stbuf->st_blksize = 0;
stbuf->st_blocks = 0;
+#endif
stbuf->st_atime = 0;
stbuf->st_mtime = 0;
stbuf->st_ctime = 0;
@@ -230,7 +232,8 @@ static void synth_direntry(V9fsSynthNode *node,
entry->d_ino = node->attr->inode;
#ifdef CONFIG_DARWIN
entry->d_seekoff = off + 1;
-#else
+#endif
+#ifdef CONFIG_LINUX
entry->d_off = off + 1;
#endif
}
--
2.25.1
^ permalink raw reply related [flat|nested] 32+ messages in thread
* [PATCH v5 15/16] tests/qtest: virtio-9p-test: Adapt the case for win32
2023-02-20 10:07 [PATCH v5 00/16] hw/9pfs: Add 9pfs support for Windows Bin Meng
` (13 preceding siblings ...)
2023-02-20 10:08 ` [PATCH v5 14/16] hw/9pfs: Update synth fs driver for Windows Bin Meng
@ 2023-02-20 10:08 ` Bin Meng
2023-02-20 10:08 ` [PATCH v5 16/16] meson.build: Turn on virtfs for Windows Bin Meng
` (2 subsequent siblings)
17 siblings, 0 replies; 32+ messages in thread
From: Bin Meng @ 2023-02-20 10:08 UTC (permalink / raw)
To: Christian Schoenebeck, Greg Kurz, qemu-devel
Cc: Guohuai Shi, Xuzhou Cheng, Thomas Huth
From: Guohuai Shi <guohuai.shi@windriver.com>
Windows does not provide the getuid() API. Let's create a local
one and return a fixed value 0 as the uid for testing.
Co-developed-by: Xuzhou Cheng <xuzhou.cheng@windriver.com>
Signed-off-by: Guohuai Shi <guohuai.shi@windriver.com>
Signed-off-by: Bin Meng <bin.meng@windriver.com>
Reviewed-by: Thomas Huth <thuth@redhat.com>
---
tests/qtest/libqos/virtio-9p-client.h | 7 +++++++
1 file changed, 7 insertions(+)
diff --git a/tests/qtest/libqos/virtio-9p-client.h b/tests/qtest/libqos/virtio-9p-client.h
index 78228eb97d..a5c0107580 100644
--- a/tests/qtest/libqos/virtio-9p-client.h
+++ b/tests/qtest/libqos/virtio-9p-client.h
@@ -491,4 +491,11 @@ void v9fs_rlink(P9Req *req);
TunlinkatRes v9fs_tunlinkat(TunlinkatOpt);
void v9fs_runlinkat(P9Req *req);
+#ifdef CONFIG_WIN32
+static inline uint32_t getuid(void)
+{
+ return 0;
+}
+#endif
+
#endif
--
2.25.1
^ permalink raw reply related [flat|nested] 32+ messages in thread
* [PATCH v5 16/16] meson.build: Turn on virtfs for Windows
2023-02-20 10:07 [PATCH v5 00/16] hw/9pfs: Add 9pfs support for Windows Bin Meng
` (14 preceding siblings ...)
2023-02-20 10:08 ` [PATCH v5 15/16] tests/qtest: virtio-9p-test: Adapt the case for win32 Bin Meng
@ 2023-02-20 10:08 ` Bin Meng
2023-03-13 12:53 ` Christian Schoenebeck
2023-03-06 6:04 ` [PATCH v5 00/16] hw/9pfs: Add 9pfs support " Bin Meng
2023-03-06 14:15 ` Christian Schoenebeck
17 siblings, 1 reply; 32+ messages in thread
From: Bin Meng @ 2023-02-20 10:08 UTC (permalink / raw)
To: Christian Schoenebeck, Greg Kurz, qemu-devel; +Cc: Guohuai Shi
From: Guohuai Shi <guohuai.shi@windriver.com>
Enable virtfs configuration option for Windows host.
Signed-off-by: Guohuai Shi <guohuai.shi@windriver.com>
Signed-off-by: Bin Meng <bin.meng@windriver.com>
---
meson.build | 10 +++++-----
fsdev/meson.build | 1 +
hw/9pfs/meson.build | 8 +++++---
3 files changed, 11 insertions(+), 8 deletions(-)
diff --git a/meson.build b/meson.build
index a76c855312..9ddf254e78 100644
--- a/meson.build
+++ b/meson.build
@@ -1755,16 +1755,16 @@ dbus_display = get_option('dbus_display') \
.allowed()
have_virtfs = get_option('virtfs') \
- .require(targetos == 'linux' or targetos == 'darwin',
- error_message: 'virtio-9p (virtfs) requires Linux or macOS') \
- .require(targetos == 'linux' or cc.has_function('pthread_fchdir_np'),
+ .require(targetos == 'linux' or targetos == 'darwin' or targetos == 'windows',
+ error_message: 'virtio-9p (virtfs) requires Linux or macOS or Windows') \
+ .require(targetos == 'linux' or targetos == 'windows' or cc.has_function('pthread_fchdir_np'),
error_message: 'virtio-9p (virtfs) on macOS requires the presence of pthread_fchdir_np') \
- .require(targetos == 'darwin' or (libattr.found() and libcap_ng.found()),
+ .require(targetos == 'darwin' or targetos == 'windows' or (libattr.found() and libcap_ng.found()),
error_message: 'virtio-9p (virtfs) on Linux requires libcap-ng-devel and libattr-devel') \
.disable_auto_if(not have_tools and not have_system) \
.allowed()
-have_virtfs_proxy_helper = targetos != 'darwin' and have_virtfs and have_tools
+have_virtfs_proxy_helper = targetos != 'darwin' and targetos != 'windows' and have_virtfs and have_tools
if get_option('block_drv_ro_whitelist') == ''
config_host_data.set('CONFIG_BDRV_RO_WHITELIST', '')
diff --git a/fsdev/meson.build b/fsdev/meson.build
index b632b66348..2aad081aef 100644
--- a/fsdev/meson.build
+++ b/fsdev/meson.build
@@ -8,6 +8,7 @@ fsdev_ss.add(when: ['CONFIG_FSDEV_9P'], if_true: files(
), if_false: files('qemu-fsdev-dummy.c'))
softmmu_ss.add_all(when: 'CONFIG_LINUX', if_true: fsdev_ss)
softmmu_ss.add_all(when: 'CONFIG_DARWIN', if_true: fsdev_ss)
+softmmu_ss.add_all(when: 'CONFIG_WIN32', if_true: fsdev_ss)
if have_virtfs_proxy_helper
executable('virtfs-proxy-helper',
diff --git a/hw/9pfs/meson.build b/hw/9pfs/meson.build
index 12443b6ad5..aaa50e71f7 100644
--- a/hw/9pfs/meson.build
+++ b/hw/9pfs/meson.build
@@ -2,7 +2,6 @@ fs_ss = ss.source_set()
fs_ss.add(files(
'9p-local.c',
'9p-posix-acl.c',
- '9p-proxy.c',
'9p-synth.c',
'9p-xattr-user.c',
'9p-xattr.c',
@@ -13,8 +12,11 @@ fs_ss.add(files(
'coth.c',
'coxattr.c',
))
-fs_ss.add(when: 'CONFIG_LINUX', if_true: files('9p-util-linux.c'))
-fs_ss.add(when: 'CONFIG_DARWIN', if_true: files('9p-util-darwin.c'))
+fs_ss.add(when: 'CONFIG_LINUX', if_true: files('9p-proxy.c',
+ '9p-util-linux.c'))
+fs_ss.add(when: 'CONFIG_DARWIN', if_true: files('9p-proxy.c',
+ '9p-util-darwin.c'))
+fs_ss.add(when: 'CONFIG_WIN32', if_true: files('9p-util-win32.c'))
fs_ss.add(when: 'CONFIG_XEN', if_true: files('xen-9p-backend.c'))
softmmu_ss.add_all(when: 'CONFIG_FSDEV_9P', if_true: fs_ss)
--
2.25.1
^ permalink raw reply related [flat|nested] 32+ messages in thread
* Re: [PATCH v5 00/16] hw/9pfs: Add 9pfs support for Windows
2023-02-20 10:07 [PATCH v5 00/16] hw/9pfs: Add 9pfs support for Windows Bin Meng
` (15 preceding siblings ...)
2023-02-20 10:08 ` [PATCH v5 16/16] meson.build: Turn on virtfs for Windows Bin Meng
@ 2023-03-06 6:04 ` Bin Meng
2023-03-06 14:15 ` Christian Schoenebeck
17 siblings, 0 replies; 32+ messages in thread
From: Bin Meng @ 2023-03-06 6:04 UTC (permalink / raw)
To: Bin Meng; +Cc: Christian Schoenebeck, Greg Kurz, qemu-devel
On Mon, Feb 20, 2023 at 6:10 PM Bin Meng <bin.meng@windriver.com> wrote:
>
> At present there is no Windows support for 9p file system.
> This series adds initial Windows support for 9p file system.
>
> 'local' file system backend driver is supported on Windows,
> including open, read, write, close, rename, remove, etc.
> All security models are supported. The mapped (mapped-xattr)
> security model is implemented using NTFS Alternate Data Stream
> (ADS) so the 9p export path shall be on an NTFS partition.
>
> 'synth' driver is adapted for Windows too so that we can now
> run qtests on Windows for 9p related regression testing.
>
> Example command line to test:
> "-fsdev local,path=c:\msys64,security_model=mapped,id=p9 -device virtio-9p-pci,fsdev=p9,mount_tag=p9fs"
>
> Changes in v5:
> - rework Windows specific xxxdir() APIs implementation
>
> Bin Meng (2):
> hw/9pfs: Update helper qemu_stat_rdev()
> hw/9pfs: Add a helper qemu_stat_blksize()
>
> Guohuai Shi (14):
> hw/9pfs: Add missing definitions for Windows
> hw/9pfs: Implement Windows specific utilities functions for 9pfs
> hw/9pfs: Replace the direct call to xxxdir() APIs with a wrapper
> hw/9pfs: Implement Windows specific xxxdir() APIs
> hw/9pfs: Update the local fs driver to support Windows
> hw/9pfs: Support getting current directory offset for Windows
> hw/9pfs: Disable unsupported flags and features for Windows
> hw/9pfs: Update v9fs_set_fd_limit() for Windows
> hw/9pfs: Add Linux error number definition
> hw/9pfs: Translate Windows errno to Linux value
> fsdev: Disable proxy fs driver on Windows
> hw/9pfs: Update synth fs driver for Windows
> tests/qtest: virtio-9p-test: Adapt the case for win32
> meson.build: Turn on virtfs for Windows
Ping?
^ permalink raw reply [flat|nested] 32+ messages in thread
* Re: [PATCH v5 13/16] fsdev: Disable proxy fs driver on Windows
2023-02-20 10:08 ` [PATCH v5 13/16] fsdev: Disable proxy fs driver on Windows Bin Meng
@ 2023-03-06 9:28 ` Philippe Mathieu-Daudé
0 siblings, 0 replies; 32+ messages in thread
From: Philippe Mathieu-Daudé @ 2023-03-06 9:28 UTC (permalink / raw)
To: Bin Meng, Christian Schoenebeck, Greg Kurz, qemu-devel; +Cc: Guohuai Shi
On 20/2/23 11:08, Bin Meng wrote:
> From: Guohuai Shi <guohuai.shi@windriver.com>
>
> We don't plan to support 'proxy' file system driver for 9pfs on
> Windows. Disable it for Windows build.
>
> Signed-off-by: Guohuai Shi <guohuai.shi@windriver.com>
> Signed-off-by: Bin Meng <bin.meng@windriver.com>
> ---
>
> fsdev/qemu-fsdev.c | 2 ++
> 1 file changed, 2 insertions(+)
>
> diff --git a/fsdev/qemu-fsdev.c b/fsdev/qemu-fsdev.c
> index 3da64e9f72..58e0710fbb 100644
> --- a/fsdev/qemu-fsdev.c
> +++ b/fsdev/qemu-fsdev.c
> @@ -89,6 +89,7 @@ static FsDriverTable FsDrivers[] = {
> NULL
> },
> },
> +#ifndef CONFIG_WIN32
> {
> .name = "proxy",
> .ops = &proxy_ops,
> @@ -100,6 +101,7 @@ static FsDriverTable FsDrivers[] = {
> NULL
> },
> },
> +#endif
> };
Probably the meson changes moving '9p-proxy.c' in hw/9pfs/meson.build
(patch 16) belong here.
^ permalink raw reply [flat|nested] 32+ messages in thread
* Re: [PATCH v5 03/16] hw/9pfs: Replace the direct call to xxxdir() APIs with a wrapper
2023-02-20 10:08 ` [PATCH v5 03/16] hw/9pfs: Replace the direct call to xxxdir() APIs with a wrapper Bin Meng
@ 2023-03-06 9:31 ` Philippe Mathieu-Daudé
2023-03-06 9:35 ` Bin Meng
0 siblings, 1 reply; 32+ messages in thread
From: Philippe Mathieu-Daudé @ 2023-03-06 9:31 UTC (permalink / raw)
To: Bin Meng, Christian Schoenebeck, Greg Kurz, qemu-devel; +Cc: Guohuai Shi
On 20/2/23 11:08, Bin Meng wrote:
> From: Guohuai Shi <guohuai.shi@windriver.com>
>
> xxxdir() APIs are not safe on Windows host. For future extension to
> Windows, let's replace the direct call to xxxdir() APIs with a wrapper.
>
> Signed-off-by: Guohuai Shi <guohuai.shi@windriver.com>
> Signed-off-by: Bin Meng <bin.meng@windriver.com>
> ---
>
> hw/9pfs/9p-util.h | 14 ++++++++++++++
> hw/9pfs/9p-local.c | 12 ++++++------
> 2 files changed, 20 insertions(+), 6 deletions(-)
> +#define qemu_opendir opendir_win32
> +#define qemu_closedir closedir_win32
> +#define qemu_readdir readdir_win32
> +#define qeme_rewinddir rewinddir_win32
Typo :)
^ permalink raw reply [flat|nested] 32+ messages in thread
* Re: [PATCH v5 03/16] hw/9pfs: Replace the direct call to xxxdir() APIs with a wrapper
2023-03-06 9:31 ` Philippe Mathieu-Daudé
@ 2023-03-06 9:35 ` Bin Meng
0 siblings, 0 replies; 32+ messages in thread
From: Bin Meng @ 2023-03-06 9:35 UTC (permalink / raw)
To: Philippe Mathieu-Daudé
Cc: Bin Meng, Christian Schoenebeck, Greg Kurz, qemu-devel, Guohuai Shi
On Mon, Mar 6, 2023 at 5:32 PM Philippe Mathieu-Daudé <philmd@linaro.org> wrote:
>
> On 20/2/23 11:08, Bin Meng wrote:
> > From: Guohuai Shi <guohuai.shi@windriver.com>
> >
> > xxxdir() APIs are not safe on Windows host. For future extension to
> > Windows, let's replace the direct call to xxxdir() APIs with a wrapper.
> >
> > Signed-off-by: Guohuai Shi <guohuai.shi@windriver.com>
> > Signed-off-by: Bin Meng <bin.meng@windriver.com>
> > ---
> >
> > hw/9pfs/9p-util.h | 14 ++++++++++++++
> > hw/9pfs/9p-local.c | 12 ++++++------
> > 2 files changed, 20 insertions(+), 6 deletions(-)
>
>
> > +#define qemu_opendir opendir_win32
> > +#define qemu_closedir closedir_win32
> > +#define qemu_readdir readdir_win32
> > +#define qeme_rewinddir rewinddir_win32
>
> Typo :)
>
Ouch! Thanks Philippe :)
Regards,
Bin
^ permalink raw reply [flat|nested] 32+ messages in thread
* Re: [PATCH v5 00/16] hw/9pfs: Add 9pfs support for Windows
2023-02-20 10:07 [PATCH v5 00/16] hw/9pfs: Add 9pfs support for Windows Bin Meng
` (16 preceding siblings ...)
2023-03-06 6:04 ` [PATCH v5 00/16] hw/9pfs: Add 9pfs support " Bin Meng
@ 2023-03-06 14:15 ` Christian Schoenebeck
2023-03-06 14:30 ` Philippe Mathieu-Daudé
2023-03-06 14:56 ` Bin Meng
17 siblings, 2 replies; 32+ messages in thread
From: Christian Schoenebeck @ 2023-03-06 14:15 UTC (permalink / raw)
To: Greg Kurz, qemu-devel; +Cc: Bin Meng
On Monday, February 20, 2023 11:07:59 AM CET Bin Meng wrote:
> At present there is no Windows support for 9p file system.
> This series adds initial Windows support for 9p file system.
>
> 'local' file system backend driver is supported on Windows,
> including open, read, write, close, rename, remove, etc.
> All security models are supported. The mapped (mapped-xattr)
> security model is implemented using NTFS Alternate Data Stream
> (ADS) so the 9p export path shall be on an NTFS partition.
>
> 'synth' driver is adapted for Windows too so that we can now
> run qtests on Windows for 9p related regression testing.
>
> Example command line to test:
> "-fsdev local,path=c:\msys64,security_model=mapped,id=p9 -device
virtio-9p-pci,fsdev=p9,mount_tag=p9fs"
>
> Changes in v5:
> - rework Windows specific xxxdir() APIs implementation
I didn't have the chance to look at this v5 yet.
In general it would help for review to point out in the cover letter which
patch(es) have changed, what decisions you have made and why.
In this case I guess that's patch 4.
Best regards,
Christian Schoenebeck
> Bin Meng (2):
> hw/9pfs: Update helper qemu_stat_rdev()
> hw/9pfs: Add a helper qemu_stat_blksize()
>
> Guohuai Shi (14):
> hw/9pfs: Add missing definitions for Windows
> hw/9pfs: Implement Windows specific utilities functions for 9pfs
> hw/9pfs: Replace the direct call to xxxdir() APIs with a wrapper
> hw/9pfs: Implement Windows specific xxxdir() APIs
> hw/9pfs: Update the local fs driver to support Windows
> hw/9pfs: Support getting current directory offset for Windows
> hw/9pfs: Disable unsupported flags and features for Windows
> hw/9pfs: Update v9fs_set_fd_limit() for Windows
> hw/9pfs: Add Linux error number definition
> hw/9pfs: Translate Windows errno to Linux value
> fsdev: Disable proxy fs driver on Windows
> hw/9pfs: Update synth fs driver for Windows
> tests/qtest: virtio-9p-test: Adapt the case for win32
> meson.build: Turn on virtfs for Windows
>
> meson.build | 10 +-
> fsdev/file-op-9p.h | 33 +
> hw/9pfs/9p-linux-errno.h | 151 +++
> hw/9pfs/9p-local.h | 8 +
> hw/9pfs/9p-util.h | 139 ++-
> hw/9pfs/9p.h | 43 +
> tests/qtest/libqos/virtio-9p-client.h | 7 +
> fsdev/qemu-fsdev.c | 2 +
> hw/9pfs/9p-local.c | 269 ++++-
> hw/9pfs/9p-synth.c | 5 +-
> hw/9pfs/9p-util-win32.c | 1452 +++++++++++++++++++++++++
> hw/9pfs/9p.c | 90 +-
> hw/9pfs/codir.c | 2 +-
> fsdev/meson.build | 1 +
> hw/9pfs/meson.build | 8 +-
> 15 files changed, 2155 insertions(+), 65 deletions(-)
> create mode 100644 hw/9pfs/9p-linux-errno.h
> create mode 100644 hw/9pfs/9p-util-win32.c
^ permalink raw reply [flat|nested] 32+ messages in thread
* Re: [PATCH v5 00/16] hw/9pfs: Add 9pfs support for Windows
2023-03-06 14:15 ` Christian Schoenebeck
@ 2023-03-06 14:30 ` Philippe Mathieu-Daudé
2023-03-06 14:56 ` Bin Meng
1 sibling, 0 replies; 32+ messages in thread
From: Philippe Mathieu-Daudé @ 2023-03-06 14:30 UTC (permalink / raw)
To: Christian Schoenebeck, Greg Kurz, qemu-devel; +Cc: Bin Meng
On 6/3/23 15:15, Christian Schoenebeck wrote:
> On Monday, February 20, 2023 11:07:59 AM CET Bin Meng wrote:
>> At present there is no Windows support for 9p file system.
>> This series adds initial Windows support for 9p file system.
>>
>> 'local' file system backend driver is supported on Windows,
>> including open, read, write, close, rename, remove, etc.
>> All security models are supported. The mapped (mapped-xattr)
>> security model is implemented using NTFS Alternate Data Stream
>> (ADS) so the 9p export path shall be on an NTFS partition.
>>
>> 'synth' driver is adapted for Windows too so that we can now
>> run qtests on Windows for 9p related regression testing.
>>
>> Example command line to test:
>> "-fsdev local,path=c:\msys64,security_model=mapped,id=p9 -device
> virtio-9p-pci,fsdev=p9,mount_tag=p9fs"
>>
>> Changes in v5:
>> - rework Windows specific xxxdir() APIs implementation
>
> I didn't have the chance to look at this v5 yet.
>
> In general it would help for review to point out in the cover letter which
> patch(es) have changed, what decisions you have made and why.
>
> In this case I guess that's patch 4.
FWIW the overall LGTM, but I'm not confident enough with Windows to
add a R-b tag.
^ permalink raw reply [flat|nested] 32+ messages in thread
* Re: [PATCH v5 00/16] hw/9pfs: Add 9pfs support for Windows
2023-03-06 14:15 ` Christian Schoenebeck
2023-03-06 14:30 ` Philippe Mathieu-Daudé
@ 2023-03-06 14:56 ` Bin Meng
2023-03-07 12:44 ` Christian Schoenebeck
1 sibling, 1 reply; 32+ messages in thread
From: Bin Meng @ 2023-03-06 14:56 UTC (permalink / raw)
To: Christian Schoenebeck; +Cc: Greg Kurz, qemu-devel, Bin Meng
On Mon, Mar 6, 2023 at 10:15 PM Christian Schoenebeck
<qemu_oss@crudebyte.com> wrote:
>
> On Monday, February 20, 2023 11:07:59 AM CET Bin Meng wrote:
> > At present there is no Windows support for 9p file system.
> > This series adds initial Windows support for 9p file system.
> >
> > 'local' file system backend driver is supported on Windows,
> > including open, read, write, close, rename, remove, etc.
> > All security models are supported. The mapped (mapped-xattr)
> > security model is implemented using NTFS Alternate Data Stream
> > (ADS) so the 9p export path shall be on an NTFS partition.
> >
> > 'synth' driver is adapted for Windows too so that we can now
> > run qtests on Windows for 9p related regression testing.
> >
> > Example command line to test:
> > "-fsdev local,path=c:\msys64,security_model=mapped,id=p9 -device
> virtio-9p-pci,fsdev=p9,mount_tag=p9fs"
> >
> > Changes in v5:
> > - rework Windows specific xxxdir() APIs implementation
>
> I didn't have the chance to look at this v5 yet.
>
> In general it would help for review to point out in the cover letter which
> patch(es) have changed, what decisions you have made and why.
>
> In this case I guess that's patch 4.
>
Yes, it's patch 4, and v5 is reworked following your comments
regarding patch 4 of v4.
Regards,
Bin
^ permalink raw reply [flat|nested] 32+ messages in thread
* Re: [PATCH v5 00/16] hw/9pfs: Add 9pfs support for Windows
2023-03-06 14:56 ` Bin Meng
@ 2023-03-07 12:44 ` Christian Schoenebeck
0 siblings, 0 replies; 32+ messages in thread
From: Christian Schoenebeck @ 2023-03-07 12:44 UTC (permalink / raw)
To: qemu-devel; +Cc: Greg Kurz, qemu-devel, Bin Meng, Bin Meng
On Monday, March 6, 2023 3:56:49 PM CET Bin Meng wrote:
> On Mon, Mar 6, 2023 at 10:15 PM Christian Schoenebeck
> <qemu_oss@crudebyte.com> wrote:
> >
> > On Monday, February 20, 2023 11:07:59 AM CET Bin Meng wrote:
> > > At present there is no Windows support for 9p file system.
> > > This series adds initial Windows support for 9p file system.
> > >
> > > 'local' file system backend driver is supported on Windows,
> > > including open, read, write, close, rename, remove, etc.
> > > All security models are supported. The mapped (mapped-xattr)
> > > security model is implemented using NTFS Alternate Data Stream
> > > (ADS) so the 9p export path shall be on an NTFS partition.
> > >
> > > 'synth' driver is adapted for Windows too so that we can now
> > > run qtests on Windows for 9p related regression testing.
> > >
> > > Example command line to test:
> > > "-fsdev local,path=c:\msys64,security_model=mapped,id=p9 -device
> > virtio-9p-pci,fsdev=p9,mount_tag=p9fs"
> > >
> > > Changes in v5:
> > > - rework Windows specific xxxdir() APIs implementation
> >
> > I didn't have the chance to look at this v5 yet.
> >
> > In general it would help for review to point out in the cover letter which
> > patch(es) have changed, what decisions you have made and why.
> >
> > In this case I guess that's patch 4.
> >
>
> Yes, it's patch 4, and v5 is reworked following your comments
> regarding patch 4 of v4.
:) The point was we only discussed suboptimal individual options (each one
with pros and cons), not one compelling solution.
Never mind, I'll look at your code changes.
Best regards,
Christian Schoenebeck
^ permalink raw reply [flat|nested] 32+ messages in thread
* Re: [PATCH v5 16/16] meson.build: Turn on virtfs for Windows
2023-02-20 10:08 ` [PATCH v5 16/16] meson.build: Turn on virtfs for Windows Bin Meng
@ 2023-03-13 12:53 ` Christian Schoenebeck
0 siblings, 0 replies; 32+ messages in thread
From: Christian Schoenebeck @ 2023-03-13 12:53 UTC (permalink / raw)
To: Greg Kurz, qemu-devel; +Cc: Guohuai Shi, Bin Meng
On Monday, February 20, 2023 11:08:15 AM CET Bin Meng wrote:
> From: Guohuai Shi <guohuai.shi@windriver.com>
>
> Enable virtfs configuration option for Windows host.
>
> Signed-off-by: Guohuai Shi <guohuai.shi@windriver.com>
> Signed-off-by: Bin Meng <bin.meng@windriver.com>
> ---
>
> meson.build | 10 +++++-----
> fsdev/meson.build | 1 +
> hw/9pfs/meson.build | 8 +++++---
> 3 files changed, 11 insertions(+), 8 deletions(-)
>
> diff --git a/meson.build b/meson.build
> index a76c855312..9ddf254e78 100644
> --- a/meson.build
> +++ b/meson.build
> @@ -1755,16 +1755,16 @@ dbus_display = get_option('dbus_display') \
> .allowed()
>
> have_virtfs = get_option('virtfs') \
> - .require(targetos == 'linux' or targetos == 'darwin',
> - error_message: 'virtio-9p (virtfs) requires Linux or macOS') \
> - .require(targetos == 'linux' or cc.has_function('pthread_fchdir_np'),
> + .require(targetos == 'linux' or targetos == 'darwin' or targetos == 'windows',
> + error_message: 'virtio-9p (virtfs) requires Linux or macOS or Windows') \
> + .require(targetos == 'linux' or targetos == 'windows' or cc.has_function('pthread_fchdir_np'),
> error_message: 'virtio-9p (virtfs) on macOS requires the presence of pthread_fchdir_np') \
> - .require(targetos == 'darwin' or (libattr.found() and libcap_ng.found()),
> + .require(targetos == 'darwin' or targetos == 'windows' or (libattr.found() and libcap_ng.found()),
> error_message: 'virtio-9p (virtfs) on Linux requires libcap-ng-devel and libattr-devel') \
> .disable_auto_if(not have_tools and not have_system) \
> .allowed()
>
> -have_virtfs_proxy_helper = targetos != 'darwin' and have_virtfs and have_tools
> +have_virtfs_proxy_helper = targetos != 'darwin' and targetos != 'windows' and have_virtfs and have_tools
>
> if get_option('block_drv_ro_whitelist') == ''
> config_host_data.set('CONFIG_BDRV_RO_WHITELIST', '')
> diff --git a/fsdev/meson.build b/fsdev/meson.build
> index b632b66348..2aad081aef 100644
> --- a/fsdev/meson.build
> +++ b/fsdev/meson.build
> @@ -8,6 +8,7 @@ fsdev_ss.add(when: ['CONFIG_FSDEV_9P'], if_true: files(
> ), if_false: files('qemu-fsdev-dummy.c'))
> softmmu_ss.add_all(when: 'CONFIG_LINUX', if_true: fsdev_ss)
> softmmu_ss.add_all(when: 'CONFIG_DARWIN', if_true: fsdev_ss)
> +softmmu_ss.add_all(when: 'CONFIG_WIN32', if_true: fsdev_ss)
>
> if have_virtfs_proxy_helper
> executable('virtfs-proxy-helper',
> diff --git a/hw/9pfs/meson.build b/hw/9pfs/meson.build
> index 12443b6ad5..aaa50e71f7 100644
> --- a/hw/9pfs/meson.build
> +++ b/hw/9pfs/meson.build
> @@ -2,7 +2,6 @@ fs_ss = ss.source_set()
> fs_ss.add(files(
> '9p-local.c',
> '9p-posix-acl.c',
> - '9p-proxy.c',
> '9p-synth.c',
> '9p-xattr-user.c',
> '9p-xattr.c',
> @@ -13,8 +12,11 @@ fs_ss.add(files(
> 'coth.c',
> 'coxattr.c',
> ))
> -fs_ss.add(when: 'CONFIG_LINUX', if_true: files('9p-util-linux.c'))
> -fs_ss.add(when: 'CONFIG_DARWIN', if_true: files('9p-util-darwin.c'))
> +fs_ss.add(when: 'CONFIG_LINUX', if_true: files('9p-proxy.c',
> + '9p-util-linux.c'))
> +fs_ss.add(when: 'CONFIG_DARWIN', if_true: files('9p-proxy.c',
> + '9p-util-darwin.c'))
> +fs_ss.add(when: 'CONFIG_WIN32', if_true: files('9p-util-win32.c'))
> fs_ss.add(when: 'CONFIG_XEN', if_true: files('xen-9p-backend.c'))
This no longer applies on master because CONFIG_XEN has been renamed to
CONFIG_XEN_BUS.
> softmmu_ss.add_all(when: 'CONFIG_FSDEV_9P', if_true: fs_ss)
>
>
^ permalink raw reply [flat|nested] 32+ messages in thread
* Re: [PATCH v5 04/16] hw/9pfs: Implement Windows specific xxxdir() APIs
2023-02-20 10:08 ` [PATCH v5 04/16] hw/9pfs: Implement Windows specific xxxdir() APIs Bin Meng
@ 2023-03-14 16:05 ` Christian Schoenebeck
2023-03-15 19:05 ` Shi, Guohuai
0 siblings, 1 reply; 32+ messages in thread
From: Christian Schoenebeck @ 2023-03-14 16:05 UTC (permalink / raw)
To: Greg Kurz, qemu-devel; +Cc: Guohuai Shi, Bin Meng
On Monday, February 20, 2023 11:08:03 AM CET Bin Meng wrote:
> From: Guohuai Shi <guohuai.shi@windriver.com>
>
> This commit implements Windows specific xxxdir() APIs for safety
> directory access.
That comment is seriously too short for this patch.
1. You should describe the behaviour implementation that you have chosen and
why you have chosen it.
2. Like already said in the previous version of the patch, you should place a
link to the discussion we had on this issue.
> Signed-off-by: Guohuai Shi <guohuai.shi@windriver.com>
> Signed-off-by: Bin Meng <bin.meng@windriver.com>
> ---
>
> hw/9pfs/9p-util.h | 6 +
> hw/9pfs/9p-util-win32.c | 443 ++++++++++++++++++++++++++++++++++++++++
> 2 files changed, 449 insertions(+)
>
> diff --git a/hw/9pfs/9p-util.h b/hw/9pfs/9p-util.h
> index 0f159fb4ce..c1c251fbd1 100644
> --- a/hw/9pfs/9p-util.h
> +++ b/hw/9pfs/9p-util.h
> @@ -141,6 +141,12 @@ int unlinkat_win32(int dirfd, const char *pathname, int flags);
> int statfs_win32(const char *root_path, struct statfs *stbuf);
> int openat_dir(int dirfd, const char *name);
> int openat_file(int dirfd, const char *name, int flags, mode_t mode);
> +DIR *opendir_win32(const char *full_file_name);
> +int closedir_win32(DIR *pDir);
> +struct dirent *readdir_win32(DIR *pDir);
> +void rewinddir_win32(DIR *pDir);
> +void seekdir_win32(DIR *pDir, long pos);
> +long telldir_win32(DIR *pDir);
> #endif
>
> static inline void close_preserve_errno(int fd)
> diff --git a/hw/9pfs/9p-util-win32.c b/hw/9pfs/9p-util-win32.c
> index a99d579a06..e9408f3c45 100644
> --- a/hw/9pfs/9p-util-win32.c
> +++ b/hw/9pfs/9p-util-win32.c
> @@ -37,6 +37,16 @@
> * Windows does not support opendir, the directory fd is created by
> * CreateFile and convert to fd by _open_osfhandle(). Keep the fd open will
> * lock and protect the directory (can not be modified or replaced)
> + *
> + * 5. Neither Windows native APIs, nor MinGW provide a POSIX compatible API for
> + * acquiring directory entries in a safe way. Calling those APIs (native
> + * _findfirst() and _findnext() or MinGW's readdir(), seekdir() and
> + * telldir()) directly can lead to an inconsistent state if directory is
> + * modified in between, e.g. the same directory appearing more than once
> + * in output, or directories not appearing at all in output even though they
> + * were neither newly created nor deleted. POSIX does not define what happens
> + * with deleted or newly created directories in between, but it guarantees a
> + * consistent state.
> */
>
> #include "qemu/osdep.h"
> @@ -51,6 +61,25 @@
>
> #define V9FS_MAGIC 0x53465039 /* string "9PFS" */
>
> +/*
> + * MinGW and Windows does not provide a safe way to seek directory while other
> + * thread is modifying the same directory.
> + *
> + * This structure is used to store sorted file id and ensure directory seek
> + * consistency.
> + */
> +struct dir_win32 {
> + struct dirent dd_dir;
> + uint32_t offset;
> + uint32_t total_entries;
> + HANDLE hDir;
> + uint32_t dir_name_len;
> + uint64_t dot_id;
> + uint64_t dot_dot_id;
> + uint64_t *file_id_list;
> + char dd_name[1];
> +};
> +
> /*
> * win32_error_to_posix - convert Win32 error to POSIX error number
> *
> @@ -977,3 +1006,417 @@ int qemu_mknodat(int dirfd, const char *filename, mode_t mode, dev_t dev)
> errno = ENOTSUP;
> return -1;
> }
> +
> +static int file_id_compare(const void *id_ptr1, const void *id_ptr2)
> +{
> + uint64_t id[2];
> +
> + id[0] = *(uint64_t *)id_ptr1;
> + id[1] = *(uint64_t *)id_ptr2;
> +
> + if (id[0] > id[1]) {
> + return 1;
> + } else if (id[0] < id[1]) {
> + return -1;
> + } else {
> + return 0;
> + }
> +}
> +
> +static int get_next_entry(struct dir_win32 *stream)
> +{
> + HANDLE hDirEntry = INVALID_HANDLE_VALUE;
> + char *entry_name;
> + char *entry_start;
> + FILE_ID_DESCRIPTOR fid;
> + DWORD attribute;
> +
> + if (stream->file_id_list[stream->offset] == stream->dot_id) {
> + strcpy(stream->dd_dir.d_name, ".");
> + return 0;
> + }
> +
> + if (stream->file_id_list[stream->offset] == stream->dot_dot_id) {
> + strcpy(stream->dd_dir.d_name, "..");
> + return 0;
> + }
> +
> + fid.dwSize = sizeof(fid);
> + fid.Type = FileIdType;
> +
> + fid.FileId.QuadPart = stream->file_id_list[stream->offset];
> +
> + hDirEntry = OpenFileById(stream->hDir, &fid, GENERIC_READ,
> + FILE_SHARE_READ | FILE_SHARE_WRITE
> + | FILE_SHARE_DELETE,
> + NULL,
> + FILE_FLAG_BACKUP_SEMANTICS
> + | FILE_FLAG_OPEN_REPARSE_POINT);
What's the purpose of FILE_FLAG_OPEN_REPARSE_POINT here? As it's apparently
not obvious, please add a comment.
> +
> + if (hDirEntry == INVALID_HANDLE_VALUE) {
> + /*
> + * Not open it successfully, it may be deleted.
Wrong English. "Open failed, it may have been deleted in the meantime.".
> + * Try next id.
> + */
> + return -1;
> + }
> +
> + entry_name = get_full_path_win32(hDirEntry, NULL);
> +
> + CloseHandle(hDirEntry);
> +
> + if (entry_name == NULL) {
> + return -1;
> + }
> +
> + attribute = GetFileAttributes(entry_name);
> +
> + /* symlink is not allowed */
> + if (attribute == INVALID_FILE_ATTRIBUTES
> + || (attribute & FILE_ATTRIBUTE_REPARSE_POINT) != 0) {
> + return -1;
Wouldn't it make sense to call warn_report_once() here to let the user know
that he has some symlinks that are never delivered to guest?
> + }
> +
> + if (memcmp(entry_name, stream->dd_name, stream->dir_name_len) != 0) {
No, that's unsafe. You want to use something like strncmp() instead.
> + /*
> + * The full entry file name should be a part of parent directory name,
> + * except dot and dot_dot (is already handled).
> + * If not, this entry should not be returned.
> + */
> + return -1;
> + }
> +
> + entry_start = entry_name + stream->dir_name_len;
s/entry_start/entry_basename/ ?
> +
> + /* skip slash */
> + while (*entry_start == '\\') {
> + entry_start++;
> + }
> +
> + if (strchr(entry_start, '\\') != NULL) {
> + return -1;
> + }
> +
> + if (strlen(entry_start) == 0
> + || strlen(entry_start) + 1 > sizeof(stream->dd_dir.d_name)) {
> + return -1;
> + }
> + strcpy(stream->dd_dir.d_name, entry_start);
g_path_get_basename() ? :)
> +
> + return 0;
> +}
> +
> +/*
> + * opendir_win32 - open a directory
> + *
> + * This function opens a directory and caches all directory entries.
It just caches all file IDs, doesn't it?
> + */
> +DIR *opendir_win32(const char *full_file_name)
> +{
> + HANDLE hDir = INVALID_HANDLE_VALUE;
> + HANDLE hDirEntry = INVALID_HANDLE_VALUE;
> + char *full_dir_entry = NULL;
> + DWORD attribute;
> + intptr_t dd_handle = -1;
> + struct _finddata_t dd_data;
> + uint64_t file_id;
> + uint64_t *file_id_list = NULL;
> + BY_HANDLE_FILE_INFORMATION FileInfo;
FileInfo is the variable name, not a struct name, so no upper case for it
please.
> + struct dir_win32 *stream = NULL;
> + int err = 0;
> + int find_status;
> + int sort_first_two_entry = 0;
> + uint32_t list_count = 16;
Magic number 16?
> + uint32_t index = 0;
> +
> + /* open directory to prevent it being removed */
> +
> + hDir = CreateFile(full_file_name, GENERIC_READ,
> + FILE_SHARE_READ | FILE_SHARE_WRITE | FILE_SHARE_DELETE,
> + NULL,
> + OPEN_EXISTING,
> + FILE_FLAG_BACKUP_SEMANTICS | FILE_FLAG_OPEN_REPARSE_POINT,
> + NULL);
> +
> + if (hDir == INVALID_HANDLE_VALUE) {
> + err = win32_error_to_posix(GetLastError());
> + goto out;
> + }
> +
> + attribute = GetFileAttributes(full_file_name);
> +
> + /* symlink is not allow */
> + if (attribute == INVALID_FILE_ATTRIBUTES
> + || (attribute & FILE_ATTRIBUTE_REPARSE_POINT) != 0) {
> + err = EACCES;
> + goto out;
> + }
> +
> + /* check if it is a directory */
> + if ((attribute & FILE_ATTRIBUTE_DIRECTORY) == 0) {
> + err = ENOTDIR;
> + goto out;
> + }
> +
> + file_id_list = g_malloc0(sizeof(uint64_t) * list_count);
> +
> + /*
> + * findfirst() needs suffix format name like "\dir1\dir2\*",
> + * allocate more buffer to store suffix.
> + */
> + stream = g_malloc0(sizeof(struct dir_win32) + strlen(full_file_name) + 3);
Not that I would care much, but +2 would be correct here, as you declared the
struct with one character already, so it is not a classic (zero size) flex
array:
struct dir_win32 {
...
char dd_name[1];
};
> +
> + strcpy(stream->dd_name, full_file_name);
> + strcat(stream->dd_name, "\\*");
> +
> + stream->hDir = hDir;
> + stream->dir_name_len = strlen(full_file_name);
> +
> + dd_handle = _findfirst(stream->dd_name, &dd_data);
> +
> + if (dd_handle == -1) {
> + err = errno;
> + goto out;
> + }
> +
> + /* read all entries to link list */
"read all entries as a linked list"
However there is no linked list here. It seems to be an array.
> + do {
> + full_dir_entry = get_full_path_win32(hDir, dd_data.name);
> +
> + if (full_dir_entry == NULL) {
> + err = ENOMEM;
> + break;
> + }
> +
> + /*
> + * Open every entry and get the file informations.
> + *
> + * Skip symbolic links during reading directory.
> + */
> + hDirEntry = CreateFile(full_dir_entry,
> + GENERIC_READ,
> + FILE_SHARE_READ | FILE_SHARE_WRITE
> + | FILE_SHARE_DELETE,
> + NULL,
> + OPEN_EXISTING,
> + FILE_FLAG_BACKUP_SEMANTICS
> + | FILE_FLAG_OPEN_REPARSE_POINT, NULL);
> +
> + if (hDirEntry != INVALID_HANDLE_VALUE) {
> + if (GetFileInformationByHandle(hDirEntry,
> + &FileInfo) == TRUE) {
> + attribute = FileInfo.dwFileAttributes;
> +
> + /* only save validate entries */
> + if ((attribute & FILE_ATTRIBUTE_REPARSE_POINT) == 0) {
> + if (index >= list_count) {
> + list_count = list_count + 16;
Magic number 16 again.
> + file_id_list = g_realloc(file_id_list,
> + sizeof(uint64_t)
> + * list_count);
OK, so here we are finally at the point where you chose the overall behaviour
for this that we discussed before.
So you are constantly appending 16 entry chunks to the end of the array,
periodically reallocate the entire array, and potentially end up with one
giant dense array with *all* file IDs of the directory.
That's not really what I had in mind, as it still has the potential to easily
crash QEMU if there are large directories on host. Theoretically a Windows
directory might then consume up to 16 GB of RAM for looking up only one single
directory.
So is this the implementation that you said was very slow, or did you test a
different one? Remember, my orgiginal idea (as starting point for Windows) was
to only cache *one* file ID (the last being looked up). That's it. Not a list
of file IDs.
> + }
> + file_id = (uint64_t)FileInfo.nFileIndexLow
> + + (((uint64_t)FileInfo.nFileIndexHigh) << 32);
> +
> +
> + file_id_list[index] = file_id;
> +
> + if (strcmp(dd_data.name, ".") == 0) {
> + stream->dot_id = file_id_list[index];
> + if (index != 0) {
> + sort_first_two_entry = 1;
> + }
> + } else if (strcmp(dd_data.name, "..") == 0) {
> + stream->dot_dot_id = file_id_list[index];
> + if (index != 1) {
> + sort_first_two_entry = 1;
> + }
> + }
> + index++;
> + }
> + }
> + CloseHandle(hDirEntry);
> + }
> + g_free(full_dir_entry);
> + find_status = _findnext(dd_handle, &dd_data);
> + } while (find_status == 0);
> +
> + if (errno == ENOENT) {
> + /* No more matching files could be found, clean errno */
> + errno = 0;
> + } else {
> + err = errno;
> + goto out;
> + }
> +
> + stream->total_entries = index;
> + stream->file_id_list = file_id_list;
> +
> + if (sort_first_two_entry == 0) {
> + /*
> + * If the first two entry is "." and "..", then do not sort them.
> + *
> + * If the guest OS always considers first two entries are "." and "..",
> + * sort the two entries may cause confused display in guest OS.
> + */
> + qsort(&file_id_list[2], index - 2, sizeof(file_id), file_id_compare);
> + } else {
> + qsort(&file_id_list[0], index, sizeof(file_id), file_id_compare);
> + }
Were there cases where you did not get "." and ".." ?
> +
> +out:
> + if (err != 0) {
> + errno = err;
> + if (stream != NULL) {
> + if (file_id_list != NULL) {
> + g_free(file_id_list);
> + }
> + CloseHandle(hDir);
> + g_free(stream);
> + stream = NULL;
> + }
> + }
> +
> + if (dd_handle != -1) {
> + _findclose(dd_handle);
> + }
> +
> + return (DIR *)stream;
> +}
> +
> +/*
> + * closedir_win32 - close a directory
> + *
> + * This function closes directory and free all cached resources.
> + */
> +int closedir_win32(DIR *pDir)
> +{
> + struct dir_win32 *stream = (struct dir_win32 *)pDir;
> +
> + if (stream == NULL) {
> + errno = EBADF;
> + return -1;
> + }
> +
> + /* free all resources */
> + CloseHandle(stream->hDir);
> +
> + g_free(stream->file_id_list);
> +
> + g_free(stream);
> +
> + return 0;
> +}
> +
> +/*
> + * readdir_win32 - read a directory
> + *
> + * This function reads a directory entry from cached entry list.
> + */
> +struct dirent *readdir_win32(DIR *pDir)
> +{
> + struct dir_win32 *stream = (struct dir_win32 *)pDir;
> +
> + if (stream == NULL) {
> + errno = EBADF;
> + return NULL;
> + }
> +
> +retry:
> +
> + if (stream->offset >= stream->total_entries) {
> + /* reach to the end, return NULL without set errno */
> + return NULL;
> + }
> +
> + if (get_next_entry(stream) != 0) {
> + stream->offset++;
> + goto retry;
> + }
> +
> + /* Windows does not provide inode number */
> + stream->dd_dir.d_ino = 0;
> + stream->dd_dir.d_reclen = 0;
> + stream->dd_dir.d_namlen = strlen(stream->dd_dir.d_name);
> +
> + stream->offset++;
> +
> + return &stream->dd_dir;
> +}
> +
> +/*
> + * rewinddir_win32 - reset directory stream
> + *
> + * This function resets the position of the directory stream to the
> + * beginning of the directory.
> + */
> +void rewinddir_win32(DIR *pDir)
> +{
> + struct dir_win32 *stream = (struct dir_win32 *)pDir;
> +
> + if (stream == NULL) {
> + errno = EBADF;
> + return;
> + }
> +
> + stream->offset = 0;
> +
> + return;
> +}
> +
> +/*
> + * seekdir_win32 - set the position of the next readdir() call in the directory
> + *
> + * This function sets the position of the next readdir() call in the directory
> + * from which the next readdir() call will start.
> + */
> +void seekdir_win32(DIR *pDir, long pos)
> +{
> + struct dir_win32 *stream = (struct dir_win32 *)pDir;
> +
> + if (stream == NULL) {
> + errno = EBADF;
> + return;
> + }
> +
> + if (pos < -1) {
> + errno = EINVAL;
> + return;
> + }
> +
> + if (pos == -1 || pos >= (long)stream->total_entries) {
> + /* seek to the end */
> + stream->offset = stream->total_entries;
> + return;
> + }
> +
> + if (pos - (long)stream->offset == 0) {
> + /* no need to seek */
> + return;
> + }
> +
> + stream->offset = pos;
> +
> + return;
> +}
> +
> +/*
> + * telldir_win32 - return current location in directory
> + *
> + * This function returns current location in directory.
> + */
> +long telldir_win32(DIR *pDir)
> +{
> + struct dir_win32 *stream = (struct dir_win32 *)pDir;
> +
> + if (stream == NULL) {
> + errno = EBADF;
> + return -1;
> + }
> +
> + if (stream->offset > stream->total_entries) {
> + return -1;
> + }
> +
> + return (long)stream->offset;
> +}
>
^ permalink raw reply [flat|nested] 32+ messages in thread
* RE: [PATCH v5 04/16] hw/9pfs: Implement Windows specific xxxdir() APIs
2023-03-14 16:05 ` Christian Schoenebeck
@ 2023-03-15 19:05 ` Shi, Guohuai
2023-03-16 11:05 ` Christian Schoenebeck
0 siblings, 1 reply; 32+ messages in thread
From: Shi, Guohuai @ 2023-03-15 19:05 UTC (permalink / raw)
To: Christian Schoenebeck, Greg Kurz, qemu-devel; +Cc: Meng, Bin
> -----Original Message-----
> From: Christian Schoenebeck <qemu_oss@crudebyte.com>
> Sent: Wednesday, March 15, 2023 00:06
> To: Greg Kurz <groug@kaod.org>; qemu-devel@nongnu.org
> Cc: Shi, Guohuai <Guohuai.Shi@windriver.com>; Meng, Bin
> <Bin.Meng@windriver.com>
> Subject: Re: [PATCH v5 04/16] hw/9pfs: Implement Windows specific xxxdir()
> APIs
>
> CAUTION: This email comes from a non Wind River email account!
> Do not click links or open attachments unless you recognize the sender and
> know the content is safe.
>
> On Monday, February 20, 2023 11:08:03 AM CET Bin Meng wrote:
> > From: Guohuai Shi <guohuai.shi@windriver.com>
> >
> > This commit implements Windows specific xxxdir() APIs for safety
> > directory access.
>
> That comment is seriously too short for this patch.
>
> 1. You should describe the behaviour implementation that you have chosen and
> why you have chosen it.
>
> 2. Like already said in the previous version of the patch, you should place a
> link to the discussion we had on this issue.
>
> > Signed-off-by: Guohuai Shi <guohuai.shi@windriver.com>
> > Signed-off-by: Bin Meng <bin.meng@windriver.com>
> > ---
> >
> > hw/9pfs/9p-util.h | 6 +
> > hw/9pfs/9p-util-win32.c | 443
> > ++++++++++++++++++++++++++++++++++++++++
> > 2 files changed, 449 insertions(+)
> >
> > diff --git a/hw/9pfs/9p-util.h b/hw/9pfs/9p-util.h index
> > 0f159fb4ce..c1c251fbd1 100644
> > --- a/hw/9pfs/9p-util.h
> > +++ b/hw/9pfs/9p-util.h
> > @@ -141,6 +141,12 @@ int unlinkat_win32(int dirfd, const char
> > *pathname, int flags); int statfs_win32(const char *root_path, struct
> > statfs *stbuf); int openat_dir(int dirfd, const char *name); int
> > openat_file(int dirfd, const char *name, int flags, mode_t mode);
> > +DIR *opendir_win32(const char *full_file_name); int
> > +closedir_win32(DIR *pDir); struct dirent *readdir_win32(DIR *pDir);
> > +void rewinddir_win32(DIR *pDir); void seekdir_win32(DIR *pDir, long
> > +pos); long telldir_win32(DIR *pDir);
> > #endif
> >
> > static inline void close_preserve_errno(int fd) diff --git
> > a/hw/9pfs/9p-util-win32.c b/hw/9pfs/9p-util-win32.c index
> > a99d579a06..e9408f3c45 100644
> > --- a/hw/9pfs/9p-util-win32.c
> > +++ b/hw/9pfs/9p-util-win32.c
> > @@ -37,6 +37,16 @@
> > * Windows does not support opendir, the directory fd is created by
> > * CreateFile and convert to fd by _open_osfhandle(). Keep the fd open
> will
> > * lock and protect the directory (can not be modified or replaced)
> > + *
> > + * 5. Neither Windows native APIs, nor MinGW provide a POSIX compatible
> API for
> > + * acquiring directory entries in a safe way. Calling those APIs
> (native
> > + * _findfirst() and _findnext() or MinGW's readdir(), seekdir() and
> > + * telldir()) directly can lead to an inconsistent state if directory
> is
> > + * modified in between, e.g. the same directory appearing more than
> once
> > + * in output, or directories not appearing at all in output even though
> they
> > + * were neither newly created nor deleted. POSIX does not define what
> happens
> > + * with deleted or newly created directories in between, but it
> guarantees a
> > + * consistent state.
> > */
> >
> > #include "qemu/osdep.h"
> > @@ -51,6 +61,25 @@
> >
> > #define V9FS_MAGIC 0x53465039 /* string "9PFS" */
> >
> > +/*
> > + * MinGW and Windows does not provide a safe way to seek directory
> > +while other
> > + * thread is modifying the same directory.
> > + *
> > + * This structure is used to store sorted file id and ensure
> > +directory seek
> > + * consistency.
> > + */
> > +struct dir_win32 {
> > + struct dirent dd_dir;
> > + uint32_t offset;
> > + uint32_t total_entries;
> > + HANDLE hDir;
> > + uint32_t dir_name_len;
> > + uint64_t dot_id;
> > + uint64_t dot_dot_id;
> > + uint64_t *file_id_list;
> > + char dd_name[1];
> > +};
> > +
> > /*
> > * win32_error_to_posix - convert Win32 error to POSIX error number
> > *
> > @@ -977,3 +1006,417 @@ int qemu_mknodat(int dirfd, const char *filename,
> mode_t mode, dev_t dev)
> > errno = ENOTSUP;
> > return -1;
> > }
> > +
> > +static int file_id_compare(const void *id_ptr1, const void *id_ptr2)
> > +{
> > + uint64_t id[2];
> > +
> > + id[0] = *(uint64_t *)id_ptr1;
> > + id[1] = *(uint64_t *)id_ptr2;
> > +
> > + if (id[0] > id[1]) {
> > + return 1;
> > + } else if (id[0] < id[1]) {
> > + return -1;
> > + } else {
> > + return 0;
> > + }
> > +}
> > +
> > +static int get_next_entry(struct dir_win32 *stream) {
> > + HANDLE hDirEntry = INVALID_HANDLE_VALUE;
> > + char *entry_name;
> > + char *entry_start;
> > + FILE_ID_DESCRIPTOR fid;
> > + DWORD attribute;
> > +
> > + if (stream->file_id_list[stream->offset] == stream->dot_id) {
> > + strcpy(stream->dd_dir.d_name, ".");
> > + return 0;
> > + }
> > +
> > + if (stream->file_id_list[stream->offset] == stream->dot_dot_id) {
> > + strcpy(stream->dd_dir.d_name, "..");
> > + return 0;
> > + }
> > +
> > + fid.dwSize = sizeof(fid);
> > + fid.Type = FileIdType;
> > +
> > + fid.FileId.QuadPart = stream->file_id_list[stream->offset];
> > +
> > + hDirEntry = OpenFileById(stream->hDir, &fid, GENERIC_READ,
> > + FILE_SHARE_READ | FILE_SHARE_WRITE
> > + | FILE_SHARE_DELETE,
> > + NULL,
> > + FILE_FLAG_BACKUP_SEMANTICS
> > + | FILE_FLAG_OPEN_REPARSE_POINT);
>
> What's the purpose of FILE_FLAG_OPEN_REPARSE_POINT here? As it's apparently
> not obvious, please add a comment.
>
If do not use this flag, and if file id is a symbolic link, then Windows will not symbolic link itself, but open the target file.
This flag is similar as O_NOFOLLOW flag.
> > +
> > + if (hDirEntry == INVALID_HANDLE_VALUE) {
> > + /*
> > + * Not open it successfully, it may be deleted.
>
> Wrong English. "Open failed, it may have been deleted in the meantime.".
>
> > + * Try next id.
> > + */
> > + return -1;
> > + }
> > +
> > + entry_name = get_full_path_win32(hDirEntry, NULL);
> > +
> > + CloseHandle(hDirEntry);
> > +
> > + if (entry_name == NULL) {
> > + return -1;
> > + }
> > +
> > + attribute = GetFileAttributes(entry_name);
> > +
> > + /* symlink is not allowed */
> > + if (attribute == INVALID_FILE_ATTRIBUTES
> > + || (attribute & FILE_ATTRIBUTE_REPARSE_POINT) != 0) {
> > + return -1;
>
> Wouldn't it make sense to call warn_report_once() here to let the user know
> that he has some symlinks that are never delivered to guest?
OK, Got it.
>
> > + }
> > +
> > + if (memcmp(entry_name, stream->dd_name, stream->dir_name_len) !=
> > + 0) {
>
> No, that's unsafe. You want to use something like strncmp() instead.
>
> > + /*
> > + * The full entry file name should be a part of parent directory
> name,
> > + * except dot and dot_dot (is already handled).
> > + * If not, this entry should not be returned.
> > + */
> > + return -1;
> > + }
> > +
> > + entry_start = entry_name + stream->dir_name_len;
>
> s/entry_start/entry_basename/ ?
>
> > +
> > + /* skip slash */
> > + while (*entry_start == '\\') {
> > + entry_start++;
> > + }
> > +
> > + if (strchr(entry_start, '\\') != NULL) {
> > + return -1;
> > + }
> > +
> > + if (strlen(entry_start) == 0
> > + || strlen(entry_start) + 1 > sizeof(stream->dd_dir.d_name)) {
> > + return -1;
> > + }
> > + strcpy(stream->dd_dir.d_name, entry_start);
>
> g_path_get_basename() ? :)
For above three comments:
This code is not good, should be fixed.
The code want to filter the following cases:
The parent directory path is not a part of entry's full path:
Parent: C:\123\456, entry: C:\123, C:\
Entry contains more than one name components:
Parent: C:\123\456, entry: C:\123\456\789\abc
Entry is zero length or name buffer is too long
I will refactor this part.
>
> > +
> > + return 0;
> > +}
> > +
> > +/*
> > + * opendir_win32 - open a directory
> > + *
> > + * This function opens a directory and caches all directory entries.
>
> It just caches all file IDs, doesn't it?
>
Will fix it
> > + */
> > +DIR *opendir_win32(const char *full_file_name) {
> > + HANDLE hDir = INVALID_HANDLE_VALUE;
> > + HANDLE hDirEntry = INVALID_HANDLE_VALUE;
> > + char *full_dir_entry = NULL;
> > + DWORD attribute;
> > + intptr_t dd_handle = -1;
> > + struct _finddata_t dd_data;
> > + uint64_t file_id;
> > + uint64_t *file_id_list = NULL;
> > + BY_HANDLE_FILE_INFORMATION FileInfo;
>
> FileInfo is the variable name, not a struct name, so no upper case for it
> please.
Will fix it.
>
> > + struct dir_win32 *stream = NULL;
> > + int err = 0;
> > + int find_status;
> > + int sort_first_two_entry = 0;
> > + uint32_t list_count = 16;
>
> Magic number 16?
Will change it to a macro.
>
> > + uint32_t index = 0;
> > +
> > + /* open directory to prevent it being removed */
> > +
> > + hDir = CreateFile(full_file_name, GENERIC_READ,
> > + FILE_SHARE_READ | FILE_SHARE_WRITE |
> FILE_SHARE_DELETE,
> > + NULL,
> > + OPEN_EXISTING,
> > + FILE_FLAG_BACKUP_SEMANTICS |
> FILE_FLAG_OPEN_REPARSE_POINT,
> > + NULL);
> > +
> > + if (hDir == INVALID_HANDLE_VALUE) {
> > + err = win32_error_to_posix(GetLastError());
> > + goto out;
> > + }
> > +
> > + attribute = GetFileAttributes(full_file_name);
> > +
> > + /* symlink is not allow */
> > + if (attribute == INVALID_FILE_ATTRIBUTES
> > + || (attribute & FILE_ATTRIBUTE_REPARSE_POINT) != 0) {
> > + err = EACCES;
> > + goto out;
> > + }
> > +
> > + /* check if it is a directory */
> > + if ((attribute & FILE_ATTRIBUTE_DIRECTORY) == 0) {
> > + err = ENOTDIR;
> > + goto out;
> > + }
> > +
> > + file_id_list = g_malloc0(sizeof(uint64_t) * list_count);
> > +
> > + /*
> > + * findfirst() needs suffix format name like "\dir1\dir2\*",
> > + * allocate more buffer to store suffix.
> > + */
> > + stream = g_malloc0(sizeof(struct dir_win32) +
> > + strlen(full_file_name) + 3);
>
> Not that I would care much, but +2 would be correct here, as you declared the
> struct with one character already, so it is not a classic (zero size) flex
> array:
>
> struct dir_win32 {
> ...
> char dd_name[1];
> };
>
Will fix it.
> > +
> > + strcpy(stream->dd_name, full_file_name);
> > + strcat(stream->dd_name, "\\*");
> > +
> > + stream->hDir = hDir;
> > + stream->dir_name_len = strlen(full_file_name);
> > +
> > + dd_handle = _findfirst(stream->dd_name, &dd_data);
> > +
> > + if (dd_handle == -1) {
> > + err = errno;
> > + goto out;
> > + }
> > +
> > + /* read all entries to link list */
>
> "read all entries as a linked list"
>
> However there is no linked list here. It seems to be an array.
Will fix it.
>
> > + do {
> > + full_dir_entry = get_full_path_win32(hDir, dd_data.name);
> > +
> > + if (full_dir_entry == NULL) {
> > + err = ENOMEM;
> > + break;
> > + }
> > +
> > + /*
> > + * Open every entry and get the file informations.
> > + *
> > + * Skip symbolic links during reading directory.
> > + */
> > + hDirEntry = CreateFile(full_dir_entry,
> > + GENERIC_READ,
> > + FILE_SHARE_READ | FILE_SHARE_WRITE
> > + | FILE_SHARE_DELETE,
> > + NULL,
> > + OPEN_EXISTING,
> > + FILE_FLAG_BACKUP_SEMANTICS
> > + | FILE_FLAG_OPEN_REPARSE_POINT, NULL);
> > +
> > + if (hDirEntry != INVALID_HANDLE_VALUE) {
> > + if (GetFileInformationByHandle(hDirEntry,
> > + &FileInfo) == TRUE) {
> > + attribute = FileInfo.dwFileAttributes;
> > +
> > + /* only save validate entries */
> > + if ((attribute & FILE_ATTRIBUTE_REPARSE_POINT) == 0) {
> > + if (index >= list_count) {
> > + list_count = list_count + 16;
>
> Magic number 16 again.
>
> > + file_id_list = g_realloc(file_id_list,
> > + sizeof(uint64_t)
> > + * list_count);
>
> OK, so here we are finally at the point where you chose the overall behaviour
> for this that we discussed before.
>
> So you are constantly appending 16 entry chunks to the end of the array,
> periodically reallocate the entire array, and potentially end up with one
> giant dense array with *all* file IDs of the directory.
>
> That's not really what I had in mind, as it still has the potential to easily
> crash QEMU if there are large directories on host. Theoretically a Windows
> directory might then consume up to 16 GB of RAM for looking up only one
> single directory.
>
> So is this the implementation that you said was very slow, or did you test a
> different one? Remember, my orgiginal idea (as starting point for Windows)
> was to only cache *one* file ID (the last being looked up). That's it. Not a
> list of file IDs.
If only cache one file ID, that means for every read directory operation.
we need to look up whole directory to find out the next ID larger than last cached one.
I provided some performance test in last patch:
Run test for read directory with 100, 1000, 10000 entries
#1, For file name cache solution, the time cost is: 2, 9, 44 (in ms).
#2, For file id cache solution, the time cost: 3, 438, 4338 (in ms). This is current solution.
#3, for cache one id solution, I just tested it: 4, 4788, more than one minutes (in ms)
I think it is not a good idea to cache one file id, it would be very bad performance
>
> > + }
> > + file_id = (uint64_t)FileInfo.nFileIndexLow
> > + + (((uint64_t)FileInfo.nFileIndexHigh)
> > + << 32);
> > +
> > +
> > + file_id_list[index] = file_id;
> > +
> > + if (strcmp(dd_data.name, ".") == 0) {
> > + stream->dot_id = file_id_list[index];
> > + if (index != 0) {
> > + sort_first_two_entry = 1;
> > + }
> > + } else if (strcmp(dd_data.name, "..") == 0) {
> > + stream->dot_dot_id = file_id_list[index];
> > + if (index != 1) {
> > + sort_first_two_entry = 1;
> > + }
> > + }
> > + index++;
> > + }
> > + }
> > + CloseHandle(hDirEntry);
> > + }
> > + g_free(full_dir_entry);
> > + find_status = _findnext(dd_handle, &dd_data);
> > + } while (find_status == 0);
> > +
> > + if (errno == ENOENT) {
> > + /* No more matching files could be found, clean errno */
> > + errno = 0;
> > + } else {
> > + err = errno;
> > + goto out;
> > + }
> > +
> > + stream->total_entries = index;
> > + stream->file_id_list = file_id_list;
> > +
> > + if (sort_first_two_entry == 0) {
> > + /*
> > + * If the first two entry is "." and "..", then do not sort them.
> > + *
> > + * If the guest OS always considers first two entries are "." and
> "..",
> > + * sort the two entries may cause confused display in guest OS.
> > + */
> > + qsort(&file_id_list[2], index - 2, sizeof(file_id),
> file_id_compare);
> > + } else {
> > + qsort(&file_id_list[0], index, sizeof(file_id), file_id_compare);
> > + }
>
> Were there cases where you did not get "." and ".." ?
NTFS always provides "." and "..".
I could add more checks here to fix this risk
>
> > +
> > +out:
> > + if (err != 0) {
> > + errno = err;
> > + if (stream != NULL) {
> > + if (file_id_list != NULL) {
> > + g_free(file_id_list);
> > + }
> > + CloseHandle(hDir);
> > + g_free(stream);
> > + stream = NULL;
> > + }
> > + }
> > +
> > + if (dd_handle != -1) {
> > + _findclose(dd_handle);
> > + }
> > +
> > + return (DIR *)stream;
> > +}
> > +
> > +/*
> > + * closedir_win32 - close a directory
> > + *
> > + * This function closes directory and free all cached resources.
> > + */
> > +int closedir_win32(DIR *pDir)
> > +{
> > + struct dir_win32 *stream = (struct dir_win32 *)pDir;
> > +
> > + if (stream == NULL) {
> > + errno = EBADF;
> > + return -1;
> > + }
> > +
> > + /* free all resources */
> > + CloseHandle(stream->hDir);
> > +
> > + g_free(stream->file_id_list);
> > +
> > + g_free(stream);
> > +
> > + return 0;
> > +}
> > +
> > +/*
> > + * readdir_win32 - read a directory
> > + *
> > + * This function reads a directory entry from cached entry list.
> > + */
> > +struct dirent *readdir_win32(DIR *pDir) {
> > + struct dir_win32 *stream = (struct dir_win32 *)pDir;
> > +
> > + if (stream == NULL) {
> > + errno = EBADF;
> > + return NULL;
> > + }
> > +
> > +retry:
> > +
> > + if (stream->offset >= stream->total_entries) {
> > + /* reach to the end, return NULL without set errno */
> > + return NULL;
> > + }
> > +
> > + if (get_next_entry(stream) != 0) {
> > + stream->offset++;
> > + goto retry;
> > + }
> > +
> > + /* Windows does not provide inode number */
> > + stream->dd_dir.d_ino = 0;
> > + stream->dd_dir.d_reclen = 0;
> > + stream->dd_dir.d_namlen = strlen(stream->dd_dir.d_name);
> > +
> > + stream->offset++;
> > +
> > + return &stream->dd_dir;
> > +}
> > +
> > +/*
> > + * rewinddir_win32 - reset directory stream
> > + *
> > + * This function resets the position of the directory stream to the
> > + * beginning of the directory.
> > + */
> > +void rewinddir_win32(DIR *pDir)
> > +{
> > + struct dir_win32 *stream = (struct dir_win32 *)pDir;
> > +
> > + if (stream == NULL) {
> > + errno = EBADF;
> > + return;
> > + }
> > +
> > + stream->offset = 0;
> > +
> > + return;
> > +}
> > +
> > +/*
> > + * seekdir_win32 - set the position of the next readdir() call in the
> > +directory
> > + *
> > + * This function sets the position of the next readdir() call in the
> > +directory
> > + * from which the next readdir() call will start.
> > + */
> > +void seekdir_win32(DIR *pDir, long pos) {
> > + struct dir_win32 *stream = (struct dir_win32 *)pDir;
> > +
> > + if (stream == NULL) {
> > + errno = EBADF;
> > + return;
> > + }
> > +
> > + if (pos < -1) {
> > + errno = EINVAL;
> > + return;
> > + }
> > +
> > + if (pos == -1 || pos >= (long)stream->total_entries) {
> > + /* seek to the end */
> > + stream->offset = stream->total_entries;
> > + return;
> > + }
> > +
> > + if (pos - (long)stream->offset == 0) {
> > + /* no need to seek */
> > + return;
> > + }
> > +
> > + stream->offset = pos;
> > +
> > + return;
> > +}
> > +
> > +/*
> > + * telldir_win32 - return current location in directory
> > + *
> > + * This function returns current location in directory.
> > + */
> > +long telldir_win32(DIR *pDir)
> > +{
> > + struct dir_win32 *stream = (struct dir_win32 *)pDir;
> > +
> > + if (stream == NULL) {
> > + errno = EBADF;
> > + return -1;
> > + }
> > +
> > + if (stream->offset > stream->total_entries) {
> > + return -1;
> > + }
> > +
> > + return (long)stream->offset;
> > +}
> >
>
^ permalink raw reply [flat|nested] 32+ messages in thread
* Re: [PATCH v5 04/16] hw/9pfs: Implement Windows specific xxxdir() APIs
2023-03-15 19:05 ` Shi, Guohuai
@ 2023-03-16 11:05 ` Christian Schoenebeck
2023-03-16 17:28 ` Shi, Guohuai
0 siblings, 1 reply; 32+ messages in thread
From: Christian Schoenebeck @ 2023-03-16 11:05 UTC (permalink / raw)
To: Greg Kurz, qemu-devel; +Cc: Meng, Bin, Shi, Guohuai
On Wednesday, March 15, 2023 8:05:34 PM CET Shi, Guohuai wrote:
>
> > -----Original Message-----
> > From: Christian Schoenebeck <qemu_oss@crudebyte.com>
> > Sent: Wednesday, March 15, 2023 00:06
> > To: Greg Kurz <groug@kaod.org>; qemu-devel@nongnu.org
> > Cc: Shi, Guohuai <Guohuai.Shi@windriver.com>; Meng, Bin
> > <Bin.Meng@windriver.com>
> > Subject: Re: [PATCH v5 04/16] hw/9pfs: Implement Windows specific xxxdir()
> > APIs
> >
> > CAUTION: This email comes from a non Wind River email account!
> > Do not click links or open attachments unless you recognize the sender and
> > know the content is safe.
> >
> > On Monday, February 20, 2023 11:08:03 AM CET Bin Meng wrote:
> > > From: Guohuai Shi <guohuai.shi@windriver.com>
> > >
> > > This commit implements Windows specific xxxdir() APIs for safety
> > > directory access.
> >
> > That comment is seriously too short for this patch.
> >
> > 1. You should describe the behaviour implementation that you have chosen and
> > why you have chosen it.
> >
> > 2. Like already said in the previous version of the patch, you should place a
> > link to the discussion we had on this issue.
> >
> > > Signed-off-by: Guohuai Shi <guohuai.shi@windriver.com>
> > > Signed-off-by: Bin Meng <bin.meng@windriver.com>
> > > ---
> > >
> > > hw/9pfs/9p-util.h | 6 +
> > > hw/9pfs/9p-util-win32.c | 443
> > > ++++++++++++++++++++++++++++++++++++++++
> > > 2 files changed, 449 insertions(+)
> > >
> > > diff --git a/hw/9pfs/9p-util.h b/hw/9pfs/9p-util.h index
> > > 0f159fb4ce..c1c251fbd1 100644
> > > --- a/hw/9pfs/9p-util.h
> > > +++ b/hw/9pfs/9p-util.h
> > > @@ -141,6 +141,12 @@ int unlinkat_win32(int dirfd, const char
> > > *pathname, int flags); int statfs_win32(const char *root_path, struct
> > > statfs *stbuf); int openat_dir(int dirfd, const char *name); int
> > > openat_file(int dirfd, const char *name, int flags, mode_t mode);
> > > +DIR *opendir_win32(const char *full_file_name); int
> > > +closedir_win32(DIR *pDir); struct dirent *readdir_win32(DIR *pDir);
> > > +void rewinddir_win32(DIR *pDir); void seekdir_win32(DIR *pDir, long
> > > +pos); long telldir_win32(DIR *pDir);
> > > #endif
> > >
> > > static inline void close_preserve_errno(int fd) diff --git
> > > a/hw/9pfs/9p-util-win32.c b/hw/9pfs/9p-util-win32.c index
> > > a99d579a06..e9408f3c45 100644
> > > --- a/hw/9pfs/9p-util-win32.c
> > > +++ b/hw/9pfs/9p-util-win32.c
> > > @@ -37,6 +37,16 @@
> > > * Windows does not support opendir, the directory fd is created by
> > > * CreateFile and convert to fd by _open_osfhandle(). Keep the fd open
> > will
> > > * lock and protect the directory (can not be modified or replaced)
> > > + *
> > > + * 5. Neither Windows native APIs, nor MinGW provide a POSIX compatible
> > API for
> > > + * acquiring directory entries in a safe way. Calling those APIs
> > (native
> > > + * _findfirst() and _findnext() or MinGW's readdir(), seekdir() and
> > > + * telldir()) directly can lead to an inconsistent state if directory
> > is
> > > + * modified in between, e.g. the same directory appearing more than
> > once
> > > + * in output, or directories not appearing at all in output even though
> > they
> > > + * were neither newly created nor deleted. POSIX does not define what
> > happens
> > > + * with deleted or newly created directories in between, but it
> > guarantees a
> > > + * consistent state.
> > > */
> > >
> > > #include "qemu/osdep.h"
> > > @@ -51,6 +61,25 @@
> > >
> > > #define V9FS_MAGIC 0x53465039 /* string "9PFS" */
> > >
> > > +/*
> > > + * MinGW and Windows does not provide a safe way to seek directory
> > > +while other
> > > + * thread is modifying the same directory.
> > > + *
> > > + * This structure is used to store sorted file id and ensure
> > > +directory seek
> > > + * consistency.
> > > + */
> > > +struct dir_win32 {
> > > + struct dirent dd_dir;
> > > + uint32_t offset;
> > > + uint32_t total_entries;
> > > + HANDLE hDir;
> > > + uint32_t dir_name_len;
> > > + uint64_t dot_id;
> > > + uint64_t dot_dot_id;
> > > + uint64_t *file_id_list;
> > > + char dd_name[1];
> > > +};
> > > +
> > > /*
> > > * win32_error_to_posix - convert Win32 error to POSIX error number
> > > *
> > > @@ -977,3 +1006,417 @@ int qemu_mknodat(int dirfd, const char *filename,
> > mode_t mode, dev_t dev)
> > > errno = ENOTSUP;
> > > return -1;
> > > }
> > > +
> > > +static int file_id_compare(const void *id_ptr1, const void *id_ptr2)
> > > +{
> > > + uint64_t id[2];
> > > +
> > > + id[0] = *(uint64_t *)id_ptr1;
> > > + id[1] = *(uint64_t *)id_ptr2;
> > > +
> > > + if (id[0] > id[1]) {
> > > + return 1;
> > > + } else if (id[0] < id[1]) {
> > > + return -1;
> > > + } else {
> > > + return 0;
> > > + }
> > > +}
> > > +
> > > +static int get_next_entry(struct dir_win32 *stream) {
> > > + HANDLE hDirEntry = INVALID_HANDLE_VALUE;
> > > + char *entry_name;
> > > + char *entry_start;
> > > + FILE_ID_DESCRIPTOR fid;
> > > + DWORD attribute;
> > > +
> > > + if (stream->file_id_list[stream->offset] == stream->dot_id) {
> > > + strcpy(stream->dd_dir.d_name, ".");
> > > + return 0;
> > > + }
> > > +
> > > + if (stream->file_id_list[stream->offset] == stream->dot_dot_id) {
> > > + strcpy(stream->dd_dir.d_name, "..");
> > > + return 0;
> > > + }
> > > +
> > > + fid.dwSize = sizeof(fid);
> > > + fid.Type = FileIdType;
> > > +
> > > + fid.FileId.QuadPart = stream->file_id_list[stream->offset];
> > > +
> > > + hDirEntry = OpenFileById(stream->hDir, &fid, GENERIC_READ,
> > > + FILE_SHARE_READ | FILE_SHARE_WRITE
> > > + | FILE_SHARE_DELETE,
> > > + NULL,
> > > + FILE_FLAG_BACKUP_SEMANTICS
> > > + | FILE_FLAG_OPEN_REPARSE_POINT);
> >
> > What's the purpose of FILE_FLAG_OPEN_REPARSE_POINT here? As it's apparently
> > not obvious, please add a comment.
> >
>
> If do not use this flag, and if file id is a symbolic link, then Windows will not symbolic link itself, but open the target file.
> This flag is similar as O_NOFOLLOW flag.
OK, got it, thanks! But please add a comment in code that describes this.
> > > +
> > > + if (hDirEntry == INVALID_HANDLE_VALUE) {
> > > + /*
> > > + * Not open it successfully, it may be deleted.
> >
> > Wrong English. "Open failed, it may have been deleted in the meantime.".
> >
> > > + * Try next id.
> > > + */
> > > + return -1;
> > > + }
> > > +
> > > + entry_name = get_full_path_win32(hDirEntry, NULL);
> > > +
> > > + CloseHandle(hDirEntry);
> > > +
> > > + if (entry_name == NULL) {
> > > + return -1;
> > > + }
> > > +
> > > + attribute = GetFileAttributes(entry_name);
> > > +
> > > + /* symlink is not allowed */
> > > + if (attribute == INVALID_FILE_ATTRIBUTES
> > > + || (attribute & FILE_ATTRIBUTE_REPARSE_POINT) != 0) {
> > > + return -1;
> >
> > Wouldn't it make sense to call warn_report_once() here to let the user know
> > that he has some symlinks that are never delivered to guest?
>
> OK, Got it.
>
> >
> > > + }
> > > +
> > > + if (memcmp(entry_name, stream->dd_name, stream->dir_name_len) !=
> > > + 0) {
> >
> > No, that's unsafe. You want to use something like strncmp() instead.
> >
> > > + /*
> > > + * The full entry file name should be a part of parent directory
> > name,
> > > + * except dot and dot_dot (is already handled).
> > > + * If not, this entry should not be returned.
> > > + */
> > > + return -1;
> > > + }
> > > +
> > > + entry_start = entry_name + stream->dir_name_len;
> >
> > s/entry_start/entry_basename/ ?
> >
> > > +
> > > + /* skip slash */
> > > + while (*entry_start == '\\') {
> > > + entry_start++;
> > > + }
> > > +
> > > + if (strchr(entry_start, '\\') != NULL) {
> > > + return -1;
> > > + }
> > > +
> > > + if (strlen(entry_start) == 0
> > > + || strlen(entry_start) + 1 > sizeof(stream->dd_dir.d_name)) {
> > > + return -1;
> > > + }
> > > + strcpy(stream->dd_dir.d_name, entry_start);
> >
> > g_path_get_basename() ? :)
>
> For above three comments:
> This code is not good, should be fixed.
> The code want to filter the following cases:
> The parent directory path is not a part of entry's full path:
> Parent: C:\123\456, entry: C:\123, C:\
> Entry contains more than one name components:
> Parent: C:\123\456, entry: C:\123\456\789\abc
> Entry is zero length or name buffer is too long
>
> I will refactor this part.
In general: writing parsing code yourself is extremely error prone. That's why
it makes sense to use existing functions from glib, etc.
> >
> > > +
> > > + return 0;
> > > +}
> > > +
> > > +/*
> > > + * opendir_win32 - open a directory
> > > + *
> > > + * This function opens a directory and caches all directory entries.
> >
> > It just caches all file IDs, doesn't it?
> >
>
> Will fix it
>
> > > + */
> > > +DIR *opendir_win32(const char *full_file_name) {
> > > + HANDLE hDir = INVALID_HANDLE_VALUE;
> > > + HANDLE hDirEntry = INVALID_HANDLE_VALUE;
> > > + char *full_dir_entry = NULL;
> > > + DWORD attribute;
> > > + intptr_t dd_handle = -1;
> > > + struct _finddata_t dd_data;
> > > + uint64_t file_id;
> > > + uint64_t *file_id_list = NULL;
> > > + BY_HANDLE_FILE_INFORMATION FileInfo;
> >
> > FileInfo is the variable name, not a struct name, so no upper case for it
> > please.
>
> Will fix it.
> >
> > > + struct dir_win32 *stream = NULL;
> > > + int err = 0;
> > > + int find_status;
> > > + int sort_first_two_entry = 0;
> > > + uint32_t list_count = 16;
> >
> > Magic number 16?
>
> Will change it to a macro.
> >
> > > + uint32_t index = 0;
> > > +
> > > + /* open directory to prevent it being removed */
> > > +
> > > + hDir = CreateFile(full_file_name, GENERIC_READ,
> > > + FILE_SHARE_READ | FILE_SHARE_WRITE |
> > FILE_SHARE_DELETE,
> > > + NULL,
> > > + OPEN_EXISTING,
> > > + FILE_FLAG_BACKUP_SEMANTICS |
> > FILE_FLAG_OPEN_REPARSE_POINT,
> > > + NULL);
> > > +
> > > + if (hDir == INVALID_HANDLE_VALUE) {
> > > + err = win32_error_to_posix(GetLastError());
> > > + goto out;
> > > + }
> > > +
> > > + attribute = GetFileAttributes(full_file_name);
> > > +
> > > + /* symlink is not allow */
> > > + if (attribute == INVALID_FILE_ATTRIBUTES
> > > + || (attribute & FILE_ATTRIBUTE_REPARSE_POINT) != 0) {
> > > + err = EACCES;
> > > + goto out;
> > > + }
> > > +
> > > + /* check if it is a directory */
> > > + if ((attribute & FILE_ATTRIBUTE_DIRECTORY) == 0) {
> > > + err = ENOTDIR;
> > > + goto out;
> > > + }
> > > +
> > > + file_id_list = g_malloc0(sizeof(uint64_t) * list_count);
> > > +
> > > + /*
> > > + * findfirst() needs suffix format name like "\dir1\dir2\*",
> > > + * allocate more buffer to store suffix.
> > > + */
> > > + stream = g_malloc0(sizeof(struct dir_win32) +
> > > + strlen(full_file_name) + 3);
> >
> > Not that I would care much, but +2 would be correct here, as you declared the
> > struct with one character already, so it is not a classic (zero size) flex
> > array:
> >
> > struct dir_win32 {
> > ...
> > char dd_name[1];
> > };
> >
> Will fix it.
>
> > > +
> > > + strcpy(stream->dd_name, full_file_name);
> > > + strcat(stream->dd_name, "\\*");
> > > +
> > > + stream->hDir = hDir;
> > > + stream->dir_name_len = strlen(full_file_name);
> > > +
> > > + dd_handle = _findfirst(stream->dd_name, &dd_data);
> > > +
> > > + if (dd_handle == -1) {
> > > + err = errno;
> > > + goto out;
> > > + }
> > > +
> > > + /* read all entries to link list */
> >
> > "read all entries as a linked list"
> >
> > However there is no linked list here. It seems to be an array.
>
> Will fix it.
> >
> > > + do {
> > > + full_dir_entry = get_full_path_win32(hDir, dd_data.name);
> > > +
> > > + if (full_dir_entry == NULL) {
> > > + err = ENOMEM;
> > > + break;
> > > + }
> > > +
> > > + /*
> > > + * Open every entry and get the file informations.
> > > + *
> > > + * Skip symbolic links during reading directory.
> > > + */
> > > + hDirEntry = CreateFile(full_dir_entry,
> > > + GENERIC_READ,
> > > + FILE_SHARE_READ | FILE_SHARE_WRITE
> > > + | FILE_SHARE_DELETE,
> > > + NULL,
> > > + OPEN_EXISTING,
> > > + FILE_FLAG_BACKUP_SEMANTICS
> > > + | FILE_FLAG_OPEN_REPARSE_POINT, NULL);
> > > +
> > > + if (hDirEntry != INVALID_HANDLE_VALUE) {
> > > + if (GetFileInformationByHandle(hDirEntry,
> > > + &FileInfo) == TRUE) {
> > > + attribute = FileInfo.dwFileAttributes;
> > > +
> > > + /* only save validate entries */
> > > + if ((attribute & FILE_ATTRIBUTE_REPARSE_POINT) == 0) {
> > > + if (index >= list_count) {
> > > + list_count = list_count + 16;
> >
> > Magic number 16 again.
> >
> > > + file_id_list = g_realloc(file_id_list,
> > > + sizeof(uint64_t)
> > > + * list_count);
> >
> > OK, so here we are finally at the point where you chose the overall behaviour
> > for this that we discussed before.
> >
> > So you are constantly appending 16 entry chunks to the end of the array,
> > periodically reallocate the entire array, and potentially end up with one
> > giant dense array with *all* file IDs of the directory.
> >
> > That's not really what I had in mind, as it still has the potential to easily
> > crash QEMU if there are large directories on host. Theoretically a Windows
> > directory might then consume up to 16 GB of RAM for looking up only one
> > single directory.
> >
> > So is this the implementation that you said was very slow, or did you test a
> > different one? Remember, my orgiginal idea (as starting point for Windows)
> > was to only cache *one* file ID (the last being looked up). That's it. Not a
> > list of file IDs.
>
> If only cache one file ID, that means for every read directory operation.
> we need to look up whole directory to find out the next ID larger than last cached one.
>
> I provided some performance test in last patch:
> Run test for read directory with 100, 1000, 10000 entries
> #1, For file name cache solution, the time cost is: 2, 9, 44 (in ms).
> #2, For file id cache solution, the time cost: 3, 438, 4338 (in ms). This is current solution.
> #3, for cache one id solution, I just tested it: 4, 4788, more than one minutes (in ms)
>
> I think it is not a good idea to cache one file id, it would be very bad performance
Yes, the performce would be lousy, but at least we would have a basis that
just works^TM. Correct behaviour always comes before performance. And from
there you could add additional patches on top to address performance
improvements. Because the point is: your implementation is also suboptimal,
and more importantly: prone to crashes like we discussed before.
Regarding performance: for instance you are re-allocating an entire dense
buffer on every 16 new entries. That will slow down things extremely. Please
use a container from glib, because these are handling resize operations more
smoothly for you out of the box, i.e. typically by doubling the container
capacity instead of re-allocating frequently with small chunks like you did.
However I am still not convinced that allocating a huge dense buffer with
*all* file IDs of a directory makes sense.
On the long-term it would make sense to do it like other implementations:
store a snapshot of the directory temporarily on disk. That way it would not
matter how huge the directory is. But that's a complex implementation, so not
something that I would do in this series already.
On the short/mid term I think we could simply make a mix of your solution and
the one-ID solution that I suggested: keeping a maximum of e.g. 1k file IDs in
RAM. And once guest seeks past that boundary, loading the subsequent 1k
entries, free-ing the previous 1k entries, and so on.
> >
> > > + }
> > > + file_id = (uint64_t)FileInfo.nFileIndexLow
> > > + + (((uint64_t)FileInfo.nFileIndexHigh)
> > > + << 32);
> > > +
> > > +
> > > + file_id_list[index] = file_id;
> > > +
> > > + if (strcmp(dd_data.name, ".") == 0) {
> > > + stream->dot_id = file_id_list[index];
> > > + if (index != 0) {
> > > + sort_first_two_entry = 1;
> > > + }
> > > + } else if (strcmp(dd_data.name, "..") == 0) {
> > > + stream->dot_dot_id = file_id_list[index];
> > > + if (index != 1) {
> > > + sort_first_two_entry = 1;
> > > + }
> > > + }
> > > + index++;
> > > + }
> > > + }
> > > + CloseHandle(hDirEntry);
> > > + }
> > > + g_free(full_dir_entry);
> > > + find_status = _findnext(dd_handle, &dd_data);
> > > + } while (find_status == 0);
> > > +
> > > + if (errno == ENOENT) {
> > > + /* No more matching files could be found, clean errno */
> > > + errno = 0;
> > > + } else {
> > > + err = errno;
> > > + goto out;
> > > + }
> > > +
> > > + stream->total_entries = index;
> > > + stream->file_id_list = file_id_list;
> > > +
> > > + if (sort_first_two_entry == 0) {
> > > + /*
> > > + * If the first two entry is "." and "..", then do not sort them.
> > > + *
> > > + * If the guest OS always considers first two entries are "." and
> > "..",
> > > + * sort the two entries may cause confused display in guest OS.
> > > + */
> > > + qsort(&file_id_list[2], index - 2, sizeof(file_id),
> > file_id_compare);
> > > + } else {
> > > + qsort(&file_id_list[0], index, sizeof(file_id), file_id_compare);
> > > + }
> >
> > Were there cases where you did not get "." and ".." ?
>
> NTFS always provides "." and "..".
> I could add more checks here to fix this risk
That's what I assumed. So you can probably just drop this code for simplicity.
>
> >
> > > +
> > > +out:
> > > + if (err != 0) {
> > > + errno = err;
> > > + if (stream != NULL) {
> > > + if (file_id_list != NULL) {
> > > + g_free(file_id_list);
> > > + }
> > > + CloseHandle(hDir);
> > > + g_free(stream);
> > > + stream = NULL;
> > > + }
> > > + }
> > > +
> > > + if (dd_handle != -1) {
> > > + _findclose(dd_handle);
> > > + }
> > > +
> > > + return (DIR *)stream;
> > > +}
> > > +
> > > +/*
> > > + * closedir_win32 - close a directory
> > > + *
> > > + * This function closes directory and free all cached resources.
> > > + */
> > > +int closedir_win32(DIR *pDir)
> > > +{
> > > + struct dir_win32 *stream = (struct dir_win32 *)pDir;
> > > +
> > > + if (stream == NULL) {
> > > + errno = EBADF;
> > > + return -1;
> > > + }
> > > +
> > > + /* free all resources */
> > > + CloseHandle(stream->hDir);
> > > +
> > > + g_free(stream->file_id_list);
> > > +
> > > + g_free(stream);
> > > +
> > > + return 0;
> > > +}
> > > +
> > > +/*
> > > + * readdir_win32 - read a directory
> > > + *
> > > + * This function reads a directory entry from cached entry list.
> > > + */
> > > +struct dirent *readdir_win32(DIR *pDir) {
> > > + struct dir_win32 *stream = (struct dir_win32 *)pDir;
> > > +
> > > + if (stream == NULL) {
> > > + errno = EBADF;
> > > + return NULL;
> > > + }
> > > +
> > > +retry:
> > > +
> > > + if (stream->offset >= stream->total_entries) {
> > > + /* reach to the end, return NULL without set errno */
> > > + return NULL;
> > > + }
> > > +
> > > + if (get_next_entry(stream) != 0) {
> > > + stream->offset++;
> > > + goto retry;
> > > + }
> > > +
> > > + /* Windows does not provide inode number */
> > > + stream->dd_dir.d_ino = 0;
> > > + stream->dd_dir.d_reclen = 0;
> > > + stream->dd_dir.d_namlen = strlen(stream->dd_dir.d_name);
> > > +
> > > + stream->offset++;
> > > +
> > > + return &stream->dd_dir;
> > > +}
> > > +
> > > +/*
> > > + * rewinddir_win32 - reset directory stream
> > > + *
> > > + * This function resets the position of the directory stream to the
> > > + * beginning of the directory.
> > > + */
> > > +void rewinddir_win32(DIR *pDir)
> > > +{
> > > + struct dir_win32 *stream = (struct dir_win32 *)pDir;
> > > +
> > > + if (stream == NULL) {
> > > + errno = EBADF;
> > > + return;
> > > + }
> > > +
> > > + stream->offset = 0;
> > > +
> > > + return;
> > > +}
> > > +
> > > +/*
> > > + * seekdir_win32 - set the position of the next readdir() call in the
> > > +directory
> > > + *
> > > + * This function sets the position of the next readdir() call in the
> > > +directory
> > > + * from which the next readdir() call will start.
> > > + */
> > > +void seekdir_win32(DIR *pDir, long pos) {
> > > + struct dir_win32 *stream = (struct dir_win32 *)pDir;
> > > +
> > > + if (stream == NULL) {
> > > + errno = EBADF;
> > > + return;
> > > + }
> > > +
> > > + if (pos < -1) {
> > > + errno = EINVAL;
> > > + return;
> > > + }
> > > +
> > > + if (pos == -1 || pos >= (long)stream->total_entries) {
> > > + /* seek to the end */
> > > + stream->offset = stream->total_entries;
> > > + return;
> > > + }
> > > +
> > > + if (pos - (long)stream->offset == 0) {
> > > + /* no need to seek */
> > > + return;
> > > + }
> > > +
> > > + stream->offset = pos;
> > > +
> > > + return;
> > > +}
> > > +
> > > +/*
> > > + * telldir_win32 - return current location in directory
> > > + *
> > > + * This function returns current location in directory.
> > > + */
> > > +long telldir_win32(DIR *pDir)
> > > +{
> > > + struct dir_win32 *stream = (struct dir_win32 *)pDir;
> > > +
> > > + if (stream == NULL) {
> > > + errno = EBADF;
> > > + return -1;
> > > + }
> > > +
> > > + if (stream->offset > stream->total_entries) {
> > > + return -1;
> > > + }
> > > +
> > > + return (long)stream->offset;
> > > +}
> > >
> >
>
>
>
^ permalink raw reply [flat|nested] 32+ messages in thread
* RE: [PATCH v5 04/16] hw/9pfs: Implement Windows specific xxxdir() APIs
2023-03-16 11:05 ` Christian Schoenebeck
@ 2023-03-16 17:28 ` Shi, Guohuai
2023-03-17 4:36 ` Shi, Guohuai
0 siblings, 1 reply; 32+ messages in thread
From: Shi, Guohuai @ 2023-03-16 17:28 UTC (permalink / raw)
To: Christian Schoenebeck, Greg Kurz, qemu-devel; +Cc: Meng, Bin
> -----Original Message-----
> From: Christian Schoenebeck <qemu_oss@crudebyte.com>
> Sent: Thursday, March 16, 2023 19:05
> To: Greg Kurz <groug@kaod.org>; qemu-devel@nongnu.org
> Cc: Meng, Bin <Bin.Meng@windriver.com>; Shi, Guohuai
> <Guohuai.Shi@windriver.com>
> Subject: Re: [PATCH v5 04/16] hw/9pfs: Implement Windows specific xxxdir()
> APIs
>
> CAUTION: This email comes from a non Wind River email account!
> Do not click links or open attachments unless you recognize the sender and
> know the content is safe.
>
> On Wednesday, March 15, 2023 8:05:34 PM CET Shi, Guohuai wrote:
> >
> > > -----Original Message-----
> > > From: Christian Schoenebeck <qemu_oss@crudebyte.com>
> > > Sent: Wednesday, March 15, 2023 00:06
> > > To: Greg Kurz <groug@kaod.org>; qemu-devel@nongnu.org
> > > Cc: Shi, Guohuai <Guohuai.Shi@windriver.com>; Meng, Bin
> > > <Bin.Meng@windriver.com>
> > > Subject: Re: [PATCH v5 04/16] hw/9pfs: Implement Windows specific
> > > xxxdir() APIs
> > >
> > > CAUTION: This email comes from a non Wind River email account!
> > > Do not click links or open attachments unless you recognize the
> > > sender and know the content is safe.
> > >
> > > On Monday, February 20, 2023 11:08:03 AM CET Bin Meng wrote:
> > > > From: Guohuai Shi <guohuai.shi@windriver.com>
> > > >
> > > > This commit implements Windows specific xxxdir() APIs for safety
> > > > directory access.
> > >
> > > That comment is seriously too short for this patch.
> > >
> > > 1. You should describe the behaviour implementation that you have
> > > chosen and why you have chosen it.
> > >
> > > 2. Like already said in the previous version of the patch, you
> > > should place a link to the discussion we had on this issue.
> > >
> > > > Signed-off-by: Guohuai Shi <guohuai.shi@windriver.com>
> > > > Signed-off-by: Bin Meng <bin.meng@windriver.com>
> > > > ---
> > > >
> > > > hw/9pfs/9p-util.h | 6 +
> > > > hw/9pfs/9p-util-win32.c | 443
> > > > ++++++++++++++++++++++++++++++++++++++++
> > > > 2 files changed, 449 insertions(+)
> > > >
> > > > diff --git a/hw/9pfs/9p-util.h b/hw/9pfs/9p-util.h index
> > > > 0f159fb4ce..c1c251fbd1 100644
> > > > --- a/hw/9pfs/9p-util.h
> > > > +++ b/hw/9pfs/9p-util.h
> > > > @@ -141,6 +141,12 @@ int unlinkat_win32(int dirfd, const char
> > > > *pathname, int flags); int statfs_win32(const char *root_path,
> > > > struct statfs *stbuf); int openat_dir(int dirfd, const char
> > > > *name); int openat_file(int dirfd, const char *name, int flags,
> > > > mode_t mode);
> > > > +DIR *opendir_win32(const char *full_file_name); int
> > > > +closedir_win32(DIR *pDir); struct dirent *readdir_win32(DIR
> > > > +*pDir); void rewinddir_win32(DIR *pDir); void seekdir_win32(DIR
> > > > +*pDir, long pos); long telldir_win32(DIR *pDir);
> > > > #endif
> > > >
> > > > static inline void close_preserve_errno(int fd) diff --git
> > > > a/hw/9pfs/9p-util-win32.c b/hw/9pfs/9p-util-win32.c index
> > > > a99d579a06..e9408f3c45 100644
> > > > --- a/hw/9pfs/9p-util-win32.c
> > > > +++ b/hw/9pfs/9p-util-win32.c
> > > > @@ -37,6 +37,16 @@
> > > > * Windows does not support opendir, the directory fd is created by
> > > > * CreateFile and convert to fd by _open_osfhandle(). Keep the fd
> open
> > > will
> > > > * lock and protect the directory (can not be modified or replaced)
> > > > + *
> > > > + * 5. Neither Windows native APIs, nor MinGW provide a POSIX
> > > > + compatible
> > > API for
> > > > + * acquiring directory entries in a safe way. Calling those APIs
> > > (native
> > > > + * _findfirst() and _findnext() or MinGW's readdir(), seekdir() and
> > > > + * telldir()) directly can lead to an inconsistent state if
> directory
> > > is
> > > > + * modified in between, e.g. the same directory appearing more than
> > > once
> > > > + * in output, or directories not appearing at all in output even
> though
> > > they
> > > > + * were neither newly created nor deleted. POSIX does not define
> what
> > > happens
> > > > + * with deleted or newly created directories in between, but it
> > > guarantees a
> > > > + * consistent state.
> > > > */
> > > >
> > > > #include "qemu/osdep.h"
> > > > @@ -51,6 +61,25 @@
> > > >
> > > > #define V9FS_MAGIC 0x53465039 /* string "9PFS" */
> > > >
> > > > +/*
> > > > + * MinGW and Windows does not provide a safe way to seek
> > > > +directory while other
> > > > + * thread is modifying the same directory.
> > > > + *
> > > > + * This structure is used to store sorted file id and ensure
> > > > +directory seek
> > > > + * consistency.
> > > > + */
> > > > +struct dir_win32 {
> > > > + struct dirent dd_dir;
> > > > + uint32_t offset;
> > > > + uint32_t total_entries;
> > > > + HANDLE hDir;
> > > > + uint32_t dir_name_len;
> > > > + uint64_t dot_id;
> > > > + uint64_t dot_dot_id;
> > > > + uint64_t *file_id_list;
> > > > + char dd_name[1];
> > > > +};
> > > > +
> > > > /*
> > > > * win32_error_to_posix - convert Win32 error to POSIX error number
> > > > *
> > > > @@ -977,3 +1006,417 @@ int qemu_mknodat(int dirfd, const char
> > > > *filename,
> > > mode_t mode, dev_t dev)
> > > > errno = ENOTSUP;
> > > > return -1;
> > > > }
> > > > +
> > > > +static int file_id_compare(const void *id_ptr1, const void
> > > > +*id_ptr2) {
> > > > + uint64_t id[2];
> > > > +
> > > > + id[0] = *(uint64_t *)id_ptr1;
> > > > + id[1] = *(uint64_t *)id_ptr2;
> > > > +
> > > > + if (id[0] > id[1]) {
> > > > + return 1;
> > > > + } else if (id[0] < id[1]) {
> > > > + return -1;
> > > > + } else {
> > > > + return 0;
> > > > + }
> > > > +}
> > > > +
> > > > +static int get_next_entry(struct dir_win32 *stream) {
> > > > + HANDLE hDirEntry = INVALID_HANDLE_VALUE;
> > > > + char *entry_name;
> > > > + char *entry_start;
> > > > + FILE_ID_DESCRIPTOR fid;
> > > > + DWORD attribute;
> > > > +
> > > > + if (stream->file_id_list[stream->offset] == stream->dot_id) {
> > > > + strcpy(stream->dd_dir.d_name, ".");
> > > > + return 0;
> > > > + }
> > > > +
> > > > + if (stream->file_id_list[stream->offset] == stream->dot_dot_id) {
> > > > + strcpy(stream->dd_dir.d_name, "..");
> > > > + return 0;
> > > > + }
> > > > +
> > > > + fid.dwSize = sizeof(fid);
> > > > + fid.Type = FileIdType;
> > > > +
> > > > + fid.FileId.QuadPart = stream->file_id_list[stream->offset];
> > > > +
> > > > + hDirEntry = OpenFileById(stream->hDir, &fid, GENERIC_READ,
> > > > + FILE_SHARE_READ | FILE_SHARE_WRITE
> > > > + | FILE_SHARE_DELETE,
> > > > + NULL,
> > > > + FILE_FLAG_BACKUP_SEMANTICS
> > > > + | FILE_FLAG_OPEN_REPARSE_POINT);
> > >
> > > What's the purpose of FILE_FLAG_OPEN_REPARSE_POINT here? As it's
> > > apparently not obvious, please add a comment.
> > >
> >
> > If do not use this flag, and if file id is a symbolic link, then Windows
> will not symbolic link itself, but open the target file.
> > This flag is similar as O_NOFOLLOW flag.
>
> OK, got it, thanks! But please add a comment in code that describes this.
>
> > > > +
> > > > + if (hDirEntry == INVALID_HANDLE_VALUE) {
> > > > + /*
> > > > + * Not open it successfully, it may be deleted.
> > >
> > > Wrong English. "Open failed, it may have been deleted in the meantime.".
> > >
> > > > + * Try next id.
> > > > + */
> > > > + return -1;
> > > > + }
> > > > +
> > > > + entry_name = get_full_path_win32(hDirEntry, NULL);
> > > > +
> > > > + CloseHandle(hDirEntry);
> > > > +
> > > > + if (entry_name == NULL) {
> > > > + return -1;
> > > > + }
> > > > +
> > > > + attribute = GetFileAttributes(entry_name);
> > > > +
> > > > + /* symlink is not allowed */
> > > > + if (attribute == INVALID_FILE_ATTRIBUTES
> > > > + || (attribute & FILE_ATTRIBUTE_REPARSE_POINT) != 0) {
> > > > + return -1;
> > >
> > > Wouldn't it make sense to call warn_report_once() here to let the
> > > user know that he has some symlinks that are never delivered to guest?
> >
> > OK, Got it.
> >
> > >
> > > > + }
> > > > +
> > > > + if (memcmp(entry_name, stream->dd_name, stream->dir_name_len)
> > > > + !=
> > > > + 0) {
> > >
> > > No, that's unsafe. You want to use something like strncmp() instead.
> > >
> > > > + /*
> > > > + * The full entry file name should be a part of parent
> > > > + directory
> > > name,
> > > > + * except dot and dot_dot (is already handled).
> > > > + * If not, this entry should not be returned.
> > > > + */
> > > > + return -1;
> > > > + }
> > > > +
> > > > + entry_start = entry_name + stream->dir_name_len;
> > >
> > > s/entry_start/entry_basename/ ?
> > >
> > > > +
> > > > + /* skip slash */
> > > > + while (*entry_start == '\\') {
> > > > + entry_start++;
> > > > + }
> > > > +
> > > > + if (strchr(entry_start, '\\') != NULL) {
> > > > + return -1;
> > > > + }
> > > > +
> > > > + if (strlen(entry_start) == 0
> > > > + || strlen(entry_start) + 1 > sizeof(stream->dd_dir.d_name)) {
> > > > + return -1;
> > > > + }
> > > > + strcpy(stream->dd_dir.d_name, entry_start);
> > >
> > > g_path_get_basename() ? :)
> >
> > For above three comments:
> > This code is not good, should be fixed.
> > The code want to filter the following cases:
> > The parent directory path is not a part of entry's full path:
> > Parent: C:\123\456, entry: C:\123, C:\ Entry contains more than one
> > name components:
> > Parent: C:\123\456, entry: C:\123\456\789\abc Entry is zero length or
> > name buffer is too long
> >
> > I will refactor this part.
>
> In general: writing parsing code yourself is extremely error prone. That's
> why it makes sense to use existing functions from glib, etc.
>
> > >
> > > > +
> > > > + return 0;
> > > > +}
> > > > +
> > > > +/*
> > > > + * opendir_win32 - open a directory
> > > > + *
> > > > + * This function opens a directory and caches all directory entries.
> > >
> > > It just caches all file IDs, doesn't it?
> > >
> >
> > Will fix it
> >
> > > > + */
> > > > +DIR *opendir_win32(const char *full_file_name) {
> > > > + HANDLE hDir = INVALID_HANDLE_VALUE;
> > > > + HANDLE hDirEntry = INVALID_HANDLE_VALUE;
> > > > + char *full_dir_entry = NULL;
> > > > + DWORD attribute;
> > > > + intptr_t dd_handle = -1;
> > > > + struct _finddata_t dd_data;
> > > > + uint64_t file_id;
> > > > + uint64_t *file_id_list = NULL;
> > > > + BY_HANDLE_FILE_INFORMATION FileInfo;
> > >
> > > FileInfo is the variable name, not a struct name, so no upper case
> > > for it please.
> >
> > Will fix it.
> > >
> > > > + struct dir_win32 *stream = NULL;
> > > > + int err = 0;
> > > > + int find_status;
> > > > + int sort_first_two_entry = 0;
> > > > + uint32_t list_count = 16;
> > >
> > > Magic number 16?
> >
> > Will change it to a macro.
> > >
> > > > + uint32_t index = 0;
> > > > +
> > > > + /* open directory to prevent it being removed */
> > > > +
> > > > + hDir = CreateFile(full_file_name, GENERIC_READ,
> > > > + FILE_SHARE_READ | FILE_SHARE_WRITE |
> > > FILE_SHARE_DELETE,
> > > > + NULL,
> > > > + OPEN_EXISTING,
> > > > + FILE_FLAG_BACKUP_SEMANTICS |
> > > FILE_FLAG_OPEN_REPARSE_POINT,
> > > > + NULL);
> > > > +
> > > > + if (hDir == INVALID_HANDLE_VALUE) {
> > > > + err = win32_error_to_posix(GetLastError());
> > > > + goto out;
> > > > + }
> > > > +
> > > > + attribute = GetFileAttributes(full_file_name);
> > > > +
> > > > + /* symlink is not allow */
> > > > + if (attribute == INVALID_FILE_ATTRIBUTES
> > > > + || (attribute & FILE_ATTRIBUTE_REPARSE_POINT) != 0) {
> > > > + err = EACCES;
> > > > + goto out;
> > > > + }
> > > > +
> > > > + /* check if it is a directory */
> > > > + if ((attribute & FILE_ATTRIBUTE_DIRECTORY) == 0) {
> > > > + err = ENOTDIR;
> > > > + goto out;
> > > > + }
> > > > +
> > > > + file_id_list = g_malloc0(sizeof(uint64_t) * list_count);
> > > > +
> > > > + /*
> > > > + * findfirst() needs suffix format name like "\dir1\dir2\*",
> > > > + * allocate more buffer to store suffix.
> > > > + */
> > > > + stream = g_malloc0(sizeof(struct dir_win32) +
> > > > + strlen(full_file_name) + 3);
> > >
> > > Not that I would care much, but +2 would be correct here, as you
> > > declared the struct with one character already, so it is not a
> > > classic (zero size) flex
> > > array:
> > >
> > > struct dir_win32 {
> > > ...
> > > char dd_name[1];
> > > };
> > >
> > Will fix it.
> >
> > > > +
> > > > + strcpy(stream->dd_name, full_file_name);
> > > > + strcat(stream->dd_name, "\\*");
> > > > +
> > > > + stream->hDir = hDir;
> > > > + stream->dir_name_len = strlen(full_file_name);
> > > > +
> > > > + dd_handle = _findfirst(stream->dd_name, &dd_data);
> > > > +
> > > > + if (dd_handle == -1) {
> > > > + err = errno;
> > > > + goto out;
> > > > + }
> > > > +
> > > > + /* read all entries to link list */
> > >
> > > "read all entries as a linked list"
> > >
> > > However there is no linked list here. It seems to be an array.
> >
> > Will fix it.
> > >
> > > > + do {
> > > > + full_dir_entry = get_full_path_win32(hDir, dd_data.name);
> > > > +
> > > > + if (full_dir_entry == NULL) {
> > > > + err = ENOMEM;
> > > > + break;
> > > > + }
> > > > +
> > > > + /*
> > > > + * Open every entry and get the file informations.
> > > > + *
> > > > + * Skip symbolic links during reading directory.
> > > > + */
> > > > + hDirEntry = CreateFile(full_dir_entry,
> > > > + GENERIC_READ,
> > > > + FILE_SHARE_READ | FILE_SHARE_WRITE
> > > > + | FILE_SHARE_DELETE,
> > > > + NULL,
> > > > + OPEN_EXISTING,
> > > > + FILE_FLAG_BACKUP_SEMANTICS
> > > > + | FILE_FLAG_OPEN_REPARSE_POINT,
> > > > + NULL);
> > > > +
> > > > + if (hDirEntry != INVALID_HANDLE_VALUE) {
> > > > + if (GetFileInformationByHandle(hDirEntry,
> > > > + &FileInfo) == TRUE) {
> > > > + attribute = FileInfo.dwFileAttributes;
> > > > +
> > > > + /* only save validate entries */
> > > > + if ((attribute & FILE_ATTRIBUTE_REPARSE_POINT) == 0) {
> > > > + if (index >= list_count) {
> > > > + list_count = list_count + 16;
> > >
> > > Magic number 16 again.
> > >
> > > > + file_id_list = g_realloc(file_id_list,
> > > > + sizeof(uint64_t)
> > > > + * list_count);
> > >
> > > OK, so here we are finally at the point where you chose the overall
> > > behaviour for this that we discussed before.
> > >
> > > So you are constantly appending 16 entry chunks to the end of the
> > > array, periodically reallocate the entire array, and potentially end
> > > up with one giant dense array with *all* file IDs of the directory.
> > >
> > > That's not really what I had in mind, as it still has the potential
> > > to easily crash QEMU if there are large directories on host.
> > > Theoretically a Windows directory might then consume up to 16 GB of
> > > RAM for looking up only one single directory.
> > >
> > > So is this the implementation that you said was very slow, or did
> > > you test a different one? Remember, my orgiginal idea (as starting
> > > point for Windows) was to only cache *one* file ID (the last being
> > > looked up). That's it. Not a list of file IDs.
> >
> > If only cache one file ID, that means for every read directory operation.
> > we need to look up whole directory to find out the next ID larger than last
> cached one.
> >
> > I provided some performance test in last patch:
> > Run test for read directory with 100, 1000, 10000 entries #1, For file
> > name cache solution, the time cost is: 2, 9, 44 (in ms).
> > #2, For file id cache solution, the time cost: 3, 438, 4338 (in ms). This
> is current solution.
> > #3, for cache one id solution, I just tested it: 4, 4788, more than
> > one minutes (in ms)
> >
> > I think it is not a good idea to cache one file id, it would be very
> > bad performance
>
> Yes, the performce would be lousy, but at least we would have a basis that
> just works^TM. Correct behaviour always comes before performance. And from
> there you could add additional patches on top to address performance
> improvements. Because the point is: your implementation is also suboptimal,
> and more importantly: prone to crashes like we discussed before.
>
> Regarding performance: for instance you are re-allocating an entire dense
> buffer on every 16 new entries. That will slow down things extremely. Please
> use a container from glib, because these are handling resize operations more
> smoothly for you out of the box, i.e. typically by doubling the container
> capacity instead of re-allocating frequently with small chunks like you did.
>
> However I am still not convinced that allocating a huge dense buffer with
> *all* file IDs of a directory makes sense.
>
> On the long-term it would make sense to do it like other implementations:
> store a snapshot of the directory temporarily on disk. That way it would not
> matter how huge the directory is. But that's a complex implementation, so not
> something that I would do in this series already.
>
> On the short/mid term I think we could simply make a mix of your solution and
> the one-ID solution that I suggested: keeping a maximum of e.g. 1k file IDs
> in RAM. And once guest seeks past that boundary, loading the subsequent 1k
> entries, free-ing the previous 1k entries, and so on.
>
Please note that the performance data is tested in native OS, but not in QEMU.
It is even worse in QEMU.
I run Linux guest OS on Windows host, use "ls -l" command to list a directory with about 100 entries.
"ls -l" command need about 0.5 second to display one directory entry.
Caching only one node (file id, or file name, or others) will make 9pfs not usable: listing 100 directory entries need 50 seconds in guest OS.
> > >
> > > > + }
> > > > + file_id = (uint64_t)FileInfo.nFileIndexLow
> > > > + +
> > > > + (((uint64_t)FileInfo.nFileIndexHigh)
> > > > + << 32);
> > > > +
> > > > +
> > > > + file_id_list[index] = file_id;
> > > > +
> > > > + if (strcmp(dd_data.name, ".") == 0) {
> > > > + stream->dot_id = file_id_list[index];
> > > > + if (index != 0) {
> > > > + sort_first_two_entry = 1;
> > > > + }
> > > > + } else if (strcmp(dd_data.name, "..") == 0) {
> > > > + stream->dot_dot_id = file_id_list[index];
> > > > + if (index != 1) {
> > > > + sort_first_two_entry = 1;
> > > > + }
> > > > + }
> > > > + index++;
> > > > + }
> > > > + }
> > > > + CloseHandle(hDirEntry);
> > > > + }
> > > > + g_free(full_dir_entry);
> > > > + find_status = _findnext(dd_handle, &dd_data);
> > > > + } while (find_status == 0);
> > > > +
> > > > + if (errno == ENOENT) {
> > > > + /* No more matching files could be found, clean errno */
> > > > + errno = 0;
> > > > + } else {
> > > > + err = errno;
> > > > + goto out;
> > > > + }
> > > > +
> > > > + stream->total_entries = index;
> > > > + stream->file_id_list = file_id_list;
> > > > +
> > > > + if (sort_first_two_entry == 0) {
> > > > + /*
> > > > + * If the first two entry is "." and "..", then do not sort
> them.
> > > > + *
> > > > + * If the guest OS always considers first two entries are
> > > > + "." and
> > > "..",
> > > > + * sort the two entries may cause confused display in guest
> OS.
> > > > + */
> > > > + qsort(&file_id_list[2], index - 2, sizeof(file_id),
> > > file_id_compare);
> > > > + } else {
> > > > + qsort(&file_id_list[0], index, sizeof(file_id),
> file_id_compare);
> > > > + }
> > >
> > > Were there cases where you did not get "." and ".." ?
> >
> > NTFS always provides "." and "..".
> > I could add more checks here to fix this risk
>
> That's what I assumed. So you can probably just drop this code for
> simplicity.
>
> >
> > >
> > > > +
> > > > +out:
> > > > + if (err != 0) {
> > > > + errno = err;
> > > > + if (stream != NULL) {
> > > > + if (file_id_list != NULL) {
> > > > + g_free(file_id_list);
> > > > + }
> > > > + CloseHandle(hDir);
> > > > + g_free(stream);
> > > > + stream = NULL;
> > > > + }
> > > > + }
> > > > +
> > > > + if (dd_handle != -1) {
> > > > + _findclose(dd_handle);
> > > > + }
> > > > +
> > > > + return (DIR *)stream;
> > > > +}
> > > > +
> > > > +/*
> > > > + * closedir_win32 - close a directory
> > > > + *
> > > > + * This function closes directory and free all cached resources.
> > > > + */
> > > > +int closedir_win32(DIR *pDir)
> > > > +{
> > > > + struct dir_win32 *stream = (struct dir_win32 *)pDir;
> > > > +
> > > > + if (stream == NULL) {
> > > > + errno = EBADF;
> > > > + return -1;
> > > > + }
> > > > +
> > > > + /* free all resources */
> > > > + CloseHandle(stream->hDir);
> > > > +
> > > > + g_free(stream->file_id_list);
> > > > +
> > > > + g_free(stream);
> > > > +
> > > > + return 0;
> > > > +}
> > > > +
> > > > +/*
> > > > + * readdir_win32 - read a directory
> > > > + *
> > > > + * This function reads a directory entry from cached entry list.
> > > > + */
> > > > +struct dirent *readdir_win32(DIR *pDir) {
> > > > + struct dir_win32 *stream = (struct dir_win32 *)pDir;
> > > > +
> > > > + if (stream == NULL) {
> > > > + errno = EBADF;
> > > > + return NULL;
> > > > + }
> > > > +
> > > > +retry:
> > > > +
> > > > + if (stream->offset >= stream->total_entries) {
> > > > + /* reach to the end, return NULL without set errno */
> > > > + return NULL;
> > > > + }
> > > > +
> > > > + if (get_next_entry(stream) != 0) {
> > > > + stream->offset++;
> > > > + goto retry;
> > > > + }
> > > > +
> > > > + /* Windows does not provide inode number */
> > > > + stream->dd_dir.d_ino = 0;
> > > > + stream->dd_dir.d_reclen = 0;
> > > > + stream->dd_dir.d_namlen = strlen(stream->dd_dir.d_name);
> > > > +
> > > > + stream->offset++;
> > > > +
> > > > + return &stream->dd_dir;
> > > > +}
> > > > +
> > > > +/*
> > > > + * rewinddir_win32 - reset directory stream
> > > > + *
> > > > + * This function resets the position of the directory stream to
> > > > +the
> > > > + * beginning of the directory.
> > > > + */
> > > > +void rewinddir_win32(DIR *pDir)
> > > > +{
> > > > + struct dir_win32 *stream = (struct dir_win32 *)pDir;
> > > > +
> > > > + if (stream == NULL) {
> > > > + errno = EBADF;
> > > > + return;
> > > > + }
> > > > +
> > > > + stream->offset = 0;
> > > > +
> > > > + return;
> > > > +}
> > > > +
> > > > +/*
> > > > + * seekdir_win32 - set the position of the next readdir() call in
> > > > +the directory
> > > > + *
> > > > + * This function sets the position of the next readdir() call in
> > > > +the directory
> > > > + * from which the next readdir() call will start.
> > > > + */
> > > > +void seekdir_win32(DIR *pDir, long pos) {
> > > > + struct dir_win32 *stream = (struct dir_win32 *)pDir;
> > > > +
> > > > + if (stream == NULL) {
> > > > + errno = EBADF;
> > > > + return;
> > > > + }
> > > > +
> > > > + if (pos < -1) {
> > > > + errno = EINVAL;
> > > > + return;
> > > > + }
> > > > +
> > > > + if (pos == -1 || pos >= (long)stream->total_entries) {
> > > > + /* seek to the end */
> > > > + stream->offset = stream->total_entries;
> > > > + return;
> > > > + }
> > > > +
> > > > + if (pos - (long)stream->offset == 0) {
> > > > + /* no need to seek */
> > > > + return;
> > > > + }
> > > > +
> > > > + stream->offset = pos;
> > > > +
> > > > + return;
> > > > +}
> > > > +
> > > > +/*
> > > > + * telldir_win32 - return current location in directory
> > > > + *
> > > > + * This function returns current location in directory.
> > > > + */
> > > > +long telldir_win32(DIR *pDir)
> > > > +{
> > > > + struct dir_win32 *stream = (struct dir_win32 *)pDir;
> > > > +
> > > > + if (stream == NULL) {
> > > > + errno = EBADF;
> > > > + return -1;
> > > > + }
> > > > +
> > > > + if (stream->offset > stream->total_entries) {
> > > > + return -1;
> > > > + }
> > > > +
> > > > + return (long)stream->offset;
> > > > +}
> > > >
> > >
> >
> >
> >
>
>
^ permalink raw reply [flat|nested] 32+ messages in thread
* RE: [PATCH v5 04/16] hw/9pfs: Implement Windows specific xxxdir() APIs
2023-03-16 17:28 ` Shi, Guohuai
@ 2023-03-17 4:36 ` Shi, Guohuai
2023-03-17 12:16 ` Christian Schoenebeck
0 siblings, 1 reply; 32+ messages in thread
From: Shi, Guohuai @ 2023-03-17 4:36 UTC (permalink / raw)
To: Christian Schoenebeck, Greg Kurz, qemu-devel; +Cc: Meng, Bin
> -----Original Message-----
> From: Shi, Guohuai
> Sent: Friday, March 17, 2023 01:28
> To: Christian Schoenebeck <qemu_oss@crudebyte.com>; Greg Kurz
> <groug@kaod.org>; qemu-devel@nongnu.org
> Cc: Meng, Bin <Bin.Meng@windriver.com>
> Subject: RE: [PATCH v5 04/16] hw/9pfs: Implement Windows specific xxxdir()
> APIs
>
>
>
> > -----Original Message-----
> > From: Christian Schoenebeck <qemu_oss@crudebyte.com>
> > Sent: Thursday, March 16, 2023 19:05
> > To: Greg Kurz <groug@kaod.org>; qemu-devel@nongnu.org
> > Cc: Meng, Bin <Bin.Meng@windriver.com>; Shi, Guohuai
> > <Guohuai.Shi@windriver.com>
> > Subject: Re: [PATCH v5 04/16] hw/9pfs: Implement Windows specific
> > xxxdir() APIs
> >
> > CAUTION: This email comes from a non Wind River email account!
> > Do not click links or open attachments unless you recognize the sender
> > and know the content is safe.
> >
> > On Wednesday, March 15, 2023 8:05:34 PM CET Shi, Guohuai wrote:
> > >
> > > > -----Original Message-----
> > > > From: Christian Schoenebeck <qemu_oss@crudebyte.com>
> > > > Sent: Wednesday, March 15, 2023 00:06
> > > > To: Greg Kurz <groug@kaod.org>; qemu-devel@nongnu.org
> > > > Cc: Shi, Guohuai <Guohuai.Shi@windriver.com>; Meng, Bin
> > > > <Bin.Meng@windriver.com>
> > > > Subject: Re: [PATCH v5 04/16] hw/9pfs: Implement Windows specific
> > > > xxxdir() APIs
> > > >
> > > > CAUTION: This email comes from a non Wind River email account!
> > > > Do not click links or open attachments unless you recognize the
> > > > sender and know the content is safe.
> > > >
> > > > On Monday, February 20, 2023 11:08:03 AM CET Bin Meng wrote:
> > > > > From: Guohuai Shi <guohuai.shi@windriver.com>
> > > > >
> > > > > This commit implements Windows specific xxxdir() APIs for safety
> > > > > directory access.
> > > >
> > > > That comment is seriously too short for this patch.
> > > >
> > > > 1. You should describe the behaviour implementation that you have
> > > > chosen and why you have chosen it.
> > > >
> > > > 2. Like already said in the previous version of the patch, you
> > > > should place a link to the discussion we had on this issue.
> > > >
> > > > > Signed-off-by: Guohuai Shi <guohuai.shi@windriver.com>
> > > > > Signed-off-by: Bin Meng <bin.meng@windriver.com>
> > > > > ---
> > > > >
> > > > > hw/9pfs/9p-util.h | 6 +
> > > > > hw/9pfs/9p-util-win32.c | 443
> > > > > ++++++++++++++++++++++++++++++++++++++++
> > > > > 2 files changed, 449 insertions(+)
> > > > >
> > > > > diff --git a/hw/9pfs/9p-util.h b/hw/9pfs/9p-util.h index
> > > > > 0f159fb4ce..c1c251fbd1 100644
> > > > > --- a/hw/9pfs/9p-util.h
> > > > > +++ b/hw/9pfs/9p-util.h
> > > > > @@ -141,6 +141,12 @@ int unlinkat_win32(int dirfd, const char
> > > > > *pathname, int flags); int statfs_win32(const char *root_path,
> > > > > struct statfs *stbuf); int openat_dir(int dirfd, const char
> > > > > *name); int openat_file(int dirfd, const char *name, int flags,
> > > > > mode_t mode);
> > > > > +DIR *opendir_win32(const char *full_file_name); int
> > > > > +closedir_win32(DIR *pDir); struct dirent *readdir_win32(DIR
> > > > > +*pDir); void rewinddir_win32(DIR *pDir); void seekdir_win32(DIR
> > > > > +*pDir, long pos); long telldir_win32(DIR *pDir);
> > > > > #endif
> > > > >
> > > > > static inline void close_preserve_errno(int fd) diff --git
> > > > > a/hw/9pfs/9p-util-win32.c b/hw/9pfs/9p-util-win32.c index
> > > > > a99d579a06..e9408f3c45 100644
> > > > > --- a/hw/9pfs/9p-util-win32.c
> > > > > +++ b/hw/9pfs/9p-util-win32.c
> > > > > @@ -37,6 +37,16 @@
> > > > > * Windows does not support opendir, the directory fd is created by
> > > > > * CreateFile and convert to fd by _open_osfhandle(). Keep the fd
> > open
> > > > will
> > > > > * lock and protect the directory (can not be modified or replaced)
> > > > > + *
> > > > > + * 5. Neither Windows native APIs, nor MinGW provide a POSIX
> > > > > + compatible
> > > > API for
> > > > > + * acquiring directory entries in a safe way. Calling those APIs
> > > > (native
> > > > > + * _findfirst() and _findnext() or MinGW's readdir(), seekdir() and
> > > > > + * telldir()) directly can lead to an inconsistent state if
> > directory
> > > > is
> > > > > + * modified in between, e.g. the same directory appearing more
> than
> > > > once
> > > > > + * in output, or directories not appearing at all in output even
> > though
> > > > they
> > > > > + * were neither newly created nor deleted. POSIX does not define
> > what
> > > > happens
> > > > > + * with deleted or newly created directories in between, but it
> > > > guarantees a
> > > > > + * consistent state.
> > > > > */
> > > > >
> > > > > #include "qemu/osdep.h"
> > > > > @@ -51,6 +61,25 @@
> > > > >
> > > > > #define V9FS_MAGIC 0x53465039 /* string "9PFS" */
> > > > >
> > > > > +/*
> > > > > + * MinGW and Windows does not provide a safe way to seek
> > > > > +directory while other
> > > > > + * thread is modifying the same directory.
> > > > > + *
> > > > > + * This structure is used to store sorted file id and ensure
> > > > > +directory seek
> > > > > + * consistency.
> > > > > + */
> > > > > +struct dir_win32 {
> > > > > + struct dirent dd_dir;
> > > > > + uint32_t offset;
> > > > > + uint32_t total_entries;
> > > > > + HANDLE hDir;
> > > > > + uint32_t dir_name_len;
> > > > > + uint64_t dot_id;
> > > > > + uint64_t dot_dot_id;
> > > > > + uint64_t *file_id_list;
> > > > > + char dd_name[1];
> > > > > +};
> > > > > +
> > > > > /*
> > > > > * win32_error_to_posix - convert Win32 error to POSIX error
> number
> > > > > *
> > > > > @@ -977,3 +1006,417 @@ int qemu_mknodat(int dirfd, const char
> > > > > *filename,
> > > > mode_t mode, dev_t dev)
> > > > > errno = ENOTSUP;
> > > > > return -1;
> > > > > }
> > > > > +
> > > > > +static int file_id_compare(const void *id_ptr1, const void
> > > > > +*id_ptr2) {
> > > > > + uint64_t id[2];
> > > > > +
> > > > > + id[0] = *(uint64_t *)id_ptr1;
> > > > > + id[1] = *(uint64_t *)id_ptr2;
> > > > > +
> > > > > + if (id[0] > id[1]) {
> > > > > + return 1;
> > > > > + } else if (id[0] < id[1]) {
> > > > > + return -1;
> > > > > + } else {
> > > > > + return 0;
> > > > > + }
> > > > > +}
> > > > > +
> > > > > +static int get_next_entry(struct dir_win32 *stream) {
> > > > > + HANDLE hDirEntry = INVALID_HANDLE_VALUE;
> > > > > + char *entry_name;
> > > > > + char *entry_start;
> > > > > + FILE_ID_DESCRIPTOR fid;
> > > > > + DWORD attribute;
> > > > > +
> > > > > + if (stream->file_id_list[stream->offset] == stream->dot_id) {
> > > > > + strcpy(stream->dd_dir.d_name, ".");
> > > > > + return 0;
> > > > > + }
> > > > > +
> > > > > + if (stream->file_id_list[stream->offset] == stream->dot_dot_id) {
> > > > > + strcpy(stream->dd_dir.d_name, "..");
> > > > > + return 0;
> > > > > + }
> > > > > +
> > > > > + fid.dwSize = sizeof(fid);
> > > > > + fid.Type = FileIdType;
> > > > > +
> > > > > + fid.FileId.QuadPart = stream->file_id_list[stream->offset];
> > > > > +
> > > > > + hDirEntry = OpenFileById(stream->hDir, &fid, GENERIC_READ,
> > > > > + FILE_SHARE_READ | FILE_SHARE_WRITE
> > > > > + | FILE_SHARE_DELETE,
> > > > > + NULL,
> > > > > + FILE_FLAG_BACKUP_SEMANTICS
> > > > > + | FILE_FLAG_OPEN_REPARSE_POINT);
> > > >
> > > > What's the purpose of FILE_FLAG_OPEN_REPARSE_POINT here? As it's
> > > > apparently not obvious, please add a comment.
> > > >
> > >
> > > If do not use this flag, and if file id is a symbolic link, then
> > > Windows
> > will not symbolic link itself, but open the target file.
> > > This flag is similar as O_NOFOLLOW flag.
> >
> > OK, got it, thanks! But please add a comment in code that describes this.
> >
> > > > > +
> > > > > + if (hDirEntry == INVALID_HANDLE_VALUE) {
> > > > > + /*
> > > > > + * Not open it successfully, it may be deleted.
> > > >
> > > > Wrong English. "Open failed, it may have been deleted in the
> meantime.".
> > > >
> > > > > + * Try next id.
> > > > > + */
> > > > > + return -1;
> > > > > + }
> > > > > +
> > > > > + entry_name = get_full_path_win32(hDirEntry, NULL);
> > > > > +
> > > > > + CloseHandle(hDirEntry);
> > > > > +
> > > > > + if (entry_name == NULL) {
> > > > > + return -1;
> > > > > + }
> > > > > +
> > > > > + attribute = GetFileAttributes(entry_name);
> > > > > +
> > > > > + /* symlink is not allowed */
> > > > > + if (attribute == INVALID_FILE_ATTRIBUTES
> > > > > + || (attribute & FILE_ATTRIBUTE_REPARSE_POINT) != 0) {
> > > > > + return -1;
> > > >
> > > > Wouldn't it make sense to call warn_report_once() here to let the
> > > > user know that he has some symlinks that are never delivered to guest?
> > >
> > > OK, Got it.
> > >
> > > >
> > > > > + }
> > > > > +
> > > > > + if (memcmp(entry_name, stream->dd_name,
> > > > > + stream->dir_name_len) !=
> > > > > + 0) {
> > > >
> > > > No, that's unsafe. You want to use something like strncmp() instead.
> > > >
> > > > > + /*
> > > > > + * The full entry file name should be a part of parent
> > > > > + directory
> > > > name,
> > > > > + * except dot and dot_dot (is already handled).
> > > > > + * If not, this entry should not be returned.
> > > > > + */
> > > > > + return -1;
> > > > > + }
> > > > > +
> > > > > + entry_start = entry_name + stream->dir_name_len;
> > > >
> > > > s/entry_start/entry_basename/ ?
> > > >
> > > > > +
> > > > > + /* skip slash */
> > > > > + while (*entry_start == '\\') {
> > > > > + entry_start++;
> > > > > + }
> > > > > +
> > > > > + if (strchr(entry_start, '\\') != NULL) {
> > > > > + return -1;
> > > > > + }
> > > > > +
> > > > > + if (strlen(entry_start) == 0
> > > > > + || strlen(entry_start) + 1 > sizeof(stream->dd_dir.d_name)) {
> > > > > + return -1;
> > > > > + }
> > > > > + strcpy(stream->dd_dir.d_name, entry_start);
> > > >
> > > > g_path_get_basename() ? :)
> > >
> > > For above three comments:
> > > This code is not good, should be fixed.
> > > The code want to filter the following cases:
> > > The parent directory path is not a part of entry's full path:
> > > Parent: C:\123\456, entry: C:\123, C:\ Entry contains more than one
> > > name components:
> > > Parent: C:\123\456, entry: C:\123\456\789\abc Entry is zero length
> > > or name buffer is too long
> > >
> > > I will refactor this part.
> >
> > In general: writing parsing code yourself is extremely error prone.
> > That's why it makes sense to use existing functions from glib, etc.
> >
> > > >
> > > > > +
> > > > > + return 0;
> > > > > +}
> > > > > +
> > > > > +/*
> > > > > + * opendir_win32 - open a directory
> > > > > + *
> > > > > + * This function opens a directory and caches all directory entries.
> > > >
> > > > It just caches all file IDs, doesn't it?
> > > >
> > >
> > > Will fix it
> > >
> > > > > + */
> > > > > +DIR *opendir_win32(const char *full_file_name) {
> > > > > + HANDLE hDir = INVALID_HANDLE_VALUE;
> > > > > + HANDLE hDirEntry = INVALID_HANDLE_VALUE;
> > > > > + char *full_dir_entry = NULL;
> > > > > + DWORD attribute;
> > > > > + intptr_t dd_handle = -1;
> > > > > + struct _finddata_t dd_data;
> > > > > + uint64_t file_id;
> > > > > + uint64_t *file_id_list = NULL;
> > > > > + BY_HANDLE_FILE_INFORMATION FileInfo;
> > > >
> > > > FileInfo is the variable name, not a struct name, so no upper case
> > > > for it please.
> > >
> > > Will fix it.
> > > >
> > > > > + struct dir_win32 *stream = NULL;
> > > > > + int err = 0;
> > > > > + int find_status;
> > > > > + int sort_first_two_entry = 0;
> > > > > + uint32_t list_count = 16;
> > > >
> > > > Magic number 16?
> > >
> > > Will change it to a macro.
> > > >
> > > > > + uint32_t index = 0;
> > > > > +
> > > > > + /* open directory to prevent it being removed */
> > > > > +
> > > > > + hDir = CreateFile(full_file_name, GENERIC_READ,
> > > > > + FILE_SHARE_READ | FILE_SHARE_WRITE |
> > > > FILE_SHARE_DELETE,
> > > > > + NULL,
> > > > > + OPEN_EXISTING,
> > > > > + FILE_FLAG_BACKUP_SEMANTICS |
> > > > FILE_FLAG_OPEN_REPARSE_POINT,
> > > > > + NULL);
> > > > > +
> > > > > + if (hDir == INVALID_HANDLE_VALUE) {
> > > > > + err = win32_error_to_posix(GetLastError());
> > > > > + goto out;
> > > > > + }
> > > > > +
> > > > > + attribute = GetFileAttributes(full_file_name);
> > > > > +
> > > > > + /* symlink is not allow */
> > > > > + if (attribute == INVALID_FILE_ATTRIBUTES
> > > > > + || (attribute & FILE_ATTRIBUTE_REPARSE_POINT) != 0) {
> > > > > + err = EACCES;
> > > > > + goto out;
> > > > > + }
> > > > > +
> > > > > + /* check if it is a directory */
> > > > > + if ((attribute & FILE_ATTRIBUTE_DIRECTORY) == 0) {
> > > > > + err = ENOTDIR;
> > > > > + goto out;
> > > > > + }
> > > > > +
> > > > > + file_id_list = g_malloc0(sizeof(uint64_t) * list_count);
> > > > > +
> > > > > + /*
> > > > > + * findfirst() needs suffix format name like "\dir1\dir2\*",
> > > > > + * allocate more buffer to store suffix.
> > > > > + */
> > > > > + stream = g_malloc0(sizeof(struct dir_win32) +
> > > > > + strlen(full_file_name) + 3);
> > > >
> > > > Not that I would care much, but +2 would be correct here, as you
> > > > declared the struct with one character already, so it is not a
> > > > classic (zero size) flex
> > > > array:
> > > >
> > > > struct dir_win32 {
> > > > ...
> > > > char dd_name[1];
> > > > };
> > > >
> > > Will fix it.
> > >
> > > > > +
> > > > > + strcpy(stream->dd_name, full_file_name);
> > > > > + strcat(stream->dd_name, "\\*");
> > > > > +
> > > > > + stream->hDir = hDir;
> > > > > + stream->dir_name_len = strlen(full_file_name);
> > > > > +
> > > > > + dd_handle = _findfirst(stream->dd_name, &dd_data);
> > > > > +
> > > > > + if (dd_handle == -1) {
> > > > > + err = errno;
> > > > > + goto out;
> > > > > + }
> > > > > +
> > > > > + /* read all entries to link list */
> > > >
> > > > "read all entries as a linked list"
> > > >
> > > > However there is no linked list here. It seems to be an array.
> > >
> > > Will fix it.
> > > >
> > > > > + do {
> > > > > + full_dir_entry = get_full_path_win32(hDir,
> > > > > + dd_data.name);
> > > > > +
> > > > > + if (full_dir_entry == NULL) {
> > > > > + err = ENOMEM;
> > > > > + break;
> > > > > + }
> > > > > +
> > > > > + /*
> > > > > + * Open every entry and get the file informations.
> > > > > + *
> > > > > + * Skip symbolic links during reading directory.
> > > > > + */
> > > > > + hDirEntry = CreateFile(full_dir_entry,
> > > > > + GENERIC_READ,
> > > > > + FILE_SHARE_READ | FILE_SHARE_WRITE
> > > > > + | FILE_SHARE_DELETE,
> > > > > + NULL,
> > > > > + OPEN_EXISTING,
> > > > > + FILE_FLAG_BACKUP_SEMANTICS
> > > > > + | FILE_FLAG_OPEN_REPARSE_POINT,
> > > > > + NULL);
> > > > > +
> > > > > + if (hDirEntry != INVALID_HANDLE_VALUE) {
> > > > > + if (GetFileInformationByHandle(hDirEntry,
> > > > > + &FileInfo) == TRUE) {
> > > > > + attribute = FileInfo.dwFileAttributes;
> > > > > +
> > > > > + /* only save validate entries */
> > > > > + if ((attribute & FILE_ATTRIBUTE_REPARSE_POINT) == 0) {
> > > > > + if (index >= list_count) {
> > > > > + list_count = list_count + 16;
> > > >
> > > > Magic number 16 again.
> > > >
> > > > > + file_id_list = g_realloc(file_id_list,
> > > > > + sizeof(uint64_t)
> > > > > + * list_count);
> > > >
> > > > OK, so here we are finally at the point where you chose the
> > > > overall behaviour for this that we discussed before.
> > > >
> > > > So you are constantly appending 16 entry chunks to the end of the
> > > > array, periodically reallocate the entire array, and potentially
> > > > end up with one giant dense array with *all* file IDs of the directory.
> > > >
> > > > That's not really what I had in mind, as it still has the
> > > > potential to easily crash QEMU if there are large directories on host.
> > > > Theoretically a Windows directory might then consume up to 16 GB
> > > > of RAM for looking up only one single directory.
> > > >
> > > > So is this the implementation that you said was very slow, or did
> > > > you test a different one? Remember, my orgiginal idea (as starting
> > > > point for Windows) was to only cache *one* file ID (the last being
> > > > looked up). That's it. Not a list of file IDs.
> > >
> > > If only cache one file ID, that means for every read directory operation.
> > > we need to look up whole directory to find out the next ID larger
> > > than last
> > cached one.
> > >
> > > I provided some performance test in last patch:
> > > Run test for read directory with 100, 1000, 10000 entries #1, For
> > > file name cache solution, the time cost is: 2, 9, 44 (in ms).
> > > #2, For file id cache solution, the time cost: 3, 438, 4338 (in ms).
> > > This
> > is current solution.
> > > #3, for cache one id solution, I just tested it: 4, 4788, more than
> > > one minutes (in ms)
> > >
> > > I think it is not a good idea to cache one file id, it would be very
> > > bad performance
> >
> > Yes, the performce would be lousy, but at least we would have a basis
> > that just works^TM. Correct behaviour always comes before performance.
> > And from there you could add additional patches on top to address
> > performance improvements. Because the point is: your implementation is
> > also suboptimal, and more importantly: prone to crashes like we discussed
> before.
> >
> > Regarding performance: for instance you are re-allocating an entire
> > dense buffer on every 16 new entries. That will slow down things
> > extremely. Please use a container from glib, because these are
> > handling resize operations more smoothly for you out of the box, i.e.
> > typically by doubling the container capacity instead of re-allocating
> frequently with small chunks like you did.
> >
> > However I am still not convinced that allocating a huge dense buffer
> > with
> > *all* file IDs of a directory makes sense.
> >
> > On the long-term it would make sense to do it like other implementations:
> > store a snapshot of the directory temporarily on disk. That way it
> > would not matter how huge the directory is. But that's a complex
> > implementation, so not something that I would do in this series already.
> >
> > On the short/mid term I think we could simply make a mix of your
> > solution and the one-ID solution that I suggested: keeping a maximum
> > of e.g. 1k file IDs in RAM. And once guest seeks past that boundary,
> > loading the subsequent 1k entries, free-ing the previous 1k entries, and so
> on.
> >
>
> Please note that the performance data is tested in native OS, but not in
> QEMU.
> It is even worse in QEMU.
>
> I run Linux guest OS on Windows host, use "ls -l" command to list a directory
> with about 100 entries.
> "ls -l" command need about 0.5 second to display one directory entry.
>
> Caching only one node (file id, or file name, or others) will make 9pfs not
> usable: listing 100 directory entries need 50 seconds in guest OS.
I have to point out that you missing about random accessing for a directory, this is the key of performance.
In QEMU 9p directory reading solution, it will try to read as many as possible entries (in function do_readdir_many).
When the butter is not enough, do_readdir_many will re-seek to the last read entry.
The key point is the "re-seek" directory.
Read directory is always read the next entry, so cache one id will be OK, and less performance impact.
But seek directory may seek to anywhere, seek directory need to cache all IDs.
Consider about this case:
There are 100 files in directory, name is from "file001" to "file100".
Currently, next read entry is "file050".
Now, user want to seek to directory offset 20 (should be "file020").
Because we only cached one id ("file050"), we do not know the file id for offset 20.
So we could only get the file id in offset 0 (need to search whole directory to get the minimal ID), and get the file id in offset 1, ... to offset 20.
So for the random accessing, seek to offset N in a directory with M-entries, we need to search whole directory for N times and reading totally M*N entries.
If there are 1000 files in a directory, and want seek to offset 1000 randomly, need to open file 1000*1000 times.
For the worst test case: read + seek + read for 1000 files, 9p on Windows host will need open files for 1000*(1 + 2 + 3 ... 1000) = 500500000 times. It may need several hours to finish it.
Another problem is: if only cache one ID, we can not detect which directory is deleted.
It is no difference with use MinGW native APIs, and we go back to the start point.
Cache one ID is useful for getting next entry, but not useful for telling us where is current offset.
Because after deleting some entries, guest OS may re-seek to the last offset. Storing only one ID is useless for re-seek to last offset.
Here is summarize of requirements:
1. Guest OS may seek directory randomly.
2. Some entries may be deleted during directory reading.
To match the requirements, a snapshot of directory may be the only solution.
So we should force on which information should be in snapshot (file id, or filename), and how to store it.
I do not think it is a big problem for large directory. Actually, if there are more than 1 million files in a directory, Windows File Explorer may not response.
>
> > > >
> > > > > + }
> > > > > + file_id = (uint64_t)FileInfo.nFileIndexLow
> > > > > + +
> > > > > + (((uint64_t)FileInfo.nFileIndexHigh)
> > > > > + << 32);
> > > > > +
> > > > > +
> > > > > + file_id_list[index] = file_id;
> > > > > +
> > > > > + if (strcmp(dd_data.name, ".") == 0) {
> > > > > + stream->dot_id = file_id_list[index];
> > > > > + if (index != 0) {
> > > > > + sort_first_two_entry = 1;
> > > > > + }
> > > > > + } else if (strcmp(dd_data.name, "..") == 0) {
> > > > > + stream->dot_dot_id = file_id_list[index];
> > > > > + if (index != 1) {
> > > > > + sort_first_two_entry = 1;
> > > > > + }
> > > > > + }
> > > > > + index++;
> > > > > + }
> > > > > + }
> > > > > + CloseHandle(hDirEntry);
> > > > > + }
> > > > > + g_free(full_dir_entry);
> > > > > + find_status = _findnext(dd_handle, &dd_data);
> > > > > + } while (find_status == 0);
> > > > > +
> > > > > + if (errno == ENOENT) {
> > > > > + /* No more matching files could be found, clean errno */
> > > > > + errno = 0;
> > > > > + } else {
> > > > > + err = errno;
> > > > > + goto out;
> > > > > + }
> > > > > +
> > > > > + stream->total_entries = index;
> > > > > + stream->file_id_list = file_id_list;
> > > > > +
> > > > > + if (sort_first_two_entry == 0) {
> > > > > + /*
> > > > > + * If the first two entry is "." and "..", then do not
> > > > > + sort
> > them.
> > > > > + *
> > > > > + * If the guest OS always considers first two entries
> > > > > + are "." and
> > > > "..",
> > > > > + * sort the two entries may cause confused display in
> > > > > + guest
> > OS.
> > > > > + */
> > > > > + qsort(&file_id_list[2], index - 2, sizeof(file_id),
> > > > file_id_compare);
> > > > > + } else {
> > > > > + qsort(&file_id_list[0], index, sizeof(file_id),
> > file_id_compare);
> > > > > + }
> > > >
> > > > Were there cases where you did not get "." and ".." ?
> > >
> > > NTFS always provides "." and "..".
> > > I could add more checks here to fix this risk
> >
> > That's what I assumed. So you can probably just drop this code for
> > simplicity.
> >
> > >
> > > >
> > > > > +
> > > > > +out:
> > > > > + if (err != 0) {
> > > > > + errno = err;
> > > > > + if (stream != NULL) {
> > > > > + if (file_id_list != NULL) {
> > > > > + g_free(file_id_list);
> > > > > + }
> > > > > + CloseHandle(hDir);
> > > > > + g_free(stream);
> > > > > + stream = NULL;
> > > > > + }
> > > > > + }
> > > > > +
> > > > > + if (dd_handle != -1) {
> > > > > + _findclose(dd_handle);
> > > > > + }
> > > > > +
> > > > > + return (DIR *)stream;
> > > > > +}
> > > > > +
> > > > > +/*
> > > > > + * closedir_win32 - close a directory
> > > > > + *
> > > > > + * This function closes directory and free all cached resources.
> > > > > + */
> > > > > +int closedir_win32(DIR *pDir)
> > > > > +{
> > > > > + struct dir_win32 *stream = (struct dir_win32 *)pDir;
> > > > > +
> > > > > + if (stream == NULL) {
> > > > > + errno = EBADF;
> > > > > + return -1;
> > > > > + }
> > > > > +
> > > > > + /* free all resources */
> > > > > + CloseHandle(stream->hDir);
> > > > > +
> > > > > + g_free(stream->file_id_list);
> > > > > +
> > > > > + g_free(stream);
> > > > > +
> > > > > + return 0;
> > > > > +}
> > > > > +
> > > > > +/*
> > > > > + * readdir_win32 - read a directory
> > > > > + *
> > > > > + * This function reads a directory entry from cached entry list.
> > > > > + */
> > > > > +struct dirent *readdir_win32(DIR *pDir) {
> > > > > + struct dir_win32 *stream = (struct dir_win32 *)pDir;
> > > > > +
> > > > > + if (stream == NULL) {
> > > > > + errno = EBADF;
> > > > > + return NULL;
> > > > > + }
> > > > > +
> > > > > +retry:
> > > > > +
> > > > > + if (stream->offset >= stream->total_entries) {
> > > > > + /* reach to the end, return NULL without set errno */
> > > > > + return NULL;
> > > > > + }
> > > > > +
> > > > > + if (get_next_entry(stream) != 0) {
> > > > > + stream->offset++;
> > > > > + goto retry;
> > > > > + }
> > > > > +
> > > > > + /* Windows does not provide inode number */
> > > > > + stream->dd_dir.d_ino = 0;
> > > > > + stream->dd_dir.d_reclen = 0;
> > > > > + stream->dd_dir.d_namlen = strlen(stream->dd_dir.d_name);
> > > > > +
> > > > > + stream->offset++;
> > > > > +
> > > > > + return &stream->dd_dir;
> > > > > +}
> > > > > +
> > > > > +/*
> > > > > + * rewinddir_win32 - reset directory stream
> > > > > + *
> > > > > + * This function resets the position of the directory stream to
> > > > > +the
> > > > > + * beginning of the directory.
> > > > > + */
> > > > > +void rewinddir_win32(DIR *pDir) {
> > > > > + struct dir_win32 *stream = (struct dir_win32 *)pDir;
> > > > > +
> > > > > + if (stream == NULL) {
> > > > > + errno = EBADF;
> > > > > + return;
> > > > > + }
> > > > > +
> > > > > + stream->offset = 0;
> > > > > +
> > > > > + return;
> > > > > +}
> > > > > +
> > > > > +/*
> > > > > + * seekdir_win32 - set the position of the next readdir() call
> > > > > +in the directory
> > > > > + *
> > > > > + * This function sets the position of the next readdir() call
> > > > > +in the directory
> > > > > + * from which the next readdir() call will start.
> > > > > + */
> > > > > +void seekdir_win32(DIR *pDir, long pos) {
> > > > > + struct dir_win32 *stream = (struct dir_win32 *)pDir;
> > > > > +
> > > > > + if (stream == NULL) {
> > > > > + errno = EBADF;
> > > > > + return;
> > > > > + }
> > > > > +
> > > > > + if (pos < -1) {
> > > > > + errno = EINVAL;
> > > > > + return;
> > > > > + }
> > > > > +
> > > > > + if (pos == -1 || pos >= (long)stream->total_entries) {
> > > > > + /* seek to the end */
> > > > > + stream->offset = stream->total_entries;
> > > > > + return;
> > > > > + }
> > > > > +
> > > > > + if (pos - (long)stream->offset == 0) {
> > > > > + /* no need to seek */
> > > > > + return;
> > > > > + }
> > > > > +
> > > > > + stream->offset = pos;
> > > > > +
> > > > > + return;
> > > > > +}
> > > > > +
> > > > > +/*
> > > > > + * telldir_win32 - return current location in directory
> > > > > + *
> > > > > + * This function returns current location in directory.
> > > > > + */
> > > > > +long telldir_win32(DIR *pDir)
> > > > > +{
> > > > > + struct dir_win32 *stream = (struct dir_win32 *)pDir;
> > > > > +
> > > > > + if (stream == NULL) {
> > > > > + errno = EBADF;
> > > > > + return -1;
> > > > > + }
> > > > > +
> > > > > + if (stream->offset > stream->total_entries) {
> > > > > + return -1;
> > > > > + }
> > > > > +
> > > > > + return (long)stream->offset; }
> > > > >
> > > >
> > >
> > >
> > >
> >
> >
^ permalink raw reply [flat|nested] 32+ messages in thread
* Re: [PATCH v5 04/16] hw/9pfs: Implement Windows specific xxxdir() APIs
2023-03-17 4:36 ` Shi, Guohuai
@ 2023-03-17 12:16 ` Christian Schoenebeck
0 siblings, 0 replies; 32+ messages in thread
From: Christian Schoenebeck @ 2023-03-17 12:16 UTC (permalink / raw)
To: Greg Kurz, qemu-devel; +Cc: Meng, Bin, Shi, Guohuai
On Friday, March 17, 2023 5:36:37 AM CET Shi, Guohuai wrote:
[...]
> > > > > > + do {
> > > > > > + full_dir_entry = get_full_path_win32(hDir,
> > > > > > + dd_data.name);
> > > > > > +
> > > > > > + if (full_dir_entry == NULL) {
> > > > > > + err = ENOMEM;
> > > > > > + break;
> > > > > > + }
> > > > > > +
> > > > > > + /*
> > > > > > + * Open every entry and get the file informations.
> > > > > > + *
> > > > > > + * Skip symbolic links during reading directory.
> > > > > > + */
> > > > > > + hDirEntry = CreateFile(full_dir_entry,
> > > > > > + GENERIC_READ,
> > > > > > + FILE_SHARE_READ | FILE_SHARE_WRITE
> > > > > > + | FILE_SHARE_DELETE,
> > > > > > + NULL,
> > > > > > + OPEN_EXISTING,
> > > > > > + FILE_FLAG_BACKUP_SEMANTICS
> > > > > > + | FILE_FLAG_OPEN_REPARSE_POINT,
> > > > > > + NULL);
> > > > > > +
> > > > > > + if (hDirEntry != INVALID_HANDLE_VALUE) {
> > > > > > + if (GetFileInformationByHandle(hDirEntry,
> > > > > > + &FileInfo) == TRUE) {
> > > > > > + attribute = FileInfo.dwFileAttributes;
> > > > > > +
> > > > > > + /* only save validate entries */
> > > > > > + if ((attribute & FILE_ATTRIBUTE_REPARSE_POINT) == 0) {
> > > > > > + if (index >= list_count) {
> > > > > > + list_count = list_count + 16;
> > > > >
> > > > > Magic number 16 again.
> > > > >
> > > > > > + file_id_list = g_realloc(file_id_list,
> > > > > > + sizeof(uint64_t)
> > > > > > + * list_count);
> > > > >
> > > > > OK, so here we are finally at the point where you chose the
> > > > > overall behaviour for this that we discussed before.
> > > > >
> > > > > So you are constantly appending 16 entry chunks to the end of the
> > > > > array, periodically reallocate the entire array, and potentially
> > > > > end up with one giant dense array with *all* file IDs of the directory.
> > > > >
> > > > > That's not really what I had in mind, as it still has the
> > > > > potential to easily crash QEMU if there are large directories on host.
> > > > > Theoretically a Windows directory might then consume up to 16 GB
> > > > > of RAM for looking up only one single directory.
> > > > >
> > > > > So is this the implementation that you said was very slow, or did
> > > > > you test a different one? Remember, my orgiginal idea (as starting
> > > > > point for Windows) was to only cache *one* file ID (the last being
> > > > > looked up). That's it. Not a list of file IDs.
> > > >
> > > > If only cache one file ID, that means for every read directory operation.
> > > > we need to look up whole directory to find out the next ID larger
> > > > than last
> > > cached one.
> > > >
> > > > I provided some performance test in last patch:
> > > > Run test for read directory with 100, 1000, 10000 entries #1, For
> > > > file name cache solution, the time cost is: 2, 9, 44 (in ms).
> > > > #2, For file id cache solution, the time cost: 3, 438, 4338 (in ms).
> > > > This
> > > is current solution.
> > > > #3, for cache one id solution, I just tested it: 4, 4788, more than
> > > > one minutes (in ms)
> > > >
> > > > I think it is not a good idea to cache one file id, it would be very
> > > > bad performance
> > >
> > > Yes, the performce would be lousy, but at least we would have a basis
> > > that just works^TM. Correct behaviour always comes before performance.
> > > And from there you could add additional patches on top to address
> > > performance improvements. Because the point is: your implementation is
> > > also suboptimal, and more importantly: prone to crashes like we discussed
> > before.
> > >
> > > Regarding performance: for instance you are re-allocating an entire
> > > dense buffer on every 16 new entries. That will slow down things
> > > extremely. Please use a container from glib, because these are
> > > handling resize operations more smoothly for you out of the box, i.e.
> > > typically by doubling the container capacity instead of re-allocating
> > frequently with small chunks like you did.
> > >
> > > However I am still not convinced that allocating a huge dense buffer
> > > with
> > > *all* file IDs of a directory makes sense.
> > >
> > > On the long-term it would make sense to do it like other implementations:
> > > store a snapshot of the directory temporarily on disk. That way it
> > > would not matter how huge the directory is. But that's a complex
> > > implementation, so not something that I would do in this series already.
> > >
> > > On the short/mid term I think we could simply make a mix of your
> > > solution and the one-ID solution that I suggested: keeping a maximum
> > > of e.g. 1k file IDs in RAM. And once guest seeks past that boundary,
> > > loading the subsequent 1k entries, free-ing the previous 1k entries, and so
> > on.
> > >
> >
> > Please note that the performance data is tested in native OS, but not in
> > QEMU.
> > It is even worse in QEMU.
> >
> > I run Linux guest OS on Windows host, use "ls -l" command to list a directory
> > with about 100 entries.
> > "ls -l" command need about 0.5 second to display one directory entry.
> >
> > Caching only one node (file id, or file name, or others) will make 9pfs not
> > usable: listing 100 directory entries need 50 seconds in guest OS.
I think we have a misapprehension here, to make this more clear: I had no
intention to roll that one-entry-cache solution out to customers. The idea
rather was this to be the base patch, followed by whatever optimization
patch(es) on top of that. So this one-cache solution would basically just
end up being burried in git history, not being used by a regular user at all.
Reasons for this preliminary DOA patch:
1. An optimized solution with n file IDs (that would then in fact being rolled
out as official QEMU release to users) is a logical extension of a simple
implementation with only 1 file ID, and it always makes sense to split patches
at logical points.
2. If some problem arises, we can always tell people to rollback to this
simple implementation and check if the problem exists there as well (no matter
how long it takes to run the test).
3. If really necessary, we could even make this 1 file ID solution a runtime
option in a distant future, which would be overkill at this point though.
> I have to point out that you missing about random accessing for a directory, this is the key of performance.
> In QEMU 9p directory reading solution, it will try to read as many as possible entries (in function do_readdir_many).
> When the butter is not enough, do_readdir_many will re-seek to the last read entry.
> The key point is the "re-seek" directory.
>
> Read directory is always read the next entry, so cache one id will be OK, and less performance impact.
> But seek directory may seek to anywhere, seek directory need to cache all IDs.
No, random access is not permitted anywhere! We have two aspects on this:
1. On guest user space level there is seekdir() and telldir(). But it's not
like that user could seek randomly like telldir() + n. In fact, many file
systems don't support this kind of operation, as often some kind of internal
file system dependent value is passed for performance reasons as "offset",
e.g. something like:
Filename Offset
001.dat 240
002.dat 80
003.dat 586
...
Instead, POSIX defines that the argument passed to seekdir() *must* have been
obtained by a telldir() call before, exactly for the reason described above.
2. On 9p2000.L protocol level (the default 9p protocol version used by Linux
clients): here we have `Treaddir` only. Which is not a random access request,
instead it is designed to just split large directories into several, smaller
requests and passing the offset of the *previous* `Treaddir` response as
argument to the next `Treaddir` request:
https://github.com/chaos/diod/blob/master/protocol.md#readdir---read-a-directory
3. On 9p2000.u protocol level (a 9p protocol version that we already
discourage to use and are probably going to deprecate) there is no such thing
as `Treaddir`, instead `Tread` on a directory FID is used, however also in
this case the protocol specs are clear that random access is not allowed,
quote:
"For directories, read returns an integral number of direc- tory entries
exactly as in stat (see stat(5)), one for each member of the directory. The
read request message must have offset equal to zero or the value of offset in
the previous read on the directory, plus the number of bytes returned in the
previous read. In other words, seeking other than to the beginning is illegal
in a directory (see seek(2))."
http://ericvh.github.io/9p-rfc/rfc9p2000.html#anchor30
> Consider about this case:
> There are 100 files in directory, name is from "file001" to "file100".
>
> Currently, next read entry is "file050".
> Now, user want to seek to directory offset 20 (should be "file020").
> Because we only cached one id ("file050"), we do not know the file id for offset 20.
> So we could only get the file id in offset 0 (need to search whole directory to get the minimal ID), and get the file id in offset 1, ... to offset 20.
>
> So for the random accessing, seek to offset N in a directory with M-entries, we need to search whole directory for N times and reading totally M*N entries.
Whenever you are capturing other file IDs - no matter if only 1 different or
multiple different file IDs - you would need to *always* scan the entire
directory. Otherwise you would always risk incorrect behaviour.
That's why I suggested as subsequent patch on top of the 1-file-id patch, a
subsequent 2nd patch as optimization that would cache max. e.g. 1000 entries
directory entries in RAM, to avoid scanning the entire directory too often.
> If there are 1000 files in a directory, and want seek to offset 1000 randomly, need to open file 1000*1000 times.
> For the worst test case: read + seek + read for 1000 files, 9p on Windows host will need open files for 1000*(1 + 2 + 3 ... 1000) = 500500000 times. It may need several hours to finish it.
>
> Another problem is: if only cache one ID, we can not detect which directory is deleted.
We don't care detecting whether or not entries were deleted.
> It is no difference with use MinGW native APIs, and we go back to the start point.
Yes it is! The essential difference is: with the MinGW API, when some entry is
deleted in between, then offsets are shifted such that guest might not receive
directory entries that *still* exist!
With the ordered file ID solution discussed here (no matter how many are
cached), as we would always return the directory entries sorted by file IDs to
guest, we can in contrast ensure that really all entries that *still* exist
are always returned to guest. And that's what we care about.
Another thing that I noticed when looking at your patch: you are first
obtaining only the file IDs of the individual directory entries and only
caching the file IDs. Which I understand, as you were really caching the
entire directory in RAM.
It's absolutely OK to cache other directory entry info as well. And if we are
limiting caching to e.g. max. 1k entries or so, then we don't have a problem
with cached size either.
> Cache one ID is useful for getting next entry, but not useful for telling us where is current offset.
> Because after deleting some entries, guest OS may re-seek to the last offset. Storing only one ID is useless for re-seek to last offset.
>
> Here is summarize of requirements:
> 1. Guest OS may seek directory randomly.
> 2. Some entries may be deleted during directory reading.
>
> To match the requirements, a snapshot of directory may be the only solution.
> So we should force on which information should be in snapshot (file id, or filename), and how to store it.
> I do not think it is a big problem for large directory. Actually, if there are more than 1 million files in a directory, Windows File Explorer may not response.
:) That's the solution that I suggested as long-term solution several times
before, as I also pointed out that other file servers are using this solution
as well. And yes, that is "probably" the "best" solution. But I think you are
underestimating the complexity of this solution.
Of course you can easily capture all directory entries in one rush, serialize
them as raw struct to a temporary file, and deserialize those structs when
being accessed. That's not the thing. But there is a lot more on this: e.g.
where would you store these temporary files? How long would you store them
there and what would be the precise mechanism to drop them? Whatabout cleanup
mechanisms after an unclean QEMU shutdown? And would it really be faster than
say caching 1000 entries in RAM? Do we share directory snapshots, and if yes how?
^ permalink raw reply [flat|nested] 32+ messages in thread
end of thread, other threads:[~2023-03-17 12:17 UTC | newest]
Thread overview: 32+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2023-02-20 10:07 [PATCH v5 00/16] hw/9pfs: Add 9pfs support for Windows Bin Meng
2023-02-20 10:08 ` [PATCH v5 01/16] hw/9pfs: Add missing definitions " Bin Meng
2023-02-20 10:08 ` [PATCH v5 02/16] hw/9pfs: Implement Windows specific utilities functions for 9pfs Bin Meng
2023-02-20 10:08 ` [PATCH v5 03/16] hw/9pfs: Replace the direct call to xxxdir() APIs with a wrapper Bin Meng
2023-03-06 9:31 ` Philippe Mathieu-Daudé
2023-03-06 9:35 ` Bin Meng
2023-02-20 10:08 ` [PATCH v5 04/16] hw/9pfs: Implement Windows specific xxxdir() APIs Bin Meng
2023-03-14 16:05 ` Christian Schoenebeck
2023-03-15 19:05 ` Shi, Guohuai
2023-03-16 11:05 ` Christian Schoenebeck
2023-03-16 17:28 ` Shi, Guohuai
2023-03-17 4:36 ` Shi, Guohuai
2023-03-17 12:16 ` Christian Schoenebeck
2023-02-20 10:08 ` [PATCH v5 05/16] hw/9pfs: Update the local fs driver to support Windows Bin Meng
2023-02-20 10:08 ` [PATCH v5 06/16] hw/9pfs: Support getting current directory offset for Windows Bin Meng
2023-02-20 10:08 ` [PATCH v5 07/16] hw/9pfs: Update helper qemu_stat_rdev() Bin Meng
2023-02-20 10:08 ` [PATCH v5 08/16] hw/9pfs: Add a helper qemu_stat_blksize() Bin Meng
2023-02-20 10:08 ` [PATCH v5 09/16] hw/9pfs: Disable unsupported flags and features for Windows Bin Meng
2023-02-20 10:08 ` [PATCH v5 10/16] hw/9pfs: Update v9fs_set_fd_limit() " Bin Meng
2023-02-20 10:08 ` [PATCH v5 11/16] hw/9pfs: Add Linux error number definition Bin Meng
2023-02-20 10:08 ` [PATCH v5 12/16] hw/9pfs: Translate Windows errno to Linux value Bin Meng
2023-02-20 10:08 ` [PATCH v5 13/16] fsdev: Disable proxy fs driver on Windows Bin Meng
2023-03-06 9:28 ` Philippe Mathieu-Daudé
2023-02-20 10:08 ` [PATCH v5 14/16] hw/9pfs: Update synth fs driver for Windows Bin Meng
2023-02-20 10:08 ` [PATCH v5 15/16] tests/qtest: virtio-9p-test: Adapt the case for win32 Bin Meng
2023-02-20 10:08 ` [PATCH v5 16/16] meson.build: Turn on virtfs for Windows Bin Meng
2023-03-13 12:53 ` Christian Schoenebeck
2023-03-06 6:04 ` [PATCH v5 00/16] hw/9pfs: Add 9pfs support " Bin Meng
2023-03-06 14:15 ` Christian Schoenebeck
2023-03-06 14:30 ` Philippe Mathieu-Daudé
2023-03-06 14:56 ` Bin Meng
2023-03-07 12:44 ` Christian Schoenebeck
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.