All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH v2 1/9] tests/virtio-9p: add terminating null in v9fs_string_read()
  2019-12-18 14:17 [PATCH v2 0/9] 9pfs: readdir optimization Christian Schoenebeck
@ 2019-12-18 13:06 ` Christian Schoenebeck
  2020-01-06 11:00   ` Greg Kurz
  2019-12-18 13:17 ` [PATCH v2 2/9] 9pfs: validate count sent by client with T_readdir Christian Schoenebeck
                   ` (7 subsequent siblings)
  8 siblings, 1 reply; 26+ messages in thread
From: Christian Schoenebeck @ 2019-12-18 13:06 UTC (permalink / raw)
  To: qemu-devel; +Cc: Greg Kurz

The 9p protocol sends strings in general without null termination
over the wire. However for future use of this functions it is
beneficial for the delivered string to be null terminated though
for being able to use the string with standard C functions which
often rely on strings being null terminated.

Signed-off-by: Christian Schoenebeck <qemu_oss@crudebyte.com>
---
 tests/virtio-9p-test.c | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/tests/virtio-9p-test.c b/tests/virtio-9p-test.c
index e7b58e3a0c..06263edb53 100644
--- a/tests/virtio-9p-test.c
+++ b/tests/virtio-9p-test.c
@@ -130,8 +130,9 @@ static void v9fs_string_read(P9Req *req, uint16_t *len, char **string)
         *len = local_len;
     }
     if (string) {
-        *string = g_malloc(local_len);
+        *string = g_malloc(local_len + 1);
         v9fs_memread(req, *string, local_len);
+        (*string)[local_len] = 0;
     } else {
         v9fs_memskip(req, local_len);
     }
-- 
2.20.1



^ permalink raw reply related	[flat|nested] 26+ messages in thread

* [PATCH v2 2/9] 9pfs: validate count sent by client with T_readdir
  2019-12-18 14:17 [PATCH v2 0/9] 9pfs: readdir optimization Christian Schoenebeck
  2019-12-18 13:06 ` [PATCH v2 1/9] tests/virtio-9p: add terminating null in v9fs_string_read() Christian Schoenebeck
@ 2019-12-18 13:17 ` Christian Schoenebeck
  2020-01-06 12:30   ` Greg Kurz
  2019-12-18 13:23 ` [PATCH v2 3/9] hw/9pfs/9p-synth: added directory for readdir test Christian Schoenebeck
                   ` (6 subsequent siblings)
  8 siblings, 1 reply; 26+ messages in thread
From: Christian Schoenebeck @ 2019-12-18 13:17 UTC (permalink / raw)
  To: qemu-devel; +Cc: Greg Kurz

A good 9p client sends T_readdir with "count" parameter that's
sufficiently smaller than client's initially negotiated msize
(maximum message size). We perform a check for that though to
avoid the server to be interrupted with a "Failed to encode
VirtFS reply type 41" error message by bad clients.

Note: we should probably also check for a minimum size of
msize during T_version to avoid issues and/or too complicated
count/size checks later on in other requests of that client.
T_version should submit an msize that's at least as large as
the largest request's header size.

Signed-off-by: Christian Schoenebeck <qemu_oss@crudebyte.com>
---
 hw/9pfs/9p.c | 8 ++++++++
 1 file changed, 8 insertions(+)

diff --git a/hw/9pfs/9p.c b/hw/9pfs/9p.c
index 520177f40c..30e33b6573 100644
--- a/hw/9pfs/9p.c
+++ b/hw/9pfs/9p.c
@@ -2414,6 +2414,7 @@ static void coroutine_fn v9fs_readdir(void *opaque)
     int32_t count;
     uint32_t max_count;
     V9fsPDU *pdu = opaque;
+    V9fsState *s = pdu->s;
 
     retval = pdu_unmarshal(pdu, offset, "dqd", &fid,
                            &initial_offset, &max_count);
@@ -2422,6 +2423,13 @@ static void coroutine_fn v9fs_readdir(void *opaque)
     }
     trace_v9fs_readdir(pdu->tag, pdu->id, fid, initial_offset, max_count);
 
+    if (max_count > s->msize - P9_IOHDRSZ) {
+        max_count = s->msize - P9_IOHDRSZ;
+        warn_report_once(
+            "9p: bad client: T_readdir with count > msize - P9_IOHDRSZ"
+        );
+    }
+
     fidp = get_fid(pdu, fid);
     if (fidp == NULL) {
         retval = -EINVAL;
-- 
2.20.1



^ permalink raw reply related	[flat|nested] 26+ messages in thread

* [PATCH v2 3/9] hw/9pfs/9p-synth: added directory for readdir test
  2019-12-18 14:17 [PATCH v2 0/9] 9pfs: readdir optimization Christian Schoenebeck
  2019-12-18 13:06 ` [PATCH v2 1/9] tests/virtio-9p: add terminating null in v9fs_string_read() Christian Schoenebeck
  2019-12-18 13:17 ` [PATCH v2 2/9] 9pfs: validate count sent by client with T_readdir Christian Schoenebeck
@ 2019-12-18 13:23 ` Christian Schoenebeck
  2020-01-09 18:49   ` Greg Kurz
  2019-12-18 13:30 ` [PATCH v2 4/9] tests/virtio-9p: added " Christian Schoenebeck
                   ` (5 subsequent siblings)
  8 siblings, 1 reply; 26+ messages in thread
From: Christian Schoenebeck @ 2019-12-18 13:23 UTC (permalink / raw)
  To: qemu-devel; +Cc: Greg Kurz

This will provide the following virtual files by the 9pfs
synth driver:

  - /ReadDirDir/ReadDirFile99
  - /ReadDirDir/ReadDirFile98
  ...
  - /ReadDirDir/ReadDirFile1
  - /ReadDirDir/ReadDirFile0

This virtual directory and its virtual 100 files will be
used by the upcoming 9pfs readdir tests.

Signed-off-by: Christian Schoenebeck <qemu_oss@crudebyte.com>
---
 hw/9pfs/9p-synth.c | 19 +++++++++++++++++++
 hw/9pfs/9p-synth.h |  5 +++++
 2 files changed, 24 insertions(+)

diff --git a/hw/9pfs/9p-synth.c b/hw/9pfs/9p-synth.c
index 54239c9bbf..7eb210ffa8 100644
--- a/hw/9pfs/9p-synth.c
+++ b/hw/9pfs/9p-synth.c
@@ -578,6 +578,25 @@ static int synth_init(FsContext *ctx, Error **errp)
                                        NULL, v9fs_synth_qtest_flush_write,
                                        ctx);
         assert(!ret);
+
+        /* Directory for READDIR test */
+        {
+            V9fsSynthNode *dir = NULL;
+            ret = qemu_v9fs_synth_mkdir(
+                NULL, 0700, QTEST_V9FS_SYNTH_READDIR_DIR, &dir
+            );
+            assert(!ret);
+            for (i = 0; i < QTEST_V9FS_SYNTH_READDIR_NFILES; ++i) {
+                char *name = g_strdup_printf(
+                    QTEST_V9FS_SYNTH_READDIR_FILE, i
+                );
+                ret = qemu_v9fs_synth_add_file(
+                    dir, 0, name, NULL, NULL, ctx
+                );
+                assert(!ret);
+                g_free(name);
+            }
+        }
     }
 
     return 0;
diff --git a/hw/9pfs/9p-synth.h b/hw/9pfs/9p-synth.h
index af7a993a1e..036d7e4a5b 100644
--- a/hw/9pfs/9p-synth.h
+++ b/hw/9pfs/9p-synth.h
@@ -55,6 +55,11 @@ int qemu_v9fs_synth_add_file(V9fsSynthNode *parent, int mode,
 #define QTEST_V9FS_SYNTH_LOPEN_FILE "LOPEN"
 #define QTEST_V9FS_SYNTH_WRITE_FILE "WRITE"
 
+/* for READDIR test */
+#define QTEST_V9FS_SYNTH_READDIR_DIR "ReadDirDir"
+#define QTEST_V9FS_SYNTH_READDIR_FILE "ReadDirFile%d"
+#define QTEST_V9FS_SYNTH_READDIR_NFILES 100
+
 /* Any write to the "FLUSH" file is handled one byte at a time by the
  * backend. If the byte is zero, the backend returns success (ie, 1),
  * otherwise it forces the server to try again forever. Thus allowing
-- 
2.20.1



^ permalink raw reply related	[flat|nested] 26+ messages in thread

* [PATCH v2 4/9] tests/virtio-9p: added readdir test
  2019-12-18 14:17 [PATCH v2 0/9] 9pfs: readdir optimization Christian Schoenebeck
                   ` (2 preceding siblings ...)
  2019-12-18 13:23 ` [PATCH v2 3/9] hw/9pfs/9p-synth: added directory for readdir test Christian Schoenebeck
@ 2019-12-18 13:30 ` Christian Schoenebeck
  2020-01-06 17:22   ` Greg Kurz
  2020-01-08 23:55   ` Greg Kurz
  2019-12-18 13:35 ` [PATCH v2 5/9] tests/virtio-9p: check file names of R_readdir response Christian Schoenebeck
                   ` (4 subsequent siblings)
  8 siblings, 2 replies; 26+ messages in thread
From: Christian Schoenebeck @ 2019-12-18 13:30 UTC (permalink / raw)
  To: qemu-devel; +Cc: Greg Kurz

This first readdir test simply checks the amount of directory
entries returned by 9pfs server, according to the created amount
of virtual files on 9pfs synth driver side.

Signed-off-by: Christian Schoenebeck <qemu_oss@crudebyte.com>
---
 tests/virtio-9p-test.c | 124 +++++++++++++++++++++++++++++++++++++++++
 1 file changed, 124 insertions(+)

diff --git a/tests/virtio-9p-test.c b/tests/virtio-9p-test.c
index 06263edb53..48c0eca292 100644
--- a/tests/virtio-9p-test.c
+++ b/tests/virtio-9p-test.c
@@ -68,6 +68,11 @@ static void v9fs_memread(P9Req *req, void *addr, size_t len)
     req->r_off += len;
 }
 
+static void v9fs_uint8_read(P9Req *req, uint8_t *val)
+{
+    v9fs_memread(req, val, 1);
+}
+
 static void v9fs_uint16_write(P9Req *req, uint16_t val)
 {
     uint16_t le_val = cpu_to_le16(val);
@@ -101,6 +106,12 @@ static void v9fs_uint32_read(P9Req *req, uint32_t *val)
     le32_to_cpus(val);
 }
 
+static void v9fs_uint64_read(P9Req *req, uint64_t *val)
+{
+    v9fs_memread(req, val, 8);
+    le64_to_cpus(val);
+}
+
 /* len[2] string[len] */
 static uint16_t v9fs_string_size(const char *string)
 {
@@ -191,6 +202,7 @@ static const char *rmessage_name(uint8_t id)
         id == P9_RLOPEN ? "RLOPEN" :
         id == P9_RWRITE ? "RWRITE" :
         id == P9_RFLUSH ? "RFLUSH" :
+        id == P9_RREADDIR ? "READDIR" :
         "<unknown>";
 }
 
@@ -348,6 +360,79 @@ static void v9fs_rwalk(P9Req *req, uint16_t *nwqid, v9fs_qid **wqid)
     v9fs_req_free(req);
 }
 
+/* size[4] Treaddir tag[2] fid[4] offset[8] count[4] */
+static P9Req *v9fs_treaddir(QVirtio9P *v9p, uint32_t fid, uint64_t offset,
+                            uint32_t count, uint16_t tag)
+{
+    P9Req *req;
+
+    req = v9fs_req_init(v9p, 4 + 8 + 4, P9_TREADDIR, tag);
+    v9fs_uint32_write(req, fid);
+    v9fs_uint64_write(req, offset);
+    v9fs_uint32_write(req, count);
+    v9fs_req_send(req);
+    return req;
+}
+
+struct v9fs_dirent {
+    v9fs_qid qid;
+    uint64_t offset;
+    uint8_t type;
+    char *name;
+    struct v9fs_dirent *next;
+};
+
+/* size[4] Rreaddir tag[2] count[4] data[count] */
+static void v9fs_rreaddir(P9Req *req, uint32_t *count, uint32_t *nentries,
+                          struct v9fs_dirent **entries)
+{
+    uint32_t sz;
+    struct v9fs_dirent *e = NULL;
+    uint16_t slen;
+    uint32_t n = 0;
+
+    v9fs_req_recv(req, P9_RREADDIR);
+    v9fs_uint32_read(req, &sz);
+
+    if (count) {
+        *count = sz;
+    }
+
+    for (int32_t togo = (int32_t)sz;
+         togo >= 13 + 8 + 1 + 2;
+         togo -= 13 + 8 + 1 + 2 + slen, ++n)
+    {
+        if (!e) {
+            e = g_malloc(sizeof(struct v9fs_dirent));
+            if (entries)
+                *entries = e;
+        } else {
+            e = e->next = g_malloc(sizeof(struct v9fs_dirent));
+        }
+        e->next = NULL;
+        /* qid[13] offset[8] type[1] name[s] */
+        v9fs_memread(req, &e->qid, 13);
+        v9fs_uint64_read(req, &e->offset);
+        v9fs_uint8_read(req, &e->type);
+        v9fs_string_read(req, &slen, &e->name);
+    }
+
+    if (nentries) {
+        *nentries = n;
+    }
+}
+
+static void v9fs_free_dirents(struct v9fs_dirent *e)
+{
+    struct v9fs_dirent *next = NULL;
+
+    for (; e; e = next) {
+        next = e->next;
+        g_free(e->name);
+        g_free(e);
+    }
+}
+
 /* size[4] Tlopen tag[2] fid[4] flags[4] */
 static P9Req *v9fs_tlopen(QVirtio9P *v9p, uint32_t fid, uint32_t flags,
                           uint16_t tag)
@@ -480,6 +565,44 @@ static void fs_walk(void *obj, void *data, QGuestAllocator *t_alloc)
     g_free(wqid);
 }
 
+static void fs_readdir(void *obj, void *data, QGuestAllocator *t_alloc)
+{
+    QVirtio9P *v9p = obj;
+    alloc = t_alloc;
+    char *const wnames[] = { g_strdup(QTEST_V9FS_SYNTH_READDIR_DIR) };
+    uint16_t nqid;
+    v9fs_qid qid;
+    uint32_t count, nentries;
+    struct v9fs_dirent *entries = NULL;
+    P9Req *req;
+
+    fs_attach(v9p, NULL, t_alloc);
+    req = v9fs_twalk(v9p, 0, 1, 1, wnames, 0);
+    v9fs_req_wait_for_reply(req, NULL);
+    v9fs_rwalk(req, &nqid, NULL);
+    g_assert_cmpint(nqid, ==, 1);
+
+    req = v9fs_tlopen(v9p, 1, O_DIRECTORY, 0);
+    v9fs_req_wait_for_reply(req, NULL);
+    v9fs_rlopen(req, &qid, NULL);
+
+    req = v9fs_treaddir(v9p, 1, 0, P9_MAX_SIZE - P9_IOHDRSZ, 0);
+    v9fs_req_wait_for_reply(req, NULL);
+    v9fs_rreaddir(req, &count, &nentries, &entries);
+
+    /*
+     * Assuming msize (P9_MAX_SIZE) is large enough so we can retrieve all
+     * dir entries with only one readdir request.
+     */
+    g_assert_cmpint(
+        nentries, ==,
+        QTEST_V9FS_SYNTH_READDIR_NFILES + 2 /* "." and ".." */
+    );
+
+    v9fs_free_dirents(entries);
+    g_free(wnames[0]);
+}
+
 static void fs_walk_no_slash(void *obj, void *data, QGuestAllocator *t_alloc)
 {
     QVirtio9P *v9p = obj;
@@ -658,6 +781,7 @@ static void register_virtio_9p_test(void)
                  NULL);
     qos_add_test("fs/flush/ignored", "virtio-9p", fs_flush_ignored,
                  NULL);
+    qos_add_test("fs/readdir/basic", "virtio-9p", fs_readdir, NULL);
 }
 
 libqos_init(register_virtio_9p_test);
-- 
2.20.1



^ permalink raw reply related	[flat|nested] 26+ messages in thread

* [PATCH v2 5/9] tests/virtio-9p: check file names of R_readdir response
  2019-12-18 14:17 [PATCH v2 0/9] 9pfs: readdir optimization Christian Schoenebeck
                   ` (3 preceding siblings ...)
  2019-12-18 13:30 ` [PATCH v2 4/9] tests/virtio-9p: added " Christian Schoenebeck
@ 2019-12-18 13:35 ` Christian Schoenebeck
  2020-01-06 17:07   ` Greg Kurz
  2019-12-18 13:43 ` [PATCH v2 6/9] 9pfs: readdir benchmark Christian Schoenebeck
                   ` (3 subsequent siblings)
  8 siblings, 1 reply; 26+ messages in thread
From: Christian Schoenebeck @ 2019-12-18 13:35 UTC (permalink / raw)
  To: qemu-devel; +Cc: Greg Kurz

Additionally to the already existing check for expected amount
of directory entries returned by R_readdir response, also check
whether all directory entries have the expected file names (as
created on 9pfs synth driver side), ignoring their precise order
in result list though.

Signed-off-by: Christian Schoenebeck <qemu_oss@crudebyte.com>
---
 tests/virtio-9p-test.c | 22 ++++++++++++++++++++++
 1 file changed, 22 insertions(+)

diff --git a/tests/virtio-9p-test.c b/tests/virtio-9p-test.c
index 48c0eca292..cb5c9fb420 100644
--- a/tests/virtio-9p-test.c
+++ b/tests/virtio-9p-test.c
@@ -565,6 +565,16 @@ static void fs_walk(void *obj, void *data, QGuestAllocator *t_alloc)
     g_free(wqid);
 }
 
+static bool fs_dirents_contain_name(struct v9fs_dirent *e, const char* name)
+{
+    for (; e; e = e->next) {
+        if (!strcmp(e->name, name)) {
+            return true;
+        }
+    }
+    return false;
+}
+
 static void fs_readdir(void *obj, void *data, QGuestAllocator *t_alloc)
 {
     QVirtio9P *v9p = obj;
@@ -599,6 +609,18 @@ static void fs_readdir(void *obj, void *data, QGuestAllocator *t_alloc)
         QTEST_V9FS_SYNTH_READDIR_NFILES + 2 /* "." and ".." */
     );
 
+    /*
+     * Check all file names exist in returned entries, ignore their order
+     * though.
+     */
+    g_assert_cmpint(fs_dirents_contain_name(entries, "."), ==, true);
+    g_assert_cmpint(fs_dirents_contain_name(entries, ".."), ==, true);
+    for (int i = 0; i < QTEST_V9FS_SYNTH_READDIR_NFILES; ++i) {
+        char *name = g_strdup_printf(QTEST_V9FS_SYNTH_READDIR_FILE, i);
+        g_assert_cmpint(fs_dirents_contain_name(entries, name), ==, true);
+        g_free(name);
+    }
+
     v9fs_free_dirents(entries);
     g_free(wnames[0]);
 }
-- 
2.20.1



^ permalink raw reply related	[flat|nested] 26+ messages in thread

* [PATCH v2 6/9] 9pfs: readdir benchmark
  2019-12-18 14:17 [PATCH v2 0/9] 9pfs: readdir optimization Christian Schoenebeck
                   ` (4 preceding siblings ...)
  2019-12-18 13:35 ` [PATCH v2 5/9] tests/virtio-9p: check file names of R_readdir response Christian Schoenebeck
@ 2019-12-18 13:43 ` Christian Schoenebeck
  2019-12-18 13:52 ` [PATCH v2 7/9] hw/9pfs/9p-synth: avoid n-square issue in synth_readdir() Christian Schoenebeck
                   ` (2 subsequent siblings)
  8 siblings, 0 replies; 26+ messages in thread
From: Christian Schoenebeck @ 2019-12-18 13:43 UTC (permalink / raw)
  To: qemu-devel; +Cc: Greg Kurz

This patch is not intended to be merged. It just provides a
temporary benchmark foundation for coneniently A/B comparison
of the subsequent 9p readdir optimization patches:

* hw/9pfs/9p-synth: increase amount of simulated files for
  readdir test to 2000 files.

* tests/virtio-9p: measure wall time that elapsed between
  sending T_readdir request and arrival of R_readdir response
  and print out that measured duration, as well as amount of
  directory entries received, and the amount of bytes of the
  response message.

* tests/virtio-9p: increased msize to 256kiB to allow
  retrieving all 2000 files (simulated by 9pfs synth driver)
  with only one T_readdir request.

Running this benchmark is fairly quick & simple and does not
require any guest OS installation or other prerequisites:

cd build
make && make tests/qos-test
export QTEST_QEMU_BINARY=x86_64-softmmu/qemu-system-x86_64
tests/qos-test -p $(tests/qos-test -l | grep readdir/basic)

Since this benchmark uses the 9pfs synth driver, the host
machine's I/O hardware (SSDs/HDDs) is not relevant for the
benchmark result, because the synth backend's readdir
implementation returns immediately (without any blocking I/O
that would incur with a real-life fs driver) and just returns
already prepared, simulated directory entries directly from RAM.
So this benchmark focuses on the efficiency of the 9pfs
controller code (or top half) for readdir request handling.

Signed-off-by: Christian Schoenebeck <qemu_oss@crudebyte.com>
---
 hw/9pfs/9p-synth.h     |  2 +-
 tests/virtio-9p-test.c | 37 ++++++++++++++++++++++++++++++++++++-
 2 files changed, 37 insertions(+), 2 deletions(-)

diff --git a/hw/9pfs/9p-synth.h b/hw/9pfs/9p-synth.h
index 036d7e4a5b..7d6cedcdac 100644
--- a/hw/9pfs/9p-synth.h
+++ b/hw/9pfs/9p-synth.h
@@ -58,7 +58,7 @@ int qemu_v9fs_synth_add_file(V9fsSynthNode *parent, int mode,
 /* for READDIR test */
 #define QTEST_V9FS_SYNTH_READDIR_DIR "ReadDirDir"
 #define QTEST_V9FS_SYNTH_READDIR_FILE "ReadDirFile%d"
-#define QTEST_V9FS_SYNTH_READDIR_NFILES 100
+#define QTEST_V9FS_SYNTH_READDIR_NFILES 2000
 
 /* Any write to the "FLUSH" file is handled one byte at a time by the
  * backend. If the byte is zero, the backend returns success (ie, 1),
diff --git a/tests/virtio-9p-test.c b/tests/virtio-9p-test.c
index cb5c9fb420..7feeca06ff 100644
--- a/tests/virtio-9p-test.c
+++ b/tests/virtio-9p-test.c
@@ -15,6 +15,18 @@
 #include "libqos/virtio-9p.h"
 #include "libqos/qgraph.h"
 
+/*
+ * to benchmark the real time (not CPU time) that elapsed between start of
+ * a request and arrival of its response
+ */
+static double wall_time(void)
+{
+    struct timeval t;
+    struct timezone tz;
+    gettimeofday(&t, &tz);
+    return t.tv_sec + t.tv_usec * 0.000001;
+}
+
 #define QVIRTIO_9P_TIMEOUT_US (10 * 1000 * 1000)
 static QGuestAllocator *alloc;
 
@@ -36,7 +48,7 @@ static void pci_config(void *obj, void *data, QGuestAllocator *t_alloc)
     g_free(tag);
 }
 
-#define P9_MAX_SIZE 4096 /* Max size of a T-message or R-message */
+#define P9_MAX_SIZE (256 * 1024) /* Max size of a T-message or R-message */
 
 typedef struct {
     QTestState *qts;
@@ -596,9 +608,32 @@ static void fs_readdir(void *obj, void *data, QGuestAllocator *t_alloc)
     v9fs_req_wait_for_reply(req, NULL);
     v9fs_rlopen(req, &qid, NULL);
 
+    const double start = wall_time();
+
     req = v9fs_treaddir(v9p, 1, 0, P9_MAX_SIZE - P9_IOHDRSZ, 0);
+    const double treaddir = wall_time();
     v9fs_req_wait_for_reply(req, NULL);
+    const double waitforreply = wall_time();
     v9fs_rreaddir(req, &count, &nentries, &entries);
+    const double end = wall_time();
+
+    printf("\nTime client spent on sending T_readdir: %fs\n\n",
+           treaddir - start);
+
+    printf("Time client spent for waiting for reply from server: %fs "
+           "[MOST IMPORTANT]\n", waitforreply - start);
+    printf("(This is the most important value, because it reflects the time\n"
+           "the 9p server required to process and return the result of the\n"
+           "T_readdir request.)\n\n");
+
+    printf("Total client time: %fs\n", end - start);
+    printf("(NOTE: this time is not relevant; this huge time comes from\n"
+           "inefficient qtest_memread() calls. So you can discard this\n"
+           "value as a problem of this test client implementation while\n"
+           "processing the received server T_readdir reply.)\n\n");
+
+    printf("Details of response message data: R_readddir nentries=%d "
+           "rbytes=%d\n", nentries, count);
 
     /*
      * Assuming msize (P9_MAX_SIZE) is large enough so we can retrieve all
-- 
2.20.1



^ permalink raw reply related	[flat|nested] 26+ messages in thread

* [PATCH v2 7/9] hw/9pfs/9p-synth: avoid n-square issue in synth_readdir()
  2019-12-18 14:17 [PATCH v2 0/9] 9pfs: readdir optimization Christian Schoenebeck
                   ` (5 preceding siblings ...)
  2019-12-18 13:43 ` [PATCH v2 6/9] 9pfs: readdir benchmark Christian Schoenebeck
@ 2019-12-18 13:52 ` Christian Schoenebeck
  2019-12-18 14:00 ` [PATCH v2 8/9] 9pfs: T_readdir latency optimization Christian Schoenebeck
  2019-12-18 14:10 ` [PATCH v2 9/9] hw/9pfs/9p.c: benchmark time on T_readdir request Christian Schoenebeck
  8 siblings, 0 replies; 26+ messages in thread
From: Christian Schoenebeck @ 2019-12-18 13:52 UTC (permalink / raw)
  To: qemu-devel; +Cc: Greg Kurz

This patch is just a temporary benchmark hack, not intended
to be merged!

9pfs synth driver's readdir() implementation has a severe
n-square performance problem. This patch is a quick and dirty
hack to prevent that performance problem from tainting the
readdir() benchmark results. In its current form, this patch
is not useful for anything else than for an isolated readdir
benchmark.

Signed-off-by: Christian Schoenebeck <qemu_oss@crudebyte.com>
---
 hw/9pfs/9p-synth.c | 29 ++++++++++++++++++++++++++---
 1 file changed, 26 insertions(+), 3 deletions(-)

diff --git a/hw/9pfs/9p-synth.c b/hw/9pfs/9p-synth.c
index 7eb210ffa8..54dc30f37b 100644
--- a/hw/9pfs/9p-synth.c
+++ b/hw/9pfs/9p-synth.c
@@ -225,7 +225,8 @@ static void synth_direntry(V9fsSynthNode *node,
 }
 
 static struct dirent *synth_get_dentry(V9fsSynthNode *dir,
-                                            struct dirent *entry, off_t off)
+                                       struct dirent *entry, off_t off,
+                                       V9fsSynthNode **hack)
 {
     int i = 0;
     V9fsSynthNode *node;
@@ -243,16 +244,38 @@ static struct dirent *synth_get_dentry(V9fsSynthNode *dir,
         /* end of directory */
         return NULL;
     }
+    *hack = node;
     synth_direntry(node, entry, off);
     return entry;
 }
 
 static struct dirent *synth_readdir(FsContext *ctx, V9fsFidOpenState *fs)
 {
-    struct dirent *entry;
+    struct dirent *entry = NULL;
     V9fsSynthOpenState *synth_open = fs->private;
     V9fsSynthNode *node = synth_open->node;
-    entry = synth_get_dentry(node, &synth_open->dent, synth_open->offset);
+
+    /*
+     * HACK: This is just intended for benchmark, to avoid severe n-square
+     * performance problem of synth driver's readdir implementation here which
+     * would otherwise unncessarily taint the benchmark results. By simply
+     * caching (globally) the previous node (of the previous synth_readdir()
+     * call) we can simply proceed to next node in chained list efficiently.
+     *
+     * not a good idea for any production code ;-)
+     */
+    static struct V9fsSynthNode *cachedNode;
+
+    if (!cachedNode) {
+        entry = synth_get_dentry(node, &synth_open->dent, synth_open->offset,
+                                 &cachedNode);
+    } else {
+        cachedNode = cachedNode->sibling.le_next;
+        if (cachedNode) {
+            entry = &synth_open->dent;
+            synth_direntry(cachedNode, entry, synth_open->offset + 1);
+        }
+    }
     if (entry) {
         synth_open->offset++;
     }
-- 
2.20.1



^ permalink raw reply related	[flat|nested] 26+ messages in thread

* [PATCH v2 8/9] 9pfs: T_readdir latency optimization
  2019-12-18 14:17 [PATCH v2 0/9] 9pfs: readdir optimization Christian Schoenebeck
                   ` (6 preceding siblings ...)
  2019-12-18 13:52 ` [PATCH v2 7/9] hw/9pfs/9p-synth: avoid n-square issue in synth_readdir() Christian Schoenebeck
@ 2019-12-18 14:00 ` Christian Schoenebeck
  2019-12-18 14:10 ` [PATCH v2 9/9] hw/9pfs/9p.c: benchmark time on T_readdir request Christian Schoenebeck
  8 siblings, 0 replies; 26+ messages in thread
From: Christian Schoenebeck @ 2019-12-18 14:00 UTC (permalink / raw)
  To: qemu-devel; +Cc: Greg Kurz

Make top half really top half and bottom half really bottom half:

Each T_readdir request handling is hopping between threads (main
I/O thread and background I/O driver threads) several times for
every individual directory entry, which sums up to huge latencies
for handling just a single T_readdir request.

Instead of doing that, collect now all required directory entries
(including all potentially required stat buffers for each entry) in
one rush on a background I/O thread from fs driver, then assemble
the entire resulting network response message for the readdir
request on main I/O thread. The fs driver is still aborting the
directory entry retrieval loop (on the background I/O thread) as
soon as it would exceed the client's requested maximum R_readdir
response size. So we should not have any performance penalty by
doing this.

Signed-off-by: Christian Schoenebeck <qemu_oss@crudebyte.com>
---
 hw/9pfs/9p.c    | 124 +++++++++++++++-----------------
 hw/9pfs/9p.h    |  23 ++++++
 hw/9pfs/codir.c | 183 +++++++++++++++++++++++++++++++++++++++++++++---
 hw/9pfs/coth.h  |   3 +
 4 files changed, 254 insertions(+), 79 deletions(-)

diff --git a/hw/9pfs/9p.c b/hw/9pfs/9p.c
index 30e33b6573..a7d36b6350 100644
--- a/hw/9pfs/9p.c
+++ b/hw/9pfs/9p.c
@@ -971,30 +971,6 @@ static int coroutine_fn fid_to_qid(V9fsPDU *pdu, V9fsFidState *fidp,
     return 0;
 }
 
-static int coroutine_fn dirent_to_qid(V9fsPDU *pdu, V9fsFidState *fidp,
-                                      struct dirent *dent, V9fsQID *qidp)
-{
-    struct stat stbuf;
-    V9fsPath path;
-    int err;
-
-    v9fs_path_init(&path);
-
-    err = v9fs_co_name_to_path(pdu, &fidp->path, dent->d_name, &path);
-    if (err < 0) {
-        goto out;
-    }
-    err = v9fs_co_lstat(pdu, &path, &stbuf);
-    if (err < 0) {
-        goto out;
-    }
-    err = stat_to_qid(pdu, &stbuf, qidp);
-
-out:
-    v9fs_path_free(&path);
-    return err;
-}
-
 V9fsPDU *pdu_alloc(V9fsState *s)
 {
     V9fsPDU *pdu = NULL;
@@ -2302,7 +2278,7 @@ out_nofid:
     pdu_complete(pdu, err);
 }
 
-static size_t v9fs_readdir_data_size(V9fsString *name)
+size_t v9fs_readdir_response_size(V9fsString *name)
 {
     /*
      * Size of each dirent on the wire: size of qid (13) + size of offset (8)
@@ -2311,6 +2287,18 @@ static size_t v9fs_readdir_data_size(V9fsString *name)
     return 24 + v9fs_string_size(name);
 }
 
+static void v9fs_free_dirents(struct V9fsDirEnt *e)
+{
+    struct V9fsDirEnt *next = NULL;
+
+    for (; e; e = next) {
+        next = e->next;
+        g_free(e->dent);
+        g_free(e->st);
+        g_free(e);
+    }
+}
+
 static int coroutine_fn v9fs_do_readdir(V9fsPDU *pdu, V9fsFidState *fidp,
                                         int32_t max_count)
 {
@@ -2319,54 +2307,53 @@ static int coroutine_fn v9fs_do_readdir(V9fsPDU *pdu, V9fsFidState *fidp,
     V9fsString name;
     int len, err = 0;
     int32_t count = 0;
-    off_t saved_dir_pos;
     struct dirent *dent;
+    struct stat *st;
+    struct V9fsDirEnt *entries = NULL;
 
-    /* save the directory position */
-    saved_dir_pos = v9fs_co_telldir(pdu, fidp);
-    if (saved_dir_pos < 0) {
-        return saved_dir_pos;
-    }
-
-    while (1) {
-        v9fs_readdir_lock(&fidp->fs.dir);
+    /*
+     * inode remapping requires the device id, which in turn might be
+     * different for different directory entries, so if inode remapping is
+     * enabled we have to make a full stat for each directory entry
+     */
+    const bool dostat = pdu->s->ctx.export_flags & V9FS_REMAP_INODES;
 
-        err = v9fs_co_readdir(pdu, fidp, &dent);
-        if (err || !dent) {
-            break;
-        }
-        v9fs_string_init(&name);
-        v9fs_string_sprintf(&name, "%s", dent->d_name);
-        if ((count + v9fs_readdir_data_size(&name)) > max_count) {
-            v9fs_readdir_unlock(&fidp->fs.dir);
+    /*
+     * Fetch all required directory entries altogether on a background IO
+     * thread from fs driver. We don't want to do that for each entry
+     * individually, because hopping between threads (this main IO thread
+     * and background IO driver thread) would sum up to huge latencies.
+     */
+    count = v9fs_co_readdir_lowlat(pdu, fidp, &entries, max_count, dostat);
+    if (count < 0) {
+        err = count;
+        count = 0;
+        goto out;
+    }
+    count = 0;
 
-            /* Ran out of buffer. Set dir back to old position and return */
-            v9fs_co_seekdir(pdu, fidp, saved_dir_pos);
-            v9fs_string_free(&name);
-            return count;
-        }
+    for (struct V9fsDirEnt *e = entries; e; e = e->next) {
+        dent = e->dent;
 
         if (pdu->s->ctx.export_flags & V9FS_REMAP_INODES) {
-            /*
-             * dirent_to_qid() implies expensive stat call for each entry,
-             * we must do that here though since inode remapping requires
-             * the device id, which in turn might be different for
-             * different entries; we cannot make any assumption to avoid
-             * that here.
-             */
-            err = dirent_to_qid(pdu, fidp, dent, &qid);
+            st = e->st;
+            /* e->st should never be NULL, but just to be sure */
+            if (!st) {
+                err = -1;
+                break;
+            }
+
+            /* remap inode */
+            err = stat_to_qid(pdu, st, &qid);
             if (err < 0) {
-                v9fs_readdir_unlock(&fidp->fs.dir);
-                v9fs_co_seekdir(pdu, fidp, saved_dir_pos);
-                v9fs_string_free(&name);
-                return err;
+                break;
             }
         } else {
             /*
              * Fill up just the path field of qid because the client uses
              * only that. To fill the entire qid structure we will have
              * to stat each dirent found, which is expensive. For the
-             * latter reason we don't call dirent_to_qid() here. Only drawback
+             * latter reason we don't call stat_to_qid() here. Only drawback
              * is that no multi-device export detection of stat_to_qid()
              * would be done and provided as error to the user here. But
              * user would get that error anyway when accessing those
@@ -2379,25 +2366,26 @@ static int coroutine_fn v9fs_do_readdir(V9fsPDU *pdu, V9fsFidState *fidp,
             qid.version = 0;
         }
 
+        v9fs_string_init(&name);
+        v9fs_string_sprintf(&name, "%s", dent->d_name);
+
         /* 11 = 7 + 4 (7 = start offset, 4 = space for storing count) */
         len = pdu_marshal(pdu, 11 + count, "Qqbs",
                           &qid, dent->d_off,
                           dent->d_type, &name);
 
-        v9fs_readdir_unlock(&fidp->fs.dir);
+        v9fs_string_free(&name);
 
         if (len < 0) {
-            v9fs_co_seekdir(pdu, fidp, saved_dir_pos);
-            v9fs_string_free(&name);
-            return len;
+            err = len;
+            break;
         }
+
         count += len;
-        v9fs_string_free(&name);
-        saved_dir_pos = dent->d_off;
     }
 
-    v9fs_readdir_unlock(&fidp->fs.dir);
-
+out:
+    v9fs_free_dirents(entries);
     if (err < 0) {
         return err;
     }
diff --git a/hw/9pfs/9p.h b/hw/9pfs/9p.h
index 3904f82901..38712dd16d 100644
--- a/hw/9pfs/9p.h
+++ b/hw/9pfs/9p.h
@@ -204,6 +204,28 @@ static inline void v9fs_readdir_init(V9fsDir *dir)
     qemu_mutex_init(&dir->readdir_mutex);
 }
 
+/*
+ * Type for 9p fs drivers' (a.k.a. 9p backends) result of readdir requests,
+ * which is a chained list of directory entries.
+ */
+typedef struct V9fsDirEnt {
+    /* mandatory (must not be NULL) information for all readdir requests */
+    struct dirent *dent;
+    /*
+     * optional (may be NULL): A full stat of each directory entry is just
+     * done if explicitly told to fs driver.
+     */
+    struct stat *st;
+    /*
+     * instead of an array, directory entries are always returned as
+     * chained list, that's because the amount of entries retrieved by fs
+     * drivers is dependent on the individual entries' name (since response
+     * messages are size limited), so the final amount cannot be estimated
+     * before hand
+     */
+    struct V9fsDirEnt *next;
+} V9fsDirEnt;
+
 /*
  * Filled by fs driver on open and other
  * calls.
@@ -408,6 +430,7 @@ void v9fs_path_init(V9fsPath *path);
 void v9fs_path_free(V9fsPath *path);
 void v9fs_path_sprintf(V9fsPath *path, const char *fmt, ...);
 void v9fs_path_copy(V9fsPath *dst, const V9fsPath *src);
+size_t v9fs_readdir_response_size(V9fsString *name);
 int v9fs_name_to_path(V9fsState *s, V9fsPath *dirpath,
                       const char *name, V9fsPath *path);
 int v9fs_device_realize_common(V9fsState *s, const V9fsTransport *t,
diff --git a/hw/9pfs/codir.c b/hw/9pfs/codir.c
index 73f9a751e1..bd36491510 100644
--- a/hw/9pfs/codir.c
+++ b/hw/9pfs/codir.c
@@ -18,28 +18,189 @@
 #include "qemu/main-loop.h"
 #include "coth.h"
 
+/*
+ * This is solely executed on a background IO thread.
+ */
+static int do_readdir(V9fsPDU *pdu, V9fsFidState *fidp, struct dirent **dent)
+{
+    int err = 0;
+    V9fsState *s = pdu->s;
+    struct dirent *entry;
+
+    errno = 0;
+    entry = s->ops->readdir(&s->ctx, &fidp->fs);
+    if (!entry && errno) {
+        *dent = NULL;
+        err = -errno;
+    } else {
+        *dent = entry;
+    }
+    return err;
+}
+
+/*
+ * TODO: This will be removed for performance reasons.
+ * Use v9fs_co_readdir_lowlat() instead.
+ */
 int coroutine_fn v9fs_co_readdir(V9fsPDU *pdu, V9fsFidState *fidp,
                                  struct dirent **dent)
 {
     int err;
-    V9fsState *s = pdu->s;
 
     if (v9fs_request_cancelled(pdu)) {
         return -EINTR;
     }
-    v9fs_co_run_in_worker(
-        {
-            struct dirent *entry;
+    v9fs_co_run_in_worker({
+        err = do_readdir(pdu, fidp, dent);
+    });
+    return err;
+}
+
+/*
+ * This is solely executed on a background IO thread.
+ *
+ * See v9fs_co_readdir_lowlat() (as its only user) below for details.
+ */
+static int do_readdir_lowlat(V9fsPDU *pdu, V9fsFidState *fidp,
+                             struct V9fsDirEnt **entries,
+                             int32_t maxsize, bool dostat)
+{
+    V9fsState *s = pdu->s;
+    V9fsString name;
+    int len, err = 0;
+    int32_t size = 0;
+    off_t saved_dir_pos;
+    struct dirent *dent;
+    struct V9fsDirEnt *e = NULL;
+    V9fsPath path;
+    struct stat stbuf;
+
+    *entries = NULL;
+    v9fs_path_init(&path);
+
+    /*
+     * TODO: Here should be an error_warn_once() if lock failed.
+     *
+     * With a good 9p client we should not get into concurrency here,
+     * because a good client would not use the same fid for concurrent
+     * requests. We do the lock here for safety reasons though. However
+     * the client would then suffer performance issues, so better log that
+     * issue here.
+     */
+    v9fs_readdir_lock(&fidp->fs.dir);
+
+    /* save the directory position */
+    saved_dir_pos = s->ops->telldir(&s->ctx, &fidp->fs);
+    if (saved_dir_pos < 0) {
+        err = saved_dir_pos;
+        goto out;
+    }
+
+    while (true) {
+        /* get directory entry from fs driver */
+        err = do_readdir(pdu, fidp, &dent);
+        if (err || !dent) {
+            break;
+        }
+
+        /*
+         * stop this loop as soon as it would exceed the allowed maximum
+         * response message size for the directory entries collected so far,
+         * because anything beyond that size would need to be discarded by
+         * 9p controller (main thread / top half) anyway
+         */
+        v9fs_string_init(&name);
+        v9fs_string_sprintf(&name, "%s", dent->d_name);
+        len = v9fs_readdir_response_size(&name);
+        v9fs_string_free(&name);
+        if (size + len > maxsize) {
+            /* this is not an error case actually */
+            break;
+        }
+
+        /* append next node to result chain */
+        if (!e) {
+            *entries = e = g_malloc0(sizeof(V9fsDirEnt));
+        } else {
+            e = e->next = g_malloc0(sizeof(V9fsDirEnt));
+        }
+        e->dent = g_malloc0(sizeof(struct dirent));
+        memcpy(e->dent, dent, sizeof(struct dirent));
 
-            errno = 0;
-            entry = s->ops->readdir(&s->ctx, &fidp->fs);
-            if (!entry && errno) {
+        /* perform a full stat() for directory entry if requested by caller */
+        if (dostat) {
+            err = s->ops->name_to_path(
+                &s->ctx, &fidp->path, dent->d_name, &path
+            );
+            if (err < 0) {
                 err = -errno;
-            } else {
-                *dent = entry;
-                err = 0;
+                break;
             }
-        });
+
+            err = s->ops->lstat(&s->ctx, &path, &stbuf);
+            if (err < 0) {
+                err = -errno;
+                break;
+            }
+
+            e->st = g_malloc0(sizeof(struct stat));
+            memcpy(e->st, &stbuf, sizeof(struct stat));
+        }
+
+        size += len;
+        saved_dir_pos = dent->d_off;
+    }
+
+    /* restore (last) saved position */
+    s->ops->seekdir(&s->ctx, &fidp->fs, saved_dir_pos);
+
+out:
+    v9fs_readdir_unlock(&fidp->fs.dir);
+    v9fs_path_free(&path);
+    if (err < 0) {
+        return err;
+    }
+    return size;
+}
+
+/**
+ * @brief Low latency variant of fs driver readdir handling.
+ *
+ * Retrieves the requested (max. amount of) directory entries from the fs
+ * driver. This function must only be called by the main IO thread (top half).
+ * Internally this function call will be dispatched to a background IO thread
+ * (bottom half) where it is eventually executed by the fs driver.
+ *
+ * The old readdir implementation above just retrieves always one dir entry
+ * per call. The problem of that implementation above is that latency is
+ * added for (retrieval of) each directory entry, which in practice lead to
+ * latencies of several hundred ms for readdir of only one directory.
+ *
+ * This is avoided in this function by letting the fs driver retrieve all
+ * required directory entries with only call of this function and hence with
+ * only a single fs driver request.
+ *
+ * @param pdu - the causing 9p (T_readdir) client request
+ * @param fidp - already opened directory where readdir shall be performed on
+ * @param entries - output for directory entries (must not be NULL)
+ * @param maxsize - maximum result message body size (in bytes)
+ * @param dostat - whether a stat() should be performed and returned for
+ *                 each directory entry
+ * @returns resulting response message body size (in bytes) on success,
+ *          negative error code otherwise
+ */
+int coroutine_fn v9fs_co_readdir_lowlat(V9fsPDU *pdu, V9fsFidState *fidp,
+                                        struct V9fsDirEnt **entries,
+                                        int32_t maxsize, bool dostat)
+{
+    int err = 0;
+
+    if (v9fs_request_cancelled(pdu)) {
+        return -EINTR;
+    }
+    v9fs_co_run_in_worker({
+        err = do_readdir_lowlat(pdu, fidp, entries, maxsize, dostat);
+    });
     return err;
 }
 
diff --git a/hw/9pfs/coth.h b/hw/9pfs/coth.h
index c2cdc7a9ea..1249dbe6df 100644
--- a/hw/9pfs/coth.h
+++ b/hw/9pfs/coth.h
@@ -49,6 +49,9 @@
 void co_run_in_worker_bh(void *);
 int coroutine_fn v9fs_co_readlink(V9fsPDU *, V9fsPath *, V9fsString *);
 int coroutine_fn v9fs_co_readdir(V9fsPDU *, V9fsFidState *, struct dirent **);
+int coroutine_fn v9fs_co_readdir_lowlat(V9fsPDU *, V9fsFidState *,
+                                        struct V9fsDirEnt **,
+                                        int32_t, bool);
 off_t coroutine_fn v9fs_co_telldir(V9fsPDU *, V9fsFidState *);
 void coroutine_fn v9fs_co_seekdir(V9fsPDU *, V9fsFidState *, off_t);
 void coroutine_fn v9fs_co_rewinddir(V9fsPDU *, V9fsFidState *);
-- 
2.20.1



^ permalink raw reply related	[flat|nested] 26+ messages in thread

* [PATCH v2 9/9] hw/9pfs/9p.c: benchmark time on T_readdir request
  2019-12-18 14:17 [PATCH v2 0/9] 9pfs: readdir optimization Christian Schoenebeck
                   ` (7 preceding siblings ...)
  2019-12-18 14:00 ` [PATCH v2 8/9] 9pfs: T_readdir latency optimization Christian Schoenebeck
@ 2019-12-18 14:10 ` Christian Schoenebeck
  8 siblings, 0 replies; 26+ messages in thread
From: Christian Schoenebeck @ 2019-12-18 14:10 UTC (permalink / raw)
  To: qemu-devel; +Cc: Greg Kurz

This patch is not intended to be merged, it measures
and prints the time the 9p server spends on handling
a T_readdir request. It prints the total time it spent
on handling the request, and also the time it spent
on I/O (fs driver) only.

Signed-off-by: Christian Schoenebeck <qemu_oss@crudebyte.com>
---
 hw/9pfs/9p.c | 21 +++++++++++++++++++++
 1 file changed, 21 insertions(+)

diff --git a/hw/9pfs/9p.c b/hw/9pfs/9p.c
index a7d36b6350..c6f70b96ee 100644
--- a/hw/9pfs/9p.c
+++ b/hw/9pfs/9p.c
@@ -2299,6 +2299,15 @@ static void v9fs_free_dirents(struct V9fsDirEnt *e)
     }
 }
 
+static double wall_time(void)
+{
+    struct timeval t;
+    struct timezone tz;
+    gettimeofday(&t, &tz);
+    return t.tv_sec + t.tv_usec * 0.000001;
+}
+
+
 static int coroutine_fn v9fs_do_readdir(V9fsPDU *pdu, V9fsFidState *fidp,
                                         int32_t max_count)
 {
@@ -2318,6 +2327,8 @@ static int coroutine_fn v9fs_do_readdir(V9fsPDU *pdu, V9fsFidState *fidp,
      */
     const bool dostat = pdu->s->ctx.export_flags & V9FS_REMAP_INODES;
 
+    const double start = wall_time();
+
     /*
      * Fetch all required directory entries altogether on a background IO
      * thread from fs driver. We don't want to do that for each entry
@@ -2332,6 +2343,10 @@ static int coroutine_fn v9fs_do_readdir(V9fsPDU *pdu, V9fsFidState *fidp,
     }
     count = 0;
 
+    const double end = wall_time();
+    printf("\n\nTime 9p server spent on synth_readdir() I/O only (synth "
+           "driver): %fs\n", end - start);
+
     for (struct V9fsDirEnt *e = entries; e; e = e->next) {
         dent = e->dent;
 
@@ -2404,6 +2419,8 @@ static void coroutine_fn v9fs_readdir(void *opaque)
     V9fsPDU *pdu = opaque;
     V9fsState *s = pdu->s;
 
+    const double start = wall_time();
+
     retval = pdu_unmarshal(pdu, offset, "dqd", &fid,
                            &initial_offset, &max_count);
     if (retval < 0) {
@@ -2447,6 +2464,10 @@ out:
     put_fid(pdu, fidp);
 out_nofid:
     pdu_complete(pdu, retval);
+
+    const double end = wall_time();
+    printf("Time 9p server spent on entire T_readdir request: %fs "
+           "[IMPORTANT]\n", end - start);
 }
 
 static int v9fs_xattr_write(V9fsState *s, V9fsPDU *pdu, V9fsFidState *fidp,
-- 
2.20.1



^ permalink raw reply related	[flat|nested] 26+ messages in thread

* [PATCH v2 0/9] 9pfs: readdir optimization
@ 2019-12-18 14:17 Christian Schoenebeck
  2019-12-18 13:06 ` [PATCH v2 1/9] tests/virtio-9p: add terminating null in v9fs_string_read() Christian Schoenebeck
                   ` (8 more replies)
  0 siblings, 9 replies; 26+ messages in thread
From: Christian Schoenebeck @ 2019-12-18 14:17 UTC (permalink / raw)
  To: qemu-devel; +Cc: Greg Kurz

As previously mentioned, I was investigating performance issues with 9pfs.
Raw file read/write of 9pfs is actually quite good, provided that client
picked a reasonable high msize (maximum message size). I would recommend
to log a warning on 9p server side if a client attached with a small msize
that would cause performance issues for that reason.

However there other aspects where 9pfs currently performs suboptimally,
especially readdir handling of 9pfs is extremely slow, a simple readdir
request of a guest typically blocks for several hundred milliseconds or
even several seconds, no matter how powerful the underlying hardware is.
The reason for this performance issue: latency.
Currently 9pfs is heavily dispatching a T_readdir request numerous times
between main I/O thread and a background I/O thread back and forth; in fact
it is actually hopping between threads even multiple times for every single
directory entry during T_readdir request handling which leads in total to
huge latencies for a single T_readdir request.

This patch series aims to address this severe performance issue of 9pfs
T_readdir request handling. The actual performance fix is patch 8. I also
provided a convenient benchmark for comparing the performance improvements
by using the 9pfs "synth" driver (see patch 6 for instructions how to run
the benchmark), so no guest OS installation is required to peform this
benchmark A/B comparison. With patch 8 I achieved a performance improvement
of factor 40 on my test machine.

** NOTE: ** These patches are not heavily tested yet, nor thouroughly
reviewed for potential security issues yet. I decided to post them already
though, because I won't have the time in the next few weeks for polishing
them. The benchmark results should demonstrate though that it is worth the
hassle. So any testing/reviews/fixes appreciated!

v1->v2:

  * Fixed missing email threading of this patch set.

  * Fixed code style issues.

  * No need to check for NULL when calling g_free() [patch 4] and
    [patch 8].

  * No need to initialize static variable with NULL [patch 7].

  * Adjustments to commit log messages.

Christian Schoenebeck (9):
  tests/virtio-9p: add terminating null in v9fs_string_read()
  9pfs: validate count sent by client with T_readdir
  hw/9pfs/9p-synth: added directory for readdir test
  tests/virtio-9p: added readdir test
  tests/virtio-9p: check file names of R_readdir response
  9pfs: readdir benchmark
  hw/9pfs/9p-synth: avoid n-square issue in synth_readdir()
  9pfs: T_readdir latency optimization
  hw/9pfs/9p.c: benchmark time on T_readdir request

 hw/9pfs/9p-synth.c     |  48 ++++++++++-
 hw/9pfs/9p-synth.h     |   5 ++
 hw/9pfs/9p.c           | 151 ++++++++++++++++++---------------
 hw/9pfs/9p.h           |  23 +++++
 hw/9pfs/codir.c        | 183 +++++++++++++++++++++++++++++++++++++---
 hw/9pfs/coth.h         |   3 +
 tests/virtio-9p-test.c | 186 ++++++++++++++++++++++++++++++++++++++++-
 7 files changed, 516 insertions(+), 83 deletions(-)

-- 
2.20.1



^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH v2 1/9] tests/virtio-9p: add terminating null in v9fs_string_read()
  2019-12-18 13:06 ` [PATCH v2 1/9] tests/virtio-9p: add terminating null in v9fs_string_read() Christian Schoenebeck
@ 2020-01-06 11:00   ` Greg Kurz
  0 siblings, 0 replies; 26+ messages in thread
From: Greg Kurz @ 2020-01-06 11:00 UTC (permalink / raw)
  To: Christian Schoenebeck; +Cc: qemu-devel

On Wed, 18 Dec 2019 14:06:30 +0100
Christian Schoenebeck <qemu_oss@crudebyte.com> wrote:

> The 9p protocol sends strings in general without null termination
> over the wire. However for future use of this functions it is
> beneficial for the delivered string to be null terminated though
> for being able to use the string with standard C functions which
> often rely on strings being null terminated.
> 
> Signed-off-by: Christian Schoenebeck <qemu_oss@crudebyte.com>
> ---

LGTM

Reviewed-by: Greg Kurz <groug@kaod.org>

>  tests/virtio-9p-test.c | 3 ++-
>  1 file changed, 2 insertions(+), 1 deletion(-)
> 
> diff --git a/tests/virtio-9p-test.c b/tests/virtio-9p-test.c
> index e7b58e3a0c..06263edb53 100644
> --- a/tests/virtio-9p-test.c
> +++ b/tests/virtio-9p-test.c
> @@ -130,8 +130,9 @@ static void v9fs_string_read(P9Req *req, uint16_t *len, char **string)
>          *len = local_len;
>      }
>      if (string) {
> -        *string = g_malloc(local_len);
> +        *string = g_malloc(local_len + 1);
>          v9fs_memread(req, *string, local_len);
> +        (*string)[local_len] = 0;
>      } else {
>          v9fs_memskip(req, local_len);
>      }



^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH v2 2/9] 9pfs: validate count sent by client with T_readdir
  2019-12-18 13:17 ` [PATCH v2 2/9] 9pfs: validate count sent by client with T_readdir Christian Schoenebeck
@ 2020-01-06 12:30   ` Greg Kurz
  2020-01-06 15:10     ` Christian Schoenebeck
  0 siblings, 1 reply; 26+ messages in thread
From: Greg Kurz @ 2020-01-06 12:30 UTC (permalink / raw)
  To: Christian Schoenebeck; +Cc: qemu-devel

On Wed, 18 Dec 2019 14:17:59 +0100
Christian Schoenebeck <qemu_oss@crudebyte.com> wrote:

> A good 9p client sends T_readdir with "count" parameter that's
> sufficiently smaller than client's initially negotiated msize
> (maximum message size). We perform a check for that though to
> avoid the server to be interrupted with a "Failed to encode
> VirtFS reply type 41" error message by bad clients.
> 

Hmm... doesn't v9fs_do_readdir() already take care of that ?

> Note: we should probably also check for a minimum size of
> msize during T_version to avoid issues and/or too complicated
> count/size checks later on in other requests of that client.
> T_version should submit an msize that's at least as large as
> the largest request's header size.

Do you mean that the server should expose such an msize in the
R_version response ? The 9p spec only says that the server must
return an msize <= to the one proposed by the client [1]. Not
sure we can do more than to emit a warning and/or interrupt the
server if the client sends a silly size.

[1] https://9fans.github.io/plan9port/man/man9/version.html

> 
> Signed-off-by: Christian Schoenebeck <qemu_oss@crudebyte.com>
> ---
>  hw/9pfs/9p.c | 8 ++++++++
>  1 file changed, 8 insertions(+)
> 
> diff --git a/hw/9pfs/9p.c b/hw/9pfs/9p.c
> index 520177f40c..30e33b6573 100644
> --- a/hw/9pfs/9p.c
> +++ b/hw/9pfs/9p.c
> @@ -2414,6 +2414,7 @@ static void coroutine_fn v9fs_readdir(void *opaque)
>      int32_t count;
>      uint32_t max_count;
>      V9fsPDU *pdu = opaque;
> +    V9fsState *s = pdu->s;
>  
>      retval = pdu_unmarshal(pdu, offset, "dqd", &fid,
>                             &initial_offset, &max_count);
> @@ -2422,6 +2423,13 @@ static void coroutine_fn v9fs_readdir(void *opaque)
>      }
>      trace_v9fs_readdir(pdu->tag, pdu->id, fid, initial_offset, max_count);
>  
> +    if (max_count > s->msize - P9_IOHDRSZ) {
> +        max_count = s->msize - P9_IOHDRSZ;

What if s->msize < P9_IOHDRSZ ?

> +        warn_report_once(
> +            "9p: bad client: T_readdir with count > msize - P9_IOHDRSZ"
> +        );
> +    }
> +
>      fidp = get_fid(pdu, fid);
>      if (fidp == NULL) {
>          retval = -EINVAL;



^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH v2 2/9] 9pfs: validate count sent by client with T_readdir
  2020-01-06 12:30   ` Greg Kurz
@ 2020-01-06 15:10     ` Christian Schoenebeck
  2020-01-06 17:49       ` Greg Kurz
  2020-01-08 23:53       ` Greg Kurz
  0 siblings, 2 replies; 26+ messages in thread
From: Christian Schoenebeck @ 2020-01-06 15:10 UTC (permalink / raw)
  To: qemu-devel
  Cc: Greg Kurz, Christian Schoenebeck, Stefano Stabellini,
	anthony.perard, Stefano Stabellini

On Montag, 6. Januar 2020 13:30:24 CET Greg Kurz wrote:
> On Wed, 18 Dec 2019 14:17:59 +0100
> 
> Christian Schoenebeck <qemu_oss@crudebyte.com> wrote:
> > A good 9p client sends T_readdir with "count" parameter that's
> > sufficiently smaller than client's initially negotiated msize
> > (maximum message size). We perform a check for that though to
> > avoid the server to be interrupted with a "Failed to encode
> > VirtFS reply type 41" error message by bad clients.
> 
> Hmm... doesn't v9fs_do_readdir() already take care of that ?

No, unfortunately it doesn't. It just looks at the "count" parameter 
transmitted with client's T_readdir request, but not at session's msize.
So a bad client could send a T_readdir request with a "count" parameter that's 
substantially higher than session's msize, which would lead to that mentioned 
transport error ATM. Hence I suggested this patch here to address it.

You can easily reproduce this issue:

1. Omit this patch 2 (since it would fix it).

2. Apply patch 3 and patch 4 of this patch set, which assemble a T_readdir
   test case combined, then stop patches here (i.e. don't apply subsequent 
   patches of this patch set, since e.g. patch 6 would increase test client's 
   msize).

3. Set QTEST_V9FS_SYNTH_READDIR_NFILES in hw/9pfs/9p-synth.h to a high number
   (e.g. several thousands).

4. Run the T_readdir test case:

   cd build
   make && make tests/qos-test
   export QTEST_QEMU_BINARY=x86_64-softmmu/qemu-system-x86_64
   tests/qos-test -p $(tests/qos-test -l | grep readdir/basic)

Result: Since the test client's msize is quite small at this point (4kB), test 
client would transmit a very large "count" parameter with T_readdir request to 
retrieve all QTEST_V9FS_SYNTH_READDIR_NFILES files, and you'll end up getting 
the quoted transport error message on server (see precise error message 
above).

> > Note: we should probably also check for a minimum size of
> > msize during T_version to avoid issues and/or too complicated
> > count/size checks later on in other requests of that client.
> > T_version should submit an msize that's at least as large as
> > the largest request's header size.
> 
> Do you mean that the server should expose such an msize in the
> R_version response ? The 9p spec only says that the server must
> return an msize <= to the one proposed by the client [1]. Not
> sure we can do more than to emit a warning and/or interrupt the
> server if the client sends a silly size.
> 
> [1] https://9fans.github.io/plan9port/man/man9/version.html

My idea was to "handle it as an error" immediately when server receives a 
T_version request with a "msize" argument transmitted by client that would be 
way too small for anything.

Because if client sends T_version with msize < P9_IOHDRSZ then it is obvious 
that this msize would be way too small for handling any subsequent 9p request 
at all.

> > Signed-off-by: Christian Schoenebeck <qemu_oss@crudebyte.com>
> > ---
> > 
> >  hw/9pfs/9p.c | 8 ++++++++
> >  1 file changed, 8 insertions(+)
> > 
> > diff --git a/hw/9pfs/9p.c b/hw/9pfs/9p.c
> > index 520177f40c..30e33b6573 100644
> > --- a/hw/9pfs/9p.c
> > +++ b/hw/9pfs/9p.c
> > @@ -2414,6 +2414,7 @@ static void coroutine_fn v9fs_readdir(void *opaque)
> > 
> >      int32_t count;
> >      uint32_t max_count;
> >      V9fsPDU *pdu = opaque;
> > 
> > +    V9fsState *s = pdu->s;
> > 
> >      retval = pdu_unmarshal(pdu, offset, "dqd", &fid,
> >      
> >                             &initial_offset, &max_count);
> > 
> > @@ -2422,6 +2423,13 @@ static void coroutine_fn v9fs_readdir(void *opaque)
> > 
> >      }
> >      trace_v9fs_readdir(pdu->tag, pdu->id, fid, initial_offset,
> >      max_count);
> > 
> > +    if (max_count > s->msize - P9_IOHDRSZ) {
> > +        max_count = s->msize - P9_IOHDRSZ;
> 
> What if s->msize < P9_IOHDRSZ ?

Exactly, that's why I added that comment to the commit log of this patch for 
now to make this circumstance clear as yet TODO.

This issue with T_version and msize needs to be addressed anyway, because it 
will popup again and again with various other request types and also with 
transport aspects like previously discussed on a transport buffer size issue 
(submitters of that transport patch on CC here for that reason):
https://lists.gnu.org/archive/html/qemu-devel/2019-12/msg04701.html

The required patch to address this overall minimum msize issue would be very 
small: just comparing client's transmitted "msize" parameter of client's 
T_version request and if that transmitted msize is smaller than a certain 
absolute minimum msize then raise an error immediately to prevent the session 
to start.

But there are some decisions to be made, that's why I haven't provided a patch 
for this min. msize issue in this patch series yet:

1. What is the minimum msize to be allowed with T_version?

   P9_IOHDRSZ would be IMO too small, because P9_IOHDRSZ is the size of the 
   common header portion of all IO requests. So it would rather make sense IMO
   reviewing the protocol and pick the largest header size among all possible 
   requests supported by 9pfs server ATM. The advantage of this approach would 
   be that overall code would be easier too maintain, since we don't have to 
   add minimum msize checks in any (or even all) individual request type 
   handlers. T_version handler of server would already enforce a minimum msize 
   and prevent the session if msize too small, and that's it.

2. How to handle that error with T_version exactly?

   As far as I can see it, it was originally not considered that T_version 
   might react with an error response at all. The specs are ambiguous about 
   this topic. All you can find is that if the transmitted protocol version
   is not supported by server, then server should respond with "unknown" with 
   its R_version response, but should not respond with R_error in that case.

   The specs do not prohibit R_error for T_version in general, but I could 
   imagine that some clients might not expect if we would send R_error. On the 
   other hand responding with R_version "unknown" would not be appropriate for 
   this msize error either, because that would mean to the client that the 
   protocol version was not supported, which is not the case. So it would 
   leave client uninformed about the actual reason/cause of the error.

3. Are there any undocumented special msizes, e.g. msize=0 for "don't care" ?

   Probably not, but you might have seen more client implementations than me.

Best regards,
Christian Schoenebeck




^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH v2 5/9] tests/virtio-9p: check file names of R_readdir response
  2019-12-18 13:35 ` [PATCH v2 5/9] tests/virtio-9p: check file names of R_readdir response Christian Schoenebeck
@ 2020-01-06 17:07   ` Greg Kurz
  2020-01-07 12:28     ` Christian Schoenebeck
  0 siblings, 1 reply; 26+ messages in thread
From: Greg Kurz @ 2020-01-06 17:07 UTC (permalink / raw)
  To: Christian Schoenebeck; +Cc: qemu-devel

On Wed, 18 Dec 2019 14:35:56 +0100
Christian Schoenebeck <qemu_oss@crudebyte.com> wrote:

> Additionally to the already existing check for expected amount
> of directory entries returned by R_readdir response, also check
> whether all directory entries have the expected file names (as
> created on 9pfs synth driver side), ignoring their precise order
> in result list though.
> 
> Signed-off-by: Christian Schoenebeck <qemu_oss@crudebyte.com>
> ---

LGTM and trivial enough that it can be folded in the previous
patch.

>  tests/virtio-9p-test.c | 22 ++++++++++++++++++++++
>  1 file changed, 22 insertions(+)
> 
> diff --git a/tests/virtio-9p-test.c b/tests/virtio-9p-test.c
> index 48c0eca292..cb5c9fb420 100644
> --- a/tests/virtio-9p-test.c
> +++ b/tests/virtio-9p-test.c
> @@ -565,6 +565,16 @@ static void fs_walk(void *obj, void *data, QGuestAllocator *t_alloc)
>      g_free(wqid);
>  }
>  
> +static bool fs_dirents_contain_name(struct v9fs_dirent *e, const char* name)
> +{
> +    for (; e; e = e->next) {
> +        if (!strcmp(e->name, name)) {
> +            return true;
> +        }
> +    }
> +    return false;
> +}
> +
>  static void fs_readdir(void *obj, void *data, QGuestAllocator *t_alloc)
>  {
>      QVirtio9P *v9p = obj;
> @@ -599,6 +609,18 @@ static void fs_readdir(void *obj, void *data, QGuestAllocator *t_alloc)
>          QTEST_V9FS_SYNTH_READDIR_NFILES + 2 /* "." and ".." */
>      );
>  
> +    /*
> +     * Check all file names exist in returned entries, ignore their order
> +     * though.
> +     */
> +    g_assert_cmpint(fs_dirents_contain_name(entries, "."), ==, true);
> +    g_assert_cmpint(fs_dirents_contain_name(entries, ".."), ==, true);
> +    for (int i = 0; i < QTEST_V9FS_SYNTH_READDIR_NFILES; ++i) {
> +        char *name = g_strdup_printf(QTEST_V9FS_SYNTH_READDIR_FILE, i);
> +        g_assert_cmpint(fs_dirents_contain_name(entries, name), ==, true);
> +        g_free(name);
> +    }
> +
>      v9fs_free_dirents(entries);
>      g_free(wnames[0]);
>  }



^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH v2 4/9] tests/virtio-9p: added readdir test
  2019-12-18 13:30 ` [PATCH v2 4/9] tests/virtio-9p: added " Christian Schoenebeck
@ 2020-01-06 17:22   ` Greg Kurz
  2020-01-07 12:25     ` Christian Schoenebeck
  2020-01-08 23:55   ` Greg Kurz
  1 sibling, 1 reply; 26+ messages in thread
From: Greg Kurz @ 2020-01-06 17:22 UTC (permalink / raw)
  To: Christian Schoenebeck; +Cc: qemu-devel

On Wed, 18 Dec 2019 14:30:43 +0100
Christian Schoenebeck <qemu_oss@crudebyte.com> wrote:

> This first readdir test simply checks the amount of directory
> entries returned by 9pfs server, according to the created amount
> of virtual files on 9pfs synth driver side.
> 
> Signed-off-by: Christian Schoenebeck <qemu_oss@crudebyte.com>
> ---
>  tests/virtio-9p-test.c | 124 +++++++++++++++++++++++++++++++++++++++++
>  1 file changed, 124 insertions(+)
> 
> diff --git a/tests/virtio-9p-test.c b/tests/virtio-9p-test.c
> index 06263edb53..48c0eca292 100644
> --- a/tests/virtio-9p-test.c
> +++ b/tests/virtio-9p-test.c
> @@ -68,6 +68,11 @@ static void v9fs_memread(P9Req *req, void *addr, size_t len)
>      req->r_off += len;
>  }
>  
> +static void v9fs_uint8_read(P9Req *req, uint8_t *val)
> +{
> +    v9fs_memread(req, val, 1);
> +}
> +
>  static void v9fs_uint16_write(P9Req *req, uint16_t val)
>  {
>      uint16_t le_val = cpu_to_le16(val);
> @@ -101,6 +106,12 @@ static void v9fs_uint32_read(P9Req *req, uint32_t *val)
>      le32_to_cpus(val);
>  }
>  
> +static void v9fs_uint64_read(P9Req *req, uint64_t *val)
> +{
> +    v9fs_memread(req, val, 8);
> +    le64_to_cpus(val);
> +}
> +
>  /* len[2] string[len] */
>  static uint16_t v9fs_string_size(const char *string)
>  {
> @@ -191,6 +202,7 @@ static const char *rmessage_name(uint8_t id)
>          id == P9_RLOPEN ? "RLOPEN" :
>          id == P9_RWRITE ? "RWRITE" :
>          id == P9_RFLUSH ? "RFLUSH" :
> +        id == P9_RREADDIR ? "READDIR" :
>          "<unknown>";
>  }
>  
> @@ -348,6 +360,79 @@ static void v9fs_rwalk(P9Req *req, uint16_t *nwqid, v9fs_qid **wqid)
>      v9fs_req_free(req);
>  }
>  
> +/* size[4] Treaddir tag[2] fid[4] offset[8] count[4] */
> +static P9Req *v9fs_treaddir(QVirtio9P *v9p, uint32_t fid, uint64_t offset,
> +                            uint32_t count, uint16_t tag)
> +{
> +    P9Req *req;
> +
> +    req = v9fs_req_init(v9p, 4 + 8 + 4, P9_TREADDIR, tag);
> +    v9fs_uint32_write(req, fid);
> +    v9fs_uint64_write(req, offset);
> +    v9fs_uint32_write(req, count);
> +    v9fs_req_send(req);
> +    return req;
> +}
> +
> +struct v9fs_dirent {

The QEMU coding style calls for a CamelCase typedef,

ie.

typedef struct V9fsDirent V9fsDirent;

> +    v9fs_qid qid;

Yeah... I should have done the same when I introduced this type ;-)

> +    uint64_t offset;
> +    uint8_t type;
> +    char *name;
> +    struct v9fs_dirent *next;
> +};
> +
> +/* size[4] Rreaddir tag[2] count[4] data[count] */
> +static void v9fs_rreaddir(P9Req *req, uint32_t *count, uint32_t *nentries,
> +                          struct v9fs_dirent **entries)
> +{
> +    uint32_t sz;
> +    struct v9fs_dirent *e = NULL;
> +    uint16_t slen;
> +    uint32_t n = 0;
> +
> +    v9fs_req_recv(req, P9_RREADDIR);
> +    v9fs_uint32_read(req, &sz);
> +
> +    if (count) {
> +        *count = sz;
> +    }
> +
> +    for (int32_t togo = (int32_t)sz;
> +         togo >= 13 + 8 + 1 + 2;
> +         togo -= 13 + 8 + 1 + 2 + slen, ++n)
> +    {
> +        if (!e) {
> +            e = g_malloc(sizeof(struct v9fs_dirent));
> +            if (entries)
> +                *entries = e;
> +        } else {
> +            e = e->next = g_malloc(sizeof(struct v9fs_dirent));
> +        }
> +        e->next = NULL;
> +        /* qid[13] offset[8] type[1] name[s] */
> +        v9fs_memread(req, &e->qid, 13);
> +        v9fs_uint64_read(req, &e->offset);
> +        v9fs_uint8_read(req, &e->type);
> +        v9fs_string_read(req, &slen, &e->name);
> +    }
> +
> +    if (nentries) {
> +        *nentries = n;
> +    }
> +}
> +
> +static void v9fs_free_dirents(struct v9fs_dirent *e)
> +{
> +    struct v9fs_dirent *next = NULL;
> +
> +    for (; e; e = next) {
> +        next = e->next;
> +        g_free(e->name);
> +        g_free(e);
> +    }
> +}
> +
>  /* size[4] Tlopen tag[2] fid[4] flags[4] */
>  static P9Req *v9fs_tlopen(QVirtio9P *v9p, uint32_t fid, uint32_t flags,
>                            uint16_t tag)
> @@ -480,6 +565,44 @@ static void fs_walk(void *obj, void *data, QGuestAllocator *t_alloc)
>      g_free(wqid);
>  }
>  
> +static void fs_readdir(void *obj, void *data, QGuestAllocator *t_alloc)
> +{
> +    QVirtio9P *v9p = obj;
> +    alloc = t_alloc;
> +    char *const wnames[] = { g_strdup(QTEST_V9FS_SYNTH_READDIR_DIR) };
> +    uint16_t nqid;
> +    v9fs_qid qid;
> +    uint32_t count, nentries;
> +    struct v9fs_dirent *entries = NULL;
> +    P9Req *req;
> +
> +    fs_attach(v9p, NULL, t_alloc);
> +    req = v9fs_twalk(v9p, 0, 1, 1, wnames, 0);
> +    v9fs_req_wait_for_reply(req, NULL);
> +    v9fs_rwalk(req, &nqid, NULL);
> +    g_assert_cmpint(nqid, ==, 1);
> +
> +    req = v9fs_tlopen(v9p, 1, O_DIRECTORY, 0);
> +    v9fs_req_wait_for_reply(req, NULL);
> +    v9fs_rlopen(req, &qid, NULL);
> +
> +    req = v9fs_treaddir(v9p, 1, 0, P9_MAX_SIZE - P9_IOHDRSZ, 0);
> +    v9fs_req_wait_for_reply(req, NULL);
> +    v9fs_rreaddir(req, &count, &nentries, &entries);
> +
> +    /*
> +     * Assuming msize (P9_MAX_SIZE) is large enough so we can retrieve all
> +     * dir entries with only one readdir request.
> +     */
> +    g_assert_cmpint(
> +        nentries, ==,
> +        QTEST_V9FS_SYNTH_READDIR_NFILES + 2 /* "." and ".." */
> +    );

What about coming up with a version of this test that loops until
it could read all the entries instead of this assumption ?

> +
> +    v9fs_free_dirents(entries);
> +    g_free(wnames[0]);
> +}
> +
>  static void fs_walk_no_slash(void *obj, void *data, QGuestAllocator *t_alloc)
>  {
>      QVirtio9P *v9p = obj;
> @@ -658,6 +781,7 @@ static void register_virtio_9p_test(void)
>                   NULL);
>      qos_add_test("fs/flush/ignored", "virtio-9p", fs_flush_ignored,
>                   NULL);
> +    qos_add_test("fs/readdir/basic", "virtio-9p", fs_readdir, NULL);
>  }
>  
>  libqos_init(register_virtio_9p_test);



^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH v2 2/9] 9pfs: validate count sent by client with T_readdir
  2020-01-06 15:10     ` Christian Schoenebeck
@ 2020-01-06 17:49       ` Greg Kurz
  2020-01-06 21:43         ` Christian Schoenebeck
  2020-01-08 23:53       ` Greg Kurz
  1 sibling, 1 reply; 26+ messages in thread
From: Greg Kurz @ 2020-01-06 17:49 UTC (permalink / raw)
  To: Christian Schoenebeck
  Cc: anthony.perard, Stefano Stabellini, qemu-devel, Stefano Stabellini

On Mon, 06 Jan 2020 16:10:28 +0100
Christian Schoenebeck <qemu_oss@crudebyte.com> wrote:

> On Montag, 6. Januar 2020 13:30:24 CET Greg Kurz wrote:
> > On Wed, 18 Dec 2019 14:17:59 +0100
> > 
> > Christian Schoenebeck <qemu_oss@crudebyte.com> wrote:
> > > A good 9p client sends T_readdir with "count" parameter that's
> > > sufficiently smaller than client's initially negotiated msize
> > > (maximum message size). We perform a check for that though to
> > > avoid the server to be interrupted with a "Failed to encode
> > > VirtFS reply type 41" error message by bad clients.
> > 
> > Hmm... doesn't v9fs_do_readdir() already take care of that ?
> 
> No, unfortunately it doesn't. It just looks at the "count" parameter 
> transmitted with client's T_readdir request, but not at session's msize.

Indeed.

> So a bad client could send a T_readdir request with a "count" parameter that's 
> substantially higher than session's msize, which would lead to that mentioned 
> transport error ATM. Hence I suggested this patch here to address it.
> 
> You can easily reproduce this issue:
> 
> 1. Omit this patch 2 (since it would fix it).
> 
> 2. Apply patch 3 and patch 4 of this patch set, which assemble a T_readdir
>    test case combined, then stop patches here (i.e. don't apply subsequent 
>    patches of this patch set, since e.g. patch 6 would increase test client's 
>    msize).
> 
> 3. Set QTEST_V9FS_SYNTH_READDIR_NFILES in hw/9pfs/9p-synth.h to a high number
>    (e.g. several thousands).
> 
> 4. Run the T_readdir test case:
> 
>    cd build
>    make && make tests/qos-test
>    export QTEST_QEMU_BINARY=x86_64-softmmu/qemu-system-x86_64
>    tests/qos-test -p $(tests/qos-test -l | grep readdir/basic)
> 
> Result: Since the test client's msize is quite small at this point (4kB), test 
> client would transmit a very large "count" parameter with T_readdir request to 
> retrieve all QTEST_V9FS_SYNTH_READDIR_NFILES files, and you'll end up getting 
> the quoted transport error message on server (see precise error message 
> above).
> 

I don't observe this behavior with:

-#define QTEST_V9FS_SYNTH_READDIR_NFILES 100
+#define QTEST_V9FS_SYNTH_READDIR_NFILES 20000

/x86_64/pc/i440FX-pcihost/pci-bus-pc/pci-bus/virtio-9p-pci/virtio-9p/virtio-9p-tests/fs/readdir/basic: **
ERROR:/home/greg/Work/qemu/qemu-9p/tests/virtio-9p-test.c:597:fs_readdir: assertion failed (nentries == QTEST_V9FS_SYNTH_READDIR_NFILES + 2): (101 == 20002)

The server didn't hit the transport error...

> > > Note: we should probably also check for a minimum size of
> > > msize during T_version to avoid issues and/or too complicated
> > > count/size checks later on in other requests of that client.
> > > T_version should submit an msize that's at least as large as
> > > the largest request's header size.
> > 
> > Do you mean that the server should expose such an msize in the
> > R_version response ? The 9p spec only says that the server must
> > return an msize <= to the one proposed by the client [1]. Not
> > sure we can do more than to emit a warning and/or interrupt the
> > server if the client sends a silly size.
> > 
> > [1] https://9fans.github.io/plan9port/man/man9/version.html
> 
> My idea was to "handle it as an error" immediately when server receives a 
> T_version request with a "msize" argument transmitted by client that would be 
> way too small for anything.
> 
> Because if client sends T_version with msize < P9_IOHDRSZ then it is obvious 
> that this msize would be way too small for handling any subsequent 9p request 
> at all.
> 
> > > Signed-off-by: Christian Schoenebeck <qemu_oss@crudebyte.com>
> > > ---
> > > 
> > >  hw/9pfs/9p.c | 8 ++++++++
> > >  1 file changed, 8 insertions(+)
> > > 
> > > diff --git a/hw/9pfs/9p.c b/hw/9pfs/9p.c
> > > index 520177f40c..30e33b6573 100644
> > > --- a/hw/9pfs/9p.c
> > > +++ b/hw/9pfs/9p.c
> > > @@ -2414,6 +2414,7 @@ static void coroutine_fn v9fs_readdir(void *opaque)
> > > 
> > >      int32_t count;
> > >      uint32_t max_count;
> > >      V9fsPDU *pdu = opaque;
> > > 
> > > +    V9fsState *s = pdu->s;
> > > 
> > >      retval = pdu_unmarshal(pdu, offset, "dqd", &fid,
> > >      
> > >                             &initial_offset, &max_count);
> > > 
> > > @@ -2422,6 +2423,13 @@ static void coroutine_fn v9fs_readdir(void *opaque)
> > > 
> > >      }
> > >      trace_v9fs_readdir(pdu->tag, pdu->id, fid, initial_offset,
> > >      max_count);
> > > 
> > > +    if (max_count > s->msize - P9_IOHDRSZ) {
> > > +        max_count = s->msize - P9_IOHDRSZ;
> > 
> > What if s->msize < P9_IOHDRSZ ?
> 
> Exactly, that's why I added that comment to the commit log of this patch for 
> now to make this circumstance clear as yet TODO.
> 
> This issue with T_version and msize needs to be addressed anyway, because it 
> will popup again and again with various other request types and also with 
> transport aspects like previously discussed on a transport buffer size issue 
> (submitters of that transport patch on CC here for that reason):
> https://lists.gnu.org/archive/html/qemu-devel/2019-12/msg04701.html
> 
> The required patch to address this overall minimum msize issue would be very 
> small: just comparing client's transmitted "msize" parameter of client's 
> T_version request and if that transmitted msize is smaller than a certain 
> absolute minimum msize then raise an error immediately to prevent the session 
> to start.
> 
> But there are some decisions to be made, that's why I haven't provided a patch 
> for this min. msize issue in this patch series yet:
> 
> 1. What is the minimum msize to be allowed with T_version?
> 
>    P9_IOHDRSZ would be IMO too small, because P9_IOHDRSZ is the size of the 
>    common header portion of all IO requests. So it would rather make sense IMO
>    reviewing the protocol and pick the largest header size among all possible 
>    requests supported by 9pfs server ATM. The advantage of this approach would 
>    be that overall code would be easier too maintain, since we don't have to 
>    add minimum msize checks in any (or even all) individual request type 
>    handlers. T_version handler of server would already enforce a minimum msize 
>    and prevent the session if msize too small, and that's it.
> 
> 2. How to handle that error with T_version exactly?
> 
>    As far as I can see it, it was originally not considered that T_version 
>    might react with an error response at all. The specs are ambiguous about 
>    this topic. All you can find is that if the transmitted protocol version
>    is not supported by server, then server should respond with "unknown" with 
>    its R_version response, but should not respond with R_error in that case.
> 
>    The specs do not prohibit R_error for T_version in general, but I could 
>    imagine that some clients might not expect if we would send R_error. On the 
>    other hand responding with R_version "unknown" would not be appropriate for 
>    this msize error either, because that would mean to the client that the 
>    protocol version was not supported, which is not the case. So it would 
>    leave client uninformed about the actual reason/cause of the error.
> 
> 3. Are there any undocumented special msizes, e.g. msize=0 for "don't care" ?
> 
>    Probably not, but you might have seen more client implementations than me.
> 
> Best regards,
> Christian Schoenebeck
> 
> 



^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH v2 2/9] 9pfs: validate count sent by client with T_readdir
  2020-01-06 17:49       ` Greg Kurz
@ 2020-01-06 21:43         ` Christian Schoenebeck
  0 siblings, 0 replies; 26+ messages in thread
From: Christian Schoenebeck @ 2020-01-06 21:43 UTC (permalink / raw)
  To: qemu-devel
  Cc: Greg Kurz, Christian Schoenebeck, anthony.perard,
	Stefano Stabellini, Stefano Stabellini

On Montag, 6. Januar 2020 18:49:10 CET Greg Kurz wrote:
> On Mon, 06 Jan 2020 16:10:28 +0100
> 
> Christian Schoenebeck <qemu_oss@crudebyte.com> wrote:
> > On Montag, 6. Januar 2020 13:30:24 CET Greg Kurz wrote:
> > > On Wed, 18 Dec 2019 14:17:59 +0100
> > > 
> > > Christian Schoenebeck <qemu_oss@crudebyte.com> wrote:
> > > > A good 9p client sends T_readdir with "count" parameter that's
> > > > sufficiently smaller than client's initially negotiated msize
> > > > (maximum message size). We perform a check for that though to
> > > > avoid the server to be interrupted with a "Failed to encode
> > > > VirtFS reply type 41" error message by bad clients.
> > > 
> > > Hmm... doesn't v9fs_do_readdir() already take care of that ?
> > 
> > No, unfortunately it doesn't. It just looks at the "count" parameter
> > transmitted with client's T_readdir request, but not at session's msize.
> 
> Indeed.
> 
> > So a bad client could send a T_readdir request with a "count" parameter
> > that's substantially higher than session's msize, which would lead to
> > that mentioned transport error ATM. Hence I suggested this patch here to
> > address it.
> > 
> > You can easily reproduce this issue:
> > 
> > 1. Omit this patch 2 (since it would fix it).
> > 
> > 2. Apply patch 3 and patch 4 of this patch set, which assemble a T_readdir
> > 
> >    test case combined, then stop patches here (i.e. don't apply subsequent
> >    patches of this patch set, since e.g. patch 6 would increase test
> >    client's
> >    msize).
> > 
> > 3. Set QTEST_V9FS_SYNTH_READDIR_NFILES in hw/9pfs/9p-synth.h to a high
> > number> 
> >    (e.g. several thousands).
> > 
> > 4. Run the T_readdir test case:
> >    cd build
> >    make && make tests/qos-test
> >    export QTEST_QEMU_BINARY=x86_64-softmmu/qemu-system-x86_64
> >    tests/qos-test -p $(tests/qos-test -l | grep readdir/basic)
> > 
> > Result: Since the test client's msize is quite small at this point (4kB),
> > test client would transmit a very large "count" parameter with T_readdir
> > request to retrieve all QTEST_V9FS_SYNTH_READDIR_NFILES files, and you'll
> > end up getting the quoted transport error message on server (see precise
> > error message above).
> 
> I don't observe this behavior with:
> 
> -#define QTEST_V9FS_SYNTH_READDIR_NFILES 100
> +#define QTEST_V9FS_SYNTH_READDIR_NFILES 20000
> 
> /x86_64/pc/i440FX-pcihost/pci-bus-pc/pci-bus/virtio-9p-pci/virtio-9p/virtio-
> 9p-tests/fs/readdir/basic: **
> ERROR:/home/greg/Work/qemu/qemu-9p/tests/virtio-9p-test.c:597:fs_readdir:
> assertion failed (nentries == QTEST_V9FS_SYNTH_READDIR_NFILES + 2): (101 ==
> 20002)
> 
> The server didn't hit the transport error...

Ah sorry, I forgot I had this already defused on test case side. Change this 
as well to reproduce it:

diff --git a/hw/9pfs/9p-synth.h b/hw/9pfs/9p-synth.h
index 036d7e4a5b..7d6cedcdac 100644
--- a/hw/9pfs/9p-synth.h
+++ b/hw/9pfs/9p-synth.h
@@ -58,7 +58,7 @@ int qemu_v9fs_synth_add_file(V9fsSynthNode *parent, int 
mode,
 /* for READDIR test */
 #define QTEST_V9FS_SYNTH_READDIR_DIR "ReadDirDir"
 #define QTEST_V9FS_SYNTH_READDIR_FILE "ReadDirFile%d"
-#define QTEST_V9FS_SYNTH_READDIR_NFILES 100
+#define QTEST_V9FS_SYNTH_READDIR_NFILES 2000
 
 /* Any write to the "FLUSH" file is handled one byte at a time by the
  * backend. If the byte is zero, the backend returns success (ie, 1),
diff --git a/tests/virtio-9p-test.c b/tests/virtio-9p-test.c
index 48c0eca292..1d49d97a83 100644
--- a/tests/virtio-9p-test.c
+++ b/tests/virtio-9p-test.c
@@ -586,7 +586,7 @@ static void fs_readdir(void *obj, void *data, 
QGuestAllocator *t_alloc)
     v9fs_req_wait_for_reply(req, NULL);
     v9fs_rlopen(req, &qid, NULL);
 
-    req = v9fs_treaddir(v9p, 1, 0, P9_MAX_SIZE - P9_IOHDRSZ, 0);
+    req = v9fs_treaddir(v9p, 1, 0, 900000, 0);
     v9fs_req_wait_for_reply(req, NULL);
     v9fs_rreaddir(req, &count, &nentries, &entries);

Then run the test, which leads to this error here:

/x86_64/pc/i440FX-pcihost/pci-bus-pc/pci-bus/virtio-9p-pci/virtio-9p/
virtio-9p-tests/fs/readdir/basic: qemu-system-x86_64: Failed to encode VirtFS 
reply type 41

-> Hang.

Best regards,
Christian Schoenebeck




^ permalink raw reply related	[flat|nested] 26+ messages in thread

* Re: [PATCH v2 4/9] tests/virtio-9p: added readdir test
  2020-01-06 17:22   ` Greg Kurz
@ 2020-01-07 12:25     ` Christian Schoenebeck
  2020-01-07 15:27       ` Greg Kurz
  0 siblings, 1 reply; 26+ messages in thread
From: Christian Schoenebeck @ 2020-01-07 12:25 UTC (permalink / raw)
  To: qemu-devel; +Cc: Greg Kurz, Christian Schoenebeck

On Montag, 6. Januar 2020 18:22:52 CET Greg Kurz wrote:
> > diff --git a/tests/virtio-9p-test.c b/tests/virtio-9p-test.c
> > index 06263edb53..48c0eca292 100644
> > --- a/tests/virtio-9p-test.c
> > +++ b/tests/virtio-9p-test.c
> > @@ -68,6 +68,11 @@ static void v9fs_memread(P9Req *req, void *addr, size_t
> > len)> 
> >      req->r_off += len;
> >  
> >  }
> > 
> > +static void v9fs_uint8_read(P9Req *req, uint8_t *val)
> > +{
> > +    v9fs_memread(req, val, 1);
> > +}
> > +
> > 
> >  static void v9fs_uint16_write(P9Req *req, uint16_t val)
> >  {
> >  
> >      uint16_t le_val = cpu_to_le16(val);
> > 
> > @@ -101,6 +106,12 @@ static void v9fs_uint32_read(P9Req *req, uint32_t
> > *val)> 
> >      le32_to_cpus(val);
> >  
> >  }
> > 
> > +static void v9fs_uint64_read(P9Req *req, uint64_t *val)
> > +{
> > +    v9fs_memread(req, val, 8);
> > +    le64_to_cpus(val);
> > +}
> > +
> > 
> >  /* len[2] string[len] */
> >  static uint16_t v9fs_string_size(const char *string)
> >  {
> > 
> > @@ -191,6 +202,7 @@ static const char *rmessage_name(uint8_t id)
> > 
> >          id == P9_RLOPEN ? "RLOPEN" :
> >          id == P9_RWRITE ? "RWRITE" :
> > 
> >          id == P9_RFLUSH ? "RFLUSH" :
> > +        id == P9_RREADDIR ? "READDIR" :
> >          "<unknown>";
> >  
> >  }
> > 
> > @@ -348,6 +360,79 @@ static void v9fs_rwalk(P9Req *req, uint16_t *nwqid,
> > v9fs_qid **wqid)> 
> >      v9fs_req_free(req);
> >  
> >  }
> > 
> > +/* size[4] Treaddir tag[2] fid[4] offset[8] count[4] */
> > +static P9Req *v9fs_treaddir(QVirtio9P *v9p, uint32_t fid, uint64_t
> > offset,
> > +                            uint32_t count, uint16_t tag)
> > +{
> > +    P9Req *req;
> > +
> > +    req = v9fs_req_init(v9p, 4 + 8 + 4, P9_TREADDIR, tag);
> > +    v9fs_uint32_write(req, fid);
> > +    v9fs_uint64_write(req, offset);
> > +    v9fs_uint32_write(req, count);
> > +    v9fs_req_send(req);
> > +    return req;
> > +}
> > +
> > +struct v9fs_dirent {
> 
> The QEMU coding style calls for a CamelCase typedef,
> 
> ie.
> 
> typedef struct V9fsDirent V9fsDirent;

np

> > +    v9fs_qid qid;
> 
> Yeah... I should have done the same when I introduced this type ;-)

So I'll probably address your sin with a separate patch then.

> > +static void fs_readdir(void *obj, void *data, QGuestAllocator *t_alloc)
> > +{
> > +    QVirtio9P *v9p = obj;
> > +    alloc = t_alloc;
> > +    char *const wnames[] = { g_strdup(QTEST_V9FS_SYNTH_READDIR_DIR) };
> > +    uint16_t nqid;
> > +    v9fs_qid qid;
> > +    uint32_t count, nentries;
> > +    struct v9fs_dirent *entries = NULL;
> > +    P9Req *req;
> > +
> > +    fs_attach(v9p, NULL, t_alloc);
> > +    req = v9fs_twalk(v9p, 0, 1, 1, wnames, 0);
> > +    v9fs_req_wait_for_reply(req, NULL);
> > +    v9fs_rwalk(req, &nqid, NULL);
> > +    g_assert_cmpint(nqid, ==, 1);
> > +
> > +    req = v9fs_tlopen(v9p, 1, O_DIRECTORY, 0);
> > +    v9fs_req_wait_for_reply(req, NULL);
> > +    v9fs_rlopen(req, &qid, NULL);
> > +
> > +    req = v9fs_treaddir(v9p, 1, 0, P9_MAX_SIZE - P9_IOHDRSZ, 0);
> > +    v9fs_req_wait_for_reply(req, NULL);
> > +    v9fs_rreaddir(req, &count, &nentries, &entries);
> > +
> > +    /*
> > +     * Assuming msize (P9_MAX_SIZE) is large enough so we can retrieve
> > all
> > +     * dir entries with only one readdir request.
> > +     */
> > +    g_assert_cmpint(
> > +        nentries, ==,
> > +        QTEST_V9FS_SYNTH_READDIR_NFILES + 2 /* "." and ".." */
> > +    );
> 
> What about coming up with a version of this test that loops until
> it could read all the entries instead of this assumption ?

Yes, I had this planned a bit later though. And not as a replacement for this 
one, but rather as a subsequent advanced readdir test. Because it makes sense 
to cover both cases: readdir a large amount of entries with a single request, 
but also splitted down by several readdir requests as subsequent, separate 
test.

Best regards,
Christian Schoenebeck




^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH v2 5/9] tests/virtio-9p: check file names of R_readdir response
  2020-01-06 17:07   ` Greg Kurz
@ 2020-01-07 12:28     ` Christian Schoenebeck
  2020-01-07 15:29       ` Greg Kurz
  0 siblings, 1 reply; 26+ messages in thread
From: Christian Schoenebeck @ 2020-01-07 12:28 UTC (permalink / raw)
  To: qemu-devel; +Cc: Greg Kurz, Christian Schoenebeck

On Montag, 6. Januar 2020 18:07:11 CET Greg Kurz wrote:
> On Wed, 18 Dec 2019 14:35:56 +0100
> 
> Christian Schoenebeck <qemu_oss@crudebyte.com> wrote:
> > Additionally to the already existing check for expected amount
> > of directory entries returned by R_readdir response, also check
> > whether all directory entries have the expected file names (as
> > created on 9pfs synth driver side), ignoring their precise order
> > in result list though.
> > 
> > Signed-off-by: Christian Schoenebeck <qemu_oss@crudebyte.com>
> > ---
> 
> LGTM and trivial enough that it can be folded in the previous
> patch.

So you want that patch to be squashed. Np.

Best regards,
Christian Schoenebeck




^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH v2 4/9] tests/virtio-9p: added readdir test
  2020-01-07 12:25     ` Christian Schoenebeck
@ 2020-01-07 15:27       ` Greg Kurz
  0 siblings, 0 replies; 26+ messages in thread
From: Greg Kurz @ 2020-01-07 15:27 UTC (permalink / raw)
  To: Christian Schoenebeck; +Cc: qemu-devel

On Tue, 07 Jan 2020 13:25:46 +0100
Christian Schoenebeck <qemu_oss@crudebyte.com> wrote:

> On Montag, 6. Januar 2020 18:22:52 CET Greg Kurz wrote:
> > > diff --git a/tests/virtio-9p-test.c b/tests/virtio-9p-test.c
> > > index 06263edb53..48c0eca292 100644
> > > --- a/tests/virtio-9p-test.c
> > > +++ b/tests/virtio-9p-test.c
> > > @@ -68,6 +68,11 @@ static void v9fs_memread(P9Req *req, void *addr, size_t
> > > len)> 
> > >      req->r_off += len;
> > >  
> > >  }
> > > 
> > > +static void v9fs_uint8_read(P9Req *req, uint8_t *val)
> > > +{
> > > +    v9fs_memread(req, val, 1);
> > > +}
> > > +
> > > 
> > >  static void v9fs_uint16_write(P9Req *req, uint16_t val)
> > >  {
> > >  
> > >      uint16_t le_val = cpu_to_le16(val);
> > > 
> > > @@ -101,6 +106,12 @@ static void v9fs_uint32_read(P9Req *req, uint32_t
> > > *val)> 
> > >      le32_to_cpus(val);
> > >  
> > >  }
> > > 
> > > +static void v9fs_uint64_read(P9Req *req, uint64_t *val)
> > > +{
> > > +    v9fs_memread(req, val, 8);
> > > +    le64_to_cpus(val);
> > > +}
> > > +
> > > 
> > >  /* len[2] string[len] */
> > >  static uint16_t v9fs_string_size(const char *string)
> > >  {
> > > 
> > > @@ -191,6 +202,7 @@ static const char *rmessage_name(uint8_t id)
> > > 
> > >          id == P9_RLOPEN ? "RLOPEN" :
> > >          id == P9_RWRITE ? "RWRITE" :
> > > 
> > >          id == P9_RFLUSH ? "RFLUSH" :
> > > +        id == P9_RREADDIR ? "READDIR" :
> > >          "<unknown>";
> > >  
> > >  }
> > > 
> > > @@ -348,6 +360,79 @@ static void v9fs_rwalk(P9Req *req, uint16_t *nwqid,
> > > v9fs_qid **wqid)> 
> > >      v9fs_req_free(req);
> > >  
> > >  }
> > > 
> > > +/* size[4] Treaddir tag[2] fid[4] offset[8] count[4] */
> > > +static P9Req *v9fs_treaddir(QVirtio9P *v9p, uint32_t fid, uint64_t
> > > offset,
> > > +                            uint32_t count, uint16_t tag)
> > > +{
> > > +    P9Req *req;
> > > +
> > > +    req = v9fs_req_init(v9p, 4 + 8 + 4, P9_TREADDIR, tag);
> > > +    v9fs_uint32_write(req, fid);
> > > +    v9fs_uint64_write(req, offset);
> > > +    v9fs_uint32_write(req, count);
> > > +    v9fs_req_send(req);
> > > +    return req;
> > > +}
> > > +
> > > +struct v9fs_dirent {
> > 
> > The QEMU coding style calls for a CamelCase typedef,
> > 
> > ie.
> > 
> > typedef struct V9fsDirent V9fsDirent;
> 
> np
> 
> > > +    v9fs_qid qid;
> > 
> > Yeah... I should have done the same when I introduced this type ;-)
> 
> So I'll probably address your sin with a separate patch then.
> 

Thanks. :)

> > > +static void fs_readdir(void *obj, void *data, QGuestAllocator *t_alloc)
> > > +{
> > > +    QVirtio9P *v9p = obj;
> > > +    alloc = t_alloc;
> > > +    char *const wnames[] = { g_strdup(QTEST_V9FS_SYNTH_READDIR_DIR) };
> > > +    uint16_t nqid;
> > > +    v9fs_qid qid;
> > > +    uint32_t count, nentries;
> > > +    struct v9fs_dirent *entries = NULL;
> > > +    P9Req *req;
> > > +
> > > +    fs_attach(v9p, NULL, t_alloc);
> > > +    req = v9fs_twalk(v9p, 0, 1, 1, wnames, 0);
> > > +    v9fs_req_wait_for_reply(req, NULL);
> > > +    v9fs_rwalk(req, &nqid, NULL);
> > > +    g_assert_cmpint(nqid, ==, 1);
> > > +
> > > +    req = v9fs_tlopen(v9p, 1, O_DIRECTORY, 0);
> > > +    v9fs_req_wait_for_reply(req, NULL);
> > > +    v9fs_rlopen(req, &qid, NULL);
> > > +
> > > +    req = v9fs_treaddir(v9p, 1, 0, P9_MAX_SIZE - P9_IOHDRSZ, 0);
> > > +    v9fs_req_wait_for_reply(req, NULL);
> > > +    v9fs_rreaddir(req, &count, &nentries, &entries);
> > > +
> > > +    /*
> > > +     * Assuming msize (P9_MAX_SIZE) is large enough so we can retrieve
> > > all
> > > +     * dir entries with only one readdir request.
> > > +     */
> > > +    g_assert_cmpint(
> > > +        nentries, ==,
> > > +        QTEST_V9FS_SYNTH_READDIR_NFILES + 2 /* "." and ".." */
> > > +    );
> > 
> > What about coming up with a version of this test that loops until
> > it could read all the entries instead of this assumption ?
> 
> Yes, I had this planned a bit later though. And not as a replacement for this 
> one, but rather as a subsequent advanced readdir test. Because it makes sense 
> to cover both cases: readdir a large amount of entries with a single request, 
> but also splitted down by several readdir requests as subsequent, separate 
> test.
> 

Works for me.

> Best regards,
> Christian Schoenebeck
> 
> 



^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH v2 5/9] tests/virtio-9p: check file names of R_readdir response
  2020-01-07 12:28     ` Christian Schoenebeck
@ 2020-01-07 15:29       ` Greg Kurz
  0 siblings, 0 replies; 26+ messages in thread
From: Greg Kurz @ 2020-01-07 15:29 UTC (permalink / raw)
  To: Christian Schoenebeck; +Cc: qemu-devel

On Tue, 07 Jan 2020 13:28:38 +0100
Christian Schoenebeck <qemu_oss@crudebyte.com> wrote:

> On Montag, 6. Januar 2020 18:07:11 CET Greg Kurz wrote:
> > On Wed, 18 Dec 2019 14:35:56 +0100
> > 
> > Christian Schoenebeck <qemu_oss@crudebyte.com> wrote:
> > > Additionally to the already existing check for expected amount
> > > of directory entries returned by R_readdir response, also check
> > > whether all directory entries have the expected file names (as
> > > created on 9pfs synth driver side), ignoring their precise order
> > > in result list though.
> > > 
> > > Signed-off-by: Christian Schoenebeck <qemu_oss@crudebyte.com>
> > > ---
> > 
> > LGTM and trivial enough that it can be folded in the previous
> > patch.
> 
> So you want that patch to be squashed. Np.
> 

Yes. Thanks.

> Best regards,
> Christian Schoenebeck
> 
> 



^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH v2 2/9] 9pfs: validate count sent by client with T_readdir
  2020-01-06 15:10     ` Christian Schoenebeck
  2020-01-06 17:49       ` Greg Kurz
@ 2020-01-08 23:53       ` Greg Kurz
  2020-01-10 12:03         ` Christian Schoenebeck
  1 sibling, 1 reply; 26+ messages in thread
From: Greg Kurz @ 2020-01-08 23:53 UTC (permalink / raw)
  To: Christian Schoenebeck
  Cc: anthony.perard, Stefano Stabellini, qemu-devel, Stefano Stabellini

On Mon, 06 Jan 2020 16:10:28 +0100
Christian Schoenebeck <qemu_oss@crudebyte.com> wrote:

> On Montag, 6. Januar 2020 13:30:24 CET Greg Kurz wrote:
> > On Wed, 18 Dec 2019 14:17:59 +0100
> > 
> > Christian Schoenebeck <qemu_oss@crudebyte.com> wrote:
> > > A good 9p client sends T_readdir with "count" parameter that's
> > > sufficiently smaller than client's initially negotiated msize
> > > (maximum message size). We perform a check for that though to
> > > avoid the server to be interrupted with a "Failed to encode
> > > VirtFS reply type 41" error message by bad clients.
> > 
> > Hmm... doesn't v9fs_do_readdir() already take care of that ?
> 
> No, unfortunately it doesn't. It just looks at the "count" parameter 
> transmitted with client's T_readdir request, but not at session's msize.
> So a bad client could send a T_readdir request with a "count" parameter that's 
> substantially higher than session's msize, which would lead to that mentioned 
> transport error ATM. Hence I suggested this patch here to address it.
> 
> You can easily reproduce this issue:
> 
> 1. Omit this patch 2 (since it would fix it).
> 
> 2. Apply patch 3 and patch 4 of this patch set, which assemble a T_readdir
>    test case combined, then stop patches here (i.e. don't apply subsequent 
>    patches of this patch set, since e.g. patch 6 would increase test client's 
>    msize).
> 
> 3. Set QTEST_V9FS_SYNTH_READDIR_NFILES in hw/9pfs/9p-synth.h to a high number
>    (e.g. several thousands).
> 
> 4. Run the T_readdir test case:
> 
>    cd build
>    make && make tests/qos-test
>    export QTEST_QEMU_BINARY=x86_64-softmmu/qemu-system-x86_64
>    tests/qos-test -p $(tests/qos-test -l | grep readdir/basic)
> 
> Result: Since the test client's msize is quite small at this point (4kB), test 
> client would transmit a very large "count" parameter with T_readdir request to 
> retrieve all QTEST_V9FS_SYNTH_READDIR_NFILES files, and you'll end up getting 
> the quoted transport error message on server (see precise error message 
> above).
> 
> > > Note: we should probably also check for a minimum size of
> > > msize during T_version to avoid issues and/or too complicated
> > > count/size checks later on in other requests of that client.
> > > T_version should submit an msize that's at least as large as
> > > the largest request's header size.
> > 
> > Do you mean that the server should expose such an msize in the
> > R_version response ? The 9p spec only says that the server must
> > return an msize <= to the one proposed by the client [1]. Not
> > sure we can do more than to emit a warning and/or interrupt the
> > server if the client sends a silly size.
> > 
> > [1] https://9fans.github.io/plan9port/man/man9/version.html
> 
> My idea was to "handle it as an error" immediately when server receives a 
> T_version request with a "msize" argument transmitted by client that would be 
> way too small for anything.
> 
> Because if client sends T_version with msize < P9_IOHDRSZ then it is obvious 
> that this msize would be way too small for handling any subsequent 9p request 
> at all.
> 

P9_IOHDRSZ is 24 bytes, it is definitely not enough to accommodate an
R_getattr message (153 bytes and heavily used by the linux client) for
example, so we would likely need a higher minimum as you're suggesting
later in this mail.

> > > Signed-off-by: Christian Schoenebeck <qemu_oss@crudebyte.com>
> > > ---
> > > 
> > >  hw/9pfs/9p.c | 8 ++++++++
> > >  1 file changed, 8 insertions(+)
> > > 
> > > diff --git a/hw/9pfs/9p.c b/hw/9pfs/9p.c
> > > index 520177f40c..30e33b6573 100644
> > > --- a/hw/9pfs/9p.c
> > > +++ b/hw/9pfs/9p.c
> > > @@ -2414,6 +2414,7 @@ static void coroutine_fn v9fs_readdir(void *opaque)
> > > 
> > >      int32_t count;
> > >      uint32_t max_count;
> > >      V9fsPDU *pdu = opaque;
> > > 
> > > +    V9fsState *s = pdu->s;
> > > 
> > >      retval = pdu_unmarshal(pdu, offset, "dqd", &fid,
> > >      
> > >                             &initial_offset, &max_count);
> > > 
> > > @@ -2422,6 +2423,13 @@ static void coroutine_fn v9fs_readdir(void *opaque)
> > > 
> > >      }
> > >      trace_v9fs_readdir(pdu->tag, pdu->id, fid, initial_offset,
> > >      max_count);
> > > 
> > > +    if (max_count > s->msize - P9_IOHDRSZ) {
> > > +        max_count = s->msize - P9_IOHDRSZ;
> > 
> > What if s->msize < P9_IOHDRSZ ?
> 
> Exactly, that's why I added that comment to the commit log of this patch for 
> now to make this circumstance clear as yet TODO.
> 
> This issue with T_version and msize needs to be addressed anyway, because it 
> will popup again and again with various other request types and also with 
> transport aspects like previously discussed on a transport buffer size issue 
> (submitters of that transport patch on CC here for that reason):
> https://lists.gnu.org/archive/html/qemu-devel/2019-12/msg04701.html
> 
> The required patch to address this overall minimum msize issue would be very 
> small: just comparing client's transmitted "msize" parameter of client's 
> T_version request and if that transmitted msize is smaller than a certain 
> absolute minimum msize then raise an error immediately to prevent the session 
> to start.
> 
> But there are some decisions to be made, that's why I haven't provided a patch 
> for this min. msize issue in this patch series yet:
> 
> 1. What is the minimum msize to be allowed with T_version?
> 
>    P9_IOHDRSZ would be IMO too small, because P9_IOHDRSZ is the size of the 
>    common header portion of all IO requests. So it would rather make sense IMO
>    reviewing the protocol and pick the largest header size among all possible 

Not the largest header size, rather the size of the largest message,
otherwise we may still disconnect. Some response messages contain a
variable size string that the 9p spec forbids to truncate. In the case
of R_readlink for example, this could go up to the largest target size
supported by the host filesystem for symbolic links... ie. likely PATH_MAX.

>    requests supported by 9pfs server ATM. The advantage of this approach would 
>    be that overall code would be easier too maintain, since we don't have to 
>    add minimum msize checks in any (or even all) individual request type 
>    handlers. T_version handler of server would already enforce a minimum msize 
>    and prevent the session if msize too small, and that's it.
> 

Makes sense. Since the linux client already requires msize to be at
least 4096, this could be a reasonable choice: it doesn't break any
existing setups and it allows fairly large strings in messages.

> 2. How to handle that error with T_version exactly?
> 
>    As far as I can see it, it was originally not considered that T_version 
>    might react with an error response at all. The specs are ambiguous about 
>    this topic. All you can find is that if the transmitted protocol version
>    is not supported by server, then server should respond with "unknown" with 
>    its R_version response, but should not respond with R_error in that case.
> 
>    The specs do not prohibit R_error for T_version in general, but I could 
>    imagine that some clients might not expect if we would send R_error. On the 
>    other hand responding with R_version "unknown" would not be appropriate for 
>    this msize error either, because that would mean to the client that the 
>    protocol version was not supported, which is not the case. So it would 
>    leave client uninformed about the actual reason/cause of the error.
> 

Other 9p implementations have minimal msize checks and return an error
message. We could go that way too. The only fishy thing I see, and that
the other implementations seem to ignore, is that the error message format
depends on the protocol version. It is R_error for older 9p2000 and 9p2000.u,
while 9p2000.L uses R_lerror. If the client doesn't understand any of these
protocols, which is very unlikely, it seems reasonable to interrupt the
server.

> 3. Are there any undocumented special msizes, e.g. msize=0 for "don't care" ?
> 
>    Probably not, but you might have seen more client implementations than me.
> 

I'm not aware of special msizes.

> Best regards,
> Christian Schoenebeck
> 
> 

Cheers,

--
Greg


^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH v2 4/9] tests/virtio-9p: added readdir test
  2019-12-18 13:30 ` [PATCH v2 4/9] tests/virtio-9p: added " Christian Schoenebeck
  2020-01-06 17:22   ` Greg Kurz
@ 2020-01-08 23:55   ` Greg Kurz
  2020-01-10 12:10     ` Christian Schoenebeck
  1 sibling, 1 reply; 26+ messages in thread
From: Greg Kurz @ 2020-01-08 23:55 UTC (permalink / raw)
  To: Christian Schoenebeck; +Cc: qemu-devel

On Wed, 18 Dec 2019 14:30:43 +0100
Christian Schoenebeck <qemu_oss@crudebyte.com> wrote:

> This first readdir test simply checks the amount of directory
> entries returned by 9pfs server, according to the created amount
> of virtual files on 9pfs synth driver side.
> 
> Signed-off-by: Christian Schoenebeck <qemu_oss@crudebyte.com>
> ---

More comments below.

>  tests/virtio-9p-test.c | 124 +++++++++++++++++++++++++++++++++++++++++
>  1 file changed, 124 insertions(+)
> 
> diff --git a/tests/virtio-9p-test.c b/tests/virtio-9p-test.c
> index 06263edb53..48c0eca292 100644
> --- a/tests/virtio-9p-test.c
> +++ b/tests/virtio-9p-test.c
> @@ -68,6 +68,11 @@ static void v9fs_memread(P9Req *req, void *addr, size_t len)
>      req->r_off += len;
>  }
>  
> +static void v9fs_uint8_read(P9Req *req, uint8_t *val)
> +{
> +    v9fs_memread(req, val, 1);
> +}
> +
>  static void v9fs_uint16_write(P9Req *req, uint16_t val)
>  {
>      uint16_t le_val = cpu_to_le16(val);
> @@ -101,6 +106,12 @@ static void v9fs_uint32_read(P9Req *req, uint32_t *val)
>      le32_to_cpus(val);
>  }
>  
> +static void v9fs_uint64_read(P9Req *req, uint64_t *val)
> +{
> +    v9fs_memread(req, val, 8);
> +    le64_to_cpus(val);
> +}
> +
>  /* len[2] string[len] */
>  static uint16_t v9fs_string_size(const char *string)
>  {
> @@ -191,6 +202,7 @@ static const char *rmessage_name(uint8_t id)
>          id == P9_RLOPEN ? "RLOPEN" :
>          id == P9_RWRITE ? "RWRITE" :
>          id == P9_RFLUSH ? "RFLUSH" :
> +        id == P9_RREADDIR ? "READDIR" :
>          "<unknown>";
>  }
>  
> @@ -348,6 +360,79 @@ static void v9fs_rwalk(P9Req *req, uint16_t *nwqid, v9fs_qid **wqid)
>      v9fs_req_free(req);
>  }
>  
> +/* size[4] Treaddir tag[2] fid[4] offset[8] count[4] */
> +static P9Req *v9fs_treaddir(QVirtio9P *v9p, uint32_t fid, uint64_t offset,
> +                            uint32_t count, uint16_t tag)
> +{
> +    P9Req *req;
> +
> +    req = v9fs_req_init(v9p, 4 + 8 + 4, P9_TREADDIR, tag);
> +    v9fs_uint32_write(req, fid);
> +    v9fs_uint64_write(req, offset);
> +    v9fs_uint32_write(req, count);
> +    v9fs_req_send(req);
> +    return req;
> +}
> +
> +struct v9fs_dirent {
> +    v9fs_qid qid;
> +    uint64_t offset;
> +    uint8_t type;
> +    char *name;
> +    struct v9fs_dirent *next;
> +};
> +
> +/* size[4] Rreaddir tag[2] count[4] data[count] */
> +static void v9fs_rreaddir(P9Req *req, uint32_t *count, uint32_t *nentries,
> +                          struct v9fs_dirent **entries)
> +{
> +    uint32_t sz;

Even if this is a size indeed, the 9p spec uses the wording "count" and
so does the function signature. Please rename this variable to local_count.
Some other functions that return server originated data already use this
naming scheme.

> +    struct v9fs_dirent *e = NULL;
> +    uint16_t slen;
> +    uint32_t n = 0;
> +
> +    v9fs_req_recv(req, P9_RREADDIR);
> +    v9fs_uint32_read(req, &sz);
> +
> +    if (count) {
> +        *count = sz;
> +    }
> +
> +    for (int32_t togo = (int32_t)sz;
> +         togo >= 13 + 8 + 1 + 2;
> +         togo -= 13 + 8 + 1 + 2 + slen, ++n)
> +    {
> +        if (!e) {
> +            e = g_malloc(sizeof(struct v9fs_dirent));
> +            if (entries)

ERROR: braces {} are necessary for all arms of this statement
#98: FILE: tests/virtio-9p-test.c:407:
+            if (entries)
[...]

> +                *entries = e;
> +        } else {
> +            e = e->next = g_malloc(sizeof(struct v9fs_dirent));
> +        }
> +        e->next = NULL;
> +        /* qid[13] offset[8] type[1] name[s] */
> +        v9fs_memread(req, &e->qid, 13);
> +        v9fs_uint64_read(req, &e->offset);
> +        v9fs_uint8_read(req, &e->type);
> +        v9fs_string_read(req, &slen, &e->name);
> +    }
> +
> +    if (nentries) {
> +        *nentries = n;
> +    }
> +}
> +
> +static void v9fs_free_dirents(struct v9fs_dirent *e)
> +{
> +    struct v9fs_dirent *next = NULL;
> +
> +    for (; e; e = next) {
> +        next = e->next;
> +        g_free(e->name);
> +        g_free(e);
> +    }
> +}
> +
>  /* size[4] Tlopen tag[2] fid[4] flags[4] */
>  static P9Req *v9fs_tlopen(QVirtio9P *v9p, uint32_t fid, uint32_t flags,
>                            uint16_t tag)
> @@ -480,6 +565,44 @@ static void fs_walk(void *obj, void *data, QGuestAllocator *t_alloc)
>      g_free(wqid);
>  }
>  
> +static void fs_readdir(void *obj, void *data, QGuestAllocator *t_alloc)
> +{
> +    QVirtio9P *v9p = obj;
> +    alloc = t_alloc;
> +    char *const wnames[] = { g_strdup(QTEST_V9FS_SYNTH_READDIR_DIR) };
> +    uint16_t nqid;
> +    v9fs_qid qid;
> +    uint32_t count, nentries;
> +    struct v9fs_dirent *entries = NULL;
> +    P9Req *req;
> +
> +    fs_attach(v9p, NULL, t_alloc);
> +    req = v9fs_twalk(v9p, 0, 1, 1, wnames, 0);
> +    v9fs_req_wait_for_reply(req, NULL);
> +    v9fs_rwalk(req, &nqid, NULL);
> +    g_assert_cmpint(nqid, ==, 1);
> +
> +    req = v9fs_tlopen(v9p, 1, O_DIRECTORY, 0);
> +    v9fs_req_wait_for_reply(req, NULL);
> +    v9fs_rlopen(req, &qid, NULL);
> +
> +    req = v9fs_treaddir(v9p, 1, 0, P9_MAX_SIZE - P9_IOHDRSZ, 0);
> +    v9fs_req_wait_for_reply(req, NULL);
> +    v9fs_rreaddir(req, &count, &nentries, &entries);
> +
> +    /*
> +     * Assuming msize (P9_MAX_SIZE) is large enough so we can retrieve all
> +     * dir entries with only one readdir request.
> +     */
> +    g_assert_cmpint(
> +        nentries, ==,
> +        QTEST_V9FS_SYNTH_READDIR_NFILES + 2 /* "." and ".." */
> +    );
> +
> +    v9fs_free_dirents(entries);
> +    g_free(wnames[0]);
> +}
> +
>  static void fs_walk_no_slash(void *obj, void *data, QGuestAllocator *t_alloc)
>  {
>      QVirtio9P *v9p = obj;
> @@ -658,6 +781,7 @@ static void register_virtio_9p_test(void)
>                   NULL);
>      qos_add_test("fs/flush/ignored", "virtio-9p", fs_flush_ignored,
>                   NULL);
> +    qos_add_test("fs/readdir/basic", "virtio-9p", fs_readdir, NULL);
>  }
>  
>  libqos_init(register_virtio_9p_test);



^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH v2 3/9] hw/9pfs/9p-synth: added directory for readdir test
  2019-12-18 13:23 ` [PATCH v2 3/9] hw/9pfs/9p-synth: added directory for readdir test Christian Schoenebeck
@ 2020-01-09 18:49   ` Greg Kurz
  0 siblings, 0 replies; 26+ messages in thread
From: Greg Kurz @ 2020-01-09 18:49 UTC (permalink / raw)
  To: Christian Schoenebeck; +Cc: qemu-devel

On Wed, 18 Dec 2019 14:23:48 +0100
Christian Schoenebeck <qemu_oss@crudebyte.com> wrote:

> This will provide the following virtual files by the 9pfs
> synth driver:
> 
>   - /ReadDirDir/ReadDirFile99
>   - /ReadDirDir/ReadDirFile98
>   ...
>   - /ReadDirDir/ReadDirFile1
>   - /ReadDirDir/ReadDirFile0
> 
> This virtual directory and its virtual 100 files will be
> used by the upcoming 9pfs readdir tests.
> 
> Signed-off-by: Christian Schoenebeck <qemu_oss@crudebyte.com>
> ---

LGTM

Reviewed-by: Greg Kurz <groug@kaod.org>

>  hw/9pfs/9p-synth.c | 19 +++++++++++++++++++
>  hw/9pfs/9p-synth.h |  5 +++++
>  2 files changed, 24 insertions(+)
> 
> diff --git a/hw/9pfs/9p-synth.c b/hw/9pfs/9p-synth.c
> index 54239c9bbf..7eb210ffa8 100644
> --- a/hw/9pfs/9p-synth.c
> +++ b/hw/9pfs/9p-synth.c
> @@ -578,6 +578,25 @@ static int synth_init(FsContext *ctx, Error **errp)
>                                         NULL, v9fs_synth_qtest_flush_write,
>                                         ctx);
>          assert(!ret);
> +
> +        /* Directory for READDIR test */
> +        {
> +            V9fsSynthNode *dir = NULL;
> +            ret = qemu_v9fs_synth_mkdir(
> +                NULL, 0700, QTEST_V9FS_SYNTH_READDIR_DIR, &dir
> +            );
> +            assert(!ret);
> +            for (i = 0; i < QTEST_V9FS_SYNTH_READDIR_NFILES; ++i) {
> +                char *name = g_strdup_printf(
> +                    QTEST_V9FS_SYNTH_READDIR_FILE, i
> +                );
> +                ret = qemu_v9fs_synth_add_file(
> +                    dir, 0, name, NULL, NULL, ctx
> +                );
> +                assert(!ret);
> +                g_free(name);
> +            }
> +        }
>      }
>  
>      return 0;
> diff --git a/hw/9pfs/9p-synth.h b/hw/9pfs/9p-synth.h
> index af7a993a1e..036d7e4a5b 100644
> --- a/hw/9pfs/9p-synth.h
> +++ b/hw/9pfs/9p-synth.h
> @@ -55,6 +55,11 @@ int qemu_v9fs_synth_add_file(V9fsSynthNode *parent, int mode,
>  #define QTEST_V9FS_SYNTH_LOPEN_FILE "LOPEN"
>  #define QTEST_V9FS_SYNTH_WRITE_FILE "WRITE"
>  
> +/* for READDIR test */
> +#define QTEST_V9FS_SYNTH_READDIR_DIR "ReadDirDir"
> +#define QTEST_V9FS_SYNTH_READDIR_FILE "ReadDirFile%d"
> +#define QTEST_V9FS_SYNTH_READDIR_NFILES 100
> +
>  /* Any write to the "FLUSH" file is handled one byte at a time by the
>   * backend. If the byte is zero, the backend returns success (ie, 1),
>   * otherwise it forces the server to try again forever. Thus allowing



^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH v2 2/9] 9pfs: validate count sent by client with T_readdir
  2020-01-08 23:53       ` Greg Kurz
@ 2020-01-10 12:03         ` Christian Schoenebeck
  0 siblings, 0 replies; 26+ messages in thread
From: Christian Schoenebeck @ 2020-01-10 12:03 UTC (permalink / raw)
  To: qemu-devel
  Cc: Greg Kurz, Christian Schoenebeck, anthony.perard,
	Stefano Stabellini, Stefano Stabellini

On Donnerstag, 9. Januar 2020 00:53:35 CET Greg Kurz wrote:
> > My idea was to "handle it as an error" immediately when server receives a
> > T_version request with a "msize" argument transmitted by client that would
> > be way too small for anything.
> > 
> > Because if client sends T_version with msize < P9_IOHDRSZ then it is
> > obvious that this msize would be way too small for handling any
> > subsequent 9p request at all.
> 
> P9_IOHDRSZ is 24 bytes, it is definitely not enough to accommodate an
> R_getattr message (153 bytes and heavily used by the linux client) for
> example, so we would likely need a higher minimum as you're suggesting
> later in this mail.
[snip]
> > 1. What is the minimum msize to be allowed with T_version?
> > 
> >    P9_IOHDRSZ would be IMO too small, because P9_IOHDRSZ is the size of
> >    the
> >    common header portion of all IO requests. So it would rather make sense
> >    IMO
> >    reviewing the protocol and pick the largest header size among all
> >    possible
> 
> Not the largest header size, rather the size of the largest message,
> otherwise we may still disconnect. Some response messages contain a
> variable size string that the 9p spec forbids to truncate. In the case
> of R_readlink for example, this could go up to the largest target size
> supported by the host filesystem for symbolic links... ie. likely PATH_MAX.

Good point.

> >    requests supported by 9pfs server ATM. The advantage of this approach
> >    would
> >    be that overall code would be easier too maintain, since we don't have
> >    to
> >    add minimum msize checks in any (or even all) individual request type
> >    handlers. T_version handler of server would already enforce a minimum
> >    msize
> >    and prevent the session if msize too small, and that's it.
> 
> Makes sense. Since the linux client already requires msize to be at
> least 4096, this could be a reasonable choice: it doesn't break any
> existing setups and it allows fairly large strings in messages.

4096 as min. msize is fine with me.

In practice people should use a *much* higher msize for performance reasons 
anyway. And I planned to make this circumstance more clear in the docs.

> > 2. How to handle that error with T_version exactly?
> > 
> >    As far as I can see it, it was originally not considered that T_version
> >    might react with an error response at all. The specs are ambiguous
> >    about
> >    this topic. All you can find is that if the transmitted protocol
> >    version
> >    is not supported by server, then server should respond with "unknown"
> >    with
> >    its R_version response, but should not respond with R_error in that
> >    case.
> >    
> >    The specs do not prohibit R_error for T_version in general, but I could
> >    imagine that some clients might not expect if we would send R_error. On
> >    the
> >    other hand responding with R_version "unknown" would not be appropriate
> >    for
> >    this msize error either, because that would mean to the client that the
> >    protocol version was not supported, which is not the case. So it would
> >    leave client uninformed about the actual reason/cause of the error.
> 
> Other 9p implementations have minimal msize checks and return an error
> message. We could go that way too. The only fishy thing I see, and that
> the other implementations seem to ignore, is that the error message format
> depends on the protocol version. It is R_error for older 9p2000 and
> 9p2000.u, while 9p2000.L uses R_lerror.

Ok, noted.

> If the client doesn't understand
> any of these protocols, which is very unlikely, it seems reasonable to
> interrupt the server.

No need. If it is neither 9P2000.u nor 9P2000.L then the appropriate reply of 
server was and still is:

R_version "unknown"

Since the unsupported protocol version still weights higher than the 
unsupported msize. If you are very, very picky, you might want to distinguish 
in this particular case from the case where msize is even too small for 
R_version "unknown", but personally I don't care since the latter would be an 
extreme exotic and broken client.

Best regards,
Christian Schoenebeck




^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH v2 4/9] tests/virtio-9p: added readdir test
  2020-01-08 23:55   ` Greg Kurz
@ 2020-01-10 12:10     ` Christian Schoenebeck
  0 siblings, 0 replies; 26+ messages in thread
From: Christian Schoenebeck @ 2020-01-10 12:10 UTC (permalink / raw)
  To: qemu-devel; +Cc: Greg Kurz, Christian Schoenebeck

On Donnerstag, 9. Januar 2020 00:55:45 CET Greg Kurz wrote:
> > diff --git a/tests/virtio-9p-test.c b/tests/virtio-9p-test.c
> > index 06263edb53..48c0eca292 100644
> > --- a/tests/virtio-9p-test.c
> > +++ b/tests/virtio-9p-test.c
> > @@ -68,6 +68,11 @@ static void v9fs_memread(P9Req *req, void *addr, size_t
> > len)> 
> >      req->r_off += len;
> >  
> >  }
> > 
> > +static void v9fs_uint8_read(P9Req *req, uint8_t *val)
> > +{
> > +    v9fs_memread(req, val, 1);
> > +}
> > +
> > 
> >  static void v9fs_uint16_write(P9Req *req, uint16_t val)
> >  {
> >  
> >      uint16_t le_val = cpu_to_le16(val);
> > 
> > @@ -101,6 +106,12 @@ static void v9fs_uint32_read(P9Req *req, uint32_t
> > *val)> 
> >      le32_to_cpus(val);
> >  
> >  }
> > 
> > +static void v9fs_uint64_read(P9Req *req, uint64_t *val)
> > +{
> > +    v9fs_memread(req, val, 8);
> > +    le64_to_cpus(val);
> > +}
> > +
> > 
> >  /* len[2] string[len] */
> >  static uint16_t v9fs_string_size(const char *string)
> >  {
> > 
> > @@ -191,6 +202,7 @@ static const char *rmessage_name(uint8_t id)
> > 
> >          id == P9_RLOPEN ? "RLOPEN" :
> >          id == P9_RWRITE ? "RWRITE" :
> > 
> >          id == P9_RFLUSH ? "RFLUSH" :
> > +        id == P9_RREADDIR ? "READDIR" :
> >          "<unknown>";
> >  
> >  }
> > 
> > @@ -348,6 +360,79 @@ static void v9fs_rwalk(P9Req *req, uint16_t *nwqid,
> > v9fs_qid **wqid)> 
> >      v9fs_req_free(req);
> >  
> >  }
> > 
> > +/* size[4] Treaddir tag[2] fid[4] offset[8] count[4] */
> > +static P9Req *v9fs_treaddir(QVirtio9P *v9p, uint32_t fid, uint64_t
> > offset,
> > +                            uint32_t count, uint16_t tag)
> > +{
> > +    P9Req *req;
> > +
> > +    req = v9fs_req_init(v9p, 4 + 8 + 4, P9_TREADDIR, tag);
> > +    v9fs_uint32_write(req, fid);
> > +    v9fs_uint64_write(req, offset);
> > +    v9fs_uint32_write(req, count);
> > +    v9fs_req_send(req);
> > +    return req;
> > +}
> > +
> > +struct v9fs_dirent {
> > +    v9fs_qid qid;
> > +    uint64_t offset;
> > +    uint8_t type;
> > +    char *name;
> > +    struct v9fs_dirent *next;
> > +};
> > +
> > +/* size[4] Rreaddir tag[2] count[4] data[count] */
> > +static void v9fs_rreaddir(P9Req *req, uint32_t *count, uint32_t
> > *nentries,
> > +                          struct v9fs_dirent **entries)
> > +{
> > +    uint32_t sz;
> 
> Even if this is a size indeed, the 9p spec uses the wording "count" and
> so does the function signature. Please rename this variable to local_count.
> Some other functions that return server originated data already use this
> naming scheme.

I know, I did that intentionally. But I don't care for such code style details 
enough to start argueing, so I'll change it.

> 
> > +    struct v9fs_dirent *e = NULL;
> > +    uint16_t slen;
> > +    uint32_t n = 0;
> > +
> > +    v9fs_req_recv(req, P9_RREADDIR);
> > +    v9fs_uint32_read(req, &sz);
> > +
> > +    if (count) {
> > +        *count = sz;
> > +    }
> > +
> > +    for (int32_t togo = (int32_t)sz;
> > +         togo >= 13 + 8 + 1 + 2;
> > +         togo -= 13 + 8 + 1 + 2 + slen, ++n)
> > +    {
> > +        if (!e) {
> > +            e = g_malloc(sizeof(struct v9fs_dirent));
> > +            if (entries)
> 
> ERROR: braces {} are necessary for all arms of this statement
> #98: FILE: tests/virtio-9p-test.c:407:
> +            if (entries)
> [...]

Right, sorry I missed that for some reason.

Best regards,
Christian Schoenebeck




^ permalink raw reply	[flat|nested] 26+ messages in thread

end of thread, other threads:[~2020-01-10 12:11 UTC | newest]

Thread overview: 26+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2019-12-18 14:17 [PATCH v2 0/9] 9pfs: readdir optimization Christian Schoenebeck
2019-12-18 13:06 ` [PATCH v2 1/9] tests/virtio-9p: add terminating null in v9fs_string_read() Christian Schoenebeck
2020-01-06 11:00   ` Greg Kurz
2019-12-18 13:17 ` [PATCH v2 2/9] 9pfs: validate count sent by client with T_readdir Christian Schoenebeck
2020-01-06 12:30   ` Greg Kurz
2020-01-06 15:10     ` Christian Schoenebeck
2020-01-06 17:49       ` Greg Kurz
2020-01-06 21:43         ` Christian Schoenebeck
2020-01-08 23:53       ` Greg Kurz
2020-01-10 12:03         ` Christian Schoenebeck
2019-12-18 13:23 ` [PATCH v2 3/9] hw/9pfs/9p-synth: added directory for readdir test Christian Schoenebeck
2020-01-09 18:49   ` Greg Kurz
2019-12-18 13:30 ` [PATCH v2 4/9] tests/virtio-9p: added " Christian Schoenebeck
2020-01-06 17:22   ` Greg Kurz
2020-01-07 12:25     ` Christian Schoenebeck
2020-01-07 15:27       ` Greg Kurz
2020-01-08 23:55   ` Greg Kurz
2020-01-10 12:10     ` Christian Schoenebeck
2019-12-18 13:35 ` [PATCH v2 5/9] tests/virtio-9p: check file names of R_readdir response Christian Schoenebeck
2020-01-06 17:07   ` Greg Kurz
2020-01-07 12:28     ` Christian Schoenebeck
2020-01-07 15:29       ` Greg Kurz
2019-12-18 13:43 ` [PATCH v2 6/9] 9pfs: readdir benchmark Christian Schoenebeck
2019-12-18 13:52 ` [PATCH v2 7/9] hw/9pfs/9p-synth: avoid n-square issue in synth_readdir() Christian Schoenebeck
2019-12-18 14:00 ` [PATCH v2 8/9] 9pfs: T_readdir latency optimization Christian Schoenebeck
2019-12-18 14:10 ` [PATCH v2 9/9] hw/9pfs/9p.c: benchmark time on T_readdir request Christian Schoenebeck

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.