All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH v2 2/7] 9pfs: Twalk benchmark
  2021-06-04 15:48 [PATCH v2 0/7] 9pfs: Twalk optimization Christian Schoenebeck
@ 2021-05-27 17:03 ` Christian Schoenebeck
  2021-05-27 17:04 ` [PATCH v2 3/7] 9pfs: capture root stat Christian Schoenebeck
                   ` (6 subsequent siblings)
  7 siblings, 0 replies; 14+ messages in thread
From: Christian Schoenebeck @ 2021-05-27 17:03 UTC (permalink / raw)
  To: qemu-devel; +Cc: Greg Kurz

This patch is not intentended to be merged, it just acts as performance
A/B comparison benchmark for the subsequent patches.

Signed-off-by: Christian Schoenebeck <qemu_oss@crudebyte.com>
---
 tests/qtest/virtio-9p-test.c | 30 ++++++++++++++++++++++++++++++
 1 file changed, 30 insertions(+)

diff --git a/tests/qtest/virtio-9p-test.c b/tests/qtest/virtio-9p-test.c
index 41fed41de1..2cd9e427b4 100644
--- a/tests/qtest/virtio-9p-test.c
+++ b/tests/qtest/virtio-9p-test.c
@@ -20,6 +20,18 @@
 #include "libqos/virtio-9p.h"
 #include "libqos/qgraph.h"
 
+/*
+ * to benchmark the real time (not CPU time) that elapsed between start of
+ * a request and arrival of its response
+ */
+static double wall_time(void)
+{
+    struct timeval t;
+    struct timezone tz;
+    gettimeofday(&t, &tz);
+    return t.tv_sec + t.tv_usec * 0.000001;
+}
+
 #define QVIRTIO_9P_TIMEOUT_US (10 * 1000 * 1000)
 static QGuestAllocator *alloc;
 
@@ -646,12 +658,30 @@ static void fs_walk(void *obj, void *data, QGuestAllocator *t_alloc)
     }
 
     do_attach(v9p);
+    const double start = wall_time();
     req = v9fs_twalk(v9p, 0, 1, P9_MAXWELEM, wnames, 0);
+    const double twalk = wall_time();
     v9fs_req_wait_for_reply(req, NULL);
+    const double waitforreply = wall_time();
     v9fs_rwalk(req, &nwqid, &wqid);
+    const double end = wall_time();
 
     g_assert_cmpint(nwqid, ==, P9_MAXWELEM);
 
+    printf("\nTime client spent on sending Twalk: %fs\n\n",
+           twalk - start);
+
+    printf("Time client spent for waiting for reply from server: %fs "
+           "[MOST IMPORTANT]\n", waitforreply - start);
+    printf("(This is the most important value, because it reflects the time\n"
+           "the 9p server required to process and return the result of the\n"
+           "Twalk request.)\n\n");
+
+    printf("Total client time: %fs\n", end - start);
+
+    //printf("Details of response message data: R_readddir nentries=%d "
+    //       "rbytes=%d\n", nentries, count);
+
     for (i = 0; i < P9_MAXWELEM; i++) {
         g_free(wnames[i]);
     }
-- 
2.20.1



^ permalink raw reply related	[flat|nested] 14+ messages in thread

* [PATCH v2 3/7] 9pfs: capture root stat
  2021-06-04 15:48 [PATCH v2 0/7] 9pfs: Twalk optimization Christian Schoenebeck
  2021-05-27 17:03 ` [PATCH v2 2/7] 9pfs: Twalk benchmark Christian Schoenebeck
@ 2021-05-27 17:04 ` Christian Schoenebeck
  2021-06-04 14:46 ` [PATCH v2 1/7] 9pfs: fix not_same_qid() Christian Schoenebeck
                   ` (5 subsequent siblings)
  7 siblings, 0 replies; 14+ messages in thread
From: Christian Schoenebeck @ 2021-05-27 17:04 UTC (permalink / raw)
  To: qemu-devel; +Cc: Greg Kurz

We already capture the QID of the exported 9p root path, i.e. to
prevent client access outside the defined, exported filesystem's tree.
This is currently checked by comparing the root QID with another FID's
QID.

The problem with the latter is that resolving a QID of any given 9p path
can only be done on 9p server's main thread, that's because it might
mutate the server's state if inode remapping is enabled.

For that reason also capture the POSIX stat info of the root path for
being able to identify on any (e.g. worker) thread whether an
arbitrary given path is identical to the export root.

Signed-off-by: Christian Schoenebeck <qemu_oss@crudebyte.com>
---
 hw/9pfs/9p.c | 10 +++++++++-
 hw/9pfs/9p.h |  1 +
 2 files changed, 10 insertions(+), 1 deletion(-)

diff --git a/hw/9pfs/9p.c b/hw/9pfs/9p.c
index e10a02f71d..eb15ec2082 100644
--- a/hw/9pfs/9p.c
+++ b/hw/9pfs/9p.c
@@ -1400,6 +1400,7 @@ static void coroutine_fn v9fs_attach(void *opaque)
     size_t offset = 7;
     V9fsQID qid;
     ssize_t err;
+    struct stat stbuf;
 
     v9fs_string_init(&uname);
     v9fs_string_init(&aname);
@@ -1422,7 +1423,13 @@ static void coroutine_fn v9fs_attach(void *opaque)
         clunk_fid(s, fid);
         goto out;
     }
-    err = fid_to_qid(pdu, fidp, &qid);
+    err = v9fs_co_lstat(pdu, &fidp->path, &stbuf);
+    if (err < 0) {
+        err = -EINVAL;
+        clunk_fid(s, fid);
+        goto out;
+    }
+    err = stat_to_qid(pdu, &stbuf, &qid);
     if (err < 0) {
         err = -EINVAL;
         clunk_fid(s, fid);
@@ -1455,6 +1462,7 @@ static void coroutine_fn v9fs_attach(void *opaque)
     err += offset;
 
     memcpy(&s->root_qid, &qid, sizeof(qid));
+    memcpy(&s->root_st, &stbuf, sizeof(stbuf));
     trace_v9fs_attach_return(pdu->tag, pdu->id,
                              qid.type, qid.version, qid.path);
 out:
diff --git a/hw/9pfs/9p.h b/hw/9pfs/9p.h
index 00381591ff..6f0b4c78c0 100644
--- a/hw/9pfs/9p.h
+++ b/hw/9pfs/9p.h
@@ -356,6 +356,7 @@ struct V9fsState {
     Error *migration_blocker;
     V9fsConf fsconf;
     V9fsQID root_qid;
+    struct stat root_st;
     dev_t dev_id;
     struct qht qpd_table;
     struct qht qpp_table;
-- 
2.20.1



^ permalink raw reply related	[flat|nested] 14+ messages in thread

* [PATCH v2 1/7] 9pfs: fix not_same_qid()
  2021-06-04 15:48 [PATCH v2 0/7] 9pfs: Twalk optimization Christian Schoenebeck
  2021-05-27 17:03 ` [PATCH v2 2/7] 9pfs: Twalk benchmark Christian Schoenebeck
  2021-05-27 17:04 ` [PATCH v2 3/7] 9pfs: capture root stat Christian Schoenebeck
@ 2021-06-04 14:46 ` Christian Schoenebeck
  2021-06-04 15:13 ` [PATCH v2 4/7] 9pfs: drop fid_to_qid() Christian Schoenebeck
                   ` (4 subsequent siblings)
  7 siblings, 0 replies; 14+ messages in thread
From: Christian Schoenebeck @ 2021-06-04 14:46 UTC (permalink / raw)
  To: qemu-devel; +Cc: Greg Kurz

There is only one user of not_same_qid() which is v9fs_walk() and the
latter is using it for comparing a client supplied path with the 9p
export root path, for the sole purpose to prevent a Twalk request
from escaping from the exported 9p tree via "..".

However for that specific purpose the implementation of not_same_qid()
is wrong; if mtime of the 9p export root path changed between Tattach
and Twalk then not_same_qid() returns true when actually comparing
against the export root path.

To fix for the actual semantic being used, only compare QID path
members, but do not compare version or type members.

Signed-off-by: Christian Schoenebeck <qemu_oss@crudebyte.com>
---
 hw/9pfs/9p.c | 5 +----
 1 file changed, 1 insertion(+), 4 deletions(-)

diff --git a/hw/9pfs/9p.c b/hw/9pfs/9p.c
index 89aa07db78..e10a02f71d 100644
--- a/hw/9pfs/9p.c
+++ b/hw/9pfs/9p.c
@@ -1707,10 +1707,7 @@ static bool name_is_illegal(const char *name)
 
 static bool not_same_qid(const V9fsQID *qid1, const V9fsQID *qid2)
 {
-    return
-        qid1->type != qid2->type ||
-        qid1->version != qid2->version ||
-        qid1->path != qid2->path;
+    return qid1->path != qid2->path;
 }
 
 static void coroutine_fn v9fs_walk(void *opaque)
-- 
2.20.1



^ permalink raw reply related	[flat|nested] 14+ messages in thread

* [PATCH v2 4/7] 9pfs: drop fid_to_qid()
  2021-06-04 15:48 [PATCH v2 0/7] 9pfs: Twalk optimization Christian Schoenebeck
                   ` (2 preceding siblings ...)
  2021-06-04 14:46 ` [PATCH v2 1/7] 9pfs: fix not_same_qid() Christian Schoenebeck
@ 2021-06-04 15:13 ` Christian Schoenebeck
  2021-06-04 15:21 ` [PATCH v2 5/7] 9pfs: replace not_same_qid() by same_stat_id() Christian Schoenebeck
                   ` (3 subsequent siblings)
  7 siblings, 0 replies; 14+ messages in thread
From: Christian Schoenebeck @ 2021-06-04 15:13 UTC (permalink / raw)
  To: qemu-devel; +Cc: Greg Kurz

There is only one user of fid_to_qid() which is v9fs_walk(). Let's
open-code fid_to_qid() directly within v9fs_walk(), because
fid_to_qid() hides the POSIX stat buffer which we are going to need
in the subsequent patch.

Signed-off-by: Christian Schoenebeck <qemu_oss@crudebyte.com>
---
 hw/9pfs/9p.c | 23 +++++------------------
 1 file changed, 5 insertions(+), 18 deletions(-)

diff --git a/hw/9pfs/9p.c b/hw/9pfs/9p.c
index eb15ec2082..0e3857798d 100644
--- a/hw/9pfs/9p.c
+++ b/hw/9pfs/9p.c
@@ -971,23 +971,6 @@ static int stat_to_qid(V9fsPDU *pdu, const struct stat *stbuf, V9fsQID *qidp)
     return 0;
 }
 
-static int coroutine_fn fid_to_qid(V9fsPDU *pdu, V9fsFidState *fidp,
-                                   V9fsQID *qidp)
-{
-    struct stat stbuf;
-    int err;
-
-    err = v9fs_co_lstat(pdu, &fidp->path, &stbuf);
-    if (err < 0) {
-        return err;
-    }
-    err = stat_to_qid(pdu, &stbuf, qidp);
-    if (err < 0) {
-        return err;
-    }
-    return 0;
-}
-
 V9fsPDU *pdu_alloc(V9fsState *s)
 {
     V9fsPDU *pdu = NULL;
@@ -1772,7 +1755,11 @@ static void coroutine_fn v9fs_walk(void *opaque)
     v9fs_path_init(&dpath);
     v9fs_path_init(&path);
 
-    err = fid_to_qid(pdu, fidp, &qid);
+    err = v9fs_co_lstat(pdu, &fidp->path, &stbuf);
+    if (err < 0) {
+        goto out;
+    }
+    err = stat_to_qid(pdu, &stbuf, &qid);
     if (err < 0) {
         goto out;
     }
-- 
2.20.1



^ permalink raw reply related	[flat|nested] 14+ messages in thread

* [PATCH v2 5/7] 9pfs: replace not_same_qid() by same_stat_id()
  2021-06-04 15:48 [PATCH v2 0/7] 9pfs: Twalk optimization Christian Schoenebeck
                   ` (3 preceding siblings ...)
  2021-06-04 15:13 ` [PATCH v2 4/7] 9pfs: drop fid_to_qid() Christian Schoenebeck
@ 2021-06-04 15:21 ` Christian Schoenebeck
  2021-06-04 15:32 ` [PATCH v2 6/7] 9pfs: drop root_qid Christian Schoenebeck
                   ` (2 subsequent siblings)
  7 siblings, 0 replies; 14+ messages in thread
From: Christian Schoenebeck @ 2021-06-04 15:21 UTC (permalink / raw)
  To: qemu-devel; +Cc: Greg Kurz

As we are actually only comparing the filesystem ID (i.e. device number
and inode number pair) let's use the POSIX stat buffer instead of QIDs,
because resolving QIDs requires to be done on 9p server's main thread
only as it might mutate the server state if inode remapping is enabled.

Signed-off-by: Christian Schoenebeck <qemu_oss@crudebyte.com>
---
 hw/9pfs/9p.c | 6 +++---
 1 file changed, 3 insertions(+), 3 deletions(-)

diff --git a/hw/9pfs/9p.c b/hw/9pfs/9p.c
index 0e3857798d..47b000d3a9 100644
--- a/hw/9pfs/9p.c
+++ b/hw/9pfs/9p.c
@@ -1696,9 +1696,9 @@ static bool name_is_illegal(const char *name)
     return !*name || strchr(name, '/') != NULL;
 }
 
-static bool not_same_qid(const V9fsQID *qid1, const V9fsQID *qid2)
+static bool same_stat_id(const struct stat *a, const struct stat *b)
 {
-    return qid1->path != qid2->path;
+    return a->st_dev == b->st_dev && a->st_ino == b->st_ino;
 }
 
 static void coroutine_fn v9fs_walk(void *opaque)
@@ -1771,7 +1771,7 @@ static void coroutine_fn v9fs_walk(void *opaque)
     v9fs_path_copy(&dpath, &fidp->path);
     v9fs_path_copy(&path, &fidp->path);
     for (name_idx = 0; name_idx < nwnames; name_idx++) {
-        if (not_same_qid(&pdu->s->root_qid, &qid) ||
+        if (!same_stat_id(&pdu->s->root_st, &stbuf) ||
             strcmp("..", wnames[name_idx].data)) {
             err = v9fs_co_name_to_path(pdu, &dpath, wnames[name_idx].data,
                                        &path);
-- 
2.20.1



^ permalink raw reply related	[flat|nested] 14+ messages in thread

* [PATCH v2 6/7] 9pfs: drop root_qid
  2021-06-04 15:48 [PATCH v2 0/7] 9pfs: Twalk optimization Christian Schoenebeck
                   ` (4 preceding siblings ...)
  2021-06-04 15:21 ` [PATCH v2 5/7] 9pfs: replace not_same_qid() by same_stat_id() Christian Schoenebeck
@ 2021-06-04 15:32 ` Christian Schoenebeck
  2021-06-04 15:38 ` [PATCH v2 7/7] 9pfs: reduce latency of Twalk Christian Schoenebeck
  2021-06-04 16:31 ` [PATCH v2 0/7] 9pfs: Twalk optimization Greg Kurz
  7 siblings, 0 replies; 14+ messages in thread
From: Christian Schoenebeck @ 2021-06-04 15:32 UTC (permalink / raw)
  To: qemu-devel; +Cc: Greg Kurz

There is no longer a user of root_qid, so drop it.

Signed-off-by: Christian Schoenebeck <qemu_oss@crudebyte.com>
---
 hw/9pfs/9p.c | 1 -
 hw/9pfs/9p.h | 1 -
 2 files changed, 2 deletions(-)

diff --git a/hw/9pfs/9p.c b/hw/9pfs/9p.c
index 47b000d3a9..7be07f2d68 100644
--- a/hw/9pfs/9p.c
+++ b/hw/9pfs/9p.c
@@ -1444,7 +1444,6 @@ static void coroutine_fn v9fs_attach(void *opaque)
     }
     err += offset;
 
-    memcpy(&s->root_qid, &qid, sizeof(qid));
     memcpy(&s->root_st, &stbuf, sizeof(stbuf));
     trace_v9fs_attach_return(pdu->tag, pdu->id,
                              qid.type, qid.version, qid.path);
diff --git a/hw/9pfs/9p.h b/hw/9pfs/9p.h
index 6f0b4c78c0..1567b67841 100644
--- a/hw/9pfs/9p.h
+++ b/hw/9pfs/9p.h
@@ -355,7 +355,6 @@ struct V9fsState {
     int32_t root_fid;
     Error *migration_blocker;
     V9fsConf fsconf;
-    V9fsQID root_qid;
     struct stat root_st;
     dev_t dev_id;
     struct qht qpd_table;
-- 
2.20.1



^ permalink raw reply related	[flat|nested] 14+ messages in thread

* [PATCH v2 7/7] 9pfs: reduce latency of Twalk
  2021-06-04 15:48 [PATCH v2 0/7] 9pfs: Twalk optimization Christian Schoenebeck
                   ` (5 preceding siblings ...)
  2021-06-04 15:32 ` [PATCH v2 6/7] 9pfs: drop root_qid Christian Schoenebeck
@ 2021-06-04 15:38 ` Christian Schoenebeck
  2021-07-02 14:36   ` Greg Kurz
  2021-06-04 16:31 ` [PATCH v2 0/7] 9pfs: Twalk optimization Greg Kurz
  7 siblings, 1 reply; 14+ messages in thread
From: Christian Schoenebeck @ 2021-06-04 15:38 UTC (permalink / raw)
  To: qemu-devel; +Cc: Greg Kurz

As with previous performance optimization on Treaddir handling;
reduce the overall latency, i.e. overall time spent on processing
a Twalk request by reducing the amount of thread hops between the
9p server's main thread and fs worker thread(s).

In fact this patch even reduces the thread hops for Twalk handling
to its theoritical minimum of exactly 2 thread hops:

main thread -> fs worker thread -> main thread

This is achieved by doing all the required fs driver tasks altogether
in a single v9fs_co_run_in_worker({ ... }); code block.

Signed-off-by: Christian Schoenebeck <qemu_oss@crudebyte.com>
---
 hw/9pfs/9p.c | 89 +++++++++++++++++++++++++++++++++++++++++-----------
 1 file changed, 70 insertions(+), 19 deletions(-)

diff --git a/hw/9pfs/9p.c b/hw/9pfs/9p.c
index 7be07f2d68..2815257f42 100644
--- a/hw/9pfs/9p.c
+++ b/hw/9pfs/9p.c
@@ -1705,9 +1705,9 @@ static void coroutine_fn v9fs_walk(void *opaque)
     int name_idx;
     V9fsQID *qids = NULL;
     int i, err = 0;
-    V9fsPath dpath, path;
+    V9fsPath dpath, path, *pathes = NULL;
     uint16_t nwnames;
-    struct stat stbuf;
+    struct stat stbuf, fidst, *stbufs = NULL;
     size_t offset = 7;
     int32_t fid, newfid;
     V9fsString *wnames = NULL;
@@ -1733,6 +1733,8 @@ static void coroutine_fn v9fs_walk(void *opaque)
     if (nwnames) {
         wnames = g_new0(V9fsString, nwnames);
         qids   = g_new0(V9fsQID, nwnames);
+        stbufs = g_new0(struct stat, nwnames);
+        pathes = g_new0(V9fsPath, nwnames);
         for (i = 0; i < nwnames; i++) {
             err = pdu_unmarshal(pdu, offset, "s", &wnames[i]);
             if (err < 0) {
@@ -1753,39 +1755,85 @@ static void coroutine_fn v9fs_walk(void *opaque)
 
     v9fs_path_init(&dpath);
     v9fs_path_init(&path);
+    /*
+     * Both dpath and path initially point to fidp.
+     * Needed to handle request with nwnames == 0
+     */
+    v9fs_path_copy(&dpath, &fidp->path);
+    v9fs_path_copy(&path, &fidp->path);
 
-    err = v9fs_co_lstat(pdu, &fidp->path, &stbuf);
+    /*
+     * To keep latency (i.e. overall execution time for processing this
+     * Twalk client request) as small as possible, run all the required fs
+     * driver code altogether inside the following block.
+     */
+    v9fs_co_run_in_worker({
+        if (v9fs_request_cancelled(pdu)) {
+            err = -EINTR;
+            break;
+        }
+        err = s->ops->lstat(&s->ctx, &dpath, &fidst);
+        if (err < 0) {
+            err = -errno;
+            break;
+        }
+        stbuf = fidst;
+        for (name_idx = 0; name_idx < nwnames; name_idx++) {
+            if (v9fs_request_cancelled(pdu)) {
+                err = -EINTR;
+                break;
+            }
+            if (!same_stat_id(&pdu->s->root_st, &stbuf) ||
+                strcmp("..", wnames[name_idx].data))
+            {
+                err = s->ops->name_to_path(&s->ctx, &dpath,
+                                        wnames[name_idx].data, &path);
+                if (err < 0) {
+                    err = -errno;
+                    break;
+                }
+                if (v9fs_request_cancelled(pdu)) {
+                    err = -EINTR;
+                    break;
+                }
+                err = s->ops->lstat(&s->ctx, &path, &stbuf);
+                if (err < 0) {
+                    err = -errno;
+                    break;
+                }
+                stbufs[name_idx] = stbuf;
+                v9fs_path_copy(&dpath, &path);
+                v9fs_path_copy(&pathes[name_idx], &path);
+            }
+        }
+    });
+    /*
+     * Handle all the rest of this Twalk request on main thread ...
+     */
     if (err < 0) {
         goto out;
     }
-    err = stat_to_qid(pdu, &stbuf, &qid);
+
+    err = stat_to_qid(pdu, &fidst, &qid);
     if (err < 0) {
         goto out;
     }
+    stbuf = fidst;
 
-    /*
-     * Both dpath and path initially poin to fidp.
-     * Needed to handle request with nwnames == 0
-     */
+    /* reset dpath and path */
     v9fs_path_copy(&dpath, &fidp->path);
     v9fs_path_copy(&path, &fidp->path);
+
     for (name_idx = 0; name_idx < nwnames; name_idx++) {
         if (!same_stat_id(&pdu->s->root_st, &stbuf) ||
-            strcmp("..", wnames[name_idx].data)) {
-            err = v9fs_co_name_to_path(pdu, &dpath, wnames[name_idx].data,
-                                       &path);
-            if (err < 0) {
-                goto out;
-            }
-
-            err = v9fs_co_lstat(pdu, &path, &stbuf);
-            if (err < 0) {
-                goto out;
-            }
+            strcmp("..", wnames[name_idx].data))
+        {
+            stbuf = stbufs[name_idx];
             err = stat_to_qid(pdu, &stbuf, &qid);
             if (err < 0) {
                 goto out;
             }
+            v9fs_path_copy(&path, &pathes[name_idx]);
             v9fs_path_copy(&dpath, &path);
         }
         memcpy(&qids[name_idx], &qid, sizeof(qid));
@@ -1821,9 +1869,12 @@ out_nofid:
     if (nwnames && nwnames <= P9_MAXWELEM) {
         for (name_idx = 0; name_idx < nwnames; name_idx++) {
             v9fs_string_free(&wnames[name_idx]);
+            v9fs_path_free(&pathes[name_idx]);
         }
         g_free(wnames);
         g_free(qids);
+        g_free(stbufs);
+        g_free(pathes);
     }
 }
 
-- 
2.20.1



^ permalink raw reply related	[flat|nested] 14+ messages in thread

* [PATCH v2 0/7] 9pfs: Twalk optimization
@ 2021-06-04 15:48 Christian Schoenebeck
  2021-05-27 17:03 ` [PATCH v2 2/7] 9pfs: Twalk benchmark Christian Schoenebeck
                   ` (7 more replies)
  0 siblings, 8 replies; 14+ messages in thread
From: Christian Schoenebeck @ 2021-06-04 15:48 UTC (permalink / raw)
  To: qemu-devel; +Cc: Greg Kurz

This patch set reduces latency of Twalk handling by reducing the amount of
thread hops, similar to previous Treaddir optimization.

The performance gain of this patch set is not as spectacular as previously
with Treaddir, but there is definitely a measurable difference.

The actualy performance optimization is patch 7. With the benchmark of
patch 2, the runtime of the Twalk test was cut in half. In real world tests
I measured a performance gain (i.e. running an entire guest OS, and hence
mixed with all other kinds of 9p requests) of about 2%, again measured in a
mix, not concentrated on Twalk alone.

v1 -> v2:

  * Added a separate [NEW PATCH 1] for fixing not_same_qid() semantic.

  * Code style: use sizeof(var-name) instead of sizeof(type-name)
    [PATCH 3].

  * Split out dropping fid_to_qid() into separate [PATCH 4].

  * Split out replacing not_same_qid() by same_stat_id() into separate
    [PATCH 5].

  * Split out dropping root_qid into separate [PATCH 6].

Christian Schoenebeck (7):
  9pfs: fix not_same_qid()
  9pfs: Twalk benchmark
  9pfs: capture root stat
  9pfs: drop fid_to_qid()
  9pfs: replace not_same_qid() by same_stat_id()
  9pfs: drop root_qid
  9pfs: reduce latency of Twalk

 hw/9pfs/9p.c                 | 128 +++++++++++++++++++++++------------
 hw/9pfs/9p.h                 |   2 +-
 tests/qtest/virtio-9p-test.c |  30 ++++++++
 3 files changed, 116 insertions(+), 44 deletions(-)

-- 
2.20.1



^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH v2 0/7] 9pfs: Twalk optimization
  2021-06-04 15:48 [PATCH v2 0/7] 9pfs: Twalk optimization Christian Schoenebeck
                   ` (6 preceding siblings ...)
  2021-06-04 15:38 ` [PATCH v2 7/7] 9pfs: reduce latency of Twalk Christian Schoenebeck
@ 2021-06-04 16:31 ` Greg Kurz
  2021-06-04 18:23   ` Christian Schoenebeck
  7 siblings, 1 reply; 14+ messages in thread
From: Greg Kurz @ 2021-06-04 16:31 UTC (permalink / raw)
  To: Christian Schoenebeck; +Cc: qemu-devel

On Fri, 4 Jun 2021 17:48:49 +0200
Christian Schoenebeck <qemu_oss@crudebyte.com> wrote:

> This patch set reduces latency of Twalk handling by reducing the amount of
> thread hops, similar to previous Treaddir optimization.
> 
> The performance gain of this patch set is not as spectacular as previously
> with Treaddir, but there is definitely a measurable difference.
> 
> The actualy performance optimization is patch 7. With the benchmark of
> patch 2, the runtime of the Twalk test was cut in half. In real world tests
> I measured a performance gain (i.e. running an entire guest OS, and hence
> mixed with all other kinds of 9p requests) of about 2%, again measured in a
> mix, not concentrated on Twalk alone.
> 
> v1 -> v2:
> 
>   * Added a separate [NEW PATCH 1] for fixing not_same_qid() semantic.
> 

Strangely, patch 1 appears between patch 3 and 4 in my email client.

>   * Code style: use sizeof(var-name) instead of sizeof(type-name)
>     [PATCH 3].
> 
>   * Split out dropping fid_to_qid() into separate [PATCH 4].
> 
>   * Split out replacing not_same_qid() by same_stat_id() into separate
>     [PATCH 5].
> 
>   * Split out dropping root_qid into separate [PATCH 6].
> 

I could have a look at all the patches except the last one. LGTM.
You can add my R-b for patches 1 and 3 to 6.

Reviewed-by: Greg Kurz <groug@kaod.org>

I'll try to find some time for patch 7 next week.

> Christian Schoenebeck (7):
>   9pfs: fix not_same_qid()
>   9pfs: Twalk benchmark
>   9pfs: capture root stat
>   9pfs: drop fid_to_qid()
>   9pfs: replace not_same_qid() by same_stat_id()
>   9pfs: drop root_qid
>   9pfs: reduce latency of Twalk
> 
>  hw/9pfs/9p.c                 | 128 +++++++++++++++++++++++------------
>  hw/9pfs/9p.h                 |   2 +-
>  tests/qtest/virtio-9p-test.c |  30 ++++++++
>  3 files changed, 116 insertions(+), 44 deletions(-)
> 



^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH v2 0/7] 9pfs: Twalk optimization
  2021-06-04 16:31 ` [PATCH v2 0/7] 9pfs: Twalk optimization Greg Kurz
@ 2021-06-04 18:23   ` Christian Schoenebeck
  2021-06-28 10:20     ` Christian Schoenebeck
  0 siblings, 1 reply; 14+ messages in thread
From: Christian Schoenebeck @ 2021-06-04 18:23 UTC (permalink / raw)
  To: qemu-devel; +Cc: Greg Kurz

On Freitag, 4. Juni 2021 18:31:28 CEST Greg Kurz wrote:
> On Fri, 4 Jun 2021 17:48:49 +0200
> 
> Christian Schoenebeck <qemu_oss@crudebyte.com> wrote:
> > This patch set reduces latency of Twalk handling by reducing the amount of
> > thread hops, similar to previous Treaddir optimization.
> > 
> > The performance gain of this patch set is not as spectacular as previously
> > with Treaddir, but there is definitely a measurable difference.
> > 
> > The actualy performance optimization is patch 7. With the benchmark of
> > patch 2, the runtime of the Twalk test was cut in half. In real world
> > tests
> > I measured a performance gain (i.e. running an entire guest OS, and hence
> > mixed with all other kinds of 9p requests) of about 2%, again measured in
> > a
> > mix, not concentrated on Twalk alone.
> > 
> > v1 -> v2:
> >   * Added a separate [NEW PATCH 1] for fixing not_same_qid() semantic.
> 
> Strangely, patch 1 appears between patch 3 and 4 in my email client.

My bad, I forgot to take care about the time stamps of the individual patches 
this time.

> >   * Code style: use sizeof(var-name) instead of sizeof(type-name)
> >   
> >     [PATCH 3].
> >   
> >   * Split out dropping fid_to_qid() into separate [PATCH 4].
> >   
> >   * Split out replacing not_same_qid() by same_stat_id() into separate
> >   
> >     [PATCH 5].
> >   
> >   * Split out dropping root_qid into separate [PATCH 6].
> 
> I could have a look at all the patches except the last one. LGTM.
> You can add my R-b for patches 1 and 3 to 6.
> 
> Reviewed-by: Greg Kurz <groug@kaod.org>

Queued patches 1, 3..6 on 9p.next:

https://github.com/cschoenebeck/qemu/commits/9p.next

Thanks!

> I'll try to find some time for patch 7 next week.
> 

Twalk can be tricky, so no hurry! It takes what it takes.

> > Christian Schoenebeck (7):
> >   9pfs: fix not_same_qid()
> >   9pfs: Twalk benchmark
> >   9pfs: capture root stat
> >   9pfs: drop fid_to_qid()
> >   9pfs: replace not_same_qid() by same_stat_id()
> >   9pfs: drop root_qid
> >   9pfs: reduce latency of Twalk
> >  
> >  hw/9pfs/9p.c                 | 128 +++++++++++++++++++++++------------
> >  hw/9pfs/9p.h                 |   2 +-
> >  tests/qtest/virtio-9p-test.c |  30 ++++++++
> >  3 files changed, 116 insertions(+), 44 deletions(-)

Best regards,
Christian Schoenebeck




^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH v2 0/7] 9pfs: Twalk optimization
  2021-06-04 18:23   ` Christian Schoenebeck
@ 2021-06-28 10:20     ` Christian Schoenebeck
  0 siblings, 0 replies; 14+ messages in thread
From: Christian Schoenebeck @ 2021-06-28 10:20 UTC (permalink / raw)
  To: qemu-devel; +Cc: Greg Kurz

On Freitag, 4. Juni 2021 20:23:21 CEST Christian Schoenebeck wrote:
> On Freitag, 4. Juni 2021 18:31:28 CEST Greg Kurz wrote:
> > On Fri, 4 Jun 2021 17:48:49 +0200
> > 
> > Christian Schoenebeck <qemu_oss@crudebyte.com> wrote:
> > > This patch set reduces latency of Twalk handling by reducing the amount
> > > of
> > > thread hops, similar to previous Treaddir optimization.
> > > 
> > > The performance gain of this patch set is not as spectacular as
> > > previously
> > > with Treaddir, but there is definitely a measurable difference.
> > > 
> > > The actualy performance optimization is patch 7. With the benchmark of
> > > patch 2, the runtime of the Twalk test was cut in half. In real world
> > > tests
> > > I measured a performance gain (i.e. running an entire guest OS, and
> > > hence
> > > mixed with all other kinds of 9p requests) of about 2%, again measured
> > > in
> > > a
> > > mix, not concentrated on Twalk alone.
> > > 
> > > v1 -> v2:
> > >   * Added a separate [NEW PATCH 1] for fixing not_same_qid() semantic.
> > 
> > Strangely, patch 1 appears between patch 3 and 4 in my email client.
> 
> My bad, I forgot to take care about the time stamps of the individual
> patches this time.
> 
> > >   * Code style: use sizeof(var-name) instead of sizeof(type-name)
> > >   
> > >     [PATCH 3].
> > >   
> > >   * Split out dropping fid_to_qid() into separate [PATCH 4].
> > >   
> > >   * Split out replacing not_same_qid() by same_stat_id() into separate
> > >   
> > >     [PATCH 5].
> > >   
> > >   * Split out dropping root_qid into separate [PATCH 6].
> > 
> > I could have a look at all the patches except the last one. LGTM.
> > You can add my R-b for patches 1 and 3 to 6.
> > 
> > Reviewed-by: Greg Kurz <groug@kaod.org>
> 
> Queued patches 1, 3..6 on 9p.next:
> 
> https://github.com/cschoenebeck/qemu/commits/9p.next
> 
> Thanks!
> 
> > I'll try to find some time for patch 7 next week.
> 
> Twalk can be tricky, so no hurry! It takes what it takes.
> 
> > > Christian Schoenebeck (7):
> > >   9pfs: fix not_same_qid()
> > >   9pfs: Twalk benchmark
> > >   9pfs: capture root stat
> > >   9pfs: drop fid_to_qid()
> > >   9pfs: replace not_same_qid() by same_stat_id()
> > >   9pfs: drop root_qid
> > >   9pfs: reduce latency of Twalk
> > >  
> > >  hw/9pfs/9p.c                 | 128 +++++++++++++++++++++++------------
> > >  hw/9pfs/9p.h                 |   2 +-
> > >  tests/qtest/virtio-9p-test.c |  30 ++++++++
> > >  3 files changed, 116 insertions(+), 44 deletions(-)

Ping

How is your schedule Greg? Any chance for a time slice on this one?

Best regards,
Christian Schoenebeck




^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH v2 7/7] 9pfs: reduce latency of Twalk
  2021-06-04 15:38 ` [PATCH v2 7/7] 9pfs: reduce latency of Twalk Christian Schoenebeck
@ 2021-07-02 14:36   ` Greg Kurz
  2021-07-02 15:05     ` Christian Schoenebeck
  0 siblings, 1 reply; 14+ messages in thread
From: Greg Kurz @ 2021-07-02 14:36 UTC (permalink / raw)
  To: Christian Schoenebeck; +Cc: qemu-devel

On Fri, 4 Jun 2021 17:38:31 +0200
Christian Schoenebeck <qemu_oss@crudebyte.com> wrote:

> As with previous performance optimization on Treaddir handling;
> reduce the overall latency, i.e. overall time spent on processing
> a Twalk request by reducing the amount of thread hops between the
> 9p server's main thread and fs worker thread(s).
> 
> In fact this patch even reduces the thread hops for Twalk handling
> to its theoritical minimum of exactly 2 thread hops:
> 
> main thread -> fs worker thread -> main thread
> 
> This is achieved by doing all the required fs driver tasks altogether
> in a single v9fs_co_run_in_worker({ ... }); code block.
> 
> Signed-off-by: Christian Schoenebeck <qemu_oss@crudebyte.com>
> ---
>  hw/9pfs/9p.c | 89 +++++++++++++++++++++++++++++++++++++++++-----------
>  1 file changed, 70 insertions(+), 19 deletions(-)
> 
> diff --git a/hw/9pfs/9p.c b/hw/9pfs/9p.c
> index 7be07f2d68..2815257f42 100644
> --- a/hw/9pfs/9p.c
> +++ b/hw/9pfs/9p.c
> @@ -1705,9 +1705,9 @@ static void coroutine_fn v9fs_walk(void *opaque)
>      int name_idx;
>      V9fsQID *qids = NULL;
>      int i, err = 0;
> -    V9fsPath dpath, path;
> +    V9fsPath dpath, path, *pathes = NULL;
>      uint16_t nwnames;
> -    struct stat stbuf;
> +    struct stat stbuf, fidst, *stbufs = NULL;
>      size_t offset = 7;
>      int32_t fid, newfid;
>      V9fsString *wnames = NULL;
> @@ -1733,6 +1733,8 @@ static void coroutine_fn v9fs_walk(void *opaque)
>      if (nwnames) {
>          wnames = g_new0(V9fsString, nwnames);
>          qids   = g_new0(V9fsQID, nwnames);
> +        stbufs = g_new0(struct stat, nwnames);
> +        pathes = g_new0(V9fsPath, nwnames);
>          for (i = 0; i < nwnames; i++) {
>              err = pdu_unmarshal(pdu, offset, "s", &wnames[i]);
>              if (err < 0) {
> @@ -1753,39 +1755,85 @@ static void coroutine_fn v9fs_walk(void *opaque)
>  
>      v9fs_path_init(&dpath);
>      v9fs_path_init(&path);
> +    /*
> +     * Both dpath and path initially point to fidp.
> +     * Needed to handle request with nwnames == 0
> +     */
> +    v9fs_path_copy(&dpath, &fidp->path);
> +    v9fs_path_copy(&path, &fidp->path);
>  
> -    err = v9fs_co_lstat(pdu, &fidp->path, &stbuf);
> +    /*
> +     * To keep latency (i.e. overall execution time for processing this
> +     * Twalk client request) as small as possible, run all the required fs
> +     * driver code altogether inside the following block.
> +     */
> +    v9fs_co_run_in_worker({
> +        if (v9fs_request_cancelled(pdu)) {
> +            err = -EINTR;
> +            break;
> +        }
> +        err = s->ops->lstat(&s->ctx, &dpath, &fidst);
> +        if (err < 0) {
> +            err = -errno;
> +            break;
> +        }
> +        stbuf = fidst;
> +        for (name_idx = 0; name_idx < nwnames; name_idx++) {
> +            if (v9fs_request_cancelled(pdu)) {
> +                err = -EINTR;
> +                break;
> +            }
> +            if (!same_stat_id(&pdu->s->root_st, &stbuf) ||
> +                strcmp("..", wnames[name_idx].data))
> +            {
> +                err = s->ops->name_to_path(&s->ctx, &dpath,
> +                                        wnames[name_idx].data, &path);

It seems you could pass &pathes[name_idx] instead of &path and...

> +                if (err < 0) {
> +                    err = -errno;
> +                    break;
> +                }
> +                if (v9fs_request_cancelled(pdu)) {
> +                    err = -EINTR;
> +                    break;
> +                }
> +                err = s->ops->lstat(&s->ctx, &path, &stbuf);
> +                if (err < 0) {
> +                    err = -errno;
> +                    break;
> +                }
> +                stbufs[name_idx] = stbuf;
> +                v9fs_path_copy(&dpath, &path);
> +                v9fs_path_copy(&pathes[name_idx], &path);

... avoid a copy.

Also, I believe the path -> dpath could be avoided as well in
the existing code, but this is a separate cleanup.

> +            }
> +        }
> +    });
> +    /*
> +     * Handle all the rest of this Twalk request on main thread ...
> +     */
>      if (err < 0) {
>          goto out;
>      }
> -    err = stat_to_qid(pdu, &stbuf, &qid);
> +
> +    err = stat_to_qid(pdu, &fidst, &qid);
>      if (err < 0) {
>          goto out;
>      }
> +    stbuf = fidst;
>  
> -    /*
> -     * Both dpath and path initially poin to fidp.
> -     * Needed to handle request with nwnames == 0
> -     */
> +    /* reset dpath and path */
>      v9fs_path_copy(&dpath, &fidp->path);
>      v9fs_path_copy(&path, &fidp->path);
> +
>      for (name_idx = 0; name_idx < nwnames; name_idx++) {
>          if (!same_stat_id(&pdu->s->root_st, &stbuf) ||
> -            strcmp("..", wnames[name_idx].data)) {
> -            err = v9fs_co_name_to_path(pdu, &dpath, wnames[name_idx].data,
> -                                       &path);
> -            if (err < 0) {
> -                goto out;
> -            }
> -
> -            err = v9fs_co_lstat(pdu, &path, &stbuf);
> -            if (err < 0) {
> -                goto out;
> -            }
> +            strcmp("..", wnames[name_idx].data))
> +        {
> +            stbuf = stbufs[name_idx];
>              err = stat_to_qid(pdu, &stbuf, &qid);
>              if (err < 0) {
>                  goto out;
>              }
> +            v9fs_path_copy(&path, &pathes[name_idx]);
>              v9fs_path_copy(&dpath, &path);
>          }
>          memcpy(&qids[name_idx], &qid, sizeof(qid));
> @@ -1821,9 +1869,12 @@ out_nofid:
>      if (nwnames && nwnames <= P9_MAXWELEM) {
>          for (name_idx = 0; name_idx < nwnames; name_idx++) {
>              v9fs_string_free(&wnames[name_idx]);
> +            v9fs_path_free(&pathes[name_idx]);
>          }
>          g_free(wnames);
>          g_free(qids);
> +        g_free(stbufs);
> +        g_free(pathes);

All of these guys should be converted to g_autofree. Separate cleanup
again.

v9fs_walk() was already a bit hairy and the diffstat definitely
doesn't improve things... this being said, the change makes sense
and I haven't spotted anything wrong, so:

Reviewed-by: Greg Kurz <groug@kaod.org>

Improvements could be to :
- track the previous path with a V9fsPath * instead of copying
- have a separate path for the nwnames == 0 case

>      }
>  }
>  



^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH v2 7/7] 9pfs: reduce latency of Twalk
  2021-07-02 14:36   ` Greg Kurz
@ 2021-07-02 15:05     ` Christian Schoenebeck
  2021-07-02 15:26       ` Greg Kurz
  0 siblings, 1 reply; 14+ messages in thread
From: Christian Schoenebeck @ 2021-07-02 15:05 UTC (permalink / raw)
  To: qemu-devel; +Cc: Greg Kurz

On Freitag, 2. Juli 2021 16:36:56 CEST Greg Kurz wrote:
> On Fri, 4 Jun 2021 17:38:31 +0200
> 
> Christian Schoenebeck <qemu_oss@crudebyte.com> wrote:
> > As with previous performance optimization on Treaddir handling;
> > reduce the overall latency, i.e. overall time spent on processing
> > a Twalk request by reducing the amount of thread hops between the
> > 9p server's main thread and fs worker thread(s).
> > 
> > In fact this patch even reduces the thread hops for Twalk handling
> > to its theoritical minimum of exactly 2 thread hops:
> > 
> > main thread -> fs worker thread -> main thread
> > 
> > This is achieved by doing all the required fs driver tasks altogether
> > in a single v9fs_co_run_in_worker({ ... }); code block.
> > 
> > Signed-off-by: Christian Schoenebeck <qemu_oss@crudebyte.com>
> > ---
> > 
> >  hw/9pfs/9p.c | 89 +++++++++++++++++++++++++++++++++++++++++-----------
> >  1 file changed, 70 insertions(+), 19 deletions(-)
> > 
> > diff --git a/hw/9pfs/9p.c b/hw/9pfs/9p.c
> > index 7be07f2d68..2815257f42 100644
> > --- a/hw/9pfs/9p.c
> > +++ b/hw/9pfs/9p.c
> > @@ -1705,9 +1705,9 @@ static void coroutine_fn v9fs_walk(void *opaque)
> > 
> >      int name_idx;
> >      V9fsQID *qids = NULL;
> >      int i, err = 0;
> > 
> > -    V9fsPath dpath, path;
> > +    V9fsPath dpath, path, *pathes = NULL;
> > 
> >      uint16_t nwnames;
> > 
> > -    struct stat stbuf;
> > +    struct stat stbuf, fidst, *stbufs = NULL;
> > 
> >      size_t offset = 7;
> >      int32_t fid, newfid;
> >      V9fsString *wnames = NULL;
> > 
> > @@ -1733,6 +1733,8 @@ static void coroutine_fn v9fs_walk(void *opaque)
> > 
> >      if (nwnames) {
> >      
> >          wnames = g_new0(V9fsString, nwnames);
> >          qids   = g_new0(V9fsQID, nwnames);
> > 
> > +        stbufs = g_new0(struct stat, nwnames);
> > +        pathes = g_new0(V9fsPath, nwnames);
> > 
> >          for (i = 0; i < nwnames; i++) {
> >          
> >              err = pdu_unmarshal(pdu, offset, "s", &wnames[i]);
> >              if (err < 0) {
> > 
> > @@ -1753,39 +1755,85 @@ static void coroutine_fn v9fs_walk(void *opaque)
> > 
> >      v9fs_path_init(&dpath);
> >      v9fs_path_init(&path);
> > 
> > +    /*
> > +     * Both dpath and path initially point to fidp.
> > +     * Needed to handle request with nwnames == 0
> > +     */
> > +    v9fs_path_copy(&dpath, &fidp->path);
> > +    v9fs_path_copy(&path, &fidp->path);
> > 
> > -    err = v9fs_co_lstat(pdu, &fidp->path, &stbuf);
> > +    /*
> > +     * To keep latency (i.e. overall execution time for processing this
> > +     * Twalk client request) as small as possible, run all the required
> > fs
> > +     * driver code altogether inside the following block.
> > +     */
> > +    v9fs_co_run_in_worker({
> > +        if (v9fs_request_cancelled(pdu)) {
> > +            err = -EINTR;
> > +            break;
> > +        }
> > +        err = s->ops->lstat(&s->ctx, &dpath, &fidst);
> > +        if (err < 0) {
> > +            err = -errno;
> > +            break;
> > +        }
> > +        stbuf = fidst;
> > +        for (name_idx = 0; name_idx < nwnames; name_idx++) {
> > +            if (v9fs_request_cancelled(pdu)) {
> > +                err = -EINTR;
> > +                break;
> > +            }
> > +            if (!same_stat_id(&pdu->s->root_st, &stbuf) ||
> > +                strcmp("..", wnames[name_idx].data))
> > +            {
> > +                err = s->ops->name_to_path(&s->ctx, &dpath,
> > +                                        wnames[name_idx].data, &path);
> 
> It seems you could pass &pathes[name_idx] instead of &path and...
> 
> > +                if (err < 0) {
> > +                    err = -errno;
> > +                    break;
> > +                }
> > +                if (v9fs_request_cancelled(pdu)) {
> > +                    err = -EINTR;
> > +                    break;
> > +                }
> > +                err = s->ops->lstat(&s->ctx, &path, &stbuf);
> > +                if (err < 0) {
> > +                    err = -errno;
> > +                    break;
> > +                }
> > +                stbufs[name_idx] = stbuf;
> > +                v9fs_path_copy(&dpath, &path);
> > +                v9fs_path_copy(&pathes[name_idx], &path);
> 
> ... avoid a copy.

It's been a while since I did this, but looks like you are right.

> Also, I believe the path -> dpath could be avoided as well in
> the existing code, but this is a separate cleanup.
> 
> > +            }
> > +        }
> > +    });
> > +    /*
> > +     * Handle all the rest of this Twalk request on main thread ...
> > +     */
> > 
> >      if (err < 0) {
> >      
> >          goto out;
> >      
> >      }
> > 
> > -    err = stat_to_qid(pdu, &stbuf, &qid);
> > +
> > +    err = stat_to_qid(pdu, &fidst, &qid);
> > 
> >      if (err < 0) {
> >      
> >          goto out;
> >      
> >      }
> > 
> > +    stbuf = fidst;
> > 
> > -    /*
> > -     * Both dpath and path initially poin to fidp.
> > -     * Needed to handle request with nwnames == 0
> > -     */
> > +    /* reset dpath and path */
> > 
> >      v9fs_path_copy(&dpath, &fidp->path);
> >      v9fs_path_copy(&path, &fidp->path);
> > 
> > +
> > 
> >      for (name_idx = 0; name_idx < nwnames; name_idx++) {
> >      
> >          if (!same_stat_id(&pdu->s->root_st, &stbuf) ||
> > 
> > -            strcmp("..", wnames[name_idx].data)) {
> > -            err = v9fs_co_name_to_path(pdu, &dpath,
> > wnames[name_idx].data,
> > -                                       &path);
> > -            if (err < 0) {
> > -                goto out;
> > -            }
> > -
> > -            err = v9fs_co_lstat(pdu, &path, &stbuf);
> > -            if (err < 0) {
> > -                goto out;
> > -            }
> > +            strcmp("..", wnames[name_idx].data))
> > +        {
> > +            stbuf = stbufs[name_idx];
> > 
> >              err = stat_to_qid(pdu, &stbuf, &qid);
> >              if (err < 0) {
> >              
> >                  goto out;
> >              
> >              }
> > 
> > +            v9fs_path_copy(&path, &pathes[name_idx]);
> > 
> >              v9fs_path_copy(&dpath, &path);
> >          
> >          }
> >          memcpy(&qids[name_idx], &qid, sizeof(qid));
> > 
> > @@ -1821,9 +1869,12 @@ out_nofid:
> >      if (nwnames && nwnames <= P9_MAXWELEM) {
> >      
> >          for (name_idx = 0; name_idx < nwnames; name_idx++) {
> >          
> >              v9fs_string_free(&wnames[name_idx]);
> > 
> > +            v9fs_path_free(&pathes[name_idx]);
> > 
> >          }
> >          g_free(wnames);
> >          g_free(qids);
> > 
> > +        g_free(stbufs);
> > +        g_free(pathes);
> 
> All of these guys should be converted to g_autofree. Separate cleanup
> again.

Definitely agreed on that.

> v9fs_walk() was already a bit hairy and the diffstat definitely
> doesn't improve things... this being said, the change makes sense
> and I haven't spotted anything wrong, so:
> 
> Reviewed-by: Greg Kurz <groug@kaod.org>
> 
> Improvements could be to :
> - track the previous path with a V9fsPath * instead of copying
> - have a separate path for the nwnames == 0 case
> 
> >      }
> >  
> >  }

Yeah, there is a lot more to do on v9fs_walk(), both cleanup, as well as the 
previously (couple weeks ago) mentioned protocol fix (i.e. Twalk should only 
reply Rerror if there is an error on the very first path element).

If you don't mind, I queue this patch as is for now and prepare a PR for the 
current 9p patches in my queue in order to bring them into the soft freeze 
deadline.

Thanks for looking at this Greg, I appreciate it!

Best regards,
Christian Schoenebeck




^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH v2 7/7] 9pfs: reduce latency of Twalk
  2021-07-02 15:05     ` Christian Schoenebeck
@ 2021-07-02 15:26       ` Greg Kurz
  0 siblings, 0 replies; 14+ messages in thread
From: Greg Kurz @ 2021-07-02 15:26 UTC (permalink / raw)
  To: Christian Schoenebeck; +Cc: qemu-devel

On Fri, 02 Jul 2021 17:05:32 +0200
Christian Schoenebeck <qemu_oss@crudebyte.com> wrote:

> On Freitag, 2. Juli 2021 16:36:56 CEST Greg Kurz wrote:
> > On Fri, 4 Jun 2021 17:38:31 +0200
> > 
> > Christian Schoenebeck <qemu_oss@crudebyte.com> wrote:
> > > As with previous performance optimization on Treaddir handling;
> > > reduce the overall latency, i.e. overall time spent on processing
> > > a Twalk request by reducing the amount of thread hops between the
> > > 9p server's main thread and fs worker thread(s).
> > > 
> > > In fact this patch even reduces the thread hops for Twalk handling
> > > to its theoritical minimum of exactly 2 thread hops:
> > > 
> > > main thread -> fs worker thread -> main thread
> > > 
> > > This is achieved by doing all the required fs driver tasks altogether
> > > in a single v9fs_co_run_in_worker({ ... }); code block.
> > > 
> > > Signed-off-by: Christian Schoenebeck <qemu_oss@crudebyte.com>
> > > ---
> > > 
> > >  hw/9pfs/9p.c | 89 +++++++++++++++++++++++++++++++++++++++++-----------
> > >  1 file changed, 70 insertions(+), 19 deletions(-)
> > > 

[...]

> 
> Yeah, there is a lot more to do on v9fs_walk(), both cleanup, as well as the 
> previously (couple weeks ago) mentioned protocol fix (i.e. Twalk should only 
> reply Rerror if there is an error on the very first path element).
> 

Ah yes... I had forgotten about this one. One more argument for a thorough
rewrite of this function.

> If you don't mind, I queue this patch as is for now and prepare a PR for the 
> current 9p patches in my queue in order to bring them into the soft freeze 
> deadline.
> 

Sure, please do.

> Thanks for looking at this Greg, I appreciate it!
> 

Upcoming soft freeze provided the extra motivation to finish the review :-)

> Best regards,
> Christian Schoenebeck
> 
> 

Cheers,

--
Greg



^ permalink raw reply	[flat|nested] 14+ messages in thread

end of thread, other threads:[~2021-07-02 15:28 UTC | newest]

Thread overview: 14+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-06-04 15:48 [PATCH v2 0/7] 9pfs: Twalk optimization Christian Schoenebeck
2021-05-27 17:03 ` [PATCH v2 2/7] 9pfs: Twalk benchmark Christian Schoenebeck
2021-05-27 17:04 ` [PATCH v2 3/7] 9pfs: capture root stat Christian Schoenebeck
2021-06-04 14:46 ` [PATCH v2 1/7] 9pfs: fix not_same_qid() Christian Schoenebeck
2021-06-04 15:13 ` [PATCH v2 4/7] 9pfs: drop fid_to_qid() Christian Schoenebeck
2021-06-04 15:21 ` [PATCH v2 5/7] 9pfs: replace not_same_qid() by same_stat_id() Christian Schoenebeck
2021-06-04 15:32 ` [PATCH v2 6/7] 9pfs: drop root_qid Christian Schoenebeck
2021-06-04 15:38 ` [PATCH v2 7/7] 9pfs: reduce latency of Twalk Christian Schoenebeck
2021-07-02 14:36   ` Greg Kurz
2021-07-02 15:05     ` Christian Schoenebeck
2021-07-02 15:26       ` Greg Kurz
2021-06-04 16:31 ` [PATCH v2 0/7] 9pfs: Twalk optimization Greg Kurz
2021-06-04 18:23   ` Christian Schoenebeck
2021-06-28 10:20     ` Christian Schoenebeck

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.