All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCHSET 0/4] fs: Support for LOOKUP_CACHED / RESOLVE_CACHED
@ 2020-12-17 16:19 Jens Axboe
  2020-12-17 16:19 ` [PATCH 1/4] fs: make unlazy_walk() error handling consistent Jens Axboe
                   ` (4 more replies)
  0 siblings, 5 replies; 13+ messages in thread
From: Jens Axboe @ 2020-12-17 16:19 UTC (permalink / raw)
  To: linux-fsdevel; +Cc: torvalds, viro

Hi,

Here's v3 of the LOOKUP_CACHED change. It's exposed as both a flag for
openat2(), and it's used internally by io_uring to speed up (and make more
efficient) the openat/openat2 support there. As posted in the v3 thread,
performance numbers for various levels of the filename lookup already
being cached:

Cached		5.10-git	5.10-git+LOOKUP_CACHED	Speedup
---------------------------------------------------------------
33%		1,014,975	900,474			1.1x
89%		 545,466	292,937			1.9x
100%		 435,636	151,475			2.9x

The more cache hot we are, the faster the inline LOOKUP_CACHED
optimization helps. This is unsurprising and expected, as a thread
offload becomes a more dominant part of the total overhead. If we look
at io_uring tracing, doing an IORING_OP_OPENAT on a file that isn't in
the dentry cache will yield:

275.550481: io_uring_create: ring 00000000ddda6278, fd 3 sq size 8, cq size 16, flags 0
275.550491: io_uring_submit_sqe: ring 00000000ddda6278, op 18, data 0x0, non block 1, sq_thread 0
275.550498: io_uring_queue_async_work: ring 00000000ddda6278, request 00000000c0267d17, flags 69760, normal queue, work 000000003d683991
275.550502: io_uring_cqring_wait: ring 00000000ddda6278, min_events 1
275.550556: io_uring_complete: ring 00000000ddda6278, user_data 0x0, result 4

which shows a failed nonblock lookup, then punt to worker, and then we
complete with fd == 4. This takes 65 usec in total. Re-running the same
test case again:

281.253956: io_uring_create: ring 0000000008207252, fd 3 sq size 8, cq size 16, flags 0
281.253967: io_uring_submit_sqe: ring 0000000008207252, op 18, data 0x0, non block 1, sq_thread 0
281.253973: io_uring_complete: ring 0000000008207252, user_data 0x0, result 4

shows the same request completing inline, also returning fd == 4. This
takes 6 usec.

Using openat2, we see that an attempted RESOLVE_CACHED open of an uncached
file will fail with -EAGAIN, and a subsequent attempt will too as it's
still not cached. ls the file and retry, and we successfully open it
with RESOLVE_CACHED:

[test@archlinux ~]$ ./openat2-cached /etc/nanorc
open: -1
openat2: Resource temporarily unavailable
[test@archlinux ~]$ ./openat2-cached /etc/nanorc
open: -1
openat2: Resource temporarily unavailable
[test@archlinux ~]$ ls -al /etc/nanorc
-rw-r--r-- 1 root root 10066 Dec 17 16:15 /etc/nanorc
[test@archlinux ~]$ ./openat2-cached /etc/nanorc
open: 3

Minor polish since v3:

- Rename LOOKUP_NONBLOCK -> LOOKUP_CACHED, and ditto for the RESOLVE_
  flag. This better explains what the feature does, making it more self
  explanatory in terms of both code readability and for the user visible
  part.

- Remove dead LOOKUP_NONBLOCK check after we've dropped LOOKUP_RCU
  already, spotted by Al.

- Add O_TMPFILE to the checks upfront, so we can drop the checking in
  do_tmpfile().

-- 
Jens Axboe




^ permalink raw reply	[flat|nested] 13+ messages in thread
* [PATCHSET v3 0/4] fs: Support for LOOKUP_NONBLOCK / RESOLVE_NONBLOCK
@ 2020-12-14 19:13 Jens Axboe
  2020-12-14 19:13 ` [PATCH 1/4] fs: make unlazy_walk() error handling consistent Jens Axboe
  0 siblings, 1 reply; 13+ messages in thread
From: Jens Axboe @ 2020-12-14 19:13 UTC (permalink / raw)
  To: linux-fsdevel; +Cc: torvalds, viro

Hi,

Wanted to throw out what the current state of this is, as we keep
getting closer to something palatable.

This time I've included the io_uring change too. I've tested this through
both openat2, and through io_uring as well.

I'm pretty happy with this at this point. The core change is very simple,
and the users end up being trivial too.

Also available here:

https://git.kernel.dk/cgit/linux-block/log/?h=nonblock-path-lookup

Changes since v2:

- Simplify the LOOKUP_NONBLOCK to _just_ cover LOOKUP_RCU and
  lookup_fast(), as per Linus's suggestion. I verified fast path
  operations are indeed just that, and it avoids having to mess with
  the inode lock and mnt_want_write() completely.

- Make O_TMPFILE return -EAGAIN for LOOKUP_NONBLOCK.

- Improve/add a few comments.

- Folded what was left of the open part of LOOKUP_NONBLOCK into the
  main patch.

-- 
Jens Axboe



^ permalink raw reply	[flat|nested] 13+ messages in thread

end of thread, other threads:[~2020-12-26 18:26 UTC | newest]

Thread overview: 13+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-12-17 16:19 [PATCHSET 0/4] fs: Support for LOOKUP_CACHED / RESOLVE_CACHED Jens Axboe
2020-12-17 16:19 ` [PATCH 1/4] fs: make unlazy_walk() error handling consistent Jens Axboe
2020-12-26  2:41   ` Jens Axboe
2020-12-26  4:50     ` Al Viro
2020-12-26 17:33       ` Jens Axboe
2020-12-26 17:58         ` Matthew Wilcox
2020-12-26 18:24           ` Jens Axboe
2020-12-17 16:19 ` [PATCH 2/4] fs: add support for LOOKUP_CACHED Jens Axboe
2020-12-17 16:19 ` [PATCH 3/4] fs: expose LOOKUP_CACHED through openat2() RESOLVE_CACHED Jens Axboe
2020-12-17 16:19 ` [PATCH 4/4] io_uring: enable LOOKUP_CACHED path resolution for filename lookups Jens Axboe
2020-12-17 18:09 ` [PATCHSET 0/4] fs: Support for LOOKUP_CACHED / RESOLVE_CACHED Linus Torvalds
2020-12-17 19:23   ` Jens Axboe
  -- strict thread matches above, loose matches on Subject: below --
2020-12-14 19:13 [PATCHSET v3 0/4] fs: Support for LOOKUP_NONBLOCK / RESOLVE_NONBLOCK Jens Axboe
2020-12-14 19:13 ` [PATCH 1/4] fs: make unlazy_walk() error handling consistent Jens Axboe

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.