All of lore.kernel.org
 help / color / mirror / Atom feed
From: Jeffle Xu <jefflexu@linux.alibaba.com>
To: dhowells@redhat.com, linux-cachefs@redhat.com, xiang@kernel.org,
	chao@kernel.org, linux-erofs@lists.ozlabs.org
Cc: torvalds@linux-foundation.org, gregkh@linuxfoundation.org,
	willy@infradead.org, linux-fsdevel@vger.kernel.org,
	joseph.qi@linux.alibaba.com, bo.liu@linux.alibaba.com,
	tao.peng@linux.alibaba.com, gerry@linux.alibaba.com,
	eguan@linux.alibaba.com, linux-kernel@vger.kernel.org,
	luodaowen.backend@bytedance.com, tianzichen@kuaishou.com,
	fannaihao@baidu.com, zhangjiachen.jaycee@bytedance.com,
	zhujia.zj@bytedance.com
Subject: [PATCH v10 08/21] cachefiles: document on-demand read mode
Date: Mon, 25 Apr 2022 20:21:30 +0800	[thread overview]
Message-ID: <20220425122143.56815-9-jefflexu@linux.alibaba.com> (raw)
In-Reply-To: <20220425122143.56815-1-jefflexu@linux.alibaba.com>

Document new user interface introduced by on-demand read mode.

Signed-off-by: Jeffle Xu <jefflexu@linux.alibaba.com>
---
 .../filesystems/caching/cachefiles.rst        | 174 ++++++++++++++++++
 1 file changed, 174 insertions(+)

diff --git a/Documentation/filesystems/caching/cachefiles.rst b/Documentation/filesystems/caching/cachefiles.rst
index 8bf396b76359..0b05391c8217 100644
--- a/Documentation/filesystems/caching/cachefiles.rst
+++ b/Documentation/filesystems/caching/cachefiles.rst
@@ -28,6 +28,7 @@ Cache on Already Mounted Filesystem
 
  (*) Debugging.
 
+ (*) On-demand Read.
 
 
 Overview
@@ -482,3 +483,176 @@ the control file.  For example::
 	echo $((1|4|8)) >/sys/module/cachefiles/parameters/debug
 
 will turn on all function entry debugging.
+
+
+On-demand Read
+==============
+
+When working in its original mode, CacheFiles serves as a local cache for a
+remote networking fs - while in on-demand read mode, CacheFiles can boost the
+scenario where on-demand read semantics are needed, e.g. container image
+distribution.
+
+The essential difference between these two modes is seen when a cache miss
+occurs: In the original mode, the netfs will fetch the data from the remote
+server and then write it to the cache file; in on-demand read mode, fetching
+the data and writing it into the cache is delegated to a user daemon.
+
+``CONFIG_CACHEFILES_ONDEMAND`` should be enabled to support on-demand read mode.
+
+
+Protocol Communication
+----------------------
+
+The on-demand read mode uses a simple protocol for communication between kernel
+and user daemon. The protocol can be modeled as::
+
+	kernel --[request]--> user daemon --[reply]--> kernel
+
+CacheFiles will send requests to the user daemon when needed.  The user daemon
+should poll the devnode ('/dev/cachefiles') to check if there's a pending
+request to be processed.  A POLLIN event will be returned when there's a pending
+request.
+
+The user daemon then reads the devnode to fetch a request to process.  It should
+be noted that each read only gets one request. When it has finished processing
+the request, the user daemon should write the reply to the devnode.
+
+Each request starts with a message header of the form::
+
+	struct cachefiles_msg {
+		__u32 msg_id;
+		__u32 opcode;
+		__u32 len;
+		__u32 object_id;
+		__u8  data[];
+	};
+
+	where:
+
+	* ``msg_id`` is a unique ID identifying this request among all pending
+	  requests.
+
+	* ``opcode`` indicates the type of this request.
+
+	* ``object_id`` is a unique ID identifying the cache file operated on.
+
+	* ``data`` indicates the payload of this request.
+
+	* ``len`` indicates the whole length of this request, including the
+	  header and following type-specific payload.
+
+
+Turning on On-demand Mode
+-------------------------
+
+An optional parameter becomes available to the "bind" command::
+
+	bind [ondemand]
+
+When the "bind" command is given no argument, it defaults to the original mode.
+When it is given the "ondemand" argument, i.e. "bind ondemand", on-demand read
+mode will be enabled.
+
+
+The OPEN Request
+----------------
+
+When the netfs opens a cache file for the first time, a request with the
+CACHEFILES_OP_OPEN opcode, a.k.a an OPEN request will be sent to the user
+daemon.  The payload format is of the form::
+
+	struct cachefiles_open {
+		__u32 volume_key_size;
+		__u32 cookie_key_size;
+		__u32 fd;
+		__u32 flags;
+		__u8  data[];
+	};
+
+	where:
+
+	* ``data`` contains the volume_key followed directly by the cookie_key.
+	  The volume key is a NUL-terminated string; the cookie key is binary
+	  data.
+
+	* ``volume_key_size`` indicates the size of the volume key in bytes.
+
+	* ``cookie_key_size`` indicates the size of the cookie key in bytes.
+
+	* ``fd`` indicates an anonymous fd referring to the cache file, through
+	  which the user daemon can perform write/llseek file operations on the
+	  cache file.
+
+
+The user daemon can use the given (volume_key, cookie_key) pair to distinguish
+the requested cache file.  With the given anonymous fd, the user daemon can
+fetch the data and write it to the cache file in the background, even when
+kernel has not triggered a cache miss yet.
+
+Be noted that each cache file has a unique object_id, while it may have multiple
+anonymous fds.  The user daemon may duplicate anonymous fds from the initial
+anonymous fd indicated by the @fd field through dup().  Thus each object_id can
+be mapped to multiple anonymous fds, while the usr daemon itself needs to
+maintain the mapping.
+
+When implementing a user daemon, please be careful of RLIMIT_NOFILE,
+``/proc/sys/fs/nr_open`` and ``/proc/sys/fs/file-max``.  Typically these needn't
+be huge since they're related to the number of open device blobs rather than
+open files of each individual filesystem.
+
+The user daemon should reply the OPEN request by issuing a "copen" (complete
+open) command on the devnode::
+
+	copen <msg_id>,<cache_size>
+
+	* ``msg_id`` must match the msg_id field of the OPEN request.
+
+	* When >= 0, ``cache_size`` indicates the size of the cache file;
+	  when < 0, ``cache_size`` indicates any error code encountered by the
+	  user daemon.
+
+
+The CLOSE Request
+-----------------
+
+When a cookie withdrawn, a CLOSE request (opcode CACHEFILES_OP_CLOSE) will be
+sent to the user daemon.  This tells the user daemon to close all anonymous fds
+associated with the given object_id.  The CLOSE request has no extra payload,
+and shouldn't be replied.
+
+
+The READ Request
+----------------
+
+When a cache miss is encountered in on-demand read mode, CacheFiles will send a
+READ request (opcode CACHEFILES_OP_READ) to the user daemon. This tells the user
+daemon to fetch the contents of the requested file range.  The payload is of the
+form::
+
+	struct cachefiles_read {
+		__u64 off;
+		__u64 len;
+	};
+
+	where:
+
+	* ``off`` indicates the starting offset of the requested file range.
+
+	* ``len`` indicates the length of the requested file range.
+
+
+When it receives a READ request, the user daemon should fetch the requested data
+and write it to the cache file identified by object_id.
+
+When it has finished processing the READ request, the user daemon should reply
+by using the CACHEFILES_IOC_READ_COMPLETE ioctl on one of the anonymous fds
+associated with the object_id given in the READ request.  The ioctl is of the
+form::
+
+	ioctl(fd, CACHEFILES_IOC_READ_COMPLETE, msg_id);
+
+	* ``fd`` is one of the anonymous fds associated with the object_id
+	  given.
+
+	* ``msg_id`` must match the msg_id field of the READ request.
-- 
2.27.0


WARNING: multiple messages have this Message-ID (diff)
From: Jeffle Xu <jefflexu@linux.alibaba.com>
To: dhowells@redhat.com, linux-cachefs@redhat.com, xiang@kernel.org,
	chao@kernel.org, linux-erofs@lists.ozlabs.org
Cc: gregkh@linuxfoundation.org, fannaihao@baidu.com,
	willy@infradead.org, linux-kernel@vger.kernel.org,
	tianzichen@kuaishou.com, joseph.qi@linux.alibaba.com,
	zhangjiachen.jaycee@bytedance.com, linux-fsdevel@vger.kernel.org,
	luodaowen.backend@bytedance.com, gerry@linux.alibaba.com,
	torvalds@linux-foundation.org
Subject: [PATCH v10 08/21] cachefiles: document on-demand read mode
Date: Mon, 25 Apr 2022 20:21:30 +0800	[thread overview]
Message-ID: <20220425122143.56815-9-jefflexu@linux.alibaba.com> (raw)
In-Reply-To: <20220425122143.56815-1-jefflexu@linux.alibaba.com>

Document new user interface introduced by on-demand read mode.

Signed-off-by: Jeffle Xu <jefflexu@linux.alibaba.com>
---
 .../filesystems/caching/cachefiles.rst        | 174 ++++++++++++++++++
 1 file changed, 174 insertions(+)

diff --git a/Documentation/filesystems/caching/cachefiles.rst b/Documentation/filesystems/caching/cachefiles.rst
index 8bf396b76359..0b05391c8217 100644
--- a/Documentation/filesystems/caching/cachefiles.rst
+++ b/Documentation/filesystems/caching/cachefiles.rst
@@ -28,6 +28,7 @@ Cache on Already Mounted Filesystem
 
  (*) Debugging.
 
+ (*) On-demand Read.
 
 
 Overview
@@ -482,3 +483,176 @@ the control file.  For example::
 	echo $((1|4|8)) >/sys/module/cachefiles/parameters/debug
 
 will turn on all function entry debugging.
+
+
+On-demand Read
+==============
+
+When working in its original mode, CacheFiles serves as a local cache for a
+remote networking fs - while in on-demand read mode, CacheFiles can boost the
+scenario where on-demand read semantics are needed, e.g. container image
+distribution.
+
+The essential difference between these two modes is seen when a cache miss
+occurs: In the original mode, the netfs will fetch the data from the remote
+server and then write it to the cache file; in on-demand read mode, fetching
+the data and writing it into the cache is delegated to a user daemon.
+
+``CONFIG_CACHEFILES_ONDEMAND`` should be enabled to support on-demand read mode.
+
+
+Protocol Communication
+----------------------
+
+The on-demand read mode uses a simple protocol for communication between kernel
+and user daemon. The protocol can be modeled as::
+
+	kernel --[request]--> user daemon --[reply]--> kernel
+
+CacheFiles will send requests to the user daemon when needed.  The user daemon
+should poll the devnode ('/dev/cachefiles') to check if there's a pending
+request to be processed.  A POLLIN event will be returned when there's a pending
+request.
+
+The user daemon then reads the devnode to fetch a request to process.  It should
+be noted that each read only gets one request. When it has finished processing
+the request, the user daemon should write the reply to the devnode.
+
+Each request starts with a message header of the form::
+
+	struct cachefiles_msg {
+		__u32 msg_id;
+		__u32 opcode;
+		__u32 len;
+		__u32 object_id;
+		__u8  data[];
+	};
+
+	where:
+
+	* ``msg_id`` is a unique ID identifying this request among all pending
+	  requests.
+
+	* ``opcode`` indicates the type of this request.
+
+	* ``object_id`` is a unique ID identifying the cache file operated on.
+
+	* ``data`` indicates the payload of this request.
+
+	* ``len`` indicates the whole length of this request, including the
+	  header and following type-specific payload.
+
+
+Turning on On-demand Mode
+-------------------------
+
+An optional parameter becomes available to the "bind" command::
+
+	bind [ondemand]
+
+When the "bind" command is given no argument, it defaults to the original mode.
+When it is given the "ondemand" argument, i.e. "bind ondemand", on-demand read
+mode will be enabled.
+
+
+The OPEN Request
+----------------
+
+When the netfs opens a cache file for the first time, a request with the
+CACHEFILES_OP_OPEN opcode, a.k.a an OPEN request will be sent to the user
+daemon.  The payload format is of the form::
+
+	struct cachefiles_open {
+		__u32 volume_key_size;
+		__u32 cookie_key_size;
+		__u32 fd;
+		__u32 flags;
+		__u8  data[];
+	};
+
+	where:
+
+	* ``data`` contains the volume_key followed directly by the cookie_key.
+	  The volume key is a NUL-terminated string; the cookie key is binary
+	  data.
+
+	* ``volume_key_size`` indicates the size of the volume key in bytes.
+
+	* ``cookie_key_size`` indicates the size of the cookie key in bytes.
+
+	* ``fd`` indicates an anonymous fd referring to the cache file, through
+	  which the user daemon can perform write/llseek file operations on the
+	  cache file.
+
+
+The user daemon can use the given (volume_key, cookie_key) pair to distinguish
+the requested cache file.  With the given anonymous fd, the user daemon can
+fetch the data and write it to the cache file in the background, even when
+kernel has not triggered a cache miss yet.
+
+Be noted that each cache file has a unique object_id, while it may have multiple
+anonymous fds.  The user daemon may duplicate anonymous fds from the initial
+anonymous fd indicated by the @fd field through dup().  Thus each object_id can
+be mapped to multiple anonymous fds, while the usr daemon itself needs to
+maintain the mapping.
+
+When implementing a user daemon, please be careful of RLIMIT_NOFILE,
+``/proc/sys/fs/nr_open`` and ``/proc/sys/fs/file-max``.  Typically these needn't
+be huge since they're related to the number of open device blobs rather than
+open files of each individual filesystem.
+
+The user daemon should reply the OPEN request by issuing a "copen" (complete
+open) command on the devnode::
+
+	copen <msg_id>,<cache_size>
+
+	* ``msg_id`` must match the msg_id field of the OPEN request.
+
+	* When >= 0, ``cache_size`` indicates the size of the cache file;
+	  when < 0, ``cache_size`` indicates any error code encountered by the
+	  user daemon.
+
+
+The CLOSE Request
+-----------------
+
+When a cookie withdrawn, a CLOSE request (opcode CACHEFILES_OP_CLOSE) will be
+sent to the user daemon.  This tells the user daemon to close all anonymous fds
+associated with the given object_id.  The CLOSE request has no extra payload,
+and shouldn't be replied.
+
+
+The READ Request
+----------------
+
+When a cache miss is encountered in on-demand read mode, CacheFiles will send a
+READ request (opcode CACHEFILES_OP_READ) to the user daemon. This tells the user
+daemon to fetch the contents of the requested file range.  The payload is of the
+form::
+
+	struct cachefiles_read {
+		__u64 off;
+		__u64 len;
+	};
+
+	where:
+
+	* ``off`` indicates the starting offset of the requested file range.
+
+	* ``len`` indicates the length of the requested file range.
+
+
+When it receives a READ request, the user daemon should fetch the requested data
+and write it to the cache file identified by object_id.
+
+When it has finished processing the READ request, the user daemon should reply
+by using the CACHEFILES_IOC_READ_COMPLETE ioctl on one of the anonymous fds
+associated with the object_id given in the READ request.  The ioctl is of the
+form::
+
+	ioctl(fd, CACHEFILES_IOC_READ_COMPLETE, msg_id);
+
+	* ``fd`` is one of the anonymous fds associated with the object_id
+	  given.
+
+	* ``msg_id`` must match the msg_id field of the READ request.
-- 
2.27.0


  parent reply	other threads:[~2022-04-25 12:23 UTC|newest]

Thread overview: 50+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-04-25 12:21 [PATCH v10 00/21] fscache,erofs: fscache-based on-demand read semantics Jeffle Xu
2022-04-25 12:21 ` [PATCH v10 00/21] fscache, erofs: " Jeffle Xu
2022-04-25 12:21 ` [PATCH v10 01/21] cachefiles: extract write routine Jeffle Xu
2022-04-25 12:21   ` Jeffle Xu
2022-04-25 12:21 ` [PATCH v10 02/21] cachefiles: notify the user daemon when looking up cookie Jeffle Xu
2022-04-25 12:21   ` Jeffle Xu
2022-04-25 12:21 ` [PATCH v10 03/21] cachefiles: unbind cachefiles gracefully in on-demand mode Jeffle Xu
2022-04-25 12:21   ` Jeffle Xu
2022-04-25 12:21 ` [PATCH v10 04/21] cachefiles: notify the user daemon when withdrawing cookie Jeffle Xu
2022-04-25 12:21   ` Jeffle Xu
2022-04-25 12:21 ` [PATCH v10 05/21] cachefiles: implement on-demand read Jeffle Xu
2022-04-25 12:21   ` Jeffle Xu
2022-04-25 12:21 ` [PATCH v10 06/21] cachefiles: enable on-demand read mode Jeffle Xu
2022-04-25 12:21   ` Jeffle Xu
2022-04-25 12:21 ` [PATCH v10 07/21] cachefiles: add tracepoints for " Jeffle Xu
2022-04-25 12:21   ` Jeffle Xu
2022-04-25 12:21 ` Jeffle Xu [this message]
2022-04-25 12:21   ` [PATCH v10 08/21] cachefiles: document " Jeffle Xu
2022-04-25 12:21 ` [PATCH v10 09/21] erofs: make erofs_map_blocks() generally available Jeffle Xu
2022-04-25 12:21   ` Jeffle Xu
2022-04-25 12:21 ` [PATCH v10 10/21] erofs: add fscache mode check helper Jeffle Xu
2022-04-25 12:21   ` Jeffle Xu
2022-04-25 12:21 ` [PATCH v10 11/21] erofs: register fscache volume Jeffle Xu
2022-04-25 12:21   ` Jeffle Xu
2022-04-25 12:21 ` [PATCH v10 12/21] erofs: add fscache context helper functions Jeffle Xu
2022-04-25 12:21   ` Jeffle Xu
2022-04-25 12:21 ` [PATCH v10 13/21] erofs: add anonymous inode caching metadata for data blobs Jeffle Xu
2022-04-25 12:21   ` Jeffle Xu
2022-04-25 12:21 ` [PATCH v10 14/21] erofs: add erofs_fscache_read_folios() helper Jeffle Xu
2022-04-25 12:21   ` Jeffle Xu
2022-04-25 12:21 ` [PATCH v10 15/21] erofs: register fscache context for primary data blob Jeffle Xu
2022-04-25 12:21   ` Jeffle Xu
2022-04-25 12:21 ` [PATCH v10 16/21] erofs: register fscache context for extra data blobs Jeffle Xu
2022-04-25 12:21   ` Jeffle Xu
2022-04-25 12:21 ` [PATCH v10 17/21] erofs: implement fscache-based metadata read Jeffle Xu
2022-04-25 12:21   ` Jeffle Xu
2022-04-26  5:39   ` Gao Xiang
2022-04-26  5:39     ` Gao Xiang
2022-04-25 12:21 ` [PATCH v10 18/21] erofs: implement fscache-based data read for non-inline layout Jeffle Xu
2022-04-25 12:21   ` Jeffle Xu
2022-04-25 12:21 ` [PATCH v10 19/21] erofs: implement fscache-based data read for inline layout Jeffle Xu
2022-04-25 12:21   ` Jeffle Xu
2022-04-25 12:21 ` [PATCH v10 20/21] erofs: implement fscache-based data readahead Jeffle Xu
2022-04-25 12:21   ` Jeffle Xu
2022-04-26  5:43   ` Gao Xiang
2022-04-26  5:43     ` Gao Xiang
2022-04-25 12:21 ` [PATCH v10 21/21] erofs: add 'fsid' mount option Jeffle Xu
2022-04-25 12:21   ` Jeffle Xu
2022-04-26 12:54 ` [PATCH v10 00/21] fscache,erofs: fscache-based on-demand read semantics Gao Xiang
2022-04-26 12:54   ` Gao Xiang

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20220425122143.56815-9-jefflexu@linux.alibaba.com \
    --to=jefflexu@linux.alibaba.com \
    --cc=bo.liu@linux.alibaba.com \
    --cc=chao@kernel.org \
    --cc=dhowells@redhat.com \
    --cc=eguan@linux.alibaba.com \
    --cc=fannaihao@baidu.com \
    --cc=gerry@linux.alibaba.com \
    --cc=gregkh@linuxfoundation.org \
    --cc=joseph.qi@linux.alibaba.com \
    --cc=linux-cachefs@redhat.com \
    --cc=linux-erofs@lists.ozlabs.org \
    --cc=linux-fsdevel@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=luodaowen.backend@bytedance.com \
    --cc=tao.peng@linux.alibaba.com \
    --cc=tianzichen@kuaishou.com \
    --cc=torvalds@linux-foundation.org \
    --cc=willy@infradead.org \
    --cc=xiang@kernel.org \
    --cc=zhangjiachen.jaycee@bytedance.com \
    --cc=zhujia.zj@bytedance.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.