linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH v3 00/22] fscache,erofs: fscache-based demand-read semantics
@ 2022-02-09  6:00 Jeffle Xu
  2022-02-09  6:00 ` [PATCH v3 01/22] fscache: export fscache_end_operation() Jeffle Xu
                   ` (22 more replies)
  0 siblings, 23 replies; 35+ messages in thread
From: Jeffle Xu @ 2022-02-09  6:00 UTC (permalink / raw)
  To: dhowells, linux-cachefs, xiang, chao, linux-erofs
  Cc: torvalds, gregkh, willy, linux-fsdevel, joseph.qi, bo.liu,
	tao.peng, gerry, eguan, linux-kernel

changes since v2:
- fscache,erofs: Now erofs uses fscache_read() directly instead of netfs
  library to read data from cache, to avoid the potential conflict with
  the following netfs library refactoring [1] (patch 12) (David Howells)
- erofs: Implement fscache-based readahead. The current implementation
  is quite rough and is synchronous though. Need to be improved in the
  following iteration.
- cachefiles_ondemand: use xarray instead of IDR managing pending read
  requests (patch 5) (Matthew Wilcox)
- I also upload this patch set at:
  https://github.com/lostjeffle/linux/commits/jingbo/dev-erofs-fscache

[1] https://lore.kernel.org/all/2946d871-b9e1-cf29-6d39-bcab30f2854f@linux.alibaba.com/t/#mfbb2053476760d8fac723c57dad529192a5084c6

RFC: https://lore.kernel.org/all/YbRL2glGzjfZkVbH@B-P7TQMD6M-0146.local/t/
v1: https://lore.kernel.org/lkml/47831875-4bdd-8398-9f2d-0466b31a4382@linux.alibaba.com/T/
v2: https://lore.kernel.org/all/2946d871-b9e1-cf29-6d39-bcab30f2854f@linux.alibaba.com/t/


[Background]
============
Nydus is a remote container snapthotter specially optimised for container
images distribution over network. It has recently been accepted as a
sub-project of containerd[1]. Nydus is an excellent container image
acceleration solution, since it only pulls data from remote when it's
really needed, a.k.a. on-demand reading.

erofs (Enhanced Read-Only File System) is a filesystem specially
optimised for read-only scenarios. (Documentation/filesystem/erofs.rst)

Recently we are focusing on erofs in container images distribution
scenario [2], trying to combine it with nydus. In this case, erofs can
be mounted from one bootstrap file (metadata) with (optional) multiple
data blob files (data) stored on another local filesystem. (All these
files are actually image files in erofs disk format.)

To accelerate the container startup (fetching container image from remote
and then start the container), we do hope that the bootstrap blob file
could support demand read. That is, erofs can be mounted and accessed
even when the bootstrap/data blob files have not been fully downloaded.

That means we have to manage the cache state of the bootstrap/data blob
files (if cache hit, read directly from the local cache; if cache miss,
fetch the data somehow). It would be painful and may be dumb for erofs to
implement the cache management itself. Thus we prefer fscache/cachefiles
to do the cache management. Besides, the demand-read feature shall be
general and it can benefit other using scenarios if it can be implemented
in fscache level.

[1] https://d7y.io/en-us/blog/containerd_accepted_nydus-snapshotter.html
[2] https://sched.co/pcdL


[Overall Design]
================
The upper fs uses a backing file on the local fs as the local cache
(exactly the "cachefiles" way), and relies on fscache to detect if data
is ready or not (cache hit/miss). Since currently fscache detects cache
hit/miss by detecting the hole of the backing files, our demand-read
mechanism also relies on the hole detecting.

1. initial phase
On the first beginning, the user daemon will touch the backing files
(bootstrap/data blob files) under corresponding directory (under
<root>/cache/<volume>/<fan>/) in advance. These backing files are
completely sparse files (with zero disk usage). Since these backing
files are all read-only and the file size is known prior mounting, user
daemon will set corresponding file size and thus create all these sparse
backing files in advance.

2. cache miss
When a file range (of bootstrap/data blob file) is accessed for the
first time, a cache miss will be triggered and then .issue_op() will be
called to fetch the data somehow.

In the demand-read case, we relies on a user daemon to fetch the data
from local/remote. In this case, .issue_op() just packages the file
range into a message and informs the user daemon. User daemon needs to
poll and wait on the devnode (/dev/cachefiles_demand). Once awaken, the
user daemon will read the devnode to get the file range information, and
then fetch the data corresponding to the file range somehow, e.g.
download from remote through network. Once data ready, the user daemon
will write the fetched data into the backing file and then inform
cachefiles backend by writing to the devnode. Cachefiles backend getting
blocked on the previous .issue_op() calling will be awaken then. By then
the data has been ready in the backing file, and the upper fs will
reinitiate a read request from the backing file.

3. cache hit
Once data is already ready in the backing file, upper fs will read from
the backing file directly.


[Advantage of fscache-based demand-read]
========================================
1. Asynchronous Prefetch
In current mechanism, fscache is responsible for cache state management,
while the data plane (fetch data from local/remote on cache miss) is
done on the user daemon side.

If data has already been ready in the backing file, the upper fs (e.g.
erofs) will read from the backing file directly and won't be trapped to
user space anymore. Thus the user daemon could fetch data (from remote)
asynchronously on the background, and thus accelerate the backing file
accessing in some degree.

2. Support massive blob files
Besides this mechanism supports a large amount of backing files, and
thus can benefit the densely employed scenario.

In our using scenario, one container image can correspond to one
bootstrap file (required) and multiple data blob files (optional). For
example, one container image for node.js will corresponds to ~20 files
in total. In densely employed environment, there could be as many as
hundreds of containers and thus thousands of backing files on one
machine.


[Test]
==========
You could start a quick test by
https://github.com/lostjeffle/demand-read-cachefilesd



Jeffle Xu (22):
  fscache: export fscache_end_operation()
  fscache: add a method to support on-demand read semantics
  cachefiles: extract generic function for daemon methods
  cachefiles: detect backing file size in on-demand read mode
  cachefiles: introduce new devnode for on-demand read mode
  erofs: use meta buffers for erofs_read_superblock()
  erofs: export erofs_map_blocks()
  erofs: add mode checking helper
  erofs: register global fscache volume
  erofs: add cookie context helper functions
  erofs: add anonymous inode managing page cache of blob file
  erofs: add erofs_fscache_read_page() helper
  erofs: register cookie context for bootstrap blob
  erofs: implement fscache-based metadata read
  erofs: implement fscache-based data read for non-inline layout
  erofs: implement fscache-based data read for inline layout
  erofs: register cookie context for data blobs
  erofs: implement fscache-based data read for data blobs
  erofs: implement fscache-based data readahead for hole
  erofs: implement fscache-based data readahead for non-inline layout
  erofs: implement fscache-based data readahead for inline layout
  erofs: add 'uuid' mount option

 Documentation/filesystems/netfs_library.rst |  18 +
 fs/cachefiles/Kconfig                       |  13 +
 fs/cachefiles/daemon.c                      | 243 +++++++++--
 fs/cachefiles/internal.h                    |  12 +
 fs/cachefiles/io.c                          |  60 +++
 fs/cachefiles/main.c                        |  27 ++
 fs/cachefiles/namei.c                       |  60 ++-
 fs/erofs/Makefile                           |   3 +-
 fs/erofs/data.c                             |  18 +-
 fs/erofs/fscache.c                          | 451 ++++++++++++++++++++
 fs/erofs/inode.c                            |   6 +-
 fs/erofs/internal.h                         |  30 ++
 fs/erofs/super.c                            | 106 ++++-
 fs/fscache/internal.h                       |  11 -
 fs/nfs/fscache.c                            |   8 -
 include/linux/fscache.h                     |  39 ++
 include/linux/netfs.h                       |   4 +
 include/uapi/linux/cachefiles_ondemand.h    |  14 +
 18 files changed, 1050 insertions(+), 73 deletions(-)
 create mode 100644 fs/erofs/fscache.c
 create mode 100644 include/uapi/linux/cachefiles_ondemand.h

-- 
2.27.0


^ permalink raw reply	[flat|nested] 35+ messages in thread

* [PATCH v3 01/22] fscache: export fscache_end_operation()
  2022-02-09  6:00 [PATCH v3 00/22] fscache,erofs: fscache-based demand-read semantics Jeffle Xu
@ 2022-02-09  6:00 ` Jeffle Xu
  2022-02-17  7:44   ` Liu Bo
  2022-02-09  6:00 ` [PATCH v3 02/22] fscache: add a method to support on-demand read semantics Jeffle Xu
                   ` (21 subsequent siblings)
  22 siblings, 1 reply; 35+ messages in thread
From: Jeffle Xu @ 2022-02-09  6:00 UTC (permalink / raw)
  To: dhowells, linux-cachefs, xiang, chao, linux-erofs
  Cc: torvalds, gregkh, willy, linux-fsdevel, joseph.qi, bo.liu,
	tao.peng, gerry, eguan, linux-kernel

Export fscache_end_operation() to avoid code duplication.

Besides, considering the paired fscache_begin_read_operation() is
already exported, it shall make sense to also export
fscache_end_operation().

Signed-off-by: Jeffle Xu <jefflexu@linux.alibaba.com>
---
 fs/fscache/internal.h   | 11 -----------
 fs/nfs/fscache.c        |  8 --------
 include/linux/fscache.h | 14 ++++++++++++++
 3 files changed, 14 insertions(+), 19 deletions(-)

diff --git a/fs/fscache/internal.h b/fs/fscache/internal.h
index f121c21590dc..ed1c9ed737f2 100644
--- a/fs/fscache/internal.h
+++ b/fs/fscache/internal.h
@@ -70,17 +70,6 @@ static inline void fscache_see_cookie(struct fscache_cookie *cookie,
 			     where);
 }
 
-/*
- * io.c
- */
-static inline void fscache_end_operation(struct netfs_cache_resources *cres)
-{
-	const struct netfs_cache_ops *ops = fscache_operation_valid(cres);
-
-	if (ops)
-		ops->end_operation(cres);
-}
-
 /*
  * main.c
  */
diff --git a/fs/nfs/fscache.c b/fs/nfs/fscache.c
index cfe901650ab0..39654ca72d3d 100644
--- a/fs/nfs/fscache.c
+++ b/fs/nfs/fscache.c
@@ -249,14 +249,6 @@ void nfs_fscache_release_file(struct inode *inode, struct file *filp)
 	}
 }
 
-static inline void fscache_end_operation(struct netfs_cache_resources *cres)
-{
-	const struct netfs_cache_ops *ops = fscache_operation_valid(cres);
-
-	if (ops)
-		ops->end_operation(cres);
-}
-
 /*
  * Fallback page reading interface.
  */
diff --git a/include/linux/fscache.h b/include/linux/fscache.h
index 296c5f1d9f35..d2430da8aa67 100644
--- a/include/linux/fscache.h
+++ b/include/linux/fscache.h
@@ -456,6 +456,20 @@ int fscache_begin_read_operation(struct netfs_cache_resources *cres,
 	return -ENOBUFS;
 }
 
+/**
+ * fscache_end_operation - End the read operation for the netfs lib
+ * @cres: The cache resources for the read operation
+ *
+ * Clean up the resources at the end of the read request.
+ */
+static inline void fscache_end_operation(struct netfs_cache_resources *cres)
+{
+	const struct netfs_cache_ops *ops = fscache_operation_valid(cres);
+
+	if (ops)
+		ops->end_operation(cres);
+}
+
 /**
  * fscache_read - Start a read from the cache.
  * @cres: The cache resources to use
-- 
2.27.0


^ permalink raw reply related	[flat|nested] 35+ messages in thread

* [PATCH v3 02/22] fscache: add a method to support on-demand read semantics
  2022-02-09  6:00 [PATCH v3 00/22] fscache,erofs: fscache-based demand-read semantics Jeffle Xu
  2022-02-09  6:00 ` [PATCH v3 01/22] fscache: export fscache_end_operation() Jeffle Xu
@ 2022-02-09  6:00 ` Jeffle Xu
  2022-02-09  6:00 ` [PATCH v3 03/22] cachefiles: extract generic function for daemon methods Jeffle Xu
                   ` (20 subsequent siblings)
  22 siblings, 0 replies; 35+ messages in thread
From: Jeffle Xu @ 2022-02-09  6:00 UTC (permalink / raw)
  To: dhowells, linux-cachefs, xiang, chao, linux-erofs
  Cc: torvalds, gregkh, willy, linux-fsdevel, joseph.qi, bo.liu,
	tao.peng, gerry, eguan, linux-kernel

Add .ondemand_read() callback to netfs_cache_ops to implement on-demand
read.

The precondition for implementing on-demand read semantics is that,
all blob files have been placed under corresponding directory with
correct file size (sparse files) on the first beginning. When upper fs
starts to access the blob file, it will "cache miss" (hit the hole).
Then .ondemand_read() callback can be called to notify backend to
prepare the data.

The implementation of .ondemand_read() callback can be backend specific.
The following patch will introduce the implementation for cachefiles,
which will notify user daemon the requested file range to read. The
.ondemand_read() callback will get blocked until the user daemon has
prepared the corresponding data.

Then once .ondemand_read() callback returns with 0, it is guaranteed
that the requested data has been ready. In this case, users can retry to
read from the backing file.

Signed-off-by: Jeffle Xu <jefflexu@linux.alibaba.com>
---
 Documentation/filesystems/netfs_library.rst | 18 +++++++++++++++
 include/linux/fscache.h                     | 25 +++++++++++++++++++++
 include/linux/netfs.h                       |  4 ++++
 3 files changed, 47 insertions(+)

diff --git a/Documentation/filesystems/netfs_library.rst b/Documentation/filesystems/netfs_library.rst
index 4f373a8ec47b..e544d6688100 100644
--- a/Documentation/filesystems/netfs_library.rst
+++ b/Documentation/filesystems/netfs_library.rst
@@ -466,6 +466,8 @@ operation table looks like the following::
 		int (*query_occupancy)(struct netfs_cache_resources *cres,
 				       loff_t start, size_t len, size_t granularity,
 				       loff_t *_data_start, size_t *_data_len);
+		int (*ondemand_read)(struct netfs_cache_resources *cres,
+				     loff_t start_pos, size_t len);
 	};
 
 With a termination handler function pointer::
@@ -552,6 +554,22 @@ The methods defined in the table are:
    It returns 0 if some data was found, -ENODATA if there was no usable data
    within the region or -ENOBUFS if there is no caching on this file.
 
+ * ``ondemand_read()``
+
+   [Optional] Called to make cache prepare for the data. It shall be called only
+   when on-demand read semantics is required. It will be called when a cache
+   miss is encountered. The function will make the backend somehow prepare for
+   the data in the region specified by @start_pos/@len of the cache file. It may
+   get blocked until the backend has prepared the data in the cache file
+   successfully, or error encountered.
+
+   Once it returns with 0, it is guaranteed that the requested data has been
+   ready in the cache file. In this case, users can retry to read from the cache
+   file.
+
+   It returns 0 if data has been ready in the cache file, or other error code
+   from the cache, such as -ENOMEM.
+
 Note that these methods are passed a pointer to the cache resource structure,
 not the read request structure as they could be used in other situations where
 there isn't a read request structure as well, such as writing dirty data to the
diff --git a/include/linux/fscache.h b/include/linux/fscache.h
index d2430da8aa67..efcd5d5c6726 100644
--- a/include/linux/fscache.h
+++ b/include/linux/fscache.h
@@ -514,6 +514,31 @@ int fscache_read(struct netfs_cache_resources *cres,
 			 term_func, term_func_priv);
 }
 
+/**
+ * fscache_ondemand_read - Make cache prepare for the data.
+ * @cres: The cache resources to use
+ * @start_pos: The beginning file offset in the cache file
+ * @len: The length of the file offset range in the cache file
+ *
+ * This shall only be called when a cache miss is encountered. It will make
+ * the backend somehow prepare for the data in the file offset range specified
+ * by @start_pos/@len of the cache file. It may get blocked until the backend
+ * has prepared the data in the cache file successfully, or error encountered.
+ *
+ * Returns:
+ * * 0		- Success (Data is ready in the cache file)
+ * * Other error code from the cache, such as -ENOMEM.
+ */
+static inline
+int fscache_ondemand_read(struct netfs_cache_resources *cres,
+			  loff_t start_pos, size_t len)
+{
+	const struct netfs_cache_ops *ops = fscache_operation_valid(cres);
+	if (ops->ondemand_read)
+		return ops->ondemand_read(cres, start_pos, len);
+	return -EOPNOTSUPP;
+}
+
 /**
  * fscache_begin_write_operation - Begin a write operation for the netfs lib
  * @cres: The cache resources for the write being performed
diff --git a/include/linux/netfs.h b/include/linux/netfs.h
index 614f22213e21..81fe707ad38d 100644
--- a/include/linux/netfs.h
+++ b/include/linux/netfs.h
@@ -251,6 +251,10 @@ struct netfs_cache_ops {
 	int (*query_occupancy)(struct netfs_cache_resources *cres,
 			       loff_t start, size_t len, size_t granularity,
 			       loff_t *_data_start, size_t *_data_len);
+
+	/* Make cache prepare for the data */
+	int (*ondemand_read)(struct netfs_cache_resources *cres,
+			     loff_t start_pos, size_t len);
 };
 
 struct readahead_control;
-- 
2.27.0


^ permalink raw reply related	[flat|nested] 35+ messages in thread

* [PATCH v3 03/22] cachefiles: extract generic function for daemon methods
  2022-02-09  6:00 [PATCH v3 00/22] fscache,erofs: fscache-based demand-read semantics Jeffle Xu
  2022-02-09  6:00 ` [PATCH v3 01/22] fscache: export fscache_end_operation() Jeffle Xu
  2022-02-09  6:00 ` [PATCH v3 02/22] fscache: add a method to support on-demand read semantics Jeffle Xu
@ 2022-02-09  6:00 ` Jeffle Xu
  2022-02-17  8:17   ` Liu Bo
  2022-02-09  6:00 ` [PATCH v3 04/22] cachefiles: detect backing file size in on-demand read mode Jeffle Xu
                   ` (19 subsequent siblings)
  22 siblings, 1 reply; 35+ messages in thread
From: Jeffle Xu @ 2022-02-09  6:00 UTC (permalink / raw)
  To: dhowells, linux-cachefs, xiang, chao, linux-erofs
  Cc: torvalds, gregkh, willy, linux-fsdevel, joseph.qi, bo.liu,
	tao.peng, gerry, eguan, linux-kernel

... so that the following new devnode can reuse most of the code when
implementing its own methods.

Signed-off-by: Jeffle Xu <jefflexu@linux.alibaba.com>
---
 fs/cachefiles/daemon.c | 70 +++++++++++++++++++++++++++---------------
 1 file changed, 45 insertions(+), 25 deletions(-)

diff --git a/fs/cachefiles/daemon.c b/fs/cachefiles/daemon.c
index 7ac04ee2c0a0..6b8d7c5bbe5d 100644
--- a/fs/cachefiles/daemon.c
+++ b/fs/cachefiles/daemon.c
@@ -78,6 +78,34 @@ static const struct cachefiles_daemon_cmd cachefiles_daemon_cmds[] = {
 	{ "",		NULL				}
 };
 
+static struct cachefiles_cache *cachefiles_daemon_open_cache(void)
+{
+	struct cachefiles_cache *cache;
+
+	/* allocate a cache record */
+	cache = kzalloc(sizeof(struct cachefiles_cache), GFP_KERNEL);
+	if (cache) {
+		mutex_init(&cache->daemon_mutex);
+		init_waitqueue_head(&cache->daemon_pollwq);
+		INIT_LIST_HEAD(&cache->volumes);
+		INIT_LIST_HEAD(&cache->object_list);
+		spin_lock_init(&cache->object_list_lock);
+
+		/* set default caching limits
+		 * - limit at 1% free space and/or free files
+		 * - cull below 5% free space and/or free files
+		 * - cease culling above 7% free space and/or free files
+		 */
+		cache->frun_percent = 7;
+		cache->fcull_percent = 5;
+		cache->fstop_percent = 1;
+		cache->brun_percent = 7;
+		cache->bcull_percent = 5;
+		cache->bstop_percent = 1;
+	}
+
+	return cache;
+}
 
 /*
  * Prepare a cache for caching.
@@ -96,31 +124,13 @@ static int cachefiles_daemon_open(struct inode *inode, struct file *file)
 	if (xchg(&cachefiles_open, 1) == 1)
 		return -EBUSY;
 
-	/* allocate a cache record */
-	cache = kzalloc(sizeof(struct cachefiles_cache), GFP_KERNEL);
+
+	cache = cachefiles_daemon_open_cache();
 	if (!cache) {
 		cachefiles_open = 0;
 		return -ENOMEM;
 	}
 
-	mutex_init(&cache->daemon_mutex);
-	init_waitqueue_head(&cache->daemon_pollwq);
-	INIT_LIST_HEAD(&cache->volumes);
-	INIT_LIST_HEAD(&cache->object_list);
-	spin_lock_init(&cache->object_list_lock);
-
-	/* set default caching limits
-	 * - limit at 1% free space and/or free files
-	 * - cull below 5% free space and/or free files
-	 * - cease culling above 7% free space and/or free files
-	 */
-	cache->frun_percent = 7;
-	cache->fcull_percent = 5;
-	cache->fstop_percent = 1;
-	cache->brun_percent = 7;
-	cache->bcull_percent = 5;
-	cache->bstop_percent = 1;
-
 	file->private_data = cache;
 	cache->cachefilesd = file;
 	return 0;
@@ -209,10 +219,11 @@ static ssize_t cachefiles_daemon_read(struct file *file, char __user *_buffer,
 /*
  * Take a command from cachefilesd, parse it and act on it.
  */
-static ssize_t cachefiles_daemon_write(struct file *file,
-				       const char __user *_data,
-				       size_t datalen,
-				       loff_t *pos)
+static ssize_t cachefiles_daemon_do_write(struct file *file,
+					  const char __user *_data,
+					  size_t datalen,
+					  loff_t *pos,
+			const struct cachefiles_daemon_cmd *cmds)
 {
 	const struct cachefiles_daemon_cmd *cmd;
 	struct cachefiles_cache *cache = file->private_data;
@@ -261,7 +272,7 @@ static ssize_t cachefiles_daemon_write(struct file *file,
 	}
 
 	/* run the appropriate command handler */
-	for (cmd = cachefiles_daemon_cmds; cmd->name[0]; cmd++)
+	for (cmd = cmds; cmd->name[0]; cmd++)
 		if (strcmp(cmd->name, data) == 0)
 			goto found_command;
 
@@ -284,6 +295,15 @@ static ssize_t cachefiles_daemon_write(struct file *file,
 	goto error;
 }
 
+static ssize_t cachefiles_daemon_write(struct file *file,
+				       const char __user *_data,
+				       size_t datalen,
+				       loff_t *pos)
+{
+	return cachefiles_daemon_do_write(file, _data, datalen, pos,
+					  cachefiles_daemon_cmds);
+}
+
 /*
  * Poll for culling state
  * - use EPOLLOUT to indicate culling state
-- 
2.27.0


^ permalink raw reply related	[flat|nested] 35+ messages in thread

* [PATCH v3 04/22] cachefiles: detect backing file size in on-demand read mode
  2022-02-09  6:00 [PATCH v3 00/22] fscache,erofs: fscache-based demand-read semantics Jeffle Xu
                   ` (2 preceding siblings ...)
  2022-02-09  6:00 ` [PATCH v3 03/22] cachefiles: extract generic function for daemon methods Jeffle Xu
@ 2022-02-09  6:00 ` Jeffle Xu
  2022-02-09  6:00 ` [PATCH v3 05/22] cachefiles: introduce new devnode for " Jeffle Xu
                   ` (18 subsequent siblings)
  22 siblings, 0 replies; 35+ messages in thread
From: Jeffle Xu @ 2022-02-09  6:00 UTC (permalink / raw)
  To: dhowells, linux-cachefs, xiang, chao, linux-erofs
  Cc: torvalds, gregkh, willy, linux-fsdevel, joseph.qi, bo.liu,
	tao.peng, gerry, eguan, linux-kernel

Fscache/cachefiles used to serve as a local cache for remote fs. The
following patches will introduce a new use case, in which local
read-only fs could implement on-demand reading with fscache. Then in
this case, the upper read-only fs may has no idea on the size of the
backed file.

It is worth nothing that, in this scenario, user daemon is responsible
for preparing all backing files with correct file size in the first
beginning. (Backing files are all sparse files in this case). And since
it's read-only, we can get the backing file size at runtime as the
object size.

This patch also adds one flag bit to distinguish the new introduced
on-demand read mode from the original mode. The following patch will
introduce a user configures it.

Signed-off-by: Jeffle Xu <jefflexu@linux.alibaba.com>
---
 fs/cachefiles/Kconfig    | 13 +++++++++
 fs/cachefiles/internal.h |  1 +
 fs/cachefiles/namei.c    | 60 +++++++++++++++++++++++++++++++++++++++-
 3 files changed, 73 insertions(+), 1 deletion(-)

diff --git a/fs/cachefiles/Kconfig b/fs/cachefiles/Kconfig
index 719faeeda168..cef412cfd127 100644
--- a/fs/cachefiles/Kconfig
+++ b/fs/cachefiles/Kconfig
@@ -26,3 +26,16 @@ config CACHEFILES_ERROR_INJECTION
 	help
 	  This permits error injection to be enabled in cachefiles whilst a
 	  cache is in service.
+
+config CACHEFILES_ONDEMAND
+	bool "Support for on-demand reading"
+	depends on CACHEFILES
+	default n
+	help
+	  This permits on-demand read mode of cachefiles. In this mode, when
+	  cache miss, the cachefiles backend instead of the upper fs using
+	  fscache is responsible for fetching data, e.g. through user daemon.
+	  Then after the data's ready, upper fs can reinitiate a read from the
+	  cache.
+
+	  If unsure, say N.
diff --git a/fs/cachefiles/internal.h b/fs/cachefiles/internal.h
index 1a837de7b070..8400501bbd56 100644
--- a/fs/cachefiles/internal.h
+++ b/fs/cachefiles/internal.h
@@ -98,6 +98,7 @@ struct cachefiles_cache {
 #define CACHEFILES_DEAD			1	/* T if cache dead */
 #define CACHEFILES_CULLING		2	/* T if cull engaged */
 #define CACHEFILES_STATE_CHANGED	3	/* T if state changed (poll trigger) */
+#define CACHEFILES_ONDEMAND_MODE	4	/* T if in on-demand read mode */
 	char				*rootdirname;	/* name of cache root directory */
 	char				*secctx;	/* LSM security context */
 	char				*tag;		/* cache binding tag */
diff --git a/fs/cachefiles/namei.c b/fs/cachefiles/namei.c
index 7ebc29210b70..90479cb55e0a 100644
--- a/fs/cachefiles/namei.c
+++ b/fs/cachefiles/namei.c
@@ -510,15 +510,69 @@ struct file *cachefiles_create_tmpfile(struct cachefiles_object *object)
 	return file;
 }
 
+#ifdef CONFIG_CACHEFILES_ONDEMAND
+static inline bool cachefiles_can_create_file(struct cachefiles_cache *cache)
+{
+	/*
+	 * On-demand read mode requires that backing files have been prepared
+	 * with correct file size under corresponding directory in the very
+	 * first begginning. We can get here when the backing file doesn't exist
+	 * under corresponding directory, or the file size is unexpected 0.
+	 */
+	return !test_bit(CACHEFILES_ONDEMAND_MODE, &cache->flags);
+
+}
+
+/*
+ * Fs using fscache for on-demand reading may have no idea of the file size of
+ * backing files. Thus the on-demand read mode requires that backing files shall
+ * be prepared with correct file size under corresponding directory by the user
+ * daemon in the first beginning. Then the backend is responsible for taking the
+ * file size of the backing file as the object size at runtime.
+ */
+static int cachefiles_recheck_size(struct cachefiles_object *object,
+				   struct file *file)
+{
+	loff_t size;
+	struct cachefiles_cache *cache = object->volume->cache;
+
+	if (!test_bit(CACHEFILES_ONDEMAND_MODE, &cache->flags))
+		return 0;
+
+	size = i_size_read(file_inode(file));
+	if (!size)
+		return -EINVAL;
+
+	object->cookie->object_size = size;
+	return 0;
+}
+#else
+static inline bool cachefiles_can_create_file(struct cachefiles_cache *cache)
+{
+	return true;
+}
+
+static int cachefiles_recheck_size(struct cachefiles_object *object,
+				   struct file *file)
+{
+	return 0;
+}
+#endif
+
+
 /*
  * Create a new file.
  */
 static bool cachefiles_create_file(struct cachefiles_object *object)
 {
+	struct cachefiles_cache *cache = object->volume->cache;
 	struct file *file;
 	int ret;
 
-	ret = cachefiles_has_space(object->volume->cache, 1, 0,
+	if (!cachefiles_can_create_file(cache))
+		return false;
+
+	ret = cachefiles_has_space(cache, 1, 0,
 				   cachefiles_has_space_for_create);
 	if (ret < 0)
 		return false;
@@ -573,6 +627,10 @@ static bool cachefiles_open_file(struct cachefiles_object *object,
 	}
 	_debug("file -> %pd positive", dentry);
 
+	ret = cachefiles_recheck_size(object, file);
+	if (ret < 0)
+		goto check_failed;
+
 	ret = cachefiles_check_auxdata(object, file);
 	if (ret < 0)
 		goto check_failed;
-- 
2.27.0


^ permalink raw reply related	[flat|nested] 35+ messages in thread

* [PATCH v3 05/22] cachefiles: introduce new devnode for on-demand read mode
  2022-02-09  6:00 [PATCH v3 00/22] fscache,erofs: fscache-based demand-read semantics Jeffle Xu
                   ` (3 preceding siblings ...)
  2022-02-09  6:00 ` [PATCH v3 04/22] cachefiles: detect backing file size in on-demand read mode Jeffle Xu
@ 2022-02-09  6:00 ` Jeffle Xu
  2022-02-15  9:03   ` JeffleXu
  2022-02-09  6:00 ` [PATCH v3 06/22] erofs: use meta buffers for erofs_read_superblock() Jeffle Xu
                   ` (17 subsequent siblings)
  22 siblings, 1 reply; 35+ messages in thread
From: Jeffle Xu @ 2022-02-09  6:00 UTC (permalink / raw)
  To: dhowells, linux-cachefs, xiang, chao, linux-erofs
  Cc: torvalds, gregkh, willy, linux-fsdevel, joseph.qi, bo.liu,
	tao.peng, gerry, eguan, linux-kernel

This patch introduces a new devnode 'cachefiles_ondemand' to support the
newly introduced on-demand read mode.

The precondition for on-demand reading semantics is that, all blob files
have been placed under corresponding directory with correct file size
(sparse files) on the first beginning. When upper fs starts to access
the blob file, it will "cache miss" (hit the hole) and then turn to user
daemon for preparing the data.

The interaction between kernel and user daemon is described as below.
1. Once cache miss, .ondemand_read() callback of corresponding fscache
   backend is called to prepare the data. As for cachefiles, it just
   packages related metadata (file range to read, etc.) into a pending
   read request, and then the process triggering cache miss will fall
   asleep until the corresponding data gets fetched later.
2. User daemon needs to poll on the devnode ('cachefiles_ondemand'),
   waiting for pending read request.
3. Once there's pending read request, user daemon will be notified and
   shall read the devnode ('cachefiles_ondemand') to fetch one pending
   read request to process.
4. For the fetched read request, user daemon need to somehow prepare the
   data (e.g. download from remote through network) and then write the
   fetched data into the backing file to fill the hole.
5. After that, user daemon need to notify cachefiles backend by writing a
   'done' command to devnode ('cachefiles_ondemand'). It will also
   awake the previous asleep process triggering cache miss.
6. By the time the process gets awaken, the data has been ready in the
   backing file. Then process can re-initiate a read request from the
   backing file.

Signed-off-by: Jeffle Xu <jefflexu@linux.alibaba.com>
---
 fs/cachefiles/daemon.c                   | 173 +++++++++++++++++++++++
 fs/cachefiles/internal.h                 |  11 ++
 fs/cachefiles/io.c                       |  60 ++++++++
 fs/cachefiles/main.c                     |  27 ++++
 include/uapi/linux/cachefiles_ondemand.h |  14 ++
 5 files changed, 285 insertions(+)
 create mode 100644 include/uapi/linux/cachefiles_ondemand.h

diff --git a/fs/cachefiles/daemon.c b/fs/cachefiles/daemon.c
index 6b8d7c5bbe5d..977cf1a42c30 100644
--- a/fs/cachefiles/daemon.c
+++ b/fs/cachefiles/daemon.c
@@ -757,3 +757,176 @@ static void cachefiles_daemon_unbind(struct cachefiles_cache *cache)
 
 	_leave("");
 }
+
+#ifdef CONFIG_CACHEFILES_ONDEMAND
+static unsigned long cachefiles_open_ondemand;
+
+static int cachefiles_ondemand_open(struct inode *inode, struct file *file);
+static int cachefiles_ondemand_release(struct inode *inode, struct file *file);
+static ssize_t cachefiles_ondemand_write(struct file *, const char __user *,
+					 size_t, loff_t *);
+static ssize_t cachefiles_ondemand_read(struct file *, char __user *, size_t,
+					loff_t *);
+static __poll_t cachefiles_ondemand_poll(struct file *,
+					 struct poll_table_struct *);
+static int cachefiles_daemon_done(struct cachefiles_cache *, char *);
+
+const struct file_operations cachefiles_ondemand_fops = {
+	.owner		= THIS_MODULE,
+	.open		= cachefiles_ondemand_open,
+	.release	= cachefiles_ondemand_release,
+	.read		= cachefiles_ondemand_read,
+	.write		= cachefiles_ondemand_write,
+	.poll		= cachefiles_ondemand_poll,
+	.llseek		= noop_llseek,
+};
+
+static const struct cachefiles_daemon_cmd cachefiles_ondemand_cmds[] = {
+	{ "bind",	cachefiles_daemon_bind		},
+	{ "brun",	cachefiles_daemon_brun		},
+	{ "bcull",	cachefiles_daemon_bcull		},
+	{ "bstop",	cachefiles_daemon_bstop		},
+	{ "cull",	cachefiles_daemon_cull		},
+	{ "debug",	cachefiles_daemon_debug		},
+	{ "dir",	cachefiles_daemon_dir		},
+	{ "frun",	cachefiles_daemon_frun		},
+	{ "fcull",	cachefiles_daemon_fcull		},
+	{ "fstop",	cachefiles_daemon_fstop		},
+	{ "inuse",	cachefiles_daemon_inuse		},
+	{ "secctx",	cachefiles_daemon_secctx	},
+	{ "tag",	cachefiles_daemon_tag		},
+	{ "done",	cachefiles_daemon_done		},
+	{ "",		NULL				}
+};
+
+static int cachefiles_ondemand_open(struct inode *inode, struct file *file)
+{
+	struct cachefiles_cache *cache;
+
+	_enter("");
+
+	/* only the superuser may do this */
+	if (!capable(CAP_SYS_ADMIN))
+		return -EPERM;
+
+	/* the cachefiles device may only be open once at a time */
+	if (xchg(&cachefiles_open_ondemand, 1) == 1)
+		return -EBUSY;
+
+	cache = cachefiles_daemon_open_cache();
+	if (!cache) {
+		cachefiles_open_ondemand = 0;
+		return -ENOMEM;
+	}
+
+	xa_init_flags(&cache->reqs, XA_FLAGS_ALLOC);
+	set_bit(CACHEFILES_ONDEMAND_MODE, &cache->flags);
+
+	file->private_data = cache;
+	cache->cachefilesd = file;
+	return 0;
+}
+
+static int cachefiles_ondemand_release(struct inode *inode, struct file *file)
+{
+	struct cachefiles_cache *cache = file->private_data;
+
+	_enter("");
+
+	ASSERT(cache);
+
+	set_bit(CACHEFILES_DEAD, &cache->flags);
+
+	cachefiles_daemon_unbind(cache);
+
+	/* clean up the control file interface */
+	xa_destroy(&cache->reqs);
+	cache->cachefilesd = NULL;
+	file->private_data = NULL;
+	cachefiles_open_ondemand = 0;
+
+	kfree(cache);
+
+	_leave("");
+	return 0;
+}
+
+static ssize_t cachefiles_ondemand_write(struct file *file,
+					 const char __user *_data,
+					 size_t datalen,
+					 loff_t *pos)
+{
+	return cachefiles_daemon_do_write(file, _data, datalen, pos,
+					  cachefiles_ondemand_cmds);
+}
+
+static ssize_t cachefiles_ondemand_read(struct file *file, char __user *_buffer,
+					size_t buflen, loff_t *pos)
+{
+	struct cachefiles_cache *cache = file->private_data;
+	struct cachefiles_req *req;
+	unsigned long id = 0;
+	int n;
+
+	if (!test_bit(CACHEFILES_READY, &cache->flags))
+		return 0;
+
+	req = xa_find(&cache->reqs, &id, UINT_MAX, XA_PRESENT);
+	if (!req)
+		return 0;
+
+	n = sizeof(struct cachefiles_req_in);
+	if (n > buflen)
+		return -EMSGSIZE;
+
+	req->base.id = id;
+	if (copy_to_user(_buffer, &req->base, n) != 0)
+		return -EFAULT;
+
+	return n;
+}
+
+static __poll_t cachefiles_ondemand_poll(struct file *file,
+					 struct poll_table_struct *poll)
+{
+	struct cachefiles_cache *cache = file->private_data;
+	__poll_t mask;
+
+	poll_wait(file, &cache->daemon_pollwq, poll);
+	mask = 0;
+
+	if (!xa_empty(&cache->reqs))
+		mask |= EPOLLIN;
+
+	return mask;
+}
+
+/*
+ * Request completion
+ * - command: "done <id>"
+ */
+static int cachefiles_daemon_done(struct cachefiles_cache *cache, char *args)
+{
+	struct cachefiles_req *req;
+	unsigned long id;
+	int ret;
+
+	_enter(",%s", args);
+
+	if (!*args) {
+		pr_err("Empty id specified\n");
+		return -EINVAL;
+	}
+
+	ret = kstrtoul(args, 0, &id);
+	if (ret)
+		return ret;
+
+	req = xa_erase(&cache->reqs, id);
+	if (!req)
+		return -EINVAL;
+
+	complete(&req->done);
+	return 0;
+}
+#endif
diff --git a/fs/cachefiles/internal.h b/fs/cachefiles/internal.h
index 8400501bbd56..46259feba7ac 100644
--- a/fs/cachefiles/internal.h
+++ b/fs/cachefiles/internal.h
@@ -15,6 +15,8 @@
 #include <linux/fscache-cache.h>
 #include <linux/cred.h>
 #include <linux/security.h>
+#include <linux/xarray.h>
+#include <linux/cachefiles_ondemand.h>
 
 #define CACHEFILES_DIO_BLOCK_SIZE 4096
 
@@ -102,6 +104,14 @@ struct cachefiles_cache {
 	char				*rootdirname;	/* name of cache root directory */
 	char				*secctx;	/* LSM security context */
 	char				*tag;		/* cache binding tag */
+#ifdef CONFIG_CACHEFILES_ONDEMAND
+	struct xarray			reqs;
+#endif
+};
+
+struct cachefiles_req {
+	struct cachefiles_req_in base;
+	struct completion done;
 };
 
 #include <trace/events/cachefiles.h>
@@ -146,6 +156,7 @@ extern int cachefiles_has_space(struct cachefiles_cache *cache,
  * daemon.c
  */
 extern const struct file_operations cachefiles_daemon_fops;
+extern const struct file_operations cachefiles_ondemand_fops;
 
 /*
  * error_inject.c
diff --git a/fs/cachefiles/io.c b/fs/cachefiles/io.c
index 753986ea1583..1d1a279e5be4 100644
--- a/fs/cachefiles/io.c
+++ b/fs/cachefiles/io.c
@@ -597,6 +597,63 @@ static void cachefiles_end_operation(struct netfs_cache_resources *cres)
 	fscache_end_cookie_access(fscache_cres_cookie(cres), fscache_access_io_end);
 }
 
+#ifdef CONFIG_CACHEFILES_ONDEMAND
+static struct cachefiles_req *cachefiles_alloc_req(struct cachefiles_object *object,
+						   loff_t start_pos,
+						   size_t len)
+{
+	struct cachefiles_req *req;
+	struct cachefiles_req_in *base;
+
+	req = kzalloc(sizeof(*req), GFP_KERNEL);
+	if (!req)
+		return NULL;
+
+	base = &req->base;
+
+	base->off = start_pos;
+	base->len = len;
+	strncpy(base->path, object->d_name, sizeof(base->path) - 1);
+
+	init_completion(&req->done);
+
+	return req;
+}
+
+int cachefiles_ondemand_read(struct netfs_cache_resources *cres,
+			     loff_t start_pos, size_t len)
+{
+	struct cachefiles_object *object;
+	struct cachefiles_cache *cache;
+	struct cachefiles_req *req;
+	int ret;
+	u32 id;
+
+	object = cachefiles_cres_object(cres);
+	cache = object->volume->cache;
+
+	if (!test_bit(CACHEFILES_ONDEMAND_MODE, &cache->flags))
+		return -EOPNOTSUPP;
+
+	req = cachefiles_alloc_req(object, start_pos, len);
+	if (!req)
+		return -ENOMEM;
+
+	ret = xa_alloc(&cache->reqs, &id, req, xa_limit_32b, GFP_KERNEL);
+	if (ret) {
+		kfree(req);
+		return -ENOMEM;
+	}
+
+	wake_up_all(&cache->daemon_pollwq);
+
+	wait_for_completion(&req->done);
+	kfree(req);
+
+	return 0;
+}
+#endif
+
 static const struct netfs_cache_ops cachefiles_netfs_cache_ops = {
 	.end_operation		= cachefiles_end_operation,
 	.read			= cachefiles_read,
@@ -604,6 +661,9 @@ static const struct netfs_cache_ops cachefiles_netfs_cache_ops = {
 	.prepare_read		= cachefiles_prepare_read,
 	.prepare_write		= cachefiles_prepare_write,
 	.query_occupancy	= cachefiles_query_occupancy,
+#ifdef CONFIG_CACHEFILES_ONDEMAND
+	.ondemand_read		= cachefiles_ondemand_read,
+#endif
 };
 
 /*
diff --git a/fs/cachefiles/main.c b/fs/cachefiles/main.c
index 3f369c6f816d..eab17c3140d9 100644
--- a/fs/cachefiles/main.c
+++ b/fs/cachefiles/main.c
@@ -39,6 +39,27 @@ static struct miscdevice cachefiles_dev = {
 	.fops	= &cachefiles_daemon_fops,
 };
 
+#ifdef CONFIG_CACHEFILES_ONDEMAND
+static struct miscdevice cachefiles_ondemand_dev = {
+	.minor	= MISC_DYNAMIC_MINOR,
+	.name	= "cachefiles_ondemand",
+	.fops	= &cachefiles_ondemand_fops,
+};
+
+static inline int cachefiles_init_ondemand(void)
+{
+	return misc_register(&cachefiles_ondemand_dev);
+}
+
+static inline void cachefiles_exit_ondemand(void)
+{
+	misc_deregister(&cachefiles_ondemand_dev);
+}
+#else
+static inline int cachefiles_init_ondemand(void) { return 0; }
+static inline void cachefiles_exit_ondemand(void) {}
+#endif
+
 /*
  * initialise the fs caching module
  */
@@ -52,6 +73,9 @@ static int __init cachefiles_init(void)
 	ret = misc_register(&cachefiles_dev);
 	if (ret < 0)
 		goto error_dev;
+	ret = cachefiles_init_ondemand();
+	if (ret < 0)
+		goto error_ondemand_dev;
 
 	/* create an object jar */
 	ret = -ENOMEM;
@@ -68,6 +92,8 @@ static int __init cachefiles_init(void)
 	return 0;
 
 error_object_jar:
+	cachefiles_exit_ondemand();
+error_ondemand_dev:
 	misc_deregister(&cachefiles_dev);
 error_dev:
 	cachefiles_unregister_error_injection();
@@ -86,6 +112,7 @@ static void __exit cachefiles_exit(void)
 	pr_info("Unloading\n");
 
 	kmem_cache_destroy(cachefiles_object_jar);
+	cachefiles_exit_ondemand();
 	misc_deregister(&cachefiles_dev);
 	cachefiles_unregister_error_injection();
 }
diff --git a/include/uapi/linux/cachefiles_ondemand.h b/include/uapi/linux/cachefiles_ondemand.h
new file mode 100644
index 000000000000..e639a82f1098
--- /dev/null
+++ b/include/uapi/linux/cachefiles_ondemand.h
@@ -0,0 +1,14 @@
+/* SPDX-License-Identifier: GPL-2.0 WITH Linux-syscall-note */
+#ifndef _LINUX_CACHEFILES_ONDEMAND_H
+#define _LINUX_CACHEFILES_ONDEMAND_H
+
+#include <linux/limits.h>
+
+struct cachefiles_req_in {
+	uint64_t id;
+	uint64_t off;
+	uint64_t len;
+	char path[NAME_MAX];
+};
+
+#endif
-- 
2.27.0


^ permalink raw reply related	[flat|nested] 35+ messages in thread

* [PATCH v3 06/22] erofs: use meta buffers for erofs_read_superblock()
  2022-02-09  6:00 [PATCH v3 00/22] fscache,erofs: fscache-based demand-read semantics Jeffle Xu
                   ` (4 preceding siblings ...)
  2022-02-09  6:00 ` [PATCH v3 05/22] cachefiles: introduce new devnode for " Jeffle Xu
@ 2022-02-09  6:00 ` Jeffle Xu
  2022-02-09  7:52   ` Gao Xiang
  2022-02-09  6:00 ` [PATCH v3 07/22] erofs: export erofs_map_blocks() Jeffle Xu
                   ` (16 subsequent siblings)
  22 siblings, 1 reply; 35+ messages in thread
From: Jeffle Xu @ 2022-02-09  6:00 UTC (permalink / raw)
  To: dhowells, linux-cachefs, xiang, chao, linux-erofs
  Cc: torvalds, gregkh, willy, linux-fsdevel, joseph.qi, bo.liu,
	tao.peng, gerry, eguan, linux-kernel

The only change is that, meta buffers read cache page without __GFP_FS
flag, which shall not matter.

Signed-off-by: Jeffle Xu <jefflexu@linux.alibaba.com>
---
 fs/erofs/super.c | 13 +++++--------
 1 file changed, 5 insertions(+), 8 deletions(-)

diff --git a/fs/erofs/super.c b/fs/erofs/super.c
index 915eefe0d7e2..12755217631f 100644
--- a/fs/erofs/super.c
+++ b/fs/erofs/super.c
@@ -281,21 +281,19 @@ static int erofs_init_devices(struct super_block *sb,
 static int erofs_read_superblock(struct super_block *sb)
 {
 	struct erofs_sb_info *sbi;
-	struct page *page;
+	struct erofs_buf buf = __EROFS_BUF_INITIALIZER;
 	struct erofs_super_block *dsb;
 	unsigned int blkszbits;
 	void *data;
 	int ret;
 
-	page = read_mapping_page(sb->s_bdev->bd_inode->i_mapping, 0, NULL);
-	if (IS_ERR(page)) {
+	data = erofs_read_metabuf(&buf, sb, 0, EROFS_KMAP);
+	if (IS_ERR(data)) {
 		erofs_err(sb, "cannot read erofs superblock");
-		return PTR_ERR(page);
+		return PTR_ERR(data);
 	}
 
 	sbi = EROFS_SB(sb);
-
-	data = kmap(page);
 	dsb = (struct erofs_super_block *)(data + EROFS_SUPER_OFFSET);
 
 	ret = -EINVAL;
@@ -365,8 +363,7 @@ static int erofs_read_superblock(struct super_block *sb)
 	if (erofs_sb_has_ztailpacking(sbi))
 		erofs_info(sb, "EXPERIMENTAL compressed inline data feature in use. Use at your own risk!");
 out:
-	kunmap(page);
-	put_page(page);
+	erofs_put_metabuf(&buf);
 	return ret;
 }
 
-- 
2.27.0


^ permalink raw reply related	[flat|nested] 35+ messages in thread

* [PATCH v3 07/22] erofs: export erofs_map_blocks()
  2022-02-09  6:00 [PATCH v3 00/22] fscache,erofs: fscache-based demand-read semantics Jeffle Xu
                   ` (5 preceding siblings ...)
  2022-02-09  6:00 ` [PATCH v3 06/22] erofs: use meta buffers for erofs_read_superblock() Jeffle Xu
@ 2022-02-09  6:00 ` Jeffle Xu
  2022-02-09  6:00 ` [PATCH v3 08/22] erofs: add mode checking helper Jeffle Xu
                   ` (15 subsequent siblings)
  22 siblings, 0 replies; 35+ messages in thread
From: Jeffle Xu @ 2022-02-09  6:00 UTC (permalink / raw)
  To: dhowells, linux-cachefs, xiang, chao, linux-erofs
  Cc: torvalds, gregkh, willy, linux-fsdevel, joseph.qi, bo.liu,
	tao.peng, gerry, eguan, linux-kernel

... so that it can be used in the following introduced fs/erofs/fscache.c.

Signed-off-by: Jeffle Xu <jefflexu@linux.alibaba.com>
---
 fs/erofs/data.c     | 4 ++--
 fs/erofs/internal.h | 2 ++
 2 files changed, 4 insertions(+), 2 deletions(-)

diff --git a/fs/erofs/data.c b/fs/erofs/data.c
index 226a57c57ee6..6e2a28242453 100644
--- a/fs/erofs/data.c
+++ b/fs/erofs/data.c
@@ -104,8 +104,8 @@ static int erofs_map_blocks_flatmode(struct inode *inode,
 	return 0;
 }
 
-static int erofs_map_blocks(struct inode *inode,
-			    struct erofs_map_blocks *map, int flags)
+int erofs_map_blocks(struct inode *inode,
+		     struct erofs_map_blocks *map, int flags)
 {
 	struct super_block *sb = inode->i_sb;
 	struct erofs_inode *vi = EROFS_I(inode);
diff --git a/fs/erofs/internal.h b/fs/erofs/internal.h
index b8272fb95fd6..f9f94d63d40f 100644
--- a/fs/erofs/internal.h
+++ b/fs/erofs/internal.h
@@ -484,6 +484,8 @@ void *erofs_read_metabuf(struct erofs_buf *buf, struct super_block *sb,
 int erofs_map_dev(struct super_block *sb, struct erofs_map_dev *dev);
 int erofs_fiemap(struct inode *inode, struct fiemap_extent_info *fieinfo,
 		 u64 start, u64 len);
+int erofs_map_blocks(struct inode *inode,
+		     struct erofs_map_blocks *map, int flags);
 
 /* inode.c */
 static inline unsigned long erofs_inode_hash(erofs_nid_t nid)
-- 
2.27.0


^ permalink raw reply related	[flat|nested] 35+ messages in thread

* [PATCH v3 08/22] erofs: add mode checking helper
  2022-02-09  6:00 [PATCH v3 00/22] fscache,erofs: fscache-based demand-read semantics Jeffle Xu
                   ` (6 preceding siblings ...)
  2022-02-09  6:00 ` [PATCH v3 07/22] erofs: export erofs_map_blocks() Jeffle Xu
@ 2022-02-09  6:00 ` Jeffle Xu
  2022-02-09  6:00 ` [PATCH v3 09/22] erofs: register global fscache volume Jeffle Xu
                   ` (14 subsequent siblings)
  22 siblings, 0 replies; 35+ messages in thread
From: Jeffle Xu @ 2022-02-09  6:00 UTC (permalink / raw)
  To: dhowells, linux-cachefs, xiang, chao, linux-erofs
  Cc: torvalds, gregkh, willy, linux-fsdevel, joseph.qi, bo.liu,
	tao.peng, gerry, eguan, linux-kernel

Until then erofs is exactly blockdev based filesystem. In other using
scenarios (e.g. container image), erofs needs to run upon files.

This patch set is going to introduces a new nodev mode, in which erofs
could be mounted from a bootstrap blob file containing complete erofs
image.

Add a helper checking which mode erofs works in.

Signed-off-by: Jeffle Xu <jefflexu@linux.alibaba.com>
---
 fs/erofs/internal.h | 5 +++++
 1 file changed, 5 insertions(+)

diff --git a/fs/erofs/internal.h b/fs/erofs/internal.h
index f9f94d63d40f..2b9337d385ce 100644
--- a/fs/erofs/internal.h
+++ b/fs/erofs/internal.h
@@ -161,6 +161,11 @@ struct erofs_sb_info {
 #define set_opt(opt, option)	((opt)->mount_opt |= EROFS_MOUNT_##option)
 #define test_opt(opt, option)	((opt)->mount_opt & EROFS_MOUNT_##option)
 
+static inline bool erofs_bdev_mode(struct super_block *sb)
+{
+	return sb->s_bdev;
+}
+
 enum {
 	EROFS_ZIP_CACHE_DISABLED,
 	EROFS_ZIP_CACHE_READAHEAD,
-- 
2.27.0


^ permalink raw reply related	[flat|nested] 35+ messages in thread

* [PATCH v3 09/22] erofs: register global fscache volume
  2022-02-09  6:00 [PATCH v3 00/22] fscache,erofs: fscache-based demand-read semantics Jeffle Xu
                   ` (7 preceding siblings ...)
  2022-02-09  6:00 ` [PATCH v3 08/22] erofs: add mode checking helper Jeffle Xu
@ 2022-02-09  6:00 ` Jeffle Xu
  2022-02-09  6:00 ` [PATCH v3 10/22] erofs: add cookie context helper functions Jeffle Xu
                   ` (13 subsequent siblings)
  22 siblings, 0 replies; 35+ messages in thread
From: Jeffle Xu @ 2022-02-09  6:00 UTC (permalink / raw)
  To: dhowells, linux-cachefs, xiang, chao, linux-erofs
  Cc: torvalds, gregkh, willy, linux-fsdevel, joseph.qi, bo.liu,
	tao.peng, gerry, eguan, linux-kernel

All erofs instances will share one global fscache volume.

In this using scenario, one erofs instance could be mounted from one (or
multiple) blob files instead of blkdev. The number of blob files that
each erofs instance could correspond to is limited, since these blob
files are quite large in size. For example, when used for container
image distribution, one erofs instance used for container image for
node.js will correspond to ~20 blob files in total. Thus in densely
employed environment, there could be as many as hundreds of containers
and thus thousands of fscache cookies under one fscache volume.

Then as for cachefiles backend, the hash table managing all cookies
under one volume contains 32K slots. Thus the hashing functionality shall
scale well in this case. Besides, cachefiles backend will scatter
backing files under 256 fan sub-directoris, and thus the scalability of
looking up backing files shall also not be an issue.

Signed-off-by: Jeffle Xu <jefflexu@linux.alibaba.com>
---
 fs/erofs/Makefile   |  3 ++-
 fs/erofs/fscache.c  | 21 +++++++++++++++++++++
 fs/erofs/internal.h |  5 +++++
 fs/erofs/super.c    |  7 +++++++
 4 files changed, 35 insertions(+), 1 deletion(-)
 create mode 100644 fs/erofs/fscache.c

diff --git a/fs/erofs/Makefile b/fs/erofs/Makefile
index 8a3317e38e5a..21999e8a4728 100644
--- a/fs/erofs/Makefile
+++ b/fs/erofs/Makefile
@@ -1,7 +1,8 @@
 # SPDX-License-Identifier: GPL-2.0-only
 
 obj-$(CONFIG_EROFS_FS) += erofs.o
-erofs-objs := super.o inode.o data.o namei.o dir.o utils.o pcpubuf.o sysfs.o
+erofs-objs := super.o inode.o data.o namei.o dir.o utils.o pcpubuf.o sysfs.o \
+	      fscache.o
 erofs-$(CONFIG_EROFS_FS_XATTR) += xattr.o
 erofs-$(CONFIG_EROFS_FS_ZIP) += decompressor.o zmap.o zdata.o
 erofs-$(CONFIG_EROFS_FS_ZIP_LZMA) += decompressor_lzma.o
diff --git a/fs/erofs/fscache.c b/fs/erofs/fscache.c
new file mode 100644
index 000000000000..9c32f42e1056
--- /dev/null
+++ b/fs/erofs/fscache.c
@@ -0,0 +1,21 @@
+// SPDX-License-Identifier: GPL-2.0-or-later
+/*
+ * Copyright (C) 2021, Alibaba Cloud
+ */
+#include "internal.h"
+
+static struct fscache_volume *volume;
+
+int __init erofs_init_fscache(void)
+{
+	volume = fscache_acquire_volume("erofs", NULL, NULL, 0);
+	if (!volume)
+		return -EINVAL;
+
+	return 0;
+}
+
+void erofs_exit_fscache(void)
+{
+	fscache_relinquish_volume(volume, NULL, false);
+}
diff --git a/fs/erofs/internal.h b/fs/erofs/internal.h
index 2b9337d385ce..c2608a469107 100644
--- a/fs/erofs/internal.h
+++ b/fs/erofs/internal.h
@@ -17,6 +17,7 @@
 #include <linux/slab.h>
 #include <linux/vmalloc.h>
 #include <linux/iomap.h>
+#include <linux/fscache.h>
 #include "erofs_fs.h"
 
 /* redefine pr_fmt "erofs: " */
@@ -616,6 +617,10 @@ static inline int z_erofs_load_lzma_config(struct super_block *sb,
 }
 #endif	/* !CONFIG_EROFS_FS_ZIP */
 
+/* fscache.c */
+int erofs_init_fscache(void);
+void erofs_exit_fscache(void);
+
 #define EFSCORRUPTED    EUCLEAN         /* Filesystem is corrupted */
 
 #endif	/* __EROFS_INTERNAL_H */
diff --git a/fs/erofs/super.c b/fs/erofs/super.c
index 12755217631f..798f0c379e35 100644
--- a/fs/erofs/super.c
+++ b/fs/erofs/super.c
@@ -814,6 +814,10 @@ static int __init erofs_module_init(void)
 	if (err)
 		goto sysfs_err;
 
+	err = erofs_init_fscache();
+	if (err)
+		goto fscache_err;
+
 	err = register_filesystem(&erofs_fs_type);
 	if (err)
 		goto fs_err;
@@ -821,6 +825,8 @@ static int __init erofs_module_init(void)
 	return 0;
 
 fs_err:
+	erofs_exit_fscache();
+fscache_err:
 	erofs_exit_sysfs();
 sysfs_err:
 	z_erofs_exit_zip_subsystem();
@@ -841,6 +847,7 @@ static void __exit erofs_module_exit(void)
 	/* Ensure all RCU free inodes / pclusters are safe to be destroyed. */
 	rcu_barrier();
 
+	erofs_exit_fscache();
 	erofs_exit_sysfs();
 	z_erofs_exit_zip_subsystem();
 	z_erofs_lzma_exit();
-- 
2.27.0


^ permalink raw reply related	[flat|nested] 35+ messages in thread

* [PATCH v3 10/22] erofs: add cookie context helper functions
  2022-02-09  6:00 [PATCH v3 00/22] fscache,erofs: fscache-based demand-read semantics Jeffle Xu
                   ` (8 preceding siblings ...)
  2022-02-09  6:00 ` [PATCH v3 09/22] erofs: register global fscache volume Jeffle Xu
@ 2022-02-09  6:00 ` Jeffle Xu
  2022-02-09  6:00 ` [PATCH v3 11/22] erofs: add anonymous inode managing page cache of blob file Jeffle Xu
                   ` (12 subsequent siblings)
  22 siblings, 0 replies; 35+ messages in thread
From: Jeffle Xu @ 2022-02-09  6:00 UTC (permalink / raw)
  To: dhowells, linux-cachefs, xiang, chao, linux-erofs
  Cc: torvalds, gregkh, willy, linux-fsdevel, joseph.qi, bo.liu,
	tao.peng, gerry, eguan, linux-kernel

Introduce 'struct erofs_cookie_ctx' for managing cookie for backing
file, and the following introduced API for reading from backing file.

Besides, introduce two helper functions for initializing and cleaning
up erofs_cookie_ctx.

struct erofs_cookie_ctx *
erofs_fscache_get_ctx(struct super_block *sb, char *path);

void erofs_fscache_put_ctx(struct erofs_cookie_ctx *ctx);

Signed-off-by: Jeffle Xu <jefflexu@linux.alibaba.com>
---
 fs/erofs/fscache.c  | 78 +++++++++++++++++++++++++++++++++++++++++++++
 fs/erofs/internal.h |  8 +++++
 2 files changed, 86 insertions(+)

diff --git a/fs/erofs/fscache.c b/fs/erofs/fscache.c
index 9c32f42e1056..c043d7709d65 100644
--- a/fs/erofs/fscache.c
+++ b/fs/erofs/fscache.c
@@ -6,6 +6,84 @@
 
 static struct fscache_volume *volume;
 
+static int erofs_fscache_init_cookie(struct erofs_fscache_context *ctx,
+				     char *path)
+{
+	struct fscache_cookie *cookie;
+
+	/*
+	 * @object_size shall be non-zero to avoid
+	 * FSCACHE_COOKIE_NO_DATA_TO_READ.
+	 */
+	cookie = fscache_acquire_cookie(volume, 0,
+					path, strlen(path),
+					NULL, 0, -1);
+	if (!cookie)
+		return -EINVAL;
+
+	fscache_use_cookie(cookie, false);
+	ctx->cookie = cookie;
+	return 0;
+}
+
+static inline
+void erofs_fscache_cleanup_cookie(struct erofs_fscache_context *ctx)
+{
+	struct fscache_cookie *cookie = ctx->cookie;
+
+	fscache_unuse_cookie(cookie, NULL, NULL);
+	fscache_relinquish_cookie(cookie, false);
+	ctx->cookie = NULL;
+}
+
+static int erofs_fscache_init_ctx(struct erofs_fscache_context *ctx,
+				  struct super_block *sb, char *path)
+{
+	int ret;
+
+	ret = erofs_fscache_init_cookie(ctx, path);
+	if (ret) {
+		erofs_err(sb, "failed to init cookie");
+		return ret;
+	}
+
+	return 0;
+}
+
+static inline
+void erofs_fscache_cleanup_ctx(struct erofs_fscache_context *ctx)
+{
+	erofs_fscache_cleanup_cookie(ctx);
+}
+
+struct erofs_fscache_context *erofs_fscache_get_ctx(struct super_block *sb,
+						char *path)
+{
+	struct erofs_fscache_context *ctx;
+	int ret;
+
+	ctx = kzalloc(sizeof(*ctx), GFP_KERNEL);
+	if (!ctx)
+		return ERR_PTR(-ENOMEM);
+
+	ret = erofs_fscache_init_ctx(ctx, sb, path);
+	if (ret) {
+		kfree(ctx);
+		return ERR_PTR(ret);
+	}
+
+	return ctx;
+}
+
+void erofs_fscache_put_ctx(struct erofs_fscache_context *ctx)
+{
+	if (!ctx)
+		return;
+
+	erofs_fscache_cleanup_ctx(ctx);
+	kfree(ctx);
+}
+
 int __init erofs_init_fscache(void)
 {
 	volume = fscache_acquire_volume("erofs", NULL, NULL, 0);
diff --git a/fs/erofs/internal.h b/fs/erofs/internal.h
index c2608a469107..1f5bc69e8e9f 100644
--- a/fs/erofs/internal.h
+++ b/fs/erofs/internal.h
@@ -97,6 +97,10 @@ struct erofs_sb_lz4_info {
 	u16 max_pclusterblks;
 };
 
+struct erofs_fscache_context {
+	struct fscache_cookie *cookie;
+};
+
 struct erofs_sb_info {
 	struct erofs_mount_opts opt;	/* options */
 #ifdef CONFIG_EROFS_FS_ZIP
@@ -621,6 +625,10 @@ static inline int z_erofs_load_lzma_config(struct super_block *sb,
 int erofs_init_fscache(void);
 void erofs_exit_fscache(void);
 
+struct erofs_fscache_context *erofs_fscache_get_ctx(struct super_block *sb,
+						char *path);
+void erofs_fscache_put_ctx(struct erofs_fscache_context *ctx);
+
 #define EFSCORRUPTED    EUCLEAN         /* Filesystem is corrupted */
 
 #endif	/* __EROFS_INTERNAL_H */
-- 
2.27.0


^ permalink raw reply related	[flat|nested] 35+ messages in thread

* [PATCH v3 11/22] erofs: add anonymous inode managing page cache of blob file
  2022-02-09  6:00 [PATCH v3 00/22] fscache,erofs: fscache-based demand-read semantics Jeffle Xu
                   ` (9 preceding siblings ...)
  2022-02-09  6:00 ` [PATCH v3 10/22] erofs: add cookie context helper functions Jeffle Xu
@ 2022-02-09  6:00 ` Jeffle Xu
  2022-02-09  6:00 ` [PATCH v3 12/22] erofs: add erofs_fscache_read_page() helper Jeffle Xu
                   ` (11 subsequent siblings)
  22 siblings, 0 replies; 35+ messages in thread
From: Jeffle Xu @ 2022-02-09  6:00 UTC (permalink / raw)
  To: dhowells, linux-cachefs, xiang, chao, linux-erofs
  Cc: torvalds, gregkh, willy, linux-fsdevel, joseph.qi, bo.liu,
	tao.peng, gerry, eguan, linux-kernel

Introduce one anonymous inode for managing page cache of corresponding
blob file. Then erofs could read directly from the address space of the
anonymous inode when cache hit.

Signed-off-by: Jeffle Xu <jefflexu@linux.alibaba.com>
---
 fs/erofs/fscache.c  | 45 ++++++++++++++++++++++++++++++++++++++++++---
 fs/erofs/internal.h |  3 ++-
 2 files changed, 44 insertions(+), 4 deletions(-)

diff --git a/fs/erofs/fscache.c b/fs/erofs/fscache.c
index c043d7709d65..3addd9aa549c 100644
--- a/fs/erofs/fscache.c
+++ b/fs/erofs/fscache.c
@@ -6,6 +6,9 @@
 
 static struct fscache_volume *volume;
 
+static const struct address_space_operations erofs_fscache_blob_aops = {
+};
+
 static int erofs_fscache_init_cookie(struct erofs_fscache_context *ctx,
 				     char *path)
 {
@@ -36,8 +39,34 @@ void erofs_fscache_cleanup_cookie(struct erofs_fscache_context *ctx)
 	ctx->cookie = NULL;
 }
 
+static int erofs_fscache_get_inode(struct erofs_fscache_context *ctx,
+				   struct super_block *sb)
+{
+	struct inode *const inode = new_inode(sb);
+
+	if (!inode)
+		return -ENOMEM;
+
+	set_nlink(inode, 1);
+	inode->i_size = OFFSET_MAX;
+
+	inode->i_mapping->a_ops = &erofs_fscache_blob_aops;
+	mapping_set_gfp_mask(inode->i_mapping,
+			GFP_NOFS | __GFP_HIGHMEM | __GFP_MOVABLE);
+	ctx->inode = inode;
+	return 0;
+}
+
+static inline
+void erofs_fscache_put_inode(struct erofs_fscache_context *ctx)
+{
+	iput(ctx->inode);
+	ctx->inode = NULL;
+}
+
 static int erofs_fscache_init_ctx(struct erofs_fscache_context *ctx,
-				  struct super_block *sb, char *path)
+				  struct super_block *sb, char *path,
+				  bool need_inode)
 {
 	int ret;
 
@@ -47,6 +76,15 @@ static int erofs_fscache_init_ctx(struct erofs_fscache_context *ctx,
 		return ret;
 	}
 
+	if (need_inode) {
+		ret = erofs_fscache_get_inode(ctx, sb);
+		if (ret) {
+			erofs_err(sb, "failed to get anonymous inode");
+			erofs_fscache_cleanup_cookie(ctx);
+			return ret;
+		}
+	}
+
 	return 0;
 }
 
@@ -54,10 +92,11 @@ static inline
 void erofs_fscache_cleanup_ctx(struct erofs_fscache_context *ctx)
 {
 	erofs_fscache_cleanup_cookie(ctx);
+	erofs_fscache_put_inode(ctx);
 }
 
 struct erofs_fscache_context *erofs_fscache_get_ctx(struct super_block *sb,
-						char *path)
+						char *path, bool need_inode)
 {
 	struct erofs_fscache_context *ctx;
 	int ret;
@@ -66,7 +105,7 @@ struct erofs_fscache_context *erofs_fscache_get_ctx(struct super_block *sb,
 	if (!ctx)
 		return ERR_PTR(-ENOMEM);
 
-	ret = erofs_fscache_init_ctx(ctx, sb, path);
+	ret = erofs_fscache_init_ctx(ctx, sb, path, need_inode);
 	if (ret) {
 		kfree(ctx);
 		return ERR_PTR(ret);
diff --git a/fs/erofs/internal.h b/fs/erofs/internal.h
index 1f5bc69e8e9f..bb5e992fe0df 100644
--- a/fs/erofs/internal.h
+++ b/fs/erofs/internal.h
@@ -99,6 +99,7 @@ struct erofs_sb_lz4_info {
 
 struct erofs_fscache_context {
 	struct fscache_cookie *cookie;
+	struct inode *inode;
 };
 
 struct erofs_sb_info {
@@ -626,7 +627,7 @@ int erofs_init_fscache(void);
 void erofs_exit_fscache(void);
 
 struct erofs_fscache_context *erofs_fscache_get_ctx(struct super_block *sb,
-						char *path);
+						char *path, bool need_inode);
 void erofs_fscache_put_ctx(struct erofs_fscache_context *ctx);
 
 #define EFSCORRUPTED    EUCLEAN         /* Filesystem is corrupted */
-- 
2.27.0


^ permalink raw reply related	[flat|nested] 35+ messages in thread

* [PATCH v3 12/22] erofs: add erofs_fscache_read_page() helper
  2022-02-09  6:00 [PATCH v3 00/22] fscache,erofs: fscache-based demand-read semantics Jeffle Xu
                   ` (10 preceding siblings ...)
  2022-02-09  6:00 ` [PATCH v3 11/22] erofs: add anonymous inode managing page cache of blob file Jeffle Xu
@ 2022-02-09  6:00 ` Jeffle Xu
  2022-02-09  6:00 ` [PATCH v3 13/22] erofs: register cookie context for bootstrap blob Jeffle Xu
                   ` (10 subsequent siblings)
  22 siblings, 0 replies; 35+ messages in thread
From: Jeffle Xu @ 2022-02-09  6:00 UTC (permalink / raw)
  To: dhowells, linux-cachefs, xiang, chao, linux-erofs
  Cc: torvalds, gregkh, willy, linux-fsdevel, joseph.qi, bo.liu,
	tao.peng, gerry, eguan, linux-kernel

Add erofs_fscache_read_page() helper reading from fscache. It supports
on-demand read semantics. That is, it will make the backend prepare for
the data when cache miss. Once data ready, it will reinitiate a read
from the cache.

This helper can then be used to implement .readpage() of on-demand
reading semantics.

Signed-off-by: Jeffle Xu <jefflexu@linux.alibaba.com>
---
 fs/erofs/fscache.c | 36 ++++++++++++++++++++++++++++++++++++
 1 file changed, 36 insertions(+)

diff --git a/fs/erofs/fscache.c b/fs/erofs/fscache.c
index 3addd9aa549c..f4aade711664 100644
--- a/fs/erofs/fscache.c
+++ b/fs/erofs/fscache.c
@@ -6,6 +6,42 @@
 
 static struct fscache_volume *volume;
 
+static int erofs_fscache_read_page(struct fscache_cookie *cookie,
+				   struct page *page, loff_t start_pos)
+{
+	struct netfs_cache_resources cres;
+	struct bio_vec bvec[1];
+	struct iov_iter iter;
+	int ret;
+
+	memset(&cres, 0, sizeof(cres));
+
+	ret = fscache_begin_read_operation(&cres, cookie);
+	if (ret)
+		return ret;
+
+	bvec[0].bv_page         = page;
+	bvec[0].bv_offset       = 0;
+	bvec[0].bv_len          = PAGE_SIZE;
+	iov_iter_bvec(&iter, READ, bvec, ARRAY_SIZE(bvec), PAGE_SIZE);
+
+	ret = fscache_read(&cres, start_pos, &iter,
+			   NETFS_READ_HOLE_FAIL, NULL, NULL);
+	/*
+	 * -ENODATA will be returned when cache miss. In this case, make the
+	 * backend prepare for the data and then reinitiate a read from cache.
+	 */
+	if (ret == -ENODATA) {
+		ret = fscache_ondemand_read(&cres, start_pos, PAGE_SIZE);
+		if (ret == 0)
+			ret = fscache_read(&cres, start_pos, &iter,
+					   NETFS_READ_HOLE_FAIL, NULL, NULL);
+	}
+
+	fscache_end_operation(&cres);
+	return ret;
+}
+
 static const struct address_space_operations erofs_fscache_blob_aops = {
 };
 
-- 
2.27.0


^ permalink raw reply related	[flat|nested] 35+ messages in thread

* [PATCH v3 13/22] erofs: register cookie context for bootstrap blob
  2022-02-09  6:00 [PATCH v3 00/22] fscache,erofs: fscache-based demand-read semantics Jeffle Xu
                   ` (11 preceding siblings ...)
  2022-02-09  6:00 ` [PATCH v3 12/22] erofs: add erofs_fscache_read_page() helper Jeffle Xu
@ 2022-02-09  6:00 ` Jeffle Xu
  2022-02-09  6:01 ` [PATCH v3 14/22] erofs: implement fscache-based metadata read Jeffle Xu
                   ` (9 subsequent siblings)
  22 siblings, 0 replies; 35+ messages in thread
From: Jeffle Xu @ 2022-02-09  6:00 UTC (permalink / raw)
  To: dhowells, linux-cachefs, xiang, chao, linux-erofs
  Cc: torvalds, gregkh, willy, linux-fsdevel, joseph.qi, bo.liu,
	tao.peng, gerry, eguan, linux-kernel

Registers fscache_cookie for the bootstrap blob file. The bootstrap blob
file can be specified by a new mount option, which is going to be
introduced by a following patch.

Something worth mentioning about the cleanup routine.

1. The init routine is prior to when the root inode gets initialized,
and thus the corresponding cleanup routine shall be placed under
.kill_sb() callback.

2. The init routine will instantiate anonymous inodes under the
super_block, and thus .put_super() callback shall also contain the
cleanup routine. Or we'll get "VFS: Busy inodes after unmount." warning.

Signed-off-by: Jeffle Xu <jefflexu@linux.alibaba.com>
---
 fs/erofs/internal.h |  3 +++
 fs/erofs/super.c    | 13 +++++++++++++
 2 files changed, 16 insertions(+)

diff --git a/fs/erofs/internal.h b/fs/erofs/internal.h
index bb5e992fe0df..277dcd5888ea 100644
--- a/fs/erofs/internal.h
+++ b/fs/erofs/internal.h
@@ -75,6 +75,7 @@ struct erofs_mount_opts {
 	unsigned int max_sync_decompress_pages;
 #endif
 	unsigned int mount_opt;
+	char *uuid;
 };
 
 struct erofs_dev_context {
@@ -152,6 +153,8 @@ struct erofs_sb_info {
 	/* sysfs support */
 	struct kobject s_kobj;		/* /sys/fs/erofs/<devname> */
 	struct completion s_kobj_unregister;
+
+	struct erofs_fscache_context *bootstrap;
 };
 
 #define EROFS_SB(sb) ((struct erofs_sb_info *)(sb)->s_fs_info)
diff --git a/fs/erofs/super.c b/fs/erofs/super.c
index 798f0c379e35..8c5783c6f71f 100644
--- a/fs/erofs/super.c
+++ b/fs/erofs/super.c
@@ -598,6 +598,16 @@ static int erofs_fc_fill_super(struct super_block *sb, struct fs_context *fc)
 	sbi->devs = ctx->devs;
 	ctx->devs = NULL;
 
+	if (!erofs_bdev_mode(sb)) {
+		struct erofs_fscache_context *bootstrap;
+
+		bootstrap = erofs_fscache_get_ctx(sb, ctx->opt.uuid, true);
+		if (IS_ERR(bootstrap))
+			return PTR_ERR(bootstrap);
+
+		sbi->bootstrap = bootstrap;
+	}
+
 	err = erofs_read_superblock(sb);
 	if (err)
 		return err;
@@ -753,6 +763,7 @@ static void erofs_kill_sb(struct super_block *sb)
 		return;
 
 	erofs_free_dev_context(sbi->devs);
+	erofs_fscache_put_ctx(sbi->bootstrap);
 	fs_put_dax(sbi->dax_dev);
 	kfree(sbi);
 	sb->s_fs_info = NULL;
@@ -771,6 +782,8 @@ static void erofs_put_super(struct super_block *sb)
 	iput(sbi->managed_cache);
 	sbi->managed_cache = NULL;
 #endif
+	erofs_fscache_put_ctx(sbi->bootstrap);
+	sbi->bootstrap = NULL;
 }
 
 static struct file_system_type erofs_fs_type = {
-- 
2.27.0


^ permalink raw reply related	[flat|nested] 35+ messages in thread

* [PATCH v3 14/22] erofs: implement fscache-based metadata read
  2022-02-09  6:00 [PATCH v3 00/22] fscache,erofs: fscache-based demand-read semantics Jeffle Xu
                   ` (12 preceding siblings ...)
  2022-02-09  6:00 ` [PATCH v3 13/22] erofs: register cookie context for bootstrap blob Jeffle Xu
@ 2022-02-09  6:01 ` Jeffle Xu
  2022-02-09  6:01 ` [PATCH v3 15/22] erofs: implement fscache-based data read for non-inline layout Jeffle Xu
                   ` (8 subsequent siblings)
  22 siblings, 0 replies; 35+ messages in thread
From: Jeffle Xu @ 2022-02-09  6:01 UTC (permalink / raw)
  To: dhowells, linux-cachefs, xiang, chao, linux-erofs
  Cc: torvalds, gregkh, willy, linux-fsdevel, joseph.qi, bo.liu,
	tao.peng, gerry, eguan, linux-kernel

This patch implements the data plane of reading metadata from bootstrap
blob file over fscache.

Be noted that currently it only supports the scenario where the backing
file has no hole. Once it hits a hole of the backing file, erofs will
fail the IO with -EOPNOTSUPP for now. The following patch will fix this
issue, i.e. implementing the demand reading mode.

Signed-off-by: Jeffle Xu <jefflexu@linux.alibaba.com>
---
 fs/erofs/data.c     | 11 +++++++++--
 fs/erofs/fscache.c  | 24 ++++++++++++++++++++++++
 fs/erofs/internal.h |  3 +++
 3 files changed, 36 insertions(+), 2 deletions(-)

diff --git a/fs/erofs/data.c b/fs/erofs/data.c
index 6e2a28242453..1bff99576883 100644
--- a/fs/erofs/data.c
+++ b/fs/erofs/data.c
@@ -31,15 +31,22 @@ void erofs_put_metabuf(struct erofs_buf *buf)
 void *erofs_read_metabuf(struct erofs_buf *buf, struct super_block *sb,
 			erofs_blk_t blkaddr, enum erofs_kmap_type type)
 {
-	struct address_space *const mapping = sb->s_bdev->bd_inode->i_mapping;
+	struct address_space *mapping;
+	struct erofs_sb_info *sbi = EROFS_SB(sb);
 	erofs_off_t offset = blknr_to_addr(blkaddr);
 	pgoff_t index = offset >> PAGE_SHIFT;
 	struct page *page = buf->page;
 
 	if (!page || page->index != index) {
 		erofs_put_metabuf(buf);
-		page = read_cache_page_gfp(mapping, index,
+		if (erofs_bdev_mode(sb)) {
+			mapping = sb->s_bdev->bd_inode->i_mapping;
+			page = read_cache_page_gfp(mapping, index,
 				mapping_gfp_constraint(mapping, ~__GFP_FS));
+		} else {
+			page = erofs_fscache_read_cache_page(sbi->bootstrap,
+				index);
+		}
 		if (IS_ERR(page))
 			return page;
 		/* should already be PageUptodate, no need to lock page */
diff --git a/fs/erofs/fscache.c b/fs/erofs/fscache.c
index f4aade711664..a29d2ecff58b 100644
--- a/fs/erofs/fscache.c
+++ b/fs/erofs/fscache.c
@@ -42,9 +42,33 @@ static int erofs_fscache_read_page(struct fscache_cookie *cookie,
 	return ret;
 }
 
+static int erofs_fscache_readpage_blob(struct file *data, struct page *page)
+{
+	int ret;
+	struct erofs_fscache_context *ctx =
+		(struct erofs_fscache_context *)data;
+
+	ret = erofs_fscache_read_page(ctx->cookie, page, page_offset(page));
+	if (ret)
+		SetPageError(page);
+	else
+		SetPageUptodate(page);
+
+	unlock_page(page);
+	return ret;
+}
+
 static const struct address_space_operations erofs_fscache_blob_aops = {
+	.readpage = erofs_fscache_readpage_blob,
 };
 
+struct page *erofs_fscache_read_cache_page(struct erofs_fscache_context *ctx,
+					   pgoff_t index)
+{
+	DBG_BUGON(!ctx->inode);
+	return read_mapping_page(ctx->inode->i_mapping, index, ctx);
+}
+
 static int erofs_fscache_init_cookie(struct erofs_fscache_context *ctx,
 				     char *path)
 {
diff --git a/fs/erofs/internal.h b/fs/erofs/internal.h
index 277dcd5888ea..fca706cfaf72 100644
--- a/fs/erofs/internal.h
+++ b/fs/erofs/internal.h
@@ -633,6 +633,9 @@ struct erofs_fscache_context *erofs_fscache_get_ctx(struct super_block *sb,
 						char *path, bool need_inode);
 void erofs_fscache_put_ctx(struct erofs_fscache_context *ctx);
 
+struct page *erofs_fscache_read_cache_page(struct erofs_fscache_context *ctx,
+					   pgoff_t index);
+
 #define EFSCORRUPTED    EUCLEAN         /* Filesystem is corrupted */
 
 #endif	/* __EROFS_INTERNAL_H */
-- 
2.27.0


^ permalink raw reply related	[flat|nested] 35+ messages in thread

* [PATCH v3 15/22] erofs: implement fscache-based data read for non-inline layout
  2022-02-09  6:00 [PATCH v3 00/22] fscache,erofs: fscache-based demand-read semantics Jeffle Xu
                   ` (13 preceding siblings ...)
  2022-02-09  6:01 ` [PATCH v3 14/22] erofs: implement fscache-based metadata read Jeffle Xu
@ 2022-02-09  6:01 ` Jeffle Xu
  2022-02-09  6:01 ` [PATCH v3 16/22] erofs: implement fscache-based data read for inline layout Jeffle Xu
                   ` (7 subsequent siblings)
  22 siblings, 0 replies; 35+ messages in thread
From: Jeffle Xu @ 2022-02-09  6:01 UTC (permalink / raw)
  To: dhowells, linux-cachefs, xiang, chao, linux-erofs
  Cc: torvalds, gregkh, willy, linux-fsdevel, joseph.qi, bo.liu,
	tao.peng, gerry, eguan, linux-kernel

This patch implements the data plane of reading data from bootstrap blob
file over fscache for non-inline layout.

Be noted that compressed layout is not supported yet.

Signed-off-by: Jeffle Xu <jefflexu@linux.alibaba.com>
---
 fs/erofs/fscache.c  | 94 +++++++++++++++++++++++++++++++++++++++++++++
 fs/erofs/inode.c    |  6 ++-
 fs/erofs/internal.h |  1 +
 3 files changed, 100 insertions(+), 1 deletion(-)

diff --git a/fs/erofs/fscache.c b/fs/erofs/fscache.c
index a29d2ecff58b..82fdde054b0b 100644
--- a/fs/erofs/fscache.c
+++ b/fs/erofs/fscache.c
@@ -4,6 +4,12 @@
  */
 #include "internal.h"
 
+struct erofs_fscache_map {
+	struct erofs_fscache_context *m_ctx;
+	erofs_off_t m_pa, m_la, o_la;
+	u64 m_llen;
+};
+
 static struct fscache_volume *volume;
 
 static int erofs_fscache_read_page(struct fscache_cookie *cookie,
@@ -58,10 +64,98 @@ static int erofs_fscache_readpage_blob(struct file *data, struct page *page)
 	return ret;
 }
 
+static inline int erofs_fscache_get_map(struct erofs_fscache_map *fsmap,
+					struct erofs_map_blocks *map,
+					struct super_block *sb)
+{
+	struct erofs_sb_info *sbi = EROFS_SB(sb);
+
+	fsmap->m_ctx  = sbi->bootstrap;
+	fsmap->m_la   = map->m_la;
+	fsmap->m_pa   = map->m_pa;
+	fsmap->m_llen = map->m_llen;
+
+	return 0;
+}
+
+static int erofs_fscache_readpage_noinline(struct page *page,
+					   struct erofs_fscache_map *fsmap)
+{
+	struct fscache_cookie *cookie = fsmap->m_ctx->cookie;
+	/*
+	 * 1) For FLAT_PLAIN layout, the output map.m_la shall be equal to o_la,
+	 * and the output map.m_pa is exactly the physical address of o_la.
+	 * 2) For CHUNK_BASED layout, the output map.m_la is rounded down to the
+	 * nearest chunk boundary, and the output map.m_pa is actually the
+	 * physical address of this chunk boundary. So we need to recalculate
+	 * the actual physical address of o_la.
+	 */
+	loff_t start = fsmap->m_pa + fsmap->o_la - fsmap->m_la;
+
+	return erofs_fscache_read_page(cookie, page, start);
+}
+
+static int erofs_fscache_do_readpage(struct page *page)
+{
+	struct inode *inode = page->mapping->host;
+	struct erofs_inode *vi = EROFS_I(inode);
+	struct super_block *sb = inode->i_sb;
+	struct erofs_map_blocks map;
+	struct erofs_fscache_map fsmap;
+	int ret;
+
+	if (erofs_inode_is_data_compressed(vi->datalayout)) {
+		erofs_info(sb, "compressed layout not supported yet");
+		return -EOPNOTSUPP;
+	}
+
+	map.m_la = fsmap.o_la = page_offset(page);
+
+	ret = erofs_map_blocks(inode, &map, EROFS_GET_BLOCKS_RAW);
+	if (ret)
+		return ret;
+
+	if (!(map.m_flags & EROFS_MAP_MAPPED)) {
+		zero_user(page, 0, PAGE_SIZE);
+		return 0;
+	}
+
+	ret = erofs_fscache_get_map(&fsmap, &map, sb);
+	if (ret)
+		return ret;
+
+	switch (vi->datalayout) {
+	case EROFS_INODE_FLAT_PLAIN:
+	case EROFS_INODE_CHUNK_BASED:
+		return erofs_fscache_readpage_noinline(page, &fsmap);
+	default:
+		DBG_BUGON(1);
+		return -EOPNOTSUPP;
+	}
+}
+
+static int erofs_fscache_readpage(struct file *file, struct page *page)
+{
+	int ret;
+
+	ret = erofs_fscache_do_readpage(page);
+	if (ret)
+		SetPageError(page);
+	else
+		SetPageUptodate(page);
+
+	unlock_page(page);
+	return ret;
+}
+
 static const struct address_space_operations erofs_fscache_blob_aops = {
 	.readpage = erofs_fscache_readpage_blob,
 };
 
+const struct address_space_operations erofs_fscache_access_aops = {
+	.readpage = erofs_fscache_readpage,
+};
+
 struct page *erofs_fscache_read_cache_page(struct erofs_fscache_context *ctx,
 					   pgoff_t index)
 {
diff --git a/fs/erofs/inode.c b/fs/erofs/inode.c
index ff62f84f47d3..2f450cb3a7b9 100644
--- a/fs/erofs/inode.c
+++ b/fs/erofs/inode.c
@@ -296,7 +296,11 @@ static int erofs_fill_inode(struct inode *inode, int isdir)
 		err = z_erofs_fill_inode(inode);
 		goto out_unlock;
 	}
-	inode->i_mapping->a_ops = &erofs_raw_access_aops;
+
+	if (erofs_bdev_mode(inode->i_sb))
+		inode->i_mapping->a_ops = &erofs_raw_access_aops;
+	else
+		inode->i_mapping->a_ops = &erofs_fscache_access_aops;
 
 out_unlock:
 	erofs_put_metabuf(&buf);
diff --git a/fs/erofs/internal.h b/fs/erofs/internal.h
index fca706cfaf72..548f928b0ded 100644
--- a/fs/erofs/internal.h
+++ b/fs/erofs/internal.h
@@ -393,6 +393,7 @@ struct page *erofs_grab_cache_page_nowait(struct address_space *mapping,
 extern const struct super_operations erofs_sops;
 
 extern const struct address_space_operations erofs_raw_access_aops;
+extern const struct address_space_operations erofs_fscache_access_aops;
 extern const struct address_space_operations z_erofs_aops;
 
 /*
-- 
2.27.0


^ permalink raw reply related	[flat|nested] 35+ messages in thread

* [PATCH v3 16/22] erofs: implement fscache-based data read for inline layout
  2022-02-09  6:00 [PATCH v3 00/22] fscache,erofs: fscache-based demand-read semantics Jeffle Xu
                   ` (14 preceding siblings ...)
  2022-02-09  6:01 ` [PATCH v3 15/22] erofs: implement fscache-based data read for non-inline layout Jeffle Xu
@ 2022-02-09  6:01 ` Jeffle Xu
  2022-02-09  6:01 ` [PATCH v3 17/22] erofs: register cookie context for data blobs Jeffle Xu
                   ` (6 subsequent siblings)
  22 siblings, 0 replies; 35+ messages in thread
From: Jeffle Xu @ 2022-02-09  6:01 UTC (permalink / raw)
  To: dhowells, linux-cachefs, xiang, chao, linux-erofs
  Cc: torvalds, gregkh, willy, linux-fsdevel, joseph.qi, bo.liu,
	tao.peng, gerry, eguan, linux-kernel

This patch implements the data plane of reading data from bootstrap blob
file over fscache for inline layout.

Signed-off-by: Jeffle Xu <jefflexu@linux.alibaba.com>
---
 fs/erofs/fscache.c | 37 +++++++++++++++++++++++++++++++++++++
 1 file changed, 37 insertions(+)

diff --git a/fs/erofs/fscache.c b/fs/erofs/fscache.c
index 82fdde054b0b..fcd686f4dc9f 100644
--- a/fs/erofs/fscache.c
+++ b/fs/erofs/fscache.c
@@ -95,6 +95,41 @@ static int erofs_fscache_readpage_noinline(struct page *page,
 	return erofs_fscache_read_page(cookie, page, start);
 }
 
+static int erofs_fscache_readpage_inline(struct page *page,
+					 struct erofs_fscache_map *fsmap)
+{
+	struct inode *inode = page->mapping->host;
+	struct super_block *sb = inode->i_sb;
+	struct erofs_buf buf = __EROFS_BUF_INITIALIZER;
+	erofs_blk_t blknr;
+	size_t offset, len;
+	void *src, *dst;
+
+	/*
+	 * For inline (tail packing) layout, the offset may be non-zero, while
+	 * the offset can be calculated from corresponding physical address
+	 * directly.
+	 * Currently only flat layout supports inline (FLAT_INLINE), and the
+	 * output map.m_pa is exactly the physical address of o_la in this case.
+	 */
+	offset = erofs_blkoff(fsmap->m_pa);
+	blknr = erofs_blknr(fsmap->m_pa);
+	len = fsmap->m_llen;
+
+	src = erofs_read_metabuf(&buf, sb, blknr, EROFS_KMAP);
+	if (IS_ERR(src))
+		return PTR_ERR(src);
+
+	dst = kmap(page);
+	memcpy(dst, src + offset, len);
+	memset(dst + len, 0, PAGE_SIZE - len);
+	kunmap(page);
+
+	erofs_put_metabuf(&buf);
+
+	return 0;
+}
+
 static int erofs_fscache_do_readpage(struct page *page)
 {
 	struct inode *inode = page->mapping->host;
@@ -128,6 +163,8 @@ static int erofs_fscache_do_readpage(struct page *page)
 	case EROFS_INODE_FLAT_PLAIN:
 	case EROFS_INODE_CHUNK_BASED:
 		return erofs_fscache_readpage_noinline(page, &fsmap);
+	case EROFS_INODE_FLAT_INLINE:
+		return erofs_fscache_readpage_inline(page, &fsmap);
 	default:
 		DBG_BUGON(1);
 		return -EOPNOTSUPP;
-- 
2.27.0


^ permalink raw reply related	[flat|nested] 35+ messages in thread

* [PATCH v3 17/22] erofs: register cookie context for data blobs
  2022-02-09  6:00 [PATCH v3 00/22] fscache,erofs: fscache-based demand-read semantics Jeffle Xu
                   ` (15 preceding siblings ...)
  2022-02-09  6:01 ` [PATCH v3 16/22] erofs: implement fscache-based data read for inline layout Jeffle Xu
@ 2022-02-09  6:01 ` Jeffle Xu
  2022-02-09  6:01 ` [PATCH v3 18/22] erofs: implement fscache-based data read " Jeffle Xu
                   ` (5 subsequent siblings)
  22 siblings, 0 replies; 35+ messages in thread
From: Jeffle Xu @ 2022-02-09  6:01 UTC (permalink / raw)
  To: dhowells, linux-cachefs, xiang, chao, linux-erofs
  Cc: torvalds, gregkh, willy, linux-fsdevel, joseph.qi, bo.liu,
	tao.peng, gerry, eguan, linux-kernel

Similar to the multi device mode, erofs could be mounted from multiple
blob files (one bootstrap blob file and optional multiple data blob
files). In this case, each device slot contains the path of
corresponding data blob file.

This patch registers corresponding cookie context for each data blob
file.

Signed-off-by: Jeffle Xu <jefflexu@linux.alibaba.com>
---
 fs/erofs/internal.h |  1 +
 fs/erofs/super.c    | 27 +++++++++++++++++++--------
 2 files changed, 20 insertions(+), 8 deletions(-)

diff --git a/fs/erofs/internal.h b/fs/erofs/internal.h
index 548f928b0ded..5d514c7b73cc 100644
--- a/fs/erofs/internal.h
+++ b/fs/erofs/internal.h
@@ -53,6 +53,7 @@ struct erofs_device_info {
 	struct block_device *bdev;
 	struct dax_device *dax_dev;
 	u64 dax_part_off;
+	struct erofs_fscache_context *ctx;
 
 	u32 blocks;
 	u32 mapped_blkaddr;
diff --git a/fs/erofs/super.c b/fs/erofs/super.c
index 8c5783c6f71f..f058a04a00c7 100644
--- a/fs/erofs/super.c
+++ b/fs/erofs/super.c
@@ -250,6 +250,7 @@ static int erofs_init_devices(struct super_block *sb,
 	down_read(&sbi->devs->rwsem);
 	idr_for_each_entry(&sbi->devs->tree, dif, id) {
 		struct block_device *bdev;
+		struct erofs_fscache_context *ctx;
 
 		ptr = erofs_read_metabuf(&buf, sb, erofs_blknr(pos),
 					 EROFS_KMAP);
@@ -259,15 +260,24 @@ static int erofs_init_devices(struct super_block *sb,
 		}
 		dis = ptr + erofs_blkoff(pos);
 
-		bdev = blkdev_get_by_path(dif->path,
-					  FMODE_READ | FMODE_EXCL,
-					  sb->s_type);
-		if (IS_ERR(bdev)) {
-			err = PTR_ERR(bdev);
-			break;
+		if (erofs_bdev_mode(sb)) {
+			bdev = blkdev_get_by_path(dif->path,
+						  FMODE_READ | FMODE_EXCL,
+						  sb->s_type);
+			if (IS_ERR(bdev)) {
+				err = PTR_ERR(bdev);
+				break;
+			}
+			dif->bdev = bdev;
+			dif->dax_dev = fs_dax_get_by_bdev(bdev, &dif->dax_part_off);
+		} else {
+			ctx = erofs_fscache_get_ctx(sb, dif->path, false);
+			if (IS_ERR(ctx)) {
+				err = PTR_ERR(ctx);
+				break;
+			}
+			dif->ctx = ctx;
 		}
-		dif->bdev = bdev;
-		dif->dax_dev = fs_dax_get_by_bdev(bdev, &dif->dax_part_off);
 		dif->blocks = le32_to_cpu(dis->blocks);
 		dif->mapped_blkaddr = le32_to_cpu(dis->mapped_blkaddr);
 		sbi->total_blocks += dif->blocks;
@@ -694,6 +704,7 @@ static int erofs_release_device_info(int id, void *ptr, void *data)
 {
 	struct erofs_device_info *dif = ptr;
 
+	erofs_fscache_put_ctx(dif->ctx);
 	fs_put_dax(dif->dax_dev);
 	if (dif->bdev)
 		blkdev_put(dif->bdev, FMODE_READ | FMODE_EXCL);
-- 
2.27.0


^ permalink raw reply related	[flat|nested] 35+ messages in thread

* [PATCH v3 18/22] erofs: implement fscache-based data read for data blobs
  2022-02-09  6:00 [PATCH v3 00/22] fscache,erofs: fscache-based demand-read semantics Jeffle Xu
                   ` (16 preceding siblings ...)
  2022-02-09  6:01 ` [PATCH v3 17/22] erofs: register cookie context for data blobs Jeffle Xu
@ 2022-02-09  6:01 ` Jeffle Xu
  2022-02-09  6:01 ` [PATCH v3 19/22] erofs: implement fscache-based data readahead for hole Jeffle Xu
                   ` (4 subsequent siblings)
  22 siblings, 0 replies; 35+ messages in thread
From: Jeffle Xu @ 2022-02-09  6:01 UTC (permalink / raw)
  To: dhowells, linux-cachefs, xiang, chao, linux-erofs
  Cc: torvalds, gregkh, willy, linux-fsdevel, joseph.qi, bo.liu,
	tao.peng, gerry, eguan, linux-kernel

This patch implements the data plane of reading data from data blob file
over fscache.

Signed-off-by: Jeffle Xu <jefflexu@linux.alibaba.com>
---
 fs/erofs/data.c     |  3 +++
 fs/erofs/fscache.c  | 16 +++++++++++++---
 fs/erofs/internal.h |  1 +
 3 files changed, 17 insertions(+), 3 deletions(-)

diff --git a/fs/erofs/data.c b/fs/erofs/data.c
index 1bff99576883..c5ccf55c050c 100644
--- a/fs/erofs/data.c
+++ b/fs/erofs/data.c
@@ -200,6 +200,7 @@ int erofs_map_dev(struct super_block *sb, struct erofs_map_dev *map)
 	map->m_bdev = sb->s_bdev;
 	map->m_daxdev = EROFS_SB(sb)->dax_dev;
 	map->m_dax_part_off = EROFS_SB(sb)->dax_part_off;
+	map->m_ctx = EROFS_SB(sb)->bootstrap;
 
 	if (map->m_deviceid) {
 		down_read(&devs->rwsem);
@@ -211,6 +212,7 @@ int erofs_map_dev(struct super_block *sb, struct erofs_map_dev *map)
 		map->m_bdev = dif->bdev;
 		map->m_daxdev = dif->dax_dev;
 		map->m_dax_part_off = dif->dax_part_off;
+		map->m_ctx = dif->ctx;
 		up_read(&devs->rwsem);
 	} else if (devs->extra_devices) {
 		down_read(&devs->rwsem);
@@ -228,6 +230,7 @@ int erofs_map_dev(struct super_block *sb, struct erofs_map_dev *map)
 				map->m_bdev = dif->bdev;
 				map->m_daxdev = dif->dax_dev;
 				map->m_dax_part_off = dif->dax_part_off;
+				map->m_ctx = dif->ctx;
 				break;
 			}
 		}
diff --git a/fs/erofs/fscache.c b/fs/erofs/fscache.c
index fcd686f4dc9f..c7762e154064 100644
--- a/fs/erofs/fscache.c
+++ b/fs/erofs/fscache.c
@@ -68,11 +68,21 @@ static inline int erofs_fscache_get_map(struct erofs_fscache_map *fsmap,
 					struct erofs_map_blocks *map,
 					struct super_block *sb)
 {
-	struct erofs_sb_info *sbi = EROFS_SB(sb);
+	struct erofs_map_dev mdev;
+	int ret;
+
+	mdev = (struct erofs_map_dev) {
+		.m_deviceid = map->m_deviceid,
+		.m_pa = map->m_pa,
+	};
+
+	ret = erofs_map_dev(sb, &mdev);
+	if (ret)
+		return ret;
 
-	fsmap->m_ctx  = sbi->bootstrap;
+	fsmap->m_ctx  = mdev.m_ctx;
+	fsmap->m_pa   = mdev.m_pa;
 	fsmap->m_la   = map->m_la;
-	fsmap->m_pa   = map->m_pa;
 	fsmap->m_llen = map->m_llen;
 
 	return 0;
diff --git a/fs/erofs/internal.h b/fs/erofs/internal.h
index 5d514c7b73cc..6ccf14952b2d 100644
--- a/fs/erofs/internal.h
+++ b/fs/erofs/internal.h
@@ -486,6 +486,7 @@ struct erofs_map_dev {
 	struct block_device *m_bdev;
 	struct dax_device *m_daxdev;
 	u64 m_dax_part_off;
+	struct erofs_fscache_context *m_ctx;
 
 	erofs_off_t m_pa;
 	unsigned int m_deviceid;
-- 
2.27.0


^ permalink raw reply related	[flat|nested] 35+ messages in thread

* [PATCH v3 19/22] erofs: implement fscache-based data readahead for hole
  2022-02-09  6:00 [PATCH v3 00/22] fscache,erofs: fscache-based demand-read semantics Jeffle Xu
                   ` (17 preceding siblings ...)
  2022-02-09  6:01 ` [PATCH v3 18/22] erofs: implement fscache-based data read " Jeffle Xu
@ 2022-02-09  6:01 ` Jeffle Xu
  2022-02-09  6:01 ` [PATCH v3 20/22] erofs: implement fscache-based data readahead for non-inline layout Jeffle Xu
                   ` (3 subsequent siblings)
  22 siblings, 0 replies; 35+ messages in thread
From: Jeffle Xu @ 2022-02-09  6:01 UTC (permalink / raw)
  To: dhowells, linux-cachefs, xiang, chao, linux-erofs
  Cc: torvalds, gregkh, willy, linux-fsdevel, joseph.qi, bo.liu,
	tao.peng, gerry, eguan, linux-kernel

Implement fscache-based data readahead. This patch only supports
readahead for hole, while the following patches will handle other cases.

Besides this patch also registers an individual bdi for each erofs
instance, so that readahead can be enabled.

Signed-off-by: Jeffle Xu <jefflexu@linux.alibaba.com>
---
 fs/erofs/fscache.c | 83 ++++++++++++++++++++++++++++++++++++++++++++++
 fs/erofs/super.c   |  4 +++
 2 files changed, 87 insertions(+)

diff --git a/fs/erofs/fscache.c b/fs/erofs/fscache.c
index c7762e154064..c8a0851230e5 100644
--- a/fs/erofs/fscache.c
+++ b/fs/erofs/fscache.c
@@ -195,12 +195,95 @@ static int erofs_fscache_readpage(struct file *file, struct page *page)
 	return ret;
 }
 
+enum erofs_fscache_readahead_type {
+	EROFS_FSCACHE_READAHEAD_TYPE_HOLE,
+};
+
+static int erofs_fscache_do_readahead(struct readahead_control *rac,
+				      struct erofs_fscache_map *fsmap,
+				      enum erofs_fscache_readahead_type type)
+{
+	size_t offset, length, done;
+	struct page *page;
+
+	/*
+	 * 1) For CHUNK_BASED (HOLE), the output map.m_la is rounded down to
+	 *    the nearest chunk boundary, and thus offset will be non-zero.
+	 */
+	offset = fsmap->o_la - fsmap->m_la;
+	length = fsmap->m_llen - offset;
+
+	for (done = 0; done < length; done += PAGE_SIZE) {
+		page = readahead_page(rac);
+		if (!page)
+			break;
+
+		switch (type) {
+		case EROFS_FSCACHE_READAHEAD_TYPE_HOLE:
+			zero_user(page, 0, PAGE_SIZE);
+			break;
+		default:
+			DBG_BUGON(1);
+			return -EINVAL;
+		}
+
+		SetPageUptodate(page);
+		unlock_page(page);
+	}
+
+	return done;
+}
+
+static void erofs_fscache_readahead(struct readahead_control *rac)
+{
+	struct inode *inode = rac->mapping->host;
+	struct erofs_inode *vi = EROFS_I(inode);
+	struct super_block *sb = inode->i_sb;
+	size_t length = readahead_length(rac);
+	struct erofs_map_blocks map;
+	struct erofs_fscache_map fsmap;
+	int ret;
+
+	if (erofs_inode_is_data_compressed(vi->datalayout)) {
+		erofs_info(sb, "compressed layout not supported yet");
+		return;
+	}
+
+	while (length) {
+		map.m_la = fsmap.o_la = readahead_pos(rac);
+
+		ret = erofs_map_blocks(inode, &map, EROFS_GET_BLOCKS_RAW);
+		if (ret)
+			return;
+
+		if (!(map.m_flags & EROFS_MAP_MAPPED)) {
+			/* Only CHUNK_BASED layout supports hole. */
+			fsmap.m_la   = map.m_la;
+			fsmap.m_llen = map.m_llen;
+			ret = erofs_fscache_do_readahead(rac, &fsmap,
+					EROFS_FSCACHE_READAHEAD_TYPE_HOLE);
+		} else {
+			switch (vi->datalayout) {
+			default:
+				DBG_BUGON(1);
+				return;
+			}
+		}
+
+		if (ret <= 0)
+			return;
+
+		length -= ret;
+	}
+}
+
 static const struct address_space_operations erofs_fscache_blob_aops = {
 	.readpage = erofs_fscache_readpage_blob,
 };
 
 const struct address_space_operations erofs_fscache_access_aops = {
 	.readpage = erofs_fscache_readpage,
+	.readahead = erofs_fscache_readahead,
 };
 
 struct page *erofs_fscache_read_cache_page(struct erofs_fscache_context *ctx,
diff --git a/fs/erofs/super.c b/fs/erofs/super.c
index f058a04a00c7..2942029a7049 100644
--- a/fs/erofs/super.c
+++ b/fs/erofs/super.c
@@ -616,6 +616,10 @@ static int erofs_fc_fill_super(struct super_block *sb, struct fs_context *fc)
 			return PTR_ERR(bootstrap);
 
 		sbi->bootstrap = bootstrap;
+
+		err = super_setup_bdi(sb);
+		if (err)
+			return err;
 	}
 
 	err = erofs_read_superblock(sb);
-- 
2.27.0


^ permalink raw reply related	[flat|nested] 35+ messages in thread

* [PATCH v3 20/22] erofs: implement fscache-based data readahead for non-inline layout
  2022-02-09  6:00 [PATCH v3 00/22] fscache,erofs: fscache-based demand-read semantics Jeffle Xu
                   ` (18 preceding siblings ...)
  2022-02-09  6:01 ` [PATCH v3 19/22] erofs: implement fscache-based data readahead for hole Jeffle Xu
@ 2022-02-09  6:01 ` Jeffle Xu
  2022-02-09  6:01 ` [PATCH v3 21/22] erofs: implement fscache-based data readahead for inline layout Jeffle Xu
                   ` (2 subsequent siblings)
  22 siblings, 0 replies; 35+ messages in thread
From: Jeffle Xu @ 2022-02-09  6:01 UTC (permalink / raw)
  To: dhowells, linux-cachefs, xiang, chao, linux-erofs
  Cc: torvalds, gregkh, willy, linux-fsdevel, joseph.qi, bo.liu,
	tao.peng, gerry, eguan, linux-kernel

Signed-off-by: Jeffle Xu <jefflexu@linux.alibaba.com>
---
 fs/erofs/fscache.c | 25 +++++++++++++++++++++++--
 1 file changed, 23 insertions(+), 2 deletions(-)

diff --git a/fs/erofs/fscache.c b/fs/erofs/fscache.c
index c8a0851230e5..ef5eef33e3d5 100644
--- a/fs/erofs/fscache.c
+++ b/fs/erofs/fscache.c
@@ -197,6 +197,7 @@ static int erofs_fscache_readpage(struct file *file, struct page *page)
 
 enum erofs_fscache_readahead_type {
 	EROFS_FSCACHE_READAHEAD_TYPE_HOLE,
+	EROFS_FSCACHE_READAHEAD_TYPE_NOINLINE,
 };
 
 static int erofs_fscache_do_readahead(struct readahead_control *rac,
@@ -205,10 +206,14 @@ static int erofs_fscache_do_readahead(struct readahead_control *rac,
 {
 	size_t offset, length, done;
 	struct page *page;
+	int ret = 0;
 
 	/*
-	 * 1) For CHUNK_BASED (HOLE), the output map.m_la is rounded down to
-	 *    the nearest chunk boundary, and thus offset will be non-zero.
+	 * 1) For CHUNK_BASED (HOLE/NOINLINE), the output map.m_la is rounded
+	 *    down to the nearest chunk boundary, and thus offset will be
+	 *    non-zero.
+	 * 2) For the other cases, the output map.m_la shall be equal to o_la,
+	 *    and thus offset will be zero.
 	 */
 	offset = fsmap->o_la - fsmap->m_la;
 	length = fsmap->m_llen - offset;
@@ -222,11 +227,18 @@ static int erofs_fscache_do_readahead(struct readahead_control *rac,
 		case EROFS_FSCACHE_READAHEAD_TYPE_HOLE:
 			zero_user(page, 0, PAGE_SIZE);
 			break;
+		case EROFS_FSCACHE_READAHEAD_TYPE_NOINLINE:
+			ret = erofs_fscache_readpage_noinline(page, fsmap);
+			fsmap->m_pa += EROFS_BLKSIZ;
+			break;
 		default:
 			DBG_BUGON(1);
 			return -EINVAL;
 		}
 
+		if (ret)
+			return ret;
+
 		SetPageUptodate(page);
 		unlock_page(page);
 	}
@@ -263,7 +275,16 @@ static void erofs_fscache_readahead(struct readahead_control *rac)
 			ret = erofs_fscache_do_readahead(rac, &fsmap,
 					EROFS_FSCACHE_READAHEAD_TYPE_HOLE);
 		} else {
+			ret = erofs_fscache_get_map(&fsmap, &map, sb);
+			if (ret)
+				return;
+
 			switch (vi->datalayout) {
+			case EROFS_INODE_FLAT_PLAIN:
+			case EROFS_INODE_CHUNK_BASED:
+				ret = erofs_fscache_do_readahead(rac, &fsmap,
+					EROFS_FSCACHE_READAHEAD_TYPE_NOINLINE);
+				break;
 			default:
 				DBG_BUGON(1);
 				return;
-- 
2.27.0


^ permalink raw reply related	[flat|nested] 35+ messages in thread

* [PATCH v3 21/22] erofs: implement fscache-based data readahead for inline layout
  2022-02-09  6:00 [PATCH v3 00/22] fscache,erofs: fscache-based demand-read semantics Jeffle Xu
                   ` (19 preceding siblings ...)
  2022-02-09  6:01 ` [PATCH v3 20/22] erofs: implement fscache-based data readahead for non-inline layout Jeffle Xu
@ 2022-02-09  6:01 ` Jeffle Xu
  2022-02-09  6:01 ` [PATCH v3 22/22] erofs: add 'uuid' mount option Jeffle Xu
  2022-02-10  5:58 ` [Linux-cachefs] [PATCH v3 00/22] fscache, erofs: fscache-based demand-read semantics Gao Xiang
  22 siblings, 0 replies; 35+ messages in thread
From: Jeffle Xu @ 2022-02-09  6:01 UTC (permalink / raw)
  To: dhowells, linux-cachefs, xiang, chao, linux-erofs
  Cc: torvalds, gregkh, willy, linux-fsdevel, joseph.qi, bo.liu,
	tao.peng, gerry, eguan, linux-kernel

Signed-off-by: Jeffle Xu <jefflexu@linux.alibaba.com>
---
 fs/erofs/fscache.c | 8 ++++++++
 1 file changed, 8 insertions(+)

diff --git a/fs/erofs/fscache.c b/fs/erofs/fscache.c
index ef5eef33e3d5..003f9abdaf1b 100644
--- a/fs/erofs/fscache.c
+++ b/fs/erofs/fscache.c
@@ -198,6 +198,7 @@ static int erofs_fscache_readpage(struct file *file, struct page *page)
 enum erofs_fscache_readahead_type {
 	EROFS_FSCACHE_READAHEAD_TYPE_HOLE,
 	EROFS_FSCACHE_READAHEAD_TYPE_NOINLINE,
+	EROFS_FSCACHE_READAHEAD_TYPE_INLINE,
 };
 
 static int erofs_fscache_do_readahead(struct readahead_control *rac,
@@ -231,6 +232,9 @@ static int erofs_fscache_do_readahead(struct readahead_control *rac,
 			ret = erofs_fscache_readpage_noinline(page, fsmap);
 			fsmap->m_pa += EROFS_BLKSIZ;
 			break;
+		case EROFS_FSCACHE_READAHEAD_TYPE_INLINE:
+			ret = erofs_fscache_readpage_inline(page, fsmap);
+			break;
 		default:
 			DBG_BUGON(1);
 			return -EINVAL;
@@ -285,6 +289,10 @@ static void erofs_fscache_readahead(struct readahead_control *rac)
 				ret = erofs_fscache_do_readahead(rac, &fsmap,
 					EROFS_FSCACHE_READAHEAD_TYPE_NOINLINE);
 				break;
+			case EROFS_INODE_FLAT_INLINE:
+				ret = erofs_fscache_do_readahead(rac, &fsmap,
+					EROFS_FSCACHE_READAHEAD_TYPE_INLINE);
+				break;
 			default:
 				DBG_BUGON(1);
 				return;
-- 
2.27.0


^ permalink raw reply related	[flat|nested] 35+ messages in thread

* [PATCH v3 22/22] erofs: add 'uuid' mount option
  2022-02-09  6:00 [PATCH v3 00/22] fscache,erofs: fscache-based demand-read semantics Jeffle Xu
                   ` (20 preceding siblings ...)
  2022-02-09  6:01 ` [PATCH v3 21/22] erofs: implement fscache-based data readahead for inline layout Jeffle Xu
@ 2022-02-09  6:01 ` Jeffle Xu
  2022-02-10  5:58 ` [Linux-cachefs] [PATCH v3 00/22] fscache, erofs: fscache-based demand-read semantics Gao Xiang
  22 siblings, 0 replies; 35+ messages in thread
From: Jeffle Xu @ 2022-02-09  6:01 UTC (permalink / raw)
  To: dhowells, linux-cachefs, xiang, chao, linux-erofs
  Cc: torvalds, gregkh, willy, linux-fsdevel, joseph.qi, bo.liu,
	tao.peng, gerry, eguan, linux-kernel

Introduce 'uuid' mount option to enable on-demand read sementics. In
this case, erofs could be mounted from blob files instead of blkdev.
By then users could specify the path of bootstrap blob file containing
the complete erofs image.

Signed-off-by: Jeffle Xu <jefflexu@linux.alibaba.com>
---
 fs/erofs/super.c | 44 +++++++++++++++++++++++++++++++++++++-------
 1 file changed, 37 insertions(+), 7 deletions(-)

diff --git a/fs/erofs/super.c b/fs/erofs/super.c
index 2942029a7049..8bc4b782f9a9 100644
--- a/fs/erofs/super.c
+++ b/fs/erofs/super.c
@@ -400,6 +400,7 @@ enum {
 	Opt_dax,
 	Opt_dax_enum,
 	Opt_device,
+	Opt_uuid,
 	Opt_err
 };
 
@@ -424,6 +425,7 @@ static const struct fs_parameter_spec erofs_fs_parameters[] = {
 	fsparam_flag("dax",             Opt_dax),
 	fsparam_enum("dax",		Opt_dax_enum, erofs_dax_param_enums),
 	fsparam_string("device",	Opt_device),
+	fsparam_string("uuid",		Opt_uuid),
 	{}
 };
 
@@ -519,6 +521,12 @@ static int erofs_fc_parse_param(struct fs_context *fc,
 		}
 		++ctx->devs->extra_devices;
 		break;
+	case Opt_uuid:
+		kfree(ctx->opt.uuid);
+		ctx->opt.uuid = kstrdup(param->string, GFP_KERNEL);
+		if (!ctx->opt.uuid)
+			return -ENOMEM;
+		break;
 	default:
 		return -ENOPARAM;
 	}
@@ -593,9 +601,14 @@ static int erofs_fc_fill_super(struct super_block *sb, struct fs_context *fc)
 
 	sb->s_magic = EROFS_SUPER_MAGIC;
 
-	if (!sb_set_blocksize(sb, EROFS_BLKSIZ)) {
-		erofs_err(sb, "failed to set erofs blksize");
-		return -EINVAL;
+	if (erofs_bdev_mode(sb)) {
+		if (!sb_set_blocksize(sb, EROFS_BLKSIZ)) {
+			erofs_err(sb, "failed to set erofs blksize");
+			return -EINVAL;
+		}
+	} else {
+		sb->s_blocksize = EROFS_BLKSIZ;
+		sb->s_blocksize_bits = LOG_BLOCK_SIZE;
 	}
 
 	sbi = kzalloc(sizeof(*sbi), GFP_KERNEL);
@@ -604,11 +617,12 @@ static int erofs_fc_fill_super(struct super_block *sb, struct fs_context *fc)
 
 	sb->s_fs_info = sbi;
 	sbi->opt = ctx->opt;
-	sbi->dax_dev = fs_dax_get_by_bdev(sb->s_bdev, &sbi->dax_part_off);
 	sbi->devs = ctx->devs;
 	ctx->devs = NULL;
 
-	if (!erofs_bdev_mode(sb)) {
+	if (erofs_bdev_mode(sb)) {
+		sbi->dax_dev = fs_dax_get_by_bdev(sb->s_bdev, &sbi->dax_part_off);
+	} else {
 		struct erofs_fscache_context *bootstrap;
 
 		bootstrap = erofs_fscache_get_ctx(sb, ctx->opt.uuid, true);
@@ -620,6 +634,8 @@ static int erofs_fc_fill_super(struct super_block *sb, struct fs_context *fc)
 		err = super_setup_bdi(sb);
 		if (err)
 			return err;
+
+		sbi->dax_dev = NULL;
 	}
 
 	err = erofs_read_superblock(sb);
@@ -682,6 +698,11 @@ static int erofs_fc_fill_super(struct super_block *sb, struct fs_context *fc)
 
 static int erofs_fc_get_tree(struct fs_context *fc)
 {
+	struct erofs_fs_context *ctx = fc->fs_private;
+
+	if (ctx->opt.uuid)
+		return get_tree_nodev(fc, erofs_fc_fill_super);
+
 	return get_tree_bdev(fc, erofs_fc_fill_super);
 }
 
@@ -731,6 +752,7 @@ static void erofs_fc_free(struct fs_context *fc)
 	struct erofs_fs_context *ctx = fc->fs_private;
 
 	erofs_free_dev_context(ctx->devs);
+	kfree(ctx->opt.uuid);
 	kfree(ctx);
 }
 
@@ -771,7 +793,10 @@ static void erofs_kill_sb(struct super_block *sb)
 
 	WARN_ON(sb->s_magic != EROFS_SUPER_MAGIC);
 
-	kill_block_super(sb);
+	if (erofs_bdev_mode(sb))
+		kill_block_super(sb);
+	else
+		generic_shutdown_super(sb);
 
 	sbi = EROFS_SB(sb);
 	if (!sbi)
@@ -889,7 +914,12 @@ static int erofs_statfs(struct dentry *dentry, struct kstatfs *buf)
 {
 	struct super_block *sb = dentry->d_sb;
 	struct erofs_sb_info *sbi = EROFS_SB(sb);
-	u64 id = huge_encode_dev(sb->s_bdev->bd_dev);
+	u64 id;
+
+	if (erofs_bdev_mode(sb))
+		id = huge_encode_dev(sb->s_bdev->bd_dev);
+	else
+		id = 0; /* TODO */
 
 	buf->f_type = sb->s_magic;
 	buf->f_bsize = EROFS_BLKSIZ;
-- 
2.27.0


^ permalink raw reply related	[flat|nested] 35+ messages in thread

* Re: [PATCH v3 06/22] erofs: use meta buffers for erofs_read_superblock()
  2022-02-09  6:00 ` [PATCH v3 06/22] erofs: use meta buffers for erofs_read_superblock() Jeffle Xu
@ 2022-02-09  7:52   ` Gao Xiang
  0 siblings, 0 replies; 35+ messages in thread
From: Gao Xiang @ 2022-02-09  7:52 UTC (permalink / raw)
  To: Jeffle Xu
  Cc: dhowells, linux-cachefs, xiang, chao, linux-erofs, torvalds,
	gregkh, willy, linux-fsdevel, joseph.qi, bo.liu, tao.peng, gerry,
	eguan, linux-kernel

On Wed, Feb 09, 2022 at 02:00:52PM +0800, Jeffle Xu wrote:
> The only change is that, meta buffers read cache page without __GFP_FS
> flag, which shall not matter.
> 
> Signed-off-by: Jeffle Xu <jefflexu@linux.alibaba.com>

Reviewed-by: Gao Xiang <hsiangkao@linux.alibaba.com>

(If this patchset left behind anyway, I will submit this cleanup
 independently for the next cycle.)

Thanks,
Gao Xiang

> ---
>  fs/erofs/super.c | 13 +++++--------
>  1 file changed, 5 insertions(+), 8 deletions(-)
> 
> diff --git a/fs/erofs/super.c b/fs/erofs/super.c
> index 915eefe0d7e2..12755217631f 100644
> --- a/fs/erofs/super.c
> +++ b/fs/erofs/super.c
> @@ -281,21 +281,19 @@ static int erofs_init_devices(struct super_block *sb,
>  static int erofs_read_superblock(struct super_block *sb)
>  {
>  	struct erofs_sb_info *sbi;
> -	struct page *page;
> +	struct erofs_buf buf = __EROFS_BUF_INITIALIZER;
>  	struct erofs_super_block *dsb;
>  	unsigned int blkszbits;
>  	void *data;
>  	int ret;
>  
> -	page = read_mapping_page(sb->s_bdev->bd_inode->i_mapping, 0, NULL);
> -	if (IS_ERR(page)) {
> +	data = erofs_read_metabuf(&buf, sb, 0, EROFS_KMAP);
> +	if (IS_ERR(data)) {
>  		erofs_err(sb, "cannot read erofs superblock");
> -		return PTR_ERR(page);
> +		return PTR_ERR(data);
>  	}
>  
>  	sbi = EROFS_SB(sb);
> -
> -	data = kmap(page);
>  	dsb = (struct erofs_super_block *)(data + EROFS_SUPER_OFFSET);
>  
>  	ret = -EINVAL;
> @@ -365,8 +363,7 @@ static int erofs_read_superblock(struct super_block *sb)
>  	if (erofs_sb_has_ztailpacking(sbi))
>  		erofs_info(sb, "EXPERIMENTAL compressed inline data feature in use. Use at your own risk!");
>  out:
> -	kunmap(page);
> -	put_page(page);
> +	erofs_put_metabuf(&buf);
>  	return ret;
>  }
>  
> -- 
> 2.27.0

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [Linux-cachefs] [PATCH v3 00/22] fscache,  erofs: fscache-based demand-read semantics
  2022-02-09  6:00 [PATCH v3 00/22] fscache,erofs: fscache-based demand-read semantics Jeffle Xu
                   ` (21 preceding siblings ...)
  2022-02-09  6:01 ` [PATCH v3 22/22] erofs: add 'uuid' mount option Jeffle Xu
@ 2022-02-10  5:58 ` Gao Xiang
  22 siblings, 0 replies; 35+ messages in thread
From: Gao Xiang @ 2022-02-10  5:58 UTC (permalink / raw)
  To: David Howells
  Cc: Jeffle Xu, linux-cachefs, xiang, chao, linux-erofs, gregkh,
	tao.peng, willy, linux-kernel, joseph.qi, bo.liu, linux-fsdevel,
	eguan, gerry, torvalds

Hi David,

On Wed, Feb 09, 2022 at 02:00:46PM +0800, Jeffle Xu wrote:

...

> 
> 
> Jeffle Xu (22):
>   fscache: export fscache_end_operation()
>   fscache: add a method to support on-demand read semantics
>   cachefiles: extract generic function for daemon methods
>   cachefiles: detect backing file size in on-demand read mode
>   cachefiles: introduce new devnode for on-demand read mode

...

> 
>  Documentation/filesystems/netfs_library.rst |  18 +
>  fs/cachefiles/Kconfig                       |  13 +
>  fs/cachefiles/daemon.c                      | 243 +++++++++--
>  fs/cachefiles/internal.h                    |  12 +
>  fs/cachefiles/io.c                          |  60 +++
>  fs/cachefiles/main.c                        |  27 ++
>  fs/cachefiles/namei.c                       |  60 ++-

Would you mind taking a review at this version? We follow your previous
advices written in v2 and it reuses almost all cachefiles code except
that it has slightly different implication of cachefile file size and
a new daemon node.

I think it could be as the first step to implement fscache-based
on-demand read.

Thanks,
Gao Xiang


^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [PATCH v3 05/22] cachefiles: introduce new devnode for on-demand read mode
  2022-02-09  6:00 ` [PATCH v3 05/22] cachefiles: introduce new devnode for " Jeffle Xu
@ 2022-02-15  9:03   ` JeffleXu
  2022-02-15 10:37     ` Greg KH
  2022-02-15 11:13     ` [PATCH v4 05/23] " Jeffle Xu
  0 siblings, 2 replies; 35+ messages in thread
From: JeffleXu @ 2022-02-15  9:03 UTC (permalink / raw)
  To: dhowells, linux-cachefs, xiang, chao, linux-erofs
  Cc: gregkh, willy, linux-kernel, joseph.qi, linux-fsdevel, gerry, torvalds

Hi David,

FYI I've updated this patch on [1].

[1]
https://github.com/lostjeffle/linux/commit/589dd838dc539aee291d1032406653a8f6269e6f.

This new version mainly adds cachefiles_ondemand_flush_reqs(), which
drains the pending read requests when cachefilesd is going to exit.

On 2/9/22 2:00 PM, Jeffle Xu wrote:
> This patch introduces a new devnode 'cachefiles_ondemand' to support the
> newly introduced on-demand read mode.
> 
> The precondition for on-demand reading semantics is that, all blob files
> have been placed under corresponding directory with correct file size
> (sparse files) on the first beginning. When upper fs starts to access
> the blob file, it will "cache miss" (hit the hole) and then turn to user
> daemon for preparing the data.
> 
> The interaction between kernel and user daemon is described as below.
> 1. Once cache miss, .ondemand_read() callback of corresponding fscache
>    backend is called to prepare the data. As for cachefiles, it just
>    packages related metadata (file range to read, etc.) into a pending
>    read request, and then the process triggering cache miss will fall
>    asleep until the corresponding data gets fetched later.
> 2. User daemon needs to poll on the devnode ('cachefiles_ondemand'),
>    waiting for pending read request.
> 3. Once there's pending read request, user daemon will be notified and
>    shall read the devnode ('cachefiles_ondemand') to fetch one pending
>    read request to process.
> 4. For the fetched read request, user daemon need to somehow prepare the
>    data (e.g. download from remote through network) and then write the
>    fetched data into the backing file to fill the hole.
> 5. After that, user daemon need to notify cachefiles backend by writing a
>    'done' command to devnode ('cachefiles_ondemand'). It will also
>    awake the previous asleep process triggering cache miss.
> 6. By the time the process gets awaken, the data has been ready in the
>    backing file. Then process can re-initiate a read request from the
>    backing file.
> 
> Signed-off-by: Jeffle Xu <jefflexu@linux.alibaba.com>
> ---


-- 
Thanks,
Jeffle

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [PATCH v3 05/22] cachefiles: introduce new devnode for on-demand read mode
  2022-02-15  9:03   ` JeffleXu
@ 2022-02-15 10:37     ` Greg KH
  2022-02-16  8:17       ` JeffleXu
  2022-02-15 11:13     ` [PATCH v4 05/23] " Jeffle Xu
  1 sibling, 1 reply; 35+ messages in thread
From: Greg KH @ 2022-02-15 10:37 UTC (permalink / raw)
  To: JeffleXu
  Cc: dhowells, linux-cachefs, xiang, chao, linux-erofs, willy,
	linux-kernel, joseph.qi, linux-fsdevel, gerry, torvalds

On Tue, Feb 15, 2022 at 05:03:16PM +0800, JeffleXu wrote:
> Hi David,
> 
> FYI I've updated this patch on [1].
> 
> [1]
> https://github.com/lostjeffle/linux/commit/589dd838dc539aee291d1032406653a8f6269e6f.

We can not review random github links :(


^ permalink raw reply	[flat|nested] 35+ messages in thread

* [PATCH v4 05/23] cachefiles: introduce new devnode for on-demand read mode
  2022-02-15  9:03   ` JeffleXu
  2022-02-15 10:37     ` Greg KH
@ 2022-02-15 11:13     ` Jeffle Xu
  2022-02-16 10:48       ` Greg KH
  1 sibling, 1 reply; 35+ messages in thread
From: Jeffle Xu @ 2022-02-15 11:13 UTC (permalink / raw)
  To: dhowells, linux-cachefs, xiang, chao, linux-erofs
  Cc: torvalds, gregkh, willy, linux-fsdevel, joseph.qi, bo.liu,
	tao.peng, gerry, eguan, linux-kernel

This patch introduces a new devnode 'cachefiles_ondemand' to support the
newly introduced on-demand read mode.

The precondition for on-demand reading semantics is that, all blob files
have been placed under corresponding directory with correct file size
(sparse files) on the first beginning. When upper fs starts to access
the blob file, it will "cache miss" (hit the hole) and then turn to user
daemon for preparing the data.

The interaction between kernel and user daemon is described as below.
1. Once cache miss, .ondemand_read() callback of corresponding fscache
   backend is called to prepare the data. As for cachefiles, it just
   packages related metadata (file range to read, etc.) into a pending
   read request, and then the process triggering cache miss will fall
   asleep until the corresponding data gets fetched later.
2. User daemon needs to poll on the devnode ('cachefiles_ondemand'),
   waiting for pending read request.
3. Once there's pending read request, user daemon will be notified and
   shall read the devnode ('cachefiles_ondemand') to fetch one pending
   read request to process.
4. For the fetched read request, user daemon need to somehow prepare the
   data (e.g. download from remote through network) and then write the
   fetched data into the backing file to fill the hole.
5. After that, user daemon need to notify cachefiles backend by writing a
   'done' command to devnode ('cachefiles_ondemand'). It will also
   awake the previous asleep process triggering cache miss.
6. By the time the process gets awaken, the data has been ready in the
   backing file. Then process can re-initiate a read request from the
   backing file.

If user daemon exits in advance when upper fs still mounted, no new
on-demand read request can be queued anymore and the existing pending
read requests will fail with -EIO.

Signed-off-by: Jeffle Xu <jefflexu@linux.alibaba.com>
---
 fs/cachefiles/daemon.c                   | 185 +++++++++++++++++++++++
 fs/cachefiles/internal.h                 |  12 ++
 fs/cachefiles/io.c                       |  64 ++++++++
 fs/cachefiles/main.c                     |  27 ++++
 include/uapi/linux/cachefiles_ondemand.h |  14 ++
 5 files changed, 302 insertions(+)
 create mode 100644 include/uapi/linux/cachefiles_ondemand.h

diff --git a/fs/cachefiles/daemon.c b/fs/cachefiles/daemon.c
index 6b8d7c5bbe5d..96aee8e0eb14 100644
--- a/fs/cachefiles/daemon.c
+++ b/fs/cachefiles/daemon.c
@@ -757,3 +757,188 @@ static void cachefiles_daemon_unbind(struct cachefiles_cache *cache)
 
 	_leave("");
 }
+
+#ifdef CONFIG_CACHEFILES_ONDEMAND
+static unsigned long cachefiles_open_ondemand;
+
+static int cachefiles_ondemand_open(struct inode *inode, struct file *file);
+static int cachefiles_ondemand_release(struct inode *inode, struct file *file);
+static ssize_t cachefiles_ondemand_write(struct file *, const char __user *,
+					 size_t, loff_t *);
+static ssize_t cachefiles_ondemand_read(struct file *, char __user *, size_t,
+					loff_t *);
+static __poll_t cachefiles_ondemand_poll(struct file *,
+					 struct poll_table_struct *);
+static int cachefiles_daemon_done(struct cachefiles_cache *, char *);
+
+const struct file_operations cachefiles_ondemand_fops = {
+	.owner		= THIS_MODULE,
+	.open		= cachefiles_ondemand_open,
+	.release	= cachefiles_ondemand_release,
+	.read		= cachefiles_ondemand_read,
+	.write		= cachefiles_ondemand_write,
+	.poll		= cachefiles_ondemand_poll,
+	.llseek		= noop_llseek,
+};
+
+static const struct cachefiles_daemon_cmd cachefiles_ondemand_cmds[] = {
+	{ "bind",	cachefiles_daemon_bind		},
+	{ "brun",	cachefiles_daemon_brun		},
+	{ "bcull",	cachefiles_daemon_bcull		},
+	{ "bstop",	cachefiles_daemon_bstop		},
+	{ "cull",	cachefiles_daemon_cull		},
+	{ "debug",	cachefiles_daemon_debug		},
+	{ "dir",	cachefiles_daemon_dir		},
+	{ "frun",	cachefiles_daemon_frun		},
+	{ "fcull",	cachefiles_daemon_fcull		},
+	{ "fstop",	cachefiles_daemon_fstop		},
+	{ "inuse",	cachefiles_daemon_inuse		},
+	{ "secctx",	cachefiles_daemon_secctx	},
+	{ "tag",	cachefiles_daemon_tag		},
+	{ "done",	cachefiles_daemon_done		},
+	{ "",		NULL				}
+};
+
+static int cachefiles_ondemand_open(struct inode *inode, struct file *file)
+{
+	struct cachefiles_cache *cache;
+
+	_enter("");
+
+	/* only the superuser may do this */
+	if (!capable(CAP_SYS_ADMIN))
+		return -EPERM;
+
+	/* the cachefiles device may only be open once at a time */
+	if (xchg(&cachefiles_open_ondemand, 1) == 1)
+		return -EBUSY;
+
+	cache = cachefiles_daemon_open_cache();
+	if (!cache) {
+		cachefiles_open_ondemand = 0;
+		return -ENOMEM;
+	}
+
+	xa_init_flags(&cache->reqs, XA_FLAGS_ALLOC);
+	set_bit(CACHEFILES_ONDEMAND_MODE, &cache->flags);
+
+	file->private_data = cache;
+	cache->cachefilesd = file;
+	return 0;
+}
+
+static void cachefiles_ondemand_flush_reqs(struct cachefiles_cache *cache)
+{
+	struct cachefiles_req *req;
+	unsigned long index;
+
+	xa_for_each(&cache->reqs, index, req) {
+		req->error = -EIO;
+		complete(&req->done);
+	}
+}
+
+static int cachefiles_ondemand_release(struct inode *inode, struct file *file)
+{
+	struct cachefiles_cache *cache = file->private_data;
+
+	_enter("");
+
+	ASSERT(cache);
+
+	set_bit(CACHEFILES_DEAD, &cache->flags);
+
+	cachefiles_ondemand_flush_reqs(cache);
+	cachefiles_daemon_unbind(cache);
+
+	/* clean up the control file interface */
+	xa_destroy(&cache->reqs);
+	cache->cachefilesd = NULL;
+	file->private_data = NULL;
+	cachefiles_open_ondemand = 0;
+
+	kfree(cache);
+
+	_leave("");
+	return 0;
+}
+
+static ssize_t cachefiles_ondemand_write(struct file *file,
+					 const char __user *_data,
+					 size_t datalen,
+					 loff_t *pos)
+{
+	return cachefiles_daemon_do_write(file, _data, datalen, pos,
+					  cachefiles_ondemand_cmds);
+}
+
+static ssize_t cachefiles_ondemand_read(struct file *file, char __user *_buffer,
+					size_t buflen, loff_t *pos)
+{
+	struct cachefiles_cache *cache = file->private_data;
+	struct cachefiles_req *req;
+	unsigned long id = 0;
+	int n;
+
+	if (!test_bit(CACHEFILES_READY, &cache->flags))
+		return 0;
+
+	req = xa_find(&cache->reqs, &id, UINT_MAX, XA_PRESENT);
+	if (!req)
+		return 0;
+
+	n = sizeof(struct cachefiles_req_in);
+	if (n > buflen)
+		return -EMSGSIZE;
+
+	req->base.id = id;
+	if (copy_to_user(_buffer, &req->base, n) != 0)
+		return -EFAULT;
+
+	return n;
+}
+
+static __poll_t cachefiles_ondemand_poll(struct file *file,
+					 struct poll_table_struct *poll)
+{
+	struct cachefiles_cache *cache = file->private_data;
+	__poll_t mask;
+
+	poll_wait(file, &cache->daemon_pollwq, poll);
+	mask = 0;
+
+	if (!xa_empty(&cache->reqs))
+		mask |= EPOLLIN;
+
+	return mask;
+}
+
+/*
+ * Request completion
+ * - command: "done <id>"
+ */
+static int cachefiles_daemon_done(struct cachefiles_cache *cache, char *args)
+{
+	struct cachefiles_req *req;
+	unsigned long id;
+	int ret;
+
+	_enter(",%s", args);
+
+	if (!*args) {
+		pr_err("Empty id specified\n");
+		return -EINVAL;
+	}
+
+	ret = kstrtoul(args, 0, &id);
+	if (ret)
+		return ret;
+
+	req = xa_erase(&cache->reqs, id);
+	if (!req)
+		return -EINVAL;
+
+	complete(&req->done);
+	return 0;
+}
+#endif
diff --git a/fs/cachefiles/internal.h b/fs/cachefiles/internal.h
index 6473634c41a9..59dd11e42cb3 100644
--- a/fs/cachefiles/internal.h
+++ b/fs/cachefiles/internal.h
@@ -15,6 +15,8 @@
 #include <linux/fscache-cache.h>
 #include <linux/cred.h>
 #include <linux/security.h>
+#include <linux/xarray.h>
+#include <linux/cachefiles_ondemand.h>
 
 #define CACHEFILES_DIO_BLOCK_SIZE 4096
 
@@ -102,6 +104,15 @@ struct cachefiles_cache {
 	char				*rootdirname;	/* name of cache root directory */
 	char				*secctx;	/* LSM security context */
 	char				*tag;		/* cache binding tag */
+#ifdef CONFIG_CACHEFILES_ONDEMAND
+	struct xarray			reqs;
+#endif
+};
+
+struct cachefiles_req {
+	struct cachefiles_req_in base;
+	struct completion done;
+	int error;
 };
 
 #include <trace/events/cachefiles.h>
@@ -146,6 +157,7 @@ extern int cachefiles_has_space(struct cachefiles_cache *cache,
  * daemon.c
  */
 extern const struct file_operations cachefiles_daemon_fops;
+extern const struct file_operations cachefiles_ondemand_fops;
 
 /*
  * error_inject.c
diff --git a/fs/cachefiles/io.c b/fs/cachefiles/io.c
index 753986ea1583..7c51e53d52d1 100644
--- a/fs/cachefiles/io.c
+++ b/fs/cachefiles/io.c
@@ -597,6 +597,67 @@ static void cachefiles_end_operation(struct netfs_cache_resources *cres)
 	fscache_end_cookie_access(fscache_cres_cookie(cres), fscache_access_io_end);
 }
 
+#ifdef CONFIG_CACHEFILES_ONDEMAND
+static struct cachefiles_req *cachefiles_alloc_req(struct cachefiles_object *object,
+						   loff_t start_pos,
+						   size_t len)
+{
+	struct cachefiles_req *req;
+	struct cachefiles_req_in *base;
+
+	req = kzalloc(sizeof(*req), GFP_KERNEL);
+	if (!req)
+		return NULL;
+
+	base = &req->base;
+
+	base->off = start_pos;
+	base->len = len;
+	strncpy(base->path, object->d_name, sizeof(base->path) - 1);
+
+	init_completion(&req->done);
+
+	return req;
+}
+
+static int cachefiles_ondemand_read(struct netfs_cache_resources *cres,
+				    loff_t start_pos, size_t len)
+{
+	struct cachefiles_object *object;
+	struct cachefiles_cache *cache;
+	struct cachefiles_req *req;
+	int ret;
+	u32 id;
+
+	object = cachefiles_cres_object(cres);
+	cache = object->volume->cache;
+
+	if (!test_bit(CACHEFILES_ONDEMAND_MODE, &cache->flags))
+		return -EOPNOTSUPP;
+
+	if (test_bit(CACHEFILES_DEAD, &cache->flags))
+		return -EIO;
+
+	req = cachefiles_alloc_req(object, start_pos, len);
+	if (!req)
+		return -ENOMEM;
+
+	ret = xa_alloc(&cache->reqs, &id, req, xa_limit_32b, GFP_KERNEL);
+	if (ret) {
+		kfree(req);
+		return -ENOMEM;
+	}
+
+	wake_up_all(&cache->daemon_pollwq);
+
+	wait_for_completion(&req->done);
+	ret = req->error;
+	kfree(req);
+
+	return ret;
+}
+#endif
+
 static const struct netfs_cache_ops cachefiles_netfs_cache_ops = {
 	.end_operation		= cachefiles_end_operation,
 	.read			= cachefiles_read,
@@ -604,6 +665,9 @@ static const struct netfs_cache_ops cachefiles_netfs_cache_ops = {
 	.prepare_read		= cachefiles_prepare_read,
 	.prepare_write		= cachefiles_prepare_write,
 	.query_occupancy	= cachefiles_query_occupancy,
+#ifdef CONFIG_CACHEFILES_ONDEMAND
+	.ondemand_read		= cachefiles_ondemand_read,
+#endif
 };
 
 /*
diff --git a/fs/cachefiles/main.c b/fs/cachefiles/main.c
index 3f369c6f816d..eab17c3140d9 100644
--- a/fs/cachefiles/main.c
+++ b/fs/cachefiles/main.c
@@ -39,6 +39,27 @@ static struct miscdevice cachefiles_dev = {
 	.fops	= &cachefiles_daemon_fops,
 };
 
+#ifdef CONFIG_CACHEFILES_ONDEMAND
+static struct miscdevice cachefiles_ondemand_dev = {
+	.minor	= MISC_DYNAMIC_MINOR,
+	.name	= "cachefiles_ondemand",
+	.fops	= &cachefiles_ondemand_fops,
+};
+
+static inline int cachefiles_init_ondemand(void)
+{
+	return misc_register(&cachefiles_ondemand_dev);
+}
+
+static inline void cachefiles_exit_ondemand(void)
+{
+	misc_deregister(&cachefiles_ondemand_dev);
+}
+#else
+static inline int cachefiles_init_ondemand(void) { return 0; }
+static inline void cachefiles_exit_ondemand(void) {}
+#endif
+
 /*
  * initialise the fs caching module
  */
@@ -52,6 +73,9 @@ static int __init cachefiles_init(void)
 	ret = misc_register(&cachefiles_dev);
 	if (ret < 0)
 		goto error_dev;
+	ret = cachefiles_init_ondemand();
+	if (ret < 0)
+		goto error_ondemand_dev;
 
 	/* create an object jar */
 	ret = -ENOMEM;
@@ -68,6 +92,8 @@ static int __init cachefiles_init(void)
 	return 0;
 
 error_object_jar:
+	cachefiles_exit_ondemand();
+error_ondemand_dev:
 	misc_deregister(&cachefiles_dev);
 error_dev:
 	cachefiles_unregister_error_injection();
@@ -86,6 +112,7 @@ static void __exit cachefiles_exit(void)
 	pr_info("Unloading\n");
 
 	kmem_cache_destroy(cachefiles_object_jar);
+	cachefiles_exit_ondemand();
 	misc_deregister(&cachefiles_dev);
 	cachefiles_unregister_error_injection();
 }
diff --git a/include/uapi/linux/cachefiles_ondemand.h b/include/uapi/linux/cachefiles_ondemand.h
new file mode 100644
index 000000000000..e639a82f1098
--- /dev/null
+++ b/include/uapi/linux/cachefiles_ondemand.h
@@ -0,0 +1,14 @@
+/* SPDX-License-Identifier: GPL-2.0 WITH Linux-syscall-note */
+#ifndef _LINUX_CACHEFILES_ONDEMAND_H
+#define _LINUX_CACHEFILES_ONDEMAND_H
+
+#include <linux/limits.h>
+
+struct cachefiles_req_in {
+	uint64_t id;
+	uint64_t off;
+	uint64_t len;
+	char path[NAME_MAX];
+};
+
+#endif
-- 
2.27.0


^ permalink raw reply related	[flat|nested] 35+ messages in thread

* Re: [PATCH v3 05/22] cachefiles: introduce new devnode for on-demand read mode
  2022-02-15 10:37     ` Greg KH
@ 2022-02-16  8:17       ` JeffleXu
  0 siblings, 0 replies; 35+ messages in thread
From: JeffleXu @ 2022-02-16  8:17 UTC (permalink / raw)
  To: Greg KH
  Cc: dhowells, linux-cachefs, xiang, chao, linux-erofs, willy,
	linux-kernel, joseph.qi, linux-fsdevel, gerry, torvalds



On 2/15/22 6:37 PM, Greg KH wrote:
> On Tue, Feb 15, 2022 at 05:03:16PM +0800, JeffleXu wrote:
>> Hi David,
>>
>> FYI I've updated this patch on [1].
>>
>> [1]
>> https://github.com/lostjeffle/linux/commit/589dd838dc539aee291d1032406653a8f6269e6f.
> 
> We can not review random github links :(

Thanks. The new version patch has been replied in the email.

-- 
Thanks,
Jeffle

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [PATCH v4 05/23] cachefiles: introduce new devnode for on-demand read mode
  2022-02-15 11:13     ` [PATCH v4 05/23] " Jeffle Xu
@ 2022-02-16 10:48       ` Greg KH
  2022-02-16 12:49         ` JeffleXu
  0 siblings, 1 reply; 35+ messages in thread
From: Greg KH @ 2022-02-16 10:48 UTC (permalink / raw)
  To: Jeffle Xu
  Cc: dhowells, linux-cachefs, xiang, chao, linux-erofs, torvalds,
	willy, linux-fsdevel, joseph.qi, bo.liu, tao.peng, gerry, eguan,
	linux-kernel

On Tue, Feb 15, 2022 at 07:13:35PM +0800, Jeffle Xu wrote:
> This patch introduces a new devnode 'cachefiles_ondemand' to support the
> newly introduced on-demand read mode.
> 
> The precondition for on-demand reading semantics is that, all blob files
> have been placed under corresponding directory with correct file size
> (sparse files) on the first beginning. When upper fs starts to access
> the blob file, it will "cache miss" (hit the hole) and then turn to user
> daemon for preparing the data.
> 
> The interaction between kernel and user daemon is described as below.
> 1. Once cache miss, .ondemand_read() callback of corresponding fscache
>    backend is called to prepare the data. As for cachefiles, it just
>    packages related metadata (file range to read, etc.) into a pending
>    read request, and then the process triggering cache miss will fall
>    asleep until the corresponding data gets fetched later.
> 2. User daemon needs to poll on the devnode ('cachefiles_ondemand'),
>    waiting for pending read request.
> 3. Once there's pending read request, user daemon will be notified and
>    shall read the devnode ('cachefiles_ondemand') to fetch one pending
>    read request to process.
> 4. For the fetched read request, user daemon need to somehow prepare the
>    data (e.g. download from remote through network) and then write the
>    fetched data into the backing file to fill the hole.
> 5. After that, user daemon need to notify cachefiles backend by writing a
>    'done' command to devnode ('cachefiles_ondemand'). It will also
>    awake the previous asleep process triggering cache miss.
> 6. By the time the process gets awaken, the data has been ready in the
>    backing file. Then process can re-initiate a read request from the
>    backing file.
> 
> If user daemon exits in advance when upper fs still mounted, no new
> on-demand read request can be queued anymore and the existing pending
> read requests will fail with -EIO.
> 
> Signed-off-by: Jeffle Xu <jefflexu@linux.alibaba.com>
> ---
>  fs/cachefiles/daemon.c                   | 185 +++++++++++++++++++++++
>  fs/cachefiles/internal.h                 |  12 ++
>  fs/cachefiles/io.c                       |  64 ++++++++
>  fs/cachefiles/main.c                     |  27 ++++
>  include/uapi/linux/cachefiles_ondemand.h |  14 ++
>  5 files changed, 302 insertions(+)
>  create mode 100644 include/uapi/linux/cachefiles_ondemand.h
> 
> diff --git a/fs/cachefiles/daemon.c b/fs/cachefiles/daemon.c
> index 6b8d7c5bbe5d..96aee8e0eb14 100644
> --- a/fs/cachefiles/daemon.c
> +++ b/fs/cachefiles/daemon.c
> @@ -757,3 +757,188 @@ static void cachefiles_daemon_unbind(struct cachefiles_cache *cache)
>  
>  	_leave("");
>  }
> +
> +#ifdef CONFIG_CACHEFILES_ONDEMAND
> +static unsigned long cachefiles_open_ondemand;
> +
> +static int cachefiles_ondemand_open(struct inode *inode, struct file *file);
> +static int cachefiles_ondemand_release(struct inode *inode, struct file *file);
> +static ssize_t cachefiles_ondemand_write(struct file *, const char __user *,
> +					 size_t, loff_t *);
> +static ssize_t cachefiles_ondemand_read(struct file *, char __user *, size_t,
> +					loff_t *);
> +static __poll_t cachefiles_ondemand_poll(struct file *,
> +					 struct poll_table_struct *);
> +static int cachefiles_daemon_done(struct cachefiles_cache *, char *);
> +
> +const struct file_operations cachefiles_ondemand_fops = {
> +	.owner		= THIS_MODULE,
> +	.open		= cachefiles_ondemand_open,
> +	.release	= cachefiles_ondemand_release,
> +	.read		= cachefiles_ondemand_read,
> +	.write		= cachefiles_ondemand_write,
> +	.poll		= cachefiles_ondemand_poll,
> +	.llseek		= noop_llseek,
> +};
> +
> +static const struct cachefiles_daemon_cmd cachefiles_ondemand_cmds[] = {
> +	{ "bind",	cachefiles_daemon_bind		},
> +	{ "brun",	cachefiles_daemon_brun		},
> +	{ "bcull",	cachefiles_daemon_bcull		},
> +	{ "bstop",	cachefiles_daemon_bstop		},
> +	{ "cull",	cachefiles_daemon_cull		},
> +	{ "debug",	cachefiles_daemon_debug		},
> +	{ "dir",	cachefiles_daemon_dir		},
> +	{ "frun",	cachefiles_daemon_frun		},
> +	{ "fcull",	cachefiles_daemon_fcull		},
> +	{ "fstop",	cachefiles_daemon_fstop		},
> +	{ "inuse",	cachefiles_daemon_inuse		},
> +	{ "secctx",	cachefiles_daemon_secctx	},
> +	{ "tag",	cachefiles_daemon_tag		},
> +	{ "done",	cachefiles_daemon_done		},
> +	{ "",		NULL				}
> +};
> +
> +static int cachefiles_ondemand_open(struct inode *inode, struct file *file)
> +{
> +	struct cachefiles_cache *cache;
> +
> +	_enter("");

ftrace is your friend, no need to try to duplicate it with debugging
stuff.  This and the _leave() calls should be removed.

> +
> +	/* only the superuser may do this */
> +	if (!capable(CAP_SYS_ADMIN))
> +		return -EPERM;

Shouldn't you rely on the userspace permissions of the file instead of
this?

> +
> +	/* the cachefiles device may only be open once at a time */
> +	if (xchg(&cachefiles_open_ondemand, 1) == 1)
> +		return -EBUSY;
> +
> +	cache = cachefiles_daemon_open_cache();
> +	if (!cache) {
> +		cachefiles_open_ondemand = 0;
> +		return -ENOMEM;
> +	}
> +
> +	xa_init_flags(&cache->reqs, XA_FLAGS_ALLOC);
> +	set_bit(CACHEFILES_ONDEMAND_MODE, &cache->flags);
> +
> +	file->private_data = cache;
> +	cache->cachefilesd = file;
> +	return 0;
> +}
> +
> +static void cachefiles_ondemand_flush_reqs(struct cachefiles_cache *cache)
> +{
> +	struct cachefiles_req *req;
> +	unsigned long index;
> +
> +	xa_for_each(&cache->reqs, index, req) {
> +		req->error = -EIO;
> +		complete(&req->done);
> +	}
> +}
> +
> +static int cachefiles_ondemand_release(struct inode *inode, struct file *file)
> +{
> +	struct cachefiles_cache *cache = file->private_data;
> +
> +	_enter("");
> +
> +	ASSERT(cache);

We don't mess with ASSERT() in the kernel, how can this ever be false?

> +
> +	set_bit(CACHEFILES_DEAD, &cache->flags);
> +
> +	cachefiles_ondemand_flush_reqs(cache);
> +	cachefiles_daemon_unbind(cache);
> +
> +	/* clean up the control file interface */
> +	xa_destroy(&cache->reqs);
> +	cache->cachefilesd = NULL;
> +	file->private_data = NULL;
> +	cachefiles_open_ondemand = 0;
> +
> +	kfree(cache);
> +
> +	_leave("");
> +	return 0;
> +}
> +
> +static ssize_t cachefiles_ondemand_write(struct file *file,
> +					 const char __user *_data,
> +					 size_t datalen,
> +					 loff_t *pos)
> +{
> +	return cachefiles_daemon_do_write(file, _data, datalen, pos,
> +					  cachefiles_ondemand_cmds);
> +}
> +
> +static ssize_t cachefiles_ondemand_read(struct file *file, char __user *_buffer,
> +					size_t buflen, loff_t *pos)
> +{
> +	struct cachefiles_cache *cache = file->private_data;
> +	struct cachefiles_req *req;
> +	unsigned long id = 0;
> +	int n;
> +
> +	if (!test_bit(CACHEFILES_READY, &cache->flags))
> +		return 0;
> +
> +	req = xa_find(&cache->reqs, &id, UINT_MAX, XA_PRESENT);
> +	if (!req)
> +		return 0;
> +
> +	n = sizeof(struct cachefiles_req_in);
> +	if (n > buflen)
> +		return -EMSGSIZE;

You forgot to test if you have a big enough buffer to copy the data
into :(

> +
> +	req->base.id = id;
> +	if (copy_to_user(_buffer, &req->base, n) != 0)

No endian issues?

> +		return -EFAULT;
> +
> +	return n;
> +}
> +
> +static __poll_t cachefiles_ondemand_poll(struct file *file,
> +					 struct poll_table_struct *poll)
> +{
> +	struct cachefiles_cache *cache = file->private_data;
> +	__poll_t mask;
> +
> +	poll_wait(file, &cache->daemon_pollwq, poll);
> +	mask = 0;
> +
> +	if (!xa_empty(&cache->reqs))
> +		mask |= EPOLLIN;
> +
> +	return mask;
> +}
> +
> +/*
> + * Request completion
> + * - command: "done <id>"
> + */
> +static int cachefiles_daemon_done(struct cachefiles_cache *cache, char *args)
> +{
> +	struct cachefiles_req *req;
> +	unsigned long id;
> +	int ret;
> +
> +	_enter(",%s", args);
> +
> +	if (!*args) {
> +		pr_err("Empty id specified\n");
> +		return -EINVAL;
> +	}
> +
> +	ret = kstrtoul(args, 0, &id);
> +	if (ret)
> +		return ret;
> +
> +	req = xa_erase(&cache->reqs, id);
> +	if (!req)
> +		return -EINVAL;
> +
> +	complete(&req->done);
> +	return 0;
> +}
> +#endif
> diff --git a/fs/cachefiles/internal.h b/fs/cachefiles/internal.h
> index 6473634c41a9..59dd11e42cb3 100644
> --- a/fs/cachefiles/internal.h
> +++ b/fs/cachefiles/internal.h
> @@ -15,6 +15,8 @@
>  #include <linux/fscache-cache.h>
>  #include <linux/cred.h>
>  #include <linux/security.h>
> +#include <linux/xarray.h>
> +#include <linux/cachefiles_ondemand.h>
>  
>  #define CACHEFILES_DIO_BLOCK_SIZE 4096
>  
> @@ -102,6 +104,15 @@ struct cachefiles_cache {
>  	char				*rootdirname;	/* name of cache root directory */
>  	char				*secctx;	/* LSM security context */
>  	char				*tag;		/* cache binding tag */
> +#ifdef CONFIG_CACHEFILES_ONDEMAND
> +	struct xarray			reqs;
> +#endif
> +};
> +
> +struct cachefiles_req {
> +	struct cachefiles_req_in base;
> +	struct completion done;
> +	int error;
>  };
>  
>  #include <trace/events/cachefiles.h>
> @@ -146,6 +157,7 @@ extern int cachefiles_has_space(struct cachefiles_cache *cache,
>   * daemon.c
>   */
>  extern const struct file_operations cachefiles_daemon_fops;
> +extern const struct file_operations cachefiles_ondemand_fops;
>  
>  /*
>   * error_inject.c
> diff --git a/fs/cachefiles/io.c b/fs/cachefiles/io.c
> index 753986ea1583..7c51e53d52d1 100644
> --- a/fs/cachefiles/io.c
> +++ b/fs/cachefiles/io.c
> @@ -597,6 +597,67 @@ static void cachefiles_end_operation(struct netfs_cache_resources *cres)
>  	fscache_end_cookie_access(fscache_cres_cookie(cres), fscache_access_io_end);
>  }
>  
> +#ifdef CONFIG_CACHEFILES_ONDEMAND
> +static struct cachefiles_req *cachefiles_alloc_req(struct cachefiles_object *object,
> +						   loff_t start_pos,
> +						   size_t len)
> +{
> +	struct cachefiles_req *req;
> +	struct cachefiles_req_in *base;
> +
> +	req = kzalloc(sizeof(*req), GFP_KERNEL);
> +	if (!req)
> +		return NULL;
> +
> +	base = &req->base;
> +
> +	base->off = start_pos;
> +	base->len = len;
> +	strncpy(base->path, object->d_name, sizeof(base->path) - 1);
> +
> +	init_completion(&req->done);
> +
> +	return req;
> +}
> +
> +static int cachefiles_ondemand_read(struct netfs_cache_resources *cres,
> +				    loff_t start_pos, size_t len)
> +{
> +	struct cachefiles_object *object;
> +	struct cachefiles_cache *cache;
> +	struct cachefiles_req *req;
> +	int ret;
> +	u32 id;
> +
> +	object = cachefiles_cres_object(cres);
> +	cache = object->volume->cache;
> +
> +	if (!test_bit(CACHEFILES_ONDEMAND_MODE, &cache->flags))
> +		return -EOPNOTSUPP;
> +
> +	if (test_bit(CACHEFILES_DEAD, &cache->flags))
> +		return -EIO;
> +
> +	req = cachefiles_alloc_req(object, start_pos, len);
> +	if (!req)
> +		return -ENOMEM;
> +
> +	ret = xa_alloc(&cache->reqs, &id, req, xa_limit_32b, GFP_KERNEL);
> +	if (ret) {
> +		kfree(req);
> +		return -ENOMEM;
> +	}
> +
> +	wake_up_all(&cache->daemon_pollwq);
> +
> +	wait_for_completion(&req->done);
> +	ret = req->error;
> +	kfree(req);
> +
> +	return ret;
> +}
> +#endif
> +
>  static const struct netfs_cache_ops cachefiles_netfs_cache_ops = {
>  	.end_operation		= cachefiles_end_operation,
>  	.read			= cachefiles_read,
> @@ -604,6 +665,9 @@ static const struct netfs_cache_ops cachefiles_netfs_cache_ops = {
>  	.prepare_read		= cachefiles_prepare_read,
>  	.prepare_write		= cachefiles_prepare_write,
>  	.query_occupancy	= cachefiles_query_occupancy,
> +#ifdef CONFIG_CACHEFILES_ONDEMAND
> +	.ondemand_read		= cachefiles_ondemand_read,
> +#endif
>  };
>  
>  /*
> diff --git a/fs/cachefiles/main.c b/fs/cachefiles/main.c
> index 3f369c6f816d..eab17c3140d9 100644
> --- a/fs/cachefiles/main.c
> +++ b/fs/cachefiles/main.c
> @@ -39,6 +39,27 @@ static struct miscdevice cachefiles_dev = {
>  	.fops	= &cachefiles_daemon_fops,
>  };
>  
> +#ifdef CONFIG_CACHEFILES_ONDEMAND
> +static struct miscdevice cachefiles_ondemand_dev = {
> +	.minor	= MISC_DYNAMIC_MINOR,
> +	.name	= "cachefiles_ondemand",

That is a very big device node name.  Are you sure that is what you
want?

And where are you documenting this new misc device node name and format
so that userspace knows about it?

> +	.fops	= &cachefiles_ondemand_fops,
> +};
> +
> +static inline int cachefiles_init_ondemand(void)
> +{
> +	return misc_register(&cachefiles_ondemand_dev);
> +}
> +
> +static inline void cachefiles_exit_ondemand(void)
> +{
> +	misc_deregister(&cachefiles_ondemand_dev);
> +}
> +#else
> +static inline int cachefiles_init_ondemand(void) { return 0; }
> +static inline void cachefiles_exit_ondemand(void) {}
> +#endif
> +
>  /*
>   * initialise the fs caching module
>   */
> @@ -52,6 +73,9 @@ static int __init cachefiles_init(void)
>  	ret = misc_register(&cachefiles_dev);
>  	if (ret < 0)
>  		goto error_dev;
> +	ret = cachefiles_init_ondemand();
> +	if (ret < 0)
> +		goto error_ondemand_dev;
>  
>  	/* create an object jar */
>  	ret = -ENOMEM;
> @@ -68,6 +92,8 @@ static int __init cachefiles_init(void)
>  	return 0;
>  
>  error_object_jar:
> +	cachefiles_exit_ondemand();
> +error_ondemand_dev:
>  	misc_deregister(&cachefiles_dev);
>  error_dev:
>  	cachefiles_unregister_error_injection();
> @@ -86,6 +112,7 @@ static void __exit cachefiles_exit(void)
>  	pr_info("Unloading\n");
>  
>  	kmem_cache_destroy(cachefiles_object_jar);
> +	cachefiles_exit_ondemand();
>  	misc_deregister(&cachefiles_dev);
>  	cachefiles_unregister_error_injection();
>  }
> diff --git a/include/uapi/linux/cachefiles_ondemand.h b/include/uapi/linux/cachefiles_ondemand.h
> new file mode 100644
> index 000000000000..e639a82f1098
> --- /dev/null
> +++ b/include/uapi/linux/cachefiles_ondemand.h
> @@ -0,0 +1,14 @@
> +/* SPDX-License-Identifier: GPL-2.0 WITH Linux-syscall-note */
> +#ifndef _LINUX_CACHEFILES_ONDEMAND_H
> +#define _LINUX_CACHEFILES_ONDEMAND_H
> +
> +#include <linux/limits.h>
> +
> +struct cachefiles_req_in {
> +	uint64_t id;
> +	uint64_t off;
> +	uint64_t len;

For structures that cross the user/kernel boundry, you have to use the
correct types.  For this it would be __u64.

> +	char path[NAME_MAX];

__u8.

Also, what is the endian of the other values here?  Always native?

thanks,

greg k-h

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [PATCH v4 05/23] cachefiles: introduce new devnode for on-demand read mode
  2022-02-16 10:48       ` Greg KH
@ 2022-02-16 12:49         ` JeffleXu
  2022-02-16 17:48           ` Greg KH
  0 siblings, 1 reply; 35+ messages in thread
From: JeffleXu @ 2022-02-16 12:49 UTC (permalink / raw)
  To: Greg KH
  Cc: dhowells, linux-cachefs, xiang, chao, linux-erofs, torvalds,
	willy, linux-fsdevel, joseph.qi, bo.liu, tao.peng, gerry, eguan,
	linux-kernel

Hi,

Thanks for reviewing. :)


On 2/16/22 6:48 PM, Greg KH wrote:
> On Tue, Feb 15, 2022 at 07:13:35PM +0800, Jeffle Xu wrote:
>> This patch introduces a new devnode 'cachefiles_ondemand' to support the
>> newly introduced on-demand read mode.
>>
>> The precondition for on-demand reading semantics is that, all blob files
>> have been placed under corresponding directory with correct file size
>> (sparse files) on the first beginning. When upper fs starts to access
>> the blob file, it will "cache miss" (hit the hole) and then turn to user
>> daemon for preparing the data.
>>
>> The interaction between kernel and user daemon is described as below.
>> 1. Once cache miss, .ondemand_read() callback of corresponding fscache
>>    backend is called to prepare the data. As for cachefiles, it just
>>    packages related metadata (file range to read, etc.) into a pending
>>    read request, and then the process triggering cache miss will fall
>>    asleep until the corresponding data gets fetched later.
>> 2. User daemon needs to poll on the devnode ('cachefiles_ondemand'),
>>    waiting for pending read request.
>> 3. Once there's pending read request, user daemon will be notified and
>>    shall read the devnode ('cachefiles_ondemand') to fetch one pending
>>    read request to process.
>> 4. For the fetched read request, user daemon need to somehow prepare the
>>    data (e.g. download from remote through network) and then write the
>>    fetched data into the backing file to fill the hole.
>> 5. After that, user daemon need to notify cachefiles backend by writing a
>>    'done' command to devnode ('cachefiles_ondemand'). It will also
>>    awake the previous asleep process triggering cache miss.
>> 6. By the time the process gets awaken, the data has been ready in the
>>    backing file. Then process can re-initiate a read request from the
>>    backing file.
>>
>> If user daemon exits in advance when upper fs still mounted, no new
>> on-demand read request can be queued anymore and the existing pending
>> read requests will fail with -EIO.
>>
>> Signed-off-by: Jeffle Xu <jefflexu@linux.alibaba.com>
>> ---
>>  fs/cachefiles/daemon.c                   | 185 +++++++++++++++++++++++
>>  fs/cachefiles/internal.h                 |  12 ++
>>  fs/cachefiles/io.c                       |  64 ++++++++
>>  fs/cachefiles/main.c                     |  27 ++++
>>  include/uapi/linux/cachefiles_ondemand.h |  14 ++
>>  5 files changed, 302 insertions(+)
>>  create mode 100644 include/uapi/linux/cachefiles_ondemand.h
>>
>> diff --git a/fs/cachefiles/daemon.c b/fs/cachefiles/daemon.c
>> index 6b8d7c5bbe5d..96aee8e0eb14 100644
>> --- a/fs/cachefiles/daemon.c
>> +++ b/fs/cachefiles/daemon.c
>> @@ -757,3 +757,188 @@ static void cachefiles_daemon_unbind(struct cachefiles_cache *cache)
>>  
>>  	_leave("");
>>  }
>> +
>> +#ifdef CONFIG_CACHEFILES_ONDEMAND
>> +static unsigned long cachefiles_open_ondemand;
>> +
>> +static int cachefiles_ondemand_open(struct inode *inode, struct file *file);
>> +static int cachefiles_ondemand_release(struct inode *inode, struct file *file);
>> +static ssize_t cachefiles_ondemand_write(struct file *, const char __user *,
>> +					 size_t, loff_t *);
>> +static ssize_t cachefiles_ondemand_read(struct file *, char __user *, size_t,
>> +					loff_t *);
>> +static __poll_t cachefiles_ondemand_poll(struct file *,
>> +					 struct poll_table_struct *);
>> +static int cachefiles_daemon_done(struct cachefiles_cache *, char *);
>> +
>> +const struct file_operations cachefiles_ondemand_fops = {
>> +	.owner		= THIS_MODULE,
>> +	.open		= cachefiles_ondemand_open,
>> +	.release	= cachefiles_ondemand_release,
>> +	.read		= cachefiles_ondemand_read,
>> +	.write		= cachefiles_ondemand_write,
>> +	.poll		= cachefiles_ondemand_poll,
>> +	.llseek		= noop_llseek,
>> +};
>> +
>> +static const struct cachefiles_daemon_cmd cachefiles_ondemand_cmds[] = {
>> +	{ "bind",	cachefiles_daemon_bind		},
>> +	{ "brun",	cachefiles_daemon_brun		},
>> +	{ "bcull",	cachefiles_daemon_bcull		},
>> +	{ "bstop",	cachefiles_daemon_bstop		},
>> +	{ "cull",	cachefiles_daemon_cull		},
>> +	{ "debug",	cachefiles_daemon_debug		},
>> +	{ "dir",	cachefiles_daemon_dir		},
>> +	{ "frun",	cachefiles_daemon_frun		},
>> +	{ "fcull",	cachefiles_daemon_fcull		},
>> +	{ "fstop",	cachefiles_daemon_fstop		},
>> +	{ "inuse",	cachefiles_daemon_inuse		},
>> +	{ "secctx",	cachefiles_daemon_secctx	},
>> +	{ "tag",	cachefiles_daemon_tag		},
>> +	{ "done",	cachefiles_daemon_done		},
>> +	{ "",		NULL				}
>> +};
>> +
>> +static int cachefiles_ondemand_open(struct inode *inode, struct file *file)
>> +{
>> +	struct cachefiles_cache *cache;
>> +
>> +	_enter("");
> 
> ftrace is your friend, no need to try to duplicate it with debugging
> stuff.  This and the _leave() calls should be removed.
> 
>> +
>> +	/* only the superuser may do this */
>> +	if (!capable(CAP_SYS_ADMIN))
>> +		return -EPERM;
> 
> Shouldn't you rely on the userspace permissions of the file instead of
> this?
> 

OK. You are right.


>> +
>> +	/* the cachefiles device may only be open once at a time */
>> +	if (xchg(&cachefiles_open_ondemand, 1) == 1)
>> +		return -EBUSY;
>> +
>> +	cache = cachefiles_daemon_open_cache();
>> +	if (!cache) {
>> +		cachefiles_open_ondemand = 0;
>> +		return -ENOMEM;
>> +	}
>> +
>> +	xa_init_flags(&cache->reqs, XA_FLAGS_ALLOC);
>> +	set_bit(CACHEFILES_ONDEMAND_MODE, &cache->flags);
>> +
>> +	file->private_data = cache;
>> +	cache->cachefilesd = file;
>> +	return 0;
>> +}
>> +
>> +static void cachefiles_ondemand_flush_reqs(struct cachefiles_cache *cache)
>> +{
>> +	struct cachefiles_req *req;
>> +	unsigned long index;
>> +
>> +	xa_for_each(&cache->reqs, index, req) {
>> +		req->error = -EIO;
>> +		complete(&req->done);
>> +	}
>> +}
>> +
>> +static int cachefiles_ondemand_release(struct inode *inode, struct file *file)
>> +{
>> +	struct cachefiles_cache *cache = file->private_data;
>> +
>> +	_enter("");
>> +
>> +	ASSERT(cache);
> 
> We don't mess with ASSERT() in the kernel, how can this ever be false?
> 
>> +
>> +	set_bit(CACHEFILES_DEAD, &cache->flags);
>> +
>> +	cachefiles_ondemand_flush_reqs(cache);
>> +	cachefiles_daemon_unbind(cache);
>> +
>> +	/* clean up the control file interface */
>> +	xa_destroy(&cache->reqs);
>> +	cache->cachefilesd = NULL;
>> +	file->private_data = NULL;
>> +	cachefiles_open_ondemand = 0;
>> +
>> +	kfree(cache);
>> +
>> +	_leave("");
>> +	return 0;
>> +}
>> +
>> +static ssize_t cachefiles_ondemand_write(struct file *file,
>> +					 const char __user *_data,
>> +					 size_t datalen,
>> +					 loff_t *pos)
>> +{
>> +	return cachefiles_daemon_do_write(file, _data, datalen, pos,
>> +					  cachefiles_ondemand_cmds);
>> +}
>> +
>> +static ssize_t cachefiles_ondemand_read(struct file *file, char __user *_buffer,
>> +					size_t buflen, loff_t *pos)
>> +{
>> +	struct cachefiles_cache *cache = file->private_data;
>> +	struct cachefiles_req *req;
>> +	unsigned long id = 0;
>> +	int n;
>> +
>> +	if (!test_bit(CACHEFILES_READY, &cache->flags))
>> +		return 0;
>> +
>> +	req = xa_find(&cache->reqs, &id, UINT_MAX, XA_PRESENT);
>> +	if (!req)
>> +		return 0;
>> +
>> +	n = sizeof(struct cachefiles_req_in);
>> +	if (n > buflen)
		^
This statement is used to check if the user buffer is big enough to
contain the data. req->base is of 'struct cachefiles_req_in' type. But
it shall be better to be changed to

"n = sizeof(req->base);"

in case of type of req->base may be changed in the future.


>> +		return -EMSGSIZE;
> 
> You forgot to test if you have a big enough buffer to copy the data
> into :(



> 
>> +
>> +	req->base.id = id;
>> +	if (copy_to_user(_buffer, &req->base, n) != 0)
> 
> No endian issues?

'struct cachefiles_req_in' is always in the memory. It won't be flushed
to disk. So yes there's no endian issue.

> 
>> +		return -EFAULT;
>> +
>> +	return n;
>> +}
>> +
>> +static __poll_t cachefiles_ondemand_poll(struct file *file,
>> +					 struct poll_table_struct *poll)
>> +{
>> +	struct cachefiles_cache *cache = file->private_data;
>> +	__poll_t mask;
>> +
>> +	poll_wait(file, &cache->daemon_pollwq, poll);
>> +	mask = 0;
>> +
>> +	if (!xa_empty(&cache->reqs))
>> +		mask |= EPOLLIN;
>> +
>> +	return mask;
>> +}
>> +
>> +/*
>> + * Request completion
>> + * - command: "done <id>"
>> + */
>> +static int cachefiles_daemon_done(struct cachefiles_cache *cache, char *args)
>> +{
>> +	struct cachefiles_req *req;
>> +	unsigned long id;
>> +	int ret;
>> +
>> +	_enter(",%s", args);
>> +
>> +	if (!*args) {
>> +		pr_err("Empty id specified\n");
>> +		return -EINVAL;
>> +	}
>> +
>> +	ret = kstrtoul(args, 0, &id);
>> +	if (ret)
>> +		return ret;
>> +
>> +	req = xa_erase(&cache->reqs, id);
>> +	if (!req)
>> +		return -EINVAL;
>> +
>> +	complete(&req->done);
>> +	return 0;
>> +}
>> +#endif
>> diff --git a/fs/cachefiles/internal.h b/fs/cachefiles/internal.h
>> index 6473634c41a9..59dd11e42cb3 100644
>> --- a/fs/cachefiles/internal.h
>> +++ b/fs/cachefiles/internal.h
>> @@ -15,6 +15,8 @@
>>  #include <linux/fscache-cache.h>
>>  #include <linux/cred.h>
>>  #include <linux/security.h>
>> +#include <linux/xarray.h>
>> +#include <linux/cachefiles_ondemand.h>
>>  
>>  #define CACHEFILES_DIO_BLOCK_SIZE 4096
>>  
>> @@ -102,6 +104,15 @@ struct cachefiles_cache {
>>  	char				*rootdirname;	/* name of cache root directory */
>>  	char				*secctx;	/* LSM security context */
>>  	char				*tag;		/* cache binding tag */
>> +#ifdef CONFIG_CACHEFILES_ONDEMAND
>> +	struct xarray			reqs;
>> +#endif
>> +};
>> +
>> +struct cachefiles_req {
>> +	struct cachefiles_req_in base;
>> +	struct completion done;
>> +	int error;
>>  };
>>  
>>  #include <trace/events/cachefiles.h>
>> @@ -146,6 +157,7 @@ extern int cachefiles_has_space(struct cachefiles_cache *cache,
>>   * daemon.c
>>   */
>>  extern const struct file_operations cachefiles_daemon_fops;
>> +extern const struct file_operations cachefiles_ondemand_fops;
>>  
>>  /*
>>   * error_inject.c
>> diff --git a/fs/cachefiles/io.c b/fs/cachefiles/io.c
>> index 753986ea1583..7c51e53d52d1 100644
>> --- a/fs/cachefiles/io.c
>> +++ b/fs/cachefiles/io.c
>> @@ -597,6 +597,67 @@ static void cachefiles_end_operation(struct netfs_cache_resources *cres)
>>  	fscache_end_cookie_access(fscache_cres_cookie(cres), fscache_access_io_end);
>>  }
>>  
>> +#ifdef CONFIG_CACHEFILES_ONDEMAND
>> +static struct cachefiles_req *cachefiles_alloc_req(struct cachefiles_object *object,
>> +						   loff_t start_pos,
>> +						   size_t len)
>> +{
>> +	struct cachefiles_req *req;
>> +	struct cachefiles_req_in *base;
>> +
>> +	req = kzalloc(sizeof(*req), GFP_KERNEL);
>> +	if (!req)
>> +		return NULL;
>> +
>> +	base = &req->base;
>> +
>> +	base->off = start_pos;
>> +	base->len = len;
>> +	strncpy(base->path, object->d_name, sizeof(base->path) - 1);
>> +
>> +	init_completion(&req->done);
>> +
>> +	return req;
>> +}
>> +
>> +static int cachefiles_ondemand_read(struct netfs_cache_resources *cres,
>> +				    loff_t start_pos, size_t len)
>> +{
>> +	struct cachefiles_object *object;
>> +	struct cachefiles_cache *cache;
>> +	struct cachefiles_req *req;
>> +	int ret;
>> +	u32 id;
>> +
>> +	object = cachefiles_cres_object(cres);
>> +	cache = object->volume->cache;
>> +
>> +	if (!test_bit(CACHEFILES_ONDEMAND_MODE, &cache->flags))
>> +		return -EOPNOTSUPP;
>> +
>> +	if (test_bit(CACHEFILES_DEAD, &cache->flags))
>> +		return -EIO;
>> +
>> +	req = cachefiles_alloc_req(object, start_pos, len);
>> +	if (!req)
>> +		return -ENOMEM;
>> +
>> +	ret = xa_alloc(&cache->reqs, &id, req, xa_limit_32b, GFP_KERNEL);
>> +	if (ret) {
>> +		kfree(req);
>> +		return -ENOMEM;
>> +	}
>> +
>> +	wake_up_all(&cache->daemon_pollwq);
>> +
>> +	wait_for_completion(&req->done);
>> +	ret = req->error;
>> +	kfree(req);
>> +
>> +	return ret;
>> +}
>> +#endif
>> +
>>  static const struct netfs_cache_ops cachefiles_netfs_cache_ops = {
>>  	.end_operation		= cachefiles_end_operation,
>>  	.read			= cachefiles_read,
>> @@ -604,6 +665,9 @@ static const struct netfs_cache_ops cachefiles_netfs_cache_ops = {
>>  	.prepare_read		= cachefiles_prepare_read,
>>  	.prepare_write		= cachefiles_prepare_write,
>>  	.query_occupancy	= cachefiles_query_occupancy,
>> +#ifdef CONFIG_CACHEFILES_ONDEMAND
>> +	.ondemand_read		= cachefiles_ondemand_read,
>> +#endif
>>  };
>>  
>>  /*
>> diff --git a/fs/cachefiles/main.c b/fs/cachefiles/main.c
>> index 3f369c6f816d..eab17c3140d9 100644
>> --- a/fs/cachefiles/main.c
>> +++ b/fs/cachefiles/main.c
>> @@ -39,6 +39,27 @@ static struct miscdevice cachefiles_dev = {
>>  	.fops	= &cachefiles_daemon_fops,
>>  };
>>  
>> +#ifdef CONFIG_CACHEFILES_ONDEMAND
>> +static struct miscdevice cachefiles_ondemand_dev = {
>> +	.minor	= MISC_DYNAMIC_MINOR,
>> +	.name	= "cachefiles_ondemand",
> 
> That is a very big device node name.  Are you sure that is what you
> want?

I have to admit that it's not a good name. I need to think of a better
name...

> 
> And where are you documenting this new misc device node name and format
> so that userspace knows about it?

Sorry I haven't documented all these. Indeed we need a better documentation.

> 
>> +	.fops	= &cachefiles_ondemand_fops,
>> +};
>> +
>> +static inline int cachefiles_init_ondemand(void)
>> +{
>> +	return misc_register(&cachefiles_ondemand_dev);
>> +}
>> +
>> +static inline void cachefiles_exit_ondemand(void)
>> +{
>> +	misc_deregister(&cachefiles_ondemand_dev);
>> +}
>> +#else
>> +static inline int cachefiles_init_ondemand(void) { return 0; }
>> +static inline void cachefiles_exit_ondemand(void) {}
>> +#endif
>> +
>>  /*
>>   * initialise the fs caching module
>>   */
>> @@ -52,6 +73,9 @@ static int __init cachefiles_init(void)
>>  	ret = misc_register(&cachefiles_dev);
>>  	if (ret < 0)
>>  		goto error_dev;
>> +	ret = cachefiles_init_ondemand();
>> +	if (ret < 0)
>> +		goto error_ondemand_dev;
>>  
>>  	/* create an object jar */
>>  	ret = -ENOMEM;
>> @@ -68,6 +92,8 @@ static int __init cachefiles_init(void)
>>  	return 0;
>>  
>>  error_object_jar:
>> +	cachefiles_exit_ondemand();
>> +error_ondemand_dev:
>>  	misc_deregister(&cachefiles_dev);
>>  error_dev:
>>  	cachefiles_unregister_error_injection();
>> @@ -86,6 +112,7 @@ static void __exit cachefiles_exit(void)
>>  	pr_info("Unloading\n");
>>  
>>  	kmem_cache_destroy(cachefiles_object_jar);
>> +	cachefiles_exit_ondemand();
>>  	misc_deregister(&cachefiles_dev);
>>  	cachefiles_unregister_error_injection();
>>  }
>> diff --git a/include/uapi/linux/cachefiles_ondemand.h b/include/uapi/linux/cachefiles_ondemand.h
>> new file mode 100644
>> index 000000000000..e639a82f1098
>> --- /dev/null
>> +++ b/include/uapi/linux/cachefiles_ondemand.h
>> @@ -0,0 +1,14 @@
>> +/* SPDX-License-Identifier: GPL-2.0 WITH Linux-syscall-note */
>> +#ifndef _LINUX_CACHEFILES_ONDEMAND_H
>> +#define _LINUX_CACHEFILES_ONDEMAND_H
>> +
>> +#include <linux/limits.h>
>> +
>> +struct cachefiles_req_in {
>> +	uint64_t id;
>> +	uint64_t off;
>> +	uint64_t len;
> 
> For structures that cross the user/kernel boundry, you have to use the
> correct types.  For this it would be __u64.

OK I will change to __xx style in the next version.

By the way, I can't understand the disadvantage of uintxx_t style. I can
only find the inital commit [1] that introduces the __xx style. But
still I can't get any background info.

[1] commit d13ff31cfeedbf2fefc7ba13cb753775648eac0c ("types: create
<asm-generic/int-*.h>")

> 
>> +	char path[NAME_MAX];
> 
> __u8.
> 
> Also, what is the endian of the other values here?  Always native?

As stated previously, these structures are always in memory, and thus
there's no endian issue.

-- 
Thanks,
Jeffle

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [PATCH v4 05/23] cachefiles: introduce new devnode for on-demand read mode
  2022-02-16 12:49         ` JeffleXu
@ 2022-02-16 17:48           ` Greg KH
  2022-02-17  1:49             ` JeffleXu
  0 siblings, 1 reply; 35+ messages in thread
From: Greg KH @ 2022-02-16 17:48 UTC (permalink / raw)
  To: JeffleXu
  Cc: dhowells, linux-cachefs, xiang, chao, linux-erofs, torvalds,
	willy, linux-fsdevel, joseph.qi, bo.liu, tao.peng, gerry, eguan,
	linux-kernel

On Wed, Feb 16, 2022 at 08:49:35PM +0800, JeffleXu wrote:
> >> +struct cachefiles_req_in {
> >> +	uint64_t id;
> >> +	uint64_t off;
> >> +	uint64_t len;
> > 
> > For structures that cross the user/kernel boundry, you have to use the
> > correct types.  For this it would be __u64.
> 
> OK I will change to __xx style in the next version.
> 
> By the way, I can't understand the disadvantage of uintxx_t style.

The "uint*" types are not valid kernel types.  They are userspace types
and do not transfer properly in all arches and situations when crossing
the user/kernel boundry.  They are also in a different C "namespace", so
should not even be used in kernel code, although a lot of people do
because they are used to writing userspace C code :(

thanks,

greg k-h

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [PATCH v4 05/23] cachefiles: introduce new devnode for on-demand read mode
  2022-02-16 17:48           ` Greg KH
@ 2022-02-17  1:49             ` JeffleXu
  0 siblings, 0 replies; 35+ messages in thread
From: JeffleXu @ 2022-02-17  1:49 UTC (permalink / raw)
  To: Greg KH
  Cc: dhowells, linux-cachefs, xiang, chao, linux-erofs, torvalds,
	willy, linux-fsdevel, joseph.qi, bo.liu, tao.peng, gerry, eguan,
	linux-kernel



On 2/17/22 1:48 AM, Greg KH wrote:
> On Wed, Feb 16, 2022 at 08:49:35PM +0800, JeffleXu wrote:
>>>> +struct cachefiles_req_in {
>>>> +	uint64_t id;
>>>> +	uint64_t off;
>>>> +	uint64_t len;
>>>
>>> For structures that cross the user/kernel boundry, you have to use the
>>> correct types.  For this it would be __u64.
>>
>> OK I will change to __xx style in the next version.
>>
>> By the way, I can't understand the disadvantage of uintxx_t style.
> 
> The "uint*" types are not valid kernel types.  They are userspace types
> and do not transfer properly in all arches and situations when crossing
> the user/kernel boundry.  They are also in a different C "namespace", so
> should not even be used in kernel code, although a lot of people do
> because they are used to writing userspace C code :(

OK. "uint*" types are defined in ISO C library, while it seems that
linux kernel doesn't expect any C library [1].

[1] https://kernelnewbies.org/FAQ/LibraryFunctionsInKernel

Thanks for explaining it.

-- 
Thanks,
Jeffle

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [PATCH v3 01/22] fscache: export fscache_end_operation()
  2022-02-09  6:00 ` [PATCH v3 01/22] fscache: export fscache_end_operation() Jeffle Xu
@ 2022-02-17  7:44   ` Liu Bo
  0 siblings, 0 replies; 35+ messages in thread
From: Liu Bo @ 2022-02-17  7:44 UTC (permalink / raw)
  To: Jeffle Xu
  Cc: dhowells, linux-cachefs, xiang, chao, linux-erofs, torvalds,
	gregkh, willy, linux-fsdevel, joseph.qi, tao.peng, gerry, eguan,
	linux-kernel

On Wed, Feb 09, 2022 at 02:00:47PM +0800, Jeffle Xu wrote:
> Export fscache_end_operation() to avoid code duplication.
> 
> Besides, considering the paired fscache_begin_read_operation() is
> already exported, it shall make sense to also export
> fscache_end_operation().
>

Looks reasonable to me.

Reviewed-by: Liu Bo <bo.liu@linux.alibaba.com>

> Signed-off-by: Jeffle Xu <jefflexu@linux.alibaba.com>
> ---
>  fs/fscache/internal.h   | 11 -----------
>  fs/nfs/fscache.c        |  8 --------
>  include/linux/fscache.h | 14 ++++++++++++++
>  3 files changed, 14 insertions(+), 19 deletions(-)
> 
> diff --git a/fs/fscache/internal.h b/fs/fscache/internal.h
> index f121c21590dc..ed1c9ed737f2 100644
> --- a/fs/fscache/internal.h
> +++ b/fs/fscache/internal.h
> @@ -70,17 +70,6 @@ static inline void fscache_see_cookie(struct fscache_cookie *cookie,
>  			     where);
>  }
>  
> -/*
> - * io.c
> - */
> -static inline void fscache_end_operation(struct netfs_cache_resources *cres)
> -{
> -	const struct netfs_cache_ops *ops = fscache_operation_valid(cres);
> -
> -	if (ops)
> -		ops->end_operation(cres);
> -}
> -
>  /*
>   * main.c
>   */
> diff --git a/fs/nfs/fscache.c b/fs/nfs/fscache.c
> index cfe901650ab0..39654ca72d3d 100644
> --- a/fs/nfs/fscache.c
> +++ b/fs/nfs/fscache.c
> @@ -249,14 +249,6 @@ void nfs_fscache_release_file(struct inode *inode, struct file *filp)
>  	}
>  }
>  
> -static inline void fscache_end_operation(struct netfs_cache_resources *cres)
> -{
> -	const struct netfs_cache_ops *ops = fscache_operation_valid(cres);
> -
> -	if (ops)
> -		ops->end_operation(cres);
> -}
> -
>  /*
>   * Fallback page reading interface.
>   */
> diff --git a/include/linux/fscache.h b/include/linux/fscache.h
> index 296c5f1d9f35..d2430da8aa67 100644
> --- a/include/linux/fscache.h
> +++ b/include/linux/fscache.h
> @@ -456,6 +456,20 @@ int fscache_begin_read_operation(struct netfs_cache_resources *cres,
>  	return -ENOBUFS;
>  }
>  
> +/**
> + * fscache_end_operation - End the read operation for the netfs lib
> + * @cres: The cache resources for the read operation
> + *
> + * Clean up the resources at the end of the read request.
> + */
> +static inline void fscache_end_operation(struct netfs_cache_resources *cres)
> +{
> +	const struct netfs_cache_ops *ops = fscache_operation_valid(cres);
> +
> +	if (ops)
> +		ops->end_operation(cres);
> +}
> +
>  /**
>   * fscache_read - Start a read from the cache.
>   * @cres: The cache resources to use
> -- 
> 2.27.0

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [PATCH v3 03/22] cachefiles: extract generic function for daemon methods
  2022-02-09  6:00 ` [PATCH v3 03/22] cachefiles: extract generic function for daemon methods Jeffle Xu
@ 2022-02-17  8:17   ` Liu Bo
  0 siblings, 0 replies; 35+ messages in thread
From: Liu Bo @ 2022-02-17  8:17 UTC (permalink / raw)
  To: Jeffle Xu
  Cc: dhowells, linux-cachefs, xiang, chao, linux-erofs, torvalds,
	gregkh, willy, linux-fsdevel, joseph.qi, tao.peng, gerry, eguan,
	linux-kernel

On Wed, Feb 09, 2022 at 02:00:49PM +0800, Jeffle Xu wrote:
> ... so that the following new devnode can reuse most of the code when
> implementing its own methods.
>

Looks good.

Reviewed-by: Liu Bo <bo.liu@linux.alibaba.com>
liubo

> Signed-off-by: Jeffle Xu <jefflexu@linux.alibaba.com>
> ---
>  fs/cachefiles/daemon.c | 70 +++++++++++++++++++++++++++---------------
>  1 file changed, 45 insertions(+), 25 deletions(-)
> 
> diff --git a/fs/cachefiles/daemon.c b/fs/cachefiles/daemon.c
> index 7ac04ee2c0a0..6b8d7c5bbe5d 100644
> --- a/fs/cachefiles/daemon.c
> +++ b/fs/cachefiles/daemon.c
> @@ -78,6 +78,34 @@ static const struct cachefiles_daemon_cmd cachefiles_daemon_cmds[] = {
>  	{ "",		NULL				}
>  };
>  
> +static struct cachefiles_cache *cachefiles_daemon_open_cache(void)
> +{
> +	struct cachefiles_cache *cache;
> +
> +	/* allocate a cache record */
> +	cache = kzalloc(sizeof(struct cachefiles_cache), GFP_KERNEL);
> +	if (cache) {
> +		mutex_init(&cache->daemon_mutex);
> +		init_waitqueue_head(&cache->daemon_pollwq);
> +		INIT_LIST_HEAD(&cache->volumes);
> +		INIT_LIST_HEAD(&cache->object_list);
> +		spin_lock_init(&cache->object_list_lock);
> +
> +		/* set default caching limits
> +		 * - limit at 1% free space and/or free files
> +		 * - cull below 5% free space and/or free files
> +		 * - cease culling above 7% free space and/or free files
> +		 */
> +		cache->frun_percent = 7;
> +		cache->fcull_percent = 5;
> +		cache->fstop_percent = 1;
> +		cache->brun_percent = 7;
> +		cache->bcull_percent = 5;
> +		cache->bstop_percent = 1;
> +	}
> +
> +	return cache;
> +}
>  
>  /*
>   * Prepare a cache for caching.
> @@ -96,31 +124,13 @@ static int cachefiles_daemon_open(struct inode *inode, struct file *file)
>  	if (xchg(&cachefiles_open, 1) == 1)
>  		return -EBUSY;
>  
> -	/* allocate a cache record */
> -	cache = kzalloc(sizeof(struct cachefiles_cache), GFP_KERNEL);
> +
> +	cache = cachefiles_daemon_open_cache();
>  	if (!cache) {
>  		cachefiles_open = 0;
>  		return -ENOMEM;
>  	}
>  
> -	mutex_init(&cache->daemon_mutex);
> -	init_waitqueue_head(&cache->daemon_pollwq);
> -	INIT_LIST_HEAD(&cache->volumes);
> -	INIT_LIST_HEAD(&cache->object_list);
> -	spin_lock_init(&cache->object_list_lock);
> -
> -	/* set default caching limits
> -	 * - limit at 1% free space and/or free files
> -	 * - cull below 5% free space and/or free files
> -	 * - cease culling above 7% free space and/or free files
> -	 */
> -	cache->frun_percent = 7;
> -	cache->fcull_percent = 5;
> -	cache->fstop_percent = 1;
> -	cache->brun_percent = 7;
> -	cache->bcull_percent = 5;
> -	cache->bstop_percent = 1;
> -
>  	file->private_data = cache;
>  	cache->cachefilesd = file;
>  	return 0;
> @@ -209,10 +219,11 @@ static ssize_t cachefiles_daemon_read(struct file *file, char __user *_buffer,
>  /*
>   * Take a command from cachefilesd, parse it and act on it.
>   */
> -static ssize_t cachefiles_daemon_write(struct file *file,
> -				       const char __user *_data,
> -				       size_t datalen,
> -				       loff_t *pos)
> +static ssize_t cachefiles_daemon_do_write(struct file *file,
> +					  const char __user *_data,
> +					  size_t datalen,
> +					  loff_t *pos,
> +			const struct cachefiles_daemon_cmd *cmds)
>  {
>  	const struct cachefiles_daemon_cmd *cmd;
>  	struct cachefiles_cache *cache = file->private_data;
> @@ -261,7 +272,7 @@ static ssize_t cachefiles_daemon_write(struct file *file,
>  	}
>  
>  	/* run the appropriate command handler */
> -	for (cmd = cachefiles_daemon_cmds; cmd->name[0]; cmd++)
> +	for (cmd = cmds; cmd->name[0]; cmd++)
>  		if (strcmp(cmd->name, data) == 0)
>  			goto found_command;
>  
> @@ -284,6 +295,15 @@ static ssize_t cachefiles_daemon_write(struct file *file,
>  	goto error;
>  }
>  
> +static ssize_t cachefiles_daemon_write(struct file *file,
> +				       const char __user *_data,
> +				       size_t datalen,
> +				       loff_t *pos)
> +{
> +	return cachefiles_daemon_do_write(file, _data, datalen, pos,
> +					  cachefiles_daemon_cmds);
> +}
> +
>  /*
>   * Poll for culling state
>   * - use EPOLLOUT to indicate culling state
> -- 
> 2.27.0

^ permalink raw reply	[flat|nested] 35+ messages in thread

end of thread, other threads:[~2022-02-17  8:17 UTC | newest]

Thread overview: 35+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2022-02-09  6:00 [PATCH v3 00/22] fscache,erofs: fscache-based demand-read semantics Jeffle Xu
2022-02-09  6:00 ` [PATCH v3 01/22] fscache: export fscache_end_operation() Jeffle Xu
2022-02-17  7:44   ` Liu Bo
2022-02-09  6:00 ` [PATCH v3 02/22] fscache: add a method to support on-demand read semantics Jeffle Xu
2022-02-09  6:00 ` [PATCH v3 03/22] cachefiles: extract generic function for daemon methods Jeffle Xu
2022-02-17  8:17   ` Liu Bo
2022-02-09  6:00 ` [PATCH v3 04/22] cachefiles: detect backing file size in on-demand read mode Jeffle Xu
2022-02-09  6:00 ` [PATCH v3 05/22] cachefiles: introduce new devnode for " Jeffle Xu
2022-02-15  9:03   ` JeffleXu
2022-02-15 10:37     ` Greg KH
2022-02-16  8:17       ` JeffleXu
2022-02-15 11:13     ` [PATCH v4 05/23] " Jeffle Xu
2022-02-16 10:48       ` Greg KH
2022-02-16 12:49         ` JeffleXu
2022-02-16 17:48           ` Greg KH
2022-02-17  1:49             ` JeffleXu
2022-02-09  6:00 ` [PATCH v3 06/22] erofs: use meta buffers for erofs_read_superblock() Jeffle Xu
2022-02-09  7:52   ` Gao Xiang
2022-02-09  6:00 ` [PATCH v3 07/22] erofs: export erofs_map_blocks() Jeffle Xu
2022-02-09  6:00 ` [PATCH v3 08/22] erofs: add mode checking helper Jeffle Xu
2022-02-09  6:00 ` [PATCH v3 09/22] erofs: register global fscache volume Jeffle Xu
2022-02-09  6:00 ` [PATCH v3 10/22] erofs: add cookie context helper functions Jeffle Xu
2022-02-09  6:00 ` [PATCH v3 11/22] erofs: add anonymous inode managing page cache of blob file Jeffle Xu
2022-02-09  6:00 ` [PATCH v3 12/22] erofs: add erofs_fscache_read_page() helper Jeffle Xu
2022-02-09  6:00 ` [PATCH v3 13/22] erofs: register cookie context for bootstrap blob Jeffle Xu
2022-02-09  6:01 ` [PATCH v3 14/22] erofs: implement fscache-based metadata read Jeffle Xu
2022-02-09  6:01 ` [PATCH v3 15/22] erofs: implement fscache-based data read for non-inline layout Jeffle Xu
2022-02-09  6:01 ` [PATCH v3 16/22] erofs: implement fscache-based data read for inline layout Jeffle Xu
2022-02-09  6:01 ` [PATCH v3 17/22] erofs: register cookie context for data blobs Jeffle Xu
2022-02-09  6:01 ` [PATCH v3 18/22] erofs: implement fscache-based data read " Jeffle Xu
2022-02-09  6:01 ` [PATCH v3 19/22] erofs: implement fscache-based data readahead for hole Jeffle Xu
2022-02-09  6:01 ` [PATCH v3 20/22] erofs: implement fscache-based data readahead for non-inline layout Jeffle Xu
2022-02-09  6:01 ` [PATCH v3 21/22] erofs: implement fscache-based data readahead for inline layout Jeffle Xu
2022-02-09  6:01 ` [PATCH v3 22/22] erofs: add 'uuid' mount option Jeffle Xu
2022-02-10  5:58 ` [Linux-cachefs] [PATCH v3 00/22] fscache, erofs: fscache-based demand-read semantics Gao Xiang

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).