All of lore.kernel.org
 help / color / mirror / Atom feed
* [RFC PATCH v4 0/2] erofs: support compressed fragments data
@ 2022-09-13 11:05 ` Yue Hu
  0 siblings, 0 replies; 12+ messages in thread
From: Yue Hu @ 2022-09-13 11:05 UTC (permalink / raw)
  To: xiang, chao
  Cc: linux-erofs, linux-kernel, zhangwen, shaojunjun, zbestahu, Yue Hu

From: Yue Hu <huyue2@coolpad.com>

This feature can merge tail of per-file or the whole files into a
special inode to achieve greater compression ratio.

Meanwhile, also add a interlaced uncompressed data layout support for
compressed files since fragments feature (and later) can use it.

mkfs v8: https://lore.kernel.org/all/cover.1663065968.git.huyue2@coolpad.com/

changes from v3:
 - improve the interlaced layout for non 4K uncompressed data as well (Xiang)
 - support 64bit fragment offset for fragment inode and legacy compress (Xiang)

changes from v2:
 - enhance the condition to check if pcluster is interlaced or not;
 - no typo.

changes from v1:
 - fix a compiling error without CONFIG_EROFS_FS_ZIP, reported by kernel test
   robot <lkp@intel.com>;
 - introduce the term 'interlaced' for patch 1/2 suggested by Xiang;
 - fix packed inode failure path when read super pointed out by Xiang;
 - use kmap_local_page instead of kmap_atomic pointed out by Xiang;
 - use a simpler way to avoid call read fragment data twice suggested by Xiang;
 - update commit message change.

Yue Hu (2):
  erofs: support interlaced uncompressed data for compressed files
  erofs: support on-disk compressed fragments data

 fs/erofs/decompressor.c | 47 ++++++++++++++++++-------------
 fs/erofs/erofs_fs.h     | 31 +++++++++++++++++----
 fs/erofs/internal.h     | 17 +++++++++--
 fs/erofs/super.c        | 15 ++++++++++
 fs/erofs/sysfs.c        |  2 ++
 fs/erofs/zdata.c        | 48 ++++++++++++++++++++++++++++++-
 fs/erofs/zmap.c         | 62 +++++++++++++++++++++++++++++++++++++----
 7 files changed, 187 insertions(+), 35 deletions(-)

-- 
2.17.1


^ permalink raw reply	[flat|nested] 12+ messages in thread

* [RFC PATCH v4 0/2] erofs: support compressed fragments data
@ 2022-09-13 11:05 ` Yue Hu
  0 siblings, 0 replies; 12+ messages in thread
From: Yue Hu @ 2022-09-13 11:05 UTC (permalink / raw)
  To: xiang, chao; +Cc: linux-kernel, zhangwen, Yue Hu, linux-erofs, shaojunjun

From: Yue Hu <huyue2@coolpad.com>

This feature can merge tail of per-file or the whole files into a
special inode to achieve greater compression ratio.

Meanwhile, also add a interlaced uncompressed data layout support for
compressed files since fragments feature (and later) can use it.

mkfs v8: https://lore.kernel.org/all/cover.1663065968.git.huyue2@coolpad.com/

changes from v3:
 - improve the interlaced layout for non 4K uncompressed data as well (Xiang)
 - support 64bit fragment offset for fragment inode and legacy compress (Xiang)

changes from v2:
 - enhance the condition to check if pcluster is interlaced or not;
 - no typo.

changes from v1:
 - fix a compiling error without CONFIG_EROFS_FS_ZIP, reported by kernel test
   robot <lkp@intel.com>;
 - introduce the term 'interlaced' for patch 1/2 suggested by Xiang;
 - fix packed inode failure path when read super pointed out by Xiang;
 - use kmap_local_page instead of kmap_atomic pointed out by Xiang;
 - use a simpler way to avoid call read fragment data twice suggested by Xiang;
 - update commit message change.

Yue Hu (2):
  erofs: support interlaced uncompressed data for compressed files
  erofs: support on-disk compressed fragments data

 fs/erofs/decompressor.c | 47 ++++++++++++++++++-------------
 fs/erofs/erofs_fs.h     | 31 +++++++++++++++++----
 fs/erofs/internal.h     | 17 +++++++++--
 fs/erofs/super.c        | 15 ++++++++++
 fs/erofs/sysfs.c        |  2 ++
 fs/erofs/zdata.c        | 48 ++++++++++++++++++++++++++++++-
 fs/erofs/zmap.c         | 62 +++++++++++++++++++++++++++++++++++++----
 7 files changed, 187 insertions(+), 35 deletions(-)

-- 
2.17.1


^ permalink raw reply	[flat|nested] 12+ messages in thread

* [RFC PATCH v4 1/2] erofs: support interlaced uncompressed data for compressed files
       [not found] ` <cover.1663066966.git.huyue2@coolpad.com>
@ 2022-09-13 11:05     ` Yue Hu
  2022-09-13 11:05     ` Yue Hu
  1 sibling, 0 replies; 12+ messages in thread
From: Yue Hu @ 2022-09-13 11:05 UTC (permalink / raw)
  To: xiang, chao
  Cc: linux-erofs, linux-kernel, zhangwen, shaojunjun, zbestahu, Yue Hu

From: Yue Hu <huyue2@coolpad.com>

Currently, uncompressed data is all handled in the shifted way, which
means we have to shift the whole on-disk plain pcluster to get the
logical data.   However, since we are also using in-place I/O for
uncompressed data, data copy will be reduced a lot if pcluster is
recorded in the interlaced way as illustrated below:
 _______________________________________________________________
|               |    |               |_ tail part |_ head part _|
|<-   blk0    ->| .. |<-   blkn-2  ->|<-         blkn-1       ->|

The logical data then becomes:
 ________________________________________________________
|_ head part _|_  blk0  _| .. |_  blkn-2  _|_ tail part _|

In addition, non-4k plain pclusters are also survived by the
interlaced way, which can be used for non-4k lclusters as well.

However, it's almost impossible to de-duplicate uncompressed data
in the interlaced way, therefore shifted uncompressed data is still
useful.

Signed-off-by: Yue Hu <huyue2@coolpad.com>
---
 fs/erofs/decompressor.c | 47 ++++++++++++++++++++++++-----------------
 fs/erofs/erofs_fs.h     |  2 ++
 fs/erofs/internal.h     |  1 +
 fs/erofs/zmap.c         | 14 ++++++++----
 4 files changed, 41 insertions(+), 23 deletions(-)

diff --git a/fs/erofs/decompressor.c b/fs/erofs/decompressor.c
index 2d55569f96ac..51b7ac7166d9 100644
--- a/fs/erofs/decompressor.c
+++ b/fs/erofs/decompressor.c
@@ -317,52 +317,61 @@ static int z_erofs_lz4_decompress(struct z_erofs_decompress_req *rq,
 	return ret;
 }
 
-static int z_erofs_shifted_transform(struct z_erofs_decompress_req *rq,
-				     struct page **pagepool)
+static int z_erofs_transform_plain(struct z_erofs_decompress_req *rq,
+				   struct page **pagepool)
 {
-	const unsigned int nrpages_out =
+	const unsigned int inpages = PAGE_ALIGN(rq->inputsize) >> PAGE_SHIFT;
+	const unsigned int outpages =
 		PAGE_ALIGN(rq->pageofs_out + rq->outputsize) >> PAGE_SHIFT;
 	const unsigned int righthalf = min_t(unsigned int, rq->outputsize,
 					     PAGE_SIZE - rq->pageofs_out);
 	const unsigned int lefthalf = rq->outputsize - righthalf;
+	const unsigned int interlaced_offset =
+		rq->alg == Z_EROFS_COMPRESSION_SHIFTED ? 0 : rq->pageofs_out;
 	unsigned char *src, *dst;
 
-	if (nrpages_out > 2) {
+	if (outpages > 2 && rq->alg == Z_EROFS_COMPRESSION_SHIFTED) {
 		DBG_BUGON(1);
-		return -EIO;
+		return -EFSCORRUPTED;
 	}
 
 	if (rq->out[0] == *rq->in) {
-		DBG_BUGON(nrpages_out != 1);
+		DBG_BUGON(rq->pageofs_out);
 		return 0;
 	}
 
-	src = kmap_atomic(*rq->in) + rq->pageofs_in;
+	src = kmap_local_page(rq->in[inpages - 1]) + rq->pageofs_in;
 	if (rq->out[0]) {
-		dst = kmap_atomic(rq->out[0]);
-		memcpy(dst + rq->pageofs_out, src, righthalf);
-		kunmap_atomic(dst);
+		dst = kmap_local_page(rq->out[0]);
+		memcpy(dst + rq->pageofs_out, src + interlaced_offset,
+		       righthalf);
+		kunmap_local(dst);
 	}
 
-	if (nrpages_out == 2) {
-		DBG_BUGON(!rq->out[1]);
-		if (rq->out[1] == *rq->in) {
+	if (outpages > inpages) {
+		DBG_BUGON(!rq->out[outpages - 1]);
+		if (rq->out[outpages - 1] != rq->in[inpages - 1]) {
+			dst = kmap_local_page(rq->out[outpages - 1]);
+			memcpy(dst, interlaced_offset ? src :
+					(src + righthalf), lefthalf);
+			kunmap_local(dst);
+		} else if (!interlaced_offset) {
 			memmove(src, src + righthalf, lefthalf);
-		} else {
-			dst = kmap_atomic(rq->out[1]);
-			memcpy(dst, src + righthalf, lefthalf);
-			kunmap_atomic(dst);
 		}
 	}
-	kunmap_atomic(src);
+	kunmap_local(src);
 	return 0;
 }
 
 static struct z_erofs_decompressor decompressors[] = {
 	[Z_EROFS_COMPRESSION_SHIFTED] = {
-		.decompress = z_erofs_shifted_transform,
+		.decompress = z_erofs_transform_plain,
 		.name = "shifted"
 	},
+	[Z_EROFS_COMPRESSION_INTERLACED] = {
+		.decompress = z_erofs_transform_plain,
+		.name = "interlaced"
+	},
 	[Z_EROFS_COMPRESSION_LZ4] = {
 		.decompress = z_erofs_lz4_decompress,
 		.name = "lz4"
diff --git a/fs/erofs/erofs_fs.h b/fs/erofs/erofs_fs.h
index 2b48373f690b..5c1de6d7ad71 100644
--- a/fs/erofs/erofs_fs.h
+++ b/fs/erofs/erofs_fs.h
@@ -295,11 +295,13 @@ struct z_erofs_lzma_cfgs {
  * bit 1 : HEAD1 big pcluster (0 - off; 1 - on)
  * bit 2 : HEAD2 big pcluster (0 - off; 1 - on)
  * bit 3 : tailpacking inline pcluster (0 - off; 1 - on)
+ * bit 4 : interlaced plain pcluster (0 - off; 1 - on)
  */
 #define Z_EROFS_ADVISE_COMPACTED_2B		0x0001
 #define Z_EROFS_ADVISE_BIG_PCLUSTER_1		0x0002
 #define Z_EROFS_ADVISE_BIG_PCLUSTER_2		0x0004
 #define Z_EROFS_ADVISE_INLINE_PCLUSTER		0x0008
+#define Z_EROFS_ADVISE_INTERLACED_PCLUSTER	0x0010
 
 struct z_erofs_map_header {
 	__le16	h_reserved1;
diff --git a/fs/erofs/internal.h b/fs/erofs/internal.h
index cfee49d33b95..f3ed36445d73 100644
--- a/fs/erofs/internal.h
+++ b/fs/erofs/internal.h
@@ -436,6 +436,7 @@ struct erofs_map_blocks {
 
 enum {
 	Z_EROFS_COMPRESSION_SHIFTED = Z_EROFS_COMPRESSION_MAX,
+	Z_EROFS_COMPRESSION_INTERLACED,
 	Z_EROFS_COMPRESSION_RUNTIME_MAX
 };
 
diff --git a/fs/erofs/zmap.c b/fs/erofs/zmap.c
index d58549ca1df9..7196235a441c 100644
--- a/fs/erofs/zmap.c
+++ b/fs/erofs/zmap.c
@@ -679,12 +679,18 @@ static int z_erofs_do_map_blocks(struct inode *inode,
 			goto out;
 	}
 
-	if (m.headtype == Z_EROFS_VLE_CLUSTER_TYPE_PLAIN)
-		map->m_algorithmformat = Z_EROFS_COMPRESSION_SHIFTED;
-	else if (m.headtype == Z_EROFS_VLE_CLUSTER_TYPE_HEAD2)
+	if (m.headtype == Z_EROFS_VLE_CLUSTER_TYPE_PLAIN) {
+		if (vi->z_advise & Z_EROFS_ADVISE_INTERLACED_PCLUSTER)
+			map->m_algorithmformat =
+				Z_EROFS_COMPRESSION_INTERLACED;
+		else
+			map->m_algorithmformat =
+				Z_EROFS_COMPRESSION_SHIFTED;
+	} else if (m.headtype == Z_EROFS_VLE_CLUSTER_TYPE_HEAD2) {
 		map->m_algorithmformat = vi->z_algorithmtype[1];
-	else
+	} else {
 		map->m_algorithmformat = vi->z_algorithmtype[0];
+	}
 
 	if ((flags & EROFS_GET_BLOCKS_FIEMAP) ||
 	    ((flags & EROFS_GET_BLOCKS_READMORE) &&
-- 
2.17.1


^ permalink raw reply related	[flat|nested] 12+ messages in thread

* [RFC PATCH v4 1/2] erofs: support interlaced uncompressed data for compressed files
@ 2022-09-13 11:05     ` Yue Hu
  0 siblings, 0 replies; 12+ messages in thread
From: Yue Hu @ 2022-09-13 11:05 UTC (permalink / raw)
  To: xiang, chao; +Cc: linux-kernel, zhangwen, Yue Hu, linux-erofs, shaojunjun

From: Yue Hu <huyue2@coolpad.com>

Currently, uncompressed data is all handled in the shifted way, which
means we have to shift the whole on-disk plain pcluster to get the
logical data.   However, since we are also using in-place I/O for
uncompressed data, data copy will be reduced a lot if pcluster is
recorded in the interlaced way as illustrated below:
 _______________________________________________________________
|               |    |               |_ tail part |_ head part _|
|<-   blk0    ->| .. |<-   blkn-2  ->|<-         blkn-1       ->|

The logical data then becomes:
 ________________________________________________________
|_ head part _|_  blk0  _| .. |_  blkn-2  _|_ tail part _|

In addition, non-4k plain pclusters are also survived by the
interlaced way, which can be used for non-4k lclusters as well.

However, it's almost impossible to de-duplicate uncompressed data
in the interlaced way, therefore shifted uncompressed data is still
useful.

Signed-off-by: Yue Hu <huyue2@coolpad.com>
---
 fs/erofs/decompressor.c | 47 ++++++++++++++++++++++++-----------------
 fs/erofs/erofs_fs.h     |  2 ++
 fs/erofs/internal.h     |  1 +
 fs/erofs/zmap.c         | 14 ++++++++----
 4 files changed, 41 insertions(+), 23 deletions(-)

diff --git a/fs/erofs/decompressor.c b/fs/erofs/decompressor.c
index 2d55569f96ac..51b7ac7166d9 100644
--- a/fs/erofs/decompressor.c
+++ b/fs/erofs/decompressor.c
@@ -317,52 +317,61 @@ static int z_erofs_lz4_decompress(struct z_erofs_decompress_req *rq,
 	return ret;
 }
 
-static int z_erofs_shifted_transform(struct z_erofs_decompress_req *rq,
-				     struct page **pagepool)
+static int z_erofs_transform_plain(struct z_erofs_decompress_req *rq,
+				   struct page **pagepool)
 {
-	const unsigned int nrpages_out =
+	const unsigned int inpages = PAGE_ALIGN(rq->inputsize) >> PAGE_SHIFT;
+	const unsigned int outpages =
 		PAGE_ALIGN(rq->pageofs_out + rq->outputsize) >> PAGE_SHIFT;
 	const unsigned int righthalf = min_t(unsigned int, rq->outputsize,
 					     PAGE_SIZE - rq->pageofs_out);
 	const unsigned int lefthalf = rq->outputsize - righthalf;
+	const unsigned int interlaced_offset =
+		rq->alg == Z_EROFS_COMPRESSION_SHIFTED ? 0 : rq->pageofs_out;
 	unsigned char *src, *dst;
 
-	if (nrpages_out > 2) {
+	if (outpages > 2 && rq->alg == Z_EROFS_COMPRESSION_SHIFTED) {
 		DBG_BUGON(1);
-		return -EIO;
+		return -EFSCORRUPTED;
 	}
 
 	if (rq->out[0] == *rq->in) {
-		DBG_BUGON(nrpages_out != 1);
+		DBG_BUGON(rq->pageofs_out);
 		return 0;
 	}
 
-	src = kmap_atomic(*rq->in) + rq->pageofs_in;
+	src = kmap_local_page(rq->in[inpages - 1]) + rq->pageofs_in;
 	if (rq->out[0]) {
-		dst = kmap_atomic(rq->out[0]);
-		memcpy(dst + rq->pageofs_out, src, righthalf);
-		kunmap_atomic(dst);
+		dst = kmap_local_page(rq->out[0]);
+		memcpy(dst + rq->pageofs_out, src + interlaced_offset,
+		       righthalf);
+		kunmap_local(dst);
 	}
 
-	if (nrpages_out == 2) {
-		DBG_BUGON(!rq->out[1]);
-		if (rq->out[1] == *rq->in) {
+	if (outpages > inpages) {
+		DBG_BUGON(!rq->out[outpages - 1]);
+		if (rq->out[outpages - 1] != rq->in[inpages - 1]) {
+			dst = kmap_local_page(rq->out[outpages - 1]);
+			memcpy(dst, interlaced_offset ? src :
+					(src + righthalf), lefthalf);
+			kunmap_local(dst);
+		} else if (!interlaced_offset) {
 			memmove(src, src + righthalf, lefthalf);
-		} else {
-			dst = kmap_atomic(rq->out[1]);
-			memcpy(dst, src + righthalf, lefthalf);
-			kunmap_atomic(dst);
 		}
 	}
-	kunmap_atomic(src);
+	kunmap_local(src);
 	return 0;
 }
 
 static struct z_erofs_decompressor decompressors[] = {
 	[Z_EROFS_COMPRESSION_SHIFTED] = {
-		.decompress = z_erofs_shifted_transform,
+		.decompress = z_erofs_transform_plain,
 		.name = "shifted"
 	},
+	[Z_EROFS_COMPRESSION_INTERLACED] = {
+		.decompress = z_erofs_transform_plain,
+		.name = "interlaced"
+	},
 	[Z_EROFS_COMPRESSION_LZ4] = {
 		.decompress = z_erofs_lz4_decompress,
 		.name = "lz4"
diff --git a/fs/erofs/erofs_fs.h b/fs/erofs/erofs_fs.h
index 2b48373f690b..5c1de6d7ad71 100644
--- a/fs/erofs/erofs_fs.h
+++ b/fs/erofs/erofs_fs.h
@@ -295,11 +295,13 @@ struct z_erofs_lzma_cfgs {
  * bit 1 : HEAD1 big pcluster (0 - off; 1 - on)
  * bit 2 : HEAD2 big pcluster (0 - off; 1 - on)
  * bit 3 : tailpacking inline pcluster (0 - off; 1 - on)
+ * bit 4 : interlaced plain pcluster (0 - off; 1 - on)
  */
 #define Z_EROFS_ADVISE_COMPACTED_2B		0x0001
 #define Z_EROFS_ADVISE_BIG_PCLUSTER_1		0x0002
 #define Z_EROFS_ADVISE_BIG_PCLUSTER_2		0x0004
 #define Z_EROFS_ADVISE_INLINE_PCLUSTER		0x0008
+#define Z_EROFS_ADVISE_INTERLACED_PCLUSTER	0x0010
 
 struct z_erofs_map_header {
 	__le16	h_reserved1;
diff --git a/fs/erofs/internal.h b/fs/erofs/internal.h
index cfee49d33b95..f3ed36445d73 100644
--- a/fs/erofs/internal.h
+++ b/fs/erofs/internal.h
@@ -436,6 +436,7 @@ struct erofs_map_blocks {
 
 enum {
 	Z_EROFS_COMPRESSION_SHIFTED = Z_EROFS_COMPRESSION_MAX,
+	Z_EROFS_COMPRESSION_INTERLACED,
 	Z_EROFS_COMPRESSION_RUNTIME_MAX
 };
 
diff --git a/fs/erofs/zmap.c b/fs/erofs/zmap.c
index d58549ca1df9..7196235a441c 100644
--- a/fs/erofs/zmap.c
+++ b/fs/erofs/zmap.c
@@ -679,12 +679,18 @@ static int z_erofs_do_map_blocks(struct inode *inode,
 			goto out;
 	}
 
-	if (m.headtype == Z_EROFS_VLE_CLUSTER_TYPE_PLAIN)
-		map->m_algorithmformat = Z_EROFS_COMPRESSION_SHIFTED;
-	else if (m.headtype == Z_EROFS_VLE_CLUSTER_TYPE_HEAD2)
+	if (m.headtype == Z_EROFS_VLE_CLUSTER_TYPE_PLAIN) {
+		if (vi->z_advise & Z_EROFS_ADVISE_INTERLACED_PCLUSTER)
+			map->m_algorithmformat =
+				Z_EROFS_COMPRESSION_INTERLACED;
+		else
+			map->m_algorithmformat =
+				Z_EROFS_COMPRESSION_SHIFTED;
+	} else if (m.headtype == Z_EROFS_VLE_CLUSTER_TYPE_HEAD2) {
 		map->m_algorithmformat = vi->z_algorithmtype[1];
-	else
+	} else {
 		map->m_algorithmformat = vi->z_algorithmtype[0];
+	}
 
 	if ((flags & EROFS_GET_BLOCKS_FIEMAP) ||
 	    ((flags & EROFS_GET_BLOCKS_READMORE) &&
-- 
2.17.1


^ permalink raw reply related	[flat|nested] 12+ messages in thread

* [RFC PATCH v4 2/2] erofs: support on-disk compressed fragments data
       [not found] ` <cover.1663066966.git.huyue2@coolpad.com>
@ 2022-09-13 11:05     ` Yue Hu
  2022-09-13 11:05     ` Yue Hu
  1 sibling, 0 replies; 12+ messages in thread
From: Yue Hu @ 2022-09-13 11:05 UTC (permalink / raw)
  To: xiang, chao
  Cc: linux-erofs, linux-kernel, zhangwen, shaojunjun, zbestahu, Yue Hu

From: Yue Hu <huyue2@coolpad.com>

Introduce on-disk compressed fragments data feature.

This approach adds a new field called `h_fragmentoff' in the per-file
compression header to indicate the fragment offset of each tail pcluster
or the whole file in the special packed inode.

Similar to ztailpacking, it will also find and record the 'headlcn'
of the tail pcluster when initializing per-inode zmap for making
follow-on requests more easy.

Signed-off-by: Yue Hu <huyue2@coolpad.com>
---
 fs/erofs/erofs_fs.h | 29 +++++++++++++++++++++------
 fs/erofs/internal.h | 16 ++++++++++++---
 fs/erofs/super.c    | 15 ++++++++++++++
 fs/erofs/sysfs.c    |  2 ++
 fs/erofs/zdata.c    | 48 ++++++++++++++++++++++++++++++++++++++++++++-
 fs/erofs/zmap.c     | 48 +++++++++++++++++++++++++++++++++++++++++++--
 6 files changed, 146 insertions(+), 12 deletions(-)

diff --git a/fs/erofs/erofs_fs.h b/fs/erofs/erofs_fs.h
index 5c1de6d7ad71..aa976757328b 100644
--- a/fs/erofs/erofs_fs.h
+++ b/fs/erofs/erofs_fs.h
@@ -25,6 +25,7 @@
 #define EROFS_FEATURE_INCOMPAT_DEVICE_TABLE	0x00000008
 #define EROFS_FEATURE_INCOMPAT_COMPR_HEAD2	0x00000008
 #define EROFS_FEATURE_INCOMPAT_ZTAILPACKING	0x00000010
+#define EROFS_FEATURE_INCOMPAT_FRAGMENTS	0x00000020
 #define EROFS_ALL_FEATURE_INCOMPAT		\
 	(EROFS_FEATURE_INCOMPAT_ZERO_PADDING | \
 	 EROFS_FEATURE_INCOMPAT_COMPR_CFGS | \
@@ -32,7 +33,8 @@
 	 EROFS_FEATURE_INCOMPAT_CHUNKED_FILE | \
 	 EROFS_FEATURE_INCOMPAT_DEVICE_TABLE | \
 	 EROFS_FEATURE_INCOMPAT_COMPR_HEAD2 | \
-	 EROFS_FEATURE_INCOMPAT_ZTAILPACKING)
+	 EROFS_FEATURE_INCOMPAT_ZTAILPACKING | \
+	 EROFS_FEATURE_INCOMPAT_FRAGMENTS)
 
 #define EROFS_SB_EXTSLOT_SIZE	16
 
@@ -71,7 +73,9 @@ struct erofs_super_block {
 	} __packed u1;
 	__le16 extra_devices;	/* # of devices besides the primary device */
 	__le16 devt_slotoff;	/* startoff = devt_slotoff * devt_slotsize */
-	__u8 reserved2[38];
+	__u8 reserved[6];
+	__le64 packed_nid;	/* nid of the special packed inode */
+	__u8 reserved2[24];
 };
 
 /*
@@ -296,17 +300,26 @@ struct z_erofs_lzma_cfgs {
  * bit 2 : HEAD2 big pcluster (0 - off; 1 - on)
  * bit 3 : tailpacking inline pcluster (0 - off; 1 - on)
  * bit 4 : interlaced plain pcluster (0 - off; 1 - on)
+ * bit 5 : fragment pcluster (0 - off; 1 - on)
  */
 #define Z_EROFS_ADVISE_COMPACTED_2B		0x0001
 #define Z_EROFS_ADVISE_BIG_PCLUSTER_1		0x0002
 #define Z_EROFS_ADVISE_BIG_PCLUSTER_2		0x0004
 #define Z_EROFS_ADVISE_INLINE_PCLUSTER		0x0008
 #define Z_EROFS_ADVISE_INTERLACED_PCLUSTER	0x0010
+#define Z_EROFS_ADVISE_FRAGMENT_PCLUSTER	0x0020
 
+#define Z_EROFS_FRAGMENT_INODE_BIT              7
 struct z_erofs_map_header {
-	__le16	h_reserved1;
-	/* indicates the encoded size of tailpacking data */
-	__le16  h_idata_size;
+	union {
+		/* fragment data offset in the packed inode */
+		__le32  h_fragmentoff;
+		struct {
+			__le16  h_reserved1;
+			/* indicates the encoded size of tailpacking data */
+			__le16  h_idata_size;
+		};
+	};
 	__le16	h_advise;
 	/*
 	 * bit 0-3 : algorithm type of head 1 (logical cluster type 01);
@@ -315,7 +328,8 @@ struct z_erofs_map_header {
 	__u8	h_algorithmtype;
 	/*
 	 * bit 0-2 : logical cluster bits - 12, e.g. 0 for 4096;
-	 * bit 3-7 : reserved.
+	 * bit 3-6 : reserved;
+	 * bit 7   : move the whole file into packed inode or not.
 	 */
 	__u8	h_clusterbits;
 };
@@ -421,6 +435,9 @@ static inline void erofs_check_ondisk_layout_definitions(void)
 
 	BUILD_BUG_ON(BIT(Z_EROFS_VLE_DI_CLUSTER_TYPE_BITS) <
 		     Z_EROFS_VLE_CLUSTER_TYPE_MAX - 1);
+	WARN_ON(*(__le64 *)&(struct z_erofs_map_header) {
+			.h_clusterbits = 1 << Z_EROFS_FRAGMENT_INODE_BIT
+		} != cpu_to_le64(1ULL << 63));
 }
 
 #endif
diff --git a/fs/erofs/internal.h b/fs/erofs/internal.h
index f3ed36445d73..b133664b4ad2 100644
--- a/fs/erofs/internal.h
+++ b/fs/erofs/internal.h
@@ -120,6 +120,7 @@ struct erofs_sb_info {
 	struct inode *managed_cache;
 
 	struct erofs_sb_lz4_info lz4;
+	struct inode *packed_inode;
 #endif	/* CONFIG_EROFS_FS_ZIP */
 	struct erofs_dev_context *devs;
 	struct dax_device *dax_dev;
@@ -306,6 +307,7 @@ EROFS_FEATURE_FUNCS(chunked_file, incompat, INCOMPAT_CHUNKED_FILE)
 EROFS_FEATURE_FUNCS(device_table, incompat, INCOMPAT_DEVICE_TABLE)
 EROFS_FEATURE_FUNCS(compr_head2, incompat, INCOMPAT_COMPR_HEAD2)
 EROFS_FEATURE_FUNCS(ztailpacking, incompat, INCOMPAT_ZTAILPACKING)
+EROFS_FEATURE_FUNCS(fragments, incompat, INCOMPAT_FRAGMENTS)
 EROFS_FEATURE_FUNCS(sb_chksum, compat, COMPAT_SB_CHKSUM)
 
 /* atomic flag definitions */
@@ -341,8 +343,13 @@ struct erofs_inode {
 			unsigned char  z_algorithmtype[2];
 			unsigned char  z_logical_clusterbits;
 			unsigned long  z_tailextent_headlcn;
-			erofs_off_t    z_idataoff;
-			unsigned short z_idata_size;
+			union {
+				struct {
+					erofs_off_t    z_idataoff;
+					unsigned short z_idata_size;
+				};
+				erofs_off_t z_fragmentoff;
+			};
 		};
 #endif	/* CONFIG_EROFS_FS_ZIP */
 	};
@@ -400,6 +407,7 @@ extern const struct address_space_operations z_erofs_aops;
 enum {
 	BH_Encoded = BH_PrivateStart,
 	BH_FullMapped,
+	BH_Fragment,
 };
 
 /* Has a disk mapping */
@@ -410,6 +418,8 @@ enum {
 #define EROFS_MAP_ENCODED	(1 << BH_Encoded)
 /* The length of extent is full */
 #define EROFS_MAP_FULL_MAPPED	(1 << BH_FullMapped)
+/* Located in the special packed inode */
+#define EROFS_MAP_FRAGMENT	(1 << BH_Fragment)
 
 struct erofs_map_blocks {
 	struct erofs_buf buf;
@@ -431,7 +441,7 @@ struct erofs_map_blocks {
 #define EROFS_GET_BLOCKS_FIEMAP	0x0002
 /* Used to map the whole extent if non-negligible data is requested for LZMA */
 #define EROFS_GET_BLOCKS_READMORE	0x0004
-/* Used to map tail extent for tailpacking inline pcluster */
+/* Used to map tail extent for tailpacking inline or fragment pcluster */
 #define EROFS_GET_BLOCKS_FINDTAIL	0x0008
 
 enum {
diff --git a/fs/erofs/super.c b/fs/erofs/super.c
index 3173debeaa5a..8170c0d8ab92 100644
--- a/fs/erofs/super.c
+++ b/fs/erofs/super.c
@@ -381,6 +381,17 @@ static int erofs_read_superblock(struct super_block *sb)
 #endif
 	sbi->islotbits = ilog2(sizeof(struct erofs_inode_compact));
 	sbi->root_nid = le16_to_cpu(dsb->root_nid);
+#ifdef CONFIG_EROFS_FS_ZIP
+	sbi->packed_inode = NULL;
+	if (erofs_sb_has_fragments(sbi)) {
+		sbi->packed_inode =
+			erofs_iget(sb, le64_to_cpu(dsb->packed_nid), false);
+		if (IS_ERR(sbi->packed_inode)) {
+			ret = PTR_ERR(sbi->packed_inode);
+			goto out;
+		}
+	}
+#endif
 	sbi->inos = le64_to_cpu(dsb->inos);
 
 	sbi->build_time = le64_to_cpu(dsb->build_time);
@@ -411,6 +422,8 @@ static int erofs_read_superblock(struct super_block *sb)
 		erofs_info(sb, "EXPERIMENTAL compressed inline data feature in use. Use at your own risk!");
 	if (erofs_is_fscache_mode(sb))
 		erofs_info(sb, "EXPERIMENTAL fscache-based on-demand read feature in use. Use at your own risk!");
+	if (erofs_sb_has_fragments(sbi))
+		erofs_info(sb, "EXPERIMENTAL compressed fragments feature in use. Use at your own risk!");
 out:
 	erofs_put_metabuf(&buf);
 	return ret;
@@ -908,6 +921,8 @@ static void erofs_put_super(struct super_block *sb)
 #ifdef CONFIG_EROFS_FS_ZIP
 	iput(sbi->managed_cache);
 	sbi->managed_cache = NULL;
+	iput(sbi->packed_inode);
+	sbi->packed_inode = NULL;
 #endif
 	erofs_fscache_unregister_cookie(&sbi->s_fscache);
 }
diff --git a/fs/erofs/sysfs.c b/fs/erofs/sysfs.c
index c1383e508bbe..1b52395be82a 100644
--- a/fs/erofs/sysfs.c
+++ b/fs/erofs/sysfs.c
@@ -76,6 +76,7 @@ EROFS_ATTR_FEATURE(device_table);
 EROFS_ATTR_FEATURE(compr_head2);
 EROFS_ATTR_FEATURE(sb_chksum);
 EROFS_ATTR_FEATURE(ztailpacking);
+EROFS_ATTR_FEATURE(fragments);
 
 static struct attribute *erofs_feat_attrs[] = {
 	ATTR_LIST(zero_padding),
@@ -86,6 +87,7 @@ static struct attribute *erofs_feat_attrs[] = {
 	ATTR_LIST(compr_head2),
 	ATTR_LIST(sb_chksum),
 	ATTR_LIST(ztailpacking),
+	ATTR_LIST(fragments),
 	NULL,
 };
 ATTRIBUTE_GROUPS(erofs_feat);
diff --git a/fs/erofs/zdata.c b/fs/erofs/zdata.c
index 5792ca9e0d5e..aa2a3cdeea57 100644
--- a/fs/erofs/zdata.c
+++ b/fs/erofs/zdata.c
@@ -650,6 +650,33 @@ static bool should_alloc_managed_pages(struct z_erofs_decompress_frontend *fe,
 		la < fe->headoffset;
 }
 
+static int z_erofs_read_fragment(struct inode *inode, erofs_off_t pos,
+				 struct page *page, unsigned int pageofs,
+				 unsigned int len)
+{
+	struct inode *packed_inode = EROFS_I_SB(inode)->packed_inode;
+	struct erofs_buf buf = __EROFS_BUF_INITIALIZER;
+	u8 *src, *dst;
+	unsigned int i, cnt;
+
+	pos += EROFS_I(inode)->z_fragmentoff;
+	for (i = 0; i < len; i += cnt) {
+		cnt = min_t(unsigned int, len - i,
+			    EROFS_BLKSIZ - erofs_blkoff(pos));
+		src = erofs_bread(&buf, packed_inode,
+				  erofs_blknr(pos), EROFS_KMAP);
+		if (IS_ERR(src))
+			return PTR_ERR(src);
+
+		dst = kmap_local_page(page);
+		memcpy(dst + pageofs + i, src + erofs_blkoff(pos), cnt);
+		kunmap_local(dst);
+		pos += cnt;
+	}
+	erofs_put_metabuf(&buf);
+	return 0;
+}
+
 static int z_erofs_do_read_page(struct z_erofs_decompress_frontend *fe,
 				struct page *page, struct page **pagepool)
 {
@@ -688,7 +715,8 @@ static int z_erofs_do_read_page(struct z_erofs_decompress_frontend *fe,
 		/* didn't get a valid pcluster previously (very rare) */
 	}
 
-	if (!(map->m_flags & EROFS_MAP_MAPPED))
+	if (!(map->m_flags & EROFS_MAP_MAPPED) ||
+	    map->m_flags & EROFS_MAP_FRAGMENT)
 		goto hitted;
 
 	err = z_erofs_collector_begin(fe);
@@ -735,6 +763,24 @@ static int z_erofs_do_read_page(struct z_erofs_decompress_frontend *fe,
 		zero_user_segment(page, cur, end);
 		goto next_part;
 	}
+	if (map->m_flags & EROFS_MAP_FRAGMENT) {
+		unsigned int pageofs, skip, len;
+
+		if (offset > map->m_la) {
+			pageofs = 0;
+			skip = offset - map->m_la;
+		} else {
+			pageofs = map->m_la & ~PAGE_MASK;
+			skip = 0;
+		}
+		len = min_t(unsigned int, map->m_llen - skip, end - cur);
+		err = z_erofs_read_fragment(inode, skip, page, pageofs, len);
+		if (err)
+			goto out;
+		++spiltted;
+		tight = false;
+		goto next_part;
+	}
 
 	exclusive = (!cur && (!spiltted || tight));
 	if (cur)
diff --git a/fs/erofs/zmap.c b/fs/erofs/zmap.c
index 7196235a441c..6830999529d7 100644
--- a/fs/erofs/zmap.c
+++ b/fs/erofs/zmap.c
@@ -69,6 +69,16 @@ static int z_erofs_fill_inode_lazy(struct inode *inode)
 	}
 
 	h = kaddr + erofs_blkoff(pos);
+	/*
+	 * if the highest bit of the 8-byte map header is set, the whole file
+	 * is stored in the packed inode. The rest bits keeps z_fragmentoff.
+	 */
+	if (h->h_clusterbits >> Z_EROFS_FRAGMENT_INODE_BIT) {
+		vi->z_advise = Z_EROFS_ADVISE_FRAGMENT_PCLUSTER;
+		vi->z_fragmentoff = le64_to_cpu(*(__le64 *)h) ^ (1ULL << 63);
+		vi->z_tailextent_headlcn = 0;
+		goto unmap_done;
+	}
 	vi->z_advise = le16_to_cpu(h->h_advise);
 	vi->z_algorithmtype[0] = h->h_algorithmtype & 15;
 	vi->z_algorithmtype[1] = h->h_algorithmtype >> 4;
@@ -123,6 +133,20 @@ static int z_erofs_fill_inode_lazy(struct inode *inode)
 		if (err < 0)
 			goto out_unlock;
 	}
+
+	if (vi->z_advise & Z_EROFS_ADVISE_FRAGMENT_PCLUSTER &&
+	    !(h->h_clusterbits >> Z_EROFS_FRAGMENT_INODE_BIT)) {
+		struct erofs_map_blocks map = {
+			.buf = __EROFS_BUF_INITIALIZER
+		};
+
+		vi->z_fragmentoff = le32_to_cpu(h->h_fragmentoff);
+		err = z_erofs_do_map_blocks(inode, &map,
+					    EROFS_GET_BLOCKS_FINDTAIL);
+		erofs_put_metabuf(&map.buf);
+		if (err < 0)
+			goto out_unlock;
+	}
 	/* paired with smp_mb() at the beginning of the function */
 	smp_mb();
 	set_bit(EROFS_I_Z_INITED_BIT, &vi->flags);
@@ -598,6 +622,7 @@ static int z_erofs_do_map_blocks(struct inode *inode,
 {
 	struct erofs_inode *const vi = EROFS_I(inode);
 	bool ztailpacking = vi->z_advise & Z_EROFS_ADVISE_INLINE_PCLUSTER;
+	bool fragment = vi->z_advise & Z_EROFS_ADVISE_FRAGMENT_PCLUSTER;
 	struct z_erofs_maprecorder m = {
 		.inode = inode,
 		.map = map,
@@ -666,12 +691,20 @@ static int z_erofs_do_map_blocks(struct inode *inode,
 
 	map->m_llen = end - map->m_la;
 
-	if (flags & EROFS_GET_BLOCKS_FINDTAIL)
+	if (flags & EROFS_GET_BLOCKS_FINDTAIL) {
 		vi->z_tailextent_headlcn = m.lcn;
+		/* for non-compact indexes, fragmentoff is 64 bits */
+		if (fragment &&
+		    vi->datalayout == EROFS_INODE_FLAT_COMPRESSION_LEGACY)
+			vi->z_fragmentoff |= (u64)m.pblk << 32;
+	}
 	if (ztailpacking && m.lcn == vi->z_tailextent_headlcn) {
 		map->m_flags |= EROFS_MAP_META;
 		map->m_pa = vi->z_idataoff;
 		map->m_plen = vi->z_idata_size;
+	} else if (fragment && m.lcn == vi->z_tailextent_headlcn) {
+		map->m_flags |= EROFS_MAP_FRAGMENT;
+		DBG_BUGON(!map->m_la);
 	} else {
 		map->m_pa = blknr_to_addr(m.pblk);
 		err = z_erofs_get_extent_compressedlen(&m, initial_lcn);
@@ -715,6 +748,7 @@ int z_erofs_map_blocks_iter(struct inode *inode,
 			    struct erofs_map_blocks *map,
 			    int flags)
 {
+	struct erofs_inode *const vi = EROFS_I(inode);
 	int err = 0;
 
 	trace_z_erofs_map_blocks_iter_enter(inode, map, flags);
@@ -731,6 +765,15 @@ int z_erofs_map_blocks_iter(struct inode *inode,
 	if (err)
 		goto out;
 
+	if ((vi->z_advise & Z_EROFS_ADVISE_FRAGMENT_PCLUSTER) &&
+	    !vi->z_tailextent_headlcn) {
+		map->m_la = 0;
+		map->m_llen = inode->i_size;
+		map->m_flags = EROFS_MAP_MAPPED | EROFS_MAP_FULL_MAPPED |
+				EROFS_MAP_FRAGMENT;
+		goto out;
+	}
+
 	err = z_erofs_do_map_blocks(inode, map, flags);
 out:
 	trace_z_erofs_map_blocks_iter_exit(inode, map, flags, err);
@@ -757,7 +800,8 @@ static int z_erofs_iomap_begin_report(struct inode *inode, loff_t offset,
 	iomap->length = map.m_llen;
 	if (map.m_flags & EROFS_MAP_MAPPED) {
 		iomap->type = IOMAP_MAPPED;
-		iomap->addr = map.m_pa;
+		iomap->addr = map.m_flags & EROFS_MAP_FRAGMENT ?
+			      IOMAP_NULL_ADDR : map.m_pa;
 	} else {
 		iomap->type = IOMAP_HOLE;
 		iomap->addr = IOMAP_NULL_ADDR;
-- 
2.17.1


^ permalink raw reply related	[flat|nested] 12+ messages in thread

* [RFC PATCH v4 2/2] erofs: support on-disk compressed fragments data
@ 2022-09-13 11:05     ` Yue Hu
  0 siblings, 0 replies; 12+ messages in thread
From: Yue Hu @ 2022-09-13 11:05 UTC (permalink / raw)
  To: xiang, chao; +Cc: linux-kernel, zhangwen, Yue Hu, linux-erofs, shaojunjun

From: Yue Hu <huyue2@coolpad.com>

Introduce on-disk compressed fragments data feature.

This approach adds a new field called `h_fragmentoff' in the per-file
compression header to indicate the fragment offset of each tail pcluster
or the whole file in the special packed inode.

Similar to ztailpacking, it will also find and record the 'headlcn'
of the tail pcluster when initializing per-inode zmap for making
follow-on requests more easy.

Signed-off-by: Yue Hu <huyue2@coolpad.com>
---
 fs/erofs/erofs_fs.h | 29 +++++++++++++++++++++------
 fs/erofs/internal.h | 16 ++++++++++++---
 fs/erofs/super.c    | 15 ++++++++++++++
 fs/erofs/sysfs.c    |  2 ++
 fs/erofs/zdata.c    | 48 ++++++++++++++++++++++++++++++++++++++++++++-
 fs/erofs/zmap.c     | 48 +++++++++++++++++++++++++++++++++++++++++++--
 6 files changed, 146 insertions(+), 12 deletions(-)

diff --git a/fs/erofs/erofs_fs.h b/fs/erofs/erofs_fs.h
index 5c1de6d7ad71..aa976757328b 100644
--- a/fs/erofs/erofs_fs.h
+++ b/fs/erofs/erofs_fs.h
@@ -25,6 +25,7 @@
 #define EROFS_FEATURE_INCOMPAT_DEVICE_TABLE	0x00000008
 #define EROFS_FEATURE_INCOMPAT_COMPR_HEAD2	0x00000008
 #define EROFS_FEATURE_INCOMPAT_ZTAILPACKING	0x00000010
+#define EROFS_FEATURE_INCOMPAT_FRAGMENTS	0x00000020
 #define EROFS_ALL_FEATURE_INCOMPAT		\
 	(EROFS_FEATURE_INCOMPAT_ZERO_PADDING | \
 	 EROFS_FEATURE_INCOMPAT_COMPR_CFGS | \
@@ -32,7 +33,8 @@
 	 EROFS_FEATURE_INCOMPAT_CHUNKED_FILE | \
 	 EROFS_FEATURE_INCOMPAT_DEVICE_TABLE | \
 	 EROFS_FEATURE_INCOMPAT_COMPR_HEAD2 | \
-	 EROFS_FEATURE_INCOMPAT_ZTAILPACKING)
+	 EROFS_FEATURE_INCOMPAT_ZTAILPACKING | \
+	 EROFS_FEATURE_INCOMPAT_FRAGMENTS)
 
 #define EROFS_SB_EXTSLOT_SIZE	16
 
@@ -71,7 +73,9 @@ struct erofs_super_block {
 	} __packed u1;
 	__le16 extra_devices;	/* # of devices besides the primary device */
 	__le16 devt_slotoff;	/* startoff = devt_slotoff * devt_slotsize */
-	__u8 reserved2[38];
+	__u8 reserved[6];
+	__le64 packed_nid;	/* nid of the special packed inode */
+	__u8 reserved2[24];
 };
 
 /*
@@ -296,17 +300,26 @@ struct z_erofs_lzma_cfgs {
  * bit 2 : HEAD2 big pcluster (0 - off; 1 - on)
  * bit 3 : tailpacking inline pcluster (0 - off; 1 - on)
  * bit 4 : interlaced plain pcluster (0 - off; 1 - on)
+ * bit 5 : fragment pcluster (0 - off; 1 - on)
  */
 #define Z_EROFS_ADVISE_COMPACTED_2B		0x0001
 #define Z_EROFS_ADVISE_BIG_PCLUSTER_1		0x0002
 #define Z_EROFS_ADVISE_BIG_PCLUSTER_2		0x0004
 #define Z_EROFS_ADVISE_INLINE_PCLUSTER		0x0008
 #define Z_EROFS_ADVISE_INTERLACED_PCLUSTER	0x0010
+#define Z_EROFS_ADVISE_FRAGMENT_PCLUSTER	0x0020
 
+#define Z_EROFS_FRAGMENT_INODE_BIT              7
 struct z_erofs_map_header {
-	__le16	h_reserved1;
-	/* indicates the encoded size of tailpacking data */
-	__le16  h_idata_size;
+	union {
+		/* fragment data offset in the packed inode */
+		__le32  h_fragmentoff;
+		struct {
+			__le16  h_reserved1;
+			/* indicates the encoded size of tailpacking data */
+			__le16  h_idata_size;
+		};
+	};
 	__le16	h_advise;
 	/*
 	 * bit 0-3 : algorithm type of head 1 (logical cluster type 01);
@@ -315,7 +328,8 @@ struct z_erofs_map_header {
 	__u8	h_algorithmtype;
 	/*
 	 * bit 0-2 : logical cluster bits - 12, e.g. 0 for 4096;
-	 * bit 3-7 : reserved.
+	 * bit 3-6 : reserved;
+	 * bit 7   : move the whole file into packed inode or not.
 	 */
 	__u8	h_clusterbits;
 };
@@ -421,6 +435,9 @@ static inline void erofs_check_ondisk_layout_definitions(void)
 
 	BUILD_BUG_ON(BIT(Z_EROFS_VLE_DI_CLUSTER_TYPE_BITS) <
 		     Z_EROFS_VLE_CLUSTER_TYPE_MAX - 1);
+	WARN_ON(*(__le64 *)&(struct z_erofs_map_header) {
+			.h_clusterbits = 1 << Z_EROFS_FRAGMENT_INODE_BIT
+		} != cpu_to_le64(1ULL << 63));
 }
 
 #endif
diff --git a/fs/erofs/internal.h b/fs/erofs/internal.h
index f3ed36445d73..b133664b4ad2 100644
--- a/fs/erofs/internal.h
+++ b/fs/erofs/internal.h
@@ -120,6 +120,7 @@ struct erofs_sb_info {
 	struct inode *managed_cache;
 
 	struct erofs_sb_lz4_info lz4;
+	struct inode *packed_inode;
 #endif	/* CONFIG_EROFS_FS_ZIP */
 	struct erofs_dev_context *devs;
 	struct dax_device *dax_dev;
@@ -306,6 +307,7 @@ EROFS_FEATURE_FUNCS(chunked_file, incompat, INCOMPAT_CHUNKED_FILE)
 EROFS_FEATURE_FUNCS(device_table, incompat, INCOMPAT_DEVICE_TABLE)
 EROFS_FEATURE_FUNCS(compr_head2, incompat, INCOMPAT_COMPR_HEAD2)
 EROFS_FEATURE_FUNCS(ztailpacking, incompat, INCOMPAT_ZTAILPACKING)
+EROFS_FEATURE_FUNCS(fragments, incompat, INCOMPAT_FRAGMENTS)
 EROFS_FEATURE_FUNCS(sb_chksum, compat, COMPAT_SB_CHKSUM)
 
 /* atomic flag definitions */
@@ -341,8 +343,13 @@ struct erofs_inode {
 			unsigned char  z_algorithmtype[2];
 			unsigned char  z_logical_clusterbits;
 			unsigned long  z_tailextent_headlcn;
-			erofs_off_t    z_idataoff;
-			unsigned short z_idata_size;
+			union {
+				struct {
+					erofs_off_t    z_idataoff;
+					unsigned short z_idata_size;
+				};
+				erofs_off_t z_fragmentoff;
+			};
 		};
 #endif	/* CONFIG_EROFS_FS_ZIP */
 	};
@@ -400,6 +407,7 @@ extern const struct address_space_operations z_erofs_aops;
 enum {
 	BH_Encoded = BH_PrivateStart,
 	BH_FullMapped,
+	BH_Fragment,
 };
 
 /* Has a disk mapping */
@@ -410,6 +418,8 @@ enum {
 #define EROFS_MAP_ENCODED	(1 << BH_Encoded)
 /* The length of extent is full */
 #define EROFS_MAP_FULL_MAPPED	(1 << BH_FullMapped)
+/* Located in the special packed inode */
+#define EROFS_MAP_FRAGMENT	(1 << BH_Fragment)
 
 struct erofs_map_blocks {
 	struct erofs_buf buf;
@@ -431,7 +441,7 @@ struct erofs_map_blocks {
 #define EROFS_GET_BLOCKS_FIEMAP	0x0002
 /* Used to map the whole extent if non-negligible data is requested for LZMA */
 #define EROFS_GET_BLOCKS_READMORE	0x0004
-/* Used to map tail extent for tailpacking inline pcluster */
+/* Used to map tail extent for tailpacking inline or fragment pcluster */
 #define EROFS_GET_BLOCKS_FINDTAIL	0x0008
 
 enum {
diff --git a/fs/erofs/super.c b/fs/erofs/super.c
index 3173debeaa5a..8170c0d8ab92 100644
--- a/fs/erofs/super.c
+++ b/fs/erofs/super.c
@@ -381,6 +381,17 @@ static int erofs_read_superblock(struct super_block *sb)
 #endif
 	sbi->islotbits = ilog2(sizeof(struct erofs_inode_compact));
 	sbi->root_nid = le16_to_cpu(dsb->root_nid);
+#ifdef CONFIG_EROFS_FS_ZIP
+	sbi->packed_inode = NULL;
+	if (erofs_sb_has_fragments(sbi)) {
+		sbi->packed_inode =
+			erofs_iget(sb, le64_to_cpu(dsb->packed_nid), false);
+		if (IS_ERR(sbi->packed_inode)) {
+			ret = PTR_ERR(sbi->packed_inode);
+			goto out;
+		}
+	}
+#endif
 	sbi->inos = le64_to_cpu(dsb->inos);
 
 	sbi->build_time = le64_to_cpu(dsb->build_time);
@@ -411,6 +422,8 @@ static int erofs_read_superblock(struct super_block *sb)
 		erofs_info(sb, "EXPERIMENTAL compressed inline data feature in use. Use at your own risk!");
 	if (erofs_is_fscache_mode(sb))
 		erofs_info(sb, "EXPERIMENTAL fscache-based on-demand read feature in use. Use at your own risk!");
+	if (erofs_sb_has_fragments(sbi))
+		erofs_info(sb, "EXPERIMENTAL compressed fragments feature in use. Use at your own risk!");
 out:
 	erofs_put_metabuf(&buf);
 	return ret;
@@ -908,6 +921,8 @@ static void erofs_put_super(struct super_block *sb)
 #ifdef CONFIG_EROFS_FS_ZIP
 	iput(sbi->managed_cache);
 	sbi->managed_cache = NULL;
+	iput(sbi->packed_inode);
+	sbi->packed_inode = NULL;
 #endif
 	erofs_fscache_unregister_cookie(&sbi->s_fscache);
 }
diff --git a/fs/erofs/sysfs.c b/fs/erofs/sysfs.c
index c1383e508bbe..1b52395be82a 100644
--- a/fs/erofs/sysfs.c
+++ b/fs/erofs/sysfs.c
@@ -76,6 +76,7 @@ EROFS_ATTR_FEATURE(device_table);
 EROFS_ATTR_FEATURE(compr_head2);
 EROFS_ATTR_FEATURE(sb_chksum);
 EROFS_ATTR_FEATURE(ztailpacking);
+EROFS_ATTR_FEATURE(fragments);
 
 static struct attribute *erofs_feat_attrs[] = {
 	ATTR_LIST(zero_padding),
@@ -86,6 +87,7 @@ static struct attribute *erofs_feat_attrs[] = {
 	ATTR_LIST(compr_head2),
 	ATTR_LIST(sb_chksum),
 	ATTR_LIST(ztailpacking),
+	ATTR_LIST(fragments),
 	NULL,
 };
 ATTRIBUTE_GROUPS(erofs_feat);
diff --git a/fs/erofs/zdata.c b/fs/erofs/zdata.c
index 5792ca9e0d5e..aa2a3cdeea57 100644
--- a/fs/erofs/zdata.c
+++ b/fs/erofs/zdata.c
@@ -650,6 +650,33 @@ static bool should_alloc_managed_pages(struct z_erofs_decompress_frontend *fe,
 		la < fe->headoffset;
 }
 
+static int z_erofs_read_fragment(struct inode *inode, erofs_off_t pos,
+				 struct page *page, unsigned int pageofs,
+				 unsigned int len)
+{
+	struct inode *packed_inode = EROFS_I_SB(inode)->packed_inode;
+	struct erofs_buf buf = __EROFS_BUF_INITIALIZER;
+	u8 *src, *dst;
+	unsigned int i, cnt;
+
+	pos += EROFS_I(inode)->z_fragmentoff;
+	for (i = 0; i < len; i += cnt) {
+		cnt = min_t(unsigned int, len - i,
+			    EROFS_BLKSIZ - erofs_blkoff(pos));
+		src = erofs_bread(&buf, packed_inode,
+				  erofs_blknr(pos), EROFS_KMAP);
+		if (IS_ERR(src))
+			return PTR_ERR(src);
+
+		dst = kmap_local_page(page);
+		memcpy(dst + pageofs + i, src + erofs_blkoff(pos), cnt);
+		kunmap_local(dst);
+		pos += cnt;
+	}
+	erofs_put_metabuf(&buf);
+	return 0;
+}
+
 static int z_erofs_do_read_page(struct z_erofs_decompress_frontend *fe,
 				struct page *page, struct page **pagepool)
 {
@@ -688,7 +715,8 @@ static int z_erofs_do_read_page(struct z_erofs_decompress_frontend *fe,
 		/* didn't get a valid pcluster previously (very rare) */
 	}
 
-	if (!(map->m_flags & EROFS_MAP_MAPPED))
+	if (!(map->m_flags & EROFS_MAP_MAPPED) ||
+	    map->m_flags & EROFS_MAP_FRAGMENT)
 		goto hitted;
 
 	err = z_erofs_collector_begin(fe);
@@ -735,6 +763,24 @@ static int z_erofs_do_read_page(struct z_erofs_decompress_frontend *fe,
 		zero_user_segment(page, cur, end);
 		goto next_part;
 	}
+	if (map->m_flags & EROFS_MAP_FRAGMENT) {
+		unsigned int pageofs, skip, len;
+
+		if (offset > map->m_la) {
+			pageofs = 0;
+			skip = offset - map->m_la;
+		} else {
+			pageofs = map->m_la & ~PAGE_MASK;
+			skip = 0;
+		}
+		len = min_t(unsigned int, map->m_llen - skip, end - cur);
+		err = z_erofs_read_fragment(inode, skip, page, pageofs, len);
+		if (err)
+			goto out;
+		++spiltted;
+		tight = false;
+		goto next_part;
+	}
 
 	exclusive = (!cur && (!spiltted || tight));
 	if (cur)
diff --git a/fs/erofs/zmap.c b/fs/erofs/zmap.c
index 7196235a441c..6830999529d7 100644
--- a/fs/erofs/zmap.c
+++ b/fs/erofs/zmap.c
@@ -69,6 +69,16 @@ static int z_erofs_fill_inode_lazy(struct inode *inode)
 	}
 
 	h = kaddr + erofs_blkoff(pos);
+	/*
+	 * if the highest bit of the 8-byte map header is set, the whole file
+	 * is stored in the packed inode. The rest bits keeps z_fragmentoff.
+	 */
+	if (h->h_clusterbits >> Z_EROFS_FRAGMENT_INODE_BIT) {
+		vi->z_advise = Z_EROFS_ADVISE_FRAGMENT_PCLUSTER;
+		vi->z_fragmentoff = le64_to_cpu(*(__le64 *)h) ^ (1ULL << 63);
+		vi->z_tailextent_headlcn = 0;
+		goto unmap_done;
+	}
 	vi->z_advise = le16_to_cpu(h->h_advise);
 	vi->z_algorithmtype[0] = h->h_algorithmtype & 15;
 	vi->z_algorithmtype[1] = h->h_algorithmtype >> 4;
@@ -123,6 +133,20 @@ static int z_erofs_fill_inode_lazy(struct inode *inode)
 		if (err < 0)
 			goto out_unlock;
 	}
+
+	if (vi->z_advise & Z_EROFS_ADVISE_FRAGMENT_PCLUSTER &&
+	    !(h->h_clusterbits >> Z_EROFS_FRAGMENT_INODE_BIT)) {
+		struct erofs_map_blocks map = {
+			.buf = __EROFS_BUF_INITIALIZER
+		};
+
+		vi->z_fragmentoff = le32_to_cpu(h->h_fragmentoff);
+		err = z_erofs_do_map_blocks(inode, &map,
+					    EROFS_GET_BLOCKS_FINDTAIL);
+		erofs_put_metabuf(&map.buf);
+		if (err < 0)
+			goto out_unlock;
+	}
 	/* paired with smp_mb() at the beginning of the function */
 	smp_mb();
 	set_bit(EROFS_I_Z_INITED_BIT, &vi->flags);
@@ -598,6 +622,7 @@ static int z_erofs_do_map_blocks(struct inode *inode,
 {
 	struct erofs_inode *const vi = EROFS_I(inode);
 	bool ztailpacking = vi->z_advise & Z_EROFS_ADVISE_INLINE_PCLUSTER;
+	bool fragment = vi->z_advise & Z_EROFS_ADVISE_FRAGMENT_PCLUSTER;
 	struct z_erofs_maprecorder m = {
 		.inode = inode,
 		.map = map,
@@ -666,12 +691,20 @@ static int z_erofs_do_map_blocks(struct inode *inode,
 
 	map->m_llen = end - map->m_la;
 
-	if (flags & EROFS_GET_BLOCKS_FINDTAIL)
+	if (flags & EROFS_GET_BLOCKS_FINDTAIL) {
 		vi->z_tailextent_headlcn = m.lcn;
+		/* for non-compact indexes, fragmentoff is 64 bits */
+		if (fragment &&
+		    vi->datalayout == EROFS_INODE_FLAT_COMPRESSION_LEGACY)
+			vi->z_fragmentoff |= (u64)m.pblk << 32;
+	}
 	if (ztailpacking && m.lcn == vi->z_tailextent_headlcn) {
 		map->m_flags |= EROFS_MAP_META;
 		map->m_pa = vi->z_idataoff;
 		map->m_plen = vi->z_idata_size;
+	} else if (fragment && m.lcn == vi->z_tailextent_headlcn) {
+		map->m_flags |= EROFS_MAP_FRAGMENT;
+		DBG_BUGON(!map->m_la);
 	} else {
 		map->m_pa = blknr_to_addr(m.pblk);
 		err = z_erofs_get_extent_compressedlen(&m, initial_lcn);
@@ -715,6 +748,7 @@ int z_erofs_map_blocks_iter(struct inode *inode,
 			    struct erofs_map_blocks *map,
 			    int flags)
 {
+	struct erofs_inode *const vi = EROFS_I(inode);
 	int err = 0;
 
 	trace_z_erofs_map_blocks_iter_enter(inode, map, flags);
@@ -731,6 +765,15 @@ int z_erofs_map_blocks_iter(struct inode *inode,
 	if (err)
 		goto out;
 
+	if ((vi->z_advise & Z_EROFS_ADVISE_FRAGMENT_PCLUSTER) &&
+	    !vi->z_tailextent_headlcn) {
+		map->m_la = 0;
+		map->m_llen = inode->i_size;
+		map->m_flags = EROFS_MAP_MAPPED | EROFS_MAP_FULL_MAPPED |
+				EROFS_MAP_FRAGMENT;
+		goto out;
+	}
+
 	err = z_erofs_do_map_blocks(inode, map, flags);
 out:
 	trace_z_erofs_map_blocks_iter_exit(inode, map, flags, err);
@@ -757,7 +800,8 @@ static int z_erofs_iomap_begin_report(struct inode *inode, loff_t offset,
 	iomap->length = map.m_llen;
 	if (map.m_flags & EROFS_MAP_MAPPED) {
 		iomap->type = IOMAP_MAPPED;
-		iomap->addr = map.m_pa;
+		iomap->addr = map.m_flags & EROFS_MAP_FRAGMENT ?
+			      IOMAP_NULL_ADDR : map.m_pa;
 	} else {
 		iomap->type = IOMAP_HOLE;
 		iomap->addr = IOMAP_NULL_ADDR;
-- 
2.17.1


^ permalink raw reply related	[flat|nested] 12+ messages in thread

* Re: [RFC PATCH v4 1/2] erofs: support interlaced uncompressed data for compressed files
  2022-09-13 11:05     ` Yue Hu
@ 2022-09-16  9:04       ` Gao Xiang
  -1 siblings, 0 replies; 12+ messages in thread
From: Gao Xiang @ 2022-09-16  9:04 UTC (permalink / raw)
  To: Yue Hu
  Cc: xiang, chao, linux-erofs, linux-kernel, zhangwen, shaojunjun, Yue Hu

On Tue, Sep 13, 2022 at 07:05:51PM +0800, Yue Hu wrote:
> From: Yue Hu <huyue2@coolpad.com>
> 
> Currently, uncompressed data is all handled in the shifted way, which
> means we have to shift the whole on-disk plain pcluster to get the
> logical data.   However, since we are also using in-place I/O for
> uncompressed data, data copy will be reduced a lot if pcluster is
> recorded in the interlaced way as illustrated below:
>  _______________________________________________________________
> |               |    |               |_ tail part |_ head part _|
> |<-   blk0    ->| .. |<-   blkn-2  ->|<-         blkn-1       ->|
> 
> The logical data then becomes:
>  ________________________________________________________
> |_ head part _|_  blk0  _| .. |_  blkn-2  _|_ tail part _|
> 
> In addition, non-4k plain pclusters are also survived by the
> interlaced way, which can be used for non-4k lclusters as well.
> 
> However, it's almost impossible to de-duplicate uncompressed data
> in the interlaced way, therefore shifted uncompressed data is still
> useful.
> 
> Signed-off-by: Yue Hu <huyue2@coolpad.com>

This version looks good to me,
Reviewed-by: Gao Xiang <hsiangkao@linux.alibaba.com>

Thanks,
Gao Xiang


>  fs/erofs/decompressor.c | 47 ++++++++++++++++++++++++-----------------
>  fs/erofs/erofs_fs.h     |  2 ++
>  fs/erofs/internal.h     |  1 +
>  fs/erofs/zmap.c         | 14 ++++++++----
>  4 files changed, 41 insertions(+), 23 deletions(-)
> 
> diff --git a/fs/erofs/decompressor.c b/fs/erofs/decompressor.c
> index 2d55569f96ac..51b7ac7166d9 100644
> --- a/fs/erofs/decompressor.c
> +++ b/fs/erofs/decompressor.c
> @@ -317,52 +317,61 @@ static int z_erofs_lz4_decompress(struct z_erofs_decompress_req *rq,
>  	return ret;
>  }
>  
> -static int z_erofs_shifted_transform(struct z_erofs_decompress_req *rq,
> -				     struct page **pagepool)
> +static int z_erofs_transform_plain(struct z_erofs_decompress_req *rq,
> +				   struct page **pagepool)
>  {
> -	const unsigned int nrpages_out =
> +	const unsigned int inpages = PAGE_ALIGN(rq->inputsize) >> PAGE_SHIFT;
> +	const unsigned int outpages =
>  		PAGE_ALIGN(rq->pageofs_out + rq->outputsize) >> PAGE_SHIFT;
>  	const unsigned int righthalf = min_t(unsigned int, rq->outputsize,
>  					     PAGE_SIZE - rq->pageofs_out);
>  	const unsigned int lefthalf = rq->outputsize - righthalf;
> +	const unsigned int interlaced_offset =
> +		rq->alg == Z_EROFS_COMPRESSION_SHIFTED ? 0 : rq->pageofs_out;
>  	unsigned char *src, *dst;
>  
> -	if (nrpages_out > 2) {
> +	if (outpages > 2 && rq->alg == Z_EROFS_COMPRESSION_SHIFTED) {
>  		DBG_BUGON(1);
> -		return -EIO;
> +		return -EFSCORRUPTED;
>  	}
>  
>  	if (rq->out[0] == *rq->in) {
> -		DBG_BUGON(nrpages_out != 1);
> +		DBG_BUGON(rq->pageofs_out);
>  		return 0;
>  	}
>  
> -	src = kmap_atomic(*rq->in) + rq->pageofs_in;
> +	src = kmap_local_page(rq->in[inpages - 1]) + rq->pageofs_in;
>  	if (rq->out[0]) {
> -		dst = kmap_atomic(rq->out[0]);
> -		memcpy(dst + rq->pageofs_out, src, righthalf);
> -		kunmap_atomic(dst);
> +		dst = kmap_local_page(rq->out[0]);
> +		memcpy(dst + rq->pageofs_out, src + interlaced_offset,
> +		       righthalf);
> +		kunmap_local(dst);
>  	}
>  
> -	if (nrpages_out == 2) {
> -		DBG_BUGON(!rq->out[1]);
> -		if (rq->out[1] == *rq->in) {
> +	if (outpages > inpages) {
> +		DBG_BUGON(!rq->out[outpages - 1]);
> +		if (rq->out[outpages - 1] != rq->in[inpages - 1]) {
> +			dst = kmap_local_page(rq->out[outpages - 1]);
> +			memcpy(dst, interlaced_offset ? src :
> +					(src + righthalf), lefthalf);
> +			kunmap_local(dst);
> +		} else if (!interlaced_offset) {
>  			memmove(src, src + righthalf, lefthalf);
> -		} else {
> -			dst = kmap_atomic(rq->out[1]);
> -			memcpy(dst, src + righthalf, lefthalf);
> -			kunmap_atomic(dst);
>  		}
>  	}
> -	kunmap_atomic(src);
> +	kunmap_local(src);
>  	return 0;
>  }
>  
>  static struct z_erofs_decompressor decompressors[] = {
>  	[Z_EROFS_COMPRESSION_SHIFTED] = {
> -		.decompress = z_erofs_shifted_transform,
> +		.decompress = z_erofs_transform_plain,
>  		.name = "shifted"
>  	},
> +	[Z_EROFS_COMPRESSION_INTERLACED] = {
> +		.decompress = z_erofs_transform_plain,
> +		.name = "interlaced"
> +	},
>  	[Z_EROFS_COMPRESSION_LZ4] = {
>  		.decompress = z_erofs_lz4_decompress,
>  		.name = "lz4"
> diff --git a/fs/erofs/erofs_fs.h b/fs/erofs/erofs_fs.h
> index 2b48373f690b..5c1de6d7ad71 100644
> --- a/fs/erofs/erofs_fs.h
> +++ b/fs/erofs/erofs_fs.h
> @@ -295,11 +295,13 @@ struct z_erofs_lzma_cfgs {
>   * bit 1 : HEAD1 big pcluster (0 - off; 1 - on)
>   * bit 2 : HEAD2 big pcluster (0 - off; 1 - on)
>   * bit 3 : tailpacking inline pcluster (0 - off; 1 - on)
> + * bit 4 : interlaced plain pcluster (0 - off; 1 - on)
>   */
>  #define Z_EROFS_ADVISE_COMPACTED_2B		0x0001
>  #define Z_EROFS_ADVISE_BIG_PCLUSTER_1		0x0002
>  #define Z_EROFS_ADVISE_BIG_PCLUSTER_2		0x0004
>  #define Z_EROFS_ADVISE_INLINE_PCLUSTER		0x0008
> +#define Z_EROFS_ADVISE_INTERLACED_PCLUSTER	0x0010
>  
>  struct z_erofs_map_header {
>  	__le16	h_reserved1;
> diff --git a/fs/erofs/internal.h b/fs/erofs/internal.h
> index cfee49d33b95..f3ed36445d73 100644
> --- a/fs/erofs/internal.h
> +++ b/fs/erofs/internal.h
> @@ -436,6 +436,7 @@ struct erofs_map_blocks {
>  
>  enum {
>  	Z_EROFS_COMPRESSION_SHIFTED = Z_EROFS_COMPRESSION_MAX,
> +	Z_EROFS_COMPRESSION_INTERLACED,
>  	Z_EROFS_COMPRESSION_RUNTIME_MAX
>  };
>  
> diff --git a/fs/erofs/zmap.c b/fs/erofs/zmap.c
> index d58549ca1df9..7196235a441c 100644
> --- a/fs/erofs/zmap.c
> +++ b/fs/erofs/zmap.c
> @@ -679,12 +679,18 @@ static int z_erofs_do_map_blocks(struct inode *inode,
>  			goto out;
>  	}
>  
> -	if (m.headtype == Z_EROFS_VLE_CLUSTER_TYPE_PLAIN)
> -		map->m_algorithmformat = Z_EROFS_COMPRESSION_SHIFTED;
> -	else if (m.headtype == Z_EROFS_VLE_CLUSTER_TYPE_HEAD2)
> +	if (m.headtype == Z_EROFS_VLE_CLUSTER_TYPE_PLAIN) {
> +		if (vi->z_advise & Z_EROFS_ADVISE_INTERLACED_PCLUSTER)
> +			map->m_algorithmformat =
> +				Z_EROFS_COMPRESSION_INTERLACED;
> +		else
> +			map->m_algorithmformat =
> +				Z_EROFS_COMPRESSION_SHIFTED;
> +	} else if (m.headtype == Z_EROFS_VLE_CLUSTER_TYPE_HEAD2) {
>  		map->m_algorithmformat = vi->z_algorithmtype[1];
> -	else
> +	} else {
>  		map->m_algorithmformat = vi->z_algorithmtype[0];
> +	}
>  
>  	if ((flags & EROFS_GET_BLOCKS_FIEMAP) ||
>  	    ((flags & EROFS_GET_BLOCKS_READMORE) &&
> -- 
> 2.17.1

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [RFC PATCH v4 1/2] erofs: support interlaced uncompressed data for compressed files
@ 2022-09-16  9:04       ` Gao Xiang
  0 siblings, 0 replies; 12+ messages in thread
From: Gao Xiang @ 2022-09-16  9:04 UTC (permalink / raw)
  To: Yue Hu; +Cc: linux-kernel, zhangwen, Yue Hu, linux-erofs, shaojunjun

On Tue, Sep 13, 2022 at 07:05:51PM +0800, Yue Hu wrote:
> From: Yue Hu <huyue2@coolpad.com>
> 
> Currently, uncompressed data is all handled in the shifted way, which
> means we have to shift the whole on-disk plain pcluster to get the
> logical data.   However, since we are also using in-place I/O for
> uncompressed data, data copy will be reduced a lot if pcluster is
> recorded in the interlaced way as illustrated below:
>  _______________________________________________________________
> |               |    |               |_ tail part |_ head part _|
> |<-   blk0    ->| .. |<-   blkn-2  ->|<-         blkn-1       ->|
> 
> The logical data then becomes:
>  ________________________________________________________
> |_ head part _|_  blk0  _| .. |_  blkn-2  _|_ tail part _|
> 
> In addition, non-4k plain pclusters are also survived by the
> interlaced way, which can be used for non-4k lclusters as well.
> 
> However, it's almost impossible to de-duplicate uncompressed data
> in the interlaced way, therefore shifted uncompressed data is still
> useful.
> 
> Signed-off-by: Yue Hu <huyue2@coolpad.com>

This version looks good to me,
Reviewed-by: Gao Xiang <hsiangkao@linux.alibaba.com>

Thanks,
Gao Xiang


>  fs/erofs/decompressor.c | 47 ++++++++++++++++++++++++-----------------
>  fs/erofs/erofs_fs.h     |  2 ++
>  fs/erofs/internal.h     |  1 +
>  fs/erofs/zmap.c         | 14 ++++++++----
>  4 files changed, 41 insertions(+), 23 deletions(-)
> 
> diff --git a/fs/erofs/decompressor.c b/fs/erofs/decompressor.c
> index 2d55569f96ac..51b7ac7166d9 100644
> --- a/fs/erofs/decompressor.c
> +++ b/fs/erofs/decompressor.c
> @@ -317,52 +317,61 @@ static int z_erofs_lz4_decompress(struct z_erofs_decompress_req *rq,
>  	return ret;
>  }
>  
> -static int z_erofs_shifted_transform(struct z_erofs_decompress_req *rq,
> -				     struct page **pagepool)
> +static int z_erofs_transform_plain(struct z_erofs_decompress_req *rq,
> +				   struct page **pagepool)
>  {
> -	const unsigned int nrpages_out =
> +	const unsigned int inpages = PAGE_ALIGN(rq->inputsize) >> PAGE_SHIFT;
> +	const unsigned int outpages =
>  		PAGE_ALIGN(rq->pageofs_out + rq->outputsize) >> PAGE_SHIFT;
>  	const unsigned int righthalf = min_t(unsigned int, rq->outputsize,
>  					     PAGE_SIZE - rq->pageofs_out);
>  	const unsigned int lefthalf = rq->outputsize - righthalf;
> +	const unsigned int interlaced_offset =
> +		rq->alg == Z_EROFS_COMPRESSION_SHIFTED ? 0 : rq->pageofs_out;
>  	unsigned char *src, *dst;
>  
> -	if (nrpages_out > 2) {
> +	if (outpages > 2 && rq->alg == Z_EROFS_COMPRESSION_SHIFTED) {
>  		DBG_BUGON(1);
> -		return -EIO;
> +		return -EFSCORRUPTED;
>  	}
>  
>  	if (rq->out[0] == *rq->in) {
> -		DBG_BUGON(nrpages_out != 1);
> +		DBG_BUGON(rq->pageofs_out);
>  		return 0;
>  	}
>  
> -	src = kmap_atomic(*rq->in) + rq->pageofs_in;
> +	src = kmap_local_page(rq->in[inpages - 1]) + rq->pageofs_in;
>  	if (rq->out[0]) {
> -		dst = kmap_atomic(rq->out[0]);
> -		memcpy(dst + rq->pageofs_out, src, righthalf);
> -		kunmap_atomic(dst);
> +		dst = kmap_local_page(rq->out[0]);
> +		memcpy(dst + rq->pageofs_out, src + interlaced_offset,
> +		       righthalf);
> +		kunmap_local(dst);
>  	}
>  
> -	if (nrpages_out == 2) {
> -		DBG_BUGON(!rq->out[1]);
> -		if (rq->out[1] == *rq->in) {
> +	if (outpages > inpages) {
> +		DBG_BUGON(!rq->out[outpages - 1]);
> +		if (rq->out[outpages - 1] != rq->in[inpages - 1]) {
> +			dst = kmap_local_page(rq->out[outpages - 1]);
> +			memcpy(dst, interlaced_offset ? src :
> +					(src + righthalf), lefthalf);
> +			kunmap_local(dst);
> +		} else if (!interlaced_offset) {
>  			memmove(src, src + righthalf, lefthalf);
> -		} else {
> -			dst = kmap_atomic(rq->out[1]);
> -			memcpy(dst, src + righthalf, lefthalf);
> -			kunmap_atomic(dst);
>  		}
>  	}
> -	kunmap_atomic(src);
> +	kunmap_local(src);
>  	return 0;
>  }
>  
>  static struct z_erofs_decompressor decompressors[] = {
>  	[Z_EROFS_COMPRESSION_SHIFTED] = {
> -		.decompress = z_erofs_shifted_transform,
> +		.decompress = z_erofs_transform_plain,
>  		.name = "shifted"
>  	},
> +	[Z_EROFS_COMPRESSION_INTERLACED] = {
> +		.decompress = z_erofs_transform_plain,
> +		.name = "interlaced"
> +	},
>  	[Z_EROFS_COMPRESSION_LZ4] = {
>  		.decompress = z_erofs_lz4_decompress,
>  		.name = "lz4"
> diff --git a/fs/erofs/erofs_fs.h b/fs/erofs/erofs_fs.h
> index 2b48373f690b..5c1de6d7ad71 100644
> --- a/fs/erofs/erofs_fs.h
> +++ b/fs/erofs/erofs_fs.h
> @@ -295,11 +295,13 @@ struct z_erofs_lzma_cfgs {
>   * bit 1 : HEAD1 big pcluster (0 - off; 1 - on)
>   * bit 2 : HEAD2 big pcluster (0 - off; 1 - on)
>   * bit 3 : tailpacking inline pcluster (0 - off; 1 - on)
> + * bit 4 : interlaced plain pcluster (0 - off; 1 - on)
>   */
>  #define Z_EROFS_ADVISE_COMPACTED_2B		0x0001
>  #define Z_EROFS_ADVISE_BIG_PCLUSTER_1		0x0002
>  #define Z_EROFS_ADVISE_BIG_PCLUSTER_2		0x0004
>  #define Z_EROFS_ADVISE_INLINE_PCLUSTER		0x0008
> +#define Z_EROFS_ADVISE_INTERLACED_PCLUSTER	0x0010
>  
>  struct z_erofs_map_header {
>  	__le16	h_reserved1;
> diff --git a/fs/erofs/internal.h b/fs/erofs/internal.h
> index cfee49d33b95..f3ed36445d73 100644
> --- a/fs/erofs/internal.h
> +++ b/fs/erofs/internal.h
> @@ -436,6 +436,7 @@ struct erofs_map_blocks {
>  
>  enum {
>  	Z_EROFS_COMPRESSION_SHIFTED = Z_EROFS_COMPRESSION_MAX,
> +	Z_EROFS_COMPRESSION_INTERLACED,
>  	Z_EROFS_COMPRESSION_RUNTIME_MAX
>  };
>  
> diff --git a/fs/erofs/zmap.c b/fs/erofs/zmap.c
> index d58549ca1df9..7196235a441c 100644
> --- a/fs/erofs/zmap.c
> +++ b/fs/erofs/zmap.c
> @@ -679,12 +679,18 @@ static int z_erofs_do_map_blocks(struct inode *inode,
>  			goto out;
>  	}
>  
> -	if (m.headtype == Z_EROFS_VLE_CLUSTER_TYPE_PLAIN)
> -		map->m_algorithmformat = Z_EROFS_COMPRESSION_SHIFTED;
> -	else if (m.headtype == Z_EROFS_VLE_CLUSTER_TYPE_HEAD2)
> +	if (m.headtype == Z_EROFS_VLE_CLUSTER_TYPE_PLAIN) {
> +		if (vi->z_advise & Z_EROFS_ADVISE_INTERLACED_PCLUSTER)
> +			map->m_algorithmformat =
> +				Z_EROFS_COMPRESSION_INTERLACED;
> +		else
> +			map->m_algorithmformat =
> +				Z_EROFS_COMPRESSION_SHIFTED;
> +	} else if (m.headtype == Z_EROFS_VLE_CLUSTER_TYPE_HEAD2) {
>  		map->m_algorithmformat = vi->z_algorithmtype[1];
> -	else
> +	} else {
>  		map->m_algorithmformat = vi->z_algorithmtype[0];
> +	}
>  
>  	if ((flags & EROFS_GET_BLOCKS_FIEMAP) ||
>  	    ((flags & EROFS_GET_BLOCKS_READMORE) &&
> -- 
> 2.17.1

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [RFC PATCH v4 2/2] erofs: support on-disk compressed fragments data
  2022-09-13 11:05     ` Yue Hu
@ 2022-09-16  9:08       ` Gao Xiang
  -1 siblings, 0 replies; 12+ messages in thread
From: Gao Xiang @ 2022-09-16  9:08 UTC (permalink / raw)
  To: Yue Hu
  Cc: xiang, chao, linux-erofs, linux-kernel, zhangwen, shaojunjun, Yue Hu

On Tue, Sep 13, 2022 at 07:05:52PM +0800, Yue Hu wrote:
> From: Yue Hu <huyue2@coolpad.com>
> 
> Introduce on-disk compressed fragments data feature.
> 
> This approach adds a new field called `h_fragmentoff' in the per-file
> compression header to indicate the fragment offset of each tail pcluster
> or the whole file in the special packed inode.
> 
> Similar to ztailpacking, it will also find and record the 'headlcn'
> of the tail pcluster when initializing per-inode zmap for making
> follow-on requests more easy.
> 
> Signed-off-by: Yue Hu <huyue2@coolpad.com>
> ---
>  fs/erofs/erofs_fs.h | 29 +++++++++++++++++++++------
>  fs/erofs/internal.h | 16 ++++++++++++---
>  fs/erofs/super.c    | 15 ++++++++++++++
>  fs/erofs/sysfs.c    |  2 ++
>  fs/erofs/zdata.c    | 48 ++++++++++++++++++++++++++++++++++++++++++++-
>  fs/erofs/zmap.c     | 48 +++++++++++++++++++++++++++++++++++++++++++--
>  6 files changed, 146 insertions(+), 12 deletions(-)
> 
> diff --git a/fs/erofs/erofs_fs.h b/fs/erofs/erofs_fs.h
> index 5c1de6d7ad71..aa976757328b 100644
> --- a/fs/erofs/erofs_fs.h
> +++ b/fs/erofs/erofs_fs.h
> @@ -25,6 +25,7 @@
>  #define EROFS_FEATURE_INCOMPAT_DEVICE_TABLE	0x00000008
>  #define EROFS_FEATURE_INCOMPAT_COMPR_HEAD2	0x00000008
>  #define EROFS_FEATURE_INCOMPAT_ZTAILPACKING	0x00000010
> +#define EROFS_FEATURE_INCOMPAT_FRAGMENTS	0x00000020
>  #define EROFS_ALL_FEATURE_INCOMPAT		\
>  	(EROFS_FEATURE_INCOMPAT_ZERO_PADDING | \
>  	 EROFS_FEATURE_INCOMPAT_COMPR_CFGS | \
> @@ -32,7 +33,8 @@
>  	 EROFS_FEATURE_INCOMPAT_CHUNKED_FILE | \
>  	 EROFS_FEATURE_INCOMPAT_DEVICE_TABLE | \
>  	 EROFS_FEATURE_INCOMPAT_COMPR_HEAD2 | \
> -	 EROFS_FEATURE_INCOMPAT_ZTAILPACKING)
> +	 EROFS_FEATURE_INCOMPAT_ZTAILPACKING | \
> +	 EROFS_FEATURE_INCOMPAT_FRAGMENTS)
>  
>  #define EROFS_SB_EXTSLOT_SIZE	16
>  
> @@ -71,7 +73,9 @@ struct erofs_super_block {
>  	} __packed u1;
>  	__le16 extra_devices;	/* # of devices besides the primary device */
>  	__le16 devt_slotoff;	/* startoff = devt_slotoff * devt_slotsize */
> -	__u8 reserved2[38];
> +	__u8 reserved[6];
> +	__le64 packed_nid;	/* nid of the special packed inode */
> +	__u8 reserved2[24];
>  };
>  
>  /*
> @@ -296,17 +300,26 @@ struct z_erofs_lzma_cfgs {
>   * bit 2 : HEAD2 big pcluster (0 - off; 1 - on)
>   * bit 3 : tailpacking inline pcluster (0 - off; 1 - on)
>   * bit 4 : interlaced plain pcluster (0 - off; 1 - on)
> + * bit 5 : fragment pcluster (0 - off; 1 - on)
>   */
>  #define Z_EROFS_ADVISE_COMPACTED_2B		0x0001
>  #define Z_EROFS_ADVISE_BIG_PCLUSTER_1		0x0002
>  #define Z_EROFS_ADVISE_BIG_PCLUSTER_2		0x0004
>  #define Z_EROFS_ADVISE_INLINE_PCLUSTER		0x0008
>  #define Z_EROFS_ADVISE_INTERLACED_PCLUSTER	0x0010
> +#define Z_EROFS_ADVISE_FRAGMENT_PCLUSTER	0x0020
>  
> +#define Z_EROFS_FRAGMENT_INODE_BIT              7
>  struct z_erofs_map_header {
> -	__le16	h_reserved1;
> -	/* indicates the encoded size of tailpacking data */
> -	__le16  h_idata_size;
> +	union {
> +		/* fragment data offset in the packed inode */
> +		__le32  h_fragmentoff;
> +		struct {
> +			__le16  h_reserved1;
> +			/* indicates the encoded size of tailpacking data */
> +			__le16  h_idata_size;
> +		};
> +	};
>  	__le16	h_advise;
>  	/*
>  	 * bit 0-3 : algorithm type of head 1 (logical cluster type 01);
> @@ -315,7 +328,8 @@ struct z_erofs_map_header {
>  	__u8	h_algorithmtype;
>  	/*
>  	 * bit 0-2 : logical cluster bits - 12, e.g. 0 for 4096;
> -	 * bit 3-7 : reserved.
> +	 * bit 3-6 : reserved;
> +	 * bit 7   : move the whole file into packed inode or not.
>  	 */
>  	__u8	h_clusterbits;
>  };
> @@ -421,6 +435,9 @@ static inline void erofs_check_ondisk_layout_definitions(void)
>  
>  	BUILD_BUG_ON(BIT(Z_EROFS_VLE_DI_CLUSTER_TYPE_BITS) <
>  		     Z_EROFS_VLE_CLUSTER_TYPE_MAX - 1);
> +	WARN_ON(*(__le64 *)&(struct z_erofs_map_header) {
> +			.h_clusterbits = 1 << Z_EROFS_FRAGMENT_INODE_BIT
> +		} != cpu_to_le64(1ULL << 63));

why not BUILD_BUG_ON here?

>  }
>  
>  #endif
> diff --git a/fs/erofs/internal.h b/fs/erofs/internal.h
> index f3ed36445d73..b133664b4ad2 100644
> --- a/fs/erofs/internal.h
> +++ b/fs/erofs/internal.h
> @@ -120,6 +120,7 @@ struct erofs_sb_info {
>  	struct inode *managed_cache;
>  
>  	struct erofs_sb_lz4_info lz4;
> +	struct inode *packed_inode;
>  #endif	/* CONFIG_EROFS_FS_ZIP */
>  	struct erofs_dev_context *devs;
>  	struct dax_device *dax_dev;
> @@ -306,6 +307,7 @@ EROFS_FEATURE_FUNCS(chunked_file, incompat, INCOMPAT_CHUNKED_FILE)
>  EROFS_FEATURE_FUNCS(device_table, incompat, INCOMPAT_DEVICE_TABLE)
>  EROFS_FEATURE_FUNCS(compr_head2, incompat, INCOMPAT_COMPR_HEAD2)
>  EROFS_FEATURE_FUNCS(ztailpacking, incompat, INCOMPAT_ZTAILPACKING)
> +EROFS_FEATURE_FUNCS(fragments, incompat, INCOMPAT_FRAGMENTS)
>  EROFS_FEATURE_FUNCS(sb_chksum, compat, COMPAT_SB_CHKSUM)
>  
>  /* atomic flag definitions */
> @@ -341,8 +343,13 @@ struct erofs_inode {
>  			unsigned char  z_algorithmtype[2];
>  			unsigned char  z_logical_clusterbits;
>  			unsigned long  z_tailextent_headlcn;
> -			erofs_off_t    z_idataoff;
> -			unsigned short z_idata_size;
> +			union {
> +				struct {
> +					erofs_off_t    z_idataoff;
> +					unsigned short z_idata_size;
> +				};
> +				erofs_off_t z_fragmentoff;
> +			};
>  		};
>  #endif	/* CONFIG_EROFS_FS_ZIP */
>  	};
> @@ -400,6 +407,7 @@ extern const struct address_space_operations z_erofs_aops;
>  enum {
>  	BH_Encoded = BH_PrivateStart,
>  	BH_FullMapped,
> +	BH_Fragment,
>  };
>  
>  /* Has a disk mapping */
> @@ -410,6 +418,8 @@ enum {
>  #define EROFS_MAP_ENCODED	(1 << BH_Encoded)
>  /* The length of extent is full */
>  #define EROFS_MAP_FULL_MAPPED	(1 << BH_FullMapped)
> +/* Located in the special packed inode */
> +#define EROFS_MAP_FRAGMENT	(1 << BH_Fragment)
>  
>  struct erofs_map_blocks {
>  	struct erofs_buf buf;
> @@ -431,7 +441,7 @@ struct erofs_map_blocks {
>  #define EROFS_GET_BLOCKS_FIEMAP	0x0002
>  /* Used to map the whole extent if non-negligible data is requested for LZMA */
>  #define EROFS_GET_BLOCKS_READMORE	0x0004
> -/* Used to map tail extent for tailpacking inline pcluster */
> +/* Used to map tail extent for tailpacking inline or fragment pcluster */
>  #define EROFS_GET_BLOCKS_FINDTAIL	0x0008
>  
>  enum {
> diff --git a/fs/erofs/super.c b/fs/erofs/super.c
> index 3173debeaa5a..8170c0d8ab92 100644
> --- a/fs/erofs/super.c
> +++ b/fs/erofs/super.c
> @@ -381,6 +381,17 @@ static int erofs_read_superblock(struct super_block *sb)
>  #endif
>  	sbi->islotbits = ilog2(sizeof(struct erofs_inode_compact));
>  	sbi->root_nid = le16_to_cpu(dsb->root_nid);
> +#ifdef CONFIG_EROFS_FS_ZIP
> +	sbi->packed_inode = NULL;
> +	if (erofs_sb_has_fragments(sbi)) {
> +		sbi->packed_inode =
> +			erofs_iget(sb, le64_to_cpu(dsb->packed_nid), false);
> +		if (IS_ERR(sbi->packed_inode)) {
> +			ret = PTR_ERR(sbi->packed_inode);
> +			goto out;
> +		}
> +	}
> +#endif
>  	sbi->inos = le64_to_cpu(dsb->inos);
>  
>  	sbi->build_time = le64_to_cpu(dsb->build_time);
> @@ -411,6 +422,8 @@ static int erofs_read_superblock(struct super_block *sb)
>  		erofs_info(sb, "EXPERIMENTAL compressed inline data feature in use. Use at your own risk!");
>  	if (erofs_is_fscache_mode(sb))
>  		erofs_info(sb, "EXPERIMENTAL fscache-based on-demand read feature in use. Use at your own risk!");
> +	if (erofs_sb_has_fragments(sbi))
> +		erofs_info(sb, "EXPERIMENTAL compressed fragments feature in use. Use at your own risk!");
>  out:
>  	erofs_put_metabuf(&buf);
>  	return ret;
> @@ -908,6 +921,8 @@ static void erofs_put_super(struct super_block *sb)
>  #ifdef CONFIG_EROFS_FS_ZIP
>  	iput(sbi->managed_cache);
>  	sbi->managed_cache = NULL;
> +	iput(sbi->packed_inode);
> +	sbi->packed_inode = NULL;
>  #endif
>  	erofs_fscache_unregister_cookie(&sbi->s_fscache);
>  }
> diff --git a/fs/erofs/sysfs.c b/fs/erofs/sysfs.c
> index c1383e508bbe..1b52395be82a 100644
> --- a/fs/erofs/sysfs.c
> +++ b/fs/erofs/sysfs.c
> @@ -76,6 +76,7 @@ EROFS_ATTR_FEATURE(device_table);
>  EROFS_ATTR_FEATURE(compr_head2);
>  EROFS_ATTR_FEATURE(sb_chksum);
>  EROFS_ATTR_FEATURE(ztailpacking);
> +EROFS_ATTR_FEATURE(fragments);
>  
>  static struct attribute *erofs_feat_attrs[] = {
>  	ATTR_LIST(zero_padding),
> @@ -86,6 +87,7 @@ static struct attribute *erofs_feat_attrs[] = {
>  	ATTR_LIST(compr_head2),
>  	ATTR_LIST(sb_chksum),
>  	ATTR_LIST(ztailpacking),
> +	ATTR_LIST(fragments),
>  	NULL,
>  };
>  ATTRIBUTE_GROUPS(erofs_feat);
> diff --git a/fs/erofs/zdata.c b/fs/erofs/zdata.c
> index 5792ca9e0d5e..aa2a3cdeea57 100644
> --- a/fs/erofs/zdata.c
> +++ b/fs/erofs/zdata.c
> @@ -650,6 +650,33 @@ static bool should_alloc_managed_pages(struct z_erofs_decompress_frontend *fe,
>  		la < fe->headoffset;
>  }
>  
> +static int z_erofs_read_fragment(struct inode *inode, erofs_off_t pos,
> +				 struct page *page, unsigned int pageofs,
> +				 unsigned int len)
> +{
> +	struct inode *packed_inode = EROFS_I_SB(inode)->packed_inode;
> +	struct erofs_buf buf = __EROFS_BUF_INITIALIZER;
> +	u8 *src, *dst;
> +	unsigned int i, cnt;
> +
> +	pos += EROFS_I(inode)->z_fragmentoff;
> +	for (i = 0; i < len; i += cnt) {
> +		cnt = min_t(unsigned int, len - i,
> +			    EROFS_BLKSIZ - erofs_blkoff(pos));
> +		src = erofs_bread(&buf, packed_inode,
> +				  erofs_blknr(pos), EROFS_KMAP);
> +		if (IS_ERR(src))

			^
			need to erofs_put_metabuf() here anyway?

> +			return PTR_ERR(src);
> +
> +		dst = kmap_local_page(page);
> +		memcpy(dst + pageofs + i, src + erofs_blkoff(pos), cnt);
> +		kunmap_local(dst);
> +		pos += cnt;
> +	}
> +	erofs_put_metabuf(&buf);
> +	return 0;
> +}

Thanks,
Gao Xiang

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [RFC PATCH v4 2/2] erofs: support on-disk compressed fragments data
@ 2022-09-16  9:08       ` Gao Xiang
  0 siblings, 0 replies; 12+ messages in thread
From: Gao Xiang @ 2022-09-16  9:08 UTC (permalink / raw)
  To: Yue Hu; +Cc: linux-kernel, zhangwen, Yue Hu, linux-erofs, shaojunjun

On Tue, Sep 13, 2022 at 07:05:52PM +0800, Yue Hu wrote:
> From: Yue Hu <huyue2@coolpad.com>
> 
> Introduce on-disk compressed fragments data feature.
> 
> This approach adds a new field called `h_fragmentoff' in the per-file
> compression header to indicate the fragment offset of each tail pcluster
> or the whole file in the special packed inode.
> 
> Similar to ztailpacking, it will also find and record the 'headlcn'
> of the tail pcluster when initializing per-inode zmap for making
> follow-on requests more easy.
> 
> Signed-off-by: Yue Hu <huyue2@coolpad.com>
> ---
>  fs/erofs/erofs_fs.h | 29 +++++++++++++++++++++------
>  fs/erofs/internal.h | 16 ++++++++++++---
>  fs/erofs/super.c    | 15 ++++++++++++++
>  fs/erofs/sysfs.c    |  2 ++
>  fs/erofs/zdata.c    | 48 ++++++++++++++++++++++++++++++++++++++++++++-
>  fs/erofs/zmap.c     | 48 +++++++++++++++++++++++++++++++++++++++++++--
>  6 files changed, 146 insertions(+), 12 deletions(-)
> 
> diff --git a/fs/erofs/erofs_fs.h b/fs/erofs/erofs_fs.h
> index 5c1de6d7ad71..aa976757328b 100644
> --- a/fs/erofs/erofs_fs.h
> +++ b/fs/erofs/erofs_fs.h
> @@ -25,6 +25,7 @@
>  #define EROFS_FEATURE_INCOMPAT_DEVICE_TABLE	0x00000008
>  #define EROFS_FEATURE_INCOMPAT_COMPR_HEAD2	0x00000008
>  #define EROFS_FEATURE_INCOMPAT_ZTAILPACKING	0x00000010
> +#define EROFS_FEATURE_INCOMPAT_FRAGMENTS	0x00000020
>  #define EROFS_ALL_FEATURE_INCOMPAT		\
>  	(EROFS_FEATURE_INCOMPAT_ZERO_PADDING | \
>  	 EROFS_FEATURE_INCOMPAT_COMPR_CFGS | \
> @@ -32,7 +33,8 @@
>  	 EROFS_FEATURE_INCOMPAT_CHUNKED_FILE | \
>  	 EROFS_FEATURE_INCOMPAT_DEVICE_TABLE | \
>  	 EROFS_FEATURE_INCOMPAT_COMPR_HEAD2 | \
> -	 EROFS_FEATURE_INCOMPAT_ZTAILPACKING)
> +	 EROFS_FEATURE_INCOMPAT_ZTAILPACKING | \
> +	 EROFS_FEATURE_INCOMPAT_FRAGMENTS)
>  
>  #define EROFS_SB_EXTSLOT_SIZE	16
>  
> @@ -71,7 +73,9 @@ struct erofs_super_block {
>  	} __packed u1;
>  	__le16 extra_devices;	/* # of devices besides the primary device */
>  	__le16 devt_slotoff;	/* startoff = devt_slotoff * devt_slotsize */
> -	__u8 reserved2[38];
> +	__u8 reserved[6];
> +	__le64 packed_nid;	/* nid of the special packed inode */
> +	__u8 reserved2[24];
>  };
>  
>  /*
> @@ -296,17 +300,26 @@ struct z_erofs_lzma_cfgs {
>   * bit 2 : HEAD2 big pcluster (0 - off; 1 - on)
>   * bit 3 : tailpacking inline pcluster (0 - off; 1 - on)
>   * bit 4 : interlaced plain pcluster (0 - off; 1 - on)
> + * bit 5 : fragment pcluster (0 - off; 1 - on)
>   */
>  #define Z_EROFS_ADVISE_COMPACTED_2B		0x0001
>  #define Z_EROFS_ADVISE_BIG_PCLUSTER_1		0x0002
>  #define Z_EROFS_ADVISE_BIG_PCLUSTER_2		0x0004
>  #define Z_EROFS_ADVISE_INLINE_PCLUSTER		0x0008
>  #define Z_EROFS_ADVISE_INTERLACED_PCLUSTER	0x0010
> +#define Z_EROFS_ADVISE_FRAGMENT_PCLUSTER	0x0020
>  
> +#define Z_EROFS_FRAGMENT_INODE_BIT              7
>  struct z_erofs_map_header {
> -	__le16	h_reserved1;
> -	/* indicates the encoded size of tailpacking data */
> -	__le16  h_idata_size;
> +	union {
> +		/* fragment data offset in the packed inode */
> +		__le32  h_fragmentoff;
> +		struct {
> +			__le16  h_reserved1;
> +			/* indicates the encoded size of tailpacking data */
> +			__le16  h_idata_size;
> +		};
> +	};
>  	__le16	h_advise;
>  	/*
>  	 * bit 0-3 : algorithm type of head 1 (logical cluster type 01);
> @@ -315,7 +328,8 @@ struct z_erofs_map_header {
>  	__u8	h_algorithmtype;
>  	/*
>  	 * bit 0-2 : logical cluster bits - 12, e.g. 0 for 4096;
> -	 * bit 3-7 : reserved.
> +	 * bit 3-6 : reserved;
> +	 * bit 7   : move the whole file into packed inode or not.
>  	 */
>  	__u8	h_clusterbits;
>  };
> @@ -421,6 +435,9 @@ static inline void erofs_check_ondisk_layout_definitions(void)
>  
>  	BUILD_BUG_ON(BIT(Z_EROFS_VLE_DI_CLUSTER_TYPE_BITS) <
>  		     Z_EROFS_VLE_CLUSTER_TYPE_MAX - 1);
> +	WARN_ON(*(__le64 *)&(struct z_erofs_map_header) {
> +			.h_clusterbits = 1 << Z_EROFS_FRAGMENT_INODE_BIT
> +		} != cpu_to_le64(1ULL << 63));

why not BUILD_BUG_ON here?

>  }
>  
>  #endif
> diff --git a/fs/erofs/internal.h b/fs/erofs/internal.h
> index f3ed36445d73..b133664b4ad2 100644
> --- a/fs/erofs/internal.h
> +++ b/fs/erofs/internal.h
> @@ -120,6 +120,7 @@ struct erofs_sb_info {
>  	struct inode *managed_cache;
>  
>  	struct erofs_sb_lz4_info lz4;
> +	struct inode *packed_inode;
>  #endif	/* CONFIG_EROFS_FS_ZIP */
>  	struct erofs_dev_context *devs;
>  	struct dax_device *dax_dev;
> @@ -306,6 +307,7 @@ EROFS_FEATURE_FUNCS(chunked_file, incompat, INCOMPAT_CHUNKED_FILE)
>  EROFS_FEATURE_FUNCS(device_table, incompat, INCOMPAT_DEVICE_TABLE)
>  EROFS_FEATURE_FUNCS(compr_head2, incompat, INCOMPAT_COMPR_HEAD2)
>  EROFS_FEATURE_FUNCS(ztailpacking, incompat, INCOMPAT_ZTAILPACKING)
> +EROFS_FEATURE_FUNCS(fragments, incompat, INCOMPAT_FRAGMENTS)
>  EROFS_FEATURE_FUNCS(sb_chksum, compat, COMPAT_SB_CHKSUM)
>  
>  /* atomic flag definitions */
> @@ -341,8 +343,13 @@ struct erofs_inode {
>  			unsigned char  z_algorithmtype[2];
>  			unsigned char  z_logical_clusterbits;
>  			unsigned long  z_tailextent_headlcn;
> -			erofs_off_t    z_idataoff;
> -			unsigned short z_idata_size;
> +			union {
> +				struct {
> +					erofs_off_t    z_idataoff;
> +					unsigned short z_idata_size;
> +				};
> +				erofs_off_t z_fragmentoff;
> +			};
>  		};
>  #endif	/* CONFIG_EROFS_FS_ZIP */
>  	};
> @@ -400,6 +407,7 @@ extern const struct address_space_operations z_erofs_aops;
>  enum {
>  	BH_Encoded = BH_PrivateStart,
>  	BH_FullMapped,
> +	BH_Fragment,
>  };
>  
>  /* Has a disk mapping */
> @@ -410,6 +418,8 @@ enum {
>  #define EROFS_MAP_ENCODED	(1 << BH_Encoded)
>  /* The length of extent is full */
>  #define EROFS_MAP_FULL_MAPPED	(1 << BH_FullMapped)
> +/* Located in the special packed inode */
> +#define EROFS_MAP_FRAGMENT	(1 << BH_Fragment)
>  
>  struct erofs_map_blocks {
>  	struct erofs_buf buf;
> @@ -431,7 +441,7 @@ struct erofs_map_blocks {
>  #define EROFS_GET_BLOCKS_FIEMAP	0x0002
>  /* Used to map the whole extent if non-negligible data is requested for LZMA */
>  #define EROFS_GET_BLOCKS_READMORE	0x0004
> -/* Used to map tail extent for tailpacking inline pcluster */
> +/* Used to map tail extent for tailpacking inline or fragment pcluster */
>  #define EROFS_GET_BLOCKS_FINDTAIL	0x0008
>  
>  enum {
> diff --git a/fs/erofs/super.c b/fs/erofs/super.c
> index 3173debeaa5a..8170c0d8ab92 100644
> --- a/fs/erofs/super.c
> +++ b/fs/erofs/super.c
> @@ -381,6 +381,17 @@ static int erofs_read_superblock(struct super_block *sb)
>  #endif
>  	sbi->islotbits = ilog2(sizeof(struct erofs_inode_compact));
>  	sbi->root_nid = le16_to_cpu(dsb->root_nid);
> +#ifdef CONFIG_EROFS_FS_ZIP
> +	sbi->packed_inode = NULL;
> +	if (erofs_sb_has_fragments(sbi)) {
> +		sbi->packed_inode =
> +			erofs_iget(sb, le64_to_cpu(dsb->packed_nid), false);
> +		if (IS_ERR(sbi->packed_inode)) {
> +			ret = PTR_ERR(sbi->packed_inode);
> +			goto out;
> +		}
> +	}
> +#endif
>  	sbi->inos = le64_to_cpu(dsb->inos);
>  
>  	sbi->build_time = le64_to_cpu(dsb->build_time);
> @@ -411,6 +422,8 @@ static int erofs_read_superblock(struct super_block *sb)
>  		erofs_info(sb, "EXPERIMENTAL compressed inline data feature in use. Use at your own risk!");
>  	if (erofs_is_fscache_mode(sb))
>  		erofs_info(sb, "EXPERIMENTAL fscache-based on-demand read feature in use. Use at your own risk!");
> +	if (erofs_sb_has_fragments(sbi))
> +		erofs_info(sb, "EXPERIMENTAL compressed fragments feature in use. Use at your own risk!");
>  out:
>  	erofs_put_metabuf(&buf);
>  	return ret;
> @@ -908,6 +921,8 @@ static void erofs_put_super(struct super_block *sb)
>  #ifdef CONFIG_EROFS_FS_ZIP
>  	iput(sbi->managed_cache);
>  	sbi->managed_cache = NULL;
> +	iput(sbi->packed_inode);
> +	sbi->packed_inode = NULL;
>  #endif
>  	erofs_fscache_unregister_cookie(&sbi->s_fscache);
>  }
> diff --git a/fs/erofs/sysfs.c b/fs/erofs/sysfs.c
> index c1383e508bbe..1b52395be82a 100644
> --- a/fs/erofs/sysfs.c
> +++ b/fs/erofs/sysfs.c
> @@ -76,6 +76,7 @@ EROFS_ATTR_FEATURE(device_table);
>  EROFS_ATTR_FEATURE(compr_head2);
>  EROFS_ATTR_FEATURE(sb_chksum);
>  EROFS_ATTR_FEATURE(ztailpacking);
> +EROFS_ATTR_FEATURE(fragments);
>  
>  static struct attribute *erofs_feat_attrs[] = {
>  	ATTR_LIST(zero_padding),
> @@ -86,6 +87,7 @@ static struct attribute *erofs_feat_attrs[] = {
>  	ATTR_LIST(compr_head2),
>  	ATTR_LIST(sb_chksum),
>  	ATTR_LIST(ztailpacking),
> +	ATTR_LIST(fragments),
>  	NULL,
>  };
>  ATTRIBUTE_GROUPS(erofs_feat);
> diff --git a/fs/erofs/zdata.c b/fs/erofs/zdata.c
> index 5792ca9e0d5e..aa2a3cdeea57 100644
> --- a/fs/erofs/zdata.c
> +++ b/fs/erofs/zdata.c
> @@ -650,6 +650,33 @@ static bool should_alloc_managed_pages(struct z_erofs_decompress_frontend *fe,
>  		la < fe->headoffset;
>  }
>  
> +static int z_erofs_read_fragment(struct inode *inode, erofs_off_t pos,
> +				 struct page *page, unsigned int pageofs,
> +				 unsigned int len)
> +{
> +	struct inode *packed_inode = EROFS_I_SB(inode)->packed_inode;
> +	struct erofs_buf buf = __EROFS_BUF_INITIALIZER;
> +	u8 *src, *dst;
> +	unsigned int i, cnt;
> +
> +	pos += EROFS_I(inode)->z_fragmentoff;
> +	for (i = 0; i < len; i += cnt) {
> +		cnt = min_t(unsigned int, len - i,
> +			    EROFS_BLKSIZ - erofs_blkoff(pos));
> +		src = erofs_bread(&buf, packed_inode,
> +				  erofs_blknr(pos), EROFS_KMAP);
> +		if (IS_ERR(src))

			^
			need to erofs_put_metabuf() here anyway?

> +			return PTR_ERR(src);
> +
> +		dst = kmap_local_page(page);
> +		memcpy(dst + pageofs + i, src + erofs_blkoff(pos), cnt);
> +		kunmap_local(dst);
> +		pos += cnt;
> +	}
> +	erofs_put_metabuf(&buf);
> +	return 0;
> +}

Thanks,
Gao Xiang

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [RFC PATCH v4 2/2] erofs: support on-disk compressed fragments data
  2022-09-16  9:08       ` Gao Xiang
@ 2022-09-17  3:56         ` Yue Hu
  -1 siblings, 0 replies; 12+ messages in thread
From: Yue Hu @ 2022-09-17  3:56 UTC (permalink / raw)
  To: Gao Xiang
  Cc: xiang, chao, linux-erofs, linux-kernel, zhangwen, shaojunjun, Yue Hu

On Fri, 16 Sep 2022 17:08:51 +0800
Gao Xiang <hsiangkao@linux.alibaba.com> wrote:

> On Tue, Sep 13, 2022 at 07:05:52PM +0800, Yue Hu wrote:
> > From: Yue Hu <huyue2@coolpad.com>
> > 
> > Introduce on-disk compressed fragments data feature.
> > 
> > This approach adds a new field called `h_fragmentoff' in the per-file
> > compression header to indicate the fragment offset of each tail pcluster
> > or the whole file in the special packed inode.
> > 
> > Similar to ztailpacking, it will also find and record the 'headlcn'
> > of the tail pcluster when initializing per-inode zmap for making
> > follow-on requests more easy.
> > 
> > Signed-off-by: Yue Hu <huyue2@coolpad.com>
> > ---
> >  fs/erofs/erofs_fs.h | 29 +++++++++++++++++++++------
> >  fs/erofs/internal.h | 16 ++++++++++++---
> >  fs/erofs/super.c    | 15 ++++++++++++++
> >  fs/erofs/sysfs.c    |  2 ++
> >  fs/erofs/zdata.c    | 48 ++++++++++++++++++++++++++++++++++++++++++++-
> >  fs/erofs/zmap.c     | 48 +++++++++++++++++++++++++++++++++++++++++++--
> >  6 files changed, 146 insertions(+), 12 deletions(-)
> > 
> > diff --git a/fs/erofs/erofs_fs.h b/fs/erofs/erofs_fs.h
> > index 5c1de6d7ad71..aa976757328b 100644
> > --- a/fs/erofs/erofs_fs.h
> > +++ b/fs/erofs/erofs_fs.h
> > @@ -25,6 +25,7 @@
> >  #define EROFS_FEATURE_INCOMPAT_DEVICE_TABLE	0x00000008
> >  #define EROFS_FEATURE_INCOMPAT_COMPR_HEAD2	0x00000008
> >  #define EROFS_FEATURE_INCOMPAT_ZTAILPACKING	0x00000010
> > +#define EROFS_FEATURE_INCOMPAT_FRAGMENTS	0x00000020
> >  #define EROFS_ALL_FEATURE_INCOMPAT		\
> >  	(EROFS_FEATURE_INCOMPAT_ZERO_PADDING | \
> >  	 EROFS_FEATURE_INCOMPAT_COMPR_CFGS | \
> > @@ -32,7 +33,8 @@
> >  	 EROFS_FEATURE_INCOMPAT_CHUNKED_FILE | \
> >  	 EROFS_FEATURE_INCOMPAT_DEVICE_TABLE | \
> >  	 EROFS_FEATURE_INCOMPAT_COMPR_HEAD2 | \
> > -	 EROFS_FEATURE_INCOMPAT_ZTAILPACKING)
> > +	 EROFS_FEATURE_INCOMPAT_ZTAILPACKING | \
> > +	 EROFS_FEATURE_INCOMPAT_FRAGMENTS)
> >  
> >  #define EROFS_SB_EXTSLOT_SIZE	16
> >  
> > @@ -71,7 +73,9 @@ struct erofs_super_block {
> >  	} __packed u1;
> >  	__le16 extra_devices;	/* # of devices besides the primary device */
> >  	__le16 devt_slotoff;	/* startoff = devt_slotoff * devt_slotsize */
> > -	__u8 reserved2[38];
> > +	__u8 reserved[6];
> > +	__le64 packed_nid;	/* nid of the special packed inode */
> > +	__u8 reserved2[24];
> >  };
> >  
> >  /*
> > @@ -296,17 +300,26 @@ struct z_erofs_lzma_cfgs {
> >   * bit 2 : HEAD2 big pcluster (0 - off; 1 - on)
> >   * bit 3 : tailpacking inline pcluster (0 - off; 1 - on)
> >   * bit 4 : interlaced plain pcluster (0 - off; 1 - on)
> > + * bit 5 : fragment pcluster (0 - off; 1 - on)
> >   */
> >  #define Z_EROFS_ADVISE_COMPACTED_2B		0x0001
> >  #define Z_EROFS_ADVISE_BIG_PCLUSTER_1		0x0002
> >  #define Z_EROFS_ADVISE_BIG_PCLUSTER_2		0x0004
> >  #define Z_EROFS_ADVISE_INLINE_PCLUSTER		0x0008
> >  #define Z_EROFS_ADVISE_INTERLACED_PCLUSTER	0x0010
> > +#define Z_EROFS_ADVISE_FRAGMENT_PCLUSTER	0x0020
> >  
> > +#define Z_EROFS_FRAGMENT_INODE_BIT              7
> >  struct z_erofs_map_header {
> > -	__le16	h_reserved1;
> > -	/* indicates the encoded size of tailpacking data */
> > -	__le16  h_idata_size;
> > +	union {
> > +		/* fragment data offset in the packed inode */
> > +		__le32  h_fragmentoff;
> > +		struct {
> > +			__le16  h_reserved1;
> > +			/* indicates the encoded size of tailpacking data */
> > +			__le16  h_idata_size;
> > +		};
> > +	};
> >  	__le16	h_advise;
> >  	/*
> >  	 * bit 0-3 : algorithm type of head 1 (logical cluster type 01);
> > @@ -315,7 +328,8 @@ struct z_erofs_map_header {
> >  	__u8	h_algorithmtype;
> >  	/*
> >  	 * bit 0-2 : logical cluster bits - 12, e.g. 0 for 4096;
> > -	 * bit 3-7 : reserved.
> > +	 * bit 3-6 : reserved;
> > +	 * bit 7   : move the whole file into packed inode or not.
> >  	 */
> >  	__u8	h_clusterbits;
> >  };
> > @@ -421,6 +435,9 @@ static inline void erofs_check_ondisk_layout_definitions(void)
> >  
> >  	BUILD_BUG_ON(BIT(Z_EROFS_VLE_DI_CLUSTER_TYPE_BITS) <
> >  		     Z_EROFS_VLE_CLUSTER_TYPE_MAX - 1);
> > +	WARN_ON(*(__le64 *)&(struct z_erofs_map_header) {
> > +			.h_clusterbits = 1 << Z_EROFS_FRAGMENT_INODE_BIT
> > +		} != cpu_to_le64(1ULL << 63));  
> 
> why not BUILD_BUG_ON here?

There's compiling error, let me check further.

> 
> >  }
> >  
> >  #endif
> > diff --git a/fs/erofs/internal.h b/fs/erofs/internal.h
> > index f3ed36445d73..b133664b4ad2 100644
> > --- a/fs/erofs/internal.h
> > +++ b/fs/erofs/internal.h
> > @@ -120,6 +120,7 @@ struct erofs_sb_info {
> >  	struct inode *managed_cache;
> >  
> >  	struct erofs_sb_lz4_info lz4;
> > +	struct inode *packed_inode;
> >  #endif	/* CONFIG_EROFS_FS_ZIP */
> >  	struct erofs_dev_context *devs;
> >  	struct dax_device *dax_dev;
> > @@ -306,6 +307,7 @@ EROFS_FEATURE_FUNCS(chunked_file, incompat, INCOMPAT_CHUNKED_FILE)
> >  EROFS_FEATURE_FUNCS(device_table, incompat, INCOMPAT_DEVICE_TABLE)
> >  EROFS_FEATURE_FUNCS(compr_head2, incompat, INCOMPAT_COMPR_HEAD2)
> >  EROFS_FEATURE_FUNCS(ztailpacking, incompat, INCOMPAT_ZTAILPACKING)
> > +EROFS_FEATURE_FUNCS(fragments, incompat, INCOMPAT_FRAGMENTS)
> >  EROFS_FEATURE_FUNCS(sb_chksum, compat, COMPAT_SB_CHKSUM)
> >  
> >  /* atomic flag definitions */
> > @@ -341,8 +343,13 @@ struct erofs_inode {
> >  			unsigned char  z_algorithmtype[2];
> >  			unsigned char  z_logical_clusterbits;
> >  			unsigned long  z_tailextent_headlcn;
> > -			erofs_off_t    z_idataoff;
> > -			unsigned short z_idata_size;
> > +			union {
> > +				struct {
> > +					erofs_off_t    z_idataoff;
> > +					unsigned short z_idata_size;
> > +				};
> > +				erofs_off_t z_fragmentoff;
> > +			};
> >  		};
> >  #endif	/* CONFIG_EROFS_FS_ZIP */
> >  	};
> > @@ -400,6 +407,7 @@ extern const struct address_space_operations z_erofs_aops;
> >  enum {
> >  	BH_Encoded = BH_PrivateStart,
> >  	BH_FullMapped,
> > +	BH_Fragment,
> >  };
> >  
> >  /* Has a disk mapping */
> > @@ -410,6 +418,8 @@ enum {
> >  #define EROFS_MAP_ENCODED	(1 << BH_Encoded)
> >  /* The length of extent is full */
> >  #define EROFS_MAP_FULL_MAPPED	(1 << BH_FullMapped)
> > +/* Located in the special packed inode */
> > +#define EROFS_MAP_FRAGMENT	(1 << BH_Fragment)
> >  
> >  struct erofs_map_blocks {
> >  	struct erofs_buf buf;
> > @@ -431,7 +441,7 @@ struct erofs_map_blocks {
> >  #define EROFS_GET_BLOCKS_FIEMAP	0x0002
> >  /* Used to map the whole extent if non-negligible data is requested for LZMA */
> >  #define EROFS_GET_BLOCKS_READMORE	0x0004
> > -/* Used to map tail extent for tailpacking inline pcluster */
> > +/* Used to map tail extent for tailpacking inline or fragment pcluster */
> >  #define EROFS_GET_BLOCKS_FINDTAIL	0x0008
> >  
> >  enum {
> > diff --git a/fs/erofs/super.c b/fs/erofs/super.c
> > index 3173debeaa5a..8170c0d8ab92 100644
> > --- a/fs/erofs/super.c
> > +++ b/fs/erofs/super.c
> > @@ -381,6 +381,17 @@ static int erofs_read_superblock(struct super_block *sb)
> >  #endif
> >  	sbi->islotbits = ilog2(sizeof(struct erofs_inode_compact));
> >  	sbi->root_nid = le16_to_cpu(dsb->root_nid);
> > +#ifdef CONFIG_EROFS_FS_ZIP
> > +	sbi->packed_inode = NULL;
> > +	if (erofs_sb_has_fragments(sbi)) {
> > +		sbi->packed_inode =
> > +			erofs_iget(sb, le64_to_cpu(dsb->packed_nid), false);
> > +		if (IS_ERR(sbi->packed_inode)) {
> > +			ret = PTR_ERR(sbi->packed_inode);
> > +			goto out;
> > +		}
> > +	}
> > +#endif
> >  	sbi->inos = le64_to_cpu(dsb->inos);
> >  
> >  	sbi->build_time = le64_to_cpu(dsb->build_time);
> > @@ -411,6 +422,8 @@ static int erofs_read_superblock(struct super_block *sb)
> >  		erofs_info(sb, "EXPERIMENTAL compressed inline data feature in use. Use at your own risk!");
> >  	if (erofs_is_fscache_mode(sb))
> >  		erofs_info(sb, "EXPERIMENTAL fscache-based on-demand read feature in use. Use at your own risk!");
> > +	if (erofs_sb_has_fragments(sbi))
> > +		erofs_info(sb, "EXPERIMENTAL compressed fragments feature in use. Use at your own risk!");
> >  out:
> >  	erofs_put_metabuf(&buf);
> >  	return ret;
> > @@ -908,6 +921,8 @@ static void erofs_put_super(struct super_block *sb)
> >  #ifdef CONFIG_EROFS_FS_ZIP
> >  	iput(sbi->managed_cache);
> >  	sbi->managed_cache = NULL;
> > +	iput(sbi->packed_inode);
> > +	sbi->packed_inode = NULL;
> >  #endif
> >  	erofs_fscache_unregister_cookie(&sbi->s_fscache);
> >  }
> > diff --git a/fs/erofs/sysfs.c b/fs/erofs/sysfs.c
> > index c1383e508bbe..1b52395be82a 100644
> > --- a/fs/erofs/sysfs.c
> > +++ b/fs/erofs/sysfs.c
> > @@ -76,6 +76,7 @@ EROFS_ATTR_FEATURE(device_table);
> >  EROFS_ATTR_FEATURE(compr_head2);
> >  EROFS_ATTR_FEATURE(sb_chksum);
> >  EROFS_ATTR_FEATURE(ztailpacking);
> > +EROFS_ATTR_FEATURE(fragments);
> >  
> >  static struct attribute *erofs_feat_attrs[] = {
> >  	ATTR_LIST(zero_padding),
> > @@ -86,6 +87,7 @@ static struct attribute *erofs_feat_attrs[] = {
> >  	ATTR_LIST(compr_head2),
> >  	ATTR_LIST(sb_chksum),
> >  	ATTR_LIST(ztailpacking),
> > +	ATTR_LIST(fragments),
> >  	NULL,
> >  };
> >  ATTRIBUTE_GROUPS(erofs_feat);
> > diff --git a/fs/erofs/zdata.c b/fs/erofs/zdata.c
> > index 5792ca9e0d5e..aa2a3cdeea57 100644
> > --- a/fs/erofs/zdata.c
> > +++ b/fs/erofs/zdata.c
> > @@ -650,6 +650,33 @@ static bool should_alloc_managed_pages(struct z_erofs_decompress_frontend *fe,
> >  		la < fe->headoffset;
> >  }
> >  
> > +static int z_erofs_read_fragment(struct inode *inode, erofs_off_t pos,
> > +				 struct page *page, unsigned int pageofs,
> > +				 unsigned int len)
> > +{
> > +	struct inode *packed_inode = EROFS_I_SB(inode)->packed_inode;
> > +	struct erofs_buf buf = __EROFS_BUF_INITIALIZER;
> > +	u8 *src, *dst;
> > +	unsigned int i, cnt;
> > +
> > +	pos += EROFS_I(inode)->z_fragmentoff;
> > +	for (i = 0; i < len; i += cnt) {
> > +		cnt = min_t(unsigned int, len - i,
> > +			    EROFS_BLKSIZ - erofs_blkoff(pos));
> > +		src = erofs_bread(&buf, packed_inode,
> > +				  erofs_blknr(pos), EROFS_KMAP);
> > +		if (IS_ERR(src))  
> 
> 			^
> 			need to erofs_put_metabuf() here anyway?

rt, need it.

Thanks.

> 
> > +			return PTR_ERR(src);
> > +
> > +		dst = kmap_local_page(page);
> > +		memcpy(dst + pageofs + i, src + erofs_blkoff(pos), cnt);
> > +		kunmap_local(dst);
> > +		pos += cnt;
> > +	}
> > +	erofs_put_metabuf(&buf);
> > +	return 0;
> > +}  
> 
> Thanks,
> Gao Xiang


^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [RFC PATCH v4 2/2] erofs: support on-disk compressed fragments data
@ 2022-09-17  3:56         ` Yue Hu
  0 siblings, 0 replies; 12+ messages in thread
From: Yue Hu @ 2022-09-17  3:56 UTC (permalink / raw)
  To: Gao Xiang; +Cc: linux-kernel, zhangwen, Yue Hu, linux-erofs, shaojunjun

On Fri, 16 Sep 2022 17:08:51 +0800
Gao Xiang <hsiangkao@linux.alibaba.com> wrote:

> On Tue, Sep 13, 2022 at 07:05:52PM +0800, Yue Hu wrote:
> > From: Yue Hu <huyue2@coolpad.com>
> > 
> > Introduce on-disk compressed fragments data feature.
> > 
> > This approach adds a new field called `h_fragmentoff' in the per-file
> > compression header to indicate the fragment offset of each tail pcluster
> > or the whole file in the special packed inode.
> > 
> > Similar to ztailpacking, it will also find and record the 'headlcn'
> > of the tail pcluster when initializing per-inode zmap for making
> > follow-on requests more easy.
> > 
> > Signed-off-by: Yue Hu <huyue2@coolpad.com>
> > ---
> >  fs/erofs/erofs_fs.h | 29 +++++++++++++++++++++------
> >  fs/erofs/internal.h | 16 ++++++++++++---
> >  fs/erofs/super.c    | 15 ++++++++++++++
> >  fs/erofs/sysfs.c    |  2 ++
> >  fs/erofs/zdata.c    | 48 ++++++++++++++++++++++++++++++++++++++++++++-
> >  fs/erofs/zmap.c     | 48 +++++++++++++++++++++++++++++++++++++++++++--
> >  6 files changed, 146 insertions(+), 12 deletions(-)
> > 
> > diff --git a/fs/erofs/erofs_fs.h b/fs/erofs/erofs_fs.h
> > index 5c1de6d7ad71..aa976757328b 100644
> > --- a/fs/erofs/erofs_fs.h
> > +++ b/fs/erofs/erofs_fs.h
> > @@ -25,6 +25,7 @@
> >  #define EROFS_FEATURE_INCOMPAT_DEVICE_TABLE	0x00000008
> >  #define EROFS_FEATURE_INCOMPAT_COMPR_HEAD2	0x00000008
> >  #define EROFS_FEATURE_INCOMPAT_ZTAILPACKING	0x00000010
> > +#define EROFS_FEATURE_INCOMPAT_FRAGMENTS	0x00000020
> >  #define EROFS_ALL_FEATURE_INCOMPAT		\
> >  	(EROFS_FEATURE_INCOMPAT_ZERO_PADDING | \
> >  	 EROFS_FEATURE_INCOMPAT_COMPR_CFGS | \
> > @@ -32,7 +33,8 @@
> >  	 EROFS_FEATURE_INCOMPAT_CHUNKED_FILE | \
> >  	 EROFS_FEATURE_INCOMPAT_DEVICE_TABLE | \
> >  	 EROFS_FEATURE_INCOMPAT_COMPR_HEAD2 | \
> > -	 EROFS_FEATURE_INCOMPAT_ZTAILPACKING)
> > +	 EROFS_FEATURE_INCOMPAT_ZTAILPACKING | \
> > +	 EROFS_FEATURE_INCOMPAT_FRAGMENTS)
> >  
> >  #define EROFS_SB_EXTSLOT_SIZE	16
> >  
> > @@ -71,7 +73,9 @@ struct erofs_super_block {
> >  	} __packed u1;
> >  	__le16 extra_devices;	/* # of devices besides the primary device */
> >  	__le16 devt_slotoff;	/* startoff = devt_slotoff * devt_slotsize */
> > -	__u8 reserved2[38];
> > +	__u8 reserved[6];
> > +	__le64 packed_nid;	/* nid of the special packed inode */
> > +	__u8 reserved2[24];
> >  };
> >  
> >  /*
> > @@ -296,17 +300,26 @@ struct z_erofs_lzma_cfgs {
> >   * bit 2 : HEAD2 big pcluster (0 - off; 1 - on)
> >   * bit 3 : tailpacking inline pcluster (0 - off; 1 - on)
> >   * bit 4 : interlaced plain pcluster (0 - off; 1 - on)
> > + * bit 5 : fragment pcluster (0 - off; 1 - on)
> >   */
> >  #define Z_EROFS_ADVISE_COMPACTED_2B		0x0001
> >  #define Z_EROFS_ADVISE_BIG_PCLUSTER_1		0x0002
> >  #define Z_EROFS_ADVISE_BIG_PCLUSTER_2		0x0004
> >  #define Z_EROFS_ADVISE_INLINE_PCLUSTER		0x0008
> >  #define Z_EROFS_ADVISE_INTERLACED_PCLUSTER	0x0010
> > +#define Z_EROFS_ADVISE_FRAGMENT_PCLUSTER	0x0020
> >  
> > +#define Z_EROFS_FRAGMENT_INODE_BIT              7
> >  struct z_erofs_map_header {
> > -	__le16	h_reserved1;
> > -	/* indicates the encoded size of tailpacking data */
> > -	__le16  h_idata_size;
> > +	union {
> > +		/* fragment data offset in the packed inode */
> > +		__le32  h_fragmentoff;
> > +		struct {
> > +			__le16  h_reserved1;
> > +			/* indicates the encoded size of tailpacking data */
> > +			__le16  h_idata_size;
> > +		};
> > +	};
> >  	__le16	h_advise;
> >  	/*
> >  	 * bit 0-3 : algorithm type of head 1 (logical cluster type 01);
> > @@ -315,7 +328,8 @@ struct z_erofs_map_header {
> >  	__u8	h_algorithmtype;
> >  	/*
> >  	 * bit 0-2 : logical cluster bits - 12, e.g. 0 for 4096;
> > -	 * bit 3-7 : reserved.
> > +	 * bit 3-6 : reserved;
> > +	 * bit 7   : move the whole file into packed inode or not.
> >  	 */
> >  	__u8	h_clusterbits;
> >  };
> > @@ -421,6 +435,9 @@ static inline void erofs_check_ondisk_layout_definitions(void)
> >  
> >  	BUILD_BUG_ON(BIT(Z_EROFS_VLE_DI_CLUSTER_TYPE_BITS) <
> >  		     Z_EROFS_VLE_CLUSTER_TYPE_MAX - 1);
> > +	WARN_ON(*(__le64 *)&(struct z_erofs_map_header) {
> > +			.h_clusterbits = 1 << Z_EROFS_FRAGMENT_INODE_BIT
> > +		} != cpu_to_le64(1ULL << 63));  
> 
> why not BUILD_BUG_ON here?

There's compiling error, let me check further.

> 
> >  }
> >  
> >  #endif
> > diff --git a/fs/erofs/internal.h b/fs/erofs/internal.h
> > index f3ed36445d73..b133664b4ad2 100644
> > --- a/fs/erofs/internal.h
> > +++ b/fs/erofs/internal.h
> > @@ -120,6 +120,7 @@ struct erofs_sb_info {
> >  	struct inode *managed_cache;
> >  
> >  	struct erofs_sb_lz4_info lz4;
> > +	struct inode *packed_inode;
> >  #endif	/* CONFIG_EROFS_FS_ZIP */
> >  	struct erofs_dev_context *devs;
> >  	struct dax_device *dax_dev;
> > @@ -306,6 +307,7 @@ EROFS_FEATURE_FUNCS(chunked_file, incompat, INCOMPAT_CHUNKED_FILE)
> >  EROFS_FEATURE_FUNCS(device_table, incompat, INCOMPAT_DEVICE_TABLE)
> >  EROFS_FEATURE_FUNCS(compr_head2, incompat, INCOMPAT_COMPR_HEAD2)
> >  EROFS_FEATURE_FUNCS(ztailpacking, incompat, INCOMPAT_ZTAILPACKING)
> > +EROFS_FEATURE_FUNCS(fragments, incompat, INCOMPAT_FRAGMENTS)
> >  EROFS_FEATURE_FUNCS(sb_chksum, compat, COMPAT_SB_CHKSUM)
> >  
> >  /* atomic flag definitions */
> > @@ -341,8 +343,13 @@ struct erofs_inode {
> >  			unsigned char  z_algorithmtype[2];
> >  			unsigned char  z_logical_clusterbits;
> >  			unsigned long  z_tailextent_headlcn;
> > -			erofs_off_t    z_idataoff;
> > -			unsigned short z_idata_size;
> > +			union {
> > +				struct {
> > +					erofs_off_t    z_idataoff;
> > +					unsigned short z_idata_size;
> > +				};
> > +				erofs_off_t z_fragmentoff;
> > +			};
> >  		};
> >  #endif	/* CONFIG_EROFS_FS_ZIP */
> >  	};
> > @@ -400,6 +407,7 @@ extern const struct address_space_operations z_erofs_aops;
> >  enum {
> >  	BH_Encoded = BH_PrivateStart,
> >  	BH_FullMapped,
> > +	BH_Fragment,
> >  };
> >  
> >  /* Has a disk mapping */
> > @@ -410,6 +418,8 @@ enum {
> >  #define EROFS_MAP_ENCODED	(1 << BH_Encoded)
> >  /* The length of extent is full */
> >  #define EROFS_MAP_FULL_MAPPED	(1 << BH_FullMapped)
> > +/* Located in the special packed inode */
> > +#define EROFS_MAP_FRAGMENT	(1 << BH_Fragment)
> >  
> >  struct erofs_map_blocks {
> >  	struct erofs_buf buf;
> > @@ -431,7 +441,7 @@ struct erofs_map_blocks {
> >  #define EROFS_GET_BLOCKS_FIEMAP	0x0002
> >  /* Used to map the whole extent if non-negligible data is requested for LZMA */
> >  #define EROFS_GET_BLOCKS_READMORE	0x0004
> > -/* Used to map tail extent for tailpacking inline pcluster */
> > +/* Used to map tail extent for tailpacking inline or fragment pcluster */
> >  #define EROFS_GET_BLOCKS_FINDTAIL	0x0008
> >  
> >  enum {
> > diff --git a/fs/erofs/super.c b/fs/erofs/super.c
> > index 3173debeaa5a..8170c0d8ab92 100644
> > --- a/fs/erofs/super.c
> > +++ b/fs/erofs/super.c
> > @@ -381,6 +381,17 @@ static int erofs_read_superblock(struct super_block *sb)
> >  #endif
> >  	sbi->islotbits = ilog2(sizeof(struct erofs_inode_compact));
> >  	sbi->root_nid = le16_to_cpu(dsb->root_nid);
> > +#ifdef CONFIG_EROFS_FS_ZIP
> > +	sbi->packed_inode = NULL;
> > +	if (erofs_sb_has_fragments(sbi)) {
> > +		sbi->packed_inode =
> > +			erofs_iget(sb, le64_to_cpu(dsb->packed_nid), false);
> > +		if (IS_ERR(sbi->packed_inode)) {
> > +			ret = PTR_ERR(sbi->packed_inode);
> > +			goto out;
> > +		}
> > +	}
> > +#endif
> >  	sbi->inos = le64_to_cpu(dsb->inos);
> >  
> >  	sbi->build_time = le64_to_cpu(dsb->build_time);
> > @@ -411,6 +422,8 @@ static int erofs_read_superblock(struct super_block *sb)
> >  		erofs_info(sb, "EXPERIMENTAL compressed inline data feature in use. Use at your own risk!");
> >  	if (erofs_is_fscache_mode(sb))
> >  		erofs_info(sb, "EXPERIMENTAL fscache-based on-demand read feature in use. Use at your own risk!");
> > +	if (erofs_sb_has_fragments(sbi))
> > +		erofs_info(sb, "EXPERIMENTAL compressed fragments feature in use. Use at your own risk!");
> >  out:
> >  	erofs_put_metabuf(&buf);
> >  	return ret;
> > @@ -908,6 +921,8 @@ static void erofs_put_super(struct super_block *sb)
> >  #ifdef CONFIG_EROFS_FS_ZIP
> >  	iput(sbi->managed_cache);
> >  	sbi->managed_cache = NULL;
> > +	iput(sbi->packed_inode);
> > +	sbi->packed_inode = NULL;
> >  #endif
> >  	erofs_fscache_unregister_cookie(&sbi->s_fscache);
> >  }
> > diff --git a/fs/erofs/sysfs.c b/fs/erofs/sysfs.c
> > index c1383e508bbe..1b52395be82a 100644
> > --- a/fs/erofs/sysfs.c
> > +++ b/fs/erofs/sysfs.c
> > @@ -76,6 +76,7 @@ EROFS_ATTR_FEATURE(device_table);
> >  EROFS_ATTR_FEATURE(compr_head2);
> >  EROFS_ATTR_FEATURE(sb_chksum);
> >  EROFS_ATTR_FEATURE(ztailpacking);
> > +EROFS_ATTR_FEATURE(fragments);
> >  
> >  static struct attribute *erofs_feat_attrs[] = {
> >  	ATTR_LIST(zero_padding),
> > @@ -86,6 +87,7 @@ static struct attribute *erofs_feat_attrs[] = {
> >  	ATTR_LIST(compr_head2),
> >  	ATTR_LIST(sb_chksum),
> >  	ATTR_LIST(ztailpacking),
> > +	ATTR_LIST(fragments),
> >  	NULL,
> >  };
> >  ATTRIBUTE_GROUPS(erofs_feat);
> > diff --git a/fs/erofs/zdata.c b/fs/erofs/zdata.c
> > index 5792ca9e0d5e..aa2a3cdeea57 100644
> > --- a/fs/erofs/zdata.c
> > +++ b/fs/erofs/zdata.c
> > @@ -650,6 +650,33 @@ static bool should_alloc_managed_pages(struct z_erofs_decompress_frontend *fe,
> >  		la < fe->headoffset;
> >  }
> >  
> > +static int z_erofs_read_fragment(struct inode *inode, erofs_off_t pos,
> > +				 struct page *page, unsigned int pageofs,
> > +				 unsigned int len)
> > +{
> > +	struct inode *packed_inode = EROFS_I_SB(inode)->packed_inode;
> > +	struct erofs_buf buf = __EROFS_BUF_INITIALIZER;
> > +	u8 *src, *dst;
> > +	unsigned int i, cnt;
> > +
> > +	pos += EROFS_I(inode)->z_fragmentoff;
> > +	for (i = 0; i < len; i += cnt) {
> > +		cnt = min_t(unsigned int, len - i,
> > +			    EROFS_BLKSIZ - erofs_blkoff(pos));
> > +		src = erofs_bread(&buf, packed_inode,
> > +				  erofs_blknr(pos), EROFS_KMAP);
> > +		if (IS_ERR(src))  
> 
> 			^
> 			need to erofs_put_metabuf() here anyway?

rt, need it.

Thanks.

> 
> > +			return PTR_ERR(src);
> > +
> > +		dst = kmap_local_page(page);
> > +		memcpy(dst + pageofs + i, src + erofs_blkoff(pos), cnt);
> > +		kunmap_local(dst);
> > +		pos += cnt;
> > +	}
> > +	erofs_put_metabuf(&buf);
> > +	return 0;
> > +}  
> 
> Thanks,
> Gao Xiang


^ permalink raw reply	[flat|nested] 12+ messages in thread

end of thread, other threads:[~2022-09-17  3:53 UTC | newest]

Thread overview: 12+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2022-09-13 11:05 [RFC PATCH v4 0/2] erofs: support compressed fragments data Yue Hu
2022-09-13 11:05 ` Yue Hu
     [not found] ` <cover.1663066966.git.huyue2@coolpad.com>
2022-09-13 11:05   ` [RFC PATCH v4 1/2] erofs: support interlaced uncompressed data for compressed files Yue Hu
2022-09-13 11:05     ` Yue Hu
2022-09-16  9:04     ` Gao Xiang
2022-09-16  9:04       ` Gao Xiang
2022-09-13 11:05   ` [RFC PATCH v4 2/2] erofs: support on-disk compressed fragments data Yue Hu
2022-09-13 11:05     ` Yue Hu
2022-09-16  9:08     ` Gao Xiang
2022-09-16  9:08       ` Gao Xiang
2022-09-17  3:56       ` Yue Hu
2022-09-17  3:56         ` Yue Hu

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.