linux-erofs.lists.ozlabs.org archive mirror
 help / color / mirror / Atom feed
From: gaoxiang25@huawei.com (Gao Xiang)
Subject: [PATCH v6 24/24] erofs: add document
Date: Fri, 2 Aug 2019 20:53:47 +0800	[thread overview]
Message-ID: <20190802125347.166018-25-gaoxiang25@huawei.com> (raw)
In-Reply-To: <20190802125347.166018-1-gaoxiang25@huawei.com>

This documents key features, usage, and
on-disk design of erofs.

Signed-off-by: Gao Xiang <gaoxiang25 at huawei.com>
---
 Documentation/filesystems/erofs.txt | 225 ++++++++++++++++++++++++++++
 1 file changed, 225 insertions(+)
 create mode 100644 Documentation/filesystems/erofs.txt

diff --git a/Documentation/filesystems/erofs.txt b/Documentation/filesystems/erofs.txt
new file mode 100644
index 000000000000..457e601e0467
--- /dev/null
+++ b/Documentation/filesystems/erofs.txt
@@ -0,0 +1,225 @@
+Overview
+========
+
+EROFS file-system stands for Enhanced Read-Only File System. Different
+from other read-only file systems, it aims to be designed for flexibility,
+scalability, but be kept simple and high performance.
+
+It is designed as a better filesystem solution for the following scenarios:
+ - read-only storage media or
+
+ - part of a fully trusted read-only solution, which means it needs to be
+   immutable and bit-for-bit identical to the official golden image for
+   their releases due to security and other considerations and
+
+ - hope to save some extra storage space with guaranteed end-to-end performance
+   by using reduced metadata and transparent file compression, especially
+   for those embedded devices with limited memory (ex, smartphone);
+
+Here is the main features of EROFS:
+ - Little endian on-disk design;
+
+ - Currently 4KB block size (nobh) and therefore maximum 16TB address space;
+
+ - Metadata & data could be mixed by design;
+
+ - 2 inode versions for different requirements:
+                          v1            v2
+   Inode metadata size:   32 bytes      64 bytes
+   Max file size:         4 GB          16 EB (also limited by max. vol size)
+   Max uids/gids:         65536         4294967296
+   File creation time:    no            yes (64 + 32-bit timestamp)
+   Max hardlinks:         65536         4294967296
+   Metadata reserved:     4 bytes       14 bytes
+
+ - Support extended attributes (xattrs) as an option;
+
+ - Support xattr inline and tail-end data inline for all files;
+
+ - Support POSIX.1e ACLs by using xattrs;
+
+ - Support statx();
+
+ - Support transparent file compression as an option:
+   LZ4 algorithm with 4 KB fixed-output compression for high performance;
+
+The following git tree provides the file system user-space tools under
+development (ex, formatting tool mkfs.erofs):
+>> git://git.kernel.org/pub/scm/linux/kernel/git/xiang/erofs-utils.git
+
+Bugs and patches are welcome, please kindly help us and send to the following
+linux-erofs mailing list:
+>> linux-erofs mailing list   <linux-erofs at lists.ozlabs.org>
+
+Note that EROFS is still working in progress as a Linux staging driver,
+Cc the staging mailing list as well is highly recommended:
+>> Linux Driver Project Developer List <devel at driverdev.osuosl.org>
+
+Mount options
+=============
+
+fault_injection=%d     Enable fault injection in all supported types with
+                       specified injection rate. Supported injection type:
+                       Type_Name                Type_Value
+                       FAULT_KMALLOC            0x000000001
+                       FAULT_READ_IO            0x000000002
+(no)user_xattr         Setup Extended User Attributes. Note: xattr is enabled
+                       by default if CONFIG_EROFS_FS_XATTR is selected.
+(no)acl                Setup POSIX Access Control List. Note: acl is enabled
+                       by default if CONFIG_EROFS_FS_POSIX_ACL is selected.
+cache_strategy=%s      Select a strategy for cached decompression from now on:
+                         disabled: In-place I/O decompression only;
+                        readahead: Cache the last incomplete compressed physical
+                                   cluster for further reading. It still does
+                                   in-place I/O decompression for the rest
+                                   compressed physical clusters;
+                       readaround: Cache the both ends of incomplete compressed
+                                   physical clusters for further reading.
+                                   It still does in-place I/O decompression
+                                   for the rest compressed physical clusters.
+
+Module parameters
+=================
+use_vmap=[0|1]         Use vmap() instead of vm_map_ram() (default 0).
+
+On-disk details
+===============
+
+Summary
+-------
+Different from other read-only file systems, an EROFS volume is designed
+to be as simple as possible:
+
+                                |-> aligned with the block size
+   ____________________________________________________________
+  | |SB| | ... | Metadata | ... | Data | Metadata | ... | Data |
+  |_|__|_|_____|__________|_____|______|__________|_____|______|
+  0 +1K
+
+All data areas should be aligned with the block size, but metadata areas
+may not. All metadatas can be now observed in two different spaces (views):
+ 1. Inode metadata space
+    Each valid inode should be aligned with an inode slot, which is a fixed
+    value (32 bytes) and designed to be kept in line with v1 inode size.
+
+    Each inode can be directly found with the following formula:
+         inode offset = meta_blkaddr * block_size + 32 * nid
+
+                                |-> aligned with 8B
+                                           |-> followed closely
+    + meta_blkaddr blocks                                      |-> another slot
+     _____________________________________________________________________
+    |  ...   | inode |  xattrs  | extents  | data inline | ... | inode ...
+    |________|_______|(optional)|(optional)|__(optional)_|_____|__________
+             |-> aligned with the inode slot size
+                  .                   .
+                .                         .
+              .                              .
+            .                                    .
+          .                                         .
+        .                                              .
+      .____________________________________________________|-> aligned with 4B
+      | xattr_ibody_header | shared xattrs | inline xattrs |
+      |____________________|_______________|_______________|
+      |->    12 bytes    <-|->x * 4 bytes<-|               .
+                          .                .                 .
+                    .                      .                   .
+               .                           .                     .
+           ._______________________________.______________________.
+           | id | id | id | id |  ... | id | ent | ... | ent| ... |
+           |____|____|____|____|______|____|_____|_____|____|_____|
+                                           |-> aligned with 4B
+                                                       |-> aligned with 4B
+
+    Inode could be 32 or 64 bytes, which can be distinguished from a common
+    field which all inode versions have -- i_advise:
+
+        __________________               __________________
+       |     i_advise     |             |     i_advise     |
+       |__________________|             |__________________|
+       |        ...       |             |        ...       |
+       |                  |             |                  |
+       |__________________| 32 bytes    |                  |
+                                        |                  |
+                                        |__________________| 64 bytes
+
+    Xattrs, extents, data inline are followed by the corresponding inode with
+    proper alignes, and they could be optional for different data mappings,
+    _currently_ there are totally 3 valid data mappings supported:
+
+     1) flat file data without data inline (no extent);
+     2) fixed-output size data compression (must have extents);
+     3) flat file data with tail-end data inline (no extent);
+
+    The size of the optional xattrs is indicated by i_xattr_count in inode
+    header. Large xattrs or xattrs shared by many different files can be
+    stored in shared xattrs metadata rather than inlined right after inode.
+
+ 2. Shared xattrs metadata space
+    Shared xattrs space is similar to the above inode space, started with
+    a specific block indicated by xattr_blkaddr, organized one by one with
+    proper align.
+
+    Each share xattr can also be directly found by the following formula:
+         xattr offset = xattr_blkaddr * block_size + 4 * xattr_id
+
+                           |-> aligned by  4 bytes
+    + xattr_blkaddr blocks                     |-> aligned with 4 bytes
+     _________________________________________________________________________
+    |  ...   | xattr_entry |  xattr data | ... |  xattr_entry | xattr data  ...
+    |________|_____________|_____________|_____|______________|_______________
+
+Directories
+-----------
+All directories are now organized in a compact on-disk format. Note that
+each directory block is divided into index and name areas in order to support
+random file lookup, and all directory entries are _strictly_ recorded in
+alphabetical order in order to support improved prefix binary search
+algorithm (could refer to the related source code).
+
+                 ___________________________
+                /                           |
+               /              ______________|________________
+              /              /              | nameoff1       | nameoffN-1
+ ____________.______________._______________v________________v__________
+| dirent | dirent | ... | dirent | filename | filename | ... | filename |
+|___.0___|____1___|_____|___N-1__|____0_____|____1_____|_____|___N-1____|
+     \                           ^
+      \                          |                           * could have
+       \                         |                             trailing '\0'
+        \________________________| nameoff0
+
+                             Directory block
+
+Note that apart from the offset of the first filename, nameoff0 also indicates
+the total number of directory entries in this block since it is no need to
+introduce another on-disk field at all.
+
+Compression
+-----------
+Currently, EROFS supports 4KB fixed-output clustersize transparent file
+compression, as illustrated below:
+
+         |---- Variant-Length Extent ----|-------- VLE --------|----- VLE -----
+         clusterofs                      clusterofs            clusterofs
+         |                               |                     |   logical data
+_________v_______________________________v_____________________v_______________
+... |    .        |             |        .    |             |  .          | ...
+____|____.________|_____________|________.____|_____________|__.__________|____
+    |-> cluster <-|-> cluster <-|-> cluster <-|-> cluster <-|-> cluster <-|
+         size          size          size          size          size
+          .                             .                .                   .
+           .                       .               .                  .
+            .                  .              .                .
+      _______._____________._____________._____________._____________________
+         ... |             |             |             | ... physical data
+      _______|_____________|_____________|_____________|_____________________
+             |-> cluster <-|-> cluster <-|-> cluster <-|
+                  size          size          size
+
+Currently each on-disk physical cluster can contain 4KB (un)compressed data
+at most. For each logical cluster, there is a corresponding on-disk index to
+describe its cluster type, physical cluster address, etc.
+
+See "struct z_erofs_vle_decompressed_index" in erofs_fs.h for more details.
+
-- 
2.17.1

  parent reply	other threads:[~2019-08-02 12:53 UTC|newest]

Thread overview: 159+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2019-08-02 12:53 [PATCH v6 00/24] erofs: promote erofs from staging Gao Xiang
2019-08-02 12:53 ` [PATCH v6 01/24] erofs: add on-disk layout Gao Xiang
2019-08-29  9:59   ` Christoph Hellwig
2019-08-29 10:32     ` Gao Xiang
2019-08-29 10:36       ` Christoph Hellwig
2019-08-29 10:58         ` Gao Xiang
2019-08-29 15:58       ` Joe Perches
2019-08-29 17:26         ` Gao Xiang
2019-08-30 12:07         ` David Sterba
2019-08-30 12:18           ` Gao Xiang via Linux-erofs
2019-09-02  8:43           ` Pavel Machek
2019-09-02 14:07             ` David Sterba
2019-09-03 11:27               ` Pavel Machek
2019-08-29 15:41     ` Gao Xiang
2019-09-01  7:54     ` Gao Xiang via Linux-erofs
2019-09-02 12:45       ` Christoph Hellwig
2019-09-02 13:02         ` Gao Xiang
2019-09-02  8:40     ` Pavel Machek
2019-09-02 10:35       ` Gao Xiang
2019-08-02 12:53 ` [PATCH v6 02/24] erofs: add erofs in-memory stuffs Gao Xiang
2019-08-02 12:53 ` [PATCH v6 03/24] erofs: add super block operations Gao Xiang
2019-08-29 10:15   ` Christoph Hellwig
2019-08-29 10:50     ` Gao Xiang
2019-08-30 16:39       ` Christoph Hellwig
2019-08-30 17:15         ` Gao Xiang
2019-08-31  0:54           ` Gao Xiang
2019-08-31  6:34           ` Amir Goldstein
2019-08-31  6:48             ` Gao Xiang
2019-09-01  8:54     ` Gao Xiang via Linux-erofs
2019-09-02 12:51       ` Christoph Hellwig
2019-09-02 14:43         ` Gao Xiang
2019-09-02 15:19           ` Christoph Hellwig
2019-09-02 15:24             ` Gao Xiang
2019-08-02 12:53 ` [PATCH v6 04/24] erofs: add raw address_space operations Gao Xiang
2019-08-29 10:17   ` Christoph Hellwig
2019-08-29 11:46     ` Gao Xiang
2019-08-30 16:40       ` Christoph Hellwig
2019-08-30 17:23         ` Gao Xiang
2019-08-02 12:53 ` [PATCH v6 05/24] erofs: add inode operations Gao Xiang
2019-08-29 10:24   ` Christoph Hellwig
2019-08-29 11:59     ` Gao Xiang
2019-08-30 16:42       ` Christoph Hellwig
2019-08-30 18:46         ` Gao Xiang
2019-09-01  9:34     ` Gao Xiang via Linux-erofs
2019-09-02 12:53       ` Christoph Hellwig
2019-09-02 13:43       ` David Sterba
2019-09-02 13:55         ` Gao Xiang
2019-08-02 12:53 ` [PATCH v6 06/24] erofs: support special inode Gao Xiang
2019-08-29 10:25   ` Christoph Hellwig
2019-09-01  9:39     ` Gao Xiang via Linux-erofs
2019-08-02 12:53 ` [PATCH v6 07/24] erofs: add directory operations Gao Xiang
2019-08-02 12:53 ` [PATCH v6 08/24] erofs: add namei functions Gao Xiang
2019-08-29 10:28   ` Christoph Hellwig
2019-08-29 11:28     ` Gao Xiang
2019-08-02 12:53 ` [PATCH v6 09/24] erofs: support tracepoint Gao Xiang
2019-08-02 12:53 ` [PATCH v6 10/24] erofs: update Kconfig and Makefile Gao Xiang
2019-08-02 12:53 ` [PATCH v6 11/24] erofs: introduce xattr & posixacl support Gao Xiang
2019-08-02 12:53 ` [PATCH v6 12/24] erofs: introduce tagged pointer Gao Xiang
2019-08-02 12:53 ` [PATCH v6 13/24] erofs: add compression indexes support Gao Xiang
2019-08-02 12:53 ` [PATCH v6 14/24] erofs: introduce superblock registration Gao Xiang
2019-08-02 12:53 ` [PATCH v6 15/24] erofs: introduce erofs shrinker Gao Xiang
2019-08-02 12:53 ` [PATCH v6 16/24] erofs: introduce workstation for decompression Gao Xiang
2019-08-02 12:53 ` [PATCH v6 17/24] erofs: introduce per-CPU buffers implementation Gao Xiang
2019-08-02 12:53 ` [PATCH v6 18/24] erofs: introduce pagevec for decompression subsystem Gao Xiang
2019-08-02 12:53 ` [PATCH v6 19/24] erofs: add erofs_allocpage() Gao Xiang
2019-08-02 12:53 ` [PATCH v6 20/24] erofs: introduce generic decompression backend Gao Xiang
2019-08-02 12:53 ` [PATCH v6 21/24] erofs: introduce LZ4 decompression inplace Gao Xiang
2019-08-02 12:53 ` [PATCH v6 22/24] erofs: introduce the decompression frontend Gao Xiang
2019-08-02 12:53 ` [PATCH v6 23/24] erofs: introduce cached decompression Gao Xiang
2019-08-02 12:53 ` Gao Xiang [this message]
2019-09-01  5:51 ` [PATCH 00/21] erofs: patchset addressing Christoph's comments Gao Xiang via Linux-erofs
2019-09-01  5:51   ` [PATCH 01/21] erofs: remove all the byte offset comments Gao Xiang via Linux-erofs
2019-09-02 12:05     ` Christoph Hellwig
2019-09-01  5:51   ` [PATCH 02/21] erofs: on-disk format should have explicitly assigned numbers Gao Xiang via Linux-erofs
2019-09-02 12:05     ` Christoph Hellwig
2019-09-01  5:51   ` [PATCH 03/21] erofs: some macros are much more readable as a function Gao Xiang via Linux-erofs
2019-09-02 12:06     ` Christoph Hellwig
2019-09-01  5:51   ` [PATCH 04/21] erofs: kill __packed for on-disk structures Gao Xiang via Linux-erofs
2019-09-02 12:06     ` Christoph Hellwig
2019-09-01  5:51   ` [PATCH 05/21] erofs: update erofs_inode_is_data_compressed helper Gao Xiang via Linux-erofs
2019-09-02 12:07     ` Christoph Hellwig
2019-09-01  5:51   ` [PATCH 06/21] erofs: kill erofs_{init,exit}_inode_cache Gao Xiang via Linux-erofs
2019-09-02 12:09     ` Christoph Hellwig
2019-09-01  5:51   ` [PATCH 07/21] erofs: use erofs_inode naming Gao Xiang via Linux-erofs
2019-09-02 12:10     ` Christoph Hellwig
2019-09-02 12:13       ` Gao Xiang
2019-09-02 12:47         ` Christoph Hellwig
2019-09-02 13:33           ` Gao Xiang
2019-09-01  5:51   ` [PATCH 08/21] erofs: update comments in inode.c Gao Xiang via Linux-erofs
2019-09-01  5:51   ` [PATCH 09/21] erofs: update erofs symlink stuffs Gao Xiang via Linux-erofs
2019-09-02 12:11     ` Christoph Hellwig
2019-09-01  5:51   ` [PATCH 10/21] erofs: kill is_inode_layout_compression() Gao Xiang via Linux-erofs
2019-09-02 12:11     ` Christoph Hellwig
2019-09-01  5:51   ` [PATCH 11/21] erofs: use dsb instead of layout for ondisk super_block Gao Xiang via Linux-erofs
2019-09-02 12:12     ` Christoph Hellwig
2019-09-02 12:15       ` Gao Xiang
2019-09-01  5:51   ` [PATCH 12/21] erofs: kill verbose debug info in erofs_fill_super Gao Xiang via Linux-erofs
2019-09-02 12:14     ` Christoph Hellwig
2019-09-02 12:18       ` Gao Xiang
2019-09-01  5:51   ` [PATCH 13/21] erofs: simplify erofs_grab_bio() since bio_alloc() never fail Gao Xiang via Linux-erofs
2019-09-02 12:20     ` Christoph Hellwig
2019-09-01  5:51   ` [PATCH 14/21] erofs: kill prio and nofail of erofs_get_meta_page() Gao Xiang via Linux-erofs
2019-09-02 12:21     ` Christoph Hellwig
2019-09-01  5:51   ` [PATCH 15/21] erofs: kill __submit_bio() Gao Xiang via Linux-erofs
2019-09-01  5:51   ` [PATCH 16/21] erofs: kill magic underscores Gao Xiang via Linux-erofs
2019-09-02 12:26     ` Christoph Hellwig
2019-09-02 12:39       ` Gao Xiang
2019-09-02 12:54         ` Christoph Hellwig
2019-09-02 13:38           ` Gao Xiang
2019-09-01  5:51   ` [PATCH 17/21] erofs: use a switch statement when dealing with the file modes Gao Xiang via Linux-erofs
2019-09-02 12:27     ` Christoph Hellwig
2019-09-01  5:51   ` [PATCH 18/21] erofs: add "erofs_" prefix for common and short functions Gao Xiang via Linux-erofs
2019-09-02 12:28     ` Christoph Hellwig
2019-09-01  5:51   ` [PATCH 19/21] erofs: kill all erofs specific fault injection Gao Xiang via Linux-erofs
2019-09-02 12:28     ` Christoph Hellwig
2019-09-01  5:51   ` [PATCH 20/21] erofs: kill use_vmap module parameter Gao Xiang via Linux-erofs
2019-09-02 12:31     ` Christoph Hellwig
2019-09-02 12:43       ` Gao Xiang
2019-09-01  5:51   ` [PATCH 21/21] erofs: save one level of indentation Gao Xiang via Linux-erofs
2019-09-02 12:31     ` Christoph Hellwig
2019-09-02 12:46   ` [PATCH 00/21] erofs: patchset addressing Christoph's comments Christoph Hellwig
2019-09-02 14:24     ` Gao Xiang
2019-09-02 15:23       ` Christoph Hellwig
2019-09-02 15:50         ` Gao Xiang
2019-09-03  6:58           ` Christoph Hellwig
2019-09-03  8:17             ` Gao Xiang
2019-09-03 15:37               ` Christoph Hellwig
2019-09-03 15:43                 ` Gao Xiang via Linux-erofs
2019-09-04  2:08   ` [PATCH v2 00/25] " Gao Xiang
2019-09-04  2:08     ` [PATCH v2 01/25] erofs: remove all the byte offset comments Gao Xiang
2019-09-04  2:08     ` [PATCH v2 02/25] erofs: on-disk format should have explicitly assigned numbers Gao Xiang
2019-09-04  2:08     ` [PATCH v2 03/25] erofs: some macros are much more readable as a function Gao Xiang
2019-09-04  2:08     ` [PATCH v2 04/25] erofs: kill __packed for on-disk structures Gao Xiang
2019-09-04  2:08     ` [PATCH v2 05/25] erofs: update erofs_inode_is_data_compressed helper Gao Xiang
2019-09-04  2:08     ` [PATCH v2 06/25] erofs: use feature_incompat rather than requirements Gao Xiang
2019-09-04  2:08     ` [PATCH v2 07/25] erofs: better naming for erofs inode related stuffs Gao Xiang
2019-09-04  2:08     ` [PATCH v2 08/25] erofs: kill erofs_{init,exit}_inode_cache Gao Xiang
2019-09-04  2:08     ` [PATCH v2 09/25] erofs: use erofs_inode naming Gao Xiang
2019-09-04  2:08     ` [PATCH v2 10/25] erofs: update erofs_fs.h comments Gao Xiang
2019-09-04  2:08     ` [PATCH v2 11/25] erofs: update comments in inode.c Gao Xiang
2019-09-04  2:08     ` [PATCH v2 12/25] erofs: better erofs symlink stuffs Gao Xiang
2019-09-04  2:09     ` [PATCH v2 13/25] erofs: use dsb instead of layout for ondisk super_block Gao Xiang
2019-09-04  2:09     ` [PATCH v2 14/25] erofs: kill verbose debug info in erofs_fill_super Gao Xiang
2019-09-04  2:09     ` [PATCH v2 15/25] erofs: localize erofs_grab_bio() Gao Xiang
2019-09-04  2:09     ` [PATCH v2 16/25] erofs: kill prio and nofail of erofs_get_meta_page() Gao Xiang
2019-09-04  2:09     ` [PATCH v2 17/25] erofs: kill __submit_bio() Gao Xiang
2019-09-04  2:09     ` [PATCH v2 18/25] erofs: add "erofs_" prefix for common and short functions Gao Xiang
2019-09-04  2:09     ` [PATCH v2 19/25] erofs: kill all erofs specific fault injection Gao Xiang
2019-09-04  2:09     ` [PATCH v2 20/25] erofs: kill use_vmap module parameter Gao Xiang
2019-09-04  2:09     ` [PATCH v2 21/25] erofs: save one level of indentation Gao Xiang
2019-09-04  2:09     ` [PATCH v2 22/25] erofs: rename errln/infoln/debugln to erofs_{err, info, dbg} Gao Xiang
2019-09-04  2:09     ` [PATCH v2 23/25] erofs: use read_mapping_page instead of sb_bread Gao Xiang
2019-09-04  2:09     ` [PATCH v2 24/25] erofs: always use iget5_locked Gao Xiang
2019-09-04  2:09     ` [PATCH v2 25/25] erofs: use read_cache_page_gfp for erofs_get_meta_page Gao Xiang
2019-09-04  3:27     ` [PATCH v2 00/25] erofs: patchset addressing Christoph's comments Chao Yu
2019-09-05  1:03       ` Gao Xiang via Linux-erofs
2019-09-05 11:30         ` Christoph Hellwig
2019-09-04  5:16     ` Christoph Hellwig
2019-09-04  6:08       ` Gao Xiang via Linux-erofs

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20190802125347.166018-25-gaoxiang25@huawei.com \
    --to=gaoxiang25@huawei.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).